Data Designer ships with pre-configured model providers and model configurations that make it easy to start generating synthetic data without manual setup.
Data Designer includes a few default model providers that are configured automatically:
nvidia)https://integrate.api.nvidia.com/v1NVIDIA_API_KEY environment variableThe NVIDIA provider gives you access to state-of-the-art models including Nemotron and other NVIDIA-optimized models.
openai)https://api.openai.com/v1OPENAI_API_KEY environment variableThe OpenAI provider gives you access to GPT models and other OpenAI offerings.
openrouter)https://openrouter.ai/api/v1OPENROUTER_API_KEY environment variableThe OpenRouter provider gives you access to a unified interface for many different language models from various providers.
Data Designer provides pre-configured model aliases for common use cases. When you create a DataDesignerConfigBuilder without specifying model_configs, these default configurations are automatically available.
The following model configurations are automatically available when NVIDIA_API_KEY is set:
The following model configurations are automatically available when OPENAI_API_KEY is set:
The following model configurations are automatically available when OPENROUTER_API_KEY is set:
Default settings work out of the box - no configuration needed! Simply create DataDesigner and DataDesignerConfigBuilder instances without any arguments, and reference the default model aliases in your column configurations.
For a complete example showing how to use default model settings, see the Getting Started page.
When the Data Designer library or the CLI is initialized, default model configurations and providers are stored in the Data Designer home directory for easy access and customization if they do not already exist. These configuration files serve as the single source of truth for model settings. By default they are saved to the following paths:
~/.data-designer/model_configs.yaml~/.data-designer/model_providers.yamlWhile these files provide a convenient way to specify settings for your model providers and configuration you use most often, they can always be set programmatically in your SDG workflow.
You can customize the home directory location by setting the DATA_DESIGNER_HOME environment variable:
These configuration files can be modified in two ways:
Both methods operate on the same files, ensuring consistency across your entire Data Designer setup.
While default model configurations are always available, you need to set the appropriate API key environment variable (NVIDIA_API_KEY, OPENAI_API_KEY, or OPENROUTER_API_KEY) to actually use the corresponding models for data generation. Without a valid API key, any attempt to generate data using that provider’s models will fail.
The default model providers call hosted endpoints operated by NVIDIA, OpenAI, OpenRouter, or their upstream providers. Provider terms and privacy practices apply independently of Data Designer, and free or trial endpoints may log request data for security, operations, or product improvement. Do not submit confidential information or personal data, including faces, voices, screenshots, regulated data, or other sensitive content, unless the selected provider and endpoint are approved for your use case.
The default: key in ~/.data-designer/model_providers.yaml and the registry-level “default provider” concept are deprecated and will be removed in a future release. Specify provider= explicitly on every ModelConfig instead — the built-in defaults above already do this, and a DeprecationWarning is now emitted whenever the legacy routing is exercised. See issue #589.
Store your API keys in environment variables rather than hardcoding them in your scripts: