Configure Models#

Data Designer requires access to large language models (LLMs) to generate synthetic data and without a properly configured model provider, you won’t have access to LLMs.

Configuration happens at two distinct levels:

  1. Model Providers (Deploy Time): The infrastructure layer that defines which LLM services are available. This only requires setting API endpoints and authentication keys in the deployment configuration. Model Providers are configured in a Model Provider Registry in Data Designer.

  2. Model Configurations (Runtime): The application layer where you define specific models, aliases, and inference parameters. This can be configured in your SDK code or API requests.

The system is designed to work with multiple provider types, from cloud-hosted services like NVIDIA’s API and OpenAI to locally deployed models like NIMs, Ollama, LM Studio, and so on. This flexibility allows you to choose providers based on your requirements for cost, privacy, performance, and model capabilities.

Understanding Model Providers#

Model providers are external LLM services that Data Designer uses for data generation. Each provider requires specific configuration including endpoints and authentication credentials.

By default, the Model Provider Registry includes NVIDIA’s API service at build.nvidia.com (referred to as “nvidiabuild”). This service provides access to various NVIDIA-hosted language models through a unified API. You can obtain free API access by creating an account at build.nvidia.com.

However, you can configure Data Designer to use any compatible LLM provider, including OpenAI, Azure OpenAI, local model endpoints, or other API-compatible services.

Configuration File Structure#

When you download the Docker Compose configuration, you receive this directory structure:

├── services/
└── docker-compose.yaml

All model provider configuration is managed in the docker-compose.yaml file. Refer to the data_designer section of the configs.platform_config field, as shown in the following sample.

configs:
  platform_config:
    content: |
      # Other NMP settings

      data_designer:
        # Other Data Designer settings

        model_provider_registry:
          default: "nvidiabuild"
          providers:
            - name: "nvidiabuild"
              endpoint: "https://integrate.api.nvidia.com/v1"
              api_key: "NIM_API_KEY"

Provider Registry Configuration (Deploy Time)#

The provider registry is configured at deployment time and defines which LLM service endpoints are available to Data Designer. Multiple providers may be configured. If more than one provider is configured, a default must be specified.

Providers are defined with the following fields:

  • name (required): A name for the provider, to be referenced in model configs

  • endpoint (required): The URL where the LLM service is hosted

  • api_key (optional, default None): Authentication credentials for accessing the service (see more details below)

  • provider_type (optional, default "openai"): The hosting service provider type; officially supported values are "openai" and "azure"

  • allowed_models (optional, default None): A list of allowed models hosted by the provider; when unset, all models are permitted

Note

The list of allowed_models should be full model names as defined by the provider. For example:

name: "nvidiabuild"
endpoint: "https://integrate.api.nvidia.com/v1"
api_key: "NIM_API_KEY"
allowed_models:
  - "nvidia/nvidia-nemotron-nano-9b-v2"
  - "mistralai/mistral-small-24b-instruct"

Once providers are registered at deploy time, you can configure specific models, aliases, and inference parameters later through the SDK or API requests.

API keys should not be stored as plain text in the Data Designer configuration. Rather, the value should be a reference to either an environment variable or a key in a simple key-value JSON secrets file. For example, in Quickstart the nvidiabuild model provider is predefined and sets api_key to "NIM_API_KEY", which is exported as an environment variable and registered with the data-designer service.

# services/data_designer.yaml
services:
  data-designer:
    environment:
      - NIM_API_KEY

Any additional provider API keys referenced in the model_provider_registry should be exported and added to this list the same way.

Model Configuration (Deploy Time and Runtime)#

Once providers are registered at deploy time, you configure specific models at runtime through your SDK code or API requests. This application layer is where you define which models to use, create convenient aliases, and set inference parameters for different use cases.

Key Point: Model configurations are flexible and can be defined in your application code, allowing you to:

  • Choose specific models from registered providers

  • Create meaningful aliases for easy reference

  • Set inference parameters like temperature and token limits

  • Configure different models for different content types

Model Configuration Structure#

Model configurations use the ModelConfig class to connect your data columns to specific AI models and their settings:

model_configs = [
    ModelConfig(
        alias="text-generation",
        provider="nvidiabuild",
        model="meta/llama-3.3-70b-instruct",
        inference_parameters=InferenceParameters(
            temperature=0.7,
            top_p=0.9,
            max_tokens=1024
        )
    ),
    ModelConfig(
        alias="code-generation",
        provider="nvidiabuild",
        model="qwen/qwen2.5-coder-32b-instruct",
        inference_parameters=InferenceParameters(
            temperature=0.3,
            top_p=0.9,
            max_tokens=1500
        )
    )
]

Required Model Configuration Fields#

Field

Description

Required

alias

Unique name to reference this model configuration

Yes

model

Model identifier (e.g., meta/llama-3.3-70b-instruct)

Yes

inference_parameters

Model generation settings

Yes

provider

Name of the provider hosting the model

No (uses default)

Using Model Aliases in Columns#

Reference your model aliases in column configurations:

model_configs = [
    ModelConfig(
        alias="creative-writer",
        provider="nvidiabuild",
        model="meta/llama-3.1-8b-instruct",
        inference_parameters=InferenceParameters(
            temperature=0.8,
            top_p=0.95,
            max_tokens=1000
        )
    )
]

builder = DataDesignerConfigBuilder(model_configs=model_configs)
builder.add_column(
    LLMTextColumn(
        name="story",
        type="llm-text",
        prompt="Write an engaging story about {{theme}}",
        model_alias="creative-writer"
    )
)

Default Model Configurations#

If you find yourself using the same model configurations repeatedly, you can configure them as default model configs in the deployment configuration and omit the details from requests. The default_model_configs configuration setting takes a list of model configs exactly as structured in requests. In cases of overlapping aliases, a request-time config will take priority over a default model config.

Inference Parameters#

Inference parameters control how AI models generate content by adjusting their behavior during text generation. These parameters significantly impact output quality, consistency, and style.

Core Parameters#

Temperature (0.0-2.0)#

Controls randomness and creativity in outputs:

  • Low values (0.0-0.3): Deterministic, consistent outputs (good for code, technical content)

  • Medium values (0.3-0.7): Balanced creativity (good for general content)

  • High values (0.7-2.0): Creative, varied outputs (good for stories, marketing)

inference_parameters = InferenceParameters(
    temperature=0.7  # Balanced creativity
)

Top-P (0.0-1.0)#

Limits token selection to most probable choices:

  • Low values (0.1-0.5): Focused, conservative word choices

  • High values (0.8-1.0): Diverse vocabulary and phrasing

inference_parameters = InferenceParameters(
    top_p=0.9  # Good balance of diversity and focus
)

Max Tokens#

Sets the limit for generated content length:

  • Guidelines: ~100 tokens ≈ 75 words, ~500 tokens ≈ 375 words

inference_parameters = InferenceParameters(
    max_tokens=1000  # ~750 words
)

Timeout#

Sets the timeout in seconds for each LLM request:

  • Default: 10 seconds with 3 auto retries

  • Recommendation: Increase based on your LLM endpoint scalability

inference_parameters = InferenceParameters(
    timeout=30
)

Max Parallel Requests#

Controls concurrency of requests to LLMs:

  • Guidelines: Use 4-8 for initial testing, then try values between 50-200+ for scaling high-performance LLM services

inference_parameters = InferenceParameters(
    max_parallel_requests=8  # Start here, scale up as needed
)

Tip

If you see HTTP errors from your model endpoint, your endpoint is the bottleneck. You can check the error log the following specific exceptions:

  • ModelRateLimitError for 429 (rate limit) errors

  • ModelInternalServerError for 500 (internal server) errors

  • ModelTimeoutError for 504 (timeout) errors

  • ModelAPIError for other uncaught model endpoint errors

If Data Designer seems slow but no HTTP errors occur, try increasing this value.

Dynamic Parameters#

Use distributions for varied outputs across records:

inference_parameters = InferenceParameters(
    temperature=UniformDistribution(
        params=UniformDistributionParams(
            low=0.50,
            high=0.90
        )
    ),
    top_p=ManualDistribution(
        params=ManualDistributionParams(
            values=[0.8, 0.9, 0.95],
            weights=[0.3, 0.4, 0.3]
        )
    )
)

Connecting to Locally Deployed Models#

If you are hosting models locally on the same machine running the Data Designer microservice in Quickstart mode, the model provider should use a special endpoint rather than localhost.

  • When using Docker Desktop on Mac, the DNS entry host.docker.internal is configured by default.

  • When using Docker on Linux, 172.17.0.1 is the conventional endpoint used to access services on the Docker host.

Example: Connecting to a Local Model#

For a model running on your local machine at port 8000, configure the provider like this:

data_designer:
  model_provider_registry:
    default: "local-llm"
    providers:
      - name: "local-llm"
        endpoint: "http://host.docker.internal:8000/v1"  # or for Linux: "http://172.17.0.1:8000/v1"
        api_key: "LOCAL_API_KEY"  # If authentication is required
# Example model configuration for local deployment
model_configs = [
    ModelConfig(
        alias="local-llama",
        provider="local-llm",
        model="meta/llama-3.1-8b-instruct",  # Model identifier expected by your local service
        inference_parameters=InferenceParameters(
            temperature=0.7,
            top_p=0.9,
            max_tokens=1024,
            timeout=60  # Increase timeout for local models if needed
        )
    )
]

Best Practices#

Model Alias Naming Conventions#

Use descriptive, hyphenated names that clearly indicate purpose:

# ✅ Good: Clear purpose and scope
alias: "technical-documentation"
alias: "creative-storytelling"
alias: "precise-code-generation"

# ❌ Avoid: Vague or unclear names
alias: "model1"
alias: "good"
alias: "temp_high"

Parameter Tuning Guidelines#

Start with these baseline values and adjust based on your results:

  • Code generation: temperature: 0.2, top_p: 0.9

  • Technical writing: temperature: 0.3, top_p: 0.85

  • General content: temperature: 0.6, top_p: 0.9

  • Creative writing: temperature: 0.8, top_p: 0.95

Testing Model Configurations#

Always test configurations with preview mode before production:

# Add test columns to validate behavior
columns:
  - name: "test_output"
    type: "llm-text"
    prompt: "Test prompt for {{topic}}"
    model_alias: "your-custom-alias"

Troubleshooting#

Validation and Testing#

After modifying config.yaml, restart services and verify connectivity:

  1. Restart services:

    docker compose --profile data-designer down
    docker compose --profile data-designer up
    
  2. Check service health:

    curl -fv http://localhost:8080/health/data-designer
    
  3. Test model configurations: Use the Data Designer API to test each configured model by including model aliases in your requests.

Common Provider Issues#

Authentication Problems#

  • Verify environment variables are set correctly

  • Check API key validity with the provider’s documentation

  • Ensure endpoint URLs are accessible from the container

Connection Issues#

  • Test endpoint connectivity: curl -I <endpoint-url>

  • Check firewall and network settings

  • Verify provider service status

Configuration Errors#

  • Validate YAML syntax in config.yaml

  • Check container logs: docker logs nemo-microservices-data-designer-1

  • Verify provider names are unique and properly referenced

Common Model Configuration Issues#

“Alias not found” error#

  • Ensure you define the alias in model_configs before using it in columns

  • Check spelling and case sensitivity of alias names

“Model not available” error#

  • Verify the model exists and is accessible via the provider

  • Check that the model name matches exactly what the provider expects

Invalid parameter ranges#

  • Verify temperature is between 0.0-2.0

  • Ensure top_p is between 0.0-1.0

  • Confirm max_tokens is ≥1 and within provider limits

Performance Issues#

  • Slow generation with no errors: Try increasing max_parallel_requests

  • 429 or 503 HTTP errors: Your endpoint cannot handle the load, reduce max_parallel_requests or upgrade your LLM service

  • Timeouts: Increase timeout value or reduce max_parallel_requests

  • Check your endpoint’s monitoring dashboard to see if you’re hitting rate limits or capacity constraints

Environment Variable Management#

For multiple providers, organize your environment variables systematically:

# NVIDIA API
export NIM_API_KEY="nvapi-your-key"

# OpenAI
export OPENAI_API_KEY="sk-your-openai-key"

# Azure
export AZURE_OPENAI_KEY="your-azure-key"

# Start services
docker compose --profile data-designer up