For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
  • Getting Started
    • Welcome
    • Contributing
  • Concepts
    • Columns
    • Seed Datasets
    • Agent Rollout Ingestion
    • Custom Columns
    • Validators
    • Processors
    • Person Sampling
    • Traces
    • Architecture & Performance
    • Deployment Options
    • Security
  • Tutorials
    • Overview
    • The Basics
    • Structured Outputs, Jinja Expressions, and Conditional Generation
    • Seeding with an External Dataset
    • Providing Images as Context
    • Generating Images
    • Image-to-Image Editing
  • Recipes
    • Recipe Cards
  • Plugins
    • Overview
    • Example Plugin
    • FileSystemSeedReader Plugins
    • Discover
  • Code Reference
    • Overview
      • Overview
      • models
      • mcp
      • column_configs
      • config_builder
      • data_designer_config
      • run_config
      • sampler_params
      • validator_params
      • seeds
      • processors
      • analysis
      • Config API
        • Analysis
        • Base
        • Column Configs
        • Column Types
        • Config Builder
        • Custom Column
        • Data Designer Config
        • Dataset Metadata
        • Default Model Settings
        • Errors
        • Exportable Config
        • Fingerprint
        • Interface
        • Mcp
        • Models
        • Preview Results
        • Processor Types
        • Processors
        • Run Config
        • Sampler Constraints
        • Sampler Params
        • Seed
        • Seed Source
        • Seed Source Dataframe
        • Seed Source Types
        • Testing
        • Utils
        • Validator Params
        • Version
  • Dev Notes
    • Overview
    • Push Datasets to Hugging Face Hub
    • Text-to-SQL for Nemotron Super
    • Async All the Way Down
    • Owning the Model Stack
    • Data Designer Got Skills
NVIDIANVIDIA
Developer-friendly docs for your API
Privacy Policy | Your Privacy Choices | Terms of Service | Accessibility | Corporate Policies | Product Security | Contact

Copyright © 2026, NVIDIA Corporation.

LogoLogoNeMo Data Designer
On this page
  • Module Contents
  • Classes
  • Functions
  • Data
  • API
Code ReferenceConfigConfig API

data_designer.config.models

||View as Markdown|
Previous

Mcp

Next

Preview Results

Module Contents

Classes

NameDescription
ModalitySupported modality types for multimodal model data.
ModalityDataTypeData type formats for multimodal data.
DistributionTypeTypes of distributions for sampling inference parameters.
ModalityContextHelper class that provides a standard way to create an ABC using inheritance.
ImageContextConfiguration for providing image context to multimodal models.
DistributionHelper class that provides a standard way to create an ABC using inheritance.
ManualDistributionParamsParameters for manual distribution sampling.
ManualDistributionManual (discrete) distribution for sampling inference parameters.
UniformDistributionParamsParameters for uniform distribution sampling.
UniformDistributionUniform distribution for sampling inference parameters.
GenerationTypestr(object=”) -> str str(bytes_or_buffer[, encoding[, errors]]) -> str
BaseInferenceParamsBase configuration for inference parameters.
ChatCompletionInferenceParamsConfiguration for LLM inference parameters.
EmbeddingInferenceParamsConfiguration for embedding generation parameters.
ImageInferenceParamsConfiguration for image generation models.
ModelConfigConfiguration for a model used for generation.
ModelProviderConfiguration for a custom model provider.

Functions

NameDescription
load_model_configsNone

Data

logger DistributionParamsT DistributionT InferenceParamsT

API

1logger = getLogger(...)
1class data_designer.config.models.Modality

Bases: str, enum.Enum

Supported modality types for multimodal model data.

Initialization:

Initialize self. See help(type(self)) for accurate signature.

1IMAGE = image
1class data_designer.config.models.ModalityDataType

Bases: str, enum.Enum

Data type formats for multimodal data.

Initialization:

Initialize self. See help(type(self)) for accurate signature.

1URL = url
1BASE64 = base64
1class data_designer.config.models.DistributionType

Bases: str, enum.Enum

Types of distributions for sampling inference parameters.

Initialization:

Initialize self. See help(type(self)) for accurate signature.

1UNIFORM = uniform
1MANUAL = manual
1class data_designer.config.models.ModalityContext(
2 /,
3 **data: typing.Any
4)

Bases: abc.ABC, pydantic.BaseModel

1modality: data_designer.config.models.Modality
1column_name: str
1data_type: data_designer.config.models.ModalityDataType | None
1get_contexts(
2 record: dict,
3 *,
4 base_path: str | None = None
5) -> list[dict[str, typing.Any]]
1class data_designer.config.models.ImageContext(
2 /,
3 **data: typing.Any
4)

Bases: data_designer.config.models.ModalityContext

Configuration for providing image context to multimodal models.

Parameters:

modality

The modality type (always “image”).

column_name

Name of the column containing image data.

data_type

Format of the image data (“url”, “base64”, or None for auto-detection). When None, the format is auto-detected: URLs are passed through, file paths that exist under base_path are loaded as base64, and other values are assumed to be base64.

image_format

Image format (required when data_type is explicitly “base64”).

Attributes:

modality

The modality type (always “image”).

column_name

Name of the column containing image data.

data_type

Format of the image data (“url”, “base64”, or None for auto-detection). When None, the format is auto-detected: URLs are passed through, file paths that exist under base_path are loaded as base64, and other values are assumed to be base64.

image_format

Image format (required when data_type is explicitly “base64”).

Initialization:

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

1modality: data_designer.config.models.Modality
1image_format: data_designer.config.utils.image_helpers.ImageFormat | None
1get_contexts(
2 record: dict,
3 *,
4 base_path: str | None = None
5) -> list[dict[str, typing.Any]]

Get the contexts for the image modality.

Parameters:

record
dict

The record containing the image data. The data can be:

  • A JSON serialized list of strings
  • A list of strings
  • A single string
base_path
str | NoneDefaults to None

Optional base path for resolving relative file paths. When provided, file paths that exist under this directory are loaded and converted to base64. This enables generated images (stored as relative paths in create mode) to be sent to remote model endpoints.

Returns:

list[dict[str, typing.Any]]

A list of image contexts.

1_auto_resolve_context_value(
2 context_value: str,
3 base_path: str | None
4) -> dict[str, str]

Auto-detect the format of a context value and resolve it.

Resolution rules:

  • File path that exists under base_path → load to base64 (generated artifact)
  • URL (http/https) → pass through as-is
  • Otherwise → assume base64 data
1_format_base64_context(base64_data: str) -> dict[str, str]

Format base64 image data as an image_url context dict.

Uses self.image_format if set, otherwise detects from the image bytes.

1_validate_image_format() -> typing_extensions.Self
1DistributionParamsT = TypeVar(...)
1class data_designer.config.models.Distribution(
2 /,
3 **data: typing.Any
4)

Bases: abc.ABC, data_designer.config.base.ConfigBase, typing.Generic[data_designer.config.models.DistributionParamsT]

1distribution_type: data_designer.config.models.DistributionType
1params: data_designer.config.models.DistributionParamsT
1sample() -> float
1class data_designer.config.models.ManualDistributionParams(
2 /,
3 **data: typing.Any
4)

Bases: data_designer.config.base.ConfigBase

Parameters for manual distribution sampling.

Parameters:

values

List of possible values to sample from.

weights

Optional list of weights for each value. If not provided, all values have equal probability.

Attributes:

values

List of possible values to sample from.

weights

Optional list of weights for each value. If not provided, all values have equal probability.

Initialization:

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

1values: list[float] = Field(...)
1weights: list[float] | None
1_normalize_weights() -> typing_extensions.Self
1_validate_equal_lengths() -> typing_extensions.Self
1class data_designer.config.models.ManualDistribution

Bases: data_designer.config.models.Distribution[data_designer.config.models.ManualDistributionParams]

Manual (discrete) distribution for sampling inference parameters.

Samples from a discrete set of values with optional weights. Useful for testing specific values or creating custom probability distributions for temperature or top_p.

Attributes:

distribution_type

Type of distribution (“manual”).

params

Distribution parameters (values, weights).

1distribution_type: data_designer.config.models.DistributionType | None = manual
1params: data_designer.config.models.ManualDistributionParams
1sample() -> float

Sample a value from the manual distribution.

Returns:

float

A float value sampled from the manual distribution.

1class data_designer.config.models.UniformDistributionParams(
2 /,
3 **data: typing.Any
4)

Bases: data_designer.config.base.ConfigBase

Parameters for uniform distribution sampling.

Parameters:

low

Lower bound (inclusive).

high

Upper bound (exclusive).

Attributes:

low

Lower bound (inclusive).

high

Upper bound (exclusive).

Initialization:

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

1low: float
1high: float
1_validate_low_lt_high() -> typing_extensions.Self
1class data_designer.config.models.UniformDistribution

Bases: data_designer.config.models.Distribution[data_designer.config.models.UniformDistributionParams]

Uniform distribution for sampling inference parameters.

Samples values uniformly between low and high bounds. Useful for exploring a continuous range of values for temperature or top_p.

Attributes:

distribution_type

Type of distribution (“uniform”).

params

Distribution parameters (low, high).

1distribution_type: data_designer.config.models.DistributionType | None = uniform
1params: data_designer.config.models.UniformDistributionParams
1sample() -> float

Sample a value from the uniform distribution.

Returns:

float

A float value sampled from the uniform distribution.

DistributionT
typing_extensions.TypeAlias
1class data_designer.config.models.GenerationType

Bases: str, enum.Enum

1CHAT_COMPLETION = chat-completion
1EMBEDDING = embedding
1IMAGE = image
1class data_designer.config.models.BaseInferenceParams(
2 /,
3 **data: typing.Any
4)

Bases: data_designer.config.base.ConfigBase, abc.ABC

Base configuration for inference parameters.

Parameters:

generation_type

Type of generation (chat-completion, embedding, or image). Acts as discriminator.

max_parallel_requests

Maximum number of parallel requests to the model API.

timeout

Timeout in seconds for each request.

extra_body

Additional parameters to pass to the model API.

Attributes:

generation_type

Type of generation (chat-completion, embedding, or image). Acts as discriminator.

max_parallel_requests

Maximum number of parallel requests to the model API.

timeout

Timeout in seconds for each request.

extra_body

Additional parameters to pass to the model API.

Initialization:

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

1generation_type: data_designer.config.models.GenerationType
1max_parallel_requests: int = Field(...)
1timeout: int | None = Field(...)
1extra_body: dict[str, typing.Any] | None
1generate_kwargs: dict[str, typing.Any]

Get the generate kwargs for the inference parameters.

Returns:

Any

A dictionary of the generate kwargs.

1format_for_display() -> str

Format inference parameters for display as a single line.

Returns:

str

Formatted string of inference parameters

1get_formatted_params() -> list[str]

Get a list of formatted parameter strings.

Returns:

list[str]

List of formatted parameter strings (e.g., [“temperature=0.70”, “max_tokens=100”])

1_format_value(
2 key: str,
3 value: typing.Any
4) -> str

Format a single parameter value. Override in subclasses for custom formatting.

Parameters:

key
str

Parameter name

value
typing.Any

Parameter value

Returns:

str

Formatted string representation of the value

1class data_designer.config.models.ChatCompletionInferenceParams(
2 /,
3 **data: typing.Any
4)

Bases: data_designer.config.models.BaseInferenceParams

Configuration for LLM inference parameters.

Parameters:

generation_type

Type of generation, always “chat-completion” for this class.

temperature

Sampling temperature (0.0-2.0). Can be a fixed value or a distribution for dynamic sampling.

top_p

Nucleus sampling probability (0.0-1.0). Can be a fixed value or a distribution for dynamic sampling.

max_tokens

Maximum number of tokens to generate in the response.

Attributes:

generation_type

Type of generation, always “chat-completion” for this class.

temperature

Sampling temperature (0.0-2.0). Can be a fixed value or a distribution for dynamic sampling.

top_p

Nucleus sampling probability (0.0-1.0). Can be a fixed value or a distribution for dynamic sampling.

max_tokens

Maximum number of tokens to generate in the response.

Initialization:

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

1generation_type: typing.Literal[data_designer.config.models.GenerationType]
1temperature: float | data_designer.config.models.DistributionT | None
1top_p: float | data_designer.config.models.DistributionT | None
1max_tokens: int | None = Field(...)
1generate_kwargs: dict[str, typing.Any]
1_validate_temperature() -> typing_extensions.Self
1_validate_top_p() -> typing_extensions.Self
1_run_validation(
2 value: float | data_designer.config.models.DistributionT | None,
3 param_name: str,
4 min_value: float,
5 max_value: float
6) -> typing_extensions.Self
1_is_value_in_range(
2 value: float,
3 min_value: float,
4 max_value: float
5) -> bool
1_format_value(
2 key: str,
3 value: typing.Any
4) -> str

Format chat completion parameter values, including distributions.

Parameters:

key
str

Parameter name

value
typing.Any

Parameter value

Returns:

str

Formatted string representation of the value

1class data_designer.config.models.EmbeddingInferenceParams(
2 /,
3 **data: typing.Any
4)

Bases: data_designer.config.models.BaseInferenceParams

Configuration for embedding generation parameters.

Parameters:

generation_type

Type of generation, always “embedding” for this class.

encoding_format

Format of the embedding encoding (“float” or “base64”).

dimensions

Number of dimensions for the embedding.

Attributes:

generation_type

Type of generation, always “embedding” for this class.

encoding_format

Format of the embedding encoding (“float” or “base64”).

dimensions

Number of dimensions for the embedding.

Initialization:

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

1generation_type: typing.Literal[data_designer.config.models.GenerationType]
1encoding_format: typing.Literal[float, base64] = float
1dimensions: int | None
1generate_kwargs: dict[str, float | int]
1class data_designer.config.models.ImageInferenceParams(
2 /,
3 **data: typing.Any
4)

Bases: data_designer.config.models.BaseInferenceParams

Configuration for image generation models.

Works for both diffusion and autoregressive image generation models. Pass all model-specific image options via extra_body.

Parameters:

generation_type

Type of generation, always “image” for this class.

Attributes:

generation_type

Type of generation, always “image” for this class.

Example:

1# OpenAI-style (DALL·E): quality and size in extra_body or as top-level kwargs
2dd.ImageInferenceParams(
3 extra_body={"size": "1024x1024", "quality": "hd"}
4)
5
6# Gemini-style: generationConfig.imageConfig
7dd.ImageInferenceParams(
8 extra_body={
9 "generationConfig": {
10 "imageConfig": {
11 "aspectRatio": "1:1",
12 "imageSize": "1024"
13 }
14 }
15 }
16)

Initialization:

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

1generation_type: typing.Literal[data_designer.config.models.GenerationType]
InferenceParamsT
typing_extensions.TypeAlias
1class data_designer.config.models.ModelConfig(
2 /,
3 **data: typing.Any
4)

Bases: data_designer.config.base.ConfigBase

Configuration for a model used for generation.

Parameters:

alias

User-defined alias to reference in column configurations.

model

Model identifier (e.g., from build.nvidia.com or other providers).

inference_parameters

Inference parameters for the model (temperature, top_p, max_tokens, etc.). The generation_type is determined by the type of inference_parameters.

provider

Name of the model provider. Required in a future release. Leaving provider unset (or None) currently routes through the registry’s implicit default and is deprecated; specify provider= explicitly. See issue #589.

skip_health_check

Whether to skip the health check for this model. Defaults to False.

Attributes:

alias

User-defined alias to reference in column configurations.

model

Model identifier (e.g., from build.nvidia.com or other providers).

inference_parameters

Inference parameters for the model (temperature, top_p, max_tokens, etc.). The generation_type is determined by the type of inference_parameters.

provider

Name of the model provider. Required in a future release. Leaving provider unset (or None) currently routes through the registry’s implicit default and is deprecated; specify provider= explicitly. See issue #589.

skip_health_check

Whether to skip the health check for this model. Defaults to False.

Initialization:

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

1alias: str
1model: str
1inference_parameters: data_designer.config.models.InferenceParamsT = Field(...)
1provider: str | None
1skip_health_check: bool = False
1generation_type: data_designer.config.models.GenerationType

Get the generation type from the inference parameters.

1_convert_inference_parameters(value: typing.Any) -> typing.Any

Convert raw dict to appropriate inference parameters type based on field presence.

1_warn_on_implicit_provider() -> typing_extensions.Self
1class data_designer.config.models.ModelProvider(
2 /,
3 **data: typing.Any
4)

Bases: data_designer.config.base.ConfigBase

Configuration for a custom model provider.

Parameters:

name

Name of the model provider.

endpoint

API endpoint URL for the provider.

provider_type

Provider type (default: “openai”). Determines the API format to use.

api_key

Optional API key for authentication.

extra_body

Additional parameters to pass in API requests.

extra_headers

Additional headers to pass in API requests.

Attributes:

name

Name of the model provider.

endpoint

API endpoint URL for the provider.

provider_type

Provider type (default: “openai”). Determines the API format to use.

api_key

Optional API key for authentication.

extra_body

Additional parameters to pass in API requests.

extra_headers

Additional headers to pass in API requests.

Initialization:

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

1name: str
1endpoint: str
1provider_type: str = openai
1api_key: str | None
1extra_body: dict[str, typing.Any] | None
1extra_headers: dict[str, str] | None
1normalize_provider_type(v: str) -> str
1data_designer.config.models.load_model_configs(model_configs: list[data_designer.config.models.ModelConfig] | str | pathlib.Path) -> list[data_designer.config.models.ModelConfig]data_designer.config.models.load_model_configs(model_configs: list[data_designer.config.models.ModelConfig] | str | pathlib.Path) -> list[data_designer.config.models.ModelConfig]