data_designer.config.models
data_designer.config.models
data_designer.config.models
logger
DistributionParamsT
DistributionT
InferenceParamsT
Bases: str, enum.Enum
Supported modality types for multimodal model data.
Initialization:
Initialize self. See help(type(self)) for accurate signature.
Bases: str, enum.Enum
Data type formats for multimodal data.
Initialization:
Initialize self. See help(type(self)) for accurate signature.
Bases: str, enum.Enum
Types of distributions for sampling inference parameters.
Initialization:
Initialize self. See help(type(self)) for accurate signature.
Bases: abc.ABC, pydantic.BaseModel
Bases: data_designer.config.models.ModalityContext
Configuration for providing image context to multimodal models.
Parameters:
The modality type (always “image”).
Name of the column containing image data.
Format of the image data (“url”, “base64”, or None for auto-detection). When None, the format is auto-detected: URLs are passed through, file paths that exist under base_path are loaded as base64, and other values are assumed to be base64.
Image format (required when data_type is explicitly “base64”).
Attributes:
The modality type (always “image”).
Name of the column containing image data.
Format of the image data (“url”, “base64”, or None for auto-detection). When None, the format is auto-detected: URLs are passed through, file paths that exist under base_path are loaded as base64, and other values are assumed to be base64.
Image format (required when data_type is explicitly “base64”).
Initialization:
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be
validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
Get the contexts for the image modality.
Parameters:
The record containing the image data. The data can be:
Optional base path for resolving relative file paths. When provided, file paths that exist under this directory are loaded and converted to base64. This enables generated images (stored as relative paths in create mode) to be sent to remote model endpoints.
Returns:
list[dict[str, typing.Any]]
A list of image contexts.
Auto-detect the format of a context value and resolve it.
Resolution rules:
Format base64 image data as an image_url context dict.
Uses self.image_format if set, otherwise detects from the image bytes.
Bases: abc.ABC, data_designer.config.base.ConfigBase, typing.Generic[data_designer.config.models.DistributionParamsT]
Bases: data_designer.config.base.ConfigBase
Parameters for manual distribution sampling.
Parameters:
List of possible values to sample from.
Optional list of weights for each value. If not provided, all values have equal probability.
Attributes:
List of possible values to sample from.
Optional list of weights for each value. If not provided, all values have equal probability.
Initialization:
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be
validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
Bases: data_designer.config.models.Distribution[data_designer.config.models.ManualDistributionParams]
Manual (discrete) distribution for sampling inference parameters.
Samples from a discrete set of values with optional weights. Useful for testing specific values or creating custom probability distributions for temperature or top_p.
Attributes:
Type of distribution (“manual”).
Distribution parameters (values, weights).
Sample a value from the manual distribution.
Returns:
float
A float value sampled from the manual distribution.
Bases: data_designer.config.base.ConfigBase
Parameters for uniform distribution sampling.
Parameters:
Lower bound (inclusive).
Upper bound (exclusive).
Attributes:
Lower bound (inclusive).
Upper bound (exclusive).
Initialization:
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be
validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
Bases: data_designer.config.models.Distribution[data_designer.config.models.UniformDistributionParams]
Uniform distribution for sampling inference parameters.
Samples values uniformly between low and high bounds. Useful for exploring a continuous range of values for temperature or top_p.
Attributes:
Type of distribution (“uniform”).
Distribution parameters (low, high).
Sample a value from the uniform distribution.
Returns:
float
A float value sampled from the uniform distribution.
Bases: str, enum.Enum
Bases: data_designer.config.base.ConfigBase, abc.ABC
Base configuration for inference parameters.
Parameters:
Type of generation (chat-completion, embedding, or image). Acts as discriminator.
Maximum number of parallel requests to the model API.
Timeout in seconds for each request.
Additional parameters to pass to the model API.
Attributes:
Type of generation (chat-completion, embedding, or image). Acts as discriminator.
Maximum number of parallel requests to the model API.
Timeout in seconds for each request.
Additional parameters to pass to the model API.
Initialization:
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be
validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
Get the generate kwargs for the inference parameters.
Returns:
Any
A dictionary of the generate kwargs.
Format inference parameters for display as a single line.
Returns:
str
Formatted string of inference parameters
Get a list of formatted parameter strings.
Returns:
list[str]
List of formatted parameter strings (e.g., [“temperature=0.70”, “max_tokens=100”])
Format a single parameter value. Override in subclasses for custom formatting.
Parameters:
Parameter name
Parameter value
Returns:
str
Formatted string representation of the value
Bases: data_designer.config.models.BaseInferenceParams
Configuration for LLM inference parameters.
Parameters:
Type of generation, always “chat-completion” for this class.
Sampling temperature (0.0-2.0). Can be a fixed value or a distribution for dynamic sampling.
Nucleus sampling probability (0.0-1.0). Can be a fixed value or a distribution for dynamic sampling.
Maximum number of tokens to generate in the response.
Attributes:
Type of generation, always “chat-completion” for this class.
Sampling temperature (0.0-2.0). Can be a fixed value or a distribution for dynamic sampling.
Nucleus sampling probability (0.0-1.0). Can be a fixed value or a distribution for dynamic sampling.
Maximum number of tokens to generate in the response.
Initialization:
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be
validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
Format chat completion parameter values, including distributions.
Parameters:
Parameter name
Parameter value
Returns:
str
Formatted string representation of the value
Bases: data_designer.config.models.BaseInferenceParams
Configuration for embedding generation parameters.
Parameters:
Type of generation, always “embedding” for this class.
Format of the embedding encoding (“float” or “base64”).
Number of dimensions for the embedding.
Attributes:
Type of generation, always “embedding” for this class.
Format of the embedding encoding (“float” or “base64”).
Number of dimensions for the embedding.
Initialization:
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be
validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
Bases: data_designer.config.models.BaseInferenceParams
Configuration for image generation models.
Works for both diffusion and autoregressive image generation models. Pass all model-specific image options via extra_body.
Parameters:
Type of generation, always “image” for this class.
Attributes:
Type of generation, always “image” for this class.
Example:
Initialization:
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be
validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
Bases: data_designer.config.base.ConfigBase
Configuration for a model used for generation.
Parameters:
User-defined alias to reference in column configurations.
Model identifier (e.g., from build.nvidia.com or other providers).
Inference parameters for the model (temperature, top_p, max_tokens, etc.). The generation_type is determined by the type of inference_parameters.
Name of the model provider. Required in a future release. Leaving
provider unset (or None) currently routes through the registry’s
implicit default and is deprecated; specify provider= explicitly.
See issue #589.
Whether to skip the health check for this model. Defaults to False.
Attributes:
User-defined alias to reference in column configurations.
Model identifier (e.g., from build.nvidia.com or other providers).
Inference parameters for the model (temperature, top_p, max_tokens, etc.). The generation_type is determined by the type of inference_parameters.
Name of the model provider. Required in a future release. Leaving
provider unset (or None) currently routes through the registry’s
implicit default and is deprecated; specify provider= explicitly.
See issue #589.
Whether to skip the health check for this model. Defaults to False.
Initialization:
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be
validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
Get the generation type from the inference parameters.
Convert raw dict to appropriate inference parameters type based on field presence.
Bases: data_designer.config.base.ConfigBase
Configuration for a custom model provider.
Parameters:
Name of the model provider.
API endpoint URL for the provider.
Provider type (default: “openai”). Determines the API format to use.
Optional API key for authentication.
Additional parameters to pass in API requests.
Additional headers to pass in API requests.
Attributes:
Name of the model provider.
API endpoint URL for the provider.
Provider type (default: “openai”). Determines the API format to use.
Optional API key for authentication.
Additional parameters to pass in API requests.
Additional headers to pass in API requests.
Initialization:
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be
validated to form a valid model.
self is explicitly positional-only to allow self as a field name.