For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
  • Getting Started
    • Welcome
    • Contributing
  • Concepts
    • Columns
    • Seed Datasets
    • Agent Rollout Ingestion
    • Custom Columns
    • Validators
    • Processors
    • Person Sampling
    • Traces
    • Architecture & Performance
    • Deployment Options
    • Security
  • Tutorials
    • Overview
    • The Basics
    • Structured Outputs, Jinja Expressions, and Conditional Generation
    • Seeding with an External Dataset
    • Providing Images as Context
    • Generating Images
    • Image-to-Image Editing
  • Recipes
    • Recipe Cards
  • Plugins
    • Overview
    • Example Plugin
    • FileSystemSeedReader Plugins
    • Discover
  • Code Reference
    • Overview
      • Overview
      • models
      • mcp
      • column_configs
      • config_builder
      • data_designer_config
      • run_config
      • sampler_params
      • validator_params
      • seeds
      • processors
      • analysis
      • Config API
        • Analysis
        • Base
        • Column Configs
        • Column Types
        • Config Builder
        • Custom Column
        • Data Designer Config
        • Dataset Metadata
        • Default Model Settings
        • Errors
        • Exportable Config
        • Fingerprint
        • Interface
        • Mcp
        • Models
        • Preview Results
        • Processor Types
        • Processors
        • Run Config
        • Sampler Constraints
        • Sampler Params
        • Seed
        • Seed Source
        • Seed Source Dataframe
        • Seed Source Types
        • Testing
        • Utils
        • Validator Params
        • Version
  • Dev Notes
    • Overview
    • Push Datasets to Hugging Face Hub
    • Text-to-SQL for Nemotron Super
    • Async All the Way Down
    • Owning the Model Stack
    • Data Designer Got Skills
NVIDIANVIDIA
Developer-friendly docs for your API
Privacy Policy | Your Privacy Choices | Terms of Service | Accessibility | Corporate Policies | Product Security | Contact

Copyright © 2026, NVIDIA Corporation.

LogoLogoNeMo Data Designer
On this page
  • Module Contents
  • Classes
  • Functions
  • Data
  • API
Code ReferenceConfigConfig API

data_designer.config.config_builder

||View as Markdown|
Previous

Column Types

Next

Custom Column

Module Contents

Classes

NameDescription
BuilderConfigConfiguration container for Data Designer builder.
DataDesignerConfigBuilderConfig builder for Data Designer configurations.

Functions

NameDescription
_load_model_configsResolves the provided model_configs, which may be a string or Path to a model configuration file. If None or empty, returns default model configurations if possible, otherwise raises an error.

Data

logger

API

1logger = getLogger(...)
1class data_designer.config.config_builder.BuilderConfig(
2 /,
3 **data: typing.Any
4)

Bases: data_designer.config.exportable_config.ExportableConfigBase

Configuration container for Data Designer builder.

This class holds the main Data Designer configuration along with optional datastore settings needed for seed dataset operations.

Parameters:

data_designer

The main Data Designer configuration containing columns, constraints, profilers, and other settings.

Attributes:

data_designer

The main Data Designer configuration containing columns, constraints, profilers, and other settings.

Initialization:

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

1data_designer: data_designer.config.data_designer_config.DataDesignerConfig
1library_version: str | None
1_set_library_version() -> data_designer.config.config_builder.BuilderConfig
1class data_designer.config.config_builder.DataDesignerConfigBuilder(
2 model_configs: list[data_designer.config.models.ModelConfig] | str | pathlib.Path | None = None,
3 tool_configs: list[data_designer.config.mcp.ToolConfig] | None = None
4)

Config builder for Data Designer configurations.

This class provides a high-level interface for building Data Designer configurations.

Initialization:

Initialize a new DataDesignerConfigBuilder instance.

Parameters:

model_configs
list[data_designer.config.models.ModelConfig] | str | pathlib.Path | NoneDefaults to None

Model configurations. Can be:

  • None to use default model configurations in local mode
  • A list of ModelConfig objects
  • A string or Path to a model configuration file
tool_configs
list[data_designer.config.mcp.ToolConfig] | NoneDefaults to None

Tool configurations for MCP tool calling. Can be:

  • None if no tool configs are needed
  • A list of ToolConfig objects
1from_config(config: dict | str | pathlib.Path | data_designer.config.config_builder.BuilderConfig) -> typing_extensions.Self

Create a DataDesignerConfigBuilder from an existing configuration.

Accepts both the full BuilderConfig format (with a top-level data_designer key) and the shorthand DataDesignerConfig format (columns, model_configs, etc. at the top level). When the shorthand format is detected it is automatically normalized into a full BuilderConfig.

Parameters:

config
dict | str | pathlib.Path | data_designer.config.config_builder.BuilderConfig

Configuration source. Can be:

  • A dictionary containing the configuration
  • A string or Path to a local YAML/JSON configuration file
  • An HTTP(S) URL string to a YAML/JSON configuration file
  • A BuilderConfig object

Returns:

typing_extensions.Self

A new instance populated with the configuration from the provided source.

Raises:

ValueError

If the config format is invalid.

ValidationError

If the builder config loaded from the config is invalid.

1model_configs: list[data_designer.config.models.ModelConfig]

Get the model configurations for this builder.

Returns:

Any

A list of ModelConfig objects used for data generation.

1tool_configs: list[data_designer.config.mcp.ToolConfig]

Get the tool configurations for this builder.

Returns:

Any

A list of ToolConfig objects used for MCP tool calling.

1allowed_references: list[str]

Get all referenceable variables allowed in prompt templates and expressions.

This includes all column names and their side effect columns that can be referenced in prompt templates and expressions within the configuration.

Returns:

Any

A list of variable names that can be referenced in templates and expressions.

1info: data_designer.config.utils.info.ConfigBuilderInfo

Get the ConfigBuilderInfo object for this builder.

Returns:

Any

An object containing information about the configuration.

1add_model_config(model_config: data_designer.config.models.ModelConfig) -> typing_extensions.Self

Add a model configuration to the current Data Designer configuration.

Parameters:

model_config
data_designer.config.models.ModelConfig

The model configuration to add.

1delete_model_config(alias: str) -> typing_extensions.Self

Delete a model configuration from the current Data Designer configuration by alias.

Parameters:

alias
str

The alias of the model configuration to delete.

1add_tool_config(tool_config: data_designer.config.mcp.ToolConfig) -> typing_extensions.Self

Add a tool configuration to the current Data Designer configuration.

Parameters:

tool_config
data_designer.config.mcp.ToolConfig

The tool configuration to add.

Returns:

typing_extensions.Self

The current Data Designer config builder instance.

Raises:

BuilderConfigurationError

If a tool configuration with the same alias already exists.

1delete_tool_config(alias: str) -> typing_extensions.Self

Delete a tool configuration from the current Data Designer configuration by alias.

Parameters:

alias
str

The alias of the tool configuration to delete.

Returns:

typing_extensions.Self

The current Data Designer config builder instance.

1add_column(
2 column_config: data_designer.config.column_types.ColumnConfigT | None = None,
3 *,
4 name: str | None = None,
5 column_type: data_designer.config.column_types.DataDesignerColumnType | None = None,
6 **kwargs
7) -> typing_extensions.Self

Add a Data Designer column configuration to the current Data Designer configuration.

If no column config object is provided, you must provide the name, column_type, and any additional keyword arguments that are required by the column config constructor.

Parameters:

column_config
data_designer.config.column_types.ColumnConfigT | NoneDefaults to None

Data Designer column config object to add.

name
str | NoneDefaults to None

Name of the column to add. This is only used if column_config is not provided.

column_type
data_designer.config.column_types.DataDesignerColumnType | NoneDefaults to None

Column type to add. This is only used if column_config is not provided.

**kwargs

Additional keyword arguments to pass to the column constructor.

Returns:

typing_extensions.Self

The current Data Designer config builder instance.

Raises:

BuilderConfigurationError

If neither a column config nor the required constructor arguments are provided.

InvalidColumnTypeError

If the provided column config is not one of the supported column config types.

1add_constraint(
2 constraint: data_designer.config.sampler_constraints.ColumnConstraintT | None = None,
3 *,
4 constraint_type: data_designer.config.sampler_constraints.ConstraintType | None = None,
5 **kwargs
6) -> typing_extensions.Self

Add a constraint to the current Data Designer configuration.

Currently, constraints are only supported for numerical samplers.

You can either provide a constraint object directly, or provide a constraint type and additional keyword arguments to construct the constraint object. Valid constraint types are:

  • “scalar_inequality”: Constraint between a column and a scalar value.
  • “column_inequality”: Constraint between two columns.

Parameters:

constraint
data_designer.config.sampler_constraints.ColumnConstraintT | NoneDefaults to None

Constraint object to add.

constraint_type
data_designer.config.sampler_constraints.ConstraintType | NoneDefaults to None

Constraint type to add. Ignored when constraint is provided.

**kwargs

Additional keyword arguments to pass to the constraint constructor.

Returns:

typing_extensions.Self

The current Data Designer config builder instance.

1add_processor(
2 processor_config: data_designer.config.processor_types.ProcessorConfigT | None = None,
3 *,
4 processor_type: data_designer.config.processors.ProcessorType | None = None,
5 **kwargs
6) -> typing_extensions.Self

Add a processor to the current Data Designer configuration.

If a processor with the same name already exists, it is replaced (upsert), making notebook cells safely re-runnable.

You can either provide a processor config object directly, or provide a processor type and additional keyword arguments to construct the processor config object.

Parameters:

processor_config
data_designer.config.processor_types.ProcessorConfigT | NoneDefaults to None

The processor configuration object to add.

processor_type
data_designer.config.processors.ProcessorType | NoneDefaults to None

The type of processor to add.

**kwargs

Additional keyword arguments to pass to the processor constructor.

Returns:

typing_extensions.Self

The current Data Designer config builder instance.

1_remove_processor_by_name(name: str) -> None

Remove an existing processor by name and undo its side-effects.

1_resolve_drop_column_names(column_names: list[str]) -> list[str]

Resolve column names, expanding glob patterns against known column configs.

1add_profiler(profiler_config: data_designer.config.analysis.column_profilers.ColumnProfilerConfigT) -> typing_extensions.Self

Add a profiler to the current Data Designer configuration.

Parameters:

profiler_config
data_designer.config.analysis.column_profilers.ColumnProfilerConfigT

The profiler configuration object to add.

Returns:

typing_extensions.Self

The current Data Designer config builder instance.

Raises:

BuilderConfigurationError

If the profiler configuration is of an invalid type.

1get_profilers() -> list[data_designer.config.analysis.column_profilers.ColumnProfilerConfigT]

Get all profilers.

Returns:

list[data_designer.config.analysis.column_profilers.ColumnProfilerConfigT]

A list of profiler configuration objects.

1build() -> data_designer.config.data_designer_config.DataDesignerConfig

Build a DataDesignerConfig instance based on the current builder configuration.

Returns:

data_designer.config.data_designer_config.DataDesignerConfig

The current Data Designer config object.

Raises:

BuilderConfigurationError

If any ToolConfig has duplicate tool names in its allow_tools list.

1_validate_tool_configs_no_duplicates() -> None

Validate that no ToolConfig has duplicate tool names in its allow_tools list.

This is a static validation that catches obvious duplicates at config build time, before providers are queried. Full validation (including duplicates across providers) happens at resource provider creation time.

Raises:

BuilderConfigurationError

If any ToolConfig has duplicate tool names in allow_tools.

1delete_constraints(target_column: str) -> typing_extensions.Self

Delete all constraints for the given target column.

Parameters:

target_column
str

Name of the column to remove constraints for.

Returns:

typing_extensions.Self

The current Data Designer config builder instance.

1delete_column(column_name: str) -> typing_extensions.Self

Delete the column with the given name.

Parameters:

column_name
str

Name of the column to delete.

Returns:

typing_extensions.Self

The current Data Designer config builder instance.

Raises:

BuilderConfigurationError

If trying to delete a seed dataset column.

1get_column_config(name: str) -> data_designer.config.column_types.ColumnConfigT

Get a column configuration by name.

Parameters:

name
str

Name of the column to retrieve the config for.

Returns:

data_designer.config.column_types.ColumnConfigT

The column configuration object.

Raises:

KeyError

If no column with the given name exists.

1get_column_configs() -> list[data_designer.config.column_types.ColumnConfigT]

Get all column configurations.

Returns:

list[data_designer.config.column_types.ColumnConfigT]

A list of all column configuration objects.

1get_tool_config(alias: str) -> data_designer.config.mcp.ToolConfig

Get a tool configuration by alias.

Parameters:

alias
str

The alias of the tool configuration to retrieve.

Returns:

data_designer.config.mcp.ToolConfig

The tool configuration object.

Raises:

KeyError

If no tool configuration with the given alias exists.

1get_constraints(target_column: str) -> list[data_designer.config.sampler_constraints.ColumnConstraintT]

Get all constraints for the given target column.

Parameters:

target_column
str

Name of the column to get constraints for.

Returns:

list[data_designer.config.sampler_constraints.ColumnConstraintT]

A list of constraint objects targeting the specified column.

1get_columns_of_type(column_type: data_designer.config.column_types.DataDesignerColumnType) -> list[data_designer.config.column_types.ColumnConfigT]get_columns_of_type(column_type: data_designer.config.column_types.DataDesignerColumnType) -> list[data_designer.config.column_types.ColumnConfigT]

Get all column configurations of the specified type.

Parameters:

column_type
data_designer.config.column_types.DataDesignerColumnType

The type of columns to filter by.

Returns:

list[data_designer.config.column_types.ColumnConfigT]

A list of column configurations matching the specified type.

1get_columns_excluding_type(column_type: data_designer.config.column_types.DataDesignerColumnType) -> list[data_designer.config.column_types.ColumnConfigT]get_columns_excluding_type(column_type: data_designer.config.column_types.DataDesignerColumnType) -> list[data_designer.config.column_types.ColumnConfigT]

Get all column configurations excluding the specified type.

Parameters:

column_type
data_designer.config.column_types.DataDesignerColumnType

The type of columns to exclude.

Returns:

list[data_designer.config.column_types.ColumnConfigT]

A list of column configurations that do not match the specified type.

1get_processor_configs() -> list[data_designer.config.processor_types.ProcessorConfigT]

Get processor configuration objects.

Returns:

list[data_designer.config.processor_types.ProcessorConfigT]

A list of processor configuration objects.

1get_seed_config() -> data_designer.config.seed.SeedConfig | None

Get the seed config for the current Data Designer configuration.

Returns:

data_designer.config.seed.SeedConfig | None

The seed config if configured, None otherwise.

1num_columns_of_type(column_type: data_designer.config.column_types.DataDesignerColumnType) -> int

Get the count of columns of the specified type.

Parameters:

column_type
data_designer.config.column_types.DataDesignerColumnType

The type of columns to count.

Returns:

int

The number of columns matching the specified type.

1with_seed_dataset(
2 seed_source: data_designer.config.seed_source_types.SeedSourceT,
3 *,
4 sampling_strategy: data_designer.config.seed.SamplingStrategy = SamplingStrategy.ORDERED,
5 selection_strategy: data_designer.config.seed.IndexRange | data_designer.config.seed.PartitionBlock | None = None selection_strategy: data_designer.config.seed.IndexRange | data_designer.config.seed.PartitionBlock | None = None
6) -> typing_extensions.Self

Add a seed dataset to the current Data Designer configuration.

This method sets the seed dataset for the configuration, but columns are not resolved until compilation (including validation) is performed by the engine using a SeedReader.

Parameters:

seed_source
data_designer.config.seed_source_types.SeedSourceT

The pointer to the seed dataset.

sampling_strategy
data_designer.config.seed.SamplingStrategyDefaults to SamplingStrategy.ORDERED

The sampling strategy to use when generating data from the seed dataset. Defaults to ORDERED sampling.

selection_strategy
data_designer.config.seed.IndexRange | data_designer.config.seed.PartitionBlock | NoneDefaults to None

An optional selection strategy to use when generating data from the seed dataset. Defaults to None.

Returns:

typing_extensions.Self

The current Data Designer config builder instance.

1write_config(
2 path: str | pathlib.Path,
3 indent: int | None = 2,
4 **kwargs
5) -> None

Write the current configuration to a file.

Parameters:

path
str | pathlib.Path

Path to the file to write the configuration to.

indent
int | NoneDefaults to 2

Indentation level for the output file (default: 2).

**kwargs

Additional keyword arguments passed to the serialization methods used.

Raises:

BuilderConfigurationError

If the file format is unsupported.

BuilderSerializationError

If the configuration cannot be serialized.

1get_builder_config() -> data_designer.config.config_builder.BuilderConfig

Get the builder config for the current Data Designer configuration.

Returns:

data_designer.config.config_builder.BuilderConfig

The builder config.

1__repr__() -> str

Generates a string representation of the DataDesignerConfigBuilder instance.

Returns:

str

A formatted string showing the builder’s configuration including seed dataset and column information grouped by type.

1_repr_html_() -> str

Return an HTML representation of the DataDesignerConfigBuilder instance..

This method provides a syntax-highlighted HTML representation of the builder’s string representation.

Returns:

str

HTML string with syntax highlighting for the builder representation.

1data_designer.config.config_builder._load_model_configs(model_configs: list[data_designer.config.models.ModelConfig] | str | pathlib.Path | None = None) -> list[data_designer.config.models.ModelConfig]data_designer.config.config_builder._load_model_configs(model_configs: list[data_designer.config.models.ModelConfig] | str | pathlib.Path | None = None) -> list[data_designer.config.models.ModelConfig]

Resolves the provided model_configs, which may be a string or Path to a model configuration file. If None or empty, returns default model configurations if possible, otherwise raises an error.