data_designer.config.config_builder
data_designer.config.config_builder
data_designer.config.config_builder
Bases: data_designer.config.exportable_config.ExportableConfigBase
Configuration container for Data Designer builder.
This class holds the main Data Designer configuration along with optional datastore settings needed for seed dataset operations.
Parameters:
The main Data Designer configuration containing columns, constraints, profilers, and other settings.
Attributes:
The main Data Designer configuration containing columns, constraints, profilers, and other settings.
Initialization:
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be
validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
Config builder for Data Designer configurations.
This class provides a high-level interface for building Data Designer configurations.
Initialization:
Initialize a new DataDesignerConfigBuilder instance.
Parameters:
Model configurations. Can be:
Tool configurations for MCP tool calling. Can be:
Create a DataDesignerConfigBuilder from an existing configuration.
Accepts both the full BuilderConfig format (with a top-level
data_designer key) and the shorthand DataDesignerConfig format
(columns, model_configs, etc. at the top level). When the
shorthand format is detected it is automatically normalized into a
full BuilderConfig.
Parameters:
Configuration source. Can be:
Returns:
typing_extensions.Self
A new instance populated with the configuration from the provided source.
Raises:
If the config format is invalid.
If the builder config loaded from the config is invalid.
Get the model configurations for this builder.
Returns:
Any
A list of ModelConfig objects used for data generation.
Get the tool configurations for this builder.
Returns:
Any
A list of ToolConfig objects used for MCP tool calling.
Get all referenceable variables allowed in prompt templates and expressions.
This includes all column names and their side effect columns that can be referenced in prompt templates and expressions within the configuration.
Returns:
Any
A list of variable names that can be referenced in templates and expressions.
Get the ConfigBuilderInfo object for this builder.
Returns:
Any
An object containing information about the configuration.
Add a model configuration to the current Data Designer configuration.
Parameters:
The model configuration to add.
Delete a model configuration from the current Data Designer configuration by alias.
Parameters:
The alias of the model configuration to delete.
Add a tool configuration to the current Data Designer configuration.
Parameters:
The tool configuration to add.
Returns:
typing_extensions.Self
The current Data Designer config builder instance.
Raises:
If a tool configuration with the same alias already exists.
Delete a tool configuration from the current Data Designer configuration by alias.
Parameters:
The alias of the tool configuration to delete.
Returns:
typing_extensions.Self
The current Data Designer config builder instance.
Add a Data Designer column configuration to the current Data Designer configuration.
If no column config object is provided, you must provide the name, column_type, and any
additional keyword arguments that are required by the column config constructor.
Parameters:
Data Designer column config object to add.
Name of the column to add. This is only used if column_config is not provided.
Column type to add. This is only used if column_config is not provided.
Additional keyword arguments to pass to the column constructor.
Returns:
typing_extensions.Self
The current Data Designer config builder instance.
Raises:
If neither a column config nor the required constructor arguments are provided.
If the provided column config is not one of the supported column config types.
Add a constraint to the current Data Designer configuration.
Currently, constraints are only supported for numerical samplers.
You can either provide a constraint object directly, or provide a constraint type and additional keyword arguments to construct the constraint object. Valid constraint types are:
Parameters:
Constraint object to add.
Constraint type to add. Ignored when constraint is provided.
Additional keyword arguments to pass to the constraint constructor.
Returns:
typing_extensions.Self
The current Data Designer config builder instance.
Add a processor to the current Data Designer configuration.
If a processor with the same name already exists, it is replaced (upsert), making notebook cells safely re-runnable.
You can either provide a processor config object directly, or provide a processor type and additional keyword arguments to construct the processor config object.
Parameters:
The processor configuration object to add.
The type of processor to add.
Additional keyword arguments to pass to the processor constructor.
Returns:
typing_extensions.Self
The current Data Designer config builder instance.
Remove an existing processor by name and undo its side-effects.
Resolve column names, expanding glob patterns against known column configs.
Add a profiler to the current Data Designer configuration.
Parameters:
The profiler configuration object to add.
Returns:
typing_extensions.Self
The current Data Designer config builder instance.
Raises:
If the profiler configuration is of an invalid type.
Get all profilers.
Returns:
list[data_designer.config.analysis.column_profilers.ColumnProfilerConfigT]
A list of profiler configuration objects.
Build a DataDesignerConfig instance based on the current builder configuration.
Returns:
data_designer.config.data_designer_config.DataDesignerConfig
The current Data Designer config object.
Raises:
If any ToolConfig has duplicate tool names in its allow_tools list.
Validate that no ToolConfig has duplicate tool names in its allow_tools list.
This is a static validation that catches obvious duplicates at config build time, before providers are queried. Full validation (including duplicates across providers) happens at resource provider creation time.
Raises:
If any ToolConfig has duplicate tool names in allow_tools.
Delete all constraints for the given target column.
Parameters:
Name of the column to remove constraints for.
Returns:
typing_extensions.Self
The current Data Designer config builder instance.
Delete the column with the given name.
Parameters:
Name of the column to delete.
Returns:
typing_extensions.Self
The current Data Designer config builder instance.
Raises:
If trying to delete a seed dataset column.
Get a column configuration by name.
Parameters:
Name of the column to retrieve the config for.
Returns:
data_designer.config.column_types.ColumnConfigT
The column configuration object.
Raises:
If no column with the given name exists.
Get all column configurations.
Returns:
list[data_designer.config.column_types.ColumnConfigT]
A list of all column configuration objects.
Get a tool configuration by alias.
Parameters:
The alias of the tool configuration to retrieve.
Returns:
data_designer.config.mcp.ToolConfig
The tool configuration object.
Raises:
If no tool configuration with the given alias exists.
Get all constraints for the given target column.
Parameters:
Name of the column to get constraints for.
Returns:
list[data_designer.config.sampler_constraints.ColumnConstraintT]
A list of constraint objects targeting the specified column.
Get all column configurations of the specified type.
Parameters:
The type of columns to filter by.
Returns:
list[data_designer.config.column_types.ColumnConfigT]
A list of column configurations matching the specified type.
Get all column configurations excluding the specified type.
Parameters:
The type of columns to exclude.
Returns:
list[data_designer.config.column_types.ColumnConfigT]
A list of column configurations that do not match the specified type.
Get processor configuration objects.
Returns:
list[data_designer.config.processor_types.ProcessorConfigT]
A list of processor configuration objects.
Get the seed config for the current Data Designer configuration.
Returns:
data_designer.config.seed.SeedConfig | None
The seed config if configured, None otherwise.
Get the count of columns of the specified type.
Parameters:
The type of columns to count.
Returns:
int
The number of columns matching the specified type.
Add a seed dataset to the current Data Designer configuration.
This method sets the seed dataset for the configuration, but columns are not resolved until compilation (including validation) is performed by the engine using a SeedReader.
Parameters:
The pointer to the seed dataset.
The sampling strategy to use when generating data from the seed dataset. Defaults to ORDERED sampling.
An optional selection strategy to use when generating data from the seed dataset. Defaults to None.
Returns:
typing_extensions.Self
The current Data Designer config builder instance.
Write the current configuration to a file.
Parameters:
Path to the file to write the configuration to.
Indentation level for the output file (default: 2).
Additional keyword arguments passed to the serialization methods used.
Raises:
If the file format is unsupported.
If the configuration cannot be serialized.
Get the builder config for the current Data Designer configuration.
Returns:
data_designer.config.config_builder.BuilderConfig
The builder config.
Generates a string representation of the DataDesignerConfigBuilder instance.
Returns:
str
A formatted string showing the builder’s configuration including seed dataset and column information grouped by type.
Return an HTML representation of the DataDesignerConfigBuilder instance..
This method provides a syntax-highlighted HTML representation of the builder’s string representation.
Returns:
str
HTML string with syntax highlighting for the builder representation.
Resolves the provided model_configs, which may be a string or Path to a model configuration file. If None or empty, returns default model configurations if possible, otherwise raises an error.