data_designer.config.base

Module Contents

Classes

Name	Description
`ConfigBase`	!!! abstract “Usage Documentation” Models
`SkipConfig`	Expression gate for conditional column generation.
`SingleColumnConfig`	Abstract base class for all single-column configuration types.
`ProcessorConfig`	Abstract base class for all processor configuration types.

Data

_VALIDATION_ENV

API

1 _VALIDATION_ENV = ImmutableSandboxedEnvironment(...)

1 class data_designer.config.base.ConfigBase(
2     /,
3     **data: typing.Any
4 )

Bases: pydantic.BaseModel

1 model_config = ConfigDict(...)

1 class data_designer.config.base.SkipConfig(
2     /,
3     **data: typing.Any
4 )

Bases: data_designer.config.base.ConfigBase

Expression gate for conditional column generation.

Attach to a SingleColumnConfig via skip=SkipConfig(...) to gate generation on a Jinja2 expression. Controls when to skip; propagation of upstream skips is controlled separately by propagate_skip on SingleColumnConfig.

Parameters:

when

Jinja2 expression (including \{\{ \}\} delimiters); when truthy, skip generation for this row.

value

Value to write for skipped cells. Defaults to None (becomes NaN/pd.NA in the DataFrame).

Attributes:

when

Jinja2 expression (including \{\{ \}\} delimiters); when truthy, skip generation for this row.

value

Value to write for skipped cells. Defaults to None (becomes NaN/pd.NA in the DataFrame).

Initialization:

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

1 when: str = Field(...)

1 value: bool | int | float | str | None = Field(...)

1 _validate_when_syntax(v: str) -> str

1 columns() -> list[str]

Column names referenced in the when expression.

Parsed once from the Jinja2 AST and cached. Used by the DAG builder to add dependency edges and by the execution graph to store metadata.

1 class data_designer.config.base.SingleColumnConfig(
2     /,
3     **data: typing.Any
4 )

Bases: data_designer.config.base.ConfigBase, abc.ABC

Abstract base class for all single-column configuration types.

This class serves as the foundation for all column configurations in DataDesigner, defining shared fields and properties across all column type.

Parameters:

name

Unique name of the column to be generated.

drop

If True, the column will be generated but removed from the final dataset. Useful for intermediate columns that are dependencies for other columns.

allow_resize

If True, the generator may emit a different number of rows than it received (1:N or N:1). Explicit skip gates are invalid on resize columns, and upstream skip propagation is not applied to them.

column_type

Discriminator field that identifies the specific column type. Subclasses must override this field to specify the column type with a Literal value.

skip

Optional expression gate for conditional generation.

propagate_skip

If True (default), this column auto-skips when any of its required_columns was skipped. Independent of skip.

Attributes:

name

Unique name of the column to be generated.

drop

If True, the column will be generated but removed from the final dataset. Useful for intermediate columns that are dependencies for other columns.

allow_resize

If True, the generator may emit a different number of rows than it received (1:N or N:1). Explicit skip gates are invalid on resize columns, and upstream skip propagation is not applied to them.

column_type

Discriminator field that identifies the specific column type. Subclasses must override this field to specify the column type with a Literal value.

skip

Optional expression gate for conditional generation.

propagate_skip

If True (default), this column auto-skips when any of its required_columns was skipped. Independent of skip.

Initialization:

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

1 name: str

1 drop: bool = False

1 allow_resize: bool = False

1 column_type: str

1 skip: data_designer.config.base.SkipConfig | None

1 propagate_skip: bool = Field(...)

1 _validate_skip_scope() -> typing_extensions.Self

1 get_column_emoji() -> str

1 required_columns: list[str]

Decorators: @abstractmethod

Returns a list of column names that must exist before this column can be generated.

Returns:

Any

List of column names that this column depends on. Empty list indicates no dependencies. Override in subclasses to specify dependencies.

1 side_effect_columns: list[str]

Decorators: @abstractmethod

Returns a list of additional columns that this column will create as a side effect.

Some column types generate additional metadata or auxiliary columns alongside the primary column (e.g., reasoning traces for LLM columns).

Returns:

Any

List of column names that this column will create as a side effect. Empty list indicates no side effect columns. Override in subclasses to specify side effects.

1 get_model_aliases() -> list[str]

Return every model alias this column depends on.

The startup model health check uses this to decide which model endpoints to ping. The default implementation returns the column’s primary model_alias (if the attribute is present), which covers the built-in LLM, embedding, and image columns.

Override this method on configs that depend on more than one model — for example, a plugin config with both a model_alias and a judge_model_alias should return both so a typo or unreachable endpoint on the secondary alias surfaces at startup rather than at first generation.

An empty-string model_alias is forwarded to the health check so that the registry’s “no model config with alias ” found” error is raised eagerly at startup instead of at first generation; only a truly missing attribute is treated as “no model endpoints”.

Returns:

list[str]

List of model aliases this column depends on. Empty list indicates the column does not call any model endpoints.

1 class data_designer.config.base.ProcessorConfig(
2     /,
3     **data: typing.Any
4 )

Bases: data_designer.config.base.ConfigBase, abc.ABC

Abstract base class for all processor configuration types.

Processors are transformations that run at different stages of the generation pipeline. They can modify, reshape, or augment the dataset.

Parameters:

name

Unique name of the processor, used to identify the processor in results and to name output artifacts on disk.

processor_type

Discriminator field that identifies the specific processor type. Subclasses must override this field with a Literal value.

Attributes:

name

Unique name of the processor, used to identify the processor in results and to name output artifacts on disk.

processor_type

Discriminator field that identifies the specific processor type. Subclasses must override this field with a Literal value.

Initialization:

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

1 name: str = Field(...)

1 processor_type: str

1	class data_designer.config.base.ConfigBase(
2	/,
3	**data: typing.Any
4	)

1	class data_designer.config.base.SkipConfig(
2	/,
3	**data: typing.Any
4	)

1	class data_designer.config.base.SingleColumnConfig(
2	/,
3	**data: typing.Any
4	)

1	class data_designer.config.base.ProcessorConfig(
2	/,
3	**data: typing.Any
4	)