data_designer.config.run_config
data_designer.config.run_config
data_designer.config.run_config
Bases: data_designer.config.utils.type_helpers.StrEnum
Template renderer used by the engine for user-supplied Jinja templates.
Initialization:
Initialize self. See help(type(self)) for accurate signature.
Bases: data_designer.config.base.ConfigBase
AIMD throttle tuning parameters for adaptive concurrency control.
These knobs configure the ThrottleManager that wraps every outbound
model HTTP request. The defaults are conservative and suitable for most
workloads; override only when you understand the trade-offs.
Parameters:
Multiplicative decrease factor applied to the per-domain concurrency limit on a 429 / rate-limit signal. Must be in (0, 1). Default is 0.75 (reduce by 25% on rate-limit).
Additive increase step applied after every
success_window consecutive successes. Default is 1.
Number of consecutive successful releases before the additive increase is applied. Default is 25.
Default cooldown duration (seconds) applied after a
rate-limit when the provider does not include a Retry-After
header. Default is 2.0.
Fraction above the observed rate-limit ceiling that additive increase is allowed to probe before capping. Default is 0.10 (10% overshoot).
Optional startup ramp duration. When greater than zero, each throttle domain starts at one concurrent request and linearly ramps to its configured peak over this many seconds. A 429 aborts the startup ramp and switches to normal AIMD recovery. Default is 0.0 (disabled).
Attributes:
Multiplicative decrease factor applied to the per-domain concurrency limit on a 429 / rate-limit signal. Must be in (0, 1). Default is 0.75 (reduce by 25% on rate-limit).
Additive increase step applied after every
success_window consecutive successes. Default is 1.
Number of consecutive successful releases before the additive increase is applied. Default is 25.
Default cooldown duration (seconds) applied after a
rate-limit when the provider does not include a Retry-After
header. Default is 2.0.
Fraction above the observed rate-limit ceiling that additive increase is allowed to probe before capping. Default is 0.10 (10% overshoot).
Optional startup ramp duration. When greater than zero, each throttle domain starts at one concurrent request and linearly ramps to its configured peak over this many seconds. A 429 aborts the startup ramp and switches to normal AIMD recovery. Default is 0.0 (disabled).
Initialization:
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be
validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
Bases: data_designer.config.base.ConfigBase
Runtime configuration for dataset generation.
Groups configuration options that control generation behavior but aren’t part of the dataset configuration itself.
Parameters:
If True, disables the executor’s early-shutdown behavior entirely. Generation will continue regardless of error rate, and the early-shutdown exception will never be raised. Error counts and summaries are still collected. Default is False.
Error rate threshold (0.0-1.0) that triggers early shutdown when early shutdown is enabled. Default is 0.5.
Minimum number of completed tasks before error rate monitoring begins. Must be >= 1. Default is 10.
Number of records to process in each batch during dataset generation. A batch is processed end-to-end (column generation, post-batch processors, and writing the batch to artifact storage) before moving on to the next batch. Must be > 0. Default is 1000.
Maximum number of worker threads used for non-inference cell-by-cell generators. Must be >= 1. Default is 4.
Maximum number of full conversation restarts permitted when
generation tasks call ModelFacade.generate(...). Must be >= 0. Default is 5.
Maximum number of correction rounds permitted within a
single conversation when generation tasks call ModelFacade.generate(...). Must be >= 0.
Default is 0.
If True, collect per-task tracing data when using the async engine (DATA_DESIGNER_ASYNC_ENGINE=1). Has no effect on the sync path. Default is False.
If True, display sticky ANSI progress bars instead of periodic log lines during generation. Requires a TTY; falls back to log lines in non-TTY environments. Default is False.
How often (in seconds) the async progress reporter emits a consolidated log block. Must be > 0. Default is 5.0.
Template renderer used for engine-side Jinja evaluation.
native uses Jinja2’s built-in sandbox with the standard filter set and
fewer Data Designer-specific restrictions. secure uses Data Designer’s
hardened sandbox with additional AST, filter, and output guards.
Default is secure.
AIMD throttle tuning parameters. See ThrottleConfig for details.
Attributes:
If True, disables the executor’s early-shutdown behavior entirely. Generation will continue regardless of error rate, and the early-shutdown exception will never be raised. Error counts and summaries are still collected. Default is False.
Error rate threshold (0.0-1.0) that triggers early shutdown when early shutdown is enabled. Default is 0.5.
Minimum number of completed tasks before error rate monitoring begins. Must be >= 1. Default is 10.
Number of records to process in each batch during dataset generation. A batch is processed end-to-end (column generation, post-batch processors, and writing the batch to artifact storage) before moving on to the next batch. Must be > 0. Default is 1000.
Maximum number of worker threads used for non-inference cell-by-cell generators. Must be >= 1. Default is 4.
Maximum number of full conversation restarts permitted when
generation tasks call ModelFacade.generate(...). Must be >= 0. Default is 5.
Maximum number of correction rounds permitted within a
single conversation when generation tasks call ModelFacade.generate(...). Must be >= 0.
Default is 0.
If True, collect per-task tracing data when using the async engine (DATA_DESIGNER_ASYNC_ENGINE=1). Has no effect on the sync path. Default is False.
If True, display sticky ANSI progress bars instead of periodic log lines during generation. Requires a TTY; falls back to log lines in non-TTY environments. Default is False.
How often (in seconds) the async progress reporter emits a consolidated log block. Must be > 0. Default is 5.0.
Template renderer used for engine-side Jinja evaluation.
native uses Jinja2’s built-in sandbox with the standard filter set and
fewer Data Designer-specific restrictions. secure uses Data Designer’s
hardened sandbox with additional AST, filter, and output guards.
Default is secure.
AIMD throttle tuning parameters. See ThrottleConfig for details.
Initialization:
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be
validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
Normalize shutdown settings for compatibility.