CLI Reference#
Command-line reference for nemotron steps run sdg/data_designer. For pipeline overview, see About Synthetic Data Generation.
Syntax#
$ nemotron steps run sdg/data_designer \
[-c CONFIG] \
[--run PROFILE | --batch PROFILE] \
[--dry-run] \
[KEY=VALUE ...]
Flags#
- -c, --config CONFIG#
Config name (resolved from the step’s
config/directory) or an absolute/relative path to a YAML file.Bundled names:
default,customer_support_tools,rl_pref,tiny.Default:
default
- -r, --run PROFILE#
Run attached using the env.toml profile named
PROFILE. Job output streams to the terminal. Use for short interactive runs.
- -b, --batch PROFILE#
Run detached using the env.toml profile named
PROFILE. Job is submitted and the command returns immediately. Use for long cluster jobs.
- -d, --dry-run#
Compile the config and print the resolved job spec without executing. Useful for verifying hydra overrides before submission.
Hydra Overrides#
Any KEY=VALUE argument after the flags is passed as a hydra dotlist override and merged into the resolved config. Overrides take precedence over YAML values.
Override |
Example |
Effect |
|---|---|---|
|
|
Generate N records |
|
|
Run in preview mode |
|
|
Write output to PATH |
|
|
Override seed file |
|
|
Override first model’s temperature |
Dotlist path follows the YAML structure. Nested keys use . as separator; list items use .N (zero-indexed).
Examples#
Preview the default config with two records:
$ nemotron steps run sdg/data_designer -c default preview=true num_records=2
Generate 100 SFT records with a custom output path:
$ nemotron steps run sdg/data_designer -c default \
num_records=100 \
output_path=/data/my-project/sft.jsonl
Dry-run a cluster submission to check the resolved config:
$ nemotron steps run sdg/data_designer -c default --run my-profile --dry-run
Run attached on a Lepton profile with 500 records:
$ nemotron steps run sdg/data_designer -c default --run lepton_sdg_data_designer num_records=500
Use a config at an arbitrary path:
$ nemotron steps run sdg/data_designer -c /path/to/my-config.yaml preview=true num_records=2