SDG Reference#

Complete specifications for the SDG pipeline. For pipeline overview and when to use it, refer to About Synthetic Data Generation.

Config Schema

All YAML fields: top-level settings, seed dataset, model aliases, column types, and output projections.

Config Schema
CLI Reference

nemotron steps run sdg/data_designer flags and hydra override syntax.

CLI Reference
Output Projections

The three projection shapes with annotated JSONL examples.

Output Projections
Troubleshooting

Failure modes for local runs and cluster dispatch.

Troubleshooting