Reference#
Specifications grounded in src/nemotron/steps/byob.
Outputs#
Output files
Seed, stage cache, raw and final Parquet paths.
Troubleshooting
Common configuration errors, missing caches, filtering, and endpoint issues.
Configuration#
Generation YAML
Required keys for ByobConfig.from_yaml.
Translation YAML
ByobTranslationConfig.from_yaml requirements.
Source benchmarks#
Allowed Hugging Face datasets
Identifiers and default subsets from runtime/constants.py.