Pipeline Overview#
The mcq family is registered in runtime/benchmark_families/registry.py and exposes three entrypoints: prepare_data, generate, and translate.
nemotron.steps.byob.scripts.runtime.run_byob dispatches by stage:
CLI / YAML |
Calls |
|---|---|
|
|
|
|
|
|
|
|
nemotron steps run byob/mcq executes mcq/step.py, which forwards to the BYOB argparse dispatcher in src/nemotron/steps/byob/scripts/run.py.
Generate stage order#
generate_mcq in runtime/benchmark_families/mcq/pipeline.py writes Parquet files under output_dir/expt_name/stage_cache/ in this order:
GENERATION —
generated_questions.parquetJUDGEMENT —
judged_questions.parquetSEMANTIC_DEDUPLICATION —
semantic_deduplicated_questions.parquet(skipped body whensemantic_deduplication_config.enabledis false, but the file still materializes with duplicate flags)DISTRACTOR_EXPANSION — optional when
do_distractor_expansionis trueCOVERAGE_CHECK — optional when
do_coverage_checkis trueDISTRACTOR_VALIDITY_CHECK —
valid_distractors.parquetSEMANTIC_OUTLIER_DETECTION —
semantic_outlier_detection.parquetHALLUCINATION_EASINESS_DETECTION —
filtered_questions.parquetFINAL_OUTPUT — copies into
benchmark_raw.parquet, appliesremove_hallucinated/remove_easy, renames columns, and writesbenchmark.parquet
Translate stage order#
translate_mcq writes:
TRANSLATION —
stage_cache/translated_questions.parquetBACKTRANSLATION —
stage_cache/backtranslated_questions.parquetQUALITY_METRICS —
stage_cache/quality_metrics.parquetFINAL_OUTPUT —
benchmark_raw.parquet, optionalremove_low_qualityfilter, column rename to the MCQ schema,benchmark.parquet
See Output Files for the exact filenames.