Concepts#

These pages explain how the mcq family inside src/nemotron/steps/byob prepares data, runs each generation stage, and optionally translates benchmarks.

Architecture#

Pipeline overview

Prepare, generate, translate, and the Parquet stage cache.

stages

Pipeline Overview

Core processes#

Data preparation

Seeds from Hugging Face plus local corpus chunks.

few-shot

Data Preparation for Multiple-Choice Question Benchmarks

Mapping targets to sources

source_subjects, weights, and optional tags.

target_source_mapping

Getting the Right Questions From the Source Benchmark

Question generation

Data Designer batched calls from prepared seeds.

generation

Question Generation

Quality assurance#

Validation stack

Judgement, deduplication, distractors, coverage, outliers.

validation

Quality Validation

Filtering

Easiness and hallucination scores with removal flags.

filtering

Easiness and Hallucination Filtering

Translation#

Translation

Curator translation, backtranslation, metrics, final schema.

translate

Translation

Next steps#

Hands-on first run: Getting Started with Building MCQ Benchmarks
YAML tables: Reference