About Nemotron Steps#
A Nemotron step is a named, reusable unit of work that you invoke with the nemotron steps CLI.
Each step declares the artifacts it consumes, the artifacts it produces, and a set of named configurations that you can run on your laptop, on a single node, or on a cluster.
Steps are the building blocks of every Nemotron pipeline.
This section is the entry point for the step model itself. Use it to learn what a step is, to explore the available steps from the CLI, and to find the right domain section for the work you have in mind.
The Basics#
Definitions of step, configuration, environment profile, and artifact. Start here if you have not run a step before.
List the available steps, inspect their inputs and outputs, and chain steps together.
Building Block Steps#
Pipelines are modular. You can run a single step in isolation, and you can compose steps into longer flows. The cards below group the available steps by the outcome they support. Follow the link in each card for tutorials, how-to guides, concepts, and reference material in that domain.
Build your own dataset
Generate supervised fine-tuning (SFT) chat data, tool-calling data, or preference pairs with NeMo Data Designer.
Backed by the sdg/data_designer step.
Translate JSON Lines or Apache Parquet corpora with NeMo Curator, with optional faithfulness, accuracy, integrity, and translation-quality holistic (FAITH) scoring.
Backed by the translate/nemo_curator step.
Filter raw text with curate/nemo_curator, then tokenize and shard it with the data_prep/pretrain_prep, data_prep/sft_packing, and data_prep/rl_prep steps.
Use the curation docs for JSONL filtering and the training docs for data preparation.
Build your own benchmarks
Generate a custom multiple-choice question (MCQ) benchmark from your own documents, with optional translation.
Backed by the byob step.
Build your own models
Pretrain, fine-tune, align, and optimize models with the pretrain/, sft/, peft/, rl/, optimize/, and convert/ step families.
Score a trained checkpoint on standard benchmarks with NeMo Evaluator.
Backed by the eval/model_eval step.
I Want To#
Goal |
Go To |
|---|---|
Learn what a step, configuration, and profile are |
|
List the available steps from the CLI |
|
Run steps in an airgap environment |
|
Curate JSONL text |
|
Generate synthetic training data |
|
Translate a corpus |
|
Build an MCQ benchmark |
|
Fine-tune or align a model |
|
Evaluate a model |
|
Set up a Lepton or Slurm environment profile |