Repository Structure
This introductory guide presents the structure of the NeMo AutoModel repository, provides a brief overview of its parts, introduces concepts such as components and recipes, and explains how everything fits together.
What is NeMo AutoModel?
NeMo AutoModel is a PyTorch library for fine-tuning and pretraining large-scale models. In particular, it provides:
- Optimized implementations for training efficiency, including fused kernels and memory-saving techniques.
- Day-0 support for LLMs and VLMs available on the Hugging Face Hub.
- Seamless integration with Hugging Face datasets, tokenizers, and related tools.
- Distributed training strategies using FSDP2 and MegatronFSDP across multi-GPU and multi-node environments.
- End-to-end workflows with recipes for data preparation, training, and evaluation.
Repository Structure
The AutoModel source code is available under the nemo_automodel directory. It is organized into three directories:
components/- Self-contained modulesrecipes/- End-to-end training workflowscli/- CLI entry-point and job launcher dispatch.
Components Directory
The components/ directory contains isolated modules used in training loops.
Each component is designed to be dependency-light and reusable without cross-module imports.
Directory Structure
The following directory listing shows all components along with explanations of their contents.
Key Features
- Each component can be used independently in other projects.
- Each component has its own dependencies, without cross-module imports.
- Unit tests are colocated with the component they cover.
Recipes Directory
Recipes define end-to-end workflows (data and model loading → training with custom loop → saving the output checkpoint) for a variety of tasks, such as training, fine-tuning, and knowledge distillation.
Available Recipes
The following directory listing shows all components along with explanations of their contents.
Run a Recipe
Recipes are launched via the automodel CLI:
The above command will fine-tune the Llama3.2-1B model on the SQuAD dataset with two GPUs using the llama3_2_1b_squad.yaml config.
For a single-GPU run, omit --nproc-per-node:
Each recipe imports the components it needs from the nemo_automodel/components/ catalog.
The recipe/components structure enables you to:
- Decouple individual components and replace them with custom implementations when needed.
- Avoid rigid, class-based trainer structures by using linear scripts that expose training logic for maximum flexibility and control.
Configure a Recipe
An example YAML configuration is shown below. The complete config is available here:
More recipe examples are available under the examples/ directory.
CLI Directory (cli/)
The automodel (or am) CLI application simplifies job execution across different environments, from
single-GPU interactive sessions to batch multi-node runs. It supports interactive (local), SLURM,
SkyPilot, and NeMo-Run launchers. The CLI lives at the repository root in the
cli/ package, separate from the core nemo_automodel library.
Next Steps
Learn how to train models with NeMo AutoModel on:
- Your local workstation: See
docs/launcher/local-workstation.md. - A cluster: See
docs/launcher/slurm.md.