Nemotron Architecture#
This directory contains documentation about Nemotronβs architecture and design principles.
Overview#
Nemotron is a cookbook - a reference implementation showing best practices for training LLMs at scale. Itβs not a framework you install; itβs a codebase you fork and customize.
Documents#
Runspec Specification - The
[tool.runspec]metadata format for recipe scriptsCLI Architecture - How the CLI layer works and how to fork it
Design Philosophy - What we optimize for and why
Quick Start#
# Run pretraining locally
nemotron nano3 pretrain -c tiny
# Submit to cluster
nemotron nano3 pretrain -c tiny --run dgx
# See what's happening (execution logic is visible in the code)
# Open: src/nemotron/cli/commands/nano3/pretrain.py
Two-Layer Architecture#
Layer |
What |
Where |
Fork When |
|---|---|---|---|
Execution |
How to run and track experiments |
|
nemo-run/wandb -> SkyPilot/mlflow |
Runtime |
Training/data processing |
|
Algorithm changes |
Execution Layer Runtime Layer (recipes/)
ββββββββββββββββββββββββββββ βββββββββββββββββββββββ
β cli/commands/nano3/ β β recipes/nano3/ β
β pretrain.py β β pretrain/train.py β
β β β β
β nemo_runspec (toolkit) βββββββΊβ Megatron-Bridge β
β config, execution, env β β β
β artifact registry β β β
ββββββββββββββββββββββββββββ βββββββββββββββββββββββ
The runtime layer is typically a thin script that delegates to NVIDIA AI stack libraries. The execution layer contains all the job submission logic, which is what youβd change to swap nemo-run for SkyPilot or another backend.
Package Responsibilities#
Package |
Scope |
|---|---|
|
Generic CLI toolkit: PEP 723 runspec parsing, config loading, env.toml profiles, execution helpers, packaging, pipeline orchestration, artifact registry ( |
|
Domain-specific: artifact type definitions (pretrain data, SFT data, checkpoints), lineage trackers (W&B, file-based), W&B integration |
|
CLI commands: visible execution logic per command, typer-based command tree |
|
Runtime scripts: training, data prep, RL (thin scripts delegating to NVIDIA AI stack) |
Dependency direction: nemotron.cli -> nemo_runspec + nemotron.kit -> NVIDIA stack. Never the reverse.