Job Launchers
NeMo AutoModel provides several ways to launch training. The right choice depends on your hardware and environment.
Which Launcher Should I Use?
I Have 1–2 GPUs on My Workstation
Use the interactive launcher. No scheduler or cluster software is needed:
See the Run on Your Local Workstation guide.
I Have Access to a Slurm Cluster
Add a slurm: section to your YAML config and submit with the same automodel command. The CLI generates the torchrun invocation and calls sbatch for you:
See the Run on a Cluster guide.
I Want Managed Job Submission (Slurm, Kubernetes, Docker)
Add a nemo_run: section to your YAML config. NeMo-Run loads a pre-configured executor for your compute target and submits the job:
See the NeMo-Run guide.
I Want to Train on the Cloud
Add a skypilot: section to your YAML config. SkyPilot provisions VMs on any major cloud and handles spot-instance preemption automatically:
See the SkyPilot guide.
I Want to Train on Kubernetes with SkyPilot
Use the same skypilot: launcher, but set cloud: kubernetes. This is a good fit when your team already has a GPU-backed Kubernetes cluster and you want SkyPilot to handle job submission and multi-node orchestration:
See the SkyPilot + Kubernetes tutorial.
All Launchers Use the Same Config
Every launcher shares the same YAML recipe format. The only difference is an optional launcher section (slurm:, nemo_run:, or skypilot:) that tells the CLI where to run. Without a launcher section, training runs interactively on the current machine.