nemo_automodel.components.launcher.nemo_run.launcher#
Module Contents#
Classes#
Launch a recipe via NeMo-Run’s executor API. |
Data#
API#
- nemo_automodel.components.launcher.nemo_run.launcher.logger#
‘getLogger(…)’
- nemo_automodel.components.launcher.nemo_run.launcher._CONFIG_FILENAME#
‘automodel_config.yaml’
- nemo_automodel.components.launcher.nemo_run.launcher._REMOTE_CONFIG_PATH#
None
- class nemo_automodel.components.launcher.nemo_run.launcher.NemoRunLauncher[source]#
Bases:
nemo_automodel.components.launcher.base.LauncherLaunch a recipe via NeMo-Run’s executor API.
Supports loading pre-configured executors from
$NEMORUN_HOME/executors.py(or a custom path) and submitting jobs asnemo_run.Scriptobjects. Works with any NeMo-Run executor backend (Slurm, Kubernetes, Docker, local).Uses NeMo-Run’s native
Torchrunlauncher so that distributed training arguments (rendezvous, node rank, nproc-per-node) are managed automatically. The training config YAML is packaged viaPatternPackagerso it is available at/nemo_run/code/automodel_config.yamlinside the container.- static _configure_torchrun(executor: Any, devices: int) None[source]#
Enable the native NeMo-Run Torchrun launcher on executor.
Sets
executor.launcher = "torchrun"andtorchrun_nproc_per_nodeso NeMo-Run generates the correcttorchrun --nproc-per-node=<N>invocation in the sbatch script.