nemo_automodel.components.launcher.interactive#

Module Contents#

Classes#

InteractiveLauncher

Launch a recipe locally on the current node using torchrun or in-process.

Functions#

_get_repo_root

Return the repository root. If CWD looks like an editable checkout, prepend it to PYTHONPATH so the local source takes precedence.

resolve_recipe_cls

Import and return the recipe class from a dotted path.

_recipe_module_path

Convert a dotted recipe target into an absolute filesystem path.

Data#

API#

nemo_automodel.components.launcher.interactive.logger#

‘getLogger(…)’

nemo_automodel.components.launcher.interactive._get_repo_root() pathlib.Path[source]#

Return the repository root. If CWD looks like an editable checkout, prepend it to PYTHONPATH so the local source takes precedence.

nemo_automodel.components.launcher.interactive.resolve_recipe_cls(target_str: str)[source]#

Import and return the recipe class from a dotted path.

"  pip install nemo-automodel          # CPU/basic

“ “ pip install nemo-automodel[all] # with CUDA & all extras

“

nemo_automodel.components.launcher.interactive._recipe_module_path(
recipe_target: str,
repo_root: pathlib.Path,
) pathlib.Path[source]#

Convert a dotted recipe target into an absolute filesystem path.

nemo_automodel.components.launcher.interactive._INSTALL_MSG = <Multiline-String>#
class nemo_automodel.components.launcher.interactive.InteractiveLauncher[source]#

Bases: nemo_automodel.components.launcher.base.Launcher

Launch a recipe locally on the current node using torchrun or in-process.

static _is_torchrun_worker() bool[source]#

Return True when this process was already spawned by torchrun.

torchrun (torch.distributed.run) sets both LOCAL_RANK and TORCHELASTIC_RUN_ID in the environment of every worker it spawns. We check for both to avoid false positives from environments (e.g. SLURM) that may set LOCAL_RANK without an active torchrun session.

When the user launches the CLI via torchrun --nproc-per-node N -m nemo_automodel.cli.app config.yaml, each worker must run the recipe in-process instead of re-launching torchrun.

_run_recipe_in_process(
recipe_target: str,
config: Dict[str, Any],
) int[source]#

Instantiate and run a recipe in the current process.

launch(
config: Dict[str, Any],
config_path: pathlib.Path,
recipe_target: str,
launcher_config: Any = None,
extra_args: Optional[List[str]] = None,
) int[source]#