YAML Configuration and Environment Variable Interpolation#
NeMo Automodel recipes are configured with YAML. Under the hood, YAML is parsed into a ConfigNode which:
Translates common scalar strings into typed Python values (e.g.,
"10"→10).Resolves
_target_(and*_fn) into Python callables/classes.Supports environment variable interpolation inside YAML strings.
Tries to make config printing safe by preserving original placeholders (to avoid leaking secrets).
Load Model and Dataset Configs#
Most recipes load the YAML via nemo_automodel.components.config.loader.load_yaml_config(), which returns a ConfigNode.
Within a ConfigNode:
Nested dicts become nested
ConfigNodeobjects.Lists are recursively wrapped.
Scalars are translated with
translate_value()when they are YAML strings.
Typed Scalar Translation (translate_value)#
Only strings are translated. Examples:
"123"→123"3.14"→3.14"true"/"false"→True/False"None"/"none"→None
YAML-native types (like step_size: 10 without quotes) are already typed by the YAML parser and remain unchanged.
Use _target_ for Instantiation#
Any mapping containing a _target_ key can be instantiated via ConfigNode.instantiate():
model:
_target_: nemo_automodel.NeMoAutoModelForCausalLM.from_pretrained
pretrained_model_name_or_path: meta-llama/Llama-3.2-1B
There is also support for resolving callables from:
Dotted paths:
pkg.module.symbolLocal file paths:
/abs/path/to/file.py:symbol
Safety and Policy#
By default, resolving targets is restricted:
Imports are allowed from common safe prefixes (e.g.
nemo_automodel,torch,transformers, …).Accessing private or dunder attributes is blocked by default.
Loading out-of-tree user code can be enabled with
NEMO_ENABLE_USER_MODULES=1or by callingset_enable_user_modules(True).
Interpolate Environment Variables in YAML#
NeMo Automodel supports env var interpolation inside YAML string values.
Supported Forms#
Braced:
${VAR}${VAR,default}${var.dot.var}(dots are treated as part of the env var name)
Dollar:
$VAR$var.dot.var
Back-compat:
${oc.env:VAR}${oc.env:VAR,default}
Interpolation Behavior#
Interpolation happens when values are wrapped into a
ConfigNode.If a referenced env var is missing and no default is provided, config loading raises a
KeyError.Defaults are supported only for braced forms via the first comma:
${VAR,default_value}.
Example (Databricks Delta)#
dataset:
_target_: nemo_automodel.components.datasets.llm.column_mapped_text_instruction_iterable_dataset.ColumnMappedTextInstructionIterableDataset
path_or_dataset_id: delta://catalog.schema.training_data
delta_storage_options:
DATABRICKS_HOST: ${DATABRICKS_HOST}
DATABRICKS_TOKEN: ${DATABRICKS_TOKEN}
DATABRICKS_HTTP_PATH: ${DATABRICKS_HTTP_PATH}
Prevent Secret Leakage in Logs#
When an env var placeholder is resolved, the config keeps the original placeholder in an internal ._orig_value field for safe printing:
str(cfg)/repr(cfg)prints placeholders (e.g.${DATABRICKS_TOKEN}), not resolved secrets.cfg.to_yaml_dict(use_orig_values=True, redact_sensitive=True)is the recommended way to produce a loggable YAML dict.
Important
Printing a leaf value (for example, print(cfg.dataset.delta_storage_options.DATABRICKS_TOKEN)) outputs the resolved secret. Instead, print the full config or use a redacted YAML dict.
Configure Slurm (automodel CLI)#
The automodel CLI loads YAML via yaml.safe_load() and then extracts the slurm: section.
Since the slurm: dict is not wrapped into a ConfigNode, env placeholders are passed through as-is. This lets you defer expansion to job runtime (and avoids embedding secrets into generated scripts).
Example:
slurm:
hf_home: ${HF_HOME} # passed through to the template as-is
hf_token: ${HF_TOKEN} # also passed through (recommended for secrets)
env_vars:
WANDB_API_KEY: ${WANDB_API_KEY}
Note
job_diris used by the CLI on the submit host to create the local log directory and write the sbatch script/config. If you setjob_dirto a placeholder like${SLURM_JOB_DIR}, the CLI will treat it literally.Some values are rendered into
#SBATCHdirectives (which are not shell-expanded). Prefer env placeholders for runtimeexport ...lines (hf_token,env_vars, etc.), not for SBATCH fields.The
slurm:section is passed through to a bash script. Use bash-compatible syntax ($VAR/${VAR}) there. Python-only forms like${oc.env:VAR}(and dotted names like${foo.bar}) are not valid bash parameter expansions and can fail at runtime.