nemo_rl.modelopt.utils#

Lightweight quantization config resolver usable by both Megatron and vLLM workers.

Module Contents#

Functions#

resolve_quant_cfg

Resolve a quantization config string into a dict consumable by mtq.quantize.

API#

nemo_rl.modelopt.utils.resolve_quant_cfg(quant_cfg: str) dict[str, Any]#

Resolve a quantization config string into a dict consumable by mtq.quantize.

Resolution order:

  1. Built-in ModelOpt config constant exposed on modelopt.torch.quantization (e.g. "NVFP4_DEFAULT_CFG", "FP8_DEFAULT_CFG").

  2. A ModelOpt PTQ recipe — either the name of a built-in recipe shipped under modelopt_recipes/ (e.g. "general/ptq/nvfp4_default-fp8_kv"; the .yml / .yaml suffix is optional) or the path to a user-authored YAML recipe. Resolution is performed by modelopt.recipe.load_config, which searches the filesystem first and then the built-in recipe library.

YAML recipes are expected to follow the standard ModelOpt PTQ recipe layout with a top-level quantize: section in the {"quant_cfg": [...], "algorithm": ...} shape that mtq.quantize expects. A bare {"quant_cfg": [...], "algorithm": ...} document (without a wrapping quantize: key) is also accepted for convenience. The extracted dict — not the full recipe — is returned.

See modelopt_recipes/general/ptq/ in the TensorRT-Model-Optimizer repo for the canonical format and examples/modelopt/quant_configs/ for a user-authored example.