nemo_rl.modelopt.utils#
Lightweight quantization config resolver usable by both Megatron and vLLM workers.
Module Contents#
Functions#
Resolve a quantization config string into a dict consumable by |
API#
- nemo_rl.modelopt.utils.resolve_quant_cfg(quant_cfg: str) dict[str, Any]#
Resolve a quantization config string into a dict consumable by
mtq.quantize.Resolution order:
Built-in ModelOpt config constant exposed on
modelopt.torch.quantization(e.g."NVFP4_DEFAULT_CFG","FP8_DEFAULT_CFG").A ModelOpt PTQ recipe — either the name of a built-in recipe shipped under
modelopt_recipes/(e.g."general/ptq/nvfp4_default-fp8_kv"; the.yml/.yamlsuffix is optional) or the path to a user-authored YAML recipe. Resolution is performed bymodelopt.recipe.load_config, which searches the filesystem first and then the built-in recipe library.
YAML recipes are expected to follow the standard ModelOpt PTQ recipe layout with a top-level
quantize:section in the{"quant_cfg": [...], "algorithm": ...}shape thatmtq.quantizeexpects. A bare{"quant_cfg": [...], "algorithm": ...}document (without a wrappingquantize:key) is also accepted for convenience. The extracted dict — not the full recipe — is returned.See
modelopt_recipes/general/ptq/in the TensorRT-Model-Optimizer repo for the canonical format andexamples/modelopt/quant_configs/for a user-authored example.