nemo_automodel.components.quantization.qat#

TorchAO Quantization-Aware Training (QAT) helpers for NeMo-AutoModel.

This module provides thin wrappers to:

  • Instantiate and apply torchao QAT quantizers to models (prepare)

  • Toggle fake-quant on/off during training (for delayed fake-quant)

Module Contents#

Functions#

get_quantizer_mode

Return a short mode string for a known torchao QAT quantizer.

get_disable_fake_quant_fn

Return the disable fake-quant function for a given quantizer mode.

get_enable_fake_quant_fn

Return the enable fake-quant function for a given quantizer mode.

prepare_qat_model

Apply a torchao QAT quantizer to the given model.

Data#

API#

nemo_automodel.components.quantization.qat.logger#

‘getLogger(…)’

nemo_automodel.components.quantization.qat._QUANTIZER_TO_MODE#

None

nemo_automodel.components.quantization.qat._DISABLE_FN_BY_MODE#

None

nemo_automodel.components.quantization.qat._ENABLE_FN_BY_MODE#

None

nemo_automodel.components.quantization.qat.get_quantizer_mode(quantizer: object) Optional[str]#

Return a short mode string for a known torchao QAT quantizer.

Returns None when the quantizer is unrecognized.

nemo_automodel.components.quantization.qat.get_disable_fake_quant_fn(
mode: str,
) Optional[Callable]#

Return the disable fake-quant function for a given quantizer mode.

nemo_automodel.components.quantization.qat.get_enable_fake_quant_fn(
mode: str,
) Optional[Callable]#

Return the enable fake-quant function for a given quantizer mode.

nemo_automodel.components.quantization.qat.prepare_qat_model(
model,
quantizer,
) tuple[object, Optional[str]]#

Apply a torchao QAT quantizer to the given model.

Returns the (possibly wrapped) model and a mode string if recognized.

nemo_automodel.components.quantization.qat.__all__#

[‘get_quantizer_mode’, ‘get_disable_fake_quant_fn’, ‘get_enable_fake_quant_fn’, ‘prepare_qat_model’]