nemo_automodel.components.quantization.qat#
TorchAO Quantization-Aware Training (QAT) helpers for NeMo-AutoModel.
This module provides thin wrappers to:
Instantiate and apply torchao QAT quantizers to models (prepare)
Toggle fake-quant on/off during training (for delayed fake-quant)
Module Contents#
Functions#
Return a short mode string for a known torchao QAT quantizer. |
|
Return the disable fake-quant function for a given quantizer mode. |
|
Return the enable fake-quant function for a given quantizer mode. |
|
Apply a torchao QAT quantizer to the given model. |
Data#
API#
- nemo_automodel.components.quantization.qat.logger#
‘getLogger(…)’
- nemo_automodel.components.quantization.qat._QUANTIZER_TO_MODE#
None
- nemo_automodel.components.quantization.qat._DISABLE_FN_BY_MODE#
None
- nemo_automodel.components.quantization.qat._ENABLE_FN_BY_MODE#
None
- nemo_automodel.components.quantization.qat.get_quantizer_mode(quantizer: object) Optional[str]#
Return a short mode string for a known torchao QAT quantizer.
Returns None when the quantizer is unrecognized.
- nemo_automodel.components.quantization.qat.get_disable_fake_quant_fn(
- mode: str,
Return the disable fake-quant function for a given quantizer mode.
- nemo_automodel.components.quantization.qat.get_enable_fake_quant_fn(
- mode: str,
Return the enable fake-quant function for a given quantizer mode.
- nemo_automodel.components.quantization.qat.prepare_qat_model(
- model,
- quantizer,
Apply a torchao QAT quantizer to the given model.
Returns the (possibly wrapped) model and a mode string if recognized.
- nemo_automodel.components.quantization.qat.__all__#
[‘get_quantizer_mode’, ‘get_disable_fake_quant_fn’, ‘get_enable_fake_quant_fn’, ‘prepare_qat_model’]