> For clean Markdown of any page, append .md to the page URL.
> For a complete documentation index, see https://docs.nvidia.com/nemo/automodel/llms.txt.
> For AI client integration (Claude Code, Cursor, etc.), connect to the MCP server at https://docs.nvidia.com/nemo/automodel/_mcp/server.

# nemo_automodel.components.quantization.qat

TorchAO Quantization-Aware Training (QAT) helpers for NeMo-AutoModel.

This module provides:

* QATConfig: Configuration class for QAT settings
* Thin wrappers to instantiate and apply torchao QAT quantizers to models (prepare)
* Toggle fake-quant on/off during training (for delayed fake-quant)

## Module Contents

### Classes

| Name                                                                 | Description                                          |
| -------------------------------------------------------------------- | ---------------------------------------------------- |
| [`QATConfig`](#nemo_automodel-components-quantization-qat-QATConfig) | Configuration for Quantization-Aware Training (QAT). |

### Functions

| Name                                                                                                 | Description                                                        |
| ---------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------ |
| [`get_disable_fake_quant_fn`](#nemo_automodel-components-quantization-qat-get_disable_fake_quant_fn) | Return the disable fake-quant function for a given quantizer mode. |
| [`get_enable_fake_quant_fn`](#nemo_automodel-components-quantization-qat-get_enable_fake_quant_fn)   | Return the enable fake-quant function for a given quantizer mode.  |
| [`get_quantizer_mode`](#nemo_automodel-components-quantization-qat-get_quantizer_mode)               | Return a short mode string for a known torchao QAT quantizer.      |
| [`prepare_qat_model`](#nemo_automodel-components-quantization-qat-prepare_qat_model)                 | Apply a torchao QAT quantizer to the given model.                  |

### Data

[`_DISABLE_FN_BY_MODE`](#nemo_automodel-components-quantization-qat-_DISABLE_FN_BY_MODE)

[`_ENABLE_FN_BY_MODE`](#nemo_automodel-components-quantization-qat-_ENABLE_FN_BY_MODE)

[`_QUANTIZER_TO_MODE`](#nemo_automodel-components-quantization-qat-_QUANTIZER_TO_MODE)

[`__all__`](#nemo_automodel-components-quantization-qat-__all__)

[`logger`](#nemo_automodel-components-quantization-qat-logger)

### API

```python
class nemo_automodel.components.quantization.qat.QATConfig(
    quantizer_type: typing.Literal['int8_dynact_int4weight', 'int4_weight_only'] = 'int8_dynact_int4weight',
    quantizer_kwargs = {}
)
```

Dataclass

Configuration for Quantization-Aware Training (QAT).

This config controls how QAT quantizers are instantiated and applied to models.
QAT is enabled when this config is provided to from\_pretrained/from\_config.

```python
nemo_automodel.components.quantization.qat.QATConfig.create_quantizer()
```

Create and return the appropriate QAT quantizer based on config.

**Returns:**

A torchao QAT quantizer instance (Int8DynActInt4WeightQATQuantizer

**Raises:**

* `ValueError`: If quantizer\_type is not recognized.

```python
nemo_automodel.components.quantization.qat.QATConfig.to_dict() -> typing.Dict[str, typing.Any]
```

Convert config to dictionary.

```python
nemo_automodel.components.quantization.qat.get_disable_fake_quant_fn(
    mode: str
) -> typing.Optional[typing.Callable]
```

Return the disable fake-quant function for a given quantizer mode.

```python
nemo_automodel.components.quantization.qat.get_enable_fake_quant_fn(
    mode: str
) -> typing.Optional[typing.Callable]
```

Return the enable fake-quant function for a given quantizer mode.

```python
nemo_automodel.components.quantization.qat.get_quantizer_mode(
    quantizer: object
) -> typing.Optional[str]
```

Return a short mode string for a known torchao QAT quantizer.

Returns None when the quantizer is unrecognized.

```python
nemo_automodel.components.quantization.qat.prepare_qat_model(
    model,
    quantizer
) -> tuple[object, typing.Optional[str]]
```

Apply a torchao QAT quantizer to the given model.

Returns the (possibly wrapped) model and a mode string if recognized.

```python
nemo_automodel.components.quantization.qat._DISABLE_FN_BY_MODE = {'8da4w-qat': disable_8da4w_fake_quant, '4w-qat': disable_4w_fake_quant}
```

```python
nemo_automodel.components.quantization.qat._ENABLE_FN_BY_MODE = {'8da4w-qat': enable_8da4w_fake_quant, '4w-qat': enable_4w_fake_quant}
```

```python
nemo_automodel.components.quantization.qat._QUANTIZER_TO_MODE = {Int8DynActInt4WeightQATQuantizer: '8da4w-qat', Int4WeightOnlyQATQuantizer: '4w-...
```

```python
nemo_automodel.components.quantization.qat.__all__ = ['QATConfig', 'get_quantizer_mode', 'get_disable_fake_quant_fn', 'get_enable_fak...
```

```python
nemo_automodel.components.quantization.qat.logger = logging.getLogger(__name__)
```