ModelQuantizationConfig Fields#

Field

value_type

description

default_value

valid_options

backend

categorical

The quantization backend to use

torchao

modelopt,torchao

mode

categorical

The quantization mode to use

weight_only_ptq

static_ptq,weight_only_ptq

algorithm

categorical

Calibration or optimisation algorithm name to pass to the backend configuration. For the ‘modelopt’ backend, this becomes the top-level ‘algorithm’ field

minmax

minmax,entropy

default_layer_dtype

categorical

Default data type for layers (currently ignored by backends; specify dtype per layer)

native

int8,fp8_e4m3fn,fp8_e5m2,native

default_activation_dtype

categorical

Default data type for activations (currently ignored by backends; specify dtype per layer)

native

int8,fp8_e4m3fn,fp8_e5m2,native

layers

list

List of per-module quantization configurations

[]

skip_names

list

List of module or layer names or patterns to exclude from quantization

[]

model_path

string

Path to the model to be quantized

results_dir

string

Path to where all the assets generated from a task are stored