pytorch_quantization.nn

TensorQuantizer

class pytorch_quantization.nn.TensorQuantizer(quant_desc=<pytorch_quantization.tensor_quant.ScaledQuantDescriptor object>, disabled=False, if_quant=True, if_clip=False, if_calib=False)[source]

Tensor quantizer module

This module uses tensor_quant or fake_tensor_quant function to quantize a tensor. And wrappers variable, moving statistics we’d want when training a quantized network.

Experimental features:

clip stage learns range before enabling quantization. calib stage runs calibration

Parameters
  • quant_desc – An instance of QuantDescriptor.

  • disabled – A boolean. If True, by pass the whole module returns input. Default False.

  • if_quant – A boolean. If True, run main quantization body. Default True.

  • if_clip – A boolean. If True, clip before quantization and learn amax. Default False.

  • if_calib – A boolean. If True, run calibration. Not implemented yet. Settings of calibration will probably go to QuantDescriptor.

Raises:

Readonly Properties:
  • axis:

  • fake_quant:

  • scale:

  • step_size:

Mutable Properties:
  • num_bits:

  • unsigned:

  • amax:

__init__(quant_desc=<pytorch_quantization.tensor_quant.ScaledQuantDescriptor object>, disabled=False, if_quant=True, if_clip=False, if_calib=False)[source]

Initialize quantizer and set up required variables

disable()[source]

Bypass the module

disable_clip()[source]

Disable clip stage

enable_clip()[source]

Enable clip stage

forward(inputs)[source]

Apply tensor_quant function to inputs

Parameters

inputs – A Tensor of type float32.

Returns

outputs – A Tensor of type output_dtype

init_learn_amax()[source]

Initialize learned amax from fixed amax

load_calib_amax(*args, **kwargs)[source]

Load amax from calibrator.

Updates the amax buffer with value computed by the calibrator, creating it if necessary. *args and **kwargs are directly passed to compute_amax, except “strict” in kwargs. Refer to compute_amax for more details.

Quantized Modules

_QuantConvNd

class pytorch_quantization.nn.modules.quant_conv._QuantConvNd(in_channels, out_channels, kernel_size, stride, padding, dilation, transposed, output_padding, groups, bias, padding_mode, quant_desc_input, quant_desc_weight)[source]

base class of quantized Conv inherited from _ConvNd

Comments of original arguments can be found in torch.nn.modules.conv

Parameters
  • quant_desc_input – An instance of QuantDescriptor. Quantization descriptor of input.

  • quant_desc_weight – An instance of QuantDescriptor. Quantization descriptor of weight.

Raises

ValueError – If unsupported arguments are passed in.

Readonly properties:
  • input_quantizer:

  • weight_quantizer:

Static methods:
  • set_default_quant_desc_input: Set default_quant_desc_input

  • set_default_quant_desc_weight: Set default_quant_desc_weight

QuantConv1d

class pytorch_quantization.nn.QuantConv1d(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias=True, padding_mode='zeros', **kwargs)[source]

Quantized 1D Conv

QuantConv2d

class pytorch_quantization.nn.QuantConv2d(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias=True, padding_mode='zeros', **kwargs)[source]

Quantized 2D conv

QuantConv3d

class pytorch_quantization.nn.QuantConv3d(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias=True, padding_mode='zeros', **kwargs)[source]

Quantized 3D Conv

QuantConvTranspose1d

class pytorch_quantization.nn.QuantConvTranspose1d(in_channels, out_channels, kernel_size, stride=1, padding=0, output_padding=0, groups=1, bias=True, dilation=1, padding_mode='zeros', **kwargs)[source]

Quantized ConvTranspose1d

QuantConvTranspose2d

class pytorch_quantization.nn.QuantConvTranspose2d(in_channels, out_channels, kernel_size, stride=1, padding=0, output_padding=0, groups=1, bias=True, dilation=1, padding_mode='zeros', **kwargs)[source]

Quantized ConvTranspose2d

QuantConvTranspose3d

class pytorch_quantization.nn.QuantConvTranspose3d(in_channels, out_channels, kernel_size, stride=1, padding=0, output_padding=0, groups=1, bias=True, dilation=1, padding_mode='zeros', **kwargs)[source]

Quantized ConvTranspose3d

QuantLinear

class pytorch_quantization.nn.QuantLinear(in_features, out_features, bias=True, **kwargs)[source]

Quantized version of nn.Linear

Apply quantized linear to the incoming data, y = dequant(quant(x)quant(A)^T + b).

Keep Module name “Linear” instead of “QuantLinear” so that it can be easily dropped into preexisting model and load pretrained weights. An alias “QuantLinear” is defined below. The base code is a copy of nn.Linear, see detailed comment of original arguments there.

Quantization descriptors are passed in in kwargs. If not presents, default_quant_desc_input and default_quant_desc_weight are used.

Keyword Arguments
  • quant_desc_input – An instance of QuantDescriptor. Quantization descriptor of input.

  • quant_desc_wegiht – An instance of QuantDescriptor. Quantization descriptor of weight.

Raises
  • ValueError – If unsupported arguments are passed in.

  • KeyError – If unsupported kwargs are passed in.

Readonly properties:
  • input_quantizer:

  • weight_quantizer:

Static methods:
  • set_default_quant_desc_input: Set default_quant_desc_input

  • set_default_quant_desc_weight: Set default_quant_desc_weight

QuantMaxPool1d

class pytorch_quantization.nn.QuantMaxPool1d(kernel_size, stride=None, padding=0, dilation=1, return_indices=False, ceil_mode=False, **kwargs)[source]

Quantized 1D maxpool

QuantMaxPool2d

class pytorch_quantization.nn.QuantMaxPool2d(kernel_size, stride=None, padding=0, dilation=1, return_indices=False, ceil_mode=False, **kwargs)[source]

Quantized 2D maxpool

QuantMaxPool3d

class pytorch_quantization.nn.QuantMaxPool3d(kernel_size, stride=None, padding=0, dilation=1, return_indices=False, ceil_mode=False, **kwargs)[source]

Quantized 3D maxpool

QuantAvgPool1d

class pytorch_quantization.nn.QuantAvgPool1d(kernel_size, stride=None, padding=0, ceil_mode=False, count_include_pad=True, **kwargs)[source]

Quantized 1D average pool

QuantAvgPool2d

class pytorch_quantization.nn.QuantAvgPool2d(kernel_size, stride=None, padding=0, ceil_mode=False, count_include_pad=True, divisor_override=None, **kwargs)[source]

Quantized 2D average pool

QuantAvgPool3d

class pytorch_quantization.nn.QuantAvgPool3d(kernel_size, stride=None, padding=0, ceil_mode=False, count_include_pad=True, divisor_override=None, **kwargs)[source]

Quantized 3D average pool

QuantAdaptiveAvgPool1d

class pytorch_quantization.nn.QuantAdaptiveAvgPool1d(output_size, **kwargs)[source]

Quantized 1D adaptive average pool

QuantAdaptiveAvgPool2d

class pytorch_quantization.nn.QuantAdaptiveAvgPool2d(output_size, **kwargs)[source]

Quantized 2D adaptive average pool

QuantAdaptiveAvgPool3d

class pytorch_quantization.nn.QuantAdaptiveAvgPool3d(output_size, **kwargs)[source]

Quantized 3D adaptive average pool

Clip

class pytorch_quantization.nn.Clip(clip_value_min, clip_value_max, learn_min=False, learn_max=False)[source]

Clip tensor

Parameters
  • clip_value_min – A number or tensor of lower bound to clip

  • clip_value_max – A number of tensor of upper bound to clip

  • learn_min – A boolean. If True, learn min. clip_value_min will be used to initialize. Default False

  • learn_max – A boolean. Similar as learn_min but for max.

Raises

ValueError

QuantLSTM

class pytorch_quantization.nn.QuantLSTM(*args, **kwargs)[source]

Applies a multi-layer long short-term memory (LSTM) RNN to an input sequence.

QuantLSTMCell

class pytorch_quantization.nn.QuantLSTMCell(input_size, hidden_size, bias=True, **kwargs)[source]

A long short-term memory (LSTM) cell.