# Quantize¶

Quantize a float input tensor into an integer output tensor. The quantization computation is as follows: $$output_{i_0,..,i_n} = \text{clamp}(\text{round}(\frac{input_{i_0,..,i_n}}{scale} + \text{zero_point}))$$.

## Attributes¶

axis The axis to perform the quantization on.

## Inputs¶

input: tensor of type T1.

scale: tensor of type T2 that provides the quantization scale. The scale tensor must be a build-time constant scalar or a 1D tensor.

zero_point: tensor of type T2 that provides the quantization zero-point. The zero_point tensor is optional and will be assumed to be zero if not set. The zero_point must only contain zero-valued coefficients if set, and must be a build-time constant scalar or a 1D tensor.

## Outputs¶

output: tensor of type T3.

## Data Types¶

T1: float16, float32

T2: float32

T3: int8

## Shape Information¶

input and output are tensors with a shape of $$[a_0,...,a_n]$$.

scale and zero_point must have the same shape, if zero_point is defined.

## Examples¶

Quantize
in1 = network.add_input("input1", dtype=trt.float32, shape=(1, 1, 3, 3))
scale = network.add_constant(shape=(1,), weights=np.array([1 / 127], dtype=np.float32))
quantize.axis = 3
dequantize.axis = 3
network.mark_output(dequantize.get_output(0))

inputs[in1.name] = np.array(
[
[
[0.56, 0.89, 1.4],
[-0.56, 0.39, 6.0],
[0.67, 0.11, -3.6],
]
]
)

outputs[dequantize.get_output(0).name] = dequantize.get_output(0).shape
expected[dequantize.get_output(0).name] = np.array(
[
[
[0.56, 0.89, 1],
[-0.56, 0.39, 1.0],
[0.67, 0.11, -1.0],
]
]
)