Calibrator

Module: polygraphy.backend.trt

Calibrator(data_loader, cache=None, BaseClass=None, batch_size=None, quantile=None, regression_cutoff=None, algo=None)[source]

Supplies calibration data to TensorRT to calibrate the network for INT8 inference.

Parameters:
  • data_loader (Sequence[OrderedDict[str, Union[numpy.ndarray, DeviceView, torch.Tensor, int]]]) –

    A generator or iterable that yields a dictionary that maps input names to NumPy arrays, Polygraphy DeviceViews, PyTorch tensors, or GPU pointers. If NumPy arrays, DeviceViews, or PyTorch tensors are provided, the calibrator will check the data types and shapes if possible to ensure that they match those expected by the model.

    In case you don’t know details about the inputs ahead of time, you can access the input_metadata property in your data loader, which will be set to a TensorMetadata instance by Polygraphy APIs like CreateConfig and EngineFromNetwork. Note that this does not work for generators or lists.

    The number of calibration batches is controlled by the number of items supplied by the data loader.

  • cache (Union[str, file-like]) – Path or file-like object to save/load the calibration cache. By default, the calibration cache is not saved.

  • BaseClass (type) – The type of calibrator to inherit from. Defaults to trt.IInt8EntropyCalibrator2.

  • batch_size (int) – [DEPRECATED] The size of each batch provided by the data loader.

  • quantile (float) – The quantile to use for trt.IInt8LegacyCalibrator. Has no effect for other calibrator types. Defaults to 0.5.

  • regression_cutoff (float) – The regression cutoff to use for trt.IInt8LegacyCalibrator. Has no effect for other calibrator types. Defaults to 0.5.

  • algo (trt.CalibrationAlgoType) – Calibration algorithm to use for trt.IInt8Calibrator. Has no effect for other calibrator types. Defaults to trt.CalibrationAlgoType.MINMAX_CALIBRATION.