nvidia.dali.fn.box_encoder(*inputs, **kwargs)

Encodes the input bounding boxes and labels using a set of default boxes (anchors) passed as an argument.

This operator follows the algorithm described in “SSD: Single Shot MultiBox Detector” and implemented in https://github.com/mlperf/training/tree/master/single_stage_detector/ssd. Inputs must be supplied as the following Tensors:

  • BBoxes that contain bounding boxes that are represented as [l,t,r,b].

  • Labels that contain the corresponding label for each bounding box.

The results are two tensors:

  • EncodedBBoxes that contain M-encoded bounding boxes as [l,t,r,b], where M is number of anchors.

  • EncodedLabels that contain the corresponding label for each encoded box.

Supported backends
  • ‘cpu’

  • ‘gpu’

  • input0 (TensorList) – Input to the operator.

  • input1 (TensorList) – Input to the operator.

Keyword Arguments
  • anchors (float or list of float) – Anchors to be used for encoding, as the list of floats is in the ltrb format.

  • bytes_per_sample_hint (int or list of int, optional, default = [0]) –

    Output size hint, in bytes per sample.

    If specified, the operator’s outputs residing in GPU or page-locked host memory will be preallocated to accommodate a batch of samples of this size.

  • criteria (float, optional, default = 0.5) –

    Threshold IoU for matching bounding boxes with anchors.

    The value needs to be between 0 and 1.

  • means (float or list of float, optional, default = [0.0, 0.0, 0.0, 0.0]) – [x y w h] mean values for normalization.

  • offset (bool, optional, default = False) – Returns normalized offsets ((encoded_bboxes*scale - anchors*scale) - mean) / stds in EncodedBBoxes that use std and the mean and scale arguments.

  • preserve (bool, optional, default = False) – Prevents the operator from being removed from the graph even if its outputs are not used.

  • scale (float, optional, default = 1.0) – Rescales the box and anchor values before the offset is calculated (for example, to return to the absolute values).

  • seed (int, optional, default = -1) –

    Random seed.

    If not provided, it will be populated based on the global seed of the pipeline.

  • stds (float or list of float, optional, default = [1.0, 1.0, 1.0, 1.0]) – [x y w h] standard deviations for offset normalization.