NMS

The NMS algorithm iterates through a set of bounding boxes and their confidence scores, in decreasing order of score. Boxes are selected if their score is above a given threshold, and their intersection-over-union (IoU) with previously selected boxes is less than or equal to a given threshold. This layer implements NMS per batch item and per class.

Per batch item, boxes are initially sorted by their scores without regard to class. Only boxes up to a maximum of the TopK limit are considered for selection (per batch). During selection, only overlapping boxes of the same class are compared, so that overlapping boxes of different classes do not suppress each other.

For each batch item, the ordering of candidate bounding boxes with the same score is unspecified, but the ordering will be consistent across different runs for the same inputs.

Attributes

fmt The bounding box format can be one of:

  • CORNER_PAIRS (x1, y1, x2, y2) where (x1, y1) and (x2, y2) are any pair of diagonal corners.

  • CENTER_SIZES (x_center, y_center, width, height) where (x_center, y_center) is the center point of the box.

The default value is CORNER_PAIRS.

limit The TopK box limit is the maximum number of filtered boxes considered for selection per batch item. The default value is 2000 for SM 5.3 and 6.2 devices, and 5000 otherwise. The TopK box limit must be less than or equal to {2000 for SM 5.3 and 6.2 devices, 5000 otherwise}.

Inputs

Boxes: tensor of type T.

Scores: tensor of type T.

MaxOutputBoxesPerClass: tensor of type int32.

IoUThreshold optional: tensor of type float32. This must contain a scalar value in range [0.0f, 1.0f]. The default value of the threshold is 0.0f.

ScoreThreshold optional: tensor of type float32. The default value of the threshold is 0.0f.

Outputs

SelectedIndices: tensor of type int32.

NumOutputBoxes: tensor of type int32.

Data Types

T: float16, float32, bfloat16

Shape Information

Boxes is a tensor with a shape of [batchSize, numInputBoundingBoxes, numClasses, 4] if the boxes are per class, or [batchSize, numInputBoundingBoxes, 4] if the same boxes are to be used for each class.

Scores is a tensor with a shape of [batchSize, numInputBoundingBoxes, numClasses].

MaxOutputBoxesPerClass is a 0D tensor containing a scalar.

IoUThreshold is a 0D tensor containing a scalar.

ScoreThreshold is a 0D tensor containing a scalar.

SelectedIndices is a tensor with a shape of [NumOutputBoxes, 3].

NumOutputBoxes is a 0D tensor containing a scalar.

Examples

NMS
opt_profile = get_runner.builder.create_optimization_profile()
get_runner.config.add_optimization_profile(opt_profile)

boxes = network.add_input("boxes", dtype=trt.float32, shape=(1, 3, 4))
scores = network.add_input("scores", dtype=trt.float32, shape=(1, 3, 3))
constant = network.add_constant(shape=(), weights=np.ones(shape=(), dtype=np.int32))
max_output_boxes_per_class = constant.get_output(0)

layer = network.add_nms(boxes, scores, max_output_boxes_per_class)
layer.get_output(0).dtype = trt.int32
layer.get_output(1).dtype = trt.int32
network.mark_output(layer.get_output(0))
network.mark_output(layer.get_output(1))

inputs[boxes.name] = np.array([[[0.0, 0.0, 0.1, 0.1], [0.2, 0.2, 0.4, 0.4], [0.5, 0.5, 0.6, 0.6]]])
inputs[scores.name] = np.array([[[0.7, 0.0, 0.0], [0.0, 0.0, 0.0], [0.0, 0.0, 0.9]]])

# Expected shape is [2, 3]
outputs[layer.get_output(0).name] = layer.get_output(0).shape
expected[layer.get_output(0).name] = np.array([[0, 2, 2], [0, 0, 0]])

# Expected shape is [] with a scalar value of 2
outputs[layer.get_output(1).name] = layer.get_output(1).shape
expected[layer.get_output(1).name] = np.array(2)

C++ API

For more information about the C++ INMSLayer operator, refer to the C++ INMSLayer documentation.

Python API

For more information about the Python INMSLayer operator, refer to the Python INMSLayer documentation.