nvidia.dali.fn.segmentation.random_object_bbox(__input, /, *, background=0, bytes_per_sample_hint=[0], cache_objects=False, class_weights=None, classes=None, foreground_prob=1.0, format='anchor_shape', ignore_class=False, k_largest=None, output_class=False, preserve=False, seed=-1, threshold=None, device=None, name=None)#

Randomly selects an object from a mask and returns its bounding box.

This operator takes a labeled segmentation map as its input. With probability foreground_prob it randomly selects a label (uniformly or according to the distribution given as class_weights), extracts connected blobs of pixels with the selected label and randomly selects one of the blobs. The blobs may be further filtered according to k_largest and threshold. The output is a bounding box of the selected blob in one of the formats described in format.

With probability 1-foreground_prob, the entire area of the input is returned.

Supported backends
  • ‘cpu’


__input (TensorList) – Input to the operator.

Keyword Arguments:
  • background (int or TensorList of int, optional, default = 0) –

    Background label.

    If left unspecified, it’s either 0 or any value not in classes.

  • bytes_per_sample_hint (int or list of int, optional, default = [0]) –

    Output size hint, in bytes per sample.

    If specified, the operator’s outputs residing in GPU or page-locked host memory will be preallocated to accommodate a batch of samples of this size.

  • cache_objects (bool, optional, default = False) –

    Cache object bounding boxes to avoid the computational cost of finding object blobs in previously seen inputs.

    Searching for blobs of connected pixels and finding boxes can take a long time. When the dataset has few items, but item size is big, you can use caching to save the boxes and reuse them when the same input is seen again. The inputs are compared based on 256-bit hash, which is much faster to compute than to recalculate the object boxes.

  • class_weights (float or list of float or TensorList of float, optional) –

    Relative probabilities of foreground classes.

    Each value corresponds to a class label in classes. If classes are not specified, consecutive 1-based labels are assigned.

    The sum of the weights doesn’t have to be equal to 1 - if it isn’t the weights will be normalized .

  • classes (int or list of int or TensorList of int, optional) –

    List of labels considered as foreground.

    If left unspecified, all labels not equal to background are considered foreground.

  • foreground_prob (float or TensorList of float, optional, default = 1.0) – Probability of selecting a foreground bounding box.

  • format (str, optional, default = ‘anchor_shape’) –

    Format in which the data is returned.

    Possible choices are::
    • ”anchor_shape” (the default) - there are two outputs: anchor and shape

    • ”start_end” - there are two outputs: bounding box start and one-past-end coordinates

    • ”box” - there is one output that contains concatenated start and end coordinates

  • ignore_class (bool, optional, default = False) –

    If True, all objects are picked with equal probability, regardless of the class they belong to. Otherwise, a class is picked first and then an object is randomly selected from this class.

    This argument is incompatible with classes, class_weights or output_class.


    This flag only affects the probability with which blobs are selected. It does not cause blobs of different classes to be merged.

  • k_largest (int, optional) –

    If specified, the boxes are sorted by decreasing volume and only k_largest are considered.

    If ignore_class is True, k_largest referes to all boxes; otherwise it refers to the selected class.

  • output_class (bool, optional, default = False) –

    If True, an additional output is produced which contains the label of the class to which the selected box belongs, or background label if the selected box is not an object bounding box.

    The output may not be an object bounding box when any of the following conditions occur:
    • the sample was randomly (according to foreground_prob) chosen not be be a foreground one

    • the sample contained no foreground objects

    • no bounding box met the required size threshold.

  • preserve (bool, optional, default = False) – Prevents the operator from being removed from the graph even if its outputs are not used.

  • seed (int, optional, default = -1) –

    Random seed.

    If not provided, it will be populated based on the global seed of the pipeline.

  • threshold (int or list of int or TensorList of int, optional) –

    Per-axis minimum size of the bounding boxes to return.

    If the selected class doesn’t contain any bounding box that meets this condition, it is rejected and another class is picked. If no class contains a satisfactory box, the entire input area is returned.