Retail Object Detection

The retail detector models described here detect retail items within an image and return a bounding box around each detected item. The retail items are generally packaged commercial goods with barcodes and ingredients labels on them.

Two detection models are provided here. The 100-class Retail Item Detection detects 100 specific retail subjects. The Binary-class Retail Item Detection detects general retail items and returns a single category.

These models are based on EfficientDet-D5. EfficientDet is a one-stage detector with the following architecture components:

NvImageNet-pretrained EfficientNet-B5 backbone
Weighted bi-directional feature pyramid network (BiFPN)
Bounding and classification box head
A compound scaling method that uniformly scales the resolution, depth, and width for all backbone, feature network, and box/class prediction networks at the same time

Model Card

More details on the models can be found on the model card.

Deploy With DeepStream

Input

RGB Image of dimensions: 416 X 416 X 3 (W x H x C) Channel Ordering of the Input: NCHW, where N = Batch Size, C = number of channels (3), H = Height of images (416), W = Width of the images (416) Input scale: 1.0 Mean subtraction: 0.

Output

Category labels and bounding-box coordinates for each detected retail item in the input image.

To deploy these models with the Perception App, please follow the config below to override the configuration:

property:
    gpu-id: 0
    net-scale-factor: 1
    offsets: 0;0;0
    model-color-format: 0
    tlt-model-key: nvidia_tlt
    tlt-encoded-model: ../../models/retailDetector/retailDetector_100.etlt
    model-engine-file: ../../models/retailDetector/retailDetector_100.etlt_b1_gpu0_fp16.engine
    labelfile-path:    ../../models/retailDetector/retailDetector_100_labels.txt
    network-input-order: 1
    infer-dims: 3;416;416
    maintain-aspect-ratio: 1
    batch-size: 1
    ## 0=FP32, 1=INT8, 2=FP16 mode
    network-mode: 2
    num-detected-classes: 100
    interval: 0
    cluster-mode: 3
    output-blob-names: num_detections;detection_boxes;detection_scores;detection_classes
    parse-bbox-func-name: NvDsInferParseCustomEfficientDetTAO
    custom-lib-path: ../../post_processor/libnvds_infercustomparser_tao.so
    #Use the config params below for NMS clustering mode
class-attrs-all:
    pre-cluster-threshold: 0.5

Note

The sample perception app have configuration file examples packaged under /opt/nvidia/deepstream/deepstream/sources/apps/sample_apps/deepstream-fewshot-learning-app/configs/fsl/fsl_pgie_config.txt.

The “Deploying to DeepStream” chapter of TAO User Guide provides more details.