Retail Object Detection
The retail detector models described here detect retail items within an image and return a bounding box around each detected item. The retail items are generally packaged commercial goods with barcodes and ingredients labels on them.
Two detection models are provided here. The 100-class Retail Item Detection detects 100 specific retail subjects. The Binary-class Retail Item Detection detects general retail items and returns a single category.
These models are based on EfficientDet-D5. EfficientDet is a one-stage detector with the following architecture components:
NvImageNet-pretrained EfficientNet-B5 backbone
Weighted bi-directional feature pyramid network (BiFPN)
Bounding and classification box head
A compound scaling method that uniformly scales the resolution, depth, and width for all backbone, feature network, and box/class prediction networks at the same time
Model Card
More details on the models can be found on the model card.
Deploy With DeepStream
Input
RGB Image of dimensions: 416 X 416 X 3 (W x H x C) Channel Ordering of the Input: NCHW, where N = Batch Size, C = number of channels (3), H = Height of images (416), W = Width of the images (416) Input scale: 1.0 Mean subtraction: 0.
Output
Category labels and bounding-box coordinates for each detected retail item in the input image.
To deploy these models with the Perception App, please follow the config below to override the configuration:
property:
gpu-id: 0
net-scale-factor: 1
offsets: 0;0;0
model-color-format: 0
tlt-model-key: nvidia_tlt
tlt-encoded-model: ../../models/retailDetector/retailDetector_100.etlt
model-engine-file: ../../models/retailDetector/retailDetector_100.etlt_b1_gpu0_fp16.engine
labelfile-path: ../../models/retailDetector/retailDetector_100_labels.txt
network-input-order: 1
infer-dims: 3;416;416
maintain-aspect-ratio: 1
batch-size: 1
## 0=FP32, 1=INT8, 2=FP16 mode
network-mode: 2
num-detected-classes: 100
interval: 0
cluster-mode: 3
output-blob-names: num_detections;detection_boxes;detection_scores;detection_classes
parse-bbox-func-name: NvDsInferParseCustomEfficientDetTAO
custom-lib-path: ../../post_processor/libnvds_infercustomparser_tao.so
#Use the config params below for NMS clustering mode
class-attrs-all:
pre-cluster-threshold: 0.5
Note
The sample perception app have configuration file examples packaged under /opt/nvidia/deepstream/deepstream/sources/apps/sample_apps/deepstream-fewshot-learning-app/configs/fsl/fsl_pgie_config.txt
.
The “Deploying to DeepStream” chapter of TAO User Guide provides more details.