Retail Object Recognition
This model encodes retail items to embedding vectors and predicts their labels based on the embedding vectors in the reference space.
The model consists of a trunk and an embedder. The trunk uses the architecture of ResNet101 with its fully connected layer removed. The embedder is a one-layer Perceptron with an input size of 2048 (the output size of the Average Pool in ResNet101) and an output size of 2048. Thus the embedding dimension of the Retail Embedding model is 2048.
Model Card
More details on the models can be found on the model card.
Deploy With DeepStream
To deploy these models with the Perception App, follow the config below to override the configuration:
property:
net-scale-factor: 0.003921568627451
offsets: 0;0;0
model-color-format: 0
tlt-model-key: nvidia_tlt
tlt-encoded-model: ../../models/retailEmbedder/retailEmbedder.etlt
model-engine-file: ../../models/retailEmbedder/retailEmbedder.etlt_b16_gpu0_fp16.engine
infer-dims: 3;224;224
batch-size: 16
## 0=FP32, 1=INT8, 2=FP16 mode
network-mode: 2
network-type: 100
interval: 0
## Infer Processing Mode 1=Primary Mode 2=Secondary Mode
process-mode: 2
output-tensor-meta: 1
Note
The sample perception app have configuration file examples packaged under /opt/nvidia/deepstream/deepstream/sources/apps/sample_apps/deepstream-fewshot-learning-app/configs/fsl/fsl_sgie_config.txt
.
The “Deploying to DeepStream” chapter of TAO User Guide provides more details.