Open Images Pre-trained Object Detection

Object detection is a popular computer vision technique that can detect one or multiple objects in a frame. Object detection will recognize the individual objects in an image and places bounding boxes around the object. This model object contains pretrained weights that may be used as a starting point with the following object detection networks in TAO Toolkit to facilitate transfer learning.

  • YOLOv3

  • YOLOv4

  • YOLOv4-tiny

  • FasterRCNN

  • SSD

  • DSSD

  • RetinaNet

It is trained on a subset of the Google OpenImages dataset.

The following backbones are supported with these detection networks:

  • resnet10/resnet18/resnet34/resnet50/resnet101

  • vgg16/vgg19

  • googlenet

  • mobilenet_v1/mobilenet_v2

  • squeezenet

  • darknet19/darknet53

  • efficientnet_b0

  • cspdarknet19/cspdarknet53

  • cspdarknet_tiny

Some combinations might not be supported. See the matrix below for all supported combinations.

Object Detection
Backbone FasterRCNN SSD YOLOv3 RetinaNet DSSD YOLOv4 YOLOv4-tiny
ResNet10/18/34/50/101 Yes Yes Yes Yes Yes Yes
VGG 16/19 Yes Yes Yes Yes Yes Yes
GoogLeNet Yes Yes Yes Yes Yes Yes
MobileNet V1/V2 Yes Yes Yes Yes Yes Yes
SqueezeNet Yes Yes Yes Yes Yes
DarkNet 19/53 Yes Yes Yes Yes Yes Yes
CSPDarkNet 19/53 Yes
CSPDarkNet-tiny Yes
Efficientnet B0 Yes Yes Yes Yes
Efficientnet B1 Yes
Note
  • These are unpruned models with just the feature extractor weights, and may not be used without retraining to deploy in a classification application.

  • Please make sure to set the all_projections field to False in the spec file when training a ResNet101 model.

For more instructions on downloading and using the models defined here, refer to the NGC catalog page.

© Copyright 2023, NVIDIA.. Last updated on Dec 8, 2023.