Open Images Pre-trained Object Detection

Object detection is a popular computer vision technique that can detect one or multiple objects in a frame. Object detection will recognize the individual objects in an image and places bounding boxes around the object. This model object contains pretrained weights that may be used as a starting point with the following object detection networks in TAO Toolkit to facilitate transfer learning.

  • YOLOv3

  • YOLOv4

  • YOLOv4-tiny

  • FasterRCNN

  • SSD

  • DSSD

  • RetinaNet

It is trained on a subset of the Google OpenImages dataset.

The following backbones are supported with these detection networks:

  • resnet10/resnet18/resnet34/resnet50/resnet101

  • vgg16/vgg19

  • googlenet

  • mobilenet_v1/mobilenet_v2

  • squeezenet

  • darknet19/darknet53

  • efficientnet_b0

  • cspdarknet19/cspdarknet53

  • cspdarknet_tiny

Some combinations might not be supported. See the matrix below for all supported combinations.

Object Detection

Backbone

FasterRCNN

SSD

YOLOv3

RetinaNet

DSSD

YOLOv4

YOLOv4-tiny

ResNet10/18/34/50/101

Yes

Yes

Yes

Yes

Yes

Yes

VGG 16/19

Yes

Yes

Yes

Yes

Yes

Yes

GoogLeNet

Yes

Yes

Yes

Yes

Yes

Yes

MobileNet V1/V2

Yes

Yes

Yes

Yes

Yes

Yes

SqueezeNet

Yes

Yes

Yes

Yes

Yes

DarkNet 19/53

Yes

Yes

Yes

Yes

Yes

Yes

CSPDarkNet 19/53

Yes

CSPDarkNet-tiny

Yes

Efficientnet B0

Yes

Yes

Yes

Yes

Efficientnet B1

Yes

Note
  • These are unpruned models with just the feature extractor weights, and may not be used without retraining to deploy in a classification application.

  • Please make sure to set the all_projections field to False in the spec file when training a ResNet101 model.

For more instructions on downloading and using the models defined here, refer to the NGC catalog page.

© Copyright 2022, NVIDIA. Last updated on Jun 6, 2022.