Using a Custom Model with DeepStream

The NVIDIA® DeepStream SDK on NVIDIA® Tesla® or NVIDIA® Jetson platforms can be customized to support custom neural networks for object detection and classification. You can create your own model. You must specify the applicable configuration parameters in the [property] group of the nvinfer configuration file (for example, config_infer_primary.txt). The configuration parameters that you must specify include:

  • model-file (Caffe model)

  • proto-file (Caffe model)

  • uff-file (UFF models)

  • onnx-file (ONNX models)

  • model-engine-file, if already generated

  • int8-calib-file for INT8 mode

  • mean-file, if required

  • offsets, if required

  • maintain-aspect-ratio, if required

  • parse-bbox-func-name (detectors only)

  • parse-classifier-func-name (classifiers only)

  • custom-lib-path

  • output-blob-names (Caffe and UFF models)

  • network-type

  • model-color-format

  • process-mode

  • engine-create-func-name

  • infer-dims (UFF models)

  • uff-input-order (UFF models)

Custom Model Implementation Interface

nvinfer supports interfaces for these purposes:

  • Custom bounding box parsing for custom neural network detectors and classifiers

  • IPlugin implementation for layers not natively supported by NVIDIA® TensorRT™

  • Initializing non-image input layers in cases where the network has more than one input layer

  • Creating a CUDA engine using TensorRT Layer APIs instead of model parsing APIs. Read more about TensorRT docs here: https://docs.nvidia.com/deeplearning/tensorrt/developer-guide/index.html

  • IModelParser interface to parse the model and fill the layers in an INetworkDefinition

All the interface implementations for the models must go into a single independent shared library. nvinfer dynamically loads the library with dlopen(), looks for implemented interfaces with dlsym(), and calls the interfaces as required. For more information about the interface, refer to the header file nvdsinfer_custom_impl.h.

Custom Output Parsing

For detectors, you must write a library that can parse the bounding box coordinates and the object class from the output layers. For classifiers, the library must parse the object attributes from the output layers. For example, in segmentation models, the library must parse the bounding box, object class and the mask from the output layers. You can find example code and makefile in the source directory in sources/libs/nvdsinfer_customparser. The generated library path and the function name must be specified with the configuration parameters as mentioned in the section Custom Model. The README file in sources/libs/nvdsinfer_customparser has an example of how to use this custom parser. Following output parsers are supported in the current release:

  • FasterRCNN

  • MaskRCNN

  • SSD

  • YoloV3 / YoloV3Tiny / YoloV2 / YoloV2Tiny

  • DetectNet

IPlugin Implementation

DeepStream supports networks containing layers not supported by TensorRT but supported through implementations of the IPlugin interface. The objectDetector_SSD, objectDetector_FasterRCNN, and objectDetector_YoloV3 sample applications show examples of IPlugin implementations. DeepStream supports NVIDIA® TensorRT™ plugins for custom layers. The Gst-nvinfer plugin now has support for the IPluginV2 and IPluginCreator interface, introduced in TensorRT 5.0. For caffemodels and for backward compatibility with existing plugins, it also supports the following interfaces:

  • nvinfer1::IPluginFactory

  • nvuffparser::IPluginFactory

  • nvuffparser::IPluginFactoryExt

  • nvcaffeparser1::IPluginFactory

  • nvcaffeparser1::IPluginFactoryExt

  • nvcaffeparser1::IPluginFactoryV2

See the TensorRT documentation for details on new and deprecated plugin interfaces.

How to Use IPluginCreator

To use the new IPluginCreator interface you must implement the interface in an independent custom library. This library must be passed to the Gst-nvinfer plugin through its configuration file by specifying the library’s pathname with the custom-lib-path key. Gst-nvinfer opens the library with dlopen(), which causes the plugin to be registered with TensorRT. There is no further direct interaction between the custom library and Gst-nvinfer. TensorRT calls the custom plugin functions as required. The SSD sample provided with the SDK provides an example of using the IPluginV2 and IPluginCreator interface. This sample has been adapted from TensorRT.

How to Use IPluginFactory

To use the IPluginFactory interface, you must implement the interface in an independent custom library. Pass this library to the Gst-nvinfer plugin through the plugin’s configuration file by specifying the library’s pathname in the custom-lib-path key. The custom library must implement the applicable functions:

  • NvDsInferPluginFactoryCaffeGet

  • NvDsInferPluginFactoryCaffeDestroy

  • NvDsInferPluginFactoryUffGet

  • NvDsInferPluginFactoryUffDestroy

  • NvDsInferPluginFactoryRuntimeGet

  • NvDsInferPluginFactoryRuntimeDestroy

These structures are defined in nvdsinfer_custom_impl.h. The function definitions must be named as in the header file. Gst-nvinfer opens the custom library with dlopen() and looks for the names.

For Caffe Files

During parsing and building of a Caffe network, Gst-nvinfer looks for NvDsInferPluginFactoryCaffeGet. If found, it calls the function to get the IPluginFactory instance. Depending on the type of IPluginFactory returned, Gst-nvinfer sets the factory using one of the ICaffeParser interface’s methods setPluginFactory(), setPluginFactoryExt(), or setPluginFactoryV2(). After the network has been built and serialized, Gst-nvinfer looks for NvDsInferPluginFactoryCaffeDestroy and calls it to destroy the IPluginFactory instance.

For Uff Files

During parsing and building of a Caffe network, Gst-nvinfer looks for NvDsInferPluginFactoryUffGet. If found, it calls the function to get the IPluginFactory instance. Depending on the type of IPluginFactory returned, Gst-nvinfer sets the factory using one of the IUffParser inteface’s methods setPluginFactory() or setPluginFactoryExt(). After the network has been built and serialized, Gst-nvinfer looks for NvDsInferPluginFactoryUffDestroy and calls it to destroy the IPluginFactory instance.

During Deserialization

If deserializing the models requires an instance of NvInfer1::IPluginFactory, the custom library must also implement NvDsInferPluginFactoryRuntimeGet() and optionally NvDsInferPluginFactoryRuntimeDestroy(). During deserialization, Gst-nvinfer calls the library’s NvDsInferPluginFactoryRuntimeGet() function to get the IPluginFactory instance, then calls NvDsInferPluginFactoryRuntimeDestroy to destroy the instance if it finds that function during Gst-nvinfer deinitialization. The FasterRCNN sample provided with the SDK provides an example of using the IPluginV2+nvcaffeparser1::IPluginFactoryV2 interface with DeepStream. This sample has been adapted from TensorRT. It also provides an example of using the legacy IPlugin + nvcaffeparser1::IPluginFactory + Gst-nvinfer 1::IPluginFactory interface for backward compatibility.

Input Layer Initialization

DeepStream supports initializing non-image input layers for networks having more than one input layer. The layers are initialized only once before the first inference call. The objectDetector_FasterRCNN sample application shows an example of an implementation.

CUDA Engine Creation for Custom Models

DeepStream supports creating TensorRT CUDA engines for models which are not in Caffe, UFF, or ONNX format, or which must be created from TensorRT Layer APIs. The objectDetector_YoloV3 sample application shows an example of the implementation. When using a single custom library for multiple nvinfer plugin instances in a pipeline, each instance can have its own implementation of engine-create-func-name and this can be specified in the configuration file. An example would be back-to-back detector pipeline with different types of yolo models.

IModelParser Interface for Custom Model Parsing

This is an alternative to the “CUDA Engine Creation” interface for parsing and filling a TensorRT network (INetworkDefinition). The objectDetector_YoloV3 sample application shows an example of the implementation.