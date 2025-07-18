There are 2 options to integrate models from TAO with DeepStream:

Option 1 : Integrate the model (.etlt) with the encrypted key directly in the DeepStream app. The model file is generated by export.

Option 2: Generate a device specific optimized TensorRT engine using tao-converter. The TensorRT engine file can also be ingested by DeepStream.

For FasterRCNN, we will need to build TensorRT Open source plugins and custom bounding box parser. The instructions are provided below in the TensorRT OSS section above and the required code can be found in this GitHub repo.

In order to integrate the models with DeepStream, you need the following:

Download and install DeepStream SDK. The installation instructions for DeepStream are provided in the DeepStream Development Guide. An exported .etlt model file and optional calibration cache for INT8 precision. TensorRT OSS Plugins . A labels.txt file containing the labels for classes in the order in which the networks produces outputs. A sample config_infer_*.txt file to configure the nvinfer element in DeepStream. The nvinfer element handles everything related to TensorRT optimization and engine creation in DeepStream.

DeepStream SDK ships with an end-to-end reference application which is fully configurable. Users can configure input sources, inference model, and output sinks. The app requires a primary object detection model, followed by an optional secondary classification model. The reference application is installed as deepstream-app . The graphic below shows the architecture of the reference application.

There are typically 2 or more configuration files that are used with this app. In the install directory, the config files are located in samples/configs/deepstream-app or sample/configs/tlt_pretrained_models . The main config file configures all the high level parameters in the pipeline above. This would set input source and resolution, number of inferences, tracker, and output sinks. The other supporting config files are for each individual inference engine. The inference specific config files are used to specify models, inference resolution, batch size, number of classes and other customization. The main config file will call all the supporting config files. Here are some config files in samples/configs/deepstream-app for your reference.

source4_1080p_dec_infer-resnet_tracker_sgie_tiled_display_int8.txt : Main config file

config_infer_primary.txt : Supporting config file for primary detector in the pipeline above

config_infer_secondary_*.txt : Supporting config file for secondary classifier in the pipeline above

The deepstream-app will only work with the main config file. This file will most likely remain the same for all models and can be used directly from the DeepStream SDK will little to no change. User will only have to modify or create config_infer_primary.txt and config_infer_secondary_*.txt .

To run a FasterRCNN model in DeepStream, you need a label file and a DeepStream configuration file. In addition, you need to compile the TensorRT Open source software and FasterRCNN bounding box parser for DeepStream.

A DeepStream sample with documentation on how to run inference using the trained FasterRCNN models from TAO is provided on GitHub here.

FasterRCNN requires the cropAndResizePlugin and the proposalPlugin. This plugin is available in the TensorRT open source repo. Detailed instructions to build TensorRT OSS can be found in TensorRT Open Source Software (OSS). FasterRCNN requires custom bounding box parsers that are not built-in inside the DeepStream SDK. The source code to build custom bounding box parsers for FasterRCNN is available here. The following instructions can be used to build bounding box parser:

Step 1: Install git-lfs (git >= 1.8.2)

Copy Copied! curl -s https://packagecloud.io/install/repositories/github/git-lfs/script.deb.sh | sudo bash sudo apt-get install git-lfs git lfs install

Step 2: Download Source Code with SSH or HTTPS

Copy Copied! git clone -b release/tlt3.0 https://github.com/NVIDIA-AI-IOT/deepstream_tlt_apps

Step 3: Build

Copy Copied! // or Path for DS installation export CUDA_VER=10.2 // CUDA version, e.g. 10.2 make

This generates libnvds_infercustomparser_tlt.so in the directory post_processor .

The label file is a text file containing the names of the classes that the FasterRCNN model is trained to detect. The order in which the classes are listed here must match the order in which the model predicts the output. This order is derived from the order the objects are instantiated in the target_class_mapping field of the FasterRCNN experiment specification file. During the training, TAO FasterRCNN will make all the class names in lower case and sort them in alphabetical order. For example, if the target_class_mapping label file is:

Copy Copied! target_class_mapping { key: "car" value: "car" } target_class_mapping { key: "person" value: "person" } target_class_mapping { key: "bicycle" value: "bicycle" }

The actual class name list is bicycle , car , person . The example of the corresponding label_file_frcnn.txt file is (we always append a background class at the end):

Copy Copied! bicycle car person background

Note If --gen_ds_config is provided during TAO export of a FasterRCNN model, then a label file named labels.txt will be generated automatically. Without knowing the above details, the labels.txt file can be used directly in DeepStream inference.

The detection model is typically used as a primary inference engine. It can also be used as a secondary inference engine. To run this model in the sample deepstream-app , you must modify the existing config_infer_primary.txt file to point to this model as well as the custom parser.

Option 1: Integrate the model ( .onnx ) directly in the DeepStream app.

For this option, users will need to add the following parameters in the configuration file. The int8-calib-file is only required for INT8 precision.

Copy Copied! onnx-file=<TAO exported .onnx> int8-calib-file=<Calibration cache file>

From TAO 5.0.0, .etlt is deprecated. To integrate .etlt directly in the DeepStream app, you need following parmaters in the configuration file.

Copy Copied! tlt-encoded-model=<TLT exported .etlt> tlt-model-key=<Model export key> int8-calib-file=<Calibration cache file>

The tlt-encoded-model parameter points to the exported model ( .etlt ) from TAO. The tlt-model-key is the encryption key used during model export.

Option 2: Integrate the TensorRT engine file with the DeepStream app.

Generate the device-specific TensorRT engine using TAO Deploy. After the engine file is generated, modify the following parameter to use this engine with DeepStream: Copy Copied! model-engine-file=<PATH to generated TensorRT engine>

All other parameters are common between the 2 approaches. To use the custom bounding box parser instead of the default parsers in DeepStream, modify the following parameters in [property] section of primary infer configuration file:

Copy Copied! parse-bbox-func-name=NvDsInferParseCustomNMSTLT custom-lib-path=<PATH to libnvds_infercustomparser_tlt.so>

Add the label file generated above using:

Copy Copied! labelfile-path=<Classification labels>

For all the options, see the configuration file below. To learn about what all the parameters are used for, refer to DeepStream Development Guide.

Here’s a sample config file, config_infer_primary.txt :

Copy Copied! [property] gpu-id=0 net-scale-factor=1.0 offsets=<image mean values as in the training spec file> # e.g.: 103.939;116.779;123.68 model-color-format=1 labelfile-path=<Path to frcnn_labels.txt> onnx-file=<Path to FasterRCNN model> batch-size=<batch size> e.g.: 1 ## 0=FP32, 1=INT8, 2=FP16 mode network-mode=0 num-detected-classes=<number of classes to detect(including background)> # e.g.: 5 interval=0 gie-unique-id=1 is-classifier=0 #network-type=0 parse-bbox-func-name=NvDsInferParseCustomNMSTLT custom-lib-path=<PATH to libnvds_infercustomparser_tlt.so> [class-attrs-all] pre-cluster-threshold=0.6 roi-top-offset=0 roi-bottom-offset=0 detected-min-w=0 detected-min-h=0 detected-max-w=0 detected-max-h=0