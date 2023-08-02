The deep learning and computer vision models that you trained can be deployed on edge devices, such as a Jetson Xavier, Jetson Nano, or Tesla, or in the cloud with NVIDIA GPUs. The exported \*.etlt model can be used in a stand-alone TensorRT inference sample or in DeepStream.

DeepStream SDK is a streaming analytic toolkit to accelerate building AI-based video analytic applications. TAO Toolkit is integrated with DeepStream SDK, so models trained with TAO Toolkit will work out of the box with Deepstream.

Once you get the .etlt ActionRecognitionNet model, you can deploy it into the DeepStream 3d-action-recognition sample app. Refer to the sample applications documentation for detailed steps to run action recogintion in DeepStream.

A stand-alone TensorRT inference sample is also provided. It consumes the TensorRT engine and supports running with 2D/3D input on images. The sample can be found on Github.

To use this sample, you need to generate the TensorRT engine out of a \*.etlt model using tao-converter .

Using tao-converter

The tao-converter is a tool that is provided with the TAO Toolkit to facilitate the deployment of TAO Toolkit trained models on TensorRT and/or Deepstream. For deployment platforms with an x86 based CPU and discrete GPUs, the tao-converter is distributed within the TAO Docker. Therefore, we suggest using the Docker to generate the engine. However, you will need to adhere to the same minor version of TensorRT that is distributed with the Docker. The TAO Docker includes TensorRT version 8.0. To use the engine with a different minor version of TensorRT, copy the converter from /opt/nvidia/tools/tao-converter to the target machine and follow the instructions for x86 to run it and generate a TensorRT engine.

For the aarch64 platform, the tao-converter is available to download from Dev Zone.

Here is a sample command to generate the ActionRecognitionNet engine through tao-converter :

Copy Copied! #convert 2D RGB model with input sequence length is 32 and input size is 224x224: tao-converter <2d_rgb_etlt_model> -k <key_to_etlt_model> -p input_rgb,1x96x224x224,4x96x224x224,16x96x224x224 -e <path_to_generated_trt_engine> -t fp16 #convert 3D RGB model with input sequence length is 32 and input size is 224x224: tao-converter <3d_rgb_etlt_model> -k <key_to_etlt_model> -p input_rgb,1x3x32x224x224,4x3x32x224x224,16x3x32x224x224 -e <path_to_generated_trt_engine> -t fp16

This command will generate an optimized TensorRT engine with dynamic batch size (1~16).

Usage of inference sample

Once you get the tensorrt engine, you can deploy the engine in the stand-alone sample. Use the following command to run inference:

Copy Copied! python ar_trt_inference.py --input_images_folder <path to input images folder> \ --trt_engine <path to tensorrt engine> \ [--center_crop] \ [--input_2d]

Required Arguments

--input_images_folder : The path to input images folder. It should be a video_<n> level directory as described in the Preparing the Dataset section.

--trt_engine : The path to the TensorRT engine.

Optional Arguments

--center_crop : Resizes the input images with a short side to 256 and center crops to a 224x224 area. If this flag is not set, the input images will be directly resized to 224x224.

--input_2d : Set this flag if the engine is generated from a 2D model.