This sample demonstrates the simplest way to use NVIDIA® TensorRT™ to decode video and save the bounding box information to the result.txt
file. TensorRT was previously known as GPU Inference Engine (GIE).
This samples does not require a Camera or display.
$ sudo vi /etc/apt/sources.list.d/nvidia-l4t-apt-source.list
Change the repository name and download URL in the deb commands shown below:
deb https://repo.download.nvidia.com/jetson/common <release> main deb https://repo.download.nvidia.com/jetson/<platform> <release> main
<release> is the release number. Ex: r32.5.
<platform> identifies the platform's processor.
$ sudo apt-get update $ sudo apt-get install tensorrt
$ cd /usr/src/jetson_multimedia_api/samples/04_video_dec_trt $ make
$ ./video_dec_trt [Channel-num] <in-file1> <in-file2> ... <in-format> [options]
The following example generates two results: result0.txt
and result1.txt
. The results contain normalized rectangle coordinates for detected objects.
$ /usr/src/tensorrt/bin/trtexec --onnx=../../data/Model/resnet10/resnet10_dynamic_batch.onnx \ --maxShapes=data:4x3x368x640 --minShapes=data:1x3x368x640 --optShapes=data:2x3x368x640 \ --fp16 --saveEngine=resnet10_dynamic_batch.engine $ ./video_dec_trt 2 ../../data/Video/sample_outdoor_car_1080p_10fps.h264 \ ../../data/Video/sample_outdoor_car_1080p_10fps.h264 H264 \ --trt-engine resnet10_dynamic_batch.engine
$ sudo ~/jetson_clocks.sh
Channel-num
option.For information on opening more than 16 video devices, see the following NVIDIA® DevTalk topic:
Inference Performance(ms per batch):xx Wait from decode takes(ms per batch):xx
$ cp result*.txt /usr/src/jetson_multimedia_api/samples/02_video_dec_cuda $ cd /usr/src/jetson_multimedia_api/samples/02_video_dec_cuda $ ./video_dec_cuda ../../data/Video/sample_outdoor_car_1080p_10fps.h264 H264 --bbox-file result0.txt $ ./video_dec_cuda ../../data/Video/sample_outdoor_car_1080p_10fps.h264 H264 --bbox-file result1.txt
The data pipeline is as follow:
Input video file -> Decoder -> VIC -> TensorRT Inference -> Plain text file with Bounding Box info
The sample does the following:
The following block diagram shows the video decoder pipeline and memory sharing between different engines. This memory sharing also applies to other L4T Multimedia samples.
This sample uses the following key structures and classes:
The global structure context_t
manages all the resources in the application.
Element | Description |
---|---|
NvVideoDecoder | Contains all video decoding-related elements and functions. |
EGLDisplay | Specifies the EGLImage used for CUDA processing. |
TRT_Context | Specifies interfaces for loading ONNXmodel/Caffemodel and performing inference. |
Member | Description |
---|---|
decCaptureLoop | Gets buffers from the dec capture plane, converts the buffers, and pushes the buffers to the TensorRT buffer queue. |
trtThread | Specifies the CUDA process and inference characteristics. |
To display and verify the results and to scale the rectangle parameters, use the 02_video_dec_cuda
sample as follows:
$ ./video_dec_cuda <in-file> <in-format> --bbox-file result.txt
The sample does the following:
Uses the default file:
resnet10_dynamic_batch.onnx
In this directory:
$SDKDIR/data/Model/resnet10
./video_dec_trt [Channel-num] <in-file1> <in-file2> ... <in-format> [options]
Option | Description |
---|---|
-- trt-onnxmodel | Sets ONNX model file name. |
-- trt-deployfile | Sets deploy file name.(Will be deprecated as TensorRT is deprecating Caffe Parser) |
-- trt-modelfile | Sets the model file name.(Will be deprecated as TensorRT is deprecating Caffe Parser) |
-- trt-mode <int> | Specifies to use float16 or not[0-2], where <int> is one of the following:
|
-- trt-enable-perf | Enables performance measurement. |