L4T Multimedia API Reference

32.1 Release

 All Data Structures Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros Groups Pages
04_video_dec_trt

Overview

This sample demonstrates the simplest way to use NVIDIA® TensorRT to decode video and save the bounding box information to the result.txt file. TensorRT was previously known as GPU Inference Engine (GIE).

This samples does not require a Camera or display.


Building and Running

Prerequisites

  • You have followed Steps 1-3 in Building and Running.
  • You have installed the following:
    • NVIDIA® CUDA®
    • TensorRT (previously known as GPU Inference Engine (GIE))
    • OpenCV

To build:

  • Enter:
     $ cd $HOME/tegra_multimedia_api/samples/04_video_dec_trt
     $ make
    

To run

  • Enter:
     $ ./video_dec_trt [Channel-num] <in-file1> <in-file2> ... <in-format> [options]
    

Example

The following example generates two results: result0.txt and result1.txt. The results contain normalized rectangle coordinates for detected objects.

   $ ./video_dec_trt 2 ../../data/Video/sample_outdoor_car_1080p_10fps.h264 \
     ../../data/Video/sample_outdoor_car_1080p_10fps.h264 H264 \
     --trt-deployfile ../../data/Model/resnet10/resnet10.prototxt \
     --trt-modelfile ../../data/Model/resnet10/resnet10.caffemodel \
     --trt-mode 0

Notes

  • Boost the clock before running performance.
     $ sudo ~/jetson_clocks.sh
    
  • To change the batch size, use the Channel-num option.
  • For information on opening more than 16 video devices, see the following NVIDIA® DevTalk topic:

    https://devtalk.nvidia.com/default/topic/1025375/

  • If the mode or any other parameter is changed, run the following command.
     $ rm trtModel.cache
    
  • The log shows the performance results with the following syntax:
     Inference Performance(ms per batch):xx  Wait from decode takes(ms per batch):xx
    
  • To verify the result and scale the rectangle parameters, enter the following commands:
    $ cp result*.txt $HOME/tegra_multimedia_api/samples/02_video_dec_cuda
    $ cd $HOME/tegra_multimedia_api/samples/02_video_dec_cuda
    $ ./video_dec_cuda ../../data/Video/sample_outdoor_car_1080p_10fps.h264 H264 --bbox-file result0.txt
    $ ./video_dec_cuda ../../data/Video/sample_outdoor_car_1080p_10fps.h264 H264 --bbox-file result1.txt
    


Flow

The data pipeline is as follow:

Input video file -> Decoder -> VIC -> TensorRT Inference -> Plain text file with Bounding Box info

Operation Flow

The sample does the following:

  1. Encodes the input video stream.
  2. Performs one-channel video decodeVIC, which does the following:
    • Converts the buffer layout from block linear to pitch linear.
    • Scales the image resolution to the resolution that TensorRT requires.
  3. Uses TensorRT to perform object identification and adds a bounding box to the object identified in the original frame.
  4. Converts the image from YUV to RGB format and saves it in a file.

The following block diagram shows the video decoder pipeline and memory sharing between different engines. This memory sharing also applies to other L4T Multimedia samples.


Key Structure and Classes

This sample uses the following key structures and classes:

The global structure context_t manages all the resources in the application.

ElementDescription
NvVideoDecoderContains all video decoding-related elements and functions.
NvVideoConverterContains elements and functions for video format conversion.
EGLDisplaySpecifies the EGLImage used for CUDA processing.
conv_output_plane_buf_queueSpecifies the output plane queue for video conversion.
TRT_ContextSpecifies interfaces for loading Caffemodel and performing inference.

Key Thread

MemberDescription
decCaptureLoopGets buffers from dec capture plane and push to converter, and handle resolution change.
Conv outputPlane dqThreadReturns the buffers dequeued from converter output plane to decoder capture plane.
Conv captuerPlane dqThreadGets buffers from conv capture plane and push to the TensorRT buffer queue.
trtThreadSpecifies the CUDA process and inference characteristics.

Programming Notes

To display and verify the results and to scale the rectangle parameters, use the 02_video_dec_cuda sample as follows:

   $ ./video_dec_cuda <in-file> <in-format> --bbox-file result.txt

The sample does the following:

  • Saves the resulting normalized rectangle within [0,1].
  • Supports in-stream resolution changes.
  • Uses the default file:

     resnet10.prototxt
    

    The default model file is

    resnet10.caffemodel
    

    In this directory:

    $SDKDIR/data/Model/resnet10
    
  • Performs end-of-stream (EOS) processesing as follows:

    a. Completely reads the file.

    b. Pushes a null v4l2buf to decoder.

    c. Waits for all output plane buffers to return.

    d. Sets get_eos:

      decCap thread exit
    

    e. Ends the TensorRT thread.

    f. Sends EOS to the converter:

       conv output plane dqThread callback return false
       conv output plane dqThread exit
       conv capture plane dqThread callback return false
       conv capture plane dqThread exit
    

    g. Deletes the decoder:

      deinit output plane and capture plane buffers
    

    h. Deletes the converter:

       unmap capture plane buffers
    

Command Line Options

./video_dec_trt [Channel-num] <in-file1> <in-file2> ... <in-format> [options]

OptionDescription
--trt-deployfileSets deploy file name.
--trt-modelfileSets the model file name.
--trt-mode <int>Specifies to use float16 or not[0-2], where <int> is one of the following:
  • 0 float16
  • 1 float32
  • 2 int8
--trt-enable-perfEnables performance measurement.