This application implements a typical appliance performing intelligent video analytics. Application areas include public safety, smart cities, and autonomous machines. This example demonstrates four (4) concurrent video streams going through a decoding process using the on-chip decoders, video scaling using on chip scalar, and GPU compute. For simplicity of demonstration, only one of the channels uses NVIDIA® TensorRT™ to perform object identification and generate bounding box around the identified object. This sample also uses video converter functions for various format conversions. It also uses EGLImage to demonstrate buffer sharing and image display.
In this sample, object detection is limited to identifying cars in video streams of 960 x 540 resolution, running up to 14 FPS. The network is based on GoogleNet. The inference is performed on a frame-by-frame basis and no object tracking is involved. Note that this network is intended to be an example that shows how to use TensorRT to quickly build the compute pipeline. The sample includes trained GoogleNet, which was trained with NVIDIA Deep Learning GPU Training System (DIGITS). The training was done with roughly 3000 frames taken from 5-10 feet elevation. Varying levels of detection accuracy are expected based on the video samples fed in. Given that this sample is locked to perform at half-HD resolutions under 10 FPS, video feeds with higher FPS for inference will show stuttering during playback.
This sample does not require a camera.
$ sudo vi /etc/apt/sources.list.d/nvidia-l4t-apt-source.list
Change the repository name and download URL in the deb commands shown below:
deb https://repo.download.nvidia.com/jetson/common <release> main deb https://repo.download.nvidia.com/jetson/<platform> <release> main
<release> is the release number. Ex: r32.5.
<platform> identifies the platform's processor:
$ sudo apt-get update $ sudo apt-get install tensorrt
ENABLE_TRT := 0
$ cd backend $ make
$ ./backend 1 ../../data/Video/sample_outdoor_car_1080p_10fps.h264 H264 \ --trt-deployfile ../../data/Model/GoogleNet_one_class/GoogleNet_modified_oneClass_halfHD.prototxt \ --trt-modelfile ../../data/Model/GoogleNet_one_class/GoogleNet_modified_oneClass_halfHD.caffemodel \ --trt-mode 0 --trt-proc-interval 1 -fps 10
q
.$ cd backend $ ./backend -h
The following image shows the movement of data through the sample when TensorRT is not enabled.
The following image shows data flow details for the channel using TensorRT.
NvEGLImageFromFd
is an NVIDIA API that returns an EGLImage
pointer from the file descriptor buffer that is allocated via the Tegra mechanism. TensorRT then uses the EGLImage
buffer to render the bounding box to the image.
For X11 technical details, see:
http://www.x.org/docs/X11/xlib.pdf
The context_t structure
(backend/v4l2_backend_test.h) manages all resources in sample applications.
Element | Description |
---|---|
NvVideoDecoder | Contains all video decoding-related elements and functions. |
NvVideoConverter | Contains elements and functions for video format conversion. |
NvEglRenderer | Contains all EGL display rendering-related functions. |
EGLImageKHR | The EGLImage used for CUDA processing. This type is from the EGL open source graphical library. |
The NvVideoDecoder class creates a new V4L2 Video Decoder. The following table describes the key NvVideoDecoder members that this sample uses.
Member | Description |
---|---|
NvV4l2Element::output_plane | Holds the V4L2 output plane. |
NvV4l2Element::capture_plane | Holds the V4L2 capture plane. |
NvVideoDecoder::createVideoDecoder | Static function to create video decode object. |
NvV4l2Element::subscribeEvent | Subscribes event. |
NvVideoDecoder::setExtControls | Sets external control to V4L2 device. |
NvVideoDecoder::setOutputPlaneFormat | Sets output plane format. |
NvVideoDecoder::setCapturePlaneFormat | Sets capture plane format. |
NvV4l2Element::getControl | Gets the value of a control setting. |
NvV4l2Element::dqEvent | Dequeues the devent reported by the V4L2 device. |
NvV4l2Element::isInError | Checks if under error state. |
The NvVideoConverter class packages all video converting related elements and functions. It performs color space conversion, scaling and conversion between hardware buffer memory and software buffer memory. The following table describes the key NvVideoConverter members that this sample uses.
Member | Description |
---|---|
NvV4l2Element::output_plane | Holds the output plane. |
NvV4l2Element::capture_plane | Holds the capture plane. |
NvVideoConverter::waitForIdle | Waits until all the buffers queued on the output plane are converted and dequeued from the capture plane. This is a blocking method. |
NvVideoConverter::setCapturePlaneFormat | Sets the format on the converter capture plane. |
NvVideoConverter::setOutputPlaneFormat | Sets the format on the converter output plane. |
NvVideoDecoder
and NvVideoConverter
contain two key elements: output_plane
and capture_plane
. These objects are instantiated from the NvV4l2ElementPlane class type.
NvV4l2ElementPlane creates an NVv4l2Element plane. The following table describes the key NvV4l2ElementPlane members used in this sample. v4l2_buf
is a local variable inside the NvV4l2ElementPlane::dqThreadCallback function and, thus, the scope exists only inside the callback function. If other functions of the sample must access this buffer, a prior copy of the buffer inside callback function is required.
Member | Description |
---|---|
NvV4l2ElementPlane::setupPlane | Sets up the plane of V4l2 element. |
NvV4l2ElementPlane::deinitPlane | Destroys the plane of V4l2 element. |
NvV4l2ElementPlane::setStreamStatus | Starts/Stops the stream. |
NvV4l2ElementPlane::setDQThreadCallback | Sets the callback function of the dqueue buffer thread. |
NvV4l2ElementPlane::startDQThread | Starts the thread of the dqueue buffer. |
NvV4l2ElementPlane::stopDQThread | Stops the thread of the dqueue buffer. |
NvV4l2ElementPlane::qBuffer | Queues a V4l2 buffer from the plane. |
NvV4l2ElementPlane::dqBuffer | Dequeues a V4l2 buffer from the plane. |
NvV4l2ElementPlane::getNumBuffers | Gets the number of the V4l2 buffer. |
NvV4l2ElementPlane::getNumQueuedBuffers | Gets the number of the V4l2 buffer in the queue. |
NvV4l2ElementPlane::getNthBuffer | Gets the NvBuffer queue object at index N. |
TRT_Context provides a series of interfaces to load Caffe model and perform inference. The following table describes the key TRT_Context members used in this sample.
TRT_Context | Description |
---|---|
TRT_Context::destroyTrtContext | Destroys the TRT_context. |
TRT_Context::getNumTrtInstances | Gets the number of TRT_context instances. |
TRT_Context::doInference | Interface for inference after TensorRT model is loaded. |
The sample uses 2 global functions to create and destroy EGLImage from dmabuf
file descriptor. These functions are defined in nvbuf_utils.h.
Global Function | Description |
---|---|
NvEGLImageFromFd() | Creates EGLImage from dmabuf fd. |
NvDestroyEGLImage() | Destroys the EGLImage. |