archive/driveworks-3.0/samples_2drivenet_2drivenet_2README_8md_source.html

# Copyright (c) 2019-2020 NVIDIA CORPORATION.  All rights reserved.

@page dwx_drivenet_sample DriveNet Sample
@tableofcontents

@note SW Release Applicability: This sample is available in **NVIDIA DRIVE Software** releases.

@section dwx_drivenet_description Description

The NVIDIA<sup>&reg;</sup> DriveNet sample is a sophisticated, multi-class, higher-
resolution example that uses the NVIDIA<sup>&reg;</sup> DriveNet proprietary deep neural
network (DNN) to perform object detection.

The DriveNet sample application detects objects by performing inferences on each frame of a RAW video or camera stream.
It clusters these objects with parameters defined within the sample application.

A follow-up algorithm clusters detections from both images to compute a more stable response.

@section dwx_drivenet_sample_running Running the Sample

The DriveNet sample, sample_drivenet, accepts the following optional parameters. If none are specified, it performs detections on a supplied pre-recorded video.

    ./sample_drivenet --input-type=[video|camera]
                      --video=[path/to/video]
                      --camera-type=[camera]
                      --camera-group=[a|b|c|d]
                      --camera-index=[0|1|2|3]
                      --slave=[0|1]
                      --dla=[0|1]
                      --dlaEngineNo=[0|1]
                      --precision=[int8|fp16|fp32]
                      --useCudaGraph=[0|1]
                      --stopFrame=[frame]
                      --enableUrgency=[0|1]
                      --stateless=[0|1]

Where:

    --input-type=[video|camera]
            Defines if the input is from live camera or from a recorded video.
            Live camera is supported only on NVIDIA DRIVE(tm) platforms.
            It is not supported on Linux (x86 architecture) host systems.
            Default value: video

    --video=[path/to/video]
            Specifies the absolute or relative path of a raw, lraw or h264 recording.
            Only applicable if --input-type=video.
            Default value: path/to/data/samples/raw/rccb.raw

    --camera-type=[camera]
            Only applicable if --input-type=camera.
            Default value: ar0231-rccb-bae-sf3324

    --camera-group=[a|b|c|d]
            Is the group where the camera is connected to.
            Only applicable if --input-type=camera.
            Default value: a

    --camera-index=[0|1|2|3]
            Indicates the camera index on the given port.
            Default value: 0

    --slave=[0|1]
            Setting this parameter to 1 when running the sample on Xavier B accesses the camera
            on Xavier A.
            Applicable only when --input-type=camera.
            Default value: 0

    --dla=[0|1]
            Setting this parameter to 1 runs the DriveNet DNN inference on one of the DLA engines.
            Default value: 0

    --dlaEngineNo=[0|1]
            Chooses the DLA engine to be used.
            Only applicable if --dla=1
            Default value: 0

    --precision=[int8|fp16|fp32]
            Defines the precision of the DriveNet DNN. The following precision levels are supported.
            - int8
              - 8-bit signed integer precision.
              - Supported GPUs: compute capability >= 6.1.
              - Faster than fp16 and fp32 on GPUs with compute capability = 6.1 or compute capability > 6.2.
            - fp16 (default)
              - 16-bit floating point precision.
              - Supported GPUs: compute capability >= 6.2
              - Faster than fp32.
              - If fp16 is selected on a Pascal GPU, the precision will be set to fp32.
            - fp32
              - 32-bit floating point precision.
              - Supported GPUs: Only Pascal GPUs (compute capability 6.1)
              - Default for Pascal GPUs.
            When using DLA engines only fp16 is allowed.
            Default value: fp16

    --useCudaGraph=[0|1]
            Setting this parameter to 1 runs Drivenet DNN inference by CUDAGraph if the hardware supports.
            Default value: 0

    --stopFrame=[number]
            Runs DriveNet only on the first <number> frames and then exits the application.
            The default value for `--stopFrame` is 0, for which the sample runs endlessly.
            Default value: 0

    --enableUrgency=[0|1]
            Enables the object urgency prediction by a temporal model.
            Only supports predicting the urgency for cars and pedestrians on the front camera with 60&deg; field of view.
            Default value: 0

    --stateless=[0|1]
            Setting this parameter to 0 runs the stateful temporal model. Setting it to 1 runs the stateless temporal model.
            The stateful model uses all past frames to predict urgency, while the stateless model only uses the most recent frames.
            Only applicable if --enableUrgency=1.
            Default value: 0

@subsection dwx_drivenet_sample_examples Examples

### To run the sample on a video

    ./sample_drivenet --input-type=video --video=<video file.raw>

### To run the sample on a camera on NVIDIA DRIVE platforms

    ./sample_drivenet --input-type=camera --camera-type=<camera type> --camera-group=<camera group> --camera-index=<camera idx on camera group>

where `<camera type>` is a supported `RCCB` sensor.
See @ref supported_sensors for the list of supported cameras for each platform.

### To run the sample on a DLA engine on an NVIDIA DRIVE platform

On NVIDIA DRIVE<sup>&trade;</sup> platforms, you can run DriveNet on DLA engines with the following command line:

    ./sample_drivenet --dla=1 --dlaEngineNo=0

### To run the sample on a video for the first 3000 frames

    ./sample_drivenet --video=<video file.raw> --stopFrame=3000

### To run the sample with different precisions

    ./sample_drivenet --precision=int8

### To run the sample with urgency predictions

    ./sample_drivenet --enableUrgency=1

@section dwx_drivenet_sample_output Output

The sample creates a window, displays a video, and overlays bounding boxes for detected objects.
The color of the bounding boxes represents the classes that the sample detects, as follows:

* Red: Cars and Trucks (both labeled as cars).
* Green: Traffic Signs.
* Blue: Bicycles.
* Magenta: Pedestrians.
* Orange: Traffic Lights.
* Yellow: Curb
* Cyan: other
* Grey: unknown

When urgency prediction is enabled, the predicted urgency value is displayed behind the object class name.
The color of the bounding boxes represents urgency value with a green, white, red smoothly transitioned color map.
In this color map, green indicates negative urgency, white indicates zero urgency, and red indicates positive urgency.

![Multiclass object detector on an RCCB stream using DriveNet](sample_drivenet.png)

@section dwx_drivenet_sample_limitations Limitations

@warning DriveNet DNN currently has limitations that could affect its performance:
- It is optimized for daytime, clear-weather data. As a result, it
  does not perform well in dark or rainy conditions.
- It is trained primarily on data collected in the United States.
  As a result, it may have reduced accuracy in other locales,
  particularly for road sign shapes that do not exist in the U.S.

@section dwx_drivenet_sample_more Additional Information

For more information, see @ref drivenet_mainsection.