Performance

This topic provides details about configuration settings and measured performance for the DeepStream SDK.

Tesla

This section describes configuration and settings for the DeepStream SDK on NVIDIA® Tesla®.

System Configuration

The system configuration for the DeepStream SDK is listed below:
System Configuration
Specification
CPU
Dual Intel® Xeon® CPU E5-2650 v4 @ 2.20GHz (48 threads total)
GPU
Tesla T4
System Memory
128 GB DDR4, 2400MHz
Ubuntu
Ubuntu 18.04
GPU Driver
440+
CUDA
10.2
TensorRT
7.0+
GPU clock frequency
1.3 GHz

Application Configuration

The application configuration for the DeepStream SDK is listed below:
Application Configuration
Specification
N×1080p 30 fps stream
sample_1080p_h265.mp4 (provided with the SDK) N=64
sample_1080p_h264.mp4 (provided with the SDK) N=39
Primary GIE
Resnet10 (480×272)
Batch Size = N
Interval=0
Tracker
Enabled. Processing at 480×272 resolution, IOU tracker enabled.
3 × Secondary GIEs
All batches size 32. Asynchronous mode enabled.
Secondary_VehicleTypes (224×224—Resnet18)
Secondary_CarColor (224×224—Resnet18)
Secondary_CarMake (224×224—Resnet18)
Tiled Display
Disabled
Rendering
Disabled

Achieved Performance

The achieved performance of the DeepStream SDK under the specified system and application configuration are as follows:
Stream Type
No. of Streams @ 30 FPS
CPU Utilization
GPU Utilization
H.265
64
8% to 10%
58%
H.264
39
5%
31%
 
Note:
The “No. of Streams @ 30 FPS” and the “GPU Utilization” values indicated in the above table are with T4 performance work around as mentioned in release notes section 3.4.
 

Jetson Performance

This section describes configuration and settings for the DeepStream SDK on NVIDIA Jetson platforms. JetPack 4.4 DP is used for software installation.

System Configuration

For the performance test:
1. Max power mode is enabled:
$ sudo nvpmodel -m 0
2. The GPU clocks are stepped to maximum:
$ sudo jetson_clocks
For information about supported power modes, see “Supported Modes and Power Efficiency” in the power management topics of NVIDIA Tegra Linux Driver Package Development Guide, e.g., “Power Management for Jetson AGX Xavier Devices.”

Jetson Nano

The following tables describe performance results for the NVIDIA Jetson Nano.
Pipeline Configuration (deepstream-app)
Application Configuration
Specification
N×1080p 30 fps streams
sample_1080p_h265.mp4 (provided with the SDK) N = 8
sample_1080p_h264.mp4 (provided with the SDK) N = 8
Primary GIE
Resnet10 (480×272) Asynchronous mode enabled
Batch Size = N
Interval = 4
Tracker
Enabled; processing at 480×272 resolution, KLT tracker enabled.
OSD/tiled display
Disabled
Renderer
Disabled
 
Achieved Performance
Stream Type
No. of Streams @ 30 FPS
CPU Utilization
GPU Utilization
H.265
8
39%
67%
H.264
8
39%
65%

Jetson AGX Xavier

The following tables describe performance results for the NVIDIA Jetson AGX Xavier™.
Pipeline Configuration (deepstream-app)
Application Configuration
Specification
N×1080p 30 fps streams
sample_1080p_h265.mp4 (provided with the SDK) N=47
sample_1080p_h264.mp4 (provided with the SDK) N=32
Primary GIE
Resnet10 (480×272) Asynchronous mode enabled
Batch Size = N
Interval = 0
Tracker
Enabled; processing at 480×272 resolution, IOU tracker enabled.
3× secondary GIEs
All batches are size 32.
Secondary_VehicleTypes (224×224—Resnet18)
Secondary_CarColor (224×224—Resnet18)
Secondary_CarMake (224×224—Resnet18)
OSD/tiled display
Disabled
Renderer
Disabled
 
Achieved Performance
Stream Type
No. of Streams @ 30 FPS
CPU Utilization
GPU Utilization
H.265
47
22%
95%
H.264
32
19%
71%

Jetson NX

The following tables describe performance results for the NVIDIA® Jetson NX™.
Pipeline Configuration (deepstream-app)
Application Configuration
Specification
N×1080p 30 fps streams
sample_1080p_h265.mp4 (provided with the SDK) N=23
sample_1080p_h264.mp4 (provided with the SDK) N=16
Primary GIE
Resnet10 (480×272) Asynchronous mode enabled
Batch Size = N
Interval = 0
Tracker
Enabled; processing at 480×272 resolution, IOU tracker enabled.
3× secondary GIEs
All batches are size 32.
Secondary_VehicleTypes (224×224—Resnet18)
Secondary_CarColor (224×224—Resnet18)
Secondary_CarMake (224×224—Resnet18)
OSD/tiled display
Disabled
Renderer
Disabled
 
Achieved Performance
Stream Type
No. of Streams @ 30 FPS
CPU Utilization
GPU Utilization
H.265
23
55%
93%
H.264
16
45%
65%

Jetson TX2

The following tables describe performance results for the NVIDIA Jetson TX2.
Pipeline Configuration (deepstream-app)
Application Configuration
Specification
N×1080p 30 fps streams
sample_1080p_h265.mp4 (provided with the SDK) N = 15
sample_1080p_h264.mp4 (provided with the SDK) N = 14
Primary GIE
Resnet10 (480×272) Asynchronous mode enabled
Batch Size = N
Interval = 4
Tracker
Enabled; processing at 480×272 resolution, KLT tracker enabled
OSD/tiled display
Disabled
Renderer
Disabled
 
Achieved Performance
Stream Type
No. of Streams @ 30 FPS
CPU Utilization
GPU Utilization
H.265
15
35%
47%
H.264
14
34%
43%

Jetson TX1

The following tables describe performance results for the NVIDIA Jetson TX1.
Pipeline Configuration (deepstream-app)
Application Configuration
Specification
N×1080p 30 fps streams
sample_1080p_h265.mp4 (provided with the SDK) N = 13
sample_1080p_h264.mp4 (provided with the SDK) N = 10
Primary GIE
Resnet10 (480×272) Asynchronous mode enabled
Batch Size = N
Interval = 4
Tracker
Enabled; processing at 480×272 resolution, KLT tracker enabled
OSD/tiled display
Disabled
Renderer
Disabled
 
Achieved Performance
Stream Type
No. of Streams @ 30 FPS
CPU Utilization
GPU Utilization
H.265
13
56%
49%
H.264
10
43%
43%