Release Notes :: NVIDIA JetPack Documentation

1. JetPack 4.4 Developer Preview

1.1. New Features

This release adds support for the Jetson Xavier NX module, and includes new versions of CUDA, TensorRT, and cuDNN. Support for Vulkan 1.2 has been added as well.

In addition, DeepStream 5.0 Developer Preview is supported in this release.

2. Additional Release Details

2.1. OS

L4T 32.4.2

Vulkan 1.2 support
Support for upgrading L4T version using Debian package management tool¹
Support for Generic Timestamping Engine (GTE) for Jetson AGX Xavier and Jetson Xavier NX
- Kernel driver for GTE
- Sample GPIO driver using GTE for timestamping input state transitions
Refer to the GTE section in the NVIDIA Jetson Linux Developer Guide.
Support for Dynamic Frequency Scaling (DFS) for VIC using actmon.
Samples to demonstrate hardware backed authentication and encryption capabilities of Jetson TX2, Jetson AGX Xavier and Jetson Xavier NX.

Refer to the Security section in the NVIDIA Jetson Linux Developer Guide.
Utility to burn fuse with multiple Jetson devices simultaneously.
For Jetson Nano and Jetson Xavier NX Developer Kits:
- Option to select APP partition size on the microSD card during initial configuration at first boot.

¹ Debian package-based L4T upgrade is supported only starting with L4T version 32.3.1.

2.2. Libraries and APIs

CUDA 10.2

Performance optimization through user-mode submits.

50% launch latency reduction for CUDA kernels, resulting in improved GPU utilization and lower CPU utilization.

TensorRT 7.1.0 (Developer Preview)

New Layers
1. IFillLayer: The IFillLayer is used to generate an output tensor with the specified mode.
2. IIteratorLayer: The IIteratorLayer enables a loop to iterate over a tensor. A loop is defined by loop boundary layers.
3. ILoopBoundaryLayer: Class ILoopBoundaryLayer defines a virtual method getLoop() that returns a pointer to the associated ILoop.
4. ILoopOutputLayer: The ILoopOutputLayer specifies an output from the loop.
5. IParametricReluLayer: The IParametricReluLayer represents a parametric ReLU operation, meaning, a leaky ReLU where the slopes for x < 0 can be different for each element.
6. IRecurrenceLayer: The IRecurrenceLayer specifies a recurrent definition.
7. ISelectLayer: The ISelectLayer returns either of the two inputs depending on the condition.
8. ITripLimitLayer: The ITripLimitLayer specifies how many times the loop iterates.
New Operators

Expanded support for ONNX operations: Added ConstantOfShape, DequantizeLinear, Equal, Erf, Expand, Greater, GRU, Less, Loop, LRN, LSTM, Not, PRelu, QuantizeLinear, RandomUniform, RandomUniformLike, Range, RNN, Scan, Sqrt, Tile, and Where.
New Samples

sampleAlgorithmSelector shows an example of how to use the algorithm selection API based on sampleMNIST. This sample demonstrates the usage of IAlgorithmSelector to deterministically build TensorRT engines.
Working with Loops

TensorRT supports loop-like constructs, which can be useful for recurrent networks. TensorRT loops support scanning over input tensors, recurrent definitions of tensors, and both "scan outputs" and "last value" outputs.
ONNX parser with dynamic shapes support

The ONNX parser supports full-dimensions mode only. Your network definition must be created with the explicitBatch flag set.
BERT INT8 and mixed precision optimizations

Some GEMM layers are now followed by GELU activation in the BERT model. Since TensorRT doesn't have IMMA GEMM layers, you can implement those GEMM layers in the BERT network with either IConvolutionLayer or IFullyConnectedLayer layers depending on what precision you require. For example, you can leverage IConvolutionLayer with H == W == 1 (CONV1x1) to implement a FullyConnected operation and leverage IMMA math under INT8 mode. TensorRT supports the fusion of Convolution/FullyConnected and GELU.
Working with Quantized Networks

Supports quantized models trained with Quantization Aware Training. Support is limited to symmetrically quantized models, meaning zero_point = 0 using QuantizeLinear and DequantizeLinear.
Boolean Tensor Support

TensorRT supports Boolean tensors which can be marked as network input and output. IElementWiseLayer, IUnaryLayer (only kNOT), IShuffleLayer, ITripLimit (only kWHILE) and ISelectLayer support the Boolean datatype. Boolean tensors can be used only with FP32 and FP16 precision networks.
Working with empty tensor

TensorRT supports empty tensors. A tensor is an empty tensor if it has one or more dimensions with length zero.
Builder layer timing cache

The layer timing cache will cache the layer profiling information during the builder phase. Models with repeated layer will see a significant speedup in builder time.
Pointwise fusion based on code generation

Pointwise fusion is updated to use code generation and runtime compilation to further improve performance.
Dilation support for deconvolution

IDeconvolutionLayer now supports a dilation parameter. This is accessible through the C++ API, Python API, and the ONNX parser.
Selecting FP16 and INT8 kernels

TensorRT supports Mixed Precision Inference with FP32, FP16, or INT8 as supported precisions. Depending on the hardware support, you can choose to enable either of the above precision to accelerate inference.
Calibration with dynamic shapes

INT8 calibration with dynamic shapes supports the same functionality as a standard INT8 calibrator but for networks with dynamic shapes.
Algorithm selection

Algorithm selection provides a mechanism to select and report algorithms for different layers in a network. This can also be used to deterministically build TensorRT engine or to reproduce the same implementations for layers in the engine.

Refer to the release notes for TensorRT 7.X.X for detailed release notes.

cuDNN 8.0.0 (Developer Preview)

cuDNN library split into multiple inferencing and training libraries, enabling applications to only link against needed cuDNN sub-components.

VPI 0.2.0 (Developer Preview)

Performance optimization of algorithms introduced in VPI 0.1.0: up to 45x on GPU and up to 90x on CPU backends.

Refer to the VPI documentation for benchmarks.
New Image FFT, Image iFFT and Image Format converter algorithms added with support for CPU and GPU backends. (PVA backend will be supported in a future release.)

2.3. Developer Tools

NVIDIA Nsight Systems 2020.2 for application profiling across GPU and CPU.
- Enhanced data analysis with option to export to SQLite, HDF5 or JSON
- Support for sampling Xavier PMU extensions
- Reduced NVTX overhead
- New CLI support for profiling on devices with intermittent network connectivity
NVIDIA Nsight Graphics 2020.1 for graphics application debugging and profiling.
- Added support to save and load custom named layouts
- Improved events view display and filtering
- Enhanced support for mixed DPI monitor scaling
- Added precise control of pixel position in the resources view for launching pixel history
- Improved Vulkan action profiling information by fixing GPU clocks when running profiling experiments
- Adds support for new Vulkan extensions:
```
VK_EXT_line_rasterization
```
```
VK_EXT_headless_surface
```
```
VK_KHR_create_renderpass2
```
```
VK_KHR_external_fence_win32
```
```
VK_KHR_external_memory_win32
```
```
VK_KHR_external_semaphore
```
```
VK_KHR_imageless_framebuffer
```
```
VK_NVX_image_view_handle
```
NVIDIA Nsight Compute 2019.3 for CUDA kernel profiling.
- NVIDIA Nsight Compute version remains the same from JetPack 4.3

Previous | Next

  Previous Topic     Next Topic

Introduction to JetPack

How to Install JetPack

Previous Topic	Next Topic
Introduction to JetPack	How to Install JetPack

_{THE INFORMATION IN THIS GUIDE AND ALL OTHER INFORMATION CONTAINED IN NVIDIA
DOCUMENTATION REFERENCED IN THIS GUIDE IS PROVIDED “AS IS.” NVIDIA MAKES NO
WARRANTIES, EXPRESSED, IMPLIED, STATUTORY, OR OTHERWISE WITH RESPECT TO THE
INFORMATION FOR THE PRODUCT, AND EXPRESSLY DISCLAIMS ALL IMPLIED WARRANTIES OF
NONINFRINGEMENT, MERCHANTABILITY, AND FITNESS FOR A PARTICULAR PURPOSE.
Notwithstanding any damages that customer might incur for any reason whatsoever,
NVIDIA’s aggregate and cumulative liability towards customer for the product
described in this guide shall be limited in accordance with the NVIDIA terms and
conditions of sale for the product.}

_{THE NVIDIA PRODUCT DESCRIBED IN THIS GUIDE IS NOT FAULT TOLERANT AND IS NOT
DESIGNED, MANUFACTURED OR INTENDED FOR USE IN CONNECTION WITH THE DESIGN,
CONSTRUCTION, MAINTENANCE, AND/OR OPERATION OF ANY SYSTEM WHERE THE USE OR A
FAILURE OF SUCH SYSTEM COULD RESULT IN A SITUATION THAT THREATENS THE SAFETY OF
HUMAN LIFE OR SEVERE PHYSICAL HARM OR PROPERTY DAMAGE (INCLUDING, FOR EXAMPLE,
USE IN CONNECTION WITH ANY NUCLEAR, AVIONICS, LIFE SUPPORT OR OTHER LIFE
CRITICAL APPLICATION). NVIDIA EXPRESSLY DISCLAIMS ANY EXPRESS OR IMPLIED
WARRANTY OF FITNESS FOR SUCH HIGH RISK USES. NVIDIA SHALL NOT BE LIABLE TO
CUSTOMER OR ANY THIRD PARTY, IN WHOLE OR IN PART, FOR ANY CLAIMS OR DAMAGES
ARISING FROM SUCH HIGH RISK USES.}

_{NVIDIA makes no representation or warranty that the product described in this
guide will be suitable for any specified use without further testing or
modification. Testing of all parameters of each product is not necessarily
performed by NVIDIA. It is customer’s sole responsibility to ensure the product
is suitable and fit for the application planned by customer and to do the
necessary testing for the application in order to avoid a default of the
application or the product. Weaknesses in customer’s product designs may affect
the quality and reliability of the NVIDIA product and may result in additional
or different conditions and/or requirements beyond those contained in this
guide. NVIDIA does not accept any liability related to any default, damage,
costs or problem which may be based on or attributable to: (i) the use of the
NVIDIA product in any manner that is contrary to this guide, or (ii) customer
product designs.}

_{Other than the right for customer to use the information in this guide with the
product, no other license, either expressed or implied, is hereby granted by
NVIDIA under this guide. Reproduction of information in this guide is
permissible only if reproduction is approved by NVIDIA in writing, is reproduced
without alteration, and is accompanied by all associated conditions,
limitations, and notices.}

Trademarks

_{NVIDIA, the NVIDIA logo, and cuBLAS, CUDA, CUDA-GDB, CUDA-MEMCHECK, cuDNN, cuFFT, cuSPARSE, DIGITS, DGX, DGX-1, DGX Station,
NVIDIA DRIVE, NVIDIA DRIVE AGX, NVIDIA DRIVE Software, NVIDIA DRIVE OS, NVIDIA Developer Zone (aka "DevZone"), GRID, Jetson,
NVIDIA Jetson Nano, NVIDIA Jetson AGX Xavier, NVIDIA Jetson TX2, NVIDIA Jetson TX2i, NVIDIA Jetson TX1, NVIDIA Jetson TK1,
Kepler, NGX, NVIDIA GPU Cloud, Maxwell, Multimedia API, NCCL, NVIDIA Nsight Compute, NVIDIA Nsight Eclipse Edition, NVIDIA
Nsight Graphics, NVIDIA Nsight Systems, NVLink, nvprof, Pascal, NVIDIA SDK Manager, Tegra, TensorRT, Tesla, Visual Profiler,
VisionWorks and Volta are trademarks and/or registered trademarks of NVIDIA Corporation in the United States and other countries.
Other company and product names may be trademarks of the respective companies with which they are associated.}