Release Notes :: NVIDIA JetPack Documentation

1. JetPack 4.2.1

1.1. New Product Support

Jetson TX2 4GB module (P3489-0080)
Jetson Nano module (P3448-0020)

1.2. New Features

All Jetson Products

NVIDIA Container Runtime with Docker integration [beta]

Enables creation, distribution, and use of containerized GPU accelerated applications.
NVIDIA Indicator Applet

Enables switching between nvpmodel profiles; monitors and alerts any voltage drop events.
Supports DeepStream 4.0
Support for headless initial configuration

Developer kits support initial configuration via serial port instead of display, keyboard, and mouse.
Supports ISAAC version 2019.2

Jetson AGX Xavier

TensorRT gains DLA support for INT-8 [beta]
FreeRTOS for SPE (Cortex-R5)

Open source real-time operating system ported to run on Cortex-R5 processors in Jetson AGX Xavier and Jetson TX2 series modules. Example uses include timestamping incoming sensor data.

Jetson TX2

FreeRTOS for SPE (Cortex-R5)

Jetson Nano

Improved memory availability

Reduced memory consumption and improved low-memory handling via use of ZRAM and system software optimizations.
Better handling of insufficient power supplies
- Boot process updated to require less power
- Widget to display under-voltage warning

1.3. Security

Jetson Nano, Jetson TX2, and Jetson AGX Xavier

Secure boot adds support for Jetson Nano

Secure boot provides a cryptographic checks at each stage of the boot process. These checks ensure the integrity of the software component to prevent unauthorized code from being run.

Jetson TX2 and Jetson AGX Xavier

TrustedOS

Provides a framework for customers to use when implementing applications that work with confidential assets. Common use cases include:
1. Secure management of secret keys and OEM defined assets
2. User authentication: PIN or biometric (e.g., facial recognition, fingerprint)
3. Secure storage of sensitive user data

1.4. Additional Release Details

OS

L4T 32.2
- Added userspace interface to change fan speed.
- GNOME shell now uses Wayland graphics stack.
- Jetson Nano Developer Kit now supports Ethernet boot.
- Added static webRTC library with support for hardware acclerated H.264 encoding.
- Added DT GPIO labels to Jetson device trees.
- Made IO instances the same across chips and platforms: I2S, I2C, UART, etc.

Libraries and APIs

TensorRT 5.1.6.1
- New/Updated Layers
  1. Asymmetric padding support has been added to IConvolutionLayer, IDeconvolutionLayer and IPoolingLayer.
  2. Added support for the Slice layer. The Slice layer implements a slice operator for tensors.
  3. Updated RNNv1 and RNNv2 validation of hidden and cell input/output dimensions. This affects only bidirectional RNNs.
- New/Updated Samples – The README.md files for many samples have been greatly improved. Two new Python samples were added:
  1. INT8 Calibration in Python – This sample demonstrates how to create an INT8 calibrator, build and calibrate an engine for INT8 mode, and finally run inference in INT8 mode.
  2. Engine Refit in Python – This sample demonstrates the engine refit functionality provided by TensorRT. The model first trains an MNIST model in PyTorch, then recreates the network in TensorRT.
- Updated Parser Support
  1. Caffe – Added BNLL, Clip and ELU ops. Additionally, the leaky ReLU option for the ReLU op (negative_slope != 0) was added.
  2. UFF – Added ArgMax, ArgMin, Clip, Elu, ExpandDims, Identity, LeakyReLU, Recip, Relu6, Sin, Cos, Tan, Asin, Acos, Atan, Sinh, Cosh, Asinh, Acosh, Atanh, Ceil, Floor, Selu, Slice, Softplus and Softsign ops
  3. ONNX – Added ArgMax, ArgMin, Clip, Cast, Elu, Selu, HardSigmoid, Softplus, Gather, ImageScaler, LeakyReLU, ParametricSoftplus, Sin, Cos, Tan, Asin, Acos, Atan, Sinh, Cosh, Asinh, Acosh, Atanh, Ceil, Floor, ScaledTanh, Softsign, Slice, ThresholdedRelu and Unsqueeze ops
- Engine refitter – TensorRT engines can now be dynamically refitted with new weights without rebuilding the entire engine.
- Improved performance/watt for IMMA and HMMA kernels.
  
  The efficiency has been improved for both FP16 and INT8 modes across the platforms.
cuDNN 7.5.0.56
- A new set of APIs are added to provide support for Multi-Head Attention computation.
- A new set of APIs, and enhancements for the existing APIs, are introduced for Recurrent Neural Networks (RNNs).
- A new set of APIs for general tensor folding is introduced.
- Added a new family of five new fast NHWC batch normalization functions, and one new type descriptor.
- API Logging is enhanced so that the process ID can be included in the log file name. See API Logging.
- The FFT tiling algorithms for convolution have been enhanced to support strided convolution. In specific, for the algorithms CUDNN_CONVOLUTION_FWD_ALGO_FFT_TILING and CUDNN_CONVOLUTION_BWD_DATA_ALGO_FFT_TILING, the convDesc's vertical and horizontal filter stride can be 2 when neither the filter width nor the filter height is 1.
- The CUDNN_CONVOLUTION_FWD_ALGO_WINOGRAD algorithm for cudnnConvolutionForward() and cudnnConvolutionBackwardData() now give superior performance for Volta architecture. In addition, the mobile version of this algorithm in the same functions gives superior performance for Maxwell and Pascal architectures.
- Dilated convolutions now give superior performance for cudnnConvolutionForward(), cudnnConvolutionBackwardData(), and cudnnConvolutionBackwardFilter() on Volta architecture, in some cases.
- All cudnnRNNForward/Backward* functions are enhanced to support FP16 math precision mode when both input and output are in FP16. To switch to FP16 math precision, set the mathPrec parameter in cudnnSetRNNDescriptor to CUDNN_DATA_HALF. To switch to FP32 math precision, set the mathPrec parameter in cudnnSetRNNDescriptor to CUDNN_DATA_FLOAT. This feature is only available for CUDNN_ALGO_STANDARD and for the compute capability 5.3 or higher.
- Added support for INT8x4 and INT8x32 data type for cudnnPoolingForward. Using these will provide improved performance over scalar data type.
- In cudnnConvolutionForward() for 2D convolutions, for wDesc NCHW, the IMPLICIT_GEMM algorithm (algo 0) now supports the Data Type Configuration of INT8x4_CONFIG, and INT8x4_EXT_CONFIG also.
- Performance of cudnnPoolingBackward() is enhanced for the average pooling when using NHWC data format – for both the CUDNN_POOLING_AVERAGE_COUNT_INCLUDE_PADDING and CUDNN_POOLING_AVERAGE_COUNT_EXCLUDE_PADDING cases of cudnnPoolingMode_t.
- Performance of the strided convolution in cudnnConvolutionBackwardData() is enhanced when the filter is in NHWC format and the data type is TRUE_HALF_CONFIG or PSEUDO_HALF_CONFIG or FLOAT_CONFIG. For strides u,v < r,s the performance is further enhanced.
- Significantly improved the performance of cudnnConvolutionForward(), cudnnConvolutionBackwardData() and cudnnConvolutionBackwardFilter() functions on RCNN models such as Fast RCNN, Faster RCNN, and Mask RCNN.
CUDA 10.0.326
- NVCC has added support for GCC 7.3.
- CUDA samples are now also available on GitHub:
  
  https://github.com/NVIDIA/cuda-samples
- Bug fixes

Developer Tools

NVIDIA Nsight Systems 2019.4 for application profiling across GPU and CPU.
- Added buttons on the timeline to quickly show more or less of low-impact processes, threads, or CUDA streams.
- The percentage breakdown of CUDA contexts, streams, and kernels are now shown on the timeline.
- A notification is now displayed when a new version of Nsight Systems is released.
- Various bug fixes and performance enhancements.
NVIDIA Nsight Graphics 2019.2 for graphics application debugging and profiling.
- Range Profiler selection linking
  - Range Profiler selection now sets the current event to the last event in the selected range
  - Enhances transition between profiling and debugging workflows
- OpenGL and Vulkan Interoperability debugging
  - Enables the Frame Debugging activity for applications using OpenGL and Vulkan simultaneously
  - Support EXT_external_objects
- New Range Profiler Preview
  - Configure sections to customize the profiling experience
  - Introduce rules to create warnings based on GPU performance counter values within the configurable Range Profiler
- Feedback Button
  - Integrated a feedback button into Nsight Graphics to make sending comments, bugs, and feature requests easier
NVIDIA Nsight Compute 2019.3 for CUDA kernel profiling.
- Kernel launch context and stream are reported as metrics.
- PC sampling configuration options are reported as metrics.
- The default base port for connections to the target has changed.
- Section files support multiple, named Body fields.
- NvRules allow users to query metrics using any convertible data type.
- Improved performance and bug fixes.

Roadmap Notes

Vision Accelerator support
- Jetson AGX Xavier includes a 7-way VLIW Vision Accelerator (VA) for accelerating traditional computer vision workloads. Support for the VA will be included in a future release.

_{THE INFORMATION IN THIS GUIDE AND ALL OTHER INFORMATION CONTAINED IN NVIDIA
DOCUMENTATION REFERENCED IN THIS GUIDE IS PROVIDED “AS IS.” NVIDIA MAKES NO
WARRANTIES, EXPRESSED, IMPLIED, STATUTORY, OR OTHERWISE WITH RESPECT TO THE
INFORMATION FOR THE PRODUCT, AND EXPRESSLY DISCLAIMS ALL IMPLIED WARRANTIES OF
NONINFRINGEMENT, MERCHANTABILITY, AND FITNESS FOR A PARTICULAR PURPOSE.
Notwithstanding any damages that customer might incur for any reason whatsoever,
NVIDIA’s aggregate and cumulative liability towards customer for the product
described in this guide shall be limited in accordance with the NVIDIA terms and
conditions of sale for the product.}

_{THE NVIDIA PRODUCT DESCRIBED IN THIS GUIDE IS NOT FAULT TOLERANT AND IS NOT
DESIGNED, MANUFACTURED OR INTENDED FOR USE IN CONNECTION WITH THE DESIGN,
CONSTRUCTION, MAINTENANCE, AND/OR OPERATION OF ANY SYSTEM WHERE THE USE OR A
FAILURE OF SUCH SYSTEM COULD RESULT IN A SITUATION THAT THREATENS THE SAFETY OF
HUMAN LIFE OR SEVERE PHYSICAL HARM OR PROPERTY DAMAGE (INCLUDING, FOR EXAMPLE,
USE IN CONNECTION WITH ANY NUCLEAR, AVIONICS, LIFE SUPPORT OR OTHER LIFE
CRITICAL APPLICATION). NVIDIA EXPRESSLY DISCLAIMS ANY EXPRESS OR IMPLIED
WARRANTY OF FITNESS FOR SUCH HIGH RISK USES. NVIDIA SHALL NOT BE LIABLE TO
CUSTOMER OR ANY THIRD PARTY, IN WHOLE OR IN PART, FOR ANY CLAIMS OR DAMAGES
ARISING FROM SUCH HIGH RISK USES.}

_{NVIDIA makes no representation or warranty that the product described in this
guide will be suitable for any specified use without further testing or
modification. Testing of all parameters of each product is not necessarily
performed by NVIDIA. It is customer’s sole responsibility to ensure the product
is suitable and fit for the application planned by customer and to do the
necessary testing for the application in order to avoid a default of the
application or the product. Weaknesses in customer’s product designs may affect
the quality and reliability of the NVIDIA product and may result in additional
or different conditions and/or requirements beyond those contained in this
guide. NVIDIA does not accept any liability related to any default, damage,
costs or problem which may be based on or attributable to: (i) the use of the
NVIDIA product in any manner that is contrary to this guide, or (ii) customer
product designs.}

_{Other than the right for customer to use the information in this guide with the
product, no other license, either expressed or implied, is hereby granted by
NVIDIA under this guide. Reproduction of information in this guide is
permissible only if reproduction is approved by NVIDIA in writing, is reproduced
without alteration, and is accompanied by all associated conditions,
limitations, and notices.}

Trademarks

_{NVIDIA, the NVIDIA logo, and cuBLAS, CUDA, CUDA-GDB, CUDA-MEMCHECK, cuDNN, cuFFT, cuSPARSE, DIGITS, DGX, DGX-1, DGX Station,
NVIDIA DRIVE, NVIDIA DRIVE AGX, NVIDIA DRIVE Software, NVIDIA DRIVE OS, NVIDIA Developer Zone (aka "DevZone"), GRID, Jetson,
NVIDIA Jetson Nano, NVIDIA Jetson AGX Xavier, NVIDIA Jetson TX2, NVIDIA Jetson TX2i, NVIDIA Jetson TX1, NVIDIA Jetson TK1,
Kepler, NGX, NVIDIA GPU Cloud, Maxwell, Multimedia API, NCCL, NVIDIA Nsight Compute, NVIDIA Nsight Eclipse Edition, NVIDIA
Nsight Graphics, NVIDIA Nsight Systems, NVLink, nvprof, Pascal, NVIDIA SDK Manager, Tegra, TensorRT, Tesla, Visual Profiler,
VisionWorks and Volta are trademarks and/or registered trademarks of NVIDIA Corporation in the United States and other countries.
Other company and product names may be trademarks of the respective companies with which they are associated.}