1. JetPack 4.5

1.1. New Features

This release includes VPI 1.0 (first production release of VPI) and L4T 32.5 with new security features, boot enhancements, and a new way to flash Jetson devices using NFS.

2. Additional Release Details

2.1. OS

L4T 32.5

  • Secure boot enhanced for Jetson AGX Xavier and Jetson Xavier NX to extend encryption support to kernel, kernel-dtb and initrd.

  • Disk encryption supported to protect data at rest for Jetson AGX Xavier, Jetson Xavier NX and Jetson TX2 series.

  • Support for loading kernel, device tree and initrd from root file system on USB drive or NVME for Jetson TX2 series, Jetson TX1 and Jetson Nano modules and developer kits.

  • UBoot version updated to v2020.04 for Jetson TX2 series, Jetson TX1 and Jetson Nano.

  • New way of flashing eMMC, SD card, USB or NVME connected to Jetson using Network File System. Jetson AGX Xavier, Jetson Xavier NX and Jetson TX2 series are supported in this release.

  • Boot firmware for all Jetson Nano developer kits updated to relocate boot firmware to integrated QSPI-NOR. The microSD card on the developer kit will solely be used for OS/app storage going forward. microSD cards with previous versions of JetPack will continue to function as before.

  • All Jetson Nano Developer Kits will now show a warning screen if no SD card is inserted. It will attempt to boot from other supported media after showing this warning screen.

2.2. Libraries and APIs

CUDA 10.2

  • Performance optimization through user-mode submits.

    50% launch latency reduction for CUDA kernels, resulting in improved GPU utilization and lower CPU utilization.

TensorRT 7.1.3

  • New Layers

    1. IFillLayer: The IFillLayer is used to generate an output tensor with the specified mode.

    2. IIteratorLayer: The IIteratorLayer enables a loop to iterate over a tensor. A loop is defined by loop boundary layers.

    3. ILoopBoundaryLayer: Class ILoopBoundaryLayer defines a virtual method getLoop() that returns a pointer to the associated ILoop.

    4. ILoopOutputLayer: The ILoopOutputLayer specifies an output from the loop.

    5. IParametricReluLayer: The IParametricReluLayer represents a parametric ReLU operation, meaning, a leaky ReLU where the slopes for x < 0 can be different for each element.

    6. IRecurrenceLayer: The IRecurrenceLayer specifies a recurrent definition.

    7. ISelectLayer: The ISelectLayer returns either of the two inputs depending on the condition.

    8. ITripLimitLayer: The ITripLimitLayer specifies how many times the loop iterates.

  • New Operators

    Expanded support for ONNX operations: Added ConstantOfShape, DequantizeLinear, Equal, Erf, Expand, Greater, GRU, Less, Loop, LRN, LSTM, Not, PRelu, QuantizeLinear, RandomUniform, RandomUniformLike, Range, RNN, Scan, Sqrt, Tile, and Where.

  • New Samples

    sampleAlgorithmSelector shows an example of how to use the algorithm selection API based on sampleMNIST. This sample demonstrates the usage of IAlgorithmSelector to deterministically build TensorRT engines.
    onnx_packnet is a Python sample which uses TensorRT to perform inference with the PackNet network. PackNet is a self-supervised monocular depth estimation network used in autonomous driving.
  • Working with Loops

    TensorRT supports loop-like constructs, which can be useful for recurrent networks. TensorRT loops support scanning over input tensors, recurrent definitions of tensors, and both "scan outputs" and "last value" outputs.

  • ONNX parser with dynamic shapes support

    The ONNX parser supports full-dimensions mode only. Your network definition must be created with the explicitBatch flag set.

  • BERT INT8 and mixed precision optimizations

    Some GEMM layers are now followed by GELU activation in the BERT model. Since TensorRT doesn't have IMMA GEMM layers, you can implement those GEMM layers in the BERT network with either IConvolutionLayer or IFullyConnectedLayer layers depending on what precision you require. For example, you can leverage IConvolutionLayer with H == W == 1 (CONV1x1) to implement a FullyConnected operation and leverage IMMA math under INT8 mode. TensorRT supports the fusion of Convolution/FullyConnected and GELU.

  • Working with Quantized Networks

    Supports quantized models trained with Quantization Aware Training. Support is limited to symmetrically quantized models, meaning zero_point = 0 using QuantizeLinear and DequantizeLinear.

  • Boolean Tensor Support

    TensorRT supports Boolean tensors which can be marked as network input and output. IElementWiseLayer, IUnaryLayer (only kNOT), IShuffleLayer, ITripLimit (only kWHILE) and ISelectLayer support the Boolean datatype. Boolean tensors can be used only with FP32 and FP16 precision networks.

  • Working with empty tensor

    TensorRT supports empty tensors. A tensor is an empty tensor if it has one or more dimensions with length zero.

  • Builder layer timing cache

    The layer timing cache will cache the layer profiling information during the builder phase. Models with repeated layer will see a significant speedup in builder time.

  • Pointwise fusion based on code generation

    Pointwise fusion is updated to use code generation and runtime compilation to further improve performance.

  • Dilation support for deconvolution

    IDeconvolutionLayer now supports a dilation parameter. This is accessible through the C++ API, Python API, and the ONNX parser.

  • Selecting FP16 and INT8 kernels

    TensorRT supports Mixed Precision Inference with FP32, FP16, or INT8 as supported precisions. Depending on the hardware support, you can choose to enable either of the above precision to accelerate inference.

  • Calibration with dynamic shapes

    INT8 calibration with dynamic shapes supports the same functionality as a standard INT8 calibrator but for networks with dynamic shapes.

  • Algorithm selection

    Algorithm selection provides a mechanism to select and report algorithms for different layers in a network. This can also be used to deterministically build TensorRT engine or to reproduce the same implementations for layers in the engine.

  • INT8 Calibration

    The Legacy class IInt8LegacyCalibrator is un-deprecated. It is provided as a fallback option if the other calibrators yield poor results. A new kCALIBRATION_BEFORE_FUSION has been added which allows calibration before fusion.
  • Quantizing and dequantizing scale layers

    A quantizing scale layer can be specified as a scale layer with output precision type of INT8. Similarly, a dequantizing scale layer can be specified as a scale layer with output precision type of FP32. Networks must be created with Explicit Precision mode to use these layers.
  • Group normalization plugin

    A new group normalization plugin has been added.

    Refer to the release notes for TensorRT 7.X.X for detailed release notes.

cuDNN 8.0.0

  • cuDNN library split into multiple inferencing and training libraries, enabling applications to only link against needed cuDNN sub-components.

VPI 1.0

  • New Algorithms

    • Pyramidal LK Optical Flow supported on CPU and GPU

  • YUV422 packed color support for VIC backend

  • Easier integration with OpenCV

Multimedia API

  • V4L2 API is extended to support interacting with CSI Camera. Implementation is compliant with V4L2 spec.

2.3. Developer Tools

  • NVIDIA Nsight Systems 2020.5 for application profiling across GPU and CPU.

    • The Nsight Systems CLI collects profiling data without using the GUI.

    • Several GUI improvements:

      • Improved selection highlighting

      • Improved support for high DPI displays

      • OpenGL GPU row moved next to CUDA GPU rows

      • Improved CUDA UVM coloring

      • Improved CUDA stream overview rendering

      • Improved event viewer search system

      • Improved event viewer highlight and/or zoom

      • Option to sort thread by utilization within selection

      • Keyboard hotkey info button

  • NVIDIA Nsight Graphics 2020.5 for graphics application debugging and profiling.

    • Adds support for new Vulkan extensions:

      VK_KHR_buffer_device_address
      VK_KHR_separate_depth_stencil_layouts
      VK_KHR_timeline_semaphor
      VK_KHR_deferred_host_operation
      VK_KHR_pipeline_library
      VK_EXT_tooling_info
    • Adds support for new OpenGL extensions:

      ARB_compute_variable_group_size
      NV_representative_fragment_test
      GL_NV_clip_space_w_scaling
    • Feature Enhancements:

      • Vulkan 1.2

        • Support for full frame debugging and profiling for the recently released Vulkan 1.2 specification.

        • Includes support for timeline semaphores and other key extensions present in the core spec.

      • VK_KHR_ray_tracing support (provisional extension)

        Applications that use VK_KHR_ray_tracing can be captured, profiled and exported to a C++ Capture. While the extension is still evolving, the NVIDIA Beta Driver will allow for experimentation before the extension is ratified. Go here for more info: https://developer.nvidia.com/vulkan-driver.

      • Added VRS visualization in the Current Target View

        • The shading rate can be visualized as an overlay in the Current Target View if it is enabled for the current draw call.

        • Supports Vulkan and OpenGL.

      • Vulkan Dynamic Shader Editing

        Vulkan (GLSL) shaders can now be dynamically edited, allowing you to try new changes in real time without having to restart your application.

      • Acceleration Structure Navigation Enhancements

        • Refined and added controls to rotate around a selected geometry.

        • Added context menus to show/hide geometry.

        • Added an animation when navigating to geometry.

    • Improvements

      • Improved experience for hosts running with mixed-DPI monitors.

      • Improved semaphore visualization for OpenGL and Vulkan interop applications.

        • C++ Capture build time for large applications was reduced by up to 40x

        • C++ Capture replay performance was improved by up to 5%.

        • Improved detection and freezing of watchdog threads to help avoid application teardown by any watchdog after resuming from a capture.

      • Improved crash information: we now show the call stack if a crash occurs, allowing the user to differentiate between problems in their application and problems with the tool.

      • Added visualization of debug info loaded status in the Linked Programs View.

      • Connection Dialog: added the ability to use configuration variables in the Connection Dialog to support project-relative, application-relative, or working-directory-relative application launches.

      • API Inspector View: extended the buffer display to expand to a maximum of 100 rows for your Vulkan applications.

      • Added support for displaying fixed-size strings in the memory viewer.

      • OpenGL-Vulkan interoperability support on Linux.

  • NVIDIA Nsight Compute 2019.3 for CUDA kernel profiling.

    • NVIDIA Nsight Compute version remains the same from JetPack 4.4

Previous | Next

  Previous Topic     Next Topic  

Introduction to JetPack    

How to Install JetPack    

 

Notices

Notice

THE INFORMATION IN THIS GUIDE AND ALL OTHER INFORMATION CONTAINED IN NVIDIA DOCUMENTATION REFERENCED IN THIS GUIDE IS PROVIDED “AS IS.” NVIDIA MAKES NO WARRANTIES, EXPRESSED, IMPLIED, STATUTORY, OR OTHERWISE WITH RESPECT TO THE INFORMATION FOR THE PRODUCT, AND EXPRESSLY DISCLAIMS ALL IMPLIED WARRANTIES OF NONINFRINGEMENT, MERCHANTABILITY, AND FITNESS FOR A PARTICULAR PURPOSE. Notwithstanding any damages that customer might incur for any reason whatsoever, NVIDIA’s aggregate and cumulative liability towards customer for the product described in this guide shall be limited in accordance with the NVIDIA terms and conditions of sale for the product.

THE NVIDIA PRODUCT DESCRIBED IN THIS GUIDE IS NOT FAULT TOLERANT AND IS NOT DESIGNED, MANUFACTURED OR INTENDED FOR USE IN CONNECTION WITH THE DESIGN, CONSTRUCTION, MAINTENANCE, AND/OR OPERATION OF ANY SYSTEM WHERE THE USE OR A FAILURE OF SUCH SYSTEM COULD RESULT IN A SITUATION THAT THREATENS THE SAFETY OF HUMAN LIFE OR SEVERE PHYSICAL HARM OR PROPERTY DAMAGE (INCLUDING, FOR EXAMPLE, USE IN CONNECTION WITH ANY NUCLEAR, AVIONICS, LIFE SUPPORT OR OTHER LIFE CRITICAL APPLICATION). NVIDIA EXPRESSLY DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY OF FITNESS FOR SUCH HIGH RISK USES. NVIDIA SHALL NOT BE LIABLE TO CUSTOMER OR ANY THIRD PARTY, IN WHOLE OR IN PART, FOR ANY CLAIMS OR DAMAGES ARISING FROM SUCH HIGH RISK USES.

NVIDIA makes no representation or warranty that the product described in this guide will be suitable for any specified use without further testing or modification. Testing of all parameters of each product is not necessarily performed by NVIDIA. It is customer’s sole responsibility to ensure the product is suitable and fit for the application planned by customer and to do the necessary testing for the application in order to avoid a default of the application or the product. Weaknesses in customer’s product designs may affect the quality and reliability of the NVIDIA product and may result in additional or different conditions and/or requirements beyond those contained in this guide. NVIDIA does not accept any liability related to any default, damage, costs or problem which may be based on or attributable to: (i) the use of the NVIDIA product in any manner that is contrary to this guide, or (ii) customer product designs.

Other than the right for customer to use the information in this guide with the product, no other license, either expressed or implied, is hereby granted by NVIDIA under this guide. Reproduction of information in this guide is permissible only if reproduction is approved by NVIDIA in writing, is reproduced without alteration, and is accompanied by all associated conditions, limitations, and notices.

Trademarks

NVIDIA, the NVIDIA logo, and cuBLAS, CUDA, CUDA-GDB, CUDA-MEMCHECK, cuDNN, cuFFT, cuSPARSE, DIGITS, DGX, DGX-1, DGX Station, NVIDIA DRIVE, NVIDIA DRIVE AGX, NVIDIA DRIVE Software, NVIDIA DRIVE OS, NVIDIA Developer Zone (aka "DevZone"), GRID, Jetson, NVIDIA Jetson Nano, NVIDIA Jetson AGX Xavier, NVIDIA Jetson TX2, NVIDIA Jetson TX2i, NVIDIA Jetson TX1, NVIDIA Jetson TK1, Kepler, NGX, NVIDIA GPU Cloud, Maxwell, Multimedia API, NCCL, NVIDIA Nsight Compute, NVIDIA Nsight Eclipse Edition, NVIDIA Nsight Graphics, NVIDIA Nsight Systems, NVLink, nvprof, Pascal, NVIDIA SDK Manager, Tegra, TensorRT, Tesla, Visual Profiler, VisionWorks and Volta are trademarks and/or registered trademarks of NVIDIA Corporation in the United States and other countries. Other company and product names may be trademarks of the respective companies with which they are associated.