Release Notes#

Nsight Deep Learning Designer Release Notes.

Release notes and system requirements.

Updates in 2025.2#

Nsight Deep Learning Designer 2025.2 includes the following changes:

New Features
- Added an option to reset the zoom level (on by default) when centering on a node.
- Added options that allow users to customize layer glyph sizes, including layer terminal size, the font size, boldness, border width, and whether or not layer names are displayed.
- Added options that allow users to choose different algorithms to set the initial view of a model.
- Added support for ‘Backspace’ as alternate hotkey to ‘Delete’.
- Added two informational fields (target type and name, parameter name) to the subgraph creation dialog.
- Added support to allow subgraphs / local functions to be closed with [x] on tab when unsaved.
- Added “Close all” and “Save all” options to the dialog when closing the parent model.
- Added support to save attributes and configurations for local functions and subgraphs.
- Added support to generate ONNX tensors from random data with given shapes.
- Added serialized TRT engine size to profiler reports.
- Added support to show TensorRT layer hierarchy in profiler reports.
- Fixed a TensorRT profiler issue on DriveOS (Linux).
- Added support for DriveOS 7.0.x (Linux).
- Added INT4 support.
- Added new context menu shortcuts to open tool windows.
- Added support in Layer Explorer that allows users to filter by nodes that have a subgraph.
- Added type and shape columns to the Initializer List.
- Added support for boolean initializers.
Bug Fixes
- Fixed a bug that causes out-of-sync between canvas and Initializer Editor state selection.
- Fixed a bug that affects FP8 tensor values.
- Fixed a bug that affects deleting subgraphs in Parameter Editor.
- Fixed a bug that prevents the subgraph glyph button from adding subgraphs.
- Fixed multiple crashes.
- Fixed an issue affecting the initial column widths in Layer Explorer.
- Fixed a performance issue affecting nodes with many initializers.
- Fixed a bug that causes UI to freeze when copying large ONNX files to temp directory.
- Fixed a bug causing deselection of currently selected node when right-clicking.
Performance and UX Improvements
- Improved random data generation for ONNX initializers.
- Improved rendering performance during zooming and scrolling.
- Improved additional rendering performance by avoiding unnecessary CacheUpdate calls.
- Replaced popup menu for jumping to source layers with a dialog.
- Moved the inference time column in profiler report to the left.
- Adjusted median inference time calculations based on the timeline.
- Made changes to set proper sizes for some dialog windows.

Nsight Deep Learning Designer depends on the following components and will download some of them automatically when needed:

Component	Version
CUDA	12.8
cuDNN	9.6.0
DirectML	1.15.4
Nsight Systems	2025.1.1
ONNX IR	10
ONNX GraphSurgeon	0.5.7
ONNX Opsets	1 to 21
ONNX Runtime	1.21.0
ONNX Proto Spec	1.17.0
Polygraphy	0.49.20
WinPixEventRuntime	1.0.240308001

Updates in 2025.1.1#

Nsight Deep Learning Designer 2025.1.1 includes the following fixes and updates:

Fixed a bug causing profiling to fail on Blackwell chips.
Fixed a bug where TensorRT layers metadata files were not rendered correctly in certain cases.
Added support for FP4 types when visualizing a TensorRT engine.
Fixed a bug where certain metadata (e.g. node positions) could not be saved in an ONNX local function if the parent document was saved under a different file name.
Fixed a bug where tensor sizes were not being displayed correctly in layer node terminal tooltips.
Improved visibility of selected layer nodes in the Model Canvas.

Updates in 2025.1#

Nsight Deep Learning Designer 2025.1 includes the following updates:

Integrated with TensorRT 10.8.
Added support for switching between different minor versions of TensorRT.
Added support to hide/show Constant nodes from both Model Canvas and Layer Explorer (hidden by default).
Added support to create a new ONNX node via a hotkey Ctrl+Space.
Added support for user-specified locations for storing dependency bits and timing caches via an environment variable NV_DLD_CACHE_DIR.
Fixed a bug that caused excessive CPU memory usage when loading very large ONNX models.
Fixed a bug that caused typer-checker to fail for some YOLO models when the inputs of some nodes are not all connected.
Fixed an issue where some links could be rendered invisible after nodes were moved in the model canvas.
Fixed an issue that caused typer-checker to fail when an output node has the same name as an intermediate tensor.
Fixed a bug where an opened ONNX model file is not marked dirty when some nodes are assigned names automatically during loading.

Nsight Deep Learning Designer depends on the following components and will download some of them automatically when needed:

Component	Version
CUDA	12.8
cuDNN	8.9.7
DirectML	1.15.2
Nsight Systems	2024.6.1
ONNX IR	9
ONNX GraphSurgeon	0.5.1
ONNX Opsets	1 to 20
ONNX Runtime	1.17.3
Polygraphy	0.49.9
TensorRT	10.8
WinPixEventRuntime	1.0.240308001

Updates in 2024.1#

Nsight Deep Learning Designer has been completely revamped. In this release, we’ve embraced ONNX to provide broader and more flexible supports for DL models. More importantly, we have adopted TensorRT, NVIDIA’s signature inference solution, to replace NvNeural as the underlying inference engine. Some notable changes in this release:

Nsight Deep Learning Designer is now a full-fledged ONNX model editor. ONNX models can be visualized, edited, and profiled.
We revamped how nodes are represented on the canvas.
A new layout algorithm to arrange nodes on the canvas in a more compact layout was added.
Added support for ONNX opset version 1 to 20, and ONNX Runtime Contrib opset version 1.
Added support for subgraphs and local functions in ONNX models.
Added support to extract subgraphs or a selection of nodes to a standalone ONNX model.
Added support to export a TensorRT engine from an ONNX model.
Added support for visualizing TensorRT engine graphs.
Added support for TensorRT and ONNX Runtime profiling for ONNX models. Profiling generates a report document with detailed inference performances of the model.
Added a new model validator based on Polygraphy to report any errors, warnings, or issues caused by the current model structure.
Added support to run custom tools on ONNX models from Nsight Deep Learning Designer.
Added batch modification actions for ONNX models: convert model to FP16, sanitize model, and batch tensor conversion.

Nsight Deep Learning Designer depends on the following components and will download some of them automatically when needed:

Component	Version
CUDA	12.7
cuDNN	8.9.7
DirectML	1.15.2
Nsight Systems	2024.6.1
ONNX IR	9
ONNX GraphSurgeon	0.5.1
ONNX Opsets	1 to 20
ONNX Runtime	1.17.3
Polygraphy	0.49.9
TensorRT	10.7
WinPixEventRuntime	1.0.240308001

As part of the transition to ONNX and TensorRT, we deprecated the following:

Editing and visualizing NvNeural models
Importing/exporting NvNeural models from/to PyTorch
NvNeural SDK
Analysis mode and channel inspector

Updates in 2022.2#

NVIDIA Nsight Deep Learning Designer changes in version 2022.2:

We added support to launch the PyTorch exporter from a virtual environment (Conda or virtualenv).
We improved the overall performance of Channel Inspector by separating the visualization of per-layer weights from per-layer features and changing the way how we visualize NxCx1x1 weights.
We added an experimental feature that allows users to directly import existing PyTorch models into NVIDIA Nsight Deep Learning Designer without starting from scratch.
We switched to using cubic curves to represent layer links to reduce path-finding overhead.
We added support to visualize inference results of classification networks in the Analysis Mode.
We added support for custom padding (in addition to the current same and valid options) to the relevant layers.
We fixed numerous bugs.

We have also decided to remove some uncommonly used operators as we modernize the inference library and unify our model exporters. The following layers are now deprecated and will be removed in a future product release:

Local response normalization (LRN) layer: Only region values of within are being removed. Normalization across channels is still fully supported.
Mono-four-stack layer: Replace with a custom layer.
Mono-to-RGB layer: Replace with a custom layer.
Network layer: Import the subnetwork as a template instead.
Output layer: Use of output layers for tensor slicing (the width, height, channels, and offset parameters) is deprecated. Use an explicit slice layer if these operations are required.

The following activations are now deprecated:

Leaky sigmoid activation: Replace with a custom layer.
Leaky tanh activation: Replace with a custom layer.
ReLU activation with alpha values other than zero is being removed. Normal ReLU is still fully supported. The element-wise max layer can replace these activations.

NvNeural changes in version 2022.2:

Changed the signature of nvneural::XmlNetworkBuilder::createLayerObject to receive the original serialized type used to select the layer object being instantiated. Custom classes deriving from XmlNetworkBuilder must be updated if they override this function.

Updates in 2022.1#

NVIDIA Nsight Deep Learning Designer changes in version 2022.1:

Added support to save all tensors in the analysis mode.
Added support for using nested templates to construct hierarchical network graphs.
We significantly improved the performance of the type-checking process in the Editor.
Fixed a bug that prevented PyTorch exports on Linux from succeeding.
Removed clamping behavior from the Affine layer. It no longer restricts the values of its scale and offset parameters. The options_on parameter has been deprecated; users wishing to hide interactive controls for this layer during analysis should set the new include_ui parameter to false.
Fixed a bug that blocked FP16 inference when fusing 7x7 convolutions with batch normalizations.

NvNeural changes in version 2022.1:

Added a new analysis layer: Signal Injector.
Added a new Input (Constant) layer which supports direct embedding of scalar constants.
Optimized the performance of the BatchNorm layer.
Optimized the performance of the Upscale layer.
Added support for downscaling and fixed-size scaling to the Upscale layer.
The NvRTC wrapper in nvneural::ICudaRuntimeCompiler has been replaced with a stub when type-checking networks from the GUI. Plugins that rely on the ability to execute generated kernel code during initialization or nvneural::ILayer::reshape should call NvRTC directly, but for performance reasons we do not recommend this approach.
The forward() function in the exported PyTorch class now takes in keyword-only arguments. User should explicitly name the input paramters while calling the model/function.
The INetwork::inferenceSubgraph method now applies queued reshape operations. Queued reshapes are not cleared upon failure and will continue to block inference and inferenceSubgraph calls until they succeed.

Updates in 2021.2#

NVIDIA Nsight Deep Learning Designer changes in version 2021.2:

The Channel Inspector can display summary and per-channel statistics about a layer’s output tensor: mean, minimum/maximum, standard deviation, and sparsity (percentage of tensor elements close to zero).
Output tensor shapes are now visible during editing.
ConverenceNG can now save network outputs as .npy files.
Users can now expand or collapse the parameters list of a layer glyph in the editor view.
In Channel Inspector, users can now toggle a checkbox to perform auto scale and shift of the channels.
Users can now save a template as a file that can be imported into another model.

We have added more data to network profiling reports:

Per-layer device memory footprints
Whole-network device memory footprint
Percentage view for inference timings
Layers’ distance from the nearest input, for sorting by network depth
Template-level inference timings

NvNeural changes in version 2021.2:

Plugin initialization has been refactored to reduce its reliance on translation-unit-scoped static initialization. The ExportPlugin framework now expects user plugin code to provide an implementation of the function void nvneural::plugin::InitializePluginTypes(). This function should call static ClassRegistry methods to make its export types visible to the client application.
The SkipConcatenation optimization has been rewritten. Custom concatenation layers should implement nvneural::IConcatenationLayer2 to participate in this optimization.
We added two new analysis layers: Saliency Generator and Saliency Mix. The Saliency Generator layer converts their input tensors into a single-channel tensor with the same H and W as their inputs. The saliency mix layer simply overlays saliency information (output of a Saliency Generator) onto another input tensor.

Known Issues#

Nsight Deep Learning Designer’s profiling features require access to GPU performance counters, which by default requires local administrator privileges. See the following page for details on how to configure your system to allow profiling without elevation. Profiling without the appropriate permissions will generate error messages of the form ERR: xxx: Error 19 returned from Perfworks.
The values of operations/second calculated from this NVIDIA developer tools site and generated by using the data center monitoring tools are not calculated in the same way as the operations/second used for export control purposes and should not be relied upon to assess performance against the export control limits.
The screenshot feature in the Send Feedback dialog does not work on Linux systems using Wayland as a display manager.
The ONNX Runtime DirectML profiling provider does not support machines that use a software display adapter.
There is no support for COMPLEX64 and COMPLEX128 ONNX data types.
ONNX models’ inputs and outputs only support Tensor types.
Link annotations can overlap each other if the links are too close together.
Model validation fails when processing an ONNX model with an invalid SparseTensor.
TensorRT profiling does not support control-flow operators such as Loop and If.
For INT4 support, the following limitations apply:
- ONNX Runtime: INT4 and UINT4 are supported in Q/DQ operators only.
- TensorRT: INT4 is supported in Q/DQ operators only. UINT4 is not supported.
- Type checking: The Polygraphy tool used by Nsight Deep Learning Designer does not handle INT4 and UINT4 types correctly and may falsely report errors in a model.

Platform Support#

Linux#

Nsight Deep Learning Designer supports Linux x86_64 systems running Ubuntu 20.04 LTS or newer, with GLIBC version 2.29 or higher.

The Nsight Deep Learning Designer host application requires several packages to be installed to enable Qt. Please refer to the Qt for X11 Requirements. When executing Nsight Deep Learning Designer with missing dependencies, an error message with information on the missing packages is shown. Note that only one package will be shown at a time, even though multiple may be missing from your system. The following command installs needed packages for Nsight Deep Learning Designer on X11:

Ubuntu 20.04:

apt install libopengl0 libxcb-icccm4 libxcb-image0 libxcb-keysyms1 libxcb-render-util0 libxcb-xinerama0 libxcb-xkb1 libxkbcommon-x11-0 libxcb-cursor0

NVIDIA L4T#

Nsight Deep Learning Designer supports NVIDIA L4T arm64 systems running Ubuntu 20.04 LTS or newer, with GLIBC version 2.29 or higher.

The Nsight Deep Learning Designer host application requires several packages to be installed to enable Qt. Please refer to the Qt for X11 Requirements. When executing Nsight Deep Learning Designer with missing dependencies, an error message with information on the missing packages is shown. Note that only one package will be shown at a time, even though multiple may be missing from your system. The following command installs needed packages for Nsight Deep Learning Designer on X11:

Ubuntu 20.04:

apt install libopengl0 libxcb-icccm4 libxcb-image0 libxcb-keysyms1 libxcb-render-util0 libxcb-xinerama0 libxcb-xkb1 libxkbcommon-x11-0 libxcb-cursor0

NVIDIA Drive Linux#

Nsight Deep Learning Designer supports NVIDIA Drive Linux systems running DriveOS 7.0.x or newer as a remote target for TensorRT profiling and TensorRT engine building.

Windows#

Nsight Deep Learning Designer supports Windows x86_64 systems running:

Windows 10: 20H1 or newer.
Windows 11: 21H2 or newer.

GPU Support#

NVIDIA Nsight Deep Learning Designer requires an NVIDIA GPU to run its inference-related tasks (profiling, TRT engine building, etc.). The following GPUs are supported:

Linux x86_64
- GeForce GPUs: GeForce RTX 2000 series, RTX 3000 series, RTX 4000 series, or RTX 5000 series.
- Data Center GPUs: A100, H100, or L40.
NVIDIA L4T arm64
- Embedded Systems: Jetson Orin.
NVIDIA DRIVE Linux arm64
- Embedded Systems: DRIVE AGX Orin, DRIVE AGX Thor.
Windows x86_64
- GeForce GPUs: GeForce RTX 2000 series, RTX 3000 series, RTX 4000 series, or RTX 5000 series.

Recommended Display Driver#

You must have a recent NVIDIA display driver installed on your system to run Nsight Deep Learning Designer. The following display drivers are recommended:

Windows: Release 560.00 or newer.
Linux: Release 560.00 or newer.
NVIDIA L4T: NVIDIA JetPack SDK 6.1. Note: When targeting the NVIDIA L4T platform, the user (local or remote) needs to be a member of the debug group in order to profile.
NVIDIA Drive Linux: NVIDIA DriveOS 7.0.x or newer. Note: When targeting the NVIDIA Drive Linux platform, the user (local or remote) needs to be a member of the debug group in order to profile.