Release Notes
Nsight Deep Learning Designer Release Notes.
Release notes and system requirements.
Updates in 2025.1.1
Nsight Deep Learning Designer 2025.1.1 includes the following fixes and updates:
Fixed a bug causing profiling to fail on Blackwell chips.
Fixed a bug where TensorRT layers metadata files were not rendered correctly in certain cases.
Added support for FP4 types when visualizing a TensorRT engine.
Fixed a bug where certain metadata (e.g. node positions) could not be saved in an ONNX local function if the parent document was saved under a different file name.
Fixed a bug where tensor sizes were not being displayed correctly in layer node terminal tooltips.
Improved visibility of selected layer nodes in the Model Canvas.
Updates in 2025.1
Nsight Deep Learning Designer 2025.1 includes the following updates:
Integrated with TensorRT 10.8.
Added support for switching between different minor versions of TensorRT.
Added support to hide/show Constant nodes from both Model Canvas and Layer Explorer (hidden by default).
Added support to create a new ONNX node via a hotkey Ctrl+Space.
Added support for user-specified locations for storing dependency bits and timing caches via an environment variable NV_DLD_CACHE_DIR.
Fixed a bug that caused excessive CPU memory usage when loading very large ONNX models.
Fixed a bug that caused typer-checker to fail for some YOLO models when the inputs of some nodes are not all connected.
Fixed an issue where some ONNX node links are rendered invisible while nodes/links are being dragged across the screen.
Fixed an issue that caused typer-checker to fail when an output node has the same name as an intermediate tensor.
Fixed a bug where an opened ONNX model file is not marked dirty when some nodes are renamed automatically during loading.
Nsight Deep Learning Designer depends on the following components and will download some of them automatically when needed:
Component |
Version |
---|---|
CUDA |
12.8 |
cuDNN |
8.9.7 |
DirectML |
1.15.2 |
Nsight Systems |
2024.6.1 |
ONNX IR |
9 |
ONNX GraphSurgeon |
0.5.1 |
ONNX Opsets |
1 to 20 |
ONNX Runtime |
1.17.3 |
Polygraphy |
0.49.9 |
TensorRT |
10.8 |
WinPixEventRuntime |
1.0.240308001 |
Updates in 2024.1
Nsight Deep Learning Designer has been completely revamped. In this release, we’ve embraced ONNX to provide broader and more flexible supports for DL models. More importantly, we have adopted TensorRT, NVIDIA’s signature inference solution, to replace NvNeural as the underlying inference engine. Some notable changes in this release:
Nsight Deep Learning Designer is now a full-fledged ONNX model editor. ONNX models can be visualized, edited, and profiled.
We revamped how nodes are represented on the canvas.
A new layout algorithm to arrange nodes on the canvas in a more compact layout was added.
Added support for ONNX opset version 1 to 20, and ONNX Runtime Contrib opset version 1.
Added support for subgraphs and local functions in ONNX models.
Added support to extract subgraphs or a selection of nodes to a standalone ONNX model.
Added support to export a TensorRT engine from an ONNX model.
Added support for visualizing TensorRT engine graphs.
Added support for TensorRT and ONNX Runtime profiling for ONNX models. Profiling generates a report document with detailed inference performances of the model.
Added a new model validator based on Polygraphy to report any errors, warnings, or issues caused by the current model structure.
Added support to run custom tools on ONNX models from Nsight Deep Learning Designer.
Added batch modification actions for ONNX models: convert model to FP16, sanitize model, and batch tensor conversion.
Nsight Deep Learning Designer depends on the following components and will download some of them automatically when needed:
Component |
Version |
---|---|
CUDA |
12.7 |
cuDNN |
8.9.7 |
DirectML |
1.15.2 |
Nsight Systems |
2024.6.1 |
ONNX IR |
9 |
ONNX GraphSurgeon |
0.5.1 |
ONNX Opsets |
1 to 20 |
ONNX Runtime |
1.17.3 |
Polygraphy |
0.49.9 |
TensorRT |
10.7 |
WinPixEventRuntime |
1.0.240308001 |
As part of the transition to ONNX and TensorRT, we deprecated the following:
Editing and visualizing NvNeural models
Importing/exporting NvNeural models from/to PyTorch
NvNeural SDK
Analysis mode and channel inspector
Updates in 2022.2
NVIDIA Nsight Deep Learning Designer changes in version 2022.2:
We added support to launch the PyTorch exporter from a virtual environment (Conda or virtualenv).
We improved the overall performance of Channel Inspector by separating the visualization of per-layer weights from per-layer features and changing the way how we visualize NxCx1x1 weights.
We added an experimental feature that allows users to directly import existing PyTorch models into NVIDIA Nsight Deep Learning Designer without starting from scratch.
We switched to using cubic curves to represent layer links to reduce path-finding overhead.
We added support to visualize inference results of classification networks in the Analysis Mode.
We added support for custom padding (in addition to the current
same
andvalid
options) to the relevant layers.We fixed numerous bugs.
We have also decided to remove some uncommonly used operators as we modernize the inference library and unify our model exporters. The following layers are now deprecated and will be removed in a future product release:
Local response normalization (LRN) layer: Only
region
values ofwithin
are being removed. Normalizationacross
channels is still fully supported.Mono-four-stack layer: Replace with a custom layer.
Mono-to-RGB layer: Replace with a custom layer.
Network layer: Import the subnetwork as a template instead.
Output layer: Use of output layers for tensor slicing (the
width
,height
,channels
, andoffset
parameters) is deprecated. Use an explicit slice layer if these operations are required.
The following activations are now deprecated:
Leaky sigmoid activation: Replace with a custom layer.
Leaky tanh activation: Replace with a custom layer.
ReLU activation with
alpha
values other than zero is being removed. Normal ReLU is still fully supported. The element-wisemax
layer can replace these activations.
NvNeural changes in version 2022.2:
Changed the signature of
nvneural::XmlNetworkBuilder::createLayerObject
to receive the original serialized type used to select the layer object being instantiated. Custom classes deriving fromXmlNetworkBuilder
must be updated if they override this function.
Updates in 2022.1
NVIDIA Nsight Deep Learning Designer changes in version 2022.1:
Added support to save all tensors in the analysis mode.
Added support for using nested templates to construct hierarchical network graphs.
We significantly improved the performance of the type-checking process in the Editor.
Fixed a bug that prevented PyTorch exports on Linux from succeeding.
Removed clamping behavior from the Affine layer. It no longer restricts the values of its
scale
andoffset
parameters. Theoptions_on
parameter has been deprecated; users wishing to hide interactive controls for this layer during analysis should set the newinclude_ui
parameter to false.Fixed a bug that blocked FP16 inference when fusing 7x7 convolutions with batch normalizations.
NvNeural changes in version 2022.1:
Added a new analysis layer: Signal Injector.
Added a new Input (Constant) layer which supports direct embedding of scalar constants.
Optimized the performance of the BatchNorm layer.
Optimized the performance of the Upscale layer.
Added support for downscaling and fixed-size scaling to the Upscale layer.
The NvRTC wrapper in
nvneural::ICudaRuntimeCompiler
has been replaced with a stub when type-checking networks from the GUI. Plugins that rely on the ability to execute generated kernel code during initialization ornvneural::ILayer::reshape
should call NvRTC directly, but for performance reasons we do not recommend this approach.The
forward()
function in the exported PyTorch class now takes in keyword-only arguments. User should explicitly name the input paramters while calling the model/function.The
INetwork::inferenceSubgraph
method now applies queued reshape operations. Queued reshapes are not cleared upon failure and will continue to blockinference
andinferenceSubgraph
calls until they succeed.
Updates in 2021.2
NVIDIA Nsight Deep Learning Designer changes in version 2021.2:
The Channel Inspector can display summary and per-channel statistics about a layer’s output tensor: mean, minimum/maximum, standard deviation, and sparsity (percentage of tensor elements close to zero).
Output tensor shapes are now visible during editing.
ConverenceNG can now save network outputs as .npy files.
Users can now expand or collapse the parameters list of a layer glyph in the editor view.
In Channel Inspector, users can now toggle a checkbox to perform auto scale and shift of the channels.
Users can now save a template as a file that can be imported into another model.
We have added more data to network profiling reports:
Per-layer device memory footprints
Whole-network device memory footprint
Percentage view for inference timings
Layers’ distance from the nearest input, for sorting by network depth
Template-level inference timings
NvNeural changes in version 2021.2:
Plugin initialization has been refactored to reduce its reliance on translation-unit-scoped static initialization. The ExportPlugin framework now expects user plugin code to provide an implementation of the function
void nvneural::plugin::InitializePluginTypes()
. This function should call static ClassRegistry methods to make its export types visible to the client application.The SkipConcatenation optimization has been rewritten. Custom concatenation layers should implement
nvneural::IConcatenationLayer2
to participate in this optimization.We added two new analysis layers: Saliency Generator and Saliency Mix. The Saliency Generator layer converts their input tensors into a single-channel tensor with the same H and W as their inputs. The saliency mix layer simply overlays saliency information (output of a Saliency Generator) onto another input tensor.
Known Issues
Nsight Deep Learning Designer’s profiling features require access to GPU performance counters, which by default requires local administrator privileges. See the following page for details on how to configure your system to allow profiling without elevation. Profiling without the appropriate permissions will generate error messages of the form
ERR: xxx: Error 19 returned from Perfworks
.The values of operations/second calculated from this NVIDIA developer tools site and generated by using the data center monitoring tools are not calculated in the same way as the operations/second used for export control purposes and should not be relied upon to assess performance against the export control limits.
The screenshot feature in the Send Feedback dialog does not work on Linux systems using Wayland as a display manager.
The ONNX Runtime DirectML profiling provider does not support machines that use a software display adapter.
There is no support for COMPLEX64 and COMPLEX128 ONNX data types.
ONNX models’ inputs and outputs only support Tensor types.
Link annotations can overlap each other if the links are too close together.
Model validation fails when processing an ONNX model with an invalid SparseTensor.
TensorRT profiling does not support control-flow operators such as Loop and If.
Closing an unsaved local function document with the [x] button in the document tab will not allow the document to be closed if the user elects to save changes. Save the local function edits with the ‘Confirm Local Function Edits’ button or the ‘Ctrl-S’ keyboard shortcut.
Platform Support
Linux
Nsight Deep Learning Designer supports Linux x86_64 systems running Ubuntu 20.04 LTS or newer, with GLIBC version 2.29 or higher.
The Nsight Deep Learning Designer host application requires several packages to be installed to enable Qt. Please refer to the Qt for X11 Requirements. When executing Nsight Deep Learning Designer with missing dependencies, an error message with information on the missing packages is shown. Note that only one package will be shown at a time, even though multiple may be missing from your system. The following command installs needed packages for Nsight Deep Learning Designer on X11:
Ubuntu 20.04:
apt install libopengl0 libxcb-icccm4 libxcb-image0 libxcb-keysyms1 libxcb-render-util0 libxcb-xinerama0 libxcb-xkb1 libxkbcommon-x11-0 libxcb-cursor0
NVIDIA L4T
Nsight Deep Learning Designer supports NVIDIA L4T arm64 systems running Ubuntu 20.04 LTS or newer, with GLIBC version 2.29 or higher.
The Nsight Deep Learning Designer host application requires several packages to be installed to enable Qt. Please refer to the Qt for X11 Requirements. When executing Nsight Deep Learning Designer with missing dependencies, an error message with information on the missing packages is shown. Note that only one package will be shown at a time, even though multiple may be missing from your system. The following command installs needed packages for Nsight Deep Learning Designer on X11:
Ubuntu 20.04:
apt install libopengl0 libxcb-icccm4 libxcb-image0 libxcb-keysyms1 libxcb-render-util0 libxcb-xinerama0 libxcb-xkb1 libxkbcommon-x11-0 libxcb-cursor0
Windows
Nsight Deep Learning Designer supports Windows x86_64 systems running:
Windows 10: 20H1 or newer.
Windows 11: 21H2 or newer.
GPU Support
NVIDIA Nsight Deep Learning Designer requires an NVIDIA GPU to run:
Linux x86_64
GeForce GPUs: GeForce RTX 2000 series, RTX 3000 series, or RTX 4000 series.
Data Center GPUs: A100, or H100.
NVIDIA L4T arm64
Embedded Systems: Jetson Orin.
Windows x86_64
GeForce GPUs: GeForce RTX 2000 series, RTX 3000 series, or RTX 4000 series.
Recommended Display Driver
You must have a recent NVIDIA display driver installed on your system to run Nsight Deep Learning Designer. The following display drivers are recommended:
Windows: Release 560.00 or newer.
Linux: Release 560.00 or newer.
NVIDIA L4T: NVIDIA JetPack SDK 6.1. Note: When targeting the NVIDIA L4T platform, the user (local or remote) needs to be a member of the debug group in order to profile.