Updates in 2022.4
General
Added support for the CUDA toolkit 12.0.
Added support for profiling CUDA graphs as complete workloads instead of as single kernel nodes. Enable this using the Graph Profiling option in the activities. Similarly to range replay results, selected metrics are not available when profiling graphs.
Added support for profiling on Windows Subsystem for Linux (WSL2). See the System Requirements for more details.
Deprecated
nv-nsight-cu
andnv-nsight-cu-cli
aliases are removed in favor ofncu-ui
andncu
.
NVIDIA Nsight Compute
The Source page now loads disassembly and static analysis results asynchronously in the background.
Added a new Metric Details tool window to inspect metric information such as raw value, unit, description or instance values. Open the tool window and select a metric on the Details or Raw page or lookup any metric in the focused report directly in the tool window’s search bar.
In the Source page PTX view, the source name will be shown as a list of comma-separated files.
Added flexibility with NVTX based filtering in the Next Trigger filter, similar to the command line. Filters can now use nvtx-include and nvtx-exclude expressions by adding the
nvtx-include:
ornvtx-exclude:
prefix.NVTX views now show the payload type.
Simplified the command line generated by the Profile activity.
Reduced the number of steps required to re-run the Profile activity.
The way to rename Baselines in-place has been improved.
The Resources tool window now shows the CUDA Dynamic Parallelism state for CUDA functions and modules.
OptiX traversable handles can now be exported as Graphviz DOT or SVG files for visualization from the Resources tool window.
All OptiX build, instance and geometry flags can be viewed in the Acceleration Structure Viewer.
Added OptiX-specific highlight filters to the Acceleration Structure Viewer.
Added support for user-specified index strides to the Acceleration Structure Viewer.
NVIDIA Nsight Compute CLI
Added new option
--graph-profiling graph
to enable profiling of complete CUDA graphs as single workloads.
Added new option
--filter-mode per-launch-config
to enable filtering of kernel launches for each GPU launch parameter separately.Added support to print section body item metrics on the details page with the new
--print-details
command line option.Added support to select what to show in Metric Name column on the details page with the new
--print-metric-name
command line option.Removed deprecated options:
--units
,--fp
,--summary
and--kernel-base
Added support to print launch, session, process and device attributes on session page with the new
--page session
option.Added
--kill yes
support for application replay mode.
Resolved Issues
Fixed an issue that NVIDIA Nsight Compute could crash when continuing profiling after transposing the Raw page table.
Fixed an issue that caused closing a report document to be delayed by pending source analysis.
Fixed support for profiling applications with older OptiX versions.
Fixed display of OptiX module inputs for IR and built-in modules.