Updates in 2023.3
General
NVIDIA Nsight Compute now supports collecting many metrics by sampling the GPU’s performance monitors (PM) periodically at fixed intervals. The results can be visualized on a timeline.
Added WSL profiling support on Windows 10 WSL with OS build version 19044 and greater. WSL profiling is not supported on Windows 10 WSL for systems that exceed 1 TB of system memory.
Rule outputs are prioritized to improve the accuracy of estimated speedups. The Summary page now shows the most actionable optimization advices when a result row is selected.
Improved the handling and reporting for unavailable metrics during collection and when applying rules.
Added instructionMix sample CUDA application and document to show how to use NVIDIA Nsight Compute to analyze and identify the performance bottleneck due to an imbalanced instruction mix. Refer to the
README.TXT
file, sample code, and document underextras/samples/instructionMix
.
NVIDIA Nsight Compute
Added support to see the source files of two profile results side by side using Source Comparison. This allows you to quickly identify source differences and understand changes in metric values.
The Summary page is now the default page when a report is opened. Previous behavior can be enabled in the options dialog.
On the Summary and Raw pages, values from all/selected rows are automatically aggregated in the column header for applicable metrics. Selected individual cells are aggregated in the bottom status bar.
Added
Launch Name
andDevice
options in the filter dialog launched byApply Filters
button in the report header.Added support for source view profiles that persist the Source page configuration and allow you to re-apply it to other reports.
The Metric Details tool window now supports querying metrics beyond the current report by using the
chip:<chipname>
tag in the search.Added support for CUDA Graph Edge Data (such as port and dependency type) and CUDA Graph Conditional Handles in the Resources tool window.
The Acceleration Structure Viewer and Resources tool window now support OptiX Opacity Micromaps.
NVIDIA Nsight Compute CLI
Tracking and profiling all child processes (
--target-processes all
) is now the default for ncu.Improved reporting of requested but unavailable metrics. Metrics requested in section files are by default considered optional and only cause a warning to be shown.
Resolved Issues
Support for tracking child processes launched with
system()
is available on Linux ppc64le.Improved the behavior of following SASS navigation links on the Source page.
Fixed issues with profiling CUDA graphs in graph-profiling mode when nodes are associated with a non-current CUDA context.
Fixed an issue in L2 bandwidth calculations in the hierarchical roofline sections.