Updates in 2019.1
General
Support for CUDA 10.1
Improved performance
Bug fixes
Profiling on Volta GPUs now uses the same metric names as on Turing GPUs
Section files support descriptions
The default sections and rules directory has been renamed to sections
NVIDIA Nsight Compute
Added new profiling options to the options dialog
Details page shows rule result icons in the section headers
Section descriptions are shown in the details page and in the sections tool window
Source page supports collapsing multiple source files or functions to show aggregated results
Source page heatmap color scale has changed
Invalid metric results are highlighted in the profiler report
Loaded section and rule files can be opened from the sections tool window
NVIDIA Nsight Compute CLI
Support for profiling child processes on Linux and Windows x86_64 targets
NVIDIA Nsight Compute CLI uses a temporary file if no output file is specified
Support for new
--quiet
optionSupport for setting the GPU clock control mode using new
--clock-control
optionDetails page output shows the NVTX context when
--nvtx
is enabledSupport for filtering kernel launches for profiling based on their NVTX context using new
--nvtx-include
and--nvtx-exclude
optionsAdded new
--summary
options for aggregating profiling resultsAdded option
--open-in-ui
to open reports collected with NVIDIA Nsight Compute CLI directly in NVIDIA Nsight Compute
Resolved Issues
Installation directory scripts use absolute paths
OpenACC kernel names are correctly demangled
Profile activity report file supports a relative path
Source view can resolve all applicable files at once
UI font colors are improved
Details page layout and label elision issues are resolved
Turing metrics are properly reported on the Summary page
All byte-based metrics use a factor of 1000 when scaling units to follow SI standards
CSV exports properly align columns with empty entries
Fixed the metric computation for double_precision_fu_utilization on GV11b
Fixed incorrect ‘selected’ PC sampling counter values
The SpeedOfLight section uses ‘max’ instead of ‘avg’ cycles metrics for Elapsed Cycles