Updates in 2019.4
General
Added support for the Linux PowerPC target platform
Reduced the profiling overhead, especially if no source metrics are collected
Reduced the overhead for non-profiled kernels
Improved the deployment performance during remote launches
Trying to profile on an unsupported GPU now shows an “Unsupported GPU” error message
Added support for the
%i
sequential number placeholder to generate unique report file namesAdded support for smsp__sass_* metrics on Volta and newer GPUs
The launch__occupancy_limit_shared_mem now reports the device block limit if no shared memory is used by the kernel
NVIDIA Nsight Compute
The Profile activity shows the command line used to launch ncu
The heatmap on the Source page now shows the represented metric in its tooltip
The Memory Workload Analysis Chart on the Details page now supports baselines
When applying rules, a message displaying the number of new rule results is shown in the status bar
The Visual Profiler Transition Guide was added to the documentation
Connection dialog activity options were added to the documentation
A warning dialog is shown if the application is resumed without Auto-Profile enabled
Pausing the application now has immediate feedback in the toolbar controls
Added a Close All command to the File menu
NVIDIA Nsight Compute CLI
The
--query-metrics
option now shows only metric base names for faster metric query. The new option--query-metrics-mode
can be used to display the valid suffixes for each base metric.Added support for passing response files using the
@
operator to specify command line options through a file
Resolved Issues
Fixed an issue that reported the wrong executable name in the Session page when attaching
Fixed issues that chart labels were shown elided on the Details page
Fixed an issue that caused the cache hitrates to be shown incorrectly when baselines were added
Fixed an illegal memory access when collecting sass__*_histogram metrics for applications using PyTorch on Pascal GPUs
Fixed an issue when attempting to collect all smsp__* metrics on Volta and newer GPUs
Fixed an issue when profiling multi-context applications
Fixed that profiling start/stop settings from the connection dialog weren’t properly passed to the interactive profile activity
Fixed that certain smsp__warp_cycles_per_issue_stall* metrics returned negative values on Pascal GPUs
Fixed that metric names were truncated in the
--page details
non-CSV command line outputFixed that the target application could crash if a connection port was used by another application with higher privileges