Updates in 2019.4

General

  • Added support for the Linux PowerPC target platform

  • Reduced the profiling overhead, especially if no source metrics are collected

  • Reduced the overhead for non-profiled kernels

  • Improved the deployment performance during remote launches

  • Trying to profile on an unsupported GPU now shows an “Unsupported GPU” error message

  • Added support for the %i sequential number placeholder to generate unique report file names

  • Added support for smsp__sass_* metrics on Volta and newer GPUs

  • The launch__occupancy_limit_shared_mem now reports the device block limit if no shared memory is used by the kernel

NVIDIA Nsight Compute

  • The Profile activity shows the command line used to launch ncu

  • The heatmap on the Source page now shows the represented metric in its tooltip

  • The Memory Workload Analysis Chart on the Details page now supports baselines

  • When applying rules, a message displaying the number of new rule results is shown in the status bar

  • The Visual Profiler Transition Guide was added to the documentation

  • Connection dialog activity options were added to the documentation

  • A warning dialog is shown if the application is resumed without Auto-Profile enabled

  • Pausing the application now has immediate feedback in the toolbar controls

  • Added a Close All command to the File menu

NVIDIA Nsight Compute CLI

  • The --query-metrics option now shows only metric base names for faster metric query. The new option --query-metrics-mode can be used to display the valid suffixes for each base metric.

  • Added support for passing response files using the @ operator to specify command line options through a file

Resolved Issues

  • Fixed an issue that reported the wrong executable name in the Session page when attaching

  • Fixed issues that chart labels were shown elided on the Details page

  • Fixed an issue that caused the cache hitrates to be shown incorrectly when baselines were added

  • Fixed an illegal memory access when collecting sass__*_histogram metrics for applications using PyTorch on Pascal GPUs

  • Fixed an issue when attempting to collect all smsp__* metrics on Volta and newer GPUs

  • Fixed an issue when profiling multi-context applications

  • Fixed that profiling start/stop settings from the connection dialog weren’t properly passed to the interactive profile activity

  • Fixed that certain smsp__warp_cycles_per_issue_stall* metrics returned negative values on Pascal GPUs

  • Fixed that metric names were truncated in the --page details non-CSV command line output

  • Fixed that the target application could crash if a connection port was used by another application with higher privileges