Updates in 2019.5

General

  • Added section sets to reduce the default overhead and make it easier to configure metric sets for profiling

  • Reduced the size of the installation

  • Added support for CUDA Graphs Recapture API

  • The NvRules API now supports accessing correlation IDs for instanced metrics

  • Added breakdown tables for SOL SM and SOL Memory in the Speed Of Light section for Volta+ GPUs

NVIDIA Nsight Compute

  • Added a snap-select feature to the Source page heatmap help navigate large files

  • Added support for loading remote CUDA-C source files via SSH on demand for Linux x86_64 targets

  • Charts on the Details page provide better help in tool tips when hovering metric names

  • Improved the performance of the Source page when scrolling or collapsing

  • The charts for Warp States and Compute pipelines are now sorted by value

NVIDIA Nsight Compute CLI

  • Added support for GPU cache control, see --cache-control

  • Added support for setting the kernel name base in command line output, see --kernel-base

  • Added support for listing the available names for --chips, see --list-chips

  • Improved the stability on Windows when using --target-processes all

  • Reduced the profiling overhead for small metric sets in applications with many kernels

Resolved Issues

  • Reduced the overhead caused by demangling kernel names multiple times

  • Fixed an issue that kernel names were not demangled in CUDA Graph Nodes resources window

  • The connection dialog better disables unsupported combinations or warns of invalid entries

  • Fixed metric thread_inst_executed_true to derive from smsp_not_predicated_off_thread_inst_executed on Volta+ GPUs

  • Fixed an issue with computing the theoretical occupancy on GV100

  • Selecting an entry on the Source page heatmap no longer selects the respective source line, to avoid losing the current selection

  • Fixed the current view indicator of the Source page heatmap to be line-accurate

  • Fixed an issue when comparing metrics from Pascal and later architectures on the Summary page

  • Fixed an issue that metrics representing constant values on Volta+ couldn’t be collected without non-constant metrics