Updates in 2019.5
General
Added section sets to reduce the default overhead and make it easier to configure metric sets for profiling
Reduced the size of the installation
Added support for CUDA Graphs Recapture API
The NvRules API now supports accessing correlation IDs for instanced metrics
Added breakdown tables for SOL SM and SOL Memory in the Speed Of Light section for Volta+ GPUs
NVIDIA Nsight Compute
Added a snap-select feature to the Source page heatmap help navigate large files
Added support for loading remote CUDA-C source files via SSH on demand for Linux x86_64 targets
Charts on the Details page provide better help in tool tips when hovering metric names
Improved the performance of the Source page when scrolling or collapsing
The charts for Warp States and Compute pipelines are now sorted by value
NVIDIA Nsight Compute CLI
Added support for GPU cache control, see
--cache-control
Added support for setting the kernel name base in command line output, see
--kernel-base
Added support for listing the available names for
--chips
, see--list-chips
Improved the stability on Windows when using
--target-processes all
Reduced the profiling overhead for small metric sets in applications with many kernels
Resolved Issues
Reduced the overhead caused by demangling kernel names multiple times
Fixed an issue that kernel names were not demangled in CUDA Graph Nodes resources window
The connection dialog better disables unsupported combinations or warns of invalid entries
Fixed metric thread_inst_executed_true to derive from smsp_not_predicated_off_thread_inst_executed on Volta+ GPUs
Fixed an issue with computing the theoretical occupancy on GV100
Selecting an entry on the Source page heatmap no longer selects the respective source line, to avoid losing the current selection
Fixed the current view indicator of the Source page heatmap to be line-accurate
Fixed an issue when comparing metrics from Pascal and later architectures on the Summary page
Fixed an issue that metrics representing constant values on Volta+ couldn’t be collected without non-constant metrics