Updates in 2019.1

General

  • Support for CUDA 10.1

  • Improved performance

  • Bug fixes

  • Profiling on Volta GPUs now uses the same metric names as on Turing GPUs

  • Section files support descriptions

  • The default sections and rules directory has been renamed to sections

NVIDIA Nsight Compute

  • Added new profiling options to the options dialog

  • Details page shows rule result icons in the section headers

  • Section descriptions are shown in the details page and in the sections tool window

  • Source page supports collapsing multiple source files or functions to show aggregated results

  • Source page heatmap color scale has changed

  • Invalid metric results are highlighted in the profiler report

  • Loaded section and rule files can be opened from the sections tool window

NVIDIA Nsight Compute CLI

  • Support for profiling child processes on Linux and Windows x86_64 targets

  • NVIDIA Nsight Compute CLI uses a temporary file if no output file is specified

  • Support for new --quiet option

  • Support for setting the GPU clock control mode using new --clock-control option

  • Details page output shows the NVTX context when --nvtx is enabled

  • Support for filtering kernel launches for profiling based on their NVTX context using new --nvtx-include and --nvtx-exclude options

  • Added new --summary options for aggregating profiling results

  • Added option --open-in-ui to open reports collected with NVIDIA Nsight Compute CLI directly in NVIDIA Nsight Compute

Resolved Issues

  • Installation directory scripts use absolute paths

  • OpenACC kernel names are correctly demangled

  • Profile activity report file supports a relative path

  • Source view can resolve all applicable files at once

  • UI font colors are improved

  • Details page layout and label elision issues are resolved

  • Turing metrics are properly reported on the Summary page

  • All byte-based metrics use a factor of 1000 when scaling units to follow SI standards

  • CSV exports properly align columns with empty entries

  • Fixed the metric computation for double_precision_fu_utilization on GV11b

  • Fixed incorrect ‘selected’ PC sampling counter values

  • The SpeedOfLight section uses ‘max’ instead of ‘avg’ cycles metrics for Elapsed Cycles