Updates in 2024.1

General

  • Switched to using OpenSSL version 3.0.10.

  • Added new metrics available when profiling on CUDA Green Contexts.

  • Reduced the number of passes required for collecting PM sampling sections.

  • Counter domains can now be specified for PM sampling metrics in section files.

  • PM sampling metrics can now be queried in the command line and Metric Details window by specifying the respective collection option.

  • Added a new optional PmSampling_WarpStates section for understanding warp stall reasons over the workload duration.

  • Added a new rule for detecting load imbalances.

  • Improved the performance of graph-level profiling on new drivers.

  • Updated the metrics compatibility table for OptiX cmdlists and instruction-level SASS metrics.

NVIDIA Nsight Compute

  • Added SASS view and Source Markers support in Source Comparison.

  • Improved Source Comparison diff visualization by adding empty lines on other side of inserted/deleted lines.

  • The Source page column chooser can now be opened directly from the Navigation drop down.

  • Added a Launch Details tool window for showing information about individual launches within larger workloads like OptiX command lists.

  • Added support for CUDA Green Contexts in the Resources tool window, the Launch Statistics section and the report header.

NVIDIA Nsight Compute CLI

  • Improved documentation on NVTX expressions and command line output when a potentially incorrect expression led to no workloads being profiled.

  • Improved checking for invalid expressions when using the --target-processes-filer option.

Resolved Issues

  • Fixed that the L1 cache achieved roofline value was missing when profiling on GH100.

  • Fixed several “Launch Failed” errors when collecting instruction-level SASS metrics.

  • Fixed that Live Register values would be too high for some workloads.

  • Fixed a scrolling issue on the Source page when collapsing a multi-file view.

  • Fixed an issue that no PM sampling data would be shown in the timeline when context switch trace was not available.

  • Fixed a display issue in the memory chart when adding baselines.

  • Fixed a crash when adding baselines.

  • Fixed a crash in timeline views when not all configured data was available.

  • Fixed that the application history was not always deleted when selecting Reset Application Data.

  • Fixed an error in the metric compatibility documentation.