Updates in 2022.3
General
- Added support for the CUDA toolkit 11.8. 
- Added support for the Ada GPU architecture. 
- Added support for the Hopper GPU architecture. 
- Added support for OptiX 7.6. 
- Added uncoalescedGlobalAccesses sample CUDA application and document to show how the NVIDIA Nsight Compute profiler can be used to analyze and identify the memory accesses which are uncoalesced and result in inefficient DRAM accesses. Refer to the README, sample code and document under - extras/samples/uncoalescedGlobalAccesses.
- Added Metrics Reference in the documentation that lists metrics not available through - --query-metrics.
- Reduced the overhead of collecting SASS-patching based metrics. 
- On Multi-Instance GPU (MIG) configurations, NVIDIA Nsight Compute cannot lock clocks anymore. Users are expected to lock clocks externally using nvidia-smi. 
NVIDIA Nsight Compute
- Wrapper script - nv-nsight-cuis deprecated in favor of- ncu-uiand will be removed in a future release.
- Source page supports range replay results. 
- Added a second chart on the Compute Workload Analysis section to avoid mixing metrics with different meaning. 
- NVIDIA Nsight Compute now tracks traversable handles created with - optixAccelRelocate.
- NVIDIA Nsight Compute now tracks traversable handles created as updates from others. 
- The Acceleration Structure viewer now reports unsupported inputs. 
- The Acceleration Structure viewer now supports opening multiple traversable handles. 
- The Acceleration Structure viewer now uses OptiX naming for displayed elements. 
NVIDIA Nsight Compute CLI
- Wrapper script - nv-nsight-cu-cliis deprecated in favor of- ncuand will be removed in a future release.
- Added new option - --filter-mode per-gputo enable filtering of kernel launches on each GPU separately.
- Added new option - --app-replay-mode relaxedto produce profiling results for valid kernels even if the number of kernel launches is inconsistent across application replay passes.
- Added a documentation section on supported environment variables. 
- Improved the performance when loading existing reports on the command line. 
Resolved Issues
- Fixed an issue when resolving files on the Source page. 
- Fixed an issue when profiling OptiX applications. 
- Fixed an issue in the OptiX traversable handle management caused by clashing handle values. 
- Fixed an issue in the Acceleration Structure viewer causing the display of invalid memory when viewing AABB buffers.