# 1\. Release Notes [](https://docs.nvidia.com/nsight-compute/ReleaseNotes/index.html\#release-notes "Permalink to this headline")

Nsight Compute Release Notes.

Release notes, including new features and important bug fixes. Supported platforms and GPUs. List of known issues for the current release.

## 1.1. Release Notes [](https://docs.nvidia.com/nsight-compute/ReleaseNotes/index.html\#id1 "Permalink to this headline")

### 1.1.1. Updates in 2026.1.1 [](https://docs.nvidia.com/nsight-compute/ReleaseNotes/index.html\#updates-in-2026-1-1 "Permalink to this headline")

**Resolved Issues**

- Fixed an issue where metric colors in a timeline row could differ from those shown in the corresponding tooltip.

- Fixed an incorrect calculation in the Roofline Analysis rule that prevented it from triggering in all expected cases.

- Fixed an issue where the High Pipe Utilization rule might not analyze all pipelines.


### 1.1.2. Updates in 2026.1 [](https://docs.nvidia.com/nsight-compute/ReleaseNotes/index.html\#updates-in-2026-1 "Permalink to this headline")

**General**

- Added support for CUDA 13.2.

- The minimum supported version of macOS is now 13.0.

- The [Python Report Interface](https://docs.nvidia.com/nsight-compute/PythonReportInterface/index.html.md#introduction) can now be installed as a standalone package from PyPI using `pip install ncu-report`.


**NVIDIA Nsight Compute**

- Nsight Compute now supports profiling Linux (aarch64 sbsa) targets from macOS hosts.

- The [Start Activity Dialog](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#start-activity-dialog) now support editing multi-line command line arguments.

- Added the [Report Merge Tool](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#report-merge-tool).
It enables you to combine multiple reports into one.
It is particularly useful for multi-GPU systems and scenarios when comparing and analyzing several reports individually becomes impractical.

- Added the [Clustering Window](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#clustering-window).
It helps you analyze and compare multiple profiling reports by grouping similar reports together.
This makes it easier to identify performance patterns and find relationships between different profiling sessions.

- Added [Register Dependencies](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#profiler-report-source-page-additional-tables) analysis to the Source page.
It helps you to identify general purpose register dependencies and occupancy issues due to live register pressure.
Added _Attributed Live Registers_ and _Output Registers_ metric columns in the Source page views.
Renamed _Instructions & Scoreboards_ to _Instructions & Dependencies_.

- The [Source](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#pre-defined-source-metrics) page now shows the [metric pipelines](https://docs.nvidia.com/nsight-compute/ProfilingGuide/index.html.md#pipelines) associated with each instruction.

- Added a [CUDA Graph Viewer](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#cuda-graph-viewer) tool window that dynamically visualizes CUDA Graphs during interactive profiling.

- [Timelines](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#timelines) on the Details page now overlay related metrics in a single row when possible.
They now also show a max bar in the background to indicate the maximum value at any zoom level.
Rows can be switched between theoretical peak and collected maximum value for the Y-axis scale.

- The Instruction Statistics section now has thread-level charts.

- Improved alignment of rule elements on the Details page.

- Moved the section body dropdowns to the left of the Details page section headers.

- Renamed “Kernel Analysis” to “SASS Analysis” in Options dialog.

- The Save As dialog now supports the `.ncu-repz` file extension for compressed reports.


**NVIDIA Nsight Compute CLI**

- [Mandatory concurrent kernels](https://docs.nvidia.com/nsight-compute/NsightComputeCli/index.html.md#mandatory-concurrent-kernels) (e.g. NCCL communication kernels) can now be profiled across processes from the same process tree using the `--communicator shmem` option.

- The tool now generates a persisted log file in case of non-recoverable errors.

- The default value of `--clock-control` is now `boost`.


**Resolved Issues**

- Fixed issues with OptiX command lists in interactive profiling mode.

- Added `occupancy` as an option to `--query-metrics-collection`.

- Fixed issues with the Active Clusters graph in the [occupancy calculator](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#graphs).

- Fixed that the inline functions table could show incorrect metrics values in some cases.

- Fixed issues with syntax highlighting on the Souce page.

- Improved the performance of the [Source Comparison](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#source-comparison) view for large SASS diffs.

- Fixed issues with alignment of multi-pass PM sampling [timelines](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#timelines) without context switch trace.

- Using an unknown metric in a PM sampling timeline section file is now an error.

- Fixed issues with opcode category tooltips in the Instruction Statistics charts.

- Fixed issues in node-level profiling of CUDA device launchable graphs.


### 1.1.3. Updates in 2025.4.1 [](https://docs.nvidia.com/nsight-compute/ReleaseNotes/index.html\#updates-in-2025-4-1 "Permalink to this headline")

**General**

- Added support for profiling OptiX workloads with the interactive profile activity.

- The Resources tool window now shows a _CUDA ID_ column for graph resources.


**Resolved Issues**

- Fixed issues in the Report Merge Tool.

- Fixed issues in the Clustering Window.

- Fixed an issue with the formatting of entries in the Inline Functions table.

- Fixed an issue that could cause the UI to crash when switching between kernels in the Source page in certain cases.

- Fixed an issue that the `ncu` CUDA Toolkit script may fail if unrecognized versions of Nsight Compute are installed separately.

- Fixed several incorrect metrics in the `PmSampling.section`.

- Fixed an issue that PM Sampling timelines could be distorted due to broken initial samples.

- Fixed a crash with node-level graph profiling in app-range replay mode. If graphs are uploaded outside the range, an error indicating that profiling it is not supported is now shown.

- Fixed a crash when instantiating device-side graphs with memory nodes.


### 1.1.4. Updates in 2025.4 [](https://docs.nvidia.com/nsight-compute/ReleaseNotes/index.html\#updates-in-2025-4 "Permalink to this headline")

**General**

- Added support for profiling CUDA tile workloads.

- Introduced a new Tile section to summarize tile dimensions and pipeline utilization, displayed when enabled and a tile workload is profiled.

- Source page supports correlation between SASS and high-level Tile code (limited to cuTile Python code).

- Added a new `ncu-repz` file format for [zstd](https://facebook.github.io/zstd/) compressed report files.

- Added support for locking GPUs to boost clock instead of base on Ampere and newer GPU. Use the `boost` and `force-boost` [options](https://docs.nvidia.com/nsight-compute/NsightComputeCli/index.html.md#command-line-options-profile) on supported drivers.

- [Warp sampling](https://docs.nvidia.com/nsight-compute/ProfilingGuide/index.html.md#warp-sampling) by default now focuses on the _Not Issued_ (`(_not_issued)`) variants of the metrics.
This is to avoid pointing to source locations where warp stalls are mitigated by having sufficient numbers of warps during an issue cycle to hide latency.

- Added support for node-level profiling of CUDA conditional graphs, including device-updatable nodes and nodes that can set conditional graph handles.

- Added support for node-level profiling of CUDA graphs launched from the device (DGL), including host graph nodes that can launch DGL.

- Source page now displays [symbol labels](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#profiler-report-source-page-navigation): A new column for symbol labels has been added, and symbol labels are shown alongside addresses in SASS instruction disassembly.
This change aligns the output with that of the nvdisasm tool.

- Added support for collecting Warp sampling metrics with PM sampling allowing user to see function-level warp stalls for the selected time range in the timeline.
See the [Function Stats tool window](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#tool-window-function-stats) for details.


### 1.1.5. Updates in 2025.3.1 [](https://docs.nvidia.com/nsight-compute/ReleaseNotes/index.html\#updates-in-2025-3-1 "Permalink to this headline")

**General**

- Improved the charts in the Compute Workload Analysis section to better distinguish between per\_cycle\_active and per\_cycle\_elapsed metrics.


**Resolved Issues**

- Fixed an issue where kernels using the compile-time attribute `__block_size__` were launched with incorrect grid dimensions.

- Fixed an issue with timline y-axis labels being showing unexpected units for small max values.

- Fixed a crash when stepping applications in interactive profiling mode.

- Fix that roofline charts missed showing achived value data in some cases.

- Fixed that duplicated tooltips could be shown for some links in the Memory Chart.

- Fixed a potential hang when setting `--pm-sampling-buffer-size` to very large values.

- Fixed several rules to not show non-actionable warnings for unsupported, missing metrics when profiling on mobile chips.


### 1.1.6. Updates in 2025.3 [](https://docs.nvidia.com/nsight-compute/ReleaseNotes/index.html\#updates-in-2025-3 "Permalink to this headline")

**General**

- Added support for CUDA 13.0. See the tool’s CUDA driver [system requirements](https://docs.nvidia.com/nsight-compute/ReleaseNotes/index.html.md#system-requirements).

- Added or improved support for Blackwell chips.

- For Green Context launches, `launch__waves_per_multiprocessor` is now scaled to the number of SMs in the Green Context.

- Added support for profiling individual nodes of device-launchable CUDA graphs launched from the host.

- Added metric `launch__persisting_l2_cache_size` to the Memory Workload Analysis section.

- Removed metric `profiler__pmsampler_dropped_samples`.

- Added [support](https://docs.nvidia.com/nsight-compute/NsightComputeCli/index.html.md#command-line-options-profile) for not importing SASS cubins into the report.


**NVIDIA Nsight Compute**

- The [Source page](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#pre-defined-source-metrics) now shows the instruction category in SASS and the instruction mix for high-level source.

- Added a new [instruction mix and scoreboard dependencies table](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#profiler-report-source-page-additional-tables) to the Source page.

- Added improved tooltips to the [memory chart](https://docs.nvidia.com/nsight-compute/ProfilingGuide/index.html.md#memory-chart).

- Added information on the GPC Constant Cache (GCC) and DSMEM atomics in the [memory tables](https://docs.nvidia.com/nsight-compute/ProfilingGuide/index.html.md#memory-tables).

- The [Metric Details](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#tool-window-metric-details) tool window now shows the breakdown for [throughput metrics](https://docs.nvidia.com/nsight-compute/ProfilingGuide/index.html.md#metrics-entities).

- Added support for searching web forum and rule results.

- Multiple results from the same search source are now combined to make the output more readable.

- Improved the [occupancy calculator](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#occupancy-calculator).


**NVIDIA Nsight Compute CLI**

- Added the option [–forward-signals](https://docs.nvidia.com/nsight-compute/NsightComputeCli/index.html.md#command-line-options-launch) to transparently forward signals to the profiled application.


**Resolved Issues**

- Fixed that some `ncu` console messages were truncated after 1024 characters.

- Fixed some display issues related to Green Context tables.

- Improved the performance of remote profiling in application replay mode.

- Fixed a hang in certain scenarios when profiling dependent kernels with device-mapped host allocations.

- Fixed missing correlation between JIT-compiled PTX to SASS in some situations.

- Fixed an error when profiling a CUDA graph kernel node doing a cluster launch on driver 580 or newer.


### 1.1.7. Updates in 2025.2.1 [](https://docs.nvidia.com/nsight-compute/ReleaseNotes/index.html\#updates-in-2025-2-1 "Permalink to this headline")

**Resolved Issues**

- Updated OpenSSL to 3.0.16

- Fixed an issue that caused the Device Memory table of the Memory Workload Analysis section to show up empty for chips of type GB100 and GB102


### 1.1.8. Updates in 2021.2.9 [](https://docs.nvidia.com/nsight-compute/ReleaseNotes/index.html\#updates-in-2021-2-9 "Permalink to this headline")

**NVIDIA Nsight Compute**

- Clarify when not all metrics for the roofline chart could be collected on the current chip.


### 1.1.9. Older Versions [](https://docs.nvidia.com/nsight-compute/ReleaseNotes/index.html\#older-versions "Permalink to this headline")

#### Updates in 2025.2 [](https://docs.nvidia.com/nsight-compute/ReleaseNotes/index.html\#updates-in-2025-2 "Permalink to this headline")

**General**

> - Added support for collecting C2C link information on Blackwell GPUs.
>
> - [CPU call stack filtering](https://docs.nvidia.com/nsight-compute/NsightComputeCli/index.html.md#cpu-stack-filtering) now supports Python call stacks.
>
> - Instruction statistics now show warp- and thread-level instruction counts per opcode category.
> Added new metrics `sass__inst_executed_per_opcode_category` and `sass__thread_inst_executed_per_opcode_category`.
> See the [Metrics Reference](https://docs.nvidia.com/nsight-compute/ProfilingGuide/index.html.md#metrics-reference) for details.
>
> - Enhanced several rules to produce tables pointing to the source location of interest.
>
> - Improved the NvRules API to support [generic tables](https://docs.nvidia.com/nsight-compute/NvRulesAPI/index.html.md#NvRules.IFrontend.generate_table) for the UI and CLI.
>
> - Improved the [NvRules](https://docs.nvidia.com/nsight-compute/NvRulesAPI/index.html.md) and [Python Report Interface](https://docs.nvidia.com/nsight-compute/PythonReportInterface/index.html.md) documentations to be more pythonic.
>
> - Added APIs to the Python Report Interface for querying rules and source markers in the report.
>
> - Added [Occupancy Calculator Python Interface](https://docs.nvidia.com/nsight-compute/OccupancyCalculatorPythonInterface/index.html.md#introduction), which provides a Python-based interface for performing occupancy calculations and analysis of kernels on NVIDIA GPUs.

**NVIDIA Nsight Compute**

- Added product-wide [search](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#tool-window-search) functionality via a new search bar and tool window.

- The Source page now shows scoreboard dependencies in SASS.

- Converted more tooltips into interactive tooltips. Interactive tooltips can now be pinned and dragged.

- Added source correlation [navigation controls](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#navigation) which allow navigation to the previous or next block of correlated lines.


**NVIDIA Nsight Compute CLI**

- Added support for profiling [MPS applications](https://docs.nvidia.com/nsight-compute/ProfilingGuide/index.html.md#special-configurations-mps).

- Added support for filtering kernels based on a [renamed kernels](https://docs.nvidia.com/nsight-compute/NsightComputeCli/index.html.md#kernel-renaming) configuration during profiling.


**Resolved Issues**

- CUDA Graphs in the [Resources View](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#tool-window-resources) use the current UI theme.

- Resolved several issues when interacting with timelines on the Details page.

- Resolved issues with Python syntax highlighting on the Source page.

- Disabled deprecated columns in the API Stream tool window.

- Fixed that the Source page may show incorrect correlation when some source files were not resolved.

- Reduced the number of replay passes required for collecting the `PmSampling.section` on GH100 with applicable drivers.

- Resolved that `--native-include` did not work properly when using range replay and `cu(da)ProfilerStop`.

- Fixed an `Invalid or unsupported charset:ANSI_X3.4-1968` error when using the CLI on some systems.

- Fixed that memory available for saving context state during replay may be computed incorrectly when the app was using managed memory.

- Fixed that some metrics were not listed for collection in section files for GB20x GPUs.


#### Updates in 2025.1.1 [](https://docs.nvidia.com/nsight-compute/ReleaseNotes/index.html\#updates-in-2025-1-1 "Permalink to this headline")

**General**

- Added support for Optix 9.0 functions `optixClusterAccelComputeMemoryUsage` and `optixClusterAccelBuild`.


**Resolved Issues**

- Fixed a possible deadlock condition while handling the launch of child processes on Linux systems.

- Fixed a possible crash of the Nsight Compute UI when switching to the Source Page.

- Fixed the missing roofline ceilings in the Floating Point Operations Roofline for GB20x chips.


#### Updates in 2025.1 [](https://docs.nvidia.com/nsight-compute/ReleaseNotes/index.html\#updates-in-2025-1 "Permalink to this headline")

**General**

- All roofline sections are now included in the `full` section set.

- [Range Replay](https://docs.nvidia.com/nsight-compute/ProfilingGuide/index.html.md#range-replay) and [app-range replay](https://docs.nvidia.com/nsight-compute/ProfilingGuide/index.html.md#application-range-replay) are now supporting the collection of instruction-level source metrics.

- Rules are now supported for range replays.

- Improved which launch metrics are available for ranges.

- Added a new `launch__stack_size` metric in the _Launch Statistics_ section to report the configured stack size.

- Added a new `sass__inst_executed_register_spilling` metric which counts the number of load and store instructions that were created by the compiler due to register spilling.

- Nsight Compute now natively supports macOS arm64.


**NVIDIA Nsight Compute**

- Added interactive tooltips to [Details](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#profiler-report-details-page) and [Source](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#profiler-report-source-page) pages. An interactive tooltip can be used to compare different baselines. Its content can be copied to the clipboard using the copy icon button.

- CUDA [Green Contexts support](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#green-contexts-support) is improved by showing TPC mask information in the Launch Statistics section, the [Resources](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#tool-window-resources) tool window, and on the [Session page](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#session-page).

- Added heatmap to the [Source Comparison](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#source-comparison) document to visualize the source code differences.

- Added _Diff By_ drop down menu to the [Source Comparison](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#source-comparison) document in the _SASS_ view, this allows you to choose the diff basis based on either _Opcode_ or _Full Instruction_.

- Performance improvements in _SASS_ view.

- The [Resources View](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#tool-window-resources) for CUDA Graphs can now visualize the graph structure directly in a new _Chart_ mode.

- The [Memory Chart](https://docs.nvidia.com/nsight-compute/ProfilingGuide/index.html.md#memory-chart) now supports zoom and pan.

- The [Metric Details](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#tool-window-metric-details) tool window now shows PM sampling metrics from the timeline as context switched.

- Improved the performance for deploying to target systems over [remote connections](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#remote-connections).


**NVIDIA Nsight Compute CLI**

- Added experimental support for profiling [MPS applications](https://docs.nvidia.com/nsight-compute/ProfilingGuide/index.html.md#special-configurations-mps).


**Resolved Issues**

- Fixed that on some systems, not all free GPU memory was considered when saving context memory for multi-pass data collection.

- Fixed an incorrect multiplier in the calculation of non-tensor FP16 rooflines.

- Fixed the metric `Avg. Threads Executed` for inlined functions with control flow.

- Fixed that in some situations, no average was shown in the Source Statistics table for Warp Stall sampling metrics.

- Fixed several SASS syntax highlighting issues.

- Fixed an issue where the SM count wasn’t shown correctly in the report header when loading older reports.

- Improved interactions between the [Metric Details](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#tool-window-metric-details) tool window and the memory chart.


#### Updates in 2024.4 [](https://docs.nvidia.com/nsight-compute/ReleaseNotes/index.html\#updates-in-2024-4 "Permalink to this headline")

**General**

- Added support for the Blackwell architecture.

- Added support for several `launch__*` metrics for CUDA graphs.

- Added support for cuMemBatchDecompressAsync API in the [Range Replay](https://docs.nvidia.com/nsight-compute/ProfilingGuide/index.html.md#range-replay).


**NVIDIA Nsight Compute**

- A new feature overview is now shown the first time a new UI version is opened.

- Switched the default orientation of the _Raw_ page to show metrics in rows and profile results in columns.

- Added support for reporting register spilling compiler annotations on the Source page.

- The [source page](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#profiler-report-source-page-navigation) has improved search with support for regular expression- and value-based lookups.

- Added support to set a [Source View Profile](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#profiler-report-source-page-profiles) as the default profile to apply it automatically while opening a report.

- Added hyperlinks for the line numbers and inline function addresses in the [Inline Table](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#additional-tables). This enabled you to quickly jump to the respective line number in the Source view and address in the SASS view. Added a new column Source File in the [Inline Table](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#additional-tables) to show the file name to which source belongs.

- The [memory chart](https://docs.nvidia.com/nsight-compute/ProfilingGuide/index.html.md#memory-chart) can indicate or hide inactive elements.

- Chart tooltips on the Details page now show more relevant information when a specific value is hovered.

- [Roofline charts](https://docs.nvidia.com/nsight-compute/ProfilingGuide/index.html.md#roofline) now support showing the formula for ridge point calculation in the [metric details](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#tool-window-metric-details) tool window.

- The [occupancy calculator](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#occupancy-calculator) now considers the impact of block barriers for Hopper-architecture and newer GPUs. It also has improved controls to adjust input values.

- The [remote connections](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#remote-connections) dialog now supports placeholders to deploy files to e.g. user-specific directories on the target system.


**NVIDIA Nsight Compute CLI**

- Added new `--nvtx-push-pop-scope` [command line option](https://docs.nvidia.com/nsight-compute/NsightComputeCli/index.html.md#command-line-options-profile) which allows to set push pop range scope process wide.


**Resolved Issues**

- Fixed UI scrolling issues on macOS trackpads.

- Fixed that certain Python script errors were not properly reported when loading rule files.

- On CUDA 12.7 drivers, [context switch trace](https://docs.nvidia.com/nsight-compute/ProfilingGuide/index.html.md#pm-sampling) can now filter events more precisely to the profiled CUDA context, even when profiling in containers.

- [NVTX filtering](https://docs.nvidia.com/nsight-compute/NsightComputeCli/index.html.md#nvtx-filtering) now properly supports start/end ranges that start and end in different threads.

- Fixed several issues with [range replay](https://docs.nvidia.com/nsight-compute/ProfilingGuide/index.html.md#range-replay-supported-apis) when capturing CUDA memcpy APIs.


#### Updates in 2024.3.2 [](https://docs.nvidia.com/nsight-compute/ReleaseNotes/index.html\#updates-in-2024-3-2 "Permalink to this headline")

**NVIDIA Nsight Compute**

- Added support for kernel profiling on the Microsoft Compute Driver Model (MCDM). Requires an NVIDIA display driver version of 565 or higher.


**Resolved Issues**

- Fixed the calculation of peak throughput values for select pipelines on chips using the Ada GPU architecture, including the NVIDIA L20 and NVIDIA L40 data-center GPUs.

- Re-enabled the target-side data collection for the [Acceleration Structure Viewer](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#acceleration-structure-viewer) on Linux (aarch64 and aarch64 sbsa), which was disabled in 2024.3.1.


#### Updates in 2024.3.1 [](https://docs.nvidia.com/nsight-compute/ReleaseNotes/index.html\#updates-in-2024-3-1 "Permalink to this headline")

**NVIDIA Nsight Compute**

- Added the ability to use the ESC button to close the search popup in the views of the Source Page.

- Disabled the [Acceleration Structure Viewer](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#acceleration-structure-viewer) on Linux (aarch64 and aarch64 sbsa) as the rendering view may cause a crash of the Nsight Compute UI. The functionality will be restored in a future version.


**NVIDIA Nsight Compute CLI**

- Added back the help documentation for command line parameters `--set` and `--list-sets`.


**Resolved Issues**

- Fixed an issue with PM Sampling on Hopper GPUs that could cause the target application to hang during profiling.

- Fixed an issue that caused CPU stack frames to not show up if stack frames could not be fully resolved.

- Fixed an issue with the heights of the tables in the Metric Details window.

- Fixed a possible crash of the Nsight Compute UI while preparing the information for the Source Page.

- Fixed an issue that caused a failure when the request to save a partial profile report is canceled.


#### Updates in 2024.3 [](https://docs.nvidia.com/nsight-compute/ReleaseNotes/index.html\#updates-in-2024-3 "Permalink to this headline")

**NVIDIA Nsight Compute**

- Added syntax highlighting support for section and rule files when open these as a new document. These documents now support edit, save (ctrl + s), text search (ctrl + f), zoom in (ctrl + mouse scroll down), zoom out (ctrl + mouse scroll up) functionalities.

- In [Source Comparison](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#source-comparison) document, added synchronized horizontal scrolling support across both sides for frozen columns.

- Added hyperlinks for section and rule files in the [Metric Selection](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#metric-selection) tool window.

- Improved Inline Table source syntax highlighting.

- The ‘Connect’ button has been renamed to ‘Start Activity’.

- The ‘Connection Dialog’ has been renamed ‘Start Activity’.

- Improved Progress Log window in [Start Activity Dialog](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#connection-dialog) to show progress status of files being deployed to the remote host machine.

- Added file menu options to Save Filtered Results and Save Selected Results from an opened report to a new report. Also added context menu option Save Results to save the selected results from the Summary Table or Raw Table to a new report.


**NVIDIA Nsight Compute CLI**

- Added support for exporting filtered results from an imported profile report to a new file using the –export and –import options together.

- The –kernel-id option now supports regex for specifying the context id and stream id.


#### Updates in 2024.2.1 [](https://docs.nvidia.com/nsight-compute/ReleaseNotes/index.html\#updates-in-2024-2-1 "Permalink to this headline")

**General**

- Improved performance when filtering by NVTX context in kernel and application replay.

- Improved documentation for metric units and terms.


**NVIDIA Nsight Compute**

- Improved tooltips for the memory chart.

- Improved timeline row maximum selection for PM sampling metrics.


**Resolved Issues**

- Fixed an issue that the report result dropdown may not update when some options were changed.

- Fixed an issue with the Source page statistics table.

- Fixed an issue with PM sampling reporting incomplete data.

- Fixed an issue with filering using NVTX start/end ranges.

- Fixed an issue with demangling Numba CUDA kernel names.

- Fixed an issue with profiling multi-ctx applications on vGPU.

- Fixed an issue that resulted in L1 caches not always being invalidated for every pass.

- Fixed an issue with applications using `execlp`.


#### Updates in 2024.2 [](https://docs.nvidia.com/nsight-compute/ReleaseNotes/index.html\#updates-in-2024-2 "Permalink to this headline")

**General**

- PM sampling timelines now show the sampled GPU workload activities.

- Added support for collecting [Python Call Stacks](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#profiler-report-nvtx-page) alongside native ones to better understand the context of a workload in Python applications.

- Demangled kernel names can now be automatically simplified or manually renamed. For reports with multiple results, all names are considered during simplification to make them easier to distinguish.

- Removed support for ppc64le.


**NVIDIA Nsight Compute**

- Redesigned the [report header](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#profiler-report-header) for easier access to all report pages. All actions are now sorted into clearly labeled buttons. The focused result selection was integrated into the current row. When adding the current result as a baseline, the row itself is updated to reflect this, instead of showing a separate one.

- Redesigned [Source Page](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#profiler-report-source-page) and [Source Comparison](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#source-comparison) controls to allow more vertical space.

- The [Source Page](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#profiler-report-source-page) and [Source Comparison](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#source-comparison) Navigate By dropdowns can be linked to the respective other view. Changing column names from one dropdown will change it in the other view, too. This applies only if a column is also available in the second view.

- Added Inline Table support in the [Source Comparison](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#source-comparison) document to each side separately.

- Added rich support for Python and Fortran source syntax highlighting. Enhanced CUDA-C and PTX syntax highlighting.

- Added a Statistics Table to the Source Page that allows you to quickly see aggregated metrics across a custom selection of lines.

- Improved tooltips in the memory chart to show more detailed information when metrics are missing.

- The [Acceleration Structure Viewer](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#acceleration-structure-viewer) can now compute ray-geometry intersection and traversal timing heatmaps.

- Added support for ignoring directories in the section search folder.

- Added support for specifying [custom metric descriptions](https://docs.nvidia.com/nsight-compute/CustomizationGuide/index.html.md#custom-descriptions) in section files.

- Added a warning if the opened report is newer than the UI and may not be fully compatible.


**NVIDIA Nsight Compute CLI**

- The raw page csv output now includes metric instance values when these enabled for printing.


**Resolved Issues**

- Improved handling of short workloads during PM sampling.

- Improved units for several metrics.

- Fixed an issue that some metrics did not show aggregates on the Summary and Raw pages.

- Fixed an issue that profiled applications could inadvertently overwrite which PerfWorks library is loaded by the tool.

- Fixed an issue that kernel names including $ were modified by the shell when profiling them from the System Trace activity.

- Fixed an issue that reports could not be saved to the expected file extension in certain cases.


#### Updates in 2024.1.1 [](https://docs.nvidia.com/nsight-compute/ReleaseNotes/index.html\#updates-in-2024-1-1 "Permalink to this headline")

**General**

- Added clarification that for profiling a range with multiple, active CUDA Green Contexts, counter values that are not attributable to SMs will be aggregated over all these Green Contexts.


**Resolved Issues**

- Changed the way the PerfWorks library is loaded into the target application’s process space. This addresses possible connection errors in case the library search path includes other directories with PerfWorks libraries.

- Fixed an issue that caused PM sampling data to be missing from the results of a Profile Series.

- Fixed the incorrect calculation of the percentage values in the Inline Function table.

- Fixed a potential crash of the NVIDIA Nsight Compute UI when PM sampling data was requested, but no sample was collected.


#### Updates in 2024.1 [](https://docs.nvidia.com/nsight-compute/ReleaseNotes/index.html\#updates-in-2024-1 "Permalink to this headline")

**General**

- Switched to using OpenSSL version 3.0.10.

- Added new metrics available when profiling on CUDA Green Contexts.

- Reduced the number of passes required for collecting [PM sampling](https://docs.nvidia.com/nsight-compute/ProfilingGuide/index.html.md#pm-sampling) sections.

- Counter domains can now be specified for PM sampling metrics in section files.

- PM sampling metrics can now be queried in the command line and Metric Details window by specifying the respective `collection` option.

- Added a new optional [PmSampling\_WarpStates](https://docs.nvidia.com/nsight-compute/ProfilingGuide/index.html.md#sections-and-rules) section for understanding warp stall reasons over the workload duration.

- Added a new rule for detecting load imbalances.

- Improved the performance of graph-level profiling on new drivers.

- Updated the [metrics compatibility](https://docs.nvidia.com/nsight-compute/ProfilingGuide/index.html.md#compatibility) table for OptiX cmdlists and instruction-level SASS metrics.

- Improved PM sampling results by dynamically recollecting metrics to avoid outlier pass groups. Use the new [pm-sampling-max-passes](https://docs.nvidia.com/nsight-compute/NsightComputeCli/index.html.md#pm-sampling) option to control the maximum number of dynamically replayed passes.

- Added _interKernelCommunication_ sample CUDA application to show how to use NVIDIA Nsight Compute to profile kernels that depend on each other and must be launched concurrently. Refer to the `README.TXT` file and sample code under `extras/samples/interKernelCommunication`.


**NVIDIA Nsight Compute**

- Added SASS view and Source Markers support in [Source Comparison](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#source-comparison).

- Improved [Source Comparison](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#source-comparison) diff visualization by adding empty lines on other side of inserted/deleted lines.

- The Source page [column chooser](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#profiler-report-source-page-metrics) can now be opened directly from the Navigation drop down.

- Added support to update a [source-page profile](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#profiles).

- Added a [Launch Details](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#tool-window-launch-details) tool window for showing information about individual launches within larger workloads like OptiX command lists.

- Added support for CUDA Green Contexts in the [Resources](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#tool-window-resources) tool window, the Launch Statistics section and the report header.

- Property metrics can now be queried in the Metric Details window.

- Added support to show if a CUDA Graph kernel node is device-side updatable in the [Resources](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#tool-window-resources) tool window.


**NVIDIA Nsight Compute CLI**

- Improved documentation on NVTX expressions and command line output when a potentially incorrect expression led to no workloads being profiled.

- Improved checking for invalid expressions when using the `--target-processes-filer` option.


**Resolved Issues**

- Fixed that the L1 cache achieved roofline value was missing when profiling on GH100.

- Fixed several “Launch Failed” errors when collecting instruction-level SASS metrics.

- Fixed that [Live Register](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#profiler-report-source-page-metrics) values would be too high for some workloads.

- Fixed a scrolling issue on the Source page when collapsing a multi-file view.

- Fixed an issue that no PM sampling data would be shown in the timeline when context switch trace was not available.

- Fixed a display issue in the memory chart when adding baselines.

- Fixed a crash when adding baselines.

- Fixed a crash in timeline views when not all configured data was available.

- Fixed that the application history was not always deleted when selecting Reset Application Data.

- Fixed an error in the metric compatibility documentation.


#### Updates in 2023.3.1 [](https://docs.nvidia.com/nsight-compute/ReleaseNotes/index.html\#updates-in-2023-3-1 "Permalink to this headline")

**General**

- Switched to using OpenSSL version 1.1.1w.

- Improved the speedup estimates for rule IssueSlotUtilization as well as its child rules.

- Updated report files and documentation for the samples located at `extras/samples/`.


**Resolved Issues**

- Fixed collection of context switch data during [PM Sampling](https://docs.nvidia.com/nsight-compute/ProfilingGuide/index.html.md#pm-sampling) when using [Range Replay](https://docs.nvidia.com/nsight-compute/ProfilingGuide/index.html.md#range-replay).

- Fixed potential crash of NVIDIA Nsight Compute when an invalid regular expression was provided as requested metric.

- Improved the performance of NVIDIA Nsight Compute in cases where only a single process is being profiled and `--target-processes all` was specified.

- Fixed an issue of reporting too high register counts on the Source Page.

- Fixed a bug that could cause a GPU fault while collecting SW counters through PerfWorks.

- Fixed showing incorrect baseline values for the Runtime Improvement values on the Summary Page.


#### Updates in 2023.3 [](https://docs.nvidia.com/nsight-compute/ReleaseNotes/index.html\#updates-in-2023-3 "Permalink to this headline")

**General**

- NVIDIA Nsight Compute now supports collecting many metrics by [sampling the GPU’s performance monitors (PM)](https://docs.nvidia.com/nsight-compute/ProfilingGuide/index.html.md#pm-sampling) periodically at fixed intervals. The results can be visualized on a [timeline](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#details-page).

- Added WSL profiling support on Windows 10 WSL with OS build version 19044 and greater. WSL profiling is not supported on Windows 10 WSL for systems that exceed 1 TB of system memory.

- Rule outputs are prioritized to improve the accuracy of estimated speedups. The [Summary](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#summary-page) page now shows the most actionable optimization advices when a result row is selected.

- Improved the handling and reporting for unavailable metrics during collection and when applying rules.

- Added _instructionMix_ sample CUDA application and document to show how to use NVIDIA Nsight Compute to analyze and identify the performance bottleneck due to an imbalanced instruction mix. Refer to the `README.TXT` file, sample code, and document under `extras/samples/instructionMix`.


**NVIDIA Nsight Compute**

- Added support to see the source files of two profile results side by side using [Source Comparison](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#source-comparison). This allows you to quickly identify source differences and understand changes in metric values.

- The [Summary](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#summary-page) page is now the default page when a report is opened. Previous behavior can be enabled in the [options](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#profile) dialog.

- On the [Summary](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#summary-page) and [Raw](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#raw-page) pages, values from all/selected rows are automatically aggregated in the column header for applicable metrics. Selected individual cells are aggregated in the bottom status bar.

- Added `Launch Name` and `Device` options in the filter dialog launched by `Apply Filters` button in the [report header](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#profiler-report-header).

- Added support for [source view profiles](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#profiles) that persist the Source page configuration and allow you to re-apply it to other reports.

- The [Metric Details](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#metric-details) tool window now supports querying metrics beyond the current report by using the `chip:<chipname>` tag in the search.

- Added support for _CUDA Graph Edge Data_ (such as port and dependency type) and _CUDA Graph Conditional Handles_ in the [Resources](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#resources) tool window.

- The [Acceleration Structure Viewer](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#acceleration-structure-viewer) and [Resources](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#resources) tool window now support OptiX Opacity Micromaps.


**NVIDIA Nsight Compute CLI**

- Tracking and profiling all child processes (`--target-processes all`) is now the default for ncu.

- Improved reporting of requested but unavailable metrics. Metrics requested in section files are by default considered optional and only cause a warning to be shown.


**Resolved Issues**

- Support for tracking child processes launched with `system()` is available on Linux ppc64le.

- Improved the behavior of following SASS navigation links on the Source page.

- Fixed issues with profiling CUDA graphs in graph-profiling mode when nodes are associated with a non-current CUDA context.

- Fixed an issue in L2 bandwidth calculations in the hierarchical roofline sections.


#### Updates in 2023.2.2 [](https://docs.nvidia.com/nsight-compute/ReleaseNotes/index.html\#updates-in-2023-2-2 "Permalink to this headline")

**Resolved Issues**

- Fixed possible crash when profiling CUDA graphs on multiple GPUs.

- Fixed the detection mechanism of the C2C interface, i.e. `metric c2clink__present`. The fix requires the display driver shipping with this release or any newer driver.


#### Updates in 2023.2.1 [](https://docs.nvidia.com/nsight-compute/ReleaseNotes/index.html\#updates-in-2023-2-1 "Permalink to this headline")

**Resolved Issues**

- Fixed a crash during application replay while having the temporary directory located on a network file system (NFS).

- Improved detection mechanism for C2C interface. Added caching of the detected configuration to reduce overhead.


#### Updates in 2023.2 [](https://docs.nvidia.com/nsight-compute/ReleaseNotes/index.html\#updates-in-2023-2 "Permalink to this headline")

**General**

- Extended the rules system to show estimates of the potential speedup that can be achieved by addressing the corresponding performance bottleneck. These speedups allow prioritizing applicable rules and help focusing first on optimization strategies with the highest potential performance gain.

- Added support for rules to highlight individual source lines. Lines with global/local memory access with high excessive sector counts and shared accesses with many bank conflicts are automatically detected and highlighted.

- Added the ability to query metric attributes in NvRules API.

- Added support for creating instanced metrics through the NvRules API.

- For Orin+ mobile chips on the Linux aarch64 platform, added metrics (`mcc__*`) support for memory controller channel (MC Channel) unit which connects to the DRAM.


**NVIDIA Nsight Compute**

- Added hyperlinks to the SASS View of the Source Page for instructions that reference others by address or offset. This enables to quickly jump to the target instruction of a branch.

- Improved the search bar in the Metric Details tool window. The search string now matches any part of the metric names, and the matching results are shown in a sorted order.

- Added a visual indication of scale of the metric value changes when the baselines are used. The background bars in the table cells of the Details Page allow to quickly identify which metrics values increased or decreased the most. The color scheme can be configured in the [Baselines tool window](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#tool-window-baselines).

- Added support for rules toggle button on the Summary Page. Allows to hide the bottom pane with the rules output for the selected kernel launch.

- Added support for allowing users to configure properties on [Summary Page](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#profiler-report-summary-page) using [Metrics/Properties profile option](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#options-profile).

- Added percentage bars on [Summary Page](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#profiler-report-summary-page).


**NVIDIA Nsight Compute CLI**

- Added support for tracking child processes launched with `posix_spawn(p)` when using `--target-processes all`.

- Added support for tracking child processes launched with `system()` on Windows and Linux (aarch64, x86\_64) when using `--target-processes all`.


**Resolved Issues**

- Fixed table alignment in the output of the NVIDIA Nsight Compute CLI on Windows when printing Unicode characters.

- Fixed view corruption in the Source Page after switching from the collapsed view to the expanded view.

- Fixed missing tooltip descriptions for some SASS instructions.

- Fixed potential crash when copying from the Resources tool window using CTRL+C.

- Fixed possible crash when restoring sections in the Sections tool window.


#### Updates in 2023.1.1 [](https://docs.nvidia.com/nsight-compute/ReleaseNotes/index.html\#updates-in-2023-1-1 "Permalink to this headline")

**NVIDIA Nsight Compute**

- Added new configuration options to set the default view mode and precision for the Source page.


**Resolved Issues**

- Added support for the `DT_RUNPATH` attribute when intercepting calls to `dlopen`. Fixed issue for applications or libraries relying on `DT_RUNPATH` not finding all dynamic libraries when launched by NVIDIA Nsight Compute.

- Improved interaction between custom additional metrics and the selected metric set. Adding custom metrics no longer forces switching to the custom metric set.

- Added ability to gracefully skip folders with insufficient access permissions while importing source code.

- Fixed the calculation of the peak values for the L1 and L2 cache bandwidths in the hierarchical roofline charts.

- Fixed issue that prevented modules loaded with function `optixModuleCreateFromPTX` showing up in the _Optix: Modules_ table of the _Resources_ tool window.

- Fixed handling of deprecated functions when querying function pointers from the OptiX interception library.

- Fixed that sometimes sections or rules couldn’t be easily selected in the tool window.

- Fixed issue with _Reset Application Data_ that prevented some setting from correctly resetting.

- Fixed potential crash of NVIDIA Nsight Compute when _Reset Application Data_ was executed multiple times in a row.

- Fixed a crash when saving or loading baselines for non-kernel results.

- Fixed that memory written while executing a CUDA graph was not properly restored in single-pass graph profiling.

- Fixed potential memory leak while collecting SW counters for modules with unpatched kernel functions.


#### Updates in 2023.1 [](https://docs.nvidia.com/nsight-compute/ReleaseNotes/index.html\#updates-in-2023-1 "Permalink to this headline")

**General**

- Added support for the CUDA toolkit 12.1.

- Added a new [app-range](https://docs.nvidia.com/nsight-compute/ProfilingGuide/index.html.md#application-range-replay) replay mode to profile ranges without API capture by relaunching the entire application multiple times.

- Added _sharedBankConflicts_ sample CUDA application and document to show how NVIDIA Nsight Compute can be used to analyze and identify the shared memory bank conflicts which result in inefficient shared memory accesses. Refer to the `README.TXT` file, sample code and document under `extras/samples/sharedBankConflicts`.

- Jupyter notebook samples are available in the Nsight training [github repository](https://github.com/NVIDIA/nsight-training/blob/master/cuda/nsight_compute/python_report_interface).

- The equivalent of the [high-level Python report interface](https://docs.nvidia.com/nsight-compute/CustomizationGuide/index.html.md#python-report-interface-high-level) is now available in rule files.


**NVIDIA Nsight Compute**

- Added support for profiling individual metrics in [Interactive Profile activity](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#connection-activity-interactive). A new input field for metrics was added in the [Metric Selection](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#tool-window-sections-info) tool window.

- Files on remote systems can be opened directly from the [menu](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#main-menu).

- Metric- and section-related entries in the menu, [Profile activity](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#connection-activity-non-interactive) and [Metric Selection](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#tool-window-sections-info) tool window were renamed to make them more clear.

- CPU and GPU [NUMA topology metrics](https://docs.nvidia.com/nsight-compute/ProfilingGuide/index.html.md#metrics-reference) can be collected on applicable systems. Topology information is shown in a new [NUMA Affinity section](https://docs.nvidia.com/nsight-compute/ProfilingGuide/index.html.md#sections-and-rules).

- Added content-aware suggestions to the Details page to provide suggestions based on the selected profiling options.

- Added support for [re-resolving source files](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#profiler-report-source-page-navigation) on the Source page.

- Not-issued warp stall reasons are removed from the Source Counters section tables and hidden by default on the Source page. Users should focus on regular warp stall reasons by default and only inspect not-issued samples if this distinction is needed.

- Added support to search missing CUDA source files to permanently import into the report using [Source Lookup options](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#options-source-lookup) in the [Interactive Profile activity](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#connection-activity-interactive).

- The [source page](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#profiler-report-source-page-metrics) now shows metric values as percentages by default. New buttons are added to support switching between different value modes.


**NVIDIA Nsight Compute CLI**

- Added support for config files in the current working or user directory to set default ncu parameters. See the [General options](https://docs.nvidia.com/nsight-compute/NsightComputeCli/index.html.md#command-line-options-general) for more details.

- Added `--range-filter` [command line option](https://docs.nvidia.com/nsight-compute/NsightComputeCli/index.html.md#command-line-options-console-output) which allows to select subset of enabled profile ranges.

- Added new `--source-folders` [command line option](https://docs.nvidia.com/nsight-compute/NsightComputeCli/index.html.md#command-line-options-profile) that allows to recursively search for missing CUDA source files to permanently import into the report.


**Resolved Issues**

- Fixed performance issues on the Summary and Raw pages for large reports.

- Improved support for non-ASCII characters in filenames.

- Fixed an issue with delayed updates of assembly analysis information on the Source page’s Source and PTX views.

- Fixed potential crashes when using the Python report interface.


#### Updates in 2022.4.1 [](https://docs.nvidia.com/nsight-compute/ReleaseNotes/index.html\#updates-in-2022-4-1 "Permalink to this headline")

**General**

- Improved the documentation for the NvRules API.

- The python report interface links libstdc++ statically.


**Resolved Issues**

- Fixed an issue that enabled profiling on CUDA Graph uploads.

- Fixed formatting issues during unit conversion of metric instances.

- Fixed an issue that could lead to a crash during application replay.

- Fixed an issue that could lead to a crash in the python report interface.

- Fixed typos in the metrics reference documentation and descriptions.


#### Updates in 2022.4 [](https://docs.nvidia.com/nsight-compute/ReleaseNotes/index.html\#updates-in-2022-4 "Permalink to this headline")

**General**

- Added support for the CUDA toolkit 12.0.


- Added support for profiling [CUDA graphs](https://docs.nvidia.com/cuda/cuda-driver-api/group__CUDA__GRAPH.html.md#group__CUDA__GRAPH) as complete workloads instead of as single kernel nodes. Enable this using the _Graph Profiling_ option in the [activities](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#connection-dialog). Similarly to [range replay](https://docs.nvidia.com/nsight-compute/ProfilingGuide/index.html.md#range-replay) results, selected metrics are not available when profiling graphs.


- Added support for profiling on Windows Subsystem for Linux (WSL2). See the [System Requirements](https://docs.nvidia.com/nsight-compute/ReleaseNotes/index.html.md#system-requirements) for more details.

- Deprecated `nv-nsight-cu` and `nv-nsight-cu-cli` aliases are removed in favor of `ncu-ui` and `ncu`.


**NVIDIA Nsight Compute**

- The [Source page](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#profiler-report-source-page) now loads disassembly and static analysis results asynchronously in the background.

- Added a new [Metric Details](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#tool-window-metric-details) tool window to inspect metric information such as raw value, unit, description or instance values. Open the tool window and select a metric on the _Details_ or _Raw_ page or lookup any metric in the focused report directly in the tool window’s search bar.

- In the Source page PTX view, the source name will be shown as a list of comma-separated files.

- Added flexibility with NVTX based filtering in the _Next Trigger_ filter, similar to the command line. Filters can now use nvtx-include and nvtx-exclude expressions by adding the `nvtx-include:` or `nvtx-exclude:` prefix.

- NVTX views now show the payload type.

- Simplified the command line generated by the [Profile activity](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#connection-activity-non-interactive).

- Reduced the number of steps required to re-run the [Profile activity](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#connection-activity-non-interactive).

- The way to rename [Baselines](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#baselines) in-place has been improved.

- The [Resources](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#tool-window-resources) tool window now shows the CUDA Dynamic Parallelism state for CUDA functions and modules.

- OptiX traversable handles can now be exported as [Graphviz](https://graphviz.org/) DOT or SVG files for visualization from the Resources tool window.

- All OptiX build, instance and geometry flags can be viewed in the [Acceleration Structure Viewer](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#acceleration-structure-viewer).

- Added OptiX-specific highlight filters to the Acceleration Structure Viewer.

- Added support for user-specified index strides to the Acceleration Structure Viewer.


**NVIDIA Nsight Compute CLI**

- Added new option `--graph-profiling graph` to enable profiling of complete CUDA graphs as single workloads.


- Added new option `--filter-mode per-launch-config` to enable filtering of kernel launches for each GPU launch parameter separately.

- Added support to print section body item metrics on the details page with the new `--print-details` [command line option](https://docs.nvidia.com/nsight-compute/NsightComputeCli/index.html.md#command-line-options-console-output).

- Added support to select what to show in Metric Name column on the details page with the new `--print-metric-name` [command line option](https://docs.nvidia.com/nsight-compute/NsightComputeCli/index.html.md#command-line-options-console-output).

- Removed deprecated options: `--units`, `--fp`, `--summary` and `--kernel-base`

- Added support to print launch, session, process and device attributes on session page with the new `--page session` option.

- Added `--kill yes` support for application replay mode.


**Resolved Issues**

- Fixed an issue that NVIDIA Nsight Compute could crash when continuing profiling after transposing the _Raw_ page table.

- Fixed an issue that caused closing a report document to be delayed by pending source analysis.

- Fixed support for profiling applications with older OptiX versions.

- Fixed display of OptiX module inputs for IR and built-in modules.


#### Updates in 2022.3 [](https://docs.nvidia.com/nsight-compute/ReleaseNotes/index.html\#updates-in-2022-3 "Permalink to this headline")

**General**

- Added support for the CUDA toolkit 11.8.

- Added support for the Ada GPU architecture.

- Added support for the Hopper GPU architecture.

- Added support for [OptiX 7.6](https://docs.nvidia.com/nsight-compute/ReleaseNotes/index.html.md#library-support-optix).

- Added _uncoalescedGlobalAccesses_ sample CUDA application and document to show how the NVIDIA Nsight Compute profiler can be used to analyze and identify the memory accesses which are uncoalesced and result in inefficient DRAM accesses. Refer to the README, sample code and document under `extras/samples/uncoalescedGlobalAccesses`.

- Added [Metrics Reference](https://docs.nvidia.com/nsight-compute/ProfilingGuide/index.html.md#metrics-reference) in the documentation that lists metrics not available through `--query-metrics`.

- Reduced the overhead of collecting SASS-patching based metrics.

- On [Multi-Instance GPU (MIG)](https://docs.nvidia.com/nsight-compute/ProfilingGuide/index.html.md#special-configurations-mig) configurations, NVIDIA Nsight Compute cannot lock clocks anymore. Users are expected to lock clocks externally using nvidia-smi.


**NVIDIA Nsight Compute**

- Wrapper script `nv-nsight-cu` is deprecated in favor of `ncu-ui` and will be removed in a future release.

- Source page supports range replay results.

- Added a second chart on the Compute Workload Analysis section to avoid mixing metrics with different meaning.

- NVIDIA Nsight Compute now tracks traversable handles created with `optixAccelRelocate`.

- NVIDIA Nsight Compute now tracks traversable handles created as updates from others.

- The Acceleration Structure viewer now reports unsupported inputs.

- The Acceleration Structure viewer now supports opening multiple traversable handles.

- The Acceleration Structure viewer now uses OptiX naming for displayed elements.


**NVIDIA Nsight Compute CLI**

- Wrapper script `nv-nsight-cu-cli` is deprecated in favor of `ncu` and will be removed in a future release.

- Added new option `--filter-mode per-gpu` to enable filtering of kernel launches on each GPU separately.

- Added new option `--app-replay-mode relaxed` to produce profiling results for valid kernels even if the number of kernel launches is inconsistent across application replay passes.

- Added a documentation section on supported [environment variables](https://docs.nvidia.com/nsight-compute/NsightComputeCli/index.html.md#environment-variables).

- Improved the performance when loading existing reports on the command line.


**Resolved Issues**

- Fixed an issue when resolving files on the Source page.

- Fixed an issue when profiling OptiX applications.

- Fixed an issue in the OptiX traversable handle management caused by clashing handle values.

- Fixed an issue in the Acceleration Structure viewer causing the display of invalid memory when viewing AABB buffers.


#### Updates in 2022.2.1 [](https://docs.nvidia.com/nsight-compute/ReleaseNotes/index.html\#updates-in-2022-2-1 "Permalink to this headline")

**Resolved Issues**

- Fixed an issue that caused some tootips to not show up for the charts on the Details page.

- Fixed the incorrect reporting of the accessed bytes for LDGSTS (access) traffic in the L1TEX memory table.

- Fixed an issue that resulted in an empty view on the Source page after resolving multiple source files at once.

- Fixed a failure to connect to remote machines over SSH due to a mismatch in the configuration of data compression.

- Fixed a potential failure to profile kernels on multiple devices of the same type of chip. The failure occurred on the attempt to profile on the second device.


#### Updates in 2022.2 [](https://docs.nvidia.com/nsight-compute/ReleaseNotes/index.html\#updates-in-2022-2 "Permalink to this headline")

**General**

- Added support for the CUDA toolkit 11.7.

- Improved performance for profiling and metric query.

- Added Linux (aarch64 sbsa) as a supported [host platform](https://docs.nvidia.com/nsight-compute/ReleaseNotes/index.html.md#platform-support).

- The NVIDIA Nsight Compute CLI stores the command line arguments, which can be viewed in the [Session](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#profiler-report-session-page) report page.

- Added an API to query the version of the [Python Report](https://docs.nvidia.com/nsight-compute/CustomizationGuide/index.html.md#python-report-interface) and [NvRules](https://docs.nvidia.com/nsight-compute/NvRulesAPI/index.html.md#abstract) interfaces.

- Added an API to query the PTX in the [Python Report](https://docs.nvidia.com/nsight-compute/CustomizationGuide/index.html.md#python-report-interface) and [NvRules](https://docs.nvidia.com/nsight-compute/NvRulesAPI/index.html.md#abstract) interfaces.


**NVIDIA Nsight Compute**

- The [Acceleration Structure Viewer](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#acceleration-structure-viewer) allows inspection of acceleration structures built using the OptiX API for debugging and performance optimization.

- The [Source page](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#profiler-report-source-page) column chooser now supports to enable or disable groups of metrics. Note that not all metrics are enabled anymore by default to make the view easier to use.

- The [Resources](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#tool-window-resources) tool window now links to the exact target resource instances for _CUDA_ resource types.

- The [Resources](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#tool-window-resources) tool window now shows the instanced nodes for CUDA graphs.

- The [Resources](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#tool-window-resources) tool window now shows the loading state and number of loaded functions for _CUDA Modules_.

- The [Resources](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#tool-window-resources) tool window now shows the graph node enablement state for applicable instanced graph nodes.

- The [Resources](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#tool-window-resources) tool window now shows the graph node priorities for instanced kernel graph nodes.

- Added regex support in the _Next Trigger_ filter for NVTX based filtering. The _Next Trigger_ filter now considers the NVTX config as a regular expression if the `regex:` prefix is specified.

- Added regex support in the report’s _Filter Results_ dialog.

- Added [keyboard shortcuts](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#quick-start-navigate-report) to navigate between the pages in a report.

- The behavior for selecting sets and sections is now consistent between the [Sections/Rules Info window](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#tool-window-sections-info) and the [non-interactive profile activity](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#quick-start-non-interactive).

- Reports can now be opened directly from the welcome dialog.


**NVIDIA Nsight Compute CLI**

- Added support for collecting [sampling-based](https://docs.nvidia.com/nsight-compute/ProfilingGuide/index.html.md#sampling) warp stalls in range replay mode.

- Added regex support in [NVTX filtering](https://docs.nvidia.com/nsight-compute/NsightComputeCli/index.html.md#nvtx-filtering).

- The metric type is shown when querying metrics.


**Resolved Issues**

- Reduced overhead of connecting to the host UI for non-interactive remote profiling sessions.

- Fixed issues with persisting the Source page state when collapsing or switching between results.

- Fixed an issue that locked GPU clocks were not reset when terminating the NVIDIA Nsight Compute CLI while profiling a kernel.

- Fixed issues with selecting and copying text from the Details page tables.

- Fixed an issue with opening report files in the UI on macOS.

- Fixed an issue with the _Freeze API_ option.


#### Updates in 2022.1.1 [](https://docs.nvidia.com/nsight-compute/ReleaseNotes/index.html\#updates-in-2022-1-1 "Permalink to this headline")

**General**

- Filtering kernel launches or profile results based on NVTX domains/ranges now takes registered strings in the payload field into account, if the range name is empty.

- Added support for the suffix `.max_rate` for ratio metrics.


**Resolved Issues**

- Fixed a crash during the disassembly of the kernel’s SASS code for the Source page.

- Fixed a crash on exit of the NVIDIA Nsight Compute UI.

- Fixed a hang during profiling when CPU call stack collection is enabled.

- Fixed missing to flush UVM buffers before taking memory checkpoints during [Range Replay](https://docs.nvidia.com/nsight-compute/ProfilingGuide/index.html.md#range-replay).

- Fixed tracking of memory during [Range Replay](https://docs.nvidia.com/nsight-compute/ProfilingGuide/index.html.md#range-replay), if the CUDA context has any device mapped memory allocations.

- Fixed the maximum available shared memory sizes in the [Occupancy Calculator](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#occupancy-calculator) for NVIDIA Ampere GPUs.

- Fixed that the shared memory usage of the kernel is incorrectly initialized when opening the [Occupancy Calculator](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#occupancy-calculator) from a profile report.


#### Updates in 2022.1 [](https://docs.nvidia.com/nsight-compute/ReleaseNotes/index.html\#updates-in-2022-1 "Permalink to this headline")

**General**

- Added support for the CUDA toolkit 11.6.


- Added support for GA103 chips.


- Added a new [Range Replay](https://docs.nvidia.com/nsight-compute/ProfilingGuide/index.html.md#range-replay) mode to profile ranges of multiple, concurrent kernels. Range replay is available in the NVIDIA Nsight Compute CLI and the non-interactive Profile activity.

- Added a new rule to detect non-fused floating-point instructions.

- The Uncoalesced Memory access rules now show results in a dynamic table.

- Unix Domain Sockets and Windows Named Pipes are used for local connection between the host and target processes on x86\_64 Linux and Windows, respectively.

- The [NvRules API](https://docs.nvidia.com/nsight-compute/NvRulesAPI/index.html.md#abstract) now supports querying action names using different function name bases (e.g. demangled).


**NVIDIA Nsight Compute**

- The default [report page](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#profiler-report-pages) is now chosen automatically when opening a report.

- Added coverage for ECC (Error Correction Code) operations in the L2 Cache table of the Memory Analysis section.

- Added a new [L2 Evict Policies](https://docs.nvidia.com/nsight-compute/ProfilingGuide/topics/memory-tables-l2-evict-policy.html.md) table to the Memory Analysis section.

- The [Occupancy Calculator](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#occupancy-calculator) now updates automatically when the input changes.

- Added new metric _Thread Instructions Executed_ to the Source page.

- Added tooltips to the [Register Dependency](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#profiler-report-source-page) columns in the Source page to identify the associated register more conveniently.

- Improved the selection of Sections and Sets in the Profile activity connection dialog.

- NVLink utilization is shown in the NVLink Tables section.

- NVLink links are colored according to the measured throughput.


**NVIDIA Nsight Compute CLI**

- `--kernel-regex` and `--kernel-regex-base` options are no longer supported. Alternate options are `--kernel-name` and `--kernel-name-base` respectively, added in 2021.1.0.

- Added support to resolve CUDA source files in the `--page source` output with the new `--resolve-source-file` [command line option](https://docs.nvidia.com/nsight-compute/NsightComputeCli/index.html.md#command-line-options-console-output).

- Added new option `--target-processes-filter` to filter the processes being profiled by name.

- The CPU Stack Trace is shown in the NVIDIA Nsight Compute CLI output.


**Resolved Issues**

- Fixed the calculation of aggregated average instruction execution metrics in non-SASS views on the Source page.

- Fixed that atomic instructions are counted as both loads and stores in the Memory Analysis tables.


#### Updates in 2021.3.1 [](https://docs.nvidia.com/nsight-compute/ReleaseNotes/index.html\#updates-in-2021-3-1 "Permalink to this headline")

**Resolved Issues**

- Fixed that kernels with the same name and launch configuration were in some scenarios associated with the wrong profiling results during application replay.

- Fixed an issue with binary forward compatibility of the report format.

- Fixed an issue with applications calling into the CUDA API during process teardown.

- Fixed an issue profiling application using pre-CUDA API 3.1 contexts.

- Fixed a crash when resolving files on the Source page.

- Fixed that opening reports with large embedded CUBINs would hang the UI.

- Fixed an issue with remote profiling on a target where the UI is already launched.


#### Updates in 2021.3 [](https://docs.nvidia.com/nsight-compute/ReleaseNotes/index.html\#updates-in-2021-3 "Permalink to this headline")

**General**

- Added support for the CUDA toolkit 11.5.

- Added a new rule for detecting inefficient memory access patterns in the L1TEX cache and L2 cache.

- Added a new rule for detecting high usage of system or peer memory.

- Added new `IAction::sass_by_pc` function to the [NvRules API](https://docs.nvidia.com/nsight-compute/NvRulesAPI/index.html.md#abstract).

- The [Python-based report interface](https://docs.nvidia.com/nsight-compute/CustomizationGuide/index.html.md#python-report-interface) is now available for Windows and macOS hosts, too.

- Added Hierarchical Roofline section files in a new “roofline” section set.

- Added support for collecting CPU call stack information.


**NVIDIA Nsight Compute**

- Added support for new remote profiling [SSH connection and authentication options](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#remote-connections) as well as local SSH configuration files.

- Added an [Occupancy Calculator](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#occupancy-calculator) which can be opened directly from a profile report or as a new activity. It offers feature parity to the CUDA Occupancy Calculator [spreadsheet](http://docs.nvidia.com/cuda/cuda-occupancy-calculator/index.html).

- Added new [Baselines tool window](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#tool-window-baselines) to manage (hide, update, re-order, save/load) baseline selections.

- The Source page views now support multi-line/cell selection and copy/paste. Different colors are used for highlighting selections and correlated lines.

- The search edit on the Source page now supports _Shift+Enter_ to search in reverse direction.

- The [Memory Workload Analysis Chart](https://docs.nvidia.com/nsight-compute/ProfilingGuide/index.html.md#memory-chart) can be configured to show throughput values instead of transferred bytes.

- The _Profile_ activity now supports the `--devices` option.

- The _NVLink Topology_ diagram displays per NVLink metrics.

- Added a new tool window showing the CPU call stack at the location where the current thread was suspended during interactive profiling activities.

- If enabled, the _Call Stack / NVTX_ page of the profile report shows the captured CPU call stack for the selected kernel launch.


**NVIDIA Nsight Compute CLI**

- Added support for printing source/metric content with the new `--page source` and `--print-source` [command line options](https://docs.nvidia.com/nsight-compute/NsightComputeCli/index.html.md#command-line-options-console-output).

- Added new option `--call-stack` to enable collecting the CPU call stack for every profiled kernel launch.


**Resolved Issues**

- Fixed that `memory_*` metrics could not be collected with the `--metrics` option.

- Fixed that selection and copy/paste was not supported for section header tables on the Details page.

- Fixed issues with the Source page when collapsing the content.

- Fixed that the UI could crash when applying rules to a new profile result.

- Fixed that PC Sampling metrics were not available for _Profile Series_.

- Fixed that local profiling did not work if no non-loopback address was configured for the system.

- Fixed termination of remote-launched applications. On QNX, terminating an application profiled via _Remote Launch_ is now supported. Canceling remote-launched _Profile_ activities is now supported.


#### Updates in 2021.2.8 [](https://docs.nvidia.com/nsight-compute/ReleaseNotes/index.html\#updates-in-2021-2-8 "Permalink to this headline")

**General**

- Updated Python libraries to version 3.10.5.


#### Updates in 2021.2.7 [](https://docs.nvidia.com/nsight-compute/ReleaseNotes/index.html\#updates-in-2021-2-7 "Permalink to this headline")

**General**

- Enabled stack canaries with random canary values for L4T builds.


#### Updates in 2021.2.6 [](https://docs.nvidia.com/nsight-compute/ReleaseNotes/index.html\#updates-in-2021-2-6 "Permalink to this headline")

**Resolved Issues**

- Fixed an issue causing a hang on QNX after pressing `ctrl+c` while profiling a multi-process application.


#### Updates in 2021.2.5 [](https://docs.nvidia.com/nsight-compute/ReleaseNotes/index.html\#updates-in-2021-2-5 "Permalink to this headline")

**Resolved Issues**

- Improve the handling of the performance monitor reservation on mobile target GPUs.


#### Updates in 2021.2.4 [](https://docs.nvidia.com/nsight-compute/ReleaseNotes/index.html\#updates-in-2021-2-4 "Permalink to this headline")

**Resolved Issues**

- Fixed an issue that prevented remote interactive profiling of kernels on NVIDIA GA10b chips.


#### Updates in 2021.2.3 [](https://docs.nvidia.com/nsight-compute/ReleaseNotes/index.html\#updates-in-2021-2-3 "Permalink to this headline")

**General**

- Added support for the NVIDIA GA10b chip.


**Resolved Issues**

- Improved error message on QNX for failure to deploy stock section and rules files.


#### Updates in 2021.2.2 [](https://docs.nvidia.com/nsight-compute/ReleaseNotes/index.html\#updates-in-2021-2-2 "Permalink to this headline")

**General**

- Changes for profiling support on NVIDIA virtual GPUs (vGPUs) for an upcoming GRID/vGPU release.


**Resolved Issues**

- Fixed hang issue on QNX when using the `--target-processes all` option while profiling shell scripts.


#### Updates in 2021.2.1 [](https://docs.nvidia.com/nsight-compute/ReleaseNotes/index.html\#updates-in-2021-2-1 "Permalink to this headline")

**General**

- Reduced the memory overhead when loading reports in the [Python Report Interface](https://docs.nvidia.com/nsight-compute/CustomizationGuide/index.html.md#python-report-interface).


**Resolved Issues**

- Fixed that links in the _Memory Allocations_ Resource view were not working correctly.

- Fixed that NVTX state might not be correctly reset between interactive profiling activities.

- Fixed that the UI could crash when opening baselines from different GPU architectures.


#### Updates in 2021.2 [](https://docs.nvidia.com/nsight-compute/ReleaseNotes/index.html\#updates-in-2021-2 "Permalink to this headline")

**General**

- Added support for the CUDA toolkit 11.4.

- Added support for OptiX version 7.3.

- Added support for profiling on [NVIDIA virtual GPUs](https://www.nvidia.com/en-us/data-center/virtual-gpu-technology.md/) (vGPUs) on an upcoming GRID/vGPU release.

- Added a new [Python-based report interface](https://docs.nvidia.com/nsight-compute/CustomizationGuide/index.html.md#python-report-interface) for interacting with report files from Python scripts.

- Added a new rule to warn users when sampling metrics were selected, but no sampling data was collected.

- Renamed _SOL_ to _Throughput_ in the Speed of Light section.

- Renamed several `memory_*` metrics used on the _Source_ page, to better reflect the measured value. See the [Source page](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#profiler-report-source-page) documentation for more details.


**NVIDIA Nsight Compute**

- Added support for opening [cubin files](https://docs.nvidia.com/cuda/cuda-binary-utilities/index.html.md#cuda-binary) in a [Standalone Source Viewer](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#cubin-viewer) without profiling the application.

- Moved the output of all rules so that it is visible even if a section’s body is collapsed. Visibility of the rules’ output can be toggled by a new button in the report header.

- The profiler report header now shows the report name for each baseline when ambiguous.

- Rules can define _Focused Metrics_ that were most important for triggering their result output. Metrics are provided per result message which additional information, such as the underlying conditions and thresholds.

- [Memory tables](https://docs.nvidia.com/nsight-compute/ProfilingGuide/index.html.md#memory-tables) show tooltips for cells with derived metric calculations.

- Added a knowledge base service to show more comprehensive background information on metric names and descriptions in their tooltips.

- Following a link in the Source Counters hot spot tables automatically selects the corresponding metric in the Source page.

- Added new columns for visualizing register dependencies in the SASS view of the [Source page](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#profiler-report-source-page).

- Functions in the SASS view are now sorted by name.

- Added support for OptiX 7.x resource tracking in the interactive profile activity. The _Resources_ tool window will show information on instantiated `optixDeviceContexts, optixModules, optixProgramGroups, optixPipelines and optixDenoiser` objects.

- Added support for new CUDA graph memory allocation APIs.

- Improved consistency between command line parameters and the _Next Trigger_ filter in the API Stream window for handling of regex inputs. The _Next Trigger_ filter now considers kernel/API name as a regular expression only if string has `regex:` as prefix.

- Added ability to select font settings in the options dialog.

- Added ability to configure the metrics shown on the summary page via the options dialog.

- The selected heatmap color scale now also applies to the _Memory chart_.

- The ncu-ui script now checks for missing library dependencies, such as OpenGL or [Qt](https://doc.qt.io/qt-5/linux-requirements.html).


**NVIDIA Nsight Compute CLI**

- Added environment variable [NV\_COMPUTE\_PROFILER\_DISABLE\_STOCK\_FILE\_DEPLOYMENT=1](https://docs.nvidia.com/nsight-compute/NsightComputeCli/index.html.md#environment-variables) to skip deployment of section and rule files.


**Resolved Issues**

- Fixed a performance issue in the NVIDIA Nsight Compute CLI when using `--page raw --csv --units auto`.

- Fixed that the SSH passphase key is no longer persisted in the project file.

- Fixed state of restore button in connection dialog. The button now supports restoring the default settings, if current setting differ from the default.

- Fixed that the complete GPU name can be shown in the NVLINK topology diagram on macOS.

- Fixed that collapsing the Source view reset the selected metrics.

- Fixed that correlated lines could differ between filtered and unfiltered views of the executed functions.

- Fixed that two application icons were shown in the macOS dock.

- Improved HiDPI awareness.


#### Updates in 2021.1.1 [](https://docs.nvidia.com/nsight-compute/ReleaseNotes/index.html\#updates-in-2021-1-1 "Permalink to this headline")

**General**

- Updated OpenSSL library to version 1.1.1k.


**NVIDIA Nsight Compute**

- Remote source resolution can now use the IP address, in addition to the hostname, to find the necessary SSH target.


**NVIDIA Nsight Compute CLI**

- Added support for the existing command line options for kernel filtering while importing data from an existing report file using `--import`.

- Option `-k` is not considered as deprecated option `--kernel-regex` anymore.


**Resolved Issues**

- Fixed failure to profile kernels from applications that use the CUDA graphics interop APIs to share semaphores.

- Fixed wavefront metric in the L1TEX table for writes to shared memory on GA10x chips.

- Fixed an issue resulting in incomplete data collection for the interactive profile activity after switching from single-pass mode to collecting multiple passes in the same session.

- Fixed values shown in the mimimap of the Source page when all functions are collapsed.

- Fixed an issue causing names set by the NVTX naming APIs of one application to be applied to all subsequent sessions of the same instance of NVIDIA Nsight Compute.

- Fixed behavior of horizontal scroll bars when clicking in the source views on the Source page.

- Fixed appearance of multi-line entries in column chooser on the Source page.

- Fixed enablement state of the reset button on the Connection dialog.

- Fixed potential crash of NVIDIA Nsight Compute when windows size becomes small while being on the Source page.

- Fixed potential crash of NVIDIA Nsight Compute when relative paths for section/rules files could not be found.

- Fixed potential crash of NVIDIA Nsight Compute after removing baselines.


#### Updates in 2021.1 [](https://docs.nvidia.com/nsight-compute/ReleaseNotes/index.html\#updates-in-2021-1 "Permalink to this headline")

**General**

- Added support for the CUDA toolkit 11.3.

- Added support for the [OptiX 7 API](https://docs.nvidia.com/nsight-compute/ReleaseNotes/index.html.md#library-support-optix).

- `GpuArch` enumeration values used for filtering in section files were renamed from architecture names to compute capabilities.

- NVTX states can now be accessed via the [NvRules API](https://docs.nvidia.com/nsight-compute/NvRulesAPI/index.html.md#abstract).

- Added a rule for the _Occupancy_ section.


**NVIDIA Nsight Compute**

- Added support for new CUDA asynchronous allocator attributes in the _Memory Pools_ resources view.

- Added a topology chart and link properties table in the NVLink section.

- The selected metric column is scrolled into view on the _Source_ page when a new metric is selected.

- Users can choose the _Source_ heatmap color scale in the _Options_ dialog.


**NVIDIA Nsight Compute CLI**

- Added file-based [application replay](https://docs.nvidia.com/nsight-compute/ProfilingGuide/index.html.md#application-replay) as the new default application replay mode. File-based replay uses a temporary file for keeping replay data, instead of allocating them in memory. This keeps the required memory footprint close to constant, independent of the number of profiled kernels. Users can switch between buffer modes using the `--app-replay-buffer` option.

- CLI output now shows NVTX color and message information.

- `--kernel-regex` and `--kernel-regex-base>` options are deprecated and replaced by `--kernel-name` and `--kernel-regex-base`, respectively.

- All options which support regex need to provide `regex:` as a prefix before an argument to match per the regex, e.g `<option> <regex:expression>`


**Resolved Issues**

- Fixed that baselines were not updated properly on the _Comments_ page.

- Fixed that NVTX ranges named using their payloads can be used in [NVTX filtering](https://docs.nvidia.com/nsight-compute/NsightComputeCli/index.html.md#nvtx-filtering) expressions.

- Fixed crashes in MacOSX hosts when terminating the target application.

- The NVLINK(`nvl*`) metrics are now added back.


#### Updates in 2020.3.1 [](https://docs.nvidia.com/nsight-compute/ReleaseNotes/index.html\#updates-in-2020-3-1 "Permalink to this headline")

**General**

- Added support for LDSM instruction-level metrics.


**NVIDIA Nsight Compute**

- LDSM instruction-level metrics are shown in the _Source_ page and memory tables.

- Improved reporting and documentation for collecting _Profile Series_.

- Frozen columns in the _Source_ page are automatically scrolled into view.


**Resolved Issues**

- Fixed an issue when profiling multi-threaded applications.

- Fixed an issue that NVIDIA Nsight Compute would not automatically restart when using _Reset Application Data_.

- Fixed issues with target applications using libstdc++.

- Fixed an issue when collecting single-pass metrics in multiple Nsight Compute instances.

- Fixed an issue when using _Kernel ID_ and setting _Launch Capture Count_ as non-zero in the UI’s _Profile_ activity.

- Fixed an issue that prevented different users on the same Linux system to use NVIDIA Nsight Compute in shared instance mode.

- Fixed an issue that prevented resources from being properly renamed using NVTX information in the UI.


#### Updates in 2020.3 [](https://docs.nvidia.com/nsight-compute/ReleaseNotes/index.html\#updates-in-2020-3 "Permalink to this headline")

**General**

- Added support for _derived metrics_ in section files. Derived metrics can be used to create new metrics based on existing metrics and constants. See the [Customization Guide](https://docs.nvidia.com/nsight-compute/CustomizationGuide/index.html.md#section-derived-metrics) for details.

- Added a new _Import Source_ (`--import-source`) option to the UI and command line to permanently import source files into the report, when available.

- Added a new section that shows selected _NVLink_ metrics on supported systems.

- Added a new `launch__func_cache_config` metric to the _Launch Statistics_ section.

- Added new branch efficiency metrics to the _Source Counters_ section, including `smsp__sass_average_branch_targets_threads_uniform.pct` to replace nvprof’s `branch_efficiency`, as well as instruction-level metrics `smsp__branch_targets_threads_divergent`, `smsp__branch_targets_threads_uniform` and `branch_inst_executed`.

- A warning is shown if kernel replay starts staging GPU memory to CPU memory or the file system.

- Section and rule files are deployed to a versioned directory in the user’s home directory to allow easier editing of those files, and to prevent modifying the base installation.

- Removed support for NVLINK(`nvl*`) metrics due to a potential application hang during data collection. The metrics will be added back in a future version of the driver/tool.


**NVIDIA Nsight Compute**

- Added support for _Profile Series_. Series allow you to profile a kernel with a range of configurable parameters to analyze the performance of each combination.

- Added a new _Allocations_ view to the _Resources_ tool window which shows the state of all current memory allocations.

- Added a new _Memory Pools_ view to the _Resources_ tool window which shows the state of all current memory pools.

- Added coverage of peer memory to the _Memory Chart_.

- The _Source_ page now shows the number of excessive sectors requested from L1 or L2, e.g. due to uncoalesced memory accesses.

- The _Source_ column on the _Source_ page can now be scrolled horizontally.

- The kernel duration `gpu__time_duration.sum` was added as column on the _Summary_ page.

- Improved the performance of _application replay_ when not all kernels in the application are profiled.


**NVIDIA Nsight Compute CLI**

- Added a new `--app-replay-match` option to select the mechanism used for matching kernel instances across application replay passes.

- An error is shown if `--nvtx-include/exclude` are used without `--nvtx`.


**Resolved Issues**

- The _Grid Size_ column on the _Raw_ page now shows the CUDA grid size like the _Launch Statistics_ section, rather than the combined grid and block sizes.

- The _Branch Resolving_ wap stall reason was added to the PC sampling metric groups and the _Warp State Statistics_ section.

- The _API Stream_ tool window shows kernel names according to the selected Function Name Mode.

- Fixed that an incorrect line could be shown after a heatmap selection on the _Source_ page.

- Fixed incorrect metric usage for system memory in the _Memory Chart_. Previously, all requested memory of L2 from system memory was reported instead of only the portion that missed in L2.


#### Updates in 2020.2.1 [](https://docs.nvidia.com/nsight-compute/ReleaseNotes/index.html\#updates-in-2020-2-1 "Permalink to this headline")

**Resolved Issues**

- Fixed several issues related to auto-profiling in the UI.

- Fixed a metric collection issue when profiling kernels on different GPU architectures with application replay.

- Fixed a performance problem related to profiling large process trees.

- Fixed that occupancy charts would not render correctly when comparing against baselines.

- Fixed that no memory metrics were shown on the _Source_ page for `LDGSTS` instructions.

- Fixed the automatic sorting on the _Summary_ and _Raw_ pages.

- Fixed an issue that would cause the NVIDIA Nsight Compute CLI to consume too much memory when importing or printing reports.

- Long kernel names are now elided in the _Details_ page source hot spot tables.

- Fixed that function names in the _Resources_ tool window were demangled differently.


#### Updates in 2020.2 [](https://docs.nvidia.com/nsight-compute/ReleaseNotes/index.html\#updates-in-2020-2 "Permalink to this headline")

**General**

- Added support for the NVIDIA Ampere GPUs with compute capability 8.6 and CUDA toolkit 11.1.

- Added support for application replay to collect metric results across multiple application runs, instead of replaying individual kernels.

- Added new `launch__device_id` metric.

- Added support for NVLink (`nvl*`) metrics for GPUs with compute capabilities 7.0, 7.5 and 8.0

- Added documentation for memory charts and tables in the [Profiling Guide](https://docs.nvidia.com/nsight-compute/ProfilingGuide/index.html.md#memory-chart).


**NVIDIA Nsight Compute**

- Updated menu and toolbar layout.

- Added support for zoom and pan on roofline charts.

- The _Resources_ tool window shows the current CUDA stream attributes.

- The memory chart shows a heatmap for link and port utilization.

- The hot-spot tables in the _Source Counters_ section now show values as percentages, too.

- On-demand resolve of remote CUDA-C source is now available for macOS hosts.

- Metric columns in the _Summary_ and _Raw_ pages are now sortable.

- Added a new option to set the number of recent API calls shown in the _API Stream_ tool window.


**NVIDIA Nsight Compute CLI**

- CLI output now shows NVTX payload information.

- CSV output now shows NVTX states.

- Added a new `--replay-mode` option to select the mechanism used for replaying a kernel launch multiple times.

- Added a new `--kill` option to terminate the application once all requested kernels were profiled.

- Added a new `--log-file` option to decide the output stream for printing tool output.

- Added a new `--check-exit-code` option to decide if the child application exit code should be checked.


**Resolved Issues**

- The profiling progress dialog is not dismissed automatically anymore after an error.

- The inter-process lock is now automatically given write permissions for all users.

- All project extensions are enabled in the default dialog filter.

- Fixed handling of targets using _tcsh_ during remote profiling.

- Fixed handling of quoted application arguments on Windows.


#### Updates in 2020.1.2 [](https://docs.nvidia.com/nsight-compute/ReleaseNotes/index.html\#updates-in-2020-1-2 "Permalink to this headline")

**General**

- The NVIDIA Nsight Compute installer for Mac is now code-signed and notarized.

- Disabled the creation of the Python cache when executing rules to avoid permission issues and signing conflicts.


**Resolved Issues**

- Fixed the launcher script of the NVIDIA Nsight Compute CLI to no longer fail if `uname -p` is not available.

- Fixed the API parameter capture for function `cuDeviceGetLuid`.


#### Updates in 2020.1.1 [](https://docs.nvidia.com/nsight-compute/ReleaseNotes/index.html\#updates-in-2020-1-1 "Permalink to this headline")

**General**

- Added support for the NVIDIA GA100/SM 8.x GPU architecture


- Metrics passed to `--metrics` on the NVIDIA Nsight Compute CLI or in the respective _Profile_ activity option are automatically expanded to all first-level sub-metrics if required. See the documentation on `--metrics` for more details.

- Added new rules for detecting inefficiencies of using the sparse data compression on the NVIDIA Ampere architecture.

- The version of the NVIDIA Nsight Compute target collecting the results is shown in the _Session_ page.

- Added new `launch__grid_dim_[x,y,z]` and `launch__block_dim_[x,y,z]` metrics.


**NVIDIA Nsight Compute**

- The _Break on API Error_ functionality has been improved when auto profiling.


**NVIDIA Nsight Compute CLI**

- The full path to the report output file is printed after profiling.

- Added and corrected metrics in the nvprof _Metric Comparison_ table.


**Resolved Issues**

- Documented the _breakdown:_ metrics prefix.

- Fixed handling of escaped domain delimiters in NVTX filter expressions.

- Fixed issues with the occupancy charts for small block sizes.

- Fixed an issue when choosing a default report page in the options dialog.

- Fixed that the scroll bar could overlap the content when exporting the report page as an image.


#### Updates in 2020.1 [](https://docs.nvidia.com/nsight-compute/ReleaseNotes/index.html\#updates-in-2020-1 "Permalink to this headline")

**General**

- Added support for the NVIDIA GA100/SM 8.x GPU architecture

- Removed support for the Pascal SM 6.x GPU architecture

- Windows 7 is not a supported host or target platform anymore

- Added a rule for reporting uncoalesced memory accesses as part of the _Source Counters_ section

- Added support for report name placeholders %p, %q, %i and %h

- The [Kernel Profiling Guide](https://docs.nvidia.com/nsight-compute/ProfilingGuide/index.html.md#abstract) was added to the documentation


**NVIDIA Nsight Compute**

- The UI command was renamed from `nv-nsight-cu` to `ncu-ui`. Old names remain for backwards compatibility.

- Added support for roofline analysis charts

- Added linked hot spot tables in section bodies to indicate performance problems in the source code

- Added section navigation links in rule results to quickly jump to the referenced section

- Added a new option to select how kernel names are shown in the UI

- Added new memory tables for the L1/TEX cache and the L2 cache. The old tables are still available for backwards compatibility and moved to a new section containing deprecated UI elements.

- Memory tables now show the metric name as a tooltip

- Source resolution now takes into account file properties when selecting a file from disk

- Results in the profile report can now be filtered by NVTX range

- The Source page now supports collapsing views even for single files

- The UI shows profiler error messages as dismissible banners for increased visibility

- Improved the baseline name control in the profiler report header


**NVIDIA Nsight Compute CLI**

- The CLI command was renamed from `nv-nsight-cu-cli` to `ncu`. Old names remain for backwards compatibility.

- Queried metrics on GV100 and newer chips are sorted alphabetically

- Multiple instances of NVIDIA Nsight Compute CLI can now run concurrently on the same system, e.g. for profiling individual MPI ranks. Profiled kernels are serialized across all processes using a system-wide file lock.


**Resolved Issues**

- More C++ kernel names can be properly demangled

- Fixed a `free(): invalid pointer` error when profiling applications using pytorch > 19.07

- Fixed profiling IBM Spectrum MPI applications that require PAMI GPU hooks (`--smpiargs="-gpu"`)

- Fixed that the first kernel instruction was missed when computing `sass__inst_executed_per_opcode`

- Reduced surplus DRAM write traffic created from flushing caches during kernel replay

- The _Compute Workload Analysis_ section shows the IMMA pipeline on GV11b GPUs

- Profile reports now scroll properly on MacOS when using a trackpad

- Relative output filenames for the Profile activity now use the document directory, instead of the current working directory

- Fixed path expansion of `~` on Windows

- Memory access information is now shown properly for RED assembly instructions on the Source page

- Fixed that user `PYTHONHOME` and `PYTHONPATH` environment variables would be picked up by NVIDIA Nsight Compute, resulting in locale encoding issues.


#### Updates in 2019.5.3 [](https://docs.nvidia.com/nsight-compute/ReleaseNotes/index.html\#updates-in-2019-5-3 "Permalink to this headline")

**General**

- More C++ kernel names can be properly demangled


#### Updates in 2019.5.2 [](https://docs.nvidia.com/nsight-compute/ReleaseNotes/index.html\#updates-in-2019-5-2 "Permalink to this headline")

**General**

- Bug fixes


#### Updates in 2019.5.1 [](https://docs.nvidia.com/nsight-compute/ReleaseNotes/index.html\#updates-in-2019-5-1 "Permalink to this headline")

**General**

- Added support for Nsight Compute Visual Studio Integration


#### Updates in 2019.5 [](https://docs.nvidia.com/nsight-compute/ReleaseNotes/index.html\#updates-in-2019-5 "Permalink to this headline")

**General**

- Added _section sets_ to reduce the default overhead and make it easier to configure metric sets for profiling

- Reduced the size of the installation

- Added support for CUDA Graphs Recapture API

- The NvRules API now supports accessing correlation IDs for instanced metrics

- Added breakdown tables for _SOL SM_ and _SOL Memory_ in the Speed Of Light section for Volta+ GPUs


**NVIDIA Nsight Compute**

- Added a snap-select feature to the Source page heatmap help navigate large files

- Added support for loading remote CUDA-C source files via SSH on demand for Linux x86\_64 targets

- Charts on the Details page provide better help in tool tips when hovering metric names

- Improved the performance of the Source page when scrolling or collapsing

- The charts for Warp States and Compute pipelines are now sorted by value


**NVIDIA Nsight Compute CLI**

- Added support for GPU cache control, see `--cache-control`

- Added support for setting the kernel name base in command line output, see `--kernel-base`

- Added support for listing the available names for `--chips`, see `--list-chips`

- Improved the stability on Windows when using `--target-processes all`

- Reduced the profiling overhead for small metric sets in applications with many kernels


**Resolved Issues**

- Reduced the overhead caused by demangling kernel names multiple times

- Fixed an issue that kernel names were not demangled in CUDA Graph Nodes resources window

- The connection dialog better disables unsupported combinations or warns of invalid entries

- Fixed metric _thread\_inst\_executed\_true_ to derive from _smsp\_not\_predicated\_off\_thread\_inst\_executed_ on Volta+ GPUs

- Fixed an issue with computing the theoretical occupancy on GV100

- Selecting an entry on the Source page heatmap no longer selects the respective source line, to avoid losing the current selection

- Fixed the current view indicator of the Source page heatmap to be line-accurate

- Fixed an issue when comparing metrics from Pascal and later architectures on the Summary page

- Fixed an issue that metrics representing constant values on Volta+ couldn’t be collected without non-constant metrics


#### Updates in 2019.4 [](https://docs.nvidia.com/nsight-compute/ReleaseNotes/index.html\#updates-in-2019-4 "Permalink to this headline")

**General**

- Added support for the Linux PowerPC target platform

- Reduced the profiling overhead, especially if no source metrics are collected

- Reduced the overhead for non-profiled kernels

- Improved the deployment performance during remote launches

- Trying to profile on an unsupported GPU now shows an “Unsupported GPU” error message

- Added support for the `%i` sequential number placeholder to generate unique report file names

- Added support for _smsp\_\_sass\_\*_ metrics on Volta and newer GPUs

- The _launch\_\_occupancy\_limit\_shared\_mem_ now reports the device block limit if no shared memory is used by the kernel


**NVIDIA Nsight Compute**

- The _Profile_ activity shows the command line used to launch ncu

- The heatmap on the Source page now shows the represented metric in its tooltip

- The _Memory Workload Analysis Chart_ on the Details page now supports baselines

- When applying rules, a message displaying the number of new rule results is shown in the status bar

- The Visual Profiler Transition Guide was added to the documentation

- Connection dialog activity options were added to the documentation

- A warning dialog is shown if the application is resumed without Auto-Profile enabled

- Pausing the application now has immediate feedback in the toolbar controls

- Added a _Close All_ command to the _File_ menu


**NVIDIA Nsight Compute CLI**

- The `--query-metrics` option now shows only metric base names for faster metric query. The new option `--query-metrics-mode` can be used to display the valid suffixes for each base metric.

- Added support for passing response files using the `@` operator to specify command line options through a file


**Resolved Issues**

- Fixed an issue that reported the wrong executable name in the Session page when attaching

- Fixed issues that chart labels were shown elided on the Details page

- Fixed an issue that caused the cache hitrates to be shown incorrectly when baselines were added

- Fixed an illegal memory access when collecting _sass\_\_\*\_histogram_ metrics for applications using PyTorch on Pascal GPUs

- Fixed an issue when attempting to collect all _smsp\_\_\*_ metrics on Volta and newer GPUs

- Fixed an issue when profiling multi-context applications

- Fixed that profiling start/stop settings from the connection dialog weren’t properly passed to the interactive profile activity

- Fixed that certain _smsp\_\_warp\_cycles\_per\_issue\_stall\*_ metrics returned negative values on Pascal GPUs

- Fixed that metric names were truncated in the `--page details` non-CSV command line output

- Fixed that the target application could crash if a connection port was used by another application with higher privileges


#### Updates in 2019.3.1 [](https://docs.nvidia.com/nsight-compute/ReleaseNotes/index.html\#updates-in-2019-3-1 "Permalink to this headline")

**NVIDIA Nsight Compute**

- Added ability to send bug reports and suggestions for features using _Send Feedback_ in the _Help_ menu


**Resolved Issues**

- Fixed calculation of theoretical occupancy for grids with blocks that are not a multiple of 32 threads

- Fixed intercepting child processes launched through Python’s subprocess.Popen class

- Fixed issue of NVTX push/pop ranges not showing up for child threads in NVIDIA Nsight Compute CLI

- Fixed performance regression for metric lookups on the Source page

- Fixed description in rule covering the IMC stall reason

- Fixed cases were baseline values were not correctly calculated in the Memory tables when comparing reports of different architectures

- Fixed incorrect calculation of baseline values in the Executed Instruction Mix chart

- Fixed accessing instanced metrics in the NvRules API

- Fixed a bug that could cause the collection of unnecessary metrics in the Interactive Profile activity

- Fixed potential crash on exit of the profiled target application

- Switched underlying metric for `SOL FB` in the GPU Speed Of Light section to be driven by `dram__throughput.avg.pct_of_peak_sustained_elapsed` instead of `fbpa__throughput.avg.pct_of_peak_sustained_elapsed`


#### Updates in 2019.3 [](https://docs.nvidia.com/nsight-compute/ReleaseNotes/index.html\#updates-in-2019-3 "Permalink to this headline")

**General**

- Improved performance

- Bug fixes

- Kernel launch context and stream are reported as metrics

- PC sampling configuration options are reported as metrics

- The default base port for connections to the target changed

- Section files support multiple, named Body fields

- NvRules allows users to query metrics using any convertible data type


**NVIDIA Nsight Compute**

- Support for filtering kernel launches using their NVTX context

- Support for new options to select the connection port range

- The Profile activity supports configuring PC sampling parameters

- Sections on the Details page support selecting individual bodies


**NVIDIA Nsight Compute CLI**

- Support for stepping to kernel launches from specific NVTX contexts

- Support for new `--port` and `--max-connections` options

- Support for new `--sampling-*` options to configure PC sampling parameters

- Section file errors are reported with `--list-sections`

- A warning is shown if some section files could not be loaded


**Resolved Issues**

- Using the –summary option works for reports that include invalid metrics

- The full process executable filename is reported for QNX targets

- The project system now properly stores the state of opened reports

- Fixed PTX syntax highlighting

- Fixed an issue when switching between manual and auto profiling in NVIDIA Nsight Compute

- The source page in NVIDIA Nsight Compute now works with results from multiple processes

- Charts on the NVIDIA Nsight Compute details page uses proper localization for numbers

- NVIDIA Nsight Compute no longer requires the system locale to be set to English


#### Updates in 2019.2 [](https://docs.nvidia.com/nsight-compute/ReleaseNotes/index.html\#updates-in-2019-2 "Permalink to this headline")

**General**

- Improved performance

- Bug fixes

- Kernel launch context and stream are reported as metrics

- PC sampling configuration options are reported as metrics

- The default base port for connections to the target changed

- Section files support multiple, named Body fields

- NvRules allows users to query metrics using any convertible data type


**NVIDIA Nsight Compute**

- Support for filtering kernel launches using their NVTX context

- Support for new options to select the connection port range

- The Profile activity supports configuring PC sampling parameters

- Sections on the Details page support selecting individual bodies


**NVIDIA Nsight Compute CLI**

- Support for stepping to kernel launches from specific NVTX contexts

- Support for new `--port` and `--max-connections` options

- Support for new `--sampling-*` options to configure PC sampling parameters

- Section file errors are reported with `--list-sections`

- A warning is shown if some section files could not be loaded


**Resolved Issues**

- Using the –summary option works for reports that include invalid metrics

- The full process executable filename is reported for QNX targets

- The project system now properly stores the state of opened reports

- Fixed PTX syntax highlighting

- Fixed an issue when switching between manual and auto profiling in NVIDIA Nsight Compute

- The source page in NVIDIA Nsight Compute now works with results from multiple processes

- Charts on the NVIDIA Nsight Compute details page uses proper localization for numbers

- NVIDIA Nsight Compute no longer requires the system locale to be set to English


#### Updates in 2019.1 [](https://docs.nvidia.com/nsight-compute/ReleaseNotes/index.html\#updates-in-2019-1 "Permalink to this headline")

**General**

- Support for CUDA 10.1

- Improved performance

- Bug fixes

- Profiling on Volta GPUs now uses the same metric names as on Turing GPUs

- Section files support descriptions

- The default sections and rules directory has been renamed to _sections_


**NVIDIA Nsight Compute**

- Added new profiling options to the options dialog

- Details page shows rule result icons in the section headers

- Section descriptions are shown in the details page and in the sections tool window

- Source page supports collapsing multiple source files or functions to show aggregated results

- Source page heatmap color scale has changed

- Invalid metric results are highlighted in the profiler report

- Loaded section and rule files can be opened from the sections tool window


**NVIDIA Nsight Compute CLI**

- Support for profiling child processes on Linux and Windows x86\_64 targets

- NVIDIA Nsight Compute CLI uses a temporary file if no output file is specified

- Support for new `--quiet` option

- Support for setting the GPU clock control mode using new `--clock-control` option

- Details page output shows the NVTX context when `--nvtx` is enabled

- Support for filtering kernel launches for profiling based on their NVTX context using new `--nvtx-include` and `--nvtx-exclude` options

- Added new `--summary` options for aggregating profiling results

- Added option `--open-in-ui` to open reports collected with NVIDIA Nsight Compute CLI directly in NVIDIA Nsight Compute


**Resolved Issues**

- Installation directory scripts use absolute paths

- OpenACC kernel names are correctly demangled

- Profile activity report file supports a relative path

- Source view can resolve all applicable files at once

- UI font colors are improved

- Details page layout and label elision issues are resolved

- Turing metrics are properly reported on the Summary page

- All byte-based metrics use a factor of 1000 when scaling units to follow SI standards

- CSV exports properly align columns with empty entries


- Fixed the metric computation for double\_precision\_fu\_utilization on GV11b


- Fixed incorrect ‘selected’ PC sampling counter values

- The SpeedOfLight section uses ‘max’ instead of ‘avg’ cycles metrics for Elapsed Cycles


## 1.2. Known Issues [](https://docs.nvidia.com/nsight-compute/ReleaseNotes/index.html\#known-issues "Permalink to this headline")

**Installation**

- The installer might not show all patch-level version numbers during installation.

- Some command line options listed in the help of a _.run_ installer of NVIDIA Nsight Compute are affecting only the archive extraction, but not the installation stage. To pass command line options to the embedded installer script, specify those options after `--` in the form of `-- -<option>`. The available options for the installer script are:





```
  -help               : Print help message
  -targetpath=<PATH>  : Specify install path
  -noprompt           : No prompts. Implies acceptance of the EULA
```





For example, specifying only option `--quiet` extracts the installer archive without any output to the console, but still prompts for user interaction during the installation. To install NVIDIA Nsight Compute without any console output nor any user interaction, please specify `--quiet -- -noprompt`.

- After using the SDK Manager to install the NVIDIA Nsight Compute tools, their binary path needs to be manually added to your `PATH` environment variable.

- See also the [System Requirements](https://docs.nvidia.com/nsight-compute/ReleaseNotes/index.html.md#system-requirements) for more installation instructions.


**Launch and Connection**

- Launching applications on remote targets/platforms is not supported for several combinations. See [Platform Support](https://docs.nvidia.com/nsight-compute/ReleaseNotes/index.html.md#platform-support) for details. Manually launch the application using command line `ncu --mode=launch` on the remote system and connect using the UI or CLI afterwards.

- In the NVIDIA Nsight Compute connection dialog, a remote system can only be specified for one target platform. Remove a connection from its current target platform in order to be able to add it to another.

- Loading of CUDA sources via SSH requires that the remote connection is configured, and that the hostname/IP address of the connection matches the target (as seen in the report session details). For example, prefer my-machine.my-domain.com, instead of my-machine, even though the latter resolves to the same.

- Other issues concerning remote connections are discussed in the documentation for [remote connections](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#remote-connections).

- Local connections between NVIDIA Nsight Compute and the launched target application might not work on some ppc64le or aarch64 (sbsa) systems configured to only support IPv6. On these platforms, the [NV\_COMPUTE\_PROFILER\_LOCAL\_CONNECTION\_OVERRIDE=uds](https://docs.nvidia.com/nsight-compute/NsightComputeCli/index.html.md#environment-variables) environment variable can be set to use _Unix Domain Sockets_ instead of _TCP_ for local connections to workaround the problem. On x86\_64 Linux, Unix Domain Sockets are used by default, but local TCP connections can be forced using [NV\_COMPUTE\_PROFILER\_LOCAL\_CONNECTION\_OVERRIDE=tcp](https://docs.nvidia.com/nsight-compute/NsightComputeCli/index.html.md#environment-variables).


**Profiling and Metrics**

- Profiling of 32-bit processes is not supported.

- Profiling kernels executed on a device that is part of an SLI group is not supported. An “Unsupported GPU” error is shown in this case.

- Profiling a kernel while other contexts are active on the same device (e.g. X server, or secondary CUDA or graphics application) can result in varying metric values for L2/FB (Device Memory) related metrics. Specifically, L2/FB traffic from non-profiled contexts cannot be excluded from the metric results. To completely avoid this issue, profile the application on a GPU without secondary contexts accessing the same device (e.g. no X server on Linux).

- In the current release, profiling a kernel while any other GPU work is executing on the same MIG compute instance can result in varying metric values for all units. NVIDIA Nsight Compute enforces serialization of the CUDA launches within the target application to ensure those kernels do not influence each other. See [Serialization](https://docs.nvidia.com/nsight-compute/ProfilingGuide/index.html.md#serialization) for more details. However, GPU work issued through other APIs in the target process or workloads created by non-target processes running simultaneously in the same MIG compute instance will influence the collected metrics. Note that it is acceptable to run CUDA processes in other MIG compute instances as they will not influence the profiled MIG compute instance.

- On Linux kernels settings `fs.protected_regular=1` (e.g. some Ubuntu 20.04 cloud service provider instances), root users may not be able to access the [inter-process lock file](https://docs.nvidia.com/nsight-compute/ProfilingGuide/index.html.md#serialization). See the [FAQ](https://docs.nvidia.com/nsight-compute/ProfilingGuide/index.html.md#faq) for workarounds.

- Profiling only supports up to 32 device instances, including instances of MIG partitions. Profiling the 33rd or higher device instance will result in indeterminate data.

- Enabling certain metrics can cause GPU kernels to run longer than the driver’s watchdog time-out limit. In these cases the driver will terminate the GPU kernel resulting in an application error and profiling data will not be available. Please disable the driver watchdog time out before profiling such long running CUDA kernels.

  - On Linux, setting the X Config option Interactive to false is recommended.

  - For Windows, detailed information on disabling the Windows TDR is available at [https://docs.microsoft.com/en-us/windows-hardware/drivers/display/timeout-detection-and-recovery](https://docs.microsoft.com/en-us/windows-hardware/drivers/display/timeout-detection-and-recovery)
- Collecting device-level metrics, such as the NVLink metrics (`nvl*`), is not supported on [NVIDIA virtual GPUs](https://www.nvidia.com/en-us/data-center/virtual-gpu-technology.md/) (vGPUs).

- As of CUDA 11.4 and R470 TRD1 driver release, NVIDIA Nsight Compute is supported in a vGPU environment which requires a vGPU license. If the license is not obtained after 20 minutes, the reported performance metrics data from the GPU will be inaccurate. This is because of a feature in vGPU environment which reduces performance but retains functionality as specified [here](https://docs.nvidia.com/grid/latest/grid-licensing-user-guide/index.html#software-enforcement-grid-licensing).

- Profiling on [NVIDIA live-migrated virtual machines](https://www.nvidia.com/en-us/data-center/virtualization/virtual-gpu-migration.md/) is not supported and can result in undefined behavior.


- Profiling with enabled multi-process service (MPS) is not supported.


- Profiling is not supported while the target GPU is configured to run in any [Confidential Computing](https://docs.nvidia.com/confidential-computing/index.html) mode.

- Profiling most metrics with `CU_FORCE_PTX_JIT=1` set is only supported with CUDA 12.7 drivers or newer.

- When Profiling using _Range Replay_ or _Application Range Replay_ with multiple CUDA Green Contexts active which belong to the same device context, the range result will contain counter values aggregated on all Green Contexts. In case the Green Contexts use overlapping SM masks, this will even apply to [Green-Context attributable metrics](https://docs.nvidia.com/nsight-compute/ProfilingGuide/index.html.md#cuda-green-contexts).

- The NVLink Topology section is not supported for a configuration using NVSwitch.

- NVIDIA Nsight Compute does not support per-NVLink metrics.

- NVIDIA Nsight Compute does not support the _Logical NVLink Throughput_ table.

- Setting a reduced NVLink Bandwidth mode does not impact the reported peak values for NVLink metrics. All peak values and corresponding percentages are calculated off the non-reduced NVLink bandwidth. Reconfiguring the NVLink Bandwidth mode using `nvidia-smi` while profiling may lead to undefined tools’ behavior.

- On the Tegra platforms, when profiling multi-process applications, the mcc\_\* metrics may sometimes fail to be collected.


- Profiling kernel nodes of a device-side graph can cause hang in some cases on drivers older than 595. Use [Graph Profiling](https://docs.nvidia.com/nsight-compute/NsightComputeCli/index.html.md#command-line-options-profile) mode instead.

- Profiling in [Graph Profiling](https://docs.nvidia.com/nsight-compute/NsightComputeCli/index.html.md#command-line-options-profile) mode is performed on the context that is specified by the stream handle for the graph launch. Only kernel nodes executing on this context are profiled.

- On drivers older than 580, profiling CUDA graph kernel nodes doing cluster launches is not supported.


- On CUDA drivers older than 530.x, profiling on Windows Subsystem for Linux (WSL) is not supported if the system has multiple physical NVIDIA GPUs. This is not affected by setting `CUDA_VISIBLE_DEVICES`.

- Collecting software counters through PerfWorks currently forces all functions in the module of the profiled kernel to be loaded. This increases the host and device memory footprint of the target application for the remainder of the process lifetime.

- PM Sampling is not supported when collecting a Profile Series.

- Data collected using [PM Sampling](https://docs.nvidia.com/nsight-compute/ProfilingGuide/index.html.md#pm-sampling) across multiple passes may not align perfectly in the timeline, even with context switch filtering applied.

- For results which are collected with [Work ID/Cluster Launch Control](https://docs.nvidia.com/cuda/parallel-thread-execution/index.html.md#parallel-synchronization-and-communication-instructions-clusterlaunchcontrol-try-cancel) feature, the count of clusters, blocks, warps and threads launched on the device could be lower than configured. Metrics that depend on such counts would be affected accordingly.

- On Windows, when in MCDM mode, changing access to GPU performance counters is not supported through the NVIDIA Control Panel. See [ERR\_NVGPUCTRPERM](https://developer.nvidia.com/ERR_NVGPUCTRPERM) for further details.


**Compatibility**

- Applications calling blocking functions on std input/output streams can result in the profiler to stop, until the blocking function call is resolved.

- NVIDIA Nsight Compute can hang on applications using RAPIDS in versions 0.6 and 0.7, due to an issue in cuDF.

- Profiling child processes launched via `clone()` is not supported.

- Profiling of Cooperative Groups kernels launched with `cuLaunchCooperativeKernelMultiDevice` is not yet supported.

- On Linux systems, when profiling _bsd-csh_ scripts, the original application output will not be printed. As a workaround, use a different C-shell, e.g. _tcsh_.

- Attempting to use the `--clock-control` option to set the GPU clocks will fail when profiling on a MIG GPU partition. Please use `nvidia-smi` (installed with NVIDIA display driver) to control the clocks for the entire GPU. This will require administrative privileges when the GPU is partitioned.

- `--clock-control` option is not supported on Linux (aarch64 sbsa) with GB10b (Thor) GPUs. Attempting to lock or reset the clocks has no effect.

- On Linux aarch64, NVIDIA Nsight Compute does not work if the _HOME_ environment variable is not set.

- NVIDIA Nsight Compute versions 2020.1.0 to 2020.2.1 are not compatible with CUDA driver version 460+ if the application launches Cooperative Groups kernels. Profiling will fail with error “UnknownError”.

- Collecting CPU call stack information on Windows Server 2016 can hang NVIDIA Nsight Compute in some cases. Currently, the only workaround is to skip CPU call stack collection on such systems by not specifying the option `--call-stack`.

- When profiling a script, `--target-processes all` may target utility executables such as _xargs_, _uname_ or _ls_. To avoid profiling these, use the `--target-processes-filter` option accordingly.

- On mobile platforms, `--kill` option is not supported with application replay mode.

- NVIDIA Nsight Compute might show invalid characters for Unicode names and paths on Windows 10. As a workaround, use a third-party terminal emulator, e.g. Git bash.

- CPU call stack collection may lead to crashes for CUDA applications with device functions compiled with gcc 13.2.


**User Interface**

- The API Statistics filter in NVIDIA Nsight Compute does not support units.

- File size is the only property considered when resolving source files. Timestamps are currently ignored.

- Terminating or disconnecting an application in the _Interactive Profiling_ activity while the API Stream View is updated can lead to a crash.

- See the [OptiX library support section](https://docs.nvidia.com/nsight-compute/ReleaseNotes/index.html.md#optix) for limitations concerning the [Acceleration Structure Viewer](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#acceleration-structure-viewer).

- After updating from a previous version of NVIDIA Nsight Compute on Linux, the file load dialog may not allow column resizing and sorting. As a workaround, the _~/.config/QtProject.conf_ file can be edited to remove the _treeViewHeader_ entry from the _\[FileDialog\]_ section.


## 1.3. Support [](https://docs.nvidia.com/nsight-compute/ReleaseNotes/index.html\#support "Permalink to this headline")

Information on supported platforms and GPUs.

### 1.3.1. Platform Support [](https://docs.nvidia.com/nsight-compute/ReleaseNotes/index.html\#platform-support "Permalink to this headline")

Host denotes the UI can run on that platform. Target means that we can instrument applications on that platform for data collection. Applications launched with instrumentation on a target system can be connected to from most host platforms. The reports collected on one system can be opened on any other system.

|  | Host | Targets |
| --- | --- | --- |
| Windows | Yes | Windows\*, Linux (x86\_64) |
| Windows Subsystem for Linux (WSL2) | Yes | Windows Subsystem for Linux (WSL2) as part of the Linux (x86\_64) package. |
| Linux (x86\_64) | Yes | Windows\*, Linux (x86\_64), Linux (aarch64 sbsa) |
| Linux (ppc64le) | No | No |
| Linux (aarch64 sbsa) | Yes | Linux (aarch64 sbsa) |
| Linux (x86\_64) (Drive SDK) | Yes | Windows\*, Linux (x86\_64), Linux (aarch64), QNX |
| macOS 13+ (x86\_64, arm64) | Yes | Windows\*, Linux (x86\_64), Linux (aarch64 sbsa) |
| Linux (aarch64 l4t, Drive OS Linux) | Yes | Linux (aarch64 l4t, Drive OS Linux) |
| QNX | No | QNX |

Platforms supported by NVIDIA Nsight Compute [](https://docs.nvidia.com/nsight-compute/ReleaseNotes/index.html.md#id2 "Permalink to this table")

Target platforms marked with \* do not support remote launch from the respective host. Remote launch means that the application can be launched on the target system from the host UI. Instead, the application must be launched from the target system.

Profiling of 32-bit processes is not supported.

### 1.3.2. GPU Support [](https://docs.nvidia.com/nsight-compute/ReleaseNotes/index.html\#gpu-support "Permalink to this headline")

| Architecture | Support |
| --- | --- |
| Maxwell | No |
| Pascal | No |
| Volta | No |
| Turing TU1xx | Yes |
| NVIDIA GA100 | Yes |
| NVIDIA GA10x | Yes |
| NVIDIA GA10b | Yes |
| NVIDIA AD10x | Yes |
| NVIDIA GH100 | Yes |
| NVIDIA GB10x | Yes |
| NVIDIA GB10b | Yes |
| NVIDIA GB11x | Yes |
| NVIDIA GB20x | Yes |
| NVIDIA GB20b | Yes |
| NVIDIA GB20c | Yes |

GPU architectures supported by NVIDIA Nsight Compute [](https://docs.nvidia.com/nsight-compute/ReleaseNotes/index.html.md#id3 "Permalink to this table")

Many metrics used in NVIDIA Nsight Compute are identical to those of the PerfWorks Metrics API and follow the documented [Metrics Structure](https://docs.nvidia.com/nsight-compute/ProfilingGuide/index.html.md#metrics-structure).
Non-PerfWorks metrics are documented in the [Metrics Reference](https://docs.nvidia.com/nsight-compute/ProfilingGuide/index.html.md#metrics-reference).

### 1.3.3. Library Support [](https://docs.nvidia.com/nsight-compute/ReleaseNotes/index.html\#library-support "Permalink to this headline")

NVIDIA Nsight Compute can be used to profile CUDA applications, as well as applications that use CUDA via NVIDIA or third-party libraries. For most such libraries, the behavior is expected to be identical to applications using CUDA directly. However, for certain libraries, NVIDIA Nsight Compute has certain restrictions, alternate behavior, or requires non-default setup steps prior to profiling.

#### OptiX [](https://docs.nvidia.com/nsight-compute/ReleaseNotes/index.html\#optix "Permalink to this headline")

NVIDIA Nsight Compute supports profiling of OptiX applications, but with certain restrictions.

- **Internal Kernels**

Some kernels launched by OptiX that contain no user-defined code are given the generic name _NVIDIA internal_. These kernels show up on the API Stream in the NVIDIA Nsight Compute UI, and can be profiled in both the UI as well as the NVIDIA Nsight Compute CLI. However, no CUDA-C source, PTX or SASS is available for them.

- **User Kernels**

Kernels launched by OptiX can contain user-defined code. OptiX identifies these kernels in the API Stream with a custom name. This name starts with _raygen\_\__ (for “ray generation”). These kernels show up on the API Stream and can be profiled in the UI as well as the NVIDIA Nsight Compute CLI. The Source page displays CUDA-C source, PTX and SASS defined by the user. Certain parts of the kernel, including device functions that contain OptiX-internal code, will not be available in the Source page.

- **SASS**

When SASS information is available in the profile report, certain instructions might not be available in the Source page and shown as _N/A_.


The [Acceleration Structure Viewer](https://docs.nvidia.com/nsight-compute/NsightCompute/index.html.md#acceleration-structure-viewer) for OptiX traversable handles currently has the following limitations:

- It is only available in interactive profiling sessions.

- It is not supported on macOS.

- Viewing instance acceleration structures using multi-level instancing is not supported.

- Applying motion traversables to acceleration structures is not supported.


The following feature set is supported per OptiX API version:

|     |     |     |     |
| --- | --- | --- | --- |
| **OptiX API Version** | **Kernel Profiling** | **API Interception** | **Resource Tracking** |
| 6.x | No | No | No |
| 7.0 - 9.0 | Yes | Yes | Yes |

#### CUDA Tile [](https://docs.nvidia.com/nsight-compute/ReleaseNotes/index.html\#cuda-tile "Permalink to this headline")

CUDA Tile profiling is supported with driver versions 590 and above.

| Feature | Current<br>Support | Future<br>Support |
| --- | --- | --- |
| Tile kernel profiling | Yes | Yes |
| Source <-> SASS correlation | Yes | Yes |
| SIMT kernel feature parity | Yes | Yes |
| Source <-> Tile IR <-> SASS correlation | No | Yes |

cuTile Python Profile Support [](https://docs.nvidia.com/nsight-compute/ReleaseNotes/index.html.md#id4 "Permalink to this table")

### 1.3.4. System Requirements [](https://docs.nvidia.com/nsight-compute/ReleaseNotes/index.html\#system-requirements "Permalink to this headline")

#### CUDA Driver [](https://docs.nvidia.com/nsight-compute/ReleaseNotes/index.html\#cuda-driver "Permalink to this headline")

Nsight Compute requires a CUDA driver that is [compatible](https://docs.nvidia.com/deploy/cuda-compatibility) with CUDA 13.
In addition, certain features may require a specific minimum driver version.

#### Linux and WSL [](https://docs.nvidia.com/nsight-compute/ReleaseNotes/index.html\#linux-and-wsl "Permalink to this headline")

On Linux platforms, NVIDIA Nsight Compute requires the following minimum GLIBC versions:

| Platform | Version |
| --- | --- |
| x86\_64 | 2.28 |
| aarch64 sbsa | 2.28 |
| aarch64 l4t | 2.28 |

The NVIDIA Nsight Compute UI requires packages to be installed to enable Qt and other dependencies.
Please refer to the [Qt for X11 Requirements](https://doc.qt.io/qt-6/linux-requirements.html).
When executing `ncu-ui` with missing dependencies, an error message with information on the missing packages is shown.
Note that only one package will be shown at a time, even though multiple may be missing from your system.
For selected operating systems, the following commands install needed packages for NVIDIA Nsight Compute on X11:

- **Ubuntu 20.04**

`apt install libopengl0 libxcb-icccm4 libxcb-image0 libxcb-keysyms1 libxcb-render-util0 libxcb-xinerama0 libxcb-xkb1 libxkbcommon-x11-0`

- **Ubuntu 22.04** and **Ubuntu 24.04**

`apt install libxcb-cursor0`

- **RHEL 9.5**

`yum install libglvnd-opengl libxcb libxkbcommon-x11 xcb-util-keysyms xcb-util-wm`


Profiling on Windows Subsystem for Linux (WSL) is only supported with WSL version 2.
Profiling is supported on Windows 10 WSL with OS build version 19044 and greater, and NVIDIA display driver version 545 or higher.
It is not supported on Windows 10 WSL for systems that exceed 1 TB of system memory.
Profiling is supported on Windows 11 WSL with NVIDIA display driver version 525 or higher.

The Linux (x86\_64) NVIDIA Nsight Compute package can be used and should be installed directly within WSL2.
Remote profiling to and from WSL2 works equivalently to regular Linux (x86\_64) hosts and targets, as long as it’s accessible via SSH.
Access to NVIDIA GPU Performance Counters must be enabled in the NVIDIA Control Panel of the Windows host.
See also the [CUDA on WSL User Guide](https://docs.nvidia.com/cuda/wsl-user-guide/index.html.md).

For selected operating systems, the following commands install needed packages for NVIDIA Nsight Compute on Wayland on WSL2:

- **Ubuntu 24.04**


> `apt install libopengl0 libxcb-icccm4 libxcb-keysyms1 libxcb-cursor0 libxcb-shape0 libxkbcommon-x11-0 libnss3`


#### Windows [](https://docs.nvidia.com/nsight-compute/ReleaseNotes/index.html\#windows "Permalink to this headline")

Only Windows 10 and 11 are supported as host and target.

The Visual Studio 2019 redistributable is not automatically installed by the NVIDIA Nsight Compute installer. The workaround is to install the x64 version of the ‘Microsoft Visual C++ Redistributable for Visual Studio 2019’ manually. The installer is linked on the main download page for Visual Studio at [https://www.visualstudio.com/downloads/](https://www.visualstudio.com/downloads/) or download directly from [https://go.microsoft.com/fwlink/?LinkId=746572](https://go.microsoft.com/fwlink/?LinkId=746572).

Notices

Notices

ALL NVIDIA DESIGN SPECIFICATIONS, REFERENCE BOARDS, FILES, DRAWINGS, DIAGNOSTICS, LISTS, AND OTHER DOCUMENTS (TOGETHER AND SEPARATELY, “MATERIALS”) ARE BEING PROVIDED “AS IS.” NVIDIA MAKES NO WARRANTIES, EXPRESSED, IMPLIED, STATUTORY, OR OTHERWISE WITH RESPECT TO THE MATERIALS, AND EXPRESSLY DISCLAIMS ALL IMPLIED WARRANTIES OF NONINFRINGEMENT, MERCHANTABILITY, AND FITNESS FOR A PARTICULAR PURPOSE.

Information furnished is believed to be accurate and reliable. However, NVIDIA Corporation assumes no responsibility for the consequences of use of such information or for any infringement of patents or other rights of third parties that may result from its use. No license is granted by implication of otherwise under any patent rights of NVIDIA Corporation. Specifications mentioned in this publication are subject to change without notice. This publication supersedes and replaces all other information previously supplied. NVIDIA Corporation products are not authorized as critical components in life support devices or systems without express written approval of NVIDIA Corporation.

Trademarks

NVIDIA and the NVIDIA logo are trademarks or registered trademarks of NVIDIA Corporation in the U.S. and other countries. Other company and product names may be trademarks of the respective companies with which they are associated.