1. Release Notes

1.1. Important Information about NVIDIA Nsight Visual Studio Edition 2020.1

1.1.1. Display Driver

You must install the NVIDIA display driver that supports the NVIDIA Nsight Visual Studio Edition tools. If you have an NVIDIA graphics card installed on your target machine, you likely already have an NVIDIA display driver; however, NVIDIA Nsight Visual Studio Edition requires a specific version of the driver in order to function properly. From the NVIDIA web site, download and install the following display driver (or newer):

Release 451.12 or newer

See below for more information on:

1.2. New in NVIDIA Nsight Visual Studio Edition 2020.1

  • NVIDIA Nsight Integration, a Visual Studio extension, has been introduced to allow next generation, standalone, Nsight tool integration into Visual Studio. In particular: 

    • Integrated Graphics Debugging, deprecated since NVIDIA Nsight Visual Studio Edition 2019.2, has been removed and replaced by Nsight Graphics.

    • Integrated CUDA profiling, deprecated since NVIDIA Nsight Visual Studio Edition 2019.2, has been removed from the Performance Analysis tools and replaced by: 

    • Integrated Analysis Trace, deprecated since NVIDIA Nsight Visual Studio Edition 2019.2, has not been removed, but will be in an upcoming release of NVIDIA Nsight VSE. The replacement, stand-alone Nsight Systems tool is currently available and works with NVIDIA Nsight Integration for Visual Studio.

  • OpenCL profiling support in NVIDIA Nsight Visual Studio Edition, deprecated as of NVIDIA Nsight VSE 2019.4, has been removed.

  • Windows 7 (and WinServer through 2012R2) support, deprecated since 2019.4 release, has been removed.

  • Support for sm_30 and sm_32 architectures have been dropped and sm_35, sm_37, sm_50 support has been deprecated as of the 2020.1 release. The default compilation target is now sm_52 in NVIDIA Nsight VSE build customizations. (CTK-865)

  • Support for Visual Studio 2013 has been dropped. Current Visual Studio support includes versions 2015, 2017, and 2019.

1.3. CUDA Debugger

1.3.1. New in the 2020.1 Release

  • Added support for the NVIDIA GA100 GPU.

  • Supports for the CUDA 11.0 Toolkit.

  • Added ability to control breaking on and reporting CUDA API errors.

  • The Warp Watch view is now available in the Next-Gen Nsight Debugger.

  • The Resources view is now available in the Next-Gen Nsight Debugger.

  • CUDA Task Graph support has been added to the Next-Gen Nsight Debugger.

  • Support for Pascal has been dropped from the Legacy Nsight Debugger, but is fully supported by the Next-Gen Nsight Debugger.

1.3.2. Known Issues in the 2020.1 Release of the CUDA Debugger

  1. When the display driver is in TCC mode, reading managed memory in the Memory window or visualizing expressions containing managed data can sometimes return stale data if those regions have been written to from the CPU side and have not been synchronized back to the GPU memory. This issue does not affect unmanaged memory and managed memory in WDDM driver mode. (DTNSV-593)

  2. Macro expressions may not work in conditional breakpoints. As a workaround, instead of using macros in the expression (for example: @blockIdx(x,y,z) && threadIdx(a,b,c)), specify the expressions with variables (for example: blockIdx.x == x && blockIdx.y == y … etc.) (DTNSV-794)

  3. CUDA grid launch failures occur on Pascal GPUs when debugging with preemption enabled. A workaround for this issue is to use a second GPU for rendering the desktop, and use the Pascal GPU dedicated for compute work without a display attached. (55322)

  4. If you are using NVIDIA Nsight VSE on a Windows 10 x64 machine, you cannot attach to a win32/x86 CUDA application. Only 64-bit CUDA applications are supported. (44794)

  5. Half2 types are not supported for conditional breakpoints. (37814)

  6. Firewall and anti-intrusion software (e.g., McAfee Host Intrusion Prevention) will not allow remote debugger connections. Please disable or add an exclusion for the Nsight Monitor. (22804)

  7. In some cases, when the CUDA application is built with the "Generate Relocatable Device Code" option, and a CUDA kernel function is declared with the __global__ static attributes, the NVIDIA Nsight VSE debugger might not be able to display local variables inside that function. Users can workaround this issue by simply removing the static qualifier on the function. (21914)

  8. You must enable Memory Checker before launching a process, and cannot change the setting while debugging (applies to Legacy Debugger only). (18935, 18937)

  9. When the CUDA Debugger is used to debug CUDA applications which share resources with DirectX 9 (such as the "simpleD3D9" sample program), the debugger may display incorrect values for memory locations in those shared resources. This may happen when the GPU device executing the application is Compute Capability 2.0 or higher. Incorrect values for the contents of memory may be displayed in any debug window (Autos, Locals, Watch, Warp Watch, or Memory). This issue does not affect applications using Direct3D 11. (13899)

  10. When using the CUDA Debugger with NVIDIA Nsight VSE, breakpoints will not be hit in source files whose full paths contain non-ASCII characters. Any path with a character code >= 128 is affected. (11429)

  11. If you experience hangs or TDRs while locally debugging CUDA on a single GPU (or using the Software Preemption debugging mode in general), try disabling operating system features that use video hardware acceleration. For example, disabling Aero on Windows 7, changing to a high-contrast desktop theme on Windows 8, or disabling WPF acceleration.

  12. Variables do not appear for source code that is not executed. This occurs because the compiler aggressively optimizes code even if you have not specified any compiler optimizations. As a result, the compiler removes any code that will not be executed from the output executable.

  13. Breakpoints will hit multiple times on lines that have more than one inline function call. For example, setting a breakpoint on:

    x = cos() + sin()

    will generate three breakpoints on that line. One for the evaluation of the expression, plus one for each function on the line.

  14. Unloading modules does not refresh the state of breakpoints set in that module. This means that those breakpoints do not show their latest state in Visual Studio when they have been unloaded.

  15. The Visual Studio Breakpoint "Filter" option is not supported for CUDA GPU breakpoints.

  16. The Visual Studio Breakpoint "Hitcount" option is not supported for CUDA GPU breakpoints.

  17. The F5 hotkey (which is the default hotkey in Visual Studio for starting the CPU Debugger) does not start the CUDA Debugger. To start the CUDA Debugger, you must either change the key bindings or use the menu command: Nsight > Start CUDA Debugging.

  18. There is no support for automatically performing a Build when launching the CUDA Debugger.

  19. The Load Symbols option, or "Symbols settings," in the Modules view is not supported for CUDA debugging.

1.4. Graphics Debugger

1.4.1. New in the 2020.1 Release

  • Integrated Graphics Debugging, deprecated since NVIDIA Nsight Visual Studio Edition 2019.2, has been removed and replaced by Nsight Graphics.

  • Note that NVIDIA Nsight Integration, a Visual Studio extension, has been introduced to allow Nsight Graphics integration into Visual Studio under the Nsight menu.

1.5. Analysis Tools

New in the 2020.1 Release

  • Integrated CUDA profiling, deprecated since NVIDIA Nsight Visual Studio Edition 2019.2, has been removed from the Performance Analysis tools and replaced by:

  • Integrated Analysis Trace, which has been deprecated since NVIDIA Nsight Visual Studio Edition 2019.2, has not been removed, but will be in an upcoming release of NVIDIA Nsight VSE. However, the replacement, stand-alone Nsight Systems tool is currently available and works with NVIDIA Nsight Integration for Visual Studio integration.

1.5.2. Known Issues in the 2020.1 Release of the Analysis Tools

  1. Windows 10 20H1 (v2004) ships with Hardware-Accelerated GPU Scheduling disabled by default. Enabling Hardware-Accelerated GPU Scheduling is generally not supported with r450 drivers that precede this Windows 10 release, and will result in a crash when using Analysis Trace. (200601042)

  2. When running a CUDA Next-Gen analysis trace with an NVIDIA Volta GV100 GPU, some fields (e.g., Launch Statistics, PC Sampling) may be empty. This will be fixed in a future release. (60311)

  3. NVIDIA Nsight VSE may not display memcpy start times correctly GeForce GTX 1050 and GeForce GTX 1050Ti GPUs with drivers earlier than release 376.09.

  4. In order to capture ETW events, the Nsight Monitor must be run with elevated privileges by right-clicking on the Nsight Monitor icon and selecting Run As Administrator. (Note that even if you are logged in with an Administrator account, you will have to explicitly run Nsight Monitor as administrator in order for ETW events to be captured.) (12193)

  5. In order to prevent Tesla AutoBoost from causing applications to have non-repeatable behavior in NVIDIA Nsight VSE Analysis, the AutoBoost feature is disabled. This will cause apps running under NVIDIA Nsight VSE Analysis to run slower than they would outside ofNVIDIA Nsight VSE on GPUs where AutoBoost defaults to enabled (for example, the Tesla K80). NVIDIA Nsight VSE Analysis will not disable AutoBoost if the user sets the environment variable CUDA_AUTO_BOOST to 1 in the Nsight Monitor process, which allows for higher performance at the cost of less repeatability for measurements. (34102)

  6. There may be some instability when trying to use Analysis Tracing on multi-GPU Quadro systems where the app uses gpu_affinity. This will be fixed in a future release. (22071)

  7. Analysis workload tracing on newer Maxwell GPUs requires an updated driver. Older drivers will display an error message in the analysis report summary. (31982)

  8. In rare cases, the reported number of memory transactions can exceed the number of transactions caused by executing memory requests from the user's code. This mismatch occurs when the GPU or driver must add transactions that are not controllable by the user. (27520)

  9. Firewall and anti-attack software (e.g., McAfee Host Intrusion Prevention) will not allow remote analysis connections. Please disable or add an exclusion for the Nsight Monitor. (22804)

  10. Do not start an analysis capture session when the CUDA Debugger is paused on a breakpoint. Doing so can cause the system to crash. (3203)

Analysis Activity Known Issues

  1. Tracing the following APIs is not supported in managed processes:

    • NVTX

    • OpenCL

    • Direct3D

    • OpenGL

    Launching a managed .exe for tracing with any of the aforementioned APIs enabled will result in an "Access Denied" pop-up message, and the analysis session will not start.

    In Trace Process Tree mode, instrumentation for tracing the aforementioned APIs can only propagate to native child processes. If a managed child process is launched, neither it nor any child process it launches (managed or native) can be instrumented by NVIDIA Nsight VSE. The analysis session will continue unaffected, and the user will not be notified of the problem; the report will not contain data from managed processes and their children.

    System and CUDA tracing is fully supported in managed processes, and in Trace Process Tree mode, tracing support propagates to all child processes (native or managed).

  2. The stop collection timer is implemented in Visual Studio. The latency to communicate to the monitor and application can result in a longer duration than requested.

  3. CPU Thread Trace

    If the Windows Kernel Event Provider is already in use when a new capture session is launched, the collected data may produce unexpected results. For best results ensure that no other kernel providers are running during an analysis session.

  4. CUDA Trace

    • CUDA trace does not show implicit memory transfers for graphics interop.

    • CUDA Runtime API trace does not capture the <<< >>> kernel launch syntax. Instead, the corresponding CUDA Runtime API calls are reported. Some of the CUDA Driver API calls that are executed by the CUDA Runtime may report errors, such as CUDA_ERROR_INVALID_CONTEXT, even though the usage of the CUDA Runtime API is valid. (6745)

      When collecting trace information about CUDA kernels and memory transfers, sometimes the report file will not contain complete information about the kernels and memory transfers. This happens because retrieving the data interferes with the application and affects performance, so the tool only does it after these events:

      • a call to cuCtxSynchronize()/cudaDeviceSynchronize(),

      • a call to cuCtxDestroy()/cudaDeviceReset(),

      • a call to cuStreamDestroy()/cudaStreamDestroy(),

      the application launches enough kernels or memory transfers to fill up NVIDIA Nsight VSE's buffer, so Nsight forces a context synchronize in order to retrieve the data.

  5. If your capture appears to be missing some or all kernel launch or memory transfer events, either force the data to flush by adding a call to cuCtxSynchronize()/cudaDeviceSynchronize() after all the CUDA work is finished, or (for an application that continuously launches kernels and memcpys), simply capture for more time and try to generate enough data to incur NVIDIA Nsight VSE's flush for a full buffer. (4812)

Analysis Report Known Issues

  1. If two different host computers use the same remote target machine, it is possible that the 2 machines could generate the same report directory. This would be confusing because reports from the 2 machines would be mixed together. Although unlikely, this can occur when 2 different machines analyze an application of the same name. The NVIDIA Nsight VSE Analysis Tools on the host machine create the directory name based on the name of the application.

Timeline Known Issues

  1. Clicking the timeline sometimes crashes Visual Studio 2019; see workaround in the DevTalk forum.

  2. Skew between CPU events and GPU events will typically not exceed one microsecond.

  3. Percentages displayed in the row labels and tool tips are based upon the full capture time.

  4. The mouse forward and back buttons cannot be used to navigate the report page system.

  5. CTRL+- toggles to the previous document instead of Zooming Out.

  6. Double-clicking on a row containing a line/area graph that also has children will expand/collapse the row as opposed to increasing the height to 66% of the view.

 

Notices

Notice

NVIDIA® Nsight™ Application Development Environment for Heterogeneous Platforms, Visual Studio Edition 2020.1 User GuideSend Feedback

THE INFORMATION IN THIS GUIDE AND ALL OTHER INFORMATION CONTAINED IN NVIDIA DOCUMENTATION REFERENCED IN THIS GUIDE IS PROVIDED “AS IS.” NVIDIA MAKES NO WARRANTIES, EXPRESSED, IMPLIED, STATUTORY, OR OTHERWISE WITH RESPECT TO THE INFORMATION FOR THE PRODUCT, AND EXPRESSLY DISCLAIMS ALL IMPLIED WARRANTIES OF NONINFRINGEMENT, MERCHANTABILITY, AND FITNESS FOR A PARTICULAR PURPOSE. Notwithstanding any damages that customer might incur for any reason whatsoever, NVIDIA’s aggregate and cumulative liability towards customer for the product described in this guide shall be limited in accordance with the NVIDIA terms and conditions of sale for the product.

THE NVIDIA PRODUCT DESCRIBED IN THIS GUIDE IS NOT FAULT TOLERANT AND IS NOT DESIGNED, MANUFACTURED OR INTENDED FOR USE IN CONNECTION WITH THE DESIGN, CONSTRUCTION, MAINTENANCE, AND/OR OPERATION OF ANY SYSTEM WHERE THE USE OR A FAILURE OF SUCH SYSTEM COULD RESULT IN A SITUATION THAT THREATENS THE SAFETY OF HUMAN LIFE OR SEVERE PHYSICAL HARM OR PROPERTY DAMAGE (INCLUDING, FOR EXAMPLE, USE IN CONNECTION WITH ANY NUCLEAR, AVIONICS, LIFE SUPPORT OR OTHER LIFE CRITICAL APPLICATION). NVIDIA EXPRESSLY DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY OF FITNESS FOR SUCH HIGH RISK USES. NVIDIA SHALL NOT BE LIABLE TO CUSTOMER OR ANY THIRD PARTY, IN WHOLE OR IN PART, FOR ANY CLAIMS OR DAMAGES ARISING FROM SUCH HIGH RISK USES.

NVIDIA makes no representation or warranty that the product described in this guide will be suitable for any specified use without further testing or modification. Testing of all parameters of each product is not necessarily performed by NVIDIA. It is customer’s sole responsibility to ensure the product is suitable and fit for the application planned by customer and to do the necessary testing for the application in order to avoid a default of the application or the product. Weaknesses in customer’s product designs may affect the quality and reliability of the NVIDIA product and may result in additional or different conditions and/or requirements beyond those contained in this guide. NVIDIA does not accept any liability related to any default, damage, costs or problem which may be based on or attributable to: (i) the use of the NVIDIA product in any manner that is contrary to this guide, or (ii) customer product designs.

Other than the right for customer to use the information in this guide with the product, no other license, either expressed or implied, is hereby granted by NVIDIA under this guide. Reproduction of information in this guide is permissible only if reproduction is approved by NVIDIA in writing, is reproduced without alteration, and is accompanied by all associated conditions, limitations, and notices.

Trademarks

NVIDIA, the NVIDIA logo, and cuBLAS, CUDA, CUDA-GDB, CUDA-MEMCHECK, cuDNN, cuFFT, cuSPARSE, DIGITS, DGX, DGX-1, DGX Station, NVIDIA DRIVE, NVIDIA DRIVE AGX, NVIDIA DRIVE Software, NVIDIA DRIVE OS, NVIDIA Developer Zone (aka "DevZone"), GRID, Jetson, NVIDIA Jetson Nano, NVIDIA Jetson AGX Xavier, NVIDIA Jetson TX2, NVIDIA Jetson TX2i, NVIDIA Jetson TX1, NVIDIA Jetson TK1, Kepler, NGX, NVIDIA GPU Cloud, Maxwell, Multimedia API, NCCL, NVIDIA Nsight Compute, NVIDIA Nsight Eclipse Edition, NVIDIA Nsight Graphics, NVIDIA Nsight Integration, NVIDIA Nsight Systems, NVIDIA Nsight Visual Studio Edition, NVLink, nvprof, Pascal, NVIDIA SDK Manager, Tegra, TensorRT, Tesla, Visual Profiler, VisionWorks and Volta are trademarks and/or registered trademarks of NVIDIA Corporation in the United States and other countries. Other company and product names may be trademarks of the respective companies with which they are associated.