NVIDIA® Nsight™ Application Development Environment for Heterogeneous Platforms, Visual Studio Edition 5.3 User Guide
Send Feedback
I get the following error when I perform analysis activity:
Nsight-NvEvents-Provider: Too few event buffers
The event system capturing the analysis data allocates a set of output buffers to communicate the captured date. Each OS thread that emits events to the analysis system requires to reserve such an event buffer to be able to output any data. In case all these buffers are already in use, additional providers of events will trigger this error message and their event output will be discarded.
You can configure the number of event buffers on the Activity document as part of the NvEvents controller options. To show these options, make sure that the flag Show Controller Options is set to TRUE. Set the option from the Nsight menu: Nsight > Options > Analysis.
For optimal performance the number of event buffers should at least be twice the number of threads outputting events.
When I convert a project from Visual Studio 2008 to Visual Studio 2010, I get build errors.
For more information on how to convert a project, see the NVIDIA Developer Forums.
How do I get a diagnostic log(s) of the NVIDIA Nsight host and monitor for troubleshooting purposes?
%AppData%\NVIDIA Corporation\Nsight\Vsip\1.0\Logs
and %AppData%\NVIDIA Corporation\Nsight\Monitor\1.0\Logs
and delete any existing files. Nvda.Diagnostics.nlog
as follows. Program Files (x86)\NVIDIA Corporation\Nsight Visual Studio Edition 5.3\Common\Configurations
Program Files (x86)\NVIDIA Corporation\Nsight Visual Studio Edition Monitor 5.3\Common\Configurations
<logger name=”*” minlevel=”Error” writeTo=”file-high-severity” />.
minlevel
attribute value from "Error"
to "Trace".
%AppData%\NVIDIA Corporation\Nsight\Vsip\1.0\Logs
%AppData%\NVIDIA Corporation\Nsight\Monitor\1.0\Logs
When breakpoints are set in source code, the CUDA Debugger pauses execution at locations unrelated to the breakpoints.
This can happen when more than one __global__
function (kernel) makes a call to a __device__
function within a single module, and both of the following are true:
the __device__
function is not inlined.
the different kernels call the exact same __device__
function.
There are a couple of approaches you can take to work around this issue:
__device__
function to be inlined by applying the __forceinline__
keyword to the __device__
function. Note that using the inline
keyword does not force inlining in debug builds.
__global__
function for each instance of the __device__
function. This means that each .cu
file that is compiled with the NVIDIA CUDA Compiler (nvcc.exe) should contain no more than one __global__
function. This works for both Driver API and CUDART applications. Be aware that there are other potential issues with this approach:
__device__
functions to common header files. Use the #include
statement to include the __device__
function in each .cu
file containing a __global__
function. __device__ int x;
and that variable is used by multiple __global__
functions, then using multiple files to make multiple calls to the __global__
function is not a trivial work-around. In this case, we recommend eliminating global variables that are declared in that style from the source code, and making them kernel parameters instead.
__constant__
variable is associated with one CUDA module (a compiled .cu
file).
If your source code is written in a way that multiple kernels depend on the same __constant__
variable, and the host code side of your application dynamically updates that variable, then you will need some broader changes to your source code:
__constant__
variable into each .cu
file, give each variable a different name.
I get warnings that 64-bit injection and/or 32-bit injection is not present.
The Nsight Monitor checks for 64-bit versions of the CUDA injection. This means that you can get warnings if 64-bit and/or 32-bit injection is not present. If this happens, re-install the tools.
My machine hangs when I use the CUDA Debugger locally on a single machine with 2 GPUs on it.
There are several possible issues that can cause a machine to hang when locally debugging on two GPUs with the NVIDIA Nsight tools.
Make sure that your TDR settings have been configured correctly. For more information, see Timeout Detection and Recovery.
We recommend not having a display attached or a desktop running on the GPU on which you are debugging CUDA code, as having concurrent activities on a GPU can cause machine hangs. See How To: Setup Local Headless GPU Debugging for more information.
The GPU debugger hangs when I also use the CPU debugger.
Never use the same Visual Studio instance to run both the CUDA Debugger and the CPU debugger.
In general, make sure you only use either CUDA Debugger or CPU debugger, not both. Attaching the CPU debugger and hitting a CPU breakpoint during a CUDA debugging session will cause the CUDA Debugger to hang (until you resume the CPU process).
If you are careful, you can attach two separate Visual Studio instances (one CUDA, one CPU). While you are stopped in CPU code, the CUDA Debugger will hang. Once you resume the CPU code, CUDA Debugger will come back alive.
I am unable to set and hit a breakpoint in my CUDA code.
Make sure to use the driver version specified in the release notes. This is the most common reason that breakpoints do not work. The driver must be installed on the machine where your application code runs.
Also, make sure your project uses a compatible CUDA toolkit. A compatible version of the CUDA toolkit generates symbolics information that allows the CUDA Debugger to properly debug your code when you use the -G0
flag on the nvcc
command line. If you are using the CUDA Driver API, make sure that there are .cubin.elf.o
files alongside each of your compiled .cubin
files in the build output directory for your project. Projects using the CUDA Runtime API have the symbolics information embedded in the object file itself.
I get the following error message:
Local debugging failed. Nsight is incompatible with WPF acceleration.
Please see documentation about WPF acceleration. Run the
DisableWpfHardwareAcceleration.reg in your Nsight installation.
Disable WPF D3D acceleration. For more information, see Setup Local Debugging.
If one or more applications are running with WPF hardware acceleration and you run the .reg
file, you could still have issues until those applications are restarted. If you are performing local debugging, this includes the Nsight Monitor - you need to restart it seeing as it too is a WPF application.
My program ignores breakpoints set in CPU code when I debug a program by choosing Start CUDA Debugging from the Nsight menu.
The CUDA Debugger ignores breakpoints set in CPU code as it does not currently support debugging x86 or other CPU code.
When I hit a CUDA breakpoint, I only break once on thread (0, 0, 0) in my CUDA kernel. If I hit Continue (F5), it never breaks again and the entire launch completes.
The default behavior of the CUDA Debugger is to break unconditionally on the first thread of a kernel. After that, the breakpoints have an implicit conditional based on the CUDA Focus Picker. If you would like to break on a different thread, use the CUDA Focus Picker to switch focus to the desired thread or set a conditional breakpoint so that the debugger stops only on the thread you specify. For more information on setting the conditional breakpoint, see How To: Specify Debugger Context and How To: Set GPU Breakpoints. After you switch focus, the CUDA Debugger maintains the focus and breaks on breakpoints only in that thread for the duration of the kernel launch.
I encounter an error when trying to copy and paste my shader code.
This can happen in Visual Studio 2012, when the Productivity Power Tools extension is being used. Disable the HTML Copy option, and you should be able to copy and paste normally in the shader editor.
NVIDIA® Nsight™ Application Development Environment for Heterogeneous Platforms, Visual Studio Edition User Guide Rev. 5.3.170616 ©2009-2017. NVIDIA Corporation. All Rights Reserved.