Reference > Troubleshooting NVIDIA Nsight

Problem:

I get the following error when I perform analysis activity:

Nsight-NvEvents-Provider: Too few event  buffers

The event system capturing the analysis data allocates a set of output buffers to communicate the captured date. Each OS thread that emits events to the analysis system requires to reserve such an event buffer to be able to output any data. In case all these buffers are already in use, additional providers of events will trigger this error message and their event output will be discarded.

Resolution:

You can configure the number of event buffers on the Activity document as part of the NvEvents controller options. To show these options, make sure that the flag Show Controller Options is set to TRUE. Set the option from the Nsight menu: Nsight > Options > Analysis.

For optimal performance the number of event buffers should at least be twice the number of threads outputting events.

 

Problem:

When I convert a project from Visual Studio 2008 to Visual Studio 2010, I get build errors.

Resolution:

For more information on how to convert a project, see the NVIDIA Developer Forums.

 

Problem:

How do I get a diagnostic log(s) of the NVIDIA Nsight host and monitor for troubleshooting purposes?

Resolution:

  1. Close both Visual Studio and the Nsight Monitor.
  2. On both the host and target machines, go to %AppData%\NVIDIA Corporation\Nsight\Vsip\1.0\Logs and %AppData%\NVIDIA Corporation\Nsight\Monitor\1.0\Logs and delete any existing files.
  3. Edit Nvda.Diagnostics.nlogas follows.
    1. On the host machine:
    2. On the target machine:
  4. Go to the last logger at the bottom of the file: <logger name=”*” minlevel=”Error” writeTo=”file-high-severity” />.
  5. Change the minlevel attribute value from "Error" to "Trace".
  6. Save the file.
  7. Reproduce the problem, and send the following generated logs:

 

Problem:

When breakpoints are set in source code, the CUDA Debugger pauses execution at locations unrelated to the breakpoints.

This can happen when more than one __global__ function (kernel) makes a call to a __device__ function within a single module, and both of the following are true:

Resolution:

There are a couple of approaches you can take to work around this issue:

  1. Force the __device__ function to be inlined by applying the __forceinline__ keyword to the __device__ function. Note that using the inline keyword does not force inlining in debug builds.
  2. Reorganize source code so that there is only one __global__ function for each instance of the __device__ function.  This means that each .cu file that is compiled with the NVIDIA CUDA Compiler (nvcc.exe) should contain no more than one __global__ function. This works for both Driver API and CUDART applications. Be aware that there are other potential issues with this approach:

Problem:

I get warnings that 64-bit and/or 32-bit injection is not present.

Resolution:

The Nsight Monitor checks for 64-bit versions of the CUDA injection. This means that you can get warnings if 64-bit and/or 32-bit injection is not present. If this happens, re-install the tools. 

 

Problem:

My machine hangs when I use the CUDA Debugger locally on a single machine with 2 GPUs on it.

Resolution:

There are several possible issues that can cause a machine to hang when locally debugging on two GPUs with the NVIDIA Nsight tools.

Problem:

The GPU debugger hangs when I also use the CPU debugger.

Resolution:

Never use the same Visual Studio instance to run both the CUDA Debugger and the CPU debugger.

In general, make sure you only use either CUDA Debugger or CPU debugger, not both.  Attaching the CPU debugger and hitting a CPU breakpoint during a CUDA debugging session will cause the CUDA Debugger to hang (until you resume the CPU process).

If you are careful, you can attach two separate Visual Studio instances (one CUDA, one CPU).  While you are stopped in CPU code, the CUDA Debugger will hang.  Once you resume the CPU code, CUDA Debugger will come back alive. 

 

Problem:

I am unable to set and hit a breakpoint in my CUDA code.

Resolution:

Make sure to use the driver version specified in the release notes. This is the most common reason that breakpoints do not work. The driver must be installed on the machine where your application code runs.

Also make sure your project uses a compatible CUDA toolkit (version 4.2, 4.1, or 4.0). NVIDIA Nsight includes these versions. A compatible version of the CUDA toolkit generates symbolics information that allows the CUDA Debugger to properly debug your code when you use the -G0 flag on the nvcc command line. If you are using the CUDA Driver API, make sure that there are .cubin.elf.o files alongside each of your compiled .cubin files in the build output directory for your project. Projects using the CUDA Runtime API have the symbolics information embedded in the object file itself.


Problem:

I get the following error message:

Local debugging failed. Nsight is incompatible with  WPF acceleration.
Please see documentation about WPF acceleration. Run the
DisableWpfHardwareAcceleration.reg in your Nsight installation.

Resolution:

Disable WPF D3D acceleration. For more information, see Setup Local Debugging.

If one or more applications are running with WPF hardware acceleration and you run the .reg file, you could still have issues until those applications are restarted. If you are performing local debugging, this includes the Nsight Monitor - you need to restart it seeing as it too is a WPF application.


Problem:

My program ignores breakpoints set in CPU code when I debug a program by choosing Start CUDA Debugging from the Nsight menu.

Resolution:

The CUDA Debugger ignores breakpoints set in CPU code as it does not currently support debugging x86 or other CPU code.

 

Problem:

When I hit a CUDA breakpoint, I only break once on thread (0, 0, 0) in my CUDA kernel. If I hit Continue (F5), it never breaks again and the entire launch completes.

Resolution:

The default behavior of the CUDA Debugger is to break unconditionally on the first thread of a kernel. After that, the breakpoints have an implicit conditional based on the CUDA Focus Picker. If you would like to break on a different thread, use the CUDA Focus Picker to switch focus to the desired thread or set a conditional breakpoint so that the debugger stops only on the thread you specify. For more information on setting the conditional breakpoint, see How To: Specify Debugger Context and How To: Set GPU Breakpoints. After you switch focus, the CUDA Debugger maintains the focus and breaks on breakpoints only in that thread for the duration of the kernel launch.

 


NVIDIA® Nsight™ Development Platform, Visual Studio Edition User Guide Rev. 3.2.131009 ©2009-2013. NVIDIA Corporation. All Rights Reserved.

of