CUDA Concurrent Kernel Trace Mode

NVIDIA® Nsight™ Application Development Environment for Heterogeneous Platforms, Visual Studio Edition 5.3 User Guide
Send Feedback


NVIDIA Nsight includes support for tracing concurrent kernel execution on newer NVIDIA GPUs. In older versions of NVIDIA Nsight, analysis captures always serialized all kernel launches, forcing them to be executed one at a time. With the new concurrent kernel trace mode, the runtime behavior of the target application with respect to the concurrent kernel execution is maintained, and all kernel start and end times are captured without forcing the kernels to be executed one at a time.

Note that on NVIDIA GPUs built on the Tesla architecture, the serialized capture mode is always used, regardless of the configuration specified in the NVIDIA Nsight options.

A few notes on concurrent kernel trace mode:
  • The maximum number of kernel launches that a device can execute concurrently is sixteen.
  • A kernel from one CUDA context cannot execute concurrently with a kernel from another CUDA context.
  • Kernels that use many textures or a large amount of local memory are less likely to execute concurrently with other kernels.

To select analysis trace mode, do the following:

  1. With your solution file open in Visual Studio, go to Nsight > Options.
  2. Click on the Analysis page.
  3. Select the drop-down menu next to CUDA Kernel Trace Mode. Here, you can choose whether to use concurrent or serialized mode.

Concurrent versus Serialized Mode

Here's an example of how concurrent kernel execution appears in the timeline report for the concurrentKernels SDK sample that ships with the NVIDIA GPU Computing SDK. All eight kernel launches are executed in parallel on the GPU.

By contrast, in serialized mode, it's easy to see that all kernel launches are forced to be processed one at a time, causing a significantly different runtime behavior.


NVIDIA® Nsight™ Application Development Environment for Heterogeneous Platforms, Visual Studio Edition User Guide Rev. 5.3.170616 ©2009-2017. NVIDIA Corporation. All Rights Reserved.