CUDA Runtime API :: CUDA Toolkit Documentation

6.34. Interactions with the CUDA Driver API

This section describes the interactions between the CUDA Driver API and the CUDA Runtime API

Primary Contexts

There exists a one to one relationship between CUDA devices in the CUDA Runtime API and CUcontext s in the CUDA Driver API within a process. The specific context which the CUDA Runtime API uses for a device is called the device's primary context. From the perspective of the CUDA Runtime API, a device and its primary context are synonymous.

Initialization and Tear-Down

CUDA Runtime API calls operate on the CUDA Driver API CUcontext which is current to to the calling host thread.

The function cudaInitDevice() ensures that the primary context is initialized for the requested device but does not make it current to the calling thread.

The function cudaSetDevice() initializes the primary context for the specified device and makes it current to the calling thread by calling cuCtxSetCurrent().

The CUDA Runtime API will automatically initialize the primary context for a device at the first CUDA Runtime API call which requires an active context. If no CUcontext is current to the calling thread when a CUDA Runtime API call which requires an active context is made, then the primary context for a device will be selected, made current to the calling thread, and initialized.

The context which the CUDA Runtime API initializes will be initialized using the parameters specified by the CUDA Runtime API functions cudaSetDeviceFlags(), cudaD3D9SetDirect3DDevice(), cudaD3D10SetDirect3DDevice(), cudaD3D11SetDirect3DDevice(), cudaGLSetGLDevice(), and cudaVDPAUSetVDPAUDevice(). Note that these functions will fail with cudaErrorSetOnActiveProcess if they are called when the primary context for the specified device has already been initialized, except for cudaSetDeviceFlags() which will simply overwrite the previous settings.

Primary contexts will remain active until they are explicitly deinitialized using cudaDeviceReset(). The function cudaDeviceReset() will deinitialize the primary context for the calling thread's current device immediately. The context will remain current to all of the threads that it was current to. The next CUDA Runtime API call on any thread which requires an active context will trigger the reinitialization of that device's primary context.

Note that primary contexts are shared resources. It is recommended that the primary context not be reset except just before exit or to recover from an unspecified launch failure.

Context Interoperability

Note that the use of multiple CUcontext s per device within a single process will substantially degrade performance and is strongly discouraged. Instead, it is highly recommended that the implicit one-to-one device-to-context mapping for the process provided by the CUDA Runtime API be used.

If a non-primary CUcontext created by the CUDA Driver API is current to a thread then the CUDA Runtime API calls to that thread will operate on that CUcontext, with some exceptions listed below. Interoperability between data types is discussed in the following sections.

The function cudaPointerGetAttributes() will return the error cudaErrorIncompatibleDriverContext if the pointer being queried was allocated by a non-primary context. The function cudaDeviceEnablePeerAccess() and the rest of the peer access API may not be called when a non-primary CUcontext is current. To use the pointer query and peer access APIs with a context created using the CUDA Driver API, it is necessary that the CUDA Driver API be used to access these features.

All CUDA Runtime API state (e.g, global variables' addresses and values) travels with its underlying CUcontext. In particular, if a CUcontext is moved from one thread to another then all CUDA Runtime API state will move to that thread as well.

Please note that attaching to legacy contexts (those with a version of 3010 as returned by cuCtxGetApiVersion()) is not possible. The CUDA Runtime will return cudaErrorIncompatibleDriverContext in such cases.

Interactions between CUstream and cudaStream_t

The types CUstream and cudaStream_t are identical and may be used interchangeably.

Interactions between CUevent and cudaEvent_t

The types CUevent and cudaEvent_t are identical and may be used interchangeably.

Interactions between CUarray and cudaArray_t

The types CUarray and struct cudaArray * represent the same data type and may be used interchangeably by casting the two types between each other.

In order to use a CUarray in a CUDA Runtime API function which takes a struct cudaArray *, it is necessary to explicitly cast the CUarray to a struct cudaArray *.

In order to use a struct cudaArray * in a CUDA Driver API function which takes a CUarray, it is necessary to explicitly cast the struct cudaArray * to a CUarray .

Interactions between CUgraphicsResource and cudaGraphicsResource_t

The types CUgraphicsResource and cudaGraphicsResource_t represent the same data type and may be used interchangeably by casting the two types between each other.

In order to use a CUgraphicsResource in a CUDA Runtime API function which takes a cudaGraphicsResource_t, it is necessary to explicitly cast the CUgraphicsResource to a cudaGraphicsResource_t.

In order to use a cudaGraphicsResource_t in a CUDA Driver API function which takes a CUgraphicsResource, it is necessary to explicitly cast the cudaGraphicsResource_t to a CUgraphicsResource.

Interactions between CUtexObject and cudaTextureObject_t

The types CUtexObject and cudaTextureObject_t represent the same data type and may be used interchangeably by casting the two types between each other.

In order to use a CUtexObject in a CUDA Runtime API function which takes a cudaTextureObject_t, it is necessary to explicitly cast the CUtexObject to a cudaTextureObject_t.

In order to use a cudaTextureObject_t in a CUDA Driver API function which takes a CUtexObject, it is necessary to explicitly cast the cudaTextureObject_t to a CUtexObject.

Interactions between CUsurfObject and cudaSurfaceObject_t

The types CUsurfObject and cudaSurfaceObject_t represent the same data type and may be used interchangeably by casting the two types between each other.

In order to use a CUsurfObject in a CUDA Runtime API function which takes a cudaSurfaceObject_t, it is necessary to explicitly cast the CUsurfObject to a cudaSurfaceObject_t.

In order to use a cudaSurfaceObject_t in a CUDA Driver API function which takes a CUsurfObject, it is necessary to explicitly cast the cudaSurfaceObject_t to a CUsurfObject.

Interactions between CUfunction and cudaFunction_t

The types CUfunction and cudaFunction_t represent the same data type and may be used interchangeably by casting the two types between each other.

In order to use a cudaFunction_t in a CUDA Driver API function which takes a CUfunction, it is necessary to explicitly cast the cudaFunction_t to a CUfunction.

Interactions between CUkernel and cudaKernel_t

The types CUkernel and cudaKernel_t represent the same data type and may be used interchangeably by casting the two types between each other.

In order to use a cudaKernel_t in a CUDA Driver API function which takes a CUkernel, it is necessary to explicitly cast the cudaKernel_t to a CUkernel.

Functions

__host__ cudaError_t cudaGetFuncBySymbol ( cudaFunction_t* functionPtr, const void* symbolPtr ): Get pointer to device entry function that matches entry function symbolPtr.
__host__ cudaError_t cudaGetKernel ( cudaKernel_t* kernelPtr, const void* entryFuncAddr ): Get pointer to device kernel that matches entry function entryFuncAddr.

Functions

__host__ cudaError_t cudaGetFuncBySymbol ( cudaFunction_t* functionPtr, const void* symbolPtr )

Get pointer to device entry function that matches entry function symbolPtr.

Parameters

functionPtr: - Returns the device entry function
symbolPtr: - Pointer to device entry function to search for

Returns

cudaSuccess

Description

Returns in functionPtr the device entry function corresponding to the symbol symbolPtr.

__host__ cudaError_t cudaGetKernel ( cudaKernel_t* kernelPtr, const void* entryFuncAddr )

Get pointer to device kernel that matches entry function entryFuncAddr.

Parameters

kernelPtr: - Returns the device kernel
entryFuncAddr: - Address of device entry function to search kernel for

Returns

cudaSuccess

Description

Returns in kernelPtr the device kernel corresponding to the entry function entryFuncAddr.

Note that it is possible that there are multiple symbols belonging to different translation units with the same entryFuncAddr registered with this CUDA Runtime and so the order which the translation units are loaded and registered with the CUDA Runtime can lead to differing return pointers in kernelPtr . Suggested methods of ensuring uniqueness are to limit visibility of __global__ device functions by using static or hidden visibility attribute in the respective translation units.

See also:

cudaGetKernel (C++ API)