5.1. Data types used by CUDA driver
Classes
- struct
- struct
- struct
- struct
- struct
- struct
- struct
- struct
- struct
- struct
- struct
- struct
- struct
- struct
- struct
- struct
- struct
- struct
- struct
- struct
- struct
- struct
- struct
- struct
- struct
- struct
- union
Defines
- #define CUDA_ARRAY3D_2DARRAY 0x01
- #define CUDA_ARRAY3D_COLOR_ATTACHMENT 0x20
- #define CUDA_ARRAY3D_CUBEMAP 0x04
- #define CUDA_ARRAY3D_DEPTH_TEXTURE 0x10
- #define CUDA_ARRAY3D_LAYERED 0x01
- #define CUDA_ARRAY3D_SURFACE_LDST 0x02
- #define CUDA_ARRAY3D_TEXTURE_GATHER 0x08
- #define CUDA_COOPERATIVE_LAUNCH_MULTI_DEVICE_NO_POST_LAUNCH_SYNC 0x02
- #define CUDA_COOPERATIVE_LAUNCH_MULTI_DEVICE_NO_PRE_LAUNCH_SYNC 0x01
- #define CUDA_EXTERNAL_MEMORY_DEDICATED 0x1
- #define CUDA_EXTERNAL_SEMAPHORE_SIGNAL_SKIP_NVSCIBUF_MEMSYNC 0x01
- #define CUDA_EXTERNAL_SEMAPHORE_WAIT_SKIP_NVSCIBUF_MEMSYNC 0x02
- #define CUDA_NVSCISYNC_ATTR_SIGNAL 0x1
- #define CUDA_NVSCISYNC_ATTR_WAIT 0x2
- #define CUDA_VERSION 10020
- #define CU_DEVICE_CPU ((CUdevice)-1)
- #define CU_DEVICE_INVALID ((CUdevice)-2)
- #define CU_IPC_HANDLE_SIZE 64
- #define CU_LAUNCH_PARAM_BUFFER_POINTER ((void*)0x01)
- #define CU_LAUNCH_PARAM_BUFFER_SIZE ((void*)0x02)
- #define CU_LAUNCH_PARAM_END ((void*)0x00)
- #define CU_MEMHOSTALLOC_DEVICEMAP 0x02
- #define CU_MEMHOSTALLOC_PORTABLE 0x01
- #define CU_MEMHOSTALLOC_WRITECOMBINED 0x04
- #define CU_MEMHOSTREGISTER_DEVICEMAP 0x02
- #define CU_MEMHOSTREGISTER_IOMEMORY 0x04
- #define CU_MEMHOSTREGISTER_PORTABLE 0x01
- #define CU_PARAM_TR_DEFAULT -1
- #define CU_STREAM_LEGACY ((CUstream)0x1)
- #define CU_STREAM_PER_THREAD ((CUstream)0x2)
- #define CU_TRSA_OVERRIDE_FORMAT 0x01
- #define CU_TRSF_NORMALIZED_COORDINATES 0x02
- #define CU_TRSF_READ_AS_INTEGER 0x01
- #define CU_TRSF_SRGB 0x10
- #define MAX_PLANES 3
Typedefs
- typedef CUarray_st * CUarray
- typedef CUctx_st * CUcontext
- typedef int CUdevice
- typedef unsigned int CUdeviceptr
- typedef CUeglStreamConnection_st * CUeglStreamConnection
- typedef CUevent_st * CUevent
- typedef CUextMemory_st * CUexternalMemory
- typedef CUextSemaphore_st * CUexternalSemaphore
- typedef CUfunc_st * CUfunction
- typedef CUgraph_st * CUgraph
- typedef CUgraphExec_st * CUgraphExec
- typedef CUgraphNode_st * CUgraphNode
- typedef CUgraphicsResource_st * CUgraphicsResource
- typedef void(CUDA_CB* CUhostFn )( void* userData )
- typedef CUmipmappedArray_st * CUmipmappedArray
- typedef CUmod_st * CUmodule
- typedef size_t(CUDA_CB* CUoccupancyB2DSize )( int blockSize )
- typedef CUstream_st * CUstream
- typedef void(CUDA_CB* CUstreamCallback )( CUstream hStream, CUresult status, void* userData )
- typedef unsigned long long CUsurfObject
- typedef CUsurfref_st * CUsurfref
- typedef unsigned long long CUtexObject
- typedef CUtexref_st * CUtexref
Enumerations
- enum CUaddress_mode
- enum CUarray_cubemap_face
- enum CUarray_format
- enum CUcomputemode
- enum CUctx_flags
- enum CUdevice_P2PAttribute
- enum CUdevice_attribute
- enum CUeglColorFormat
- enum CUeglFrameType
- enum CUeglResourceLocationFlags
- enum CUevent_flags
- enum CUexternalMemoryHandleType
- enum CUexternalSemaphoreHandleType
- enum CUfilter_mode
- enum CUfunc_cache
- enum CUfunction_attribute
- enum CUgraphNodeType
- enum CUgraphicsMapResourceFlags
- enum CUgraphicsRegisterFlags
- enum CUipcMem_flags
- enum CUjitInputType
- enum CUjit_cacheMode
- enum CUjit_fallback
- enum CUjit_option
- enum CUjit_target
- enum CUlimit
- enum CUmemAccess_flags
- enum CUmemAllocationGranularity_flags
- enum CUmemAllocationHandleType
- enum CUmemAllocationType
- enum CUmemAttach_flags
- enum CUmemLocationType
- enum CUmem_advise
- enum CUmemorytype
- enum CUoccupancy_flags
- enum CUpointer_attribute
- enum CUresourceViewFormat
- enum CUresourcetype
- enum CUresult
- enum CUshared_carveout
- enum CUsharedconfig
- enum CUstreamBatchMemOpType
- enum CUstreamCaptureMode
- enum CUstreamCaptureStatus
- enum CUstreamWaitValue_flags
- enum CUstreamWriteValue_flags
- enum CUstream_flags
Defines
- #define CUDA_ARRAY3D_2DARRAY 0x01
-
Deprecated, use CUDA_ARRAY3D_LAYERED
- #define CUDA_ARRAY3D_COLOR_ATTACHMENT 0x20
-
This flag indicates that the CUDA array may be bound as a color target in an external graphics API
- #define CUDA_ARRAY3D_CUBEMAP 0x04
-
If set, the CUDA array is a collection of six 2D arrays, representing faces of a cube. The width of such a CUDA array must be equal to its height, and Depth must be six. If CUDA_ARRAY3D_LAYERED flag is also set, then the CUDA array is a collection of cubemaps and Depth must be a multiple of six.
- #define CUDA_ARRAY3D_DEPTH_TEXTURE 0x10
-
This flag if set indicates that the CUDA array is a DEPTH_TEXTURE.
- #define CUDA_ARRAY3D_LAYERED 0x01
-
If set, the CUDA array is a collection of layers, where each layer is either a 1D or a 2D array and the Depth member of CUDA_ARRAY3D_DESCRIPTOR specifies the number of layers, not the depth of a 3D array.
- #define CUDA_ARRAY3D_SURFACE_LDST 0x02
-
This flag must be set in order to bind a surface reference to the CUDA array
- #define CUDA_ARRAY3D_TEXTURE_GATHER 0x08
-
This flag must be set in order to perform texture gather operations on a CUDA array.
- #define CUDA_COOPERATIVE_LAUNCH_MULTI_DEVICE_NO_POST_LAUNCH_SYNC 0x02
-
If set, any subsequent work pushed in a stream that participated in a call to cuLaunchCooperativeKernelMultiDevice will only wait for the kernel launched on the GPU corresponding to that stream to complete before it begins execution.
- #define CUDA_COOPERATIVE_LAUNCH_MULTI_DEVICE_NO_PRE_LAUNCH_SYNC 0x01
-
If set, each kernel launched as part of cuLaunchCooperativeKernelMultiDevice only waits for prior work in the stream corresponding to that GPU to complete before the kernel begins execution.
- #define CUDA_EXTERNAL_MEMORY_DEDICATED 0x1
-
Indicates that the external memory object is a dedicated resource
- #define CUDA_EXTERNAL_SEMAPHORE_SIGNAL_SKIP_NVSCIBUF_MEMSYNC 0x01
-
When the /p flags parameter of CUDA_EXTERNAL_SEMAPHORE_SIGNAL_PARAMS contains this flag, it indicates that signaling an external semaphore object should skip performing appropriate memory synchronization operations over all the external memory objects that are imported as CU_EXTERNAL_MEMORY_HANDLE_TYPE_NVSCIBUF, which otherwise are performed by default to ensure data coherency with other importers of the same NvSciBuf memory objects.
- #define CUDA_EXTERNAL_SEMAPHORE_WAIT_SKIP_NVSCIBUF_MEMSYNC 0x02
-
When the /p flags parameter of CUDA_EXTERNAL_SEMAPHORE_WAIT_PARAMS contains this flag, it indicates that waiting on an external semaphore object should skip performing appropriate memory synchronization operations over all the external memory objects that are imported as CU_EXTERNAL_MEMORY_HANDLE_TYPE_NVSCIBUF, which otherwise are performed by default to ensure data coherency with other importers of the same NvSciBuf memory objects.
- #define CUDA_NVSCISYNC_ATTR_SIGNAL 0x1
-
When /p flags of cuDeviceGetNvSciSyncAttributes is set to this, it indicates that application needs signaler specific NvSciSyncAttr to be filled by cuDeviceGetNvSciSyncAttributes.
- #define CUDA_NVSCISYNC_ATTR_WAIT 0x2
-
When /p flags of cuDeviceGetNvSciSyncAttributes is set to this, it indicates that application needs waiter specific NvSciSyncAttr to be filled by cuDeviceGetNvSciSyncAttributes.
- #define CUDA_VERSION 10020
-
CUDA API version number
- #define CU_DEVICE_CPU ((CUdevice)-1)
-
Device that represents the CPU
- #define CU_DEVICE_INVALID ((CUdevice)-2)
-
Device that represents an invalid device
- #define CU_IPC_HANDLE_SIZE 64
-
CUDA IPC handle size
- #define CU_LAUNCH_PARAM_BUFFER_POINTER ((void*)0x01)
-
Indicator that the next value in the extra parameter to cuLaunchKernel will be a pointer to a buffer containing all kernel parameters used for launching kernel f. This buffer needs to honor all alignment/padding requirements of the individual parameters. If CU_LAUNCH_PARAM_BUFFER_SIZE is not also specified in the extra array, then CU_LAUNCH_PARAM_BUFFER_POINTER will have no effect.
- #define CU_LAUNCH_PARAM_BUFFER_SIZE ((void*)0x02)
-
Indicator that the next value in the extra parameter to cuLaunchKernel will be a pointer to a size_t which contains the size of the buffer specified with CU_LAUNCH_PARAM_BUFFER_POINTER. It is required that CU_LAUNCH_PARAM_BUFFER_POINTER also be specified in the extra array if the value associated with CU_LAUNCH_PARAM_BUFFER_SIZE is not zero.
- #define CU_LAUNCH_PARAM_END ((void*)0x00)
-
End of array terminator for the extra parameter to cuLaunchKernel
- #define CU_MEMHOSTALLOC_DEVICEMAP 0x02
-
If set, host memory is mapped into CUDA address space and cuMemHostGetDevicePointer() may be called on the host pointer. Flag for cuMemHostAlloc()
- #define CU_MEMHOSTALLOC_PORTABLE 0x01
-
If set, host memory is portable between CUDA contexts. Flag for cuMemHostAlloc()
- #define CU_MEMHOSTALLOC_WRITECOMBINED 0x04
-
If set, host memory is allocated as write-combined - fast to write, faster to DMA, slow to read except via SSE4 streaming load instruction (MOVNTDQA). Flag for cuMemHostAlloc()
- #define CU_MEMHOSTREGISTER_DEVICEMAP 0x02
-
If set, host memory is mapped into CUDA address space and cuMemHostGetDevicePointer() may be called on the host pointer. Flag for cuMemHostRegister()
- #define CU_MEMHOSTREGISTER_IOMEMORY 0x04
-
If set, the passed memory pointer is treated as pointing to some memory-mapped I/O space, e.g. belonging to a third-party PCIe device. On Windows the flag is a no-op. On Linux that memory is marked as non cache-coherent for the GPU and is expected to be physically contiguous. It may return CUDA_ERROR_NOT_PERMITTED if run as an unprivileged user, CUDA_ERROR_NOT_SUPPORTED on older Linux kernel versions. On all other platforms, it is not supported and CUDA_ERROR_NOT_SUPPORTED is returned. Flag for cuMemHostRegister()
- #define CU_MEMHOSTREGISTER_PORTABLE 0x01
-
If set, host memory is portable between CUDA contexts. Flag for cuMemHostRegister()
- #define CU_PARAM_TR_DEFAULT -1
-
For texture references loaded into the module, use default texunit from texture reference.
- #define CU_STREAM_LEGACY ((CUstream)0x1)
-
Legacy stream handle
Stream handle that can be passed as a CUstream to use an implicit stream with legacy synchronization behavior.
See details of the synchronization behavior.
- #define CU_STREAM_PER_THREAD ((CUstream)0x2)
-
Per-thread stream handle
Stream handle that can be passed as a CUstream to use an implicit stream with per-thread synchronization behavior.
See details of the synchronization behavior.
- #define CU_TRSA_OVERRIDE_FORMAT 0x01
-
Override the texref format with a format inferred from the array. Flag for cuTexRefSetArray()
- #define CU_TRSF_NORMALIZED_COORDINATES 0x02
-
Use normalized texture coordinates in the range [0,1) instead of [0,dim). Flag for cuTexRefSetFlags()
- #define CU_TRSF_READ_AS_INTEGER 0x01
-
Read the texture as integers rather than promoting the values to floats in the range [0,1]. Flag for cuTexRefSetFlags()
- #define CU_TRSF_SRGB 0x10
-
Perform sRGB->linear conversion during texture read. Flag for cuTexRefSetFlags()
- #define MAX_PLANES 3
-
Maximum number of planes per frame
Typedefs
- typedef CUarray_st * CUarray
-
CUDA array
- typedef CUctx_st * CUcontext
-
CUDA context
- typedef int CUdevice
-
CUDA device
- typedef unsigned int CUdeviceptr
-
CUDA device pointer CUdeviceptr is defined as an unsigned integer type whose size matches the size of a pointer on the target platform.
- typedef CUeglStreamConnection_st * CUeglStreamConnection
-
CUDA EGLSream Connection
- typedef CUevent_st * CUevent
-
CUDA event
- typedef CUextMemory_st * CUexternalMemory
-
CUDA external memory
- typedef CUextSemaphore_st * CUexternalSemaphore
-
CUDA external semaphore
- typedef CUfunc_st * CUfunction
-
CUDA function
- typedef CUgraph_st * CUgraph
-
CUDA graph
- typedef CUgraphExec_st * CUgraphExec
-
CUDA executable graph
- typedef CUgraphNode_st * CUgraphNode
-
CUDA graph node
- typedef CUgraphicsResource_st * CUgraphicsResource
-
CUDA graphics interop resource
- void(CUDA_CB* CUhostFn )( void* userData )
-
CUDA host function
- userData
- Argument value passed to the function
- typedef CUmipmappedArray_st * CUmipmappedArray
-
CUDA mipmapped array
- typedef CUmod_st * CUmodule
-
CUDA module
- size_t(CUDA_CB* CUoccupancyB2DSize )( int blockSize )
-
Block size to per-block dynamic shared memory mapping for a certain kernel
- blockSize
- Block size of the kernel.
- typedef CUstream_st * CUstream
-
CUDA stream
- void(CUDA_CB* CUstreamCallback )( CUstream hStream, CUresult status, void* userData )
-
CUDA stream callback
- hStream
- The stream the callback was added to, as passed to cuStreamAddCallback. May be NULL.
- CUresult status
- userData
- User parameter provided at registration.
- typedef unsigned long long CUsurfObject
-
An opaque value that represents a CUDA surface object
- typedef CUsurfref_st * CUsurfref
-
CUDA surface reference
- typedef unsigned long long CUtexObject
-
An opaque value that represents a CUDA texture object
- typedef CUtexref_st * CUtexref
-
CUDA texture reference
Parameters
Parameters
Returns
The dynamic shared memory needed by a block.
Parameters
Enumerations
- enum CUaddress_mode
-
Texture reference addressing modes
Values
- CU_TR_ADDRESS_MODE_WRAP = 0
- Wrapping address mode
- CU_TR_ADDRESS_MODE_CLAMP = 1
- Clamp to edge address mode
- CU_TR_ADDRESS_MODE_MIRROR = 2
- Mirror address mode
- CU_TR_ADDRESS_MODE_BORDER = 3
- Border address mode
- enum CUarray_cubemap_face
-
Array indices for cube faces
Values
- CU_CUBEMAP_FACE_POSITIVE_X = 0x00
- Positive X face of cubemap
- CU_CUBEMAP_FACE_NEGATIVE_X = 0x01
- Negative X face of cubemap
- CU_CUBEMAP_FACE_POSITIVE_Y = 0x02
- Positive Y face of cubemap
- CU_CUBEMAP_FACE_NEGATIVE_Y = 0x03
- Negative Y face of cubemap
- CU_CUBEMAP_FACE_POSITIVE_Z = 0x04
- Positive Z face of cubemap
- CU_CUBEMAP_FACE_NEGATIVE_Z = 0x05
- Negative Z face of cubemap
- enum CUarray_format
-
Array formats
Values
- CU_AD_FORMAT_UNSIGNED_INT8 = 0x01
- Unsigned 8-bit integers
- CU_AD_FORMAT_UNSIGNED_INT16 = 0x02
- Unsigned 16-bit integers
- CU_AD_FORMAT_UNSIGNED_INT32 = 0x03
- Unsigned 32-bit integers
- CU_AD_FORMAT_SIGNED_INT8 = 0x08
- Signed 8-bit integers
- CU_AD_FORMAT_SIGNED_INT16 = 0x09
- Signed 16-bit integers
- CU_AD_FORMAT_SIGNED_INT32 = 0x0a
- Signed 32-bit integers
- CU_AD_FORMAT_HALF = 0x10
- 16-bit floating point
- CU_AD_FORMAT_FLOAT = 0x20
- 32-bit floating point
- enum CUcomputemode
-
Compute Modes
Values
- CU_COMPUTEMODE_DEFAULT = 0
- Default compute mode (Multiple contexts allowed per device)
- CU_COMPUTEMODE_PROHIBITED = 2
- Compute-prohibited mode (No contexts can be created on this device at this time)
- CU_COMPUTEMODE_EXCLUSIVE_PROCESS = 3
- Compute-exclusive-process mode (Only one context used by a single process can be present on this device at a time)
- enum CUctx_flags
-
Context creation flags
Values
- CU_CTX_SCHED_AUTO = 0x00
- Automatic scheduling
- CU_CTX_SCHED_SPIN = 0x01
- Set spin as default scheduling
- CU_CTX_SCHED_YIELD = 0x02
- Set yield as default scheduling
- CU_CTX_SCHED_BLOCKING_SYNC = 0x04
- Set blocking synchronization as default scheduling
- CU_CTX_BLOCKING_SYNC = 0x04
-
Deprecated
This flag was deprecated as of CUDA 4.0 and was replaced with CU_CTX_SCHED_BLOCKING_SYNC.
Set blocking synchronization as default scheduling
- CU_CTX_SCHED_MASK = 0x07
- CU_CTX_MAP_HOST = 0x08
- Support mapped pinned allocations
- CU_CTX_LMEM_RESIZE_TO_MAX = 0x10
- Keep local memory allocation after launch
- CU_CTX_FLAGS_MASK = 0x1f
- enum CUdevice_P2PAttribute
-
P2P Attributes
Values
- CU_DEVICE_P2P_ATTRIBUTE_PERFORMANCE_RANK = 0x01
- A relative value indicating the performance of the link between two devices
- CU_DEVICE_P2P_ATTRIBUTE_ACCESS_SUPPORTED = 0x02
- P2P Access is enable
- CU_DEVICE_P2P_ATTRIBUTE_NATIVE_ATOMIC_SUPPORTED = 0x03
- Atomic operation over the link supported
- CU_DEVICE_P2P_ATTRIBUTE_ACCESS_ACCESS_SUPPORTED = 0x04
-
Deprecated
use CU_DEVICE_P2P_ATTRIBUTE_CUDA_ARRAY_ACCESS_SUPPORTED instead
- CU_DEVICE_P2P_ATTRIBUTE_CUDA_ARRAY_ACCESS_SUPPORTED = 0x04
- Accessing CUDA arrays over the link supported
- enum CUdevice_attribute
-
Device properties
Values
- CU_DEVICE_ATTRIBUTE_MAX_THREADS_PER_BLOCK = 1
- Maximum number of threads per block
- CU_DEVICE_ATTRIBUTE_MAX_BLOCK_DIM_X = 2
- Maximum block dimension X
- CU_DEVICE_ATTRIBUTE_MAX_BLOCK_DIM_Y = 3
- Maximum block dimension Y
- CU_DEVICE_ATTRIBUTE_MAX_BLOCK_DIM_Z = 4
- Maximum block dimension Z
- CU_DEVICE_ATTRIBUTE_MAX_GRID_DIM_X = 5
- Maximum grid dimension X
- CU_DEVICE_ATTRIBUTE_MAX_GRID_DIM_Y = 6
- Maximum grid dimension Y
- CU_DEVICE_ATTRIBUTE_MAX_GRID_DIM_Z = 7
- Maximum grid dimension Z
- CU_DEVICE_ATTRIBUTE_MAX_SHARED_MEMORY_PER_BLOCK = 8
- Maximum shared memory available per block in bytes
- CU_DEVICE_ATTRIBUTE_SHARED_MEMORY_PER_BLOCK = 8
- Deprecated, use CU_DEVICE_ATTRIBUTE_MAX_SHARED_MEMORY_PER_BLOCK
- CU_DEVICE_ATTRIBUTE_TOTAL_CONSTANT_MEMORY = 9
- Memory available on device for __constant__ variables in a CUDA C kernel in bytes
- CU_DEVICE_ATTRIBUTE_WARP_SIZE = 10
- Warp size in threads
- CU_DEVICE_ATTRIBUTE_MAX_PITCH = 11
- Maximum pitch in bytes allowed by memory copies
- CU_DEVICE_ATTRIBUTE_MAX_REGISTERS_PER_BLOCK = 12
- Maximum number of 32-bit registers available per block
- CU_DEVICE_ATTRIBUTE_REGISTERS_PER_BLOCK = 12
- Deprecated, use CU_DEVICE_ATTRIBUTE_MAX_REGISTERS_PER_BLOCK
- CU_DEVICE_ATTRIBUTE_CLOCK_RATE = 13
- Typical clock frequency in kilohertz
- CU_DEVICE_ATTRIBUTE_TEXTURE_ALIGNMENT = 14
- Alignment requirement for textures
- CU_DEVICE_ATTRIBUTE_GPU_OVERLAP = 15
- Device can possibly copy memory and execute a kernel concurrently. Deprecated. Use instead CU_DEVICE_ATTRIBUTE_ASYNC_ENGINE_COUNT.
- CU_DEVICE_ATTRIBUTE_MULTIPROCESSOR_COUNT = 16
- Number of multiprocessors on device
- CU_DEVICE_ATTRIBUTE_KERNEL_EXEC_TIMEOUT = 17
- Specifies whether there is a run time limit on kernels
- CU_DEVICE_ATTRIBUTE_INTEGRATED = 18
- Device is integrated with host memory
- CU_DEVICE_ATTRIBUTE_CAN_MAP_HOST_MEMORY = 19
- Device can map host memory into CUDA address space
- CU_DEVICE_ATTRIBUTE_COMPUTE_MODE = 20
- Compute mode (See CUcomputemode for details)
- CU_DEVICE_ATTRIBUTE_MAXIMUM_TEXTURE1D_WIDTH = 21
- Maximum 1D texture width
- CU_DEVICE_ATTRIBUTE_MAXIMUM_TEXTURE2D_WIDTH = 22
- Maximum 2D texture width
- CU_DEVICE_ATTRIBUTE_MAXIMUM_TEXTURE2D_HEIGHT = 23
- Maximum 2D texture height
- CU_DEVICE_ATTRIBUTE_MAXIMUM_TEXTURE3D_WIDTH = 24
- Maximum 3D texture width
- CU_DEVICE_ATTRIBUTE_MAXIMUM_TEXTURE3D_HEIGHT = 25
- Maximum 3D texture height
- CU_DEVICE_ATTRIBUTE_MAXIMUM_TEXTURE3D_DEPTH = 26
- Maximum 3D texture depth
- CU_DEVICE_ATTRIBUTE_MAXIMUM_TEXTURE2D_LAYERED_WIDTH = 27
- Maximum 2D layered texture width
- CU_DEVICE_ATTRIBUTE_MAXIMUM_TEXTURE2D_LAYERED_HEIGHT = 28
- Maximum 2D layered texture height
- CU_DEVICE_ATTRIBUTE_MAXIMUM_TEXTURE2D_LAYERED_LAYERS = 29
- Maximum layers in a 2D layered texture
- CU_DEVICE_ATTRIBUTE_MAXIMUM_TEXTURE2D_ARRAY_WIDTH = 27
- Deprecated, use CU_DEVICE_ATTRIBUTE_MAXIMUM_TEXTURE2D_LAYERED_WIDTH
- CU_DEVICE_ATTRIBUTE_MAXIMUM_TEXTURE2D_ARRAY_HEIGHT = 28
- Deprecated, use CU_DEVICE_ATTRIBUTE_MAXIMUM_TEXTURE2D_LAYERED_HEIGHT
- CU_DEVICE_ATTRIBUTE_MAXIMUM_TEXTURE2D_ARRAY_NUMSLICES = 29
- Deprecated, use CU_DEVICE_ATTRIBUTE_MAXIMUM_TEXTURE2D_LAYERED_LAYERS
- CU_DEVICE_ATTRIBUTE_SURFACE_ALIGNMENT = 30
- Alignment requirement for surfaces
- CU_DEVICE_ATTRIBUTE_CONCURRENT_KERNELS = 31
- Device can possibly execute multiple kernels concurrently
- CU_DEVICE_ATTRIBUTE_ECC_ENABLED = 32
- Device has ECC support enabled
- CU_DEVICE_ATTRIBUTE_PCI_BUS_ID = 33
- PCI bus ID of the device
- CU_DEVICE_ATTRIBUTE_PCI_DEVICE_ID = 34
- PCI device ID of the device
- CU_DEVICE_ATTRIBUTE_TCC_DRIVER = 35
- Device is using TCC driver model
- CU_DEVICE_ATTRIBUTE_MEMORY_CLOCK_RATE = 36
- Peak memory clock frequency in kilohertz
- CU_DEVICE_ATTRIBUTE_GLOBAL_MEMORY_BUS_WIDTH = 37
- Global memory bus width in bits
- CU_DEVICE_ATTRIBUTE_L2_CACHE_SIZE = 38
- Size of L2 cache in bytes
- CU_DEVICE_ATTRIBUTE_MAX_THREADS_PER_MULTIPROCESSOR = 39
- Maximum resident threads per multiprocessor
- CU_DEVICE_ATTRIBUTE_ASYNC_ENGINE_COUNT = 40
- Number of asynchronous engines
- CU_DEVICE_ATTRIBUTE_UNIFIED_ADDRESSING = 41
- Device shares a unified address space with the host
- CU_DEVICE_ATTRIBUTE_MAXIMUM_TEXTURE1D_LAYERED_WIDTH = 42
- Maximum 1D layered texture width
- CU_DEVICE_ATTRIBUTE_MAXIMUM_TEXTURE1D_LAYERED_LAYERS = 43
- Maximum layers in a 1D layered texture
- CU_DEVICE_ATTRIBUTE_CAN_TEX2D_GATHER = 44
- Deprecated, do not use.
- CU_DEVICE_ATTRIBUTE_MAXIMUM_TEXTURE2D_GATHER_WIDTH = 45
- Maximum 2D texture width if CUDA_ARRAY3D_TEXTURE_GATHER is set
- CU_DEVICE_ATTRIBUTE_MAXIMUM_TEXTURE2D_GATHER_HEIGHT = 46
- Maximum 2D texture height if CUDA_ARRAY3D_TEXTURE_GATHER is set
- CU_DEVICE_ATTRIBUTE_MAXIMUM_TEXTURE3D_WIDTH_ALTERNATE = 47
- Alternate maximum 3D texture width
- CU_DEVICE_ATTRIBUTE_MAXIMUM_TEXTURE3D_HEIGHT_ALTERNATE = 48
- Alternate maximum 3D texture height
- CU_DEVICE_ATTRIBUTE_MAXIMUM_TEXTURE3D_DEPTH_ALTERNATE = 49
- Alternate maximum 3D texture depth
- CU_DEVICE_ATTRIBUTE_PCI_DOMAIN_ID = 50
- PCI domain ID of the device
- CU_DEVICE_ATTRIBUTE_TEXTURE_PITCH_ALIGNMENT = 51
- Pitch alignment requirement for textures
- CU_DEVICE_ATTRIBUTE_MAXIMUM_TEXTURECUBEMAP_WIDTH = 52
- Maximum cubemap texture width/height
- CU_DEVICE_ATTRIBUTE_MAXIMUM_TEXTURECUBEMAP_LAYERED_WIDTH = 53
- Maximum cubemap layered texture width/height
- CU_DEVICE_ATTRIBUTE_MAXIMUM_TEXTURECUBEMAP_LAYERED_LAYERS = 54
- Maximum layers in a cubemap layered texture
- CU_DEVICE_ATTRIBUTE_MAXIMUM_SURFACE1D_WIDTH = 55
- Maximum 1D surface width
- CU_DEVICE_ATTRIBUTE_MAXIMUM_SURFACE2D_WIDTH = 56
- Maximum 2D surface width
- CU_DEVICE_ATTRIBUTE_MAXIMUM_SURFACE2D_HEIGHT = 57
- Maximum 2D surface height
- CU_DEVICE_ATTRIBUTE_MAXIMUM_SURFACE3D_WIDTH = 58
- Maximum 3D surface width
- CU_DEVICE_ATTRIBUTE_MAXIMUM_SURFACE3D_HEIGHT = 59
- Maximum 3D surface height
- CU_DEVICE_ATTRIBUTE_MAXIMUM_SURFACE3D_DEPTH = 60
- Maximum 3D surface depth
- CU_DEVICE_ATTRIBUTE_MAXIMUM_SURFACE1D_LAYERED_WIDTH = 61
- Maximum 1D layered surface width
- CU_DEVICE_ATTRIBUTE_MAXIMUM_SURFACE1D_LAYERED_LAYERS = 62
- Maximum layers in a 1D layered surface
- CU_DEVICE_ATTRIBUTE_MAXIMUM_SURFACE2D_LAYERED_WIDTH = 63
- Maximum 2D layered surface width
- CU_DEVICE_ATTRIBUTE_MAXIMUM_SURFACE2D_LAYERED_HEIGHT = 64
- Maximum 2D layered surface height
- CU_DEVICE_ATTRIBUTE_MAXIMUM_SURFACE2D_LAYERED_LAYERS = 65
- Maximum layers in a 2D layered surface
- CU_DEVICE_ATTRIBUTE_MAXIMUM_SURFACECUBEMAP_WIDTH = 66
- Maximum cubemap surface width
- CU_DEVICE_ATTRIBUTE_MAXIMUM_SURFACECUBEMAP_LAYERED_WIDTH = 67
- Maximum cubemap layered surface width
- CU_DEVICE_ATTRIBUTE_MAXIMUM_SURFACECUBEMAP_LAYERED_LAYERS = 68
- Maximum layers in a cubemap layered surface
- CU_DEVICE_ATTRIBUTE_MAXIMUM_TEXTURE1D_LINEAR_WIDTH = 69
- Maximum 1D linear texture width
- CU_DEVICE_ATTRIBUTE_MAXIMUM_TEXTURE2D_LINEAR_WIDTH = 70
- Maximum 2D linear texture width
- CU_DEVICE_ATTRIBUTE_MAXIMUM_TEXTURE2D_LINEAR_HEIGHT = 71
- Maximum 2D linear texture height
- CU_DEVICE_ATTRIBUTE_MAXIMUM_TEXTURE2D_LINEAR_PITCH = 72
- Maximum 2D linear texture pitch in bytes
- CU_DEVICE_ATTRIBUTE_MAXIMUM_TEXTURE2D_MIPMAPPED_WIDTH = 73
- Maximum mipmapped 2D texture width
- CU_DEVICE_ATTRIBUTE_MAXIMUM_TEXTURE2D_MIPMAPPED_HEIGHT = 74
- Maximum mipmapped 2D texture height
- CU_DEVICE_ATTRIBUTE_COMPUTE_CAPABILITY_MAJOR = 75
- Major compute capability version number
- CU_DEVICE_ATTRIBUTE_COMPUTE_CAPABILITY_MINOR = 76
- Minor compute capability version number
- CU_DEVICE_ATTRIBUTE_MAXIMUM_TEXTURE1D_MIPMAPPED_WIDTH = 77
- Maximum mipmapped 1D texture width
- CU_DEVICE_ATTRIBUTE_STREAM_PRIORITIES_SUPPORTED = 78
- Device supports stream priorities
- CU_DEVICE_ATTRIBUTE_GLOBAL_L1_CACHE_SUPPORTED = 79
- Device supports caching globals in L1
- CU_DEVICE_ATTRIBUTE_LOCAL_L1_CACHE_SUPPORTED = 80
- Device supports caching locals in L1
- CU_DEVICE_ATTRIBUTE_MAX_SHARED_MEMORY_PER_MULTIPROCESSOR = 81
- Maximum shared memory available per multiprocessor in bytes
- CU_DEVICE_ATTRIBUTE_MAX_REGISTERS_PER_MULTIPROCESSOR = 82
- Maximum number of 32-bit registers available per multiprocessor
- CU_DEVICE_ATTRIBUTE_MANAGED_MEMORY = 83
- Device can allocate managed memory on this system
- CU_DEVICE_ATTRIBUTE_MULTI_GPU_BOARD = 84
- Device is on a multi-GPU board
- CU_DEVICE_ATTRIBUTE_MULTI_GPU_BOARD_GROUP_ID = 85
- Unique id for a group of devices on the same multi-GPU board
- CU_DEVICE_ATTRIBUTE_HOST_NATIVE_ATOMIC_SUPPORTED = 86
- Link between the device and the host supports native atomic operations (this is a placeholder attribute, and is not supported on any current hardware)
- CU_DEVICE_ATTRIBUTE_SINGLE_TO_DOUBLE_PRECISION_PERF_RATIO = 87
- Ratio of single precision performance (in floating-point operations per second) to double precision performance
- CU_DEVICE_ATTRIBUTE_PAGEABLE_MEMORY_ACCESS = 88
- Device supports coherently accessing pageable memory without calling cudaHostRegister on it
- CU_DEVICE_ATTRIBUTE_CONCURRENT_MANAGED_ACCESS = 89
- Device can coherently access managed memory concurrently with the CPU
- CU_DEVICE_ATTRIBUTE_COMPUTE_PREEMPTION_SUPPORTED = 90
- Device supports compute preemption.
- CU_DEVICE_ATTRIBUTE_CAN_USE_HOST_POINTER_FOR_REGISTERED_MEM = 91
- Device can access host registered memory at the same virtual address as the CPU
- CU_DEVICE_ATTRIBUTE_CAN_USE_STREAM_MEM_OPS = 92
- cuStreamBatchMemOp and related APIs are supported.
- CU_DEVICE_ATTRIBUTE_CAN_USE_64_BIT_STREAM_MEM_OPS = 93
- 64-bit operations are supported in cuStreamBatchMemOp and related APIs.
- CU_DEVICE_ATTRIBUTE_CAN_USE_STREAM_WAIT_VALUE_NOR = 94
- CU_STREAM_WAIT_VALUE_NOR is supported.
- CU_DEVICE_ATTRIBUTE_COOPERATIVE_LAUNCH = 95
- Device supports launching cooperative kernels via cuLaunchCooperativeKernel
- CU_DEVICE_ATTRIBUTE_COOPERATIVE_MULTI_DEVICE_LAUNCH = 96
- Device can participate in cooperative kernels launched via cuLaunchCooperativeKernelMultiDevice
- CU_DEVICE_ATTRIBUTE_MAX_SHARED_MEMORY_PER_BLOCK_OPTIN = 97
- Maximum optin shared memory per block
- CU_DEVICE_ATTRIBUTE_CAN_FLUSH_REMOTE_WRITES = 98
- Both the CU_STREAM_WAIT_VALUE_FLUSH flag and the CU_STREAM_MEM_OP_FLUSH_REMOTE_WRITES MemOp are supported on the device. See Stream memory operations for additional details.
- CU_DEVICE_ATTRIBUTE_HOST_REGISTER_SUPPORTED = 99
- Device supports host memory registration via cudaHostRegister.
- CU_DEVICE_ATTRIBUTE_PAGEABLE_MEMORY_ACCESS_USES_HOST_PAGE_TABLES = 100
- Device accesses pageable memory via the host's page tables.
- CU_DEVICE_ATTRIBUTE_DIRECT_MANAGED_MEM_ACCESS_FROM_HOST = 101
- The host can directly access managed memory on the device without migration.
- CU_DEVICE_ATTRIBUTE_VIRTUAL_ADDRESS_MANAGEMENT_SUPPORTED = 102
- Device supports virtual address management APIs like cuMemAddressReserve, cuMemCreate, cuMemMap and related APIs
- CU_DEVICE_ATTRIBUTE_HANDLE_TYPE_POSIX_FILE_DESCRIPTOR_SUPPORTED = 103
- Device supports exporting memory to a posix file descriptor with cuMemExportToShareableHandle, if requested via cuMemCreate
- CU_DEVICE_ATTRIBUTE_HANDLE_TYPE_WIN32_HANDLE_SUPPORTED = 104
- Device supports exporting memory to a Win32 NT handle with cuMemExportToShareableHandle, if requested via cuMemCreate
- CU_DEVICE_ATTRIBUTE_HANDLE_TYPE_WIN32_KMT_HANDLE_SUPPORTED = 105
- Device supports exporting memory to a Win32 KMT handle with cuMemExportToShareableHandle, if requested cuMemCreate
- CU_DEVICE_ATTRIBUTE_MAX
- enum CUeglColorFormat
-
CUDA EGL Color Format - The different planar and multiplanar formats currently supported for CUDA_EGL interops.
Values
- CU_EGL_COLOR_FORMAT_YUV420_PLANAR = 0x00
- Y, U, V in three surfaces, each in a separate surface, U/V width = 1/2 Y width, U/V height = 1/2 Y height.
- CU_EGL_COLOR_FORMAT_YUV420_SEMIPLANAR = 0x01
- Y, UV in two surfaces (UV as one surface) with VU byte ordering, width, height ratio same as YUV420Planar.
- CU_EGL_COLOR_FORMAT_YUV422_PLANAR = 0x02
- Y, U, V each in a separate surface, U/V width = 1/2 Y width, U/V height = Y height.
- CU_EGL_COLOR_FORMAT_YUV422_SEMIPLANAR = 0x03
- Y, UV in two surfaces with VU byte ordering, width, height ratio same as YUV422Planar.
- CU_EGL_COLOR_FORMAT_RGB = 0x04
- R/G/B three channels in one surface with BGR byte ordering. Only pitch linear format supported.
- CU_EGL_COLOR_FORMAT_BGR = 0x05
- R/G/B three channels in one surface with RGB byte ordering. Only pitch linear format supported.
- CU_EGL_COLOR_FORMAT_ARGB = 0x06
- R/G/B/A four channels in one surface with BGRA byte ordering.
- CU_EGL_COLOR_FORMAT_RGBA = 0x07
- R/G/B/A four channels in one surface with ABGR byte ordering.
- CU_EGL_COLOR_FORMAT_L = 0x08
- single luminance channel in one surface.
- CU_EGL_COLOR_FORMAT_R = 0x09
- single color channel in one surface.
- CU_EGL_COLOR_FORMAT_YUV444_PLANAR = 0x0A
- Y, U, V in three surfaces, each in a separate surface, U/V width = Y width, U/V height = Y height.
- CU_EGL_COLOR_FORMAT_YUV444_SEMIPLANAR = 0x0B
- Y, UV in two surfaces (UV as one surface) with VU byte ordering, width, height ratio same as YUV444Planar.
- CU_EGL_COLOR_FORMAT_YUYV_422 = 0x0C
- Y, U, V in one surface, interleaved as UYVY.
- CU_EGL_COLOR_FORMAT_UYVY_422 = 0x0D
- Y, U, V in one surface, interleaved as YUYV.
- CU_EGL_COLOR_FORMAT_ABGR = 0x0E
- R/G/B/A four channels in one surface with RGBA byte ordering.
- CU_EGL_COLOR_FORMAT_BGRA = 0x0F
- R/G/B/A four channels in one surface with ARGB byte ordering.
- CU_EGL_COLOR_FORMAT_A = 0x10
- Alpha color format - one channel in one surface.
- CU_EGL_COLOR_FORMAT_RG = 0x11
- R/G color format - two channels in one surface with GR byte ordering
- CU_EGL_COLOR_FORMAT_AYUV = 0x12
- Y, U, V, A four channels in one surface, interleaved as VUYA.
- CU_EGL_COLOR_FORMAT_YVU444_SEMIPLANAR = 0x13
- Y, VU in two surfaces (VU as one surface) with UV byte ordering, U/V width = Y width, U/V height = Y height.
- CU_EGL_COLOR_FORMAT_YVU422_SEMIPLANAR = 0x14
- Y, VU in two surfaces (VU as one surface) with UV byte ordering, U/V width = 1/2 Y width, U/V height = Y height.
- CU_EGL_COLOR_FORMAT_YVU420_SEMIPLANAR = 0x15
- Y, VU in two surfaces (VU as one surface) with UV byte ordering, U/V width = 1/2 Y width, U/V height = 1/2 Y height.
- CU_EGL_COLOR_FORMAT_Y10V10U10_444_SEMIPLANAR = 0x16
- Y10, V10U10 in two surfaces (VU as one surface) with UV byte ordering, U/V width = Y width, U/V height = Y height.
- CU_EGL_COLOR_FORMAT_Y10V10U10_420_SEMIPLANAR = 0x17
- Y10, V10U10 in two surfaces (VU as one surface) with UV byte ordering, U/V width = 1/2 Y width, U/V height = 1/2 Y height.
- CU_EGL_COLOR_FORMAT_Y12V12U12_444_SEMIPLANAR = 0x18
- Y12, V12U12 in two surfaces (VU as one surface) with UV byte ordering, U/V width = Y width, U/V height = Y height.
- CU_EGL_COLOR_FORMAT_Y12V12U12_420_SEMIPLANAR = 0x19
- Y12, V12U12 in two surfaces (VU as one surface) with UV byte ordering, U/V width = 1/2 Y width, U/V height = 1/2 Y height.
- CU_EGL_COLOR_FORMAT_VYUY_ER = 0x1A
- Extended Range Y, U, V in one surface, interleaved as YVYU.
- CU_EGL_COLOR_FORMAT_UYVY_ER = 0x1B
- Extended Range Y, U, V in one surface, interleaved as YUYV.
- CU_EGL_COLOR_FORMAT_YUYV_ER = 0x1C
- Extended Range Y, U, V in one surface, interleaved as UYVY.
- CU_EGL_COLOR_FORMAT_YVYU_ER = 0x1D
- Extended Range Y, U, V in one surface, interleaved as VYUY.
- CU_EGL_COLOR_FORMAT_YUV_ER = 0x1E
- Extended Range Y, U, V three channels in one surface, interleaved as VUY. Only pitch linear format supported.
- CU_EGL_COLOR_FORMAT_YUVA_ER = 0x1F
- Extended Range Y, U, V, A four channels in one surface, interleaved as AVUY.
- CU_EGL_COLOR_FORMAT_AYUV_ER = 0x20
- Extended Range Y, U, V, A four channels in one surface, interleaved as VUYA.
- CU_EGL_COLOR_FORMAT_YUV444_PLANAR_ER = 0x21
- Extended Range Y, U, V in three surfaces, U/V width = Y width, U/V height = Y height.
- CU_EGL_COLOR_FORMAT_YUV422_PLANAR_ER = 0x22
- Extended Range Y, U, V in three surfaces, U/V width = 1/2 Y width, U/V height = Y height.
- CU_EGL_COLOR_FORMAT_YUV420_PLANAR_ER = 0x23
- Extended Range Y, U, V in three surfaces, U/V width = 1/2 Y width, U/V height = 1/2 Y height.
- CU_EGL_COLOR_FORMAT_YUV444_SEMIPLANAR_ER = 0x24
- Extended Range Y, UV in two surfaces (UV as one surface) with VU byte ordering, U/V width = Y width, U/V height = Y height.
- CU_EGL_COLOR_FORMAT_YUV422_SEMIPLANAR_ER = 0x25
- Extended Range Y, UV in two surfaces (UV as one surface) with VU byte ordering, U/V width = 1/2 Y width, U/V height = Y height.
- CU_EGL_COLOR_FORMAT_YUV420_SEMIPLANAR_ER = 0x26
- Extended Range Y, UV in two surfaces (UV as one surface) with VU byte ordering, U/V width = 1/2 Y width, U/V height = 1/2 Y height.
- CU_EGL_COLOR_FORMAT_YVU444_PLANAR_ER = 0x27
- Extended Range Y, V, U in three surfaces, U/V width = Y width, U/V height = Y height.
- CU_EGL_COLOR_FORMAT_YVU422_PLANAR_ER = 0x28
- Extended Range Y, V, U in three surfaces, U/V width = 1/2 Y width, U/V height = Y height.
- CU_EGL_COLOR_FORMAT_YVU420_PLANAR_ER = 0x29
- Extended Range Y, V, U in three surfaces, U/V width = 1/2 Y width, U/V height = 1/2 Y height.
- CU_EGL_COLOR_FORMAT_YVU444_SEMIPLANAR_ER = 0x2A
- Extended Range Y, VU in two surfaces (VU as one surface) with UV byte ordering, U/V width = Y width, U/V height = Y height.
- CU_EGL_COLOR_FORMAT_YVU422_SEMIPLANAR_ER = 0x2B
- Extended Range Y, VU in two surfaces (VU as one surface) with UV byte ordering, U/V width = 1/2 Y width, U/V height = Y height.
- CU_EGL_COLOR_FORMAT_YVU420_SEMIPLANAR_ER = 0x2C
- Extended Range Y, VU in two surfaces (VU as one surface) with UV byte ordering, U/V width = 1/2 Y width, U/V height = 1/2 Y height.
- CU_EGL_COLOR_FORMAT_BAYER_RGGB = 0x2D
- Bayer format - one channel in one surface with interleaved RGGB ordering.
- CU_EGL_COLOR_FORMAT_BAYER_BGGR = 0x2E
- Bayer format - one channel in one surface with interleaved BGGR ordering.
- CU_EGL_COLOR_FORMAT_BAYER_GRBG = 0x2F
- Bayer format - one channel in one surface with interleaved GRBG ordering.
- CU_EGL_COLOR_FORMAT_BAYER_GBRG = 0x30
- Bayer format - one channel in one surface with interleaved GBRG ordering.
- CU_EGL_COLOR_FORMAT_BAYER10_RGGB = 0x31
- Bayer10 format - one channel in one surface with interleaved RGGB ordering. Out of 16 bits, 10 bits used 6 bits No-op.
- CU_EGL_COLOR_FORMAT_BAYER10_BGGR = 0x32
- Bayer10 format - one channel in one surface with interleaved BGGR ordering. Out of 16 bits, 10 bits used 6 bits No-op.
- CU_EGL_COLOR_FORMAT_BAYER10_GRBG = 0x33
- Bayer10 format - one channel in one surface with interleaved GRBG ordering. Out of 16 bits, 10 bits used 6 bits No-op.
- CU_EGL_COLOR_FORMAT_BAYER10_GBRG = 0x34
- Bayer10 format - one channel in one surface with interleaved GBRG ordering. Out of 16 bits, 10 bits used 6 bits No-op.
- CU_EGL_COLOR_FORMAT_BAYER12_RGGB = 0x35
- Bayer12 format - one channel in one surface with interleaved RGGB ordering. Out of 16 bits, 12 bits used 4 bits No-op.
- CU_EGL_COLOR_FORMAT_BAYER12_BGGR = 0x36
- Bayer12 format - one channel in one surface with interleaved BGGR ordering. Out of 16 bits, 12 bits used 4 bits No-op.
- CU_EGL_COLOR_FORMAT_BAYER12_GRBG = 0x37
- Bayer12 format - one channel in one surface with interleaved GRBG ordering. Out of 16 bits, 12 bits used 4 bits No-op.
- CU_EGL_COLOR_FORMAT_BAYER12_GBRG = 0x38
- Bayer12 format - one channel in one surface with interleaved GBRG ordering. Out of 16 bits, 12 bits used 4 bits No-op.
- CU_EGL_COLOR_FORMAT_BAYER14_RGGB = 0x39
- Bayer14 format - one channel in one surface with interleaved RGGB ordering. Out of 16 bits, 14 bits used 2 bits No-op.
- CU_EGL_COLOR_FORMAT_BAYER14_BGGR = 0x3A
- Bayer14 format - one channel in one surface with interleaved BGGR ordering. Out of 16 bits, 14 bits used 2 bits No-op.
- CU_EGL_COLOR_FORMAT_BAYER14_GRBG = 0x3B
- Bayer14 format - one channel in one surface with interleaved GRBG ordering. Out of 16 bits, 14 bits used 2 bits No-op.
- CU_EGL_COLOR_FORMAT_BAYER14_GBRG = 0x3C
- Bayer14 format - one channel in one surface with interleaved GBRG ordering. Out of 16 bits, 14 bits used 2 bits No-op.
- CU_EGL_COLOR_FORMAT_BAYER20_RGGB = 0x3D
- Bayer20 format - one channel in one surface with interleaved RGGB ordering. Out of 32 bits, 20 bits used 12 bits No-op.
- CU_EGL_COLOR_FORMAT_BAYER20_BGGR = 0x3E
- Bayer20 format - one channel in one surface with interleaved BGGR ordering. Out of 32 bits, 20 bits used 12 bits No-op.
- CU_EGL_COLOR_FORMAT_BAYER20_GRBG = 0x3F
- Bayer20 format - one channel in one surface with interleaved GRBG ordering. Out of 32 bits, 20 bits used 12 bits No-op.
- CU_EGL_COLOR_FORMAT_BAYER20_GBRG = 0x40
- Bayer20 format - one channel in one surface with interleaved GBRG ordering. Out of 32 bits, 20 bits used 12 bits No-op.
- CU_EGL_COLOR_FORMAT_YVU444_PLANAR = 0x41
- Y, V, U in three surfaces, each in a separate surface, U/V width = Y width, U/V height = Y height.
- CU_EGL_COLOR_FORMAT_YVU422_PLANAR = 0x42
- Y, V, U in three surfaces, each in a separate surface, U/V width = 1/2 Y width, U/V height = Y height.
- CU_EGL_COLOR_FORMAT_YVU420_PLANAR = 0x43
- Y, V, U in three surfaces, each in a separate surface, U/V width = 1/2 Y width, U/V height = 1/2 Y height.
- CU_EGL_COLOR_FORMAT_BAYER_ISP_RGGB = 0x44
- Nvidia proprietary Bayer ISP format - one channel in one surface with interleaved RGGB ordering and mapped to opaque integer datatype.
- CU_EGL_COLOR_FORMAT_BAYER_ISP_BGGR = 0x45
- Nvidia proprietary Bayer ISP format - one channel in one surface with interleaved BGGR ordering and mapped to opaque integer datatype.
- CU_EGL_COLOR_FORMAT_BAYER_ISP_GRBG = 0x46
- Nvidia proprietary Bayer ISP format - one channel in one surface with interleaved GRBG ordering and mapped to opaque integer datatype.
- CU_EGL_COLOR_FORMAT_BAYER_ISP_GBRG = 0x47
- Nvidia proprietary Bayer ISP format - one channel in one surface with interleaved GBRG ordering and mapped to opaque integer datatype.
- CU_EGL_COLOR_FORMAT_MAX
- enum CUeglFrameType
-
CUDA EglFrame type - array or pointer
Values
- CU_EGL_FRAME_TYPE_ARRAY = 0
- Frame type CUDA array
- CU_EGL_FRAME_TYPE_PITCH = 1
- Frame type pointer
- enum CUeglResourceLocationFlags
-
Resource location flags- sysmem or vidmem
For CUDA context on iGPU, since video and system memory are equivalent - these flags will not have an effect on the execution.
For CUDA context on dGPU, applications can use the flag CUeglResourceLocationFlags to give a hint about the desired location.
CU_EGL_RESOURCE_LOCATION_SYSMEM - the frame data is made resident on the system memory to be accessed by CUDA.
CU_EGL_RESOURCE_LOCATION_VIDMEM - the frame data is made resident on the dedicated video memory to be accessed by CUDA.
There may be an additional latency due to new allocation and data migration, if the frame is produced on a different memory.
Values
- CU_EGL_RESOURCE_LOCATION_SYSMEM = 0x00
- Resource location sysmem
- CU_EGL_RESOURCE_LOCATION_VIDMEM = 0x01
- Resource location vidmem
- enum CUevent_flags
-
Event creation flags
Values
- CU_EVENT_DEFAULT = 0x0
- Default event flag
- CU_EVENT_BLOCKING_SYNC = 0x1
- Event uses blocking synchronization
- CU_EVENT_DISABLE_TIMING = 0x2
- Event will not record timing data
- CU_EVENT_INTERPROCESS = 0x4
- Event is suitable for interprocess use. CU_EVENT_DISABLE_TIMING must be set
- enum CUexternalMemoryHandleType
-
External memory handle types
Values
- CU_EXTERNAL_MEMORY_HANDLE_TYPE_OPAQUE_FD = 1
- Handle is an opaque file descriptor
- CU_EXTERNAL_MEMORY_HANDLE_TYPE_OPAQUE_WIN32 = 2
- Handle is an opaque shared NT handle
- CU_EXTERNAL_MEMORY_HANDLE_TYPE_OPAQUE_WIN32_KMT = 3
- Handle is an opaque, globally shared handle
- CU_EXTERNAL_MEMORY_HANDLE_TYPE_D3D12_HEAP = 4
- Handle is a D3D12 heap object
- CU_EXTERNAL_MEMORY_HANDLE_TYPE_D3D12_RESOURCE = 5
- Handle is a D3D12 committed resource
- CU_EXTERNAL_MEMORY_HANDLE_TYPE_D3D11_RESOURCE = 6
- Handle is a shared NT handle to a D3D11 resource
- CU_EXTERNAL_MEMORY_HANDLE_TYPE_D3D11_RESOURCE_KMT = 7
- Handle is a globally shared handle to a D3D11 resource
- CU_EXTERNAL_MEMORY_HANDLE_TYPE_NVSCIBUF = 8
- Handle is an NvSciBuf object
- enum CUexternalSemaphoreHandleType
-
External semaphore handle types
Values
- CU_EXTERNAL_SEMAPHORE_HANDLE_TYPE_OPAQUE_FD = 1
- Handle is an opaque file descriptor
- CU_EXTERNAL_SEMAPHORE_HANDLE_TYPE_OPAQUE_WIN32 = 2
- Handle is an opaque shared NT handle
- CU_EXTERNAL_SEMAPHORE_HANDLE_TYPE_OPAQUE_WIN32_KMT = 3
- Handle is an opaque, globally shared handle
- CU_EXTERNAL_SEMAPHORE_HANDLE_TYPE_D3D12_FENCE = 4
- Handle is a shared NT handle referencing a D3D12 fence object
- CU_EXTERNAL_SEMAPHORE_HANDLE_TYPE_D3D11_FENCE = 5
- Handle is a shared NT handle referencing a D3D11 fence object
- CU_EXTERNAL_SEMAPHORE_HANDLE_TYPE_NVSCISYNC = 6
- Opaque handle to NvSciSync Object
- CU_EXTERNAL_SEMAPHORE_HANDLE_TYPE_D3D11_KEYED_MUTEX = 7
- Handle is a shared NT handle referencing a D3D11 keyed mutex object
- CU_EXTERNAL_SEMAPHORE_HANDLE_TYPE_D3D11_KEYED_MUTEX_KMT = 8
- Handle is a globally shared handle referencing a D3D11 keyed mutex object
- enum CUfilter_mode
-
Texture reference filtering modes
Values
- CU_TR_FILTER_MODE_POINT = 0
- Point filter mode
- CU_TR_FILTER_MODE_LINEAR = 1
- Linear filter mode
- enum CUfunc_cache
-
Function cache configurations
Values
- CU_FUNC_CACHE_PREFER_NONE = 0x00
- no preference for shared memory or L1 (default)
- CU_FUNC_CACHE_PREFER_SHARED = 0x01
- prefer larger shared memory and smaller L1 cache
- CU_FUNC_CACHE_PREFER_L1 = 0x02
- prefer larger L1 cache and smaller shared memory
- CU_FUNC_CACHE_PREFER_EQUAL = 0x03
- prefer equal sized L1 cache and shared memory
- enum CUfunction_attribute
-
Function properties
Values
- CU_FUNC_ATTRIBUTE_MAX_THREADS_PER_BLOCK = 0
- The maximum number of threads per block, beyond which a launch of the function would fail. This number depends on both the function and the device on which the function is currently loaded.
- CU_FUNC_ATTRIBUTE_SHARED_SIZE_BYTES = 1
- The size in bytes of statically-allocated shared memory required by this function. This does not include dynamically-allocated shared memory requested by the user at runtime.
- CU_FUNC_ATTRIBUTE_CONST_SIZE_BYTES = 2
- The size in bytes of user-allocated constant memory required by this function.
- CU_FUNC_ATTRIBUTE_LOCAL_SIZE_BYTES = 3
- The size in bytes of local memory used by each thread of this function.
- CU_FUNC_ATTRIBUTE_NUM_REGS = 4
- The number of registers used by each thread of this function.
- CU_FUNC_ATTRIBUTE_PTX_VERSION = 5
- The PTX virtual architecture version for which the function was compiled. This value is the major PTX version * 10 + the minor PTX version, so a PTX version 1.3 function would return the value 13. Note that this may return the undefined value of 0 for cubins compiled prior to CUDA 3.0.
- CU_FUNC_ATTRIBUTE_BINARY_VERSION = 6
- The binary architecture version for which the function was compiled. This value is the major binary version * 10 + the minor binary version, so a binary version 1.3 function would return the value 13. Note that this will return a value of 10 for legacy cubins that do not have a properly-encoded binary architecture version.
- CU_FUNC_ATTRIBUTE_CACHE_MODE_CA = 7
- The attribute to indicate whether the function has been compiled with user specified option "-Xptxas --dlcm=ca" set .
- CU_FUNC_ATTRIBUTE_MAX_DYNAMIC_SHARED_SIZE_BYTES = 8
- The maximum size in bytes of dynamically-allocated shared memory that can be used by this function. If the user-specified dynamic shared memory size is larger than this value, the launch will fail. See cuFuncSetAttribute
- CU_FUNC_ATTRIBUTE_PREFERRED_SHARED_MEMORY_CARVEOUT = 9
- On devices where the L1 cache and shared memory use the same hardware resources, this sets the shared memory carveout preference, in percent of the total shared memory. Refer to CU_DEVICE_ATTRIBUTE_MAX_SHARED_MEMORY_PER_MULTIPROCESSOR. This is only a hint, and the driver can choose a different ratio if required to execute the function. See cuFuncSetAttribute
- CU_FUNC_ATTRIBUTE_MAX
- enum CUgraphNodeType
-
Graph node types
Values
- CU_GRAPH_NODE_TYPE_KERNEL = 0
- GPU kernel node
- CU_GRAPH_NODE_TYPE_MEMCPY = 1
- Memcpy node
- CU_GRAPH_NODE_TYPE_MEMSET = 2
- Memset node
- CU_GRAPH_NODE_TYPE_HOST = 3
- Host (executable) node
- CU_GRAPH_NODE_TYPE_GRAPH = 4
- Node which executes an embedded graph
- CU_GRAPH_NODE_TYPE_EMPTY = 5
- Empty (no-op) node
- CU_GRAPH_NODE_TYPE_COUNT
- enum CUgraphicsMapResourceFlags
-
Flags for mapping and unmapping interop resources
Values
- CU_GRAPHICS_MAP_RESOURCE_FLAGS_NONE = 0x00
- CU_GRAPHICS_MAP_RESOURCE_FLAGS_READ_ONLY = 0x01
- CU_GRAPHICS_MAP_RESOURCE_FLAGS_WRITE_DISCARD = 0x02
- enum CUgraphicsRegisterFlags
-
Flags to register a graphics resource
Values
- CU_GRAPHICS_REGISTER_FLAGS_NONE = 0x00
- CU_GRAPHICS_REGISTER_FLAGS_READ_ONLY = 0x01
- CU_GRAPHICS_REGISTER_FLAGS_WRITE_DISCARD = 0x02
- CU_GRAPHICS_REGISTER_FLAGS_SURFACE_LDST = 0x04
- CU_GRAPHICS_REGISTER_FLAGS_TEXTURE_GATHER = 0x08
- enum CUipcMem_flags
-
CUDA Ipc Mem Flags
Values
- CU_IPC_MEM_LAZY_ENABLE_PEER_ACCESS = 0x1
- Automatically enable peer access between remote devices as needed
- enum CUjitInputType
-
Device code formats
Values
- CU_JIT_INPUT_CUBIN = 0
- Compiled device-class-specific device code Applicable options: none
- CU_JIT_INPUT_PTX
- PTX source code Applicable options: PTX compiler options
- CU_JIT_INPUT_FATBINARY
- Bundle of multiple cubins and/or PTX of some device code Applicable options: PTX compiler options, CU_JIT_FALLBACK_STRATEGY
- CU_JIT_INPUT_OBJECT
- Host object with embedded device code Applicable options: PTX compiler options, CU_JIT_FALLBACK_STRATEGY
- CU_JIT_INPUT_LIBRARY
- Archive of host objects with embedded device code Applicable options: PTX compiler options, CU_JIT_FALLBACK_STRATEGY
- CU_JIT_NUM_INPUT_TYPES
- enum CUjit_cacheMode
-
Caching modes for dlcm
Values
- CU_JIT_CACHE_OPTION_NONE = 0
- Compile with no -dlcm flag specified
- CU_JIT_CACHE_OPTION_CG
- Compile with L1 cache disabled
- CU_JIT_CACHE_OPTION_CA
- Compile with L1 cache enabled
- enum CUjit_fallback
-
Cubin matching fallback strategies
Values
- CU_PREFER_PTX = 0
- Prefer to compile ptx if exact binary match not found
- CU_PREFER_BINARY
- Prefer to fall back to compatible binary code if exact match not found
- enum CUjit_option
-
Online compiler and linker options
Values
- CU_JIT_MAX_REGISTERS = 0
- Max number of registers that a thread may use. Option type: unsigned int Applies to: compiler only
- CU_JIT_THREADS_PER_BLOCK
- IN: Specifies minimum number of threads per block to target compilation for OUT: Returns the number of threads the compiler actually targeted. This restricts the resource utilization fo the compiler (e.g. max registers) such that a block with the given number of threads should be able to launch based on register limitations. Note, this option does not currently take into account any other resource limitations, such as shared memory utilization. Cannot be combined with CU_JIT_TARGET. Option type: unsigned int Applies to: compiler only
- CU_JIT_WALL_TIME
- Overwrites the option value with the total wall clock time, in milliseconds, spent in the compiler and linker Option type: float Applies to: compiler and linker
- CU_JIT_INFO_LOG_BUFFER
- Pointer to a buffer in which to print any log messages that are informational in nature (the buffer size is specified via option CU_JIT_INFO_LOG_BUFFER_SIZE_BYTES) Option type: char * Applies to: compiler and linker
- CU_JIT_INFO_LOG_BUFFER_SIZE_BYTES
- IN: Log buffer size in bytes. Log messages will be capped at this size (including null terminator) OUT: Amount of log buffer filled with messages Option type: unsigned int Applies to: compiler and linker
- CU_JIT_ERROR_LOG_BUFFER
- Pointer to a buffer in which to print any log messages that reflect errors (the buffer size is specified via option CU_JIT_ERROR_LOG_BUFFER_SIZE_BYTES) Option type: char * Applies to: compiler and linker
- CU_JIT_ERROR_LOG_BUFFER_SIZE_BYTES
- IN: Log buffer size in bytes. Log messages will be capped at this size (including null terminator) OUT: Amount of log buffer filled with messages Option type: unsigned int Applies to: compiler and linker
- CU_JIT_OPTIMIZATION_LEVEL
- Level of optimizations to apply to generated code (0 - 4), with 4 being the default and highest level of optimizations. Option type: unsigned int Applies to: compiler only
- CU_JIT_TARGET_FROM_CUCONTEXT
- No option value required. Determines the target based on the current attached context (default) Option type: No option value needed Applies to: compiler and linker
- CU_JIT_TARGET
- Target is chosen based on supplied CUjit_target. Cannot be combined with CU_JIT_THREADS_PER_BLOCK. Option type: unsigned int for enumerated type CUjit_target Applies to: compiler and linker
- CU_JIT_FALLBACK_STRATEGY
- Specifies choice of fallback strategy if matching cubin is not found. Choice is based on supplied CUjit_fallback. This option cannot be used with cuLink* APIs as the linker requires exact matches. Option type: unsigned int for enumerated type CUjit_fallback Applies to: compiler only
- CU_JIT_GENERATE_DEBUG_INFO
- Specifies whether to create debug information in output (-g) (0: false, default) Option type: int Applies to: compiler and linker
- CU_JIT_LOG_VERBOSE
- Generate verbose log messages (0: false, default) Option type: int Applies to: compiler and linker
- CU_JIT_GENERATE_LINE_INFO
- Generate line number information (-lineinfo) (0: false, default) Option type: int Applies to: compiler only
- CU_JIT_CACHE_MODE
- Specifies whether to enable caching explicitly (-dlcm) Choice is based on supplied CUjit_cacheMode_enum. Option type: unsigned int for enumerated type CUjit_cacheMode_enum Applies to: compiler only
- CU_JIT_NEW_SM3X_OPT
- The below jit options are used for internal purposes only, in this version of CUDA
- CU_JIT_FAST_COMPILE
- CU_JIT_GLOBAL_SYMBOL_NAMES
- Array of device symbol names that will be relocated to the corresponing host addresses stored in CU_JIT_GLOBAL_SYMBOL_ADDRESSES. Must contain CU_JIT_GLOBAL_SYMBOL_COUNT entries. When loding a device module, driver will relocate all encountered unresolved symbols to the host addresses. It is only allowed to register symbols that correspond to unresolved global variables. It is illegal to register the same device symbol at multiple addresses. Option type: const char ** Applies to: dynamic linker only
- CU_JIT_GLOBAL_SYMBOL_ADDRESSES
- Array of host addresses that will be used to relocate corresponding device symbols stored in CU_JIT_GLOBAL_SYMBOL_NAMES. Must contain CU_JIT_GLOBAL_SYMBOL_COUNT entries. Option type: void ** Applies to: dynamic linker only
- CU_JIT_GLOBAL_SYMBOL_COUNT
- Number of entries in CU_JIT_GLOBAL_SYMBOL_NAMES and CU_JIT_GLOBAL_SYMBOL_ADDRESSES arrays. Option type: unsigned int Applies to: dynamic linker only
- CU_JIT_NUM_OPTIONS
- enum CUjit_target
-
Online compilation targets
Values
- CU_TARGET_COMPUTE_20 = 20
- Compute device class 2.0
- CU_TARGET_COMPUTE_21 = 21
- Compute device class 2.1
- CU_TARGET_COMPUTE_30 = 30
- Compute device class 3.0
- CU_TARGET_COMPUTE_32 = 32
- Compute device class 3.2
- CU_TARGET_COMPUTE_35 = 35
- Compute device class 3.5
- CU_TARGET_COMPUTE_37 = 37
- Compute device class 3.7
- CU_TARGET_COMPUTE_50 = 50
- Compute device class 5.0
- CU_TARGET_COMPUTE_52 = 52
- Compute device class 5.2
- CU_TARGET_COMPUTE_53 = 53
- Compute device class 5.3
- CU_TARGET_COMPUTE_60 = 60
- Compute device class 6.0.
- CU_TARGET_COMPUTE_61 = 61
- Compute device class 6.1.
- CU_TARGET_COMPUTE_62 = 62
- Compute device class 6.2.
- CU_TARGET_COMPUTE_70 = 70
- Compute device class 7.0.
- CU_TARGET_COMPUTE_72 = 72
- Compute device class 7.2.
- CU_TARGET_COMPUTE_75 = 75
- Compute device class 7.5.
- enum CUlimit
-
Limits
Values
- CU_LIMIT_STACK_SIZE = 0x00
- GPU thread stack size
- CU_LIMIT_PRINTF_FIFO_SIZE = 0x01
- GPU printf FIFO size
- CU_LIMIT_MALLOC_HEAP_SIZE = 0x02
- GPU malloc heap size
- CU_LIMIT_DEV_RUNTIME_SYNC_DEPTH = 0x03
- GPU device runtime launch synchronize depth
- CU_LIMIT_DEV_RUNTIME_PENDING_LAUNCH_COUNT = 0x04
- GPU device runtime pending launch count
- CU_LIMIT_MAX_L2_FETCH_GRANULARITY = 0x05
- A value between 0 and 128 that indicates the maximum fetch granularity of L2 (in Bytes). This is a hint
- CU_LIMIT_MAX
- enum CUmemAccess_flags
-
Specifies the memory protection flags for mapping.
Values
- CU_MEM_ACCESS_FLAGS_PROT_NONE = 0x0
- Default, make the address range not accessible
- CU_MEM_ACCESS_FLAGS_PROT_READ = 0x1
- Make the address range read accessible
- CU_MEM_ACCESS_FLAGS_PROT_READWRITE = 0x3
- Make the address range read-write accessible
- CU_MEM_ACCESS_FLAGS_PROT_MAX = 0xFFFFFFFF
- enum CUmemAllocationGranularity_flags
-
Flag for requesting different optimal and required granularities for an allocation.
Values
- CU_MEM_ALLOC_GRANULARITY_MINIMUM = 0x0
- Minimum required granularity for allocation
- CU_MEM_ALLOC_GRANULARITY_RECOMMENDED = 0x1
- Recommended granularity for allocation for best performance
- enum CUmemAllocationHandleType
-
Flags for specifying particular handle types
Values
- CU_MEM_HANDLE_TYPE_POSIX_FILE_DESCRIPTOR = 0x1
- Allows a file descriptor to be used for exporting. Permitted only on POSIX systems. (int)
- CU_MEM_HANDLE_TYPE_WIN32 = 0x2
- Allows a Win32 NT handle to be used for exporting. (HANDLE)
- CU_MEM_HANDLE_TYPE_WIN32_KMT = 0x4
- Allows a Win32 KMT handle to be used for exporting. (D3DKMT_HANDLE)
- CU_MEM_HANDLE_TYPE_MAX = 0xFFFFFFFF
- enum CUmemAllocationType
-
Defines the allocation types available
Values
- CU_MEM_ALLOCATION_TYPE_INVALID = 0x0
- CU_MEM_ALLOCATION_TYPE_PINNED = 0x1
- This allocation type is 'pinned', i.e. cannot migrate from its current location while the application is actively using it
- CU_MEM_ALLOCATION_TYPE_MAX = 0xFFFFFFFF
- enum CUmemAttach_flags
-
CUDA Mem Attach Flags
Values
- CU_MEM_ATTACH_GLOBAL = 0x1
- Memory can be accessed by any stream on any device
- CU_MEM_ATTACH_HOST = 0x2
- Memory cannot be accessed by any stream on any device
- CU_MEM_ATTACH_SINGLE = 0x4
- Memory can only be accessed by a single stream on the associated device
- enum CUmemLocationType
-
Specifies the type of location
Values
- CU_MEM_LOCATION_TYPE_INVALID = 0x0
- CU_MEM_LOCATION_TYPE_DEVICE = 0x1
- Location is a device location, thus id is a device ordinal
- CU_MEM_LOCATION_TYPE_MAX = 0xFFFFFFFF
- enum CUmem_advise
-
Memory advise values
Values
- CU_MEM_ADVISE_SET_READ_MOSTLY = 1
- Data will mostly be read and only occassionally be written to
- CU_MEM_ADVISE_UNSET_READ_MOSTLY = 2
- Undo the effect of CU_MEM_ADVISE_SET_READ_MOSTLY
- CU_MEM_ADVISE_SET_PREFERRED_LOCATION = 3
- Set the preferred location for the data as the specified device
- CU_MEM_ADVISE_UNSET_PREFERRED_LOCATION = 4
- Clear the preferred location for the data
- CU_MEM_ADVISE_SET_ACCESSED_BY = 5
- Data will be accessed by the specified device, so prevent page faults as much as possible
- CU_MEM_ADVISE_UNSET_ACCESSED_BY = 6
- Let the Unified Memory subsystem decide on the page faulting policy for the specified device
- enum CUmemorytype
-
Memory types
Values
- CU_MEMORYTYPE_HOST = 0x01
- Host memory
- CU_MEMORYTYPE_DEVICE = 0x02
- Device memory
- CU_MEMORYTYPE_ARRAY = 0x03
- Array memory
- CU_MEMORYTYPE_UNIFIED = 0x04
- Unified device or host memory
- enum CUoccupancy_flags
-
Occupancy calculator flag
Values
- CU_OCCUPANCY_DEFAULT = 0x0
- Default behavior
- CU_OCCUPANCY_DISABLE_CACHING_OVERRIDE = 0x1
- Assume global caching is enabled and cannot be automatically turned off
- enum CUpointer_attribute
-
Pointer information
Values
- CU_POINTER_ATTRIBUTE_CONTEXT = 1
- The CUcontext on which a pointer was allocated or registered
- CU_POINTER_ATTRIBUTE_MEMORY_TYPE = 2
- The CUmemorytype describing the physical location of a pointer
- CU_POINTER_ATTRIBUTE_DEVICE_POINTER = 3
- The address at which a pointer's memory may be accessed on the device
- CU_POINTER_ATTRIBUTE_HOST_POINTER = 4
- The address at which a pointer's memory may be accessed on the host
- CU_POINTER_ATTRIBUTE_P2P_TOKENS = 5
- A pair of tokens for use with the nv-p2p.h Linux kernel interface
- CU_POINTER_ATTRIBUTE_SYNC_MEMOPS = 6
- Synchronize every synchronous memory operation initiated on this region
- CU_POINTER_ATTRIBUTE_BUFFER_ID = 7
- A process-wide unique ID for an allocated memory region
- CU_POINTER_ATTRIBUTE_IS_MANAGED = 8
- Indicates if the pointer points to managed memory
- CU_POINTER_ATTRIBUTE_DEVICE_ORDINAL = 9
- A device ordinal of a device on which a pointer was allocated or registered
- CU_POINTER_ATTRIBUTE_IS_LEGACY_CUDA_IPC_CAPABLE = 10
- 1 if this pointer maps to an allocation that is suitable for cudaIpcGetMemHandle, 0 otherwise
- CU_POINTER_ATTRIBUTE_RANGE_START_ADDR = 11
- Starting address for this requested pointer
- CU_POINTER_ATTRIBUTE_RANGE_SIZE = 12
- Size of the address range for this requested pointer
- CU_POINTER_ATTRIBUTE_MAPPED = 13
- 1 if this pointer is in a valid address range that is mapped to a backing allocation, 0 otherwise
- CU_POINTER_ATTRIBUTE_ALLOWED_HANDLE_TYPES = 14
- Bitmask of allowed CUmemAllocationHandleType for this allocation
- enum CUresourceViewFormat
-
Resource view format
Values
- CU_RES_VIEW_FORMAT_NONE = 0x00
- No resource view format (use underlying resource format)
- CU_RES_VIEW_FORMAT_UINT_1X8 = 0x01
- 1 channel unsigned 8-bit integers
- CU_RES_VIEW_FORMAT_UINT_2X8 = 0x02
- 2 channel unsigned 8-bit integers
- CU_RES_VIEW_FORMAT_UINT_4X8 = 0x03
- 4 channel unsigned 8-bit integers
- CU_RES_VIEW_FORMAT_SINT_1X8 = 0x04
- 1 channel signed 8-bit integers
- CU_RES_VIEW_FORMAT_SINT_2X8 = 0x05
- 2 channel signed 8-bit integers
- CU_RES_VIEW_FORMAT_SINT_4X8 = 0x06
- 4 channel signed 8-bit integers
- CU_RES_VIEW_FORMAT_UINT_1X16 = 0x07
- 1 channel unsigned 16-bit integers
- CU_RES_VIEW_FORMAT_UINT_2X16 = 0x08
- 2 channel unsigned 16-bit integers
- CU_RES_VIEW_FORMAT_UINT_4X16 = 0x09
- 4 channel unsigned 16-bit integers
- CU_RES_VIEW_FORMAT_SINT_1X16 = 0x0a
- 1 channel signed 16-bit integers
- CU_RES_VIEW_FORMAT_SINT_2X16 = 0x0b
- 2 channel signed 16-bit integers
- CU_RES_VIEW_FORMAT_SINT_4X16 = 0x0c
- 4 channel signed 16-bit integers
- CU_RES_VIEW_FORMAT_UINT_1X32 = 0x0d
- 1 channel unsigned 32-bit integers
- CU_RES_VIEW_FORMAT_UINT_2X32 = 0x0e
- 2 channel unsigned 32-bit integers
- CU_RES_VIEW_FORMAT_UINT_4X32 = 0x0f
- 4 channel unsigned 32-bit integers
- CU_RES_VIEW_FORMAT_SINT_1X32 = 0x10
- 1 channel signed 32-bit integers
- CU_RES_VIEW_FORMAT_SINT_2X32 = 0x11
- 2 channel signed 32-bit integers
- CU_RES_VIEW_FORMAT_SINT_4X32 = 0x12
- 4 channel signed 32-bit integers
- CU_RES_VIEW_FORMAT_FLOAT_1X16 = 0x13
- 1 channel 16-bit floating point
- CU_RES_VIEW_FORMAT_FLOAT_2X16 = 0x14
- 2 channel 16-bit floating point
- CU_RES_VIEW_FORMAT_FLOAT_4X16 = 0x15
- 4 channel 16-bit floating point
- CU_RES_VIEW_FORMAT_FLOAT_1X32 = 0x16
- 1 channel 32-bit floating point
- CU_RES_VIEW_FORMAT_FLOAT_2X32 = 0x17
- 2 channel 32-bit floating point
- CU_RES_VIEW_FORMAT_FLOAT_4X32 = 0x18
- 4 channel 32-bit floating point
- CU_RES_VIEW_FORMAT_UNSIGNED_BC1 = 0x19
- Block compressed 1
- CU_RES_VIEW_FORMAT_UNSIGNED_BC2 = 0x1a
- Block compressed 2
- CU_RES_VIEW_FORMAT_UNSIGNED_BC3 = 0x1b
- Block compressed 3
- CU_RES_VIEW_FORMAT_UNSIGNED_BC4 = 0x1c
- Block compressed 4 unsigned
- CU_RES_VIEW_FORMAT_SIGNED_BC4 = 0x1d
- Block compressed 4 signed
- CU_RES_VIEW_FORMAT_UNSIGNED_BC5 = 0x1e
- Block compressed 5 unsigned
- CU_RES_VIEW_FORMAT_SIGNED_BC5 = 0x1f
- Block compressed 5 signed
- CU_RES_VIEW_FORMAT_UNSIGNED_BC6H = 0x20
- Block compressed 6 unsigned half-float
- CU_RES_VIEW_FORMAT_SIGNED_BC6H = 0x21
- Block compressed 6 signed half-float
- CU_RES_VIEW_FORMAT_UNSIGNED_BC7 = 0x22
- Block compressed 7
- enum CUresourcetype
-
Resource types
Values
- CU_RESOURCE_TYPE_ARRAY = 0x00
- Array resoure
- CU_RESOURCE_TYPE_MIPMAPPED_ARRAY = 0x01
- Mipmapped array resource
- CU_RESOURCE_TYPE_LINEAR = 0x02
- Linear resource
- CU_RESOURCE_TYPE_PITCH2D = 0x03
- Pitch 2D resource
- enum CUresult
-
Error codes
Values
- CUDA_SUCCESS = 0
- The API call returned with no errors. In the case of query calls, this also means that the operation being queried is complete (see cuEventQuery() and cuStreamQuery()).
- CUDA_ERROR_INVALID_VALUE = 1
- This indicates that one or more of the parameters passed to the API call is not within an acceptable range of values.
- CUDA_ERROR_OUT_OF_MEMORY = 2
- The API call failed because it was unable to allocate enough memory to perform the requested operation.
- CUDA_ERROR_NOT_INITIALIZED = 3
- This indicates that the CUDA driver has not been initialized with cuInit() or that initialization has failed.
- CUDA_ERROR_DEINITIALIZED = 4
- This indicates that the CUDA driver is in the process of shutting down.
- CUDA_ERROR_PROFILER_DISABLED = 5
- This indicates profiler is not initialized for this run. This can happen when the application is running with external profiling tools like visual profiler.
- CUDA_ERROR_PROFILER_NOT_INITIALIZED = 6
-
Deprecated
This error return is deprecated as of CUDA 5.0. It is no longer an error to attempt to enable/disable the profiling via cuProfilerStart or cuProfilerStop without initialization.
- CUDA_ERROR_PROFILER_ALREADY_STARTED = 7
-
Deprecated
This error return is deprecated as of CUDA 5.0. It is no longer an error to call cuProfilerStart() when profiling is already enabled.
- CUDA_ERROR_PROFILER_ALREADY_STOPPED = 8
-
Deprecated
This error return is deprecated as of CUDA 5.0. It is no longer an error to call cuProfilerStop() when profiling is already disabled.
- CUDA_ERROR_NO_DEVICE = 100
- This indicates that no CUDA-capable devices were detected by the installed CUDA driver.
- CUDA_ERROR_INVALID_DEVICE = 101
- This indicates that the device ordinal supplied by the user does not correspond to a valid CUDA device.
- CUDA_ERROR_INVALID_IMAGE = 200
- This indicates that the device kernel image is invalid. This can also indicate an invalid CUDA module.
- CUDA_ERROR_INVALID_CONTEXT = 201
- This most frequently indicates that there is no context bound to the current thread. This can also be returned if the context passed to an API call is not a valid handle (such as a context that has had cuCtxDestroy() invoked on it). This can also be returned if a user mixes different API versions (i.e. 3010 context with 3020 API calls). See cuCtxGetApiVersion() for more details.
- CUDA_ERROR_CONTEXT_ALREADY_CURRENT = 202
-
Deprecated
This error return is deprecated as of CUDA 3.2. It is no longer an error to attempt to push the active context via cuCtxPushCurrent().
This indicated that the context being supplied as a parameter to the API call was already the active context.
- CUDA_ERROR_MAP_FAILED = 205
- This indicates that a map or register operation has failed.
- CUDA_ERROR_UNMAP_FAILED = 206
- This indicates that an unmap or unregister operation has failed.
- CUDA_ERROR_ARRAY_IS_MAPPED = 207
- This indicates that the specified array is currently mapped and thus cannot be destroyed.
- CUDA_ERROR_ALREADY_MAPPED = 208
- This indicates that the resource is already mapped.
- CUDA_ERROR_NO_BINARY_FOR_GPU = 209
- This indicates that there is no kernel image available that is suitable for the device. This can occur when a user specifies code generation options for a particular CUDA source file that do not include the corresponding device configuration.
- CUDA_ERROR_ALREADY_ACQUIRED = 210
- This indicates that a resource has already been acquired.
- CUDA_ERROR_NOT_MAPPED = 211
- This indicates that a resource is not mapped.
- CUDA_ERROR_NOT_MAPPED_AS_ARRAY = 212
- This indicates that a mapped resource is not available for access as an array.
- CUDA_ERROR_NOT_MAPPED_AS_POINTER = 213
- This indicates that a mapped resource is not available for access as a pointer.
- CUDA_ERROR_ECC_UNCORRECTABLE = 214
- This indicates that an uncorrectable ECC error was detected during execution.
- CUDA_ERROR_UNSUPPORTED_LIMIT = 215
- This indicates that the CUlimit passed to the API call is not supported by the active device.
- CUDA_ERROR_CONTEXT_ALREADY_IN_USE = 216
- This indicates that the CUcontext passed to the API call can only be bound to a single CPU thread at a time but is already bound to a CPU thread.
- CUDA_ERROR_PEER_ACCESS_UNSUPPORTED = 217
- This indicates that peer access is not supported across the given devices.
- CUDA_ERROR_INVALID_PTX = 218
- This indicates that a PTX JIT compilation failed.
- CUDA_ERROR_INVALID_GRAPHICS_CONTEXT = 219
- This indicates an error with OpenGL or DirectX context.
- CUDA_ERROR_NVLINK_UNCORRECTABLE = 220
- This indicates that an uncorrectable NVLink error was detected during the execution.
- CUDA_ERROR_JIT_COMPILER_NOT_FOUND = 221
- This indicates that the PTX JIT compiler library was not found.
- CUDA_ERROR_INVALID_SOURCE = 300
- This indicates that the device kernel source is invalid.
- CUDA_ERROR_FILE_NOT_FOUND = 301
- This indicates that the file specified was not found.
- CUDA_ERROR_SHARED_OBJECT_SYMBOL_NOT_FOUND = 302
- This indicates that a link to a shared object failed to resolve.
- CUDA_ERROR_SHARED_OBJECT_INIT_FAILED = 303
- This indicates that initialization of a shared object failed.
- CUDA_ERROR_OPERATING_SYSTEM = 304
- This indicates that an OS call failed.
- CUDA_ERROR_INVALID_HANDLE = 400
- This indicates that a resource handle passed to the API call was not valid. Resource handles are opaque types like CUstream and CUevent.
- CUDA_ERROR_ILLEGAL_STATE = 401
- This indicates that a resource required by the API call is not in a valid state to perform the requested operation.
- CUDA_ERROR_NOT_FOUND = 500
- This indicates that a named symbol was not found. Examples of symbols are global/constant variable names, texture names, and surface names.
- CUDA_ERROR_NOT_READY = 600
- This indicates that asynchronous operations issued previously have not completed yet. This result is not actually an error, but must be indicated differently than CUDA_SUCCESS (which indicates completion). Calls that may return this value include cuEventQuery() and cuStreamQuery().
- CUDA_ERROR_ILLEGAL_ADDRESS = 700
- While executing a kernel, the device encountered a load or store instruction on an invalid memory address. This leaves the process in an inconsistent state and any further CUDA work will return the same error. To continue using CUDA, the process must be terminated and relaunched.
- CUDA_ERROR_LAUNCH_OUT_OF_RESOURCES = 701
- This indicates that a launch did not occur because it did not have appropriate resources. This error usually indicates that the user has attempted to pass too many arguments to the device kernel, or the kernel launch specifies too many threads for the kernel's register count. Passing arguments of the wrong size (i.e. a 64-bit pointer when a 32-bit int is expected) is equivalent to passing too many arguments and can also result in this error.
- CUDA_ERROR_LAUNCH_TIMEOUT = 702
- This indicates that the device kernel took too long to execute. This can only occur if timeouts are enabled - see the device attribute CU_DEVICE_ATTRIBUTE_KERNEL_EXEC_TIMEOUT for more information. This leaves the process in an inconsistent state and any further CUDA work will return the same error. To continue using CUDA, the process must be terminated and relaunched.
- CUDA_ERROR_LAUNCH_INCOMPATIBLE_TEXTURING = 703
- This error indicates a kernel launch that uses an incompatible texturing mode.
- CUDA_ERROR_PEER_ACCESS_ALREADY_ENABLED = 704
- This error indicates that a call to cuCtxEnablePeerAccess() is trying to re-enable peer access to a context which has already had peer access to it enabled.
- CUDA_ERROR_PEER_ACCESS_NOT_ENABLED = 705
- This error indicates that cuCtxDisablePeerAccess() is trying to disable peer access which has not been enabled yet via cuCtxEnablePeerAccess().
- CUDA_ERROR_PRIMARY_CONTEXT_ACTIVE = 708
- This error indicates that the primary context for the specified device has already been initialized.
- CUDA_ERROR_CONTEXT_IS_DESTROYED = 709
- This error indicates that the context current to the calling thread has been destroyed using cuCtxDestroy, or is a primary context which has not yet been initialized.
- CUDA_ERROR_ASSERT = 710
- A device-side assert triggered during kernel execution. The context cannot be used anymore, and must be destroyed. All existing device memory allocations from this context are invalid and must be reconstructed if the program is to continue using CUDA.
- CUDA_ERROR_TOO_MANY_PEERS = 711
- This error indicates that the hardware resources required to enable peer access have been exhausted for one or more of the devices passed to cuCtxEnablePeerAccess().
- CUDA_ERROR_HOST_MEMORY_ALREADY_REGISTERED = 712
- This error indicates that the memory range passed to cuMemHostRegister() has already been registered.
- CUDA_ERROR_HOST_MEMORY_NOT_REGISTERED = 713
- This error indicates that the pointer passed to cuMemHostUnregister() does not correspond to any currently registered memory region.
- CUDA_ERROR_HARDWARE_STACK_ERROR = 714
- While executing a kernel, the device encountered a stack error. This can be due to stack corruption or exceeding the stack size limit. This leaves the process in an inconsistent state and any further CUDA work will return the same error. To continue using CUDA, the process must be terminated and relaunched.
- CUDA_ERROR_ILLEGAL_INSTRUCTION = 715
- While executing a kernel, the device encountered an illegal instruction. This leaves the process in an inconsistent state and any further CUDA work will return the same error. To continue using CUDA, the process must be terminated and relaunched.
- CUDA_ERROR_MISALIGNED_ADDRESS = 716
- While executing a kernel, the device encountered a load or store instruction on a memory address which is not aligned. This leaves the process in an inconsistent state and any further CUDA work will return the same error. To continue using CUDA, the process must be terminated and relaunched.
- CUDA_ERROR_INVALID_ADDRESS_SPACE = 717
- While executing a kernel, the device encountered an instruction which can only operate on memory locations in certain address spaces (global, shared, or local), but was supplied a memory address not belonging to an allowed address space. This leaves the process in an inconsistent state and any further CUDA work will return the same error. To continue using CUDA, the process must be terminated and relaunched.
- CUDA_ERROR_INVALID_PC = 718
- While executing a kernel, the device program counter wrapped its address space. This leaves the process in an inconsistent state and any further CUDA work will return the same error. To continue using CUDA, the process must be terminated and relaunched.
- CUDA_ERROR_LAUNCH_FAILED = 719
- An exception occurred on the device while executing a kernel. Common causes include dereferencing an invalid device pointer and accessing out of bounds shared memory. Less common cases can be system specific - more information about these cases can be found in the system specific user guide. This leaves the process in an inconsistent state and any further CUDA work will return the same error. To continue using CUDA, the process must be terminated and relaunched.
- CUDA_ERROR_COOPERATIVE_LAUNCH_TOO_LARGE = 720
- This error indicates that the number of blocks launched per grid for a kernel that was launched via either cuLaunchCooperativeKernel or cuLaunchCooperativeKernelMultiDevice exceeds the maximum number of blocks as allowed by cuOccupancyMaxActiveBlocksPerMultiprocessor or cuOccupancyMaxActiveBlocksPerMultiprocessorWithFlags times the number of multiprocessors as specified by the device attribute CU_DEVICE_ATTRIBUTE_MULTIPROCESSOR_COUNT.
- CUDA_ERROR_NOT_PERMITTED = 800
- This error indicates that the attempted operation is not permitted.
- CUDA_ERROR_NOT_SUPPORTED = 801
- This error indicates that the attempted operation is not supported on the current system or device.
- CUDA_ERROR_SYSTEM_NOT_READY = 802
- This error indicates that the system is not yet ready to start any CUDA work. To continue using CUDA, verify the system configuration is in a valid state and all required driver daemons are actively running. More information about this error can be found in the system specific user guide.
- CUDA_ERROR_SYSTEM_DRIVER_MISMATCH = 803
- This error indicates that there is a mismatch between the versions of the display driver and the CUDA driver. Refer to the compatibility documentation for supported versions.
- CUDA_ERROR_COMPAT_NOT_SUPPORTED_ON_DEVICE = 804
- This error indicates that the system was upgraded to run with forward compatibility but the visible hardware detected by CUDA does not support this configuration. Refer to the compatibility documentation for the supported hardware matrix or ensure that only supported hardware is visible during initialization via the CUDA_VISIBLE_DEVICES environment variable.
- CUDA_ERROR_STREAM_CAPTURE_UNSUPPORTED = 900
- This error indicates that the operation is not permitted when the stream is capturing.
- CUDA_ERROR_STREAM_CAPTURE_INVALIDATED = 901
- This error indicates that the current capture sequence on the stream has been invalidated due to a previous error.
- CUDA_ERROR_STREAM_CAPTURE_MERGE = 902
- This error indicates that the operation would have resulted in a merge of two independent capture sequences.
- CUDA_ERROR_STREAM_CAPTURE_UNMATCHED = 903
- This error indicates that the capture was not initiated in this stream.
- CUDA_ERROR_STREAM_CAPTURE_UNJOINED = 904
- This error indicates that the capture sequence contains a fork that was not joined to the primary stream.
- CUDA_ERROR_STREAM_CAPTURE_ISOLATION = 905
- This error indicates that a dependency would have been created which crosses the capture sequence boundary. Only implicit in-stream ordering dependencies are allowed to cross the boundary.
- CUDA_ERROR_STREAM_CAPTURE_IMPLICIT = 906
- This error indicates a disallowed implicit dependency on a current capture sequence from cudaStreamLegacy.
- CUDA_ERROR_CAPTURED_EVENT = 907
- This error indicates that the operation is not permitted on an event which was last recorded in a capturing stream.
- CUDA_ERROR_STREAM_CAPTURE_WRONG_THREAD = 908
- A stream capture sequence not initiated with the CU_STREAM_CAPTURE_MODE_RELAXED argument to cuStreamBeginCapture was passed to cuStreamEndCapture in a different thread.
- CUDA_ERROR_TIMEOUT = 909
- This error indicates that the timeout specified for the wait operation has lapsed.
- CUDA_ERROR_GRAPH_EXEC_UPDATE_FAILURE = 910
- This error indicates that the graph update was not performed because it included changes which violated constraints specific to instantiated graph update.
- CUDA_ERROR_UNKNOWN = 999
- This indicates that an unknown internal error has occurred.
- enum CUshared_carveout
-
Shared memory carveout configurations. These may be passed to cuFuncSetAttribute
Values
- CU_SHAREDMEM_CARVEOUT_DEFAULT = -1
- No preference for shared memory or L1 (default)
- CU_SHAREDMEM_CARVEOUT_MAX_SHARED = 100
- Prefer maximum available shared memory, minimum L1 cache
- CU_SHAREDMEM_CARVEOUT_MAX_L1 = 0
- Prefer maximum available L1 cache, minimum shared memory
- enum CUsharedconfig
-
Shared memory configurations
Values
- CU_SHARED_MEM_CONFIG_DEFAULT_BANK_SIZE = 0x00
- set default shared memory bank size
- CU_SHARED_MEM_CONFIG_FOUR_BYTE_BANK_SIZE = 0x01
- set shared memory bank width to four bytes
- CU_SHARED_MEM_CONFIG_EIGHT_BYTE_BANK_SIZE = 0x02
- set shared memory bank width to eight bytes
- enum CUstreamBatchMemOpType
-
Operations for cuStreamBatchMemOp
Values
- CU_STREAM_MEM_OP_WAIT_VALUE_32 = 1
- Represents a cuStreamWaitValue32 operation
- CU_STREAM_MEM_OP_WRITE_VALUE_32 = 2
- Represents a cuStreamWriteValue32 operation
- CU_STREAM_MEM_OP_WAIT_VALUE_64 = 4
- Represents a cuStreamWaitValue64 operation
- CU_STREAM_MEM_OP_WRITE_VALUE_64 = 5
- Represents a cuStreamWriteValue64 operation
- CU_STREAM_MEM_OP_FLUSH_REMOTE_WRITES = 3
- This has the same effect as CU_STREAM_WAIT_VALUE_FLUSH, but as a standalone operation.
- enum CUstreamCaptureMode
-
Possible modes for stream capture thread interactions. For more details see cuStreamBeginCapture and cuThreadExchangeStreamCaptureMode
Values
- CU_STREAM_CAPTURE_MODE_GLOBAL = 0
- CU_STREAM_CAPTURE_MODE_THREAD_LOCAL = 1
- CU_STREAM_CAPTURE_MODE_RELAXED = 2
- enum CUstreamCaptureStatus
-
Possible stream capture statuses returned by cuStreamIsCapturing
Values
- CU_STREAM_CAPTURE_STATUS_NONE = 0
- Stream is not capturing
- CU_STREAM_CAPTURE_STATUS_ACTIVE = 1
- Stream is actively capturing
- CU_STREAM_CAPTURE_STATUS_INVALIDATED = 2
- Stream is part of a capture sequence that has been invalidated, but not terminated
- enum CUstreamWaitValue_flags
-
Flags for cuStreamWaitValue32 and cuStreamWaitValue64
Values
- CU_STREAM_WAIT_VALUE_GEQ = 0x0
- Wait until (int32_t)(*addr - value) >= 0 (or int64_t for 64 bit values). Note this is a cyclic comparison which ignores wraparound. (Default behavior.)
- CU_STREAM_WAIT_VALUE_EQ = 0x1
- Wait until *addr == value.
- CU_STREAM_WAIT_VALUE_AND = 0x2
- Wait until (*addr & value) != 0.
- CU_STREAM_WAIT_VALUE_NOR = 0x3
- Wait until ~(*addr | value) != 0. Support for this operation can be queried with cuDeviceGetAttribute() and CU_DEVICE_ATTRIBUTE_CAN_USE_STREAM_WAIT_VALUE_NOR.
- CU_STREAM_WAIT_VALUE_FLUSH = 1<<30
- Follow the wait operation with a flush of outstanding remote writes. This means that, if a remote write operation is guaranteed to have reached the device before the wait can be satisfied, that write is guaranteed to be visible to downstream device work. The device is permitted to reorder remote writes internally. For example, this flag would be required if two remote writes arrive in a defined order, the wait is satisfied by the second write, and downstream work needs to observe the first write. Support for this operation is restricted to selected platforms and can be queried with CU_DEVICE_ATTRIBUTE_CAN_USE_WAIT_VALUE_FLUSH.
- enum CUstreamWriteValue_flags
-
Flags for cuStreamWriteValue32
Values
- CU_STREAM_WRITE_VALUE_DEFAULT = 0x0
- Default behavior
- CU_STREAM_WRITE_VALUE_NO_MEMORY_BARRIER = 0x1
- Permits the write to be reordered with writes which were issued before it, as a performance optimization. Normally, cuStreamWriteValue32 will provide a memory fence before the write, which has similar semantics to __threadfence_system() but is scoped to the stream rather than a CUDA thread.
- enum CUstream_flags
-
Stream creation flags
Values
- CU_STREAM_DEFAULT = 0x0
- Default stream flag
- CU_STREAM_NON_BLOCKING = 0x1
- Stream does not synchronize with stream 0 (the NULL stream)