4.1. CUDBGAPI_st Struct Reference

CUDA Debugger API methods.

Get the API associated with the major/minor/revision version numbers.

See also:

cudbgGetAPIVersion

Public Variables

CUDBGResult  ( *acknowledgeEvent30 )( CUDBGEvent30* event )
Inform the debugger API that synchronous events have been processed.
CUDBGResult  ( *acknowledgeEvents42 )( )
Inform the debugger API that synchronous events have been processed.
CUDBGResult  ( *acknowledgeSyncEvents )( )
Inform the debugger API that synchronous events have been processed.
CUDBGResult  ( *clearAttachState )( )
Clear attach-specific state prior to detach.
CUDBGResult  ( *consumeCudaLogs )( CUDBGCudaLogMessage* logMessages, uint32_t numMessages, uint32_t* numConsumed )
Get CUDA error log entries.
CUDBGResult  ( *disableBreakpoint )( CUDBGBreakpointHandle handle )
Disable a breakpoint specified by its handle.
CUDBGResult  ( *disassemble )( uint32_t dev, uint64_t addr, uint32_t* instSize, char* buf, uint32_t sz )
Disassemble instruction at instruction address.
CUDBGResult  ( *enableBreakpoint )( CUDBGBreakpointHandle handle )
Enable a breakpoint specified by its handle.
CUDBGResult  ( *executeInternalCommand )( const char* command, char* resultBuffer, uint32_t sizeInBytes )
Execute an internal command (not available in public driver builds).
CUDBGResult  ( *finalize )( )
Finalize the API, shutting down the debugging session.
CUDBGResult  ( *generateCoredump )( const char* filename, CUDBGCoredumpGenerationFlags flags )
Generate a coredump for the current GPU state.
CUDBGResult  ( *getAdjustedCodeAddress )( uint32_t dev, uint64_t address, uint64_t* adjustedAddress, CUDBGAdjAddrAction adjAction )
Get the adjusted code address for a given code address for a given device.
CUDBGResult  ( *getBlockDim )( uint32_t dev, uint32_t sm, uint32_t wp, CuDim3* blockDim )
Get the dimensions of the given block.
CUDBGResult  ( *getCbuWarpState )( uint32_t dev, uint32_t sm, uint64_t warpMask, CUDBGCbuWarpState* warpStates, uint32_t numWarpStates )
Get the CBU state of a given warp.
CUDBGResult  ( *getClusterDim )( uint32_t dev, uint32_t sm, uint32_t wp, CuDim3* clusterDim )
Get the number of blocks in the given cluster.
CUDBGResult  ( *getClusterDim120 )( uint32_t dev, uint64_t gridId64, CuDim3* clusterDim )
Get the number of blocks in the given cluster.
CUDBGResult  ( *getClusterExceptionTargetBlock )( uint32_t dev, uint32_t sm, uint32_t wp, CuDim3* blockIdx, bool* blockIdxValid )
Get the target block index and validity status for cluster exceptions.
CUDBGResult  ( *getConstBankAddress )( uint32_t dev, uint64_t gridId64, uint32_t bank, uint64_t* address, uint32_t* size )
Get constant bank GPU VA and size.
CUDBGResult  ( *getConstBankAddress123 )( uint32_t dev, uint32_t sm, uint32_t wp, uint32_t bank, uint32_t offset, uint64_t* address )
Convert constant bank number and offset into GPU VA.
CUDBGResult  ( *getCudaExceptionString )( uint32_t dev, uint32_t sm, uint32_t wp, uint32_t ln, char* buf, uint32_t bufSz, uint32_t* msgSz )
Get the error string for CUDA Exceptions.
CUDBGResult  ( *getDeviceInfo )( uint32_t dev, CUDBGDeviceInfoQueryType_t type, void* buffer, uint32_t length, uint32_t* dataLength )
Read full device info for the device.
CUDBGResult  ( *getDeviceInfoSizes )( uint32_t dev, CUDBGDeviceInfoSizes* sizes )
Return sizes for device info structs and defined attributes.
CUDBGResult  ( *getDeviceName )( uint32_t dev, char* buf, uint32_t sz )
Get the device name string.
CUDBGResult  ( *getDevicePCIBusInfo )( uint32_t dev, uint32_t* pciBusId, uint32_t* pciDevId )
Get PCI bus and device ids associated with device index.
CUDBGResult  ( *getDeviceType )( uint32_t dev, char* buf, uint32_t sz )
Get the string description of the device.
CUDBGResult  ( *getElfImage )( uint32_t dev, uint32_t sm, uint32_t wp, bool  relocated, void* *elfImage, uint64_t* size )
Get the relocated or non-relocated ELF image and size for the grid on the given device.
CUDBGResult  ( *getElfImage32 )( uint32_t dev, uint32_t sm, uint32_t wp, bool  relocated, void* *elfImage, uint32_t* size )
Get the relocated or non-relocated ELF image and size for the grid on the given device.
CUDBGResult  ( *getElfImageByHandle )( uint32_t dev, uint64_t handle, CUDBGElfImageType type, void* elfImage, uint64_t size )
Get the relocated or non-relocated ELF image for the given handle on the given device.
CUDBGResult  ( *getErrorStringEx )( char* buf, uint32_t bufSz, uint32_t* msgSz )
Fills a user-provided buffer with an error message encoded as a null-terminated ASCII string.
CUDBGResult  ( *getGridAttribute )( uint32_t dev, uint32_t sm, uint32_t wp, CUDBGAttribute attr, uint64_t* value )
Get the value of a grid attribute.
CUDBGResult  ( *getGridAttributes )( uint32_t dev, uint32_t sm, uint32_t wp, CUDBGAttributeValuePair* pairs, uint32_t numPairs )
Get several grid attribute values in a single API call.
CUDBGResult  ( *getGridDim )( uint32_t dev, uint32_t sm, uint32_t wp, CuDim3* gridDim )
Get the dimensions in blocks of the given grid.
CUDBGResult  ( *getGridDim32 )( uint32_t dev, uint32_t sm, uint32_t wp, CuDim2* gridDim )
Get the dimensions of the given grid.
CUDBGResult  ( *getGridInfo )( uint32_t dev, uint64_t gridId64, CUDBGGridInfo* gridInfo )
Get information about the specified grid.
CUDBGResult  ( *getGridInfo120 )( uint32_t dev, uint64_t gridId64, CUDBGGridInfo120* gridInfo )
Get information about the specified grid.
CUDBGResult  ( *getGridInfo55 )( uint32_t dev, uint64_t gridId64, CUDBGGridInfo55* gridInfo )
Get information about the specified grid.
CUDBGResult  ( *getGridStatus )( uint32_t dev, uint64_t gridId64, CUDBGGridStatus* status )
Check whether the grid corresponding to the ID is still present on the device.
CUDBGResult  ( *getGridStatus50 )( uint32_t dev, uint32_t gridId, CUDBGGridStatus* status )
Check whether the grid corresponding to the ID is still present on the device.
CUDBGResult  ( *getHardwareBarrierInfo )( uint32_t dev, uint32_t sm, uint32_t wp, uint32_t ln, CUDBGBarrierScope* scope, char* buf, uint32_t bufSz, uint32_t* msgSz )
Get hardware barrier information.
CUDBGResult  ( *getHostAddrFromDeviceAddr )( uint32_t dev, uint64_t device_addr, uint64_t* host_addr )
Given a device virtual address, return a corresponding system memory virtual address.
CUDBGResult  ( *getLoadedFunctionInfo )( uint32_t dev, uint64_t handle, CUDBGLoadedFunctionInfo* info, uint32_t startIndex, uint32_t numEntries )
Get the section number and address of loaded functions for a given module.
CUDBGResult  ( *getLoadedFunctionInfo118 )( uint32_t dev, uint64_t handle, CUDBGLoadedFunctionInfo* info, uint32_t numEntries )
Get the section number and address of loaded functions for a given module.
CUDBGResult  ( *getManagedMemoryRegionInfo )( uint64_t startAddress, CUDBGMemoryInfo* memoryInfo, uint32_t memoryInfo_size, uint32_t* numEntries )
Get a sorted list of managed memory regions.
CUDBGResult  ( *getNextAsyncEvent50 )( CUDBGEvent50* event )
Copies the next available event in the asynchronous event queue into 'event' and removes it from the queue.
CUDBGResult  ( *getNextAsyncEvent55 )( CUDBGEvent55* event )
Copies the next available event in the asynchronous event queue into 'event' and removes it from the queue.
CUDBGResult  ( *getNextEvent )( CUDBGEventQueueType type, CUDBGEvent* event )
Copies the next available event into 'event' and removes it from the queue.
CUDBGResult  ( *getNextEvent30 )( CUDBGEvent30* event )
Copies the next available event in the event queue into 'event' and removes it from the queue.
CUDBGResult  ( *getNextEvent32 )( CUDBGEvent32* event )
Copies the next available event in the event queue into 'event' and removes it from the queue.
CUDBGResult  ( *getNextEvent42 )( CUDBGEvent42* event )
Copies the next available event in the event queue into 'event' and removes it from the queue.
CUDBGResult  ( *getNextSyncEvent50 )( CUDBGEvent50* event )
Copies the next available event in the synchronous event queue into 'event' and removes it from the queue.
CUDBGResult  ( *getNextSyncEvent55 )( CUDBGEvent55* event )
Copies the next available event in the synchronous event queue into 'event' and removes it from the queue.
CUDBGResult  ( *getNumDevices )( uint32_t* numDev )
Get the number of installed CUDA devices.
CUDBGResult  ( *getNumLanes )( uint32_t dev, uint32_t* numLanes )
Get the number of lanes per warp on the device.
CUDBGResult  ( *getNumPredicates )( uint32_t dev, uint32_t* numPredicates )
Get the number of predicate registers per lane on the device.
CUDBGResult  ( *getNumRegisters )( uint32_t dev, uint32_t* numRegs )
Get the maximum number of registers per lane on the device.
CUDBGResult  ( *getNumSMs )( uint32_t dev, uint32_t* numSMs )
Get the total number of SMs on the device.
CUDBGResult  ( *getNumUniformPredicates )( uint32_t dev, uint32_t* numPredicates )
Get the number of uniform predicate registers per warp on the device.
CUDBGResult  ( *getNumUniformRegisters )( uint32_t dev, uint32_t* numRegs )
Get the number of uniform registers per warp on the device.
CUDBGResult  ( *getNumWarps )( uint32_t dev, uint32_t* numWarps )
Get the number of warps per SM on the device.
CUDBGResult  ( *getPhysicalRegister30 )( uint64_t pc, char* reg, uint32_t* buf, uint32_t sz, uint32_t* numPhysRegs, CUDBGRegClass* regClass )
Get the physical register number(s) assigned to a virtual register name at a given PC, if it's live at that PC.
CUDBGResult  ( *getPhysicalRegister40 )( uint32_t dev, uint32_t sm, uint32_t wp, uint64_t pc, char* reg, uint32_t* buf, uint32_t sz, uint32_t* numPhysRegs, CUDBGRegClass* regClass )
Get the physical register number(s) assigned to a virtual register name at a given PC, if it's live at that PC.
CUDBGResult  ( *getSmType )( uint32_t dev, char* buf, uint32_t sz )
Get the SM type of the device.
CUDBGResult  ( *getSupportedDebuggerCapabilities )( CUDBGCapabilityFlags* capabilities )
Returns debug agent capabilities that are supported by this version of the API.
CUDBGResult  ( *getTID )( uint32_t dev, uint32_t sm, uint32_t wp, uint32_t* tid )
Get the ID of the Linux thread hosting the CUDA context active at the given coordinates.
CUDBGResult  ( *getWarpHitBreakpoint )( uint32_t dev, uint32_t sm, uint32_t wp, CUDBGBreakpointHandle* handle )
Get the handle of the breakpoint that the given warp hit.
CUDBGResult  ( *initialize )( )
Initialize the API.
CUDBGResult  ( *initializeAttachStub )( )
Initialize the attach stub.
CUDBGResult  ( *insertBreakpoint )( uint32_t dev, uint64_t addr, CUDBGBreakpointHandle* handle )
Set a breakpoint at the given instruction address for the given device.
CUDBGResult  ( *isBreakpointEnabled )( CUDBGBreakpointHandle handle, uint32_t* enabled )
Check if a breakpoint specified by its handle is enabled.
CUDBGResult  ( *isDeviceCodeAddress )( uintptr_t addr, bool* isDeviceAddress )
Determine whether a virtual address resides within device code.
CUDBGResult  ( *isDeviceCodeAddress55 )( uintptr_t addr, bool* isDeviceAddress )
Determine whether a virtual address resides within device code.
CUDBGResult  ( *lookupDeviceCodeSymbol )( char* symName, bool* symFound, uintptr_t* symAddr )
Determines whether a symbol represents a function in device code and returns its virtual address.
CUDBGResult  ( *memcheckReadErrorAddress )( uint32_t dev, uint32_t sm, uint32_t wp, uint32_t ln, uint64_t* address, ptxStorageKind* storage )
Get the address that memcheck detected an error on.
CUDBGResult  ( *readActiveLanes )( uint32_t dev, uint32_t sm, uint32_t wp, uint32_t* activeLanesMask )
Read the lane bitmask of active threads on a valid warp.
CUDBGResult  ( *readAllVirtualReturnAddresses )( uint32_t dev, uint32_t sm, uint32_t wp, uint32_t ln, uint64_t* addrs, uint32_t numAddrs, uint32_t* callDepth, uint32_t* syscallCallDepth )
Read all the virtual return addresses for a thread (the full backtrace).
CUDBGResult  ( *readBlockIdx )( uint32_t dev, uint32_t sm, uint32_t wp, CuDim3* blockIdx )
Read the CUDA block index running on a valid warp.
CUDBGResult  ( *readBlockIdx32 )( uint32_t dev, uint32_t sm, uint32_t wp, CuDim2* blockIdx )
Read the two-dimensional CUDA block index running on a valid warp.
CUDBGResult  ( *readBrokenWarps )( uint32_t dev, uint32_t sm, uint64_t* brokenWarpsMask )
Read the bitmask of warps that are at a breakpoint on a given SM.
CUDBGResult  ( *readCCRegister )( uint32_t dev, uint32_t sm, uint32_t wp, uint32_t ln, uint32_t* val )
Read the hardware CC register.
CUDBGResult  ( *readCPUCallStack )( uint32_t dev, uint64_t gridId64, uint64_t* addrs, uint32_t numAddrs, uint32_t* totalNumAddrs )
Read the CPU call stack captured at the time of kernel launch.
CUDBGResult  ( *readCallDepth )( uint32_t dev, uint32_t sm, uint32_t wp, uint32_t ln, uint32_t* depth )
Read the call depth (number of calls) for a given thread.
CUDBGResult  ( *readCallDepth32 )( uint32_t dev, uint32_t sm, uint32_t wp, uint32_t* depth )
Read the call depth (number of calls) for a given warp.
CUDBGResult  ( *readClusterIdx )( uint32_t dev, uint32_t sm, uint32_t wp, CuDim3* clusterIdx )
Read the CUDA cluster index running on a valid warp.
CUDBGResult  ( *readCodeMemory )( uint32_t dev, uint64_t addr, void* buf, uint32_t sz )
Read content at address in the code memory segment.
CUDBGResult  ( *readConstMemory129 )( uint32_t dev, uint64_t addr, void* buf, uint32_t sz )
Read content at address in the constant memory segment.
CUDBGResult  ( *readDeviceExceptionState )( uint32_t dev, uint64_t* mask, uint32_t numWords )
Get the exception state of the SMs on the device.
CUDBGResult  ( *readDeviceExceptionState80 )( uint32_t dev, uint64_t* exceptionSMMask )
Get the exception state of the SMs on the device.
CUDBGResult  ( *readErrorPC )( uint32_t dev, uint32_t sm, uint32_t wp, uint64_t* errorPC, bool* errorPCValid )
Get the hardware reported error PC if it exists.
CUDBGResult  ( *readGenericMemory )( uint32_t dev, uint32_t sm, uint32_t wp, uint32_t ln, uint64_t addr, void* buf, uint32_t sz )
Read content at an address in any memory segment.
CUDBGResult  ( *readGlobalMemory )( uint64_t addr, void* buf, uint32_t sz )
Read content at an address in the global address space.
CUDBGResult  ( *readGlobalMemory31 )( uint32_t dev, uint64_t addr, void* buf, uint32_t sz )
Read content at address in the global memory segment.
CUDBGResult  ( *readGlobalMemory55 )( uint32_t dev, uint32_t sm, uint32_t wp, uint32_t ln, uint64_t addr, void* buf, uint32_t sz )
Read content at address in the global memory segment.
CUDBGResult  ( *readGridId )( uint32_t dev, uint32_t sm, uint32_t wp, uint64_t* gridId64 )
Read the 64-bit CUDA grid index running on a valid warp.
CUDBGResult  ( *readGridId50 )( uint32_t dev, uint32_t sm, uint32_t wp, uint32_t* gridId )
Read the CUDA grid index running on a valid warp.
CUDBGResult  ( *readLaneException )( uint32_t dev, uint32_t sm, uint32_t wp, uint32_t ln, CUDBGException_t* exception )
Read the exception type for a given thread.
CUDBGResult  ( *readLaneStatus )( uint32_t dev, uint32_t sm, uint32_t wp, uint32_t ln, bool* error )
Read the status of the given thread.
CUDBGResult  ( *readLocalMemory )( uint32_t dev, uint32_t sm, uint32_t wp, uint32_t ln, uint64_t addr, void* buf, uint32_t sz )
Read content at address in the local memory segment.
CUDBGResult  ( *readPC )( uint32_t dev, uint32_t sm, uint32_t wp, uint32_t ln, uint64_t* pc )
Read the PC offset on the given active thread.
CUDBGResult  ( *readParamMemory )( uint32_t dev, uint32_t sm, uint32_t wp, uint64_t addr, void* buf, uint32_t sz )
Read content at address in the param memory segment.
CUDBGResult  ( *readPinnedMemory )( uint64_t addr, void* buf, uint32_t sz )
Read content at pinned address in system memory.
CUDBGResult  ( *readPredicates )( uint32_t dev, uint32_t sm, uint32_t wp, uint32_t ln, uint32_t predicates_size, uint32_t* predicates )
Read content of hardware predicate registers.
CUDBGResult  ( *readRegister )( uint32_t dev, uint32_t sm, uint32_t wp, uint32_t ln, uint32_t regno, uint32_t* val )
Read content of a hardware register.
CUDBGResult  ( *readRegisterRange )( uint32_t dev, uint32_t sm, uint32_t wp, uint32_t ln, uint32_t index, uint32_t numRegisters, uint32_t* registers, uint32_t* numRegistersRead )
Read content of a hardware range of hardware registers.
CUDBGResult  ( *readRegisterRange60 )( uint32_t dev, uint32_t sm, uint32_t wp, uint32_t ln, uint32_t index, uint32_t registers_size, uint32_t* registers )
Read content of a range of hardware registers.
CUDBGResult  ( *readReturnAddress )( uint32_t dev, uint32_t sm, uint32_t wp, uint32_t ln, uint32_t level, uint64_t* ra )
Read the return address (offset) for a call level.
CUDBGResult  ( *readReturnAddress32 )( uint32_t dev, uint32_t sm, uint32_t wp, uint32_t level, uint64_t* ra )
Read the return address (offset) for a call level.
CUDBGResult  ( *readSharedMemory )( uint32_t dev, uint32_t sm, uint32_t wp, uint64_t addr, void* buf, uint32_t sz )
Read content at address in the shared memory segment.
CUDBGResult  ( *readSmException )( uint32_t dev, uint32_t sm, CUDBGException_t* exception, uint64_t* errorPC, bool* errorPCValid )
Get the SM exception status if it exists.
CUDBGResult  ( *readSyscallCallDepth )( uint32_t dev, uint32_t sm, uint32_t wp, uint32_t ln, uint32_t* depth )
Read the call depth of syscalls for a given thread.
CUDBGResult  ( *readTextureMemory )( uint32_t dev, uint32_t vsm, uint32_t wp, uint32_t id, uint32_t dim, uint32_t* coords, void* buf, uint32_t sz )
This method is no longer supported since CUDA 12.0.
CUDBGResult  ( *readTextureMemoryBindless )( uint32_t dev, uint32_t vsm, uint32_t wp, uint32_t texSymtabIndex, uint32_t dim, uint32_t* coords, void* buf, uint32_t sz )
This method is no longer supported since CUDA 12.0.
CUDBGResult  ( *readThreadIdx )( uint32_t dev, uint32_t sm, uint32_t wp, uint32_t ln, CuDim3* threadIdx )
Read the CUDA thread index running on valid thread.
CUDBGResult  ( *readUniformPredicates )( uint32_t dev, uint32_t sm, uint32_t wp, uint32_t predicates_size, uint32_t* predicates )
Read contents of uniform predicate registers.
CUDBGResult  ( *readUniformRegisterRange )( uint32_t dev, uint32_t sm, uint32_t wp, uint32_t regno, uint32_t registers_size, uint32_t* registers )
Read a range of uniform registers.
CUDBGResult  ( *readValidLanes )( uint32_t dev, uint32_t sm, uint32_t wp, uint32_t* validLanesMask )
Read the lane bitmask of valid threads on a given warp.
CUDBGResult  ( *readValidWarps )( uint32_t dev, uint32_t sm, uint64_t* validWarpsMask )
Read the bitmask of valid warps on a given SM.
CUDBGResult  ( *readVirtualPC )( uint32_t dev, uint32_t sm, uint32_t wp, uint32_t ln, uint64_t* pc )
Read the PC (virtual address) on the given active thread.
CUDBGResult  ( *readVirtualReturnAddress )( uint32_t dev, uint32_t sm, uint32_t wp, uint32_t ln, uint32_t level, uint64_t* ra )
Read the virtual return address for a call level.
CUDBGResult  ( *readVirtualReturnAddress32 )( uint32_t dev, uint32_t sm, uint32_t wp, uint32_t level, uint64_t* ra )
Read the virtual return address for a call level.
CUDBGResult  ( *readWarpResources )( uint32_t dev, uint32_t sm, uint32_t wp, CUDBGWarpResources* resources )
Get the resources assigned to a given warp.
CUDBGResult  ( *readWarpState )( uint32_t dev, uint32_t sm, uint32_t wp, CUDBGWarpState* state )
Read the state of a given warp.
CUDBGResult  ( *readWarpState120 )( uint32_t dev, uint32_t sm, uint32_t wp, CUDBGWarpState120* state )
Read the state of a given warp.
CUDBGResult  ( *readWarpState127 )( uint32_t dev, uint32_t sm, uint32_t wp, CUDBGWarpState127* state )
Read the state of a given warp.
CUDBGResult  ( *readWarpState60 )( uint32_t dev, uint32_t sm, uint32_t wp, CUDBGWarpState60* state )
Read the state of a given warp.
CUDBGResult  ( *removeBreakpoint )( CUDBGBreakpointHandle handle )
Remove a breakpoint specified by its handle.
CUDBGResult  ( *requestCleanupOnDetach )( uint32_t appResumeFlag )
Request for cleanup of driver state when detaching.
CUDBGResult  ( *requestCleanupOnDetach55 )( )
Request for cleanup of driver state when detaching.
CUDBGResult  ( *resumeAllDevices )( )
Resume all running CUDA devices.
CUDBGResult  ( *resumeDevice )( uint32_t dev )
Resume a suspended CUDA device.
CUDBGResult  ( *resumeWarpsUntilPC )( uint32_t dev, uint32_t sm, uint64_t warpMask, uint64_t pc, uint32_t flags )
Insert a temporary breakpoint at the specified virtual PC and resume all warps in the specified bitmask on a given SM.
CUDBGResult  ( *resumeWarpsUntilPC60 )( uint32_t dev, uint32_t sm, uint64_t warpMask, uint64_t virtPC )
Insert a temporary breakpoint at the specified virtual PC and resume all warps in the specified bitmask on a given SM.
CUDBGResult  ( *setBreakpoint )( uint32_t dev, uint64_t addr )
Set a breakpoint at the given instruction address for the given device.
CUDBGResult  ( *setBreakpoint31 )( uint64_t addr )
Set a breakpoint at the given instruction address.
CUDBGResult  ( *setKernelLaunchNotificationMode )( CUDBGKernelLaunchNotifyMode mode )
Set the launch notification policy.
CUDBGResult  ( *setNotifyNewEventCallback )( CUDBGNotifyNewEventCallback callback, void* userData )
Provides the API with the function to call to notify the debugger of a new application or device event.
CUDBGResult  ( *setNotifyNewEventCallback31 )( CUDBGNotifyNewEventCallback31 callback, void* data )
Provides the API with the function to call to notify the debugger of a new application or device event.
CUDBGResult  ( *setNotifyNewEventCallback40 )( CUDBGNotifyNewEventCallback40 callback )
Provides the API with the function to call to notify the debugger of a new application or device event.
CUDBGResult  ( *setNotifyNewEventCallback41 )( CUDBGNotifyNewEventCallback41 callback )
Provides the API with the function to call to notify the debugger of a new application or device event.
CUDBGResult  ( *singleStepWarp )( uint32_t dev, uint32_t sm, uint32_t wp, uint32_t laneHint, uint32_t nsteps, uint32_t flags, uint64_t* warpMask )
Single step an individual warp nsteps times on a suspended CUDA device.
CUDBGResult  ( *singleStepWarp40 )( uint32_t dev, uint32_t sm, uint32_t wp )
Single step an individual warp on a suspended CUDA device.
CUDBGResult  ( *singleStepWarp41 )( uint32_t dev, uint32_t sm, uint32_t wp, uint64_t* warpMask )
Single step an individual warp on a suspended CUDA device.
CUDBGResult  ( *singleStepWarp65 )( uint32_t dev, uint32_t sm, uint32_t wp, uint32_t nsteps, uint64_t* warpMask )
Single step an individual warp nsteps times on a suspended CUDA device.
CUDBGResult  ( *suspendAllDevices )( uint32_t nonBlocking )
Suspend all running CUDA devices.
CUDBGResult  ( *suspendDevice )( uint32_t dev )
Suspends a running CUDA device.
CUDBGResult  ( *unsetBreakpoint )( uint32_t dev, uint64_t addr )
Unset a breakpoint at the given instruction address for the given device.
CUDBGResult  ( *unsetBreakpoint31 )( uint64_t addr )
Unset a breakpoint at the given instruction address.
CUDBGResult  ( *writeCCRegister )( uint32_t dev, uint32_t sm, uint32_t wp, uint32_t ln, uint32_t val )
Write to the hardware CC register.
CUDBGResult  ( *writeGenericMemory )( uint32_t dev, uint32_t sm, uint32_t wp, uint32_t ln, uint64_t addr, const void* buf, uint32_t sz )
Write to an address in any memory segment.
CUDBGResult  ( *writeGlobalMemory )( uint64_t addr, const void* buf, uint32_t sz )
Write to an address in global memory.
CUDBGResult  ( *writeGlobalMemory31 )( uint32_t dev, uint64_t addr, const void* buf, uint32_t sz )
Write to an address in global memory.
CUDBGResult  ( *writeGlobalMemory55 )( uint32_t dev, uint32_t sm, uint32_t wp, uint32_t ln, uint64_t addr, const void* buf, uint32_t sz )
Write to an address in global memory.
CUDBGResult  ( *writeLocalMemory )( uint32_t dev, uint32_t sm, uint32_t wp, uint32_t ln, uint64_t addr, const void* buf, uint32_t sz )
Write to an address in local memory.
CUDBGResult  ( *writeParamMemory )( uint32_t dev, uint32_t sm, uint32_t wp, uint64_t addr, const void* buf, uint32_t sz )
Write to an address in param memory.
CUDBGResult  ( *writePinnedMemory )( uint64_t addr, const void* buf, uint32_t sz )
Write to a pinned memory address.
CUDBGResult  ( *writePredicates )( uint32_t dev, uint32_t sm, uint32_t wp, uint32_t ln, uint32_t predicates_size, const uint32_t* predicates )
Write to hardware predicates.
CUDBGResult  ( *writeRegister )( uint32_t dev, uint32_t sm, uint32_t wp, uint32_t ln, uint32_t regno, uint32_t val )
Write to a hardware register.
CUDBGResult  ( *writeSharedMemory )( uint32_t dev, uint32_t sm, uint32_t wp, uint64_t addr, const void* buf, uint32_t sz )
Write to an address in shared memory.
CUDBGResult  ( *writeUniformPredicates )( uint32_t dev, uint32_t sm, uint32_t wp, uint32_t predicates_size, const uint32_t* predicates )
Write to hardware uniform predicates.
CUDBGResult  ( *writeUniformRegister )( uint32_t dev, uint32_t sm, uint32_t wp, uint32_t regno, uint32_t val )
Write to a hardware uniform register.

Variables

CUDBGResult ( *CUDBGAPI_st::acknowledgeEvent30 )( CUDBGEvent30* event )

Inform the debugger API that synchronous events have been processed. Behaves exactly like acknowledgeSyncEvents (the event parameter is ignored).

Since CUDA 3.0.

Note:

DEPRECATED in CUDA 3.1: Use acknowledgeSyncEvents instead.

See also:

acknowledgeSyncEvents

Parameters
event
- pointer to the event that has been processed
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INITIALIZATION_FAILURE

CUDBGResult ( *CUDBGAPI_st::acknowledgeEvents42 )( )

Inform the debugger API that synchronous events have been processed. Behaves exactly like acknowledgeSyncEvents.

Since CUDA 3.1.

Note:

DEPRECATED in CUDA 5.0: Use acknowledgeSyncEvents instead.

See also:

acknowledgeSyncEvents

Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INITIALIZATION_FAILURE

CUDBGResult ( *CUDBGAPI_st::acknowledgeSyncEvents )( )

Inform the debugger API that synchronous events have been processed. This resumes any process that was interrupted by the synchronous event (e.g. a context creation, a module load, etc.). This method always acknowledges only those SYNC events that have been read with getNextEvent (or its deprecated variants). SYNC events that haven't been read are not acknowledged and will continue to prevent their corresponding processes from proceeding. ASYNC events do not require acknowledgement.

Since CUDA 5.0.

See also:

getNextEvent

setNotifyNewEventCallback

Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INITIALIZATION_FAILURE

CUDBGResult ( *CUDBGAPI_st::clearAttachState )( )

Clear attach-specific state prior to detach. This call prepares the API for detaching. See the "Attaching and Detaching" section for more information.

Since CUDA 5.0.

Returns

CUDBG_SUCCESS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::consumeCudaLogs )( CUDBGCudaLogMessage* logMessages, uint32_t numMessages, uint32_t* numConsumed )

Get CUDA error log entries. This consumes the log entries, so they will not be available in subsequent calls. This functionality is only available if the CUDBG_DEBUGGER_CAPABILITY_ENABLE_CUDA_LOGS capability is enabled.

Since CUDA 12.9.

Parameters
logMessages
- client-allocated array to store log entries
numMessages
- capacity of the logMessages array, in number of elements
numConsumed
- returned number of entries written to logMessages
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_NO_EVENT_AVAILABLE, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::disableBreakpoint )( CUDBGBreakpointHandle handle )

Disable a breakpoint specified by its handle. Disabling/enabling a breakpoint might be faster than removing and inserting it again.

Since CUDA 13.2.

See also:

enableBreakpoint

getWarpHitBreakpoint

insertBreakpoint

isBreakpointEnabled

removeBreakpoint

Parameters
handle
- the breakpoint handle
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INTERNAL, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_RECURSIVE_API_CALL, CUDBG_ERROR_BREAKPOINT_STATE_CONFLICT

CUDBGResult ( *CUDBGAPI_st::disassemble )( uint32_t dev, uint64_t addr, uint32_t* instSize, char* buf, uint32_t sz )

Disassemble instruction at instruction address. This method does not guarantee any specific output format and its result should be treated as plain text.

Since CUDA 3.0.

Parameters
dev
- device index
addr
- instruction address
instSize
- instruction size (32 or 64 bits)
buf
- disassembled instruction buffer
sz
- disassembled instruction buffer size
Returns

CUDBG_SUCCESS, CUDBG_ERROR_UNKNOWN, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INTERNAL, CUDBG_ERROR_INVALID_ADDRESS, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_INVALID_CONTEXT, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::enableBreakpoint )( CUDBGBreakpointHandle handle )

Enable a breakpoint specified by its handle. Disabling/enabling a breakpoint might be faster than removing and inserting it again.

Since CUDA 13.2.

See also:

disableBreakpoint

getWarpHitBreakpoint

insertBreakpoint

isBreakpointEnabled

removeBreakpoint

Parameters
handle
- the breakpoint handle
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INTERNAL, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_RECURSIVE_API_CALL, CUDBG_ERROR_BREAKPOINT_STATE_CONFLICT

CUDBGResult ( *CUDBGAPI_st::executeInternalCommand )( const char* command, char* resultBuffer, uint32_t sizeInBytes )

Execute an internal command (not available in public driver builds). Always returns CUDBG_ERROR_NOT_SUPPORTED.

Since CUDA 12.6.

Parameters
command
- the command name and arguments
resultBuffer
- the destination buffer
sizeInBytes
- buffer size in bytes
Returns

CUDBG_ERROR_NOT_SUPPORTED

CUDBGResult ( *CUDBGAPI_st::finalize )( )

Finalize the API, shutting down the debugging session. Since CUDA 3.0.

See also:

initialize

Returns

CUDBG_SUCCESS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::generateCoredump )( const char* filename, CUDBGCoredumpGenerationFlags flags )

Generate a coredump for the current GPU state. Since CUDA 12.3.

Parameters
filename
- target coredump file name
flags
- coredump generation flags/options
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INTERNAL, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::getAdjustedCodeAddress )( uint32_t dev, uint64_t address, uint64_t* adjustedAddress, CUDBGAdjAddrAction adjAction )

Get the adjusted code address for a given code address for a given device. The client must call this function before inserting a breakpoint, or when the previous or next code address is needed for breakpoint inserting purposes.

Since CUDA 5.5.

See also:

setBreakpoint

Parameters
dev
- device index
address
adjustedAddress
- adjusted address
adjAction
- whether the adjusted next, previous or current address is needed
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INTERNAL, CUDBG_ERROR_INVALID_ADDRESS, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::getBlockDim )( uint32_t dev, uint32_t sm, uint32_t wp, CuDim3* blockDim )

Get the dimensions of the given block. Since CUDA 3.0.

See also:

getClusterDim

getGridDim

Parameters
dev
- device index
sm
- SM index
wp
- warp index
blockDim
- the returned number of threads in the block
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INTERNAL, CUDBG_ERROR_INVALID_WARP, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_INVALID_CONTEXT, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::getCbuWarpState )( uint32_t dev, uint32_t sm, uint64_t warpMask, CUDBGCbuWarpState* warpStates, uint32_t numWarpStates )

Get the CBU state of a given warp. Since CUDA 12.9.

Parameters
dev
- device index
sm
- SM index
warpMask
- bitmask of the warps which states should be returned in warpStates
warpStates
- pointer to the array of CUDBGCbuWarpState structures
numWarpStates
- number of elements in warpStates array
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INTERNAL, CUDBG_ERROR_INVALID_WARP, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_INVALID_CONTEXT, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::getClusterDim )( uint32_t dev, uint32_t sm, uint32_t wp, CuDim3* clusterDim )

Get the number of blocks in the given cluster. Since CUDA 12.7.

See also:

getBlockDim

getGridDim

Parameters
dev
- device index
sm
- SM index
wp
- warp index
clusterDim
- the returned number of blocks in the cluster
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INTERNAL, CUDBG_ERROR_INVALID_WARP, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_INVALID_CONTEXT, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::getClusterDim120 )( uint32_t dev, uint64_t gridId64, CuDim3* clusterDim )

Get the number of blocks in the given cluster. Behaves like getClusterDim, but takes a grid ID instead of warp coordinates. In newer GPU architectures, it's possible to have different warps belong to blocks of clusters of different size.

Since CUDA 12.0.

Note:

DEPRECATED in CUDA 12.7: Use getClusterDim instead.

See also:

getClusterDim

Parameters
dev
- device index
gridId64
- grid ID
clusterDim
- the returned number of blocks in the cluster
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INTERNAL, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_INVALID_GRID, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::getClusterExceptionTargetBlock )( uint32_t dev, uint32_t sm, uint32_t wp, CuDim3* blockIdx, bool* blockIdxValid )

Get the target block index and validity status for cluster exceptions. Since CUDA 12.7.

Parameters
dev
- device index
sm
- SM index
wp
- warp index
blockIdx
- pointer to a `CuDim3` structure that will be populated with the target block index
blockIdxValid
- pointer to a boolean variable that will be set to `true` if the target block index is valid, and `false` otherwise. Value will be set to false if the warp is not stopped on a cluster exception
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INTERNAL, CUDBG_ERROR_INVALID_WARP, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_INVALID_CONTEXT, CUDBG_ERROR_RECURSIVE_API_CALL, CUDBG_ERROR_NOT_SUPPORTED

CUDBGResult ( *CUDBGAPI_st::getConstBankAddress )( uint32_t dev, uint64_t gridId64, uint32_t bank, uint64_t* address, uint32_t* size )

Get constant bank GPU VA and size. Since CUDA 12.4.

Parameters
dev
- device index
gridId64
- grid ID of the grid containing the constant bank
bank
- constant bank number
address
- GPU VA of the bank memory
size
- bank size
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INTERNAL, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_INVALID_GRID, CUDBG_ERROR_RECURSIVE_API_CALL, CUDBG_ERROR_MISSING_DATA

CUDBGResult ( *CUDBGAPI_st::getConstBankAddress123 )( uint32_t dev, uint32_t sm, uint32_t wp, uint32_t bank, uint32_t offset, uint64_t* address )

Convert constant bank number and offset into GPU VA. It's more convenient to get the constbank address and then calculate the VA for any const address using that.

Since CUDA 12.3.

Note:

DEPRECATED in CUDA 12.4: Use getConstBankAddress instead.

See also:

getConstBankAddress

Parameters
dev
- device index
sm
- SM index
wp
- warp index
bank
- constant bank number
offset
- offset within the bank
address
- GPU VA
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INVALID_MEMORY_ACCESS, CUDBG_ERROR_INTERNAL, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_INVALID_GRID, CUDBG_ERROR_RECURSIVE_API_CALL, CUDBG_ERROR_MISSING_DATA

CUDBGResult ( *CUDBGAPI_st::getCudaExceptionString )( uint32_t dev, uint32_t sm, uint32_t wp, uint32_t ln, char* buf, uint32_t bufSz, uint32_t* msgSz )

Get the error string for CUDA Exceptions. Since CUDA 13.0.

Parameters
dev
- device index
sm
- SM index
wp
- warp index
ln
- lane index
buf
- buffer for the error string
bufSz
- buffer size
msgSz
- error message size with null character, can be null
Returns

CUDBG_SUCCESS, CUDBG_ERROR_BUFFER_TOO_SMALL, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INTERNAL, CUDBG_ERROR_INVALID_WARP, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_INVALID_CONTEXT, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::getDeviceInfo )( uint32_t dev, CUDBGDeviceInfoQueryType_t type, void* buffer, uint32_t length, uint32_t* dataLength )

Read full device info for the device. Information returned by this method is cheap to calculate, so it can be used after every suspend to quickly get the updated device state. For convenience, the caller can always request partial updates, the API will return a full response when returning a partial one is not possible. If the CUDBG_DEBUGGER_CAPABILITY_REPORT_EXCEPTIONS_IN_EXITED_WARPS capability is enabled, exceptions in exited warps will be reported.

Since CUDA 12.4.

See also:

getDeviceInfoSizes

Parameters
dev
- device index
type
- query type (full or delta)
buffer
- output buffer
length
- output buffer length
dataLength
- number of bytes written to the buffer
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INTERNAL, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::getDeviceInfoSizes )( uint32_t dev, CUDBGDeviceInfoSizes* sizes )

Return sizes for device info structs and defined attributes. Since CUDA 12.4.

See also:

getDeviceInfo

Parameters
dev
- device index
sizes
- device info sizes
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::getDeviceName )( uint32_t dev, char* buf, uint32_t sz )

Get the device name string. Returns CUDBG_ERROR_BUFFER_TOO_SMALL if the provided buffer is not large enough. This value is constant within a single session for a given device.

Since CUDA 6.5.

See also:

getDeviceType

getSMType

Parameters
dev
- device index
buf
- the destination buffer
sz
- buffer size in bytes
Returns

CUDBG_SUCCESS, CUDBG_ERROR_BUFFER_TOO_SMALL, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::getDevicePCIBusInfo )( uint32_t dev, uint32_t* pciBusId, uint32_t* pciDevId )

Get PCI bus and device ids associated with device index. Since CUDA 5.5.

Parameters
dev
- device index
pciBusId
- pointer where corresponding PCI BUS ID would be stored
pciDevId
- pointer where corresponding PCI DEVICE ID would be stored
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::getDeviceType )( uint32_t dev, char* buf, uint32_t sz )

Get the string description of the device. Returns CUDBG_ERROR_BUFFER_TOO_SMALL if the provided buffer is not large enough. This value is constant within a single session for a given device.

Since CUDA 3.0.

See also:

getDeviceName

getSMType

Parameters
dev
- device index
buf
- the destination buffer
sz
- buffer size in bytes
Returns

CUDBG_SUCCESS, CUDBG_ERROR_BUFFER_TOO_SMALL, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::getElfImage )( uint32_t dev, uint32_t sm, uint32_t wp, bool  relocated, void* *elfImage, uint64_t* size )

Get the relocated or non-relocated ELF image and size for the grid on the given device. Since CUDA 4.0.

See also:

getElfImageByHandle

getLoadedFunctionInfo

Parameters
dev
- device index
sm
- SM index
wp
- warp index
relocated
- set to true to specify the relocated ELF image, false otherwise
*elfImage
size
- size of the ELF image (64 bits)
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INTERNAL, CUDBG_ERROR_INVALID_WARP, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_INVALID_GRID, CUDBG_ERROR_INVALID_CONTEXT, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::getElfImage32 )( uint32_t dev, uint32_t sm, uint32_t wp, bool  relocated, void* *elfImage, uint32_t* size )

Get the relocated or non-relocated ELF image and size for the grid on the given device. Behaves like getElfImage but will truncate the image size for cubins larger than 4GiB.

Since CUDA 3.0.

Note:

DEPRECATED in CUDA 4.0: Use getElfImage instead.

See also:

getElfImage

getElfImageByHandle

Parameters
dev
- device index
sm
- SM index
wp
- warp index
relocated
- set to true to specify the relocated ELF image, false otherwise
*elfImage
size
- size of the ELF image (32 bits)
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INTERNAL, CUDBG_ERROR_INVALID_WARP, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_INVALID_GRID, CUDBG_ERROR_INVALID_CONTEXT, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::getElfImageByHandle )( uint32_t dev, uint64_t handle, CUDBGElfImageType type, void* elfImage, uint64_t size )

Get the relocated or non-relocated ELF image for the given handle on the given device. The handle is provided in the ELF Image Loaded notification event.

Since CUDA 6.0.

See also:

getElfImage

getLoadedFunctionInfo

Parameters
dev
- device index
handle
- elf image handle
type
- type of the requested ELF image
elfImage
- pointer to the ELF image
size
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INTERNAL, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::getErrorStringEx )( char* buf, uint32_t bufSz, uint32_t* msgSz )

Fills a user-provided buffer with an error message encoded as a null-terminated ASCII string. The error message is specific to the last failed API call and is invalidated after every API method call except this one. It's possible to query the size of the error message without reading it by passing 0 as `buf` and `bufSz` parameters. The `msgSz` parameter is optional unless 0 as passed in as `buf` and `bufSz`. CUDBG_ERROR_BUFFER_TOO_SMALL is returned when the passed in buffer is too small to contain the message.

Since CUDA 12.2.

Parameters
buf
- the destination buffer
bufSz
- the size of the destination buffer in bytes
msgSz
- the size of the written error message including the terminating null character.
Returns

CUDBG_SUCCESS, CUDBG_ERROR_BUFFER_TOO_SMALL, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INTERNAL, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::getGridAttribute )( uint32_t dev, uint32_t sm, uint32_t wp, CUDBGAttribute attr, uint64_t* value )

Get the value of a grid attribute. See CUDBGAttribute for the list of available attributes.

Since CUDA 3.1.

See also:

getGridAttributes

Parameters
dev
- device index
sm
- SM index
wp
- warp index
attr
- the attribute
value
- the returned value of the attribute
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INVALID_WARP, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_INVALID_ATTRIBUTE, CUDBG_ERROR_INVALID_CONTEXT, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::getGridAttributes )( uint32_t dev, uint32_t sm, uint32_t wp, CUDBGAttributeValuePair* pairs, uint32_t numPairs )

Get several grid attribute values in a single API call. See CUDBGAttribute for the list of available attributes.

Since CUDA 3.1.

See also:

getGridAttribute

Parameters
dev
- device index
sm
- SM index
wp
- warp index
pairs
- array of attribute/value pairs
numPairs
- the number of attribute/values pairs in the array
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INVALID_WARP, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_INVALID_ATTRIBUTE, CUDBG_ERROR_INVALID_CONTEXT, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::getGridDim )( uint32_t dev, uint32_t sm, uint32_t wp, CuDim3* gridDim )

Get the dimensions in blocks of the given grid. Since CUDA 4.0.

See also:

getBlockDim

getClusterDim

Parameters
dev
- device index
sm
- SM index
wp
- warp index
gridDim
- the dimensions of the grid
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INTERNAL, CUDBG_ERROR_INVALID_WARP, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_INVALID_CONTEXT, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::getGridDim32 )( uint32_t dev, uint32_t sm, uint32_t wp, CuDim2* gridDim )

Get the dimensions of the given grid. Behaves like getGridDim but doesn't return the z dimension.

Since CUDA 3.0.

Note:

DEPRECATED in CUDA 4.0: Use getGridDim instead.

See also:

getGridDim

Parameters
dev
- device index
sm
- SM index
wp
- warp index
gridDim
- the returned number of blocks in the grid
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INTERNAL, CUDBG_ERROR_INVALID_WARP, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_INVALID_CONTEXT, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::getGridInfo )( uint32_t dev, uint64_t gridId64, CUDBGGridInfo* gridInfo )

Get information about the specified grid. Returns CUDBG_ERROR_INVALID_GRID if the context of the grid has already been destroyed (even if grid ID itself is correct).

Since CUDA 12.7.

Parameters
dev
- device index
gridId64
gridInfo
- pointer to a client allocated structure in which grid info will be returned
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INTERNAL, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_INVALID_GRID, CUDBG_ERROR_INVALID_CONTEXT, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::getGridInfo120 )( uint32_t dev, uint64_t gridId64, CUDBGGridInfo120* gridInfo )

Get information about the specified grid. Behaves like getGridInfo, but returns less information. Returns CUDBG_ERROR_INVALID_GRID if the context of the grid has already been destroyed (even if grid ID itself is correct).

Since CUDA 12.0.

Note:

DEPRECATED in CUDA 12.7: Use getGridInfo instead.

See also:

getGridInfo

Parameters
dev
- device index
gridId64
gridInfo
- pointer to a client allocated structure in which grid info will be returned
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INTERNAL, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_INVALID_GRID, CUDBG_ERROR_INVALID_CONTEXT, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::getGridInfo55 )( uint32_t dev, uint64_t gridId64, CUDBGGridInfo55* gridInfo )

Get information about the specified grid. Behaves like getGridInfo, but returns less information. Returns CUDBG_ERROR_INVALID_GRID if the context of the grid has already been destroyed (even if grid ID itself is correct).

Since CUDA 5.5.

Note:

DEPRECATED in CUDA 12.0: Use getGridInfo instead.

See also:

getGridInfo

Parameters
dev
- device index
gridId64
gridInfo
- pointer to a client allocated structure in which grid info will be returned
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INTERNAL, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_INVALID_GRID, CUDBG_ERROR_INVALID_CONTEXT, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::getGridStatus )( uint32_t dev, uint64_t gridId64, CUDBGGridStatus* status )

Check whether the grid corresponding to the ID is still present on the device. Since CUDA 5.5.

Parameters
dev
- device index
gridId64
- 64-bit grid ID
status
- enum indicating the grid status
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INTERNAL, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::getGridStatus50 )( uint32_t dev, uint32_t gridId, CUDBGGridStatus* status )

Check whether the grid corresponding to the ID is still present on the device. Behaves like getGridStatus, but takes a 32-bit grid ID instead of a 64-bit one.

Since CUDA 5.0.

Note:

DEPRECATED in CUDA 5.5: Use getGridStatus instead.

See also:

getGridStatus

Parameters
dev
- device index
gridId
- grid ID
status
- enum indicating the grid status
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INTERNAL, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::getHardwareBarrierInfo )( uint32_t dev, uint32_t sm, uint32_t wp, uint32_t ln, CUDBGBarrierScope* scope, char* buf, uint32_t bufSz, uint32_t* msgSz )

Get hardware barrier information. Since CUDA 13.1.

Parameters
dev
- device index
sm
- SM index
wp
- warp index
ln
- lane index
scope
- barrier scope
buf
- buffer for the barrier information
bufSz
- buffer size
msgSz
- error message size with null character, can be null
Returns

CUDBG_SUCCESS, CUDBG_ERROR_BUFFER_TOO_SMALL, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INTERNAL, CUDBG_ERROR_INVALID_WARP, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_INVALID_CONTEXT, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::getHostAddrFromDeviceAddr )( uint32_t dev, uint64_t device_addr, uint64_t* host_addr )

Given a device virtual address, return a corresponding system memory virtual address. Since CUDA 4.1.

See also:

readGenericMemory

writeGenericMemory

Parameters
dev
- device index
device_addr
- device memory address
host_addr
- returned system memory address
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INVALID_MEMORY_SEGMENT, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_INVALID_CONTEXT, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::getLoadedFunctionInfo )( uint32_t dev, uint64_t handle, CUDBGLoadedFunctionInfo* info, uint32_t startIndex, uint32_t numEntries )

Get the section number and address of loaded functions for a given module. If the CUDBG_DEBUGGER_CAPABILITY_LAZY_FUNCTION_LOADING capability is enabled, CUDA loads functions lazily after the module has been reported. This method could be used to get the lazily loaded functions.

Since CUDA 12.3.

Parameters
dev
- device index
handle
- ELF/cubin image handle
info
- information about loaded functions
startIndex
- start index of the entries to get
numEntries
- number of function load entries to read
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INTERNAL, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::getLoadedFunctionInfo118 )( uint32_t dev, uint64_t handle, CUDBGLoadedFunctionInfo* info, uint32_t numEntries )

Get the section number and address of loaded functions for a given module. Behaves like getLoadedFunctionInfo but doesn't allow querying a subset of all lazily loaded functions.

Since CUDA 11.8.

Note:

DEPRECATED in CUDA 12.3: Use getLoadedFunctionInfo instead.

See also:

getLoadedFunctionInfo

Parameters
dev
- device index
handle
- ELF/cubin image handle
info
- information about loaded functions
numEntries
- number of function load entries to read
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INTERNAL, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::getManagedMemoryRegionInfo )( uint64_t startAddress, CUDBGMemoryInfo* memoryInfo, uint32_t memoryInfo_size, uint32_t* numEntries )

Get a sorted list of managed memory regions. The sorted list of memory regions starts from a region containing the specified starting address. If the starting address is set to 0, a sorted list of managed memory regions is returned which starts from the managed memory region with the lowest start address.

Since CUDA 6.0.

Parameters
startAddress
- the address that the first region in the list must contain
memoryInfo
- client-allocated array of memory region records of type CUDBGMemoryInfo
memoryInfo_size
- number of records of type CUDBGMemoryInfo that memoryInfo can hold
numEntries
- pointer to a client-allocated variable holding the number of valid entries returned in memoryInfo
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INTERNAL, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::getNextAsyncEvent50 )( CUDBGEvent50* event )

Copies the next available event in the asynchronous event queue into 'event' and removes it from the queue. Behaves like getNextEvent but only for ASYNC events and doesn't support the latest event struct format so some fields won't be available.

Since CUDA 5.0.

Note:

DEPRECATED in CUDA 5.5: Use getNextEvent instead.

See also:

getNextEvent

Parameters
event
- pointer to an event container where to copy the event parameters
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INTERNAL, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_NO_EVENT_AVAILABLE, CUDBG_ERROR_INVALID_CONTEXT

CUDBGResult ( *CUDBGAPI_st::getNextAsyncEvent55 )( CUDBGEvent55* event )

Copies the next available event in the asynchronous event queue into 'event' and removes it from the queue. Behaves like getNextEvent but only for ASYNC events and doesn't support the latest event struct format so some fields won't be available.

Since CUDA 5.5.

Note:

DEPRECATED in CUDA 6.0: Use getNextEvent instead.

See also:

getNextEvent

Parameters
event
- pointer to an event container where to copy the event parameters
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INTERNAL, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_NO_EVENT_AVAILABLE, CUDBG_ERROR_INVALID_CONTEXT

CUDBGResult ( *CUDBGAPI_st::getNextEvent )( CUDBGEventQueueType type, CUDBGEvent* event )

Copies the next available event into 'event' and removes it from the queue. CUDBG_ERROR_NO_EVENT_AVAILABLE is returned if the queue is empty. ASYNC and SYNC queues are separate and each one is ordered separately, but it's impossible to find out the relative order of ASYNC and SYNC events.

Since CUDA 6.0.

See also:

acknowledgeSyncEvents

setNotifyNewEventCallback

Parameters
type
- application event queue type
event
- pointer to an event container where to copy the event parameters
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_NO_EVENT_AVAILABLE

CUDBGResult ( *CUDBGAPI_st::getNextEvent30 )( CUDBGEvent30* event )

Copies the next available event in the event queue into 'event' and removes it from the queue. Behaves like getNextEvent but only for SYNC events and doesn't support the latest event struct format so some fields won't be available.

Since CUDA 3.0.

Note:

DEPRECATED in CUDA 3.1: Use getNextEvent instead.

See also:

getNextEvent

Parameters
event
- pointer to an event container where to copy the event parameters
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INTERNAL, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_NO_EVENT_AVAILABLE, CUDBG_ERROR_INVALID_CONTEXT

CUDBGResult ( *CUDBGAPI_st::getNextEvent32 )( CUDBGEvent32* event )

Copies the next available event in the event queue into 'event' and removes it from the queue. Behaves like getNextEvent but only for SYNC events and doesn't support the latest event struct format so some fields won't be available.

Since CUDA 3.1.

Note:

DEPRECATED in CUDA 4.0: Use getNextEvent instead.

See also:

getNextEvent

Parameters
event
- pointer to an event container where to copy the event parameters
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INTERNAL, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_NO_EVENT_AVAILABLE, CUDBG_ERROR_INVALID_CONTEXT

CUDBGResult ( *CUDBGAPI_st::getNextEvent42 )( CUDBGEvent42* event )

Copies the next available event in the event queue into 'event' and removes it from the queue. Behaves like getNextEvent but only for SYNC events and doesn't support the latest event struct format so some fields won't be available.

Since CUDA 4.0.

Note:

DEPRECATED in CUDA 5.0: Use getNextEvent instead.

See also:

getNextEvent

Parameters
event
- pointer to an event container where to copy the event parameters
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INTERNAL, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_NO_EVENT_AVAILABLE, CUDBG_ERROR_INVALID_CONTEXT

CUDBGResult ( *CUDBGAPI_st::getNextSyncEvent50 )( CUDBGEvent50* event )

Copies the next available event in the synchronous event queue into 'event' and removes it from the queue. Behaves like getNextEvent but only for SYNC events and doesn't support the latest event struct format so some fields won't be available.

Since CUDA 5.0.

Note:

DEPRECATED in CUDA 5.5: Use getNextEvent instead.

See also:

getNextEvent

Parameters
event
- pointer to an event container where to copy the event parameters
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INTERNAL, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_NO_EVENT_AVAILABLE, CUDBG_ERROR_INVALID_CONTEXT

CUDBGResult ( *CUDBGAPI_st::getNextSyncEvent55 )( CUDBGEvent55* event )

Copies the next available event in the synchronous event queue into 'event' and removes it from the queue. Behaves like getNextEvent but only for SYNC events and doesn't support the latest event struct format so some fields won't be available.

Since CUDA 5.5.

Note:

DEPRECATED in CUDA 6.0: Use getNextEvent instead.

See also:

getNextEvent

Parameters
event
- pointer to an event container where to copy the event parameters
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INTERNAL, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_NO_EVENT_AVAILABLE, CUDBG_ERROR_INVALID_CONTEXT

CUDBGResult ( *CUDBGAPI_st::getNumDevices )( uint32_t* numDev )

Get the number of installed CUDA devices. This value is constant within a single session.

Since CUDA 3.0.

See also:

getNumLanes

getNumPredicates

getNumRegisters

getNumSMs

getNumUniformPredicates

getNumUniformRegisters

getNumWarps

Parameters
numDev
- the returned number of devices
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::getNumLanes )( uint32_t dev, uint32_t* numLanes )

Get the number of lanes per warp on the device. This value is constant within a single session for a given device.

Since CUDA 3.0.

See also:

getNumDevices

getNumPredicates

getNumRegisters

getNumSMs

getNumUniformPredicates

getNumUniformRegisters

getNumWarps

Parameters
dev
- device index
numLanes
- the returned number of lanes
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::getNumPredicates )( uint32_t dev, uint32_t* numPredicates )

Get the number of predicate registers per lane on the device. This value is constant within a single session for a given device.

Since CUDA 6.5.

See also:

getNumDevices

getNumLanes

getNumRegisters

getNumSMs

getNumUniformPredicates

getNumUniformRegisters

getNumWarps

Parameters
dev
- device index
numPredicates
- the returned number of predicate registers
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::getNumRegisters )( uint32_t dev, uint32_t* numRegs )

Get the maximum number of registers per lane on the device. This value is constant within a single session for a given device. Note that the actual number of registers can change per warp, use readWarpResources() to query that number dynamically.

Since CUDA 3.0.

See also:

getNumDevices

getNumLanes

getNumPredicates

getNumSMs

getNumUniformPredicates

getNumUniformRegisters

getNumWarps

readWarpResources

Parameters
dev
- device index
numRegs
- the returned number of registers
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::getNumSMs )( uint32_t dev, uint32_t* numSMs )

Get the total number of SMs on the device. This value is constant within a single session for a given device.

Since CUDA 3.0.

See also:

getNumDevices

getNumLanes

getNumPredicates

getNumRegisters

getNumUniformPredicates

getNumUniformRegisters

getNumWarps

Parameters
dev
- device index
numSMs
- the returned number of SMs
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::getNumUniformPredicates )( uint32_t dev, uint32_t* numPredicates )

Get the number of uniform predicate registers per warp on the device. This value is constant within a single session for a given device.

Since CUDA 10.0.

See also:

getNumDevices

getNumLanes

getNumPredicates

getNumRegisters

getNumSMs

getNumUniformRegisters

getNumWarps

Parameters
dev
- device index
numPredicates
- the returned number of uniform predicate registers
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::getNumUniformRegisters )( uint32_t dev, uint32_t* numRegs )

Get the number of uniform registers per warp on the device. This value is constant within a single session for a given device.

Since CUDA 10.0.

See also:

getNumDevices

getNumLanes

getNumPredicates

getNumRegisters

getNumSMs

getNumUniformPredicates

getNumWarps

Parameters
dev
- device index
numRegs
- the returned number of uniform registers
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::getNumWarps )( uint32_t dev, uint32_t* numWarps )

Get the number of warps per SM on the device. This value is constant within a single session for a given device.

Since CUDA 3.0.

See also:

getNumDevices

getNumLanes

getNumPredicates

getNumRegisters

getNumSMs

getNumUniformPredicates

getNumUniformRegisters

Parameters
dev
- device index
numWarps
- the returned number of warps
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::getPhysicalRegister30 )( uint64_t pc, char* reg, uint32_t* buf, uint32_t sz, uint32_t* numPhysRegs, CUDBGRegClass* regClass )

Get the physical register number(s) assigned to a virtual register name at a given PC, if it's live at that PC. Since CUDA 3.0.

Note:

DEPRECATED in CUDA 3.1: Do not use.

Parameters
pc
- Program counter
reg
- virtual register index
buf
- physical register name(s)
sz
- the physical register name buffer size
numPhysRegs
- number of physical register names returned
regClass
- the class of the physical registers
Returns

CUDBG_SUCCESS, CUDBG_ERROR_BUFFER_TOO_SMALL, CUDBG_ERROR_UNKNOWN_FUNCTION, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INTERNAL, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_INVALID_CONTEXT, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::getPhysicalRegister40 )( uint32_t dev, uint32_t sm, uint32_t wp, uint64_t pc, char* reg, uint32_t* buf, uint32_t sz, uint32_t* numPhysRegs, CUDBGRegClass* regClass )

Get the physical register number(s) assigned to a virtual register name at a given PC, if it's live at that PC. Instead, the PTX to SASS mappings can be read from the cubin directly.

Since CUDA 3.1.

Note:

DEPRECATED in CUDA 4.1: Do not use.

Parameters
dev
- device index
sm
- SM index
wp
- warp index
pc
- Program counter
reg
- virtual register index
buf
- physical register name(s)
sz
- the physical register name buffer size
numPhysRegs
- number of physical register names returned
regClass
- the class of the physical registers
Returns

CUDBG_SUCCESS, CUDBG_ERROR_BUFFER_TOO_SMALL, CUDBG_ERROR_UNKNOWN_FUNCTION, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INTERNAL, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_INVALID_CONTEXT, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::getSmType )( uint32_t dev, char* buf, uint32_t sz )

Get the SM type of the device. Returns CUDBG_ERROR_BUFFER_TOO_SMALL if the provided buffer is not large enough. This value is constant within a single session for a given device.

Since CUDA 3.0.

See also:

getDeviceName

getDeviceType

Parameters
dev
- device index
buf
- the destination buffer
sz
- buffer size in bytes
Returns

CUDBG_SUCCESS, CUDBG_ERROR_BUFFER_TOO_SMALL, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::getSupportedDebuggerCapabilities )( CUDBGCapabilityFlags* capabilities )

Returns debug agent capabilities that are supported by this version of the API. This API method can be called without initializing the API.

Since CUDA 12.5.

Parameters
capabilities
- returned debug engine capabilities
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::getTID )( uint32_t dev, uint32_t sm, uint32_t wp, uint32_t* tid )

Get the ID of the Linux thread hosting the CUDA context active at the given coordinates. This returns a Linux thread ID.

Since CUDA 3.0.

See also:

getGridAttributes

Parameters
dev
- device index
sm
- SM index
wp
- warp index
tid
- the returned thread id
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INVALID_WARP, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_INVALID_CONTEXT, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::getWarpHitBreakpoint )( uint32_t dev, uint32_t sm, uint32_t wp, CUDBGBreakpointHandle* handle )

Get the handle of the breakpoint that the given warp hit. An error is returned if the warp did not hit a breakpoint. Use readBrokenWarps() to check if the warp is broken before calling this method. Some breakpoint handles are special, see the documentation of CUDBGBreakpointHandle for more details.

Since CUDA 13.2.

See also:

disableBreakpoint

enableBreakpoint

insertBreakpoint

isBreakpointEnabled

readBrokenWarps

removeBreakpoint

Parameters
dev
- device index
sm
- SM index
wp
- warp index
handle
- the returned breakpoint handle
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INTERNAL, CUDBG_ERROR_INVALID_WARP, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_INVALID_CONTEXT, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::initialize )( )

Initialize the API. setNotifyNewEventCallback() and getSupportedDebuggerCapabilities() can be called before initialize(). If no CUDA devices are detected on the system, CUDBG_ERROR_NO_DEVICE_AVAILABLE is returned.

Since CUDA 3.0.

See also:

finalize

Returns

CUDBG_SUCCESS, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_NO_DEVICE_AVAILABLE

CUDBGResult ( *CUDBGAPI_st::initializeAttachStub )( )

Initialize the attach stub. This is no longer necessary starting with driver version r590.

Since CUDA 5.0.

Returns

CUDBG_SUCCESS

CUDBGResult ( *CUDBGAPI_st::insertBreakpoint )( uint32_t dev, uint64_t addr, CUDBGBreakpointHandle* handle )

Set a breakpoint at the given instruction address for the given device. Before setting a breakpoint, getAdjustedCodeAddress() should be called to get the adjusted breakpoint address. The returned handle can be used to enable/disable/remove the breakpoint.

Since CUDA 13.2.

See also:

disableBreakpoint

enableBreakpoint

getWarpHitBreakpoint

isBreakpointEnabled

removeBreakpoint

Parameters
dev
- the device index
addr
- instruction address
handle
- the returned breakpoint handle
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INTERNAL, CUDBG_ERROR_INVALID_ADDRESS, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_RECURSIVE_API_CALL, CUDBG_ERROR_BREAKPOINT_STATE_CONFLICT

CUDBGResult ( *CUDBGAPI_st::isBreakpointEnabled )( CUDBGBreakpointHandle handle, uint32_t* enabled )

Check if a breakpoint specified by its handle is enabled. Since CUDA 13.2.

See also:

disableBreakpoint

enableBreakpoint

getWarpHitBreakpoint

insertBreakpoint

removeBreakpoint

Parameters
handle
- the breakpoint handle
enabled
- whether the breakpoint is enabled
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INTERNAL, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::isDeviceCodeAddress )( uintptr_t addr, bool* isDeviceAddress )

Determine whether a virtual address resides within device code. Since CUDA 3.0.

Parameters
addr
- virtual address
isDeviceAddress
- true if address resides within device code
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INTERNAL, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::isDeviceCodeAddress55 )( uintptr_t addr, bool* isDeviceAddress )

Determine whether a virtual address resides within device code. Behaves exactly like isDeviceCodeAddress.

Since CUDA 3.0.

Note:

DEPRECATED in CUDA 6.0: Use isDeviceCodeAddress instead.

See also:

isDeviceCodeAddress

Parameters
addr
- virtual address
isDeviceAddress
- true if address resides within device code
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INTERNAL, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::lookupDeviceCodeSymbol )( char* symName, bool* symFound, uintptr_t* symAddr )

Determines whether a symbol represents a function in device code and returns its virtual address. Since CUDA 3.0.

Parameters
symName
- symbol name
symFound
- set to true if the symbol is found
symAddr
- the symbol virtual address if found
Returns

CUDBG_SUCCESS, CUDBG_ERROR_UNKNOWN_FUNCTION, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INTERNAL, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::memcheckReadErrorAddress )( uint32_t dev, uint32_t sm, uint32_t wp, uint32_t ln, uint64_t* address, ptxStorageKind* storage )

Get the address that memcheck detected an error on. Will always return CUDBG_ERROR_NOT_SUPPORTED.

Since CUDA 5.0.

Note:

DEPRECATED in CUDA 12.0: Do not use.

Parameters
dev
- device index
sm
- SM index
wp
- warp index
ln
- lane index
address
- returned address detected by memcheck
storage
- returned address class of address
Returns

CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_RECURSIVE_API_CALL, CUDBG_ERROR_NOT_SUPPORTED

CUDBGResult ( *CUDBGAPI_st::readActiveLanes )( uint32_t dev, uint32_t sm, uint32_t wp, uint32_t* activeLanesMask )

Read the lane bitmask of active threads on a valid warp. Since CUDA 3.0.

See also:

readBlockIdx

readBrokenWarps

readGridId

readThreadIdx

readValidLanes

readValidWarps

Parameters
dev
- device index
sm
- SM index
wp
- warp index
activeLanesMask
- the returned bitmask of active threads
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INVALID_WARP, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_INVALID_CONTEXT, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::readAllVirtualReturnAddresses )( uint32_t dev, uint32_t sm, uint32_t wp, uint32_t ln, uint64_t* addrs, uint32_t numAddrs, uint32_t* callDepth, uint32_t* syscallCallDepth )

Read all the virtual return addresses for a thread (the full backtrace). Note that syscallCallDepth is always set to 0.

Since CUDA 12.5.

See also:

readCallDepth

readReturnAddress

Parameters
dev
- device index
sm
- SM index
wp
- warp index
ln
- lane index
addrs
- the returned addresses array
numAddrs
- number of elements in addrs array
callDepth
- the returned call depth
syscallCallDepth
- the returned syscall call depth
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INTERNAL, CUDBG_ERROR_INVALID_WARP, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_INVALID_CONTEXT, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::readBlockIdx )( uint32_t dev, uint32_t sm, uint32_t wp, CuDim3* blockIdx )

Read the CUDA block index running on a valid warp. Since CUDA 4.0.

See also:

readActiveLanes

readBrokenWarps

readGridId

readThreadIdx

readValidLanes

readValidWarps

Parameters
dev
- device index
sm
- SM index
wp
- warp index
blockIdx
- the returned CUDA block index
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INTERNAL, CUDBG_ERROR_INVALID_WARP, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_INVALID_CONTEXT, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::readBlockIdx32 )( uint32_t dev, uint32_t sm, uint32_t wp, CuDim2* blockIdx )

Read the two-dimensional CUDA block index running on a valid warp. Behaves like readBlockIdx but doesn't return the z dimension.

Since CUDA 3.0.

Note:

DEPRECATED in CUDA 4.0: Use readBlockIdx instead.

See also:

readBlockIdx

Parameters
dev
- device index
sm
- SM index
wp
- warp index
blockIdx
- the returned CUDA block index
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INTERNAL, CUDBG_ERROR_INVALID_WARP, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_INVALID_CONTEXT, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::readBrokenWarps )( uint32_t dev, uint32_t sm, uint64_t* brokenWarpsMask )

Read the bitmask of warps that are at a breakpoint on a given SM. Since CUDA 3.0.

See also:

readActiveLanes

readBlockIdx

readGridId

readThreadIdx

readValidLanes

readValidWarps

Parameters
dev
- device index
sm
- SM index
brokenWarpsMask
- the returned bitmask of broken warps
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::readCCRegister )( uint32_t dev, uint32_t sm, uint32_t wp, uint32_t ln, uint32_t* val )

Read the hardware CC register. The CC register is no longer available in the supported hardware.

Since CUDA 6.5.

Note:

DEPRECATED in CUDA 13.1: Do not use.

Parameters
dev
- device index
sm
- SM index
wp
- warp index
ln
- lane index
val
- the returned value of the CC register
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::readCPUCallStack )( uint32_t dev, uint64_t gridId64, uint64_t* addrs, uint32_t numAddrs, uint32_t* totalNumAddrs )

Read the CPU call stack captured at the time of kernel launch. This method only works if the CUDBG_DEBUGGER_CAPABILITY_COLLECT_CPU_CALL_STACK_FOR_KERNEL_LAUNCHES capability is enabled.

Since CUDA 12.9.

Parameters
dev
- device index
gridId64
- 64-bit grid ID
addrs
- the returned addresses array, can be NULL
numAddrs
- capacity of addrs (possibly 0)
totalNumAddrs
- the actual size of the stack (number of frames) is written here; the value written can be greater than numAddrs
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INTERNAL, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_INVALID_GRID, CUDBG_ERROR_INVALID_CONTEXT, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::readCallDepth )( uint32_t dev, uint32_t sm, uint32_t wp, uint32_t ln, uint32_t* depth )

Read the call depth (number of calls) for a given thread. Since CUDA 4.0.

See also:

readReturnAddress

readVirtualReturnAddress

Parameters
dev
- device index
sm
- SM index
wp
- warp index
ln
- lane index
depth
- the returned call depth
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INTERNAL, CUDBG_ERROR_INVALID_WARP, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_INVALID_CONTEXT, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::readCallDepth32 )( uint32_t dev, uint32_t sm, uint32_t wp, uint32_t* depth )

Read the call depth (number of calls) for a given warp. Behaves like readCallDepth() for the active thread group.

Since CUDA 3.1.

Note:

DEPRECATED in CUDA 4.0: Use readCallDepth instead.

See also:

readCallDepth

Parameters
dev
- device index
sm
- SM index
wp
- warp index
depth
- the returned call depth
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INTERNAL, CUDBG_ERROR_INVALID_WARP, CUDBG_ERROR_INVALID_LANE, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_INVALID_CONTEXT, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::readClusterIdx )( uint32_t dev, uint32_t sm, uint32_t wp, CuDim3* clusterIdx )

Read the CUDA cluster index running on a valid warp. Since CUDA 12.0.

See also:

readActiveLanes

readBlockIdx

readBrokenWarps

readGridId

readThreadIdx

readValidLanes

readValidWarps

Parameters
dev
- device index
sm
- SM index
wp
- warp index
clusterIdx
- the returned CUDA cluster index
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INTERNAL, CUDBG_ERROR_INVALID_WARP, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_INVALID_CONTEXT, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::readCodeMemory )( uint32_t dev, uint64_t addr, void* buf, uint32_t sz )

Read content at address in the code memory segment. It is generally not necessary to call this function - instead, the same memory could be read from the ELF module images received from the API.

Since CUDA 3.0.

See also:

readGenericMemory

readLocalMemory

readPC

readParamMemory

readRegister

readSharedMemory

readTextureMemory

Parameters
dev
- device index
addr
- memory address
buf
- buffer
sz
- buffer size in bytes
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INTERNAL, CUDBG_ERROR_INVALID_ADDRESS, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_INVALID_CONTEXT, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::readConstMemory129 )( uint32_t dev, uint64_t addr, void* buf, uint32_t sz )

Read content at address in the constant memory segment. Behaves exactly like readGlobalMemory.

Since CUDA 3.0.

Note:

DEPRECATED in CUDA 13.0: Use readGlobalMemory instead.

See also:

readGlobalMemory

Parameters
dev
- device index
addr
- memory address
buf
- buffer
sz
- buffer size in bytes
Returns

CUDBG_SUCCESS, CUDBG_ERROR_UNKNOWN, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INVALID_MEMORY_ACCESS, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_ADDRESS_NOT_IN_DEVICE_MEM, CUDBG_ERROR_AMBIGUOUS_MEMORY_ADDRESS, CUDBG_ERROR_RECURSIVE_API_CALL, CUDBG_ERROR_NOT_SUPPORTED

CUDBGResult ( *CUDBGAPI_st::readDeviceExceptionState )( uint32_t dev, uint64_t* mask, uint32_t numWords )

Get the exception state of the SMs on the device. If the CUDBG_DEBUGGER_CAPABILITY_REPORT_EXCEPTIONS_IN_EXITED_WARPS capability is enabled, exceptions in exited warps will be reported.

Since CUDA 9.0.

See also:

getNumSMs

Parameters
dev
- device index
mask
- Arbitrarily sized bit field containing a 1 at (1 << i) if SM i hit an exception
numWords
- Number of uint64_t elements in mask (must be large enough to hold a bit for each sm on the device)
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::readDeviceExceptionState80 )( uint32_t dev, uint64_t* exceptionSMMask )

Get the exception state of the SMs on the device. Behaves like readDeviceExceptionState but only supports up to 64 SMs.

Since CUDA 5.5.

Note:

DEPRECATED in CUDA 9.0: Use readDeviceExceptionState instead.

See also:

readDeviceExceptionState

Parameters
dev
- device index
exceptionSMMask
- Bit field containing a 1 at (1 << i) if SM i hit an exception
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::readErrorPC )( uint32_t dev, uint32_t sm, uint32_t wp, uint64_t* errorPC, bool* errorPCValid )

Get the hardware reported error PC if it exists. The error PC, if available, shows the PC where an error happened (the thread can progress past that so its PC could be beyond that).

Since CUDA 6.0.

Parameters
dev
- device index
sm
- SM index
wp
errorPC
- PC ofthe exception
errorPCValid
- boolean to indicate that the returned error PC is valid
Returns

CUDBG_SUCCESS, CUDBG_ERROR_UNKNOWN_FUNCTION, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INVALID_WARP, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_INVALID_CONTEXT, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::readGenericMemory )( uint32_t dev, uint32_t sm, uint32_t wp, uint32_t ln, uint64_t addr, void* buf, uint32_t sz )

Read content at an address in any memory segment. The address will be used to determine whether the read is to local, shared or global memory. The target address range should entirely reside within a single memory segment. Coordinate arguments are only used when relevant. They should be provided for the following segments:

  • Shared memory: SM and Warp

  • Local memory: SM, Warp and Lane

Since CUDA 6.0.

See also:

readCodeMemory

readLocalMemory

readPC

readParamMemory

readRegister

readSharedMemory

readTextureMemory

Parameters
dev
- device index
sm
- SM index
wp
- warp index
ln
- lane index
addr
- memory address
buf
- buffer
sz
Returns

CUDBG_SUCCESS, CUDBG_ERROR_UNKNOWN, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INVALID_MEMORY_ACCESS, CUDBG_ERROR_INTERNAL, CUDBG_ERROR_INVALID_WARP, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_INVALID_CONTEXT, CUDBG_ERROR_ADDRESS_NOT_IN_DEVICE_MEM, CUDBG_ERROR_AMBIGUOUS_MEMORY_ADDRESS, CUDBG_ERROR_RECURSIVE_API_CALL, CUDBG_ERROR_NOT_SUPPORTED

CUDBGResult ( *CUDBGAPI_st::readGlobalMemory )( uint64_t addr, void* buf, uint32_t sz )

Read content at an address in the global address space. If the address is valid on more than one device and one of those devices does not support UVA, an error is returned.

Since CUDA 6.0.

See also:

readCodeMemory

readLocalMemory

readPC

readParamMemory

readRegister

readSharedMemory

readTextureMemory

Parameters
addr
- memory address
buf
- buffer
sz
Returns

CUDBG_SUCCESS, CUDBG_ERROR_UNKNOWN, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INVALID_MEMORY_ACCESS, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_ADDRESS_NOT_IN_DEVICE_MEM, CUDBG_ERROR_AMBIGUOUS_MEMORY_ADDRESS, CUDBG_ERROR_RECURSIVE_API_CALL, CUDBG_ERROR_NOT_SUPPORTED

CUDBGResult ( *CUDBGAPI_st::readGlobalMemory31 )( uint32_t dev, uint64_t addr, void* buf, uint32_t sz )

Read content at address in the global memory segment. Behaves like readGenericMemory() with sm, wp, ln == 0. This makes this method not at all useful.

Since CUDA 3.0.

Note:

DEPRECATED in CUDA 3.2: Use readGlobalMemory instead.

See also:

readGlobalMemory

Parameters
dev
- device index
addr
- memory address
buf
- buffer
sz
- buffer size in bytes
Returns

CUDBG_SUCCESS, CUDBG_ERROR_UNKNOWN, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INVALID_MEMORY_ACCESS, CUDBG_ERROR_INTERNAL, CUDBG_ERROR_INVALID_WARP, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_INVALID_CONTEXT, CUDBG_ERROR_ADDRESS_NOT_IN_DEVICE_MEM, CUDBG_ERROR_AMBIGUOUS_MEMORY_ADDRESS, CUDBG_ERROR_RECURSIVE_API_CALL, CUDBG_ERROR_NOT_SUPPORTED

CUDBGResult ( *CUDBGAPI_st::readGlobalMemory55 )( uint32_t dev, uint32_t sm, uint32_t wp, uint32_t ln, uint64_t addr, void* buf, uint32_t sz )

Read content at address in the global memory segment. Behaves exactly like readGenericMemory().

Since CUDA 3.2.

Note:

DEPRECATED in CUDA 6.0: Use readGlobalMemory instead.

See also:

readGlobalMemory

Parameters
dev
- device index
sm
- SM index
wp
- warp index
ln
- lane index
addr
- memory address
buf
- buffer
sz
- buffer size in bytes
Returns

CUDBG_SUCCESS, CUDBG_ERROR_UNKNOWN, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INVALID_MEMORY_ACCESS, CUDBG_ERROR_INTERNAL, CUDBG_ERROR_INVALID_WARP, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_INVALID_CONTEXT, CUDBG_ERROR_ADDRESS_NOT_IN_DEVICE_MEM, CUDBG_ERROR_AMBIGUOUS_MEMORY_ADDRESS, CUDBG_ERROR_RECURSIVE_API_CALL, CUDBG_ERROR_NOT_SUPPORTED

CUDBGResult ( *CUDBGAPI_st::readGridId )( uint32_t dev, uint32_t sm, uint32_t wp, uint64_t* gridId64 )

Read the 64-bit CUDA grid index running on a valid warp. The grid ID is guaranteed to be unique within a device, but not globally.

Since CUDA 5.5.

See also:

readActiveLanes

readBlockIdx

readBrokenWarps

readThreadIdx

readValidLanes

readValidWarps

Parameters
dev
- device index
sm
- SM index
wp
- warp index
gridId64
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INVALID_WARP, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_INVALID_CONTEXT, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::readGridId50 )( uint32_t dev, uint32_t sm, uint32_t wp, uint32_t* gridId )

Read the CUDA grid index running on a valid warp. Behaves like readGridId but truncates the grid ID to 32bit. This is incompatible with some grid IDs like those used by the OptiX applications.

Since CUDA 3.0.

Note:

DEPRECATED in CUDA 5.5: Use readGridId instead.

See also:

readGridId

Parameters
dev
- device index
sm
- SM index
wp
- warp index
gridId
- the returned CUDA grid index
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INVALID_WARP, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_INVALID_CONTEXT, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::readLaneException )( uint32_t dev, uint32_t sm, uint32_t wp, uint32_t ln, CUDBGException_t* exception )

Read the exception type for a given thread. Since CUDA 3.1.

Parameters
dev
- device index
sm
- SM index
wp
- warp index
ln
- lane index
exception
- the returned exception type
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INVALID_WARP, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_INVALID_CONTEXT, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::readLaneStatus )( uint32_t dev, uint32_t sm, uint32_t wp, uint32_t ln, bool* error )

Read the status of the given thread. For specific error values, use readLaneException.

Since CUDA 3.0.

Parameters
dev
- device index
sm
- SM index
wp
- warp index
ln
- lane index
error
- true if there is an error
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INVALID_WARP, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_INVALID_CONTEXT, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::readLocalMemory )( uint32_t dev, uint32_t sm, uint32_t wp, uint32_t ln, uint64_t addr, void* buf, uint32_t sz )

Read content at address in the local memory segment. Since CUDA 3.0.

See also:

readCodeMemory

readGenericMemory

readPC

readParamMemory

readRegister

readSharedMemory

readTextureMemory

Parameters
dev
- device index
sm
- SM index
wp
- warp index
ln
- lane index
addr
- memory address
buf
- buffer
sz
- buffer size in bytes
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INTERNAL, CUDBG_ERROR_INVALID_WARP, CUDBG_ERROR_INVALID_ADDRESS, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_INVALID_CONTEXT, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::readPC )( uint32_t dev, uint32_t sm, uint32_t wp, uint32_t ln, uint64_t* pc )

Read the PC offset on the given active thread. The returned PC offset is from the start of the current function. If a function can't be found, the full virtual address is returned.

Since CUDA 3.0.

See also:

readCodeMemory

readGenericMemory

readLocalMemory

readParamMemory

readRegister

readSharedMemory

readTextureMemory

readVirtualPC

Parameters
dev
- device index
sm
- SM index
wp
- warp index
ln
- lane index
pc
- the returned PC
Returns

CUDBG_SUCCESS, CUDBG_ERROR_UNKNOWN_FUNCTION, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INTERNAL, CUDBG_ERROR_INVALID_WARP, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_INVALID_CONTEXT, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::readParamMemory )( uint32_t dev, uint32_t sm, uint32_t wp, uint64_t addr, void* buf, uint32_t sz )

Read content at address in the param memory segment. Since CUDA 3.0.

See also:

readCodeMemory

readGenericMemory

readLocalMemory

readPC

readRegister

readSharedMemory

readTextureMemory

Parameters
dev
- device index
sm
- SM index
wp
- warp index
addr
- memory address
buf
- buffer
sz
- buffer size in bytes
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INTERNAL, CUDBG_ERROR_INVALID_WARP, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_INVALID_CONTEXT, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::readPinnedMemory )( uint64_t addr, void* buf, uint32_t sz )

Read content at pinned address in system memory. Depending on the platform, this method may fail and a platform-specific CPU RAM way of reading memory from the debuggee must be used (e.g. ptrace).

Since CUDA 3.2.

See also:

readCodeMemory

readGenericMemory

readLocalMemory

readPC

readParamMemory

readRegister

readSharedMemory

readTextureMemory

Parameters
addr
- system memory address
buf
- buffer
sz
- buffer size in bytes
Returns

CUDBG_SUCCESS, CUDBG_ERROR_UNKNOWN, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INVALID_MEMORY_ACCESS, CUDBG_ERROR_MEMORY_MAPPING_FAILED, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_ADDRESS_NOT_IN_DEVICE_MEM, CUDBG_ERROR_AMBIGUOUS_MEMORY_ADDRESS, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::readPredicates )( uint32_t dev, uint32_t sm, uint32_t wp, uint32_t ln, uint32_t predicates_size, uint32_t* predicates )
Parameters
dev
- device index
sm
- SM index
wp
- warp index
ln
- lane index
predicates_size
- number of predicate registers to read
predicates
- buffer
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INTERNAL, CUDBG_ERROR_INVALID_WARP, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_INVALID_CONTEXT, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::readRegister )( uint32_t dev, uint32_t sm, uint32_t wp, uint32_t ln, uint32_t regno, uint32_t* val )

Read content of a hardware register. Note that warps can dynamically change the number of used registers at runtime, readWarpResources() could be used to query that.

Since CUDA 3.0.

See also:

readCodeMemory

readGenericMemory

readLocalMemory

readPC

readParamMemory

readSharedMemory

readTextureMemory

Parameters
dev
- device index
sm
- SM index
wp
- warp index
ln
- lane index
regno
- register index
val
- the returned value of the register
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INTERNAL, CUDBG_ERROR_INVALID_WARP, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_INVALID_CONTEXT, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::readRegisterRange )( uint32_t dev, uint32_t sm, uint32_t wp, uint32_t ln, uint32_t index, uint32_t numRegisters, uint32_t* registers, uint32_t* numRegistersRead )

Read content of a hardware range of hardware registers. Since CUDA 13.2.

See also:

readCodeMemory

readGenericMemory

readLocalMemory

readPC

readParamMemory

readRegister

readSharedMemory

readTextureMemory

Parameters
dev
- device index
sm
- SM index
wp
- warp index
ln
- lane index
index
- index of the first register to read
numRegisters
- number of registers to read
registers
- buffer
numRegistersRead
- number of registers actually read, ignored if null
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INTERNAL, CUDBG_ERROR_INVALID_WARP, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_INVALID_CONTEXT, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::readRegisterRange60 )( uint32_t dev, uint32_t sm, uint32_t wp, uint32_t ln, uint32_t index, uint32_t registers_size, uint32_t* registers )

Read content of a range of hardware registers. Since CUDA 6.0.

Note:

DEPRECATED in CUDA 13.2: Use readRegisterRange instead.

See also:

readRegisterRange

Parameters
dev
- device index
sm
- SM index
wp
- warp index
ln
- lane index
index
- index of the first register to read
registers_size
- number of registers to read
registers
- buffer
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INTERNAL, CUDBG_ERROR_INVALID_WARP, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_INVALID_CONTEXT, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::readReturnAddress )( uint32_t dev, uint32_t sm, uint32_t wp, uint32_t ln, uint32_t level, uint64_t* ra )

Read the return address (offset) for a call level. The returned return address is an offset from the start of the current function. If a function can't be found, the full virtual address is returned.

Since CUDA 4.0.

See also:

readCallDepth

readVirtualReturnAddress

Parameters
dev
- device index
sm
- SM index
wp
- warp index
ln
- lane index
level
- the specified call level
ra
- the returned return address for level
Returns

CUDBG_SUCCESS, CUDBG_ERROR_UNKNOWN_FUNCTION, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INTERNAL, CUDBG_ERROR_INVALID_WARP, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_INVALID_CALL_LEVEL, CUDBG_ERROR_INVALID_CONTEXT, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::readReturnAddress32 )( uint32_t dev, uint32_t sm, uint32_t wp, uint32_t level, uint64_t* ra )

Read the return address (offset) for a call level. Behaves like readReturnAddress() for the active thread group.

Since CUDA 3.1.

Note:

DEPRECATED in CUDA 4.0: Use readReturnAddress instead.

See also:

readReturnAddress

Parameters
dev
- device index
sm
- SM index
wp
- warp index
level
- the specified call level
ra
- the returned return address for level
Returns

CUDBG_SUCCESS, CUDBG_ERROR_UNKNOWN_FUNCTION, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INTERNAL, CUDBG_ERROR_INVALID_WARP, CUDBG_ERROR_INVALID_LANE, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_INVALID_CALL_LEVEL, CUDBG_ERROR_INVALID_CONTEXT, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::readSharedMemory )( uint32_t dev, uint32_t sm, uint32_t wp, uint64_t addr, void* buf, uint32_t sz )

Read content at address in the shared memory segment. Since CUDA 3.0.

See also:

readCodeMemory

readGenericMemory

readLocalMemory

readPC

readParamMemory

readRegister

readTextureMemory

Parameters
dev
- device index
sm
- SM index
wp
- warp index
addr
- memory address
buf
- buffer
sz
- buffer size in bytes
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INVALID_MEMORY_ACCESS, CUDBG_ERROR_INTERNAL, CUDBG_ERROR_INVALID_WARP, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_INVALID_CONTEXT, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::readSmException )( uint32_t dev, uint32_t sm, CUDBGException_t* exception, uint64_t* errorPC, bool* errorPCValid )

Get the SM exception status if it exists. If the CUDBG_DEBUGGER_CAPABILITY_REPORT_EXCEPTIONS_IN_EXITED_WARPS capability is enabled, exceptions in exited warps will be reported.

Since CUDA 12.5.

Parameters
dev
- the device index
sm
- the SM index
exception
- returned exception
errorPC
- returned PC of the exception
errorPCValid
- boolean to indicate that the returned error PC is valid
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_INVALID_CONTEXT, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::readSyscallCallDepth )( uint32_t dev, uint32_t sm, uint32_t wp, uint32_t ln, uint32_t* depth )

Read the call depth of syscalls for a given thread. Will always return 0.

Since CUDA 4.1.

Note:

DEPRECATED in CUDA 12.9: Do not use.

Parameters
dev
- device index
sm
- SM index
wp
- warp index
ln
- lane index
depth
- the returned call depth
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::readTextureMemory )( uint32_t dev, uint32_t vsm, uint32_t wp, uint32_t id, uint32_t dim, uint32_t* coords, void* buf, uint32_t sz )

This method is no longer supported since CUDA 12.0. Will always return CUDBG_ERROR_NOT_SUPPORTED.

Since CUDA 4.0.

Note:

DEPRECATED in CUDA 12.0: Do not use.

Parameters
dev
- device index
vsm
wp
- warp index
id
- texture id (the value of DW_AT_location attribute in the relocated ELF image)
dim
- texture dimension (1 to 4)
coords
- array of coordinates of size dim
buf
- result buffer
sz
- buffer size in bytes
Returns

CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_RECURSIVE_API_CALL, CUDBG_ERROR_NOT_SUPPORTED

CUDBGResult ( *CUDBGAPI_st::readTextureMemoryBindless )( uint32_t dev, uint32_t vsm, uint32_t wp, uint32_t texSymtabIndex, uint32_t dim, uint32_t* coords, void* buf, uint32_t sz )

This method is no longer supported since CUDA 12.0. Will always return CUDBG_ERROR_NOT_SUPPORTED.

Since CUDA 4.2.

Note:

DEPRECATED in CUDA 12.0: Do not use.

Parameters
dev
- device index
vsm
wp
- warp index
texSymtabIndex
- global symbol table index of the texture symbol
dim
- texture dimension (1 to 4)
coords
- array of coordinates of size dim
buf
- result buffer
sz
- buffer size in bytes
Returns

CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_RECURSIVE_API_CALL, CUDBG_ERROR_NOT_SUPPORTED

CUDBGResult ( *CUDBGAPI_st::readThreadIdx )( uint32_t dev, uint32_t sm, uint32_t wp, uint32_t ln, CuDim3* threadIdx )

Read the CUDA thread index running on valid thread. Since CUDA 3.0.

See also:

readActiveLanes

readBlockIdx

readBrokenWarps

readGridId

readValidLanes

readValidWarps

Parameters
dev
- device index
sm
- SM index
wp
- warp index
ln
- lane index
threadIdx
- the returned CUDA thread index
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INTERNAL, CUDBG_ERROR_INVALID_WARP, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_INVALID_CONTEXT, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::readUniformPredicates )( uint32_t dev, uint32_t sm, uint32_t wp, uint32_t predicates_size, uint32_t* predicates )

Read contents of uniform predicate registers. Since CUDA 10.0.

See also:

readPredicates

Parameters
dev
- device index
sm
- SM index
wp
- warp index
predicates_size
- number of predicate registers to read
predicates
- buffer
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INTERNAL, CUDBG_ERROR_INVALID_WARP, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_INVALID_CONTEXT, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::readUniformRegisterRange )( uint32_t dev, uint32_t sm, uint32_t wp, uint32_t regno, uint32_t registers_size, uint32_t* registers )

Read a range of uniform registers. Since CUDA 10.0.

See also:

readRegister

Parameters
dev
- device index
sm
- SM index
wp
- warp index
regno
- starting index into uniform register file
registers_size
- number of bytes to read
registers
- pointer to buffer
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INTERNAL, CUDBG_ERROR_INVALID_WARP, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_INVALID_CONTEXT, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::readValidLanes )( uint32_t dev, uint32_t sm, uint32_t wp, uint32_t* validLanesMask )

Read the lane bitmask of valid threads on a given warp. Since CUDA 3.0.

See also:

readActiveLanes

readBlockIdx

readBrokenWarps

readGridId

readThreadIdx

readValidWarps

Parameters
dev
- device index
sm
- SM index
wp
- warp index
validLanesMask
- the returned bitmask of valid threads
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INVALID_WARP, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_INVALID_CONTEXT, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::readValidWarps )( uint32_t dev, uint32_t sm, uint64_t* validWarpsMask )

Read the bitmask of valid warps on a given SM. Since CUDA 3.0.

See also:

readActiveLanes

readBlockIdx

readBrokenWarps

readGridId

readThreadIdx

readValidLanes

Parameters
dev
- device index
sm
- SM index
validWarpsMask
- the returned bitmask of valid warps
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::readVirtualPC )( uint32_t dev, uint32_t sm, uint32_t wp, uint32_t ln, uint64_t* pc )

Read the PC (virtual address) on the given active thread. Since CUDA 3.0.

See also:

readPC

Parameters
dev
- device index
sm
- SM index
wp
- warp index
ln
- lane index
pc
- the returned PC
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INTERNAL, CUDBG_ERROR_INVALID_WARP, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_INVALID_CONTEXT, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::readVirtualReturnAddress )( uint32_t dev, uint32_t sm, uint32_t wp, uint32_t ln, uint32_t level, uint64_t* ra )

Read the virtual return address for a call level. Since CUDA 4.0.

See also:

readCallDepth

readReturnAddress

Parameters
dev
- device index
sm
- SM index
wp
- warp index
ln
- lane index
level
- the specified call level
ra
- the returned virtual return address for level
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INTERNAL, CUDBG_ERROR_INVALID_WARP, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_INVALID_CALL_LEVEL, CUDBG_ERROR_INVALID_CONTEXT, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::readVirtualReturnAddress32 )( uint32_t dev, uint32_t sm, uint32_t wp, uint32_t level, uint64_t* ra )

Read the virtual return address for a call level. Behaves like readVirtualReturnAddress for the active thread group.

Since CUDA 3.1.

Note:

DEPRECATED in CUDA 4.0: Use readVirtualReturnAddress instead.

See also:

readVirtualReturnAddress

Parameters
dev
- device index
sm
- SM index
wp
- warp index
level
- the specified call level
ra
- the returned virtual return address for level
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INTERNAL, CUDBG_ERROR_INVALID_WARP, CUDBG_ERROR_INVALID_LANE, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_INVALID_CALL_LEVEL, CUDBG_ERROR_INVALID_CONTEXT, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::readWarpResources )( uint32_t dev, uint32_t sm, uint32_t wp, CUDBGWarpResources* resources )

Get the resources assigned to a given warp. Note that these resources can change between suspends, which makes this method useful for avoiding warp data access errors.

Since CUDA 12.8.

Parameters
dev
- device index
sm
- SM index
wp
- warp index
resources
- pointer to structure that contains warp resources
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INTERNAL, CUDBG_ERROR_INVALID_WARP, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_INVALID_CONTEXT, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::readWarpState )( uint32_t dev, uint32_t sm, uint32_t wp, CUDBGWarpState* state )

Read the state of a given warp. Since CUDA 12.9.

Parameters
dev
- device index
sm
- SM index
wp
- warp index
state
- pointer to structure that contains warp state
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INTERNAL, CUDBG_ERROR_INVALID_WARP, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_INVALID_GRID, CUDBG_ERROR_INVALID_CONTEXT, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::readWarpState120 )( uint32_t dev, uint32_t sm, uint32_t wp, CUDBGWarpState120* state )

Read the state of a given warp. Behaves like readWarpState but returns fewer fields.

Since CUDA 12.0.

Note:

DEPRECATED in CUDA 12.7: Use readWarpState instead.

See also:

readWarpState

Parameters
dev
- device index
sm
- SM index
wp
- warp index
state
- pointer to structure that contains warp state
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INTERNAL, CUDBG_ERROR_INVALID_WARP, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_INVALID_GRID, CUDBG_ERROR_INVALID_CONTEXT, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::readWarpState127 )( uint32_t dev, uint32_t sm, uint32_t wp, CUDBGWarpState127* state )

Read the state of a given warp. Behaves like readWarpState but returns fewer fields.

Since CUDA 12.7.

Note:

DEPRECATED in CUDA 12.9: Use readWarpState instead.

See also:

readWarpState

Parameters
dev
- device index
sm
- SM index
wp
- warp index
state
- pointer to structure that contains warp state
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INTERNAL, CUDBG_ERROR_INVALID_WARP, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_INVALID_GRID, CUDBG_ERROR_INVALID_CONTEXT, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::readWarpState60 )( uint32_t dev, uint32_t sm, uint32_t wp, CUDBGWarpState60* state )

Read the state of a given warp. Behaves like readWarpState but returns fewer fields.

Since CUDA 6.0.

Note:

DEPRECATED in CUDA 12.0: Use readWarpState instead.

See also:

readWarpState

Parameters
dev
- device index
sm
- SM index
wp
- warp index
state
- pointer to structure that contains warp state
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INTERNAL, CUDBG_ERROR_INVALID_WARP, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_INVALID_GRID, CUDBG_ERROR_INVALID_CONTEXT, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::removeBreakpoint )( CUDBGBreakpointHandle handle )

Remove a breakpoint specified by its handle. Since CUDA 13.2.

See also:

disableBreakpoint

enableBreakpoint

getWarpHitBreakpoint

insertBreakpoint

isBreakpointEnabled

Parameters
handle
- the breakpoint handle
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INTERNAL, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::requestCleanupOnDetach )( uint32_t appResumeFlag )

Request for cleanup of driver state when detaching. Needs to be conditionally called by the client depending on the state of the debugged application. See the "Attaching and Detaching" section for more information.

Since CUDA 6.0.

Parameters
appResumeFlag
- value of CUDBG_RESUME_FOR_ATTACH_DETACH as read from the application's process space.
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INTERNAL, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::requestCleanupOnDetach55 )( )

Request for cleanup of driver state when detaching. Needs to be conditionally called by the client depending on the state of the debugged application. See the "Attaching and Detaching" section for more information.

Since CUDA 5.0.

Note:

DEPRECATED in CUDA 6.0: Use requestCleanupOnDetach instead.

See also:

requestCleanupOnDetach

Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INTERNAL, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::resumeAllDevices )( )

Resume all running CUDA devices. Since CUDA 13.2.

See also:

suspendAllDevices

Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INTERNAL, CUDBG_ERROR_RUNNING_DEVICE, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::resumeDevice )( uint32_t dev )

Resume a suspended CUDA device. Using this method is discouraged, use resumeAllDevices() instead to avoid race conditions. This method has no effect if the device is already running.

Since CUDA 3.0.

See also:

resumeAllDevices

Parameters
dev
- device index
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INTERNAL, CUDBG_ERROR_RUNNING_DEVICE, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::resumeWarpsUntilPC )( uint32_t dev, uint32_t sm, uint64_t warpMask, uint64_t pc, uint32_t flags )

Insert a temporary breakpoint at the specified virtual PC and resume all warps in the specified bitmask on a given SM. Compared to resumeDevice(), this method provides finer-grain control by resuming a selected set of warps on the same SM. The main intended usage is to accelerate the single-stepping process when the target PC is known in advance. Instead of single-stepping each warp individually until the target PC is hit, the client can use this method. If an unsteppable barrier is hit by the resumed warps, this method returns early (before reaching the target PC). When this method is used, errors within CUDA kernels will no longer be reported precisely. In the situation where resuming warps is not possible, this method will return CUDBG_ERROR_WARP_RESUME_NOT_POSSIBLE. The client should then fall back to using singleStepWarp() or resumeDevice().

Since CUDA 13.2.

See also:

resumeAllDevices

singleStepWarp

Parameters
dev
- device index
sm
- the SM index
warpMask
- the bitmask of warps to resume (1 = resume, 0 = do not resume)
pc
flags
- flags of type CUDBGSingleStepFlags to change the stepping behavior
Returns

CUDBG_SUCCESS, CUDBG_ERROR_UNKNOWN_FUNCTION, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INTERNAL, CUDBG_ERROR_RUNNING_DEVICE, CUDBG_ERROR_INVALID_ADDRESS, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_INVALID_CONTEXT, CUDBG_ERROR_WARP_RESUME_NOT_POSSIBLE, CUDBG_ERROR_INVALID_WARP_MASK, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::resumeWarpsUntilPC60 )( uint32_t dev, uint32_t sm, uint64_t warpMask, uint64_t virtPC )

Insert a temporary breakpoint at the specified virtual PC and resume all warps in the specified bitmask on a given SM. Compared to resumeDevice(), this method provides finer-grain control by resuming a selected set of warps on the same SM. The main intended usage is to accelerate the single-stepping process when the target PC is known in advance. Instead of single-stepping each warp individually until the target PC is hit, the client can use this method. If an unsteppable barrier is hit by the resumed warps, this method returns early (before reaching the target PC). When this method is used, errors within CUDA kernels will no longer be reported precisely. In the situation where resuming warps is not possible, this method will return CUDBG_ERROR_WARP_RESUME_NOT_POSSIBLE. The client should then fall back to using singleStepWarp() or resumeDevice().

Since CUDA 6.0.

Note:

DEPRECATED in CUDA 13.2: Use resumeWarpsUntilPC instead.

See also:

resumeWarpsUntilPC

Parameters
dev
- device index
sm
- SM index
warpMask
- the bitmask of warps to resume (1 = resume, 0 = do not resume)
virtPC
- the virtual PC where the temporary breakpoint will be inserted
Returns

CUDBG_SUCCESS, CUDBG_ERROR_UNKNOWN_FUNCTION, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INTERNAL, CUDBG_ERROR_RUNNING_DEVICE, CUDBG_ERROR_INVALID_ADDRESS, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_INVALID_CONTEXT, CUDBG_ERROR_WARP_RESUME_NOT_POSSIBLE, CUDBG_ERROR_INVALID_WARP_MASK, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::setBreakpoint )( uint32_t dev, uint64_t addr )

Set a breakpoint at the given instruction address for the given device. Before setting a breakpoint, getAdjustedCodeAddress() should be called to get the adjusted breakpoint address.

Since CUDA 3.2.

See also:

unsetBreakpoint

Parameters
dev
- device index
addr
- instruction address
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INTERNAL, CUDBG_ERROR_INVALID_ADDRESS, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::setBreakpoint31 )( uint64_t addr )

Set a breakpoint at the given instruction address. Behaves like setBreakpoint but tries to automatically find a device for the given address.

Since CUDA 3.0.

Note:

DEPRECATED in CUDA 3.2: Use setBreakpoint instead.

See also:

setBreakpoint

Parameters
addr
- instruction address
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INTERNAL, CUDBG_ERROR_INVALID_ADDRESS, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::setKernelLaunchNotificationMode )( CUDBGKernelLaunchNotifyMode mode )

Set the launch notification policy. If mode is CUDBG_KNL_LAUNCH_NOTIFY_EVENT, enable synchronous launch notification reporting (via events). This can noticeably slow down the execution of the application. If mode is CUDBG_KNL_LAUNCH_NOTIFY_DEFER, the launch notifications are not reported at all.

Since CUDA 5.5.

Parameters
mode
- mode to deliver kernel launch notifications in
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INTERNAL, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::setNotifyNewEventCallback )( CUDBGNotifyNewEventCallback callback, void* userData )

Provides the API with the function to call to notify the debugger of a new application or device event. The callback function is called for every ASYNC and SYNC event. The callback function is always called on the same thread. No API methods can be called from that thread except getNextEvent() and acknowledgeSyncEvents() (and their deprecated variants), otherwise CUDBG_ERROR_RECURSIVE_API_CALL will be returned.

Since CUDA 13.0.

See also:

acknowledgeSyncEvents

getNextEvent

Parameters
callback
- the callback function
userData
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::setNotifyNewEventCallback31 )( CUDBGNotifyNewEventCallback31 callback, void* data )

Provides the API with the function to call to notify the debugger of a new application or device event. Behaves like setNotifyNewEventCallback but doesn't return the host thread ID from which the event originates.

Since CUDA 3.0.

Note:

DEPRECATED in CUDA 3.2: Use setNotifyNewEventCallback instead.

See also:

setNotifyNewEventCallback

Parameters
callback
- the callback function
data
- a pointer to be passed to the callback when called
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::setNotifyNewEventCallback40 )( CUDBGNotifyNewEventCallback40 callback )

Provides the API with the function to call to notify the debugger of a new application or device event. Behaves like setNotifyNewEventCallback but doesn't allow passing in the user data pointer.

Since CUDA 3.2.

Note:

DEPRECATED in CUDA 4.1: Use setNotifyNewEventCallback instead.

See also:

setNotifyNewEventCallback

Parameters
callback
- the callback function
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::setNotifyNewEventCallback41 )( CUDBGNotifyNewEventCallback41 callback )

Provides the API with the function to call to notify the debugger of a new application or device event. Behaves like setNotifyNewEventCallback but doesn't allow passing in the user data pointer. The timeout field is always 0.

Since CUDA 4.1.

Note:

DEPRECATED in CUDA 13.0: Use setNotifyNewEventCallback instead.

See also:

setNotifyNewEventCallback

Parameters
callback
- the callback function
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::singleStepWarp )( uint32_t dev, uint32_t sm, uint32_t wp, uint32_t laneHint, uint32_t nsteps, uint32_t flags, uint64_t* warpMask )

Single step an individual warp nsteps times on a suspended CUDA device. By default, if the warp is on a convergence barrier, resumeWarpsUntilPC is called internally to quickly advance the warp past that barrier. If the CUDBG_SINGLE_STEP_FLAGS_NO_STEP_OVER_WARP_BARRIERS flag is passed in, this optimization is not performed (which would likely lead to diverged threads becoming focused and starting to advance towards the convergence barrier). If a warp is on a block-wide barrier (or wider), other warps required to advance past the barrier are automatically resumed. The output parameter warpMask will have the warps resumed in the current SM. Warps can also be resumed in other SMs, but are not reported via the API. This method is synchronous and will not return until the step is complete.

Since CUDA 12.4.

See also:

resumeAllDevices

resumeWarpsUntilPC

suspendAllDevices

Parameters
dev
- device index
sm
- SM index
wp
- warp index
laneHint
- focused lane (~0 to let the API decide)
nsteps
- number of single steps
flags
- flags of type CUDBGSingleStepFlags to change the stepping behavior
warpMask
- the warps that have been single-stepped
Returns

CUDBG_SUCCESS, CUDBG_ERROR_UNKNOWN_FUNCTION, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INTERNAL, CUDBG_ERROR_INVALID_WARP, CUDBG_ERROR_RUNNING_DEVICE, CUDBG_ERROR_INVALID_ADDRESS, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_INVALID_CONTEXT, CUDBG_ERROR_WARP_RESUME_NOT_POSSIBLE, CUDBG_ERROR_INVALID_WARP_MASK, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::singleStepWarp40 )( uint32_t dev, uint32_t sm, uint32_t wp )

Single step an individual warp on a suspended CUDA device. Behaves like singleStepWarp41 without the output warpMask parameter.

Since CUDA 3.0.

Note:

DEPRECATED in CUDA 4.1: Use singleStepWarp instead.

See also:

singleStepWarp

Parameters
dev
- device index
sm
- SM index
wp
- warp index
Returns

CUDBG_SUCCESS, CUDBG_ERROR_UNKNOWN_FUNCTION, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INTERNAL, CUDBG_ERROR_INVALID_WARP, CUDBG_ERROR_RUNNING_DEVICE, CUDBG_ERROR_INVALID_ADDRESS, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_INVALID_CONTEXT, CUDBG_ERROR_WARP_RESUME_NOT_POSSIBLE, CUDBG_ERROR_INVALID_WARP_MASK, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::singleStepWarp41 )( uint32_t dev, uint32_t sm, uint32_t wp, uint64_t* warpMask )

Single step an individual warp on a suspended CUDA device. Behaves like singleStepWarp65 with nsteps set to 1.

Since CUDA 4.1.

Note:

DEPRECATED in CUDA 6.5: Use singleStepWarp instead.

See also:

singleStepWarp

Parameters
dev
- device index
sm
- SM index
wp
- warp index
warpMask
- the warps that have been single-stepped
Returns

CUDBG_SUCCESS, CUDBG_ERROR_UNKNOWN_FUNCTION, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INTERNAL, CUDBG_ERROR_INVALID_WARP, CUDBG_ERROR_RUNNING_DEVICE, CUDBG_ERROR_INVALID_ADDRESS, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_INVALID_CONTEXT, CUDBG_ERROR_WARP_RESUME_NOT_POSSIBLE, CUDBG_ERROR_INVALID_WARP_MASK, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::singleStepWarp65 )( uint32_t dev, uint32_t sm, uint32_t wp, uint32_t nsteps, uint64_t* warpMask )

Single step an individual warp nsteps times on a suspended CUDA device. Behaves like singleStepWarp with no lane hint and the CUDBG_SINGLE_STEP_FLAGS_NO_STEP_OVER_WARP_BARRIERS flag set.

Since CUDA 6.5.

Note:

DEPRECATED in CUDA 12.4: Use singleStepWarp instead.

See also:

singleStepWarp

Parameters
dev
- device index
sm
- SM index
wp
- warp index
nsteps
- number of single steps
warpMask
- the warps that have been single-stepped
Returns

CUDBG_SUCCESS, CUDBG_ERROR_UNKNOWN_FUNCTION, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INTERNAL, CUDBG_ERROR_INVALID_WARP, CUDBG_ERROR_RUNNING_DEVICE, CUDBG_ERROR_INVALID_ADDRESS, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_INVALID_CONTEXT, CUDBG_ERROR_WARP_RESUME_NOT_POSSIBLE, CUDBG_ERROR_INVALID_WARP_MASK, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::suspendAllDevices )( uint32_t nonBlocking )

Suspend all running CUDA devices. If the nonBlocking flag is non-zero, the function returns immediately and sends CUDBG_EVENT_ALL_DEVICES_SUSPENDED when the operation finishes in the background. Otherwise, if the function returns with CUDBG_SUCCESS, that guarantees that all devices have been suspended.

Since CUDA 13.2.

See also:

resumeAllDevices

Parameters
nonBlocking
- whether or not asynchronous operation is desired
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INTERNAL, CUDBG_ERROR_SUSPENDED_DEVICE, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::suspendDevice )( uint32_t dev )

Suspends a running CUDA device. Using this method is discouraged, use suspendAllDevices() instead to avoid race conditions. The device has to be suspended in order to execute most operations on it. CUDBG_ERROR_SUSPENDED_DEVICE is returned if the device is already suspended.

Since CUDA 3.0.

See also:

suspendAllDevices

Parameters
dev
- device index
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_SUSPENDED_DEVICE, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::unsetBreakpoint )( uint32_t dev, uint64_t addr )

Unset a breakpoint at the given instruction address for the given device. Since CUDA 3.2.

See also:

setBreakpoint

Parameters
dev
- device index
addr
- instruction address
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INTERNAL, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::unsetBreakpoint31 )( uint64_t addr )

Unset a breakpoint at the given instruction address. Behaves like unsetBreakpoint but tries to automatically find a device for the given address.

Since CUDA 3.0.

Note:

DEPRECATED in CUDA 3.2: Use unsetBreakpoint instead.

See also:

unsetBreakpoint

Parameters
addr
- instruction address
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INTERNAL, CUDBG_ERROR_INVALID_ADDRESS, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::writeCCRegister )( uint32_t dev, uint32_t sm, uint32_t wp, uint32_t ln, uint32_t val )

Write to the hardware CC register. The CC register is no longer available in the supported hardware.

Since CUDA 6.5.

Note:

DEPRECATED in CUDA 13.1: Do not use.

Parameters
dev
- device index
sm
- SM index
wp
- warp index
ln
- lane index
val
- the new value of the CC register
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::writeGenericMemory )( uint32_t dev, uint32_t sm, uint32_t wp, uint32_t ln, uint64_t addr, const void* buf, uint32_t sz )

Write to an address in any memory segment. The address will be used to determine whether the write is to local, shared or global memory. The target address range should entirely reside within a single memory segment. Coordinate arguments are only used when relevant. They should be provided for the following segments:

  • Shared memory: SM and Warp

  • Local memory: SM, Warp and Lane

Since CUDA 6.0.

See also:

writeGlobalMemory

writeLocalMemory

writeParamMemory

writeSharedMemory

Parameters
dev
- device index
sm
- SM index
wp
- warp index
ln
- lane index
addr
- address
buf
- buffer
sz
- buffer size in bytes
Returns

CUDBG_SUCCESS, CUDBG_ERROR_UNKNOWN, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INVALID_MEMORY_ACCESS, CUDBG_ERROR_INTERNAL, CUDBG_ERROR_INVALID_WARP, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_INVALID_CONTEXT, CUDBG_ERROR_ADDRESS_NOT_IN_DEVICE_MEM, CUDBG_ERROR_AMBIGUOUS_MEMORY_ADDRESS, CUDBG_ERROR_RECURSIVE_API_CALL, CUDBG_ERROR_NOT_SUPPORTED

CUDBGResult ( *CUDBGAPI_st::writeGlobalMemory )( uint64_t addr, const void* buf, uint32_t sz )

Write to an address in global memory. It's not possible to access a shared memory page or an ambiguous address allocated on several devices that don't support UVA.

Since CUDA 6.0.

See also:

writeGenericMemory

writeLocalMemory

writeParamMemory

writeSharedMemory

Parameters
addr
- address
buf
- buffer
sz
- buffer size in bytes
Returns

CUDBG_SUCCESS, CUDBG_ERROR_UNKNOWN, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INVALID_MEMORY_ACCESS, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_ADDRESS_NOT_IN_DEVICE_MEM, CUDBG_ERROR_AMBIGUOUS_MEMORY_ADDRESS, CUDBG_ERROR_RECURSIVE_API_CALL, CUDBG_ERROR_NOT_SUPPORTED

CUDBGResult ( *CUDBGAPI_st::writeGlobalMemory31 )( uint32_t dev, uint64_t addr, const void* buf, uint32_t sz )

Write to an address in global memory. This method is unsupported on Hopper and later architectures. Use newer methods: writeGlobalMemory or writeGenericMemory.

Since CUDA 3.0.

Note:

DEPRECATED in CUDA 3.2: Use writeGlobalMemory instead.

See also:

writeGlobalMemory

Parameters
dev
- device index
addr
- address
buf
- buffer
sz
- buffer size in bytes
Returns

CUDBG_SUCCESS, CUDBG_ERROR_UNKNOWN, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INVALID_MEMORY_ACCESS, CUDBG_ERROR_INTERNAL, CUDBG_ERROR_INVALID_WARP, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_INVALID_CONTEXT, CUDBG_ERROR_ADDRESS_NOT_IN_DEVICE_MEM, CUDBG_ERROR_AMBIGUOUS_MEMORY_ADDRESS, CUDBG_ERROR_RECURSIVE_API_CALL, CUDBG_ERROR_NOT_SUPPORTED

CUDBGResult ( *CUDBGAPI_st::writeGlobalMemory55 )( uint32_t dev, uint32_t sm, uint32_t wp, uint32_t ln, uint64_t addr, const void* buf, uint32_t sz )

Write to an address in global memory. Use newer methods: writeGlobalMemory or writeGenericMemory.

Since CUDA 3.2.

Note:

DEPRECATED in CUDA 6.0: Use writeGlobalMemory instead.

See also:

writeGlobalMemory

Parameters
dev
- device index
sm
- SM index
wp
- warp index
ln
- lane index
addr
- address
buf
- buffer
sz
- buffer size in bytes
Returns

CUDBG_SUCCESS, CUDBG_ERROR_UNKNOWN, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INVALID_MEMORY_ACCESS, CUDBG_ERROR_INTERNAL, CUDBG_ERROR_INVALID_WARP, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_INVALID_CONTEXT, CUDBG_ERROR_ADDRESS_NOT_IN_DEVICE_MEM, CUDBG_ERROR_AMBIGUOUS_MEMORY_ADDRESS, CUDBG_ERROR_RECURSIVE_API_CALL, CUDBG_ERROR_NOT_SUPPORTED

CUDBGResult ( *CUDBGAPI_st::writeLocalMemory )( uint32_t dev, uint32_t sm, uint32_t wp, uint32_t ln, uint64_t addr, const void* buf, uint32_t sz )

Write to an address in local memory. The destination address range must be within local memory.

Since CUDA 3.0.

See also:

writeGenericMemory

writeGlobalMemory

writeParamMemory

writeSharedMemory

Parameters
dev
- device index
sm
- SM index
wp
- warp index
ln
- lane index
addr
- address
buf
- buffer
sz
- buffer size in bytes
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INTERNAL, CUDBG_ERROR_INVALID_WARP, CUDBG_ERROR_INVALID_ADDRESS, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_INVALID_CONTEXT, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::writeParamMemory )( uint32_t dev, uint32_t sm, uint32_t wp, uint64_t addr, const void* buf, uint32_t sz )

Write to an address in param memory. The destination address range must be within param memory.

Since CUDA 3.0.

See also:

writeGenericMemory

writeGlobalMemory

writeLocalMemory

writeSharedMemory

Parameters
dev
- device index
sm
- SM index
wp
- warp index
addr
- address
buf
- buffer
sz
- buffer size in bytes
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INTERNAL, CUDBG_ERROR_INVALID_WARP, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_INVALID_CONTEXT, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::writePinnedMemory )( uint64_t addr, const void* buf, uint32_t sz )

Write to a pinned memory address. It's not possible to access an ambiguous access allocated on several devices that don't support UVA.

Since CUDA 3.2.

See also:

readPinnedMemory

Parameters
addr
- address
buf
- buffer
sz
- buffer size in bytes
Returns

CUDBG_SUCCESS, CUDBG_ERROR_UNKNOWN, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INVALID_MEMORY_ACCESS, CUDBG_ERROR_MEMORY_MAPPING_FAILED, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_ADDRESS_NOT_IN_DEVICE_MEM, CUDBG_ERROR_AMBIGUOUS_MEMORY_ADDRESS, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::writePredicates )( uint32_t dev, uint32_t sm, uint32_t wp, uint32_t ln, uint32_t predicates_size, const uint32_t* predicates )

Write to hardware predicates. This method writes to predicates_size predicates, starting from P0. Each predicate value must be either 0 or 1.

Since CUDA 6.5.

See also:

writeRegister

writeUniformPredicates

writeUniformRegister

Parameters
dev
- device index
sm
- SM index
wp
- warp index
ln
- lane index
predicates_size
- predicates count
predicates
- predicate values
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INTERNAL, CUDBG_ERROR_INVALID_WARP, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_INVALID_CONTEXT, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::writeRegister )( uint32_t dev, uint32_t sm, uint32_t wp, uint32_t ln, uint32_t regno, uint32_t val )

Write to a hardware register. Since CUDA 3.0.

See also:

writePredicates

writeUniformPredicates

writeUniformRegister

Parameters
dev
- device index
sm
- SM index
wp
- warp index
ln
- lane index
regno
- register number
val
- value
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INTERNAL, CUDBG_ERROR_INVALID_WARP, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_INVALID_CONTEXT, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::writeSharedMemory )( uint32_t dev, uint32_t sm, uint32_t wp, uint64_t addr, const void* buf, uint32_t sz )

Write to an address in shared memory. The destination address range must be within shared memory.

Since CUDA 3.0.

See also:

writeGenericMemory

writeGlobalMemory

writeLocalMemory

writeParamMemory

Parameters
dev
- device index
sm
- SM index
wp
- warp index
addr
- address
buf
- buffer
sz
- buffer size in bytes
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INVALID_MEMORY_ACCESS, CUDBG_ERROR_INTERNAL, CUDBG_ERROR_INVALID_WARP, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_INVALID_CONTEXT, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::writeUniformPredicates )( uint32_t dev, uint32_t sm, uint32_t wp, uint32_t predicates_size, const uint32_t* predicates )

Write to hardware uniform predicates. This method writes to predicates_size uniform predicates, starting from UP0. Each predicate value must be either 0 or 1.

Since CUDA 10.0.

See also:

writePredicates

writeRegister

writeUniformRegister

Parameters
dev
- device index
sm
- SM index
wp
- warp index
predicates_size
- predicates count
predicates
- predicate values
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INTERNAL, CUDBG_ERROR_INVALID_WARP, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_INVALID_CONTEXT, CUDBG_ERROR_RECURSIVE_API_CALL

CUDBGResult ( *CUDBGAPI_st::writeUniformRegister )( uint32_t dev, uint32_t sm, uint32_t wp, uint32_t regno, uint32_t val )

Write to a hardware uniform register. Since CUDA 10.0.

See also:

writePredicates

writeRegister

writeUniformPredicates

Parameters
dev
- device index
sm
- SM index
wp
- warp index
regno
- register number
val
- value
Returns

CUDBG_SUCCESS, CUDBG_ERROR_INVALID_ARGS, CUDBG_ERROR_UNINITIALIZED, CUDBG_ERROR_INTERNAL, CUDBG_ERROR_INVALID_WARP, CUDBG_ERROR_INITIALIZATION_FAILURE, CUDBG_ERROR_INVALID_CONTEXT, CUDBG_ERROR_RECURSIVE_API_CALL