-->

6. Data Structures

Here are the data structures with brief descriptions:

BufferInfo
BufferInfo will be stored in the file for every buffer i.e for every call of UtilDumpPcSamplingBufferInFile() API
CUPTI::​PcSamplingUtil::​CUptiUtil_GetBufferInfoParams
Params for CuptiUtilGetBufferInfo
CUPTI::​PcSamplingUtil::​CUptiUtil_GetHeaderDataParams
Params for CuptiUtilGetHeaderData
CUPTI::​PcSamplingUtil::​CUptiUtil_GetPcSampDataParams
Params for CuptiUtilGetPcSampData
CUPTI::​PcSamplingUtil::​CUptiUtil_MergePcSampDataParams
Params for CuptiUtilMergePcSampData
CUPTI::​PcSamplingUtil::​CUptiUtil_PutPcSampDataParams
Params for CuptiUtilPutPcSampData
CUpti_Activity
The base activity record
CUpti_ActivityAPI
The activity record for a driver or runtime API invocation
CUpti_ActivityAutoBoostState
Device auto boost state structure
CUpti_ActivityBranch
The activity record for source level result branch. (deprecated)
CUpti_ActivityBranch2
The activity record for source level result branch
CUpti_ActivityCdpKernel
The activity record for CDP (CUDA Dynamic Parallelism) kernel
CUpti_ActivityContext
The activity record for a context
CUpti_ActivityCudaEvent
The activity record for CUDA event
CUpti_ActivityDevice
The activity record for a device. (deprecated)
CUpti_ActivityDevice2
The activity record for a device. (deprecated)
CUpti_ActivityDevice3
The activity record for a device. (CUDA 7.0 onwards)
CUpti_ActivityDevice4
The activity record for a device. (CUDA 11.6 onwards)
CUpti_ActivityDevice5
The activity record for a device. (CUDA 11.6 onwards)
CUpti_ActivityDeviceAttribute
The activity record for a device attribute
CUpti_ActivityEnvironment
The activity record for CUPTI environmental data
CUpti_ActivityEvent
The activity record for a CUPTI event
CUpti_ActivityEventInstance
The activity record for a CUPTI event with instance information
CUpti_ActivityExternalCorrelation
The activity record for correlation with external records
CUpti_ActivityFunction
The activity record for global/device functions
CUpti_ActivityGlobalAccess
The activity record for source-level global access. (deprecated)
CUpti_ActivityGlobalAccess2
The activity record for source-level global access. (deprecated in CUDA 9.0)
CUpti_ActivityGlobalAccess3
The activity record for source-level global access
CUpti_ActivityGraphTrace
The activity record for trace of graph execution
CUpti_ActivityInstantaneousEvent
The activity record for an instantaneous CUPTI event
CUpti_ActivityInstantaneousEventInstance
The activity record for an instantaneous CUPTI event with event domain instance information
CUpti_ActivityInstantaneousMetric
The activity record for an instantaneous CUPTI metric
CUpti_ActivityInstantaneousMetricInstance
The instantaneous activity record for a CUPTI metric with instance information
CUpti_ActivityInstructionCorrelation
The activity record for source-level sass/source line-by-line correlation
CUpti_ActivityInstructionExecution
The activity record for source-level instruction execution
CUpti_ActivityJit
The activity record for JIT operations. This activity represents the JIT operations (compile, load, store) of a CUmodule from the Compute Cache. Gives the exact hashed path of where the cached module is loaded from, or where the module will be stored after Just-In-Time (JIT) compilation
CUpti_ActivityKernel
The activity record for kernel. (deprecated)
CUpti_ActivityKernel2
The activity record for kernel. (deprecated)
CUpti_ActivityKernel3
The activity record for a kernel (CUDA 6.5(with sm_52 support) onwards). (deprecated in CUDA 9.0)
CUpti_ActivityKernel4
The activity record for a kernel (CUDA 9.0(with sm_70 support) onwards). (deprecated in CUDA 11.0)
CUpti_ActivityKernel5
The activity record for a kernel (CUDA 11.0(with sm_80 support) onwards). (deprecated in CUDA 11.2) This activity record represents a kernel execution (CUPTI_ACTIVITY_KIND_KERNEL and CUPTI_ACTIVITY_KIND_CONCURRENT_KERNEL) but is no longer generated by CUPTI. Kernel activities are now reported using the CUpti_ActivityKernel9 activity record
CUpti_ActivityKernel6
The activity record for kernel. (deprecated in CUDA 11.6)
CUpti_ActivityKernel7
The activity record for kernel. (deprecated in CUDA 11.8)
CUpti_ActivityKernel8
The activity record for kernel
CUpti_ActivityKernel9
The activity record for kernel
CUpti_ActivityMarker
The activity record providing a marker which is an instantaneous point in time. (deprecated in CUDA 8.0)
CUpti_ActivityMarker2
The activity record providing a marker which is an instantaneous point in time
CUpti_ActivityMarkerData
The activity record providing detailed information for a marker
CUpti_ActivityMemcpy
The activity record for memory copies. (deprecated)
CUpti_ActivityMemcpy3
The activity record for memory copies. (deprecated in CUDA 11.1)
CUpti_ActivityMemcpy4
The activity record for memory copies. (deprecated in CUDA 11.6)
CUpti_ActivityMemcpy5
The activity record for memory copies
CUpti_ActivityMemcpyPtoP
The activity record for peer-to-peer memory copies
CUpti_ActivityMemcpyPtoP2
The activity record for peer-to-peer memory copies. (deprecated in CUDA 11.1)
CUpti_ActivityMemcpyPtoP3
The activity record for peer-to-peer memory copies. (deprecated in CUDA 11.6)
CUpti_ActivityMemcpyPtoP4
The activity record for peer-to-peer memory copies
CUpti_ActivityMemory
The activity record for memory
CUpti_ActivityMemory2
The activity record for memory
CUpti_ActivityMemory3
The activity record for memory
CUpti_ActivityMemory3::​CUpti_ActivityMemory3::​PACKED_ALIGNMENT
CUpti_ActivityMemoryPool
The activity record for memory pool
CUpti_ActivityMemoryPool2
The activity record for memory pool
CUpti_ActivityMemset
The activity record for memset. (deprecated)
CUpti_ActivityMemset2
The activity record for memset. (deprecated in CUDA 11.1)
CUpti_ActivityMemset3
The activity record for memset. (deprecated in CUDA 11.6)
CUpti_ActivityMemset4
The activity record for memset
CUpti_ActivityMetric
The activity record for a CUPTI metric
CUpti_ActivityMetricInstance
The activity record for a CUPTI metric with instance information
CUpti_ActivityModule
The activity record for a CUDA module
CUpti_ActivityName
The activity record providing a name
CUpti_ActivityNvLink
NVLink information. (deprecated in CUDA 9.0)
CUpti_ActivityNvLink2
NVLink information. (deprecated in CUDA 10.0)
CUpti_ActivityNvLink3
NVLink information
CUpti_ActivityNvLink4
NVLink information
CUpti_ActivityObjectKindId
Identifiers for object kinds as specified by CUpti_ActivityObjectKind
CUpti_ActivityOpenAcc
The base activity record for OpenAcc records
CUpti_ActivityOpenAccData
The activity record for OpenACC data
CUpti_ActivityOpenAccLaunch
The activity record for OpenACC launch
CUpti_ActivityOpenAccOther
The activity record for OpenACC other
CUpti_ActivityOpenMp
The base activity record for OpenMp records
CUpti_ActivityOverhead
The activity record for CUPTI and driver overheads
CUpti_ActivityPcie
PCI devices information required to construct topology
CUpti_ActivityPCSampling
The activity record for PC sampling. (deprecated in CUDA 8.0)
CUpti_ActivityPCSampling2
The activity record for PC sampling. (deprecated in CUDA 9.0)
CUpti_ActivityPCSampling3
The activity record for PC sampling
CUpti_ActivityPCSamplingConfig
PC sampling configuration structure
CUpti_ActivityPCSamplingRecordInfo
The activity record for record status for PC sampling
CUpti_ActivityPreemption
The activity record for a preemption of a CDP kernel
CUpti_ActivitySharedAccess
The activity record for source-level shared access
CUpti_ActivitySourceLocator
The activity record for source locator
CUpti_ActivityStream
The activity record for CUDA stream
CUpti_ActivitySynchronization
The activity record for synchronization management
CUpti_ActivityUnifiedMemoryCounter
The activity record for Unified Memory counters (deprecated in CUDA 7.0)
CUpti_ActivityUnifiedMemoryCounter2
The activity record for Unified Memory counters (CUDA 7.0 and beyond)
CUpti_ActivityUnifiedMemoryCounterConfig
Unified Memory counters configuration structure
CUpti_CallbackData
Data passed into a runtime or driver API callback function
CUpti_EventGroupSet
A set of event groups
CUpti_EventGroupSets
A set of event group sets
CUpti_GetCubinCrcParams
Params for cuptiGetCubinCrc
CUpti_GetSassToSourceCorrelationParams
Params for cuptiGetSassToSourceCorrelation
CUpti_GraphData
CUDA graphs data passed into a resource callback function
CUpti_MetricValue
A metric value
CUpti_ModuleResourceData
Module data passed into a resource callback function
CUpti_NvtxData
Data passed into a NVTX callback function
CUpti_PCSamplingConfigurationInfo
PC sampling configuration information structure
CUpti_PCSamplingConfigurationInfoParams
PC sampling configuration structure
CUpti_PCSamplingData
Collected PC Sampling data
CUpti_PCSamplingDisableParams
Params for cuptiPCSamplingDisable
CUpti_PCSamplingEnableParams
Params for cuptiPCSamplingEnable
CUpti_PCSamplingGetDataParams
Params for cuptiPCSamplingEnable
CUpti_PCSamplingGetNumStallReasonsParams
Params for cuptiPCSamplingGetNumStallReasons
CUpti_PCSamplingGetStallReasonsParams
Params for cuptiPCSamplingGetStallReasons
CUpti_PCSamplingPCData
PC Sampling data
CUpti_PCSamplingStallReason
PC Sampling stall reasons
CUpti_PCSamplingStartParams
Params for cuptiPCSamplingStart
CUpti_PCSamplingStopParams
Params for cuptiPCSamplingStop
CUpti_Profiler_BeginPass_Params
Params for cuptiProfilerBeginPass
CUpti_Profiler_BeginSession_Params
Params for cuptiProfilerBeginSession
CUpti_Profiler_CounterDataImage_CalculateScratchBufferSize_Params
Params for cuptiProfilerCounterDataImageCalculateScratchBufferSize
CUpti_Profiler_CounterDataImage_CalculateSize_Params
Params for cuptiProfilerCounterDataImageCalculateSize
CUpti_Profiler_CounterDataImage_Initialize_Params
Params for cuptiProfilerCounterDataImageInitialize
CUpti_Profiler_CounterDataImage_InitializeScratchBuffer_Params
Params for cuptiProfilerCounterDataImageInitializeScratchBuffer
CUpti_Profiler_CounterDataImageOptions
Input parameter to define the counterDataImage
CUpti_Profiler_DeInitialize_Params
Default parameter for cuptiProfilerDeInitialize
CUpti_Profiler_DeviceSupported_Params
Params for cuptiProfilerDeviceSupported
CUpti_Profiler_DisableProfiling_Params
Params for cuptiProfilerDisableProfiling
CUpti_Profiler_EnableProfiling_Params
Params for cuptiProfilerEnableProfiling
CUpti_Profiler_EndPass_Params
Params for cuptiProfilerEndPass
CUpti_Profiler_EndSession_Params
Params for cuptiProfilerEndSession
CUpti_Profiler_FlushCounterData_Params
Params for cuptiProfilerFlushCounterData
CUpti_Profiler_GetCounterAvailability_Params
Params for cuptiProfilerGetCounterAvailability
CUpti_Profiler_Initialize_Params
Default parameter for cuptiProfilerInitialize
CUpti_Profiler_IsPassCollected_Params
Params for cuptiProfilerIsPassCollected
CUpti_Profiler_SetConfig_Params
Params for cuptiProfilerSetConfig
CUpti_Profiler_UnsetConfig_Params
Params for cuptiProfilerUnsetConfig
CUpti_ResourceData
Data passed into a resource callback function
CUpti_StateData
Data passed into a State callback function
CUpti_SynchronizeData
Data passed into a synchronize callback function
Header
Header info will be stored in file
NV::​Cupti::​Checkpoint::​CUpti_Checkpoint
Configuration and handle for a CUPTI Checkpoint
PcSamplingStallReasons
All available stall reasons name and respective indexes will be stored in it

6.1. BufferInfo Struct Reference

[CUPTI PC Sampling Utility API]

Public Variables

uint64_t  bufferByteSize
uint64_t  numSelectedStallReasons
size_t  numStallReasons
uint64_t  recordCount

Variables

uint64_t BufferInfo::bufferByteSize [inherited]

Buffer size in Bytes.

uint64_t BufferInfo::numSelectedStallReasons [inherited]

Total number of stall reasons in single record.

size_t BufferInfo::numStallReasons [inherited]

Count of all stall reasons supported on the GPU

uint64_t BufferInfo::recordCount [inherited]

Total number of PC records.

6.2. CUPTI::PcSamplingUtil::CUptiUtil_GetBufferInfoParams Struct Reference

[CUPTI PC Sampling Utility API]

Public Variables

struct BufferInfo bufferInfoData
std::ifstream * fileHandler
size_t  size

Variables

struct BufferInfoCUPTI::​PcSamplingUtil::​CUptiUtil_GetBufferInfoParams::bufferInfoData [inherited]

Buffer Info.

std::ifstream * CUPTI::​PcSamplingUtil::​CUptiUtil_GetBufferInfoParams::fileHandler [inherited]

File handle.

size_t CUPTI::​PcSamplingUtil::​CUptiUtil_GetBufferInfoParams::size [inherited]

Size of the data structure i.e. CUpti_PCSamplingDisableParamsSize CUPTI client should set the size of the structure. It will be used in CUPTI to check what fields are available in the structure. Used to preserve backward compatibility.

6.3. CUPTI::PcSamplingUtil::CUptiUtil_GetHeaderDataParams Struct Reference

[CUPTI PC Sampling Utility API]

Public Variables

std::ifstream * fileHandler
struct Header headerInfo
size_t  size

Variables

std::ifstream * CUPTI::​PcSamplingUtil::​CUptiUtil_GetHeaderDataParams::fileHandler [inherited]

File handle.

struct HeaderCUPTI::​PcSamplingUtil::​CUptiUtil_GetHeaderDataParams::headerInfo [inherited]

Header Info.

size_t CUPTI::​PcSamplingUtil::​CUptiUtil_GetHeaderDataParams::size [inherited]

Size of the data structure i.e. CUpti_PCSamplingDisableParamsSize CUPTI client should set the size of the structure. It will be used in CUPTI to check what fields are available in the structure. Used to preserve backward compatibility.

6.4. CUPTI::PcSamplingUtil::CUptiUtil_GetPcSampDataParams Struct Reference

[CUPTI PC Sampling Utility API]

Public Variables

PcSamplingBufferType bufferType
std::ifstream * fileHandler
size_t  numAttributes
BufferInfopBufferInfoData
CUpti_PCSamplingConfigurationInfopPCSamplingConfigurationInfo
PcSamplingStallReasonspPcSamplingStallReasons
void * pSamplingData
size_t  size

Variables

PcSamplingBufferTypeCUPTI::​PcSamplingUtil::​CUptiUtil_GetPcSampDataParams::bufferType [inherited]

Type of buffer to store in file

std::ifstream * CUPTI::​PcSamplingUtil::​CUptiUtil_GetPcSampDataParams::fileHandler [inherited]

File handle.

size_t CUPTI::​PcSamplingUtil::​CUptiUtil_GetPcSampDataParams::numAttributes [inherited]

Number of configuration attributes

BufferInfo * CUPTI::​PcSamplingUtil::​CUptiUtil_GetPcSampDataParams::pBufferInfoData [inherited]

Pointer to collected buffer info using CuptiUtilGetBufferInfo

CUpti_PCSamplingConfigurationInfo * CUPTI::​PcSamplingUtil::​CUptiUtil_GetPcSampDataParams::pPCSamplingConfigurationInfo [inherited]
PcSamplingStallReasons * CUPTI::​PcSamplingUtil::​CUptiUtil_GetPcSampDataParams::pPcSamplingStallReasons [inherited]

Refer PcSamplingStallReasons. For stallReasons field of PcSamplingStallReasons it is expected to allocate memory for each string element of array.

void * CUPTI::​PcSamplingUtil::​CUptiUtil_GetPcSampDataParams::pSamplingData [inherited]

Pointer to allocated memory to store retrieved data from file.

size_t CUPTI::​PcSamplingUtil::​CUptiUtil_GetPcSampDataParams::size [inherited]

Size of the data structure i.e. CUpti_PCSamplingDisableParamsSize CUPTI client should set the size of the structure. It will be used in CUPTI to check what fields are available in the structure. Used to preserve backward compatibility.

6.5. CUPTI::PcSamplingUtil::CUptiUtil_MergePcSampDataParams Struct Reference

[CUPTI PC Sampling Utility API]

Public Variables

MergedPcSampDataBuffers
CUpti_PCSamplingDataPcSampDataBuffer
size_t * numMergedBuffer
size_t  numberOfBuffers
size_t  size

Variables

* CUPTI::​PcSamplingUtil::​CUptiUtil_MergePcSampDataParams::MergedPcSampDataBuffers [inherited]

Pointer to array of merged buffers as per the range id.

CUpti_PCSamplingData * CUPTI::​PcSamplingUtil::​CUptiUtil_MergePcSampDataParams::PcSampDataBuffer [inherited]

Pointer to array of buffers to merge

size_t * CUPTI::​PcSamplingUtil::​CUptiUtil_MergePcSampDataParams::numMergedBuffer [inherited]

Number of merged buffers.

size_t CUPTI::​PcSamplingUtil::​CUptiUtil_MergePcSampDataParams::numberOfBuffers [inherited]

Number of buffers to merge.

size_t CUPTI::​PcSamplingUtil::​CUptiUtil_MergePcSampDataParams::size [inherited]

Size of the data structure i.e. CUpti_PCSamplingDisableParamsSize CUPTI client should set the size of the structure. It will be used in CUPTI to check what fields are available in the structure. Used to preserve backward compatibility.

6.6. CUPTI::PcSamplingUtil::CUptiUtil_PutPcSampDataParams Struct Reference

[CUPTI PC Sampling Utility API]

Public Variables

PcSamplingBufferType bufferType
const char * fileName
size_t  numAttributes
CUpti_PCSamplingConfigurationInfopPCSamplingConfigurationInfo
PcSamplingStallReasonspPcSamplingStallReasons
void * pSamplingData
size_t  size

Variables

PcSamplingBufferTypeCUPTI::​PcSamplingUtil::​CUptiUtil_PutPcSampDataParams::bufferType [inherited]

Type of buffer to store in file

const char * CUPTI::​PcSamplingUtil::​CUptiUtil_PutPcSampDataParams::fileName [inherited]

File name to store buffer into it.

size_t CUPTI::​PcSamplingUtil::​CUptiUtil_PutPcSampDataParams::numAttributes [inherited]

Number of configured attributes

CUpti_PCSamplingConfigurationInfo * CUPTI::​PcSamplingUtil::​CUptiUtil_PutPcSampDataParams::pPCSamplingConfigurationInfo [inherited]

Refer CUpti_PCSamplingConfigurationInfo It is expected to provide configuration details of at least CUPTI_PC_SAMPLING_CONFIGURATION_ATTR_TYPE_STALL_REASON attribute.

PcSamplingStallReasons * CUPTI::​PcSamplingUtil::​CUptiUtil_PutPcSampDataParams::pPcSamplingStallReasons [inherited]
void * CUPTI::​PcSamplingUtil::​CUptiUtil_PutPcSampDataParams::pSamplingData [inherited]

PC sampling buffer.

size_t CUPTI::​PcSamplingUtil::​CUptiUtil_PutPcSampDataParams::size [inherited]

Size of the data structure i.e. CUpti_PCSamplingDisableParamsSize CUPTI client should set the size of the structure. It will be used in CUPTI to check what fields are available in the structure. Used to preserve backward compatibility.

6.7. CUpti_Activity Struct Reference

[CUPTI Activity API]

The activity API uses a CUpti_Activity as a generic representation for any activity. The 'kind' field is used to determine the specific activity kind, and from that the CUpti_Activity object can be cast to the specific activity record type appropriate for that kind.

Note that all activity record types are padded and aligned to ensure that each member of the record is naturally aligned.

See also:

CUpti_ActivityKind

Public Variables

CUpti_ActivityKind kind

Variables

CUpti_ActivityKindCUpti_Activity::kind [inherited]

The kind of this activity.

6.8. CUpti_ActivityAPI Struct Reference

[CUPTI Activity API]

This activity record represents an invocation of a driver or runtime API (CUPTI_ACTIVITY_KIND_DRIVER and CUPTI_ACTIVITY_KIND_RUNTIME).

Public Variables

CUpti_CallbackId cbid
uint32_t  correlationId
uint64_t  end
CUpti_ActivityKind kind
uint32_t  processId
uint32_t  returnValue
uint64_t  start
uint32_t  threadId

Variables

CUpti_CallbackIdCUpti_ActivityAPI::cbid [inherited]

The ID of the driver or runtime function.

uint32_t CUpti_ActivityAPI::correlationId [inherited]

The correlation ID of the driver or runtime CUDA function. Each function invocation is assigned a unique correlation ID that is identical to the correlation ID in the memcpy, memset, or kernel activity record that is associated with this function.

uint64_t CUpti_ActivityAPI::end [inherited]

The end timestamp for the function, in ns. A value of 0 for both the start and end timestamps indicates that timestamp information could not be collected for the function.

CUpti_ActivityKindCUpti_ActivityAPI::kind [inherited]

The activity record kind, must be CUPTI_ACTIVITY_KIND_DRIVER, CUPTI_ACTIVITY_KIND_RUNTIME, or CUPTI_ACTIVITY_KIND_INTERNAL_LAUNCH_API.

uint32_t CUpti_ActivityAPI::processId [inherited]

The ID of the process where the driver or runtime CUDA function is executing.

uint32_t CUpti_ActivityAPI::returnValue [inherited]

The return value for the function. For a CUDA driver function with will be a CUresult value, and for a CUDA runtime function this will be a cudaError_t value.

uint64_t CUpti_ActivityAPI::start [inherited]

The start timestamp for the function, in ns. A value of 0 for both the start and end timestamps indicates that timestamp information could not be collected for the function.

uint32_t CUpti_ActivityAPI::threadId [inherited]

The ID of the thread where the driver or runtime CUDA function is executing.

6.9. CUpti_ActivityAutoBoostState Struct Reference

[CUPTI Activity API]

This structure defines auto boost state for a device. See function cuptiGetAutoBoostState

Public Variables

uint32_t  enabled
uint32_t  pid

Variables

uint32_t CUpti_ActivityAutoBoostState::enabled [inherited]

Returned auto boost state. 1 is returned in case auto boost is enabled, 0 otherwise

uint32_t CUpti_ActivityAutoBoostState::pid [inherited]

Id of process that has set the current boost state. The value will be CUPTI_AUTO_BOOST_INVALID_CLIENT_PID if the user does not have the permission to query process ids or there is an error in querying the process id.

6.10. CUpti_ActivityBranch Struct Reference

[CUPTI Activity API]

This activity record the locations of the branches in the source (CUPTI_ACTIVITY_KIND_BRANCH). Branch activities are now reported using the CUpti_ActivityBranch2 activity record.

Public Variables

uint32_t  correlationId
uint32_t  diverged
uint32_t  executed
CUpti_ActivityKind kind
uint32_t  pcOffset
uint32_t  sourceLocatorId
uint64_t  threadsExecuted

Variables

uint32_t CUpti_ActivityBranch::correlationId [inherited]

The correlation ID of the kernel to which this result is associated.

uint32_t CUpti_ActivityBranch::diverged [inherited]

Number of times this branch diverged

uint32_t CUpti_ActivityBranch::executed [inherited]

The number of times this instruction was executed per warp. It will be incremented regardless of predicate or condition code.

CUpti_ActivityKindCUpti_ActivityBranch::kind [inherited]

The activity record kind, must be CUPTI_ACTIVITY_KIND_BRANCH.

uint32_t CUpti_ActivityBranch::pcOffset [inherited]

The pc offset for the branch.

uint32_t CUpti_ActivityBranch::sourceLocatorId [inherited]

The ID for source locator.

uint64_t CUpti_ActivityBranch::threadsExecuted [inherited]

This increments each time when this instruction is executed by number of threads that executed this instruction

6.11. CUpti_ActivityBranch2 Struct Reference

[CUPTI Activity API]

This activity record the locations of the branches in the source (CUPTI_ACTIVITY_KIND_BRANCH).

Public Variables

uint32_t  correlationId
uint32_t  diverged
uint32_t  executed
uint32_t  functionId
CUpti_ActivityKind kind
uint32_t  pad
uint32_t  pcOffset
uint32_t  sourceLocatorId
uint64_t  threadsExecuted

Variables

uint32_t CUpti_ActivityBranch2::correlationId [inherited]

The correlation ID of the kernel to which this result is associated.

uint32_t CUpti_ActivityBranch2::diverged [inherited]

Number of times this branch diverged

uint32_t CUpti_ActivityBranch2::executed [inherited]

The number of times this instruction was executed per warp. It will be incremented regardless of predicate or condition code.

uint32_t CUpti_ActivityBranch2::functionId [inherited]

Correlation ID with global/device function name

CUpti_ActivityKindCUpti_ActivityBranch2::kind [inherited]

The activity record kind, must be CUPTI_ACTIVITY_KIND_BRANCH.

uint32_t CUpti_ActivityBranch2::pad [inherited]

Undefined. Reserved for internal use.

uint32_t CUpti_ActivityBranch2::pcOffset [inherited]

The pc offset for the branch.

uint32_t CUpti_ActivityBranch2::sourceLocatorId [inherited]

The ID for source locator.

uint64_t CUpti_ActivityBranch2::threadsExecuted [inherited]

This increments each time when this instruction is executed by number of threads that executed this instruction

6.12. CUpti_ActivityCdpKernel Struct Reference

[CUPTI Activity API]

This activity record represents a CDP kernel execution.

Public Variables

int32_t  blockX
int32_t  blockY
int32_t  blockZ
uint64_t  completed
uint32_t  contextId
uint32_t  correlationId
uint32_t  deviceId
int32_t  dynamicSharedMemory
uint64_t  end
uint8_t  executed
int64_t  gridId
int32_t  gridX
int32_t  gridY
int32_t  gridZ
CUpti_ActivityKind kind
uint32_t  localMemoryPerThread
uint32_t  localMemoryTotal
const char * name
uint32_t  parentBlockX
uint32_t  parentBlockY
uint32_t  parentBlockZ
int64_t  parentGridId
uint64_t  queued
uint16_t  registersPerThread
uint8_t  requested
uint8_t  sharedMemoryConfig
uint64_t  start
int32_t  staticSharedMemory
uint32_t  streamId
uint64_t  submitted

Variables

int32_t CUpti_ActivityCdpKernel::blockX [inherited]

The X-dimension block size for the kernel.

int32_t CUpti_ActivityCdpKernel::blockY [inherited]

The Y-dimension block size for the kernel.

int32_t CUpti_ActivityCdpKernel::blockZ [inherited]

The Z-dimension grid size for the kernel.

uint64_t CUpti_ActivityCdpKernel::completed [inherited]

The timestamp when kernel is marked as completed, in ns. A value of CUPTI_TIMESTAMP_UNKNOWN indicates that the completion time is unknown.

uint32_t CUpti_ActivityCdpKernel::contextId [inherited]

The ID of the context where the kernel is executing.

uint32_t CUpti_ActivityCdpKernel::correlationId [inherited]

The correlation ID of the kernel. Each kernel execution is assigned a unique correlation ID that is identical to the correlation ID in the driver API activity record that launched the kernel.

uint32_t CUpti_ActivityCdpKernel::deviceId [inherited]

The ID of the device where the kernel is executing.

int32_t CUpti_ActivityCdpKernel::dynamicSharedMemory [inherited]

The dynamic shared memory reserved for the kernel, in bytes.

uint64_t CUpti_ActivityCdpKernel::end [inherited]

The end timestamp for the kernel execution, in ns. A value of 0 for both the start and end timestamps indicates that timestamp information could not be collected for the kernel.

uint8_t CUpti_ActivityCdpKernel::executed [inherited]

The cache configuration used for the kernel. The value is one of the CUfunc_cache enumeration values from cuda.h.

int64_t CUpti_ActivityCdpKernel::gridId [inherited]

The grid ID of the kernel. Each kernel execution is assigned a unique grid ID.

int32_t CUpti_ActivityCdpKernel::gridX [inherited]

The X-dimension grid size for the kernel.

int32_t CUpti_ActivityCdpKernel::gridY [inherited]

The Y-dimension grid size for the kernel.

int32_t CUpti_ActivityCdpKernel::gridZ [inherited]

The Z-dimension grid size for the kernel.

CUpti_ActivityKindCUpti_ActivityCdpKernel::kind [inherited]

The activity record kind, must be CUPTI_ACTIVITY_KIND_CDP_KERNEL

uint32_t CUpti_ActivityCdpKernel::localMemoryPerThread [inherited]

The amount of local memory reserved for each thread, in bytes.

uint32_t CUpti_ActivityCdpKernel::localMemoryTotal [inherited]

The total amount of local memory reserved for the kernel, in bytes.

const char * CUpti_ActivityCdpKernel::name [inherited]

The name of the kernel. This name is shared across all activity records representing the same kernel, and so should not be modified.

uint32_t CUpti_ActivityCdpKernel::parentBlockX [inherited]

The X-dimension of the parent block.

uint32_t CUpti_ActivityCdpKernel::parentBlockY [inherited]

The Y-dimension of the parent block.

uint32_t CUpti_ActivityCdpKernel::parentBlockZ [inherited]

The Z-dimension of the parent block.

int64_t CUpti_ActivityCdpKernel::parentGridId [inherited]

The grid ID of the parent kernel.

uint64_t CUpti_ActivityCdpKernel::queued [inherited]

The timestamp when kernel is queued up, in ns. A value of CUPTI_TIMESTAMP_UNKNOWN indicates that the queued time is unknown.

uint16_t CUpti_ActivityCdpKernel::registersPerThread [inherited]

The number of registers required for each thread executing the kernel.

uint8_t CUpti_ActivityCdpKernel::requested [inherited]

The cache configuration requested by the kernel. The value is one of the CUfunc_cache enumeration values from cuda.h.

uint8_t CUpti_ActivityCdpKernel::sharedMemoryConfig [inherited]

The shared memory configuration used for the kernel. The value is one of the CUsharedconfig enumeration values from cuda.h.

uint64_t CUpti_ActivityCdpKernel::start [inherited]

The start timestamp for the kernel execution, in ns. A value of 0 for both the start and end timestamps indicates that timestamp information could not be collected for the kernel.

int32_t CUpti_ActivityCdpKernel::staticSharedMemory [inherited]

The static shared memory allocated for the kernel, in bytes.

uint32_t CUpti_ActivityCdpKernel::streamId [inherited]

The ID of the stream where the kernel is executing.

uint64_t CUpti_ActivityCdpKernel::submitted [inherited]

The timestamp when kernel is submitted to the gpu, in ns. A value of CUPTI_TIMESTAMP_UNKNOWN indicates that the submission time is unknown.

6.13. CUpti_ActivityContext Struct Reference

[CUPTI Activity API]

This activity record represents information about a context (CUPTI_ACTIVITY_KIND_CONTEXT).

Public Variables

uint16_t  computeApiKind
uint32_t  contextId
uint32_t  deviceId
CUpti_ActivityKind kind
uint16_t  nullStreamId

Variables

uint16_t CUpti_ActivityContext::computeApiKind [inherited]

The compute API kind.

See also:

CUpti_ActivityComputeApiKind

uint32_t CUpti_ActivityContext::contextId [inherited]

The context ID.

uint32_t CUpti_ActivityContext::deviceId [inherited]

The device ID.

CUpti_ActivityKindCUpti_ActivityContext::kind [inherited]

The activity record kind, must be CUPTI_ACTIVITY_KIND_CONTEXT.

uint16_t CUpti_ActivityContext::nullStreamId [inherited]

The ID for the NULL stream in this context

6.14. CUpti_ActivityCudaEvent Struct Reference

[CUPTI Activity API]

This activity is used to track recorded events. (CUPTI_ACTIVITY_KIND_CUDA_EVENT).

Public Variables

uint32_t  contextId
uint32_t  correlationId
uint32_t  eventId
CUpti_ActivityKind kind
uint32_t  pad
uint32_t  streamId

Variables

uint32_t CUpti_ActivityCudaEvent::contextId [inherited]

The ID of the context where the event was recorded.

uint32_t CUpti_ActivityCudaEvent::correlationId [inherited]

The correlation ID of the API to which this result is associated.

uint32_t CUpti_ActivityCudaEvent::eventId [inherited]

A unique event ID to identify the event record.

CUpti_ActivityKindCUpti_ActivityCudaEvent::kind [inherited]

The activity record kind, must be CUPTI_ACTIVITY_KIND_CUDA_EVENT.

uint32_t CUpti_ActivityCudaEvent::pad [inherited]

Undefined. Reserved for internal use.

uint32_t CUpti_ActivityCudaEvent::streamId [inherited]

The compute stream where the event was recorded.

6.15. CUpti_ActivityDevice Struct Reference

[CUPTI Activity API]

This activity record represents information about a GPU device (CUPTI_ACTIVITY_KIND_DEVICE). Device activity is now reported using the CUpti_ActivityDevice5 activity record.

Public Variables

uint32_t  computeCapabilityMajor
uint32_t  computeCapabilityMinor
uint32_t  constantMemorySize
uint32_t  coreClockRate
CUpti_ActivityFlag flags
uint64_t  globalMemoryBandwidth
uint64_t  globalMemorySize
uint32_t  id
CUpti_ActivityKind kind
uint32_t  l2CacheSize
uint32_t  maxBlockDimX
uint32_t  maxBlockDimY
uint32_t  maxBlockDimZ
uint32_t  maxBlocksPerMultiprocessor
uint32_t  maxGridDimX
uint32_t  maxGridDimY
uint32_t  maxGridDimZ
uint32_t  maxIPC
uint32_t  maxRegistersPerBlock
uint32_t  maxSharedMemoryPerBlock
uint32_t  maxThreadsPerBlock
uint32_t  maxWarpsPerMultiprocessor
const char * name
uint32_t  numMemcpyEngines
uint32_t  numMultiprocessors
uint32_t  numThreadsPerWarp

Variables

uint32_t CUpti_ActivityDevice::computeCapabilityMajor [inherited]

Compute capability for the device, major number.

uint32_t CUpti_ActivityDevice::computeCapabilityMinor [inherited]

Compute capability for the device, minor number.

uint32_t CUpti_ActivityDevice::constantMemorySize [inherited]

The amount of constant memory on the device, in bytes.

uint32_t CUpti_ActivityDevice::coreClockRate [inherited]

The core clock rate of the device, in kHz.

CUpti_ActivityFlagCUpti_ActivityDevice::flags [inherited]

The flags associated with the device.

See also:

CUpti_ActivityFlag

uint64_t CUpti_ActivityDevice::globalMemoryBandwidth [inherited]

The global memory bandwidth available on the device, in kBytes/sec.

uint64_t CUpti_ActivityDevice::globalMemorySize [inherited]

The amount of global memory on the device, in bytes.

uint32_t CUpti_ActivityDevice::id [inherited]

The device ID.

CUpti_ActivityKindCUpti_ActivityDevice::kind [inherited]

The activity record kind, must be CUPTI_ACTIVITY_KIND_DEVICE.

uint32_t CUpti_ActivityDevice::l2CacheSize [inherited]

The size of the L2 cache on the device, in bytes.

uint32_t CUpti_ActivityDevice::maxBlockDimX [inherited]

Maximum allowed X dimension for a block.

uint32_t CUpti_ActivityDevice::maxBlockDimY [inherited]

Maximum allowed Y dimension for a block.

uint32_t CUpti_ActivityDevice::maxBlockDimZ [inherited]

Maximum allowed Z dimension for a block.

uint32_t CUpti_ActivityDevice::maxBlocksPerMultiprocessor [inherited]

Maximum number of blocks that can be present on a multiprocessor at any given time.

uint32_t CUpti_ActivityDevice::maxGridDimX [inherited]

Maximum allowed X dimension for a grid.

uint32_t CUpti_ActivityDevice::maxGridDimY [inherited]

Maximum allowed Y dimension for a grid.

uint32_t CUpti_ActivityDevice::maxGridDimZ [inherited]

Maximum allowed Z dimension for a grid.

uint32_t CUpti_ActivityDevice::maxIPC [inherited]

The maximum "instructions per cycle" possible on each device multiprocessor.

uint32_t CUpti_ActivityDevice::maxRegistersPerBlock [inherited]

Maximum number of registers that can be allocated to a block.

uint32_t CUpti_ActivityDevice::maxSharedMemoryPerBlock [inherited]

Maximum amount of shared memory that can be assigned to a block, in bytes.

uint32_t CUpti_ActivityDevice::maxThreadsPerBlock [inherited]

Maximum number of threads allowed in a block.

uint32_t CUpti_ActivityDevice::maxWarpsPerMultiprocessor [inherited]

Maximum number of warps that can be present on a multiprocessor at any given time.

const char * CUpti_ActivityDevice::name [inherited]

The device name. This name is shared across all activity records representing instances of the device, and so should not be modified.

uint32_t CUpti_ActivityDevice::numMemcpyEngines [inherited]

Number of memory copy engines on the device.

uint32_t CUpti_ActivityDevice::numMultiprocessors [inherited]

Number of multiprocessors on the device.

uint32_t CUpti_ActivityDevice::numThreadsPerWarp [inherited]

The number of threads per warp on the device.

6.16. CUpti_ActivityDevice2 Struct Reference

[CUPTI Activity API]

This activity record represents information about a GPU device (CUPTI_ACTIVITY_KIND_DEVICE). Device activity is now reported using the CUpti_ActivityDevice5 activity record.

Public Variables

uint32_t  computeCapabilityMajor
uint32_t  computeCapabilityMinor
uint32_t  constantMemorySize
uint32_t  coreClockRate
uint32_t  eccEnabled
CUpti_ActivityFlag flags
uint64_t  globalMemoryBandwidth
uint64_t  globalMemorySize
uint32_t  id
CUpti_ActivityKind kind
uint32_t  l2CacheSize
uint32_t  maxBlockDimX
uint32_t  maxBlockDimY
uint32_t  maxBlockDimZ
uint32_t  maxBlocksPerMultiprocessor
uint32_t  maxGridDimX
uint32_t  maxGridDimY
uint32_t  maxGridDimZ
uint32_t  maxIPC
uint32_t  maxRegistersPerBlock
uint32_t  maxRegistersPerMultiprocessor
uint32_t  maxSharedMemoryPerBlock
uint32_t  maxSharedMemoryPerMultiprocessor
uint32_t  maxThreadsPerBlock
uint32_t  maxWarpsPerMultiprocessor
const char * name
uint32_t  numMemcpyEngines
uint32_t  numMultiprocessors
uint32_t  numThreadsPerWarp
uint32_t  pad
CUuuid  uuid

Variables

uint32_t CUpti_ActivityDevice2::computeCapabilityMajor [inherited]

Compute capability for the device, major number.

uint32_t CUpti_ActivityDevice2::computeCapabilityMinor [inherited]

Compute capability for the device, minor number.

uint32_t CUpti_ActivityDevice2::constantMemorySize [inherited]

The amount of constant memory on the device, in bytes.

uint32_t CUpti_ActivityDevice2::coreClockRate [inherited]

The core clock rate of the device, in kHz.

uint32_t CUpti_ActivityDevice2::eccEnabled [inherited]

ECC enabled flag for device

CUpti_ActivityFlagCUpti_ActivityDevice2::flags [inherited]

The flags associated with the device.

See also:

CUpti_ActivityFlag

uint64_t CUpti_ActivityDevice2::globalMemoryBandwidth [inherited]

The global memory bandwidth available on the device, in kBytes/sec.

uint64_t CUpti_ActivityDevice2::globalMemorySize [inherited]

The amount of global memory on the device, in bytes.

uint32_t CUpti_ActivityDevice2::id [inherited]

The device ID.

CUpti_ActivityKindCUpti_ActivityDevice2::kind [inherited]

The activity record kind, must be CUPTI_ACTIVITY_KIND_DEVICE.

uint32_t CUpti_ActivityDevice2::l2CacheSize [inherited]

The size of the L2 cache on the device, in bytes.

uint32_t CUpti_ActivityDevice2::maxBlockDimX [inherited]

Maximum allowed X dimension for a block.

uint32_t CUpti_ActivityDevice2::maxBlockDimY [inherited]

Maximum allowed Y dimension for a block.

uint32_t CUpti_ActivityDevice2::maxBlockDimZ [inherited]

Maximum allowed Z dimension for a block.

uint32_t CUpti_ActivityDevice2::maxBlocksPerMultiprocessor [inherited]

Maximum number of blocks that can be present on a multiprocessor at any given time.

uint32_t CUpti_ActivityDevice2::maxGridDimX [inherited]

Maximum allowed X dimension for a grid.

uint32_t CUpti_ActivityDevice2::maxGridDimY [inherited]

Maximum allowed Y dimension for a grid.

uint32_t CUpti_ActivityDevice2::maxGridDimZ [inherited]

Maximum allowed Z dimension for a grid.

uint32_t CUpti_ActivityDevice2::maxIPC [inherited]

The maximum "instructions per cycle" possible on each device multiprocessor.

uint32_t CUpti_ActivityDevice2::maxRegistersPerBlock [inherited]

Maximum number of registers that can be allocated to a block.

uint32_t CUpti_ActivityDevice2::maxRegistersPerMultiprocessor [inherited]

Maximum number of 32-bit registers available per multiprocessor.

uint32_t CUpti_ActivityDevice2::maxSharedMemoryPerBlock [inherited]

Maximum amount of shared memory that can be assigned to a block, in bytes.

uint32_t CUpti_ActivityDevice2::maxSharedMemoryPerMultiprocessor [inherited]

Maximum amount of shared memory available per multiprocessor, in bytes.

uint32_t CUpti_ActivityDevice2::maxThreadsPerBlock [inherited]

Maximum number of threads allowed in a block.

uint32_t CUpti_ActivityDevice2::maxWarpsPerMultiprocessor [inherited]

Maximum number of warps that can be present on a multiprocessor at any given time.

const char * CUpti_ActivityDevice2::name [inherited]

The device name. This name is shared across all activity records representing instances of the device, and so should not be modified.

uint32_t CUpti_ActivityDevice2::numMemcpyEngines [inherited]

Number of memory copy engines on the device.

uint32_t CUpti_ActivityDevice2::numMultiprocessors [inherited]

Number of multiprocessors on the device.

uint32_t CUpti_ActivityDevice2::numThreadsPerWarp [inherited]

The number of threads per warp on the device.

uint32_t CUpti_ActivityDevice2::pad [inherited]

Undefined. Reserved for internal use.

CUuuid CUpti_ActivityDevice2::uuid [inherited]

The device UUID. This value is the globally unique immutable alphanumeric identifier of the device.

6.17. CUpti_ActivityDevice3 Struct Reference

[CUPTI Activity API]

This activity record represents information about a GPU device (CUPTI_ACTIVITY_KIND_DEVICE). Device activity is now reported using the CUpti_ActivityDevice5 activity record.

Public Variables

uint32_t  computeCapabilityMajor
uint32_t  computeCapabilityMinor
uint32_t  constantMemorySize
uint32_t  coreClockRate
uint32_t  eccEnabled
CUpti_ActivityFlag flags
uint64_t  globalMemoryBandwidth
uint64_t  globalMemorySize
uint32_t  id
uint8_t  isCudaVisible
CUpti_ActivityKind kind
uint32_t  l2CacheSize
uint32_t  maxBlockDimX
uint32_t  maxBlockDimY
uint32_t  maxBlockDimZ
uint32_t  maxBlocksPerMultiprocessor
uint32_t  maxGridDimX
uint32_t  maxGridDimY
uint32_t  maxGridDimZ
uint32_t  maxIPC
uint32_t  maxRegistersPerBlock
uint32_t  maxRegistersPerMultiprocessor
uint32_t  maxSharedMemoryPerBlock
uint32_t  maxSharedMemoryPerMultiprocessor
uint32_t  maxThreadsPerBlock
uint32_t  maxWarpsPerMultiprocessor
const char * name
uint32_t  numMemcpyEngines
uint32_t  numMultiprocessors
uint32_t  numThreadsPerWarp
uint32_t  pad
CUuuid  uuid

Variables

uint32_t CUpti_ActivityDevice3::computeCapabilityMajor [inherited]

Compute capability for the device, major number.

uint32_t CUpti_ActivityDevice3::computeCapabilityMinor [inherited]

Compute capability for the device, minor number.

uint32_t CUpti_ActivityDevice3::constantMemorySize [inherited]

The amount of constant memory on the device, in bytes.

uint32_t CUpti_ActivityDevice3::coreClockRate [inherited]

The core clock rate of the device, in kHz.

uint32_t CUpti_ActivityDevice3::eccEnabled [inherited]

ECC enabled flag for device

CUpti_ActivityFlagCUpti_ActivityDevice3::flags [inherited]

The flags associated with the device.

See also:

CUpti_ActivityFlag

uint64_t CUpti_ActivityDevice3::globalMemoryBandwidth [inherited]

The global memory bandwidth available on the device, in kBytes/sec.

uint64_t CUpti_ActivityDevice3::globalMemorySize [inherited]

The amount of global memory on the device, in bytes.

uint32_t CUpti_ActivityDevice3::id [inherited]

The device ID.

uint8_t CUpti_ActivityDevice3::isCudaVisible [inherited]

Flag to indicate whether the device is visible to CUDA. Users can set the device visibility using CUDA_VISIBLE_DEVICES environment

CUpti_ActivityKindCUpti_ActivityDevice3::kind [inherited]

The activity record kind, must be CUPTI_ACTIVITY_KIND_DEVICE.

uint32_t CUpti_ActivityDevice3::l2CacheSize [inherited]

The size of the L2 cache on the device, in bytes.

uint32_t CUpti_ActivityDevice3::maxBlockDimX [inherited]

Maximum allowed X dimension for a block.

uint32_t CUpti_ActivityDevice3::maxBlockDimY [inherited]

Maximum allowed Y dimension for a block.

uint32_t CUpti_ActivityDevice3::maxBlockDimZ [inherited]

Maximum allowed Z dimension for a block.

uint32_t CUpti_ActivityDevice3::maxBlocksPerMultiprocessor [inherited]

Maximum number of blocks that can be present on a multiprocessor at any given time.

uint32_t CUpti_ActivityDevice3::maxGridDimX [inherited]

Maximum allowed X dimension for a grid.

uint32_t CUpti_ActivityDevice3::maxGridDimY [inherited]

Maximum allowed Y dimension for a grid.

uint32_t CUpti_ActivityDevice3::maxGridDimZ [inherited]

Maximum allowed Z dimension for a grid.

uint32_t CUpti_ActivityDevice3::maxIPC [inherited]

The maximum "instructions per cycle" possible on each device multiprocessor.

uint32_t CUpti_ActivityDevice3::maxRegistersPerBlock [inherited]

Maximum number of registers that can be allocated to a block.

uint32_t CUpti_ActivityDevice3::maxRegistersPerMultiprocessor [inherited]

Maximum number of 32-bit registers available per multiprocessor.

uint32_t CUpti_ActivityDevice3::maxSharedMemoryPerBlock [inherited]

Maximum amount of shared memory that can be assigned to a block, in bytes.

uint32_t CUpti_ActivityDevice3::maxSharedMemoryPerMultiprocessor [inherited]

Maximum amount of shared memory available per multiprocessor, in bytes.

uint32_t CUpti_ActivityDevice3::maxThreadsPerBlock [inherited]

Maximum number of threads allowed in a block.

uint32_t CUpti_ActivityDevice3::maxWarpsPerMultiprocessor [inherited]

Maximum number of warps that can be present on a multiprocessor at any given time.

const char * CUpti_ActivityDevice3::name [inherited]

The device name. This name is shared across all activity records representing instances of the device, and so should not be modified.

uint32_t CUpti_ActivityDevice3::numMemcpyEngines [inherited]

Number of memory copy engines on the device.

uint32_t CUpti_ActivityDevice3::numMultiprocessors [inherited]

Number of multiprocessors on the device.

uint32_t CUpti_ActivityDevice3::numThreadsPerWarp [inherited]

The number of threads per warp on the device.

uint32_t CUpti_ActivityDevice3::pad [inherited]

Undefined. Reserved for internal use.

CUuuid CUpti_ActivityDevice3::uuid [inherited]

The device UUID. This value is the globally unique immutable alphanumeric identifier of the device.

6.18. CUpti_ActivityDevice4 Struct Reference

[CUPTI Activity API]

This activity record represents information about a GPU device (CUPTI_ACTIVITY_KIND_DEVICE). Device activity is now reported using the CUpti_ActivityDevice5 activity record.

Public Variables

uint32_t  computeCapabilityMajor
uint32_t  computeCapabilityMinor
uint32_t  computeInstanceId
uint32_t  constantMemorySize
uint32_t  coreClockRate
uint32_t  eccEnabled
CUpti_ActivityFlag flags
uint64_t  globalMemoryBandwidth
uint64_t  globalMemorySize
uint32_t  gpuInstanceId
uint32_t  id
uint8_t  isCudaVisible
uint8_t  isMigEnabled
CUpti_ActivityKind kind
uint32_t  l2CacheSize
uint32_t  maxBlockDimX
uint32_t  maxBlockDimY
uint32_t  maxBlockDimZ
uint32_t  maxBlocksPerMultiprocessor
uint32_t  maxGridDimX
uint32_t  maxGridDimY
uint32_t  maxGridDimZ
uint32_t  maxIPC
uint32_t  maxRegistersPerBlock
uint32_t  maxRegistersPerMultiprocessor
uint32_t  maxSharedMemoryPerBlock
uint32_t  maxSharedMemoryPerMultiprocessor
uint32_t  maxThreadsPerBlock
uint32_t  maxWarpsPerMultiprocessor
CUuuid  migUuid
const char * name
uint32_t  numMemcpyEngines
uint32_t  numMultiprocessors
uint32_t  numThreadsPerWarp
uint32_t  pad
CUuuid  uuid

Variables

uint32_t CUpti_ActivityDevice4::computeCapabilityMajor [inherited]

Compute capability for the device, major number.

uint32_t CUpti_ActivityDevice4::computeCapabilityMinor [inherited]

Compute capability for the device, minor number.

uint32_t CUpti_ActivityDevice4::computeInstanceId [inherited]

Compute Instance id for MIG enabled devices. If mig mode is disabled value is set to UINT32_MAX

uint32_t CUpti_ActivityDevice4::constantMemorySize [inherited]

The amount of constant memory on the device, in bytes.

uint32_t CUpti_ActivityDevice4::coreClockRate [inherited]

The core clock rate of the device, in kHz.

uint32_t CUpti_ActivityDevice4::eccEnabled [inherited]

ECC enabled flag for device

CUpti_ActivityFlagCUpti_ActivityDevice4::flags [inherited]

The flags associated with the device.

See also:

CUpti_ActivityFlag

uint64_t CUpti_ActivityDevice4::globalMemoryBandwidth [inherited]

The global memory bandwidth available on the device, in kBytes/sec.

uint64_t CUpti_ActivityDevice4::globalMemorySize [inherited]

The amount of global memory on the device, in bytes.

uint32_t CUpti_ActivityDevice4::gpuInstanceId [inherited]

GPU Instance id for MIG enabled devices. If mig mode is disabled value is set to UINT32_MAX

uint32_t CUpti_ActivityDevice4::id [inherited]

The device ID.

uint8_t CUpti_ActivityDevice4::isCudaVisible [inherited]

Flag to indicate whether the device is visible to CUDA. Users can set the device visibility using CUDA_VISIBLE_DEVICES environment

uint8_t CUpti_ActivityDevice4::isMigEnabled [inherited]

MIG enabled flag for device

CUpti_ActivityKindCUpti_ActivityDevice4::kind [inherited]

The activity record kind, must be CUPTI_ACTIVITY_KIND_DEVICE.

uint32_t CUpti_ActivityDevice4::l2CacheSize [inherited]

The size of the L2 cache on the device, in bytes.

uint32_t CUpti_ActivityDevice4::maxBlockDimX [inherited]

Maximum allowed X dimension for a block.

uint32_t CUpti_ActivityDevice4::maxBlockDimY [inherited]

Maximum allowed Y dimension for a block.

uint32_t CUpti_ActivityDevice4::maxBlockDimZ [inherited]

Maximum allowed Z dimension for a block.

uint32_t CUpti_ActivityDevice4::maxBlocksPerMultiprocessor [inherited]

Maximum number of blocks that can be present on a multiprocessor at any given time.

uint32_t CUpti_ActivityDevice4::maxGridDimX [inherited]

Maximum allowed X dimension for a grid.

uint32_t CUpti_ActivityDevice4::maxGridDimY [inherited]

Maximum allowed Y dimension for a grid.

uint32_t CUpti_ActivityDevice4::maxGridDimZ [inherited]

Maximum allowed Z dimension for a grid.

uint32_t CUpti_ActivityDevice4::maxIPC [inherited]

The maximum "instructions per cycle" possible on each device multiprocessor.

uint32_t CUpti_ActivityDevice4::maxRegistersPerBlock [inherited]

Maximum number of registers that can be allocated to a block.

uint32_t CUpti_ActivityDevice4::maxRegistersPerMultiprocessor [inherited]

Maximum number of 32-bit registers available per multiprocessor.

uint32_t CUpti_ActivityDevice4::maxSharedMemoryPerBlock [inherited]

Maximum amount of shared memory that can be assigned to a block, in bytes.

uint32_t CUpti_ActivityDevice4::maxSharedMemoryPerMultiprocessor [inherited]

Maximum amount of shared memory available per multiprocessor, in bytes.

uint32_t CUpti_ActivityDevice4::maxThreadsPerBlock [inherited]

Maximum number of threads allowed in a block.

uint32_t CUpti_ActivityDevice4::maxWarpsPerMultiprocessor [inherited]

Maximum number of warps that can be present on a multiprocessor at any given time.

CUuuid CUpti_ActivityDevice4::migUuid [inherited]

The MIG UUID. This value is the globally unique immutable alphanumeric identifier of the device.

const char * CUpti_ActivityDevice4::name [inherited]

The device name. This name is shared across all activity records representing instances of the device, and so should not be modified.

uint32_t CUpti_ActivityDevice4::numMemcpyEngines [inherited]

Number of memory copy engines on the device.

uint32_t CUpti_ActivityDevice4::numMultiprocessors [inherited]

Number of multiprocessors on the device.

uint32_t CUpti_ActivityDevice4::numThreadsPerWarp [inherited]

The number of threads per warp on the device.

uint32_t CUpti_ActivityDevice4::pad [inherited]

Undefined. Reserved for internal use.

CUuuid CUpti_ActivityDevice4::uuid [inherited]

The device UUID. This value is the globally unique immutable alphanumeric identifier of the device.

6.19. CUpti_ActivityDevice5 Struct Reference

[CUPTI Activity API]

This activity record represents information about a GPU device (CUPTI_ACTIVITY_KIND_DEVICE).

Public Variables

uint32_t  computeCapabilityMajor
uint32_t  computeCapabilityMinor
uint32_t  computeInstanceId
uint32_t  constantMemorySize
uint32_t  coreClockRate
uint32_t  eccEnabled
CUpti_ActivityFlag flags
uint64_t  globalMemoryBandwidth
uint64_t  globalMemorySize
uint32_t  gpuInstanceId
uint32_t  id
uint8_t  isCudaVisible
uint8_t  isMigEnabled
uint32_t  isNumaNode
CUpti_ActivityKind kind
uint32_t  l2CacheSize
uint32_t  maxBlockDimX
uint32_t  maxBlockDimY
uint32_t  maxBlockDimZ
uint32_t  maxBlocksPerMultiprocessor
uint32_t  maxGridDimX
uint32_t  maxGridDimY
uint32_t  maxGridDimZ
uint32_t  maxIPC
uint32_t  maxRegistersPerBlock
uint32_t  maxRegistersPerMultiprocessor
uint32_t  maxSharedMemoryPerBlock
uint32_t  maxSharedMemoryPerMultiprocessor
uint32_t  maxThreadsPerBlock
uint32_t  maxWarpsPerMultiprocessor
CUuuid  migUuid
const char * name
uint32_t  numMemcpyEngines
uint32_t  numMultiprocessors
uint32_t  numThreadsPerWarp
uint32_t  numaId
uint32_t  pad
CUuuid  uuid

Variables

uint32_t CUpti_ActivityDevice5::computeCapabilityMajor [inherited]

Compute capability for the device, major number.

uint32_t CUpti_ActivityDevice5::computeCapabilityMinor [inherited]

Compute capability for the device, minor number.

uint32_t CUpti_ActivityDevice5::computeInstanceId [inherited]

Compute Instance id for MIG enabled devices. If mig mode is disabled value is set to UINT32_MAX

uint32_t CUpti_ActivityDevice5::constantMemorySize [inherited]

The amount of constant memory on the device, in bytes.

uint32_t CUpti_ActivityDevice5::coreClockRate [inherited]

The core clock rate of the device, in kHz.

uint32_t CUpti_ActivityDevice5::eccEnabled [inherited]

ECC enabled flag for device

CUpti_ActivityFlagCUpti_ActivityDevice5::flags [inherited]

The flags associated with the device.

See also:

CUpti_ActivityFlag

uint64_t CUpti_ActivityDevice5::globalMemoryBandwidth [inherited]

The global memory bandwidth available on the device, in kBytes/sec.

uint64_t CUpti_ActivityDevice5::globalMemorySize [inherited]

The amount of global memory on the device, in bytes.

uint32_t CUpti_ActivityDevice5::gpuInstanceId [inherited]

GPU Instance id for MIG enabled devices. If mig mode is disabled value is set to UINT32_MAX

uint32_t CUpti_ActivityDevice5::id [inherited]

The device ID.

uint8_t CUpti_ActivityDevice5::isCudaVisible [inherited]

Flag to indicate whether the device is visible to CUDA. Users can set the device visibility using CUDA_VISIBLE_DEVICES environment

uint8_t CUpti_ActivityDevice5::isMigEnabled [inherited]

MIG enabled flag for device

uint32_t CUpti_ActivityDevice5::isNumaNode [inherited]

Numa (Non-uniform memory access) information for device GPU is a NUMA node or not

CUpti_ActivityKindCUpti_ActivityDevice5::kind [inherited]

The activity record kind, must be CUPTI_ACTIVITY_KIND_DEVICE.

uint32_t CUpti_ActivityDevice5::l2CacheSize [inherited]

The size of the L2 cache on the device, in bytes.

uint32_t CUpti_ActivityDevice5::maxBlockDimX [inherited]

Maximum allowed X dimension for a block.

uint32_t CUpti_ActivityDevice5::maxBlockDimY [inherited]

Maximum allowed Y dimension for a block.

uint32_t CUpti_ActivityDevice5::maxBlockDimZ [inherited]

Maximum allowed Z dimension for a block.

uint32_t CUpti_ActivityDevice5::maxBlocksPerMultiprocessor [inherited]

Maximum number of blocks that can be present on a multiprocessor at any given time.

uint32_t CUpti_ActivityDevice5::maxGridDimX [inherited]

Maximum allowed X dimension for a grid.

uint32_t CUpti_ActivityDevice5::maxGridDimY [inherited]

Maximum allowed Y dimension for a grid.

uint32_t CUpti_ActivityDevice5::maxGridDimZ [inherited]

Maximum allowed Z dimension for a grid.

uint32_t CUpti_ActivityDevice5::maxIPC [inherited]

The maximum "instructions per cycle" possible on each device multiprocessor.

uint32_t CUpti_ActivityDevice5::maxRegistersPerBlock [inherited]

Maximum number of registers that can be allocated to a block.

uint32_t CUpti_ActivityDevice5::maxRegistersPerMultiprocessor [inherited]

Maximum number of 32-bit registers available per multiprocessor.

uint32_t CUpti_ActivityDevice5::maxSharedMemoryPerBlock [inherited]

Maximum amount of shared memory that can be assigned to a block, in bytes.

uint32_t CUpti_ActivityDevice5::maxSharedMemoryPerMultiprocessor [inherited]

Maximum amount of shared memory available per multiprocessor, in bytes.

uint32_t CUpti_ActivityDevice5::maxThreadsPerBlock [inherited]

Maximum number of threads allowed in a block.

uint32_t CUpti_ActivityDevice5::maxWarpsPerMultiprocessor [inherited]

Maximum number of warps that can be present on a multiprocessor at any given time.

CUuuid CUpti_ActivityDevice5::migUuid [inherited]

The MIG UUID. This value is the globally unique immutable alphanumeric identifier of the device.

const char * CUpti_ActivityDevice5::name [inherited]

The device name. This name is shared across all activity records representing instances of the device, and so should not be modified.

uint32_t CUpti_ActivityDevice5::numMemcpyEngines [inherited]

Number of memory copy engines on the device.

uint32_t CUpti_ActivityDevice5::numMultiprocessors [inherited]

Number of multiprocessors on the device.

uint32_t CUpti_ActivityDevice5::numThreadsPerWarp [inherited]

The number of threads per warp on the device.

uint32_t CUpti_ActivityDevice5::numaId [inherited]

Numa (Non-uniform memory access) information for device NUMA node ID of the GPU memory if GPU is not a NUMA node, it returns invalidNumaId

uint32_t CUpti_ActivityDevice5::pad [inherited]

Undefined. Reserved for internal use.

CUuuid CUpti_ActivityDevice5::uuid [inherited]

The device UUID. This value is the globally unique immutable alphanumeric identifier of the device.

6.20. CUpti_ActivityDeviceAttribute Struct Reference

[CUPTI Activity API]

This activity record represents information about a GPU device: either a CUpti_DeviceAttribute or CUdevice_attribute value (CUPTI_ACTIVITY_KIND_DEVICE_ATTRIBUTE).

Public Variables

CUpti_ActivityDeviceAttribute::@23  attribute
uint32_t  deviceId
CUpti_ActivityFlag flags
CUpti_ActivityKind kind
CUpti_ActivityDeviceAttribute::@24  value

Variables

CUpti_ActivityDeviceAttribute::@23 CUpti_ActivityDeviceAttribute::attribute [inherited]

The attribute, either a CUpti_DeviceAttribute or CUdevice_attribute. Flag CUPTI_ACTIVITY_FLAG_DEVICE_ATTRIBUTE_CUDEVICE is used to indicate what kind of attribute this is. If CUPTI_ACTIVITY_FLAG_DEVICE_ATTRIBUTE_CUDEVICE is 1 then CUdevice_attribute field is value, otherwise CUpti_DeviceAttribute field is valid.

uint32_t CUpti_ActivityDeviceAttribute::deviceId [inherited]

The ID of the device that this attribute applies to.

CUpti_ActivityFlagCUpti_ActivityDeviceAttribute::flags [inherited]

The flags associated with the device.

See also:

CUpti_ActivityFlag

CUpti_ActivityKindCUpti_ActivityDeviceAttribute::kind [inherited]

The activity record kind, must be CUPTI_ACTIVITY_KIND_DEVICE_ATTRIBUTE.

CUpti_ActivityDeviceAttribute::@24 CUpti_ActivityDeviceAttribute::value [inherited]

The value for the attribute. See CUpti_DeviceAttribute and CUdevice_attribute for the type of the value for a given attribute.

6.21. CUpti_ActivityEnvironment Struct Reference

[CUPTI Activity API]

This activity record provides CUPTI environmental data, include power, clocks, and thermals. This information is sampled at various rates and returned in this activity record. The consumer of the record needs to check the environmentKind field to figure out what kind of environmental record this is.

Public Variables

CUpti_EnvironmentClocksThrottleReason clocksThrottleReasons
CUpti_ActivityEnvironment::@25::@29  cooling
uint32_t  deviceId
CUpti_ActivityEnvironmentKind environmentKind
uint32_t  fanSpeed
uint32_t  gpuTemperature
CUpti_ActivityKind kind
uint32_t  memoryClock
uint32_t  pcieLinkGen
uint32_t  pcieLinkWidth
CUpti_ActivityEnvironment::@25::@28  power
uint32_t  power
uint32_t  powerLimit
uint32_t  smClock
CUpti_ActivityEnvironment::@25::@26  speed
CUpti_ActivityEnvironment::@25::@27  temperature
uint64_t  timestamp

Variables

CUpti_EnvironmentClocksThrottleReasonCUpti_ActivityEnvironment::clocksThrottleReasons [inherited]

The clocks throttle reasons.

CUpti_ActivityEnvironment::@25::@29 CUpti_ActivityEnvironment::cooling [inherited]

Data returned for CUPTI_ACTIVITY_ENVIRONMENT_COOLING environment kind.

uint32_t CUpti_ActivityEnvironment::deviceId [inherited]

The ID of the device

CUpti_ActivityEnvironmentKindCUpti_ActivityEnvironment::environmentKind [inherited]

The kind of data reported in this record.

uint32_t CUpti_ActivityEnvironment::fanSpeed [inherited]

The fan speed as percentage of maximum.

uint32_t CUpti_ActivityEnvironment::gpuTemperature [inherited]

The GPU temperature in degrees C.

CUpti_ActivityKindCUpti_ActivityEnvironment::kind [inherited]

The activity record kind, must be CUPTI_ACTIVITY_KIND_ENVIRONMENT.

uint32_t CUpti_ActivityEnvironment::memoryClock [inherited]

The memory frequency in MHz

uint32_t CUpti_ActivityEnvironment::pcieLinkGen [inherited]

The PCIe link generation.

uint32_t CUpti_ActivityEnvironment::pcieLinkWidth [inherited]

The PCIe link width.

CUpti_ActivityEnvironment::@25::@28 CUpti_ActivityEnvironment::power [inherited]

Data returned for CUPTI_ACTIVITY_ENVIRONMENT_POWER environment kind.

uint32_t CUpti_ActivityEnvironment::power [inherited]

The power in milliwatts consumed by GPU and associated circuitry.

uint32_t CUpti_ActivityEnvironment::powerLimit [inherited]

The power in milliwatts that will trigger power management algorithm.

uint32_t CUpti_ActivityEnvironment::smClock [inherited]

The SM frequency in MHz

CUpti_ActivityEnvironment::@25::@26 CUpti_ActivityEnvironment::speed [inherited]

Data returned for CUPTI_ACTIVITY_ENVIRONMENT_SPEED environment kind.

CUpti_ActivityEnvironment::@25::@27 CUpti_ActivityEnvironment::temperature [inherited]

Data returned for CUPTI_ACTIVITY_ENVIRONMENT_TEMPERATURE environment kind.

uint64_t CUpti_ActivityEnvironment::timestamp [inherited]

The timestamp when this sample was retrieved, in ns. A value of 0 indicates that timestamp information could not be collected for the marker.

6.22. CUpti_ActivityEvent Struct Reference

[CUPTI Activity API]

This activity record represents a CUPTI event value (CUPTI_ACTIVITY_KIND_EVENT). This activity record kind is not produced by the activity API but is included for completeness and ease-of-use. Profile frameworks built on top of CUPTI that collect event data may choose to use this type to store the collected event data.

Public Variables

uint32_t  correlationId
CUpti_EventDomainID domain
CUpti_EventID id
CUpti_ActivityKind kind
uint64_t  value

Variables

uint32_t CUpti_ActivityEvent::correlationId [inherited]

The correlation ID of the event. Use of this ID is user-defined, but typically this ID value will equal the correlation ID of the kernel for which the event was gathered.

CUpti_EventDomainIDCUpti_ActivityEvent::domain [inherited]

The event domain ID.

CUpti_EventIDCUpti_ActivityEvent::id [inherited]

The event ID.

CUpti_ActivityKindCUpti_ActivityEvent::kind [inherited]

The activity record kind, must be CUPTI_ACTIVITY_KIND_EVENT.

uint64_t CUpti_ActivityEvent::value [inherited]

The event value.

6.23. CUpti_ActivityEventInstance Struct Reference

[CUPTI Activity API]

This activity record represents the a CUPTI event value for a specific event domain instance (CUPTI_ACTIVITY_KIND_EVENT_INSTANCE). This activity record kind is not produced by the activity API but is included for completeness and ease-of-use. Profile frameworks built on top of CUPTI that collect event data may choose to use this type to store the collected event data. This activity record should be used when event domain instance information needs to be associated with the event.

Public Variables

uint32_t  correlationId
CUpti_EventDomainID domain
CUpti_EventID id
uint32_t  instance
CUpti_ActivityKind kind
uint32_t  pad
uint64_t  value

Variables

uint32_t CUpti_ActivityEventInstance::correlationId [inherited]

The correlation ID of the event. Use of this ID is user-defined, but typically this ID value will equal the correlation ID of the kernel for which the event was gathered.

CUpti_EventDomainIDCUpti_ActivityEventInstance::domain [inherited]

The event domain ID.

CUpti_EventIDCUpti_ActivityEventInstance::id [inherited]

The event ID.

uint32_t CUpti_ActivityEventInstance::instance [inherited]

The event domain instance.

CUpti_ActivityKindCUpti_ActivityEventInstance::kind [inherited]

The activity record kind, must be CUPTI_ACTIVITY_KIND_EVENT_INSTANCE.

uint32_t CUpti_ActivityEventInstance::pad [inherited]

Undefined. Reserved for internal use.

uint64_t CUpti_ActivityEventInstance::value [inherited]

The event value.

6.24. CUpti_ActivityExternalCorrelation Struct Reference

[CUPTI Activity API]

This activity record correlates native CUDA records (e.g. CUDA Driver API, kernels, memcpys, ...) with records from external APIs such as OpenACC. (CUPTI_ACTIVITY_KIND_EXTERNAL_CORRELATION).

See also:

CUpti_ActivityKind

Public Variables

uint32_t  correlationId
uint64_t  externalId
CUpti_ExternalCorrelationKind externalKind
CUpti_ActivityKind kind
uint32_t  reserved

Variables

uint32_t CUpti_ActivityExternalCorrelation::correlationId [inherited]

The correlation ID of the associated CUDA driver or runtime API record.

uint64_t CUpti_ActivityExternalCorrelation::externalId [inherited]

The correlation ID of the associated non-CUDA API record. The exact field in the associated external record depends on that record's activity kind (

See also:

externalKind).

CUpti_ExternalCorrelationKindCUpti_ActivityExternalCorrelation::externalKind [inherited]

The kind of external API this record correlated to.

CUpti_ActivityKindCUpti_ActivityExternalCorrelation::kind [inherited]

The kind of this activity.

uint32_t CUpti_ActivityExternalCorrelation::reserved [inherited]

Undefined. Reserved for internal use.

6.25. CUpti_ActivityFunction Struct Reference

[CUPTI Activity API]

This activity records function name and corresponding module information. (CUPTI_ACTIVITY_KIND_FUNCTION).

Public Variables

uint32_t  contextId
uint32_t  functionIndex
uint32_t  id
CUpti_ActivityKind kind
uint32_t  moduleId
const char * name

Variables

uint32_t CUpti_ActivityFunction::contextId [inherited]

The ID of the context where the function is launched.

uint32_t CUpti_ActivityFunction::functionIndex [inherited]

The function's unique symbol index in the module.

uint32_t CUpti_ActivityFunction::id [inherited]

ID to uniquely identify the record

CUpti_ActivityKindCUpti_ActivityFunction::kind [inherited]

The activity record kind, must be CUPTI_ACTIVITY_KIND_FUNCTION.

uint32_t CUpti_ActivityFunction::moduleId [inherited]

The module ID in which this global/device function is present.

const char * CUpti_ActivityFunction::name [inherited]

The name of the function. This name is shared across all activity records representing the same kernel, and so should not be modified.

6.26. CUpti_ActivityGlobalAccess Struct Reference

[CUPTI Activity API]

This activity records the locations of the global accesses in the source (CUPTI_ACTIVITY_KIND_GLOBAL_ACCESS). Global access activities are now reported using the CUpti_ActivityGlobalAccess3 activity record.

Public Variables

uint32_t  correlationId
uint32_t  executed
CUpti_ActivityFlag flags
CUpti_ActivityKind kind
uint64_t  l2_transactions
uint32_t  pcOffset
uint32_t  sourceLocatorId
uint64_t  threadsExecuted

Variables

uint32_t CUpti_ActivityGlobalAccess::correlationId [inherited]

The correlation ID of the kernel to which this result is associated.

uint32_t CUpti_ActivityGlobalAccess::executed [inherited]

The number of times this instruction was executed per warp. It will be incremented when at least one of thread among warp is active with predicate and condition code evaluating to true.

CUpti_ActivityFlagCUpti_ActivityGlobalAccess::flags [inherited]

The properties of this global access.

CUpti_ActivityKindCUpti_ActivityGlobalAccess::kind [inherited]

The activity record kind, must be CUPTI_ACTIVITY_KIND_GLOBAL_ACCESS.

uint64_t CUpti_ActivityGlobalAccess::l2_transactions [inherited]

The total number of 32 bytes transactions to L2 cache generated by this access

uint32_t CUpti_ActivityGlobalAccess::pcOffset [inherited]

The pc offset for the access.

uint32_t CUpti_ActivityGlobalAccess::sourceLocatorId [inherited]

The ID for source locator.

uint64_t CUpti_ActivityGlobalAccess::threadsExecuted [inherited]

This increments each time when this instruction is executed by number of threads that executed this instruction with predicate and condition code evaluating to true.

6.27. CUpti_ActivityGlobalAccess2 Struct Reference

[CUPTI Activity API]

This activity records the locations of the global accesses in the source (CUPTI_ACTIVITY_KIND_GLOBAL_ACCESS). Global access activities are now reported using the CUpti_ActivityGlobalAccess3 activity record.

Public Variables

uint32_t  correlationId
uint32_t  executed
CUpti_ActivityFlag flags
uint32_t  functionId
CUpti_ActivityKind kind
uint64_t  l2_transactions
uint32_t  pad
uint32_t  pcOffset
uint32_t  sourceLocatorId
uint64_t  theoreticalL2Transactions
uint64_t  threadsExecuted

Variables

uint32_t CUpti_ActivityGlobalAccess2::correlationId [inherited]

The correlation ID of the kernel to which this result is associated.

uint32_t CUpti_ActivityGlobalAccess2::executed [inherited]

The number of times this instruction was executed per warp. It will be incremented when at least one of thread among warp is active with predicate and condition code evaluating to true.

CUpti_ActivityFlagCUpti_ActivityGlobalAccess2::flags [inherited]

The properties of this global access.

uint32_t CUpti_ActivityGlobalAccess2::functionId [inherited]

Correlation ID with global/device function name

CUpti_ActivityKindCUpti_ActivityGlobalAccess2::kind [inherited]

The activity record kind, must be CUPTI_ACTIVITY_KIND_GLOBAL_ACCESS.

uint64_t CUpti_ActivityGlobalAccess2::l2_transactions [inherited]

The total number of 32 bytes transactions to L2 cache generated by this access

uint32_t CUpti_ActivityGlobalAccess2::pad [inherited]

Undefined. Reserved for internal use.

uint32_t CUpti_ActivityGlobalAccess2::pcOffset [inherited]

The pc offset for the access.

uint32_t CUpti_ActivityGlobalAccess2::sourceLocatorId [inherited]

The ID for source locator.

uint64_t CUpti_ActivityGlobalAccess2::theoreticalL2Transactions [inherited]

The minimum number of L2 transactions possible based on the access pattern.

uint64_t CUpti_ActivityGlobalAccess2::threadsExecuted [inherited]

This increments each time when this instruction is executed by number of threads that executed this instruction with predicate and condition code evaluating to true.

6.28. CUpti_ActivityGlobalAccess3 Struct Reference

[CUPTI Activity API]

This activity records the locations of the global accesses in the source (CUPTI_ACTIVITY_KIND_GLOBAL_ACCESS).

Public Variables

uint32_t  correlationId
uint32_t  executed
CUpti_ActivityFlag flags
uint32_t  functionId
CUpti_ActivityKind kind
uint64_t  l2_transactions
uint64_t  pcOffset
uint32_t  sourceLocatorId
uint64_t  theoreticalL2Transactions
uint64_t  threadsExecuted

Variables

uint32_t CUpti_ActivityGlobalAccess3::correlationId [inherited]

The correlation ID of the kernel to which this result is associated.

uint32_t CUpti_ActivityGlobalAccess3::executed [inherited]

The number of times this instruction was executed per warp. It will be incremented when at least one of thread among warp is active with predicate and condition code evaluating to true.

CUpti_ActivityFlagCUpti_ActivityGlobalAccess3::flags [inherited]

The properties of this global access.

uint32_t CUpti_ActivityGlobalAccess3::functionId [inherited]

Correlation ID with global/device function name

CUpti_ActivityKindCUpti_ActivityGlobalAccess3::kind [inherited]

The activity record kind, must be CUPTI_ACTIVITY_KIND_GLOBAL_ACCESS.

uint64_t CUpti_ActivityGlobalAccess3::l2_transactions [inherited]

The total number of 32 bytes transactions to L2 cache generated by this access

uint64_t CUpti_ActivityGlobalAccess3::pcOffset [inherited]

The pc offset for the access.

uint32_t CUpti_ActivityGlobalAccess3::sourceLocatorId [inherited]

The ID for source locator.

uint64_t CUpti_ActivityGlobalAccess3::theoreticalL2Transactions [inherited]

The minimum number of L2 transactions possible based on the access pattern.

uint64_t CUpti_ActivityGlobalAccess3::threadsExecuted [inherited]

This increments each time when this instruction is executed by number of threads that executed this instruction with predicate and condition code evaluating to true.

6.29. CUpti_ActivityGraphTrace Struct Reference

[CUPTI Activity API]

This activity record represents execution for a graph without giving visibility about the execution of its nodes. This is intended to reduce overheads in tracing each node. The activity kind is CUPTI_ACTIVITY_KIND_GRAPH_TRACE

Public Variables

uint32_t  contextId
uint32_t  correlationId
uint32_t  deviceId
uint64_t  end
uint32_t  graphId
CUpti_ActivityKind kind
void * reserved
uint64_t  start
uint32_t  streamId

Variables

uint32_t CUpti_ActivityGraphTrace::contextId [inherited]

The ID of the context where the graph is being launched.

uint32_t CUpti_ActivityGraphTrace::correlationId [inherited]

The correlation ID of the graph launch. Each graph launch is assigned a unique correlation ID that is identical to the correlation ID in the driver API activity record that launched the graph.

uint32_t CUpti_ActivityGraphTrace::deviceId [inherited]

The ID of the device where the graph execution is occurring.

uint64_t CUpti_ActivityGraphTrace::end [inherited]

The end timestamp for the graph execution, in ns. A value of 0 for both the start and end timestamps indicates that timestamp information could not be collected for the graph.

uint32_t CUpti_ActivityGraphTrace::graphId [inherited]

The unique ID of the graph that is launched.

CUpti_ActivityKindCUpti_ActivityGraphTrace::kind [inherited]

The activity record kind, must be CUPTI_ACTIVITY_KIND_GRAPH_TRACE

void * CUpti_ActivityGraphTrace::reserved [inherited]

This field is reserved for internal use

uint64_t CUpti_ActivityGraphTrace::start [inherited]

The start timestamp for the graph execution, in ns. A value of 0 for both the start and end timestamps indicates that timestamp information could not be collected for the graph.

uint32_t CUpti_ActivityGraphTrace::streamId [inherited]

The ID of the stream where the graph is being launched.

6.30. CUpti_ActivityInstantaneousEvent Struct Reference

[CUPTI Activity API]

This activity record represents a CUPTI event value (CUPTI_ACTIVITY_KIND_EVENT) sampled at a particular instant. This activity record kind is not produced by the activity API but is included for completeness and ease-of-use. Profiler frameworks built on top of CUPTI that collect event data at a particular time may choose to use this type to store the collected event data.

Public Variables

uint32_t  deviceId
CUpti_EventID id
CUpti_ActivityKind kind
uint32_t  reserved
uint64_t  timestamp
uint64_t  value

Variables

uint32_t CUpti_ActivityInstantaneousEvent::deviceId [inherited]

The device id

CUpti_EventIDCUpti_ActivityInstantaneousEvent::id [inherited]

The event ID.

CUpti_ActivityKindCUpti_ActivityInstantaneousEvent::kind [inherited]

The activity record kind, must be CUPTI_ACTIVITY_KIND_INSTANTANEOUS_EVENT.

uint32_t CUpti_ActivityInstantaneousEvent::reserved [inherited]

Undefined. reserved for internal use

uint64_t CUpti_ActivityInstantaneousEvent::timestamp [inherited]

The timestamp at which event is sampled

uint64_t CUpti_ActivityInstantaneousEvent::value [inherited]

The event value.

6.31. CUpti_ActivityInstantaneousEventInstance Struct Reference

[CUPTI Activity API]

This activity record represents the a CUPTI event value for a specific event domain instance (CUPTI_ACTIVITY_KIND_EVENT_INSTANCE) sampled at a particular instant. This activity record kind is not produced by the activity API but is included for completeness and ease-of-use. Profiler frameworks built on top of CUPTI that collect event data may choose to use this type to store the collected event data. This activity record should be used when event domain instance information needs to be associated with the event.

Public Variables

uint32_t  deviceId
CUpti_EventID id
uint8_t  instance
CUpti_ActivityKind kind
uint8_t  pad[3]
uint64_t  timestamp
uint64_t  value

Variables

uint32_t CUpti_ActivityInstantaneousEventInstance::deviceId [inherited]

The device id

CUpti_EventIDCUpti_ActivityInstantaneousEventInstance::id [inherited]

The event ID.

uint8_t CUpti_ActivityInstantaneousEventInstance::instance [inherited]

The event domain instance

CUpti_ActivityKindCUpti_ActivityInstantaneousEventInstance::kind [inherited]

The activity record kind, must be CUPTI_ACTIVITY_KIND_INSTANTANEOUS_EVENT_INSTANCE.

uint8_t CUpti_ActivityInstantaneousEventInstance::pad[3] [inherited]

Undefined. reserved for internal use

uint64_t CUpti_ActivityInstantaneousEventInstance::timestamp [inherited]

The timestamp at which event is sampled

uint64_t CUpti_ActivityInstantaneousEventInstance::value [inherited]

The event value.

6.32. CUpti_ActivityInstantaneousMetric Struct Reference

[CUPTI Activity API]

This activity record represents the collection of a CUPTI metric value (CUPTI_ACTIVITY_KIND_METRIC) at a particular instance. This activity record kind is not produced by the activity API but is included for completeness and ease-of-use. Profiler frameworks built on top of CUPTI that collect metric data may choose to use this type to store the collected metric data.

Public Variables

uint32_t  deviceId
uint8_t  flags
CUpti_MetricID id
CUpti_ActivityKind kind
uint8_t  pad[3]
uint64_t  timestamp
union CUpti_MetricValue value

Variables

uint32_t CUpti_ActivityInstantaneousMetric::deviceId [inherited]

The device id

uint8_t CUpti_ActivityInstantaneousMetric::flags [inherited]

The properties of this metric.

See also:

CUpti_ActivityFlag

CUpti_MetricIDCUpti_ActivityInstantaneousMetric::id [inherited]

The metric ID.

CUpti_ActivityKindCUpti_ActivityInstantaneousMetric::kind [inherited]

The activity record kind, must be CUPTI_ACTIVITY_KIND_INSTANTANEOUS_METRIC.

uint8_t CUpti_ActivityInstantaneousMetric::pad[3] [inherited]

Undefined. reserved for internal use

uint64_t CUpti_ActivityInstantaneousMetric::timestamp [inherited]

The timestamp at which metric is sampled

union CUpti_MetricValueCUpti_ActivityInstantaneousMetric::value [inherited]

The metric value.

6.33. CUpti_ActivityInstantaneousMetricInstance Struct Reference

[CUPTI Activity API]

This activity record represents a CUPTI metric value for a specific metric domain instance (CUPTI_ACTIVITY_KIND_METRIC_INSTANCE) sampled at a particular time. This activity record kind is not produced by the activity API but is included for completeness and ease-of-use. Profiler frameworks built on top of CUPTI that collect metric data may choose to use this type to store the collected metric data. This activity record should be used when metric domain instance information needs to be associated with the metric.

Public Variables

uint32_t  deviceId
uint8_t  flags
CUpti_MetricID id
uint8_t  instance
CUpti_ActivityKind kind
uint8_t  pad[2]
uint64_t  timestamp
union CUpti_MetricValue value

Variables

uint32_t CUpti_ActivityInstantaneousMetricInstance::deviceId [inherited]

The device id

uint8_t CUpti_ActivityInstantaneousMetricInstance::flags [inherited]

The properties of this metric.

See also:

CUpti_ActivityFlag

CUpti_MetricIDCUpti_ActivityInstantaneousMetricInstance::id [inherited]

The metric ID.

uint8_t CUpti_ActivityInstantaneousMetricInstance::instance [inherited]

The metric domain instance

CUpti_ActivityKindCUpti_ActivityInstantaneousMetricInstance::kind [inherited]

The activity record kind, must be CUPTI_ACTIVITY_KIND_INSTANTANEOUS_METRIC_INSTANCE.

uint8_t CUpti_ActivityInstantaneousMetricInstance::pad[2] [inherited]

Undefined. reserved for internal use

uint64_t CUpti_ActivityInstantaneousMetricInstance::timestamp [inherited]

The timestamp at which metric is sampled

union CUpti_MetricValueCUpti_ActivityInstantaneousMetricInstance::value [inherited]

The metric value.

6.34. CUpti_ActivityInstructionCorrelation Struct Reference

[CUPTI Activity API]

This activity records source level sass/source correlation information. (CUPTI_ACTIVITY_KIND_INSTRUCTION_CORRELATION).

Public Variables

CUpti_ActivityFlag flags
uint32_t  functionId
CUpti_ActivityKind kind
uint32_t  pad
uint32_t  pcOffset
uint32_t  sourceLocatorId

Variables

CUpti_ActivityFlagCUpti_ActivityInstructionCorrelation::flags [inherited]

The properties of this instruction.

uint32_t CUpti_ActivityInstructionCorrelation::functionId [inherited]

Correlation ID with global/device function name

CUpti_ActivityKindCUpti_ActivityInstructionCorrelation::kind [inherited]

The activity record kind, must be CUPTI_ACTIVITY_KIND_INSTRUCTION_CORRELATION.

uint32_t CUpti_ActivityInstructionCorrelation::pad [inherited]

Undefined. Reserved for internal use.

uint32_t CUpti_ActivityInstructionCorrelation::pcOffset [inherited]

The pc offset for the instruction.

uint32_t CUpti_ActivityInstructionCorrelation::sourceLocatorId [inherited]

The ID for source locator.

6.35. CUpti_ActivityInstructionExecution Struct Reference

[CUPTI Activity API]

This activity records result for source level instruction execution. (CUPTI_ACTIVITY_KIND_INSTRUCTION_EXECUTION).

Public Variables

uint32_t  correlationId
uint32_t  executed
CUpti_ActivityFlag flags
uint32_t  functionId
CUpti_ActivityKind kind
uint64_t  notPredOffThreadsExecuted
uint32_t  pad
uint32_t  pcOffset
uint32_t  sourceLocatorId
uint64_t  threadsExecuted

Variables

uint32_t CUpti_ActivityInstructionExecution::correlationId [inherited]

The correlation ID of the kernel to which this result is associated.

uint32_t CUpti_ActivityInstructionExecution::executed [inherited]

The number of times this instruction was executed per warp. It will be incremented regardless of predicate or condition code.

CUpti_ActivityFlagCUpti_ActivityInstructionExecution::flags [inherited]

The properties of this instruction execution.

uint32_t CUpti_ActivityInstructionExecution::functionId [inherited]

Correlation ID with global/device function name

CUpti_ActivityKindCUpti_ActivityInstructionExecution::kind [inherited]

The activity record kind, must be CUPTI_ACTIVITY_KIND_INSTRUCTION_EXECUTION.

uint64_t CUpti_ActivityInstructionExecution::notPredOffThreadsExecuted [inherited]

This increments each time when this instruction is executed by number of threads that executed this instruction with predicate and condition code evaluating to true.

uint32_t CUpti_ActivityInstructionExecution::pad [inherited]

Undefined. Reserved for internal use.

uint32_t CUpti_ActivityInstructionExecution::pcOffset [inherited]

The pc offset for the instruction.

uint32_t CUpti_ActivityInstructionExecution::sourceLocatorId [inherited]

The ID for source locator.

uint64_t CUpti_ActivityInstructionExecution::threadsExecuted [inherited]

This increments each time when this instruction is executed by number of threads that executed this instruction, regardless of predicate or condition code.

6.36. CUpti_ActivityJit Struct Reference

[CUPTI Activity API]

Public Variables

const char * cachePath
uint64_t  cacheSize
uint32_t  correlationId
uint32_t  deviceId
uint64_t  end
CUpti_ActivityJitEntryType jitEntryType
uint64_t  jitOperationCorrelationId
CUpti_ActivityJitOperationType jitOperationType
CUpti_ActivityKind kind
uint32_t  padding
uint64_t  start

Variables

const char * CUpti_ActivityJit::cachePath [inherited]

The path where the fat binary is cached.

uint64_t CUpti_ActivityJit::cacheSize [inherited]

The size of compute cache.

uint32_t CUpti_ActivityJit::correlationId [inherited]

The correlation ID of the JIT operation to which records belong to. Each JIT operation is assigned a unique correlation ID that is identical to the correlation ID in the driver or runtime API activity record that launched the JIT operation.

uint32_t CUpti_ActivityJit::deviceId [inherited]

The device ID.

uint64_t CUpti_ActivityJit::end [inherited]

The end timestamp for the JIT operation, in ns. A value of 0 for both the start and end timestamps indicates that timestamp information could not be collected for the JIT operation.

CUpti_ActivityJitEntryTypeCUpti_ActivityJit::jitEntryType [inherited]

The JIT entry type.

uint64_t CUpti_ActivityJit::jitOperationCorrelationId [inherited]

The correlation ID to correlate JIT compilation, load and store operations. Each JIT compilation unit is assigned a unique correlation ID at the time of the JIT compilation. This correlation id can be used to find the matching JIT cache load/store records.

CUpti_ActivityJitOperationTypeCUpti_ActivityJit::jitOperationType [inherited]

The JIT operation type.

CUpti_ActivityKindCUpti_ActivityJit::kind [inherited]

The activity record kind must be CUPTI_ACTIVITY_KIND_JIT.

uint32_t CUpti_ActivityJit::padding [inherited]

Internal use.

uint64_t CUpti_ActivityJit::start [inherited]

The start timestamp for the JIT operation, in ns. A value of 0 for both the start and end timestamps indicates that timestamp information could not be collected for the JIT operation.

6.37. CUpti_ActivityKernel Struct Reference

[CUPTI Activity API]

This activity record represents a kernel execution (CUPTI_ACTIVITY_KIND_KERNEL and CUPTI_ACTIVITY_KIND_CONCURRENT_KERNEL) but is no longer generated by CUPTI. Kernel activities are now reported using the CUpti_ActivityKernel9 activity record.

Public Variables

int32_t  blockX
int32_t  blockY
int32_t  blockZ
uint8_t  cacheConfigExecuted
uint8_t  cacheConfigRequested
uint32_t  contextId
uint32_t  correlationId
uint32_t  deviceId
int32_t  dynamicSharedMemory
uint64_t  end
int32_t  gridX
int32_t  gridY
int32_t  gridZ
CUpti_ActivityKind kind
uint32_t  localMemoryPerThread
uint32_t  localMemoryTotal
const char * name
uint32_t  pad
uint16_t  registersPerThread
void * reserved0
uint32_t  runtimeCorrelationId
uint64_t  start
int32_t  staticSharedMemory
uint32_t  streamId

Variables

int32_t CUpti_ActivityKernel::blockX [inherited]

The X-dimension block size for the kernel.

int32_t CUpti_ActivityKernel::blockY [inherited]

The Y-dimension block size for the kernel.

int32_t CUpti_ActivityKernel::blockZ [inherited]

The Z-dimension grid size for the kernel.

uint8_t CUpti_ActivityKernel::cacheConfigExecuted [inherited]

The cache configuration used for the kernel. The value is one of the CUfunc_cache enumeration values from cuda.h.

uint8_t CUpti_ActivityKernel::cacheConfigRequested [inherited]

The cache configuration requested by the kernel. The value is one of the CUfunc_cache enumeration values from cuda.h.

uint32_t CUpti_ActivityKernel::contextId [inherited]

The ID of the context where the kernel is executing.

uint32_t CUpti_ActivityKernel::correlationId [inherited]

The correlation ID of the kernel. Each kernel execution is assigned a unique correlation ID that is identical to the correlation ID in the driver API activity record that launched the kernel.

uint32_t CUpti_ActivityKernel::deviceId [inherited]

The ID of the device where the kernel is executing.

int32_t CUpti_ActivityKernel::dynamicSharedMemory [inherited]

The dynamic shared memory reserved for the kernel, in bytes.

uint64_t CUpti_ActivityKernel::end [inherited]

The end timestamp for the kernel execution, in ns. A value of 0 for both the start and end timestamps indicates that timestamp information could not be collected for the kernel.

int32_t CUpti_ActivityKernel::gridX [inherited]

The X-dimension grid size for the kernel.

int32_t CUpti_ActivityKernel::gridY [inherited]

The Y-dimension grid size for the kernel.

int32_t CUpti_ActivityKernel::gridZ [inherited]

The Z-dimension grid size for the kernel.

CUpti_ActivityKindCUpti_ActivityKernel::kind [inherited]

The activity record kind, must be CUPTI_ACTIVITY_KIND_KERNEL or CUPTI_ACTIVITY_KIND_CONCURRENT_KERNEL.

uint32_t CUpti_ActivityKernel::localMemoryPerThread [inherited]

The amount of local memory reserved for each thread, in bytes.

uint32_t CUpti_ActivityKernel::localMemoryTotal [inherited]

The total amount of local memory reserved for the kernel, in bytes.

const char * CUpti_ActivityKernel::name [inherited]

The name of the kernel. This name is shared across all activity records representing the same kernel, and so should not be modified.

uint32_t CUpti_ActivityKernel::pad [inherited]

Undefined. Reserved for internal use.

uint16_t CUpti_ActivityKernel::registersPerThread [inherited]

The number of registers required for each thread executing the kernel.

void * CUpti_ActivityKernel::reserved0 [inherited]

Undefined. Reserved for internal use.

uint32_t CUpti_ActivityKernel::runtimeCorrelationId [inherited]

The runtime correlation ID of the kernel. Each kernel execution is assigned a unique runtime correlation ID that is identical to the correlation ID in the runtime API activity record that launched the kernel.

uint64_t CUpti_ActivityKernel::start [inherited]

The start timestamp for the kernel execution, in ns. A value of 0 for both the start and end timestamps indicates that timestamp information could not be collected for the kernel.

int32_t CUpti_ActivityKernel::staticSharedMemory [inherited]

The static shared memory allocated for the kernel, in bytes.

uint32_t CUpti_ActivityKernel::streamId [inherited]

The ID of the stream where the kernel is executing.

6.38. CUpti_ActivityKernel2 Struct Reference

[CUPTI Activity API]

This activity record represents a kernel execution (CUPTI_ACTIVITY_KIND_KERNEL and CUPTI_ACTIVITY_KIND_CONCURRENT_KERNEL) but is no longer generated by CUPTI. Kernel activities are now reported using the CUpti_ActivityKernel9 activity record.

Public Variables

int32_t  blockX
int32_t  blockY
int32_t  blockZ
uint64_t  completed
uint32_t  contextId
uint32_t  correlationId
uint32_t  deviceId
int32_t  dynamicSharedMemory
uint64_t  end
uint8_t  executed
int64_t  gridId
int32_t  gridX
int32_t  gridY
int32_t  gridZ
CUpti_ActivityKind kind
uint32_t  localMemoryPerThread
uint32_t  localMemoryTotal
const char * name
uint16_t  registersPerThread
uint8_t  requested
void * reserved0
uint8_t  sharedMemoryConfig
uint64_t  start
int32_t  staticSharedMemory
uint32_t  streamId

Variables

int32_t CUpti_ActivityKernel2::blockX [inherited]

The X-dimension block size for the kernel.

int32_t CUpti_ActivityKernel2::blockY [inherited]

The Y-dimension block size for the kernel.

int32_t CUpti_ActivityKernel2::blockZ [inherited]

The Z-dimension grid size for the kernel.

uint64_t CUpti_ActivityKernel2::completed [inherited]

The completed timestamp for the kernel execution, in ns. It represents the completion of all it's child kernels and the kernel itself. A value of CUPTI_TIMESTAMP_UNKNOWN indicates that the completion time is unknown.

uint32_t CUpti_ActivityKernel2::contextId [inherited]

The ID of the context where the kernel is executing.

uint32_t CUpti_ActivityKernel2::correlationId [inherited]

The correlation ID of the kernel. Each kernel execution is assigned a unique correlation ID that is identical to the correlation ID in the driver or runtime API activity record that launched the kernel.

uint32_t CUpti_ActivityKernel2::deviceId [inherited]

The ID of the device where the kernel is executing.

int32_t CUpti_ActivityKernel2::dynamicSharedMemory [inherited]

The dynamic shared memory reserved for the kernel, in bytes.

uint64_t CUpti_ActivityKernel2::end [inherited]

The end timestamp for the kernel execution, in ns. A value of 0 for both the start and end timestamps indicates that timestamp information could not be collected for the kernel.

uint8_t CUpti_ActivityKernel2::executed [inherited]

The cache configuration used for the kernel. The value is one of the CUfunc_cache enumeration values from cuda.h.

int64_t CUpti_ActivityKernel2::gridId [inherited]

The grid ID of the kernel. Each kernel is assigned a unique grid ID at runtime.

int32_t CUpti_ActivityKernel2::gridX [inherited]

The X-dimension grid size for the kernel.

int32_t CUpti_ActivityKernel2::gridY [inherited]

The Y-dimension grid size for the kernel.

int32_t CUpti_ActivityKernel2::gridZ [inherited]

The Z-dimension grid size for the kernel.

CUpti_ActivityKindCUpti_ActivityKernel2::kind [inherited]

The activity record kind, must be CUPTI_ACTIVITY_KIND_KERNEL or CUPTI_ACTIVITY_KIND_CONCURRENT_KERNEL.

uint32_t CUpti_ActivityKernel2::localMemoryPerThread [inherited]

The amount of local memory reserved for each thread, in bytes.

uint32_t CUpti_ActivityKernel2::localMemoryTotal [inherited]

The total amount of local memory reserved for the kernel, in bytes.

const char * CUpti_ActivityKernel2::name [inherited]

The name of the kernel. This name is shared across all activity records representing the same kernel, and so should not be modified.

uint16_t CUpti_ActivityKernel2::registersPerThread [inherited]

The number of registers required for each thread executing the kernel.

uint8_t CUpti_ActivityKernel2::requested [inherited]

The cache configuration requested by the kernel. The value is one of the CUfunc_cache enumeration values from cuda.h.

void * CUpti_ActivityKernel2::reserved0 [inherited]

Undefined. Reserved for internal use.

uint8_t CUpti_ActivityKernel2::sharedMemoryConfig [inherited]

The shared memory configuration used for the kernel. The value is one of the CUsharedconfig enumeration values from cuda.h.

uint64_t CUpti_ActivityKernel2::start [inherited]

The start timestamp for the kernel execution, in ns. A value of 0 for both the start and end timestamps indicates that timestamp information could not be collected for the kernel.

int32_t CUpti_ActivityKernel2::staticSharedMemory [inherited]

The static shared memory allocated for the kernel, in bytes.

uint32_t CUpti_ActivityKernel2::streamId [inherited]

The ID of the stream where the kernel is executing.

6.39. CUpti_ActivityKernel3 Struct Reference

[CUPTI Activity API]

This activity record represents a kernel execution (CUPTI_ACTIVITY_KIND_KERNEL and CUPTI_ACTIVITY_KIND_CONCURRENT_KERNEL). Kernel activities are now reported using the CUpti_ActivityKernel9 activity record.

Public Variables

int32_t  blockX
int32_t  blockY
int32_t  blockZ
uint64_t  completed
uint32_t  contextId
uint32_t  correlationId
uint32_t  deviceId
int32_t  dynamicSharedMemory
uint64_t  end
uint8_t  executed
int64_t  gridId
int32_t  gridX
int32_t  gridY
int32_t  gridZ
CUpti_ActivityKind kind
uint32_t  localMemoryPerThread
uint32_t  localMemoryTotal
const char * name
CUpti_ActivityPartitionedGlobalCacheConfig partitionedGlobalCacheExecuted
CUpti_ActivityPartitionedGlobalCacheConfig partitionedGlobalCacheRequested
uint16_t  registersPerThread
uint8_t  requested
void * reserved0
uint8_t  sharedMemoryConfig
uint64_t  start
int32_t  staticSharedMemory
uint32_t  streamId

Variables

int32_t CUpti_ActivityKernel3::blockX [inherited]

The X-dimension block size for the kernel.

int32_t CUpti_ActivityKernel3::blockY [inherited]

The Y-dimension block size for the kernel.

int32_t CUpti_ActivityKernel3::blockZ [inherited]

The Z-dimension grid size for the kernel.

uint64_t CUpti_ActivityKernel3::completed [inherited]

The completed timestamp for the kernel execution, in ns. It represents the completion of all it's child kernels and the kernel itself. A value of CUPTI_TIMESTAMP_UNKNOWN indicates that the completion time is unknown.

uint32_t CUpti_ActivityKernel3::contextId [inherited]

The ID of the context where the kernel is executing.

uint32_t CUpti_ActivityKernel3::correlationId [inherited]

The correlation ID of the kernel. Each kernel execution is assigned a unique correlation ID that is identical to the correlation ID in the driver or runtime API activity record that launched the kernel.

uint32_t CUpti_ActivityKernel3::deviceId [inherited]

The ID of the device where the kernel is executing.

int32_t CUpti_ActivityKernel3::dynamicSharedMemory [inherited]

The dynamic shared memory reserved for the kernel, in bytes.

uint64_t CUpti_ActivityKernel3::end [inherited]

The end timestamp for the kernel execution, in ns. A value of 0 for both the start and end timestamps indicates that timestamp information could not be collected for the kernel.

uint8_t CUpti_ActivityKernel3::executed [inherited]

The cache configuration used for the kernel. The value is one of the CUfunc_cache enumeration values from cuda.h.

int64_t CUpti_ActivityKernel3::gridId [inherited]

The grid ID of the kernel. Each kernel is assigned a unique grid ID at runtime.

int32_t CUpti_ActivityKernel3::gridX [inherited]

The X-dimension grid size for the kernel.

int32_t CUpti_ActivityKernel3::gridY [inherited]

The Y-dimension grid size for the kernel.

int32_t CUpti_ActivityKernel3::gridZ [inherited]

The Z-dimension grid size for the kernel.

CUpti_ActivityKindCUpti_ActivityKernel3::kind [inherited]

The activity record kind, must be CUPTI_ACTIVITY_KIND_KERNEL or CUPTI_ACTIVITY_KIND_CONCURRENT_KERNEL.

uint32_t CUpti_ActivityKernel3::localMemoryPerThread [inherited]

The amount of local memory reserved for each thread, in bytes.

uint32_t CUpti_ActivityKernel3::localMemoryTotal [inherited]

The total amount of local memory reserved for the kernel, in bytes.

const char * CUpti_ActivityKernel3::name [inherited]

The name of the kernel. This name is shared across all activity records representing the same kernel, and so should not be modified.

CUpti_ActivityPartitionedGlobalCacheConfigCUpti_ActivityKernel3::partitionedGlobalCacheExecuted [inherited]

The partitioned global caching executed for the kernel. Partitioned global caching is required to enable caching on certain chips, such as devices with compute capability 5.2. Partitioned global caching can be automatically disabled if the occupancy requirement of the launch cannot support caching.

CUpti_ActivityPartitionedGlobalCacheConfigCUpti_ActivityKernel3::partitionedGlobalCacheRequested [inherited]

The partitioned global caching requested for the kernel. Partitioned global caching is required to enable caching on certain chips, such as devices with compute capability 5.2.

uint16_t CUpti_ActivityKernel3::registersPerThread [inherited]

The number of registers required for each thread executing the kernel.

uint8_t CUpti_ActivityKernel3::requested [inherited]

The cache configuration requested by the kernel. The value is one of the CUfunc_cache enumeration values from cuda.h.

void * CUpti_ActivityKernel3::reserved0 [inherited]

Undefined. Reserved for internal use.

uint8_t CUpti_ActivityKernel3::sharedMemoryConfig [inherited]

The shared memory configuration used for the kernel. The value is one of the CUsharedconfig enumeration values from cuda.h.

uint64_t CUpti_ActivityKernel3::start [inherited]

The start timestamp for the kernel execution, in ns. A value of 0 for both the start and end timestamps indicates that timestamp information could not be collected for the kernel.

int32_t CUpti_ActivityKernel3::staticSharedMemory [inherited]

The static shared memory allocated for the kernel, in bytes.

uint32_t CUpti_ActivityKernel3::streamId [inherited]

The ID of the stream where the kernel is executing.

6.40. CUpti_ActivityKernel4 Struct Reference

[CUPTI Activity API]

This activity record represents a kernel execution (CUPTI_ACTIVITY_KIND_KERNEL and CUPTI_ACTIVITY_KIND_CONCURRENT_KERNEL). Kernel activities are now reported using the CUpti_ActivityKernel9 activity record.

Public Variables

int32_t  blockX
int32_t  blockY
int32_t  blockZ
CUpti_ActivityKernel4::@9  cacheConfig
uint64_t  completed
uint32_t  contextId
uint32_t  correlationId
uint32_t  deviceId
int32_t  dynamicSharedMemory
uint64_t  end
uint8_t  executed
int64_t  gridId
int32_t  gridX
int32_t  gridY
int32_t  gridZ
uint8_t  isSharedMemoryCarveoutRequested
CUpti_ActivityKind kind
uint8_t  launchType
uint32_t  localMemoryPerThread
uint32_t  localMemoryTotal
const char * name
uint8_t  padding
CUpti_ActivityPartitionedGlobalCacheConfig partitionedGlobalCacheExecuted
CUpti_ActivityPartitionedGlobalCacheConfig partitionedGlobalCacheRequested
uint64_t  queued
uint16_t  registersPerThread
uint8_t  requested
void * reserved0
uint8_t  sharedMemoryCarveoutRequested
uint8_t  sharedMemoryConfig
uint32_t  sharedMemoryExecuted
uint64_t  start
int32_t  staticSharedMemory
uint32_t  streamId
uint64_t  submitted

Variables

int32_t CUpti_ActivityKernel4::blockX [inherited]

The X-dimension block size for the kernel.

int32_t CUpti_ActivityKernel4::blockY [inherited]

The Y-dimension block size for the kernel.

int32_t CUpti_ActivityKernel4::blockZ [inherited]

The Z-dimension grid size for the kernel.

CUpti_ActivityKernel4::@9 CUpti_ActivityKernel4::cacheConfig [inherited]

For devices with compute capability 7.0+ cacheConfig values are not updated in case field isSharedMemoryCarveoutRequested is set

uint64_t CUpti_ActivityKernel4::completed [inherited]

The completed timestamp for the kernel execution, in ns. It represents the completion of all it's child kernels and the kernel itself. A value of CUPTI_TIMESTAMP_UNKNOWN indicates that the completion time is unknown.

uint32_t CUpti_ActivityKernel4::contextId [inherited]

The ID of the context where the kernel is executing.

uint32_t CUpti_ActivityKernel4::correlationId [inherited]

The correlation ID of the kernel. Each kernel execution is assigned a unique correlation ID that is identical to the correlation ID in the driver or runtime API activity record that launched the kernel.

uint32_t CUpti_ActivityKernel4::deviceId [inherited]

The ID of the device where the kernel is executing.

int32_t CUpti_ActivityKernel4::dynamicSharedMemory [inherited]

The dynamic shared memory reserved for the kernel, in bytes.

uint64_t CUpti_ActivityKernel4::end [inherited]

The end timestamp for the kernel execution, in ns. A value of 0 for both the start and end timestamps indicates that timestamp information could not be collected for the kernel.

uint8_t CUpti_ActivityKernel4::executed [inherited]

The cache configuration used for the kernel. The value is one of the CUfunc_cache enumeration values from cuda.h.

int64_t CUpti_ActivityKernel4::gridId [inherited]

The grid ID of the kernel. Each kernel is assigned a unique grid ID at runtime.

int32_t CUpti_ActivityKernel4::gridX [inherited]

The X-dimension grid size for the kernel.

int32_t CUpti_ActivityKernel4::gridY [inherited]

The Y-dimension grid size for the kernel.

int32_t CUpti_ActivityKernel4::gridZ [inherited]

The Z-dimension grid size for the kernel.

uint8_t CUpti_ActivityKernel4::isSharedMemoryCarveoutRequested [inherited]

This indicates if CU_FUNC_ATTRIBUTE_PREFERRED_SHARED_MEMORY_CARVEOUT was updated for the kernel launch

CUpti_ActivityKindCUpti_ActivityKernel4::kind [inherited]

The activity record kind, must be CUPTI_ACTIVITY_KIND_KERNEL or CUPTI_ACTIVITY_KIND_CONCURRENT_KERNEL.

uint8_t CUpti_ActivityKernel4::launchType [inherited]

The indicates if the kernel was executed via a regular launch or via a single/multi device cooperative launch.

See also:

CUpti_ActivityLaunchType

uint32_t CUpti_ActivityKernel4::localMemoryPerThread [inherited]

The amount of local memory reserved for each thread, in bytes.

uint32_t CUpti_ActivityKernel4::localMemoryTotal [inherited]

The total amount of local memory reserved for the kernel, in bytes.

const char * CUpti_ActivityKernel4::name [inherited]

The name of the kernel. This name is shared across all activity records representing the same kernel, and so should not be modified.

uint8_t CUpti_ActivityKernel4::padding [inherited]

Undefined. Reserved for internal use.

CUpti_ActivityPartitionedGlobalCacheConfigCUpti_ActivityKernel4::partitionedGlobalCacheExecuted [inherited]

The partitioned global caching executed for the kernel. Partitioned global caching is required to enable caching on certain chips, such as devices with compute capability 5.2. Partitioned global caching can be automatically disabled if the occupancy requirement of the launch cannot support caching.

CUpti_ActivityPartitionedGlobalCacheConfigCUpti_ActivityKernel4::partitionedGlobalCacheRequested [inherited]

The partitioned global caching requested for the kernel. Partitioned global caching is required to enable caching on certain chips, such as devices with compute capability 5.2.

uint64_t CUpti_ActivityKernel4::queued [inherited]

The timestamp when the kernel is queued up in the command buffer, in ns. A value of CUPTI_TIMESTAMP_UNKNOWN indicates that the queued time could not be collected for the kernel. This timestamp is not collected by default. Use API cuptiActivityEnableLatencyTimestamps() to enable collection.

Command buffer is a buffer written by CUDA driver to send commands like kernel launch, memory copy etc to the GPU. All launches of CUDA kernels are asynchrnous with respect to the host, the host requests the launch by writing commands into the command buffer, then returns without checking the GPU's progress.

uint16_t CUpti_ActivityKernel4::registersPerThread [inherited]

The number of registers required for each thread executing the kernel.

uint8_t CUpti_ActivityKernel4::requested [inherited]

The cache configuration requested by the kernel. The value is one of the CUfunc_cache enumeration values from cuda.h.

void * CUpti_ActivityKernel4::reserved0 [inherited]

Undefined. Reserved for internal use.

uint8_t CUpti_ActivityKernel4::sharedMemoryCarveoutRequested [inherited]

Shared memory carveout value requested for the function in percentage of the total resource. The value will be updated only if field isSharedMemoryCarveoutRequested is set.

uint8_t CUpti_ActivityKernel4::sharedMemoryConfig [inherited]

The shared memory configuration used for the kernel. The value is one of the CUsharedconfig enumeration values from cuda.h.

uint32_t CUpti_ActivityKernel4::sharedMemoryExecuted [inherited]

Shared memory size set by the driver.

uint64_t CUpti_ActivityKernel4::start [inherited]

The start timestamp for the kernel execution, in ns. A value of 0 for both the start and end timestamps indicates that timestamp information could not be collected for the kernel.

int32_t CUpti_ActivityKernel4::staticSharedMemory [inherited]

The static shared memory allocated for the kernel, in bytes.

uint32_t CUpti_ActivityKernel4::streamId [inherited]

The ID of the stream where the kernel is executing.

uint64_t CUpti_ActivityKernel4::submitted [inherited]

The timestamp when the command buffer containing the kernel launch is submitted to the GPU, in ns. A value of CUPTI_TIMESTAMP_UNKNOWN indicates that the submitted time could not be collected for the kernel. This timestamp is not collected by default. Use API cuptiActivityEnableLatencyTimestamps() to enable collection.

6.41. CUpti_ActivityKernel5 Struct Reference

[CUPTI Activity API]

Public Variables

int32_t  blockX
int32_t  blockY
int32_t  blockZ
CUpti_ActivityKernel5::@11  cacheConfig
uint64_t  completed
uint32_t  contextId
uint32_t  correlationId
uint32_t  deviceId
int32_t  dynamicSharedMemory
uint64_t  end
uint8_t  executed
uint32_t  graphId
uint64_t  graphNodeId
int64_t  gridId
int32_t  gridX
int32_t  gridY
int32_t  gridZ
uint8_t  isSharedMemoryCarveoutRequested
CUpti_ActivityKind kind
uint8_t  launchType
uint32_t  localMemoryPerThread
uint32_t  localMemoryTotal
const char * name
uint8_t  padding
CUpti_ActivityPartitionedGlobalCacheConfig partitionedGlobalCacheExecuted
CUpti_ActivityPartitionedGlobalCacheConfig partitionedGlobalCacheRequested
uint64_t  queued
uint16_t  registersPerThread
uint8_t  requested
void * reserved0
uint8_t  sharedMemoryCarveoutRequested
uint8_t  sharedMemoryConfig
uint32_t  sharedMemoryExecuted
CUpti_FuncShmemLimitConfig shmemLimitConfig
uint64_t  start
int32_t  staticSharedMemory
uint32_t  streamId
uint64_t  submitted

Variables

int32_t CUpti_ActivityKernel5::blockX [inherited]

The X-dimension block size for the kernel.

int32_t CUpti_ActivityKernel5::blockY [inherited]

The Y-dimension block size for the kernel.

int32_t CUpti_ActivityKernel5::blockZ [inherited]

The Z-dimension grid size for the kernel.

CUpti_ActivityKernel5::@11 CUpti_ActivityKernel5::cacheConfig [inherited]

For devices with compute capability 7.0+ cacheConfig values are not updated in case field isSharedMemoryCarveoutRequested is set

uint64_t CUpti_ActivityKernel5::completed [inherited]

The completed timestamp for the kernel execution, in ns. It represents the completion of all it's child kernels and the kernel itself. A value of CUPTI_TIMESTAMP_UNKNOWN indicates that the completion time is unknown.

uint32_t CUpti_ActivityKernel5::contextId [inherited]

The ID of the context where the kernel is executing.

uint32_t CUpti_ActivityKernel5::correlationId [inherited]

The correlation ID of the kernel. Each kernel execution is assigned a unique correlation ID that is identical to the correlation ID in the driver or runtime API activity record that launched the kernel.

uint32_t CUpti_ActivityKernel5::deviceId [inherited]

The ID of the device where the kernel is executing.

int32_t CUpti_ActivityKernel5::dynamicSharedMemory [inherited]

The dynamic shared memory reserved for the kernel, in bytes.

uint64_t CUpti_ActivityKernel5::end [inherited]

The end timestamp for the kernel execution, in ns. A value of 0 for both the start and end timestamps indicates that timestamp information could not be collected for the kernel.

uint8_t CUpti_ActivityKernel5::executed [inherited]

The cache configuration used for the kernel. The value is one of the CUfunc_cache enumeration values from cuda.h.

uint32_t CUpti_ActivityKernel5::graphId [inherited]

The unique ID of the graph that launched this kernel through graph launch APIs. This field will be 0 if the kernel is not launched through graph launch APIs.

uint64_t CUpti_ActivityKernel5::graphNodeId [inherited]

The unique ID of the graph node that launched this kernel through graph launch APIs. This field will be 0 if the kernel is not launched through graph launch APIs.

int64_t CUpti_ActivityKernel5::gridId [inherited]

The grid ID of the kernel. Each kernel is assigned a unique grid ID at runtime.

int32_t CUpti_ActivityKernel5::gridX [inherited]

The X-dimension grid size for the kernel.

int32_t CUpti_ActivityKernel5::gridY [inherited]

The Y-dimension grid size for the kernel.

int32_t CUpti_ActivityKernel5::gridZ [inherited]

The Z-dimension grid size for the kernel.

uint8_t CUpti_ActivityKernel5::isSharedMemoryCarveoutRequested [inherited]

This indicates if CU_FUNC_ATTRIBUTE_PREFERRED_SHARED_MEMORY_CARVEOUT was updated for the kernel launch

CUpti_ActivityKindCUpti_ActivityKernel5::kind [inherited]

The activity record kind, must be CUPTI_ACTIVITY_KIND_KERNEL or CUPTI_ACTIVITY_KIND_CONCURRENT_KERNEL.

uint8_t CUpti_ActivityKernel5::launchType [inherited]

The indicates if the kernel was executed via a regular launch or via a single/multi device cooperative launch.

See also:

CUpti_ActivityLaunchType

uint32_t CUpti_ActivityKernel5::localMemoryPerThread [inherited]

The amount of local memory reserved for each thread, in bytes.

uint32_t CUpti_ActivityKernel5::localMemoryTotal [inherited]

The total amount of local memory reserved for the kernel, in bytes.

const char * CUpti_ActivityKernel5::name [inherited]

The name of the kernel. This name is shared across all activity records representing the same kernel, and so should not be modified.

uint8_t CUpti_ActivityKernel5::padding [inherited]

Undefined. Reserved for internal use.

CUpti_ActivityPartitionedGlobalCacheConfigCUpti_ActivityKernel5::partitionedGlobalCacheExecuted [inherited]

The partitioned global caching executed for the kernel. Partitioned global caching is required to enable caching on certain chips, such as devices with compute capability 5.2. Partitioned global caching can be automatically disabled if the occupancy requirement of the launch cannot support caching.

CUpti_ActivityPartitionedGlobalCacheConfigCUpti_ActivityKernel5::partitionedGlobalCacheRequested [inherited]

The partitioned global caching requested for the kernel. Partitioned global caching is required to enable caching on certain chips, such as devices with compute capability 5.2.

uint64_t CUpti_ActivityKernel5::queued [inherited]

The timestamp when the kernel is queued up in the command buffer, in ns. A value of CUPTI_TIMESTAMP_UNKNOWN indicates that the queued time could not be collected for the kernel. This timestamp is not collected by default. Use API cuptiActivityEnableLatencyTimestamps() to enable collection.

Command buffer is a buffer written by CUDA driver to send commands like kernel launch, memory copy etc to the GPU. All launches of CUDA kernels are asynchrnous with respect to the host, the host requests the launch by writing commands into the command buffer, then returns without checking the GPU's progress.

uint16_t CUpti_ActivityKernel5::registersPerThread [inherited]

The number of registers required for each thread executing the kernel.

uint8_t CUpti_ActivityKernel5::requested [inherited]

The cache configuration requested by the kernel. The value is one of the CUfunc_cache enumeration values from cuda.h.

void * CUpti_ActivityKernel5::reserved0 [inherited]

Undefined. Reserved for internal use.

uint8_t CUpti_ActivityKernel5::sharedMemoryCarveoutRequested [inherited]

Shared memory carveout value requested for the function in percentage of the total resource. The value will be updated only if field isSharedMemoryCarveoutRequested is set.

uint8_t CUpti_ActivityKernel5::sharedMemoryConfig [inherited]

The shared memory configuration used for the kernel. The value is one of the CUsharedconfig enumeration values from cuda.h.

uint32_t CUpti_ActivityKernel5::sharedMemoryExecuted [inherited]

Shared memory size set by the driver.

CUpti_FuncShmemLimitConfigCUpti_ActivityKernel5::shmemLimitConfig [inherited]

The shared memory limit config for the kernel. This field shows whether user has opted for a higher per block limit of dynamic shared memory.

uint64_t CUpti_ActivityKernel5::start [inherited]

The start timestamp for the kernel execution, in ns. A value of 0 for both the start and end timestamps indicates that timestamp information could not be collected for the kernel.

int32_t CUpti_ActivityKernel5::staticSharedMemory [inherited]

The static shared memory allocated for the kernel, in bytes.

uint32_t CUpti_ActivityKernel5::streamId [inherited]

The ID of the stream where the kernel is executing.

uint64_t CUpti_ActivityKernel5::submitted [inherited]

The timestamp when the command buffer containing the kernel launch is submitted to the GPU, in ns. A value of CUPTI_TIMESTAMP_UNKNOWN indicates that the submitted time could not be collected for the kernel. This timestamp is not collected by default. Use API cuptiActivityEnableLatencyTimestamps() to enable collection.

6.42. CUpti_ActivityKernel6 Struct Reference

[CUPTI Activity API]

This activity record represents a kernel execution (CUPTI_ACTIVITY_KIND_KERNEL and CUPTI_ACTIVITY_KIND_CONCURRENT_KERNEL) but is no longer generated by CUPTI. Kernel activities are now reported using the CUpti_ActivityKernel9 activity record.

Public Variables

int32_t  blockX
int32_t  blockY
int32_t  blockZ
CUpti_ActivityKernel6::@13  cacheConfig
uint64_t  completed
uint32_t  contextId
uint32_t  correlationId
uint32_t  deviceId
int32_t  dynamicSharedMemory
uint64_t  end
uint8_t  executed
uint32_t  graphId
uint64_t  graphNodeId
int64_t  gridId
int32_t  gridX
int32_t  gridY
int32_t  gridZ
uint8_t  isSharedMemoryCarveoutRequested
CUpti_ActivityKind kind
uint8_t  launchType
uint32_t  localMemoryPerThread
uint32_t  localMemoryTotal
const char * name
CUaccessPolicyWindow * pAccessPolicyWindow
uint8_t  padding
CUpti_ActivityPartitionedGlobalCacheConfig partitionedGlobalCacheExecuted
CUpti_ActivityPartitionedGlobalCacheConfig partitionedGlobalCacheRequested
uint64_t  queued
uint16_t  registersPerThread
uint8_t  requested
void * reserved0
uint8_t  sharedMemoryCarveoutRequested
uint8_t  sharedMemoryConfig
uint32_t  sharedMemoryExecuted
CUpti_FuncShmemLimitConfig shmemLimitConfig
uint64_t  start
int32_t  staticSharedMemory
uint32_t  streamId
uint64_t  submitted

Variables

int32_t CUpti_ActivityKernel6::blockX [inherited]

The X-dimension block size for the kernel.

int32_t CUpti_ActivityKernel6::blockY [inherited]

The Y-dimension block size for the kernel.

int32_t CUpti_ActivityKernel6::blockZ [inherited]

The Z-dimension grid size for the kernel.

CUpti_ActivityKernel6::@13 CUpti_ActivityKernel6::cacheConfig [inherited]

For devices with compute capability 7.0+ cacheConfig values are not updated in case field isSharedMemoryCarveoutRequested is set

uint64_t CUpti_ActivityKernel6::completed [inherited]

The completed timestamp for the kernel execution, in ns. It represents the completion of all it's child kernels and the kernel itself. A value of CUPTI_TIMESTAMP_UNKNOWN indicates that the completion time is unknown.

uint32_t CUpti_ActivityKernel6::contextId [inherited]

The ID of the context where the kernel is executing.

uint32_t CUpti_ActivityKernel6::correlationId [inherited]

The correlation ID of the kernel. Each kernel execution is assigned a unique correlation ID that is identical to the correlation ID in the driver or runtime API activity record that launched the kernel.

uint32_t CUpti_ActivityKernel6::deviceId [inherited]

The ID of the device where the kernel is executing.

int32_t CUpti_ActivityKernel6::dynamicSharedMemory [inherited]

The dynamic shared memory reserved for the kernel, in bytes.

uint64_t CUpti_ActivityKernel6::end [inherited]

The end timestamp for the kernel execution, in ns. A value of 0 for both the start and end timestamps indicates that timestamp information could not be collected for the kernel.

uint8_t CUpti_ActivityKernel6::executed [inherited]

The cache configuration used for the kernel. The value is one of the CUfunc_cache enumeration values from cuda.h.

uint32_t CUpti_ActivityKernel6::graphId [inherited]

The unique ID of the graph that launched this kernel through graph launch APIs. This field will be 0 if the kernel is not launched through graph launch APIs.

uint64_t CUpti_ActivityKernel6::graphNodeId [inherited]

The unique ID of the graph node that launched this kernel through graph launch APIs. This field will be 0 if the kernel is not launched through graph launch APIs.

int64_t CUpti_ActivityKernel6::gridId [inherited]

The grid ID of the kernel. Each kernel is assigned a unique grid ID at runtime.

int32_t CUpti_ActivityKernel6::gridX [inherited]

The X-dimension grid size for the kernel.

int32_t CUpti_ActivityKernel6::gridY [inherited]

The Y-dimension grid size for the kernel.

int32_t CUpti_ActivityKernel6::gridZ [inherited]

The Z-dimension grid size for the kernel.

uint8_t CUpti_ActivityKernel6::isSharedMemoryCarveoutRequested [inherited]

This indicates if CU_FUNC_ATTRIBUTE_PREFERRED_SHARED_MEMORY_CARVEOUT was updated for the kernel launch

CUpti_ActivityKindCUpti_ActivityKernel6::kind [inherited]

The activity record kind, must be CUPTI_ACTIVITY_KIND_KERNEL or CUPTI_ACTIVITY_KIND_CONCURRENT_KERNEL.

uint8_t CUpti_ActivityKernel6::launchType [inherited]

The indicates if the kernel was executed via a regular launch or via a single/multi device cooperative launch.

See also:

CUpti_ActivityLaunchType

uint32_t CUpti_ActivityKernel6::localMemoryPerThread [inherited]

The amount of local memory reserved for each thread, in bytes.

uint32_t CUpti_ActivityKernel6::localMemoryTotal [inherited]

The total amount of local memory reserved for the kernel, in bytes.

const char * CUpti_ActivityKernel6::name [inherited]

The name of the kernel. This name is shared across all activity records representing the same kernel, and so should not be modified.

CUaccessPolicyWindow * CUpti_ActivityKernel6::pAccessPolicyWindow [inherited]

The pointer to the access policy window. The structure CUaccessPolicyWindow is defined in cuda.h.

uint8_t CUpti_ActivityKernel6::padding [inherited]

Undefined. Reserved for internal use.

CUpti_ActivityPartitionedGlobalCacheConfigCUpti_ActivityKernel6::partitionedGlobalCacheExecuted [inherited]

The partitioned global caching executed for the kernel. Partitioned global caching is required to enable caching on certain chips, such as devices with compute capability 5.2. Partitioned global caching can be automatically disabled if the occupancy requirement of the launch cannot support caching.

CUpti_ActivityPartitionedGlobalCacheConfigCUpti_ActivityKernel6::partitionedGlobalCacheRequested [inherited]

The partitioned global caching requested for the kernel. Partitioned global caching is required to enable caching on certain chips, such as devices with compute capability 5.2.

uint64_t CUpti_ActivityKernel6::queued [inherited]

The timestamp when the kernel is queued up in the command buffer, in ns. A value of CUPTI_TIMESTAMP_UNKNOWN indicates that the queued time could not be collected for the kernel. This timestamp is not collected by default. Use API cuptiActivityEnableLatencyTimestamps() to enable collection.

Command buffer is a buffer written by CUDA driver to send commands like kernel launch, memory copy etc to the GPU. All launches of CUDA kernels are asynchrnous with respect to the host, the host requests the launch by writing commands into the command buffer, then returns without checking the GPU's progress.

uint16_t CUpti_ActivityKernel6::registersPerThread [inherited]

The number of registers required for each thread executing the kernel.

uint8_t CUpti_ActivityKernel6::requested [inherited]

The cache configuration requested by the kernel. The value is one of the CUfunc_cache enumeration values from cuda.h.

void * CUpti_ActivityKernel6::reserved0 [inherited]

Undefined. Reserved for internal use.

uint8_t CUpti_ActivityKernel6::sharedMemoryCarveoutRequested [inherited]

Shared memory carveout value requested for the function in percentage of the total resource. The value will be updated only if field isSharedMemoryCarveoutRequested is set.

uint8_t CUpti_ActivityKernel6::sharedMemoryConfig [inherited]

The shared memory configuration used for the kernel. The value is one of the CUsharedconfig enumeration values from cuda.h.

uint32_t CUpti_ActivityKernel6::sharedMemoryExecuted [inherited]

Shared memory size set by the driver.

CUpti_FuncShmemLimitConfigCUpti_ActivityKernel6::shmemLimitConfig [inherited]

The shared memory limit config for the kernel. This field shows whether user has opted for a higher per block limit of dynamic shared memory.

uint64_t CUpti_ActivityKernel6::start [inherited]

The start timestamp for the kernel execution, in ns. A value of 0 for both the start and end timestamps indicates that timestamp information could not be collected for the kernel.

int32_t CUpti_ActivityKernel6::staticSharedMemory [inherited]

The static shared memory allocated for the kernel, in bytes.

uint32_t CUpti_ActivityKernel6::streamId [inherited]

The ID of the stream where the kernel is executing.

uint64_t CUpti_ActivityKernel6::submitted [inherited]

The timestamp when the command buffer containing the kernel launch is submitted to the GPU, in ns. A value of CUPTI_TIMESTAMP_UNKNOWN indicates that the submitted time could not be collected for the kernel. This timestamp is not collected by default. Use API cuptiActivityEnableLatencyTimestamps() to enable collection.

6.43. CUpti_ActivityKernel7 Struct Reference

[CUPTI Activity API]

This activity record represents a kernel execution (CUPTI_ACTIVITY_KIND_KERNEL and CUPTI_ACTIVITY_KIND_CONCURRENT_KERNEL) but is no longer generated by CUPTI. Kernel activities are now reported using the CUpti_ActivityKernel9 activity record.

Public Variables

int32_t  blockX
int32_t  blockY
int32_t  blockZ
CUpti_ActivityKernel7::@15  cacheConfig
uint32_t  channelID
CUpti_ChannelType  channelType
uint64_t  completed
uint32_t  contextId
uint32_t  correlationId
uint32_t  deviceId
int32_t  dynamicSharedMemory
uint64_t  end
uint8_t  executed
uint32_t  graphId
uint64_t  graphNodeId
int64_t  gridId
int32_t  gridX
int32_t  gridY
int32_t  gridZ
uint8_t  isSharedMemoryCarveoutRequested
CUpti_ActivityKind kind
uint8_t  launchType
uint32_t  localMemoryPerThread
uint32_t  localMemoryTotal
const char * name
CUaccessPolicyWindow * pAccessPolicyWindow
uint8_t  padding
CUpti_ActivityPartitionedGlobalCacheConfig partitionedGlobalCacheExecuted
CUpti_ActivityPartitionedGlobalCacheConfig partitionedGlobalCacheRequested
uint64_t  queued
uint16_t  registersPerThread
uint8_t  requested
void * reserved0
uint8_t  sharedMemoryCarveoutRequested
uint8_t  sharedMemoryConfig
uint32_t  sharedMemoryExecuted
CUpti_FuncShmemLimitConfig shmemLimitConfig
uint64_t  start
int32_t  staticSharedMemory
uint32_t  streamId
uint64_t  submitted

Variables

int32_t CUpti_ActivityKernel7::blockX [inherited]

The X-dimension block size for the kernel.

int32_t CUpti_ActivityKernel7::blockY [inherited]

The Y-dimension block size for the kernel.

int32_t CUpti_ActivityKernel7::blockZ [inherited]

The Z-dimension grid size for the kernel.

CUpti_ActivityKernel7::@15 CUpti_ActivityKernel7::cacheConfig [inherited]

For devices with compute capability 7.0+ cacheConfig values are not updated in case field isSharedMemoryCarveoutRequested is set

uint32_t CUpti_ActivityKernel7::channelID [inherited]

The ID of the HW channel on which the kernel is launched.

CUpti_ChannelType CUpti_ActivityKernel7::channelType [inherited]

The type of the channel

uint64_t CUpti_ActivityKernel7::completed [inherited]

The completed timestamp for the kernel execution, in ns. It represents the completion of all it's child kernels and the kernel itself. A value of CUPTI_TIMESTAMP_UNKNOWN indicates that the completion time is unknown.

uint32_t CUpti_ActivityKernel7::contextId [inherited]

The ID of the context where the kernel is executing.

uint32_t CUpti_ActivityKernel7::correlationId [inherited]

The correlation ID of the kernel. Each kernel execution is assigned a unique correlation ID that is identical to the correlation ID in the driver or runtime API activity record that launched the kernel.

uint32_t CUpti_ActivityKernel7::deviceId [inherited]

The ID of the device where the kernel is executing.

int32_t CUpti_ActivityKernel7::dynamicSharedMemory [inherited]

The dynamic shared memory reserved for the kernel, in bytes.

uint64_t CUpti_ActivityKernel7::end [inherited]

The end timestamp for the kernel execution, in ns. A value of 0 for both the start and end timestamps indicates that timestamp information could not be collected for the kernel.

uint8_t CUpti_ActivityKernel7::executed [inherited]

The cache configuration used for the kernel. The value is one of the CUfunc_cache enumeration values from cuda.h.

uint32_t CUpti_ActivityKernel7::graphId [inherited]

The unique ID of the graph that launched this kernel through graph launch APIs. This field will be 0 if the kernel is not launched through graph launch APIs.

uint64_t CUpti_ActivityKernel7::graphNodeId [inherited]

The unique ID of the graph node that launched this kernel through graph launch APIs. This field will be 0 if the kernel is not launched through graph launch APIs.

int64_t CUpti_ActivityKernel7::gridId [inherited]

The grid ID of the kernel. Each kernel is assigned a unique grid ID at runtime.

int32_t CUpti_ActivityKernel7::gridX [inherited]

The X-dimension grid size for the kernel.

int32_t CUpti_ActivityKernel7::gridY [inherited]

The Y-dimension grid size for the kernel.

int32_t CUpti_ActivityKernel7::gridZ [inherited]

The Z-dimension grid size for the kernel.

uint8_t CUpti_ActivityKernel7::isSharedMemoryCarveoutRequested [inherited]

This indicates if CU_FUNC_ATTRIBUTE_PREFERRED_SHARED_MEMORY_CARVEOUT was updated for the kernel launch

CUpti_ActivityKindCUpti_ActivityKernel7::kind [inherited]

The activity record kind, must be CUPTI_ACTIVITY_KIND_KERNEL or CUPTI_ACTIVITY_KIND_CONCURRENT_KERNEL.

uint8_t CUpti_ActivityKernel7::launchType [inherited]

The indicates if the kernel was executed via a regular launch or via a single/multi device cooperative launch.

See also:

CUpti_ActivityLaunchType

uint32_t CUpti_ActivityKernel7::localMemoryPerThread [inherited]

The amount of local memory reserved for each thread, in bytes.

uint32_t CUpti_ActivityKernel7::localMemoryTotal [inherited]

The total amount of local memory reserved for the kernel, in bytes.

const char * CUpti_ActivityKernel7::name [inherited]

The name of the kernel. This name is shared across all activity records representing the same kernel, and so should not be modified.

CUaccessPolicyWindow * CUpti_ActivityKernel7::pAccessPolicyWindow [inherited]

The pointer to the access policy window. The structure CUaccessPolicyWindow is defined in cuda.h.

uint8_t CUpti_ActivityKernel7::padding [inherited]

Undefined. Reserved for internal use.

CUpti_ActivityPartitionedGlobalCacheConfigCUpti_ActivityKernel7::partitionedGlobalCacheExecuted [inherited]

The partitioned global caching executed for the kernel. Partitioned global caching is required to enable caching on certain chips, such as devices with compute capability 5.2. Partitioned global caching can be automatically disabled if the occupancy requirement of the launch cannot support caching.

CUpti_ActivityPartitionedGlobalCacheConfigCUpti_ActivityKernel7::partitionedGlobalCacheRequested [inherited]

The partitioned global caching requested for the kernel. Partitioned global caching is required to enable caching on certain chips, such as devices with compute capability 5.2.

uint64_t CUpti_ActivityKernel7::queued [inherited]

The timestamp when the kernel is queued up in the command buffer, in ns. A value of CUPTI_TIMESTAMP_UNKNOWN indicates that the queued time could not be collected for the kernel. This timestamp is not collected by default. Use API cuptiActivityEnableLatencyTimestamps() to enable collection.

Command buffer is a buffer written by CUDA driver to send commands like kernel launch, memory copy etc to the GPU. All launches of CUDA kernels are asynchrnous with respect to the host, the host requests the launch by writing commands into the command buffer, then returns without checking the GPU's progress.

uint16_t CUpti_ActivityKernel7::registersPerThread [inherited]

The number of registers required for each thread executing the kernel.

uint8_t CUpti_ActivityKernel7::requested [inherited]

The cache configuration requested by the kernel. The value is one of the CUfunc_cache enumeration values from cuda.h.

void * CUpti_ActivityKernel7::reserved0 [inherited]

Undefined. Reserved for internal use.

uint8_t CUpti_ActivityKernel7::sharedMemoryCarveoutRequested [inherited]

Shared memory carveout value requested for the function in percentage of the total resource. The value will be updated only if field isSharedMemoryCarveoutRequested is set.

uint8_t CUpti_ActivityKernel7::sharedMemoryConfig [inherited]

The shared memory configuration used for the kernel. The value is one of the CUsharedconfig enumeration values from cuda.h.

uint32_t CUpti_ActivityKernel7::sharedMemoryExecuted [inherited]

Shared memory size set by the driver.

CUpti_FuncShmemLimitConfigCUpti_ActivityKernel7::shmemLimitConfig [inherited]

The shared memory limit config for the kernel. This field shows whether user has opted for a higher per block limit of dynamic shared memory.

uint64_t CUpti_ActivityKernel7::start [inherited]

The start timestamp for the kernel execution, in ns. A value of 0 for both the start and end timestamps indicates that timestamp information could not be collected for the kernel.

int32_t CUpti_ActivityKernel7::staticSharedMemory [inherited]

The static shared memory allocated for the kernel, in bytes.

uint32_t CUpti_ActivityKernel7::streamId [inherited]

The ID of the stream where the kernel is executing.

uint64_t CUpti_ActivityKernel7::submitted [inherited]

The timestamp when the command buffer containing the kernel launch is submitted to the GPU, in ns. A value of CUPTI_TIMESTAMP_UNKNOWN indicates that the submitted time could not be collected for the kernel. This timestamp is not collected by default. Use API cuptiActivityEnableLatencyTimestamps() to enable collection.

6.44. CUpti_ActivityKernel8 Struct Reference

[CUPTI Activity API]

This activity record represents a kernel execution (CUPTI_ACTIVITY_KIND_KERNEL and CUPTI_ACTIVITY_KIND_CONCURRENT_KERNEL)

Public Variables

int32_t  blockX
int32_t  blockY
int32_t  blockZ
CUpti_ActivityKernel8::@17  cacheConfig
uint32_t  channelID
CUpti_ChannelType  channelType
uint32_t  clusterSchedulingPolicy
uint32_t  clusterX
uint32_t  clusterY
uint32_t  clusterZ
uint64_t  completed
uint32_t  contextId
uint32_t  correlationId
uint32_t  deviceId
int32_t  dynamicSharedMemory
uint64_t  end
uint8_t  executed
uint32_t  graphId
uint64_t  graphNodeId
int64_t  gridId
int32_t  gridX
int32_t  gridY
int32_t  gridZ
uint8_t  isSharedMemoryCarveoutRequested
CUpti_ActivityKind kind
uint8_t  launchType
uint32_t  localMemoryPerThread
uint32_t  localMemoryTotal
uint64_t  localMemoryTotal_v2
const char * name
CUaccessPolicyWindow * pAccessPolicyWindow
uint8_t  padding
CUpti_ActivityPartitionedGlobalCacheConfig partitionedGlobalCacheExecuted
CUpti_ActivityPartitionedGlobalCacheConfig partitionedGlobalCacheRequested
uint64_t  queued
uint16_t  registersPerThread
uint8_t  requested
void * reserved0
uint8_t  sharedMemoryCarveoutRequested
uint8_t  sharedMemoryConfig
uint32_t  sharedMemoryExecuted
CUpti_FuncShmemLimitConfig shmemLimitConfig
uint64_t  start
int32_t  staticSharedMemory
uint32_t  streamId
uint64_t  submitted

Variables

int32_t CUpti_ActivityKernel8::blockX [inherited]

The X-dimension block size for the kernel.

int32_t CUpti_ActivityKernel8::blockY [inherited]

The Y-dimension block size for the kernel.

int32_t CUpti_ActivityKernel8::blockZ [inherited]

The Z-dimension grid size for the kernel.

CUpti_ActivityKernel8::@17 CUpti_ActivityKernel8::cacheConfig [inherited]

For devices with compute capability 7.0+ cacheConfig values are not updated in case field isSharedMemoryCarveoutRequested is set

uint32_t CUpti_ActivityKernel8::channelID [inherited]

The ID of the HW channel on which the kernel is launched.

CUpti_ChannelType CUpti_ActivityKernel8::channelType [inherited]

The type of the channel

uint32_t CUpti_ActivityKernel8::clusterSchedulingPolicy [inherited]

The cluster scheduling policy for the kernel. Refer CUclusterSchedulingPolicy Field is valid for devices with compute capability 9.0 and higher

uint32_t CUpti_ActivityKernel8::clusterX [inherited]

The X-dimension cluster size for the kernel. Field is valid for devices with compute capability 9.0 and higher

uint32_t CUpti_ActivityKernel8::clusterY [inherited]

The Y-dimension cluster size for the kernel. Field is valid for devices with compute capability 9.0 and higher

uint32_t CUpti_ActivityKernel8::clusterZ [inherited]

The Z-dimension cluster size for the kernel. Field is valid for devices with compute capability 9.0 and higher

uint64_t CUpti_ActivityKernel8::completed [inherited]

The completed timestamp for the kernel execution, in ns. It represents the completion of all it's child kernels and the kernel itself. A value of CUPTI_TIMESTAMP_UNKNOWN indicates that the completion time is unknown.

uint32_t CUpti_ActivityKernel8::contextId [inherited]

The ID of the context where the kernel is executing.

uint32_t CUpti_ActivityKernel8::correlationId [inherited]

The correlation ID of the kernel. Each kernel execution is assigned a unique correlation ID that is identical to the correlation ID in the driver or runtime API activity record that launched the kernel.

uint32_t CUpti_ActivityKernel8::deviceId [inherited]

The ID of the device where the kernel is executing.

int32_t CUpti_ActivityKernel8::dynamicSharedMemory [inherited]

The dynamic shared memory reserved for the kernel, in bytes.

uint64_t CUpti_ActivityKernel8::end [inherited]

The end timestamp for the kernel execution, in ns. A value of 0 for both the start and end timestamps indicates that timestamp information could not be collected for the kernel.

uint8_t CUpti_ActivityKernel8::executed [inherited]

The cache configuration used for the kernel. The value is one of the CUfunc_cache enumeration values from cuda.h.

uint32_t CUpti_ActivityKernel8::graphId [inherited]

The unique ID of the graph that launched this kernel through graph launch APIs. This field will be 0 if the kernel is not launched through graph launch APIs.

uint64_t CUpti_ActivityKernel8::graphNodeId [inherited]

The unique ID of the graph node that launched this kernel through graph launch APIs. This field will be 0 if the kernel is not launched through graph launch APIs.

int64_t CUpti_ActivityKernel8::gridId [inherited]

The grid ID of the kernel. Each kernel is assigned a unique grid ID at runtime.

int32_t CUpti_ActivityKernel8::gridX [inherited]

The X-dimension grid size for the kernel.

int32_t CUpti_ActivityKernel8::gridY [inherited]

The Y-dimension grid size for the kernel.

int32_t CUpti_ActivityKernel8::gridZ [inherited]

The Z-dimension grid size for the kernel.

uint8_t CUpti_ActivityKernel8::isSharedMemoryCarveoutRequested [inherited]

This indicates if CU_FUNC_ATTRIBUTE_PREFERRED_SHARED_MEMORY_CARVEOUT was updated for the kernel launch

CUpti_ActivityKindCUpti_ActivityKernel8::kind [inherited]

The activity record kind, must be CUPTI_ACTIVITY_KIND_KERNEL or CUPTI_ACTIVITY_KIND_CONCURRENT_KERNEL.

uint8_t CUpti_ActivityKernel8::launchType [inherited]

The indicates if the kernel was executed via a regular launch or via a single/multi device cooperative launch.

See also:

CUpti_ActivityLaunchType

uint32_t CUpti_ActivityKernel8::localMemoryPerThread [inherited]

The amount of local memory reserved for each thread, in bytes.

uint32_t CUpti_ActivityKernel8::localMemoryTotal [inherited]

The total amount of local memory reserved for the kernel, in bytes (deprecated in CUDA 11.8). Refer field localMemoryTotal_v2

uint64_t CUpti_ActivityKernel8::localMemoryTotal_v2 [inherited]

The total amount of local memory reserved for the kernel, in bytes.

const char * CUpti_ActivityKernel8::name [inherited]

The name of the kernel. This name is shared across all activity records representing the same kernel, and so should not be modified.

CUaccessPolicyWindow * CUpti_ActivityKernel8::pAccessPolicyWindow [inherited]

The pointer to the access policy window. The structure CUaccessPolicyWindow is defined in cuda.h.

uint8_t CUpti_ActivityKernel8::padding [inherited]

Undefined. Reserved for internal use.

CUpti_ActivityPartitionedGlobalCacheConfigCUpti_ActivityKernel8::partitionedGlobalCacheExecuted [inherited]

The partitioned global caching executed for the kernel. Partitioned global caching is required to enable caching on certain chips, such as devices with compute capability 5.2. Partitioned global caching can be automatically disabled if the occupancy requirement of the launch cannot support caching.

CUpti_ActivityPartitionedGlobalCacheConfigCUpti_ActivityKernel8::partitionedGlobalCacheRequested [inherited]

The partitioned global caching requested for the kernel. Partitioned global caching is required to enable caching on certain chips, such as devices with compute capability 5.2.

uint64_t CUpti_ActivityKernel8::queued [inherited]

The timestamp when the kernel is queued up in the command buffer, in ns. A value of CUPTI_TIMESTAMP_UNKNOWN indicates that the queued time could not be collected for the kernel. This timestamp is not collected by default. Use API cuptiActivityEnableLatencyTimestamps() to enable collection.

Command buffer is a buffer written by CUDA driver to send commands like kernel launch, memory copy etc to the GPU. All launches of CUDA kernels are asynchrnous with respect to the host, the host requests the launch by writing commands into the command buffer, then returns without checking the GPU's progress.

uint16_t CUpti_ActivityKernel8::registersPerThread [inherited]

The number of registers required for each thread executing the kernel.

uint8_t CUpti_ActivityKernel8::requested [inherited]

The cache configuration requested by the kernel. The value is one of the CUfunc_cache enumeration values from cuda.h.

void * CUpti_ActivityKernel8::reserved0 [inherited]

Undefined. Reserved for internal use.

uint8_t CUpti_ActivityKernel8::sharedMemoryCarveoutRequested [inherited]

Shared memory carveout value requested for the function in percentage of the total resource. The value will be updated only if field isSharedMemoryCarveoutRequested is set.

uint8_t CUpti_ActivityKernel8::sharedMemoryConfig [inherited]

The shared memory configuration used for the kernel. The value is one of the CUsharedconfig enumeration values from cuda.h.

uint32_t CUpti_ActivityKernel8::sharedMemoryExecuted [inherited]

Shared memory size set by the driver.

CUpti_FuncShmemLimitConfigCUpti_ActivityKernel8::shmemLimitConfig [inherited]

The shared memory limit config for the kernel. This field shows whether user has opted for a higher per block limit of dynamic shared memory.

uint64_t CUpti_ActivityKernel8::start [inherited]

The start timestamp for the kernel execution, in ns. A value of 0 for both the start and end timestamps indicates that timestamp information could not be collected for the kernel.

int32_t CUpti_ActivityKernel8::staticSharedMemory [inherited]

The static shared memory allocated for the kernel, in bytes.