6. Data Structures
Here are the data structures with brief descriptions:
- BufferInfo
- BufferInfo will be stored in the file for every buffer i.e for every call of UtilDumpPcSamplingBufferInFile() API
- CUPTI::PcSamplingUtil::CUptiUtil_GetBufferInfoParams
- Params for CuptiUtilGetBufferInfo
- CUPTI::PcSamplingUtil::CUptiUtil_GetHeaderDataParams
- Params for CuptiUtilGetHeaderData
- CUPTI::PcSamplingUtil::CUptiUtil_GetPcSampDataParams
- Params for CuptiUtilGetPcSampData
- CUPTI::PcSamplingUtil::CUptiUtil_MergePcSampDataParams
- Params for CuptiUtilMergePcSampData
- CUPTI::PcSamplingUtil::CUptiUtil_PutPcSampDataParams
- Params for CuptiUtilPutPcSampData
- CUpti_Activity
- The base activity record
- CUpti_ActivityAPI
- The activity record for a driver or runtime API invocation
- CUpti_ActivityAutoBoostState
- Device auto boost state structure
- CUpti_ActivityBranch
- The activity record for source level result branch. (deprecated)
- CUpti_ActivityBranch2
- The activity record for source level result branch
- CUpti_ActivityCdpKernel
- The activity record for CDP (CUDA Dynamic Parallelism) kernel
- CUpti_ActivityContext
- The activity record for a context
- CUpti_ActivityCudaEvent
- The activity record for CUDA event
- CUpti_ActivityDevice
- The activity record for a device. (deprecated)
- CUpti_ActivityDevice2
- The activity record for a device. (deprecated)
- CUpti_ActivityDevice3
- The activity record for a device. (CUDA 7.0 onwards)
- CUpti_ActivityDevice4
- The activity record for a device. (CUDA 11.6 onwards)
- CUpti_ActivityDevice5
- The activity record for a device. (CUDA 11.6 onwards)
- CUpti_ActivityDeviceAttribute
- The activity record for a device attribute
- CUpti_ActivityEnvironment
- The activity record for CUPTI environmental data
- CUpti_ActivityEvent
- The activity record for a CUPTI event
- CUpti_ActivityEventInstance
- The activity record for a CUPTI event with instance information
- CUpti_ActivityExternalCorrelation
- The activity record for correlation with external records
- CUpti_ActivityFunction
- The activity record for global/device functions
- CUpti_ActivityGlobalAccess
- The activity record for source-level global access. (deprecated)
- CUpti_ActivityGlobalAccess2
- The activity record for source-level global access. (deprecated in CUDA 9.0)
- CUpti_ActivityGlobalAccess3
- The activity record for source-level global access
- CUpti_ActivityGraphTrace
- The activity record for trace of graph execution
- CUpti_ActivityInstantaneousEvent
- The activity record for an instantaneous CUPTI event
- CUpti_ActivityInstantaneousEventInstance
- The activity record for an instantaneous CUPTI event with event domain instance information
- CUpti_ActivityInstantaneousMetric
- The activity record for an instantaneous CUPTI metric
- CUpti_ActivityInstantaneousMetricInstance
- The instantaneous activity record for a CUPTI metric with instance information
- CUpti_ActivityInstructionCorrelation
- The activity record for source-level sass/source line-by-line correlation
- CUpti_ActivityInstructionExecution
- The activity record for source-level instruction execution
- CUpti_ActivityJit
- The activity record for JIT operations. This activity represents the JIT operations (compile, load, store) of a CUmodule from the Compute Cache. Gives the exact hashed path of where the cached module is loaded from, or where the module will be stored after Just-In-Time (JIT) compilation
- CUpti_ActivityKernel
- The activity record for kernel. (deprecated)
- CUpti_ActivityKernel2
- The activity record for kernel. (deprecated)
- CUpti_ActivityKernel3
- The activity record for a kernel (CUDA 6.5(with sm_52 support) onwards). (deprecated in CUDA 9.0)
- CUpti_ActivityKernel4
- The activity record for a kernel (CUDA 9.0(with sm_70 support) onwards). (deprecated in CUDA 11.0)
- CUpti_ActivityKernel5
- The activity record for a kernel (CUDA 11.0(with sm_80 support) onwards). (deprecated in CUDA 11.2) This activity record represents a kernel execution (CUPTI_ACTIVITY_KIND_KERNEL and CUPTI_ACTIVITY_KIND_CONCURRENT_KERNEL) but is no longer generated by CUPTI. Kernel activities are now reported using the CUpti_ActivityKernel9 activity record
- CUpti_ActivityKernel6
- The activity record for kernel. (deprecated in CUDA 11.6)
- CUpti_ActivityKernel7
- The activity record for kernel. (deprecated in CUDA 11.8)
- CUpti_ActivityKernel8
- The activity record for kernel
- CUpti_ActivityKernel9
- The activity record for kernel
- CUpti_ActivityMarker
- The activity record providing a marker which is an instantaneous point in time. (deprecated in CUDA 8.0)
- CUpti_ActivityMarker2
- The activity record providing a marker which is an instantaneous point in time
- CUpti_ActivityMarkerData
- The activity record providing detailed information for a marker
- CUpti_ActivityMemcpy
- The activity record for memory copies. (deprecated)
- CUpti_ActivityMemcpy3
- The activity record for memory copies. (deprecated in CUDA 11.1)
- CUpti_ActivityMemcpy4
- The activity record for memory copies. (deprecated in CUDA 11.6)
- CUpti_ActivityMemcpy5
- The activity record for memory copies
- CUpti_ActivityMemcpyPtoP
- The activity record for peer-to-peer memory copies
- CUpti_ActivityMemcpyPtoP2
- The activity record for peer-to-peer memory copies. (deprecated in CUDA 11.1)
- CUpti_ActivityMemcpyPtoP3
- The activity record for peer-to-peer memory copies. (deprecated in CUDA 11.6)
- CUpti_ActivityMemcpyPtoP4
- The activity record for peer-to-peer memory copies
- CUpti_ActivityMemory
- The activity record for memory
- CUpti_ActivityMemory2
- The activity record for memory
- CUpti_ActivityMemory3
- The activity record for memory
- CUpti_ActivityMemory3::CUpti_ActivityMemory3::PACKED_ALIGNMENT
- CUpti_ActivityMemoryPool
- The activity record for memory pool
- CUpti_ActivityMemoryPool2
- The activity record for memory pool
- CUpti_ActivityMemset
- The activity record for memset. (deprecated)
- CUpti_ActivityMemset2
- The activity record for memset. (deprecated in CUDA 11.1)
- CUpti_ActivityMemset3
- The activity record for memset. (deprecated in CUDA 11.6)
- CUpti_ActivityMemset4
- The activity record for memset
- CUpti_ActivityMetric
- The activity record for a CUPTI metric
- CUpti_ActivityMetricInstance
- The activity record for a CUPTI metric with instance information
- CUpti_ActivityModule
- The activity record for a CUDA module
- CUpti_ActivityName
- The activity record providing a name
- CUpti_ActivityNvLink
- NVLink information. (deprecated in CUDA 9.0)
- CUpti_ActivityNvLink2
- NVLink information. (deprecated in CUDA 10.0)
- CUpti_ActivityNvLink3
- NVLink information
- CUpti_ActivityNvLink4
- NVLink information
- CUpti_ActivityObjectKindId
- Identifiers for object kinds as specified by CUpti_ActivityObjectKind
- CUpti_ActivityOpenAcc
- The base activity record for OpenAcc records
- CUpti_ActivityOpenAccData
- The activity record for OpenACC data
- CUpti_ActivityOpenAccLaunch
- The activity record for OpenACC launch
- CUpti_ActivityOpenAccOther
- The activity record for OpenACC other
- CUpti_ActivityOpenMp
- The base activity record for OpenMp records
- CUpti_ActivityOverhead
- The activity record for CUPTI and driver overheads
- CUpti_ActivityPcie
- PCI devices information required to construct topology
- CUpti_ActivityPCSampling
- The activity record for PC sampling. (deprecated in CUDA 8.0)
- CUpti_ActivityPCSampling2
- The activity record for PC sampling. (deprecated in CUDA 9.0)
- CUpti_ActivityPCSampling3
- The activity record for PC sampling
- CUpti_ActivityPCSamplingConfig
- PC sampling configuration structure
- CUpti_ActivityPCSamplingRecordInfo
- The activity record for record status for PC sampling
- CUpti_ActivityPreemption
- The activity record for a preemption of a CDP kernel
- CUpti_ActivitySharedAccess
- The activity record for source-level shared access
- CUpti_ActivitySourceLocator
- The activity record for source locator
- CUpti_ActivityStream
- The activity record for CUDA stream
- CUpti_ActivitySynchronization
- The activity record for synchronization management
- CUpti_ActivityUnifiedMemoryCounter
- The activity record for Unified Memory counters (deprecated in CUDA 7.0)
- CUpti_ActivityUnifiedMemoryCounter2
- The activity record for Unified Memory counters (CUDA 7.0 and beyond)
- CUpti_ActivityUnifiedMemoryCounterConfig
- Unified Memory counters configuration structure
- CUpti_CallbackData
- Data passed into a runtime or driver API callback function
- CUpti_EventGroupSet
- A set of event groups
- CUpti_EventGroupSets
- A set of event group sets
- CUpti_GetCubinCrcParams
- Params for cuptiGetCubinCrc
- CUpti_GetSassToSourceCorrelationParams
- Params for cuptiGetSassToSourceCorrelation
- CUpti_GraphData
- CUDA graphs data passed into a resource callback function
- CUpti_MetricValue
- A metric value
- CUpti_ModuleResourceData
- Module data passed into a resource callback function
- CUpti_NvtxData
- Data passed into a NVTX callback function
- CUpti_PCSamplingConfigurationInfo
- PC sampling configuration information structure
- CUpti_PCSamplingConfigurationInfoParams
- PC sampling configuration structure
- CUpti_PCSamplingData
- Collected PC Sampling data
- CUpti_PCSamplingDisableParams
- Params for cuptiPCSamplingDisable
- CUpti_PCSamplingEnableParams
- Params for cuptiPCSamplingEnable
- CUpti_PCSamplingGetDataParams
- Params for cuptiPCSamplingEnable
- CUpti_PCSamplingGetNumStallReasonsParams
- Params for cuptiPCSamplingGetNumStallReasons
- CUpti_PCSamplingGetStallReasonsParams
- Params for cuptiPCSamplingGetStallReasons
- CUpti_PCSamplingPCData
- PC Sampling data
- CUpti_PCSamplingStallReason
- PC Sampling stall reasons
- CUpti_PCSamplingStartParams
- Params for cuptiPCSamplingStart
- CUpti_PCSamplingStopParams
- Params for cuptiPCSamplingStop
- CUpti_Profiler_BeginPass_Params
- Params for cuptiProfilerBeginPass
- CUpti_Profiler_BeginSession_Params
- Params for cuptiProfilerBeginSession
- CUpti_Profiler_CounterDataImage_CalculateScratchBufferSize_Params
- Params for cuptiProfilerCounterDataImageCalculateScratchBufferSize
- CUpti_Profiler_CounterDataImage_CalculateSize_Params
- Params for cuptiProfilerCounterDataImageCalculateSize
- CUpti_Profiler_CounterDataImage_Initialize_Params
- Params for cuptiProfilerCounterDataImageInitialize
- CUpti_Profiler_CounterDataImage_InitializeScratchBuffer_Params
- Params for cuptiProfilerCounterDataImageInitializeScratchBuffer
- CUpti_Profiler_CounterDataImageOptions
- Input parameter to define the counterDataImage
- CUpti_Profiler_DeInitialize_Params
- Default parameter for cuptiProfilerDeInitialize
- CUpti_Profiler_DeviceSupported_Params
- Params for cuptiProfilerDeviceSupported
- CUpti_Profiler_DisableProfiling_Params
- Params for cuptiProfilerDisableProfiling
- CUpti_Profiler_EnableProfiling_Params
- Params for cuptiProfilerEnableProfiling
- CUpti_Profiler_EndPass_Params
- Params for cuptiProfilerEndPass
- CUpti_Profiler_EndSession_Params
- Params for cuptiProfilerEndSession
- CUpti_Profiler_FlushCounterData_Params
- Params for cuptiProfilerFlushCounterData
- CUpti_Profiler_GetCounterAvailability_Params
- Params for cuptiProfilerGetCounterAvailability
- CUpti_Profiler_Initialize_Params
- Default parameter for cuptiProfilerInitialize
- CUpti_Profiler_IsPassCollected_Params
- Params for cuptiProfilerIsPassCollected
- CUpti_Profiler_SetConfig_Params
- Params for cuptiProfilerSetConfig
- CUpti_Profiler_UnsetConfig_Params
- Params for cuptiProfilerUnsetConfig
- CUpti_ResourceData
- Data passed into a resource callback function
- CUpti_StateData
- Data passed into a State callback function
- CUpti_SynchronizeData
- Data passed into a synchronize callback function
- Header
- Header info will be stored in file
- NV::Cupti::Checkpoint::CUpti_Checkpoint
- Configuration and handle for a CUPTI Checkpoint
- PcSamplingStallReasons
- All available stall reasons name and respective indexes will be stored in it
6.1. BufferInfo Struct Reference
[CUPTI PC Sampling Utility API]
Public Variables
- uint64_t bufferByteSize
- uint64_t numSelectedStallReasons
- size_t numStallReasons
- uint64_t recordCount
Variables
- uint64_t BufferInfo::bufferByteSize [inherited]
-
Buffer size in Bytes.
- uint64_t BufferInfo::numSelectedStallReasons [inherited]
-
Total number of stall reasons in single record.
- size_t BufferInfo::numStallReasons [inherited]
-
Count of all stall reasons supported on the GPU
- uint64_t BufferInfo::recordCount [inherited]
-
Total number of PC records.
6.2. CUPTI::PcSamplingUtil::CUptiUtil_GetBufferInfoParams Struct Reference
[CUPTI PC Sampling Utility API]
Public Variables
- struct BufferInfo bufferInfoData
- std::ifstream * fileHandler
- size_t size
Variables
- struct BufferInfoCUPTI::PcSamplingUtil::CUptiUtil_GetBufferInfoParams::bufferInfoData [inherited]
-
Buffer Info.
- std::ifstream * CUPTI::PcSamplingUtil::CUptiUtil_GetBufferInfoParams::fileHandler [inherited]
-
File handle.
- size_t CUPTI::PcSamplingUtil::CUptiUtil_GetBufferInfoParams::size [inherited]
-
Size of the data structure i.e. CUpti_PCSamplingDisableParamsSize CUPTI client should set the size of the structure. It will be used in CUPTI to check what fields are available in the structure. Used to preserve backward compatibility.
6.3. CUPTI::PcSamplingUtil::CUptiUtil_GetHeaderDataParams Struct Reference
[CUPTI PC Sampling Utility API]
Public Variables
- std::ifstream * fileHandler
- struct Header headerInfo
- size_t size
Variables
- std::ifstream * CUPTI::PcSamplingUtil::CUptiUtil_GetHeaderDataParams::fileHandler [inherited]
-
File handle.
- struct HeaderCUPTI::PcSamplingUtil::CUptiUtil_GetHeaderDataParams::headerInfo [inherited]
-
Header Info.
- size_t CUPTI::PcSamplingUtil::CUptiUtil_GetHeaderDataParams::size [inherited]
-
Size of the data structure i.e. CUpti_PCSamplingDisableParamsSize CUPTI client should set the size of the structure. It will be used in CUPTI to check what fields are available in the structure. Used to preserve backward compatibility.
6.4. CUPTI::PcSamplingUtil::CUptiUtil_GetPcSampDataParams Struct Reference
[CUPTI PC Sampling Utility API]
Public Variables
- PcSamplingBufferType bufferType
- std::ifstream * fileHandler
- size_t numAttributes
- BufferInfo * pBufferInfoData
- CUpti_PCSamplingConfigurationInfo * pPCSamplingConfigurationInfo
- PcSamplingStallReasons * pPcSamplingStallReasons
- void * pSamplingData
- size_t size
Variables
- PcSamplingBufferTypeCUPTI::PcSamplingUtil::CUptiUtil_GetPcSampDataParams::bufferType [inherited]
-
Type of buffer to store in file
- std::ifstream * CUPTI::PcSamplingUtil::CUptiUtil_GetPcSampDataParams::fileHandler [inherited]
-
File handle.
- size_t CUPTI::PcSamplingUtil::CUptiUtil_GetPcSampDataParams::numAttributes [inherited]
-
Number of configuration attributes
- BufferInfo * CUPTI::PcSamplingUtil::CUptiUtil_GetPcSampDataParams::pBufferInfoData [inherited]
-
Pointer to collected buffer info using CuptiUtilGetBufferInfo
- CUpti_PCSamplingConfigurationInfo * CUPTI::PcSamplingUtil::CUptiUtil_GetPcSampDataParams::pPCSamplingConfigurationInfo [inherited]
- PcSamplingStallReasons * CUPTI::PcSamplingUtil::CUptiUtil_GetPcSampDataParams::pPcSamplingStallReasons [inherited]
-
Refer PcSamplingStallReasons. For stallReasons field of PcSamplingStallReasons it is expected to allocate memory for each string element of array.
- void * CUPTI::PcSamplingUtil::CUptiUtil_GetPcSampDataParams::pSamplingData [inherited]
-
Pointer to allocated memory to store retrieved data from file.
- size_t CUPTI::PcSamplingUtil::CUptiUtil_GetPcSampDataParams::size [inherited]
-
Size of the data structure i.e. CUpti_PCSamplingDisableParamsSize CUPTI client should set the size of the structure. It will be used in CUPTI to check what fields are available in the structure. Used to preserve backward compatibility.
6.5. CUPTI::PcSamplingUtil::CUptiUtil_MergePcSampDataParams Struct Reference
[CUPTI PC Sampling Utility API]
Public Variables
- * MergedPcSampDataBuffers
- CUpti_PCSamplingData * PcSampDataBuffer
- size_t * numMergedBuffer
- size_t numberOfBuffers
- size_t size
Variables
- * CUPTI::PcSamplingUtil::CUptiUtil_MergePcSampDataParams::MergedPcSampDataBuffers [inherited]
-
Pointer to array of merged buffers as per the range id.
- CUpti_PCSamplingData * CUPTI::PcSamplingUtil::CUptiUtil_MergePcSampDataParams::PcSampDataBuffer [inherited]
-
Pointer to array of buffers to merge
- size_t * CUPTI::PcSamplingUtil::CUptiUtil_MergePcSampDataParams::numMergedBuffer [inherited]
-
Number of merged buffers.
- size_t CUPTI::PcSamplingUtil::CUptiUtil_MergePcSampDataParams::numberOfBuffers [inherited]
-
Number of buffers to merge.
- size_t CUPTI::PcSamplingUtil::CUptiUtil_MergePcSampDataParams::size [inherited]
-
Size of the data structure i.e. CUpti_PCSamplingDisableParamsSize CUPTI client should set the size of the structure. It will be used in CUPTI to check what fields are available in the structure. Used to preserve backward compatibility.
6.6. CUPTI::PcSamplingUtil::CUptiUtil_PutPcSampDataParams Struct Reference
[CUPTI PC Sampling Utility API]
Public Variables
- PcSamplingBufferType bufferType
- const char * fileName
- size_t numAttributes
- CUpti_PCSamplingConfigurationInfo * pPCSamplingConfigurationInfo
- PcSamplingStallReasons * pPcSamplingStallReasons
- void * pSamplingData
- size_t size
Variables
- PcSamplingBufferTypeCUPTI::PcSamplingUtil::CUptiUtil_PutPcSampDataParams::bufferType [inherited]
-
Type of buffer to store in file
- const char * CUPTI::PcSamplingUtil::CUptiUtil_PutPcSampDataParams::fileName [inherited]
-
File name to store buffer into it.
- size_t CUPTI::PcSamplingUtil::CUptiUtil_PutPcSampDataParams::numAttributes [inherited]
-
Number of configured attributes
- CUpti_PCSamplingConfigurationInfo * CUPTI::PcSamplingUtil::CUptiUtil_PutPcSampDataParams::pPCSamplingConfigurationInfo [inherited]
-
Refer CUpti_PCSamplingConfigurationInfo It is expected to provide configuration details of at least CUPTI_PC_SAMPLING_CONFIGURATION_ATTR_TYPE_STALL_REASON attribute.
- PcSamplingStallReasons * CUPTI::PcSamplingUtil::CUptiUtil_PutPcSampDataParams::pPcSamplingStallReasons [inherited]
-
Refer PcSamplingStallReasons.
- void * CUPTI::PcSamplingUtil::CUptiUtil_PutPcSampDataParams::pSamplingData [inherited]
-
PC sampling buffer.
- size_t CUPTI::PcSamplingUtil::CUptiUtil_PutPcSampDataParams::size [inherited]
-
Size of the data structure i.e. CUpti_PCSamplingDisableParamsSize CUPTI client should set the size of the structure. It will be used in CUPTI to check what fields are available in the structure. Used to preserve backward compatibility.
6.7. CUpti_Activity Struct Reference
[CUPTI Activity API]
The activity API uses a CUpti_Activity as a generic representation for any activity. The 'kind' field is used to determine the specific activity kind, and from that the CUpti_Activity object can be cast to the specific activity record type appropriate for that kind.
Note that all activity record types are padded and aligned to ensure that each member of the record is naturally aligned.
See also:
Public Variables
Variables
- CUpti_ActivityKindCUpti_Activity::kind [inherited]
-
The kind of this activity.
6.8. CUpti_ActivityAPI Struct Reference
[CUPTI Activity API]
This activity record represents an invocation of a driver or runtime API (CUPTI_ACTIVITY_KIND_DRIVER and CUPTI_ACTIVITY_KIND_RUNTIME).
Public Variables
- CUpti_CallbackId cbid
- uint32_t correlationId
- uint64_t end
- CUpti_ActivityKind kind
- uint32_t processId
- uint32_t returnValue
- uint64_t start
- uint32_t threadId
Variables
- CUpti_CallbackIdCUpti_ActivityAPI::cbid [inherited]
-
The ID of the driver or runtime function.
- uint32_t CUpti_ActivityAPI::correlationId [inherited]
-
The correlation ID of the driver or runtime CUDA function. Each function invocation is assigned a unique correlation ID that is identical to the correlation ID in the memcpy, memset, or kernel activity record that is associated with this function.
- uint64_t CUpti_ActivityAPI::end [inherited]
-
The end timestamp for the function, in ns. A value of 0 for both the start and end timestamps indicates that timestamp information could not be collected for the function.
- CUpti_ActivityKindCUpti_ActivityAPI::kind [inherited]
-
The activity record kind, must be CUPTI_ACTIVITY_KIND_DRIVER, CUPTI_ACTIVITY_KIND_RUNTIME, or CUPTI_ACTIVITY_KIND_INTERNAL_LAUNCH_API.
- uint32_t CUpti_ActivityAPI::processId [inherited]
-
The ID of the process where the driver or runtime CUDA function is executing.
- uint32_t CUpti_ActivityAPI::returnValue [inherited]
-
The return value for the function. For a CUDA driver function with will be a CUresult value, and for a CUDA runtime function this will be a cudaError_t value.
- uint64_t CUpti_ActivityAPI::start [inherited]
-
The start timestamp for the function, in ns. A value of 0 for both the start and end timestamps indicates that timestamp information could not be collected for the function.
- uint32_t CUpti_ActivityAPI::threadId [inherited]
-
The ID of the thread where the driver or runtime CUDA function is executing.
6.9. CUpti_ActivityAutoBoostState Struct Reference
[CUPTI Activity API]
This structure defines auto boost state for a device. See function cuptiGetAutoBoostState
Public Variables
Variables
- uint32_t CUpti_ActivityAutoBoostState::enabled [inherited]
-
Returned auto boost state. 1 is returned in case auto boost is enabled, 0 otherwise
- uint32_t CUpti_ActivityAutoBoostState::pid [inherited]
-
Id of process that has set the current boost state. The value will be CUPTI_AUTO_BOOST_INVALID_CLIENT_PID if the user does not have the permission to query process ids or there is an error in querying the process id.
6.10. CUpti_ActivityBranch Struct Reference
[CUPTI Activity API]
This activity record the locations of the branches in the source (CUPTI_ACTIVITY_KIND_BRANCH). Branch activities are now reported using the CUpti_ActivityBranch2 activity record.
Public Variables
- uint32_t correlationId
- uint32_t diverged
- uint32_t executed
- CUpti_ActivityKind kind
- uint32_t pcOffset
- uint32_t sourceLocatorId
- uint64_t threadsExecuted
Variables
- uint32_t CUpti_ActivityBranch::correlationId [inherited]
-
The correlation ID of the kernel to which this result is associated.
- uint32_t CUpti_ActivityBranch::diverged [inherited]
-
Number of times this branch diverged
- uint32_t CUpti_ActivityBranch::executed [inherited]
-
The number of times this instruction was executed per warp. It will be incremented regardless of predicate or condition code.
- CUpti_ActivityKindCUpti_ActivityBranch::kind [inherited]
-
The activity record kind, must be CUPTI_ACTIVITY_KIND_BRANCH.
- uint32_t CUpti_ActivityBranch::pcOffset [inherited]
-
The pc offset for the branch.
- uint32_t CUpti_ActivityBranch::sourceLocatorId [inherited]
-
The ID for source locator.
- uint64_t CUpti_ActivityBranch::threadsExecuted [inherited]
-
This increments each time when this instruction is executed by number of threads that executed this instruction
6.11. CUpti_ActivityBranch2 Struct Reference
[CUPTI Activity API]
This activity record the locations of the branches in the source (CUPTI_ACTIVITY_KIND_BRANCH).
Public Variables
- uint32_t correlationId
- uint32_t diverged
- uint32_t executed
- uint32_t functionId
- CUpti_ActivityKind kind
- uint32_t pad
- uint32_t pcOffset
- uint32_t sourceLocatorId
- uint64_t threadsExecuted
Variables
- uint32_t CUpti_ActivityBranch2::correlationId [inherited]
-
The correlation ID of the kernel to which this result is associated.
- uint32_t CUpti_ActivityBranch2::diverged [inherited]
-
Number of times this branch diverged
- uint32_t CUpti_ActivityBranch2::executed [inherited]
-
The number of times this instruction was executed per warp. It will be incremented regardless of predicate or condition code.
- uint32_t CUpti_ActivityBranch2::functionId [inherited]
-
Correlation ID with global/device function name
- CUpti_ActivityKindCUpti_ActivityBranch2::kind [inherited]
-
The activity record kind, must be CUPTI_ACTIVITY_KIND_BRANCH.
- uint32_t CUpti_ActivityBranch2::pad [inherited]
-
Undefined. Reserved for internal use.
- uint32_t CUpti_ActivityBranch2::pcOffset [inherited]
-
The pc offset for the branch.
- uint32_t CUpti_ActivityBranch2::sourceLocatorId [inherited]
-
The ID for source locator.
- uint64_t CUpti_ActivityBranch2::threadsExecuted [inherited]
-
This increments each time when this instruction is executed by number of threads that executed this instruction
6.12. CUpti_ActivityCdpKernel Struct Reference
[CUPTI Activity API]
This activity record represents a CDP kernel execution.
Public Variables
- int32_t blockX
- int32_t blockY
- int32_t blockZ
- uint64_t completed
- uint32_t contextId
- uint32_t correlationId
- uint32_t deviceId
- int32_t dynamicSharedMemory
- uint64_t end
- uint8_t executed
- int64_t gridId
- int32_t gridX
- int32_t gridY
- int32_t gridZ
- CUpti_ActivityKind kind
- uint32_t localMemoryPerThread
- uint32_t localMemoryTotal
- const char * name
- uint32_t parentBlockX
- uint32_t parentBlockY
- uint32_t parentBlockZ
- int64_t parentGridId
- uint64_t queued
- uint16_t registersPerThread
- uint8_t requested
- uint8_t sharedMemoryConfig
- uint64_t start
- int32_t staticSharedMemory
- uint32_t streamId
- uint64_t submitted
Variables
- int32_t CUpti_ActivityCdpKernel::blockX [inherited]
-
The X-dimension block size for the kernel.
- int32_t CUpti_ActivityCdpKernel::blockY [inherited]
-
The Y-dimension block size for the kernel.
- int32_t CUpti_ActivityCdpKernel::blockZ [inherited]
-
The Z-dimension grid size for the kernel.
- uint64_t CUpti_ActivityCdpKernel::completed [inherited]
-
The timestamp when kernel is marked as completed, in ns. A value of CUPTI_TIMESTAMP_UNKNOWN indicates that the completion time is unknown.
- uint32_t CUpti_ActivityCdpKernel::contextId [inherited]
-
The ID of the context where the kernel is executing.
- uint32_t CUpti_ActivityCdpKernel::correlationId [inherited]
-
The correlation ID of the kernel. Each kernel execution is assigned a unique correlation ID that is identical to the correlation ID in the driver API activity record that launched the kernel.
- uint32_t CUpti_ActivityCdpKernel::deviceId [inherited]
-
The ID of the device where the kernel is executing.
- int32_t CUpti_ActivityCdpKernel::dynamicSharedMemory [inherited]
-
The dynamic shared memory reserved for the kernel, in bytes.
- uint64_t CUpti_ActivityCdpKernel::end [inherited]
-
The end timestamp for the kernel execution, in ns. A value of 0 for both the start and end timestamps indicates that timestamp information could not be collected for the kernel.
- uint8_t CUpti_ActivityCdpKernel::executed [inherited]
-
The cache configuration used for the kernel. The value is one of the CUfunc_cache enumeration values from cuda.h.
- int64_t CUpti_ActivityCdpKernel::gridId [inherited]
-
The grid ID of the kernel. Each kernel execution is assigned a unique grid ID.
- int32_t CUpti_ActivityCdpKernel::gridX [inherited]
-
The X-dimension grid size for the kernel.
- int32_t CUpti_ActivityCdpKernel::gridY [inherited]
-
The Y-dimension grid size for the kernel.
- int32_t CUpti_ActivityCdpKernel::gridZ [inherited]
-
The Z-dimension grid size for the kernel.
- CUpti_ActivityKindCUpti_ActivityCdpKernel::kind [inherited]
-
The activity record kind, must be CUPTI_ACTIVITY_KIND_CDP_KERNEL
- uint32_t CUpti_ActivityCdpKernel::localMemoryPerThread [inherited]
-
The amount of local memory reserved for each thread, in bytes.
- uint32_t CUpti_ActivityCdpKernel::localMemoryTotal [inherited]
-
The total amount of local memory reserved for the kernel, in bytes.
- const char * CUpti_ActivityCdpKernel::name [inherited]
-
The name of the kernel. This name is shared across all activity records representing the same kernel, and so should not be modified.
- uint32_t CUpti_ActivityCdpKernel::parentBlockX [inherited]
-
The X-dimension of the parent block.
- uint32_t CUpti_ActivityCdpKernel::parentBlockY [inherited]
-
The Y-dimension of the parent block.
- uint32_t CUpti_ActivityCdpKernel::parentBlockZ [inherited]
-
The Z-dimension of the parent block.
- int64_t CUpti_ActivityCdpKernel::parentGridId [inherited]
-
The grid ID of the parent kernel.
- uint64_t CUpti_ActivityCdpKernel::queued [inherited]
-
The timestamp when kernel is queued up, in ns. A value of CUPTI_TIMESTAMP_UNKNOWN indicates that the queued time is unknown.
- uint16_t CUpti_ActivityCdpKernel::registersPerThread [inherited]
-
The number of registers required for each thread executing the kernel.
- uint8_t CUpti_ActivityCdpKernel::requested [inherited]
-
The cache configuration requested by the kernel. The value is one of the CUfunc_cache enumeration values from cuda.h.
- uint8_t CUpti_ActivityCdpKernel::sharedMemoryConfig [inherited]
-
The shared memory configuration used for the kernel. The value is one of the CUsharedconfig enumeration values from cuda.h.
- uint64_t CUpti_ActivityCdpKernel::start [inherited]
-
The start timestamp for the kernel execution, in ns. A value of 0 for both the start and end timestamps indicates that timestamp information could not be collected for the kernel.
- int32_t CUpti_ActivityCdpKernel::staticSharedMemory [inherited]
-
The static shared memory allocated for the kernel, in bytes.
- uint32_t CUpti_ActivityCdpKernel::streamId [inherited]
-
The ID of the stream where the kernel is executing.
- uint64_t CUpti_ActivityCdpKernel::submitted [inherited]
-
The timestamp when kernel is submitted to the gpu, in ns. A value of CUPTI_TIMESTAMP_UNKNOWN indicates that the submission time is unknown.
6.13. CUpti_ActivityContext Struct Reference
[CUPTI Activity API]
This activity record represents information about a context (CUPTI_ACTIVITY_KIND_CONTEXT).
Public Variables
- uint16_t computeApiKind
- uint32_t contextId
- uint32_t deviceId
- CUpti_ActivityKind kind
- uint16_t nullStreamId
Variables
- uint16_t CUpti_ActivityContext::computeApiKind [inherited]
- uint32_t CUpti_ActivityContext::contextId [inherited]
-
The context ID.
- uint32_t CUpti_ActivityContext::deviceId [inherited]
-
The device ID.
- CUpti_ActivityKindCUpti_ActivityContext::kind [inherited]
-
The activity record kind, must be CUPTI_ACTIVITY_KIND_CONTEXT.
- uint16_t CUpti_ActivityContext::nullStreamId [inherited]
-
The ID for the NULL stream in this context
6.14. CUpti_ActivityCudaEvent Struct Reference
[CUPTI Activity API]
This activity is used to track recorded events. (CUPTI_ACTIVITY_KIND_CUDA_EVENT).
Public Variables
- uint32_t contextId
- uint32_t correlationId
- uint32_t eventId
- CUpti_ActivityKind kind
- uint32_t pad
- uint32_t streamId
Variables
- uint32_t CUpti_ActivityCudaEvent::contextId [inherited]
-
The ID of the context where the event was recorded.
- uint32_t CUpti_ActivityCudaEvent::correlationId [inherited]
-
The correlation ID of the API to which this result is associated.
- uint32_t CUpti_ActivityCudaEvent::eventId [inherited]
-
A unique event ID to identify the event record.
- CUpti_ActivityKindCUpti_ActivityCudaEvent::kind [inherited]
-
The activity record kind, must be CUPTI_ACTIVITY_KIND_CUDA_EVENT.
- uint32_t CUpti_ActivityCudaEvent::pad [inherited]
-
Undefined. Reserved for internal use.
- uint32_t CUpti_ActivityCudaEvent::streamId [inherited]
-
The compute stream where the event was recorded.
6.15. CUpti_ActivityDevice Struct Reference
[CUPTI Activity API]
This activity record represents information about a GPU device (CUPTI_ACTIVITY_KIND_DEVICE). Device activity is now reported using the CUpti_ActivityDevice5 activity record.
Public Variables
- uint32_t computeCapabilityMajor
- uint32_t computeCapabilityMinor
- uint32_t constantMemorySize
- uint32_t coreClockRate
- CUpti_ActivityFlag flags
- uint64_t globalMemoryBandwidth
- uint64_t globalMemorySize
- uint32_t id
- CUpti_ActivityKind kind
- uint32_t l2CacheSize
- uint32_t maxBlockDimX
- uint32_t maxBlockDimY
- uint32_t maxBlockDimZ
- uint32_t maxBlocksPerMultiprocessor
- uint32_t maxGridDimX
- uint32_t maxGridDimY
- uint32_t maxGridDimZ
- uint32_t maxIPC
- uint32_t maxRegistersPerBlock
- uint32_t maxSharedMemoryPerBlock
- uint32_t maxThreadsPerBlock
- uint32_t maxWarpsPerMultiprocessor
- const char * name
- uint32_t numMemcpyEngines
- uint32_t numMultiprocessors
- uint32_t numThreadsPerWarp
Variables
- uint32_t CUpti_ActivityDevice::computeCapabilityMajor [inherited]
-
Compute capability for the device, major number.
- uint32_t CUpti_ActivityDevice::computeCapabilityMinor [inherited]
-
Compute capability for the device, minor number.
- uint32_t CUpti_ActivityDevice::constantMemorySize [inherited]
-
The amount of constant memory on the device, in bytes.
- uint32_t CUpti_ActivityDevice::coreClockRate [inherited]
-
The core clock rate of the device, in kHz.
- CUpti_ActivityFlagCUpti_ActivityDevice::flags [inherited]
- uint64_t CUpti_ActivityDevice::globalMemoryBandwidth [inherited]
-
The global memory bandwidth available on the device, in kBytes/sec.
- uint64_t CUpti_ActivityDevice::globalMemorySize [inherited]
-
The amount of global memory on the device, in bytes.
- uint32_t CUpti_ActivityDevice::id [inherited]
-
The device ID.
- CUpti_ActivityKindCUpti_ActivityDevice::kind [inherited]
-
The activity record kind, must be CUPTI_ACTIVITY_KIND_DEVICE.
- uint32_t CUpti_ActivityDevice::l2CacheSize [inherited]
-
The size of the L2 cache on the device, in bytes.
- uint32_t CUpti_ActivityDevice::maxBlockDimX [inherited]
-
Maximum allowed X dimension for a block.
- uint32_t CUpti_ActivityDevice::maxBlockDimY [inherited]
-
Maximum allowed Y dimension for a block.
- uint32_t CUpti_ActivityDevice::maxBlockDimZ [inherited]
-
Maximum allowed Z dimension for a block.
- uint32_t CUpti_ActivityDevice::maxBlocksPerMultiprocessor [inherited]
-
Maximum number of blocks that can be present on a multiprocessor at any given time.
- uint32_t CUpti_ActivityDevice::maxGridDimX [inherited]
-
Maximum allowed X dimension for a grid.
- uint32_t CUpti_ActivityDevice::maxGridDimY [inherited]
-
Maximum allowed Y dimension for a grid.
- uint32_t CUpti_ActivityDevice::maxGridDimZ [inherited]
-
Maximum allowed Z dimension for a grid.
- uint32_t CUpti_ActivityDevice::maxIPC [inherited]
-
The maximum "instructions per cycle" possible on each device multiprocessor.
- uint32_t CUpti_ActivityDevice::maxRegistersPerBlock [inherited]
-
Maximum number of registers that can be allocated to a block.
- uint32_t CUpti_ActivityDevice::maxSharedMemoryPerBlock [inherited]
-
Maximum amount of shared memory that can be assigned to a block, in bytes.
- uint32_t CUpti_ActivityDevice::maxThreadsPerBlock [inherited]
-
Maximum number of threads allowed in a block.
- uint32_t CUpti_ActivityDevice::maxWarpsPerMultiprocessor [inherited]
-
Maximum number of warps that can be present on a multiprocessor at any given time.
- const char * CUpti_ActivityDevice::name [inherited]
-
The device name. This name is shared across all activity records representing instances of the device, and so should not be modified.
- uint32_t CUpti_ActivityDevice::numMemcpyEngines [inherited]
-
Number of memory copy engines on the device.
- uint32_t CUpti_ActivityDevice::numMultiprocessors [inherited]
-
Number of multiprocessors on the device.
- uint32_t CUpti_ActivityDevice::numThreadsPerWarp [inherited]
-
The number of threads per warp on the device.
6.16. CUpti_ActivityDevice2 Struct Reference
[CUPTI Activity API]
This activity record represents information about a GPU device (CUPTI_ACTIVITY_KIND_DEVICE). Device activity is now reported using the CUpti_ActivityDevice5 activity record.
Public Variables
- uint32_t computeCapabilityMajor
- uint32_t computeCapabilityMinor
- uint32_t constantMemorySize
- uint32_t coreClockRate
- uint32_t eccEnabled
- CUpti_ActivityFlag flags
- uint64_t globalMemoryBandwidth
- uint64_t globalMemorySize
- uint32_t id
- CUpti_ActivityKind kind
- uint32_t l2CacheSize
- uint32_t maxBlockDimX
- uint32_t maxBlockDimY
- uint32_t maxBlockDimZ
- uint32_t maxBlocksPerMultiprocessor
- uint32_t maxGridDimX
- uint32_t maxGridDimY
- uint32_t maxGridDimZ
- uint32_t maxIPC
- uint32_t maxRegistersPerBlock
- uint32_t maxRegistersPerMultiprocessor
- uint32_t maxSharedMemoryPerBlock
- uint32_t maxSharedMemoryPerMultiprocessor
- uint32_t maxThreadsPerBlock
- uint32_t maxWarpsPerMultiprocessor
- const char * name
- uint32_t numMemcpyEngines
- uint32_t numMultiprocessors
- uint32_t numThreadsPerWarp
- uint32_t pad
- CUuuid uuid
Variables
- uint32_t CUpti_ActivityDevice2::computeCapabilityMajor [inherited]
-
Compute capability for the device, major number.
- uint32_t CUpti_ActivityDevice2::computeCapabilityMinor [inherited]
-
Compute capability for the device, minor number.
- uint32_t CUpti_ActivityDevice2::constantMemorySize [inherited]
-
The amount of constant memory on the device, in bytes.
- uint32_t CUpti_ActivityDevice2::coreClockRate [inherited]
-
The core clock rate of the device, in kHz.
- uint32_t CUpti_ActivityDevice2::eccEnabled [inherited]
-
ECC enabled flag for device
- CUpti_ActivityFlagCUpti_ActivityDevice2::flags [inherited]
- uint64_t CUpti_ActivityDevice2::globalMemoryBandwidth [inherited]
-
The global memory bandwidth available on the device, in kBytes/sec.
- uint64_t CUpti_ActivityDevice2::globalMemorySize [inherited]
-
The amount of global memory on the device, in bytes.
- uint32_t CUpti_ActivityDevice2::id [inherited]
-
The device ID.
- CUpti_ActivityKindCUpti_ActivityDevice2::kind [inherited]
-
The activity record kind, must be CUPTI_ACTIVITY_KIND_DEVICE.
- uint32_t CUpti_ActivityDevice2::l2CacheSize [inherited]
-
The size of the L2 cache on the device, in bytes.
- uint32_t CUpti_ActivityDevice2::maxBlockDimX [inherited]
-
Maximum allowed X dimension for a block.
- uint32_t CUpti_ActivityDevice2::maxBlockDimY [inherited]
-
Maximum allowed Y dimension for a block.
- uint32_t CUpti_ActivityDevice2::maxBlockDimZ [inherited]
-
Maximum allowed Z dimension for a block.
- uint32_t CUpti_ActivityDevice2::maxBlocksPerMultiprocessor [inherited]
-
Maximum number of blocks that can be present on a multiprocessor at any given time.
- uint32_t CUpti_ActivityDevice2::maxGridDimX [inherited]
-
Maximum allowed X dimension for a grid.
- uint32_t CUpti_ActivityDevice2::maxGridDimY [inherited]
-
Maximum allowed Y dimension for a grid.
- uint32_t CUpti_ActivityDevice2::maxGridDimZ [inherited]
-
Maximum allowed Z dimension for a grid.
- uint32_t CUpti_ActivityDevice2::maxIPC [inherited]
-
The maximum "instructions per cycle" possible on each device multiprocessor.
- uint32_t CUpti_ActivityDevice2::maxRegistersPerBlock [inherited]
-
Maximum number of registers that can be allocated to a block.
- uint32_t CUpti_ActivityDevice2::maxRegistersPerMultiprocessor [inherited]
-
Maximum number of 32-bit registers available per multiprocessor.
- uint32_t CUpti_ActivityDevice2::maxSharedMemoryPerBlock [inherited]
-
Maximum amount of shared memory that can be assigned to a block, in bytes.
- uint32_t CUpti_ActivityDevice2::maxSharedMemoryPerMultiprocessor [inherited]
-
Maximum amount of shared memory available per multiprocessor, in bytes.
- uint32_t CUpti_ActivityDevice2::maxThreadsPerBlock [inherited]
-
Maximum number of threads allowed in a block.
- uint32_t CUpti_ActivityDevice2::maxWarpsPerMultiprocessor [inherited]
-
Maximum number of warps that can be present on a multiprocessor at any given time.
- const char * CUpti_ActivityDevice2::name [inherited]
-
The device name. This name is shared across all activity records representing instances of the device, and so should not be modified.
- uint32_t CUpti_ActivityDevice2::numMemcpyEngines [inherited]
-
Number of memory copy engines on the device.
- uint32_t CUpti_ActivityDevice2::numMultiprocessors [inherited]
-
Number of multiprocessors on the device.
- uint32_t CUpti_ActivityDevice2::numThreadsPerWarp [inherited]
-
The number of threads per warp on the device.
- uint32_t CUpti_ActivityDevice2::pad [inherited]
-
Undefined. Reserved for internal use.
- CUuuid CUpti_ActivityDevice2::uuid [inherited]
-
The device UUID. This value is the globally unique immutable alphanumeric identifier of the device.
6.17. CUpti_ActivityDevice3 Struct Reference
[CUPTI Activity API]
This activity record represents information about a GPU device (CUPTI_ACTIVITY_KIND_DEVICE). Device activity is now reported using the CUpti_ActivityDevice5 activity record.
Public Variables
- uint32_t computeCapabilityMajor
- uint32_t computeCapabilityMinor
- uint32_t constantMemorySize
- uint32_t coreClockRate
- uint32_t eccEnabled
- CUpti_ActivityFlag flags
- uint64_t globalMemoryBandwidth
- uint64_t globalMemorySize
- uint32_t id
- uint8_t isCudaVisible
- CUpti_ActivityKind kind
- uint32_t l2CacheSize
- uint32_t maxBlockDimX
- uint32_t maxBlockDimY
- uint32_t maxBlockDimZ
- uint32_t maxBlocksPerMultiprocessor
- uint32_t maxGridDimX
- uint32_t maxGridDimY
- uint32_t maxGridDimZ
- uint32_t maxIPC
- uint32_t maxRegistersPerBlock
- uint32_t maxRegistersPerMultiprocessor
- uint32_t maxSharedMemoryPerBlock
- uint32_t maxSharedMemoryPerMultiprocessor
- uint32_t maxThreadsPerBlock
- uint32_t maxWarpsPerMultiprocessor
- const char * name
- uint32_t numMemcpyEngines
- uint32_t numMultiprocessors
- uint32_t numThreadsPerWarp
- uint32_t pad
- CUuuid uuid
Variables
- uint32_t CUpti_ActivityDevice3::computeCapabilityMajor [inherited]
-
Compute capability for the device, major number.
- uint32_t CUpti_ActivityDevice3::computeCapabilityMinor [inherited]
-
Compute capability for the device, minor number.
- uint32_t CUpti_ActivityDevice3::constantMemorySize [inherited]
-
The amount of constant memory on the device, in bytes.
- uint32_t CUpti_ActivityDevice3::coreClockRate [inherited]
-
The core clock rate of the device, in kHz.
- uint32_t CUpti_ActivityDevice3::eccEnabled [inherited]
-
ECC enabled flag for device
- CUpti_ActivityFlagCUpti_ActivityDevice3::flags [inherited]
- uint64_t CUpti_ActivityDevice3::globalMemoryBandwidth [inherited]
-
The global memory bandwidth available on the device, in kBytes/sec.
- uint64_t CUpti_ActivityDevice3::globalMemorySize [inherited]
-
The amount of global memory on the device, in bytes.
- uint32_t CUpti_ActivityDevice3::id [inherited]
-
The device ID.
- uint8_t CUpti_ActivityDevice3::isCudaVisible [inherited]
-
Flag to indicate whether the device is visible to CUDA. Users can set the device visibility using CUDA_VISIBLE_DEVICES environment
- CUpti_ActivityKindCUpti_ActivityDevice3::kind [inherited]
-
The activity record kind, must be CUPTI_ACTIVITY_KIND_DEVICE.
- uint32_t CUpti_ActivityDevice3::l2CacheSize [inherited]
-
The size of the L2 cache on the device, in bytes.
- uint32_t CUpti_ActivityDevice3::maxBlockDimX [inherited]
-
Maximum allowed X dimension for a block.
- uint32_t CUpti_ActivityDevice3::maxBlockDimY [inherited]
-
Maximum allowed Y dimension for a block.
- uint32_t CUpti_ActivityDevice3::maxBlockDimZ [inherited]
-
Maximum allowed Z dimension for a block.
- uint32_t CUpti_ActivityDevice3::maxBlocksPerMultiprocessor [inherited]
-
Maximum number of blocks that can be present on a multiprocessor at any given time.
- uint32_t CUpti_ActivityDevice3::maxGridDimX [inherited]
-
Maximum allowed X dimension for a grid.
- uint32_t CUpti_ActivityDevice3::maxGridDimY [inherited]
-
Maximum allowed Y dimension for a grid.
- uint32_t CUpti_ActivityDevice3::maxGridDimZ [inherited]
-
Maximum allowed Z dimension for a grid.
- uint32_t CUpti_ActivityDevice3::maxIPC [inherited]
-
The maximum "instructions per cycle" possible on each device multiprocessor.
- uint32_t CUpti_ActivityDevice3::maxRegistersPerBlock [inherited]
-
Maximum number of registers that can be allocated to a block.
- uint32_t CUpti_ActivityDevice3::maxRegistersPerMultiprocessor [inherited]
-
Maximum number of 32-bit registers available per multiprocessor.
- uint32_t CUpti_ActivityDevice3::maxSharedMemoryPerBlock [inherited]
-
Maximum amount of shared memory that can be assigned to a block, in bytes.
- uint32_t CUpti_ActivityDevice3::maxSharedMemoryPerMultiprocessor [inherited]
-
Maximum amount of shared memory available per multiprocessor, in bytes.
- uint32_t CUpti_ActivityDevice3::maxThreadsPerBlock [inherited]
-
Maximum number of threads allowed in a block.
- uint32_t CUpti_ActivityDevice3::maxWarpsPerMultiprocessor [inherited]
-
Maximum number of warps that can be present on a multiprocessor at any given time.
- const char * CUpti_ActivityDevice3::name [inherited]
-
The device name. This name is shared across all activity records representing instances of the device, and so should not be modified.
- uint32_t CUpti_ActivityDevice3::numMemcpyEngines [inherited]
-
Number of memory copy engines on the device.
- uint32_t CUpti_ActivityDevice3::numMultiprocessors [inherited]
-
Number of multiprocessors on the device.
- uint32_t CUpti_ActivityDevice3::numThreadsPerWarp [inherited]
-
The number of threads per warp on the device.
- uint32_t CUpti_ActivityDevice3::pad [inherited]
-
Undefined. Reserved for internal use.
- CUuuid CUpti_ActivityDevice3::uuid [inherited]
-
The device UUID. This value is the globally unique immutable alphanumeric identifier of the device.
6.18. CUpti_ActivityDevice4 Struct Reference
[CUPTI Activity API]
This activity record represents information about a GPU device (CUPTI_ACTIVITY_KIND_DEVICE). Device activity is now reported using the CUpti_ActivityDevice5 activity record.
Public Variables
- uint32_t computeCapabilityMajor
- uint32_t computeCapabilityMinor
- uint32_t computeInstanceId
- uint32_t constantMemorySize
- uint32_t coreClockRate
- uint32_t eccEnabled
- CUpti_ActivityFlag flags
- uint64_t globalMemoryBandwidth
- uint64_t globalMemorySize
- uint32_t gpuInstanceId
- uint32_t id
- uint8_t isCudaVisible
- uint8_t isMigEnabled
- CUpti_ActivityKind kind
- uint32_t l2CacheSize
- uint32_t maxBlockDimX
- uint32_t maxBlockDimY
- uint32_t maxBlockDimZ
- uint32_t maxBlocksPerMultiprocessor
- uint32_t maxGridDimX
- uint32_t maxGridDimY
- uint32_t maxGridDimZ
- uint32_t maxIPC
- uint32_t maxRegistersPerBlock
- uint32_t maxRegistersPerMultiprocessor
- uint32_t maxSharedMemoryPerBlock
- uint32_t maxSharedMemoryPerMultiprocessor
- uint32_t maxThreadsPerBlock
- uint32_t maxWarpsPerMultiprocessor
- CUuuid migUuid
- const char * name
- uint32_t numMemcpyEngines
- uint32_t numMultiprocessors
- uint32_t numThreadsPerWarp
- uint32_t pad
- CUuuid uuid
Variables
- uint32_t CUpti_ActivityDevice4::computeCapabilityMajor [inherited]
-
Compute capability for the device, major number.
- uint32_t CUpti_ActivityDevice4::computeCapabilityMinor [inherited]
-
Compute capability for the device, minor number.
- uint32_t CUpti_ActivityDevice4::computeInstanceId [inherited]
-
Compute Instance id for MIG enabled devices. If mig mode is disabled value is set to UINT32_MAX
- uint32_t CUpti_ActivityDevice4::constantMemorySize [inherited]
-
The amount of constant memory on the device, in bytes.
- uint32_t CUpti_ActivityDevice4::coreClockRate [inherited]
-
The core clock rate of the device, in kHz.
- uint32_t CUpti_ActivityDevice4::eccEnabled [inherited]
-
ECC enabled flag for device
- CUpti_ActivityFlagCUpti_ActivityDevice4::flags [inherited]
- uint64_t CUpti_ActivityDevice4::globalMemoryBandwidth [inherited]
-
The global memory bandwidth available on the device, in kBytes/sec.
- uint64_t CUpti_ActivityDevice4::globalMemorySize [inherited]
-
The amount of global memory on the device, in bytes.
- uint32_t CUpti_ActivityDevice4::gpuInstanceId [inherited]
-
GPU Instance id for MIG enabled devices. If mig mode is disabled value is set to UINT32_MAX
- uint32_t CUpti_ActivityDevice4::id [inherited]
-
The device ID.
- uint8_t CUpti_ActivityDevice4::isCudaVisible [inherited]
-
Flag to indicate whether the device is visible to CUDA. Users can set the device visibility using CUDA_VISIBLE_DEVICES environment
- uint8_t CUpti_ActivityDevice4::isMigEnabled [inherited]
-
MIG enabled flag for device
- CUpti_ActivityKindCUpti_ActivityDevice4::kind [inherited]
-
The activity record kind, must be CUPTI_ACTIVITY_KIND_DEVICE.
- uint32_t CUpti_ActivityDevice4::l2CacheSize [inherited]
-
The size of the L2 cache on the device, in bytes.
- uint32_t CUpti_ActivityDevice4::maxBlockDimX [inherited]
-
Maximum allowed X dimension for a block.
- uint32_t CUpti_ActivityDevice4::maxBlockDimY [inherited]
-
Maximum allowed Y dimension for a block.
- uint32_t CUpti_ActivityDevice4::maxBlockDimZ [inherited]
-
Maximum allowed Z dimension for a block.
- uint32_t CUpti_ActivityDevice4::maxBlocksPerMultiprocessor [inherited]
-
Maximum number of blocks that can be present on a multiprocessor at any given time.
- uint32_t CUpti_ActivityDevice4::maxGridDimX [inherited]
-
Maximum allowed X dimension for a grid.
- uint32_t CUpti_ActivityDevice4::maxGridDimY [inherited]
-
Maximum allowed Y dimension for a grid.
- uint32_t CUpti_ActivityDevice4::maxGridDimZ [inherited]
-
Maximum allowed Z dimension for a grid.
- uint32_t CUpti_ActivityDevice4::maxIPC [inherited]
-
The maximum "instructions per cycle" possible on each device multiprocessor.
- uint32_t CUpti_ActivityDevice4::maxRegistersPerBlock [inherited]
-
Maximum number of registers that can be allocated to a block.
- uint32_t CUpti_ActivityDevice4::maxRegistersPerMultiprocessor [inherited]
-
Maximum number of 32-bit registers available per multiprocessor.
- uint32_t CUpti_ActivityDevice4::maxSharedMemoryPerBlock [inherited]
-
Maximum amount of shared memory that can be assigned to a block, in bytes.
- uint32_t CUpti_ActivityDevice4::maxSharedMemoryPerMultiprocessor [inherited]
-
Maximum amount of shared memory available per multiprocessor, in bytes.
- uint32_t CUpti_ActivityDevice4::maxThreadsPerBlock [inherited]
-
Maximum number of threads allowed in a block.
- uint32_t CUpti_ActivityDevice4::maxWarpsPerMultiprocessor [inherited]
-
Maximum number of warps that can be present on a multiprocessor at any given time.
- CUuuid CUpti_ActivityDevice4::migUuid [inherited]
-
The MIG UUID. This value is the globally unique immutable alphanumeric identifier of the device.
- const char * CUpti_ActivityDevice4::name [inherited]
-
The device name. This name is shared across all activity records representing instances of the device, and so should not be modified.
- uint32_t CUpti_ActivityDevice4::numMemcpyEngines [inherited]
-
Number of memory copy engines on the device.
- uint32_t CUpti_ActivityDevice4::numMultiprocessors [inherited]
-
Number of multiprocessors on the device.
- uint32_t CUpti_ActivityDevice4::numThreadsPerWarp [inherited]
-
The number of threads per warp on the device.
- uint32_t CUpti_ActivityDevice4::pad [inherited]
-
Undefined. Reserved for internal use.
- CUuuid CUpti_ActivityDevice4::uuid [inherited]
-
The device UUID. This value is the globally unique immutable alphanumeric identifier of the device.
6.19. CUpti_ActivityDevice5 Struct Reference
[CUPTI Activity API]
This activity record represents information about a GPU device (CUPTI_ACTIVITY_KIND_DEVICE).
Public Variables
- uint32_t computeCapabilityMajor
- uint32_t computeCapabilityMinor
- uint32_t computeInstanceId
- uint32_t constantMemorySize
- uint32_t coreClockRate
- uint32_t eccEnabled
- CUpti_ActivityFlag flags
- uint64_t globalMemoryBandwidth
- uint64_t globalMemorySize
- uint32_t gpuInstanceId
- uint32_t id
- uint8_t isCudaVisible
- uint8_t isMigEnabled
- uint32_t isNumaNode
- CUpti_ActivityKind kind
- uint32_t l2CacheSize
- uint32_t maxBlockDimX
- uint32_t maxBlockDimY
- uint32_t maxBlockDimZ
- uint32_t maxBlocksPerMultiprocessor
- uint32_t maxGridDimX
- uint32_t maxGridDimY
- uint32_t maxGridDimZ
- uint32_t maxIPC
- uint32_t maxRegistersPerBlock
- uint32_t maxRegistersPerMultiprocessor
- uint32_t maxSharedMemoryPerBlock
- uint32_t maxSharedMemoryPerMultiprocessor
- uint32_t maxThreadsPerBlock
- uint32_t maxWarpsPerMultiprocessor
- CUuuid migUuid
- const char * name
- uint32_t numMemcpyEngines
- uint32_t numMultiprocessors
- uint32_t numThreadsPerWarp
- uint32_t numaId
- uint32_t pad
- CUuuid uuid
Variables
- uint32_t CUpti_ActivityDevice5::computeCapabilityMajor [inherited]
-
Compute capability for the device, major number.
- uint32_t CUpti_ActivityDevice5::computeCapabilityMinor [inherited]
-
Compute capability for the device, minor number.
- uint32_t CUpti_ActivityDevice5::computeInstanceId [inherited]
-
Compute Instance id for MIG enabled devices. If mig mode is disabled value is set to UINT32_MAX
- uint32_t CUpti_ActivityDevice5::constantMemorySize [inherited]
-
The amount of constant memory on the device, in bytes.
- uint32_t CUpti_ActivityDevice5::coreClockRate [inherited]
-
The core clock rate of the device, in kHz.
- uint32_t CUpti_ActivityDevice5::eccEnabled [inherited]
-
ECC enabled flag for device
- CUpti_ActivityFlagCUpti_ActivityDevice5::flags [inherited]
- uint64_t CUpti_ActivityDevice5::globalMemoryBandwidth [inherited]
-
The global memory bandwidth available on the device, in kBytes/sec.
- uint64_t CUpti_ActivityDevice5::globalMemorySize [inherited]
-
The amount of global memory on the device, in bytes.
- uint32_t CUpti_ActivityDevice5::gpuInstanceId [inherited]
-
GPU Instance id for MIG enabled devices. If mig mode is disabled value is set to UINT32_MAX
- uint32_t CUpti_ActivityDevice5::id [inherited]
-
The device ID.
- uint8_t CUpti_ActivityDevice5::isCudaVisible [inherited]
-
Flag to indicate whether the device is visible to CUDA. Users can set the device visibility using CUDA_VISIBLE_DEVICES environment
- uint8_t CUpti_ActivityDevice5::isMigEnabled [inherited]
-
MIG enabled flag for device
- uint32_t CUpti_ActivityDevice5::isNumaNode [inherited]
-
Numa (Non-uniform memory access) information for device GPU is a NUMA node or not
- CUpti_ActivityKindCUpti_ActivityDevice5::kind [inherited]
-
The activity record kind, must be CUPTI_ACTIVITY_KIND_DEVICE.
- uint32_t CUpti_ActivityDevice5::l2CacheSize [inherited]
-
The size of the L2 cache on the device, in bytes.
- uint32_t CUpti_ActivityDevice5::maxBlockDimX [inherited]
-
Maximum allowed X dimension for a block.
- uint32_t CUpti_ActivityDevice5::maxBlockDimY [inherited]
-
Maximum allowed Y dimension for a block.
- uint32_t CUpti_ActivityDevice5::maxBlockDimZ [inherited]
-
Maximum allowed Z dimension for a block.
- uint32_t CUpti_ActivityDevice5::maxBlocksPerMultiprocessor [inherited]
-
Maximum number of blocks that can be present on a multiprocessor at any given time.
- uint32_t CUpti_ActivityDevice5::maxGridDimX [inherited]
-
Maximum allowed X dimension for a grid.
- uint32_t CUpti_ActivityDevice5::maxGridDimY [inherited]
-
Maximum allowed Y dimension for a grid.
- uint32_t CUpti_ActivityDevice5::maxGridDimZ [inherited]
-
Maximum allowed Z dimension for a grid.
- uint32_t CUpti_ActivityDevice5::maxIPC [inherited]
-
The maximum "instructions per cycle" possible on each device multiprocessor.
- uint32_t CUpti_ActivityDevice5::maxRegistersPerBlock [inherited]
-
Maximum number of registers that can be allocated to a block.
- uint32_t CUpti_ActivityDevice5::maxRegistersPerMultiprocessor [inherited]
-
Maximum number of 32-bit registers available per multiprocessor.
- uint32_t CUpti_ActivityDevice5::maxSharedMemoryPerBlock [inherited]
-
Maximum amount of shared memory that can be assigned to a block, in bytes.
- uint32_t CUpti_ActivityDevice5::maxSharedMemoryPerMultiprocessor [inherited]
-
Maximum amount of shared memory available per multiprocessor, in bytes.
- uint32_t CUpti_ActivityDevice5::maxThreadsPerBlock [inherited]
-
Maximum number of threads allowed in a block.
- uint32_t CUpti_ActivityDevice5::maxWarpsPerMultiprocessor [inherited]
-
Maximum number of warps that can be present on a multiprocessor at any given time.
- CUuuid CUpti_ActivityDevice5::migUuid [inherited]
-
The MIG UUID. This value is the globally unique immutable alphanumeric identifier of the device.
- const char * CUpti_ActivityDevice5::name [inherited]
-
The device name. This name is shared across all activity records representing instances of the device, and so should not be modified.
- uint32_t CUpti_ActivityDevice5::numMemcpyEngines [inherited]
-
Number of memory copy engines on the device.
- uint32_t CUpti_ActivityDevice5::numMultiprocessors [inherited]
-
Number of multiprocessors on the device.
- uint32_t CUpti_ActivityDevice5::numThreadsPerWarp [inherited]
-
The number of threads per warp on the device.
- uint32_t CUpti_ActivityDevice5::numaId [inherited]
-
Numa (Non-uniform memory access) information for device NUMA node ID of the GPU memory if GPU is not a NUMA node, it returns invalidNumaId
- uint32_t CUpti_ActivityDevice5::pad [inherited]
-
Undefined. Reserved for internal use.
- CUuuid CUpti_ActivityDevice5::uuid [inherited]
-
The device UUID. This value is the globally unique immutable alphanumeric identifier of the device.
6.20. CUpti_ActivityDeviceAttribute Struct Reference
[CUPTI Activity API]
This activity record represents information about a GPU device: either a CUpti_DeviceAttribute or CUdevice_attribute value (CUPTI_ACTIVITY_KIND_DEVICE_ATTRIBUTE).
Public Variables
- CUpti_ActivityDeviceAttribute::@23 attribute
- uint32_t deviceId
- CUpti_ActivityFlag flags
- CUpti_ActivityKind kind
- CUpti_ActivityDeviceAttribute::@24 value
Variables
- CUpti_ActivityDeviceAttribute::@23 CUpti_ActivityDeviceAttribute::attribute [inherited]
-
The attribute, either a CUpti_DeviceAttribute or CUdevice_attribute. Flag CUPTI_ACTIVITY_FLAG_DEVICE_ATTRIBUTE_CUDEVICE is used to indicate what kind of attribute this is. If CUPTI_ACTIVITY_FLAG_DEVICE_ATTRIBUTE_CUDEVICE is 1 then CUdevice_attribute field is value, otherwise CUpti_DeviceAttribute field is valid.
- uint32_t CUpti_ActivityDeviceAttribute::deviceId [inherited]
-
The ID of the device that this attribute applies to.
- CUpti_ActivityFlagCUpti_ActivityDeviceAttribute::flags [inherited]
- CUpti_ActivityKindCUpti_ActivityDeviceAttribute::kind [inherited]
-
The activity record kind, must be CUPTI_ACTIVITY_KIND_DEVICE_ATTRIBUTE.
- CUpti_ActivityDeviceAttribute::@24 CUpti_ActivityDeviceAttribute::value [inherited]
-
The value for the attribute. See CUpti_DeviceAttribute and CUdevice_attribute for the type of the value for a given attribute.
6.21. CUpti_ActivityEnvironment Struct Reference
[CUPTI Activity API]
This activity record provides CUPTI environmental data, include power, clocks, and thermals. This information is sampled at various rates and returned in this activity record. The consumer of the record needs to check the environmentKind field to figure out what kind of environmental record this is.
Public Variables
- CUpti_EnvironmentClocksThrottleReason clocksThrottleReasons
- CUpti_ActivityEnvironment::@25::@29 cooling
- uint32_t deviceId
- CUpti_ActivityEnvironmentKind environmentKind
- uint32_t fanSpeed
- uint32_t gpuTemperature
- CUpti_ActivityKind kind
- uint32_t memoryClock
- uint32_t pcieLinkGen
- uint32_t pcieLinkWidth
- CUpti_ActivityEnvironment::@25::@28 power
- uint32_t power
- uint32_t powerLimit
- uint32_t smClock
- CUpti_ActivityEnvironment::@25::@26 speed
- CUpti_ActivityEnvironment::@25::@27 temperature
- uint64_t timestamp
Variables
- CUpti_EnvironmentClocksThrottleReasonCUpti_ActivityEnvironment::clocksThrottleReasons [inherited]
-
The clocks throttle reasons.
- CUpti_ActivityEnvironment::@25::@29 CUpti_ActivityEnvironment::cooling [inherited]
-
Data returned for CUPTI_ACTIVITY_ENVIRONMENT_COOLING environment kind.
- uint32_t CUpti_ActivityEnvironment::deviceId [inherited]
-
The ID of the device
- CUpti_ActivityEnvironmentKindCUpti_ActivityEnvironment::environmentKind [inherited]
-
The kind of data reported in this record.
- uint32_t CUpti_ActivityEnvironment::fanSpeed [inherited]
-
The fan speed as percentage of maximum.
- uint32_t CUpti_ActivityEnvironment::gpuTemperature [inherited]
-
The GPU temperature in degrees C.
- CUpti_ActivityKindCUpti_ActivityEnvironment::kind [inherited]
-
The activity record kind, must be CUPTI_ACTIVITY_KIND_ENVIRONMENT.
- uint32_t CUpti_ActivityEnvironment::memoryClock [inherited]
-
The memory frequency in MHz
- uint32_t CUpti_ActivityEnvironment::pcieLinkGen [inherited]
-
The PCIe link generation.
- uint32_t CUpti_ActivityEnvironment::pcieLinkWidth [inherited]
-
The PCIe link width.
- CUpti_ActivityEnvironment::@25::@28 CUpti_ActivityEnvironment::power [inherited]
-
Data returned for CUPTI_ACTIVITY_ENVIRONMENT_POWER environment kind.
- uint32_t CUpti_ActivityEnvironment::power [inherited]
-
The power in milliwatts consumed by GPU and associated circuitry.
- uint32_t CUpti_ActivityEnvironment::powerLimit [inherited]
-
The power in milliwatts that will trigger power management algorithm.
- uint32_t CUpti_ActivityEnvironment::smClock [inherited]
-
The SM frequency in MHz
- CUpti_ActivityEnvironment::@25::@26 CUpti_ActivityEnvironment::speed [inherited]
-
Data returned for CUPTI_ACTIVITY_ENVIRONMENT_SPEED environment kind.
- CUpti_ActivityEnvironment::@25::@27 CUpti_ActivityEnvironment::temperature [inherited]
-
Data returned for CUPTI_ACTIVITY_ENVIRONMENT_TEMPERATURE environment kind.
- uint64_t CUpti_ActivityEnvironment::timestamp [inherited]
-
The timestamp when this sample was retrieved, in ns. A value of 0 indicates that timestamp information could not be collected for the marker.
6.22. CUpti_ActivityEvent Struct Reference
[CUPTI Activity API]
This activity record represents a CUPTI event value (CUPTI_ACTIVITY_KIND_EVENT). This activity record kind is not produced by the activity API but is included for completeness and ease-of-use. Profile frameworks built on top of CUPTI that collect event data may choose to use this type to store the collected event data.
Public Variables
- uint32_t correlationId
- CUpti_EventDomainID domain
- CUpti_EventID id
- CUpti_ActivityKind kind
- uint64_t value
Variables
- uint32_t CUpti_ActivityEvent::correlationId [inherited]
-
The correlation ID of the event. Use of this ID is user-defined, but typically this ID value will equal the correlation ID of the kernel for which the event was gathered.
- CUpti_EventDomainIDCUpti_ActivityEvent::domain [inherited]
-
The event domain ID.
- CUpti_EventIDCUpti_ActivityEvent::id [inherited]
-
The event ID.
- CUpti_ActivityKindCUpti_ActivityEvent::kind [inherited]
-
The activity record kind, must be CUPTI_ACTIVITY_KIND_EVENT.
- uint64_t CUpti_ActivityEvent::value [inherited]
-
The event value.
6.23. CUpti_ActivityEventInstance Struct Reference
[CUPTI Activity API]
This activity record represents the a CUPTI event value for a specific event domain instance (CUPTI_ACTIVITY_KIND_EVENT_INSTANCE). This activity record kind is not produced by the activity API but is included for completeness and ease-of-use. Profile frameworks built on top of CUPTI that collect event data may choose to use this type to store the collected event data. This activity record should be used when event domain instance information needs to be associated with the event.
Public Variables
- uint32_t correlationId
- CUpti_EventDomainID domain
- CUpti_EventID id
- uint32_t instance
- CUpti_ActivityKind kind
- uint32_t pad
- uint64_t value
Variables
- uint32_t CUpti_ActivityEventInstance::correlationId [inherited]
-
The correlation ID of the event. Use of this ID is user-defined, but typically this ID value will equal the correlation ID of the kernel for which the event was gathered.
- CUpti_EventDomainIDCUpti_ActivityEventInstance::domain [inherited]
-
The event domain ID.
- CUpti_EventIDCUpti_ActivityEventInstance::id [inherited]
-
The event ID.
- uint32_t CUpti_ActivityEventInstance::instance [inherited]
-
The event domain instance.
- CUpti_ActivityKindCUpti_ActivityEventInstance::kind [inherited]
-
The activity record kind, must be CUPTI_ACTIVITY_KIND_EVENT_INSTANCE.
- uint32_t CUpti_ActivityEventInstance::pad [inherited]
-
Undefined. Reserved for internal use.
- uint64_t CUpti_ActivityEventInstance::value [inherited]
-
The event value.
6.24. CUpti_ActivityExternalCorrelation Struct Reference
[CUPTI Activity API]
This activity record correlates native CUDA records (e.g. CUDA Driver API, kernels, memcpys, ...) with records from external APIs such as OpenACC. (CUPTI_ACTIVITY_KIND_EXTERNAL_CORRELATION).
See also:
Public Variables
- uint32_t correlationId
- uint64_t externalId
- CUpti_ExternalCorrelationKind externalKind
- CUpti_ActivityKind kind
- uint32_t reserved
Variables
- uint32_t CUpti_ActivityExternalCorrelation::correlationId [inherited]
-
The correlation ID of the associated CUDA driver or runtime API record.
- uint64_t CUpti_ActivityExternalCorrelation::externalId [inherited]
-
The correlation ID of the associated non-CUDA API record. The exact field in the associated external record depends on that record's activity kind (
See also:
- CUpti_ExternalCorrelationKindCUpti_ActivityExternalCorrelation::externalKind [inherited]
-
The kind of external API this record correlated to.
- CUpti_ActivityKindCUpti_ActivityExternalCorrelation::kind [inherited]
-
The kind of this activity.
- uint32_t CUpti_ActivityExternalCorrelation::reserved [inherited]
-
Undefined. Reserved for internal use.
6.25. CUpti_ActivityFunction Struct Reference
[CUPTI Activity API]
This activity records function name and corresponding module information. (CUPTI_ACTIVITY_KIND_FUNCTION).
Public Variables
- uint32_t contextId
- uint32_t functionIndex
- uint32_t id
- CUpti_ActivityKind kind
- uint32_t moduleId
- const char * name
Variables
- uint32_t CUpti_ActivityFunction::contextId [inherited]
-
The ID of the context where the function is launched.
- uint32_t CUpti_ActivityFunction::functionIndex [inherited]
-
The function's unique symbol index in the module.
- uint32_t CUpti_ActivityFunction::id [inherited]
-
ID to uniquely identify the record
- CUpti_ActivityKindCUpti_ActivityFunction::kind [inherited]
-
The activity record kind, must be CUPTI_ACTIVITY_KIND_FUNCTION.
- uint32_t CUpti_ActivityFunction::moduleId [inherited]
-
The module ID in which this global/device function is present.
- const char * CUpti_ActivityFunction::name [inherited]
-
The name of the function. This name is shared across all activity records representing the same kernel, and so should not be modified.
6.26. CUpti_ActivityGlobalAccess Struct Reference
[CUPTI Activity API]
This activity records the locations of the global accesses in the source (CUPTI_ACTIVITY_KIND_GLOBAL_ACCESS). Global access activities are now reported using the CUpti_ActivityGlobalAccess3 activity record.
Public Variables
- uint32_t correlationId
- uint32_t executed
- CUpti_ActivityFlag flags
- CUpti_ActivityKind kind
- uint64_t l2_transactions
- uint32_t pcOffset
- uint32_t sourceLocatorId
- uint64_t threadsExecuted
Variables
- uint32_t CUpti_ActivityGlobalAccess::correlationId [inherited]
-
The correlation ID of the kernel to which this result is associated.
- uint32_t CUpti_ActivityGlobalAccess::executed [inherited]
-
The number of times this instruction was executed per warp. It will be incremented when at least one of thread among warp is active with predicate and condition code evaluating to true.
- CUpti_ActivityFlagCUpti_ActivityGlobalAccess::flags [inherited]
-
The properties of this global access.
- CUpti_ActivityKindCUpti_ActivityGlobalAccess::kind [inherited]
-
The activity record kind, must be CUPTI_ACTIVITY_KIND_GLOBAL_ACCESS.
- uint64_t CUpti_ActivityGlobalAccess::l2_transactions [inherited]
-
The total number of 32 bytes transactions to L2 cache generated by this access
- uint32_t CUpti_ActivityGlobalAccess::pcOffset [inherited]
-
The pc offset for the access.
- uint32_t CUpti_ActivityGlobalAccess::sourceLocatorId [inherited]
-
The ID for source locator.
- uint64_t CUpti_ActivityGlobalAccess::threadsExecuted [inherited]
-
This increments each time when this instruction is executed by number of threads that executed this instruction with predicate and condition code evaluating to true.
6.27. CUpti_ActivityGlobalAccess2 Struct Reference
[CUPTI Activity API]
This activity records the locations of the global accesses in the source (CUPTI_ACTIVITY_KIND_GLOBAL_ACCESS). Global access activities are now reported using the CUpti_ActivityGlobalAccess3 activity record.
Public Variables
- uint32_t correlationId
- uint32_t executed
- CUpti_ActivityFlag flags
- uint32_t functionId
- CUpti_ActivityKind kind
- uint64_t l2_transactions
- uint32_t pad
- uint32_t pcOffset
- uint32_t sourceLocatorId
- uint64_t theoreticalL2Transactions
- uint64_t threadsExecuted
Variables
- uint32_t CUpti_ActivityGlobalAccess2::correlationId [inherited]
-
The correlation ID of the kernel to which this result is associated.
- uint32_t CUpti_ActivityGlobalAccess2::executed [inherited]
-
The number of times this instruction was executed per warp. It will be incremented when at least one of thread among warp is active with predicate and condition code evaluating to true.
- CUpti_ActivityFlagCUpti_ActivityGlobalAccess2::flags [inherited]
-
The properties of this global access.
- uint32_t CUpti_ActivityGlobalAccess2::functionId [inherited]
-
Correlation ID with global/device function name
- CUpti_ActivityKindCUpti_ActivityGlobalAccess2::kind [inherited]
-
The activity record kind, must be CUPTI_ACTIVITY_KIND_GLOBAL_ACCESS.
- uint64_t CUpti_ActivityGlobalAccess2::l2_transactions [inherited]
-
The total number of 32 bytes transactions to L2 cache generated by this access
- uint32_t CUpti_ActivityGlobalAccess2::pad [inherited]
-
Undefined. Reserved for internal use.
- uint32_t CUpti_ActivityGlobalAccess2::pcOffset [inherited]
-
The pc offset for the access.
- uint32_t CUpti_ActivityGlobalAccess2::sourceLocatorId [inherited]
-
The ID for source locator.
- uint64_t CUpti_ActivityGlobalAccess2::theoreticalL2Transactions [inherited]
-
The minimum number of L2 transactions possible based on the access pattern.
- uint64_t CUpti_ActivityGlobalAccess2::threadsExecuted [inherited]
-
This increments each time when this instruction is executed by number of threads that executed this instruction with predicate and condition code evaluating to true.
6.28. CUpti_ActivityGlobalAccess3 Struct Reference
[CUPTI Activity API]
This activity records the locations of the global accesses in the source (CUPTI_ACTIVITY_KIND_GLOBAL_ACCESS).
Public Variables
- uint32_t correlationId
- uint32_t executed
- CUpti_ActivityFlag flags
- uint32_t functionId
- CUpti_ActivityKind kind
- uint64_t l2_transactions
- uint64_t pcOffset
- uint32_t sourceLocatorId
- uint64_t theoreticalL2Transactions
- uint64_t threadsExecuted
Variables
- uint32_t CUpti_ActivityGlobalAccess3::correlationId [inherited]
-
The correlation ID of the kernel to which this result is associated.
- uint32_t CUpti_ActivityGlobalAccess3::executed [inherited]
-
The number of times this instruction was executed per warp. It will be incremented when at least one of thread among warp is active with predicate and condition code evaluating to true.
- CUpti_ActivityFlagCUpti_ActivityGlobalAccess3::flags [inherited]
-
The properties of this global access.
- uint32_t CUpti_ActivityGlobalAccess3::functionId [inherited]
-
Correlation ID with global/device function name
- CUpti_ActivityKindCUpti_ActivityGlobalAccess3::kind [inherited]
-
The activity record kind, must be CUPTI_ACTIVITY_KIND_GLOBAL_ACCESS.
- uint64_t CUpti_ActivityGlobalAccess3::l2_transactions [inherited]
-
The total number of 32 bytes transactions to L2 cache generated by this access
- uint64_t CUpti_ActivityGlobalAccess3::pcOffset [inherited]
-
The pc offset for the access.
- uint32_t CUpti_ActivityGlobalAccess3::sourceLocatorId [inherited]
-
The ID for source locator.
- uint64_t CUpti_ActivityGlobalAccess3::theoreticalL2Transactions [inherited]
-
The minimum number of L2 transactions possible based on the access pattern.
- uint64_t CUpti_ActivityGlobalAccess3::threadsExecuted [inherited]
-
This increments each time when this instruction is executed by number of threads that executed this instruction with predicate and condition code evaluating to true.
6.29. CUpti_ActivityGraphTrace Struct Reference
[CUPTI Activity API]
This activity record represents execution for a graph without giving visibility about the execution of its nodes. This is intended to reduce overheads in tracing each node. The activity kind is CUPTI_ACTIVITY_KIND_GRAPH_TRACE
Public Variables
- uint32_t contextId
- uint32_t correlationId
- uint32_t deviceId
- uint64_t end
- uint32_t graphId
- CUpti_ActivityKind kind
- void * reserved
- uint64_t start
- uint32_t streamId
Variables
- uint32_t CUpti_ActivityGraphTrace::contextId [inherited]
-
The ID of the context where the graph is being launched.
- uint32_t CUpti_ActivityGraphTrace::correlationId [inherited]
-
The correlation ID of the graph launch. Each graph launch is assigned a unique correlation ID that is identical to the correlation ID in the driver API activity record that launched the graph.
- uint32_t CUpti_ActivityGraphTrace::deviceId [inherited]
-
The ID of the device where the graph execution is occurring.
- uint64_t CUpti_ActivityGraphTrace::end [inherited]
-
The end timestamp for the graph execution, in ns. A value of 0 for both the start and end timestamps indicates that timestamp information could not be collected for the graph.
- uint32_t CUpti_ActivityGraphTrace::graphId [inherited]
-
The unique ID of the graph that is launched.
- CUpti_ActivityKindCUpti_ActivityGraphTrace::kind [inherited]
-
The activity record kind, must be CUPTI_ACTIVITY_KIND_GRAPH_TRACE
- void * CUpti_ActivityGraphTrace::reserved [inherited]
-
This field is reserved for internal use
- uint64_t CUpti_ActivityGraphTrace::start [inherited]
-
The start timestamp for the graph execution, in ns. A value of 0 for both the start and end timestamps indicates that timestamp information could not be collected for the graph.
- uint32_t CUpti_ActivityGraphTrace::streamId [inherited]
-
The ID of the stream where the graph is being launched.
6.30. CUpti_ActivityInstantaneousEvent Struct Reference
[CUPTI Activity API]
This activity record represents a CUPTI event value (CUPTI_ACTIVITY_KIND_EVENT) sampled at a particular instant. This activity record kind is not produced by the activity API but is included for completeness and ease-of-use. Profiler frameworks built on top of CUPTI that collect event data at a particular time may choose to use this type to store the collected event data.
Public Variables
- uint32_t deviceId
- CUpti_EventID id
- CUpti_ActivityKind kind
- uint32_t reserved
- uint64_t timestamp
- uint64_t value
Variables
- uint32_t CUpti_ActivityInstantaneousEvent::deviceId [inherited]
-
The device id
- CUpti_EventIDCUpti_ActivityInstantaneousEvent::id [inherited]
-
The event ID.
- CUpti_ActivityKindCUpti_ActivityInstantaneousEvent::kind [inherited]
-
The activity record kind, must be CUPTI_ACTIVITY_KIND_INSTANTANEOUS_EVENT.
- uint32_t CUpti_ActivityInstantaneousEvent::reserved [inherited]
-
Undefined. reserved for internal use
- uint64_t CUpti_ActivityInstantaneousEvent::timestamp [inherited]
-
The timestamp at which event is sampled
- uint64_t CUpti_ActivityInstantaneousEvent::value [inherited]
-
The event value.
6.31. CUpti_ActivityInstantaneousEventInstance Struct Reference
[CUPTI Activity API]
This activity record represents the a CUPTI event value for a specific event domain instance (CUPTI_ACTIVITY_KIND_EVENT_INSTANCE) sampled at a particular instant. This activity record kind is not produced by the activity API but is included for completeness and ease-of-use. Profiler frameworks built on top of CUPTI that collect event data may choose to use this type to store the collected event data. This activity record should be used when event domain instance information needs to be associated with the event.
Public Variables
- uint32_t deviceId
- CUpti_EventID id
- uint8_t instance
- CUpti_ActivityKind kind
- uint8_t pad[3]
- uint64_t timestamp
- uint64_t value
Variables
- uint32_t CUpti_ActivityInstantaneousEventInstance::deviceId [inherited]
-
The device id
- CUpti_EventIDCUpti_ActivityInstantaneousEventInstance::id [inherited]
-
The event ID.
- uint8_t CUpti_ActivityInstantaneousEventInstance::instance [inherited]
-
The event domain instance
- CUpti_ActivityKindCUpti_ActivityInstantaneousEventInstance::kind [inherited]
-
The activity record kind, must be CUPTI_ACTIVITY_KIND_INSTANTANEOUS_EVENT_INSTANCE.
- uint8_t CUpti_ActivityInstantaneousEventInstance::pad[3] [inherited]
-
Undefined. reserved for internal use
- uint64_t CUpti_ActivityInstantaneousEventInstance::timestamp [inherited]
-
The timestamp at which event is sampled
- uint64_t CUpti_ActivityInstantaneousEventInstance::value [inherited]
-
The event value.
6.32. CUpti_ActivityInstantaneousMetric Struct Reference
[CUPTI Activity API]
This activity record represents the collection of a CUPTI metric value (CUPTI_ACTIVITY_KIND_METRIC) at a particular instance. This activity record kind is not produced by the activity API but is included for completeness and ease-of-use. Profiler frameworks built on top of CUPTI that collect metric data may choose to use this type to store the collected metric data.
Public Variables
- uint32_t deviceId
- uint8_t flags
- CUpti_MetricID id
- CUpti_ActivityKind kind
- uint8_t pad[3]
- uint64_t timestamp
- union CUpti_MetricValue value
Variables
- uint32_t CUpti_ActivityInstantaneousMetric::deviceId [inherited]
-
The device id
- uint8_t CUpti_ActivityInstantaneousMetric::flags [inherited]
- CUpti_MetricIDCUpti_ActivityInstantaneousMetric::id [inherited]
-
The metric ID.
- CUpti_ActivityKindCUpti_ActivityInstantaneousMetric::kind [inherited]
-
The activity record kind, must be CUPTI_ACTIVITY_KIND_INSTANTANEOUS_METRIC.
- uint8_t CUpti_ActivityInstantaneousMetric::pad[3] [inherited]
-
Undefined. reserved for internal use
- uint64_t CUpti_ActivityInstantaneousMetric::timestamp [inherited]
-
The timestamp at which metric is sampled
- union CUpti_MetricValueCUpti_ActivityInstantaneousMetric::value [inherited]
-
The metric value.
6.33. CUpti_ActivityInstantaneousMetricInstance Struct Reference
[CUPTI Activity API]
This activity record represents a CUPTI metric value for a specific metric domain instance (CUPTI_ACTIVITY_KIND_METRIC_INSTANCE) sampled at a particular time. This activity record kind is not produced by the activity API but is included for completeness and ease-of-use. Profiler frameworks built on top of CUPTI that collect metric data may choose to use this type to store the collected metric data. This activity record should be used when metric domain instance information needs to be associated with the metric.
Public Variables
- uint32_t deviceId
- uint8_t flags
- CUpti_MetricID id
- uint8_t instance
- CUpti_ActivityKind kind
- uint8_t pad[2]
- uint64_t timestamp
- union CUpti_MetricValue value
Variables
- uint32_t CUpti_ActivityInstantaneousMetricInstance::deviceId [inherited]
-
The device id
- uint8_t CUpti_ActivityInstantaneousMetricInstance::flags [inherited]
- CUpti_MetricIDCUpti_ActivityInstantaneousMetricInstance::id [inherited]
-
The metric ID.
- uint8_t CUpti_ActivityInstantaneousMetricInstance::instance [inherited]
-
The metric domain instance
- CUpti_ActivityKindCUpti_ActivityInstantaneousMetricInstance::kind [inherited]
-
The activity record kind, must be CUPTI_ACTIVITY_KIND_INSTANTANEOUS_METRIC_INSTANCE.
- uint8_t CUpti_ActivityInstantaneousMetricInstance::pad[2] [inherited]
-
Undefined. reserved for internal use
- uint64_t CUpti_ActivityInstantaneousMetricInstance::timestamp [inherited]
-
The timestamp at which metric is sampled
- union CUpti_MetricValueCUpti_ActivityInstantaneousMetricInstance::value [inherited]
-
The metric value.
6.34. CUpti_ActivityInstructionCorrelation Struct Reference
[CUPTI Activity API]
This activity records source level sass/source correlation information. (CUPTI_ACTIVITY_KIND_INSTRUCTION_CORRELATION).
Public Variables
- CUpti_ActivityFlag flags
- uint32_t functionId
- CUpti_ActivityKind kind
- uint32_t pad
- uint32_t pcOffset
- uint32_t sourceLocatorId
Variables
- CUpti_ActivityFlagCUpti_ActivityInstructionCorrelation::flags [inherited]
-
The properties of this instruction.
- uint32_t CUpti_ActivityInstructionCorrelation::functionId [inherited]
-
Correlation ID with global/device function name
- CUpti_ActivityKindCUpti_ActivityInstructionCorrelation::kind [inherited]
-
The activity record kind, must be CUPTI_ACTIVITY_KIND_INSTRUCTION_CORRELATION.
- uint32_t CUpti_ActivityInstructionCorrelation::pad [inherited]
-
Undefined. Reserved for internal use.
- uint32_t CUpti_ActivityInstructionCorrelation::pcOffset [inherited]
-
The pc offset for the instruction.
- uint32_t CUpti_ActivityInstructionCorrelation::sourceLocatorId [inherited]
-
The ID for source locator.
6.35. CUpti_ActivityInstructionExecution Struct Reference
[CUPTI Activity API]
This activity records result for source level instruction execution. (CUPTI_ACTIVITY_KIND_INSTRUCTION_EXECUTION).
Public Variables
- uint32_t correlationId
- uint32_t executed
- CUpti_ActivityFlag flags
- uint32_t functionId
- CUpti_ActivityKind kind
- uint64_t notPredOffThreadsExecuted
- uint32_t pad
- uint32_t pcOffset
- uint32_t sourceLocatorId
- uint64_t threadsExecuted
Variables
- uint32_t CUpti_ActivityInstructionExecution::correlationId [inherited]
-
The correlation ID of the kernel to which this result is associated.
- uint32_t CUpti_ActivityInstructionExecution::executed [inherited]
-
The number of times this instruction was executed per warp. It will be incremented regardless of predicate or condition code.
- CUpti_ActivityFlagCUpti_ActivityInstructionExecution::flags [inherited]
-
The properties of this instruction execution.
- uint32_t CUpti_ActivityInstructionExecution::functionId [inherited]
-
Correlation ID with global/device function name
- CUpti_ActivityKindCUpti_ActivityInstructionExecution::kind [inherited]
-
The activity record kind, must be CUPTI_ACTIVITY_KIND_INSTRUCTION_EXECUTION.
- uint64_t CUpti_ActivityInstructionExecution::notPredOffThreadsExecuted [inherited]
-
This increments each time when this instruction is executed by number of threads that executed this instruction with predicate and condition code evaluating to true.
- uint32_t CUpti_ActivityInstructionExecution::pad [inherited]
-
Undefined. Reserved for internal use.
- uint32_t CUpti_ActivityInstructionExecution::pcOffset [inherited]
-
The pc offset for the instruction.
- uint32_t CUpti_ActivityInstructionExecution::sourceLocatorId [inherited]
-
The ID for source locator.
- uint64_t CUpti_ActivityInstructionExecution::threadsExecuted [inherited]
-
This increments each time when this instruction is executed by number of threads that executed this instruction, regardless of predicate or condition code.
6.36. CUpti_ActivityJit Struct Reference
[CUPTI Activity API]
Public Variables
- const char * cachePath
- uint64_t cacheSize
- uint32_t correlationId
- uint32_t deviceId
- uint64_t end
- CUpti_ActivityJitEntryType jitEntryType
- uint64_t jitOperationCorrelationId
- CUpti_ActivityJitOperationType jitOperationType
- CUpti_ActivityKind kind
- uint32_t padding
- uint64_t start
Variables
- const char * CUpti_ActivityJit::cachePath [inherited]
-
The path where the fat binary is cached.
- uint64_t CUpti_ActivityJit::cacheSize [inherited]
-
The size of compute cache.
- uint32_t CUpti_ActivityJit::correlationId [inherited]
-
The correlation ID of the JIT operation to which records belong to. Each JIT operation is assigned a unique correlation ID that is identical to the correlation ID in the driver or runtime API activity record that launched the JIT operation.
- uint32_t CUpti_ActivityJit::deviceId [inherited]
-
The device ID.
- uint64_t CUpti_ActivityJit::end [inherited]
-
The end timestamp for the JIT operation, in ns. A value of 0 for both the start and end timestamps indicates that timestamp information could not be collected for the JIT operation.
- CUpti_ActivityJitEntryTypeCUpti_ActivityJit::jitEntryType [inherited]
-
The JIT entry type.
- uint64_t CUpti_ActivityJit::jitOperationCorrelationId [inherited]
-
The correlation ID to correlate JIT compilation, load and store operations. Each JIT compilation unit is assigned a unique correlation ID at the time of the JIT compilation. This correlation id can be used to find the matching JIT cache load/store records.
- CUpti_ActivityJitOperationTypeCUpti_ActivityJit::jitOperationType [inherited]
-
The JIT operation type.
- CUpti_ActivityKindCUpti_ActivityJit::kind [inherited]
-
The activity record kind must be CUPTI_ACTIVITY_KIND_JIT.
- uint32_t CUpti_ActivityJit::padding [inherited]
-
Internal use.
- uint64_t CUpti_ActivityJit::start [inherited]
-
The start timestamp for the JIT operation, in ns. A value of 0 for both the start and end timestamps indicates that timestamp information could not be collected for the JIT operation.
6.37. CUpti_ActivityKernel Struct Reference
[CUPTI Activity API]
This activity record represents a kernel execution (CUPTI_ACTIVITY_KIND_KERNEL and CUPTI_ACTIVITY_KIND_CONCURRENT_KERNEL) but is no longer generated by CUPTI. Kernel activities are now reported using the CUpti_ActivityKernel9 activity record.
Public Variables
- int32_t blockX
- int32_t blockY
- int32_t blockZ
- uint8_t cacheConfigExecuted
- uint8_t cacheConfigRequested
- uint32_t contextId
- uint32_t correlationId
- uint32_t deviceId
- int32_t dynamicSharedMemory
- uint64_t end
- int32_t gridX
- int32_t gridY
- int32_t gridZ
- CUpti_ActivityKind kind
- uint32_t localMemoryPerThread
- uint32_t localMemoryTotal
- const char * name
- uint32_t pad
- uint16_t registersPerThread
- void * reserved0
- uint32_t runtimeCorrelationId
- uint64_t start
- int32_t staticSharedMemory
- uint32_t streamId
Variables
- int32_t CUpti_ActivityKernel::blockX [inherited]
-
The X-dimension block size for the kernel.
- int32_t CUpti_ActivityKernel::blockY [inherited]
-
The Y-dimension block size for the kernel.
- int32_t CUpti_ActivityKernel::blockZ [inherited]
-
The Z-dimension grid size for the kernel.
- uint8_t CUpti_ActivityKernel::cacheConfigExecuted [inherited]
-
The cache configuration used for the kernel. The value is one of the CUfunc_cache enumeration values from cuda.h.
- uint8_t CUpti_ActivityKernel::cacheConfigRequested [inherited]
-
The cache configuration requested by the kernel. The value is one of the CUfunc_cache enumeration values from cuda.h.
- uint32_t CUpti_ActivityKernel::contextId [inherited]
-
The ID of the context where the kernel is executing.
- uint32_t CUpti_ActivityKernel::correlationId [inherited]
-
The correlation ID of the kernel. Each kernel execution is assigned a unique correlation ID that is identical to the correlation ID in the driver API activity record that launched the kernel.
- uint32_t CUpti_ActivityKernel::deviceId [inherited]
-
The ID of the device where the kernel is executing.
- int32_t CUpti_ActivityKernel::dynamicSharedMemory [inherited]
-
The dynamic shared memory reserved for the kernel, in bytes.
- uint64_t CUpti_ActivityKernel::end [inherited]
-
The end timestamp for the kernel execution, in ns. A value of 0 for both the start and end timestamps indicates that timestamp information could not be collected for the kernel.
- int32_t CUpti_ActivityKernel::gridX [inherited]
-
The X-dimension grid size for the kernel.
- int32_t CUpti_ActivityKernel::gridY [inherited]
-
The Y-dimension grid size for the kernel.
- int32_t CUpti_ActivityKernel::gridZ [inherited]
-
The Z-dimension grid size for the kernel.
- CUpti_ActivityKindCUpti_ActivityKernel::kind [inherited]
-
The activity record kind, must be CUPTI_ACTIVITY_KIND_KERNEL or CUPTI_ACTIVITY_KIND_CONCURRENT_KERNEL.
- uint32_t CUpti_ActivityKernel::localMemoryPerThread [inherited]
-
The amount of local memory reserved for each thread, in bytes.
- uint32_t CUpti_ActivityKernel::localMemoryTotal [inherited]
-
The total amount of local memory reserved for the kernel, in bytes.
- const char * CUpti_ActivityKernel::name [inherited]
-
The name of the kernel. This name is shared across all activity records representing the same kernel, and so should not be modified.
- uint32_t CUpti_ActivityKernel::pad [inherited]
-
Undefined. Reserved for internal use.
- uint16_t CUpti_ActivityKernel::registersPerThread [inherited]
-
The number of registers required for each thread executing the kernel.
- void * CUpti_ActivityKernel::reserved0 [inherited]
-
Undefined. Reserved for internal use.
- uint32_t CUpti_ActivityKernel::runtimeCorrelationId [inherited]
-
The runtime correlation ID of the kernel. Each kernel execution is assigned a unique runtime correlation ID that is identical to the correlation ID in the runtime API activity record that launched the kernel.
- uint64_t CUpti_ActivityKernel::start [inherited]
-
The start timestamp for the kernel execution, in ns. A value of 0 for both the start and end timestamps indicates that timestamp information could not be collected for the kernel.
- int32_t CUpti_ActivityKernel::staticSharedMemory [inherited]
-
The static shared memory allocated for the kernel, in bytes.
- uint32_t CUpti_ActivityKernel::streamId [inherited]
-
The ID of the stream where the kernel is executing.
6.38. CUpti_ActivityKernel2 Struct Reference
[CUPTI Activity API]
This activity record represents a kernel execution (CUPTI_ACTIVITY_KIND_KERNEL and CUPTI_ACTIVITY_KIND_CONCURRENT_KERNEL) but is no longer generated by CUPTI. Kernel activities are now reported using the CUpti_ActivityKernel9 activity record.
Public Variables
- int32_t blockX
- int32_t blockY
- int32_t blockZ
- uint64_t completed
- uint32_t contextId
- uint32_t correlationId
- uint32_t deviceId
- int32_t dynamicSharedMemory
- uint64_t end
- uint8_t executed
- int64_t gridId
- int32_t gridX
- int32_t gridY
- int32_t gridZ
- CUpti_ActivityKind kind
- uint32_t localMemoryPerThread
- uint32_t localMemoryTotal
- const char * name
- uint16_t registersPerThread
- uint8_t requested
- void * reserved0
- uint8_t sharedMemoryConfig
- uint64_t start
- int32_t staticSharedMemory
- uint32_t streamId
Variables
- int32_t CUpti_ActivityKernel2::blockX [inherited]
-
The X-dimension block size for the kernel.
- int32_t CUpti_ActivityKernel2::blockY [inherited]
-
The Y-dimension block size for the kernel.
- int32_t CUpti_ActivityKernel2::blockZ [inherited]
-
The Z-dimension grid size for the kernel.
- uint64_t CUpti_ActivityKernel2::completed [inherited]
-
The completed timestamp for the kernel execution, in ns. It represents the completion of all it's child kernels and the kernel itself. A value of CUPTI_TIMESTAMP_UNKNOWN indicates that the completion time is unknown.
- uint32_t CUpti_ActivityKernel2::contextId [inherited]
-
The ID of the context where the kernel is executing.
- uint32_t CUpti_ActivityKernel2::correlationId [inherited]
-
The correlation ID of the kernel. Each kernel execution is assigned a unique correlation ID that is identical to the correlation ID in the driver or runtime API activity record that launched the kernel.
- uint32_t CUpti_ActivityKernel2::deviceId [inherited]
-
The ID of the device where the kernel is executing.
- int32_t CUpti_ActivityKernel2::dynamicSharedMemory [inherited]
-
The dynamic shared memory reserved for the kernel, in bytes.
- uint64_t CUpti_ActivityKernel2::end [inherited]
-
The end timestamp for the kernel execution, in ns. A value of 0 for both the start and end timestamps indicates that timestamp information could not be collected for the kernel.
- uint8_t CUpti_ActivityKernel2::executed [inherited]
-
The cache configuration used for the kernel. The value is one of the CUfunc_cache enumeration values from cuda.h.
- int64_t CUpti_ActivityKernel2::gridId [inherited]
-
The grid ID of the kernel. Each kernel is assigned a unique grid ID at runtime.
- int32_t CUpti_ActivityKernel2::gridX [inherited]
-
The X-dimension grid size for the kernel.
- int32_t CUpti_ActivityKernel2::gridY [inherited]
-
The Y-dimension grid size for the kernel.
- int32_t CUpti_ActivityKernel2::gridZ [inherited]
-
The Z-dimension grid size for the kernel.
- CUpti_ActivityKindCUpti_ActivityKernel2::kind [inherited]
-
The activity record kind, must be CUPTI_ACTIVITY_KIND_KERNEL or CUPTI_ACTIVITY_KIND_CONCURRENT_KERNEL.
- uint32_t CUpti_ActivityKernel2::localMemoryPerThread [inherited]
-
The amount of local memory reserved for each thread, in bytes.
- uint32_t CUpti_ActivityKernel2::localMemoryTotal [inherited]
-
The total amount of local memory reserved for the kernel, in bytes.
- const char * CUpti_ActivityKernel2::name [inherited]
-
The name of the kernel. This name is shared across all activity records representing the same kernel, and so should not be modified.
- uint16_t CUpti_ActivityKernel2::registersPerThread [inherited]
-
The number of registers required for each thread executing the kernel.
- uint8_t CUpti_ActivityKernel2::requested [inherited]
-
The cache configuration requested by the kernel. The value is one of the CUfunc_cache enumeration values from cuda.h.
- void * CUpti_ActivityKernel2::reserved0 [inherited]
-
Undefined. Reserved for internal use.
- uint8_t CUpti_ActivityKernel2::sharedMemoryConfig [inherited]
-
The shared memory configuration used for the kernel. The value is one of the CUsharedconfig enumeration values from cuda.h.
- uint64_t CUpti_ActivityKernel2::start [inherited]
-
The start timestamp for the kernel execution, in ns. A value of 0 for both the start and end timestamps indicates that timestamp information could not be collected for the kernel.
- int32_t CUpti_ActivityKernel2::staticSharedMemory [inherited]
-
The static shared memory allocated for the kernel, in bytes.
- uint32_t CUpti_ActivityKernel2::streamId [inherited]
-
The ID of the stream where the kernel is executing.
6.39. CUpti_ActivityKernel3 Struct Reference
[CUPTI Activity API]
This activity record represents a kernel execution (CUPTI_ACTIVITY_KIND_KERNEL and CUPTI_ACTIVITY_KIND_CONCURRENT_KERNEL). Kernel activities are now reported using the CUpti_ActivityKernel9 activity record.
Public Variables
- int32_t blockX
- int32_t blockY
- int32_t blockZ
- uint64_t completed
- uint32_t contextId
- uint32_t correlationId
- uint32_t deviceId
- int32_t dynamicSharedMemory
- uint64_t end
- uint8_t executed
- int64_t gridId
- int32_t gridX
- int32_t gridY
- int32_t gridZ
- CUpti_ActivityKind kind
- uint32_t localMemoryPerThread
- uint32_t localMemoryTotal
- const char * name
- CUpti_ActivityPartitionedGlobalCacheConfig partitionedGlobalCacheExecuted
- CUpti_ActivityPartitionedGlobalCacheConfig partitionedGlobalCacheRequested
- uint16_t registersPerThread
- uint8_t requested
- void * reserved0
- uint8_t sharedMemoryConfig
- uint64_t start
- int32_t staticSharedMemory
- uint32_t streamId
Variables
- int32_t CUpti_ActivityKernel3::blockX [inherited]
-
The X-dimension block size for the kernel.
- int32_t CUpti_ActivityKernel3::blockY [inherited]
-
The Y-dimension block size for the kernel.
- int32_t CUpti_ActivityKernel3::blockZ [inherited]
-
The Z-dimension grid size for the kernel.
- uint64_t CUpti_ActivityKernel3::completed [inherited]
-
The completed timestamp for the kernel execution, in ns. It represents the completion of all it's child kernels and the kernel itself. A value of CUPTI_TIMESTAMP_UNKNOWN indicates that the completion time is unknown.
- uint32_t CUpti_ActivityKernel3::contextId [inherited]
-
The ID of the context where the kernel is executing.
- uint32_t CUpti_ActivityKernel3::correlationId [inherited]
-
The correlation ID of the kernel. Each kernel execution is assigned a unique correlation ID that is identical to the correlation ID in the driver or runtime API activity record that launched the kernel.
- uint32_t CUpti_ActivityKernel3::deviceId [inherited]
-
The ID of the device where the kernel is executing.
- int32_t CUpti_ActivityKernel3::dynamicSharedMemory [inherited]
-
The dynamic shared memory reserved for the kernel, in bytes.
- uint64_t CUpti_ActivityKernel3::end [inherited]
-
The end timestamp for the kernel execution, in ns. A value of 0 for both the start and end timestamps indicates that timestamp information could not be collected for the kernel.
- uint8_t CUpti_ActivityKernel3::executed [inherited]
-
The cache configuration used for the kernel. The value is one of the CUfunc_cache enumeration values from cuda.h.
- int64_t CUpti_ActivityKernel3::gridId [inherited]
-
The grid ID of the kernel. Each kernel is assigned a unique grid ID at runtime.
- int32_t CUpti_ActivityKernel3::gridX [inherited]
-
The X-dimension grid size for the kernel.
- int32_t CUpti_ActivityKernel3::gridY [inherited]
-
The Y-dimension grid size for the kernel.
- int32_t CUpti_ActivityKernel3::gridZ [inherited]
-
The Z-dimension grid size for the kernel.
- CUpti_ActivityKindCUpti_ActivityKernel3::kind [inherited]
-
The activity record kind, must be CUPTI_ACTIVITY_KIND_KERNEL or CUPTI_ACTIVITY_KIND_CONCURRENT_KERNEL.
- uint32_t CUpti_ActivityKernel3::localMemoryPerThread [inherited]
-
The amount of local memory reserved for each thread, in bytes.
- uint32_t CUpti_ActivityKernel3::localMemoryTotal [inherited]
-
The total amount of local memory reserved for the kernel, in bytes.
- const char * CUpti_ActivityKernel3::name [inherited]
-
The name of the kernel. This name is shared across all activity records representing the same kernel, and so should not be modified.
- CUpti_ActivityPartitionedGlobalCacheConfigCUpti_ActivityKernel3::partitionedGlobalCacheExecuted [inherited]
-
The partitioned global caching executed for the kernel. Partitioned global caching is required to enable caching on certain chips, such as devices with compute capability 5.2. Partitioned global caching can be automatically disabled if the occupancy requirement of the launch cannot support caching.
- CUpti_ActivityPartitionedGlobalCacheConfigCUpti_ActivityKernel3::partitionedGlobalCacheRequested [inherited]
-
The partitioned global caching requested for the kernel. Partitioned global caching is required to enable caching on certain chips, such as devices with compute capability 5.2.
- uint16_t CUpti_ActivityKernel3::registersPerThread [inherited]
-
The number of registers required for each thread executing the kernel.
- uint8_t CUpti_ActivityKernel3::requested [inherited]
-
The cache configuration requested by the kernel. The value is one of the CUfunc_cache enumeration values from cuda.h.
- void * CUpti_ActivityKernel3::reserved0 [inherited]
-
Undefined. Reserved for internal use.
- uint8_t CUpti_ActivityKernel3::sharedMemoryConfig [inherited]
-
The shared memory configuration used for the kernel. The value is one of the CUsharedconfig enumeration values from cuda.h.
- uint64_t CUpti_ActivityKernel3::start [inherited]
-
The start timestamp for the kernel execution, in ns. A value of 0 for both the start and end timestamps indicates that timestamp information could not be collected for the kernel.
- int32_t CUpti_ActivityKernel3::staticSharedMemory [inherited]
-
The static shared memory allocated for the kernel, in bytes.
- uint32_t CUpti_ActivityKernel3::streamId [inherited]
-
The ID of the stream where the kernel is executing.
6.40. CUpti_ActivityKernel4 Struct Reference
[CUPTI Activity API]
This activity record represents a kernel execution (CUPTI_ACTIVITY_KIND_KERNEL and CUPTI_ACTIVITY_KIND_CONCURRENT_KERNEL). Kernel activities are now reported using the CUpti_ActivityKernel9 activity record.
Public Variables
- int32_t blockX
- int32_t blockY
- int32_t blockZ
- CUpti_ActivityKernel4::@9 cacheConfig
- uint64_t completed
- uint32_t contextId
- uint32_t correlationId
- uint32_t deviceId
- int32_t dynamicSharedMemory
- uint64_t end
- uint8_t executed
- int64_t gridId
- int32_t gridX
- int32_t gridY
- int32_t gridZ
- uint8_t isSharedMemoryCarveoutRequested
- CUpti_ActivityKind kind
- uint8_t launchType
- uint32_t localMemoryPerThread
- uint32_t localMemoryTotal
- const char * name
- uint8_t padding
- CUpti_ActivityPartitionedGlobalCacheConfig partitionedGlobalCacheExecuted
- CUpti_ActivityPartitionedGlobalCacheConfig partitionedGlobalCacheRequested
- uint64_t queued
- uint16_t registersPerThread
- uint8_t requested
- void * reserved0
- uint8_t sharedMemoryCarveoutRequested
- uint8_t sharedMemoryConfig
- uint32_t sharedMemoryExecuted
- uint64_t start
- int32_t staticSharedMemory
- uint32_t streamId
- uint64_t submitted
Variables
- int32_t CUpti_ActivityKernel4::blockX [inherited]
-
The X-dimension block size for the kernel.
- int32_t CUpti_ActivityKernel4::blockY [inherited]
-
The Y-dimension block size for the kernel.
- int32_t CUpti_ActivityKernel4::blockZ [inherited]
-
The Z-dimension grid size for the kernel.
- CUpti_ActivityKernel4::@9 CUpti_ActivityKernel4::cacheConfig [inherited]
-
For devices with compute capability 7.0+ cacheConfig values are not updated in case field isSharedMemoryCarveoutRequested is set
- uint64_t CUpti_ActivityKernel4::completed [inherited]
-
The completed timestamp for the kernel execution, in ns. It represents the completion of all it's child kernels and the kernel itself. A value of CUPTI_TIMESTAMP_UNKNOWN indicates that the completion time is unknown.
- uint32_t CUpti_ActivityKernel4::contextId [inherited]
-
The ID of the context where the kernel is executing.
- uint32_t CUpti_ActivityKernel4::correlationId [inherited]
-
The correlation ID of the kernel. Each kernel execution is assigned a unique correlation ID that is identical to the correlation ID in the driver or runtime API activity record that launched the kernel.
- uint32_t CUpti_ActivityKernel4::deviceId [inherited]
-
The ID of the device where the kernel is executing.
- int32_t CUpti_ActivityKernel4::dynamicSharedMemory [inherited]
-
The dynamic shared memory reserved for the kernel, in bytes.
- uint64_t CUpti_ActivityKernel4::end [inherited]
-
The end timestamp for the kernel execution, in ns. A value of 0 for both the start and end timestamps indicates that timestamp information could not be collected for the kernel.
- uint8_t CUpti_ActivityKernel4::executed [inherited]
-
The cache configuration used for the kernel. The value is one of the CUfunc_cache enumeration values from cuda.h.
- int64_t CUpti_ActivityKernel4::gridId [inherited]
-
The grid ID of the kernel. Each kernel is assigned a unique grid ID at runtime.
- int32_t CUpti_ActivityKernel4::gridX [inherited]
-
The X-dimension grid size for the kernel.
- int32_t CUpti_ActivityKernel4::gridY [inherited]
-
The Y-dimension grid size for the kernel.
- int32_t CUpti_ActivityKernel4::gridZ [inherited]
-
The Z-dimension grid size for the kernel.
- uint8_t CUpti_ActivityKernel4::isSharedMemoryCarveoutRequested [inherited]
-
This indicates if CU_FUNC_ATTRIBUTE_PREFERRED_SHARED_MEMORY_CARVEOUT was updated for the kernel launch
- CUpti_ActivityKindCUpti_ActivityKernel4::kind [inherited]
-
The activity record kind, must be CUPTI_ACTIVITY_KIND_KERNEL or CUPTI_ACTIVITY_KIND_CONCURRENT_KERNEL.
- uint8_t CUpti_ActivityKernel4::launchType [inherited]
-
The indicates if the kernel was executed via a regular launch or via a single/multi device cooperative launch.
See also:
- uint32_t CUpti_ActivityKernel4::localMemoryPerThread [inherited]
-
The amount of local memory reserved for each thread, in bytes.
- uint32_t CUpti_ActivityKernel4::localMemoryTotal [inherited]
-
The total amount of local memory reserved for the kernel, in bytes.
- const char * CUpti_ActivityKernel4::name [inherited]
-
The name of the kernel. This name is shared across all activity records representing the same kernel, and so should not be modified.
- uint8_t CUpti_ActivityKernel4::padding [inherited]
-
Undefined. Reserved for internal use.
- CUpti_ActivityPartitionedGlobalCacheConfigCUpti_ActivityKernel4::partitionedGlobalCacheExecuted [inherited]
-
The partitioned global caching executed for the kernel. Partitioned global caching is required to enable caching on certain chips, such as devices with compute capability 5.2. Partitioned global caching can be automatically disabled if the occupancy requirement of the launch cannot support caching.
- CUpti_ActivityPartitionedGlobalCacheConfigCUpti_ActivityKernel4::partitionedGlobalCacheRequested [inherited]
-
The partitioned global caching requested for the kernel. Partitioned global caching is required to enable caching on certain chips, such as devices with compute capability 5.2.
- uint64_t CUpti_ActivityKernel4::queued [inherited]
-
The timestamp when the kernel is queued up in the command buffer, in ns. A value of CUPTI_TIMESTAMP_UNKNOWN indicates that the queued time could not be collected for the kernel. This timestamp is not collected by default. Use API cuptiActivityEnableLatencyTimestamps() to enable collection.
Command buffer is a buffer written by CUDA driver to send commands like kernel launch, memory copy etc to the GPU. All launches of CUDA kernels are asynchrnous with respect to the host, the host requests the launch by writing commands into the command buffer, then returns without checking the GPU's progress.
- uint16_t CUpti_ActivityKernel4::registersPerThread [inherited]
-
The number of registers required for each thread executing the kernel.
- uint8_t CUpti_ActivityKernel4::requested [inherited]
-
The cache configuration requested by the kernel. The value is one of the CUfunc_cache enumeration values from cuda.h.
- void * CUpti_ActivityKernel4::reserved0 [inherited]
-
Undefined. Reserved for internal use.
- uint8_t CUpti_ActivityKernel4::sharedMemoryCarveoutRequested [inherited]
-
Shared memory carveout value requested for the function in percentage of the total resource. The value will be updated only if field isSharedMemoryCarveoutRequested is set.
- uint8_t CUpti_ActivityKernel4::sharedMemoryConfig [inherited]
-
The shared memory configuration used for the kernel. The value is one of the CUsharedconfig enumeration values from cuda.h.
- uint32_t CUpti_ActivityKernel4::sharedMemoryExecuted [inherited]
-
Shared memory size set by the driver.
- uint64_t CUpti_ActivityKernel4::start [inherited]
-
The start timestamp for the kernel execution, in ns. A value of 0 for both the start and end timestamps indicates that timestamp information could not be collected for the kernel.
- int32_t CUpti_ActivityKernel4::staticSharedMemory [inherited]
-
The static shared memory allocated for the kernel, in bytes.
- uint32_t CUpti_ActivityKernel4::streamId [inherited]
-
The ID of the stream where the kernel is executing.
- uint64_t CUpti_ActivityKernel4::submitted [inherited]
-
The timestamp when the command buffer containing the kernel launch is submitted to the GPU, in ns. A value of CUPTI_TIMESTAMP_UNKNOWN indicates that the submitted time could not be collected for the kernel. This timestamp is not collected by default. Use API cuptiActivityEnableLatencyTimestamps() to enable collection.
6.41. CUpti_ActivityKernel5 Struct Reference
[CUPTI Activity API]
Public Variables
- int32_t blockX
- int32_t blockY
- int32_t blockZ
- CUpti_ActivityKernel5::@11 cacheConfig
- uint64_t completed
- uint32_t contextId
- uint32_t correlationId
- uint32_t deviceId
- int32_t dynamicSharedMemory
- uint64_t end
- uint8_t executed
- uint32_t graphId
- uint64_t graphNodeId
- int64_t gridId
- int32_t gridX
- int32_t gridY
- int32_t gridZ
- uint8_t isSharedMemoryCarveoutRequested
- CUpti_ActivityKind kind
- uint8_t launchType
- uint32_t localMemoryPerThread
- uint32_t localMemoryTotal
- const char * name
- uint8_t padding
- CUpti_ActivityPartitionedGlobalCacheConfig partitionedGlobalCacheExecuted
- CUpti_ActivityPartitionedGlobalCacheConfig partitionedGlobalCacheRequested
- uint64_t queued
- uint16_t registersPerThread
- uint8_t requested
- void * reserved0
- uint8_t sharedMemoryCarveoutRequested
- uint8_t sharedMemoryConfig
- uint32_t sharedMemoryExecuted
- CUpti_FuncShmemLimitConfig shmemLimitConfig
- uint64_t start
- int32_t staticSharedMemory
- uint32_t streamId
- uint64_t submitted
Variables
- int32_t CUpti_ActivityKernel5::blockX [inherited]
-
The X-dimension block size for the kernel.
- int32_t CUpti_ActivityKernel5::blockY [inherited]
-
The Y-dimension block size for the kernel.
- int32_t CUpti_ActivityKernel5::blockZ [inherited]
-
The Z-dimension grid size for the kernel.
- CUpti_ActivityKernel5::@11 CUpti_ActivityKernel5::cacheConfig [inherited]
-
For devices with compute capability 7.0+ cacheConfig values are not updated in case field isSharedMemoryCarveoutRequested is set
- uint64_t CUpti_ActivityKernel5::completed [inherited]
-
The completed timestamp for the kernel execution, in ns. It represents the completion of all it's child kernels and the kernel itself. A value of CUPTI_TIMESTAMP_UNKNOWN indicates that the completion time is unknown.
- uint32_t CUpti_ActivityKernel5::contextId [inherited]
-
The ID of the context where the kernel is executing.
- uint32_t CUpti_ActivityKernel5::correlationId [inherited]
-
The correlation ID of the kernel. Each kernel execution is assigned a unique correlation ID that is identical to the correlation ID in the driver or runtime API activity record that launched the kernel.
- uint32_t CUpti_ActivityKernel5::deviceId [inherited]
-
The ID of the device where the kernel is executing.
- int32_t CUpti_ActivityKernel5::dynamicSharedMemory [inherited]
-
The dynamic shared memory reserved for the kernel, in bytes.
- uint64_t CUpti_ActivityKernel5::end [inherited]
-
The end timestamp for the kernel execution, in ns. A value of 0 for both the start and end timestamps indicates that timestamp information could not be collected for the kernel.
- uint8_t CUpti_ActivityKernel5::executed [inherited]
-
The cache configuration used for the kernel. The value is one of the CUfunc_cache enumeration values from cuda.h.
- uint32_t CUpti_ActivityKernel5::graphId [inherited]
-
The unique ID of the graph that launched this kernel through graph launch APIs. This field will be 0 if the kernel is not launched through graph launch APIs.
- uint64_t CUpti_ActivityKernel5::graphNodeId [inherited]
-
The unique ID of the graph node that launched this kernel through graph launch APIs. This field will be 0 if the kernel is not launched through graph launch APIs.
- int64_t CUpti_ActivityKernel5::gridId [inherited]
-
The grid ID of the kernel. Each kernel is assigned a unique grid ID at runtime.
- int32_t CUpti_ActivityKernel5::gridX [inherited]
-
The X-dimension grid size for the kernel.
- int32_t CUpti_ActivityKernel5::gridY [inherited]
-
The Y-dimension grid size for the kernel.
- int32_t CUpti_ActivityKernel5::gridZ [inherited]
-
The Z-dimension grid size for the kernel.
- uint8_t CUpti_ActivityKernel5::isSharedMemoryCarveoutRequested [inherited]
-
This indicates if CU_FUNC_ATTRIBUTE_PREFERRED_SHARED_MEMORY_CARVEOUT was updated for the kernel launch
- CUpti_ActivityKindCUpti_ActivityKernel5::kind [inherited]
-
The activity record kind, must be CUPTI_ACTIVITY_KIND_KERNEL or CUPTI_ACTIVITY_KIND_CONCURRENT_KERNEL.
- uint8_t CUpti_ActivityKernel5::launchType [inherited]
-
The indicates if the kernel was executed via a regular launch or via a single/multi device cooperative launch.
See also:
- uint32_t CUpti_ActivityKernel5::localMemoryPerThread [inherited]
-
The amount of local memory reserved for each thread, in bytes.
- uint32_t CUpti_ActivityKernel5::localMemoryTotal [inherited]
-
The total amount of local memory reserved for the kernel, in bytes.
- const char * CUpti_ActivityKernel5::name [inherited]
-
The name of the kernel. This name is shared across all activity records representing the same kernel, and so should not be modified.
- uint8_t CUpti_ActivityKernel5::padding [inherited]
-
Undefined. Reserved for internal use.
- CUpti_ActivityPartitionedGlobalCacheConfigCUpti_ActivityKernel5::partitionedGlobalCacheExecuted [inherited]
-
The partitioned global caching executed for the kernel. Partitioned global caching is required to enable caching on certain chips, such as devices with compute capability 5.2. Partitioned global caching can be automatically disabled if the occupancy requirement of the launch cannot support caching.
- CUpti_ActivityPartitionedGlobalCacheConfigCUpti_ActivityKernel5::partitionedGlobalCacheRequested [inherited]
-
The partitioned global caching requested for the kernel. Partitioned global caching is required to enable caching on certain chips, such as devices with compute capability 5.2.
- uint64_t CUpti_ActivityKernel5::queued [inherited]
-
The timestamp when the kernel is queued up in the command buffer, in ns. A value of CUPTI_TIMESTAMP_UNKNOWN indicates that the queued time could not be collected for the kernel. This timestamp is not collected by default. Use API cuptiActivityEnableLatencyTimestamps() to enable collection.
Command buffer is a buffer written by CUDA driver to send commands like kernel launch, memory copy etc to the GPU. All launches of CUDA kernels are asynchrnous with respect to the host, the host requests the launch by writing commands into the command buffer, then returns without checking the GPU's progress.
- uint16_t CUpti_ActivityKernel5::registersPerThread [inherited]
-
The number of registers required for each thread executing the kernel.
- uint8_t CUpti_ActivityKernel5::requested [inherited]
-
The cache configuration requested by the kernel. The value is one of the CUfunc_cache enumeration values from cuda.h.
- void * CUpti_ActivityKernel5::reserved0 [inherited]
-
Undefined. Reserved for internal use.
- uint8_t CUpti_ActivityKernel5::sharedMemoryCarveoutRequested [inherited]
-
Shared memory carveout value requested for the function in percentage of the total resource. The value will be updated only if field isSharedMemoryCarveoutRequested is set.
- uint8_t CUpti_ActivityKernel5::sharedMemoryConfig [inherited]
-
The shared memory configuration used for the kernel. The value is one of the CUsharedconfig enumeration values from cuda.h.
- uint32_t CUpti_ActivityKernel5::sharedMemoryExecuted [inherited]
-
Shared memory size set by the driver.
- CUpti_FuncShmemLimitConfigCUpti_ActivityKernel5::shmemLimitConfig [inherited]
-
The shared memory limit config for the kernel. This field shows whether user has opted for a higher per block limit of dynamic shared memory.
- uint64_t CUpti_ActivityKernel5::start [inherited]
-
The start timestamp for the kernel execution, in ns. A value of 0 for both the start and end timestamps indicates that timestamp information could not be collected for the kernel.
- int32_t CUpti_ActivityKernel5::staticSharedMemory [inherited]
-
The static shared memory allocated for the kernel, in bytes.
- uint32_t CUpti_ActivityKernel5::streamId [inherited]
-
The ID of the stream where the kernel is executing.
- uint64_t CUpti_ActivityKernel5::submitted [inherited]
-
The timestamp when the command buffer containing the kernel launch is submitted to the GPU, in ns. A value of CUPTI_TIMESTAMP_UNKNOWN indicates that the submitted time could not be collected for the kernel. This timestamp is not collected by default. Use API cuptiActivityEnableLatencyTimestamps() to enable collection.
6.42. CUpti_ActivityKernel6 Struct Reference
[CUPTI Activity API]
This activity record represents a kernel execution (CUPTI_ACTIVITY_KIND_KERNEL and CUPTI_ACTIVITY_KIND_CONCURRENT_KERNEL) but is no longer generated by CUPTI. Kernel activities are now reported using the CUpti_ActivityKernel9 activity record.
Public Variables
- int32_t blockX
- int32_t blockY
- int32_t blockZ
- CUpti_ActivityKernel6::@13 cacheConfig
- uint64_t completed
- uint32_t contextId
- uint32_t correlationId
- uint32_t deviceId
- int32_t dynamicSharedMemory
- uint64_t end
- uint8_t executed
- uint32_t graphId
- uint64_t graphNodeId
- int64_t gridId
- int32_t gridX
- int32_t gridY
- int32_t gridZ
- uint8_t isSharedMemoryCarveoutRequested
- CUpti_ActivityKind kind
- uint8_t launchType
- uint32_t localMemoryPerThread
- uint32_t localMemoryTotal
- const char * name
- CUaccessPolicyWindow * pAccessPolicyWindow
- uint8_t padding
- CUpti_ActivityPartitionedGlobalCacheConfig partitionedGlobalCacheExecuted
- CUpti_ActivityPartitionedGlobalCacheConfig partitionedGlobalCacheRequested
- uint64_t queued
- uint16_t registersPerThread
- uint8_t requested
- void * reserved0
- uint8_t sharedMemoryCarveoutRequested
- uint8_t sharedMemoryConfig
- uint32_t sharedMemoryExecuted
- CUpti_FuncShmemLimitConfig shmemLimitConfig
- uint64_t start
- int32_t staticSharedMemory
- uint32_t streamId
- uint64_t submitted
Variables
- int32_t CUpti_ActivityKernel6::blockX [inherited]
-
The X-dimension block size for the kernel.
- int32_t CUpti_ActivityKernel6::blockY [inherited]
-
The Y-dimension block size for the kernel.
- int32_t CUpti_ActivityKernel6::blockZ [inherited]
-
The Z-dimension grid size for the kernel.
- CUpti_ActivityKernel6::@13 CUpti_ActivityKernel6::cacheConfig [inherited]
-
For devices with compute capability 7.0+ cacheConfig values are not updated in case field isSharedMemoryCarveoutRequested is set
- uint64_t CUpti_ActivityKernel6::completed [inherited]
-
The completed timestamp for the kernel execution, in ns. It represents the completion of all it's child kernels and the kernel itself. A value of CUPTI_TIMESTAMP_UNKNOWN indicates that the completion time is unknown.
- uint32_t CUpti_ActivityKernel6::contextId [inherited]
-
The ID of the context where the kernel is executing.
- uint32_t CUpti_ActivityKernel6::correlationId [inherited]
-
The correlation ID of the kernel. Each kernel execution is assigned a unique correlation ID that is identical to the correlation ID in the driver or runtime API activity record that launched the kernel.
- uint32_t CUpti_ActivityKernel6::deviceId [inherited]
-
The ID of the device where the kernel is executing.
- int32_t CUpti_ActivityKernel6::dynamicSharedMemory [inherited]
-
The dynamic shared memory reserved for the kernel, in bytes.
- uint64_t CUpti_ActivityKernel6::end [inherited]
-
The end timestamp for the kernel execution, in ns. A value of 0 for both the start and end timestamps indicates that timestamp information could not be collected for the kernel.
- uint8_t CUpti_ActivityKernel6::executed [inherited]
-
The cache configuration used for the kernel. The value is one of the CUfunc_cache enumeration values from cuda.h.
- uint32_t CUpti_ActivityKernel6::graphId [inherited]
-
The unique ID of the graph that launched this kernel through graph launch APIs. This field will be 0 if the kernel is not launched through graph launch APIs.
- uint64_t CUpti_ActivityKernel6::graphNodeId [inherited]
-
The unique ID of the graph node that launched this kernel through graph launch APIs. This field will be 0 if the kernel is not launched through graph launch APIs.
- int64_t CUpti_ActivityKernel6::gridId [inherited]
-
The grid ID of the kernel. Each kernel is assigned a unique grid ID at runtime.
- int32_t CUpti_ActivityKernel6::gridX [inherited]
-
The X-dimension grid size for the kernel.
- int32_t CUpti_ActivityKernel6::gridY [inherited]
-
The Y-dimension grid size for the kernel.
- int32_t CUpti_ActivityKernel6::gridZ [inherited]
-
The Z-dimension grid size for the kernel.
- uint8_t CUpti_ActivityKernel6::isSharedMemoryCarveoutRequested [inherited]
-
This indicates if CU_FUNC_ATTRIBUTE_PREFERRED_SHARED_MEMORY_CARVEOUT was updated for the kernel launch
- CUpti_ActivityKindCUpti_ActivityKernel6::kind [inherited]
-
The activity record kind, must be CUPTI_ACTIVITY_KIND_KERNEL or CUPTI_ACTIVITY_KIND_CONCURRENT_KERNEL.
- uint8_t CUpti_ActivityKernel6::launchType [inherited]
-
The indicates if the kernel was executed via a regular launch or via a single/multi device cooperative launch.
See also:
- uint32_t CUpti_ActivityKernel6::localMemoryPerThread [inherited]
-
The amount of local memory reserved for each thread, in bytes.
- uint32_t CUpti_ActivityKernel6::localMemoryTotal [inherited]
-
The total amount of local memory reserved for the kernel, in bytes.
- const char * CUpti_ActivityKernel6::name [inherited]
-
The name of the kernel. This name is shared across all activity records representing the same kernel, and so should not be modified.
- CUaccessPolicyWindow * CUpti_ActivityKernel6::pAccessPolicyWindow [inherited]
-
The pointer to the access policy window. The structure CUaccessPolicyWindow is defined in cuda.h.
- uint8_t CUpti_ActivityKernel6::padding [inherited]
-
Undefined. Reserved for internal use.
- CUpti_ActivityPartitionedGlobalCacheConfigCUpti_ActivityKernel6::partitionedGlobalCacheExecuted [inherited]
-
The partitioned global caching executed for the kernel. Partitioned global caching is required to enable caching on certain chips, such as devices with compute capability 5.2. Partitioned global caching can be automatically disabled if the occupancy requirement of the launch cannot support caching.
- CUpti_ActivityPartitionedGlobalCacheConfigCUpti_ActivityKernel6::partitionedGlobalCacheRequested [inherited]
-
The partitioned global caching requested for the kernel. Partitioned global caching is required to enable caching on certain chips, such as devices with compute capability 5.2.
- uint64_t CUpti_ActivityKernel6::queued [inherited]
-
The timestamp when the kernel is queued up in the command buffer, in ns. A value of CUPTI_TIMESTAMP_UNKNOWN indicates that the queued time could not be collected for the kernel. This timestamp is not collected by default. Use API cuptiActivityEnableLatencyTimestamps() to enable collection.
Command buffer is a buffer written by CUDA driver to send commands like kernel launch, memory copy etc to the GPU. All launches of CUDA kernels are asynchrnous with respect to the host, the host requests the launch by writing commands into the command buffer, then returns without checking the GPU's progress.
- uint16_t CUpti_ActivityKernel6::registersPerThread [inherited]
-
The number of registers required for each thread executing the kernel.
- uint8_t CUpti_ActivityKernel6::requested [inherited]
-
The cache configuration requested by the kernel. The value is one of the CUfunc_cache enumeration values from cuda.h.
- void * CUpti_ActivityKernel6::reserved0 [inherited]
-
Undefined. Reserved for internal use.
- uint8_t CUpti_ActivityKernel6::sharedMemoryCarveoutRequested [inherited]
-
Shared memory carveout value requested for the function in percentage of the total resource. The value will be updated only if field isSharedMemoryCarveoutRequested is set.
- uint8_t CUpti_ActivityKernel6::sharedMemoryConfig [inherited]
-
The shared memory configuration used for the kernel. The value is one of the CUsharedconfig enumeration values from cuda.h.
- uint32_t CUpti_ActivityKernel6::sharedMemoryExecuted [inherited]
-
Shared memory size set by the driver.
- CUpti_FuncShmemLimitConfigCUpti_ActivityKernel6::shmemLimitConfig [inherited]
-
The shared memory limit config for the kernel. This field shows whether user has opted for a higher per block limit of dynamic shared memory.
- uint64_t CUpti_ActivityKernel6::start [inherited]
-
The start timestamp for the kernel execution, in ns. A value of 0 for both the start and end timestamps indicates that timestamp information could not be collected for the kernel.
- int32_t CUpti_ActivityKernel6::staticSharedMemory [inherited]
-
The static shared memory allocated for the kernel, in bytes.
- uint32_t CUpti_ActivityKernel6::streamId [inherited]
-
The ID of the stream where the kernel is executing.
- uint64_t CUpti_ActivityKernel6::submitted [inherited]
-
The timestamp when the command buffer containing the kernel launch is submitted to the GPU, in ns. A value of CUPTI_TIMESTAMP_UNKNOWN indicates that the submitted time could not be collected for the kernel. This timestamp is not collected by default. Use API cuptiActivityEnableLatencyTimestamps() to enable collection.
6.43. CUpti_ActivityKernel7 Struct Reference
[CUPTI Activity API]
This activity record represents a kernel execution (CUPTI_ACTIVITY_KIND_KERNEL and CUPTI_ACTIVITY_KIND_CONCURRENT_KERNEL) but is no longer generated by CUPTI. Kernel activities are now reported using the CUpti_ActivityKernel9 activity record.
Public Variables
- int32_t blockX
- int32_t blockY
- int32_t blockZ
- CUpti_ActivityKernel7::@15 cacheConfig
- uint32_t channelID
- CUpti_ChannelType channelType
- uint64_t completed
- uint32_t contextId
- uint32_t correlationId
- uint32_t deviceId
- int32_t dynamicSharedMemory
- uint64_t end
- uint8_t executed
- uint32_t graphId
- uint64_t graphNodeId
- int64_t gridId
- int32_t gridX
- int32_t gridY
- int32_t gridZ
- uint8_t isSharedMemoryCarveoutRequested
- CUpti_ActivityKind kind
- uint8_t launchType
- uint32_t localMemoryPerThread
- uint32_t localMemoryTotal
- const char * name
- CUaccessPolicyWindow * pAccessPolicyWindow
- uint8_t padding
- CUpti_ActivityPartitionedGlobalCacheConfig partitionedGlobalCacheExecuted
- CUpti_ActivityPartitionedGlobalCacheConfig partitionedGlobalCacheRequested
- uint64_t queued
- uint16_t registersPerThread
- uint8_t requested
- void * reserved0
- uint8_t sharedMemoryCarveoutRequested
- uint8_t sharedMemoryConfig
- uint32_t sharedMemoryExecuted
- CUpti_FuncShmemLimitConfig shmemLimitConfig
- uint64_t start
- int32_t staticSharedMemory
- uint32_t streamId
- uint64_t submitted
Variables
- int32_t CUpti_ActivityKernel7::blockX [inherited]
-
The X-dimension block size for the kernel.
- int32_t CUpti_ActivityKernel7::blockY [inherited]
-
The Y-dimension block size for the kernel.
- int32_t CUpti_ActivityKernel7::blockZ [inherited]
-
The Z-dimension grid size for the kernel.
- CUpti_ActivityKernel7::@15 CUpti_ActivityKernel7::cacheConfig [inherited]
-
For devices with compute capability 7.0+ cacheConfig values are not updated in case field isSharedMemoryCarveoutRequested is set
- uint32_t CUpti_ActivityKernel7::channelID [inherited]
-
The ID of the HW channel on which the kernel is launched.
- CUpti_ChannelType CUpti_ActivityKernel7::channelType [inherited]
-
The type of the channel
- uint64_t CUpti_ActivityKernel7::completed [inherited]
-
The completed timestamp for the kernel execution, in ns. It represents the completion of all it's child kernels and the kernel itself. A value of CUPTI_TIMESTAMP_UNKNOWN indicates that the completion time is unknown.
- uint32_t CUpti_ActivityKernel7::contextId [inherited]
-
The ID of the context where the kernel is executing.
- uint32_t CUpti_ActivityKernel7::correlationId [inherited]
-
The correlation ID of the kernel. Each kernel execution is assigned a unique correlation ID that is identical to the correlation ID in the driver or runtime API activity record that launched the kernel.
- uint32_t CUpti_ActivityKernel7::deviceId [inherited]
-
The ID of the device where the kernel is executing.
- int32_t CUpti_ActivityKernel7::dynamicSharedMemory [inherited]
-
The dynamic shared memory reserved for the kernel, in bytes.
- uint64_t CUpti_ActivityKernel7::end [inherited]
-
The end timestamp for the kernel execution, in ns. A value of 0 for both the start and end timestamps indicates that timestamp information could not be collected for the kernel.
- uint8_t CUpti_ActivityKernel7::executed [inherited]
-
The cache configuration used for the kernel. The value is one of the CUfunc_cache enumeration values from cuda.h.
- uint32_t CUpti_ActivityKernel7::graphId [inherited]
-
The unique ID of the graph that launched this kernel through graph launch APIs. This field will be 0 if the kernel is not launched through graph launch APIs.
- uint64_t CUpti_ActivityKernel7::graphNodeId [inherited]
-
The unique ID of the graph node that launched this kernel through graph launch APIs. This field will be 0 if the kernel is not launched through graph launch APIs.
- int64_t CUpti_ActivityKernel7::gridId [inherited]
-
The grid ID of the kernel. Each kernel is assigned a unique grid ID at runtime.
- int32_t CUpti_ActivityKernel7::gridX [inherited]
-
The X-dimension grid size for the kernel.
- int32_t CUpti_ActivityKernel7::gridY [inherited]
-
The Y-dimension grid size for the kernel.
- int32_t CUpti_ActivityKernel7::gridZ [inherited]
-
The Z-dimension grid size for the kernel.
- uint8_t CUpti_ActivityKernel7::isSharedMemoryCarveoutRequested [inherited]
-
This indicates if CU_FUNC_ATTRIBUTE_PREFERRED_SHARED_MEMORY_CARVEOUT was updated for the kernel launch
- CUpti_ActivityKindCUpti_ActivityKernel7::kind [inherited]
-
The activity record kind, must be CUPTI_ACTIVITY_KIND_KERNEL or CUPTI_ACTIVITY_KIND_CONCURRENT_KERNEL.
- uint8_t CUpti_ActivityKernel7::launchType [inherited]
-
The indicates if the kernel was executed via a regular launch or via a single/multi device cooperative launch.
See also:
- uint32_t CUpti_ActivityKernel7::localMemoryPerThread [inherited]
-
The amount of local memory reserved for each thread, in bytes.
- uint32_t CUpti_ActivityKernel7::localMemoryTotal [inherited]
-
The total amount of local memory reserved for the kernel, in bytes.
- const char * CUpti_ActivityKernel7::name [inherited]
-
The name of the kernel. This name is shared across all activity records representing the same kernel, and so should not be modified.
- CUaccessPolicyWindow * CUpti_ActivityKernel7::pAccessPolicyWindow [inherited]
-
The pointer to the access policy window. The structure CUaccessPolicyWindow is defined in cuda.h.
- uint8_t CUpti_ActivityKernel7::padding [inherited]
-
Undefined. Reserved for internal use.
- CUpti_ActivityPartitionedGlobalCacheConfigCUpti_ActivityKernel7::partitionedGlobalCacheExecuted [inherited]
-
The partitioned global caching executed for the kernel. Partitioned global caching is required to enable caching on certain chips, such as devices with compute capability 5.2. Partitioned global caching can be automatically disabled if the occupancy requirement of the launch cannot support caching.
- CUpti_ActivityPartitionedGlobalCacheConfigCUpti_ActivityKernel7::partitionedGlobalCacheRequested [inherited]
-
The partitioned global caching requested for the kernel. Partitioned global caching is required to enable caching on certain chips, such as devices with compute capability 5.2.
- uint64_t CUpti_ActivityKernel7::queued [inherited]
-
The timestamp when the kernel is queued up in the command buffer, in ns. A value of CUPTI_TIMESTAMP_UNKNOWN indicates that the queued time could not be collected for the kernel. This timestamp is not collected by default. Use API cuptiActivityEnableLatencyTimestamps() to enable collection.
Command buffer is a buffer written by CUDA driver to send commands like kernel launch, memory copy etc to the GPU. All launches of CUDA kernels are asynchrnous with respect to the host, the host requests the launch by writing commands into the command buffer, then returns without checking the GPU's progress.
- uint16_t CUpti_ActivityKernel7::registersPerThread [inherited]
-
The number of registers required for each thread executing the kernel.
- uint8_t CUpti_ActivityKernel7::requested [inherited]
-
The cache configuration requested by the kernel. The value is one of the CUfunc_cache enumeration values from cuda.h.
- void * CUpti_ActivityKernel7::reserved0 [inherited]
-
Undefined. Reserved for internal use.
- uint8_t CUpti_ActivityKernel7::sharedMemoryCarveoutRequested [inherited]
-
Shared memory carveout value requested for the function in percentage of the total resource. The value will be updated only if field isSharedMemoryCarveoutRequested is set.
- uint8_t CUpti_ActivityKernel7::sharedMemoryConfig [inherited]
-
The shared memory configuration used for the kernel. The value is one of the CUsharedconfig enumeration values from cuda.h.
- uint32_t CUpti_ActivityKernel7::sharedMemoryExecuted [inherited]
-
Shared memory size set by the driver.
- CUpti_FuncShmemLimitConfigCUpti_ActivityKernel7::shmemLimitConfig [inherited]
-
The shared memory limit config for the kernel. This field shows whether user has opted for a higher per block limit of dynamic shared memory.
- uint64_t CUpti_ActivityKernel7::start [inherited]
-
The start timestamp for the kernel execution, in ns. A value of 0 for both the start and end timestamps indicates that timestamp information could not be collected for the kernel.
- int32_t CUpti_ActivityKernel7::staticSharedMemory [inherited]
-
The static shared memory allocated for the kernel, in bytes.
- uint32_t CUpti_ActivityKernel7::streamId [inherited]
-
The ID of the stream where the kernel is executing.
- uint64_t CUpti_ActivityKernel7::submitted [inherited]
-
The timestamp when the command buffer containing the kernel launch is submitted to the GPU, in ns. A value of CUPTI_TIMESTAMP_UNKNOWN indicates that the submitted time could not be collected for the kernel. This timestamp is not collected by default. Use API cuptiActivityEnableLatencyTimestamps() to enable collection.
6.44. CUpti_ActivityKernel8 Struct Reference
[CUPTI Activity API]
This activity record represents a kernel execution (CUPTI_ACTIVITY_KIND_KERNEL and CUPTI_ACTIVITY_KIND_CONCURRENT_KERNEL)
Public Variables
- int32_t blockX
- int32_t blockY
- int32_t blockZ
- CUpti_ActivityKernel8::@17 cacheConfig
- uint32_t channelID
- CUpti_ChannelType channelType
- uint32_t clusterSchedulingPolicy
- uint32_t clusterX
- uint32_t clusterY
- uint32_t clusterZ
- uint64_t completed
- uint32_t contextId
- uint32_t correlationId
- uint32_t deviceId
- int32_t dynamicSharedMemory
- uint64_t end
- uint8_t executed
- uint32_t graphId
- uint64_t graphNodeId
- int64_t gridId
- int32_t gridX
- int32_t gridY
- int32_t gridZ
- uint8_t isSharedMemoryCarveoutRequested
- CUpti_ActivityKind kind
- uint8_t launchType
- uint32_t localMemoryPerThread
- uint32_t localMemoryTotal
- uint64_t localMemoryTotal_v2
- const char * name
- CUaccessPolicyWindow * pAccessPolicyWindow
- uint8_t padding
- CUpti_ActivityPartitionedGlobalCacheConfig partitionedGlobalCacheExecuted
- CUpti_ActivityPartitionedGlobalCacheConfig partitionedGlobalCacheRequested
- uint64_t queued
- uint16_t registersPerThread
- uint8_t requested
- void * reserved0
- uint8_t sharedMemoryCarveoutRequested
- uint8_t sharedMemoryConfig
- uint32_t sharedMemoryExecuted
- CUpti_FuncShmemLimitConfig shmemLimitConfig
- uint64_t start
- int32_t staticSharedMemory
- uint32_t streamId
- uint64_t submitted
Variables
- int32_t CUpti_ActivityKernel8::blockX [inherited]
-
The X-dimension block size for the kernel.
- int32_t CUpti_ActivityKernel8::blockY [inherited]
-
The Y-dimension block size for the kernel.
- int32_t CUpti_ActivityKernel8::blockZ [inherited]
-
The Z-dimension grid size for the kernel.
- CUpti_ActivityKernel8::@17 CUpti_ActivityKernel8::cacheConfig [inherited]
-
For devices with compute capability 7.0+ cacheConfig values are not updated in case field isSharedMemoryCarveoutRequested is set
- uint32_t CUpti_ActivityKernel8::channelID [inherited]
-
The ID of the HW channel on which the kernel is launched.
- CUpti_ChannelType CUpti_ActivityKernel8::channelType [inherited]
-
The type of the channel
- uint32_t CUpti_ActivityKernel8::clusterSchedulingPolicy [inherited]
-
The cluster scheduling policy for the kernel. Refer CUclusterSchedulingPolicy Field is valid for devices with compute capability 9.0 and higher
- uint32_t CUpti_ActivityKernel8::clusterX [inherited]
-
The X-dimension cluster size for the kernel. Field is valid for devices with compute capability 9.0 and higher
- uint32_t CUpti_ActivityKernel8::clusterY [inherited]
-
The Y-dimension cluster size for the kernel. Field is valid for devices with compute capability 9.0 and higher
- uint32_t CUpti_ActivityKernel8::clusterZ [inherited]
-
The Z-dimension cluster size for the kernel. Field is valid for devices with compute capability 9.0 and higher
- uint64_t CUpti_ActivityKernel8::completed [inherited]
-
The completed timestamp for the kernel execution, in ns. It represents the completion of all it's child kernels and the kernel itself. A value of CUPTI_TIMESTAMP_UNKNOWN indicates that the completion time is unknown.
- uint32_t CUpti_ActivityKernel8::contextId [inherited]
-
The ID of the context where the kernel is executing.
- uint32_t CUpti_ActivityKernel8::correlationId [inherited]
-
The correlation ID of the kernel. Each kernel execution is assigned a unique correlation ID that is identical to the correlation ID in the driver or runtime API activity record that launched the kernel.
- uint32_t CUpti_ActivityKernel8::deviceId [inherited]
-
The ID of the device where the kernel is executing.
- int32_t CUpti_ActivityKernel8::dynamicSharedMemory [inherited]
-
The dynamic shared memory reserved for the kernel, in bytes.
- uint64_t CUpti_ActivityKernel8::end [inherited]
-
The end timestamp for the kernel execution, in ns. A value of 0 for both the start and end timestamps indicates that timestamp information could not be collected for the kernel.
- uint8_t CUpti_ActivityKernel8::executed [inherited]
-
The cache configuration used for the kernel. The value is one of the CUfunc_cache enumeration values from cuda.h.
- uint32_t CUpti_ActivityKernel8::graphId [inherited]
-
The unique ID of the graph that launched this kernel through graph launch APIs. This field will be 0 if the kernel is not launched through graph launch APIs.
- uint64_t CUpti_ActivityKernel8::graphNodeId [inherited]
-
The unique ID of the graph node that launched this kernel through graph launch APIs. This field will be 0 if the kernel is not launched through graph launch APIs.
- int64_t CUpti_ActivityKernel8::gridId [inherited]
-
The grid ID of the kernel. Each kernel is assigned a unique grid ID at runtime.
- int32_t CUpti_ActivityKernel8::gridX [inherited]
-
The X-dimension grid size for the kernel.
- int32_t CUpti_ActivityKernel8::gridY [inherited]
-
The Y-dimension grid size for the kernel.
- int32_t CUpti_ActivityKernel8::gridZ [inherited]
-
The Z-dimension grid size for the kernel.
- uint8_t CUpti_ActivityKernel8::isSharedMemoryCarveoutRequested [inherited]
-
This indicates if CU_FUNC_ATTRIBUTE_PREFERRED_SHARED_MEMORY_CARVEOUT was updated for the kernel launch
- CUpti_ActivityKindCUpti_ActivityKernel8::kind [inherited]
-
The activity record kind, must be CUPTI_ACTIVITY_KIND_KERNEL or CUPTI_ACTIVITY_KIND_CONCURRENT_KERNEL.
- uint8_t CUpti_ActivityKernel8::launchType [inherited]
-
The indicates if the kernel was executed via a regular launch or via a single/multi device cooperative launch.
See also:
- uint32_t CUpti_ActivityKernel8::localMemoryPerThread [inherited]
-
The amount of local memory reserved for each thread, in bytes.
- uint32_t CUpti_ActivityKernel8::localMemoryTotal [inherited]
-
The total amount of local memory reserved for the kernel, in bytes (deprecated in CUDA 11.8). Refer field localMemoryTotal_v2
- uint64_t CUpti_ActivityKernel8::localMemoryTotal_v2 [inherited]
-
The total amount of local memory reserved for the kernel, in bytes.
- const char * CUpti_ActivityKernel8::name [inherited]
-
The name of the kernel. This name is shared across all activity records representing the same kernel, and so should not be modified.
- CUaccessPolicyWindow * CUpti_ActivityKernel8::pAccessPolicyWindow [inherited]
-
The pointer to the access policy window. The structure CUaccessPolicyWindow is defined in cuda.h.
- uint8_t CUpti_ActivityKernel8::padding [inherited]
-
Undefined. Reserved for internal use.
- CUpti_ActivityPartitionedGlobalCacheConfigCUpti_ActivityKernel8::partitionedGlobalCacheExecuted [inherited]
-
The partitioned global caching executed for the kernel. Partitioned global caching is required to enable caching on certain chips, such as devices with compute capability 5.2. Partitioned global caching can be automatically disabled if the occupancy requirement of the launch cannot support caching.
- CUpti_ActivityPartitionedGlobalCacheConfigCUpti_ActivityKernel8::partitionedGlobalCacheRequested [inherited]
-
The partitioned global caching requested for the kernel. Partitioned global caching is required to enable caching on certain chips, such as devices with compute capability 5.2.
- uint64_t CUpti_ActivityKernel8::queued [inherited]
-
The timestamp when the kernel is queued up in the command buffer, in ns. A value of CUPTI_TIMESTAMP_UNKNOWN indicates that the queued time could not be collected for the kernel. This timestamp is not collected by default. Use API cuptiActivityEnableLatencyTimestamps() to enable collection.
Command buffer is a buffer written by CUDA driver to send commands like kernel launch, memory copy etc to the GPU. All launches of CUDA kernels are asynchrnous with respect to the host, the host requests the launch by writing commands into the command buffer, then returns without checking the GPU's progress.
- uint16_t CUpti_ActivityKernel8::registersPerThread [inherited]
-
The number of registers required for each thread executing the kernel.
- uint8_t CUpti_ActivityKernel8::requested [inherited]
-
The cache configuration requested by the kernel. The value is one of the CUfunc_cache enumeration values from cuda.h.
- void * CUpti_ActivityKernel8::reserved0 [inherited]
-
Undefined. Reserved for internal use.
- uint8_t CUpti_ActivityKernel8::sharedMemoryCarveoutRequested [inherited]
-
Shared memory carveout value requested for the function in percentage of the total resource. The value will be updated only if field isSharedMemoryCarveoutRequested is set.
- uint8_t CUpti_ActivityKernel8::sharedMemoryConfig [inherited]
-
The shared memory configuration used for the kernel. The value is one of the CUsharedconfig enumeration values from cuda.h.
- uint32_t CUpti_ActivityKernel8::sharedMemoryExecuted [inherited]
-
Shared memory size set by the driver.
- CUpti_FuncShmemLimitConfigCUpti_ActivityKernel8::shmemLimitConfig [inherited]
-
The shared memory limit config for the kernel. This field shows whether user has opted for a higher per block limit of dynamic shared memory.
- uint64_t CUpti_ActivityKernel8::start [inherited]
-
The start timestamp for the kernel execution, in ns. A value of 0 for both the start and end timestamps indicates that timestamp information could not be collected for the kernel.
- int32_t CUpti_ActivityKernel8::staticSharedMemory [inherited]
-
The static shared memory allocated for the kernel, in bytes.