6.6. CUpti_ActivityCdpKernel
-
struct CUpti_ActivityCdpKernel
The activity record for CDP (CUDA Dynamic Parallelism) kernel.
This activity record represents a CDP kernel execution.
Public Members
-
CUpti_ActivityKind kind
The activity record kind, must be CUPTI_ACTIVITY_KIND_CDP_KERNEL.
-
uint8_t requested
The cache configuration requested by the kernel.
The value is one of the CUfunc_cache enumeration values from cuda.h.
-
uint8_t executed
The cache configuration used for the kernel.
The value is one of the CUfunc_cache enumeration values from cuda.h.
The shared memory configuration used for the kernel.
The value is one of the CUsharedconfig enumeration values from cuda.h.
-
uint16_t registersPerThread
The number of registers required for each thread executing the kernel.
-
uint64_t start
The start timestamp for the kernel execution, in ns.
A value of 0 for both the start and end timestamps indicates that timestamp information could not be collected for the kernel.
-
uint64_t end
The end timestamp for the kernel execution, in ns.
A value of 0 for both the start and end timestamps indicates that timestamp information could not be collected for the kernel.
-
uint32_t deviceId
The ID of the device where the kernel is executing.
-
uint32_t contextId
The ID of the context where the kernel is executing.
-
uint32_t streamId
The ID of the stream where the kernel is executing.
-
int32_t gridX
The X-dimension grid size for the kernel.
-
int32_t gridY
The Y-dimension grid size for the kernel.
-
int32_t gridZ
The Z-dimension grid size for the kernel.
-
int32_t blockX
The X-dimension block size for the kernel.
-
int32_t blockY
The Y-dimension block size for the kernel.
-
int32_t blockZ
The Z-dimension grid size for the kernel.
The static shared memory allocated for the kernel, in bytes.
The dynamic shared memory reserved for the kernel, in bytes.
-
uint32_t localMemoryPerThread
The amount of local memory reserved for each thread, in bytes.
-
uint32_t localMemoryTotal
The total amount of local memory reserved for the kernel, in bytes.
-
uint32_t correlationId
The correlation ID of the kernel.
Each kernel execution is assigned a unique correlation ID that is identical to the correlation ID in the driver API activity record that launched the kernel.
-
int64_t gridId
The grid ID of the kernel.
Each kernel execution is assigned a unique grid ID.
-
int64_t parentGridId
The grid ID of the parent kernel.
-
uint64_t queued
The timestamp when kernel is queued up, in ns.
A value of CUPTI_TIMESTAMP_UNKNOWN indicates that the queued time is unknown.
-
uint64_t submitted
The timestamp when kernel is submitted to the gpu, in ns.
A value of CUPTI_TIMESTAMP_UNKNOWN indicates that the submission time is unknown.
-
uint64_t completed
The timestamp when kernel is marked as completed, in ns.
A value of CUPTI_TIMESTAMP_UNKNOWN indicates that the completion time is unknown.
-
uint32_t parentBlockX
The X-dimension of the parent block.
-
uint32_t parentBlockY
The Y-dimension of the parent block.
-
uint32_t parentBlockZ
The Z-dimension of the parent block.
-
uint32_t pad
Undefined.
Reserved for internal use.
-
const char *name
The name of the kernel.
This name is shared across all activity records representing the same kernel, and so should not be modified.
-
CUpti_ActivityKind kind