2. CUPTI Python API Reference

2.1. Documentation Issues

The CUPTI Python API Reference section of the document is automatically generated and has some issues:

  • All the CUPTI Python enumerations, functions and classes are listed together in a single section.

  • The members of the inner struct/union classes (_py_anon_pod*) are not adequately documented. To get more information for a member, please refer the CUPTI C documentation.

  • The kind member of python classes has type int, instead of cupti.cupti.ActivityKind. While using the kind member, please use cupti.cupti.ActivityKind to get the enum value.

  • Some parts of the API documentation still cite C enums/data structures instead of mapping them to their Python counterparts.

2.2. API Reference

exception cupti.cupti.cuptiError(status: int)

Bases: Exception

class cupti.cupti.ActivityAPI

Bases: object

Empty-initialize an instance of CUpti_ActivityAPI.

cbid

The ID of the driver or runtime function.

Type:

int

correlation_id

The correlation ID of the driver or runtime CUDA function. Each function invocation is assigned a unique correlation ID that is identical to the correlation ID in the memcpy, memset, or kernel activity record that is associated with this function.

Type:

int

end

The end timestamp for the function, in ns. A value of 0 for both the start and end timestamps indicates that timestamp information could not be collected for the function.

Type:

int

kind

The activity record kind, must be CUPTI_ACTIVITY_KIND_DRIVER, CUPTI_ACTIVITY_KIND_RUNTIME, or CUPTI_ACTIVITY_KIND_INTERNAL_LAUNCH_API.

Type:

int

process_id

The ID of the process where the driver or runtime CUDA function is executing.

Type:

int

ptr

Get the pointer address to the data as Python int.

return_value

The return value for the function. For a CUDA driver function with will be a CUresult value, and for a CUDA runtime function this will be a cudaError_t value.

Type:

int

start

The start timestamp for the function, in ns. A value of 0 for both the start and end timestamps indicates that timestamp information could not be collected for the function.

Type:

int

thread_id

The ID of the thread where the driver or runtime CUDA function is executing.

Type:

int

class cupti.cupti.ActivityAttribute(value)

Bases: IntEnum

See CUpti_ActivityAttribute.

ATTR_DEVICE_BUFFER_FORCE_INT = 2147483647
ATTR_DEVICE_BUFFER_POOL_LIMIT = 2
ATTR_DEVICE_BUFFER_PRE_ALLOCATE_VALUE = 6
ATTR_DEVICE_BUFFER_SIZE = 0
ATTR_DEVICE_BUFFER_SIZE_CDP = 1
ATTR_DEVICE_BUFFER_SIZE_DEVICE_GRAPHS = 10
ATTR_MEM_ALLOCATION_TYPE_HOST_PINNED = 8
ATTR_PER_THREAD_BUFFER = 9
ATTR_PROFILING_SEMAPHORE_POOL_LIMIT = 4
ATTR_PROFILING_SEMAPHORE_POOL_SIZE = 3
ATTR_PROFILING_SEMAPHORE_PRE_ALLOCATE_VALUE = 7
ATTR_ZEROED_OUT_BUFFER = 5
class cupti.cupti.ActivityAutoBoostState

Bases: object

Empty-initialize an instance of CUpti_ActivityAutoBoostState.

enabled

Returned auto boost state. 1 is returned in case auto boost is enabled, 0 otherwise

Type:

int

pid

Id of process that has set the current boost state. The value will be CUPTI_AUTO_BOOST_INVALID_CLIENT_PID if the user does not have the permission to query process ids or there is an error in querying the process id.

Type:

int

ptr

Get the pointer address to the data as Python int.

class cupti.cupti.ActivityCdpKernel

Bases: object

Empty-initialize an instance of CUpti_ActivityCdpKernel.

block_x

The X-dimension block size for the kernel.

Type:

int

block_y

The Y-dimension block size for the kernel.

Type:

int

block_z

The Z-dimension grid size for the kernel.

Type:

int

cache_config

_py_anon_pod7:

completed

The timestamp when kernel is marked as completed, in ns. A value of CUPTI_TIMESTAMP_UNKNOWN indicates that the completion time is unknown.

Type:

int

context_id

The ID of the context where the kernel is executing.

Type:

int

correlation_id

The correlation ID of the kernel. Each kernel execution is assigned a unique correlation ID that is identical to the correlation ID in the driver API activity record that launched the kernel.

Type:

int

device_id

The ID of the device where the kernel is executing.

Type:

int

dynamic_shared_memory

The dynamic shared memory reserved for the kernel, in bytes.

Type:

int

end

The end timestamp for the kernel execution, in ns. A value of 0 for both the start and end timestamps indicates that timestamp information could not be collected for the kernel.

Type:

int

grid_id

The grid ID of the kernel. Each kernel execution is assigned a unique grid ID.

Type:

int

grid_x

The X-dimension grid size for the kernel.

Type:

int

grid_y

The Y-dimension grid size for the kernel.

Type:

int

grid_z

The Z-dimension grid size for the kernel.

Type:

int

kind

The activity record kind, must be CUPTI_ACTIVITY_KIND_CDP_KERNEL

Type:

int

local_memory_per_thread

The amount of local memory reserved for each thread, in bytes.

Type:

int

local_memory_total

The total amount of local memory reserved for the kernel, in bytes.

Type:

int

name

The name of the kernel. This name is shared across all activity records representing the same kernel, and so should not be modified.

Type:

str

pad

Undefined. Reserved for internal use.

Type:

int

parent_block_x

The X-dimension of the parent block.

Type:

int

parent_block_y

The Y-dimension of the parent block.

Type:

int

parent_block_z

The Z-dimension of the parent block.

Type:

int

parent_grid_id

The grid ID of the parent kernel.

Type:

int

ptr

Get the pointer address to the data as Python int.

queued

The timestamp when kernel is queued up, in ns. A value of CUPTI_TIMESTAMP_UNKNOWN indicates that the queued time is unknown.

Type:

int

registers_per_thread

The number of registers required for each thread executing the kernel.

Type:

int

shared_memory_config

The shared memory configuration used for the kernel. The value is one of the CUsharedconfig enumeration values from cuda.h.

Type:

int

start

The start timestamp for the kernel execution, in ns. A value of 0 for both the start and end timestamps indicates that timestamp information could not be collected for the kernel.

Type:

int

static_shared_memory

The static shared memory allocated for the kernel, in bytes.

Type:

int

stream_id

The ID of the stream where the kernel is executing.

Type:

int

submitted

The timestamp when kernel is submitted to the gpu, in ns. A value of CUPTI_TIMESTAMP_UNKNOWN indicates that the submission time is unknown.

Type:

int

class cupti.cupti.ActivityComputeApiKind(value)

Bases: IntEnum

See CUpti_ActivityComputeApiKind.

CUDA = 1
CUDA_MPS = 2
FORCE_INT = 2147483647
UNKNOWN = 0
class cupti.cupti.ActivityConfidentialComputeRotation

Bases: object

Empty-initialize an instance of CUpti_ActivityConfidentialComputeRotation.

channel_id

Channel ID

Type:

int

channel_type

Channel Type CUpti_ChannelType

Type:

int

context_id

Context ID

Type:

int

device_id

Device ID

Type:

int

event_type

Type of event CUpti_ConfidentialComputeRotationEventType

Type:

int

kind

The activity record kind, must be CUPTI_ACTIVITY_KIND_CONFIDENTIAL_COMPUTE_ROTATION.

Type:

int

ptr

Get the pointer address to the data as Python int.

timestamp

Timestamp in ns

Type:

int

class cupti.cupti.ActivityContext3

Bases: object

Empty-initialize an instance of CUpti_ActivityContext3.

cig_mode

This field indicates the CIG mode

Type:

int

compute_api_kind

The compute API kind. CUpti_ActivityComputeApiKind

Type:

int

context_id

The context ID.

Type:

int

device_id

The device ID.

Type:

int

is_green_context

This field indicates whether the context is a green context

Type:

int

kind

The activity record kind, must be CUPTI_ACTIVITY_KIND_CONTEXT.

Type:

int

null_stream_id

The ID for the NULL stream in this context

Type:

int

num_multiprocessors

Number of multiprocessors assigned to the green context Invalid if the field ‘isGreenContext’ is 0

Type:

int

padding

int:

padding2

int:

parent_context_id

The ID of the parent context. It would be 0 if context does not have parent

Type:

int

ptr

Get the pointer address to the data as Python int.

class cupti.cupti.ActivityCudaEvent2

Bases: object

Empty-initialize an instance of CUpti_ActivityCudaEvent2.

context_id

The ID of the context where the event was recorded.

Type:

int

correlation_id

The correlation ID of the API to which this result is associated.

Type:

int

cuda_event_sync_id

A unique ID to associate event synchronization records with the latest CUDA Event record. Similar field is added in CUpti_ActivitySynchronization2 to associate CUDA Event record to the synchronization record. The same CUDA event can be used multiple times, so the event id will not be unique to correlate the synchronization record with the latest CUDA Event record. This field will be unique and can be used to do the required correlation.

Type:

int

device_id

The ID of the device where the event was recorded.

Type:

int

device_timestamp

The device-side timestamp on CUDA event record. Timestamp is in nanoseconds. Collection of this field is disabled by default. It can be enabled by calling CUPTI API cuptiActivityEnableCudaEventDeviceTimestamps

Type:

int

event_id

A unique event ID to identify the event record.

Type:

int

kind

The activity record kind, must be CUPTI_ACTIVITY_KIND_CUDA_EVENT.

Type:

int

pad

Undefined. Reserved for internal use.

Type:

int

pad2

Undefined. Reserved for internal use.

Type:

int

ptr

Get the pointer address to the data as Python int.

reserved0

Undefined. Reserved for internal use.

Type:

int

stream_id

The compute stream where the event was recorded.

Type:

int

class cupti.cupti.ActivityDevice5

Bases: object

Empty-initialize an instance of CUpti_ActivityDevice5.

compute_capability_major

Compute capability for the device, major number.

Type:

int

compute_capability_minor

Compute capability for the device, minor number.

Type:

int

compute_instance_id

Compute Instance id for MIG enabled devices. If mig mode is disabled value is set to UINT32_MAX

Type:

int

constant_memory_size

The amount of constant memory on the device, in bytes.

Type:

int

core_clock_rate

The core clock rate of the device, in kHz.

Type:

int

ecc_enabled

ECC enabled flag for device

Type:

int

flags_

The flags associated with the device. CUpti_ActivityFlag

Type:

int

global_memory_bandwidth

The global memory bandwidth available on the device, in kBytes/sec.

Type:

int

global_memory_size

The amount of global memory on the device, in bytes.

Type:

int

gpu_instance_id

GPU Instance id for MIG enabled devices. If mig mode is disabled value is set to UINT32_MAX

Type:

int

id

The device ID.

Type:

int

is_cuda_visible

Flag to indicate whether the device is visible to CUDA. Users can set the device visibility using CUDA_VISIBLE_DEVICES environment

Type:

int

is_mig_enabled

MIG enabled flag for device

Type:

int

is_numa_node

The MIG UUID. This value is the globally unique immutable alphanumeric identifier of the device. Numa (Non-uniform memory access) information for device GPU is a NUMA node or not

Type:

int

kind

The activity record kind, must be CUPTI_ACTIVITY_KIND_DEVICE.

Type:

int

l2cache_size

The size of the L2 cache on the device, in bytes.

Type:

int

max_block_dim_x

Maximum allowed X dimension for a block.

Type:

int

max_block_dim_y

Maximum allowed Y dimension for a block.

Type:

int

max_block_dim_z

Maximum allowed Z dimension for a block.

Type:

int

max_blocks_per_multiprocessor

Maximum number of blocks that can be present on a multiprocessor at any given time.

Type:

int

max_grid_dim_x

Maximum allowed X dimension for a grid.

Type:

int

max_grid_dim_y

Maximum allowed Y dimension for a grid.

Type:

int

max_grid_dim_z

Maximum allowed Z dimension for a grid.

Type:

int

max_ipc

The maximum “instructions per cycle” possible on each device multiprocessor.

Type:

int

max_registers_per_block

Maximum number of registers that can be allocated to a block.

Type:

int

max_registers_per_multiprocessor

Maximum number of 32-bit registers available per multiprocessor.

Type:

int

max_shared_memory_per_block

Maximum amount of shared memory that can be assigned to a block, in bytes.

Type:

int

max_shared_memory_per_multiprocessor

Maximum amount of shared memory available per multiprocessor, in bytes.

Type:

int

max_threads_per_block

Maximum number of threads allowed in a block.

Type:

int

max_warps_per_multiprocessor

Maximum number of warps that can be present on a multiprocessor at any given time.

Type:

int

mig_uuid
name

The device UUID. This value is the globally unique immutable alphanumeric identifier of the device. The device name. Client is responsible for freeing this memory using the free function when done.

Type:

str

num_memcpy_engines

Number of memory copy engines on the device.

Type:

int

num_multiprocessors

Number of multiprocessors on the device.

Type:

int

num_threads_per_warp

The number of threads per warp on the device.

Type:

int

numa_id

Numa (Non-uniform memory access) information for device NUMA node ID of the GPU memory if GPU is not a NUMA node, it returns invalidNumaId

Type:

int

ptr

Get the pointer address to the data as Python int.

uuid
class cupti.cupti.ActivityDeviceAttribute

Bases: object

Empty-initialize an instance of CUpti_ActivityDeviceAttribute.

attribute

The attribute, either a CUpti_DeviceAttribute or CUdevice_attribute. Flag CUPTI_ACTIVITY_FLAG_DEVICE_ATTRIBUTE_CUDEVICE is used to indicate what kind of attribute this is. If CUPTI_ACTIVITY_FLAG_DEVICE_ATTRIBUTE_CUDEVICE is 1 then CUdevice_attribute field is value, otherwise CUpti_DeviceAttribute field is valid.

Type:

_py_anon_pod9

device_id

The ID of the device that this attribute applies to.

Type:

int

flags_

The flags associated with the device. CUpti_ActivityFlag

Type:

int

kind

The activity record kind, must be CUPTI_ACTIVITY_KIND_DEVICE_ATTRIBUTE.

Type:

int

ptr

Get the pointer address to the data as Python int.

value

The value for the attribute. See CUpti_DeviceAttribute and CUdevice_attribute for the type of the value for a given attribute.

Type:

_py_anon_pod10

class cupti.cupti.ActivityDeviceGraphTrace

Bases: object

Empty-initialize an instance of CUpti_ActivityDeviceGraphTrace.

context_id

The ID of the context where the first node of the graph is executed.

Type:

int

device_id

The ID of the device where the first node of the graph is executed.

Type:

int

device_launch_mode

The type of launch. See CUpti_DeviceGraphLaunchMode

Type:

int

end

The end timestamp for the graph execution, in ns. A value of 0 for both the start and end timestamps indicates that timestamp information could not be collected for the graph.

Type:

int

graph_id

The unique ID of the graph that is launched.

Type:

int

kind

The activity record kind, must be CUPTI_ACTIVITY_KIND_DEVICE_GRAPH_TRACE

Type:

int

launcher_graph_id

The unique ID of the graph that has launched this graph.

Type:

int

ptr

Get the pointer address to the data as Python int.

start

The start timestamp for the graph execution, in ns. A value of 0 for both the start and end timestamps indicates that timestamp information could not be collected for the graph.

Type:

int

stream_id

The ID of the stream where the graph is being launched.

Type:

int

class cupti.cupti.ActivityEnvironment

Bases: object

Empty-initialize an instance of CUpti_ActivityEnvironment.

data_

_py_anon_pod11:

device_id

The ID of the device

Type:

int

environment_kind

The kind of data reported in this record.

Type:

int

kind

The activity record kind, must be CUPTI_ACTIVITY_KIND_ENVIRONMENT.

Type:

int

ptr

Get the pointer address to the data as Python int.

timestamp

The timestamp when this sample was retrieved, in ns. A value of 0 indicates that timestamp information could not be collected for the marker.

Type:

int

class cupti.cupti.ActivityEnvironmentKind(value)

Bases: IntEnum

See CUpti_ActivityEnvironmentKind.

COOLING = 4
COUNT = 5
FORCE_INT = 2147483647
POWER = 3
SPEED = 1
TEMPERATURE = 2
UNKNOWN = 0
class cupti.cupti.ActivityExternalCorrelation

Bases: object

Empty-initialize an instance of CUpti_ActivityExternalCorrelation.

correlation_id

The correlation ID of the associated CUDA driver or runtime API record.

Type:

int

external_id

The correlation ID of the associated non-CUDA API record. The exact field in the associated external record depends on that record’s activity kind (externalKind).

Type:

int

external_kind

The kind of external API this record correlated to.

Type:

int

kind

The kind of this activity.

Type:

int

ptr

Get the pointer address to the data as Python int.

class cupti.cupti.ActivityFlag(value)

Bases: IntEnum

See CUpti_ActivityFlag.

DEVICE_ATTRIBUTE_CUDEVICE = 1
DEVICE_CONCURRENT_KERNELS = 1
FLUSH_FORCED = 1
FORCE_INT = 2147483647
GLOBAL_ACCESS_KIND_CACHED = 512
GLOBAL_ACCESS_KIND_LOAD = 256
GLOBAL_ACCESS_KIND_SIZE_MASK = 255
INSTRUCTION_CLASS_MASK = 510
INSTRUCTION_VALUE_INVALID = 1
MARKER_COLOR_ARGB = 2
MARKER_COLOR_NONE = 1
MARKER_INSTANTANEOUS = 1
MARKER_START = 2
MARKER_SYNC_ACQUIRE = 8
MARKER_SYNC_ACQUIRE_FAILED = 32
MARKER_SYNC_ACQUIRE_SUCCESS = 16
MARKER_SYNC_RELEASE = 64
MEMCPY_ASYNC = 1
MEMSET_ASYNC = 1
METRIC_OVERFLOWED = 1
METRIC_VALUE_INVALID = 2
NONE = 0
SHARED_ACCESS_KIND_LOAD = 256
SHARED_ACCESS_KIND_SIZE_MASK = 255
THRASHING_IN_CPU = 1
THROTTLING_IN_CPU = 1
class cupti.cupti.ActivityFunction

Bases: object

Empty-initialize an instance of CUpti_ActivityFunction.

context_id

The ID of the context where the function is launched.

Type:

int

function_ind_ex

The function’s unique symbol index in the module.

Type:

int

id

ID to uniquely identify the record

Type:

int

kind

The activity record kind, must be CUPTI_ACTIVITY_KIND_FUNCTION.

Type:

int

module_id

The module ID in which this global/device function is present.

Type:

int

name

The name of the function. This name is shared across all activity records representing the same kernel, and so should not be modified.

Type:

str

pad

Undefined. Reserved for internal use.

Type:

int

ptr

Get the pointer address to the data as Python int.

class cupti.cupti.ActivityGraphTrace2

Bases: object

Empty-initialize an instance of CUpti_ActivityGraphTrace2.

context_id

The ID of the context where the first node of the graph is executed. If this is INT_MAX, then the start is on the host.

Type:

int

correlation_id

The correlation ID of the graph launch. Each graph launch is assigned a unique correlation ID that is identical to the correlation ID in the driver API activity record that launched the graph.

Type:

int

device_id

The ID of the device where the first node of the graph is executed. If this is INT_MAX, then the start is on the host.

Type:

int

end

The end timestamp for the graph execution, in ns. A value of 0 for both the start and end timestamps indicates that timestamp information could not be collected for the graph.

Type:

int

end_context_id

The ID of the context where the last node of the graph is executed.

Type:

int

end_device_id

The ID of the device where last node of the graph is executed

Type:

int

graph_id

The unique ID of the graph that is launched.

Type:

int

kind

The activity record kind, must be CUPTI_ACTIVITY_KIND_GRAPH_TRACE

Type:

int

ptr

Get the pointer address to the data as Python int.

start

The start timestamp for the graph execution, in ns. A value of 0 for both the start and end timestamps indicates that timestamp information could not be collected for the graph.

Type:

int

stream_id

The ID of the stream where the graph is being launched.

Type:

int

class cupti.cupti.ActivityInstructionClass(value)

Bases: IntEnum

See CUpti_ActivityInstructionClass.

BARRIER = 17
BIT_CONVERSION = 4
CONSTANT = 11
CONTROL_FLOW = 5
FP_16 = 19
FP_32 = 1
FP_64 = 2
GENERIC = 9
GLOBAL = 6
GLOBAL_ATOMIC = 13
INTEGER = 3
INTER_THREAD_COMMUNICATION = 16
KIND_FORCE_INT = 2147483647
LOCAL = 8
MISCELLANEOUS = 18
SHARED = 7
SHARED_ATOMIC = 14
SURFACE = 10
SURFACE_ATOMIC = 15
TEXTURE = 12
UNIFORM = 20
UNKNOWN = 0
class cupti.cupti.ActivityJit2

Bases: object

Empty-initialize an instance of CUpti_ActivityJit2.

cache_path

The path where the fat binary is cached.

Type:

str

cache_size

The size of compute cache.

Type:

int

correlation_id

The correlation ID of the JIT operation to which records belong to. Each JIT operation is assigned a unique correlation ID that is identical to the correlation ID in the driver or runtime API activity record that launched the JIT operation.

Type:

int

device_id

The device ID.

Type:

int

end

The end timestamp for the JIT operation, in ns. A value of 0 for both the start and end timestamps indicates that timestamp information could not be collected for the JIT operation.

Type:

int

jit_entry_type

The JIT entry type.

Type:

int

jit_operation_correlation_id

The correlation ID to correlate JIT compilation, load and store operations. Each JIT compilation unit is assigned a unique correlation ID at the time of the JIT compilation. This correlation id can be used to find the matching JIT cache load/store records.

Type:

int

jit_operation_type

The JIT operation type.

Type:

int

kind

The activity record kind must be CUPTI_ACTIVITY_KIND_JIT.

Type:

int

padding

Internal use.

Type:

int

process_id

The ID of the process where the JIT operation is executing.

Type:

int

ptr

Get the pointer address to the data as Python int.

start

The start timestamp for the JIT operation, in ns. A value of 0 for both the start and end timestamps indicates that timestamp information could not be collected for the JIT operation.

Type:

int

thread_id

The ID of the thread where the JIT operation is executing.

Type:

int

class cupti.cupti.ActivityJitEntryType(value)

Bases: IntEnum

See CUpti_ActivityJitEntryType.

FORCE_INT = 2147483647
INVALID = 0
NVVM_IR_TO_PTX = 2
PTX_TO_CUBIN = 1
class cupti.cupti.ActivityJitOperationType(value)

Bases: IntEnum

See CUpti_ActivityJitOperationType.

CACHE_LOAD = 1
CACHE_STORE = 2
COMPILE = 3
FORCE_INT = 2147483647
INVALID = 0
class cupti.cupti.ActivityKernel10

Bases: object

Empty-initialize an instance of CUpti_ActivityKernel10.

block_x

The X-dimension block size for the kernel.

Type:

int

block_y

The Y-dimension block size for the kernel.

Type:

int

block_z

The Z-dimension grid size for the kernel.

Type:

int

cache_config

For devices with compute capability 7.5+ cacheConfig values are not updated in case field isSharedMemoryCarveoutRequested is set

Type:

_py_anon_pod24

channel_id

The ID of the HW channel on which the kernel is launched.

Type:

int

channel_type

The type of the channel

Type:

int

cluster_scheduling_policy

The cluster scheduling policy for the kernel. Refer CUclusterSchedulingPolicy Field is valid for devices with compute capability 9.0 and higher

Type:

int

cluster_x

The X-dimension cluster size for the kernel. Field is valid for devices with compute capability 9.0 and higher

Type:

int

cluster_y

The Y-dimension cluster size for the kernel. Field is valid for devices with compute capability 9.0 and higher

Type:

int

cluster_z

The Z-dimension cluster size for the kernel. Field is valid for devices with compute capability 9.0 and higher

Type:

int

completed

The completed timestamp for the kernel execution, in ns. It represents the completion of all it’s child kernels and the kernel itself. A value of CUPTI_TIMESTAMP_UNKNOWN indicates that the completion time is unknown.

Type:

int

context_id

The ID of the context where the kernel is executing.

Type:

int

correlation_id

The correlation ID of the kernel. Each kernel execution is assigned a unique correlation ID that is identical to the correlation ID in the driver or runtime API activity record that launched the kernel.

Type:

int

device_id

The ID of the device where the kernel is executing.

Type:

int

dynamic_shared_memory

The dynamic shared memory reserved for the kernel, in bytes.

Type:

int

end

The end timestamp for the kernel execution, in ns. A value of 0 for both the start and end timestamps indicates that timestamp information could not be collected for the kernel.

Type:

int

graph_id

The unique ID of the graph that launched this kernel through graph launch APIs. This field will be 0 if the kernel is not launched through graph launch APIs.

Type:

int

graph_node_id

The unique ID of the graph node that launched this kernel through graph launch APIs. This field will be 0 if the kernel is not launched through graph launch APIs.

Type:

int

grid_id

The grid ID of the kernel. Each kernel is assigned a unique grid ID at runtime.

Type:

int

grid_x

The X-dimension grid size for the kernel.

Type:

int

grid_y

The Y-dimension grid size for the kernel.

Type:

int

grid_z

The Z-dimension grid size for the kernel.

Type:

int

is_device_launched

This field is set to 1 if the kernel is part of a device launched graph.

Type:

int

is_shared_memory_carveout_requested

This indicates if CU_FUNC_ATTRIBUTE_PREFERRED_SHARED_MEMORY_CARVEOUT was updated for the kernel launch

Type:

int

kind

The activity record kind, must be CUPTI_ACTIVITY_KIND_KERNEL or CUPTI_ACTIVITY_KIND_CONCURRENT_KERNEL.

Type:

int

launch_type

The indicates if the kernel was executed via a regular launch or via a single/multi device cooperative launch. CUpti_ActivityLaunchType

Type:

int

local_memory_per_thread

The amount of local memory reserved for each thread, in bytes.

Type:

int

local_memory_total

The total amount of local memory reserved for the kernel, in bytes (deprecated in CUDA 11.8). Refer field localMemoryTotal_v2

Type:

int

local_memory_total_v2

The total amount of local memory reserved for the kernel, in bytes.

Type:

int

max_active_clusters

The maximum clusters that could co-exist on the target device for the kernel

Type:

int

max_potential_cluster_size

The maximum cluster size for the kernel

Type:

int

name

The name of the kernel. This name is shared across all activity records representing the same kernel, and so should not be modified.

Type:

str

p_access_policy_window

The pointer to the access policy window. The structure CUaccessPolicyWindow is defined in cuda.h.

Type:

int

padding

Undefined. Reserved for internal use.

Type:

int

padding3

(array of length 7).

Type:

uint8

partitioned_global_cache_executed

The partitioned global caching executed for the kernel. Partitioned global caching is required to enable caching on certain chips, such as devices with compute capability 5.2. Partitioned global caching can be automatically disabled if the occupancy requirement of the launch cannot support caching.

Type:

int

partitioned_global_cache_requested

The partitioned global caching requested for the kernel. Partitioned global caching is required to enable caching on certain chips, such as devices with compute capability 5.2.

Type:

int

ptr

Get the pointer address to the data as Python int.

queued

The timestamp when the kernel is queued up in the command buffer, in ns. A value of CUPTI_TIMESTAMP_UNKNOWN indicates that the queued time could not be collected for the kernel. This timestamp is not collected by default. Use API cuptiActivityEnableLatencyTimestamps() to enable collection. Command buffer is a buffer written by CUDA driver to send commands like kernel launch, memory copy etc to the GPU. All launches of CUDA kernels are asynchronous with respect to the host, the host requests the launch by writing commands into the command buffer, then returns without checking the GPU’s progress.

Type:

int

registers_per_thread

The number of registers required for each thread executing the kernel.

Type:

int

reserved0

Undefined. Reserved for internal use.

Type:

int

shared_memory_carveout_requested

Shared memory carveout value requested for the function in percentage of the total resource. The value will be updated only if field isSharedMemoryCarveoutRequested is set.

Type:

int

shared_memory_config

The shared memory configuration used for the kernel. The value is one of the CUsharedconfig enumeration values from cuda.h.

Type:

int

shared_memory_executed

Shared memory size set by the driver.

Type:

int

shmem_limit_config

The shared memory limit config for the kernel. This field shows whether user has opted for a higher per block limit of dynamic shared memory.

Type:

int

start

The start timestamp for the kernel execution, in ns. A value of 0 for both the start and end timestamps indicates that timestamp information could not be collected for the kernel.

Type:

int

static_shared_memory

The static shared memory allocated for the kernel, in bytes.

Type:

int

stream_id

The ID of the stream where the kernel is executing.

Type:

int

submitted

The timestamp when the command buffer containing the kernel launch is submitted to the GPU, in ns. A value of CUPTI_TIMESTAMP_UNKNOWN indicates that the submitted time could not be collected for the kernel. This timestamp is not collected by default. Use API cuptiActivityEnableLatencyTimestamps() to enable collection.

Type:

int

class cupti.cupti.ActivityKind(value)

Bases: IntEnum

See CUpti_ActivityKind.

BRANCH = 16
CDP_KERNEL = 18
CONCURRENT_KERNEL = 10
CONTEXT = 9
COUNT = 56
CUDA_EVENT = 36
DEVICE = 8
DEVICE_ATTRIBUTE = 28
DEVICE_GRAPH_TRACE = 53
DRIVER = 4
ENVIRONMENT = 20
EVENT = 6
EVENT_INSTANCE = 21
EXTERNAL_CORRELATION = 39
FORCE_INT = 2147483647
FUNCTION = 26
GLOBAL_ACCESS = 15
GRAPH_TRACE = 51
INSTANTANEOUS_EVENT = 41
INSTANTANEOUS_EVENT_INSTANCE = 42
INSTANTANEOUS_METRIC = 43
INSTANTANEOUS_METRIC_INSTANCE = 44
INSTRUCTION_CORRELATION = 32
INSTRUCTION_EXECUTION = 24
INTERNAL_LAUNCH_API = 48
INVALID = 0
JIT = 52
KERNEL = 3
MARKER = 12
MARKER_DATA = 13
MEMCPY = 1
MEMCPY2 = 22
MEMORY = 45
MEMORY2 = 49
MEMORY_POOL = 50
MEMSET = 2
MEM_DECOMPRESS = 54
METRIC = 7
METRIC_INSTANCE = 23
MODULE = 27
NAME = 11
OPENACC_DATA = 33
OPENACC_LAUNCH = 34
OPENACC_OTHER = 35
OPENMP = 47
OVERHEAD = 17
PCIE = 46
PC_SAMPLING = 30
PC_SAMPLING_RECORD_INFO = 31
PREEMPTION = 19
ROTATION = 55
RUNTIME = 5
SHARED_ACCESS = 29
SOURCE_LOCATOR = 14
STREAM = 37
SYNCHRONIZATION = 38
UNIFIED_MEMORY_COUNTER = 25
class cupti.cupti.ActivityLaunchType(value)

Bases: IntEnum

See CUpti_ActivityLaunchType.

CBL_COMMANDLIST = 3
COOPERATIVE_MULTI_DEVICE = 2
COOPERATIVE_SINGLE_DEVICE = 1
REGULAR = 0
class cupti.cupti.ActivityMarker2

Bases: object

Empty-initialize an instance of CUpti_ActivityMarker2.

domain

The name of the domain to which this marker belongs to. This will be NULL for default domain.

Type:

str

flags_

The flags associated with the marker. CUpti_ActivityFlag

Type:

int

id

The marker ID.

Type:

int

kind

The activity record kind, must be CUPTI_ACTIVITY_KIND_MARKER.

Type:

int

name

The marker name for an instantaneous or start marker. This will be NULL for an end marker.

Type:

str

object_id

The identifier for the activity object associated with this marker. ‘objectKind’ indicates which ID is valid for this record.

Type:

ActivityObjectKindId

object_kind

The kind of activity object associated with this marker.

Type:

int

pad

Undefined. Reserved for internal use.

Type:

int

ptr

Get the pointer address to the data as Python int.

timestamp

The timestamp for the marker, in ns. A value of 0 indicates that timestamp information could not be collected for the marker.

Type:

int

class cupti.cupti.ActivityMarkerData

Bases: object

Empty-initialize an instance of CUpti_ActivityMarkerData.

category

The category for the marker.

Type:

int

color

The color for the marker.

Type:

int

flags_

The flags associated with the marker. CUpti_ActivityFlag

Type:

int

id

The marker ID.

Type:

int

kind

The activity record kind, must be CUPTI_ACTIVITY_KIND_MARKER_DATA.

Type:

int

payload

The payload value.

Type:

MetricValue

payload_kind

Defines the payload format for the value associated with the marker.

Type:

int

ptr

Get the pointer address to the data as Python :py:`int`.

class cupti.cupti.ActivityMemDecompress

Bases: object

Empty-initialize an instance of CUpti_ActivityMemDecompress.

channel_id

The ID of the HW channel on which the memory copy is occurring.

Type:

int

channel_type

The type of the channel

Type:

int

context_id

The ID of the context.

Type:

int

correlation_id

The correlation ID of the decompression operations. Each operation is assigned a unique correlation ID that is identical to the correlation ID in the driver API activity record that launched the operation.

Type:

int

device_id

The ID of the device.

Type:

int

end

The end timestamp. A value of CUPTI_TIMESTAMP_UNKNOWN indicates that the start time is unknown.

Type:

int

kind

The activity record kind, must be CUPTI_ACTIVITY_KIND_MEM_DECOMPRESS

Type:

int

number_of_operations

The number of operations in the batch.

Type:

int

ptr

Get the pointer address to the data as Python int.

reserved0

This field is reserved for internal use

Type:

int

source_bytes

The number of bytes to be read and decompressed in the batch operation.

Type:

int

start

The start timestamp. A value of CUPTI_TIMESTAMP_UNKNOWN indicates that the start time is unknown.

Type:

int

stream_id

The ID of the stream.

Type:

int

class cupti.cupti.ActivityMemcpy6

Bases: object

Empty-initialize an instance of CUpti_ActivityMemcpy6.

bytes

The number of bytes transferred by the memory copy.

Type:

int

channel_id

The ID of the HW channel on which the memory copy is occurring.

Type:

int

channel_type

The type of the channel

Type:

int

context_id

The ID of the context where the memory copy is occurring.

Type:

int

copy_count

The total number of memcopy operations traced in this record. This field is valid for memcpy operations happening using MemcpyBatchAsync APIs in CUDA. In MemcpyBatchAsync APIs, multiple memcpy operations are batched together for optimization purposes based on certain heuristics. For other memcpy operations, this field will be 1.

Type:

int

copy_kind

The kind of the memory copy, stored as a byte to reduce record size. CUpti_ActivityMemcpyKind

Type:

int

correlation_id

The correlation ID of the memory copy. Each memory copy is assigned a unique correlation ID that is identical to the correlation ID in the driver API activity record that launched the memory copy.

Type:

int

device_id

The ID of the device where the memory copy is occurring.

Type:

int

dst_kind

The destination memory kind read by the memory copy, stored as a byte to reduce record size. CUpti_ActivityMemoryKind

Type:

int

end

The end timestamp for the memory copy, in ns. A value of 0 for both the start and end timestamps indicates that timestamp information could not be collected for the memory copy.

Type:

int

flags_

The flags associated with the memory copy. CUpti_ActivityFlag

Type:

int

graph_id

The unique ID of the graph that executed this memcpy through graph launch. This field will be 0 if the memcpy is not done through graph launch.

Type:

int

graph_node_id

The unique ID of the graph node that executed this memcpy through graph launch. This field will be 0 if the memcpy is not done through graph launch.

Type:

int

is_device_launched

This field is used to indicate if the memcpy operation is part of a device graph launch.

Type:

int

kind

The activity record kind, must be CUPTI_ACTIVITY_KIND_MEMCPY.

Type:

int

pad

Undefined. Reserved for internal use.

Type:

int

pad2

(array of length 3).Reserved for internal use.

Type:

uint8

ptr

Get the pointer address to the data as Python int.

reserved0

Undefined. Reserved for internal use.

Type:

int

runtime_correlation_id

The runtime correlation ID of the memory copy. Each memory copy is assigned a unique runtime correlation ID that is identical to the correlation ID in the runtime API activity record that launched the memory copy.

Type:

int

src_kind

The source memory kind read by the memory copy, stored as a byte to reduce record size. CUpti_ActivityMemoryKind

Type:

int

start

The start timestamp for the memory copy, in ns. A value of 0 for both the start and end timestamps indicates that timestamp information could not be collected for the memory copy.

Type:

int

stream_id

The ID of the stream where the memory copy is occurring.

Type:

int

class cupti.cupti.ActivityMemcpyKind(value)

Bases: IntEnum

See CUpti_ActivityMemcpyKind.

ATOA = 5
ATOD = 6
ATOH = 4
DTOA = 7
DTOD = 8
DTOH = 2
FORCE_INT = 2147483647
HTOA = 3
HTOD = 1
HTOH = 9
PTOP = 10
UNKNOWN = 0
class cupti.cupti.ActivityMemcpyPtoP4

Bases: object

Empty-initialize an instance of CUpti_ActivityMemcpyPtoP4.

bytes

The number of bytes transferred by the memory copy.

Type:

int

channel_id

The ID of the HW channel on which the memory copy is occurring.

Type:

int

channel_type

The type of the channel

Type:

int

context_id

The ID of the context where the memory copy is occurring.

Type:

int

copy_kind

The kind of the memory copy, stored as a byte to reduce record size. CUpti_ActivityMemcpyKind

Type:

int

correlation_id

The correlation ID of the memory copy. Each memory copy is assigned a unique correlation ID that is identical to the correlation ID in the driver and runtime API activity record that launched the memory copy.

Type:

int

device_id

The ID of the device where the memory copy is occurring.

Type:

int

dst_context_id

The ID of the context owning the memory being copied to.

Type:

int

dst_device_id

The ID of the device where memory is being copied to.

Type:

int

dst_kind

The destination memory kind read by the memory copy, stored as a byte to reduce record size. CUpti_ActivityMemoryKind

Type:

int

end

The end timestamp for the memory copy, in ns. A value of 0 for both the start and end timestamps indicates that timestamp information could not be collected for the memory copy.

Type:

int

flags_

The flags associated with the memory copy. CUpti_ActivityFlag

Type:

int

graph_id

The unique ID of the graph that executed this memcpy through graph launch. This field will be 0 if the memcpy is not done through graph launch.

Type:

int

graph_node_id

The unique ID of the graph node that executed the memcpy through graph launch. This field will be 0 if memcpy is not done using graph launch.

Type:

int

kind

The activity record kind, must be CUPTI_ACTIVITY_KIND_MEMCPY2.

Type:

int

ptr

Get the pointer address to the data as Python int.

reserved0

Undefined. Reserved for internal use.

Type:

int

src_context_id

The ID of the context owning the memory being copied from.

Type:

int

src_device_id

The ID of the device where memory is being copied from.

Type:

int

src_kind

The source memory kind read by the memory copy, stored as a byte to reduce record size. CUpti_ActivityMemoryKind

Type:

int

start

The start timestamp for the memory copy, in ns. A value of 0 for both the start and end timestamps indicates that timestamp information could not be collected for the memory copy.

Type:

int

stream_id

The ID of the stream where the memory copy is occurring.

Type:

int

class cupti.cupti.ActivityMemory

Bases: object

Empty-initialize an instance of CUpti_ActivityMemory.

address

The virtual address of the allocation

Type:

int

alloc_pc

The program counter of the allocation of memory

Type:

int

bytes

The number of bytes of memory allocated.

Type:

int

context_id

The ID of the context. If context is NULL, `contextId` is set to CUPTI_INVALID_CONTEXT_ID.

Type:

int

device_id

The ID of the device where the memory allocation is taking place.

Type:

int

end

The end timestamp for the memory operation, i.e. the time when memory was freed, in ns. This will be 0 if memory is not freed in the application

Type:

int

free_pc

The program counter of the freeing of memory. This will be 0 if memory is not freed in the application

Type:

int

kind

The activity record kind, must be CUPTI_ACTIVITY_KIND_MEMORY

Type:

int

memory_kind

The memory kind requested by the user

Type:

int

name

Variable name. This name is shared across all activity records representing the same symbol, and so should not be modified.

Type:

str

pad

Undefined. Reserved for internal use.

Type:

int

process_id

The ID of the process to which this record belongs to.

Type:

int

ptr

Get the pointer address to the data as Python int.

start

The start timestamp for the memory operation, i.e. the time when memory was allocated, in ns.

Type:

int

class cupti.cupti.ActivityMemory4

Bases: object

Empty-initialize an instance of CUpti_ActivityMemory4.

address

The virtual address of the allocation. The base address of the memory pool.

Type:

int

bytes

The number of bytes of memory allocated.

Type:

int

context_id

The ID of the context. If context is NULL, `contextId` is set to CUPTI_INVALID_CONTEXT_ID.

Type:

int

correlation_id

The correlation ID of the memory operation. Each memory operation is assigned a unique correlation ID that is identical to the correlation ID in the driver and runtime API activity record that launched the memory operation.

Type:

int

device_id

The ID of the device where the memory operation is taking place.

Type:

int

is_async

`isAsync` is set if memory operation happens through async memory APIs.

Type:

int

kind

The activity record kind, must be CUPTI_ACTIVITY_KIND_MEMORY2

Type:

int

memory_kind

The memory kind requested by the user, CUpti_ActivityMemoryKind.

Type:

int

memory_operation_type

The memory operation requested by the user, CUpti_ActivityMemoryOperationType.

Type:

int

memory_pool_config

The memory pool configuration used for the memory operations.

Type:

_py_anon_pod5

name

Variable name. This name is shared across all activity records representing the same symbol, and so should not be modified.

Type:

str

pad1

Undefined. Reserved for internal use.

Type:

int

pc

The program counter of the memory operation.

Type:

int

process_id

int:

ptr

Get the pointer address to the data as Python int.

source

The shared object or binary that the memory allocation request comes from.

Type:

str

stream_id

The ID of the stream. If memory operation is not async, `streamId` is set to CUPTI_INVALID_STREAM_ID.

Type:

int

timestamp

The start timestamp for the memory operation, in ns.

Type:

int

class cupti.cupti.ActivityMemoryKind(value)

Bases: IntEnum

See CUpti_ActivityMemoryKind.

ARRAY = 4
DEVICE = 3
DEVICE_STATIC = 6
FORCE_INT = 2147483647
MANAGED = 5
MANAGED_STATIC = 7
PAGEABLE = 1
PINNED = 2
UNKNOWN = 0
class cupti.cupti.ActivityMemoryOperationType(value)

Bases: IntEnum

See CUpti_ActivityMemoryOperationType.

ALLOCATION = 1
FORCE_INT = 2147483647
INVALID = 0
RELEASE = 2
class cupti.cupti.ActivityMemoryPool3

Bases: object

Empty-initialize an instance of CUpti_ActivityMemoryPool3.

address

The virtual address of the allocation.

Type:

int

correlation_id

The correlation ID of the memory pool operation. Each memory pool operation is assigned a unique correlation ID that is identical to the correlation ID in the driver and runtime API activity record that launched the memory operation.

Type:

int

device_id

The ID of the device where the memory pool is created.

Type:

int

is_managed_pool

Whether the pool is of managed memory allocation or pinned memory allocation. If it is 0, it is pinned and if it is 1, the memory pool allocation is of managed memory type.

Type:

int

kind

The activity record kind, must be CUPTI_ACTIVITY_KIND_MEMORY_POOL

Type:

int

memory_pool_operation_type

The memory operation requested by the user, CUpti_ActivityMemoryPoolOperationType.

Type:

int

memory_pool_type

The type of the memory pool, CUpti_ActivityMemoryPoolType

Type:

int

min_bytes_to_keep

The minimum bytes to keep of the memory pool. `minBytesToKeep` is valid for CUPTI_ACTIVITY_MEMORY_POOL_OPERATION_TYPE_TRIMMED, CUpti_ActivityMemoryPoolOperationType

Type:

int

pad2

(array of length 7).Undefined. Reserved for internal use.

Type:

uint8

process_id

The ID of the process to which this record belongs to.

Type:

int

ptr

Get the pointer address to the data as Python int.

release_threshold

The release threshold of the memory pool. `releaseThreshold` is valid for CUPTI_ACTIVITY_MEMORY_POOL_TYPE_LOCAL, CUpti_ActivityMemoryPoolType.

Type:

int

size_

The size of the memory pool operation in bytes. `size` is valid for CUPTI_ACTIVITY_MEMORY_POOL_TYPE_LOCAL, CUpti_ActivityMemoryPoolType.

Type:

int

timestamp

The start timestamp for the memory operation, in ns.

Type:

int

utilized_size

The utilized size of the memory pool. `utilizedSize` is valid for CUPTI_ACTIVITY_MEMORY_POOL_TYPE_LOCAL, CUpti_ActivityMemoryPoolType.

Type:

int

class cupti.cupti.ActivityMemoryPoolOperationType(value)

Bases: IntEnum

See CUpti_ActivityMemoryPoolOperationType.

CREATED = 1
DESTROYED = 2
FORCE_INT = 2147483647
INVALID = 0
TRIMMED = 3
class cupti.cupti.ActivityMemoryPoolType(value)

Bases: IntEnum

See CUpti_ActivityMemoryPoolType.

FORCE_INT = 2147483647
IMPORTED = 2
INVALID = 0
LOCAL = 1
class cupti.cupti.ActivityMemset4

Bases: object

Empty-initialize an instance of CUpti_ActivityMemset4.

bytes

The number of bytes being set by the memory set.

Type:

int

channel_id

The ID of the HW channel on which the memory set is occurring.

Type:

int

channel_type

The type of the channel

Type:

int

context_id

The ID of the context where the memory set is occurring.

Type:

int

correlation_id

The correlation ID of the memory set. Each memory set is assigned a unique correlation ID that is identical to the correlation ID in the driver API activity record that launched the memory set.

Type:

int

device_id

The ID of the device where the memory set is occurring.

Type:

int

end

The end timestamp for the memory set, in ns. A value of 0 for both the start and end timestamps indicates that timestamp information could not be collected for the memory set.

Type:

int

flags_

The flags associated with the memset. CUpti_ActivityFlag

Type:

int

graph_id

The unique ID of the graph that executed this memset through graph launch. This field will be 0 if the memset is not executed through graph launch.

Type:

int

graph_node_id

The unique ID of the graph node that executed this memset through graph launch. This field will be 0 if the memset is not executed through graph launch.

Type:

int

is_device_launched

This field is used to indicate if the memset operation is part of a device graph launch.

Type:

int

kind

The activity record kind, must be CUPTI_ACTIVITY_KIND_MEMSET.

Type:

int

memory_kind

The memory kind of the memory set CUpti_ActivityMemoryKind

Type:

int

pad

Undefined. Reserved for internal use.

Type:

int

pad2

(array of length 3).Undefined. Reserved for internal use

Type:

uint8

ptr

Get the pointer address to the data as Python int.

reserved0

Undefined. Reserved for internal use.

Type:

int

start

The start timestamp for the memory set, in ns. A value of 0 for both the start and end timestamps indicates that timestamp information could not be collected for the memory set.

Type:

int

stream_id

The ID of the stream where the memory set is occurring.

Type:

int

value

The value being assigned to memory by the memory set.

Type:

int

class cupti.cupti.ActivityModule

Bases: object

Empty-initialize an instance of CUpti_ActivityModule.

context_id

The ID of the context where the module is loaded.

Type:

int

cubin

The pointer to cubin.

Type:

int

cubin_size

The cubin size.

Type:

int

id

The module ID.

Type:

int

kind

The activity record kind, must be CUPTI_ACTIVITY_KIND_MODULE.

Type:

int

ptr

Get the pointer address to the data as Python int.

class cupti.cupti.ActivityName

Bases: object

Empty-initialize an instance of CUpti_ActivityName.

kind

The activity record kind, must be CUPTI_ACTIVITY_KIND_NAME.

Type:

int

name

The name.

Type:

str

object_id

The identifier for the activity object. ‘objectKind’ indicates which ID is valid for this record.

Type:

ActivityObjectKindId

object_kind

The kind of activity object being named.

Type:

int

pad

Undefined. Reserved for internal use.

Type:

int

ptr

Get the pointer address to the data as Python int.

class cupti.cupti.ActivityObjectKind(value)

Bases: IntEnum

See CUpti_ActivityObjectKind.

CONTEXT = 4
DEVICE = 3
FORCE_INT = 2147483647
PROCESS = 1
STREAM = 5
THREAD = 2
UNKNOWN = 0
class cupti.cupti.ActivityObjectKindId

Bases: object

Empty-initialize an instance of CUpti_ActivityObjectKindId.

dcs

A device object requires that we identify the device ID. A context object requires that we identify both the device and context ID. A stream object requires that we identify device, context, and stream ID.

Type:

_py_anon_pod4

pt

A process object requires that we identify the process ID. A thread object requires that we identify both the process and thread ID.

Type:

_py_anon_pod3

ptr

Get the pointer address to the data as Python int.

class cupti.cupti.ActivityOpenAccData

Bases: object

Empty-initialize an instance of CUpti_ActivityOpenAccData.

async_

int:

async_map

int:

bytes

Number of bytes

Type:

int

cu_context_id

CUDA context id Valid only if deviceType is acc_device_nvidia.

Type:

int

cu_device_id

CUDA device id Valid only if deviceType is acc_device_nvidia.

Type:

int

cu_process_id

The ID of the process where the OpenACC activity is executing.

Type:

int

cu_stream_id

CUDA stream id Valid only if deviceType is acc_device_nvidia.

Type:

int

cu_thread_id

The ID of the thread where the OpenACC activity is executing.

Type:

int

device_number

int:

device_ptr

Device pointer if available

Type:

int

device_type

int:

end

CUPTI end timestamp

Type:

int

end_line_no

int:

event_kind

CUPTI OpenACC event kind (CUpti_OpenAccEventKind)

Type:

int

external_id

The OpenACC correlation ID. Valid only if deviceType is acc_device_nvidia. If not 0, it uniquely identifies this record. It is identical to the externalId in the preceding external correlation record of type CUPTI_EXTERNAL_CORRELATION_KIND_OPENACC.

Type:

int

func_end_line_no

int:

func_line_no

int:

func_name

str:

host_ptr

Host pointer if available

Type:

int

implicit

int:

kind

The activity record kind, must be CUPTI_ACTIVITY_KIND_OPENACC_DATA.

Type:

int

line_no

int:

parent_construct

int:

ptr

Get the pointer address to the data as Python int.

src_file

str:

start

CUPTI start timestamp

Type:

int

thread_id

ThreadId

Type:

int

var_name

str:

version

int:

class cupti.cupti.ActivityOpenAccLaunch

Bases: object

Empty-initialize an instance of CUpti_ActivityOpenAccLaunch.

async_

Value of async() clause of the corresponding directive

Type:

int

async_map

Internal asynchronous queue number used

Type:

int

cu_context_id

CUDA context id Valid only if deviceType is acc_device_nvidia.

Type:

int

cu_device_id

CUDA device id Valid only if deviceType is acc_device_nvidia.

Type:

int

cu_process_id

The ID of the process where the OpenACC activity is executing.

Type:

int

cu_stream_id

CUDA stream id Valid only if deviceType is acc_device_nvidia.

Type:

int

cu_thread_id

The ID of the thread where the OpenACC activity is executing.

Type:

int

device_number

Device number

Type:

int

device_type

Device type

Type:

int

end

CUPTI end timestamp

Type:

int

end_line_no

For an OpenACC construct, this contains the line number of the end of the construct. A negative or zero value means the line number is not known.

Type:

int

event_kind

CUPTI OpenACC event kind (CUpti_OpenAccEventKind)

Type:

int

external_id

The OpenACC correlation ID. Valid only if deviceType is acc_device_nvidia. If not 0, it uniquely identifies this record. It is identical to the externalId in the preceding external correlation record of type CUPTI_EXTERNAL_CORRELATION_KIND_OPENACC.

Type:

int

func_end_line_no

The last line number of the function named in func_name. A negative or zero value means the line number is not known.

Type:

int

func_line_no

The line number of the first line of the function named in func_name. A negative or zero value means the line number is not known.

Type:

int

func_name

A pointer to a null-terminated string containing the name of the function in which the event occurred.

Type:

str

implicit

1 for any implicit event, such as an implicit wait at a synchronous data construct 0 otherwise

Type:

int

kernel_name

A pointer to null-terminated string containing the name of the kernel being launched, if known, or a null pointer if not.

Type:

str

kind

The activity record kind, must be CUPTI_ACTIVITY_KIND_OPENACC_LAUNCH.

Type:

int

line_no

The line number of the directive or program construct or the starting line number of the OpenACC construct corresponding to the event. A negative or zero value means the line number is not known.

Type:

int

num_gangs

The number of gangs created for this kernel launch

Type:

int

num_workers

The number of workers created for this kernel launch

Type:

int

parent_construct

CUPTI OpenACC parent construct kind (CUpti_OpenAccConstructKind) Note that for applications using PGI OpenACC runtime < 16.1, this will always be CUPTI_OPENACC_CONSTRUCT_KIND_UNKNOWN.

Type:

int

ptr

Get the pointer address to the data as Python int.

src_file

A pointer to null-terminated string containing the name of or path to the source file, if known, or a null pointer if not.

Type:

str

start

CUPTI start timestamp

Type:

int

thread_id

ThreadId

Type:

int

vector_length

The number of vector lanes created for this kernel launch

Type:

int

version

Version number

Type:

int

class cupti.cupti.ActivityOpenAccOther

Bases: object

Empty-initialize an instance of CUpti_ActivityOpenAccOther.

async_

Value of async() clause of the corresponding directive

Type:

int

async_map

Internal asynchronous queue number used

Type:

int

cu_context_id

CUDA context id Valid only if deviceType is acc_device_nvidia.

Type:

int

cu_device_id

CUDA device id Valid only if deviceType is acc_device_nvidia.

Type:

int

cu_process_id

The ID of the process where the OpenACC activity is executing.

Type:

int

cu_stream_id

CUDA stream id Valid only if deviceType is acc_device_nvidia.

Type:

int

cu_thread_id

The ID of the thread where the OpenACC activity is executing.

Type:

int

device_number

Device number

Type:

int

device_type

Device type

Type:

int

end

CUPTI end timestamp

Type:

int

end_line_no

For an OpenACC construct, this contains the line number of the end of the construct. A negative or zero value means the line number is not known.

Type:

int

event_kind

CUPTI OpenACC event kind (CUpti_OpenAccEventKind)

Type:

int

external_id

The OpenACC correlation ID. Valid only if deviceType is acc_device_nvidia. If not 0, it uniquely identifies this record. It is identical to the externalId in the preceding external correlation record of type CUPTI_EXTERNAL_CORRELATION_KIND_OPENACC.

Type:

int

func_end_line_no

The last line number of the function named in func_name. A negative or zero value means the line number is not known.

Type:

int

func_line_no

The line number of the first line of the function named in func_name. A negative or zero value means the line number is not known.

Type:

int

func_name

A pointer to a null-terminated string containing the name of the function in which the event occurred.

Type:

str

implicit

1 for any implicit event, such as an implicit wait at a synchronous data construct 0 otherwise

Type:

int

kind

The activity record kind, must be CUPTI_ACTIVITY_KIND_OPENACC_OTHER.

Type:

int

line_no

The line number of the directive or program construct or the starting line number of the OpenACC construct corresponding to the event. A negative or zero value means the line number is not known.

Type:

int

parent_construct

CUPTI OpenACC parent construct kind (CUpti_OpenAccConstructKind) Note that for applications using PGI OpenACC runtime < 16.1, this will always be CUPTI_OPENACC_CONSTRUCT_KIND_UNKNOWN.

Type:

int

ptr

Get the pointer address to the data as Python int.

src_file

A pointer to null-terminated string containing the name of or path to the source file, if known, or a null pointer if not.

Type:

str

start

CUPTI start timestamp

Type:

int

thread_id

ThreadId

Type:

int

version

Version number

Type:

int

class cupti.cupti.ActivityOpenMp

Bases: object

Empty-initialize an instance of CUpti_ActivityOpenMp.

cu_process_id

The ID of the process where the OpenMP activity is executing.

Type:

int

cu_thread_id

The ID of the thread where the OpenMP activity is executing.

Type:

int

end

CUPTI end timestamp

Type:

int

event_kind

CUPTI OpenMP event kind (CUpti_OpenMpEventKind)

Type:

int

kind

The kind of this activity.

Type:

int

ptr

Get the pointer address to the data as Python int.

start

CUPTI start timestamp

Type:

int

thread_id

ThreadId

Type:

int

version

Version number

Type:

int

class cupti.cupti.ActivityOverhead3

Bases: object

Empty-initialize an instance of CUpti_ActivityOverhead3.

correlation_id

The correlation ID of the overhead operation to which records belong to. This ID is identical to the correlation ID in the driver or runtime API activity record that launched the overhead operation. In some cases, it can be zero, such as for CUPTI_ACTIVITY_OVERHEAD_CUPTI_BUFFER_FLUSH records.

Type:

int

end

The end timestamp for the overhead, in ns. A value of 0 for both the start and end timestamps indicates that timestamp information could not be collected for the overhead.

Type:

int

kind

The activity record kind, must be CUPTI_ACTIVITY_OVERHEAD.

Type:

int

object_id

The identifier for the activity object. ‘objectKind’ indicates which ID is valid for this record.

Type:

ActivityObjectKindId

object_kind

The kind of activity object that the overhead is associated with.

Type:

int

overhead_data

Pointer to the struct with additional details about the overhead. Refer CUpti_ActivityOverheadKind enum and the corresponding structure to typecast and access additional overhead data. Client is responsible for freeing this memory using the free function when done.

Type:

int

overhead_kind

The kind of overhead, CUPTI, DRIVER, COMPILER etc.

Type:

int

ptr

Get the pointer address to the data as Python int.

reserved0

Reserved for internal use.

Type:

int

start

The start timestamp for the overhead, in ns. A value of 0 for both the start and end timestamps indicates that timestamp information could not be collected for the overhead.

Type:

int

class cupti.cupti.ActivityOverheadKind(value)

Bases: IntEnum

See CUpti_ActivityOverheadKind.

ACTIVITY_BUFFER_REQUEST = 458752
COMMAND_BUFFER_FULL = 393216
CUPTI_BUFFER_FLUSH = 65536
CUPTI_INSTRUMENTATION = 131072
CUPTI_RESOURCE = 196608
DRIVER_COMPILER = 1
FORCE_INT = 2147483647
LAZY_FUNCTION_LOADING = 327680
RUNTIME_TRIGGERED_MODULE_LOADING = 262144
UNKNOWN = 0
UVM_ACTIVITY_INIT = 524288
class cupti.cupti.ActivityPCSamplingPeriod(value)

Bases: IntEnum

See CUpti_ActivityPCSamplingPeriod.

FORCE_INT = 2147483647
HIGH = 4
INVALID = 0
LOW = 2
MAX = 5
MID = 3
MIN = 1
class cupti.cupti.ActivityPCSamplingStallReason(value)

Bases: IntEnum

See CUpti_ActivityPCSamplingStallReason.

CONSTANT_MEMORY_DEPENDENCY = 7
EXEC_DEPENDENCY = 3
FORCE_INT = 2147483647
INST_FETCH = 2
INVALID = 0
MEMORY_DEPENDENCY = 4
MEMORY_THROTTLE = 9
NONE = 1
NOT_SELECTED = 10
OTHER = 11
PIPE_BUSY = 8
SLEEPING = 12
SYNC = 6
TEXTURE = 5
class cupti.cupti.ActivityPartitionedGlobalCacheConfig(value)

Bases: IntEnum

See CUpti_ActivityPartitionedGlobalCacheConfig.

FORCE_INT = 2147483647
NOT_SUPPORTED = 1
OFF = 2
ON = 3
UNKNOWN = 0
class cupti.cupti.ActivityPreemption

Bases: object

Empty-initialize an instance of CUpti_ActivityPreemption.

block_x

The X-dimension of the block that is preempted

Type:

int

block_y

The Y-dimension of the block that is preempted

Type:

int

block_z

The Z-dimension of the block that is preempted

Type:

int

grid_id

The grid-id of the block that is preempted

Type:

int

kind

The activity record kind, must be CUPTI_ACTIVITY_KIND_PREEMPTION

Type:

int

pad

Undefined. Reserved for internal use.

Type:

int

preemption_kind

kind of the preemption

Type:

int

ptr

Get the pointer address to the data as Python int.

timestamp

The timestamp of the preemption, in ns. A value of 0 indicates that timestamp information could not be collected for the preemption.

Type:

int

class cupti.cupti.ActivityPreemptionKind(value)

Bases: IntEnum

See CUpti_ActivityPreemptionKind.

FORCE_INT = 2147483647
RESTORE = 2
SAVE = 1
UNKNOWN = 0
class cupti.cupti.ActivityStream

Bases: object

Empty-initialize an instance of CUpti_ActivityStream.

context_id

The ID of the context where the stream was created.

Type:

int

correlation_id

The correlation ID of the API to which this result is associated.

Type:

int

flag

Flags associated with the stream.

Type:

int

kind

The activity record kind, must be CUPTI_ACTIVITY_KIND_STREAM.

Type:

int

priority

The clamped priority for the stream.

Type:

int

ptr

Get the pointer address to the data as Python int.

stream_id

A unique stream ID to identify the stream.

Type:

int

class cupti.cupti.ActivityStreamFlag(value)

Bases: IntEnum

See CUpti_ActivityStreamFlag.

FLAG_DEFAULT = 1
FLAG_FORCE_INT = 2147483647
FLAG_NON_BLOCKING = 2
FLAG_NULL = 3
FLAG_UNKNOWN = 0
MASK = 65535
class cupti.cupti.ActivitySynchronization2

Bases: object

Empty-initialize an instance of CUpti_ActivitySynchronization2.

context_id

The ID of the context for which the synchronization API is called. In case of context synchronization API it is the context id for which the API is called. In case of stream/event synchronization it is the ID of the context where the stream/event was created.

Type:

int

correlation_id

The correlation ID of the API to which this result is associated.

Type:

int

cuda_event_id

The event ID for which the synchronization API is called. A CUPTI_SYNCHRONIZATION_INVALID_VALUE value indicate the field is not applicable for this record. Not valid for cuCtxSynchronize, cuStreamSynchronize.

Type:

int

cuda_event_sync_id

A unique ID to associate event synchronization records with the latest CUDA Event record. Similar field is added in CUpti_ActivityCudaEvent2 to associate synchronization record to the CUDA Event record. The same CUDA event can be used multiple times, so the event id will not be unique to correlate the synchronization record with the latest CUDA Event record. This field will be unique and can be used to do the required correlation. A CUPTI_SYNCHRONIZATION_INVALID_VALUE value indicates that the field is not applicable for this record. Valid only for synchronization records related to CUDA Events.

Type:

int

end

The end timestamp for the function, in ns. A value of 0 for both the start and end timestamps indicates that timestamp information could not be collected for the function.

Type:

int

kind

The activity record kind, must be CUPTI_ACTIVITY_KIND_SYNCHRONIZATION.

Type:

int

pad

Undefined. Reserved for internal use.

Type:

int

ptr

Get the pointer address to the data as Python int.

return_value

The return value for the synchronization record. Use cuptiActivityEnableAllSyncRecords API to enable/disable collection of synchronization records with return value being non-zero. This will be a CUresult value.

Type:

int

start

The start timestamp for the function, in ns. A value of 0 for both the start and end timestamps indicates that timestamp information could not be collected for the function.

Type:

int

stream_id

The compute stream for which the synchronization API is called. A CUPTI_SYNCHRONIZATION_INVALID_VALUE value indicate the field is not applicable for this record. Not valid for cuCtxSynchronize, cuEventSynchronize.

Type:

int

type

The type of record.

Type:

int

class cupti.cupti.ActivitySynchronizationType(value)

Bases: IntEnum

See CUpti_ActivitySynchronizationType.

CONTEXT_SYNCHRONIZE = 4
EVENT_SYNCHRONIZE = 1
FORCE_INT = 2147483647
STREAM_SYNCHRONIZE = 3
STREAM_WAIT_EVENT = 2
UNKNOWN = 0
class cupti.cupti.ActivityThreadIdType(value)

Bases: IntEnum

See CUpti_ActivityThreadIdType.

DEFAULT = 0
FORCE_INT = 2147483647
SIZE = 2
SYSTEM = 1
class cupti.cupti.ActivityUnifiedMemoryAccessType(value)

Bases: IntEnum

See CUpti_ActivityUnifiedMemoryAccessType.

ATOMIC = 3
PREFETCH = 4
READ = 1
UNKNOWN = 0
WRITE = 2
class cupti.cupti.ActivityUnifiedMemoryCounter3

Bases: object

Empty-initialize an instance of CUpti_ActivityUnifiedMemoryCounter3.

address

This is the virtual base address of the page/s being transferred. For cpu and gpu faults, the virtual address for the page that faulted.

Type:

int

counter_kind

The Unified Memory counter kind

Type:

int

dst_id

The ID of the destination CPU/device involved in the memory transfer or remote map operation. Ignore this field if counterKind is CUPTI_ACTIVITY_UNIFIED_MEMORY_COUNTER_KIND_GPU_PAGE_FAULT or CUPTI_ACTIVITY_UNIFIED_MEMORY_COUNTER_KIND_CPU_PAGE_FAULT_COUNT or CUPTI_ACTIVITY_UNIFIED_MEMORY_COUNTER_KIND_THRASHING or CUPTI_ACTIVITY_UNIFIED_MEMORY_COUNTER_KIND_THROTTLING

Type:

int

end

The end timestamp of the counter, in ns. Ignore this field if counterKind is CUPTI_ACTIVITY_UNIFIED_MEMORY_COUNTER_KIND_CPU_PAGE_FAULT_COUNT or CUPTI_ACTIVITY_UNIFIED_MEMORY_COUNTER_KIND_THRASHING or CUPTI_ACTIVITY_UNIFIED_MEMORY_COUNTER_KIND_REMOTE_MAP. For counterKind CUPTI_ACTIVITY_UNIFIED_MEMORY_COUNTER_KIND_BYTES_TRANSFER_HTOD and CUPTI_ACTIVITY_UNIFIED_MEMORY_COUNTER_KIND_BYTES_TRANSFER_DTOH, timestamp is captured when activity finishes on GPU. For counterKind CUPTI_ACTIVITY_UNIFIED_MEMORY_COUNTER_KIND_GPU_PAGE_FAULT, timestamp is captured when CUDA driver queues the replay of faulting memory accesses on the GPU For counterKind CUPTI_ACTIVITY_UNIFIED_MEMORY_COUNTER_KIND_THROTTLING, timestamp is captured when throttling operation was finished by CUDA driver

Type:

int

flags_

The flags associated with this record. See enums CUpti_ActivityUnifiedMemoryAccessType if counterKind is CUPTI_ACTIVITY_UNIFIED_MEMORY_COUNTER_KIND_GPU_PAGE_FAULT and CUpti_ActivityUnifiedMemoryMigrationCause if counterKind is CUPTI_ACTIVITY_UNIFIED_MEMORY_COUNTER_KIND_BYTES_TRANSFER_HTOD or CUPTI_ACTIVITY_UNIFIED_MEMORY_COUNTER_KIND_BYTES_TRANSFER_HTOD and CUpti_ActivityUnifiedMemoryRemoteMapCause if counterKind is CUPTI_ACTIVITY_UNIFIED_MEMORY_COUNTER_KIND_REMOTE_MAP and CUpti_ActivityFlag if counterKind is CUPTI_ACTIVITY_UNIFIED_MEMORY_COUNTER_KIND_THRASHING or CUPTI_ACTIVITY_UNIFIED_MEMORY_COUNTER_KIND_THROTTLING

Type:

int

kind

The activity record kind, must be CUPTI_ACTIVITY_KIND_UNIFIED_MEMORY_COUNTER

Type:

int

pad

Undefined. Reserved for internal use.

Type:

int

process_id

The ID of the process to which this record belongs to.

Type:

int

processors

(array of length 5).The bitmask of devices involved in the operation. For counterKind CUPTI_ACTIVITY_UNIFIED_MEMORY_COUNTER_KIND_THRASHING, it is a bitwise ORing of the device IDs fighting for the memory region. processors[0] represents the device ID of the device 0 to device 63, processors[1] represents device ID of device 64 to device 127 and so on. Ignore this field if counterKind is CUPTI_ACTIVITY_UNIFIED_MEMORY_COUNTER_KIND_CPU_PAGE_FAULT_COUNT or CUPTI_ACTIVITY_UNIFIED_MEMORY_COUNTER_KIND_GPU_PAGE_FAULT or CUPTI_ACTIVITY_UNIFIED_MEMORY_COUNTER_KIND_THROTTLING or CUPTI_ACTIVITY_UNIFIED_MEMORY_COUNTER_KIND_REMOTE_MAP or CUPTI_ACTIVITY_UNIFIED_MEMORY_COUNTER_KIND_BYTES_TRANSFER_HTOD or CUPTI_ACTIVITY_UNIFIED_MEMORY_COUNTER_KIND_BYTES_TRANSFER_DTOH or CUPTI_ACTIVITY_UNIFIED_MEMORY_COUNTER_KIND_DTOD or CUPTI_ACTIVITY_UNIFIED_MEMORY_COUNTER_KIND_GPU_FAULT_REPLAY

Type:

uint64

ptr

Get the pointer address to the data as Python int.

src_id

The ID of the source CPU/device involved in the memory transfer, page fault, thrashing, throttling or remote map operation. For counterKind CUPTI_ACTIVITY_UNIFIED_MEMORY_COUNTER_KIND_THRASHING, it is a bitwise ORing of the device IDs fighting for the memory region, ONLY if there are less than 32 devices. Ignore this field if counterKind is CUPTI_ACTIVITY_UNIFIED_MEMORY_COUNTER_KIND_CPU_PAGE_FAULT_COUNT

Type:

int

start

The start timestamp of the counter, in ns. For counterKind CUPTI_ACTIVITY_UNIFIED_MEMORY_COUNTER_KIND_BYTES_TRANSFER_HTOD and CUPTI_ACTIVITY_UNIFIED_MEMORY_COUNTER_KIND_BYTES_TRANSFER_DTOH, timestamp is captured when activity starts on GPU. For counterKind CUPTI_ACTIVITY_UNIFIED_MEMORY_COUNTER_KIND_GPU_PAGE_FAULT and CUPTI_ACTIVITY_UNIFIED_MEMORY_COUNTER_KIND_CPU_PAGE_FAULT_COUNT, timestamp is captured when CUDA driver started processing the fault. For counterKind CUPTI_ACTIVITY_UNIFIED_MEMORY_COUNTER_KIND_THRASHING, timestamp is captured when CUDA driver detected thrashing of memory region. For counterKind CUPTI_ACTIVITY_UNIFIED_MEMORY_COUNTER_KIND_THROTTLING, timestamp is captured when throttling operation was started by CUDA driver. For counterKind CUPTI_ACTIVITY_UNIFIED_MEMORY_COUNTER_KIND_REMOTE_MAP, timestamp is captured when CUDA driver has pushed all required operations to the processor specified by dstId.

Type:

int

stream_id

The ID of the stream causing the transfer. This value of this field is invalid.

Type:

int

value

Value of the counter For counterKind CUPTI_ACTIVITY_UNIFIED_MEMORY_COUNTER_KIND_BYTES_TRANSFER_HTOD, CUPTI_ACTIVITY_UNIFIED_MEMORY_COUNTER_KIND_BYTES_TRANSFER_DTOH, CUPTI_ACTIVITY_UNIFIED_MEMORY_COUNTER_KIND_THREASHING and CUPTI_ACTIVITY_UNIFIED_MEMORY_COUNTER_KIND_REMOTE_MAP, it is the size of the memory region in bytes. For counterKind CUPTI_ACTIVITY_UNIFIED_MEMORY_COUNTER_KIND_GPU_PAGE_FAULT, it is the number of page fault groups for the same page. For counterKind CUPTI_ACTIVITY_UNIFIED_MEMORY_COUNTER_KIND_CPU_PAGE_FAULT_COUNT, it is the program counter for the instruction that caused fault.

Type:

int

class cupti.cupti.ActivityUnifiedMemoryCounterConfig

Bases: object

Empty-initialize an array of CUpti_ActivityUnifiedMemoryCounterConfig.

The resulting object is of length size and of dtype activity_unified_memory_counter_config_dtype. If default-constructed, the instance represents a single struct.

Parameters:

size (int) – number of structs, default=1.

device_id

Device id of the target device. This is relevant only for single device scopes. (deprecated in CUDA 7.0)

Type:

Union[uint32, int]

enable

Control to enable/disable the counter. To enable the counter set it to non-zero value while disable is indicated by zero.

Type:

Union[uint32, int]

kind

Unified Memory counter Counter kind

Type:

Union[int32, int]

ptr

Get the pointer address to the data as Python int.

scope

Unified Memory counter Counter scope. (deprecated in CUDA 7.0)

Type:

Union[int32, int]

class cupti.cupti.ActivityUnifiedMemoryCounterKind(value)

Bases: IntEnum

See CUpti_ActivityUnifiedMemoryCounterKind.

BYTES_TRANSFER_DTOD = 8
BYTES_TRANSFER_DTOH = 2
BYTES_TRANSFER_HTOD = 1
COUNT = 9
CPU_PAGE_FAULT_COUNT = 3
FORCE_INT = 2147483647
GPU_PAGE_FAULT = 4
REMOTE_MAP = 7
THRASHING = 5
THROTTLING = 6
UNKNOWN = 0
class cupti.cupti.ActivityUnifiedMemoryCounterScope(value)

Bases: IntEnum

See CUpti_ActivityUnifiedMemoryCounterScope.

COUNT = 3
FORCE_INT = 2147483647
PROCESS_ALL_DEVICES = 2
PROCESS_SINGLE_DEVICE = 1
UNKNOWN = 0
class cupti.cupti.ActivityUnifiedMemoryMigrationCause(value)

Bases: IntEnum

See CUpti_ActivityUnifiedMemoryMigrationCause.

ACCESS_COUNTERS = 5
COHERENCE = 2
EVICTION = 4
PREFETCH = 3
UNKNOWN = 0
USER = 1
class cupti.cupti.ActivityUnifiedMemoryRemoteMapCause(value)

Bases: IntEnum

See CUpti_ActivityUnifiedMemoryRemoteMapCause.

COHERENCE = 1
EVICTION = 5
OUT_OF_MEMORY = 4
POLICY = 3
THRASHING = 2
UNKNOWN = 0
class cupti.cupti.ApiCallbackSite(value)

Bases: IntEnum

See CUpti_ApiCallbackSite.

API_CBSITE_FORCE_INT = 2147483647
API_ENTER = 0
API_EXIT = 1
class cupti.cupti.CallbackData

Bases: object

Empty-initialize an instance of CUpti_CallbackData.

callback_site

Point in the runtime or driver function from where the callback was issued.

Type:

int

context

Driver context current to the thread, or null if no context is current. This value can change from the entry to exit callback of a runtime API function if the runtime initializes a context.

Type:

int

context_uid

Unique ID for the CUDA context associated with the thread. The UIDs are assigned sequentially as contexts are created and are unique within a process.

Type:

int

correlation_data

Pointer to data shared between the entry and exit callbacks of a given runtime or drive API function invocation. This field can be used to pass 64-bit values from the entry callback to the corresponding exit callback.

Type:

int

correlation_id

The activity record correlation ID for this callback. For a driver domain callback (i.e. `domain` CUPTI_CB_DOMAIN_DRIVER_API) this ID will equal the correlation ID in the CUpti_ActivityAPI record corresponding to the CUDA driver function call. For a runtime domain callback (i.e. `domain` CUPTI_CB_DOMAIN_RUNTIME_API) this ID will equal the correlation ID in the CUpti_ActivityAPI record corresponding to the CUDA runtime function call. Within the callback, this ID can be recorded to correlate user data with the activity record. This field is new in 4.1.

Type:

int

function_name

Name of the runtime or driver API function which issued the callback. This string is a global constant and so may be accessed outside of the callback.

Type:

str

function_return_value

Pointer to the return value of the runtime or driver API call. This field is only valid within the exit::CUPTI_API_EXIT callback. For a runtime API `functionReturnValue` points to a `cudaError_t`. For a driver API `functionReturnValue` points to a `CUresult`.

Type:

int

ptr

Get the pointer address to the data as Python int.

symbol_name

Name of the symbol operated on by the runtime or driver API function which issued the callback. This entry is valid only for driver and runtime launch callbacks, where it returns the name of the kernel.

Type:

str

class cupti.cupti.CallbackDomain(value)

Bases: IntEnum

See CUpti_CallbackDomain.

DRIVER_API = 1
FORCE_INT = 2147483647
INVALID = 0
NVTX = 5
RESOURCE = 3
RUNTIME_API = 2
SIZE = 7
STATE = 6
SYNCHRONIZE = 4
class cupti.cupti.CallbackIdResource(value)

Bases: IntEnum

See CUpti_CallbackIdResource.

CONTEXT_CREATED = 1
CONTEXT_DESTROY_STARTING = 2
CU_INIT_FINISHED = 5
FORCE_INT = 2147483647
GRAPHEXEC_CREATED = 18
GRAPHEXEC_CREATE_STARTING = 17
GRAPHEXEC_DESTROY_STARTING = 19
GRAPHNODE_CLONED = 20
GRAPHNODE_CREATED = 13
GRAPHNODE_CREATE_STARTING = 12
GRAPHNODE_DEPENDENCY_CREATED = 15
GRAPHNODE_DEPENDENCY_DESTROY_STARTING = 16
GRAPHNODE_DESTROY_STARTING = 14
GRAPH_CLONED = 11
GRAPH_CREATED = 9
GRAPH_DESTROY_STARTING = 10
GRAPH_NODE_SET_PARAMS = 23
GRAPH_NODE_UPDATED = 22
INVALID = 0
MODULE_LOADED = 6
MODULE_PROFILED = 8
MODULE_UNLOAD_STARTING = 7
SIZE = 24
STREAM_ATTRIBUTE_CHANGED = 21
STREAM_CREATED = 3
STREAM_DESTROY_STARTING = 4
class cupti.cupti.CallbackIdState(value)

Bases: IntEnum

See CUpti_CallbackIdState.

ERROR = 2
FATAL_ERROR = 1
FORCE_INT = 2147483647
INVALID = 0
SIZE = 4
WARNING = 3
class cupti.cupti.CallbackIdSync(value)

Bases: IntEnum

See CUpti_CallbackIdSync.

CONTEXT_SYNCHRONIZED = 2
FORCE_INT = 2147483647
INVALID = 0
SIZE = 3
STREAM_SYNCHRONIZED = 1
class cupti.cupti.ChannelType(value)

Bases: IntEnum

See CUpti_ChannelType.

ASYNC_MEMCPY = 2
COMPUTE = 1
DECOMP = 3
FORCE_INT = 2147483647
INVALID = 0
class cupti.cupti.ConfidentialComputeRotationEventType(value)

Bases: IntEnum

See CUpti_ConfidentialComputeRotationEventType.

EVENT_TYPE_FORCE_INT = 2147483647
INVALID_ROTATION_EVENT = 0
KEY_ROTATION_ACKNOWLEGED = 2
KEY_ROTATION_CHANNEL_BLOCKED = 1
KEY_ROTATION_CHANNEL_DRAINED = 2
KEY_ROTATION_CHANNEL_UNBLOCKED = 3
KEY_ROTATION_COMPLETED = 4
KEY_ROTATION_REQUESTED = 1
KEY_ROTATION_STARTED = 3
class cupti.cupti.ContextCigMode(value)

Bases: IntEnum

See CUpti_ContextCigMode.

CIG = 1
CIG_FALLBACK = 2
FORCE_INT = 2147483647
NONE = 0
class cupti.cupti.DevType(value)

Bases: IntEnum

See CUpti_DevType.

FORCE_INT = 2147483647
GPU = 1
INVALID = 0
NPU = 2
class cupti.cupti.DeviceAttribute(value)

Bases: IntEnum

See CUpti_DeviceAttribute.

ATTR_CLASS = 10
ATTR_FLOP_DP_PER_CYCLE = 12
ATTR_FLOP_HP_PER_CYCLE = 17
ATTR_FLOP_SP_PER_CYCLE = 11
ATTR_FORCE_INT = 2147483647
ATTR_GLOBAL_MEMORY_BANDWIDTH = 3
ATTR_INSTRUCTION_PER_CYCLE = 4
ATTR_INSTRUCTION_THROUGHPUT_SINGLE_PRECISION = 5
ATTR_MAX_EVENT_DOMAIN_ID = 2
ATTR_MAX_EVENT_ID = 1
ATTR_MAX_FRAME_BUFFERS = 6
ATTR_MAX_L2_UNITS = 13
ATTR_MAX_SHARED_MEMORY_CACHE_CONFIG_PREFER_EQUAL = 16
ATTR_MAX_SHARED_MEMORY_CACHE_CONFIG_PREFER_L1 = 15
ATTR_MAX_SHARED_MEMORY_CACHE_CONFIG_PREFER_SHARED = 14
ATTR_NVSWITCH_PRESENT = 20
ATTR_PCIE_GEN = 9
class cupti.cupti.DeviceVirtualizationMode(value)

Bases: IntEnum

See CUpti_DeviceVirtualizationMode.

FORCE_INT = 2147483647
NONE = 0
PASS_THROUGH = 1
VIRTUAL_GPU = 2
class cupti.cupti.EnvironmentClocksThrottleReason(value)

Bases: IntEnum

See CUpti_EnvironmentClocksThrottleReason.

FORCE_INT = 2147483647
GPU_IDLE = 1
HW_SLOWDOWN = 8
NONE = 0
SW_POWER_CAP = 4
UNKNOWN = 2147483648
UNSUPPORTED = 1073741824
USER_DEFINED_CLOCKS = 2
class cupti.cupti.ExternalCorrelationKind(value)

Bases: IntEnum

See CUpti_ExternalCorrelationKind.

CUSTOM0 = 3
CUSTOM1 = 4
CUSTOM2 = 5
FORCE_INT = 2147483647
INVALID = 0
OPENACC = 2
SIZE = 6
UNKNOWN = 1
class cupti.cupti.FuncShmemLimitConfig(value)

Bases: IntEnum

See CUpti_FuncShmemLimitConfig.

DEFAULT = 0
FORCE_INT = 2147483647
OPTIN = 1
class cupti.cupti.GraphData

Bases: object

Empty-initialize an instance of CUpti_GraphData.

See also

CUpti_GraphData

dependency

The dependent graph node

Type:

int

graph

CUDA graph

Type:

int

graph_exec

CUDA executable graph

Type:

int

node

CUDA graph node

Type:

int

node_type

Type of the node

Type:

int

original_graph

The original CUDA graph from which graph is cloned

Type:

int

original_node

The original CUDA graph node from which node is cloned

Type:

int

ptr

Get the pointer address to the data as Python int.

class cupti.cupti.MetricValue

Bases: object

Empty-initialize an instance of CUpti_MetricValue.

See also

CUpti_MetricValue

metric_value_double

float:

metric_value_int64

int:

metric_value_nvtx_extended_payload

Value for CUPTI_METRIC_VALUE_KIND_NVTX_EXTENDED_PAYLOAD.

Type:

int

metric_value_percent

float:

metric_value_throughput

int:

metric_value_uint64

int:

metric_value_utilization_level

int:

ptr

Get the pointer address to the data as Python int.

class cupti.cupti.MetricValueKind(value)

Bases: IntEnum

See CUpti_MetricValueKind.

DOUBLE = 0
FORCE_INT = 2147483647
INT64 = 4
NVTX_EXTENDED_PAYLOAD = 6
PERCENT = 2
THROUGHPUT = 3
UINT64 = 1
UTILIZATION_LEVEL = 5
class cupti.cupti.MetricValueUtilizationLevel(value)

Bases: IntEnum

See CUpti_MetricValueUtilizationLevel.

FORCE_INT = 2147483647
HIGH = 8
IDLE = 0
LOW = 2
MAX = 10
MID = 5
class cupti.cupti.ModuleResourceData

Bases: object

Empty-initialize an instance of CUpti_ModuleResourceData.

cubin_size

The size of the cubin.

Type:

int

module_id

Identifier to associate with the CUDA module.

Type:

int

p_cubin

Pointer to the associated cubin.

Type:

str

ptr

Get the pointer address to the data as Python int.

class cupti.cupti.OpenAccConstructKind(value)

Bases: IntEnum

See CUpti_OpenAccConstructKind.

ATOMIC = 8
DATA = 4
DECLARE = 9
ENTER_DATA = 5
EXIT_DATA = 6
FORCE_INT = 2147483647
HOST_DATA = 7
INIT = 10
KERNELS = 2
LOOP = 3
PARALLEL = 1
ROUTINE = 14
RUNTIME_API = 16
SET = 12
SHUTDOWN = 11
UNKNOWN = 0
UPDATE = 13
WAIT = 15
class cupti.cupti.OpenAccEventKind(value)

Bases: IntEnum

See CUpti_OpenAccEventKind.

ALLOC = 15
COMPUTE_CONSTRUCT = 9
CREATE = 13
DELETE = 14
DEVICE_INIT = 1
DEVICE_SHUTDOWN = 2
ENQUEUE_DOWNLOAD = 6
ENQUEUE_LAUNCH = 4
ENQUEUE_UPLOAD = 5
ENTER_DATA = 11
EXIT_DATA = 12
FORCE_INT = 2147483647
FREE = 16
IMPLICIT_WAIT = 8
INVALID = 0
RUNTIME_SHUTDOWN = 3
UPDATE = 10
WAIT = 7
class cupti.cupti.OpenMpEventKind(value)

Bases: IntEnum

See CUpti_OpenMpEventKind.

FORCE_INT = 2147483647
IDLE = 4
INVALID = 0
PARALLEL = 1
TASK = 2
THREAD = 3
WAIT_BARRIER = 5
WAIT_TASKWAIT = 6
class cupti.cupti.PcieDeviceType(value)

Bases: IntEnum

See CUpti_PcieDeviceType.

BRIDGE = 1
FORCE_INT = 2147483647
GPU = 0
class cupti.cupti.ResourceData

Bases: object

Empty-initialize an instance of CUpti_ResourceData.

context

For CUPTI_CBID_RESOURCE_CONTEXT_CREATED and CUPTI_CBID_RESOURCE_CONTEXT_DESTROY_STARTING, the context being created or destroyed. For CUPTI_CBID_RESOURCE_STREAM_CREATED and CUPTI_CBID_RESOURCE_STREAM_DESTROY_STARTING, the context containing the stream being created or destroyed.

Type:

int

ptr

Get the pointer address to the data as Python int.

resource_descriptor

Reserved for future use.

Type:

int

resource_handle

_py_anon_pod0:

class cupti.cupti.Result(value)

Bases: IntEnum

See CUptiResult.

ERROR_API_NOT_IMPLEMENTED = 11
ERROR_CDP_TRACING_NOT_SUPPORTED = 32
ERROR_CMP_DEVICE_NOT_SUPPORTED = 42
ERROR_CONFIDENTIAL_COMPUTING_NOT_SUPPORTED = 41
ERROR_CUDA_COMPILER_NOT_COMPATIBLE = 34
ERROR_DISABLED = 23
ERROR_FORCE_INT = 2147483647
ERROR_HARDWARE = 9
ERROR_HARDWARE_BUSY = 26
ERROR_INSUFFICIENT_PRIVILEGES = 35
ERROR_INVALID_CHIP_NAME = 46
ERROR_INVALID_CONTEXT = 3
ERROR_INVALID_DEVICE = 2
ERROR_INVALID_EVENT_DOMAIN_ID = 4
ERROR_INVALID_EVENT_ID = 5
ERROR_INVALID_EVENT_NAME = 6
ERROR_INVALID_EVENT_VALUE = 22
ERROR_INVALID_HANDLE = 19
ERROR_INVALID_KIND = 21
ERROR_INVALID_METRIC_ID = 16
ERROR_INVALID_METRIC_NAME = 17
ERROR_INVALID_METRIC_VALUE = 25
ERROR_INVALID_MODULE = 24
ERROR_INVALID_OPERATION = 7
ERROR_INVALID_PARAMETER = 1
ERROR_INVALID_STREAM = 20
ERROR_LEGACY_PROFILER_NOT_SUPPORTED = 38
ERROR_MAX_LIMIT_REACHED = 12
ERROR_MIG_DEVICE_NOT_SUPPORTED = 43
ERROR_MULTIPLE_SUBSCRIBERS_NOT_SUPPORTED = 39
ERROR_NOT_COMPATIBLE = 14
ERROR_NOT_INITIALIZED = 15
ERROR_NOT_READY = 13
ERROR_NOT_SUPPORTED = 27
ERROR_OLD_PROFILER_API_INITIALIZED = 36
ERROR_OPENACC_UNDEFINED_ROUTINE = 37
ERROR_OUT_OF_MEMORY = 8
ERROR_PARAMETER_SIZE_NOT_SUFFICIENT = 10
ERROR_QUEUE_EMPTY = 18
ERROR_SLI_DEVICE_NOT_SUPPORTED = 44
ERROR_UM_PROFILING_NOT_SUPPORTED = 28
ERROR_UM_PROFILING_NOT_SUPPORTED_ON_DEVICE = 29
ERROR_UM_PROFILING_NOT_SUPPORTED_ON_NON_P2P_DEVICES = 30
ERROR_UM_PROFILING_NOT_SUPPORTED_WITH_MPS = 31
ERROR_UNKNOWN = 999
ERROR_VIRTUALIZED_DEVICE_INSUFFICIENT_PRIVILEGES = 40
ERROR_VIRTUALIZED_DEVICE_NOT_SUPPORTED = 33
ERROR_WSL_DEVICE_NOT_SUPPORTED = 45
SUCCESS = 0
class cupti.cupti.StateData

Bases: object

Empty-initialize an instance of CUpti_StateData.

See also

CUpti_StateData

ptr

Get the pointer address to the data as Python int.

class cupti.cupti.StreamAttrData

Bases: object

Empty-initialize an instance of CUpti_StreamAttrData.

attr

The type of the CUDA stream attribute

Type:

int

ptr

Get the pointer address to the data as Python int.

stream

The CUDA stream handle for the attribute

Type:

int

value

The value of the CUDA stream attribute

Type:

int

class cupti.cupti.SubscriberParams

Bases: object

Empty-initialize an instance of CUpti_SubscriberParams.

old_subscriber_name

The name of the incompatible tool or the existing CUPTI subscriber, if cupti.cupti.subscribe_v2 errors out with CUPTI_ERROR_MULTIPLE_SUBSCRIBERS_NOT_SUPPORTED return code. Is None otherwise.

Type:

Union[str, None]

ptr

Get the pointer address to the data as Python :py:`int`.

struct_size

Size of the data structure. CUPTI client should set the size of the structure. It will be used in CUPTI to check what fields are available in the structure. Used to preserve backward compatibility.

Type:

int

subscriber_name

Name given to the subscriber. The subscriber name need not include the “CUPTI” prefix, as the CUPTI library automatically adds it as “CUPTI for <subscriberName>”. Can be None. An internal copy is created. Size must not exceed cupti.cupti.SUBSCRIBER_NAME_MAX_LEN to avoid truncation.

Type:

str

class cupti.cupti.SynchronizeData

Bases: object

Empty-initialize an instance of CUpti_SynchronizeData.

context

The context of the stream being synchronized.

Type:

int

ptr

Get the pointer address to the data as Python int.

stream

The stream being synchronized.

Type:

int

class cupti.cupti._py_anon_pod0

Bases: object

Empty-initialize an instance of _anon_pod0.

See also

_anon_pod0

ptr

Get the pointer address to the data as Python int.

stream

int:

class cupti.cupti._py_anon_pod1

Bases: object

Empty-initialize an instance of _anon_pod1.

See also

_anon_pod1

notification

_py_anon_pod2:

ptr

Get the pointer address to the data as Python int.

class cupti.cupti._py_anon_pod10

Bases: object

Empty-initialize an instance of _anon_pod10.

See also

_anon_pod10

ptr

Get the pointer address to the data as Python int.

v_double

float:

v_int32

int:

v_int64

int:

v_uint32

int:

v_uint64

int:

class cupti.cupti._py_anon_pod11

Bases: object

Empty-initialize an instance of _anon_pod11.

See also

_anon_pod11

cooling

_py_anon_pod15:

power

_py_anon_pod14:

ptr

Get the pointer address to the data as Python int.

speed

_py_anon_pod12:

temperature

_py_anon_pod13:

class cupti.cupti._py_anon_pod12

Bases: object

Empty-initialize an instance of _anon_pod12.

See also

_anon_pod12

clocks_throttle_reasons

int:

memory_clock

int:

int:

int:

ptr

Get the pointer address to the data as Python int.

sm_clock

int:

class cupti.cupti._py_anon_pod13

Bases: object

Empty-initialize an instance of _anon_pod13.

See also

_anon_pod13

gpu_temperature

int:

ptr

Get the pointer address to the data as Python int.

class cupti.cupti._py_anon_pod14

Bases: object

Empty-initialize an instance of _anon_pod14.

See also

_anon_pod14

power

int:

power_limit

int:

ptr

Get the pointer address to the data as Python int.

class cupti.cupti._py_anon_pod15

Bases: object

Empty-initialize an instance of _anon_pod15.

See also

_anon_pod15

fan_speed

int:

ptr

Get the pointer address to the data as Python int.

class cupti.cupti._py_anon_pod2

Bases: object

Empty-initialize an instance of _anon_pod2.

See also

_anon_pod2

message

str:

ptr

Get the pointer address to the data as Python int.

result

int:

class cupti.cupti._py_anon_pod24

Bases: object

Empty-initialize an instance of _anon_pod24.

See also

_anon_pod24

both

int:

config

_py_anon_pod25:

ptr

Get the pointer address to the data as Python int.

class cupti.cupti._py_anon_pod25

Bases: object

Empty-initialize an instance of _anon_pod25.

See also

_anon_pod25

executed

int:

ptr

Get the pointer address to the data as Python int.

requested

int:

class cupti.cupti._py_anon_pod3

Bases: object

Empty-initialize an instance of _anon_pod3.

See also

_anon_pod3

process_id

int:

ptr

Get the pointer address to the data as Python int.

thread_id

int:

class cupti.cupti._py_anon_pod4

Bases: object

Empty-initialize an instance of _anon_pod4.

See also

_anon_pod4

context_id

int:

device_id

int:

ptr

Get the pointer address to the data as Python int.

stream_id

int:

class cupti.cupti._py_anon_pod5

Bases: object

Empty-initialize an instance of _anon_pod5.

See also

_anon_pod5

address

int:

memory_pool_type

int:

pad2

int:

pool

_py_anon_pod6:

ptr

Get the pointer address to the data as Python int.

release_threshold

int:

utilized_size

int:

class cupti.cupti._py_anon_pod6

Bases: object

Empty-initialize an instance of _anon_pod6.

See also

_anon_pod6

process_id

int:

ptr

Get the pointer address to the data as Python int.

size_

int:

class cupti.cupti._py_anon_pod7

Bases: object

Empty-initialize an instance of _anon_pod7.

See also

_anon_pod7

both

int:

config

_py_anon_pod8:

ptr

Get the pointer address to the data as Python int.

class cupti.cupti._py_anon_pod8

Bases: object

Empty-initialize an instance of _anon_pod8.

See also

_anon_pod8

executed

int:

ptr

Get the pointer address to the data as Python int.

requested

int:

class cupti.cupti._py_anon_pod9

Bases: object

Empty-initialize an instance of _anon_pod9.

See also

_anon_pod9

cu

int:

cupti

int:

ptr

Get the pointer address to the data as Python int.

class cupti.cupti.driver_api_trace_cbid(value)

Bases: IntEnum

See CUpti_driver_api_trace_cbid.

FORCE_INT = 2147483647
INVALID = 0
SIZE = 807
cu64Array3DCreate = 230
cu64Array3DGetDescriptor = 231
cu64ArrayCreate = 228
cu64ArrayGetDescriptor = 229
cu64D3D10ResourceGetMappedPitch = 200
cu64D3D10ResourceGetMappedPointer = 198
cu64D3D10ResourceGetMappedSize = 199
cu64D3D10ResourceGetSurfaceDimensions = 201
cu64D3D9MapVertexBuffer = 206
cu64D3D9ResourceGetMappedPitch = 205
cu64D3D9ResourceGetMappedPointer = 203
cu64D3D9ResourceGetMappedSize = 204
cu64D3D9ResourceGetSurfaceDimensions = 202
cu64DeviceTotalMem = 197
cu64GLMapBufferObject = 207
cu64GLMapBufferObjectAsync = 208
cu64GraphicsResourceGetMappedPointer = 131
cu64MemAlloc = 30
cu64MemAllocPitch = 32
cu64MemFree = 34
cu64MemGetAddressRange = 36
cu64MemGetInfo = 28
cu64MemHostAlloc = 215
cu64MemHostGetDevicePointer = 41
cu64Memcpy2D = 232
cu64Memcpy2DAsync = 234
cu64Memcpy2DUnaligned = 233
cu64Memcpy3D = 59
cu64Memcpy3DAsync = 70
cu64MemcpyAtoD = 52
cu64MemcpyDtoA = 50
cu64MemcpyDtoD = 48
cu64MemcpyDtoDAsync = 65
cu64MemcpyDtoH = 46
cu64MemcpyDtoHAsync = 63
cu64MemcpyHtoD = 44
cu64MemcpyHtoDAsync = 61
cu64MemsetD16 = 74
cu64MemsetD16Async = 219
cu64MemsetD2D16 = 80
cu64MemsetD2D16Async = 225
cu64MemsetD2D32 = 82
cu64MemsetD2D32Async = 227
cu64MemsetD2D8 = 78
cu64MemsetD2D8Async = 223
cu64MemsetD32 = 76
cu64MemsetD32Async = 221
cu64MemsetD8 = 72
cu64MemsetD8Async = 217
cu64ModuleGetGlobal = 25
cu64TexRefGetAddress = 104
cu64TexRefSetAddress = 96
cu64TexRefSetAddress2D = 98
cuArray3DCreate = 90
cuArray3DCreate_v2 = 274
cuArray3DGetDescriptor = 91
cuArray3DGetDescriptor_v2 = 275
cuArrayCreate = 87
cuArrayCreate_v2 = 272
cuArrayDestroy = 89
cuArrayGetDescriptor = 88
cuArrayGetDescriptor_v2 = 273
cuArrayGetMemoryRequirements = 654
cuArrayGetPlane = 597
cuArrayGetSparseProperties = 582
cuBinaryFree = 376
cuCheckpointProcessCheckpoint = 771
cuCheckpointProcessGetRestoreThreadId = 768
cuCheckpointProcessGetState = 769
cuCheckpointProcessLock = 770
cuCheckpointProcessRestore = 772
cuCheckpointProcessUnlock = 773
cuCompilePtx = 375
cuCoredumpGetAttribute = 701
cuCoredumpGetAttributeGlobal = 702
cuCoredumpSetAttribute = 703
cuCoredumpSetAttributeGlobal = 704
cuCtxAttach = 12
cuCtxCreate = 10
cuCtxCreate_v2 = 235
cuCtxCreate_v3 = 645
cuCtxCreate_v4 = 757
cuCtxDestroy = 11
cuCtxDestroy_v2 = 322
cuCtxDetach = 13
cuCtxDisablePeerAccess = 314
cuCtxEnablePeerAccess = 313
cuCtxFromGreenCtx = 753
cuCtxGetApiVersion = 296
cuCtxGetCacheConfig = 299
cuCtxGetCurrent = 304
cuCtxGetDevResource = 746
cuCtxGetDevice = 16
cuCtxGetDevice_v2 = 795
cuCtxGetExecAffinity = 646
cuCtxGetFlags = 391
cuCtxGetId = 695
cuCtxGetLimit = 137
cuCtxGetSharedMemConfig = 337
cuCtxGetStreamPriorityRange = 370
cuCtxPopCurrent = 15
cuCtxPopCurrent_v2 = 324
cuCtxPushCurrent = 14
cuCtxPushCurrent_v2 = 323
cuCtxRecordEvent = 755
cuCtxResetPersistingL2Cache = 568
cuCtxSetCacheConfig = 300
cuCtxSetCurrent = 303
cuCtxSetFlags = 705
cuCtxSetLimit = 136
cuCtxSetSharedMemConfig = 336
cuCtxSynchronize = 17
cuCtxSynchronize_v2 = 800
cuCtxWaitEvent = 756
cuD3D10CtxCreate = 139
cuD3D10CtxCreateOnDevice = 212
cuD3D10CtxCreate_v2 = 236
cuD3D10GetDevice = 138
cuD3D10GetDevices = 211
cuD3D10GetDirect3DDevice = 297
cuD3D10MapResources = 143
cuD3D10RegisterResource = 141
cuD3D10ResourceGetMappedArray = 146
cuD3D10ResourceGetMappedPitch = 149
cuD3D10ResourceGetMappedPitch_v2 = 262
cuD3D10ResourceGetMappedPointer = 147
cuD3D10ResourceGetMappedPointer_v2 = 260
cuD3D10ResourceGetMappedSize = 148
cuD3D10ResourceGetMappedSize_v2 = 261
cuD3D10ResourceGetSurfaceDimensions = 150
cuD3D10ResourceGetSurfaceDimensions_v2 = 263
cuD3D10ResourceSetMapFlags = 145
cuD3D10UnmapResources = 144
cuD3D10UnregisterResource = 142
cuD3D11CtxCreate = 152
cuD3D11CtxCreateOnDevice = 210
cuD3D11CtxCreate_v2 = 237
cuD3D11GetDevice = 151
cuD3D11GetDevices = 209
cuD3D11GetDirect3DDevice = 298
cuD3D9Begin = 168
cuD3D9CtxCreate = 155
cuD3D9CtxCreateOnDevice = 214
cuD3D9CtxCreate_v2 = 238
cuD3D9End = 169
cuD3D9GetDevice = 154
cuD3D9GetDevices = 213
cuD3D9GetDirect3DDevice = 157
cuD3D9MapResources = 160
cuD3D9MapVertexBuffer = 171
cuD3D9MapVertexBuffer_v2 = 268
cuD3D9RegisterResource = 158
cuD3D9RegisterVertexBuffer = 170
cuD3D9ResourceGetMappedArray = 164
cuD3D9ResourceGetMappedPitch = 167
cuD3D9ResourceGetMappedPitch_v2 = 267
cuD3D9ResourceGetMappedPointer = 165
cuD3D9ResourceGetMappedPointer_v2 = 265
cuD3D9ResourceGetMappedSize = 166
cuD3D9ResourceGetMappedSize_v2 = 266
cuD3D9ResourceGetSurfaceDimensions = 163
cuD3D9ResourceGetSurfaceDimensions_v2 = 264
cuD3D9ResourceSetMapFlags = 162
cuD3D9UnmapResources = 161
cuD3D9UnmapVertexBuffer = 172
cuD3D9UnregisterResource = 159
cuD3D9UnregisterVertexBuffer = 173
cuDestroyExternalMemory = 488
cuDestroyExternalSemaphore = 494
cuDevResourceGenerateDesc = 748
cuDevSmResourceSplitByCount = 751
cuDeviceCanAccessPeer = 312
cuDeviceComputeCapability = 6
cuDeviceGet = 3
cuDeviceGetAttribute = 9
cuDeviceGetByPCIBusId = 331
cuDeviceGetCount = 4
cuDeviceGetDefaultMemPool = 606
cuDeviceGetDevResource = 745
cuDeviceGetExecAffinitySupport = 644
cuDeviceGetGraphMemAttribute = 641
cuDeviceGetHostAtomicCapabilities = 805
cuDeviceGetLuid = 532
cuDeviceGetMemPool = 610
cuDeviceGetName = 5
cuDeviceGetNvSciSyncAttributes = 542
cuDeviceGetP2PAtomicCapabilities = 804
cuDeviceGetP2PAttribute = 454
cuDeviceGetPCIBusId = 332
cuDeviceGetProperties = 8
cuDeviceGetTexture1DLinearMaxWidth = 579
cuDeviceGetUuid = 482
cuDeviceGetUuid_v2 = 647
cuDeviceGraphMemTrim = 640
cuDevicePrimaryCtxGetState = 392
cuDevicePrimaryCtxRelease = 387
cuDevicePrimaryCtxRelease_v2 = 544
cuDevicePrimaryCtxReset = 389
cuDevicePrimaryCtxReset_v2 = 545
cuDevicePrimaryCtxRetain = 386
cuDevicePrimaryCtxSetFlags = 388
cuDevicePrimaryCtxSetFlags_v2 = 546
cuDeviceRegisterAsyncNotification = 735
cuDeviceSetGraphMemAttribute = 642
cuDeviceSetMemPool = 609
cuDeviceTotalMem = 7
cuDeviceTotalMem_v2 = 259
cuDeviceUnregisterAsyncNotification = 736
cuDriverGetGpuCodeIsaVersion = 806
cuDriverGetVersion = 2
cuEGLStreamConsumerAcquireFrame = 395
cuEGLStreamConsumerConnect = 393
cuEGLStreamConsumerConnectWithFlags = 470
cuEGLStreamConsumerDisconnect = 394
cuEGLStreamConsumerReleaseFrame = 396
cuEGLStreamProducerConnect = 446
cuEGLStreamProducerDisconnect = 447
cuEGLStreamProducerPresentFrame = 448
cuEGLStreamProducerReturnFrame = 453
cuEventCreate = 118
cuEventCreateFromEGLSync = 479
cuEventCreateFromNVNSync = 469
cuEventDestroy = 122
cuEventDestroy_v2 = 325
cuEventElapsedTime = 123
cuEventElapsedTime_v2 = 780
cuEventQuery = 120
cuEventRecord = 119
cuEventRecordWithFlags = 587
cuEventRecordWithFlags_ptsz = 588
cuEventRecord_ptsz = 441
cuEventSynchronize = 121
cuExternalMemoryGetMappedBuffer = 486
cuExternalMemoryGetMappedMipmappedArray = 487
cuFlushGPUDirectRDMAWrites = 627
cuFuncGetAttribute = 85
cuFuncGetModule = 566
cuFuncGetName = 718
cuFuncGetParamInfo = 733
cuFuncIsLoaded = 741
cuFuncLoad = 742
cuFuncSetAttribute = 481
cuFuncSetBlockShape = 83
cuFuncSetCacheConfig = 86
cuFuncSetSharedMemConfig = 338
cuFuncSetSharedSize = 84
cuGLCtxCreate = 174
cuGLCtxCreate_v2 = 239
cuGLGetDevices = 333
cuGLGetDevices_v2 = 385
cuGLInit = 178
cuGLMapBufferObject = 180
cuGLMapBufferObjectAsync = 184
cuGLMapBufferObjectAsync_v2 = 270
cuGLMapBufferObjectAsync_v2_ptsz = 445
cuGLMapBufferObject_v2 = 269
cuGLMapBufferObject_v2_ptds = 417
cuGLRegisterBufferObject = 179
cuGLSetBufferObjectMapFlags = 183
cuGLUnmapBufferObject = 181
cuGLUnmapBufferObjectAsync = 185
cuGLUnregisterBufferObject = 182
cuGetErrorName = 373
cuGetErrorString = 372
cuGetExportTable = 135
cuGetProcAddress = 626
cuGetProcAddress_v2 = 677
cuGraphAddBatchMemOpNode = 669
cuGraphAddChildGraphNode = 525
cuGraphAddDependencies = 518
cuGraphAddDependencies_v2 = 727
cuGraphAddEmptyNode = 526
cuGraphAddEventRecordNode = 589
cuGraphAddEventWaitNode = 590
cuGraphAddExternalSemaphoresSignalNode = 618
cuGraphAddExternalSemaphoresWaitNode = 621
cuGraphAddHostNode = 530
cuGraphAddKernelNode = 502
cuGraphAddKernelNode_v2 = 689
cuGraphAddMemAllocNode = 638
cuGraphAddMemFreeNode = 639
cuGraphAddMemcpyNode = 504
cuGraphAddMemsetNode = 506
cuGraphAddNode = 712
cuGraphAddNode_v2 = 723
cuGraphBatchMemOpNodeGetParams = 670
cuGraphBatchMemOpNodeSetParams = 671
cuGraphChildGraphNodeGetGraph = 529
cuGraphClone = 523
cuGraphConditionalHandleCreate = 722
cuGraphCreate = 501
cuGraphDebugDotPrint = 628
cuGraphDestroy = 517
cuGraphDestroyNode = 522
cuGraphEventRecordNodeGetEvent = 591
cuGraphEventRecordNodeSetEvent = 593
cuGraphEventWaitNodeGetEvent = 592
cuGraphEventWaitNodeSetEvent = 594
cuGraphExecBatchMemOpNodeSetParams = 672
cuGraphExecChildGraphNodeSetParams = 586
cuGraphExecDestroy = 516
cuGraphExecEventRecordNodeSetEvent = 595
cuGraphExecEventWaitNodeSetEvent = 596
cuGraphExecExternalSemaphoresSignalNodeSetParams = 624
cuGraphExecExternalSemaphoresWaitNodeSetParams = 625
cuGraphExecGetFlags = 658
cuGraphExecHostNodeSetParams = 564
cuGraphExecKernelNodeSetParams = 538
cuGraphExecKernelNodeSetParams_v2 = 692
cuGraphExecMemcpyNodeSetParams = 562
cuGraphExecMemsetNodeSetParams = 563
cuGraphExecNodeSetParams = 714
cuGraphExecUpdate = 561
cuGraphExecUpdate_v2 = 696
cuGraphExternalSemaphoresSignalNodeGetParams = 619
cuGraphExternalSemaphoresSignalNodeSetParams = 620
cuGraphExternalSemaphoresWaitNodeGetParams = 622
cuGraphExternalSemaphoresWaitNodeSetParams = 623
cuGraphGetEdges = 535
cuGraphGetEdges_v2 = 724
cuGraphGetNodes = 534
cuGraphGetRootNodes = 510
cuGraphHostNodeGetParams = 531
cuGraphHostNodeSetParams = 533
cuGraphInstantiate = 513
cuGraphInstantiateWithFlags = 643
cuGraphInstantiateWithParams = 656
cuGraphInstantiateWithParams_ptsz = 657
cuGraphInstantiate_v2 = 578
cuGraphKernelNodeCopyAttributes = 569
cuGraphKernelNodeGetAttribute = 570
cuGraphKernelNodeGetParams = 503
cuGraphKernelNodeGetParams_v2 = 690
cuGraphKernelNodeSetAttribute = 571
cuGraphKernelNodeSetParams = 521
cuGraphKernelNodeSetParams_v2 = 691
cuGraphLaunch = 514
cuGraphLaunch_ptsz = 515
cuGraphMemAllocNodeGetParams = 648
cuGraphMemFreeNodeGetParams = 649
cuGraphMemcpyNodeGetParams = 505
cuGraphMemcpyNodeSetParams = 520
cuGraphMemsetNodeGetParams = 507
cuGraphMemsetNodeSetParams = 508
cuGraphNodeFindInClone = 524
cuGraphNodeGetDependencies = 511
cuGraphNodeGetDependencies_v2 = 725
cuGraphNodeGetDependentNodes = 512
cuGraphNodeGetDependentNodes_v2 = 726
cuGraphNodeGetEnabled = 651
cuGraphNodeGetType = 509
cuGraphNodeSetEnabled = 650
cuGraphNodeSetParams = 713
cuGraphReleaseUserObject = 637
cuGraphRemoveDependencies = 519
cuGraphRemoveDependencies_v2 = 728
cuGraphRetainUserObject = 636
cuGraphUpload = 580
cuGraphUpload_ptsz = 581
cuGraphicsD3D10RegisterResource = 140
cuGraphicsD3D11RegisterResource = 153
cuGraphicsD3D9RegisterResource = 156
cuGraphicsEGLRegisterImage = 390
cuGraphicsGLRegisterBuffer = 175
cuGraphicsGLRegisterImage = 176
cuGraphicsMapResources = 133
cuGraphicsMapResources_ptsz = 443
cuGraphicsResourceGetMappedEglFrame = 449
cuGraphicsResourceGetMappedMipmappedArray = 360
cuGraphicsResourceGetMappedPointer = 130
cuGraphicsResourceGetMappedPointer_v2 = 258
cuGraphicsResourceSetMapFlags = 132
cuGraphicsResourceSetMapFlags_v2 = 380
cuGraphicsSubResourceGetMappedArray = 129
cuGraphicsUnmapResources = 134
cuGraphicsUnmapResources_ptsz = 444
cuGraphicsUnregisterResource = 128
cuGraphicsVDPAURegisterOutputSurface = 189
cuGraphicsVDPAURegisterVideoSurface = 188
cuGreenCtxCreate = 743
cuGreenCtxDestroy = 744
cuGreenCtxGetDevResource = 747
cuGreenCtxGetId = 782
cuGreenCtxRecordEvent = 749
cuGreenCtxStreamCreate = 758
cuGreenCtxWaitEvent = 750
cuImportExternalMemory = 485
cuImportExternalSemaphore = 489
cuInit = 1
cuIpcCloseMemHandle = 330
cuIpcGetEventHandle = 334
cuIpcGetMemHandle = 328
cuIpcOpenEventHandle = 335
cuIpcOpenMemHandle = 329
cuIpcOpenMemHandle_v2 = 567
cuKernelGetAttribute = 686
cuKernelGetFunction = 683
cuKernelGetLibrary = 754
cuKernelGetName = 719
cuKernelGetParamInfo = 734
cuKernelSetAttribute = 687
cuKernelSetCacheConfig = 688
cuLaunch = 115
cuLaunchCooperativeKernel = 477
cuLaunchCooperativeKernelMultiDevice = 480
cuLaunchCooperativeKernel_ptsz = 478
cuLaunchGrid = 116
cuLaunchGridAsync = 117
cuLaunchHostFunc = 527
cuLaunchHostFunc_ptsz = 528
cuLaunchKernel = 307
cuLaunchKernelEx = 652
cuLaunchKernelEx_ptsz = 653
cuLaunchKernel_ptsz = 442
cuLibraryEnumerateKernels = 740
cuLibraryGetGlobal = 684
cuLibraryGetKernel = 681
cuLibraryGetKernelCount = 739
cuLibraryGetManaged = 685
cuLibraryGetModule = 682
cuLibraryGetUnifiedFunction = 700
cuLibraryLoadData = 678
cuLibraryLoadFromFile = 679
cuLibraryUnload = 680
cuLinkAddData = 363
cuLinkAddData_v2 = 382
cuLinkAddFile = 364
cuLinkAddFile_v2 = 383
cuLinkComplete = 365
cuLinkCreate = 362
cuLinkCreate_v2 = 381
cuLinkDestroy = 366
cuLogsCurrent = 765
cuLogsDumpToFile = 766
cuLogsDumpToMemory = 767
cuLogsRegisterCallback = 763
cuLogsUnregisterCallback = 764
cuMemAddressFree = 548
cuMemAddressReserve = 547
cuMemAdvise = 457
cuMemAdvise_v2 = 715
cuMemAlloc = 29
cuMemAllocAsync = 598
cuMemAllocAsync_ptsz = 599
cuMemAllocFromPoolAsync = 611
cuMemAllocFromPoolAsync_ptsz = 612
cuMemAllocHost = 37
cuMemAllocHost_v2 = 294
cuMemAllocManaged = 371
cuMemAllocPitch = 31
cuMemAllocPitch_v2 = 244
cuMemAlloc_v2 = 243
cuMemBatchDecompressAsync = 761
cuMemBatchDecompressAsync_ptsz = 762
cuMemCreate = 549
cuMemDiscardAndPrefetchBatchAsync = 791
cuMemDiscardAndPrefetchBatchAsync_ptsz = 792
cuMemDiscardBatchAsync = 789
cuMemDiscardBatchAsync_ptsz = 790
cuMemExportToShareableHandle = 554
cuMemFree = 33
cuMemFreeAsync = 600
cuMemFreeAsync_ptsz = 601
cuMemFreeHost = 38
cuMemFree_v2 = 245
cuMemGetAccess = 558
cuMemGetAddressRange = 35
cuMemGetAddressRange_v2 = 246
cuMemGetAllocationGranularity = 556
cuMemGetAllocationPropertiesFromHandle = 557
cuMemGetDefaultMemPool = 801
cuMemGetHandleForAddressRange = 674
cuMemGetInfo = 27
cuMemGetInfo_v2 = 242
cuMemGetMemPool = 802
cuMemHostAlloc = 39
cuMemHostAlloc_v2 = 271
cuMemHostGetDevicePointer = 40
cuMemHostGetDevicePointer_v2 = 247
cuMemHostGetFlags = 42
cuMemHostRegister = 301
cuMemHostRegister_v2 = 379
cuMemHostUnregister = 302
cuMemImportFromShareableHandle = 555
cuMemMap = 551
cuMemMapArrayAsync = 584
cuMemMapArrayAsync_ptsz = 585
cuMemPeerGetDevicePointer = 317
cuMemPeerRegister = 315
cuMemPeerUnregister = 316
cuMemPoolCreate = 607
cuMemPoolDestroy = 608
cuMemPoolExportPointer = 615
cuMemPoolExportToShareableHandle = 613
cuMemPoolGetAccess = 617
cuMemPoolGetAttribute = 604
cuMemPoolImportFromShareableHandle = 614
cuMemPoolImportPointer = 616
cuMemPoolSetAccess = 605
cuMemPoolSetAttribute = 603
cuMemPoolTrimTo = 602
cuMemPrefetchAsync = 467
cuMemPrefetchAsync_ptsz = 468
cuMemPrefetchAsync_v2 = 716
cuMemPrefetchAsync_v2_ptsz = 717
cuMemPrefetchBatchAsync = 784
cuMemPrefetchBatchAsync_ptsz = 785
cuMemRangeGetAttribute = 471
cuMemRangeGetAttributes = 472
cuMemRelease = 550
cuMemRetainAllocationHandle = 565
cuMemSetAccess = 553
cuMemSetMemPool = 803
cuMemUnmap = 552
cuMemcpy = 305
cuMemcpy2D = 56
cuMemcpy2DAsync = 68
cuMemcpy2DAsync_v2 = 289
cuMemcpy2DAsync_v2_ptsz = 424
cuMemcpy2DUnaligned = 57
cuMemcpy2DUnaligned_v2 = 288
cuMemcpy2DUnaligned_v2_ptds = 406
cuMemcpy2D_v2 = 287
cuMemcpy2D_v2_ptds = 405
cuMemcpy3D = 58
cuMemcpy3DAsync = 69
cuMemcpy3DAsync_v2 = 291
cuMemcpy3DAsync_v2_ptsz = 425
cuMemcpy3DBatchAsync = 778
cuMemcpy3DBatchAsync_ptsz = 779
cuMemcpy3DBatchAsync_v2 = 798
cuMemcpy3DBatchAsync_v2_ptsz = 799
cuMemcpy3DPeer = 320
cuMemcpy3DPeerAsync = 321
cuMemcpy3DPeerAsync_ptsz = 427
cuMemcpy3DPeer_ptds = 410
cuMemcpy3D_v2 = 290
cuMemcpy3D_v2_ptds = 407
cuMemcpyAsync = 306
cuMemcpyAsync_ptsz = 418
cuMemcpyAtoA = 55
cuMemcpyAtoA_v2 = 286
cuMemcpyAtoA_v2_ptds = 404
cuMemcpyAtoD = 51
cuMemcpyAtoD_v2 = 284
cuMemcpyAtoD_v2_ptds = 401
cuMemcpyAtoH = 54
cuMemcpyAtoHAsync = 67
cuMemcpyAtoHAsync_v2 = 283
cuMemcpyAtoHAsync_v2_ptsz = 420
cuMemcpyAtoH_v2 = 282
cuMemcpyAtoH_v2_ptds = 403
cuMemcpyBatchAsync = 776
cuMemcpyBatchAsync_ptsz = 777
cuMemcpyBatchAsync_v2 = 796
cuMemcpyBatchAsync_v2_ptsz = 797
cuMemcpyDtoA = 49
cuMemcpyDtoA_v2 = 285
cuMemcpyDtoA_v2_ptds = 400
cuMemcpyDtoD = 47
cuMemcpyDtoDAsync = 64
cuMemcpyDtoDAsync_v2 = 281
cuMemcpyDtoDAsync_v2_ptsz = 423
cuMemcpyDtoD_v2 = 280
cuMemcpyDtoD_v2_ptds = 399
cuMemcpyDtoH = 45
cuMemcpyDtoHAsync = 62
cuMemcpyDtoHAsync_v2 = 279
cuMemcpyDtoHAsync_v2_ptsz = 422
cuMemcpyDtoH_v2 = 278
cuMemcpyDtoH_v2_ptds = 398
cuMemcpyHtoA = 53
cuMemcpyHtoAAsync = 66
cuMemcpyHtoAAsync_v2 = 293
cuMemcpyHtoAAsync_v2_ptsz = 419
cuMemcpyHtoA_v2 = 292
cuMemcpyHtoA_v2_ptds = 402
cuMemcpyHtoD = 43
cuMemcpyHtoDAsync = 60
cuMemcpyHtoDAsync_v2 = 277
cuMemcpyHtoDAsync_v2_ptsz = 421
cuMemcpyHtoD_v2 = 276
cuMemcpyHtoD_v2_ptds = 397
cuMemcpyPeer = 318
cuMemcpyPeerAsync = 319
cuMemcpyPeerAsync_ptsz = 426
cuMemcpyPeer_ptds = 409
cuMemcpy_ptds = 408
cuMemcpy_v2 = 248
cuMemsetD16 = 73
cuMemsetD16Async = 218
cuMemsetD16Async_ptsz = 429
cuMemsetD16_v2 = 250
cuMemsetD16_v2_ptds = 412
cuMemsetD2D16 = 79
cuMemsetD2D16Async = 224
cuMemsetD2D16Async_ptsz = 432
cuMemsetD2D16_v2 = 253
cuMemsetD2D16_v2_ptds = 415
cuMemsetD2D32 = 81
cuMemsetD2D32Async = 226
cuMemsetD2D32Async_ptsz = 433
cuMemsetD2D32_v2 = 254
cuMemsetD2D32_v2_ptds = 416
cuMemsetD2D8 = 77
cuMemsetD2D8Async = 222
cuMemsetD2D8Async_ptsz = 431
cuMemsetD2D8_v2 = 252
cuMemsetD2D8_v2_ptds = 414
cuMemsetD32 = 75
cuMemsetD32Async = 220
cuMemsetD32Async_ptsz = 430
cuMemsetD32_v2 = 251
cuMemsetD32_v2_ptds = 413
cuMemsetD8 = 71
cuMemsetD8Async = 216
cuMemsetD8Async_ptsz = 428
cuMemsetD8_v2 = 249
cuMemsetD8_v2_ptds = 411
cuMipmappedArrayCreate = 347
cuMipmappedArrayDestroy = 349
cuMipmappedArrayGetLevel = 348
cuMipmappedArrayGetMemoryRequirements = 655
cuMipmappedArrayGetSparseProperties = 583
cuModuleEnumerateFunctions = 738
cuModuleGetFunction = 23
cuModuleGetFunctionCount = 737
cuModuleGetGlobal = 24
cuModuleGetGlobal_v2 = 241
cuModuleGetLoadingMode = 673
cuModuleGetSurfRef = 190
cuModuleGetTexRef = 26
cuModuleLoad = 18
cuModuleLoadData = 19
cuModuleLoadDataEx = 20
cuModuleLoadFatBinary = 21
cuModuleUnload = 22
cuMultiKernelCooperativeDomainCreate = 793
cuMultiKernelCooperativeDomainDestroy = 794
cuMulticastAddDevice = 707
cuMulticastBindAddr = 709
cuMulticastBindMem = 708
cuMulticastCreate = 706
cuMulticastGetGranularity = 711
cuMulticastUnbind = 710
cuNNSetAllocator = 466
cuNVNbufferGetPointer = 464
cuNVNtextureGetArray = 465
cuOccupancyAvailableDynamicSMemPerBlock = 543
cuOccupancyMaxActiveBlocksPerMultiprocessor = 374
cuOccupancyMaxActiveBlocksPerMultiprocessorWithFlags = 451
cuOccupancyMaxActiveClusters = 676
cuOccupancyMaxPotentialBlockSize = 384
cuOccupancyMaxPotentialBlockSizeWithFlags = 452
cuOccupancyMaxPotentialClusterSize = 675
cuParamSetSize = 110
cuParamSetTexRef = 114
cuParamSetf = 112
cuParamSeti = 111
cuParamSetv = 113
cuPointerGetAttribute = 310
cuPointerGetAttributes = 450
cuPointerSetAttribute = 378
cuProfilerInitialize = 311
cuProfilerStart = 308
cuProfilerStop = 309
cuSemaphoreCreate = 786
cuSemaphoreDestroy = 788
cuSemaphoreExport = 787
cuSignalExternalSemaphoresAsync = 490
cuSignalExternalSemaphoresAsync_ptsz = 491
cuStreamAddCallback = 346
cuStreamAddCallback_ptsz = 437
cuStreamAttachMemAsync = 377
cuStreamAttachMemAsync_ptsz = 438
cuStreamBatchMemOp = 462
cuStreamBatchMemOp_ptsz = 463
cuStreamBatchMemOp_v2 = 667
cuStreamBatchMemOp_v2_ptsz = 668
cuStreamBeginCapture = 495
cuStreamBeginCaptureToGraph = 720
cuStreamBeginCaptureToGraph_ptsz = 721
cuStreamBeginCapture_ptsz = 496
cuStreamBeginCapture_v2 = 539
cuStreamBeginCapture_v2_ptsz = 540
cuStreamCopyAttributes = 572
cuStreamCopyAttributes_ptsz = 573
cuStreamCreate = 124
cuStreamCreateForCaptureToCig = 783
cuStreamCreateWithPriority = 367
cuStreamDestroy = 127
cuStreamDestroy_v2 = 326
cuStreamEndCapture = 497
cuStreamEndCapture_ptsz = 498
cuStreamGetAttribute = 574
cuStreamGetAttribute_ptsz = 575
cuStreamGetCaptureInfo = 536
cuStreamGetCaptureInfo_ptsz = 537
cuStreamGetCaptureInfo_v2 = 629
cuStreamGetCaptureInfo_v2_ptsz = 630
cuStreamGetCaptureInfo_v3 = 729
cuStreamGetCaptureInfo_v3_ptsz = 730
cuStreamGetCtx = 483
cuStreamGetCtx_ptsz = 484
cuStreamGetCtx_v2 = 759
cuStreamGetCtx_v2_ptsz = 760
cuStreamGetDevice = 774
cuStreamGetDevice_ptsz = 775
cuStreamGetFlags = 369
cuStreamGetFlags_ptsz = 435
cuStreamGetGreenCtx = 752
cuStreamGetId = 693
cuStreamGetId_ptsz = 694
cuStreamGetPriority = 368
cuStreamGetPriority_ptsz = 434
cuStreamIsCapturing = 499
cuStreamIsCapturing_ptsz = 500
cuStreamQuery = 125
cuStreamQuery_ptsz = 439
cuStreamSetAttribute = 576
cuStreamSetAttribute_ptsz = 577
cuStreamSetFlags = 559
cuStreamSetFlags_ptsz = 560
cuStreamSynchronize = 126
cuStreamSynchronize_ptsz = 440
cuStreamUpdateCaptureDependencies = 631
cuStreamUpdateCaptureDependencies_ptsz = 632
cuStreamUpdateCaptureDependencies_v2 = 731
cuStreamUpdateCaptureDependencies_v2_ptsz = 732
cuStreamWaitEvent = 295
cuStreamWaitEvent_ptsz = 436
cuStreamWaitValue32 = 458
cuStreamWaitValue32_ptsz = 459
cuStreamWaitValue32_v2 = 659
cuStreamWaitValue32_v2_ptsz = 660
cuStreamWaitValue64 = 473
cuStreamWaitValue64_ptsz = 474
cuStreamWaitValue64_v2 = 661
cuStreamWaitValue64_v2_ptsz = 662
cuStreamWriteValue32 = 460
cuStreamWriteValue32_ptsz = 461
cuStreamWriteValue32_v2 = 663
cuStreamWriteValue32_v2_ptsz = 664
cuStreamWriteValue64 = 475
cuStreamWriteValue64_ptsz = 476
cuStreamWriteValue64_v2 = 665
cuStreamWriteValue64_v2_ptsz = 666
cuSurfObjectCreate = 343
cuSurfObjectDestroy = 344
cuSurfObjectGetResourceDesc = 345
cuSurfRefCreate = 191
cuSurfRefDestroy = 192
cuSurfRefGetArray = 196
cuSurfRefGetFormat = 195
cuSurfRefSetArray = 194
cuSurfRefSetFormat = 193
cuTensorMapEncodeIm2col = 698
cuTensorMapEncodeIm2colWide = 781
cuTensorMapEncodeTiled = 697
cuTensorMapReplaceAddress = 699
cuTexObjectCreate = 339
cuTexObjectDestroy = 340
cuTexObjectGetResourceDesc = 341
cuTexObjectGetResourceViewDesc = 361
cuTexObjectGetTextureDesc = 342
cuTexRefCreate = 92
cuTexRefDestroy = 93
cuTexRefGetAddress = 103
cuTexRefGetAddressMode = 106
cuTexRefGetAddress_v2 = 257
cuTexRefGetArray = 105
cuTexRefGetBorderColor = 456
cuTexRefGetFilterMode = 107
cuTexRefGetFlags = 109
cuTexRefGetFormat = 108
cuTexRefGetMaxAnisotropy = 359
cuTexRefGetMipmapFilterMode = 356
cuTexRefGetMipmapLevelBias = 357
cuTexRefGetMipmapLevelClamp = 358
cuTexRefGetMipmappedArray = 355
cuTexRefSetAddress = 95
cuTexRefSetAddress2D = 97
cuTexRefSetAddress2D_v2 = 256
cuTexRefSetAddress2D_v3 = 327
cuTexRefSetAddressMode = 100
cuTexRefSetAddress_v2 = 255
cuTexRefSetArray = 94
cuTexRefSetBorderColor = 455
cuTexRefSetFilterMode = 101
cuTexRefSetFlags = 102
cuTexRefSetFormat = 99
cuTexRefSetMaxAnisotropy = 354
cuTexRefSetMipmapFilterMode = 351
cuTexRefSetMipmapLevelBias = 352
cuTexRefSetMipmapLevelClamp = 353
cuTexRefSetMipmappedArray = 350
cuThreadExchangeStreamCaptureMode = 541
cuUserObjectCreate = 633
cuUserObjectRelease = 635
cuUserObjectRetain = 634
cuVDPAUCtxCreate = 187
cuVDPAUCtxCreate_v2 = 240
cuVDPAUGetDevice = 186
cuWGLGetDevice = 177
cuWaitExternalSemaphoresAsync = 492
cuWaitExternalSemaphoresAsync_ptsz = 493
class cupti.cupti.runtime_api_trace_cbid(value)

Bases: IntEnum

See CUpti_runtime_api_trace_cbid.

FORCE_INT = 2147483647
INVALID = 0
SIZE = 523
cuda470_v12060 = 470
cuda471_v12060 = 471
cuda472_v12060 = 472
cuda473_v12060 = 473
cuda474_v12060 = 474
cuda475_v12060 = 475
cuda476_v12060 = 476
cuda477_v12060 = 477
cuda478_v12060 = 478
cuda479_v12060 = 479
cudaArrayGetInfo_v4010 = 181
cudaArrayGetMemoryRequirements_v11060 = 428
cudaArrayGetPlane_v11020 = 381
cudaArrayGetSparseProperties_v11010 = 359
cudaBindSurfaceToArray_v3020 = 61
cudaBindTexture2D_v3020 = 56
cudaBindTextureToArray_v3020 = 57
cudaBindTextureToMipmappedArray_v5000 = 195
cudaBindTexture_v3020 = 55
cudaChooseDevice_v3020 = 5
cudaConfigureCall_v3020 = 8
cudaCreateChannelDesc_v3020 = 7
cudaCreateSurfaceObject_v5000 = 189
cudaCreateTextureObject_v2_v11080 = 434
cudaCreateTextureObject_v5000 = 185
cudaCtxResetPersistingL2Cache_v11000 = 337
cudaD3D10GetDevice_v3020 = 88
cudaD3D10GetDevices_v3020 = 89
cudaD3D10GetDirect3DDevice_v3020 = 149
cudaD3D10MapResources_v3020 = 94
cudaD3D10RegisterResource_v3020 = 92
cudaD3D10ResourceGetMappedArray_v3020 = 98
cudaD3D10ResourceGetMappedPitch_v3020 = 101
cudaD3D10ResourceGetMappedPointer_v3020 = 99
cudaD3D10ResourceGetMappedSize_v3020 = 100
cudaD3D10ResourceGetSurfaceDimensions_v3020 = 97
cudaD3D10ResourceSetMapFlags_v3020 = 96
cudaD3D10SetDirect3DDevice_v3020 = 90
cudaD3D10UnmapResources_v3020 = 95
cudaD3D10UnregisterResource_v3020 = 93
cudaD3D11GetDevice_v3020 = 84
cudaD3D11GetDevices_v3020 = 85
cudaD3D11GetDirect3DDevice_v3020 = 148
cudaD3D11SetDirect3DDevice_v3020 = 86
cudaD3D9Begin_v3020 = 117
cudaD3D9End_v3020 = 118
cudaD3D9GetDevice_v3020 = 102
cudaD3D9GetDevices_v3020 = 103
cudaD3D9GetDirect3DDevice_v3020 = 105
cudaD3D9MapResources_v3020 = 109
cudaD3D9MapVertexBuffer_v3020 = 121
cudaD3D9RegisterResource_v3020 = 107
cudaD3D9RegisterVertexBuffer_v3020 = 119
cudaD3D9ResourceGetMappedArray_v3020 = 113
cudaD3D9ResourceGetMappedPitch_v3020 = 116
cudaD3D9ResourceGetMappedPointer_v3020 = 114
cudaD3D9ResourceGetMappedSize_v3020 = 115
cudaD3D9ResourceGetSurfaceDimensions_v3020 = 112
cudaD3D9ResourceSetMapFlags_v3020 = 111
cudaD3D9SetDirect3DDevice_v3020 = 104
cudaD3D9UnmapResources_v3020 = 110
cudaD3D9UnmapVertexBuffer_v3020 = 122
cudaD3D9UnregisterResource_v3020 = 108
cudaD3D9UnregisterVertexBuffer_v3020 = 120
cudaDestroyExternalMemory_v10000 = 277
cudaDestroyExternalSemaphore_v10000 = 283
cudaDestroySurfaceObject_v5000 = 190
cudaDestroyTextureObject_v5000 = 186
cudaDeviceCanAccessPeer_v4000 = 154
cudaDeviceDisablePeerAccess_v4000 = 156
cudaDeviceEnablePeerAccess_v4000 = 155
cudaDeviceFlushGPUDirectRDMAWrites_v11030 = 405
cudaDeviceGetAttribute_v5000 = 200
cudaDeviceGetByPCIBusId_v4010 = 173
cudaDeviceGetCacheConfig_v3020 = 168
cudaDeviceGetDefaultMemPool_v11020 = 372
cudaDeviceGetGraphMemAttribute_v11040 = 424
cudaDeviceGetHostAtomicCapabilities_v13000 = 521
cudaDeviceGetLimit_v3020 = 166
cudaDeviceGetMemPool_v11020 = 386
cudaDeviceGetNvSciSyncAttributes_v10020 = 328
cudaDeviceGetP2PAtomicCapabilities_v13000 = 522
cudaDeviceGetP2PAttribute_v8000 = 255
cudaDeviceGetPCIBusId_v4010 = 174
cudaDeviceGetSharedMemConfig_v4020 = 183
cudaDeviceGetStreamPriorityRange_v5050 = 205
cudaDeviceGetTexture1DLinearMaxWidth_v11010 = 347
cudaDeviceGraphMemTrim_v11040 = 423
cudaDeviceRegisterAsyncNotification_v12040 = 465
cudaDeviceReset_v3020 = 164
cudaDeviceSetCacheConfig_v3020 = 169
cudaDeviceSetGraphMemAttribute_v11040 = 425
cudaDeviceSetLimit_v3020 = 167
cudaDeviceSetMemPool_v11020 = 385
cudaDeviceSetSharedMemConfig_v4020 = 184
cudaDeviceSynchronize_v3020 = 165
cudaDeviceUnregisterAsyncNotification_v12040 = 466
cudaDriverGetVersion_v3020 = 1
cudaEGLStreamConsumerAcquireFrame_v7000 = 259
cudaEGLStreamConsumerConnectWithFlags_v7000 = 268
cudaEGLStreamConsumerConnect_v7000 = 257
cudaEGLStreamConsumerDisconnect_v7000 = 258
cudaEGLStreamConsumerReleaseFrame_v7000 = 260
cudaEGLStreamProducerConnect_v7000 = 261
cudaEGLStreamProducerDisconnect_v7000 = 262
cudaEGLStreamProducerPresentFrame_v7000 = 263
cudaEGLStreamProducerReturnFrame_v7000 = 264
cudaEventCreateFromEGLSync_v9000 = 271
cudaEventCreateWithFlags_v3020 = 134
cudaEventCreate_v3020 = 133
cudaEventDestroy_v3020 = 136
cudaEventElapsedTime_v12080 = 486
cudaEventElapsedTime_v2_v12080 = 486
cudaEventElapsedTime_v3020 = 139
cudaEventQuery_v3020 = 138
cudaEventRecordWithFlags_ptsz_v11010 = 371
cudaEventRecordWithFlags_v11010 = 370
cudaEventRecord_ptsz_v7000 = 242
cudaEventRecord_v3020 = 135
cudaEventSynchronize_v3020 = 137
cudaExternalMemoryGetMappedBuffer_v10000 = 275
cudaExternalMemoryGetMappedMipmappedArray_v10000 = 276
cudaFreeArray_v3020 = 24
cudaFreeAsync_ptsz_v11020 = 376
cudaFreeAsync_v11020 = 375
cudaFreeHost_v3020 = 26
cudaFreeMipmappedArray_v5000 = 194
cudaFree_v3020 = 22
cudaFuncGetAttributes_v3020 = 15
cudaFuncGetName_v12030 = 451
cudaFuncGetParamInfo_v12040 = 467
cudaFuncSetAttribute_v9000 = 273
cudaFuncSetCacheConfig_v3020 = 14
cudaFuncSetSharedMemConfig_v4020 = 182
cudaGLGetDevices_v4010 = 175
cudaGLMapBufferObjectAsync_v3020 = 69
cudaGLMapBufferObject_v3020 = 65
cudaGLRegisterBufferObject_v3020 = 64
cudaGLSetBufferObjectMapFlags_v3020 = 68
cudaGLSetGLDevice_v3020 = 63
cudaGLUnmapBufferObjectAsync_v3020 = 70
cudaGLUnmapBufferObject_v3020 = 66
cudaGLUnregisterBufferObject_v3020 = 67
cudaGetChannelDesc_v3020 = 6
cudaGetDeviceCount_v3020 = 3
cudaGetDeviceFlags_v7000 = 212
cudaGetDeviceProperties_v12000 = 440
cudaGetDeviceProperties_v2_v12000 = 440
cudaGetDeviceProperties_v3020 = 4
cudaGetDevice_v3020 = 17
cudaGetDriverEntryPointByVersion_ptsz_v12050 = 469
cudaGetDriverEntryPointByVersion_v12050 = 468
cudaGetDriverEntryPoint_ptsz_v11030 = 407
cudaGetDriverEntryPoint_v11030 = 406
cudaGetErrorName_v6050 = 209
cudaGetErrorString_v3020 = 12
cudaGetExportTable_v13000 = 493
cudaGetFuncBySymbol_v11000 = 336
cudaGetKernel_v12000 = 439
cudaGetLastError_v3020 = 10
cudaGetMipmappedArrayLevel_v5000 = 193
cudaGetSurfaceObjectResourceDesc_v5000 = 191
cudaGetSurfaceReference_v3020 = 62
cudaGetSymbolAddress_v3020 = 53
cudaGetSymbolSize_v3020 = 54
cudaGetTextureAlignmentOffset_v3020 = 59
cudaGetTextureObjectResourceDesc_v5000 = 187
cudaGetTextureObjectResourceViewDesc_v5000 = 199
cudaGetTextureObjectTextureDesc_v2_v11080 = 435
cudaGetTextureObjectTextureDesc_v5000 = 188
cudaGetTextureReference_v3020 = 60
cudaGraphAddChildGraphNode_v10000 = 298
cudaGraphAddDependencies_v10000 = 307
cudaGraphAddDependencies_v12030 = 458
cudaGraphAddDependencies_v2_v12030 = 458
cudaGraphAddEmptyNode_v10000 = 300
cudaGraphAddEventRecordNode_v11010 = 362
cudaGraphAddEventWaitNode_v11010 = 365
cudaGraphAddExternalSemaphoresSignalNode_v11020 = 397
cudaGraphAddExternalSemaphoresWaitNode_v11020 = 400
cudaGraphAddHostNode_v10000 = 296
cudaGraphAddKernelNode_v10000 = 289
cudaGraphAddMemAllocNode_v11040 = 419
cudaGraphAddMemFreeNode_v11040 = 421
cudaGraphAddMemcpyNode1D_v11010 = 352
cudaGraphAddMemcpyNodeFromSymbol_v11010 = 351
cudaGraphAddMemcpyNodeToSymbol_v11010 = 350
cudaGraphAddMemcpyNode_v10000 = 290
cudaGraphAddMemsetNode_v10000 = 293
cudaGraphAddNode_v12020 = 445
cudaGraphAddNode_v12030 = 460
cudaGraphAddNode_v2_v12030 = 460
cudaGraphChildGraphNodeGetGraph_v10000 = 299
cudaGraphClone_v10000 = 301
cudaGraphConditionalHandleCreate_v12030 = 454
cudaGraphCreate_v10000 = 286
cudaGraphDebugDotPrint_v11030 = 408
cudaGraphDestroyNode_v10000 = 309
cudaGraphDestroy_v10000 = 314
cudaGraphEventRecordNodeGetEvent_v11010 = 363
cudaGraphEventRecordNodeSetEvent_v11010 = 364
cudaGraphEventWaitNodeGetEvent_v11010 = 366
cudaGraphEventWaitNodeSetEvent_v11010 = 367
cudaGraphExecChildGraphNodeSetParams_v11010 = 361
cudaGraphExecDestroy_v10000 = 313
cudaGraphExecEventRecordNodeSetEvent_v11010 = 368
cudaGraphExecEventWaitNodeSetEvent_v11010 = 369
cudaGraphExecExternalSemaphoresSignalNodeSetParams_v11020 = 403
cudaGraphExecExternalSemaphoresWaitNodeSetParams_v11020 = 404
cudaGraphExecGetFlags_v12000 = 438
cudaGraphExecHostNodeSetParams_v10020 = 334
cudaGraphExecKernelNodeSetParams_v10010 = 326
cudaGraphExecMemcpyNodeSetParams1D_v11010 = 358
cudaGraphExecMemcpyNodeSetParamsFromSymbol_v11010 = 357
cudaGraphExecMemcpyNodeSetParamsToSymbol_v11010 = 356
cudaGraphExecMemcpyNodeSetParams_v10020 = 332
cudaGraphExecMemsetNodeSetParams_v10020 = 333
cudaGraphExecNodeSetParams_v12020 = 447
cudaGraphExecUpdate_v10020 = 335
cudaGraphExternalSemaphoresSignalNodeGetParams_v11020 = 398
cudaGraphExternalSemaphoresSignalNodeSetParams_v11020 = 399
cudaGraphExternalSemaphoresWaitNodeGetParams_v11020 = 401
cudaGraphExternalSemaphoresWaitNodeSetParams_v11020 = 402
cudaGraphGetEdges_v10000 = 323
cudaGraphGetEdges_v12030 = 455
cudaGraphGetEdges_v2_v12030 = 455
cudaGraphGetNodes_v10000 = 322
cudaGraphGetRootNodes_v10000 = 304
cudaGraphHostNodeGetParams_v10000 = 297
cudaGraphHostNodeSetParams_v10000 = 321
cudaGraphInstantiateWithFlags_v11040 = 418
cudaGraphInstantiateWithParams_ptsz_v12000 = 437
cudaGraphInstantiateWithParams_v12000 = 436
cudaGraphInstantiate_v10000 = 310
cudaGraphInstantiate_v12000 = 443
cudaGraphKernelNodeCopyAttributes_v11000 = 338
cudaGraphKernelNodeGetAttribute_v11000 = 339
cudaGraphKernelNodeGetParams_v10000 = 287
cudaGraphKernelNodeSetAttribute_v11000 = 340
cudaGraphKernelNodeSetParams_v10000 = 288
cudaGraphLaunch_ptsz_v10000 = 312
cudaGraphLaunch_v10000 = 311
cudaGraphMemAllocNodeGetParams_v11040 = 420
cudaGraphMemFreeNodeGetParams_v11040 = 422
cudaGraphMemcpyNodeGetParams_v10000 = 291
cudaGraphMemcpyNodeSetParams1D_v11010 = 355
cudaGraphMemcpyNodeSetParamsFromSymbol_v11010 = 354
cudaGraphMemcpyNodeSetParamsToSymbol_v11010 = 353
cudaGraphMemcpyNodeSetParams_v10000 = 292
cudaGraphMemsetNodeGetParams_v10000 = 294
cudaGraphMemsetNodeSetParams_v10000 = 295
cudaGraphNodeFindInClone_v10000 = 302
cudaGraphNodeGetDependencies_v10000 = 305
cudaGraphNodeGetDependencies_v12030 = 456
cudaGraphNodeGetDependencies_v2_v12030 = 456
cudaGraphNodeGetDependentNodes_v10000 = 306
cudaGraphNodeGetDependentNodes_v12030 = 457
cudaGraphNodeGetDependentNodes_v2_v12030 = 457
cudaGraphNodeGetEnabled_v11060 = 427
cudaGraphNodeGetType_v10000 = 303
cudaGraphNodeSetEnabled_v11060 = 426
cudaGraphNodeSetParams_v12020 = 446
cudaGraphReleaseUserObject_v11030 = 417
cudaGraphRemoveDependencies_v10000 = 308
cudaGraphRemoveDependencies_v12030 = 459
cudaGraphRemoveDependencies_v2_v12030 = 459
cudaGraphRetainUserObject_v11030 = 416
cudaGraphUpload_ptsz_v10000 = 349
cudaGraphUpload_v10000 = 348
cudaGraphicsD3D10RegisterResource_v3020 = 91
cudaGraphicsD3D11RegisterResource_v3020 = 87
cudaGraphicsD3D9RegisterResource_v3020 = 106
cudaGraphicsEGLRegisterImage_v7000 = 256
cudaGraphicsGLRegisterBuffer_v3020 = 73
cudaGraphicsGLRegisterImage_v3020 = 72
cudaGraphicsMapResources_v3020 = 76
cudaGraphicsResourceGetMappedEglFrame_v7000 = 265
cudaGraphicsResourceGetMappedMipmappedArray_v5000 = 196
cudaGraphicsResourceGetMappedPointer_v3020 = 78
cudaGraphicsResourceSetMapFlags_v3020 = 75
cudaGraphicsSubResourceGetMappedArray_v3020 = 79
cudaGraphicsUnmapResources_v3020 = 77
cudaGraphicsUnregisterResource_v3020 = 74
cudaGraphicsVDPAURegisterOutputSurface_v3020 = 83
cudaGraphicsVDPAURegisterVideoSurface_v3020 = 82
cudaHostAlloc_v3020 = 27
cudaHostGetDevicePointer_v3020 = 28
cudaHostGetFlags_v3020 = 29
cudaHostRegister_v4000 = 152
cudaHostUnregister_v4000 = 153
cudaImportExternalMemory_v10000 = 274
cudaImportExternalSemaphore_v10000 = 278
cudaInitDevice_v12000 = 444
cudaIpcCloseMemHandle_v4010 = 180
cudaIpcGetEventHandle_v4010 = 176
cudaIpcGetMemHandle_v4010 = 178
cudaIpcOpenEventHandle_v4010 = 177
cudaIpcOpenMemHandle_v4010 = 179
cudaKernelSetAttributeForDevice_v12060 = 479
cudaLaunchCooperativeKernelMultiDevice_v9000 = 272
cudaLaunchCooperativeKernel_ptsz_v9000 = 270
cudaLaunchCooperativeKernel_v9000 = 269
cudaLaunchHostFunc_ptsz_v10000 = 285
cudaLaunchHostFunc_v10000 = 284
cudaLaunchKernelExC_ptsz_v11060 = 431
cudaLaunchKernelExC_v11060 = 430
cudaLaunchKernel_ptsz_v7000 = 214
cudaLaunchKernel_v7000 = 211
cudaLaunch_ptsz_v7000 = 213
cudaLaunch_v3020 = 13
cudaLibraryEnumerateKernels_v12060 = 478
cudaLibraryGetGlobal_v12060 = 474
cudaLibraryGetKernelCount_v12060 = 477
cudaLibraryGetKernel_v12060 = 473
cudaLibraryGetManaged_v12060 = 475
cudaLibraryGetUnifiedFunction_v12060 = 476
cudaLibraryLoadData_v12060 = 470
cudaLibraryLoadFromFile_v12060 = 471
cudaLibraryUnload_v12060 = 472
cudaLogsCurrent_v13000 = 515
cudaLogsDumpToFile_v13000 = 516
cudaLogsDumpToMemory_v13000 = 517
cudaLogsRegisterCallback_v13000 = 513
cudaLogsUnregisterCallback_v13000 = 514
cudaMalloc3DArray_v3020 = 141
cudaMalloc3D_v3020 = 140
cudaMallocArray_v3020 = 23
cudaMallocAsync_ptsz_v11020 = 374
cudaMallocAsync_v11020 = 373
cudaMallocFromPoolAsync_ptsz_v11020 = 392
cudaMallocFromPoolAsync_v11020 = 391
cudaMallocHost_v3020 = 25
cudaMallocManaged_v6000 = 206
cudaMallocMipmappedArray_v5000 = 192
cudaMallocPitch_v3020 = 21
cudaMalloc_v3020 = 20
cudaMemAdvise_v12020 = 448
cudaMemAdvise_v2_v12020 = 448
cudaMemAdvise_v8000 = 254
cudaMemDiscardAndPrefetchBatchAsync_ptsz_v13000 = 492
cudaMemDiscardAndPrefetchBatchAsync_v13000 = 491
cudaMemDiscardBatchAsync_ptsz_v13000 = 490
cudaMemDiscardBatchAsync_v13000 = 489
cudaMemGetDefaultMemPool_v13000 = 518
cudaMemGetInfo_v3020 = 30
cudaMemGetMemPool_v13000 = 519
cudaMemPoolCreate_v11020 = 383
cudaMemPoolDestroy_v11020 = 384
cudaMemPoolExportPointer_v11020 = 389
cudaMemPoolExportToShareableHandle_v11020 = 387
cudaMemPoolGetAccess_v11020 = 382
cudaMemPoolGetAttribute_v11020 = 379
cudaMemPoolImportFromShareableHandle_v11020 = 388
cudaMemPoolImportPointer_v11020 = 390
cudaMemPoolSetAccess_v11020 = 380
cudaMemPoolSetAttribute_v11020 = 378
cudaMemPoolTrimTo_v11020 = 377
cudaMemPrefetchAsync_ptsz_v12020 = 450
cudaMemPrefetchAsync_ptsz_v8000 = 253
cudaMemPrefetchAsync_v12020 = 449
cudaMemPrefetchAsync_v2_ptsz_v12020 = 450
cudaMemPrefetchAsync_v2_v12020 = 449
cudaMemPrefetchAsync_v8000 = 252
cudaMemPrefetchBatchAsync_ptsz_v13000 = 488
cudaMemPrefetchBatchAsync_v13000 = 487
cudaMemRangeGetAttribute_v8000 = 266
cudaMemRangeGetAttributes_v8000 = 267
cudaMemSetMemPool_v13000 = 520
cudaMemcpy2DArrayToArray_ptds_v7000 = 222
cudaMemcpy2DArrayToArray_v3020 = 38
cudaMemcpy2DAsync_ptsz_v7000 = 228
cudaMemcpy2DAsync_v3020 = 44
cudaMemcpy2DFromArrayAsync_ptsz_v7000 = 230
cudaMemcpy2DFromArrayAsync_v3020 = 46
cudaMemcpy2DFromArray_ptds_v7000 = 220
cudaMemcpy2DFromArray_v3020 = 36
cudaMemcpy2DToArrayAsync_ptsz_v7000 = 229
cudaMemcpy2DToArrayAsync_v3020 = 45
cudaMemcpy2DToArray_ptds_v7000 = 218
cudaMemcpy2DToArray_v3020 = 34
cudaMemcpy2D_ptds_v7000 = 216
cudaMemcpy2D_v3020 = 32
cudaMemcpy3DAsync_ptsz_v7000 = 246
cudaMemcpy3DAsync_v3020 = 145
cudaMemcpy3DBatchAsync_ptsz_v12080 = 485
cudaMemcpy3DBatchAsync_ptsz_v13000 = 512
cudaMemcpy3DBatchAsync_v12080 = 484
cudaMemcpy3DBatchAsync_v13000 = 511
cudaMemcpy3DPeerAsync_ptsz_v7000 = 250
cudaMemcpy3DPeerAsync_v4000 = 163
cudaMemcpy3DPeer_ptds_v7000 = 249
cudaMemcpy3DPeer_v4000 = 162
cudaMemcpy3D_ptds_v7000 = 245
cudaMemcpy3D_v3020 = 144
cudaMemcpyArrayToArray_ptds_v7000 = 221
cudaMemcpyArrayToArray_v3020 = 37
cudaMemcpyAsync_ptsz_v7000 = 225
cudaMemcpyAsync_v3020 = 41
cudaMemcpyBatchAsync_ptsz_v12080 = 483
cudaMemcpyBatchAsync_ptsz_v13000 = 510
cudaMemcpyBatchAsync_v12080 = 482
cudaMemcpyBatchAsync_v13000 = 509
cudaMemcpyFromArrayAsync_ptsz_v7000 = 227
cudaMemcpyFromArrayAsync_v3020 = 43
cudaMemcpyFromArray_ptds_v7000 = 219
cudaMemcpyFromArray_v3020 = 35
cudaMemcpyFromSymbolAsync_ptsz_v7000 = 232
cudaMemcpyFromSymbolAsync_v3020 = 48
cudaMemcpyFromSymbol_ptds_v7000 = 224
cudaMemcpyFromSymbol_v3020 = 40
cudaMemcpyPeerAsync_v4000 = 161
cudaMemcpyPeer_v4000 = 160
cudaMemcpyToArrayAsync_ptsz_v7000 = 226
cudaMemcpyToArrayAsync_v3020 = 42
cudaMemcpyToArray_ptds_v7000 = 217
cudaMemcpyToArray_v3020 = 33
cudaMemcpyToSymbolAsync_ptsz_v7000 = 231
cudaMemcpyToSymbolAsync_v3020 = 47
cudaMemcpyToSymbol_ptds_v7000 = 223
cudaMemcpyToSymbol_v3020 = 39
cudaMemcpy_ptds_v7000 = 215
cudaMemcpy_v3020 = 31
cudaMemset2DAsync_ptsz_v7000 = 236
cudaMemset2DAsync_v3020 = 52
cudaMemset2D_ptds_v7000 = 234
cudaMemset2D_v3020 = 50
cudaMemset3DAsync_ptsz_v7000 = 244
cudaMemset3DAsync_v3020 = 143
cudaMemset3D_ptds_v7000 = 243
cudaMemset3D_v3020 = 142
cudaMemsetAsync_ptsz_v7000 = 235
cudaMemsetAsync_v3020 = 51
cudaMemset_ptds_v7000 = 233
cudaMemset_v3020 = 49
cudaMipmappedArrayGetMemoryRequirements_v11060 = 429
cudaMipmappedArrayGetSparseProperties_v11010 = 360
cudaOccupancyAvailableDynamicSMemPerBlock_v10200 = 329
cudaOccupancyMaxActiveBlocksPerMultiprocessorWithFlags_v7000 = 251
cudaOccupancyMaxActiveBlocksPerMultiprocessor_v6000 = 207
cudaOccupancyMaxActiveBlocksPerMultiprocessor_v6050 = 210
cudaOccupancyMaxActiveClusters_v11070 = 433
cudaOccupancyMaxPotentialClusterSize_v11070 = 432
cudaPeekAtLastError_v3020 = 11
cudaPeerGetDevicePointer_v4000 = 159
cudaPeerRegister_v4000 = 157
cudaPeerUnregister_v4000 = 158
cudaPointerGetAttributes_v4000 = 151
cudaProfilerInitialize_v4000 = 170
cudaProfilerStart_v4000 = 171
cudaProfilerStop_v4000 = 172
cudaRuntimeGetVersion_v3020 = 2
cudaSetDeviceFlags_v3020 = 19
cudaSetDevice_v3020 = 16
cudaSetDoubleForDevice_v3020 = 124
cudaSetDoubleForHost_v3020 = 125
cudaSetValidDevices_v3020 = 18
cudaSetupArgument_v3020 = 9
cudaSignalExternalSemaphoresAsync_ptsz_v10000 = 280
cudaSignalExternalSemaphoresAsync_ptsz_v11020 = 394
cudaSignalExternalSemaphoresAsync_v10000 = 279
cudaSignalExternalSemaphoresAsync_v11020 = 393
cudaSignalExternalSemaphoresAsync_v2_ptsz_v11020 = 394
cudaSignalExternalSemaphoresAsync_v2_v11020 = 393
cudaStreamAddCallback_ptsz_v7000 = 248
cudaStreamAddCallback_v5000 = 197
cudaStreamAttachMemAsync_ptsz_v7000 = 241
cudaStreamAttachMemAsync_v6000 = 208
cudaStreamBeginCaptureToGraph_ptsz_v12030 = 453
cudaStreamBeginCaptureToGraph_v12030 = 452
cudaStreamBeginCapture_ptsz_v10000 = 316
cudaStreamBeginCapture_v10000 = 315
cudaStreamCopyAttributes_ptsz_v11000 = 342
cudaStreamCopyAttributes_v11000 = 341
cudaStreamCreateWithFlags_v5000 = 198
cudaStreamCreateWithPriority_v5050 = 202
cudaStreamCreate_v3020 = 129
cudaStreamDestroy_v3020 = 130
cudaStreamDestroy_v5050 = 201
cudaStreamEndCapture_ptsz_v10000 = 320
cudaStreamEndCapture_v10000 = 319
cudaStreamGetAttribute_ptsz_v11000 = 344
cudaStreamGetAttribute_v11000 = 343
cudaStreamGetCaptureInfo_ptsz_v10010 = 325
cudaStreamGetCaptureInfo_ptsz_v12030 = 462
cudaStreamGetCaptureInfo_v10010 = 324
cudaStreamGetCaptureInfo_v12030 = 461
cudaStreamGetCaptureInfo_v2_ptsz_v11030 = 410
cudaStreamGetCaptureInfo_v2_v11030 = 409
cudaStreamGetCaptureInfo_v3_ptsz_v12030 = 462
cudaStreamGetCaptureInfo_v3_v12030 = 461
cudaStreamGetDevice_ptsz_v12080 = 481
cudaStreamGetDevice_v12080 = 480
cudaStreamGetFlags_ptsz_v7000 = 238
cudaStreamGetFlags_v5050 = 204
cudaStreamGetId_ptsz_v12000 = 442
cudaStreamGetId_v12000 = 441
cudaStreamGetPriority_ptsz_v7000 = 237
cudaStreamGetPriority_v5050 = 203
cudaStreamIsCapturing_ptsz_v10000 = 318
cudaStreamIsCapturing_v10000 = 317
cudaStreamQuery_ptsz_v7000 = 240
cudaStreamQuery_v3020 = 132
cudaStreamSetAttribute_ptsz_v11000 = 346
cudaStreamSetAttribute_v11000 = 345
cudaStreamSetFlags_ptsz_v10200 = 331
cudaStreamSetFlags_v10200 = 330
cudaStreamSynchronize_ptsz_v7000 = 239
cudaStreamSynchronize_v3020 = 131
cudaStreamUpdateCaptureDependencies_ptsz_v11030 = 412
cudaStreamUpdateCaptureDependencies_ptsz_v12030 = 464
cudaStreamUpdateCaptureDependencies_v11030 = 411
cudaStreamUpdateCaptureDependencies_v12030 = 463
cudaStreamUpdateCaptureDependencies_v2_ptsz_v12030 = 464
cudaStreamUpdateCaptureDependencies_v2_v12030 = 463
cudaStreamWaitEvent_ptsz_v7000 = 247
cudaStreamWaitEvent_v3020 = 147
cudaThreadExchangeStreamCaptureMode_v10010 = 327
cudaThreadExit_v3020 = 123
cudaThreadGetCacheConfig_v3020 = 150
cudaThreadGetLimit_v3020 = 127
cudaThreadSetCacheConfig_v3020 = 146
cudaThreadSetLimit_v3020 = 128
cudaThreadSynchronize_v3020 = 126
cudaUnbindTexture_v3020 = 58
cudaUserObjectCreate_v11030 = 413
cudaUserObjectRelease_v11030 = 415
cudaUserObjectRetain_v11030 = 414
cudaVDPAUGetDevice_v3020 = 80
cudaVDPAUSetVDPAUDevice_v3020 = 81
cudaWGLGetDevice_v3020 = 71
cudaWaitExternalSemaphoresAsync_ptsz_v10000 = 282
cudaWaitExternalSemaphoresAsync_ptsz_v11020 = 396
cudaWaitExternalSemaphoresAsync_v10000 = 281
cudaWaitExternalSemaphoresAsync_v11020 = 395
cudaWaitExternalSemaphoresAsync_v2_ptsz_v11020 = 396
cudaWaitExternalSemaphoresAsync_v2_v11020 = 395
cupti.cupti.activity_configure_unified_memory_counter(config: int, count: int)

Set Unified Memory Counter configuration.

Parameters:
  • config (intptr_t) – A pointer to CUpti_ActivityUnifiedMemoryCounterConfig structures containing Unified Memory counter configuration.

  • count (uint32_t) – Number of Unified Memory counter configuration structures.

cupti.cupti.activity_disable(kind: int)

Disable collection of a specific kind of activity record.

Parameters:

kind (ActivityKind) – The kind of activity record to stop collecting.

cupti.cupti.activity_disable_context(context: int, kind: int)

Disable collection of a specific kind of activity record for a context.

Parameters:
  • context (intptr_t) – The context for which activity is to be disabled.

  • kind (ActivityKind) – The kind of activity record to stop collecting.

cupti.cupti.activity_enable(kind: int)

Enable collection of a specific kind of activity record.

Parameters:

kind (ActivityKind) – The kind of activity record to collect.

cupti.cupti.activity_enable_all_sync_records(enable: int)

Enables collecting records for all synchronization operations.

Parameters:

enable (uint8_t) – is a boolean, denoting whether to enable or disable the collection of all CUDA event query and stream query records.

cupti.cupti.activity_enable_allocation_source(enable: int)

Enables tracking the source library for memory allocation requests.

Parameters:

enable (uint8_t) – is a boolean, denoting whether the source library of the memory allocation request needs to be tracked.

cupti.cupti.activity_enable_and_dump(kind: int)

Enable collection of a specific kind of activity record. For certain activity kinds it dumps existing records.

Parameters:

kind (ActivityKind) – The kind of activity record to collect.

cupti.cupti.activity_enable_context(context: int, kind: int)

Enable collection of a specific kind of activity record for a context.

Parameters:
  • context (intptr_t) – The context for which activity is to be enabled.

  • kind (ActivityKind) – The kind of activity record to collect.

cupti.cupti.activity_enable_cuda_event_device_timestamps(enable: int)

Enable/Disable collecting device timestamp for CUPTI_ACTIVITY_KIND_CUDA_EVENT record.

Parameters:

enable (uint8_t) – is a boolean, denoting whether to enable or disable the collection of CUDA event device timestamps.

cupti.cupti.activity_enable_device_graph(enable: int)

Controls the collection of records for device launched graphs.

Parameters:

enable (uint8_t) – is a boolean, denoting whether these records should be collected.

cupti.cupti.activity_enable_driver_api(cbid: int, enable: int)

Controls the collection of activity records for specific CUDA Driver APIs.

Parameters:
  • cbid (uint32_t) – callback id of the CUDA Driver API. This can be found in the header cupti_driver_cbid.h.

  • enable (uint8_t) – is a boolean, denoting whether to enable or disable the collection.

cupti.cupti.activity_enable_hw_trace(enable: int)

Enables the collection of CUDA kernel timestamps through Hardware Event System(HES).

Parameters:

enable (uint8_t) – is a boolean, denoting whether to enable or disable the collection through HW events.

cupti.cupti.activity_enable_latency_timestamps(enable: int)

Controls the collection of queued and submitted timestamps for kernels.

Parameters:

enable (uint8_t) – is a boolean, denoting whether these timestamps should be collected.

cupti.cupti.activity_enable_launch_attributes(enable: int)

Controls the collection of launch attributes for kernels.

Parameters:

enable (uint8_t) – is a boolean denoting whether these launch attributes should be collected.

cupti.cupti.activity_enable_runtime_api(cbid: int, enable: int)

Controls the collection of activity records for specific CUDA Runtime APIs.

Parameters:
  • cbid (uint32_t) – callback id of the CUDA Runtime API. This can be found in the header cupti_runtime_cbid.h.

  • enable (uint8_t) – is a boolean, denoting whether to enable or disable the collection.

cupti.cupti.activity_flush_all(flag: int)

Request to deliver activity records via the buffer completion callback.

Parameters:

flag (uint32_t) – The flag can be set to indicate a forced flush. See CUpti_ActivityFlag.

cupti.cupti.activity_flush_period(time: int)

Sets the flush period for the worker thread.

Parameters:

time (uint32_t) – flush period in milliseconds (ms).

cupti.cupti.activity_get_attribute(attr: int, value_size: int, value: int)

Read an activity API attribute.

Parameters:
  • attr (ActivityAttribute) – The attribute to read.

  • value_size (intptr_t) – Size of buffer pointed by the value, and returns the number of bytes written to value.

  • value (intptr_t) – Returns the value of the attribute.

cupti.cupti.activity_get_num_dropped_records(context: int, stream_id: int, dropped: int)

Get the number of activity records that were dropped of insufficient buffer space.

Parameters:
  • context (intptr_t) – The context, or NULL to get dropped count from global queue.

  • stream_id (uint32_t) – The stream ID.

  • dropped (intptr_t) – The number of records that were dropped since the last call to this function.

cupti.cupti.activity_pop_external_correlation_id(kind: int) int

Pop an external correlation id for the calling thread.

Parameters:

kind (ExternalCorrelationKind) – The kind of external API activities should be correlated with.

Returns:

If the function returns successful, contains the last external correlation id for this kind, can be NULL.

Return type:

uint64_t

cupti.cupti.activity_push_external_correlation_id(kind: int, id: int)

Push an external correlation id for the calling thread.

Parameters:
  • kind (ExternalCorrelationKind) – The kind of external API activities should be correlated with.

  • id (uint64_t) – External correlation id.

cupti.cupti.activity_register_callbacks(func_buffer_requested, func_buffer_completed)

Registers callback functions with CUPTI for activity buffer handling.

Parameters:
  • func_buffer_requested (function) – callback which is invoked when an empty buffer is requested by CUPTI.

  • func_buffer_completed (function) – callback which is invoked when a buffer containing activity records is available from CUPTI.

cupti.cupti.activity_register_timestamp_callback(func_timestamp)

Registers callback function with CUPTI for providing timestamp.

Parameters:

func_timestamp (function) – callback which is invoked when a timestamp is needed by CUPTI.

cupti.cupti.activity_set_attribute(attr: int, value_size: int, value: int)

Write an activity API attribute.

Parameters:
  • attr (ActivityAttribute) – The attribute to write.

  • value_size (intptr_t) – The size, in bytes, of the value.

  • value (intptr_t) – The attribute value to write.

cupti.cupti.compute_capability_supported(major: int, minor: int) int

Check support for a compute capability.

Parameters:
  • major (int) – The major revision number of the compute capability.

  • minor (int) – The minor revision number of the compute capability.

Returns:

Pointer to an integer to return the support status.

Return type:

int

cupti.cupti.device_supported(dev: int) int

Check support for a compute device.

Parameters:

dev (int) – The device handle returned by CUDA Driver API cuDeviceGet.

Returns:

Pointer to an integer to return the support status.

Return type:

int

cupti.cupti.device_virtualization_mode(dev: int) int

Query the virtualization mode of the device.

Parameters:

dev (int) – The device handle returned by CUDA Driver API cuDeviceGet.

Returns:

Pointer to an CUpti_DeviceVirtualizationMode to return the virtualization mode.

Return type:

int

cupti.cupti.enable_all_domains(enable: int, subscriber: int)

Enable or disable all callbacks in all domains.

Parameters:
  • enable (uint32_t) – New enable state for all callbacks in all domain. Zero disables all callbacks, non-zero enables all callbacks.

  • subscriber (intptr_t) – Handle to callback subscription.

cupti.cupti.enable_callback(enable: int, subscriber: int, domain: int, cbid: int)

Enable or disabled callbacks for a specific domain and callback ID.

Parameters:
  • enable (uint32_t) – New enable state for the callback. Zero disables the callback, non-zero enables the callback.

  • subscriber (intptr_t) – Handle to callback subscription.

  • domain (CallbackDomain) – The domain of the callback.

  • cbid (uint32_t) – The ID of the callback.

cupti.cupti.enable_domain(enable: int, subscriber: int, domain: int)

Enable or disabled all callbacks for a specific domain.

Parameters:
  • enable (uint32_t) – New enable state for all callbacks in the domain. Zero disables all callbacks, non-zero enables all callbacks.

  • subscriber (intptr_t) – Handle to callback subscription.

  • domain (CallbackDomain) – The domain of the callback.

cupti.cupti.finalize()

Detach CUPTI from the running process.

See also

cuptiFinalize

cupti.cupti.get_auto_boost_state(context: int, state: int)

Get auto boost state.

Parameters:
  • context (intptr_t) – A valid CUcontext.

  • state (intptr_t) – A pointer to CUpti_ActivityAutoBoostState structure which contains the current state and the id of the process that has requested the current state.

cupti.cupti.get_callback_name(domain: int, cbid: int)

Get the name of a callback for a specific domain and callback ID.

Parameters:
  • domain (CallbackDomain) – The domain of the callback.

  • cbid (uint32_t) – The ID of the callback.

Returns:

Returns name of the callback for the specified domain and callback ID

Return type:

name (str)

cupti.cupti.get_callback_state(subscriber: int, domain: int, cbid: int) int

Get the current enabled/disabled state of a callback for a specific domain and function ID.

Parameters:
  • subscriber (intptr_t) – Handle to the initialize subscriber.

  • domain (CallbackDomain) – The domain of the callback.

  • cbid (uint32_t) – The ID of the callback.

Returns:

Returns non-zero if callback enabled, zero if not enabled.

Return type:

uint32_t

cupti.cupti.get_context_id(context: int) int

Get the ID of a context.

Parameters:

context (intptr_t) – The context.

Returns:

Returns a process-unique ID for the context.

Return type:

uint32_t

cupti.cupti.get_device_id(context: int) int

Get the ID of a device.

Parameters:

context (intptr_t) – The context, or NULL to indicate the current context.

Returns:

Returns the ID of the device that is current for the calling thread.

Return type:

uint32_t

See also

cuptiGetDeviceId

cupti.cupti.get_graph_exec_id(graph_exec: int) int

Get the unique ID of executable graph.

Parameters:

graph_exec (intptr_t) – The executable graph.

Returns:

Returns the unique ID of the executable graph.

Return type:

uint32_t

cupti.cupti.get_graph_id(graph: int) int

Get the unique ID of graph.

Parameters:

graph (intptr_t) – The graph.

Returns:

Returns the unique ID of the graph.

Return type:

uint32_t

See also

cuptiGetGraphId

cupti.cupti.get_graph_node_id(node: int) int

Get the unique ID of a graph node.

Parameters:

node (intptr_t) – The graph node.

Returns:

Returns the unique ID of the node.

Return type:

uint64_t

cupti.cupti.get_last_error() int

Returns the last error from a cupti call or callback.

cupti.cupti.get_stream_id_ex(context: int, stream: int, per_thread_stream: int) int

Get the ID of a stream.

Parameters:
  • context (intptr_t) – If non-NULL then the stream is checked to ensure that it belongs to this context. Typically this parameter should be null.

  • stream (intptr_t) – The stream.

  • per_thread_stream (uint8_t) – Flag to indicate if program is compiled for per-thread streams.

Returns:

Returns a context-unique ID for the stream.

Return type:

uint32_t

cupti.cupti.get_thread_id_type() int

Get the thread-id type.

Returns:

.

Return type:

int

cupti.cupti.get_timestamp() int

Get the CUPTI timestamp.

Returns:

Returns the CUPTI timestamp.

Return type:

uint64_t

cupti.cupti.set_thread_id_type(type: int)

Set the thread-id type.

Parameters:

type (ActivityThreadIdType) –

.

cupti.cupti.subscribe(callback, userdata) int

Initialize a callback subscriber with a callback function and user data.

Parameters:
  • callback (function) – The callback function.

  • userdata (intptr_t) – A pointer to user data. This data will be passed to the callback function via the userdata parameter.

Returns:

Returns handle to initialize subscriber.

Return type:

intptr_t

See also

cuptiSubscribe

cupti.cupti.subscribe_v2(callback, userdata, p_params: int) int

Initialize a callback subscriber with a callback function and user data.

Parameters:
  • callback (function) – The callback function.

  • userdata (intptr_t) – A pointer to user data. This data will be passed to the callback function via the userdata parameter.

  • p_params (intptr_t) – A pointer to CUpti_SubscriberParams. Can be NULL.

Returns:

Returns handle to initialize subscriber.

Return type:

intptr_t

cupti.cupti.supported_domains()

Get the available callback domains.

Returns:

List of all available callback domains

Return type:

list[CallbackDomain]

cupti.cupti.unsubscribe(subscriber: int)

Unregister a callback subscriber.

Parameters:

subscriber (intptr_t) – Handle to the initialize subscriber.

See also

cuptiUnsubscribe