Overview#

CUPTI Python provides Python APIs for building profiling tools that target CUDA Python applications. It wraps the CUPTI C library and exposes both a low-level binding that mirrors the C API and idiomatic Python helpers built on top of it.

Supported APIs#

CUPTI Python currently supports the following CUPTI APIs:

CUPTI API

Description

Python module

Activity

Asynchronously record CUDA activities, e.g. CUDA API, kernel, memory copy.

cupti.cupti

Callback

CUDA event callback mechanism to notify subscriber that a specific CUDA event executed.

cupti.cupti

PM Sampling

Collect hardware metrics by sampling the GPU performance monitors (PM) periodically.

cupti.pm_sampling

Profiler Host

Host APIs for enumeration, configuration, and evaluation of performance metrics.

cupti.profiler_host

The following CUPTI APIs are not supported in CUPTI Python:

  • Range Profiling

  • PC Sampling

  • SASS Metrics

  • Checkpoint

Refer to the CUPTI C documentation for the full CUPTI C API surface.

API Layers#

CUPTI Python exposes two layers. Most users should start with the pythonic layer.

1:1 bindings — cupti.cupti

A direct, function-for-function mapping of the CUPTI C API. The CUPTI Python APIs use snake case naming (lower case, words separated by underscores), e.g. activity_enable. The CUPTI C APIs use camel case naming, e.g. cuptiActivityEnable.

Use this layer for the Activity and Callback APIs (no pythonic wrapper exists for these), when porting existing CUPTI C code, or when you need a CUPTI feature that is not yet wrapped by the pythonic layer.

Pythonic layer — cupti.pm_sampling, cupti.profiler_host

Higher-level Python APIs built on top of the 1:1 binding APIs (cupti.cupti) that present CUPTI in idiomatic Python: classes with explicit lifecycle methods, dataclass results, iteration over samples, and Python exceptions in place of CUPTI status codes. The pythonic layer handles object lifetimes, parameter struct sizing, and metric evaluation for you, and surfaces failures as ValueError, RuntimeError, MemoryError, and PermissionError.

Use this layer when it covers your use case. Today it covers PM Sampling and the Profiler Host APIs.

Mixing layers across different CUPTI APIs is fine — for example, using the pythonic PM Sampling collector alongside 1:1 Activity tracing. Mixing layers within the same CUPTI API is not advised: interoperability between the two layers within the same CUPTI API is not guaranteed. For example, enabling PM Sampling on the same device through both cupti.pm_sampling.Collector.enable() and cupti.cupti.pm_sampling_enable() will fail, since CUPTI only allows PM Sampling to be enabled once per device.

Supported CUPTI Activities#

When using the Activity API through cupti.cupti, each supported CUPTI Activity struct corresponds to an equivalent Python class. Activity records are delivered through the func_buffer_completed callback as Python objects. Use the kind field to identify the type of activity record.

The table below maps each supported Activity Kind to its corresponding CUPTI Activity record and Python class:

Supported Activity Kinds

Activity Record

Python Class

cupti.cupti.ActivityKind.DRIVER

CUpti_ActivityAPI

cupti.cupti.ActivityAPI

cupti.cupti.ActivityKind.RUNTIME

CUpti_ActivityAPI

cupti.cupti.ActivityAPI

cupti.cupti.ActivityKind.MEMCPY

CUpti_ActivityMemcpy6

cupti.cupti.ActivityMemcpy6

cupti.cupti.ActivityKind.MEMCPY2

CUpti_ActivityMemcpyPtoP4

cupti.cupti.ActivityMemcpyPtoP4

cupti.cupti.ActivityKind.MEMSET

CUpti_ActivityMemset4

cupti.cupti.ActivityMemset4

cupti.cupti.ActivityKind.MEMORY_POOL

CUpti_ActivityMemoryPool3

cupti.cupti.ActivityMemoryPool3

cupti.cupti.ActivityKind.MEMORY2

CUpti_ActivityMemory4

cupti.cupti.ActivityMemory4

cupti.cupti.ActivityKind.KERNEL

CUpti_ActivityKernel11

cupti.cupti.ActivityKernel11

cupti.cupti.ActivityKind.CONCURRENT_KERNEL

CUpti_ActivityKernel11

cupti.cupti.ActivityKernel11

cupti.cupti.ActivityKind.ENVIRONMENT

CUpti_ActivityEnvironment

cupti.cupti.ActivityEnvironment

cupti.cupti.ActivityKind.CONTEXT

CUpti_ActivityContext4

cupti.cupti.ActivityContext4

cupti.cupti.ActivityKind.UNIFIED_MEMORY_COUNTER | CUpti_ActivityUnifiedMemoryCounter3

cupti.cupti.ActivityUnifiedMemoryCounter3

cupti.cupti.ActivityKind.FUNCTION

CUpti_ActivityFunction

cupti.cupti.ActivityFunction

cupti.cupti.ActivityKind.MODULE

CUpti_ActivityModule

cupti.cupti.ActivityModule

cupti.cupti.ActivityKind.DEVICE_ATTRIBUTE

CUpti_ActivityDeviceAttribute

cupti.cupti.ActivityDeviceAttribute

cupti.cupti.ActivityKind.CUDA_EVENT

CUpti_ActivityCudaEvent2

cupti.cupti.ActivityCudaEvent2

cupti.cupti.ActivityKind.STREAM

CUpti_ActivityStream

cupti.cupti.ActivityStream

cupti.cupti.ActivityKind.SYNCHRONIZATION

CUpti_ActivitySynchronization2

cupti.cupti.ActivitySynchronization2

cupti.cupti.ActivityKind.EXTERNAL_CORRELATION

CUpti_ActivityExternalCorrelation

cupti.cupti.ActivityExternalCorrelation

cupti.cupti.ActivityKind.GRAPH_TRACE

CUpti_ActivityGraphTrace2

cupti.cupti.ActivityGraphTrace2

cupti.cupti.ActivityKind.JIT

CUpti_ActivityJit2

cupti.cupti.ActivityJit2

cupti.cupti.ActivityKind.NAME

CUpti_ActivityName

cupti.cupti.ActivityName

cupti.cupti.ActivityKind.MARKER

CUpti_ActivityMarker2

cupti.cupti.ActivityMarker2

cupti.cupti.ActivityKind.MARKER_DATA

CUpti_ActivityMarkerData2

cupti.cupti.ActivityMarkerData2

cupti.cupti.ActivityKind.OVERHEAD

CUpti_ActivityOverhead3

cupti.cupti.ActivityOverhead3

cupti.cupti.ActivityKind.DEVICE

CUpti_ActivityDevice6

cupti.cupti.ActivityDevice6

cupti.cupti.ActivityKind.DEVICE_GRAPH_TRACE

CUpti_ActivityDeviceGraphTrace

cupti.cupti.ActivityDeviceGraphTrace

cupti.cupti.ActivityKind.MEM_DECOMPRESS

CUpti_ActivityMemDecompress

cupti.cupti.ActivityMemDecompress

cupti.cupti.ActivityKind.ROTATION

CUpti_ActivityConfidentialComputeRotation

cupti.cupti.ActivityConfidentialComputeRotation

cupti.cupti.ActivityKind.GRAPH_HOST_NODE

CUpti_ActivityGraphHostNode

cupti.cupti.ActivityGraphHostNode

cupti.cupti.ActivityKind.COMPUTE_ENGINE_CTX_SWITCH | CUpti_ActivityComputeEngineCtxSwitch

cupti.cupti.ActivityComputeEngineCtxSwitch

cupti.cupti.ActivityKind.GREEN_CONTEXT

CUpti_ActivityGreenContext

cupti.cupti.ActivityGreenContext

cupti.cupti.ActivityKind.HOST_LAUNCH

CUpti_ActivityHostLaunch

cupti.cupti.ActivityHostLaunch