Overview#
CUPTI Python provides Python APIs for building profiling tools that target CUDA Python applications. It wraps the CUPTI C library and exposes both a low-level binding that mirrors the C API and idiomatic Python helpers built on top of it.
Supported APIs#
CUPTI Python currently supports the following CUPTI APIs:
CUPTI API |
Description |
Python module |
|---|---|---|
Activity |
Asynchronously record CUDA activities, e.g. CUDA API, kernel, memory copy. |
|
Callback |
CUDA event callback mechanism to notify subscriber that a specific CUDA event executed. |
|
PM Sampling |
Collect hardware metrics by sampling the GPU performance monitors (PM) periodically. |
|
Profiler Host |
Host APIs for enumeration, configuration, and evaluation of performance metrics. |
The following CUPTI APIs are not supported in CUPTI Python:
Range Profiling
PC Sampling
SASS Metrics
Checkpoint
Refer to the CUPTI C documentation for the full CUPTI C API surface.
API Layers#
CUPTI Python exposes two layers. Most users should start with the pythonic layer.
- 1:1 bindings —
cupti.cupti A direct, function-for-function mapping of the CUPTI C API. The CUPTI Python APIs use snake case naming (lower case, words separated by underscores), e.g.
activity_enable. The CUPTI C APIs use camel case naming, e.g.cuptiActivityEnable.Use this layer for the Activity and Callback APIs (no pythonic wrapper exists for these), when porting existing CUPTI C code, or when you need a CUPTI feature that is not yet wrapped by the pythonic layer.
- Pythonic layer —
cupti.pm_sampling,cupti.profiler_host Higher-level Python APIs built on top of the 1:1 binding APIs (
cupti.cupti) that present CUPTI in idiomatic Python: classes with explicit lifecycle methods,dataclassresults, iteration over samples, and Python exceptions in place of CUPTI status codes. The pythonic layer handles object lifetimes, parameter struct sizing, and metric evaluation for you, and surfaces failures asValueError,RuntimeError,MemoryError, andPermissionError.Use this layer when it covers your use case. Today it covers PM Sampling and the Profiler Host APIs.
Mixing layers across different CUPTI APIs is fine — for example, using the pythonic PM Sampling collector alongside 1:1 Activity tracing. Mixing layers within the same CUPTI API is not advised: interoperability between the two layers within the same CUPTI API is not guaranteed. For example, enabling PM Sampling on the same device through both cupti.pm_sampling.Collector.enable() and cupti.cupti.pm_sampling_enable() will fail, since CUPTI only allows PM Sampling to be enabled once per device.
Supported CUPTI Activities#
When using the Activity API through cupti.cupti, each supported CUPTI Activity struct corresponds to an equivalent Python class. Activity records are delivered through the func_buffer_completed callback as Python objects. Use the kind field to identify the type of activity record.
The table below maps each supported Activity Kind to its corresponding CUPTI Activity record and Python class:
Supported Activity Kinds |
Activity Record |
Python Class |
|---|---|---|
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
|
|
|
|
|
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
|
|
|
||