5.4. CUPTI Event API

Functions, types, and enums that implement the CUPTI Event API.

Note

CUPTI event API from the header cupti_events.h are not supported on devices with compute capability 7.5 and higher (i.e. Turing and later GPU architectures). These API will be deprecated in a future CUDA release. These are replaced by Profiling API in the header cupti_profiler_target.h and Perfworks metrics API in the headers nvperf_host.h and nvperf_target.h which are supported on devices with compute capability 7.0 and higher (i.e. Volta and later GPU architectures).

Data Structures

Macros

Enumerations

Functions

Typedefs

5.4.1. Macros

CUPTI_EVENT_INVALID

The value that indicates the event value is invalid.

CUPTI_EVENT_OVERFLOW

The overflow value for a CUPTI event.

The CUPTI event value that indicates an overflow.

5.4.2. Enumerations

enum CUpti_DeviceAttribute

Device attributes.

CUPTI device attributes. These attributes can be read using cuptiDeviceGetAttribute.

Values:

enumerator CUPTI_DEVICE_ATTR_MAX_EVENT_ID

Number of event IDs for a device.

Value is a uint32_t.

enumerator CUPTI_DEVICE_ATTR_MAX_EVENT_DOMAIN_ID

Number of event domain IDs for a device.

Value is a uint32_t.

enumerator CUPTI_DEVICE_ATTR_GLOBAL_MEMORY_BANDWIDTH

Get global memory bandwidth in Kbytes/sec.

Value is a uint64_t.

enumerator CUPTI_DEVICE_ATTR_INSTRUCTION_PER_CYCLE

Get theoretical maximum number of instructions per cycle.

Value is a uint32_t.

enumerator CUPTI_DEVICE_ATTR_INSTRUCTION_THROUGHPUT_SINGLE_PRECISION

Get theoretical maximum number of single precision instructions that can be executed per second.

Value is a uint64_t.

enumerator CUPTI_DEVICE_ATTR_MAX_FRAME_BUFFERS

Get number of frame buffers for device.

Value is a uint64_t.

Get PCIE link rate in Mega bits/sec for device.

Return 0 if bus-type is non-PCIE. Value is a uint64_t.

Get PCIE link width for device.

Return 0 if bus-type is non-PCIE. Value is a uint64_t.

enumerator CUPTI_DEVICE_ATTR_PCIE_GEN

Get PCIE generation for device.

Return 0 if bus-type is non-PCIE. Value is a uint64_t.

enumerator CUPTI_DEVICE_ATTR_DEVICE_CLASS

Get the class for the device.

Value is a CUpti_DeviceAttributeDeviceClass.

enumerator CUPTI_DEVICE_ATTR_FLOP_SP_PER_CYCLE

Get the peak single precision flop per cycle.

Value is a uint64_t.

enumerator CUPTI_DEVICE_ATTR_FLOP_DP_PER_CYCLE

Get the peak double precision flop per cycle.

Value is a uint64_t.

enumerator CUPTI_DEVICE_ATTR_MAX_L2_UNITS

Get number of L2 units.

Value is a uint64_t.

enumerator CUPTI_DEVICE_ATTR_MAX_SHARED_MEMORY_CACHE_CONFIG_PREFER_SHARED

Get the maximum shared memory for the CU_FUNC_CACHE_PREFER_SHARED preference.

Value is a uint64_t.

enumerator CUPTI_DEVICE_ATTR_MAX_SHARED_MEMORY_CACHE_CONFIG_PREFER_L1

Get the maximum shared memory for the CU_FUNC_CACHE_PREFER_L1 preference.

Value is a uint64_t.

enumerator CUPTI_DEVICE_ATTR_MAX_SHARED_MEMORY_CACHE_CONFIG_PREFER_EQUAL

Get the maximum shared memory for the CU_FUNC_CACHE_PREFER_EQUAL preference.

Value is a uint64_t.

enumerator CUPTI_DEVICE_ATTR_FLOP_HP_PER_CYCLE

Get the peak half precision flop per cycle.

Value is a uint64_t.

Check if Nvlink is connected to device.

Returns 1, if at least one Nvlink is connected to the device, returns 0 otherwise. Value is a uint32_t.

Check if Nvlink is present between GPU and CPU.

Returns Bandwidth, in Bytes/sec, if Nvlink is present, returns 0 otherwise. Value is a uint64_t.

enumerator CUPTI_DEVICE_ATTR_NVSWITCH_PRESENT

Check if NVSwitch is present in the underlying topology.

Returns 1, if present, returns 0 otherwise. Value is a uint32_t.

enumerator CUPTI_DEVICE_ATTR_FORCE_INT
enum CUpti_DeviceAttributeDeviceClass

Device class.

Enumeration of device classes for device attribute CUPTI_DEVICE_ATTR_DEVICE_CLASS.

Values:

enumerator CUPTI_DEVICE_ATTR_DEVICE_CLASS_TESLA
enumerator CUPTI_DEVICE_ATTR_DEVICE_CLASS_QUADRO
enumerator CUPTI_DEVICE_ATTR_DEVICE_CLASS_GEFORCE
enumerator CUPTI_DEVICE_ATTR_DEVICE_CLASS_TEGRA
enum CUpti_EventAttribute

Event attributes.

Event attributes. These attributes can be read using cuptiEventGetAttribute.

Values:

enumerator CUPTI_EVENT_ATTR_NAME

Event name.

Value is a null terminated const c-string.

enumerator CUPTI_EVENT_ATTR_SHORT_DESCRIPTION

Short description of event.

Value is a null terminated const c-string.

enumerator CUPTI_EVENT_ATTR_LONG_DESCRIPTION

Long description of event.

Value is a null terminated const c-string.

enumerator CUPTI_EVENT_ATTR_CATEGORY

Category of event.

Value is CUpti_EventCategory.

enumerator CUPTI_EVENT_ATTR_PROFILING_SCOPE

Profiling scope of the events.

It can be either device or context or both. Value is a CUpti_EventProfilingScope.

enumerator CUPTI_EVENT_ATTR_FORCE_INT
enum CUpti_EventCategory

An event category.

Each event is assigned to a category that represents the general type of the event. A event’s category is accessed using cuptiEventGetAttribute and the CUPTI_EVENT_ATTR_CATEGORY attribute.

Values:

enumerator CUPTI_EVENT_CATEGORY_INSTRUCTION

An instruction related event.

enumerator CUPTI_EVENT_CATEGORY_MEMORY

A memory related event.

enumerator CUPTI_EVENT_CATEGORY_CACHE

A cache related event.

enumerator CUPTI_EVENT_CATEGORY_PROFILE_TRIGGER

A profile-trigger event.

enumerator CUPTI_EVENT_CATEGORY_SYSTEM

A system event.

enumerator CUPTI_EVENT_CATEGORY_FORCE_INT
enum CUpti_EventCollectionMethod

The collection method used for an event.

The collection method indicates how an event is collected.

Values:

enumerator CUPTI_EVENT_COLLECTION_METHOD_PM

Event is collected using a hardware global performance monitor.

enumerator CUPTI_EVENT_COLLECTION_METHOD_SM

Event is collected using a hardware SM performance monitor.

enumerator CUPTI_EVENT_COLLECTION_METHOD_INSTRUMENTED

Event is collected using software instrumentation.

Event is collected using NvLink throughput counter method.

enumerator CUPTI_EVENT_COLLECTION_METHOD_FORCE_INT
enum CUpti_EventCollectionMode

Event collection modes.

The event collection mode determines the period over which the events within the enabled event groups will be collected.

Values:

enumerator CUPTI_EVENT_COLLECTION_MODE_CONTINUOUS

Events are collected for the entire duration between the cuptiEventGroupEnable and cuptiEventGroupDisable calls.

Event values are reset when the events are read. For CUDA toolkit v6.0 and older this was the default mode.

enumerator CUPTI_EVENT_COLLECTION_MODE_KERNEL

Events are collected only for the durations of kernel executions that occur between the cuptiEventGroupEnable and cuptiEventGroupDisable calls.

Event collection begins when a kernel execution begins, and stops when kernel execution completes. Event values are reset to zero when each kernel execution begins. If multiple kernel executions occur between the cuptiEventGroupEnable and cuptiEventGroupDisable calls then the event values must be read after each kernel launch if those events need to be associated with the specific kernel launch. Note that collection in this mode may significantly change the overall performance characteristics of the application because kernel executions that occur between the cuptiEventGroupEnable and cuptiEventGroupDisable calls are serialized on the GPU. This is the default mode from CUDA toolkit v6.5

enumerator CUPTI_EVENT_COLLECTION_MODE_FORCE_INT
enum CUpti_EventDomainAttribute

Event domain attributes.

Event domain attributes. Except where noted, all the attributes can be read using either cuptiDeviceGetEventDomainAttribute or cuptiEventDomainGetAttribute.

Values:

enumerator CUPTI_EVENT_DOMAIN_ATTR_NAME

Event domain name.

Value is a null terminated const c-string.

enumerator CUPTI_EVENT_DOMAIN_ATTR_INSTANCE_COUNT

Number of instances of the domain for which event counts will be collected.

The domain may have additional instances that cannot be profiled (see CUPTI_EVENT_DOMAIN_ATTR_TOTAL_INSTANCE_COUNT). Can be read only with cuptiDeviceGetEventDomainAttribute. Value is a uint32_t.

enumerator CUPTI_EVENT_DOMAIN_ATTR_TOTAL_INSTANCE_COUNT

Total number of instances of the domain, including instances that cannot be profiled.

Use CUPTI_EVENT_DOMAIN_ATTR_INSTANCE_COUNT to get the number of instances that can be profiled. Can be read only with cuptiDeviceGetEventDomainAttribute. Value is a uint32_t.

enumerator CUPTI_EVENT_DOMAIN_ATTR_COLLECTION_METHOD

Collection method used for events contained in the event domain.

Value is a CUpti_EventCollectionMethod.

enumerator CUPTI_EVENT_DOMAIN_ATTR_FORCE_INT
enum CUpti_EventGroupAttribute

Event group attributes.

Event group attributes. These attributes can be read using cuptiEventGroupGetAttribute. Attributes marked [rw] can also be written using cuptiEventGroupSetAttribute.

Values:

enumerator CUPTI_EVENT_GROUP_ATTR_EVENT_DOMAIN_ID

The domain to which the event group is bound.

This attribute is set when the first event is added to the group. Value is a CUpti_EventDomainID.

enumerator CUPTI_EVENT_GROUP_ATTR_PROFILE_ALL_DOMAIN_INSTANCES

[rw] Profile all the instances of the domain for this eventgroup.

This feature can be used to get load balancing across all instances of a domain. Value is an integer.

enumerator CUPTI_EVENT_GROUP_ATTR_USER_DATA

[rw] Reserved for user data.

enumerator CUPTI_EVENT_GROUP_ATTR_NUM_EVENTS

Number of events in the group.

Value is a uint32_t.

enumerator CUPTI_EVENT_GROUP_ATTR_EVENTS

Enumerates events in the group.

Value is a pointer to buffer of size sizeof(CUpti_EventID) * num_of_events in the eventgroup. num_of_events can be queried using CUPTI_EVENT_GROUP_ATTR_NUM_EVENTS.

enumerator CUPTI_EVENT_GROUP_ATTR_INSTANCE_COUNT

Number of instances of the domain bound to this event group that will be counted.

Value is a uint32_t.

enumerator CUPTI_EVENT_GROUP_ATTR_PROFILING_SCOPE

Event group scope can be set to CUPTI_EVENT_PROFILING_SCOPE_DEVICE or CUPTI_EVENT_PROFILING_SCOPE_CONTEXT for an eventGroup, before adding any event.

Sets the scope of eventgroup as CUPTI_EVENT_PROFILING_SCOPE_DEVICE or CUPTI_EVENT_PROFILING_SCOPE_CONTEXT when the scope of the events that will be added is CUPTI_EVENT_PROFILING_SCOPE_BOTH. If profiling scope of event is either CUPTI_EVENT_PROFILING_SCOPE_DEVICE or CUPTI_EVENT_PROFILING_SCOPE_CONTEXT then setting this attribute will not affect the default scope. It is not allowed to add events of different scope to same eventgroup. Value is a uint32_t.

enumerator CUPTI_EVENT_GROUP_ATTR_FORCE_INT
enum CUpti_EventProfilingScope

Profiling scope for event.

Profiling scope of event indicates if the event can be collected at context scope or device scope or both i.e. it can be collected at any of context or device scope.

Values:

enumerator CUPTI_EVENT_PROFILING_SCOPE_CONTEXT

Event is collected at context scope.

enumerator CUPTI_EVENT_PROFILING_SCOPE_DEVICE

Event is collected at device scope.

enumerator CUPTI_EVENT_PROFILING_SCOPE_BOTH

Event can be collected at device or context scope.

The scope can be set using cuptiEventGroupSetAttribute API.

enumerator CUPTI_EVENT_PROFILING_SCOPE_FORCE_INT
enum CUpti_ReadEventFlags

Flags for cuptiEventGroupReadEvent an cuptiEventGroupReadAllEvents.

Flags for cuptiEventGroupReadEvent an cuptiEventGroupReadAllEvents.

Values:

enumerator CUPTI_EVENT_READ_FLAG_NONE

No flags.

enumerator CUPTI_EVENT_READ_FLAG_FORCE_INT

5.4.3. Functions

CUptiResult cuptiDeviceEnumEventDomains(CUdevice device, size_t *arraySizeBytes, CUpti_EventDomainID *domainArray)

Get the event domains for a device.

Returns the event domains IDs in domainArray for a device. The size of the domainArray buffer is given by *arraySizeBytes. The size of the domainArray buffer must be at least numdomains * sizeof(CUpti_EventDomainID) or else all domains will not be returned. The value returned in *arraySizeBytes contains the number of bytes returned in domainArray.

Note

Thread-safety: this function is thread safe.

Parameters
  • device – The CUDA device

  • arraySizeBytes – The size of domainArray in bytes, and returns the number of bytes written to domainArray

  • domainArray – Returns the IDs of the event domains for the device

Return values
  • CUPTI_SUCCESS

  • CUPTI_ERROR_NOT_INITIALIZED

  • CUPTI_ERROR_INVALID_DEVICE

  • CUPTI_ERROR_INVALID_PARAMETER – if arraySizeBytes or domainArray are NULL

CUptiResult cuptiDeviceGetAttribute(CUdevice device, CUpti_DeviceAttribute attrib, size_t *valueSize, void *value)

Read a device attribute.

Read a device attribute and return it in *value.

Note

Thread-safety: this function is thread safe.

Parameters
  • device – The CUDA device

  • attrib – The attribute to read

  • valueSize – Size of buffer pointed by the value, and returns the number of bytes written to value

  • value – Returns the value of the attribute

Return values
  • CUPTI_SUCCESS

  • CUPTI_ERROR_NOT_INITIALIZED

  • CUPTI_ERROR_INVALID_DEVICE

  • CUPTI_ERROR_INVALID_PARAMETER – if valueSize or value is NULL, or if attrib is not a device attribute

  • CUPTI_ERROR_PARAMETER_SIZE_NOT_SUFFICIENT – For non-c-string attribute values, indicates that the value buffer is too small to hold the attribute value.

CUptiResult cuptiDeviceGetEventDomainAttribute(CUdevice device, CUpti_EventDomainID eventDomain, CUpti_EventDomainAttribute attrib, size_t *valueSize, void *value)

Read an event domain attribute.

Returns an event domain attribute in *value. The size of the value buffer is given by *valueSize. The value returned in *valueSize contains the number of bytes returned in value.

If the attribute value is a c-string that is longer than *valueSize, then only the first *valueSize characters will be returned and there will be no terminating null byte.

Note

Thread-safety: this function is thread safe.

Parameters
  • device – The CUDA device

  • eventDomain – ID of the event domain

  • attrib – The event domain attribute to read

  • valueSize – The size of the value buffer in bytes, and returns the number of bytes written to value

  • value – Returns the attribute’s value

Return values
  • CUPTI_SUCCESS

  • CUPTI_ERROR_NOT_INITIALIZED

  • CUPTI_ERROR_INVALID_DEVICE

  • CUPTI_ERROR_INVALID_EVENT_DOMAIN_ID

  • CUPTI_ERROR_INVALID_PARAMETER – if valueSize or value is NULL, or if attrib is not an event domain attribute

  • CUPTI_ERROR_PARAMETER_SIZE_NOT_SUFFICIENT – For non-c-string attribute values, indicates that the value buffer is too small to hold the attribute value.

CUptiResult cuptiDeviceGetNumEventDomains(CUdevice device, uint32_t *numDomains)

Get the number of domains for a device.

Returns the number of domains in numDomains for a device.

Note

Thread-safety: this function is thread safe.

Parameters
  • device – The CUDA device

  • numDomains – Returns the number of domains

Return values
  • CUPTI_SUCCESS

  • CUPTI_ERROR_NOT_INITIALIZED

  • CUPTI_ERROR_INVALID_DEVICE

  • CUPTI_ERROR_INVALID_PARAMETER – if numDomains is NULL

CUptiResult cuptiDisableKernelReplayMode(CUcontext context)

Disable kernel replay mode.

Set profiling mode for the context to non-replay (default) mode. Event collection mode will be set to CUPTI_EVENT_COLLECTION_MODE_KERNEL. All previously enabled event groups and event group sets will be disabled.

Note

Thread-safety: this function is thread safe.

Parameters

context – The context

Return values

CUPTI_SUCCESS

CUptiResult cuptiEnableKernelReplayMode(CUcontext context)

Enable kernel replay mode.

Set profiling mode for the context to replay mode. In this mode, any number of events can be collected in one run of the kernel. The event collection mode will automatically switch to CUPTI_EVENT_COLLECTION_MODE_KERNEL. In this mode, cuptiSetEventCollectionMode will return CUPTI_ERROR_INVALID_OPERATION.

Note

Kernels might take longer to run if many events are enabled.

Note

Thread-safety: this function is thread safe.

Parameters

context – The context

Return values

CUPTI_SUCCESS

CUptiResult cuptiEnumEventDomains(size_t *arraySizeBytes, CUpti_EventDomainID *domainArray)

Get the event domains available on any device.

Returns all the event domains available on any CUDA-capable device. Event domain IDs are returned in domainArray. The size of the domainArray buffer is given by *arraySizeBytes. The size of the domainArray buffer must be at least numDomains * sizeof(CUpti_EventDomainID) or all domains will not be returned. The value returned in *arraySizeBytes contains the number of bytes returned in domainArray.

Note

Thread-safety: this function is thread safe.

Parameters
  • arraySizeBytes – The size of domainArray in bytes, and returns the number of bytes written to domainArray

  • domainArray – Returns all the event domains

Return values
  • CUPTI_SUCCESS

  • CUPTI_ERROR_INVALID_PARAMETER – if arraySizeBytes or domainArray are NULL

CUptiResult cuptiEventDomainEnumEvents(CUpti_EventDomainID eventDomain, size_t *arraySizeBytes, CUpti_EventID *eventArray)

Get the events in a domain.

Returns the event IDs in eventArray for a domain. The size of the eventArray buffer is given by *arraySizeBytes. The size of the eventArray buffer must be at least numdomainevents * sizeof(CUpti_EventID) or else all events will not be returned. The value returned in *arraySizeBytes contains the number of bytes returned in eventArray.

Note

Thread-safety: this function is thread safe.

Parameters
  • eventDomain – ID of the event domain

  • arraySizeBytes – The size of eventArray in bytes, and returns the number of bytes written to eventArray

  • eventArray – Returns the IDs of the events in the domain

Return values
  • CUPTI_SUCCESS

  • CUPTI_ERROR_NOT_INITIALIZED

  • CUPTI_ERROR_INVALID_EVENT_DOMAIN_ID

  • CUPTI_ERROR_INVALID_PARAMETER – if arraySizeBytes or eventArray are NULL

CUptiResult cuptiEventDomainGetAttribute(CUpti_EventDomainID eventDomain, CUpti_EventDomainAttribute attrib, size_t *valueSize, void *value)

Read an event domain attribute.

Returns an event domain attribute in *value. The size of the value buffer is given by *valueSize. The value returned in *valueSize contains the number of bytes returned in value.

If the attribute value is a c-string that is longer than *valueSize, then only the first *valueSize characters will be returned and there will be no terminating null byte.

Note

Thread-safety: this function is thread safe.

Parameters
  • eventDomain – ID of the event domain

  • attrib – The event domain attribute to read

  • valueSize – The size of the value buffer in bytes, and returns the number of bytes written to value

  • value – Returns the attribute’s value

Return values
  • CUPTI_SUCCESS

  • CUPTI_ERROR_NOT_INITIALIZED

  • CUPTI_ERROR_INVALID_EVENT_DOMAIN_ID

  • CUPTI_ERROR_INVALID_PARAMETER – if valueSize or value is NULL, or if attrib is not an event domain attribute

  • CUPTI_ERROR_PARAMETER_SIZE_NOT_SUFFICIENT – For non-c-string attribute values, indicates that the value buffer is too small to hold the attribute value.

CUptiResult cuptiEventDomainGetNumEvents(CUpti_EventDomainID eventDomain, uint32_t *numEvents)

Get number of events in a domain.

Returns the number of events in numEvents for a domain.

Note

Thread-safety: this function is thread safe.

Parameters
  • eventDomain – ID of the event domain

  • numEvents – Returns the number of events in the domain

Return values
  • CUPTI_SUCCESS

  • CUPTI_ERROR_NOT_INITIALIZED

  • CUPTI_ERROR_INVALID_EVENT_DOMAIN_ID

  • CUPTI_ERROR_INVALID_PARAMETER – if numEvents is NULL

CUptiResult cuptiEventGetAttribute(CUpti_EventID event, CUpti_EventAttribute attrib, size_t *valueSize, void *value)

Get an event attribute.

Returns an event attribute in *value. The size of the value buffer is given by *valueSize. The value returned in *valueSize contains the number of bytes returned in value.

If the attribute value is a c-string that is longer than *valueSize, then only the first *valueSize characters will be returned and there will be no terminating null byte.

Note

Thread-safety: this function is thread safe.

Parameters
  • event – ID of the event

  • attrib – The event attribute to read

  • valueSize – The size of the value buffer in bytes, and returns the number of bytes written to value

  • value – Returns the attribute’s value

Return values
  • CUPTI_SUCCESS

  • CUPTI_ERROR_NOT_INITIALIZED

  • CUPTI_ERROR_INVALID_EVENT_ID

  • CUPTI_ERROR_INVALID_PARAMETER – if valueSize or value is NULL, or if attrib is not an event attribute

  • CUPTI_ERROR_PARAMETER_SIZE_NOT_SUFFICIENT – For non-c-string attribute values, indicates that the value buffer is too small to hold the attribute value.

CUptiResult cuptiEventGetIdFromName(CUdevice device, const char *eventName, CUpti_EventID *event)

Find an event by name.

Find an event by name and return the event ID in *event.

Note

Thread-safety: this function is thread safe.

Parameters
  • device – The CUDA device

  • eventName – The name of the event to find

  • event – Returns the ID of the found event or undefined if unable to find the event

Return values
  • CUPTI_SUCCESS

  • CUPTI_ERROR_NOT_INITIALIZED

  • CUPTI_ERROR_INVALID_DEVICE

  • CUPTI_ERROR_INVALID_EVENT_NAME – if unable to find an event with name eventName. In this case *event is undefined

  • CUPTI_ERROR_INVALID_PARAMETER – if eventName or event are NULL

CUptiResult cuptiEventGroupAddEvent(CUpti_EventGroup eventGroup, CUpti_EventID event)

Add an event to an event group.

Add an event to an event group. The event add can fail for a number of reasons:

  • The event group is enabled

  • The event does not belong to the same event domain as the events that are already in the event group

  • Device limitations on the events that can belong to the same group

  • The event group is full

Note

Thread-safety: this function is thread safe.

Parameters
  • eventGroup – The event group

  • event – The event to add to the group

Return values
  • CUPTI_SUCCESS

  • CUPTI_ERROR_NOT_INITIALIZED

  • CUPTI_ERROR_INVALID_EVENT_ID

  • CUPTI_ERROR_OUT_OF_MEMORY

  • CUPTI_ERROR_INVALID_OPERATION – if eventGroup is enabled

  • CUPTI_ERROR_NOT_COMPATIBLE – if event belongs to a different event domain than the events already in eventGroup, or if a device limitation prevents event from being collected at the same time as the events already in eventGroup

  • CUPTI_ERROR_MAX_LIMIT_REACHED – if eventGroup is full

  • CUPTI_ERROR_INVALID_PARAMETER – if eventGroup is NULL

CUptiResult cuptiEventGroupCreate(CUcontext context, CUpti_EventGroup *eventGroup, uint32_t flags)

Create a new event group for a context.

Creates a new event group for context and returns the new group in *eventGroup.

Note

flags are reserved for future use and should be set to zero.

Note

Thread-safety: this function is thread safe.

Parameters
  • context – The context for the event group

  • eventGroup – Returns the new event group

  • flags – Reserved - must be zero

Return values
  • CUPTI_SUCCESS

  • CUPTI_ERROR_NOT_INITIALIZED

  • CUPTI_ERROR_INVALID_CONTEXT

  • CUPTI_ERROR_OUT_OF_MEMORY

  • CUPTI_ERROR_INVALID_PARAMETER – if eventGroup is NULL

CUptiResult cuptiEventGroupDestroy(CUpti_EventGroup eventGroup)

Destroy an event group.

Destroy an eventGroup and free its resources. An event group cannot be destroyed if it is enabled.

Note

Thread-safety: this function is thread safe.

Parameters

eventGroup – The event group to destroy

Return values
  • CUPTI_SUCCESS

  • CUPTI_ERROR_NOT_INITIALIZED

  • CUPTI_ERROR_INVALID_OPERATION – if the event group is enabled

  • CUPTI_ERROR_INVALID_PARAMETER – if eventGroup is NULL

CUptiResult cuptiEventGroupDisable(CUpti_EventGroup eventGroup)

Disable an event group.

Disable an event group. Disabling an event group stops collection of events contained in the group.

Note

Thread-safety: this function is thread safe.

Parameters

eventGroup – The event group

Return values
  • CUPTI_SUCCESS

  • CUPTI_ERROR_NOT_INITIALIZED

  • CUPTI_ERROR_HARDWARE

  • CUPTI_ERROR_INVALID_PARAMETER – if eventGroup is NULL

CUptiResult cuptiEventGroupEnable(CUpti_EventGroup eventGroup)

Enable an event group.

Enable an event group. Enabling an event group zeros the value of all the events in the group and then starts collection of those events.

Note

Thread-safety: this function is thread safe.

Parameters

eventGroup – The event group

Return values
  • CUPTI_SUCCESS

  • CUPTI_ERROR_NOT_INITIALIZED

  • CUPTI_ERROR_HARDWARE

  • CUPTI_ERROR_NOT_READY – if eventGroup does not contain any events

  • CUPTI_ERROR_NOT_COMPATIBLE – if eventGroup cannot be enabled due to other already enabled event groups

  • CUPTI_ERROR_INVALID_PARAMETER – if eventGroup is NULL

  • CUPTI_ERROR_HARDWARE_BUSY – if another client is profiling and hardware is busy

CUptiResult cuptiEventGroupGetAttribute(CUpti_EventGroup eventGroup, CUpti_EventGroupAttribute attrib, size_t *valueSize, void *value)

Read an event group attribute.

Read an event group attribute and return it in *value.

Note

Thread-safety: this function is thread safe but client must guard against simultaneous destruction or modification of eventGroup (for example, client must guard against simultaneous calls to cuptiEventGroupDestroy, cuptiEventGroupAddEvent, etc.), and must guard against simultaneous destruction of the context in which eventGroup was created (for example, client must guard against simultaneous calls to cudaDeviceReset, cuCtxDestroy, etc.).

Parameters
  • eventGroup – The event group

  • attrib – The attribute to read

  • valueSize – Size of buffer pointed by the value, and returns the number of bytes written to value

  • value – Returns the value of the attribute

Return values
  • CUPTI_SUCCESS

  • CUPTI_ERROR_NOT_INITIALIZED

  • CUPTI_ERROR_INVALID_PARAMETER – if valueSize or value is NULL, or if attrib is not an eventgroup attribute

  • CUPTI_ERROR_PARAMETER_SIZE_NOT_SUFFICIENT – For non-c-string attribute values, indicates that the value buffer is too small to hold the attribute value.

CUptiResult cuptiEventGroupReadAllEvents(CUpti_EventGroup eventGroup, CUpti_ReadEventFlags flags, size_t *eventValueBufferSizeBytes, uint64_t *eventValueBuffer, size_t *eventIdArraySizeBytes, CUpti_EventID *eventIdArray, size_t *numEventIdsRead)

Read the values for all the events in an event group.

Read the values for all the events in an event group. The event values are returned in the eventValueBuffer buffer. eventValueBufferSizeBytes indicates the size of eventValueBuffer. The buffer must be at least (sizeof(uint64) * number of events in group) if CUPTI_EVENT_GROUP_ATTR_PROFILE_ALL_DOMAIN_INSTANCES is not set on the group containing the events. The buffer must be at least (sizeof(uint64) * number of domain instances * number of events in group) if CUPTI_EVENT_GROUP_ATTR_PROFILE_ALL_DOMAIN_INSTANCES is set on the group.

The data format returned in eventValueBuffer is:

  • domain instance 0: event0 event1 … eventN

  • domain instance 1: event0 event1 … eventN

  • domain instance M: event0 event1 … eventN

The event order in eventValueBuffer is returned in eventIdArray. The size of eventIdArray is specified in eventIdArraySizeBytes. The size should be at least (sizeof(CUpti_EventID) * number of events in group).

If any instance of any event counter overflows, the value returned for that event instance will be CUPTI_EVENT_OVERFLOW.

The only allowed value for flags is CUPTI_EVENT_READ_FLAG_NONE.

Reading events from a disabled event group is not allowed. After being read, an event’s value is reset to zero.

Note

Thread-safety: this function is thread safe but client must guard against simultaneous destruction or modification of eventGroup (for example, client must guard against simultaneous calls to cuptiEventGroupDestroy, cuptiEventGroupAddEvent, etc.), and must guard against simultaneous destruction of the context in which eventGroup was created (for example, client must guard against simultaneous calls to cudaDeviceReset, cuCtxDestroy, etc.). If cuptiEventGroupResetAllEvents is called simultaneously with this function, then returned event values are undefined.

Parameters
  • eventGroup – The event group

  • flags – Flags controlling the reading mode

  • eventValueBufferSizeBytes – The size of eventValueBuffer in bytes, and returns the number of bytes written to eventValueBuffer

  • eventValueBuffer – Returns the event values

  • eventIdArraySizeBytes – The size of eventIdArray in bytes, and returns the number of bytes written to eventIdArray

  • eventIdArray – Returns the IDs of the events in the same order as the values return in eventValueBuffer.

  • numEventIdsRead – Returns the number of event IDs returned in eventIdArray

Return values
  • CUPTI_SUCCESS

  • CUPTI_ERROR_NOT_INITIALIZED

  • CUPTI_ERROR_HARDWARE

  • CUPTI_ERROR_INVALID_OPERATION – if eventGroup is disabled

  • CUPTI_ERROR_INVALID_PARAMETER – if eventGroup, eventValueBufferSizeBytes, eventValueBuffer, eventIdArraySizeBytes, eventIdArray or numEventIdsRead is NULL

  • CUPTI_ERROR_PARAMETER_SIZE_NOT_SUFFICIENT – if size of eventValueBuffer or eventIdArray is not sufficient

CUptiResult cuptiEventGroupReadEvent(CUpti_EventGroup eventGroup, CUpti_ReadEventFlags flags, CUpti_EventID event, size_t *eventValueBufferSizeBytes, uint64_t *eventValueBuffer)

Read the value for an event in an event group.

Read the value for an event in an event group. The event value is returned in the eventValueBuffer buffer. eventValueBufferSizeBytes indicates the size of the eventValueBuffer buffer. The buffer must be at least sizeof(uint64) if CUPTI_EVENT_GROUP_ATTR_PROFILE_ALL_DOMAIN_INSTANCES is not set on the group containing the event. The buffer must be at least (sizeof(uint64) * number of domain instances) if CUPTI_EVENT_GROUP_ATTR_PROFILE_ALL_DOMAIN_INSTANCES is set on the group.

If any instance of an event counter overflows, the value returned for that event instance will be CUPTI_EVENT_OVERFLOW.

The only allowed value for flags is CUPTI_EVENT_READ_FLAG_NONE.

Reading an event from a disabled event group is not allowed. After being read, an event’s value is reset to zero.

Note

Thread-safety: this function is thread safe but client must guard against simultaneous destruction or modification of eventGroup (for example, client must guard against simultaneous calls to cuptiEventGroupDestroy, cuptiEventGroupAddEvent, etc.), and must guard against simultaneous destruction of the context in which eventGroup was created (for example, client must guard against simultaneous calls to cudaDeviceReset, cuCtxDestroy, etc.). If cuptiEventGroupResetAllEvents is called simultaneously with this function, then returned event values are undefined.

Parameters
  • eventGroup – The event group

  • flags – Flags controlling the reading mode

  • event – The event to read

  • eventValueBufferSizeBytes – The size of eventValueBuffer in bytes, and returns the number of bytes written to eventValueBuffer

  • eventValueBuffer – Returns the event value(s)

Return values
  • CUPTI_SUCCESS

  • CUPTI_ERROR_NOT_INITIALIZED

  • CUPTI_ERROR_INVALID_EVENT_ID

  • CUPTI_ERROR_HARDWARE

  • CUPTI_ERROR_INVALID_OPERATION – if eventGroup is disabled

  • CUPTI_ERROR_INVALID_PARAMETER – if eventGroup, eventValueBufferSizeBytes or eventValueBuffer is NULL

  • CUPTI_ERROR_PARAMETER_SIZE_NOT_SUFFICIENT – if size of eventValueBuffer is not sufficient

CUptiResult cuptiEventGroupRemoveAllEvents(CUpti_EventGroup eventGroup)

Remove all events from an event group.

Remove all events from an event group. Events cannot be removed if the event group is enabled.

Note

Thread-safety: this function is thread safe.

Parameters

eventGroup – The event group

Return values
  • CUPTI_SUCCESS

  • CUPTI_ERROR_NOT_INITIALIZED

  • CUPTI_ERROR_INVALID_OPERATION – if eventGroup is enabled

  • CUPTI_ERROR_INVALID_PARAMETER – if eventGroup is NULL

CUptiResult cuptiEventGroupRemoveEvent(CUpti_EventGroup eventGroup, CUpti_EventID event)

Remove an event from an event group.

Remove event from the an event group. The event cannot be removed if the event group is enabled.

Note

Thread-safety: this function is thread safe.

Parameters
  • eventGroup – The event group

  • event – The event to remove from the group

Return values
  • CUPTI_SUCCESS

  • CUPTI_ERROR_NOT_INITIALIZED

  • CUPTI_ERROR_INVALID_EVENT_ID

  • CUPTI_ERROR_INVALID_OPERATION – if eventGroup is enabled

  • CUPTI_ERROR_INVALID_PARAMETER – if eventGroup is NULL

CUptiResult cuptiEventGroupResetAllEvents(CUpti_EventGroup eventGroup)

Zero all the event counts in an event group.

Zero all the event counts in an event group.

Note

Thread-safety: this function is thread safe but client must guard against simultaneous destruction or modification of eventGroup (for example, client must guard against simultaneous calls to cuptiEventGroupDestroy, cuptiEventGroupAddEvent, etc.), and must guard against simultaneous destruction of the context in which eventGroup was created (for example, client must guard against simultaneous calls to cudaDeviceReset, cuCtxDestroy, etc.).

Parameters

eventGroup – The event group

Return values
  • CUPTI_SUCCESS

  • CUPTI_ERROR_NOT_INITIALIZED

  • CUPTI_ERROR_HARDWARE

  • CUPTI_ERROR_INVALID_PARAMETER – if eventGroup is NULL

CUptiResult cuptiEventGroupSetAttribute(CUpti_EventGroup eventGroup, CUpti_EventGroupAttribute attrib, size_t valueSize, void *value)

Write an event group attribute.

Write an event group attribute.

Note

Thread-safety: this function is thread safe.

Parameters
  • eventGroup – The event group

  • attrib – The attribute to write

  • valueSize – The size, in bytes, of the value

  • value – The attribute value to write

Return values
  • CUPTI_SUCCESS

  • CUPTI_ERROR_NOT_INITIALIZED

  • CUPTI_ERROR_INVALID_PARAMETER – if valueSize or value is NULL, or if attrib is not an event group attribute, or if attrib is not a writable attribute

  • CUPTI_ERROR_PARAMETER_SIZE_NOT_SUFFICIENT – Indicates that the value buffer is too small to hold the attribute value.

CUptiResult cuptiEventGroupSetDisable(CUpti_EventGroupSet *eventGroupSet)

Disable an event group set.

Disable a set of event groups. Disabling a set of event groups stops collection of events contained in the groups.

Note

Thread-safety: this function is thread safe.

Note

If this call fails, some of the event groups in the set may be disabled and other event groups may remain enabled.

Parameters

eventGroupSet – The pointer to the event group set

Return values
  • CUPTI_SUCCESS

  • CUPTI_ERROR_NOT_INITIALIZED

  • CUPTI_ERROR_HARDWARE

  • CUPTI_ERROR_INVALID_PARAMETER – if eventGroupSet is NULL

CUptiResult cuptiEventGroupSetEnable(CUpti_EventGroupSet *eventGroupSet)

Enable an event group set.

Enable a set of event groups. Enabling a set of event groups zeros the value of all the events in all the groups and then starts collection of those events.

Note

Thread-safety: this function is thread safe.

Parameters

eventGroupSet – The pointer to the event group set

Return values
  • CUPTI_SUCCESS

  • CUPTI_ERROR_NOT_INITIALIZED

  • CUPTI_ERROR_HARDWARE

  • CUPTI_ERROR_NOT_READY – if eventGroup does not contain any events

  • CUPTI_ERROR_NOT_COMPATIBLE – if eventGroup cannot be enabled due to other already enabled event groups

  • CUPTI_ERROR_INVALID_PARAMETER – if eventGroupSet is NULL

  • CUPTI_ERROR_HARDWARE_BUSY – if other client is profiling and hardware is busy

CUptiResult cuptiEventGroupSetsCreate(CUcontext context, size_t eventIdArraySizeBytes, CUpti_EventID *eventIdArray, CUpti_EventGroupSets **eventGroupPasses)

For a set of events, get the grouping that indicates the number of passes and the event groups necessary to collect the events.

The number of events that can be collected simultaneously varies by device and by the type of the events. When events can be collected simultaneously, they may need to be grouped into multiple event groups because they are from different event domains. This function takes a set of events and determines how many passes are required to collect all those events, and which events can be collected simultaneously in each pass.

The CUpti_EventGroupSets returned in eventGroupPasses indicates how many passes are required to collect the events with the numSets field. Within each event group set, the sets array indicates the event groups that should be collected on each pass.

Note

Thread-safety: this function is thread safe, but client must guard against another thread simultaneously destroying context.

Parameters
  • context – The context for event collection

  • eventIdArraySizeBytes – Size of eventIdArray in bytes

  • eventIdArray – Array of event IDs that need to be grouped

  • eventGroupPasses – Returns a CUpti_EventGroupSets object that indicates the number of passes required to collect the events and the events to collect on each pass

Return values
  • CUPTI_SUCCESS

  • CUPTI_ERROR_NOT_INITIALIZED

  • CUPTI_ERROR_INVALID_CONTEXT

  • CUPTI_ERROR_INVALID_EVENT_ID

  • CUPTI_ERROR_INVALID_PARAMETER – if eventIdArray or eventGroupPasses is NULL

CUptiResult cuptiEventGroupSetsDestroy(CUpti_EventGroupSets *eventGroupSets)

Destroy a event group sets object.

Destroy a CUpti_EventGroupSets object.

Note

Thread-safety: this function is thread safe.

Parameters

eventGroupSets – The object to destroy

Return values
  • CUPTI_SUCCESS

  • CUPTI_ERROR_NOT_INITIALIZED

  • CUPTI_ERROR_INVALID_OPERATION – if any of the event groups contained in the sets is enabled

  • CUPTI_ERROR_INVALID_PARAMETER – if eventGroupSets is NULL

CUptiResult cuptiGetNumEventDomains(uint32_t *numDomains)

Get the number of event domains available on any device.

Returns the total number of event domains available on any CUDA-capable device.

Note

Thread-safety: this function is thread safe.

Parameters

numDomains – Returns the number of domains

Return values
  • CUPTI_SUCCESS

  • CUPTI_ERROR_INVALID_PARAMETER – if numDomains is NULL

CUptiResult cuptiKernelReplaySubscribeUpdate(CUpti_KernelReplayUpdateFunc updateFunc, void *customData)

Subscribe to kernel replay updates.

When subscribed, the function pointer passed in will be called each time a kernel run is finished during kernel replay. Previously subscribed function pointer will be replaced. Pass in NULL as the function pointer unsubscribes the update.

Parameters
  • updateFunc – The update function pointer

  • customData – Pointer to any custom data

Return values

CUPTI_SUCCESS

CUptiResult cuptiSetEventCollectionMode(CUcontext context, CUpti_EventCollectionMode mode)

Set the event collection mode.

Set the event collection mode for a context. The mode controls the event collection behavior of all events in event groups created in the context. This API is invalid in kernel replay mode.

Note

Thread-safety: this function is thread safe.

Parameters
  • context – The context

  • mode – The event collection mode

Return values
  • CUPTI_SUCCESS

  • CUPTI_ERROR_NOT_INITIALIZED

  • CUPTI_ERROR_INVALID_CONTEXT

  • CUPTI_ERROR_INVALID_OPERATION – if called when replay mode is enabled

  • CUPTI_ERROR_NOT_SUPPORTED – if mode is not supported on the device

5.4.4. Typedefs

typedef uint32_t CUpti_EventDomainID

ID for an event domain.

ID for an event domain. An event domain represents a group of related events. A device may have multiple instances of a domain, indicating that the device can simultaneously record multiple instances of each event within that domain.

typedef void *CUpti_EventGroup

A group of events.

An event group is a collection of events that are managed together. All events in an event group must belong to the same domain.

typedef uint32_t CUpti_EventID

ID for an event.

An event represents a countable activity, action, or occurrence on the device.

typedef void (*CUpti_KernelReplayUpdateFunc)(const char *kernelName, int numReplaysDone, void *customData)

Function type for getting updates on kernel replay.

Param kernelName

The mangled kernel name

Param numReplaysDone

Number of replays done so far

Param customData

Pointer of any custom data passed in when subscribing