5.5. CUPTI Metric API

Functions, types, and enums that implement the CUPTI Metric API.

Note

CUPTI metric API from the header cupti_metrics.h are not supported on devices with compute capability 7.5 and higher (i.e. Turing and later GPU architectures). These API will be deprecated in a future CUDA release. These are replaced by Profiling API in the header cupti_profiler_target.h and Perfworks metrics API in the headers nvperf_host.h and nvperf_target.h which are supported on devices with compute capability 7.0 and higher (i.e. Volta and later GPU architectures).

Data Structures

Enumerations

Functions

Typedefs

5.5.1. Enumerations

enum CUpti_MetricAttribute

Metric attributes.

Metric attributes describe properties of a metric. These attributes can be read using cuptiMetricGetAttribute.

Values:

enumerator CUPTI_METRIC_ATTR_NAME

Metric name.

Value is a null terminated const c-string.

enumerator CUPTI_METRIC_ATTR_SHORT_DESCRIPTION

Short description of metric.

Value is a null terminated const c-string.

enumerator CUPTI_METRIC_ATTR_LONG_DESCRIPTION

Long description of metric.

Value is a null terminated const c-string.

enumerator CUPTI_METRIC_ATTR_CATEGORY

Category of the metric.

Value is of type CUpti_MetricCategory.

enumerator CUPTI_METRIC_ATTR_VALUE_KIND

Value type of the metric.

Value is of type CUpti_MetricValueKind.

enumerator CUPTI_METRIC_ATTR_EVALUATION_MODE

Metric evaluation mode.

Value is of type CUpti_MetricEvaluationMode.

enumerator CUPTI_METRIC_ATTR_FORCE_INT
enum CUpti_MetricCategory

A metric category.

Each metric is assigned to a category that represents the general type of the metric. A metric’s category is accessed using cuptiMetricGetAttribute and the CUPTI_METRIC_ATTR_CATEGORY attribute.

Values:

enumerator CUPTI_METRIC_CATEGORY_MEMORY

A memory related metric.

enumerator CUPTI_METRIC_CATEGORY_INSTRUCTION

An instruction related metric.

enumerator CUPTI_METRIC_CATEGORY_MULTIPROCESSOR

A multiprocessor related metric.

enumerator CUPTI_METRIC_CATEGORY_CACHE

A cache related metric.

enumerator CUPTI_METRIC_CATEGORY_TEXTURE

A texture related metric.

enumerator CUPTI_METRIC_CATEGORY_NVLINK

A Nvlink related metric.

enumerator CUPTI_METRIC_CATEGORY_PCIE

A PCIe related metric.

enumerator CUPTI_METRIC_CATEGORY_FORCE_INT
enum CUpti_MetricEvaluationMode

A metric evaluation mode.

A metric can be evaluated per hardware instance to know the load balancing across instances of a domain or the metric can be evaluated in aggregate mode when the events involved in metric evaluation are from different event domains. It might be possible to evaluate some metrics in both modes for convenience. A metric’s evaluation mode is accessed using CUpti_MetricEvaluationMode and the CUPTI_METRIC_ATTR_EVALUATION_MODE attribute.

Values:

enumerator CUPTI_METRIC_EVALUATION_MODE_PER_INSTANCE

If this bit is set, the metric can be profiled for each instance of the domain.

The event values passed to cuptiMetricGetValue can contain values for one instance of the domain. And cuptiMetricGetValue can be called for each instance.

enumerator CUPTI_METRIC_EVALUATION_MODE_AGGREGATE

If this bit is set, the metric can be profiled over all instances.

The event values passed to cuptiMetricGetValue can be aggregated values of events for all instances of the domain.

enumerator CUPTI_METRIC_EVALUATION_MODE_FORCE_INT
enum CUpti_MetricPropertyDeviceClass

Device class.

Enumeration of device classes for metric property CUPTI_METRIC_PROPERTY_DEVICE_CLASS.

Values:

enumerator CUPTI_METRIC_PROPERTY_DEVICE_CLASS_TESLA
enumerator CUPTI_METRIC_PROPERTY_DEVICE_CLASS_QUADRO
enumerator CUPTI_METRIC_PROPERTY_DEVICE_CLASS_GEFORCE
enumerator CUPTI_METRIC_PROPERTY_DEVICE_CLASS_TEGRA
enum CUpti_MetricPropertyID

Metric device properties.

Metric device properties describe device properties which are needed for a metric. Some of these properties can be collected using cuDeviceGetAttribute.

Values:

enumerator CUPTI_METRIC_PROPERTY_MULTIPROCESSOR_COUNT
enumerator CUPTI_METRIC_PROPERTY_WARPS_PER_MULTIPROCESSOR
enumerator CUPTI_METRIC_PROPERTY_KERNEL_GPU_TIME
enumerator CUPTI_METRIC_PROPERTY_CLOCK_RATE
enumerator CUPTI_METRIC_PROPERTY_FRAME_BUFFER_COUNT
enumerator CUPTI_METRIC_PROPERTY_GLOBAL_MEMORY_BANDWIDTH
enumerator CUPTI_METRIC_PROPERTY_PCIE_GEN
enumerator CUPTI_METRIC_PROPERTY_DEVICE_CLASS
enumerator CUPTI_METRIC_PROPERTY_FLOP_SP_PER_CYCLE
enumerator CUPTI_METRIC_PROPERTY_FLOP_DP_PER_CYCLE
enumerator CUPTI_METRIC_PROPERTY_L2_UNITS
enumerator CUPTI_METRIC_PROPERTY_ECC_ENABLED
enumerator CUPTI_METRIC_PROPERTY_FLOP_HP_PER_CYCLE
enum CUpti_MetricValueKind

Kinds of metric values.

Metric values can be one of several different kinds. Corresponding to each kind is a member of the CUpti_MetricValue union. The metric value returned by cuptiMetricGetValue should be accessed using the appropriate member of that union based on its value kind.

Values:

enumerator CUPTI_METRIC_VALUE_KIND_DOUBLE

The metric value is a 64-bit double.

enumerator CUPTI_METRIC_VALUE_KIND_UINT64

The metric value is a 64-bit unsigned integer.

enumerator CUPTI_METRIC_VALUE_KIND_PERCENT

The metric value is a percentage represented by a 64-bit double.

For example, 57.5% is represented by the value 57.5.

enumerator CUPTI_METRIC_VALUE_KIND_THROUGHPUT

The metric value is a throughput represented by a 64-bit integer.

The unit for throughput values is bytes/second.

enumerator CUPTI_METRIC_VALUE_KIND_INT64

The metric value is a 64-bit signed integer.

enumerator CUPTI_METRIC_VALUE_KIND_UTILIZATION_LEVEL

The metric value is a utilization level, as represented by CUpti_MetricValueUtilizationLevel.

enumerator CUPTI_METRIC_VALUE_KIND_FORCE_INT
enum CUpti_MetricValueUtilizationLevel

Enumeration of utilization levels for metrics values of kind CUPTI_METRIC_VALUE_KIND_UTILIZATION_LEVEL.

Utilization values can vary from IDLE (0) to MAX (10) but the enumeration only provides specific names for a few values.

Values:

enumerator CUPTI_METRIC_VALUE_UTILIZATION_IDLE
enumerator CUPTI_METRIC_VALUE_UTILIZATION_LOW
enumerator CUPTI_METRIC_VALUE_UTILIZATION_MID
enumerator CUPTI_METRIC_VALUE_UTILIZATION_HIGH
enumerator CUPTI_METRIC_VALUE_UTILIZATION_MAX
enumerator CUPTI_METRIC_VALUE_UTILIZATION_FORCE_INT

5.5.2. Functions

CUptiResult cuptiDeviceEnumMetrics(CUdevice device, size_t *arraySizeBytes, CUpti_MetricID *metricArray)

Get the metrics for a device.

Returns the metric IDs in metricArray for a device. The size of the metricArray buffer is given by *arraySizeBytes. The size of the metricArray buffer must be at least numMetrics * sizeof(CUpti_MetricID) or else all metric IDs will not be returned. The value returned in *arraySizeBytes contains the number of bytes returned in metricArray.

Parameters
  • device – The CUDA device

  • arraySizeBytes – The size of metricArray in bytes, and returns the number of bytes written to metricArray

  • metricArray – Returns the IDs of the metrics for the device

Return values
  • CUPTI_SUCCESS

  • CUPTI_ERROR_NOT_INITIALIZED

  • CUPTI_ERROR_INVALID_DEVICE

  • CUPTI_ERROR_INVALID_PARAMETER – if arraySizeBytes or metricArray are NULL

CUptiResult cuptiDeviceGetNumMetrics(CUdevice device, uint32_t *numMetrics)

Get the number of metrics for a device.

Returns the number of metrics available for a device.

Parameters
  • device – The CUDA device

  • numMetrics – Returns the number of metrics available for the device

Return values
  • CUPTI_SUCCESS

  • CUPTI_ERROR_NOT_INITIALIZED

  • CUPTI_ERROR_INVALID_DEVICE

  • CUPTI_ERROR_INVALID_PARAMETER – if numMetrics is NULL

CUptiResult cuptiEnumMetrics(size_t *arraySizeBytes, CUpti_MetricID *metricArray)

Get all the metrics available on any device.

Returns the metric IDs in metricArray for all CUDA-capable devices. The size of the metricArray buffer is given by *arraySizeBytes. The size of the metricArray buffer must be at least numMetrics * sizeof(CUpti_MetricID) or all metric IDs will not be returned. The value returned in *arraySizeBytes contains the number of bytes returned in metricArray.

Parameters
  • arraySizeBytes – The size of metricArray in bytes, and returns the number of bytes written to metricArray

  • metricArray – Returns the IDs of the metrics

Return values
  • CUPTI_SUCCESS

  • CUPTI_ERROR_INVALID_PARAMETER – if arraySizeBytes or metricArray are NULL

CUptiResult cuptiGetNumMetrics(uint32_t *numMetrics)

Get the total number of metrics available on any device.

Returns the total number of metrics available on any CUDA-capable devices.

Parameters

numMetrics – Returns the number of metrics

Return values
  • CUPTI_SUCCESS

  • CUPTI_ERROR_INVALID_PARAMETER – if numMetrics is NULL

CUptiResult cuptiMetricCreateEventGroupSets(CUcontext context, size_t metricIdArraySizeBytes, CUpti_MetricID *metricIdArray, CUpti_EventGroupSets **eventGroupPasses)

For a set of metrics, get the grouping that indicates the number of passes and the event groups necessary to collect the events required for those metrics.

For a set of metrics, get the grouping that indicates the number of passes and the event groups necessary to collect the events required for those metrics.

See also

cuptiEventGroupSetsCreate for details on event group set creation.

Parameters
  • context – The context for event collection

  • metricIdArraySizeBytes – Size of the metricIdArray in bytes

  • metricIdArray – Array of metric IDs

  • eventGroupPasses – Returns a CUpti_EventGroupSets object that indicates the number of passes required to collect the events and the events to collect on each pass

Return values
  • CUPTI_SUCCESS

  • CUPTI_ERROR_NOT_INITIALIZED

  • CUPTI_ERROR_INVALID_CONTEXT

  • CUPTI_ERROR_INVALID_METRIC_ID

  • CUPTI_ERROR_INVALID_PARAMETER – if metricIdArray or eventGroupPasses is NULL

CUptiResult cuptiMetricEnumEvents(CUpti_MetricID metric, size_t *eventIdArraySizeBytes, CUpti_EventID *eventIdArray)

Get the events required to calculating a metric.

Gets the event IDs in eventIdArray required to calculate a metric. The size of the eventIdArray buffer is given by *eventIdArraySizeBytes and must be at least numEvents * sizeof(CUpti_EventID) or all events will not be returned. The value returned in *eventIdArraySizeBytes contains the number of bytes returned in eventIdArray.

Parameters
  • metric – ID of the metric

  • eventIdArraySizeBytes – The size of eventIdArray in bytes, and returns the number of bytes written to eventIdArray

  • eventIdArray – Returns the IDs of the events required to calculate metric

Return values
  • CUPTI_SUCCESS

  • CUPTI_ERROR_NOT_INITIALIZED

  • CUPTI_ERROR_INVALID_METRIC_ID

  • CUPTI_ERROR_INVALID_PARAMETER – if eventIdArraySizeBytes or eventIdArray are NULL.

CUptiResult cuptiMetricEnumProperties(CUpti_MetricID metric, size_t *propIdArraySizeBytes, CUpti_MetricPropertyID *propIdArray)

Get the properties required to calculating a metric.

Gets the property IDs in propIdArray required to calculate a metric. The size of the propIdArray buffer is given by *propIdArraySizeBytes and must be at least numProp * sizeof(CUpti_DeviceAttribute) or all properties will not be returned. The value returned in *propIdArraySizeBytes contains the number of bytes returned in propIdArray.

Parameters
  • metric – ID of the metric

  • propIdArraySizeBytes – The size of propIdArray in bytes, and returns the number of bytes written to propIdArray

  • propIdArray – Returns the IDs of the properties required to calculate metric

Return values
  • CUPTI_SUCCESS

  • CUPTI_ERROR_NOT_INITIALIZED

  • CUPTI_ERROR_INVALID_METRIC_ID

  • CUPTI_ERROR_INVALID_PARAMETER – if propIdArraySizeBytes or propIdArray are NULL.

CUptiResult cuptiMetricGetAttribute(CUpti_MetricID metric, CUpti_MetricAttribute attrib, size_t *valueSize, void *value)

Get a metric attribute.

Returns a metric attribute in *value. The size of the value buffer is given by *valueSize. The value returned in *valueSize contains the number of bytes returned in value.

If the attribute value is a c-string that is longer than *valueSize, then only the first *valueSize characters will be returned and there will be no terminating null byte.

Parameters
  • metric – ID of the metric

  • attrib – The metric attribute to read

  • valueSize – The size of the value buffer in bytes, and returns the number of bytes written to value

  • value – Returns the attribute’s value

Return values
  • CUPTI_SUCCESS

  • CUPTI_ERROR_NOT_INITIALIZED

  • CUPTI_ERROR_INVALID_METRIC_ID

  • CUPTI_ERROR_INVALID_PARAMETER – if valueSize or value is NULL, or if attrib is not a metric attribute

  • CUPTI_ERROR_PARAMETER_SIZE_NOT_SUFFICIENT – For non-c-string attribute values, indicates that the value buffer is too small to hold the attribute value.

CUptiResult cuptiMetricGetIdFromName(CUdevice device, const char *metricName, CUpti_MetricID *metric)

Find an metric by name.

Find a metric by name and return the metric ID in *metric.

Parameters
  • device – The CUDA device

  • metricName – The name of metric to find

  • metric – Returns the ID of the found metric or undefined if unable to find the metric

Return values
  • CUPTI_SUCCESS

  • CUPTI_ERROR_NOT_INITIALIZED

  • CUPTI_ERROR_INVALID_DEVICE

  • CUPTI_ERROR_INVALID_METRIC_NAME – if unable to find a metric with name metricName. In this case *metric is undefined

  • CUPTI_ERROR_INVALID_PARAMETER – if metricName or metric are NULL.

CUptiResult cuptiMetricGetNumEvents(CUpti_MetricID metric, uint32_t *numEvents)

Get number of events required to calculate a metric.

Returns the number of events in numEvents that are required to calculate a metric.

Parameters
  • metric – ID of the metric

  • numEvents – Returns the number of events required for the metric

Return values
  • CUPTI_SUCCESS

  • CUPTI_ERROR_NOT_INITIALIZED

  • CUPTI_ERROR_INVALID_METRIC_ID

  • CUPTI_ERROR_INVALID_PARAMETER – if numEvents is NULL

CUptiResult cuptiMetricGetNumProperties(CUpti_MetricID metric, uint32_t *numProp)

Get number of properties required to calculate a metric.

Returns the number of properties in numProp that are required to calculate a metric.

Parameters
  • metric – ID of the metric

  • numProp – Returns the number of properties required for the metric

Return values
  • CUPTI_SUCCESS

  • CUPTI_ERROR_NOT_INITIALIZED

  • CUPTI_ERROR_INVALID_METRIC_ID

  • CUPTI_ERROR_INVALID_PARAMETER – if numProp is NULL

CUptiResult cuptiMetricGetRequiredEventGroupSets(CUcontext context, CUpti_MetricID metric, CUpti_EventGroupSets **eventGroupSets)

For a metric get the groups of events that must be collected in the same pass.

For a metric get the groups of events that must be collected in the same pass to ensure that the metric is calculated correctly. If the events are not collected as specified then the metric value may be inaccurate.

The function returns NULL if a metric does not have any required event group. In this case the events needed for the metric can be grouped in any manner for collection.

Parameters
  • context – The context for event collection

  • metric – The metric ID

  • eventGroupSets – Returns a CUpti_EventGroupSets object that indicates the events that must be collected in the same pass to ensure the metric is calculated correctly. Returns NULL if no grouping is required for metric

Return values
  • CUPTI_SUCCESS

  • CUPTI_ERROR_NOT_INITIALIZED

  • CUPTI_ERROR_INVALID_METRIC_ID

CUptiResult cuptiMetricGetValue(CUdevice device, CUpti_MetricID metric, size_t eventIdArraySizeBytes, CUpti_EventID *eventIdArray, size_t eventValueArraySizeBytes, uint64_t *eventValueArray, uint64_t timeDuration, CUpti_MetricValue *metricValue)

Calculate the value for a metric.

Use the events collected for a metric to calculate the metric value. Metric value evaluation depends on the evaluation mode CUpti_MetricEvaluationMode that the metric supports. If a metric has evaluation mode as CUPTI_METRIC_EVALUATION_MODE_PER_INSTANCE, then it assumes that the input event value is for one domain instance. If a metric has evaluation mode as CUPTI_METRIC_EVALUATION_MODE_AGGREGATE, it assumes that input event values are normalized to represent all domain instances on a device. For the most accurate metric collection, the events required for the metric should be collected for all profiled domain instances. For example, to collect all instances of an event, set the CUPTI_EVENT_GROUP_ATTR_PROFILE_ALL_DOMAIN_INSTANCES attribute on the group containing the event to 1. The normalized value for the event is then: (sum_event_values * totalInstanceCount) / instanceCount, where sum_event_values is the summation of the event values across all profiled domain instances, totalInstanceCount is obtained from querying CUPTI_EVENT_DOMAIN_ATTR_TOTAL_INSTANCE_COUNT and instanceCount is obtained from querying CUPTI_EVENT_GROUP_ATTR_INSTANCE_COUNT (or CUPTI_EVENT_DOMAIN_ATTR_INSTANCE_COUNT).

Parameters
  • device – The CUDA device that the metric is being calculated for

  • metric – The metric ID

  • eventIdArraySizeBytes – The size of eventIdArray in bytes

  • eventIdArray – The event IDs required to calculate metric

  • eventValueArraySizeBytes – The size of eventValueArray in bytes

  • eventValueArray – The normalized event values required to calculate metric. The values must be order to match the order of events in eventIdArray

  • timeDuration – The duration over which the events were collected, in ns

  • metricValue – Returns the value for the metric

Return values
  • CUPTI_SUCCESS

  • CUPTI_ERROR_NOT_INITIALIZED

  • CUPTI_ERROR_INVALID_METRIC_ID

  • CUPTI_ERROR_INVALID_OPERATION

  • CUPTI_ERROR_PARAMETER_SIZE_NOT_SUFFICIENT – if the eventIdArray does not contain all the events needed for metric

  • CUPTI_ERROR_INVALID_EVENT_VALUE – if any of the event values required for the metric is CUPTI_EVENT_OVERFLOW

  • CUPTI_ERROR_INVALID_METRIC_VALUE – if the computed metric value cannot be represented in the metric’s value type. For example, if the metric value type is unsigned and the computed metric value is negative

  • CUPTI_ERROR_INVALID_PARAMETER – if metricValue, eventIdArray or eventValueArray is NULL

CUptiResult cuptiMetricGetValue2(CUpti_MetricID metric, size_t eventIdArraySizeBytes, CUpti_EventID *eventIdArray, size_t eventValueArraySizeBytes, uint64_t *eventValueArray, size_t propIdArraySizeBytes, CUpti_MetricPropertyID *propIdArray, size_t propValueArraySizeBytes, uint64_t *propValueArray, CUpti_MetricValue *metricValue)

Calculate the value for a metric.

Use the events and properties collected for a metric to calculate the metric value. Metric value evaluation depends on the evaluation mode CUpti_MetricEvaluationMode that the metric supports. If a metric has evaluation mode as CUPTI_METRIC_EVALUATION_MODE_PER_INSTANCE, then it assumes that the input event value is for one domain instance. If a metric has evaluation mode as CUPTI_METRIC_EVALUATION_MODE_AGGREGATE, it assumes that input event values are normalized to represent all domain instances on a device. For the most accurate metric collection, the events required for the metric should be collected for all profiled domain instances. For example, to collect all instances of an event, set the CUPTI_EVENT_GROUP_ATTR_PROFILE_ALL_DOMAIN_INSTANCES attribute on the group containing the event to 1. The normalized value for the event is then: (sum_event_values * totalInstanceCount) / instanceCount, where sum_event_values is the summation of the event values across all profiled domain instances, totalInstanceCount is obtained from querying CUPTI_EVENT_DOMAIN_ATTR_TOTAL_INSTANCE_COUNT and instanceCount is obtained from querying CUPTI_EVENT_GROUP_ATTR_INSTANCE_COUNT (or CUPTI_EVENT_DOMAIN_ATTR_INSTANCE_COUNT).

Parameters
  • metric – The metric ID

  • eventIdArraySizeBytes – The size of eventIdArray in bytes

  • eventIdArray – The event IDs required to calculate metric

  • eventValueArraySizeBytes – The size of eventValueArray in bytes

  • eventValueArray – The normalized event values required to calculate metric. The values must be order to match the order of events in eventIdArray

  • propIdArraySizeBytes – The size of propIdArray in bytes

  • propIdArray – The metric property IDs required to calculate metric

  • propValueArraySizeBytes – The size of propValueArray in bytes

  • propValueArray – The metric property values required to calculate metric. The values must be order to match the order of metric properties in propIdArray

  • metricValue – Returns the value for the metric

Return values
  • CUPTI_SUCCESS

  • CUPTI_ERROR_NOT_INITIALIZED

  • CUPTI_ERROR_INVALID_METRIC_ID

  • CUPTI_ERROR_INVALID_OPERATION

  • CUPTI_ERROR_PARAMETER_SIZE_NOT_SUFFICIENT – if the eventIdArray does not contain all the events needed for metric

  • CUPTI_ERROR_INVALID_EVENT_VALUE – if any of the event values required for the metric is CUPTI_EVENT_OVERFLOW

  • CUPTI_ERROR_NOT_COMPATIBLE – if the computed metric value cannot be represented in the metric’s value type. For example, if the metric value type is unsigned and the computed metric value is negative

  • CUPTI_ERROR_INVALID_PARAMETER – if metricValue, eventIdArray or eventValueArray is NULL

5.5.3. Typedefs

typedef uint32_t CUpti_MetricID

ID for a metric.

A metric provides a measure of some aspect of the device.