cupti.profiler_host#

The cupti.profiler_host module provides the pythonic layer for the CUPTI Profiler Host API, built on top of cupti.cupti.

APIs#

class cupti.profiler_host.ProfilerHost(
chip_name: str,
profiler_type: ProfilerType,
single_pass_metric_set_name: str | None = None,
)#

Bases: object

Host-side helper to build profiler metric configs and read metric values.

For one GPU chip and profiler kind, can produce configuration data used when collecting metrics on the device, and can turn decoded counter data into floating-point metric values per sample or range. Call initialize() after construction. The chip name, profiler kind, and metric names must be valid for the installed CUPTI and hardware.

initialize() None#

Initializes the CUPTI profiler host object.

Raises:
  • ValueError – If the chip name is not valid for the installed CUPTI or hardware (ERROR_INVALID_CHIP_NAME from the underlying call).

  • ValueError – If single_pass_metric_set_name is invalid for the chip.

  • RuntimeError – If the nvperf* libraries could not be loaded.

  • cupti.cupti.cuptiError – If CUPTI fails for other reasons (e.g. internal errors).

deinitialize() None#

Release the CUPTI host object. Safe to call multiple times.

create_config_image(
metrics: Sequence[str],
) ndarray#

Build a binary configuration buffer for the given metrics.

The returned buffer is used when enabling collection on the device.

Parameters:

metrics – Metric names to include; must be supported on this chip.

Returns:

Configuration image as a uint8 array.

Raises:
evaluate_metric_values(
counter_data_image: ndarray,
metrics: Sequence[str],
index: int,
) ndarray#

Get the floating-point metric values for one sample/range index in counter data.

Parameters:
  • counter_data_image – Decoded counter payload as a uint8 array.

  • metrics – Metric names to evaluate, in the order values should appear.

  • index – Index of the sample or range in counter_data_image.

Returns:

New float64 array of shape (len(metrics),).

Raises:
get_base_metrics(
metric_type: MetricType,
) list[str]#

Get base metrics supported for a metric type on this chip.

Parameters:

metric_type – Metric type to query (for example counter, ratio, throughput).

Returns:

Base metric names for the requested metric type.

Raises:
get_sub_metrics(
metric_name: str,
metric_type: MetricType,
) list[str]#

Get sub-metrics for a metric name and metric type.

Parameters:
  • metric_name – Metric name to query (with or without extension).

  • metric_type – Metric type for the queried metric.

Returns:

Sub-metric names supported for metric_name.

Raises:
get_metric_properties(
metric_name: str,
) MetricProperties#

Get properties for a metric.

Parameters:

metric_name – Metric name to query (with or without extension).

Returns:

A MetricProperties object with description, units, metric type, and collection scope.

Raises:
cupti.profiler_host.get_supported_chips() list[str]#

Get the list of supported chips

Returns:

Supported chip names.

Return type:

list[str]

Raises:

cupti.cupti.cuptiError – If CUPTI reports internal failures.

cupti.profiler_host.get_single_pass_sets(chip_name: str) list[str]#

Get single-pass metric set names for a chip.

Parameters:

chip_name – Chip name for which single-pass metric sets are queried.

Returns:

Single-pass metric set names supported on the chip.

Return type:

list[str]

Raises:

Note

This API is primarily useful for PM Sampling workflows where you need to select metrics that can be collected in a single pass.

cupti.profiler_host.get_metrics_in_single_pass_set(
chip_name: str,
set_name: str,
) list[str]#

Get metric names in a single-pass metric set for a chip.

Parameters:
  • chip_name – Chip name to query.

  • set_name – Single-pass metric set name to query.

Returns:

Metric names in the single-pass set.

Return type:

list[str]

Raises:

Note

This API is primarily useful for PM Sampling workflows when choosing metrics from a specific single-pass metric set.

cupti.profiler_host.get_num_of_passes(config_image: ndarray) int#

Get the number of passes required to collect scheduled metrics.

Parameters:

config_image – Config image returned by ProfilerHost.create_config_image().

Returns:

Number of passes required for profiling.

Return type:

int

Raises:
cupti.profiler_host.get_max_num_hardware_metrics_per_pass(
chip_name: str,
profiler_type: ProfilerType,
) int#

Get the upper bound of hardware metrics (metric names which do not include _sass_ keyword) schedulable in one pass.

Parameters:
  • chip_name – Chip name to query.

  • profiler_type – Profiler kind (for example ProfilerType.PM_SAMPLING).

Returns:

Maximum number of hardware metrics that can be scheduled in one pass.

Return type:

int

Raises:

Data structures#

class cupti.profiler_host.MetricProperties(
description: str,
hw_unit: str,
dim_unit: str,
metric_type: MetricType,
metric_collection_scope: MetricCollectionScope,
)#

Bases: object

Properties for a metric reported by ProfilerHost.get_metric_properties().

Enumerations#

class cupti.profiler_host.MetricType(value)#

Bases: IntEnum

See CUpti_MetricType.

COUNTER = 0#
RATIO = 1#
THROUGHPUT = 2#
class cupti.profiler_host.MetricCollectionScope(value)#

Bases: IntEnum

See CUpti_MetricCollectionScope.

CONTEXT = 0#
DEVICE = 1#
INVALID = 2#
class cupti.profiler_host.ProfilerType(value)#

Bases: IntEnum

See CUpti_ProfilerType.

PM_SAMPLING = 1#
PROFILER_INVALID = 2#
RANGE_PROFILER = 0#