2. Python Report Interface

2.1. Introduction

NVIDIA Nsight Compute features a Python-based interface to interact with exported report files.

The module is called ncu_report and works on any Python version from 3.4 1. It can be found in the extras/python directory of your NVIDIA Nsight Compute package.

In order to use the Python module, you need a report file generated by NVIDIA Nsight Compute. You can obtain such a file by saving it from the graphical interface or by using the --export flag of the command line tool.

The types and functions in the ncu_report module are a subset of the ones available in the NvRules API. The documentation in this section serves as a tutorial. For a more formal description of the exposed API, please refer to the NvRules API documentation.

1

On Linux machines you will also need a GNU-compatible libc and libgcc_s.so.

2.2. Basic Usage

In order to be able to import ncu_report you will either have to navigate to the extras/python directory, or add its absolute path to the PYTHONPATH environment variable. Then, the module can be imported like any Python module:

>>> import ncu_report

Importing a report

Once the module is imported, you can load a report file by calling the load_report function with the path to the file. This function returns an object of type IContext which holds all the information concerning that report.

>>> my_context = ncu_report.load_report("my_report.ncu-rep")

Querying ranges

When working with the Python module, profiling results are grouped into ranges which are represented by IRange objects. You can inspect the number of ranges contained in the loaded report by calling the IContext.num_ranges member function of an IContext object and retrieve a range by its index using IContext.range_by_idx.

>>> my_context.num_ranges()
1
>>> my_range = my_context.range_by_idx(0)

Querying actions

Inside a range, profiling results are called actions. You can query the number of actions contained in a given range by using the IRange.num_actions method of an IRange object.

>>> my_range.num_actions()
2

In the same way ranges can be obtained from an IContext object by using the IContext.range_by_idx method, individual actions can be obtained from IRange objects by using the IRange.action_by_idx method. The resulting actions are represented by the IAction class.

>>> my_action = my_range.action_by_idx(0)

As mentioned previously, an action represents a single profiling result. To query the workloads’s name you can use the IAction.name member function of the IAction class.

>>> my_action.name()
MyKernel

To query the workload type, you can use the IAction.workload_type member function of the IAction class, which will return an integer among the IAction.WorkloadType_* values.

For CUDA kernels, you get the following result:

>>> my_action.workload_type()
0

All return values of IAction.workload_type are documented in API documentation for the IAction class.

Querying metrics

To get a tuple of all metric names contained within an action you can use the IAction.metric_names method. It is meant to be combined with the IAction.metric_by_name method which returns an IMetric object. However, for the same task you may also use the subscript operator through the IAction.__getitem__ method, as explained in the High-Level Interface section below.

The metric names displayed here are the same as the ones you can use with the --metrics flag of NVIDIA Nsight Compute. Once you have extracted a metric from an action, you can obtain its value by using one of the following three methods:

For example, to print the display name of the GPU on which the workload was profiled you can query the device__attribute_display_name metric.

>>> display_name_metric = my_action.metric_by_name('device__attribute_display_name')
>>> display_name_metric.as_string()
'NVIDIA GeForce RTX 3060 Ti'

Note that accessing a metric with the wrong type can lead to unexpected (conversion) results.

>>> display_name_metric.as_double()
0.0

Therefore, it is advisable to directly use the High-Level function IMetric.value, as explained below.

2.3. Python Interoperability

On top of the low-level NvRules API the Python Report Interface also implements part of the Python object model. By implementing special methods, the Python Report Interface’s exposed classes can be used with built-in Python mechanisms such as iteration, string formatting and length querying.

This allows you to access metrics objects via the IAction.__getitem__ method of the IAction class:

>>> display_name_metric = my_action["device__attribute_display_name"]

There is also a convenience method IMetric.value which allows you to query the value of a metric object without knowledge of its type:

>>> display_name_metric.value()
'NVIDIA GeForce RTX 3060 Ti'

All the available methods of a class, as well as their associated Python docstrings, can be looked up interactively via

>>> help(ncu_report.IMetric)

or similarly for other classes and methods. In your code, you can access the docstrings via the __doc__ attribute, i.e. ncu_report.IMetric.value.__doc__.

2.4. Metric attributes

Apart from the possibility to query the IMetric.name and IMetric.value of an IMetric object, you can also query the following additional metric attributes:

The first method IMetric.metric_type returns one out of three enum values (IMetric.MetricType_COUNTER, IMetric.MetricType_RATIO, IMetric.MetricType_THROUGHPUT) if the metric is a hardware metric, or IMetric.MetricType_OTHER otherwise (e.g. for launch or device attributes).

The method IMetric.metric_subtype returns an enum value representing the subtype of a metric (e.g. IMetric.MetricSubtype_PEAK_SUSTAINED, IMetric.MetricSubtype_PER_CYCLE_ACTIVE). In case a metric does not have a subtype, None is returned. All available values may be found in the documentation for the IMetric class, or may be looked up interactively by executing help(ncu_report.IMetric).

IMetric.rollup_operation returns the operation which is used to accumulate different values of the same metric and can be one of IMetric.RollupOperation_AVG, IMetric.RollupOperation_MAX, IMetric.RollupOperation_MIN or IMetric.RollupOperation_SUM for averaging, maximum, minimum or summation, respectively. If the metric in question does not specify a rollup operation None will be returned.

Lastly, IMetric.unit and IMetric.description return a (possibly empty) string of the metric’s unit and a short textual description for hardware metrics, respectively.

The above methods can be combined to filter through all metrics of a report, given certain criteria:

for metric in metrics:
    if metric.metric_type() == IMetric.MetricType_COUNTER and \
       metric.metric_subtype() == IMetric.MetricSubtype_PER_SECOND and \
       metric.rollup_operation() == IMetric.RollupOperation_AVG:
        print(f"{metric.name()}: {metric.value()} {metric.unit()}")

2.5. NVTX Support

The ncu_report has support for the NVIDIA Tools Extension (NVTX). This comes through the INvtxState object which represents the NVTX state of a profiled kernel.

An INvtxState object can be obtained from an action by using its IAction.nvtx_state method. It exposes the INvtxState.domains method which returns a tuple of integers representing the domains this kernel has state in. These integers can be used with the INvtxState.domain_by_id method to get an INvtxDomainInfo object which represents the state of a domain.

The INvtxDomainInfo can be used to obtain a tuple of Push-Pop, or Start-End ranges using the INvtxDomainInfo.push_pop_ranges and INvtxDomainInfo.start_end_ranges methods.

There is also a IRange.actions_by_nvtx member function in the IRange class which allows you to get a tuple of actions matching the NVTX state described in its parameter.

The parameters for the IRange.actions_by_nvtx function are two lists of strings representing the state for which we want to query the actions. The first parameter describes the NVTX states to include while the second one describes the NVTX states to exclude. These strings are in the same format as the ones used with the --nvtx-include and --nvtx-exclude options.

2.6. Sample Script

NVTX Push-Pop range filtering

This is a sample script which loads a report and prints the names of all the profiled kernels which were wrapped inside BottomRange and TopRange Push-Pop ranges of the default NVTX domain.

#!/usr/bin/env python3

import sys

import ncu_report

if len(sys.argv) != 2:
    print("usage: {} report_file".format(sys.argv[0]), file=sys.stderr)
    sys.exit(1)

report = ncu_report.load_report(sys.argv[1])

for range_idx in range(report.num_ranges()):
    current_range = report.range_by_idx(range_idx)
    for action_idx in current_range.actions_by_nvtx(["BottomRange/*/TopRange"], []):
        action = current_range.action_by_idx(action_idx)
        print(action.name())

2.7. API Reference

This documents the content of the ncu_report package which can be found in the extras/python directory of your NVIDIA Nsight Compute installation.

class ncu_report.FocusSeverity

Enum representing the severity of a focus metric.

DEFAULT

Default severity.

LOW

Low severity.

HIGH

High severity.

class ncu_report.IAction

The IAction represents a profile result such as a CUDA kernel in a single range or a range itself in range-based profiling, for which zero or more metrics were collected.

NameBase_DEMANGLED

Name base for demangled names.

Type

int

NameBase_FUNCTION

Name base for function signature names.

Type

int

NameBase_MANGLED

Name base for mangled names.

Type

int

WorkloadType_CMDLIST

Workload type for CBL command lists.

Type

int

WorkloadType_GRAPH

Workload type for CUDA graphs.

Type

int

WorkloadType_KERNEL

Workload type for CUDA kernels or CUDA graph kernel nodes.

Type

int

WorkloadType_RANGE

Workload type for result ranges.

Type

int

__getitem__(key)

Get an IMetric object contained in this IAction by its name.

Parameters

key (str) – The name of the IMetric object to retrieve.

Returns

An IMetric object.

Return type

IMetric

Raises
__iter__()

Get an iterator over the metric names of this IAction.

Returns

An iterator over the metric names.

Return type

iterator of str

__len__()

Get the number of IMetric objects of this IAction.

Returns

The number of IMetric objects.

Return type

int

__str__()

Get a human-readable representation of this IAction.

Returns

The name of the kernel the IAction object represents.

Return type

str

metric_by_name(metric_name)

Get a single IMetric by name.

Parameters

metric_name (str) – The name of the IMetric to retrieve.

Returns

The IMetric object or None if no such metric exists.

Return type

IMetric | None

metric_names()

Get the names of all metrics of this IAction.

Returns

The names of all metrics.

Return type

tuple of str

name(*args)

Get the name of the result the IAction object represents.

Parameters

name_base (int, optional) – The desired name base. Defaults to NameBase_FUNCTION.

Returns

The name of the result (potentially in a specific name base).

Return type

str

nvtx_state()

Get the NVTX state associated with this action.

Returns

The associated INvtxState or None if no state is associated.

Return type

INvtxState | None

ptx_by_pc(address)

Get the PTX associated with an address.

Parameters

address (int) – The address to get PTX for.

Returns

The PTX associated with the given address. If no PTX is available, the empty string will be returned.

Return type

str

rule_results()

Get the tuple of all rules associated with this action.

Returns

A tuple of rule results.

Return type

tuple of IRuleResult

rule_results_as_dicts()

Get the list of all rules data associated with this action.

Returns

A list of rule data dictionaries. Each rule data dictionary contains the following key-value pairs:
  • ’rule_identifier’ : (str) The rule identifier.

  • ’name’ : (str) The rule name.

  • ’section_identifier’ : (str) The section identifier.

  • ’rule_message’(dict) with the following key-value pairs:
    • ’title’ : (str) The title of rule message.

    • ’message’ : (str) The message.

    • ’type’ : (MsgType) The message type.

  • ’focus_metrics’(list of dict)A list of focus metrics. Each focus metric dictionary contains the following key-value pairs:
    • ’name’ : (str) The name of the focus metric.

    • ’value’ : (float) The value of the focus metric.

    • ’severity’ : (FocusSeverity) The severity of the focus metric.

    • ’info’ : (str) The information about the focus metric.

  • ’speedup_estimation’(dict) with the following key-value pairs:
    • ’type’ : (SpeedupType) The speedup type.

    • ’speedup’ : (float) The speedup value.

  • ’result_tables’(dict) with the following key-value pairs:
    • ’title’ : (str) The title of the result table.

    • ’description’ : (str) The description of the result table.

    • ’headers’ : (list of str) The column headers of the result table.

    • ’data’ : (list of list [int | float | str | Any ]) The table data in row-major format. Each column have elements of the same type.

    str values support below link formats:
    • @url:<hypertext>:<external link>@ - To add a external link for a hypertext.

    • @sass:<address>:<hypertext>@ - To add a link to the hypertext to open the SASS address line on the Source page.

    • @source:<file name>:<line number>:<hypertext>@ - To add a link to the hypertext to open the source file at the specified line number on the Source page.

    • @section:<section identifier>:<hypertext>@ - To add a link to the hypertext to jump to the respective section.

Return type

list of dict

sass_by_pc(address)

Get the SASS associated with an address.

Parameters

address (int) – The address to get SASS for.

Returns

The SASS associated with the given address. If no SASS is available, the empty string will be returned.

Return type

str

source_files()

Get the source files associated with this action along with their content.

If content is not available for a file (e.g. because it hadn’t been imported into the report), the file name will map to an empty string.

Returns

A dictionary mapping source files to their content.

Return type

dict of str to str

source_info(address)

Get the source info for a function address within this action.

Addresses are commonly obtained as correlation IDs of source-correlated metrics.

Parameters

address (int) – The address to get source info for.

Returns

The ISourceInfo associated to the given address. If no source info is available, None is returned.

Return type

ISourceInfo | None

source_markers()

Get all the source markers associated with this action.

Returns

A tuple of source data dictionaries. Each source data dict contains the following key-value pairs:
  • ’rule_identifier’ : (str) The rule identifier.

  • ’section_identifier’ : (str) The section identifier.

  • ’kind’ : (int) MarkerKind attribute.

  • ’message_type’ : (int) MsgType attribute.

  • ’message’ : (str) The source marker message.

  • ’source_location’(dict) with the following key-value pairs:
    • ’file_name’ : (str) The file name.

    • ’line’ : (int) The line number.

  • ’source_address’ : (int) Source instruction address. Key available only for kind MarkerKind.SASS.

Return type

tuple of dict

workload_type()

Get the workload type of the action.

Returns

The workload type.

Return type

int

class ncu_report.IContext

The IContext class is the top-level object representing an open report.

It can be created by calling the load_report function.

__getitem__(key)

Get one or more IRange objects by index or by slice.

Parameters

key (int | slice) – The index or slice to retrieve.

Returns: IRange | tuple of IRange: An IRange object or a tuple of IRange objects.

Raises
__iter__()

Get an iterator over the IRange objects of this IContext.

Returns

An iterator over the IRange objects.

Return type

iterator of IRange

__len__()

Get the number of IRange objects in this IContext.

Returns

The number of IRange objects.

Return type

int

num_ranges()

Get the number of IRange objects in this IContext.

Returns

The number of IRange objects.

Return type

int

range_by_idx(idx)

Get an IRange object by index.

Parameters

key (int) – The index to retrieve.

Returns

An IRange object or None if the index is out of range.

Return type

IRange | None

class ncu_report.IMetric

Represents a single, named metric. An IMetric can carry one value or multiple ones if it is an instanced metric.

MetricType_OTHER

Metric type for metrics that do not fit in any other category.

Type

int

MetricType_COUNTER

Metric type for counter metrics.

Type

int

MetricType_RATIO

Metric type for ratio metrics.

Type

int

MetricType_THROUGHPUT

Metric type for throughput metrics.

Type

int

MetricSubtype_NONE

Metric subtype for metrics that do not have a subtype.

Type

int

MetricSubtype_PEAK_SUSTAINED

Metric subtype for peak sustained metrics.

Type

int

MetricSubtype_PEAK_SUSTAINED_ACTIVE

Metric subtype for peak sustained active metrics.

Type

int

MetricSubtype_PEAK_SUSTAINED_ACTIVE_PER_SECOND

Metric subtype for peak sustained active per-second metrics.

Type

int

MetricSubtype_PEAK_SUSTAINED_ELAPSED

Metric subtype for peak sustained elapsed metrics.

Type

int

MetricSubtype_PEAK_SUSTAINED_ELAPSED_PER_SECOND

Metric subtype for peak sustained elapsed per-second metrics.

Type

int

MetricSubtype_PER_CYCLE_ACTIVE

Metric subtype for per-cycle active metrics.

Type

int

MetricSubtype_PER_CYCLE_ELAPSED

Metric subtype for per-cycle elapsed metrics.

Type

int

MetricSubtype_PER_SECOND

Metric subtype for per-second metrics.

Type

int

MetricSubtype_PCT_OF_PEAK_SUSTAINED_ACTIVE

Metric subtype for percentage of peak sustained active metrics.

Type

int

MetricSubtype_PCT_OF_PEAK_SUSTAINED_ELAPSED

Metric subtype for percentage of peak sustained elapsed metrics.

Type

int

MetricSubtype_MAX_RATE

Metric subtype for max rate metrics.

Type

int

MetricSubtype_PCT

Metric subtype for percentage metrics.

Type

int

MetricSubtype_RATIO

Metric subtype for ratio metrics.

Type

int

RollupOperation_NONE

No rollup operation.

Type

int

RollupOperation_AVG

Average rollup operation.

Type

int

RollupOperation_MAX

Maximum rollup operation.

Type

int

RollupOperation_MIN

Minimum rollup operation.

Type

int

RollupOperation_SUM

Sum rollup operation.

Type

int

ValueKind_UNKNOWN

Unknown value kind.

Type

int

ValueKind_ANY

Undefined value kind.

Type

int

ValueKind_STRING

String value kind.

Type

int

ValueKind_FLOAT

Float value kind.

Type

int

ValueKind_DOUBLE

Double value kind.

Type

int

ValueKind_UINT32

Unsigned 32-bit integer value kind.

Type

int

ValueKind_UINT64

Unsigned 64-bit integer value kind.

Type

int

__str__()

Get a human-readable representation of this IMetric.

Returns

The name of the IMetric.

Return type

str

as_double(*args)

Get the metric value or metric instance value as a float.

Parameters

instance (int, optional) – If provided, the index of the instance value to retrieve instead a metric value.

Returns

The metric value or metric instance value requested. If the value cannot be casted to a float, this function will return 0..

Return type

float

as_string(*args)

Get the metric value or metric instance value as a str.

Parameters

instance (int, optional) – If provided, the index of the instance value to retrieve instead a metric value.

Returns

The metric value or metric instance value requested. If the value cannot be casted to a str, this function will return None.

Return type

str | None

as_uint64(*args)

Get the metric value or metric instance value as an int.

Parameters

instance (int, optional) – If provided, the index of the instance value to retrieve instead a metric value.

Returns

The metric value or metric instance value requested. If the value cannot be casted to a int, this function will return 0.

Return type

int

correlation_ids()

Get a metric object for this metric’s instance value’s correlation IDs.

Returns a new IMetric representing the correlation IDs for the metric’s instance values. Use IMetric.has_correlation_ids to check if this metric has correlation IDs for its instance values. Correlation IDs are used to associate instance values with the instance their value represents. In the returned new metric object, the correlation IDs are that object’s instance values.

If the metric does not have any correlation IDs, this function will return None.

Returns

The new IMetric object representing the correlation IDs for this metric’s instance values or None if the metric has no correlation IDs.

Return type

IMetric | None

description()

Get the metric description.

Returns

The description of the metric.

Return type

str

has_correlation_ids()

Check if the metric has correlation IDs.

Returns

True if the metric has correlation IDs matching its instance values, False otherwise.

Return type

bool

has_value(*args)

Check if the metric or metric instance has a value.

Parameters

instance (int, optional) – If provided, the index of the instance metric to check.

Returns

True if the metric or metric instance has a value, False` otherwise.

Return type

bool

kind(*args)

Get the metric or metric instance value kind.

Parameters

instance (int, optional) – If provided, the index of the instance metric to get the value kind for.

Returns

The metric or metric instance value kind.

Return type

int

metric_subtype()

Get the metric subtype.

Returns

The metric subtype.

Return type

int

metric_type()

Get the metric type.

Returns

The metric type.

Return type

int

name()

Get the metric name.

Returns

The metric name.

Return type

str

num_instances()

Get the number of instance values for this metric.

Not all metrics have instance values. If a metric has instance values, it may also have IMetric.correlation_ids matching these instance values.

Returns

The number of instances for this metric.

Return type

int

rollup_operation()

Get the type of rollup operation for this metric.

Returns

The rollup operation type.

Return type

int

unit()

Get the metric unit.

Returns

The metric unit.

Return type

str

value(idx=None)

Get the value of this IMetric.

This is a convenience function that will wrap the logic of invoking the correct IMetric.as_* method based on the value kind of this IMetric.

Parameters

idx (int, optional) – The index of the value to get.

Returns

The value of this IMetric as str, int or float. If no value is available, this will return None.

Return type

str | int | float | None

class ncu_report.INvtxDomainInfo

Represents a single NVTX domain of the NVTX state, including all ranges associated with this domain.

__str__()

Get a human-readable representation of this INvtxDomainInfo.

Returns

The name of the INvtxDomainInfo.

Return type

str

name()

Get a human-readable representation of this INvtxDomainInfo.

Returns

The name of the INvtxDomainInfo.

Return type

str

push_pop_range(idx)

Get a push/pop range object by index.

The index is identical to the range’s order on the call stack.

Returns

The requested INvtxRange or None if the index is out of range.

Return type

INvtxRange | None

push_pop_ranges()

Get a sorted list of push/pop range names.

Get the sorted list of stacked push/pop range names in this domain, associated with the current INvtxState.

Returns

The sorted names of all push/pop ranges.

Return type

tuple of str

start_end_range(idx)

Get a start/end range object by index.

Returns

The requested INvtxRange or None if the index is out of range.

Return type

INvtxRange | None

start_end_ranges()

Get a sorted list of start/end range names.

Get the sorted list of start/end range names in this domain, associated with the current INvtxState.

Returns

The sorted names of all start/end ranges.

Return type

tuple of str

class ncu_report.INvtxRange

Represents a single NVTX Push/Pop or Start/End range.

PayloadType_PAYLOAD_UNKNOWN

Payload type for ranges of unknown type.

Type

int

PayloadType_PAYLOAD_INT32

Payload type for ranges of INT32 type.

Type

int

PayloadType_PAYLOAD_INT64

Payload type for ranges of INT64 type.

Type

int

PayloadType_PAYLOAD_UINT32

Payload type for ranges of UINT32 type.

Type

int

PayloadType_PAYLOAD_UINT64

Payload type for ranges of UINT64 type.

Type

int

PayloadType_PAYLOAD_FLOAT

Payload type for ranges of float type.

Type

int

PayloadType_PAYLOAD_DOUBLE

Payload type for ranges of double type .

Type

int

PayloadType_PAYLOAD_JSON

Payload type for ranges of JSON type.

Type

int

category()

Get the category attribute value.

Returns

The category attribute value. If INvtxRange.has_attributes returns False, this will return 0.

Return type

int

color()

Get the color attribute value.

Returns

The color attribute value. If INvtxRange.has_attributes returns False, this will return 0.

Return type

int

has_attributes()

Check if range has event attributes.

Returns

True if the range has event attributes, False otherwise.

Return type

bool

message()

Get the message attribute value.

Returns

The message attribute value. If INvtxRange.has_attributes returns False, this will return the empty string.

Return type

str

name()

Get the range’s name.

Returns

The range’s name.

Return type

str

payload_as_double()

Get the payload attribute value as a float.

Returns

The payload attribute’s value as a float. If the value cannot be casted to a float, this function will return 0..

Return type

float

payload_as_string()

Get the payload attribute value as a str.

Returns

The payload attribute’s value as a str. If the value cannot be casted to a str, this function will return the empty string.

Return type

str

payload_as_uint64()

Get the payload attribute value as an int.

Returns

The payload attribute’s value as a int. If the value cannot be casted to a int, this function will return 0.

Return type

int

payload_type()

Get the payload type as an int.

Returns

The payload type.

Return type

int

class ncu_report.INvtxState

Represents the NVTX (Nvidia Tools Extensions) state associated with a single IAction.

__getitem__(key)

Get an INvtxDomainInfo object by ID.

Parameters

key (int) – The ID of the INvtxDomainInfo object.

Returns

An INvtxDomainInfo object.

Return type

INvtxDomainInfo

Raises
__iter__()

Get an iterator over the INvtxDomainInfo objects of this INvtxState.

Returns

An iterator over the INvtxDomainInfo objects.

Return type

iterator

__len__()

Get the number of INvtxDomainInfo objects of this INvtxState.

Returns

The number of INvtxDomainInfo objects.

Return type

int

domain_by_id(id)

Get a INvtxDomainInfo object by ID.

Use INvtxState.domains to retrieve the list of valid domain IDs.

Parameters

id (int) – The ID of the request domain.

Returns

The requested INvtxDomainInfo object.

Return type

INvtxDomainInfo | None

domains()

Get the list of domain IDs in this state.

Returns

The tuple of valid domain IDs.

Return type

tuple of int

class ncu_report.IRange

Represents a serial, ordered stream of execution, such as a CUDA stream. It holds one or more actions that were logically executing in this range.

__getitem__(key)

Get one or more IAction objects by index or by slice.

Parameters

key (int | slice) – The index or slice to retrieve.

Returns

An IAction object or a tuple of IAction objects.

Return type

IAction | tuple of IAction

Raises
__iter__()

Get an iterator over the IAction objects of this class:IRange.

Returns

An iterator over the IAction objects.

Return type

iterator of IAction

__len__()

Get the number of IAction objects in this IRange.

Returns

The number of class:IAction objects.

Return type

int

action_by_idx(idx)

Get an IAction objects by index.

Parameters

key (int) – The index to retrieve.

Returns

An IAction object or None if the index is out of range.

Return type

IAction | None

actions_by_nvtx(includes, excludes)

Get a set of indices to IAction objects by their NVTX state. The state is defined using a series of includes and excludes.

Parameters
  • includes (iterable of str) – The NVTX states the result should be part of.

  • excludes (iterable of str) – The NVTX states the result should not be part of.

Returns

A tuple of indices to IAction matching the desired NVTX state.

Return type

tuple of int

num_actions()

Get the number of IAction objects in this IRange.

Returns

The number of class:IAction objects.

Return type

int

class ncu_report.IRuleResult

The IRuleResult represents rule results.

__str__()

Get a human-readable name of the rule result.

Returns

The name of the IRuleResult object represents.

Return type

str

focus_metrics()

Get all the focus metrics details.

Returns

A list of focus metrics details. Each focus metric dictionary contains the following key-value pairs:
  • ’name’ : (str) The name of the focus metric.

  • ’value’ : (float) The value of the focus metric.

  • ’severity’ : (FocusSeverity) The severity of the focus metric.

  • ’info’ : (str) The information about the focus metric.

Return type

list of dict

has_rule_message()

Check if the rule has a message.

Returns

True if rule message present, False otherwise.

Return type

bool

has_speedup_estimation()

Check if the rule has speedup estimation.

Returns

True if the rule has speedup estimation, False otherwise.

Return type

bool

name()

Get the rule name.

Returns

The rule name.

Return type

str

result_tables()

Get all the result tables.

Returns

A list of result tables. Each result table dictionary contains the following key-value pairs:
  • ’title’ : (str) The title of the result table.

  • ’description’ : (str) The description of the result table.

  • ’headers’ : (list of str) The column headers of the result table.

  • ’data’ : (list of list [int | float | str | Any ]) The table data in row-major format. Each column have elements of the same type.

    class:str values may contain substrings with the following special link formats
    • @url:<hypertext>:<external link>@ - To add a external link for a hypertext.

    • @sass:<address>:<hypertext>@ - To add a link to the hypertext to open the SASS address line on the Source page.

    • @source:<file name>:<line number>:<hypertext>@ - To add a link to the hypertext to open the source file at the specified line number on the Source page.

    • @section:<section identifier>:<hypertext>@ - To add a link to the hypertext to jump to the respective section.

Return type

list of dict

rule_identifier()

Get the rule identifier.

Returns

The rule identifier.

Return type

str

rule_message()

Get the rule message.

Returns

A dictionary with the following key-value pairs:
  • ’title’ : (str) The rule message title.

  • ’message’ : (str) The rule message.

    The message may contain substrings with the following special link formats:
    • @url:<hypertext>:<external link>@ - To add a external link for a hypertext.

    • @sass:<address>:<hypertext>@ - To add a link to the hypertext to open the SASS address line on the Source page.

    • @source:<file name>:<line number>:<hypertext>@ - To add a link to the hypertext to open the source file at the specified line number on the Source page.

    • @section:<section identifier>:<hypertext>@ - To add a link to the hypertext to jump to the respective section.

  • ’type’ : (MsgType) The message type.

Return type

dict

section_identifier()

Get the section identifier.

Returns

The section identifier.

Return type

str

speedup_estimation()

Get the speedup estimation.

Returns

A dictionary with the following key-value pairs
  • ’type’ : (SpeedupType) The speedup type.

  • ’speedup’ : (float) The estimated speedup.

Return type

dict

class ncu_report.ISourceInfo

Represents the source correlation information for a specific function address within an action.

file_name()

Get the file name, as embedded in the correlation information.

Returns

The file name.

Return type

str

line()

Get the line number within the file.

Returns

The line number.

Return type

int

class ncu_report.MarkerKind

Enum representing the kind of a source marker.

SASS

The marker will be associated with a SASS instruction.

SOURCE

The marker will be associated with a Source line.

NONE

No specific kind of marker.

class ncu_report.MsgType

Enum representing the type of the message.

NONE

No specific type for this message.

OK

The message is informative.

OPTIMIZATION

The message represents a suggestion for performance optimization.

WARNING

The message represents a warning or fixable issue.

ERROR

The message represents an error, potentially in executing the rule.

class ncu_report.SpeedupType

Enum representing the type of speedup estimation.

UNKNOWN

Unknown speedup type.

LOCAL

Value represents increase in hardware efficiency in isolated context.

GLOBAL

Value represents decrease in overall kernel runtime.

ncu_report.load_report(file_name)

Load an NVIDIA Nsight Compute report file into an IContext object.

Parameters

file_name (str | pathlib.Path) – The relative or absolute path to the .ncu-rep report file.

Returns

An IContext object representing the loaded report file.

Return type

IContext

Raises

FileNotFoundError – Either if file_name does not exist or if the NVIDIA Nsight Compute library directory cannot be found.

Notices

Notices

ALL NVIDIA DESIGN SPECIFICATIONS, REFERENCE BOARDS, FILES, DRAWINGS, DIAGNOSTICS, LISTS, AND OTHER DOCUMENTS (TOGETHER AND SEPARATELY, “MATERIALS”) ARE BEING PROVIDED “AS IS.” NVIDIA MAKES NO WARRANTIES, EXPRESSED, IMPLIED, STATUTORY, OR OTHERWISE WITH RESPECT TO THE MATERIALS, AND EXPRESSLY DISCLAIMS ALL IMPLIED WARRANTIES OF NONINFRINGEMENT, MERCHANTABILITY, AND FITNESS FOR A PARTICULAR PURPOSE.

Information furnished is believed to be accurate and reliable. However, NVIDIA Corporation assumes no responsibility for the consequences of use of such information or for any infringement of patents or other rights of third parties that may result from its use. No license is granted by implication of otherwise under any patent rights of NVIDIA Corporation. Specifications mentioned in this publication are subject to change without notice. This publication supersedes and replaces all other information previously supplied. NVIDIA Corporation products are not authorized as critical components in life support devices or systems without express written approval of NVIDIA Corporation.

Trademarks

NVIDIA and the NVIDIA logo are trademarks or registered trademarks of NVIDIA Corporation in the U.S. and other countries. Other company and product names may be trademarks of the respective companies with which they are associated.