2. Python Report Interface
2.1. Introduction
NVIDIA Nsight Compute features a Python-based interface to interact with exported report files.
The module is called ncu_report and works on any Python version from 3.7
1. It can be found in the extras/python directory of your NVIDIA
Nsight Compute package.
In order to use the Python module, you need a report file generated by NVIDIA
Nsight Compute. You can obtain such a file by saving it from the graphical
interface or by using the --export flag of the command line tool.
The types and functions in the ncu_report module are a subset of the ones
available in the NvRules API. The documentation in this section serves as a
tutorial. For a more formal description of the exposed API, please refer to the
NvRules API documentation.
- 1
- On Linux machines you will also need a GNU-compatible libc and - libgcc_s.so.
2.2. Basic Usage
In order to be able to import ncu_report you will either have to navigate
to the extras/python directory, or add its absolute path to the
PYTHONPATH environment variable. Then, the module can be imported like any
Python module:
>>> import ncu_report
Importing a report
Once the module is imported, you can load a report file by calling the
load_report function with the path to the file. This function returns an
object of type IContext which holds all the information concerning that
report.
>>> my_context = ncu_report.load_report("my_report.ncu-rep")
Querying ranges
When working with the Python module, profiling results are grouped into ranges
which are represented by IRange objects. You can inspect the number of
ranges contained in the loaded report by calling the
IContext.num_ranges member function of an IContext object and
retrieve a range by its index using IContext.range_by_idx.
>>> my_context.num_ranges()
1
>>> my_range = my_context.range_by_idx(0)
Querying actions
Inside a range, profiling results are called actions. You can query the
number of actions contained in a given range by using the
IRange.num_actions method of an IRange object.
>>> my_range.num_actions()
2
In the same way ranges can be obtained from an IContext object by
using the IContext.range_by_idx method, individual
actions can be obtained from IRange objects by using the
IRange.action_by_idx method. The resulting actions are represented by
the IAction class.
>>> my_action = my_range.action_by_idx(0)
As mentioned previously, an action represents a single profiling result. To
query the workloads’s name you can use the IAction.name member function
of the IAction class.
>>> my_action.name()
MyKernel
To query the workload type, you can use the IAction.workload_type member
function of the IAction class, which will return an integer among the
IAction.WorkloadType_* values.
For CUDA kernels, you get the following result:
>>> my_action.workload_type()
0
All return values of IAction.workload_type are documented in API
documentation for the IAction class.
Querying metrics
To get a tuple of all metric names contained within an action you can use the
IAction.metric_names method. It is meant to be combined with the
IAction.metric_by_name method which returns an IMetric object.
However, for the same task you may also use the subscript operator through the
IAction.__getitem__ method, as explained in the High-Level Interface section below.
The metric names displayed here are the same as the ones you can use with the
--metrics flag of NVIDIA Nsight Compute. Once you have extracted a metric
from an action, you can obtain its value by using one of the following three
methods:
- IMetric.as_stringto obtain its value as a Python- str
- IMetric.as_uint64to obtain its value as a Python- int
- IMetric.as_doubleto obtain its value as a Python- float
For example, to print the display name of the GPU on which the workload was
profiled you can query the device__attribute_display_name metric.
>>> display_name_metric = my_action.metric_by_name('device__attribute_display_name')
>>> display_name_metric.as_string()
'NVIDIA GeForce RTX 3060 Ti'
Note that accessing a metric with the wrong type can lead to unexpected (conversion) results.
>>> display_name_metric.as_double()
0.0
Therefore, it is advisable to directly use the High-Level function
IMetric.value, as explained below.
2.3. Python Interoperability
On top of the low-level NvRules API the Python Report Interface also implements part of the Python object model. By implementing special methods, the Python Report Interface’s exposed classes can be used with built-in Python mechanisms such as iteration, string formatting and length querying.
This allows you to access metrics objects via the IAction.__getitem__
method of the IAction class:
>>> display_name_metric = my_action["device__attribute_display_name"]
There is also a convenience method IMetric.value which allows you to
query the value of a metric object without knowledge of its type:
>>> display_name_metric.value()
'NVIDIA GeForce RTX 3060 Ti'
All the available methods of a class, as well as their associated Python docstrings, can be looked up interactively via
>>> help(ncu_report.IMetric)
or similarly for other classes and methods. In your code, you can access the
docstrings via the __doc__ attribute, i.e.
ncu_report.IMetric.value.__doc__.
2.4. Metric attributes
Apart from the possibility to query the IMetric.name and
IMetric.value of an IMetric object, you can also query the
following additional metric attributes:
The first method IMetric.metric_type returns one out of three enum
values (IMetric.MetricType_COUNTER, IMetric.MetricType_RATIO,
IMetric.MetricType_THROUGHPUT) if the metric is a hardware metric, or
IMetric.MetricType_OTHER otherwise (e.g. for launch or device
attributes).
The method IMetric.metric_subtype returns an enum value representing
the subtype of a metric (e.g.  IMetric.MetricSubtype_PEAK_SUSTAINED,
IMetric.MetricSubtype_PER_CYCLE_ACTIVE). In case a metric does not have
a subtype, None is returned. All available values  may be found in the
documentation for the IMetric class, or may be looked up interactively
by executing help(ncu_report.IMetric).
IMetric.rollup_operation returns the operation which is used to
accumulate different values of the same metric and can be one of
IMetric.RollupOperation_AVG, IMetric.RollupOperation_MAX,
IMetric.RollupOperation_MIN or IMetric.RollupOperation_SUM for
averaging, maximum, minimum or summation, respectively. If the metric in
question does not specify a rollup operation None will be returned.
Lastly, IMetric.unit and IMetric.description return a (possibly
empty) string of the metric’s unit and a short textual description for
hardware metrics, respectively.
The above methods can be combined to filter through all metrics of a report, given certain criteria:
for metric in metrics:
    if metric.metric_type() == IMetric.MetricType_COUNTER and \
       metric.metric_subtype() == IMetric.MetricSubtype_PER_SECOND and \
       metric.rollup_operation() == IMetric.RollupOperation_AVG:
        print(f"{metric.name()}: {metric.value()} {metric.unit()}")
2.5. NVTX Support
The ncu_report has support for the NVIDIA Tools Extension (NVTX). This
comes through the INvtxState object which represents the NVTX state of
a profiled kernel.
An INvtxState object can be obtained from an action by using its
IAction.nvtx_state method. It exposes the INvtxState.domains
method which returns a tuple of integers representing the domains this kernel
has state in. These integers can be used with the
INvtxState.domain_by_id method to get an INvtxDomainInfo object
which represents the state of a domain.
The INvtxDomainInfo can be used to obtain a tuple of Push-Pop, or
Start-End ranges using the INvtxDomainInfo.push_pop_ranges and
INvtxDomainInfo.start_end_ranges methods.
There is also a IRange.actions_by_nvtx member function in the
IRange class which allows you to get a tuple of actions matching the
NVTX state described in its parameter.
The parameters for the IRange.actions_by_nvtx function are two lists of
strings representing the state for which we want to query the actions. The first
parameter describes the NVTX states to include while the second one describes
the NVTX states to exclude. These strings are in the same format as the ones
used with the --nvtx-include and --nvtx-exclude options.
2.6. Sample Script
NVTX Push-Pop range filtering
This is a sample script which loads a report and prints the names of all the
profiled kernels which were wrapped inside BottomRange and TopRange
Push-Pop ranges of the default NVTX domain.
#!/usr/bin/env python3
import sys
import ncu_report
if len(sys.argv) != 2:
    print("usage: {} report_file".format(sys.argv[0]), file=sys.stderr)
    sys.exit(1)
report = ncu_report.load_report(sys.argv[1])
for range_idx in range(report.num_ranges()):
    current_range = report.range_by_idx(range_idx)
    for action_idx in current_range.actions_by_nvtx(["BottomRange/*/TopRange"], []):
        action = current_range.action_by_idx(action_idx)
        print(action.name())
2.7. API Reference
This documents the content of the ncu_report package which can be found
in the extras/python directory of your NVIDIA Nsight Compute installation.
- class ncu_report.FocusSeverity
- Enum representing the severity of a focus metric. - DEFAULT
- Default severity. 
 - LOW
- Low severity. 
 - HIGH
- High severity. 
 
- class ncu_report.IAction
- The - IActionrepresents a profile result such as a CUDA kernel in a single range or a range itself in range-based profiling, for which zero or more metrics were collected.- __iter__()
 - __len__()
 - name(*args)
- Get the name of the result the - IActionobject represents.- Parameters
- name_base ( - int, optional) – The desired name base. Defaults to- NameBase_FUNCTION.
- Returns
- The name of the result (potentially in a specific name base). 
- Return type
 
 - nvtx_state()
- Get the NVTX state associated with this action. - Returns
- The associated - INvtxStateor- Noneif no state is associated.
- Return type
 
 - ptx_by_pc(address)
- Get the PTX associated with an address. 
 - rule_results()
- Get the tuple of all rules associated with this action. - Returns
- A tuple of rule results. 
- Return type
- tupleof- IRuleResult
 
 - rule_results_as_dicts()
- Get the list of all rules data associated with this action. - Returns
- A list of rule data dictionaries. Each rule data dictionary contains the following key-value pairs:
- ’rule_identifier’ : ( - str) The rule identifier.
- ’name’ : ( - str) The rule name.
- ’section_identifier’ : ( - str) The section identifier.
- ’focus_metrics’(listofdict)A list of focus metrics. Each focus metric dictionary contains the following key-value pairs:
- ’name’ : ( - str) The name of the focus metric.
- ’value’ : ( - float) The value of the focus metric.
- ’severity’ : ( - FocusSeverity) The severity of the focus metric.
- ’info’ : ( - str) The information about the focus metric.
 
 
- ’focus_metrics’(
- ’speedup_estimation’(dict) with the following key-value pairs:
- ’type’ : ( - SpeedupType) The speedup type.
- ’speedup’ : ( - float) The speedup value.
 
 
- ’speedup_estimation’(
- ’result_tables’(dict) with the following key-value pairs:
- ’title’ : ( - str) The title of the result table.
- ’description’ : ( - str) The description of the result table.
- ’headers’ : ( - listof- str) The column headers of the result table.
- ’data’ : ( - listof- list[- int|- float|- str|- Any]) The table data in row-major format. Each column have elements of the same type.
 - strvalues support below link formats:
- @url:<hypertext>:<external link>@ - To add a external link for a hypertext. 
- @sass:<address>:<hypertext>@ - To add a link to the hypertext to open the SASS address line on the Source page. 
- @source:<file name>:<line number>:<hypertext>@ - To add a link to the hypertext to open the source file at the specified line number on the Source page. 
- @section:<section identifier>:<hypertext>@ - To add a link to the hypertext to jump to the respective section. 
 
 
 
- ’result_tables’(
 
 
- Return type
 
 - sass_by_pc(address)
- Get the SASS associated with an address. 
 - source_files()
- Get the source files associated with this action along with their content. - If content is not available for a file (e.g. because it hadn’t been imported into the report), the file name will map to an empty string. 
 - source_info(address)
- Get the source info for a function address within this action. - Addresses are commonly obtained as correlation IDs of source-correlated metrics. - Parameters
- address ( - int) – The address to get source info for.
- Returns
- The - ISourceInfoassociated to the given address. If no source info is available,- Noneis returned.
- Return type
 
 - source_markers()
- Get all the source markers associated with this action. - Returns
- A tuple of source data dictionaries. Each source data dictcontains the following key-value pairs:
- ’rule_identifier’ : ( - str) The rule identifier.
- ’section_identifier’ : ( - str) The section identifier.
- ’kind’ : ( - int)- MarkerKindattribute.
- ’message’ : ( - str) The source marker message.
- ’source_address’ : ( - int) Source instruction address. Key available only for kind- MarkerKind.SASS.
 
 
- A tuple of source data dictionaries. Each source data 
- Return type
 
 
- class ncu_report.IContext
- The - IContextclass is the top-level object representing an open report.- It can be created by calling the - load_reportfunction.- __getitem__(key)
- Get one or more - IRangeobjects by index or by slice.- Returns: - IRange|- tupleof- IRange: An- IRangeobject or a- tupleof- IRangeobjects.- Raises
- IndexError – If - keyis out of range for the- IContext.
 
 
 - __iter__()
 - __len__()
 - num_ranges()
 
- class ncu_report.IMetric
- Represents a single, named metric. An - IMetriccan carry one value or multiple ones if it is an instanced metric.- MetricSubtype_PEAK_SUSTAINED_ACTIVE_PER_SECOND
- Metric subtype for peak sustained active per-second metrics. - Type
 
 - MetricSubtype_PEAK_SUSTAINED_ELAPSED_PER_SECOND
- Metric subtype for peak sustained elapsed per-second metrics. - Type
 
 - MetricSubtype_PCT_OF_PEAK_SUSTAINED_ACTIVE
- Metric subtype for percentage of peak sustained active metrics. - Type
 
 - MetricSubtype_PCT_OF_PEAK_SUSTAINED_ELAPSED
- Metric subtype for percentage of peak sustained elapsed metrics. - Type
 
 - correlation_ids()
- Get a metric object for this metric’s instance value’s correlation IDs. - Returns a new - IMetricrepresenting the correlation IDs for the metric’s instance values. Use- IMetric.has_correlation_idsto check if this metric has correlation IDs for its instance values. Correlation IDs are used to associate instance values with the instance their value represents. In the returned new metric object, the correlation IDs are that object’s instance values.- If the metric does not have any correlation IDs, this function will return - None.
 - has_correlation_ids()
- Check if the metric has correlation IDs. 
 - has_value(*args)
- Check if the metric or metric instance has a value. 
 - kind(*args)
- Get the metric or metric instance value kind. 
 - num_instances()
- Get the number of instance values for this metric. - Not all metrics have instance values. If a metric has instance values, it may also have - IMetric.correlation_idsmatching these instance values.- Returns
- The number of instances for this metric. 
- Return type
 
 - rollup_operation()
- Get the type of rollup operation for this metric. - Returns
- The rollup operation type. 
- Return type
 
 
- class ncu_report.INvtxDomainInfo
- Represents a single NVTX domain of the NVTX state, including all ranges associated with this domain. - __str__()
- Get a human-readable representation of this - INvtxDomainInfo.- Returns
- The name of the - INvtxDomainInfo.
- Return type
 
 - name()
- Get a human-readable representation of this - INvtxDomainInfo.- Returns
- The name of the - INvtxDomainInfo.
- Return type
 
 - push_pop_range(idx)
- Get a push/pop range object by index. - The index is identical to the range’s order on the call stack. - Returns
- The requested - INvtxRangeor- Noneif the index is out of range.
- Return type
 
 - push_pop_ranges()
- Get a sorted list of push/pop range names. - Get the sorted list of stacked push/pop range names in this domain, associated with the current - INvtxState.
 - start_end_range(idx)
- Get a start/end range object by index. - Returns
- The requested - INvtxRangeor- Noneif the index is out of range.
- Return type
 
 - start_end_ranges()
- Get a sorted list of start/end range names. - Get the sorted list of start/end range names in this domain, associated with the current - INvtxState.
 
- class ncu_report.INvtxRange
- Represents a single NVTX Push/Pop or Start/End range. - category()
- Get the category attribute value. - Returns
- The category attribute value. If - INvtxRange.has_attributesreturns- False, this will return- 0.
- Return type
 
 - color()
- Get the color attribute value. - Returns
- The color attribute value. If - INvtxRange.has_attributesreturns- False, this will return- 0.
- Return type
 
 - has_attributes()
- Check if range has event attributes. 
 - message()
- Get the message attribute value. - Returns
- The message attribute value. If - INvtxRange.has_attributesreturns- False, this will return the empty string.
- Return type
 
 
- class ncu_report.INvtxState
- Represents the NVTX (Nvidia Tools Extensions) state associated with a single - IAction.- __getitem__(key)
- Get an - INvtxDomainInfoobject by ID.- Parameters
- key ( - int) – The ID of the- INvtxDomainInfoobject.
- Returns
- An - INvtxDomainInfoobject.
- Return type
- Raises
 
 - __iter__()
- Get an iterator over the - INvtxDomainInfoobjects of this- INvtxState.- Returns
- An iterator over the - INvtxDomainInfoobjects.
- Return type
 
 - __len__()
- Get the number of - INvtxDomainInfoobjects of this- INvtxState.- Returns
- The number of - INvtxDomainInfoobjects.
- Return type
 
 - domain_by_id(id)
- Get a - INvtxDomainInfoobject by ID.- Use - INvtxState.domainsto retrieve the list of valid domain IDs.- Parameters
- id ( - int) – The ID of the request domain.
- Returns
- The requested - INvtxDomainInfoobject.
- Return type
 
 
- class ncu_report.IRange
- Represents a serial, ordered stream of execution, such as a CUDA stream. It holds one or more actions that were logically executing in this range. - __getitem__(key)
 - __len__()
- Get the number of - IActionobjects in this- IRange.- Returns
- The number of class:IAction objects. 
- Return type
 
 - actions_by_nvtx(includes, excludes)
- Get a set of indices to IAction objects by their NVTX state. The state is defined using a series of includes and excludes. 
 
- class ncu_report.IRuleResult
- The - IRuleResultrepresents rule results.- __str__()
- Get a human-readable name of the rule result. - Returns
- The name of the - IRuleResultobject represents.
- Return type
 
 - focus_metrics()
- Get all the focus metrics details. - Returns
- A list of focus metrics details. Each focus metric dictionary contains the following key-value pairs:
- ’name’ : ( - str) The name of the focus metric.
- ’value’ : ( - float) The value of the focus metric.
- ’severity’ : ( - FocusSeverity) The severity of the focus metric.
- ’info’ : ( - str) The information about the focus metric.
 
 
- Return type
 
 - has_rule_message()
- Check if the rule has a message. 
 - has_speedup_estimation()
- Check if the rule has speedup estimation. 
 - result_tables()
- Get all the result tables. - Returns
- A list of result tables. Each result table dictionary contains the following key-value pairs:
- ’title’ : ( - str) The title of the result table.
- ’description’ : ( - str) The description of the result table.
- ’headers’ : ( - listof- str) The column headers of the result table.
- ’data’ : ( - listof- list[- int|- float|- str|- Any]) The table data in row-major format. Each column have elements of the same type.- class:str values may contain substrings with the following special link formats
- @url:<hypertext>:<external link>@ - To add a external link for a hypertext. 
- @sass:<address>:<hypertext>@ - To add a link to the hypertext to open the SASS address line on the Source page. 
- @source:<file name>:<line number>:<hypertext>@ - To add a link to the hypertext to open the source file at the specified line number on the Source page. 
- @section:<section identifier>:<hypertext>@ - To add a link to the hypertext to jump to the respective section. 
 
 
 
 
- Return type
 
 - rule_message()
- Get the rule message. - Returns
- A dictionary with the following key-value pairs:
- ’title’ : ( - str) The rule message title.
- ’message’ : ( - str) The rule message.- The message may contain substrings with the following special link formats:
- @url:<hypertext>:<external link>@ - To add a external link for a hypertext. 
- @sass:<address>:<hypertext>@ - To add a link to the hypertext to open the SASS address line on the Source page. 
- @source:<file name>:<line number>:<hypertext>@ - To add a link to the hypertext to open the source file at the specified line number on the Source page. 
- @section:<section identifier>:<hypertext>@ - To add a link to the hypertext to jump to the respective section. 
 
 
- ’type’ : ( - MsgType) The message type.
 
 
- Return type
 
 - speedup_estimation()
- Get the speedup estimation. - Returns
- A dictionary with the following key-value pairs
- ’type’ : ( - SpeedupType) The speedup type.
- ’speedup’ : ( - float) The estimated speedup.
 
 
- Return type
 
 
- class ncu_report.ISourceInfo
- Represents the source correlation information for a specific function address within an action. - file_name()
- Get the file name, as embedded in the correlation information. - Returns
- The file name. 
- Return type
 
 
- class ncu_report.MarkerKind
- Enum representing the kind of a source marker. - SASS
- The marker will be associated with a SASS instruction. 
 - SOURCE
- The marker will be associated with a Source line. 
 - NONE
- No specific kind of marker. 
 
- class ncu_report.MsgType
- Enum representing the type of the message. - NONE
- No specific type for this message. 
 - OK
- The message is informative. 
 - OPTIMIZATION
- The message represents a suggestion for performance optimization. 
 - WARNING
- The message represents a warning or fixable issue. 
 - ERROR
- The message represents an error, potentially in executing the rule. 
 
- class ncu_report.SpeedupType
- Enum representing the type of speedup estimation. - UNKNOWN
- Unknown speedup type. 
 - LOCAL
- Value represents increase in hardware efficiency in isolated context. 
 - GLOBAL
- Value represents decrease in overall kernel runtime. 
 
- ncu_report.load_report(file_name)
- Load an NVIDIA Nsight Compute report file into an - IContextobject.- Parameters
- file_name ( - str|- pathlib.Path) – The relative or absolute path to the- .ncu-repreport file.
- Returns
- An - IContextobject representing the loaded report file.
- Return type
- Raises
- FileNotFoundError – Either if - file_namedoes not exist or if the NVIDIA Nsight Compute library directory cannot be found.
 
Notices
Notices
ALL NVIDIA DESIGN SPECIFICATIONS, REFERENCE BOARDS, FILES, DRAWINGS, DIAGNOSTICS, LISTS, AND OTHER DOCUMENTS (TOGETHER AND SEPARATELY, “MATERIALS”) ARE BEING PROVIDED “AS IS.” NVIDIA MAKES NO WARRANTIES, EXPRESSED, IMPLIED, STATUTORY, OR OTHERWISE WITH RESPECT TO THE MATERIALS, AND EXPRESSLY DISCLAIMS ALL IMPLIED WARRANTIES OF NONINFRINGEMENT, MERCHANTABILITY, AND FITNESS FOR A PARTICULAR PURPOSE.
Information furnished is believed to be accurate and reliable. However, NVIDIA Corporation assumes no responsibility for the consequences of use of such information or for any infringement of patents or other rights of third parties that may result from its use. No license is granted by implication of otherwise under any patent rights of NVIDIA Corporation. Specifications mentioned in this publication are subject to change without notice. This publication supersedes and replaces all other information previously supplied. NVIDIA Corporation products are not authorized as critical components in life support devices or systems without express written approval of NVIDIA Corporation.
Trademarks
NVIDIA and the NVIDIA logo are trademarks or registered trademarks of NVIDIA Corporation in the U.S. and other countries. Other company and product names may be trademarks of the respective companies with which they are associated.