The Holoscan SDK has been annotated using the NVTX API to provide runtime tracing and profiling of key application calls such as the start, compute, and stop callbacks made to the operators used by the application. Time spent on emit and receive calls for individual operator ports as well as timings for optional metadata handling, CUDA stream handling and data logger calls are also shown. Additionally, the SDK provides frame-level tracking that correlates NVTX ranges with individual data frames flowing through the application pipeline. This profiling can be captured and visualized using the tools provided by NSight Systems.
To enable profiling and output the profile results of running an application, enable the HOLOSCAN_ENABLE_PROFILE environment variable and use the nsys runtime provided with NSight Systems to run the application. For example, the following command will profile the first 3 seconds of the bring_your_own_model example application and write the results to byom_profile.nsys-rep:
The written profile can then be opened with the NSight Systems UI (nsys-ui) to visualize the results. This is a sample profile of the bring your own model example application, zoomed in to show the details of the CPU and CUDA runtime of the application’s operators:

bring_your_own_model example applicationThe Holoscan SDK includes advanced frame-level tracking that enhances profiling by correlating NVTX ranges with individual data frames as they flow through the application pipeline. This feature provides deeper insights into per-frame processing times and helps identify performance bottlenecks at the frame level.
Frame-level tracking automatically:
Frame-level tracking is automatically enabled when both data flow tracking and profiling are active:
In Python:
In C++:
When frame tracking is enabled, NSight Systems will display:
Frame tracking adds minimal overhead to applications:
Frame-level tracking requires Data Flow Tracking to be enabled. If you only need basic NVTX profiling without frame correlation, you can use HOLOSCAN_ENABLE_PROFILE=1 without enabling data flow tracking.
The following images visualize the profiling results for the Endoscopy Tool Tracking and Ping Multi-Port applications using the NSight Systems UI (nsys-ui). The operators used in Endoscopy Tool Tracking are all single-port operators, whereas the operators used in Ping Multi-Port have 2 ports each to send and receive messages. In each of the images, the frame that is currently processed by an operator is identified using the frame number visible on expanding the compute section.

Endoscopy Tool Tracking Application
Ping Multi-Port Application