NVTX Trace


The NVIDIA Tools Extension Library (NVTX) is a powerful mechanism that allows users to manually instrument their application. Tegra System Profiler can then collect the information and present it on the timeline.

Tegra System Profiler supports version 2.0 of the NVTX specification.

The following features are supported:

To learn more about specific features of NVTX, please refer to the NVTX header file: nvToolsExt.h.

To use NVTX in your application, follow these steps:

  1. Add #include "nvToolsExt.h" in your source code. This header file is located in the Target-arm/nvtx/include directory on the host.

  2. Link with libnvToolsExt.a static library (-lnvToolsExt compiler flag).

    • For Android, the library is located in Target-arm/armv7 and Target-arm/armv8 directories on the host.

    • For Linux, the library is located in Target-arm-linux/armv7 and Target-arm-linux/armv8 directories on the host.

  3. Add the following compiler flags as well: -pthread -ldl -lrt.

  4. Add calls to the NVTX API functions. For example, try adding nvtxRangePushA("main") in the beginning of the main() function, and nvtxRangePop() just before the return statement in the end.

    • For convenience in C++ code, consider adding wrapper that implements RAII (resource acquisition is initialization) pattern, which would guarantee that every range gets closed.
  5. In the project settings, select the Collect NVTX trace checkbox.

  6. On Android, make sure that your application is launched by Tegra System Profiler. This is required so that the necessary launch environment is prepared, and the library responsible for collection of NVTX trace data is properly injected into the process.

Typically calls to NVTX functions can be left in the source code even if the application is not being built for profiling purposes, since the overhead is very low when the profiler is not attached.

NVTX is not intended to annotate very small pieces of code that are being called very frequently. A good rule of thumb to use: if code being annotated usually takes less than 1 microsecond to execute, adding an NVTX range around this code should be done carefully.


NVIDIA® GameWorks™ Documentation Rev. 1.0.180104 ©2014-2018. NVIDIA Corporation. All Rights Reserved.