Installing TensorRT-RTX#

There are several installation methods for TensorRT-RTX. This section covers the most common options using:

  • An SDK zip file (Windows), or

  • A tarball file (Linux)

Prerequisites#

  1. Ensure you are a member of the NVIDIA Developer Program. If you need help, follow the prompts to gain access and download the package you want to install.

    1. Go to this download link.

    2. Click GET STARTED, then click Download Now.

    3. Select the version of TensorRT that you are interested in.

    4. Select the checkbox to agree to the license terms.

    5. Click the package you want to install. Your download begins.

  2. Install NVIDIA CUDA Toolkit 12.9 or later on the target system.

Windows SDK Installation#

  1. Decompress the Windows zip package. The following can be directly decompressed to extract the:

    • Core library and ONNX parser DLLs and import libraries

    • Development headers

    • Source code samples

    • Documentation

    • Python bindings

    • The tensorrt_rtx executable for building engines and running inference from the command line

    • Licensing information and open-source acknowledgments

    Note

    The DLLs are signed and verified with the signtool utility. Remember to add the directories containing the DLLs and executable files to your PATH environment variable.

  2. Optionally, install the Python bindings.

    $version = "1.0.0.14" # Replace with newest version
    $arch = "amd64" # Replace with your architecture
    $pyversion = "311" # For Python 3.11, replace with your
                       # Python version
    $wheel = "TensorRT-RTX-$version\tensorrt_rtx-$version-cp$pyversion-none-win_$arch.whl"
    python3 -m pip install $wheel
    

Linux Tarball Installation#

TensorRT-RTX can be installed from a tarball package on Linux similar to the zip package installation on Windows. It is supported on the following distributions:

  • Ubuntu 22.04

  • Ubuntu 24.04

  • Rocky Linux 8

Prerequisites

Ensure that NVIDIA CUDA Toolkit 12.9 or later is installed on the target system.

Steps

  1. Download the repo file that matches your operating system version and CPU architecture.

  2. Unzip the tarball, optionally adding the path of the executable to your PATH variable and the path of the library to your LD_LIBRARY_PATH variable.

    version = "1.0.0.14" # Replace with newest version
    arch = "x86_64"      # Replace with your architecture
    cuda = "12.9"        # Replace with your CUDA version
    tarfile = "TensorRT-RTX-${version}.Linux.${arch}-gnu-${cuda}.tar.gz"
    tar -xzf $tarfile
    export PATH=$PATH:$PWD/TensorRT-RTX-${version}/bin
    export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$PWD/TensorRT-RTX-${version}/lib
    
  3. Optionally, install the Python bindings.

    pyversion = "311" # Assuming Python 3.11, else replace with your
                      # Python version
    wheel = "tensorrt_rtx-${version}-cp${pyversion}-none-linux_${arch}.whl"
    python3 -m pip install TensorRT-RTX-${version}/python/${wheel}
    

Basic TensorRT-RTX Workflow#

Models can be specified via the TensorRT-RTX C++ or Python API, or read from the ONNX neural network exchange format. Popular model training frameworks like PyTorch or TensorFlow typically offer an ONNX export option. Alternatively, the native TensorRT-RTX API defines operators for manually assembling the network structure.

ONNX is a framework-agnostic option that works with models in TensorFlow, PyTorch, and more. TensorRT-RTX supports an automatic conversion from ONNX files using the TensorRT-RTX API or the tensorrt_rtx executable, which we will use in this section. ONNX conversion is all-or-nothing, meaning all operations in your model must be supported by TensorRT-RTX.

For the most performance and customizability possible, you can manually construct TensorRT-RTX engines using the TensorRT-RTX network definition API. This involves building an identical network to your target model in TensorRT-RTX operation by operation, using only TensorRT-RTX operations. After a TensorRT-RTX network is created, you will export just the weights of your model from the framework and load them into your TensorRT-RTX network. For this approach, more information about constructing the model using TensorRT-RTX’s network definition API can be found here: