Using TensorRT-RTX via PyTorch#

This walkthrough demonstrates how to accelerate PyTorch inference workloads using TensorRT-RTX. Torch-TensorRT is the project that introduces TensorRT and TensorRT-RTX as compilation backends for torch.compile(), optimizing the intermediate graphs captured by PyTorch Dynamo.

Installing Torch-TensorRT-RTX#

Install the torch-tensorrt-rtx and tensorrt-rtx Python packages with pip:

python -m pip install torch-tensorrt-rtx

For additional installation options and platform requirements, refer to the Torch-TensorRT-RTX installation documentation.

Note

PyTorch’s TensorRT-RTX backend is in an experimental phase in this release. For the latest support status and known limitations, refer to the upstream Torch-TensorRT-RTX documentation.

Compiling a PyTorch Model with TensorRT-RTX#

To use TensorRT-RTX in PyTorch, compile your model with the backend set to "tensorrt":

import torch
import torch_tensorrt

model = MyModel().eval().cuda()                  # Define your model here
x = torch.randn((1, 3, 224, 224)).cuda()         # Define the input shape

optimized_model = torch.compile(model, backend="tensorrt")
optimized_model(x)                                # Compiled on first run
optimized_model(x)                                # Subsequent runs are fast

Note

Two naming conventions catch most users by surprise:

  • Import the package as torch_tensorrt (not torch_tensorrt_rtx).

  • Pass the backend name as "tensorrt" (not "tensorrt-rtx").

For the full Torch-TensorRT API, supported precisions, and advanced compilation options, refer to the Torch-TensorRT documentation.

Next Steps#

After your PyTorch model is running with TensorRT-RTX: