Using TensorRT-RTX via PyTorch#
This walkthrough demonstrates how to accelerate PyTorch inference workloads using TensorRT-RTX. Torch-TensorRT is the project that introduces TensorRT and TensorRT-RTX as compilation backends for torch.compile(), optimizing the intermediate graphs captured by PyTorch Dynamo.
Installing Torch-TensorRT-RTX#
Install the torch-tensorrt-rtx and tensorrt-rtx Python packages with pip:
python -m pip install torch-tensorrt-rtx
For additional installation options and platform requirements, refer to the Torch-TensorRT-RTX installation documentation.
Note
PyTorch’s TensorRT-RTX backend is in an experimental phase in this release. For the latest support status and known limitations, refer to the upstream Torch-TensorRT-RTX documentation.
Compiling a PyTorch Model with TensorRT-RTX#
To use TensorRT-RTX in PyTorch, compile your model with the backend set to "tensorrt":
import torch
import torch_tensorrt
model = MyModel().eval().cuda() # Define your model here
x = torch.randn((1, 3, 224, 224)).cuda() # Define the input shape
optimized_model = torch.compile(model, backend="tensorrt")
optimized_model(x) # Compiled on first run
optimized_model(x) # Subsequent runs are fast
Note
Two naming conventions catch most users by surprise:
Import the package as
torch_tensorrt(nottorch_tensorrt_rtx).Pass the backend name as
"tensorrt"(not"tensorrt-rtx").
For the full Torch-TensorRT API, supported precisions, and advanced compilation options, refer to the Torch-TensorRT documentation.
Next Steps#
After your PyTorch model is running with TensorRT-RTX:
Deploy Your First Model — End-to-end walkthrough using the ONNX path
ONNX Conversion Guide — Export PyTorch models to ONNX for use with the native TensorRT-RTX API
Architecture Overview: Model Specification — Compare ONNX, native API, and PyTorch entry points