NVIDIA Docs Hub NVIDIA Holoscan Clara Holoscan v0.3.0 TensorRT Optimized Inference

TensorRT Optimized Inference

NVIDIA TensorRT is a deep learning inference framework based on CUDA that provided the highest optimizations to run on NVIDIA GPUs, including the Clara Developer Kits.

GXF comes with a TensorRT base extension which is extended in the Holoscan SDK: the updated TensorRT extension is able to selectively load a cached TensorRT model based on the system GPU specifications, making it ideal to interface with the Clara Developer Kits.