Clara Holoscan v0.3.0
1.0

Class TensorRtInference

class TensorRtInference : public gxf::Codelet

Loads ONNX model, takes input tensors and run inference against them with TensorRT.

It takes input from all receivers provided and try to locate Tensor component with specified name on them one by one. The first occurence would be used. Only takes gpu memory tensor. Supports dynamic batch as first dimension. The codelet has an engine cache directory that can be pre-populated to reduce start time. If the engine cache directory has no pre-existing engine file for an architecture, it will generate this dynamically. Requires gxf::CudaStream to run load on specific CUDA stream.

Public Functions

gxf_result_t start() override

gxf_result_t tick() override

gxf_result_t stop() override

gxf_result_t registerInterface(gxf::Registrar *registrar) override

© Copyright 2022, NVIDIA. Last updated on Jun 28, 2023.