Defined in File tensor_rt_inference.hpp
class TensorRtInference : public nvidia::gxf::Codelet
Loads ONNX model, takes input tensors and run inference against them with TensorRT.
It takes input from all receivers provided and try to locate Tensor component with specified name on them one by one. The first occurrence would be used. Only takes gpu memory tensor. Supports dynamic batch as first dimension. The codelet has an engine cache directory that can be pre-populated to reduce start time. If the engine cache directory has no pre-existing engine file for an architecture, it will generate this dynamically. Requires gxf::CudaStream to run load on specific CUDA stream.
- gxf_result_t start() override
- gxf_result_t tick() override
- gxf_result_t stop() override
- gxf_result_t registerInterface(gxf::Registrar *registrar) override