Defined in File tensor_rt_inference.hpp
class nvidia::holoscan::custom_lstm_inference::TensorRtInference : public gxf::Codelet
Loads ONNX model, takes input tensors and run inference against them with TensorRT.
It takes input from all receivers provided and try to locate Tensor component with specified name on them one by one. The first occurence would be used. Only takes gpu memory tensor. Supports dynamic batch as first dimension. The codelet has an engine cache directory that can be pre-populated to reduce start time. If the engine cache directory has no pre-existing engine file for an architecture, it will generate this dynamically. Requires gxf::CudaStream to run load on specific CUDA stream.
gxf_result_t start() override
gxf_result_t tick() override
gxf_result_t stop() override
gxf_result_t registerInterface(gxf::Registrar *registrar) override
- gxf_result_t start() override