Class TensorRtInference
Defined in File tensor_rt_inference.hpp
Base Type
public gxf::Codelet
-
class nvidia::gxf::TensorRtInference : public gxf::Codelet
Loads ONNX model, takes input tensors and run inference against them with TensorRT.
It takes input from all receivers provided and try to locate Tensor component with specified name on them one by one. The first occurence would be used. Only takes gpu memory tensor. Supports dynamic batch as first dimension. The codelet has an engine cache directory that can be pre-populated to reduce start time. If the engine cache directory has no pre-existing engine file for an architecture, it will generate this dynamically. Requires gxf::CudaStream to run load on specific CUDA stream.
Public Functions
- gxf_result_t start() override
- gxf_result_t tick() override
- gxf_result_t stop() override
- gxf_result_t registerInterface(gxf::Registrar *registrar) override