public gxf::Codelet

Class Documentation

class nvidia::gxf::TensorRtInference : public gxf::Codelet

Loads ONNX model, takes input tensors and run inference against them with TensorRT.

It takes input from all receivers provided and try to locate Tensor component with specified name on them one by one. The first occurence would be used. Only takes gpu memory tensor. Supports dynamic batch as first dimension. The codelet has an engine cache directory that can be pre-populated to reduce start time. If the engine cache directory has no pre-existing engine file for an architecture, it will generate this dynamically. Requires gxf::CudaStream to run load on specific CUDA stream.

Public Functions

gxf_result_t start() override

gxf_result_t tick() override

gxf_result_t stop() override

gxf_result_t registerInterface(gxf::Registrar *registrar) override

Class TensorRtInference

Nested Relationships

Nested Types

Inheritance Relationships

Base Type

Class Documentation