TensorRTExtension
Components with TensorRT inference capability.
UUID: d43f23e4-b9bf-11eb-9d18-2b7be630552b
Version: 2.0.0
Author: NVIDIA
nvidia::gxf::TensorRtInference
Codelet taking input tensors and feed them into TensorRT for inference.
Component ID: 06a7f0e0-b9c0-11eb-8cd6-23c9c2070107
Base Type: nvidia::gxf::Codelet
Parameters
model_file_path
Path to ONNX model to be loaded.
Flags: GXF_PARAMETER_FLAGS_NONE
Type: GXF_PARAMETER_TYPE_STRING
engine_file_path
Path to the generated engine to be serialized and loaded from.
Flags: GXF_PARAMETER_FLAGS_NONE
Type: GXF_PARAMETER_TYPE_STRING
force_engine_update
Always update engine regard less of existing engine file. Such conversion may take minutes. Default to false.
Flags: GXF_PARAMETER_FLAGS_NONE
Type: GXF_PARAMETER_TYPE_BOOL
Default: False
input_tensor_names
Names of input tensors in the order to be fed into the model.
Flags: GXF_PARAMETER_FLAGS_NONE
Type: GXF_PARAMETER_TYPE_STRING
input_binding_names
Names of input bindings as in the model in the same order of what is provided in input_tensor_names.
Flags: GXF_PARAMETER_FLAGS_NONE
Type: GXF_PARAMETER_TYPE_STRING
output_tensor_names
Names of output tensors in the order to be retrieved from the model.
Flags: GXF_PARAMETER_FLAGS_NONE
Type: GXF_PARAMETER_TYPE_STRING
output_binding_names
Names of output bindings in the model in the same order of of what is provided in output_tensor_names.
Flags: GXF_PARAMETER_FLAGS_NONE
Type: GXF_PARAMETER_TYPE_STRING
pool
Allocator instance for output tensors.
Flags: GXF_PARAMETER_FLAGS_NONE
Type: GXF_PARAMETER_TYPE_HANDLE
Handle Type: nvidia::gxf::Allocator
cuda_stream_pool
Instance of gxf::CudaStreamPool to allocate CUDA stream.
Flags: GXF_PARAMETER_FLAGS_NONE
Type: GXF_PARAMETER_TYPE_HANDLE
Handle Type: nvidia::gxf::CudaStreamPool
max_workspace_size
Size of working space in bytes. Default to 64MB
Flags: GXF_PARAMETER_FLAGS_NONE
Type: GXF_PARAMETER_TYPE_INT64
Default: 67108864
dla_core
DLA Core to use. Fallback to GPU is always enabled. Default to use GPU only.
Flags: GXF_PARAMETER_FLAGS_OPTIONAL
Type: GXF_PARAMETER_TYPE_INT64
max_batch_size
Maximum possible batch size in case the first dimension is dynamic and used as batch size.
Flags: GXF_PARAMETER_FLAGS_NONE
Type: GXF_PARAMETER_TYPE_INT32
Default: 1
enable_fp16_
Enable inference with FP16 and FP32 fallback.
Flags: GXF_PARAMETER_FLAGS_NONE
Type: GXF_PARAMETER_TYPE_BOOL
Default: False
verbose
Enable verbose logging on console. Default to false.
Flags: GXF_PARAMETER_FLAGS_NONE
Type: GXF_PARAMETER_TYPE_BOOL
Default: False
relaxed_dimension_check
Ignore dimensions of 1 for input tensor dimension check.
Flags: GXF_PARAMETER_FLAGS_NONE
Type: GXF_PARAMETER_TYPE_BOOL
Default: True
clock
Instance of clock for publish time.
Flags: GXF_PARAMETER_FLAGS_OPTIONAL
Type: GXF_PARAMETER_TYPE_HANDLE
Handle Type: nvidia::gxf::Clock
rx
List of receivers to take input tensors
Flags: GXF_PARAMETER_FLAGS_NONE
Type: GXF_PARAMETER_TYPE_HANDLE
Handle Type: nvidia::gxf::Receiver
tx
Transmitter to publish output tensors
Flags: GXF_PARAMETER_FLAGS_NONE
Type: GXF_PARAMETER_TYPE_HANDLE
Handle Type: nvidia::gxf::Transmitter