NVIDIA Docs Hub Homepage NVIDIA Holoscan Clara Holoscan v0.2.0 TensorRTExtension

TensorRTExtension

Components with TensorRT inference capability.

UUID: d43f23e4-b9bf-11eb-9d18-2b7be630552b
Version: 2.0.0
Author: NVIDIA

Components

nvidia::gxf::TensorRtInference

Codelet taking input tensors and feed them into TensorRT for inference.

Component ID: 06a7f0e0-b9c0-11eb-8cd6-23c9c2070107
Base Type: nvidia::gxf::Codelet

Parameters

model_file_path

Path to ONNX model to be loaded.

Flags: GXF_PARAMETER_FLAGS_NONE
Type: GXF_PARAMETER_TYPE_STRING

engine_file_path

Path to the generated engine to be serialized and loaded from.

Flags: GXF_PARAMETER_FLAGS_NONE
Type: GXF_PARAMETER_TYPE_STRING

force_engine_update

Always update engine regard less of existing engine file. Such conversion may take minutes. Default to false.

Flags: GXF_PARAMETER_FLAGS_NONE
Type: GXF_PARAMETER_TYPE_BOOL
Default: False

input_tensor_names

Names of input tensors in the order to be fed into the model.

Flags: GXF_PARAMETER_FLAGS_NONE
Type: GXF_PARAMETER_TYPE_STRING

input_binding_names

Names of input bindings as in the model in the same order of what is provided in input_tensor_names.

Flags: GXF_PARAMETER_FLAGS_NONE
Type: GXF_PARAMETER_TYPE_STRING

output_tensor_names

Names of output tensors in the order to be retrieved from the model.

Flags: GXF_PARAMETER_FLAGS_NONE
Type: GXF_PARAMETER_TYPE_STRING

output_binding_names

Names of output bindings in the model in the same order of of what is provided in output_tensor_names.

Flags: GXF_PARAMETER_FLAGS_NONE
Type: GXF_PARAMETER_TYPE_STRING

pool

Allocator instance for output tensors.

Flags: GXF_PARAMETER_FLAGS_NONE
Type: GXF_PARAMETER_TYPE_HANDLE
Handle Type: nvidia::gxf::Allocator

cuda_stream_pool

Instance of gxf::CudaStreamPool to allocate CUDA stream.

Flags: GXF_PARAMETER_FLAGS_NONE
Type: GXF_PARAMETER_TYPE_HANDLE
Handle Type: nvidia::gxf::CudaStreamPool

max_workspace_size

Size of working space in bytes. Default to 64MB

Flags: GXF_PARAMETER_FLAGS_NONE
Type: GXF_PARAMETER_TYPE_INT64
Default: 67108864

dla_core

DLA Core to use. Fallback to GPU is always enabled. Default to use GPU only.

Flags: GXF_PARAMETER_FLAGS_OPTIONAL
Type: GXF_PARAMETER_TYPE_INT64

max_batch_size

Maximum possible batch size in case the first dimension is dynamic and used as batch size.

Flags: GXF_PARAMETER_FLAGS_NONE
Type: GXF_PARAMETER_TYPE_INT32
Default: 1

enable_fp16_

Enable inference with FP16 and FP32 fallback.

Flags: GXF_PARAMETER_FLAGS_NONE
Type: GXF_PARAMETER_TYPE_BOOL
Default: False

verbose

Enable verbose logging on console. Default to false.

Flags: GXF_PARAMETER_FLAGS_NONE
Type: GXF_PARAMETER_TYPE_BOOL
Default: False

relaxed_dimension_check

Ignore dimensions of 1 for input tensor dimension check.

Flags: GXF_PARAMETER_FLAGS_NONE
Type: GXF_PARAMETER_TYPE_BOOL
Default: True

clock

Instance of clock for publish time.

Flags: GXF_PARAMETER_FLAGS_OPTIONAL
Type: GXF_PARAMETER_TYPE_HANDLE
Handle Type: nvidia::gxf::Clock

List of receivers to take input tensors

Flags: GXF_PARAMETER_FLAGS_NONE
Type: GXF_PARAMETER_TYPE_HANDLE
Handle Type: nvidia::gxf::Receiver

Transmitter to publish output tensors

Flags: GXF_PARAMETER_FLAGS_NONE
Type: GXF_PARAMETER_TYPE_HANDLE
Handle Type: nvidia::gxf::Transmitter