TensorRTExtension

Components with TensorRT inference capability.

  • UUID: d43f23e4-b9bf-11eb-9d18-2b7be630552b

  • Version: 2.0.0

  • Author: NVIDIA

nvidia::gxf::TensorRtInference

Codelet taking input tensors and feed them into TensorRT for inference.

  • Component ID: 06a7f0e0-b9c0-11eb-8cd6-23c9c2070107

  • Base Type: nvidia::gxf::Codelet

Parameters

model_file_path

Path to ONNX model to be loaded.

  • Flags: GXF_PARAMETER_FLAGS_NONE

  • Type: GXF_PARAMETER_TYPE_STRING

engine_file_path

Path to the generated engine to be serialized and loaded from.

  • Flags: GXF_PARAMETER_FLAGS_NONE

  • Type: GXF_PARAMETER_TYPE_STRING

force_engine_update

Always update engine regard less of existing engine file. Such conversion may take minutes. Default to false.

  • Flags: GXF_PARAMETER_FLAGS_NONE

  • Type: GXF_PARAMETER_TYPE_BOOL

  • Default: False

input_tensor_names

Names of input tensors in the order to be fed into the model.

  • Flags: GXF_PARAMETER_FLAGS_NONE

  • Type: GXF_PARAMETER_TYPE_STRING

input_binding_names

Names of input bindings as in the model in the same order of what is provided in input_tensor_names.

  • Flags: GXF_PARAMETER_FLAGS_NONE

  • Type: GXF_PARAMETER_TYPE_STRING

output_tensor_names

Names of output tensors in the order to be retrieved from the model.

  • Flags: GXF_PARAMETER_FLAGS_NONE

  • Type: GXF_PARAMETER_TYPE_STRING

output_binding_names

Names of output bindings in the model in the same order of of what is provided in output_tensor_names.

  • Flags: GXF_PARAMETER_FLAGS_NONE

  • Type: GXF_PARAMETER_TYPE_STRING

pool

Allocator instance for output tensors.

  • Flags: GXF_PARAMETER_FLAGS_NONE

  • Type: GXF_PARAMETER_TYPE_HANDLE

  • Handle Type: nvidia::gxf::Allocator

cuda_stream_pool

Instance of gxf::CudaStreamPool to allocate CUDA stream.

  • Flags: GXF_PARAMETER_FLAGS_NONE

  • Type: GXF_PARAMETER_TYPE_HANDLE

  • Handle Type: nvidia::gxf::CudaStreamPool

max_workspace_size

Size of working space in bytes. Default to 64MB

  • Flags: GXF_PARAMETER_FLAGS_NONE

  • Type: GXF_PARAMETER_TYPE_INT64

  • Default: 67108864

dla_core

DLA Core to use. Fallback to GPU is always enabled. Default to use GPU only.

  • Flags: GXF_PARAMETER_FLAGS_OPTIONAL

  • Type: GXF_PARAMETER_TYPE_INT64

max_batch_size

Maximum possible batch size in case the first dimension is dynamic and used as batch size.

  • Flags: GXF_PARAMETER_FLAGS_NONE

  • Type: GXF_PARAMETER_TYPE_INT32

  • Default: 1

enable_fp16_

Enable inference with FP16 and FP32 fallback.

  • Flags: GXF_PARAMETER_FLAGS_NONE

  • Type: GXF_PARAMETER_TYPE_BOOL

  • Default: False

verbose

Enable verbose logging on console. Default to false.

  • Flags: GXF_PARAMETER_FLAGS_NONE

  • Type: GXF_PARAMETER_TYPE_BOOL

  • Default: False

relaxed_dimension_check

Ignore dimensions of 1 for input tensor dimension check.

  • Flags: GXF_PARAMETER_FLAGS_NONE

  • Type: GXF_PARAMETER_TYPE_BOOL

  • Default: True

clock

Instance of clock for publish time.

  • Flags: GXF_PARAMETER_FLAGS_OPTIONAL

  • Type: GXF_PARAMETER_TYPE_HANDLE

  • Handle Type: nvidia::gxf::Clock

rx

List of receivers to take input tensors

  • Flags: GXF_PARAMETER_FLAGS_NONE

  • Type: GXF_PARAMETER_TYPE_HANDLE

  • Handle Type: nvidia::gxf::Receiver

tx

Transmitter to publish output tensors

  • Flags: GXF_PARAMETER_FLAGS_NONE

  • Type: GXF_PARAMETER_TYPE_HANDLE

  • Handle Type: nvidia::gxf::Transmitter

© Copyright 2022, NVIDIA. Last updated on Jun 28, 2023.