TensorRTExtension

Components with Tensorrt inference capability.

  • UUID: d43f23e4-b9bf-11eb-9d18-2b7be630552b

  • Version: 2.0.0

  • Author: NVIDIA

  • License: LICENSE

Components

nvidia::gxf::TensorRtInference

Codelet taking input tensors and feed them into TensorRT for inference.

  • Component ID: 06a7f0e0-b9c0-11eb-8cd6-23c9c2070107

  • Base Type: nvidia::gxf::Codelet

Parameters

model_file_path

Path to ONNX model to be loaded.

  • Flags: GXF_PARAMETER_FLAGS_NONE

  • Type: GXF_PARAMETER_TYPE_STRING


engine_file_path

Path to the generated engine to be serialized and loaded from.

  • Flags: GXF_PARAMETER_FLAGS_NONE

  • Type: GXF_PARAMETER_TYPE_STRING


force_engine_update

Always update engine regard less of existing engine file. Such conversion may take minutes. Default to false.

  • Flags: GXF_PARAMETER_FLAGS_NONE

  • Type: GXF_PARAMETER_TYPE_BOOL

  • Default: False


input_tensor_names

Names of input tensors in the order to be fed into the model.

  • Flags: GXF_PARAMETER_FLAGS_NONE

  • Type: GXF_PARAMETER_TYPE_STRING


input_binding_names

Names of input bindings as in the model in the same order of what is provided in input_tensor_names.

  • Flags: GXF_PARAMETER_FLAGS_NONE

  • Type: GXF_PARAMETER_TYPE_STRING


output_tensor_names

Names of output tensors in the order to be retrieved from the model.

  • Flags: GXF_PARAMETER_FLAGS_NONE

  • Type: GXF_PARAMETER_TYPE_STRING


output_binding_names

Names of output bindings in the model in the same order of of what is provided in output_tensor_names.

  • Flags: GXF_PARAMETER_FLAGS_NONE

  • Type: GXF_PARAMETER_TYPE_STRING


pool

Allocator instance for output tensors.

  • Flags: GXF_PARAMETER_FLAGS_NONE

  • Type: GXF_PARAMETER_TYPE_HANDLE

  • Handle Type: nvidia::gxf::Allocator


cuda_stream_pool

Instance of gxf::CudaStreamPool to allocate CUDA stream.

  • Flags: GXF_PARAMETER_FLAGS_NONE

  • Type: GXF_PARAMETER_TYPE_HANDLE

  • Handle Type: nvidia::gxf::CudaStreamPool


max_workspace_size

Size of working space in bytes. Default to 64MB

  • Flags: GXF_PARAMETER_FLAGS_NONE

  • Type: GXF_PARAMETER_TYPE_INT64

  • Default: 67108864


dla_core

DLA Core to use. Fallback to GPU is always enabled. Default to use GPU only.

  • Flags: GXF_PARAMETER_FLAGS_OPTIONAL

  • Type: GXF_PARAMETER_TYPE_INT64


max_batch_size

Maximum possible batch size in case the first dimension is dynamic and used as batch size.

  • Flags: GXF_PARAMETER_FLAGS_NONE

  • Type: GXF_PARAMETER_TYPE_INT32

  • Default: 1


enable_fp16_

Enable inference with FP16 and FP32 fallback.

  • Flags: GXF_PARAMETER_FLAGS_NONE

  • Type: GXF_PARAMETER_TYPE_BOOL

  • Default: False


verbose

Enable verbose logging on console. Default to false.

  • Flags: GXF_PARAMETER_FLAGS_NONE

  • Type: GXF_PARAMETER_TYPE_BOOL

  • Default: False


relaxed_dimension_check

Ignore dimensions of 1 for input tensor dimension check.

  • Flags: GXF_PARAMETER_FLAGS_NONE

  • Type: GXF_PARAMETER_TYPE_BOOL

  • Default: True


clock

Instance of clock for publish time.

  • Flags: GXF_PARAMETER_FLAGS_OPTIONAL

  • Type: GXF_PARAMETER_TYPE_HANDLE

  • Handle Type: nvidia::gxf::Clock


rx

List of receivers to take input tensors

  • Flags: GXF_PARAMETER_FLAGS_NONE

  • Type: GXF_PARAMETER_TYPE_HANDLE

  • Handle Type: nvidia::gxf::Receiver


tx

Transmitter to publish output tensors

  • Flags: GXF_PARAMETER_FLAGS_NONE

  • Type: GXF_PARAMETER_TYPE_HANDLE

  • Handle Type: nvidia::gxf::Transmitter