IPluginV3

class tensorrt.IPluginV3(*args, **kwargs)

Plugin class for the V3 generation of user-implemented layers.

IPluginV3 acts as a wrapper around the plugin capability interfaces that define the actual behavior of the plugin.

This class is made available for the purpose of implementing IPluginV3 plugins with Python.

Note

Every attribute must be explicitly initialized on Python-based plugins. These attributes will be read-only when accessed through a C++-based plugin.

Variables

num_outputs – int The number of outputs from the plugin. This is used by the implementations of INetworkDefinition and Builder. In particular, it is called prior to any call to initialize().
tensorrt_version – int [READ ONLY] The API version with which this plugin was built.
plugin_type – str The plugin type. Should match the plugin name returned by the corresponding plugin creator.
plugin_version – str The plugin version. Should match the plugin version returned by the corresponding plugin creator.
plugin_namespace – str The namespace that this plugin object belongs to. Ideally, all plugin objects from the same plugin library should have the same namespace.
serialization_size – int [READ ONLY] The size of the serialization buffer required.

Overloaded function.

__init__(self: tensorrt.tensorrt.IPluginV3) -> None
__init__(self: tensorrt.tensorrt.IPluginV3, arg0: tensorrt.tensorrt.IPluginV3) -> None

clone(self: tensorrt.tensorrt.IPluginV3) → tensorrt.tensorrt.IPluginV3

Clone the plugin object. This copies over internal plugin parameters as well and returns a new plugin object with these parameters.

If the source plugin is pre-configured with configure_plugin(), the returned object should also be pre-configured. Cloned plugin objects can share the same per-engine immutable resource (e.g. weights) with the source object to avoid duplication.

destroy(self: tensorrt.tensorrt.IPluginV3) → None: Perform any cleanup or resource release(s) needed before plugin object is destroyed. This will be called when the INetworkDefinition , Builder or ICudaEngine is destroyed.

Note

There is no direct equivalent to this method in the C++ API.

Note

Implementing this method is optional. The default behavior is a pass.

get_capability_interface(self: tensorrt.tensorrt.IPluginV3, type: tensorrt.PluginCapabilityType) → object: Return a plugin object implementing the specified PluginCapabilityType.

Note

IPluginV3 objects added for the build phase (through add_plugin_v3()) must return valid objects for PluginCapabilityType.CORE, PluginCapabilityType.BUILD and PluginCapabilityType.RUNTIME.

Note

IPluginV3 objects added for the runtime phase must return valid objects for PluginCapabilityType.CORE and PluginCapabilityType.RUNTIME.

class tensorrt.IPluginCapability

Base class for plugin capability interfaces

IPluginCapability represents a split in TensorRT V3 plugins to sub-objects that expose different types of capabilites a plugin may have, as opposed to a single interface which defines all capabilities and behaviors of a plugin.

class tensorrt.IPluginV3OneCore(*args, **kwargs)

A plugin capability interface that enables the core capability (PluginCapabilityType.CORE).

Note

Every attribute must be explicitly initialized on Python-based plugins. These attributes will be read-only when accessed through a C++-based plugin.

Variables

plugin_type – str The plugin type. Should match the plugin name returned by the corresponding plugin creator.
plugin_version – str The plugin version. Should match the plugin version returned by the corresponding plugin creator.
plugin_namespace – str The namespace that this plugin object belongs to. Ideally, all plugin objects from the same plugin library should have the same namespace.

Overloaded function.

__init__(self: tensorrt.tensorrt.IPluginV3OneCore) -> None
__init__(self: tensorrt.tensorrt.IPluginV3OneCore, arg0: tensorrt.tensorrt.IPluginV3OneCore) -> None

class tensorrt.IPluginV3OneBuild(*args, **kwargs)

A plugin capability interface that enables the build capability (PluginCapabilityType.BUILD).

Exposes methods that allow the expression of the build time properties and behavior of a plugin.

Note

Every attribute must be explicitly initialized on Python-based plugins. These attributes will be read-only when accessed through a C++-based plugin.

Variables: num_outputs – int The number of outputs from the plugin. This is used by the implementations of INetworkDefinition and Builder.

Overloaded function.

__init__(self: tensorrt.tensorrt.IPluginV3OneBuild) -> None
__init__(self: tensorrt.tensorrt.IPluginV3OneBuild, arg0: tensorrt.tensorrt.IPluginV3OneBuild) -> None

configure_plugin(self: tensorrt.tensorrt.IPluginV3, in: List[tensorrt.tensorrt.DynamicPluginTensorDesc], out: List[tensorrt.tensorrt.DynamicPluginTensorDesc]) → None

Configure the plugin.

This function can be called multiple times in the build phase during creation of an engine by IBuilder.

Build phase: configure_plugin() is called when a plugin is being prepared for profiling but not for any specific input size. This provides an opportunity for the plugin to make algorithmic choices on the basis of input and output formats, along with the bound of possible dimensions. The min, opt and max value of the DynamicPluginTensorDesc correspond to the MIN, OPT and MAX value of the current profile that the plugin is being profiled for, with the desc.dims field corresponding to the dimensions of plugin specified at network creation. Wildcard dimensions may exist during this phase in the desc.dims field.

Warning

In contrast to the C++ API for configurePlugin(), this method must not return an error code. The expected behavior is to throw an appropriate exception if an error occurs.

Warning

This configure_plugin() method is not available to be called from Python on C++-based plugins

Parameters

in – The input tensors attributes that are used for configuration.
out – The output tensors attributes that are used for configuration.

get_output_data_types(self: tensorrt.tensorrt.IPluginV3, input_types: List[tensorrt.tensorrt.DataType]) → List[tensorrt.tensorrt.DataType]

Return DataType s of the plugin outputs.

Provide DataType.FLOAT s if the layer has no inputs. The data type for any size tensor outputs must be DataType.INT32. The returned data types must each have a format that is supported by the plugin.

Parameters: input_types – Data types of the inputs.
Returns: DataType of the plugin output at the requested index.

get_output_shapes(self: tensorrt.tensorrt.IPluginV3, inputs: List[tensorrt.tensorrt.DimsExprs], shape_inputs: List[tensorrt.tensorrt.DimsExprs], expr_builder: tensorrt.tensorrt.IExprBuilder) → List[tensorrt.tensorrt.DimsExprs]

Get expressions for computing shapes of an output tensor from shapes of the input tensors.

This function is called by the implementations of IBuilder during analysis of the network.

Warning

This get_output_shapes() method is not available to be called from Python on C++-based plugins

Parameters

inputs – Expressions for shapes of the input tensors
shape_inputs – Expressions for shapes of the shape inputs
expr_builder – Object for generating new expressions

Returns

Expressions for the output shapes.

get_valid_tactics(self: tensorrt.tensorrt.IPluginV3) → List[int]: Return any custom tactics that the plugin intends to use.

Note

The provided tactic values must be unique and positive

Warning

This get_valid_tactics() method is not available to be called from Python on C++-based plugins.

get_workspace_size(self: tensorrt.tensorrt.IPluginV3, in: List[tensorrt.tensorrt.DynamicPluginTensorDesc], out: List[tensorrt.tensorrt.DynamicPluginTensorDesc]) → int

Return the workspace size (in bytes) required by the plugin.

This function is called after the plugin is configured, and possibly during execution. The result should be a sufficient workspace size to deal with inputs and outputs of the given size or any smaller problem.

Note

When implementing a Python-based plugin, implementing this method is optional. The default behavior is equivalent to return 0.

Warning

This get_workspace_size() method is not available to be called from Python on C++-based plugins

Parameters

input_desc – How to interpret the memory for the input tensors.
output_desc – How to interpret the memory for the output tensors.

Returns

The workspace size (in bytes).

supports_format_combination(self: tensorrt.tensorrt.IPluginV3, pos: int, in_out: List[tensorrt.tensorrt.DynamicPluginTensorDesc], num_inputs: int) → bool

Return true if plugin supports the format and datatype for the input/output indexed by pos.

For this method, inputs are indexed from [0, num_inputs-1] and outputs are indexed from [num_inputs, (num_inputs + num_outputs - 1)]. pos is an index into in_ou`t, where `0 <= pos < (num_inputs + num_outputs - 1).

TensorRT invokes this method to query if the input/output tensor indexed by pos supports the format and datatype specified by in_out[pos].format and in_out[pos].type. The override shall return true if that format and datatype at in_out[pos] are supported by the plugin. It is undefined behavior to examine the format or datatype or any tensor that is indexed by a number greater than pos.

Warning

This supports_format_combination() method is not available to be called from Python on C++-based plugins

Parameters

pos – The input or output tensor index being queried.
in_out – The combined input and output tensor descriptions.
num_inputs – The number of inputs.

Returns

boolean indicating whether the format combination is supported or not.

class tensorrt.IPluginV3OneRuntime(*args, **kwargs)

A plugin capability interface that enables the runtime capability (PluginCapabilityType.RUNTIME).

Exposes methods that allow the expression of the runtime properties and behavior of a plugin.

Overloaded function.

__init__(self: tensorrt.tensorrt.IPluginV3OneRuntime) -> None
__init__(self: tensorrt.tensorrt.IPluginV3OneRuntime, arg0: tensorrt.tensorrt.IPluginV3OneRuntime) -> None

attach_to_context(self: tensorrt.tensorrt.IPluginV3, resource_context: tensorrt.IPluginResourceContext) → tensorrt.tensorrt.IPluginV3

Clone the plugin, attach the cloned plugin object to a execution context and grant the cloned plugin access to some context resources.

This function is called automatically for each plugin when a new execution context is created.

The plugin may use resources provided by the resource_context until the plugin is deleted by TensorRT.

Parameters: resource_context – A resource context that exposes methods to get access to execution context specific resources. A different resource context is guaranteed for each different execution context to which the plugin is attached.

Note

This method should clone the entire IPluginV3 object, not just the runtime interface

enqueue(self: tensorrt.tensorrt.IPluginV3, input_desc: List[tensorrt.tensorrt.PluginTensorDesc], output_desc: List[tensorrt.tensorrt.PluginTensorDesc], inputs: List[int], outputs: List[int], workspace: int, stream: int) → None

Execute the layer.

inputs and outputs contains pointers to the corresponding input and output device buffers as their intptr_t casts. stream also represents an intptr_t cast of the CUDA stream in which enqueue should be executed.

Warning

Since input, output, and workspace buffers are created and owned by TRT, care must be taken when writing to them from the Python side.

Warning

In contrast to the C++ API for enqueue(), this method must not return an error code. The expected behavior is to throw an appropriate exception. if an error occurs.

Warning

This enqueue() method is not available to be called from Python on C++-based plugins.

Parameters

input_desc – how to interpret the memory for the input tensors.
output_desc – how to interpret the memory for the output tensors.
inputs – The memory for the input tensors.
outputs – The memory for the output tensors.
workspace – Workspace for execution.
stream – The stream in which to execute the kernels.

get_fields_to_serialize(self: tensorrt.tensorrt.IPluginV3) → tensorrt.tensorrt.PluginFieldCollection_: Return the plugin fields which should be serialized.

Note

The set of plugin fields returned does not necessarily need to match that advertised through get_field_names() of the corresponding plugin creator.

Warning

This get_fields_to_serialize() method is not available to be called from Python on C++-based plugins.

on_shape_change(self: tensorrt.tensorrt.IPluginV3, in: List[tensorrt.tensorrt.PluginTensorDesc], out: List[tensorrt.tensorrt.PluginTensorDesc]) → None

Called when a plugin is being prepared for execution for specific dimensions. This could happen multiple times in the execution phase, both during creation of an engine by IBuilder and execution of an engine by IExecutionContext.

IBuilder will call this function once per profile, with in resolved to the values specified by the kOPT field of the current profile.

IExecutionContext will call this during the next subsequent instance of enqueue_v2() or execute_v3() if: (1) The optimization profile is changed (2). An input binding is changed.

Warning

In contrast to the C++ API for onShapeChange(), this method must not return an error code. The expected behavior is to throw an appropriate exception if an error occurs.

Warning

This on_shape_change() method is not available to be called from Python on C++-based plugins

Parameters

in – The input tensors attributes that are used for configuration.
out – The output tensors attributes that are used for configuration.

set_tactic(self: tensorrt.tensorrt.IPluginV3, tactic: int) → None

Set the tactic to be used in the subsequent call to enqueue().

If no custom tactics were advertised, this will have a value of 0, which is designated as the default tactic.

Warning

In contrast to the C++ API for setTactic(), this method must not return an error code. The expected behavior is to throw an appropriate exception if an error occurs.

Warning

This set_tactic() method is not available to be called from Python on C++-based plugins.

class tensorrt.IPluginResourceContext

Interface for plugins to access per context resources provided by TensorRT

There is no public way to construct an IPluginResourceContext. It appears as an argument to trt.IPluginV3OneRuntime.attach_to_context().

tensorrt.PluginTensorDesc

Fields that a plugin might see for an input or output.

scale is only valid when the type is DataType.INT8. TensorRT will set the value to -1.0 if it is invalid.

Variables

dims – Dims Dimensions.
format – TensorFormat Tensor format.
type – DataType Type.
scale – float Scale for INT8 data type.

class tensorrt.DynamicPluginTensorDesc(self: tensorrt.tensorrt.DynamicPluginTensorDesc) → None

Summarizes tensors that a plugin might see for an input or output.

Variables

desc – PluginTensorDesc Information required to interpret a pointer to tensor data, except that desc.dims has -1 in place of any runtime dimension..
min – Dims Lower bounds on tensor’s dimensions.
max – Dims Upper bounds on tensor’s dimensions.

class tensorrt.IDimensionExpr

An IDimensionExpr represents an integer expression constructed from constants, input dimensions, and binary operations.

These expressions are can be used in overrides of IPluginV2DynamicExt::get_output_dimensions() to define output dimensions in terms of input dimensions.

get_constant_value(self: tensorrt.tensorrt.IDimensionExpr) → int

Get the value of the constant.

If is_constant(), returns value of the constant. Else, return int64 minimum.

is_constant(self: tensorrt.tensorrt.IDimensionExpr) → bool: Return true if expression is a build-time constant.

is_size_tensor(self: tensorrt.tensorrt.IDimensionExpr) → bool: Return true if this denotes the value of a size tensor.

class tensorrt.DimsExprs(*args, **kwargs)

Analog of class Dims with expressions (IDimensionExpr) instead of constants for the dimensions.

Behaves like a Python iterable and lists or tuples of IDimensionExpr can be used to construct it.

Overloaded function.

__init__(self: tensorrt.tensorrt.DimsExprs) -> None
__init__(self: tensorrt.tensorrt.DimsExprs, arg0: List[tensorrt.tensorrt.IDimensionExpr]) -> None
__init__(self: tensorrt.tensorrt.DimsExprs, arg0: int) -> None

class tensorrt.IExprBuilder(self: tensorrt.tensorrt.IExprBuilder) → None

Object for constructing IDimensionExpr.

There is no public way to construct an IExprBuilder. It appears as an argument to method IPluginV2DynamicExt::get_output_dimensions(). Overrides of that method can use that IExprBuilder argument to construct expressions that define output dimensions in terms of input dimensions.

Clients should assume that any values constructed by the IExprBuilder are destroyed after IPluginV2DynamicExt::get_output_dimensions() returns.

constant(self: tensorrt.tensorrt.IExprBuilder, arg0: int) → tensorrt.tensorrt.IDimensionExpr: Return a IDimensionExpr for the given value.

declare_size_tensor(self: tensorrt.tensorrt.IExprBuilder, arg0: int, arg1: tensorrt.tensorrt.IDimensionExpr, arg2: tensorrt.tensorrt.IDimensionExpr) → tensorrt.tensorrt.IDimensionExpr

Declare a size tensor at the given output index, with the specified auto-tuning formula and upper bound.

A size tensor allows a plugin to have output dimensions that cannot be computed solely from input dimensions. For example, suppose a plugin implements the equivalent of INonZeroLayer for 2D input. The plugin can have one output for the indices of non-zero elements, and a second output containing the number of non-zero elements. Suppose the input has size [M,N] and has K non-zero elements. The plugin can write K to the second output. When telling TensorRT that the first output has shape [2,K], plugin uses IExprBuilder.constant() and IExprBuilder.declare_size_tensor(1,…) to create the IDimensionExpr that respectively denote 2 and K.

TensorRT also needs to know the value of K to use for auto-tuning and an upper bound on K so that it can allocate memory for the output tensor. In the example, suppose typically half of the plugin’s input elements are non-zero, and all the elements might be nonzero. then using M*N/2 might be a good expression for the opt parameter, and M*N for the upper bound. IDimensionsExpr for these expressions can be constructed from IDimensionsExpr for the input dimensions.

operation(self: tensorrt.tensorrt.IExprBuilder, arg0: tensorrt.tensorrt.DimensionOperation, arg1: tensorrt.tensorrt.IDimensionExpr, arg2: tensorrt.tensorrt.IDimensionExpr) → tensorrt.tensorrt.IDimensionExpr: Return a IDimensionExpr that represents the given operation applied to first and second. Returns None if op is not a valid DimensionOperation.

class tensorrt.DimensionOperation(self: tensorrt.tensorrt.DimensionOperation, value: int) → None

An operation on two IDimensionExprs, which represent integer expressions used in dimension computations.

For example, given two IDimensionExprs x and y and an IExprBuilder eb, eb.operation(DimensionOperation.SUM, x, y) creates a representation of x + y.

Members:

SUM

PROD

MAX

MIN

SUB

EQUAL

LESS

FLOOR_DIV

CEIL_DIV

property name

class tensorrt.PluginCapabilityType(self: tensorrt.tensorrt.PluginCapabilityType, value: int) → None

Enumerates the different capability types a IPluginV3 object may have.

Members:

CORE

BUILD

RUNTIME

property name

class tensorrt.TensorRTPhase(self: tensorrt.tensorrt.TensorRTPhase, value: int) → None

Indicates a phase of operation of TensorRT

Members:

BUILD

RUNTIME

property name