TensorRT 10.4.0
|
Similar to IPluginV2Ext, but with support for dynamic shapes. More...
#include <NvInferRuntime.h>
Public Member Functions | |
IPluginV2DynamicExt * | clone () const noexcept override=0 |
Clone the plugin object. This copies over internal plugin parameters as well and returns a new plugin object with these parameters. If the source plugin is pre-configured with configurePlugin(), the returned object must also be pre-configured. The returned object must allow attachToContext() with a new execution context. Cloned plugin objects can share the same per-engine immutable resource (e.g. weights) with the source object (e.g. via ref-counting) to avoid duplication. More... | |
virtual DimsExprs | getOutputDimensions (int32_t outputIndex, DimsExprs const *inputs, int32_t nbInputs, IExprBuilder &exprBuilder) noexcept=0 |
Get expressions for computing dimensions of an output tensor from dimensions of the input tensors. More... | |
virtual bool | supportsFormatCombination (int32_t pos, PluginTensorDesc const *inOut, int32_t nbInputs, int32_t nbOutputs) noexcept=0 |
Return true if plugin supports the format and datatype for the input/output indexed by pos. More... | |
virtual void | configurePlugin (DynamicPluginTensorDesc const *in, int32_t nbInputs, DynamicPluginTensorDesc const *out, int32_t nbOutputs) noexcept=0 |
Configure the plugin. More... | |
virtual size_t | getWorkspaceSize (PluginTensorDesc const *inputs, int32_t nbInputs, PluginTensorDesc const *outputs, int32_t nbOutputs) const noexcept=0 |
Find the workspace size required by the layer. More... | |
virtual int32_t | enqueue (PluginTensorDesc const *inputDesc, PluginTensorDesc const *outputDesc, void const *const *inputs, void *const *outputs, void *workspace, cudaStream_t stream) noexcept=0 |
Execute the layer. More... | |
Public Member Functions inherited from nvinfer1::IPluginV2Ext | |
virtual nvinfer1::DataType | getOutputDataType (int32_t index, nvinfer1::DataType const *inputTypes, int32_t nbInputs) const noexcept=0 |
Return the DataType of the plugin output at the requested index. More... | |
IPluginV2Ext ()=default | |
~IPluginV2Ext () override=default | |
virtual void | attachToContext (cudnnContext *, cublasContext *, IGpuAllocator *) noexcept |
Attach the plugin object to an execution context and grant the plugin the access to some context resources. More... | |
virtual void | detachFromContext () noexcept |
Detach the plugin object from its execution context. More... | |
Public Member Functions inherited from nvinfer1::IPluginV2 | |
virtual AsciiChar const * | getPluginType () const noexcept=0 |
Return the plugin type. Should match the plugin name returned by the corresponding plugin creator. More... | |
virtual AsciiChar const * | getPluginVersion () const noexcept=0 |
Return the plugin version. Should match the plugin version returned by the corresponding plugin creator. More... | |
virtual int32_t | getNbOutputs () const noexcept=0 |
Get the number of outputs from the layer. More... | |
virtual int32_t | initialize () noexcept=0 |
Initialize the layer for execution. This is called when the engine is created. More... | |
virtual void | terminate () noexcept=0 |
Release resources acquired during plugin layer initialization. This is called when the engine is destroyed. More... | |
virtual size_t | getSerializationSize () const noexcept=0 |
Find the size of the serialization buffer required to store the plugin configuration in a binary file. More... | |
virtual void | serialize (void *buffer) const noexcept=0 |
Serialize the layer. More... | |
virtual void | destroy () noexcept=0 |
Destroy the plugin object. This will be called when the network, builder or engine is destroyed. More... | |
virtual void | setPluginNamespace (AsciiChar const *pluginNamespace) noexcept=0 |
Set the namespace that this plugin object belongs to. Ideally, all plugin objects from the same plugin library must have the same namespace. More... | |
virtual AsciiChar const * | getPluginNamespace () const noexcept=0 |
Return the namespace of the plugin object. More... | |
Static Public Attributes | |
static constexpr int32_t | kFORMAT_COMBINATION_LIMIT = 100 |
Limit on number of format combinations accepted. More... | |
Protected Member Functions | |
int32_t | getTensorRTVersion () const noexcept override |
Return the API version with which this plugin was built. The upper byte reserved by TensorRT and is used to differentiate this from IPluginV2. More... | |
virtual | ~IPluginV2DynamicExt () noexcept |
Protected Member Functions inherited from nvinfer1::IPluginV2Ext | |
int32_t | getTensorRTVersion () const noexcept override |
Return the API version with which this plugin was built. The upper byte reserved by TensorRT and is used to differentiate this from IPluginV2. More... | |
void | configureWithFormat (Dims const *, int32_t, Dims const *, int32_t, DataType, PluginFormat, int32_t) noexcept override |
Derived classes must not implement this. In a C++11 API it would be override final. More... | |
Similar to IPluginV2Ext, but with support for dynamic shapes.
Clients should override the public methods, including the following inherited methods:
For weakly typed networks, the inputTypes will always be DataType::kFLOAT or DataType::kINT32, and the returned type is canonicalized to DataType::kFLOAT if it is DataType::kHALF or DataType:kINT8. For strongly typed networks, inputTypes are inferred from previous operations, and getOutputDataType specifies the returned type based on the inputTypes. Details about the floating-point precision are elicited later by method supportsFormatCombination.
|
inlineprotectedvirtualnoexcept |
|
overridepure virtualnoexcept |
Clone the plugin object. This copies over internal plugin parameters as well and returns a new plugin object with these parameters. If the source plugin is pre-configured with configurePlugin(), the returned object must also be pre-configured. The returned object must allow attachToContext() with a new execution context. Cloned plugin objects can share the same per-engine immutable resource (e.g. weights) with the source object (e.g. via ref-counting) to avoid duplication.
Usage considerations
Implements nvinfer1::IPluginV2Ext.
|
pure virtualnoexcept |
Configure the plugin.
configurePlugin() can be called multiple times in both the build and execution phases. The build phase happens before initialize() is called and only occurs during creation of an engine by IBuilder. The execution phase happens after initialize() is called and occurs during both creation of an engine by IBuilder and execution of an engine by IExecutionContext.
Build phase: IPluginV2DynamicExt->configurePlugin is called when a plugin is being prepared for profiling but not for any specific input size. This provides an opportunity for the plugin to make algorithmic choices on the basis of input and output formats, along with the bound of possible dimensions. The min and max value of the DynamicPluginTensorDesc correspond to the kMIN and kMAX value of the current profile that the plugin is being profiled for, with the desc.dims field corresponding to the dimensions of plugin specified at network creation. Wildcard dimensions will exist during this phase in the desc.dims field.
Execution phase: IPluginV2DynamicExt->configurePlugin is called when a plugin is being prepared for executing the plugin for a specific dimensions. This provides an opportunity for the plugin to change algorithmic choices based on the explicit input dimensions stored in desc.dims field.
in | The input tensors attributes that are used for configuration. |
nbInputs | Number of input tensors. |
out | The output tensors attributes that are used for configuration. |
nbOutputs | Number of output tensors. |
|
pure virtualnoexcept |
Execute the layer.
inputDesc | how to interpret the memory for the input tensors. |
outputDesc | how to interpret the memory for the output tensors. |
inputs | The memory for the input tensors. |
outputs | The memory for the output tensors. |
workspace | Workspace for execution. |
stream | The stream in which to execute the kernels. |
|
pure virtualnoexcept |
Get expressions for computing dimensions of an output tensor from dimensions of the input tensors.
outputIndex | The index of the output tensor |
inputs | Expressions for dimensions of the input tensors |
nbInputs | The number of input tensors |
exprBuilder | Object for generating new expressions |
This function is called by the implementations of IBuilder during analysis of the network.
Example #1: A plugin has a single output that transposes the last two dimensions of the plugin's single input. The body of the override of getOutputDimensions can be:
DimsExprs output(inputs[0]); std::swap(output.d[output.nbDims-1], output.d[output.nbDims-2]); return output;
Example #2: A plugin concatenates its two inputs along the first dimension. The body of the override of getOutputDimensions can be:
DimsExprs output(inputs[0]); output.d[0] = exprBuilder.operation(DimensionOperation::kSUM, *inputs[0].d[0], *inputs[1].d[0]); return output;
|
inlineoverrideprotectedvirtualnoexcept |
Return the API version with which this plugin was built. The upper byte reserved by TensorRT and is used to differentiate this from IPluginV2.
Do not override this method as it is used by the TensorRT library to maintain backwards-compatibility with plugins.
Reimplemented from nvinfer1::IPluginV2.
|
pure virtualnoexcept |
Find the workspace size required by the layer.
This function is called after the plugin is configured, and possibly during execution. The result should be a sufficient workspace size to deal with inputs and outputs of the given size or any smaller problem.
|
pure virtualnoexcept |
Return true if plugin supports the format and datatype for the input/output indexed by pos.
For this method inputs are numbered 0..(nbInputs-1) and outputs are numbered nbInputs..(nbInputs+nbOutputs-1). Using this numbering, pos is an index into InOut, where 0 <= pos < nbInputs+nbOutputs.
TensorRT invokes this method to ask if the input/output indexed by pos supports the format/datatype specified by inOut[pos].format and inOut[pos].type. The override should return true if that format/datatype at inOut[pos] are supported by the plugin. If support is conditional on other input/output formats/datatypes, the plugin can make its result conditional on the formats/datatypes in inOut[0..pos-1], which will be set to values that the plugin supports. The override should not inspect inOut[pos+1..nbInputs+nbOutputs-1], which will have invalid values. In other words, the decision for pos must be based on inOut[0..pos] only.
Some examples:
return inOut[pos].format == TensorFormat::kLINEAR && inOut[pos].type == DataType::kHALF;
return inOut[pos].format == TensorFormat::kLINEAR && (inOut[pos].type == (pos < 2 ? DataType::kHALF : DataType::kFLOAT));
return pos == 0 || (inOut[pos].format == inOut.format[0] && inOut[pos].type == inOut[0].type);Warning: TensorRT will stop asking for formats once it finds kFORMAT_COMBINATION_LIMIT on combinations.
|
staticconstexpr |
Limit on number of format combinations accepted.
Copyright © 2024 NVIDIA Corporation
Privacy Policy |
Manage My Privacy |
Do Not Sell or Share My Data |
Terms of Service |
Accessibility |
Corporate Policies |
Product Security |
Contact