TensorRT 8.6.1
|
An engine for executing inference on a built network, with functionally unsafe features. More...
#include <NvInferRuntime.h>
Public Member Functions | |
virtual | ~ICudaEngine () noexcept=default |
TRT_DEPRECATED int32_t | getNbBindings () const noexcept |
Get the number of binding indices. More... | |
TRT_DEPRECATED int32_t | getBindingIndex (char const *name) const noexcept |
Retrieve the binding index for a named tensor. More... | |
TRT_DEPRECATED char const * | getBindingName (int32_t bindingIndex) const noexcept |
Retrieve the name corresponding to a binding index. More... | |
TRT_DEPRECATED bool | bindingIsInput (int32_t bindingIndex) const noexcept |
Determine whether a binding is an input binding. More... | |
TRT_DEPRECATED Dims | getBindingDimensions (int32_t bindingIndex) const noexcept |
Get the dimensions of a binding. More... | |
Dims | getTensorShape (char const *tensorName) const noexcept |
Get shape of an input or output tensor. More... | |
TRT_DEPRECATED DataType | getBindingDataType (int32_t bindingIndex) const noexcept |
Determine the required data type for a buffer from its binding index. More... | |
DataType | getTensorDataType (char const *tensorName) const noexcept |
Determine the required data type for a buffer from its tensor name. More... | |
TRT_DEPRECATED int32_t | getMaxBatchSize () const noexcept |
Get the maximum batch size which can be used for inference. Should only be called if the engine is built from an INetworkDefinition with implicit batch dimension mode. More... | |
int32_t | getNbLayers () const noexcept |
Get the number of layers in the network. More... | |
IHostMemory * | serialize () const noexcept |
Serialize the network to a stream. More... | |
IExecutionContext * | createExecutionContext () noexcept |
Create an execution context. More... | |
TRT_DEPRECATED void | destroy () noexcept |
Destroy this object;. More... | |
TRT_DEPRECATED TensorLocation | getLocation (int32_t bindingIndex) const noexcept |
Get location of binding. More... | |
TensorLocation | getTensorLocation (char const *tensorName) const noexcept |
Get whether an input or output tensor must be on GPU or CPU. More... | |
bool | isShapeInferenceIO (char const *tensorName) const noexcept |
True if tensor is required as input for shape calculations or is output from shape calculations. More... | |
TensorIOMode | getTensorIOMode (char const *tensorName) const noexcept |
Determine whether a tensor is an input or output tensor. More... | |
IExecutionContext * | createExecutionContextWithoutDeviceMemory () noexcept |
create an execution context without any device memory allocated More... | |
size_t | getDeviceMemorySize () const noexcept |
Return the amount of device memory required by an execution context. More... | |
bool | isRefittable () const noexcept |
Return true if an engine can be refit. More... | |
TRT_DEPRECATED int32_t | getBindingBytesPerComponent (int32_t bindingIndex) const noexcept |
Return the number of bytes per component of an element. More... | |
int32_t | getTensorBytesPerComponent (char const *tensorName) const noexcept |
Return the number of bytes per component of an element, or -1 if the provided name does not map to an input or output tensor. More... | |
int32_t | getTensorBytesPerComponent (char const *tensorName, int32_t profileIndex) const noexcept |
Return the number of bytes per component of an element of given profile, or -1 if the provided name does not map to an input or output tensor. More... | |
TRT_DEPRECATED int32_t | getBindingComponentsPerElement (int32_t bindingIndex) const noexcept |
Return the number of components included in one element. More... | |
int32_t | getTensorComponentsPerElement (char const *tensorName) const noexcept |
Return the number of components included in one element, or -1 if the provided name does not map to an input or output tensor. More... | |
int32_t | getTensorComponentsPerElement (char const *tensorName, int32_t profileIndex) const noexcept |
Return the number of components included in one element of given profile, or -1 if the provided name does not map to an input or output tensor. More... | |
TRT_DEPRECATED TensorFormat | getBindingFormat (int32_t bindingIndex) const noexcept |
Return the binding format. More... | |
TensorFormat | getTensorFormat (char const *tensorName) const noexcept |
Return the tensor format, or TensorFormat::kLINEAR if the provided name does not map to an input or output tensor. More... | |
TensorFormat | getTensorFormat (char const *tensorName, int32_t profileIndex) const noexcept |
Return the tensor format of given profile, or TensorFormat::kLINEAR if the provided name does not map to an input or output tensor. More... | |
TRT_DEPRECATED char const * | getBindingFormatDesc (int32_t bindingIndex) const noexcept |
Return the human readable description of the tensor format, or nullptr if the provided name does not map to an input or output tensor. More... | |
char const * | getTensorFormatDesc (char const *tensorName) const noexcept |
Return the human readable description of the tensor format, or empty string if the provided name does not map to an input or output tensor. More... | |
char const * | getTensorFormatDesc (char const *tensorName, int32_t profileIndex) const noexcept |
Return the human readable description of the tensor format of given profile, or empty string if the provided name does not map to an input or output tensor. More... | |
TRT_DEPRECATED int32_t | getBindingVectorizedDim (int32_t bindingIndex) const noexcept |
Return the dimension index that the buffer is vectorized, or -1 is the name is not found. More... | |
int32_t | getTensorVectorizedDim (char const *tensorName) const noexcept |
Return the dimension index that the buffer is vectorized, or -1 if the provided name does not map to an input or output tensor. More... | |
int32_t | getTensorVectorizedDim (char const *tensorName, int32_t profileIndex) const noexcept |
Return the dimension index that the buffer is vectorized of given profile, or -1 if the provided name does not map to an input or output tensor. More... | |
char const * | getName () const noexcept |
Returns the name of the network associated with the engine. More... | |
int32_t | getNbOptimizationProfiles () const noexcept |
Get the number of optimization profiles defined for this engine. More... | |
TRT_DEPRECATED Dims | getProfileDimensions (int32_t bindingIndex, int32_t profileIndex, OptProfileSelector select) const noexcept |
Get the minimum / optimum / maximum dimensions for a particular input binding under an optimization profile. More... | |
Dims | getProfileShape (char const *tensorName, int32_t profileIndex, OptProfileSelector select) const noexcept |
Get the minimum / optimum / maximum dimensions for an input tensor given its name under an optimization profile. More... | |
TRT_DEPRECATED int32_t const * | getProfileShapeValues (int32_t profileIndex, int32_t inputIndex, OptProfileSelector select) const noexcept |
Get minimum / optimum / maximum values for an input shape binding under an optimization profile. More... | |
TRT_DEPRECATED bool | isShapeBinding (int32_t bindingIndex) const noexcept |
True if tensor is required as input for shape calculations or output from them. More... | |
TRT_DEPRECATED bool | isExecutionBinding (int32_t bindingIndex) const noexcept |
True if pointer to tensor data is required for execution phase, false if nullptr can be supplied. More... | |
EngineCapability | getEngineCapability () const noexcept |
Determine what execution capability this engine has. More... | |
void | setErrorRecorder (IErrorRecorder *recorder) noexcept |
Set the ErrorRecorder for this interface. More... | |
IErrorRecorder * | getErrorRecorder () const noexcept |
Get the ErrorRecorder assigned to this interface. More... | |
bool | hasImplicitBatchDimension () const noexcept |
Query whether the engine was built with an implicit batch dimension. More... | |
TacticSources | getTacticSources () const noexcept |
return the tactic sources required by this engine. More... | |
ProfilingVerbosity | getProfilingVerbosity () const noexcept |
Return the ProfilingVerbosity the builder config was set to when the engine was built. More... | |
IEngineInspector * | createEngineInspector () const noexcept |
Create a new engine inspector which prints the layer information in an engine or an execution context. More... | |
int32_t | getNbIOTensors () const noexcept |
Return number of IO tensors. More... | |
char const * | getIOTensorName (int32_t index) const noexcept |
Return name of an IO tensor. More... | |
HardwareCompatibilityLevel | getHardwareCompatibilityLevel () const noexcept |
Return the hardware compatibility level of this engine. More... | |
int32_t | getNbAuxStreams () const noexcept |
Return the number of auxiliary streams used by this engine. More... | |
Protected Attributes | |
apiv::VCudaEngine * | mImpl |
Additional Inherited Members | |
![]() | |
INoCopy ()=default | |
virtual | ~INoCopy ()=default |
INoCopy (INoCopy const &other)=delete | |
INoCopy & | operator= (INoCopy const &other)=delete |
INoCopy (INoCopy &&other)=delete | |
INoCopy & | operator= (INoCopy &&other)=delete |
An engine for executing inference on a built network, with functionally unsafe features.
|
virtualdefaultnoexcept |
|
inlinenoexcept |
Determine whether a binding is an input binding.
bindingIndex | The binding index. |
|
inlinenoexcept |
Create a new engine inspector which prints the layer information in an engine or an execution context.
|
inlinenoexcept |
Create an execution context.
The execution context created will call setOptimizationProfile(0) implicitly if there are no other execution contexts assigned to optimization profile 0. This functionality is deprecated in TensorRT 8.6 and will instead default all optimization profiles to 0 starting in TensorRT 9.0. If an error recorder has been set for the engine, it will also be passed to the execution context.
|
inlinenoexcept |
create an execution context without any device memory allocated
The memory for execution of this device context must be supplied by the application.
|
inlinenoexcept |
Destroy this object;.
delete
.
|
inlinenoexcept |
Return the number of bytes per component of an element.
The vector component size is returned if getBindingVectorizedDim() != -1.
bindingIndex | The binding Index. |
|
inlinenoexcept |
Return the number of components included in one element.
The number of elements in the vectors is returned if getBindingVectorizedDim() != -1.
bindingIndex | The binding Index. |
|
inlinenoexcept |
Determine the required data type for a buffer from its binding index.
bindingIndex | The binding index. |
|
inlinenoexcept |
Get the dimensions of a binding.
bindingIndex | The binding index. |
For example, suppose an INetworkDefinition has an input with shape [-1,-1] that becomes a binding b in the engine. If the associated optimization profile specifies that b has minimum dimensions as [6,9] and maximum dimensions [7,9], getBindingDimensions(b) returns [-1,9], despite the second dimension being dynamic in the INetworkDefinition.
Because each optimization profile has separate bindings, the returned value can differ across profiles. Consider another binding b' for the same network input, but for another optimization profile. If that other profile specifies minimum dimensions [5,8] and maximum dimensions [5,9], getBindingDimensions(b') returns [5,-1].
|
inlinenoexcept |
Return the binding format.
bindingIndex | The binding Index. |
|
inlinenoexcept |
Return the human readable description of the tensor format, or nullptr if the provided name does not map to an input or output tensor.
The description includes the order, vectorization, data type, and strides. Examples are shown as follows: Example 1: kCHW + FP32 "Row major linear FP32 format" Example 2: kCHW2 + FP16 "Two wide channel vectorized row major FP16 format" Example 3: kHWC8 + FP16 + Line Stride = 32 "Channel major FP16 format where C % 8 == 0 and H Stride % 32 == 0"
bindingIndex | The binding Index. |
|
inlinenoexcept |
Retrieve the binding index for a named tensor.
IExecutionContext::enqueueV2() and IExecutionContext::executeV2() require an array of buffers.
Engine bindings map from tensor names to indices in this array. Binding indices are assigned at engine build time, and take values in the range [0 ... n-1] where n is the total number of inputs and outputs.
To get the binding index of the name in an optimization profile with index k > 0, mangle the name by appending " [profile k]", as described for method getBindingName().
name | The tensor name. |
|
inlinenoexcept |
Retrieve the name corresponding to a binding index.
This is the reverse mapping to that provided by getBindingIndex().
For optimization profiles with an index k > 0, the name is mangled by appending " [profile k]", with k written in decimal. For example, if the tensor in the INetworkDefinition had the name "foo", and bindingIndex refers to that tensor in the optimization profile with index 3, getBindingName returns "foo [profile 3]".
bindingIndex | The binding index. |
|
inlinenoexcept |
Return the dimension index that the buffer is vectorized, or -1 is the name is not found.
Specifically -1 is returned if scalars per vector is 1.
bindingIndex | The binding Index. |
|
inlinenoexcept |
Return the amount of device memory required by an execution context.
|
inlinenoexcept |
Determine what execution capability this engine has.
If the engine has EngineCapability::kSTANDARD, then all engine functionality is valid. If the engine has EngineCapability::kSAFETY, then only the functionality in safe engine is valid. If the engine has EngineCapability::kDLA_STANDALONE, then only serialize, destroy, and const-accessor functions are valid.
|
inlinenoexcept |
Get the ErrorRecorder assigned to this interface.
Retrieves the assigned error recorder object for the given class. A nullptr will be returned if an error handler has not been set.
|
inlinenoexcept |
Return the hardware compatibility level of this engine.
This is only supported for Ampere and newer architectures.
|
inlinenoexcept |
Return name of an IO tensor.
index | value between 0 and getNbIOTensors()-1 |
|
inlinenoexcept |
Get location of binding.
This lets you know whether the binding should be a pointer to device or host memory.
bindingIndex | The binding index. |
|
inlinenoexcept |
Get the maximum batch size which can be used for inference. Should only be called if the engine is built from an INetworkDefinition with implicit batch dimension mode.
|
inlinenoexcept |
Returns the name of the network associated with the engine.
The name is set during network creation and is retrieved after building or deserialization.
|
inlinenoexcept |
Return the number of auxiliary streams used by this engine.
This number will be less than or equal to the maximum allowed number of auxiliary streams set by IBuilderConfig::setMaxAuxStreams() API call when the engine was built.
|
inlinenoexcept |
Get the number of binding indices.
There are separate binding indices for each optimization profile. This method returns the total over all profiles. If the engine has been built for K profiles, the first getNbBindings() / K bindings are used by profile number 0, the following getNbBindings() / K bindings are used by profile number 1 etc.
|
inlinenoexcept |
Return number of IO tensors.
It is the number of input and output tensors for the network from which the engine was built. The names of the IO tensors can be discovered by calling getIOTensorName(i) for i in 0 to getNbIOTensors()-1.
|
inlinenoexcept |
Get the number of layers in the network.
The number of layers in the network is not necessarily the number in the original network definition, as layers may be combined or eliminated as the engine is optimized. This value can be useful when building per-layer tables, such as when aggregating profiling data over a number of executions.
|
inlinenoexcept |
Get the number of optimization profiles defined for this engine.
|
inlinenoexcept |
Get the minimum / optimum / maximum dimensions for a particular input binding under an optimization profile.
bindingIndex | The input binding index, which must belong to the given profile, or be between 0 and bindingsPerProfile-1 as described below. |
profileIndex | The profile index, which must be between 0 and getNbOptimizationProfiles()-1. |
select | Whether to query the minimum, optimum, or maximum dimensions for this binding. |
For backwards compatibility with earlier versions of TensorRT, if the bindingIndex does not belong to the current optimization profile, but is between 0 and bindingsPerProfile-1, where bindingsPerProfile = getNbBindings()/getNbOptimizationProfiles, then a corrected bindingIndex is used instead, computed by:
profileIndex * bindingsPerProfile + bindingIndex % bindingsPerProfile
Otherwise the bindingIndex is considered invalid.
|
inlinenoexcept |
Get the minimum / optimum / maximum dimensions for an input tensor given its name under an optimization profile.
tensorName | The name of an input tensor. |
profileIndex | The profile index, which must be between 0 and getNbOptimizationProfiles()-1. |
select | Whether to query the minimum, optimum, or maximum dimensions for this input tensor. |
|
inlinenoexcept |
Get minimum / optimum / maximum values for an input shape binding under an optimization profile.
profileIndex | The profile index (must be between 0 and getNbOptimizationProfiles()-1) |
inputIndex | The input index (must be between 0 and getNbBindings() - 1) |
select | Whether to query the minimum, optimum, or maximum shape values for this binding. |
For backwards compatibility with earlier versions of TensorRT, a bindingIndex that does not belong to the profile is corrected as described for getProfileDimensions().
|
inlinenoexcept |
Return the ProfilingVerbosity the builder config was set to when the engine was built.
|
inlinenoexcept |
return the tactic sources required by this engine.
The value returned is equal to zero or more tactics sources set at build time via setTacticSources() in IBuilderConfig. Sources set by the latter but not returned by ICudaEngine::getTacticSources do not reduce overall engine execution time, and can be removed from future builds to reduce build time.
|
inlinenoexcept |
Return the number of bytes per component of an element, or -1 if the provided name does not map to an input or output tensor.
The vector component size is returned if getTensorVectorizedDim() != -1.
tensorName | The name of an input or output tensor. |
|
inlinenoexcept |
Return the number of bytes per component of an element of given profile, or -1 if the provided name does not map to an input or output tensor.
The vector component size is returned if getTensorVectorizedDim(tensorName, profileIndex) != -1.
tensorName | The name of an input or output tensor. |
profileIndex | The profile index to query |
|
inlinenoexcept |
Return the number of components included in one element, or -1 if the provided name does not map to an input or output tensor.
The number of elements in the vectors is returned if getTensorVectorizedDim() != -1.
tensorName | The name of an input or output tensor. |
|
inlinenoexcept |
Return the number of components included in one element of given profile, or -1 if the provided name does not map to an input or output tensor.
The number of elements in the vectors is returned if getTensorVectorizedDim(tensorName, profileIndex) != -1.
tensorName | The name of an input or output tensor. |
profileIndex | The profile index to query |
|
inlinenoexcept |
Determine the required data type for a buffer from its tensor name.
tensorName | The name of an input or output tensor. |
|
inlinenoexcept |
Return the tensor format, or TensorFormat::kLINEAR if the provided name does not map to an input or output tensor.
|
inlinenoexcept |
Return the tensor format of given profile, or TensorFormat::kLINEAR if the provided name does not map to an input or output tensor.
tensorName | The name of an input or output tensor. |
profileIndex | The profile index to query the format for. |
|
inlinenoexcept |
Return the human readable description of the tensor format, or empty string if the provided name does not map to an input or output tensor.
The description includes the order, vectorization, data type, and strides. Examples are shown as follows: Example 1: kCHW + FP32 "Row major linear FP32 format" Example 2: kCHW2 + FP16 "Two wide channel vectorized row major FP16 format" Example 3: kHWC8 + FP16 + Line Stride = 32 "Channel major FP16 format where C % 8 == 0 and H Stride % 32 == 0"
tensorName | The name of an input or output tensor. |
|
inlinenoexcept |
Return the human readable description of the tensor format of given profile, or empty string if the provided name does not map to an input or output tensor.
The description includes the order, vectorization, data type, and strides. Examples are shown as follows: Example 1: kCHW + FP32 "Row major linear FP32 format" Example 2: kCHW2 + FP16 "Two wide channel vectorized row major FP16 format" Example 3: kHWC8 + FP16 + Line Stride = 32 "Channel major FP16 format where C % 8 == 0 and H Stride % 32 == 0"
tensorName | The name of an input or output tensor. |
profileIndex | The profile index to query the format for. |
|
inlinenoexcept |
Determine whether a tensor is an input or output tensor.
tensorName | The name of an input or output tensor. |
|
inlinenoexcept |
Get whether an input or output tensor must be on GPU or CPU.
tensorName | The name of an input or output tensor. |
The location is established at build time. E.g. shape tensors inputs are typically required to be on the CPU.
|
inlinenoexcept |
Get shape of an input or output tensor.
tensorName | The name of an input or output tensor. |
|
inlinenoexcept |
Return the dimension index that the buffer is vectorized, or -1 if the provided name does not map to an input or output tensor.
Specifically -1 is returned if scalars per vector is 1.
tensorName | The name of an input or output tensor. |
|
inlinenoexcept |
Return the dimension index that the buffer is vectorized of given profile, or -1 if the provided name does not map to an input or output tensor.
Specifically -1 is returned if scalars per vector is 1.
tensorName | The name of an input. |
profileIndex | The profile index to query the format for. |
|
inlinenoexcept |
Query whether the engine was built with an implicit batch dimension.
This is an engine-wide property. Either all tensors in the engine have an implicit batch dimension or none of them do.
hasImplicitBatchDimension() is true if and only if the INetworkDefinition from which this engine was built was created with createNetworkV2() without NetworkDefinitionCreationFlag::kEXPLICIT_BATCH flag.
|
inlinenoexcept |
True if pointer to tensor data is required for execution phase, false if nullptr can be supplied.
For example, if a network uses an input tensor with binding i ONLY as the "reshape dimensions" input of IShuffleLayer, then isExecutionBinding(i) is false, and a nullptr can be supplied for it when calling IExecutionContext::execute or IExecutionContext::enqueue.
|
inlinenoexcept |
Return true if an engine can be refit.
|
inlinenoexcept |
True if tensor is required as input for shape calculations or output from them.
TensorRT evaluates a network in two phases:
Some tensors are required in phase 1. These tensors are called "shape tensors", and always have type Int32 and no more than one dimension. These tensors are not always shapes themselves, but might be used to calculate tensor shapes for phase 2.
isShapeBinding(i) returns true if the tensor is a required input or an output computed in phase 1. isExecutionBinding(i) returns true if the tensor is a required input or an output computed in phase 2.
For example, if a network uses an input tensor with binding i as an addend to an IElementWiseLayer that computes the "reshape dimensions" for IShuffleLayer, then isShapeBinding(i) == true.
It's possible to have a tensor be required by both phases. For instance, a tensor can be used for the "reshape dimensions" and as the indices for an IGatherLayer collecting floating-point data.
It's also possible to have a tensor be required by neither phase, but nonetheless shows up in the engine's inputs. For example, if an input tensor is used only as an input to IShapeLayer, only its shape matters and its values are irrelevant.
|
inlinenoexcept |
True if tensor is required as input for shape calculations or is output from shape calculations.
Return true for either of the following conditions:
For example, if a network uses an input tensor "foo" as an addend to an IElementWiseLayer that computes the "reshape dimensions" for IShuffleLayer, then isShapeInferenceIO("foo") == true. If the network copies said input tensor "foo" to an output "bar", then isShapeInferenceIO("bar") == true and IExecutionContext::inferShapes() will write to "bar".
|
inlinenoexcept |
Serialize the network to a stream.
The network may be deserialized with IRuntime::deserializeCudaEngine().
|
inlinenoexcept |
Set the ErrorRecorder for this interface.
Assigns the ErrorRecorder to this interface. The ErrorRecorder will track all errors during execution. This function will call incRefCount of the registered ErrorRecorder at least once. Setting recorder to nullptr unregisters the recorder with the interface, resulting in a call to decRefCount if a recorder has been registered.
If an error recorder is not set, messages will be sent to the global log stream.
recorder | The error recorder to register with this interface. |
|
protected |