TensorRT  7.0.0.11
nvinfer1::ICudaEngine Class Referenceabstract

An engine for executing inference on a built network, with functionally unsafe features. More...

#include <NvInferRuntime.h>

Public Member Functions

virtual int getNbBindings () const noexcept=0
 Get the number of binding indices. More...
 
virtual int getBindingIndex (const char *name) const noexcept=0
 Retrieve the binding index for a named tensor. More...
 
virtual const char * getBindingName (int bindingIndex) const noexcept=0
 Retrieve the name corresponding to a binding index. More...
 
virtual bool bindingIsInput (int bindingIndex) const noexcept=0
 Determine whether a binding is an input binding. More...
 
virtual Dims getBindingDimensions (int bindingIndex) const noexcept=0
 Get the dimensions of a binding. More...
 
virtual DataType getBindingDataType (int bindingIndex) const noexcept=0
 Determine the required data type for a buffer from its binding index. More...
 
virtual int getMaxBatchSize () const noexcept=0
 Get the maximum batch size which can be used for inference. More...
 
virtual int getNbLayers () const noexcept=0
 Get the number of layers in the network. More...
 
virtual TRT_DEPRECATED std::size_t getWorkspaceSize () const noexcept=0
 Get the amount of workspace the engine uses. More...
 
virtual IHostMemoryserialize () const noexcept=0
 Serialize the network to a stream. More...
 
virtual IExecutionContextcreateExecutionContext () noexcept=0
 Create an execution context. More...
 
virtual void destroy () noexcept=0
 Destroy this object;.
 
virtual TensorLocation getLocation (int bindingIndex) const noexcept=0
 Get location of binding. More...
 
virtual IExecutionContextcreateExecutionContextWithoutDeviceMemory () noexcept=0
 create an execution context without any device memory allocated More...
 
virtual size_t getDeviceMemorySize () const noexcept=0
 Return the amount of device memory required by an execution context. More...
 
virtual bool isRefittable () const noexcept=0
 Return true if engine can be refit. More...
 
virtual int getBindingBytesPerComponent (int bindingIndex) const noexcept=0
 Return the number of bytes per component of an element. More...
 
virtual int getBindingComponentsPerElement (int bindingIndex) const noexcept=0
 Return the number of components included in one element. More...
 
virtual TensorFormat getBindingFormat (int bindingIndex) const noexcept=0
 Return the binding format. More...
 
virtual const char * getBindingFormatDesc (int bindingIndex) const noexcept=0
 Return the human readable description of the tensor format. More...
 
virtual int getBindingVectorizedDim (int bindingIndex) const noexcept=0
 Return the dimension index that the buffer is vectorized. More...
 
virtual const char * getName () const noexcept=0
 Returns the name of the network associated with the engine. More...
 
virtual int getNbOptimizationProfiles () const noexcept=0
 Get the number of optimization profiles defined for this engine. More...
 
virtual Dims getProfileDimensions (int bindingIndex, int profileIndex, OptProfileSelector select) const noexcept=0
 Get the minimum / optimum / maximum dimensions for a particular binding under an optimization profile. More...
 
virtual const int32_t * getProfileShapeValues (int profileIndex, int inputIndex, OptProfileSelector select) const noexcept=0
 Get minimum / optimum / maximum values for an input shape binding under an optimization profile. More...
 
virtual bool isShapeBinding (int bindingIndex) const noexcept=0
 True if tensor is required as input for shape calculations or output from them. More...
 
virtual bool isExecutionBinding (int bindingIndex) const noexcept=0
 True if pointer to tensor data is required for execution phase, false if nullptr can be supplied. More...
 
virtual EngineCapability getEngineCapability () const noexcept=0
 determine that execution capability this engine has. More...
 
virtual void setErrorRecorder (IErrorRecorder *recorder) noexcept=0
 Set the ErrorRecorder for this interface. More...
 
virtual IErrorRecordergetErrorRecorder () const noexcept=0
 get the ErrorRecorder assigned to this interface. More...
 
virtual bool hasImplicitBatchDimension () const =0
 Query whether the engine was built with an implicit batch dimension. More...
 

Detailed Description

An engine for executing inference on a built network, with functionally unsafe features.

Warning
Do not inherit from this class, as doing so will break forward-compatibility of the API and ABI.

Member Function Documentation

virtual bool nvinfer1::ICudaEngine::bindingIsInput ( int  bindingIndex) const
pure virtualnoexcept

Determine whether a binding is an input binding.

Parameters
bindingIndexThe binding index.
Returns
True if the index corresponds to an input binding and the index is in range.
See also
getBindingIndex()
virtual IExecutionContext* nvinfer1::ICudaEngine::createExecutionContext ( )
pure virtualnoexcept

Create an execution context.

See also
IExecutionContext.
virtual IExecutionContext* nvinfer1::ICudaEngine::createExecutionContextWithoutDeviceMemory ( )
pure virtualnoexcept

create an execution context without any device memory allocated

The memory for execution of this device context must be supplied by the application.

See also
getDeviceMemorySize() IExecutionContext::setDeviceMemory()
virtual int nvinfer1::ICudaEngine::getBindingBytesPerComponent ( int  bindingIndex) const
pure virtualnoexcept

Return the number of bytes per component of an element.

The vector component size is returned if getBindingVectorizedDim() != -1.

Parameters
bindingIndexThe binding Index.
See also
ICudaEngine::getBindingVectorizedDim()
virtual int nvinfer1::ICudaEngine::getBindingComponentsPerElement ( int  bindingIndex) const
pure virtualnoexcept

Return the number of components included in one element.

The number of elements in the vectors is returned if getBindingVectorizedDim() != -1.

Parameters
bindingIndexThe binding Index.
See also
ICudaEngine::getBindingVectorizedDim()
virtual DataType nvinfer1::ICudaEngine::getBindingDataType ( int  bindingIndex) const
pure virtualnoexcept

Determine the required data type for a buffer from its binding index.

Parameters
bindingIndexThe binding index.
Returns
The type of the data in the buffer.
See also
getBindingIndex()
virtual Dims nvinfer1::ICudaEngine::getBindingDimensions ( int  bindingIndex) const
pure virtualnoexcept

Get the dimensions of a binding.

Parameters
bindingIndexThe binding index.
Returns
The dimensions of the binding if the index is in range, otherwise Dims() Has -1 for any dimension with a dynamic value.
See also
getBindingIndex()
virtual TensorFormat nvinfer1::ICudaEngine::getBindingFormat ( int  bindingIndex) const
pure virtualnoexcept

Return the binding format.

Parameters
bindingIndexThe binding Index.
virtual const char* nvinfer1::ICudaEngine::getBindingFormatDesc ( int  bindingIndex) const
pure virtualnoexcept

Return the human readable description of the tensor format.

The description includes the order, vectorization, data type, strides, and etc. Examples are shown as follows: Example 1: kCHW + FP32 "Row major linear FP32 format" Example 2: kCHW2 + FP16 "Two wide channel vectorized row major FP16 format" Example 3: kHWC8 + FP16 + Line Stride = 32 "Channel major FP16 format where C % 8 == 0 and H Stride % 32 == 0"

Parameters
bindingIndexThe binding Index.
virtual int nvinfer1::ICudaEngine::getBindingIndex ( const char *  name) const
pure virtualnoexcept

Retrieve the binding index for a named tensor.

IExecutionContext::enqueue() and IExecutionContext::execute() require an array of buffers.

Engine bindings map from tensor names to indices in this array. Binding indices are assigned at engine build time, and take values in the range [0 ... n-1] where n is the total number of inputs and outputs.

Parameters
nameThe tensor name.
Returns
The binding index for the named tensor, or -1 if the name is not found.

see getNbBindings() getBindingIndex()

virtual const char* nvinfer1::ICudaEngine::getBindingName ( int  bindingIndex) const
pure virtualnoexcept

Retrieve the name corresponding to a binding index.

This is the reverse mapping to that provided by getBindingIndex().

Parameters
bindingIndexThe binding index.
Returns
The name corresponding to the index, or nullptr if the index is out of range.
See also
getBindingIndex()
virtual int nvinfer1::ICudaEngine::getBindingVectorizedDim ( int  bindingIndex) const
pure virtualnoexcept

Return the dimension index that the buffer is vectorized.

Specifically -1 is returned if scalars per vector is 1.

Parameters
bindingIndexThe binding Index.
virtual size_t nvinfer1::ICudaEngine::getDeviceMemorySize ( ) const
pure virtualnoexcept

Return the amount of device memory required by an execution context.

See also
IExecutionContext::setDeviceMemory()
virtual EngineCapability nvinfer1::ICudaEngine::getEngineCapability ( ) const
pure virtualnoexcept

determine that execution capability this engine has.

If the engine has EngineCapability::kDEFAULT, then all engine functionality is valid.. If the engine has EngineCapability::kSAFE_GPU, then only the functionality in safe::ICudaEngine is valid. If the engine has EngineCapability::kSAFE_DLA, then only serialize, destroy, and const-accessor functions are valid.

Returns
The EngineCapability flag that the engine was built for.
virtual IErrorRecorder* nvinfer1::ICudaEngine::getErrorRecorder ( ) const
pure virtualnoexcept

get the ErrorRecorder assigned to this interface.

Retrieves the assigned error recorder object for the given class. A default error recorder does not exist, so a nullptr will be returned if setErrorRecorder has not been called.

Returns
A pointer to the IErrorRecorder object that has been registered.
See also
setErrorRecorder
virtual TensorLocation nvinfer1::ICudaEngine::getLocation ( int  bindingIndex) const
pure virtualnoexcept

Get location of binding.

This lets you know whether the binding should be a pointer to device or host memory.

See also
ITensor::setLocation() ITensor::getLocation()
Parameters
bindingIndexThe binding index.
Returns
The location of the bound tensor with given index.
virtual int nvinfer1::ICudaEngine::getMaxBatchSize ( ) const
pure virtualnoexcept

Get the maximum batch size which can be used for inference.

For an engine built from an INetworkDefinition without an implicit batch dimension, this will always return 1.

Returns
The maximum batch size for this engine.
virtual const char* nvinfer1::ICudaEngine::getName ( ) const
pure virtualnoexcept

Returns the name of the network associated with the engine.

The name is set during network creation and is retrieved after building or deserialization.

See also
INetworkDefinition::setName(), INetworkDefinition::getName()
Returns
A zero delimited C-style string representing the name of the network.
virtual int nvinfer1::ICudaEngine::getNbBindings ( ) const
pure virtualnoexcept

Get the number of binding indices.

If the engine has been built for K profiles, the first getNbBindings() / K bindings are used by profile number 0, the following getNbBindings() / K bindings are used by profile number 1 etc.

See also
getBindingIndex();
virtual int nvinfer1::ICudaEngine::getNbLayers ( ) const
pure virtualnoexcept

Get the number of layers in the network.

The number of layers in the network is not necessarily the number in the original network definition, as layers may be combined or eliminated as the engine is optimized. This value can be useful when building per-layer tables, such as when aggregating profiling data over a number of executions.

Returns
The number of layers in the network.
virtual int nvinfer1::ICudaEngine::getNbOptimizationProfiles ( ) const
pure virtualnoexcept

Get the number of optimization profiles defined for this engine.

Returns
Number of optimization profiles. It is always at least 1.
See also
IExecutionContext::setOptimizationProfile()
virtual Dims nvinfer1::ICudaEngine::getProfileDimensions ( int  bindingIndex,
int  profileIndex,
OptProfileSelector  select 
) const
pure virtualnoexcept

Get the minimum / optimum / maximum dimensions for a particular binding under an optimization profile.

Parameters
bindingIndexThe binding index (must be between 0 and getNbBindings() - 1)
profileIndexThe profile index (must be between 0 and getNbOptimizationProfiles()-1)
selectWhether to query the minimum, optimum, or maximum dimensions for this binding.
Returns
The minimum / optimum / maximum dimensions for this binding in this profile.
virtual const int32_t* nvinfer1::ICudaEngine::getProfileShapeValues ( int  profileIndex,
int  inputIndex,
OptProfileSelector  select 
) const
pure virtualnoexcept

Get minimum / optimum / maximum values for an input shape binding under an optimization profile.

Parameters
profileIndexThe profile index (must be between 0 and getNbOptimizationProfiles()-1)
inputIndexThe input index (must be between 0 and getNbBindings() - 1)
selectWhether to query the minimum, optimum, or maximum shape values for this binding.
Returns
If the binding is an input shape binding, return a pointer to an array that has the same number of elements as the corresponding tensor, i.e. 1 if dims.nbDims == 0, or dims.d[0] if dims.nbDims == 1, where dims = getBindingDimensions(inputIndex). The array contains the elementwise minimum / optimum / maximum values for this shape binding under the profile. If either of the indices is out of range, or if the binding is not an input shape binding, return nullptr.
virtual TRT_DEPRECATED std::size_t nvinfer1::ICudaEngine::getWorkspaceSize ( ) const
pure virtualnoexcept

Get the amount of workspace the engine uses.

The workspace size will be no greater than the value provided to the builder when the engine was built, and will typically be smaller. Workspace will be allocated for each execution context.

virtual bool nvinfer1::ICudaEngine::hasImplicitBatchDimension ( ) const
pure virtual

Query whether the engine was built with an implicit batch dimension.

Returns
True if tensors have implicit batch dimension, false otherwise.

This is an engine-wide property. Either all tensors in the engine have an implicit batch dimension or none of them do.

hasImplicitBatchDimension() is true if and only if the INetworkDefinition from which this engine was built was created with createNetwork() or createNetworkV2() without NetworkDefinitionCreationFlag::kEXPLICIT_BATCH flag.

See also
createNetworkV2
virtual bool nvinfer1::ICudaEngine::isExecutionBinding ( int  bindingIndex) const
pure virtualnoexcept

True if pointer to tensor data is required for execution phase, false if nullptr can be supplied.

For example, if a network uses an input tensor with binding i ONLY as the "reshape dimensions" input of IShuffleLayer, then isExecutionBinding(i) is false, and a nullptr can be supplied for it when calling IExecutionContext::execute or IExecutionContext::enqueue.

See also
isShapeBinding()
virtual bool nvinfer1::ICudaEngine::isRefittable ( ) const
pure virtualnoexcept

Return true if engine can be refit.

See also
nvinfer1::createInferRefitter()
virtual bool nvinfer1::ICudaEngine::isShapeBinding ( int  bindingIndex) const
pure virtualnoexcept

True if tensor is required as input for shape calculations or output from them.

TensorRT evaluates a network in two phases:

  1. Compute shape information required to determine memory allocation requirements and validate that runtime sizes make sense.
  2. Process tensors on the device.

Some tensors are required in phase 1. These tensors are called "shape tensors", and always have type Int32 and no more than one dimension. These tensors are not always shapes themselves, but might be used to calculate tensor shapes for phase 2.

isShapeBinding(i) returns true if the tensor is a required input or an output computed in phase 1. isExecutionBinding(i) returns true if the tensor is a required input or an output computed in phase 2.

For example, if a network uses an input tensor with binding i as an addend to an IElementWiseLayer that computes the "reshape dimensions" for IShuffleLayer, then isShapeBinding(i) == true.

It's possible to have a tensor be required by both phases. For instance, a tensor can be used for the "reshape dimensions" and as the indices for an IGatherLayer collecting floating-point data.

It's also possible to have a tensor be required by neither phase, but nonetheless shows up in the engine's inputs. For example, if an input tensor is used only as an input to IShapeLayer, only its shape matters and its values are irrelevant.

See also
isExecutionBinding()
virtual IHostMemory* nvinfer1::ICudaEngine::serialize ( ) const
pure virtualnoexcept

Serialize the network to a stream.

Returns
A IHostMemory object that contains the serialized engine.

The network may be deserialized with IRuntime::deserializeCudaEngine() and also safe::IRuntime::deserializeCudaEngine() if only functional-safe features are used in the engine.

See also
IRuntime::deserializeCudaEngine() safe::IRuntime::deserializeCudaEngine()
virtual void nvinfer1::ICudaEngine::setErrorRecorder ( IErrorRecorder recorder)
pure virtualnoexcept

Set the ErrorRecorder for this interface.

Assigns the ErrorRecorder to this interface. The ErrorRecorder will track all errors during execution. This function will call incRefCount of the registered ErrorRecorder at least once. Setting recorder to nullptr unregisters the recorder with the interface, resulting in a call to decRefCount if a recorder has been registered.

Parameters
recorderThe error recorder to register with this interface.
See also
getErrorRecorder

The documentation for this class was generated from the following file: