An engine for executing inference on a built network, with functionally unsafe features. More...

#include <NvInferRuntime.h>

Inheritance diagram for nvinfer1::ICudaEngine:

Public Member Functions
int32_t	getNbBindings () const noexcept
	Get the number of binding indices. More...

int32_t	getBindingIndex (const char *name) const noexcept
	Retrieve the binding index for a named tensor. More...

const char *	getBindingName (int32_t bindingIndex) const noexcept
	Retrieve the name corresponding to a binding index. More...

bool	bindingIsInput (int32_t bindingIndex) const noexcept
	Determine whether a binding is an input binding. More...

Dims	getBindingDimensions (int32_t bindingIndex) const noexcept
	Get the dimensions of a binding. More...

DataType	getBindingDataType (int32_t bindingIndex) const noexcept
	Determine the required data type for a buffer from its binding index. More...

int32_t	getMaxBatchSize () const noexcept
	Get the maximum batch size which can be used for inference. More...

int32_t	getNbLayers () const noexcept
	Get the number of layers in the network. More...

IHostMemory *	serialize () const noexcept
	Serialize the network to a stream. More...

IExecutionContext *	createExecutionContext () noexcept
	Create an execution context. More...

TRT_DEPRECATED void	destroy () noexcept
	Destroy this object;. More...

TensorLocation	getLocation (int32_t bindingIndex) const noexcept
	Get location of binding. More...

IExecutionContext *	createExecutionContextWithoutDeviceMemory () noexcept
	create an execution context without any device memory allocated More...

size_t	getDeviceMemorySize () const noexcept
	Return the amount of device memory required by an execution context. More...

bool	isRefittable () const noexcept
	Return true if an engine can be refit. More...

int32_t	getBindingBytesPerComponent (int32_t bindingIndex) const noexcept
	Return the number of bytes per component of an element. More...

int32_t	getBindingComponentsPerElement (int32_t bindingIndex) const noexcept
	Return the number of components included in one element. More...

TensorFormat	getBindingFormat (int32_t bindingIndex) const noexcept
	Return the binding format. More...

const char *	getBindingFormatDesc (int32_t bindingIndex) const noexcept
	Return the human readable description of the tensor format. More...

int32_t	getBindingVectorizedDim (int32_t bindingIndex) const noexcept
	Return the dimension index that the buffer is vectorized. More...

const char *	getName () const noexcept
	Returns the name of the network associated with the engine. More...

int32_t	getNbOptimizationProfiles () const noexcept
	Get the number of optimization profiles defined for this engine. More...

Dims	getProfileDimensions (int32_t bindingIndex, int32_t profileIndex, OptProfileSelector select) const noexcept
	Get the minimum / optimum / maximum dimensions for a particular binding under an optimization profile. More...

const int32_t *	getProfileShapeValues (int32_t profileIndex, int32_t inputIndex, OptProfileSelector select) const noexcept
	Get minimum / optimum / maximum values for an input shape binding under an optimization profile. More...

bool	isShapeBinding (int32_t bindingIndex) const noexcept
	True if tensor is required as input for shape calculations or output from them. More...

bool	isExecutionBinding (int32_t bindingIndex) const noexcept
	True if pointer to tensor data is required for execution phase, false if nullptr can be supplied. More...

EngineCapability	getEngineCapability () const noexcept
	Determine what execution capability this engine has. More...

void	setErrorRecorder (IErrorRecorder *recorder) noexcept
	Set the ErrorRecorder for this interface. More...

IErrorRecorder *	getErrorRecorder () const noexcept
	Get the ErrorRecorder assigned to this interface. More...

bool	hasImplicitBatchDimension () const noexcept
	Query whether the engine was built with an implicit batch dimension. More...

TacticSources	getTacticSources () const noexcept
	return the tactic sources required by this engine More...

ProfilingVerbosity	getProfilingVerbosity () const noexcept
	Return the ProfilingVerbosity the builder config was set to when the engine was built. More...

IEngineInspector *	createEngineInspector () const noexcept
	Create a new engine inspector which prints the layer information in an engine or an execution context. More...

Protected Attributes
apiv::VCudaEngine *	mImpl

Additional Inherited Members
Protected Member Functions inherited from nvinfer1::INoCopy
	INoCopy (const INoCopy &other)=delete

INoCopy &	operator= (const INoCopy &other)=delete

	INoCopy (INoCopy &&other)=delete

INoCopy &	operator= (INoCopy &&other)=delete

Detailed Description

An engine for executing inference on a built network, with functionally unsafe features.

Warning: Do not inherit from this class, as doing so will break forward-compatibility of the API and ABI.

Member Function Documentation

◆ bindingIsInput()

bool nvinfer1::ICudaEngine::bindingIsInput ( int32_t bindingIndex ) const

inlinenoexcept

Determine whether a binding is an input binding.

Parameters

bindingIndex The binding index.

Returns: True if the index corresponds to an input binding and the index is in range.

See also: getBindingIndex()

◆ createEngineInspector()

IEngineInspector * nvinfer1::ICudaEngine::createEngineInspector ( ) const

inlinenoexcept

Create a new engine inspector which prints the layer information in an engine or an execution context.

See also: IEngineInspector.

◆ createExecutionContext()

IExecutionContext * nvinfer1::ICudaEngine::createExecutionContext ( )

inlinenoexcept

Create an execution context.

If the engine supports dynamic shapes, each execution context in concurrent use must use a separate optimization profile. The first execution context created will call setOptimizationProfile(0) implicitly. For other execution contexts, setOptimizationProfile() must be called with unique profile index before calling execute or enqueue. If an error recorder has been set for the engine, it will also be passed to the execution context.

See also: IExecutionContext.; IExecutionContext::setOptimizationProfile()

◆ createExecutionContextWithoutDeviceMemory()

IExecutionContext * nvinfer1::ICudaEngine::createExecutionContextWithoutDeviceMemory ( )

inlinenoexcept

create an execution context without any device memory allocated

The memory for execution of this device context must be supplied by the application.

◆ destroy()

TRT_DEPRECATED void nvinfer1::ICudaEngine::destroy ( )

inlinenoexcept

Destroy this object;.

Deprecated:: Deprecated interface will be removed in TensorRT 10.0.

Warning: Calling destroy on a managed pointer will result in a double-free error.

◆ getBindingBytesPerComponent()

int32_t nvinfer1::ICudaEngine::getBindingBytesPerComponent ( int32_t bindingIndex ) const

inlinenoexcept

Return the number of bytes per component of an element.

The vector component size is returned if getBindingVectorizedDim() != -1.

Parameters

bindingIndex The binding Index.

See also: ICudaEngine::getBindingVectorizedDim()

◆ getBindingComponentsPerElement()

int32_t nvinfer1::ICudaEngine::getBindingComponentsPerElement ( int32_t bindingIndex ) const

inlinenoexcept

Return the number of components included in one element.

The number of elements in the vectors is returned if getBindingVectorizedDim() != -1.

Parameters

bindingIndex The binding Index.

See also: ICudaEngine::getBindingVectorizedDim()

◆ getBindingDataType()

DataType nvinfer1::ICudaEngine::getBindingDataType ( int32_t bindingIndex ) const

inlinenoexcept

Determine the required data type for a buffer from its binding index.

Parameters

bindingIndex The binding index.

Returns: The type of the data in the buffer.

See also: getBindingIndex()

◆ getBindingDimensions()

Dims nvinfer1::ICudaEngine::getBindingDimensions ( int32_t bindingIndex ) const

inlinenoexcept

Get the dimensions of a binding.

Parameters

bindingIndex The binding index.

Returns: The dimensions of the binding if the index is in range, otherwise Dims(). Has -1 for any dimension that varies within the optimization profile.

For example, suppose an INetworkDefinition has an input with shape [-1,-1] that becomes a binding b in the engine. If the associated optimization profile specifies that b has minimum dimensions as [6,9] and maximum dimensions [7,9], getBindingDimensions(b) returns [-1,9], despite the second dimension being dynamic in the INetworkDefinition.

Because each optimization profile has separate bindings, the returned value can differ across profiles. Consider another binding b' for the same network input, but for another optimization profile. If that other profile specifies minimum dimensions [5,8] and maximum dimensions [5,9], getBindingDimensions(b') returns [5,-1].

See also: getBindingIndex()

◆ getBindingFormat()

TensorFormat nvinfer1::ICudaEngine::getBindingFormat ( int32_t bindingIndex ) const

inlinenoexcept

Return the binding format.

Parameters

bindingIndex The binding Index.

◆ getBindingFormatDesc()

const char * nvinfer1::ICudaEngine::getBindingFormatDesc ( int32_t bindingIndex ) const

inlinenoexcept

Return the human readable description of the tensor format.

The description includes the order, vectorization, data type, strides, and etc. Examples are shown as follows: Example 1: kCHW + FP32 "Row major linear FP32 format" Example 2: kCHW2 + FP16 "Two wide channel vectorized row major FP16 format" Example 3: kHWC8 + FP16 + Line Stride = 32 "Channel major FP16 format where C % 8 == 0 and H Stride % 32 == 0"

Parameters

bindingIndex The binding Index.

◆ getBindingIndex()

int32_t nvinfer1::ICudaEngine::getBindingIndex ( const char * name ) const

inlinenoexcept

Retrieve the binding index for a named tensor.

IExecutionContext::enqueue() and IExecutionContext::execute() require an array of buffers.

Engine bindings map from tensor names to indices in this array. Binding indices are assigned at engine build time, and take values in the range [0 ... n-1] where n is the total number of inputs and outputs.

To get the binding index of the name in an optimization profile with index k > 0, mangle the name by appending " [profile k]", as described for method getBindingName().

Parameters

name	The tensor name.

Returns: The binding index for the named tensor, or -1 if the name is not found.

See also: getNbBindings() getBindingName()

◆ getBindingName()

const char * nvinfer1::ICudaEngine::getBindingName ( int32_t bindingIndex ) const

inlinenoexcept

Retrieve the name corresponding to a binding index.

This is the reverse mapping to that provided by getBindingIndex().

For optimization profiles with an index k > 0, the name is mangled by appending " [profile k]", with k written in decimal. For example, if the tensor in the INetworkDefinition had the name "foo", and bindingIndex refers to that tensor in the optimization profile with index 3, getBindingName returns "foo [profile 3]".

Parameters

bindingIndex The binding index.

Returns: The name corresponding to the index, or nullptr if the index is out of range.

See also: getBindingIndex()

◆ getBindingVectorizedDim()

int32_t nvinfer1::ICudaEngine::getBindingVectorizedDim ( int32_t bindingIndex ) const

inlinenoexcept

Return the dimension index that the buffer is vectorized.

Specifically -1 is returned if scalars per vector is 1.

Parameters

bindingIndex The binding Index.

◆ getDeviceMemorySize()

size_t nvinfer1::ICudaEngine::getDeviceMemorySize ( ) const

inlinenoexcept

Return the amount of device memory required by an execution context.

See also: IExecutionContext::setDeviceMemory()

◆ getEngineCapability()

EngineCapability nvinfer1::ICudaEngine::getEngineCapability ( ) const

inlinenoexcept

Determine what execution capability this engine has.

If the engine has EngineCapability::kSTANDARD, then all engine functionality is valid. If the engine has EngineCapability::kSAFETY, then only the functionality in safe engine is valid. If the engine has EngineCapability::kDLA_STANDALONE, then only serialize, destroy, and const-accessor functions are valid.

Returns: The EngineCapability flag that the engine was built for.

◆ getErrorRecorder()

IErrorRecorder * nvinfer1::ICudaEngine::getErrorRecorder ( ) const

inlinenoexcept

Get the ErrorRecorder assigned to this interface.

Retrieves the assigned error recorder object for the given class. A nullptr will be returned if an error handler has not been set.

Returns: A pointer to the IErrorRecorder object that has been registered.

See also: setErrorRecorder()

◆ getLocation()

TensorLocation nvinfer1::ICudaEngine::getLocation ( int32_t bindingIndex ) const

inlinenoexcept

Get location of binding.

This lets you know whether the binding should be a pointer to device or host memory.

See also: ITensor::setLocation() ITensor::getLocation()

Parameters

bindingIndex The binding index.

Returns: The location of the bound tensor with given index.

◆ getMaxBatchSize()

int32_t nvinfer1::ICudaEngine::getMaxBatchSize ( ) const

inlinenoexcept

Get the maximum batch size which can be used for inference.

For an engine built from an INetworkDefinition without an implicit batch dimension, this will always return 1.

Returns: The maximum batch size for this engine.

◆ getName()

const char * nvinfer1::ICudaEngine::getName ( ) const

inlinenoexcept

Returns the name of the network associated with the engine.

The name is set during network creation and is retrieved after building or deserialization.

See also: INetworkDefinition::setName(), INetworkDefinition::getName()

Returns: A null-terminated C-style string representing the name of the network.

◆ getNbBindings()

int32_t nvinfer1::ICudaEngine::getNbBindings ( ) const

inlinenoexcept

Get the number of binding indices.

There are separate binding indices for each optimization profile. This method returns the total over all profiles. If the engine has been built for K profiles, the first getNbBindings() / K bindings are used by profile number 0, the following getNbBindings() / K bindings are used by profile number 1 etc.

See also: getBindingIndex();

◆ getNbLayers()

int32_t nvinfer1::ICudaEngine::getNbLayers ( ) const

inlinenoexcept

Get the number of layers in the network.

The number of layers in the network is not necessarily the number in the original network definition, as layers may be combined or eliminated as the engine is optimized. This value can be useful when building per-layer tables, such as when aggregating profiling data over a number of executions.

Returns: The number of layers in the network.

◆ getNbOptimizationProfiles()

int32_t nvinfer1::ICudaEngine::getNbOptimizationProfiles ( ) const

inlinenoexcept

Get the number of optimization profiles defined for this engine.

Returns: Number of optimization profiles. It is always at least 1.

See also: IExecutionContext::setOptimizationProfile()

◆ getProfileDimensions()

Dims nvinfer1::ICudaEngine::getProfileDimensions	(	int32_t	bindingIndex,
		int32_t	profileIndex,
		OptProfileSelector	select
	)		const

inlinenoexcept

Get the minimum / optimum / maximum dimensions for a particular binding under an optimization profile.

Parameters

bindingIndex	The binding index, which must belong to the given profile, or be between 0 and bindingsPerProfile-1 as described below.
profileIndex	The profile index, which must be between 0 and getNbOptimizationProfiles()-1.
select	Whether to query the minimum, optimum, or maximum dimensions for this binding.

Returns: The minimum / optimum / maximum dimensions for this binding in this profile. If the profileIndex or bindingIndex are invalid, return Dims with nbDims=-1.

For backwards compatibility with earlier versions of TensorRT, if the bindingIndex does not belong to the current optimization profile, but is between 0 and bindingsPerProfile-1, where bindingsPerProfile = getNbBindings()/getNbOptimizationProfiles, then a corrected bindingIndex is used instead, computed by:

profileIndex * bindingsPerProfile + bindingIndex % bindingsPerProfile

Otherwise the bindingIndex is considered invalid.

◆ getProfileShapeValues()

const int32_t * nvinfer1::ICudaEngine::getProfileShapeValues	(	int32_t	profileIndex,
		int32_t	inputIndex,
		OptProfileSelector	select
	)		const

inlinenoexcept

Get minimum / optimum / maximum values for an input shape binding under an optimization profile.

Parameters

profileIndex	The profile index (must be between 0 and getNbOptimizationProfiles()-1)
inputIndex	The input index (must be between 0 and getNbBindings() - 1)
select	Whether to query the minimum, optimum, or maximum shape values for this binding.

Returns: If the binding is an input shape binding, return a pointer to an array that has the same number of elements as the corresponding tensor, i.e. 1 if dims.nbDims == 0, or dims.d[0] if dims.nbDims == 1, where dims = getBindingDimensions(inputIndex). The array contains the elementwise minimum / optimum / maximum values for this shape binding under the profile. If either of the indices is out of range, or if the binding is not an input shape binding, return nullptr.

For backwards compatibility with earlier versions of TensorRT, a bindingIndex that does not belong to the profile is corrected as described for getProfileDimensions.

See also: ICudaEngine::getProfileDimensions

◆ getProfilingVerbosity()

ProfilingVerbosity nvinfer1::ICudaEngine::getProfilingVerbosity ( ) const

inlinenoexcept

Return the ProfilingVerbosity the builder config was set to when the engine was built.

Returns: the profiling verbosity the builder config was set to when the engine was built.

See also: IBuilderConfig::setProfilingVerbosity()

◆ getTacticSources()

TacticSources nvinfer1::ICudaEngine::getTacticSources ( ) const

inlinenoexcept

return the tactic sources required by this engine

See also: IBuilderConfig::setTacticSources()

◆ hasImplicitBatchDimension()

bool nvinfer1::ICudaEngine::hasImplicitBatchDimension ( ) const

inlinenoexcept

Query whether the engine was built with an implicit batch dimension.

Returns: True if tensors have implicit batch dimension, false otherwise.

This is an engine-wide property. Either all tensors in the engine have an implicit batch dimension or none of them do.

hasImplicitBatchDimension() is true if and only if the INetworkDefinition from which this engine was built was created with createNetwork() or createNetworkV2() without NetworkDefinitionCreationFlag::kEXPLICIT_BATCH flag.

See also: createNetworkV2

◆ isExecutionBinding()

bool nvinfer1::ICudaEngine::isExecutionBinding ( int32_t bindingIndex ) const

inlinenoexcept

True if pointer to tensor data is required for execution phase, false if nullptr can be supplied.

For example, if a network uses an input tensor with binding i ONLY as the "reshape dimensions" input of IShuffleLayer, then isExecutionBinding(i) is false, and a nullptr can be supplied for it when calling IExecutionContext::execute or IExecutionContext::enqueue.

See also: isShapeBinding()

◆ isRefittable()

bool nvinfer1::ICudaEngine::isRefittable ( ) const

inlinenoexcept

Return true if an engine can be refit.

See also: nvinfer1::createInferRefitter()

◆ isShapeBinding()

bool nvinfer1::ICudaEngine::isShapeBinding ( int32_t bindingIndex ) const

inlinenoexcept

True if tensor is required as input for shape calculations or output from them.

TensorRT evaluates a network in two phases:

Compute shape information required to determine memory allocation requirements and validate that runtime sizes make sense.
Process tensors on the device.

Some tensors are required in phase 1. These tensors are called "shape tensors", and always have type Int32 and no more than one dimension. These tensors are not always shapes themselves, but might be used to calculate tensor shapes for phase 2.

isShapeBinding(i) returns true if the tensor is a required input or an output computed in phase 1. isExecutionBinding(i) returns true if the tensor is a required input or an output computed in phase 2.

For example, if a network uses an input tensor with binding i as an addend to an IElementWiseLayer that computes the "reshape dimensions" for IShuffleLayer, then isShapeBinding(i) == true.

It's possible to have a tensor be required by both phases. For instance, a tensor can be used for the "reshape dimensions" and as the indices for an IGatherLayer collecting floating-point data.

It's also possible to have a tensor be required by neither phase, but nonetheless shows up in the engine's inputs. For example, if an input tensor is used only as an input to IShapeLayer, only its shape matters and its values are irrelevant.

See also: isExecutionBinding()

◆ serialize()

IHostMemory * nvinfer1::ICudaEngine::serialize ( ) const

inlinenoexcept

Serialize the network to a stream.

Returns: A IHostMemory object that contains the serialized engine.

The network may be deserialized with IRuntime::deserializeCudaEngine().

See also: IRuntime::deserializeCudaEngine()

◆ setErrorRecorder()

void nvinfer1::ICudaEngine::setErrorRecorder ( IErrorRecorder * recorder )

inlinenoexcept

Set the ErrorRecorder for this interface.

Assigns the ErrorRecorder to this interface. The ErrorRecorder will track all errors during execution. This function will call incRefCount of the registered ErrorRecorder at least once. Setting recorder to nullptr unregisters the recorder with the interface, resulting in a call to decRefCount if a recorder has been registered.

If an error recorder is not set, messages will be sent to the global log stream.

Parameters

recorder The error recorder to register with this interface.

See also: getErrorRecorder()

The documentation for this class was generated from the following file:

NvInferRuntime.h

Public Member Functions

Protected Attributes

Additional Inherited Members

Detailed Description

Member Function Documentation

◆ bindingIsInput()

◆ createEngineInspector()

◆ createExecutionContext()

◆ createExecutionContextWithoutDeviceMemory()

◆ destroy()

◆ getBindingBytesPerComponent()

◆ getBindingComponentsPerElement()

◆ getBindingDataType()

◆ getBindingDimensions()

◆ getBindingFormat()

◆ getBindingFormatDesc()

◆ getBindingIndex()

◆ getBindingName()

◆ getBindingVectorizedDim()

◆ getDeviceMemorySize()

◆ getEngineCapability()

◆ getErrorRecorder()

◆ getLocation()

◆ getMaxBatchSize()

◆ getName()

◆ getNbBindings()

◆ getNbLayers()

◆ getNbOptimizationProfiles()

◆ getProfileDimensions()

◆ getProfileShapeValues()

◆ getProfilingVerbosity()

◆ getTacticSources()

◆ hasImplicitBatchDimension()

◆ isExecutionBinding()

◆ isRefittable()

◆ isShapeBinding()

◆ serialize()

◆ setErrorRecorder()