Updates weights in an engine. More...

#include <NvInferRuntime.h>

Inheritance diagram for nvinfer1::IRefitter:

Public Member Functions
virtual	~IRefitter () noexcept=default

bool	setWeights (char const *layerName, WeightsRole role, Weights weights) noexcept
	Specify new weights for a layer of given name. Returns true on success, or false if new weights are rejected. Possible reasons for rejection are: More...

bool	refitCudaEngine () noexcept
	Refits associated engine. More...

int32_t	getMissing (int32_t size, char const *layerNames, WeightsRole roles) noexcept
	Get description of missing weights. More...

int32_t	getAll (int32_t size, char const *layerNames, WeightsRole roles) noexcept
	Get description of all weights that could be refit. More...

bool	setDynamicRange (char const *tensorName, float min, float max) noexcept

float	getDynamicRangeMin (char const *tensorName) const noexcept
	Get minimum of dynamic range. More...

float	getDynamicRangeMax (char const *tensorName) const noexcept
	Get maximum of dynamic range. More...

int32_t	getTensorsWithDynamicRange (int32_t size, char const **tensorNames) const noexcept
	Get names of all tensors that have refittable dynamic ranges. More...

void	setErrorRecorder (IErrorRecorder *recorder) noexcept
	Set the ErrorRecorder for this interface. More...

IErrorRecorder *	getErrorRecorder () const noexcept
	Get the ErrorRecorder assigned to this interface. More...

bool	setNamedWeights (char const *name, Weights weights) noexcept
	Specify new weights of given name. More...

int32_t	getMissingWeights (int32_t size, char const **weightsNames) noexcept
	Get names of missing weights. More...

int32_t	getAllWeights (int32_t size, char const **weightsNames) noexcept
	Get names of all weights that could be refit. More...

ILogger *	getLogger () const noexcept
	get the logger with which the refitter was created More...

bool	setMaxThreads (int32_t maxThreads) noexcept
	Set the maximum number of threads. More...

int32_t	getMaxThreads () const noexcept
	get the maximum number of threads that can be used by the refitter. More...

bool	setNamedWeights (char const *name, Weights weights, TensorLocation location) noexcept
	Specify new weights on a specified device of given name. More...

Weights	getNamedWeights (char const *weightsName) const noexcept
	Get weights associated with the given name. More...

TensorLocation	getWeightsLocation (char const *weightsName) const noexcept
	Get location for the weights associated with the given name. More...

bool	unsetNamedWeights (char const *weightsName) noexcept
	Unset weights associated with the given name. More...

void	setWeightsValidation (bool weightsValidation) noexcept
	Set whether to validate weights during refitting. More...

bool	getWeightsValidation () const noexcept
	Get whether to validate weights values during refitting. More...

bool	refitCudaEngineAsync (cudaStream_t stream) noexcept
	Enqueue weights refitting of the associated engine on the given stream. More...

Weights	getWeightsPrototype (char const *weightsName) const noexcept
	Get the Weights prototype associated with the given name. More...

Protected Attributes
apiv::VRefitter *	mImpl

Additional Inherited Members
Protected Member Functions inherited from nvinfer1::INoCopy
	INoCopy ()=default

virtual	~INoCopy ()=default

	INoCopy (INoCopy const &other)=delete

INoCopy &	operator= (INoCopy const &other)=delete

	INoCopy (INoCopy &&other)=delete

INoCopy &	operator= (INoCopy &&other)=delete

Detailed Description

Updates weights in an engine.

Warning: Do not inherit from this class, as doing so will break forward-compatibility of the API and ABI.

Constructor & Destructor Documentation

◆ ~IRefitter()

virtual nvinfer1::IRefitter::~IRefitter ( )

virtualdefaultnoexcept

Member Function Documentation

◆ getAll()

int32_t nvinfer1::IRefitter::getAll	(	int32_t	size,
		char const **	layerNames,
		WeightsRole *	roles
	)

inlinenoexcept

Get description of all weights that could be refit.

Parameters

size	The number of items that can be safely written to a non-null layerNames or roles.
layerNames	Where to write the layer names.
roles	Where to write the weights roles.

Returns: The number of Weights that could be refit.

If layerNames!=nullptr, each written pointer points to a string owned by the engine being refit, and becomes invalid when the engine is destroyed.

◆ getAllWeights()

int32_t nvinfer1::IRefitter::getAllWeights	(	int32_t	size,
		char const **	weightsNames
	)

inlinenoexcept

Get names of all weights that could be refit.

Parameters

size	The number of weights names that can be safely written to.
weightsNames	The names of the weights to be updated, or nullptr for unnamed weights.

Returns: The number of Weights that could be refit.

If layerNames!=nullptr, each written pointer points to a string owned by the engine being refit, and becomes invalid when the engine is destroyed.

◆ getDynamicRangeMax()

float nvinfer1::IRefitter::getDynamicRangeMax ( char const * tensorName ) const

inlinenoexcept

Get maximum of dynamic range.

Returns: Maximum of dynamic range.

If the dynamic range was never set, returns the maximum computed during calibration.

Warning: The string tensorName must be null-terminated, and be at most 4096 bytes including the terminator.

◆ getDynamicRangeMin()

float nvinfer1::IRefitter::getDynamicRangeMin ( char const * tensorName ) const

inlinenoexcept

Get minimum of dynamic range.

Returns: Minimum of dynamic range.

If the dynamic range was never set, returns the minimum computed during calibration.

Warning: The string tensorName must be null-terminated, and be at most 4096 bytes including the terminator.

◆ getErrorRecorder()

IErrorRecorder * nvinfer1::IRefitter::getErrorRecorder ( ) const

inlinenoexcept

Get the ErrorRecorder assigned to this interface.

Retrieves the assigned error recorder object for the given class. A nullptr will be returned if an error handler has not been set.

Returns: A pointer to the IErrorRecorder object that has been registered.

See also: setErrorRecorder()

◆ getLogger()

ILogger * nvinfer1::IRefitter::getLogger ( ) const

inlinenoexcept

get the logger with which the refitter was created

Returns: the logger

◆ getMaxThreads()

int32_t nvinfer1::IRefitter::getMaxThreads ( ) const

inlinenoexcept

get the maximum number of threads that can be used by the refitter.

Retrieves the maximum number of threads that can be used by the refitter.

Returns: The maximum number of threads that can be used by the refitter.

See also: setMaxThreads()

◆ getMissing()

int32_t nvinfer1::IRefitter::getMissing	(	int32_t	size,
		char const **	layerNames,
		WeightsRole *	roles
	)

inlinenoexcept

Get description of missing weights.

For example, if some Weights have been set, but the engine was optimized in a way that combines weights, any unsupplied Weights in the combination are considered missing.

Parameters

size	The number of items that can be safely written to a non-null layerNames or roles.
layerNames	Where to write the layer names.
roles	Where to write the weights roles.

Returns: The number of missing Weights.

If layerNames!=nullptr, each written pointer points to a string owned by the engine being refit, and becomes invalid when the engine is destroyed.

◆ getMissingWeights()

int32_t nvinfer1::IRefitter::getMissingWeights	(	int32_t	size,
		char const **	weightsNames
	)

inlinenoexcept

Get names of missing weights.

For example, if some Weights have been set, but the engine was optimized in a way that combines weights, any unsupplied Weights in the combination are considered missing.

Parameters

size	The number of weights names that can be safely written to.
weightsNames	The names of the weights to be updated, or nullptr for unnamed weights.

Returns: The number of missing Weights.

If layerNames!=nullptr, each written pointer points to a string owned by the engine being refit, and becomes invalid when the engine is destroyed.

◆ getNamedWeights()

Weights nvinfer1::IRefitter::getNamedWeights ( char const * weightsName ) const

inlinenoexcept

Get weights associated with the given name.

Parameters

weightsName The name of the weights to be refitted.

Returns: Weights associated with the given name.

If the weights were never set, returns null weights and reports an error to the refitter errorRecorder.

Warning: The string weightsName must be null-terminated, and be at most 4096 bytes including the terminator.

◆ getTensorsWithDynamicRange()

int32_t nvinfer1::IRefitter::getTensorsWithDynamicRange	(	int32_t	size,
		char const **	tensorNames
	)		const

inlinenoexcept

Get names of all tensors that have refittable dynamic ranges.

Parameters

size	The number of items that can be safely written to a non-null tensorNames.
tensorNames	Where to write the layer names.

Returns: The number of Weights that could be refit.

If tensorNames!=nullptr, each written pointer points to a string owned by the engine being refit, and becomes invalid when the engine is destroyed.

◆ getWeightsLocation()

TensorLocation nvinfer1::IRefitter::getWeightsLocation ( char const * weightsName ) const

inlinenoexcept

Get location for the weights associated with the given name.

Parameters

weightsName The name of the weights to be refitted.

Returns: Location for the weights associated with the given name.

If the weights were never set, returns TensorLocation::kHOST and reports an error to the refitter errorRecorder.

Warning: The string weightsName must be null-terminated, and be at most 4096 bytes including the terminator.

◆ getWeightsPrototype()

Weights nvinfer1::IRefitter::getWeightsPrototype ( char const * weightsName ) const

inlinenoexcept

Get the Weights prototype associated with the given name.

Parameters

weightsName The name of the weights to be refitted.

Returns: Weights prototype associated with the given name.

The type and count of weights prototype is the same as weights used for engine building. The values property is nullptr for weights prototypes. The count of the weights prototype is -1 when the name of the weights is nullptr or does not correspond to any refittable weights.

Warning: The string weightsName must be null-terminated, and be at most 4096 bytes including the terminator.

◆ getWeightsValidation()

bool nvinfer1::IRefitter::getWeightsValidation ( ) const

inlinenoexcept

Get whether to validate weights values during refitting.

◆ refitCudaEngine()

bool nvinfer1::IRefitter::refitCudaEngine ( )

inlinenoexcept

Refits associated engine.

Returns: True on success, or false if new weights validation fails or getMissingWeights() != 0 before the call. If false is returned, a subset of weights may have been refitted.

The behavior is undefined if the engine has pending enqueued work. Provided weights on CPU or GPU can be unset and released, or updated after refitCudaEngine returns.

IExecutionContexts associated with the engine remain valid for use afterwards. There is no need to set the same weights repeatedly for multiple refit calls as the weights memory can be updated directly instead.

◆ refitCudaEngineAsync()

bool nvinfer1::IRefitter::refitCudaEngineAsync ( cudaStream_t stream )

inlinenoexcept

Enqueue weights refitting of the associated engine on the given stream.

Parameters

stream The stream to enqueue the weights updating task.

Returns: True on success, or false if new weights validation fails or getMissingWeights() != 0 before the call. If false is returned, a subset of weights may have been refitted.

The behavior is undefined if the engine has pending enqueued work on a different stream from the provided one. Provided weights on CPU can be unset and released, or updated after refitCudaEngineAsync returns. Freeing or updating of the provided weights on GPU can be enqueued on the same stream after refitCudaEngineAsync returns.

IExecutionContexts associated with the engine remain valid for use afterwards. There is no need to set the same weights repeatedly for multiple refit calls as the weights memory can be updated directly instead. The weights updating task should use the same stream as the one used for the refit call.

◆ setDynamicRange()

bool nvinfer1::IRefitter::setDynamicRange	(	char const *	tensorName,
		float	min,
		float	max
	)

inlinenoexcept

Update dynamic range for a tensor.

Parameters

tensorName	The name of an ITensor in the network.
min	The minimum of the dynamic range for the tensor.
max	The maximum of the dynamic range for the tensor.

Returns: True if successful; false otherwise.

Returns false if there is no Int8 engine tensor derived from a network tensor of that name. If successful, then getMissing may report that some weights need to be supplied.

Warning: The string tensorName must be null-terminated, and be at most 4096 bytes including the terminator.

◆ setErrorRecorder()

void nvinfer1::IRefitter::setErrorRecorder ( IErrorRecorder * recorder )

inlinenoexcept

Set the ErrorRecorder for this interface.

Assigns the ErrorRecorder to this interface. The ErrorRecorder will track all errors during execution. This function will call incRefCount of the registered ErrorRecorder at least once. Setting recorder to nullptr unregisters the recorder with the interface, resulting in a call to decRefCount if a recorder has been registered.

If an error recorder is not set, messages will be sent to the global log stream.

Parameters

recorder The error recorder to register with this interface.

See also: getErrorRecorder()

◆ setMaxThreads()

bool nvinfer1::IRefitter::setMaxThreads ( int32_t maxThreads )

inlinenoexcept

Set the maximum number of threads.

Parameters

maxThreads The maximum number of threads that can be used by the refitter.

Returns: True if successful, false otherwise.

The default value is 1 and includes the current thread. A value greater than 1 permits TensorRT to use multi-threaded algorithms. A value less than 1 triggers a kINVALID_ARGUMENT error.

◆ setNamedWeights() [1/2]

bool nvinfer1::IRefitter::setNamedWeights	(	char const *	name,
		Weights	weights
	)

inlinenoexcept

Specify new weights of given name.

Parameters

name	The name of the weights to be refit.
weights	The new weights to associate with the name.

Returns true on success, or false if new weights are rejected. Possible reasons for rejection are:

The name of weights is nullptr or does not correspond to any refittable weights.
The count of the weights is inconsistent with the count returned from calling getWeightsPrototype() with the same name.
The type of the weights is inconsistent with the type returned from calling getWeightsPrototype() with the same name.

Modifying the weights before method refitCudaEngine or refitCudaEngineAsync returns will result in undefined behavior.

Warning: The string name must be null-terminated, and be at most 4096 bytes including the terminator.

◆ setNamedWeights() [2/2]

bool nvinfer1::IRefitter::setNamedWeights	(	char const *	name,
		Weights	weights,
		TensorLocation	location
	)

inlinenoexcept

Specify new weights on a specified device of given name.

Parameters

name	The name of the weights to be refitted.
weights	The new weights on the specified device.
location	The location (host vs. device) of the new weights.

Returns: True on success, or false if new weights are rejected. Possible reasons for rejection are:

The name of the weights is nullptr or does not correspond to any refittable weights.
The count of the weights is inconsistent with the count returned from calling getWeightsPrototype() with the same name.
The type of the weights is inconsistent with the type returned from calling getWeightsPrototype() with the same name.

It is allowed to provide some weights on CPU and others on GPU. Modifying the weights before the method refitCudaEngine() or refitCudaEngineAsync() completes will result in undefined behavior.

Warning: The string name must be null-terminated, and be at most 4096 bytes including the terminator.

◆ setWeights()

bool nvinfer1::IRefitter::setWeights	(	char const *	layerName,
		WeightsRole	role,
		Weights	weights
	)

inlinenoexcept

Specify new weights for a layer of given name. Returns true on success, or false if new weights are rejected. Possible reasons for rejection are:

There is no such layer by that name.
The layer does not have weights with the specified role.
The count of weights is inconsistent with the layer’s original specification.
The type of weights is inconsistent with the layer’s original specification.

Modifying the weights before method refitCudaEngine or refitCudaEngineAsync returns will result in undefined behavior.

Warning: The string layerName must be null-terminated, and be at most 4096 bytes including the terminator.

◆ setWeightsValidation()

void nvinfer1::IRefitter::setWeightsValidation ( bool weightsValidation )

inlinenoexcept

Set whether to validate weights during refitting.

Parameters

weightsValidation Indicate whether to validate weights during refitting.

When set to true, TensorRT will validate weights during FP32 to FP16/BF16 weights conversions or sparsifying weights in the refit call. If provided weights are not proper for some weights transformations, TensorRT will issue a warning and continue the transformation for minor issues (such as overflow during narrowing conversion), or issue an error and stop the refitting process for severe issues (such as sparsifying dense weights). By default the flag is true. Set the flag to false for faster refitting performance.

◆ unsetNamedWeights()

bool nvinfer1::IRefitter::unsetNamedWeights ( char const * weightsName )

inlinenoexcept

Unset weights associated with the given name.

Parameters

weightsName The name of the weights to be refitted.

Returns: False if the weights were never set, returns true otherwise.

Unset weights before releasing them.

Warning: The string weightsName must be null-terminated, and be at most 4096 bytes including the terminator.

Member Data Documentation

◆ mImpl

apiv::VRefitter* nvinfer1::IRefitter::mImpl

protected

The documentation for this class was generated from the following file:

NvInferRuntime.h

Public Member Functions

Protected Attributes

Additional Inherited Members

Detailed Description

Constructor & Destructor Documentation

◆ ~IRefitter()

Member Function Documentation

◆ getAll()

◆ getAllWeights()

◆ getDynamicRangeMax()

◆ getDynamicRangeMin()

◆ getErrorRecorder()

◆ getLogger()

◆ getMaxThreads()

◆ getMissing()

◆ getMissingWeights()

◆ getNamedWeights()

◆ getTensorsWithDynamicRange()

◆ getWeightsLocation()

◆ getWeightsPrototype()

◆ getWeightsValidation()

◆ refitCudaEngine()

◆ refitCudaEngineAsync()

◆ setDynamicRange()

◆ setErrorRecorder()

◆ setMaxThreads()

◆ setNamedWeights() [1/2]

◆ setNamedWeights() [2/2]

◆ setWeights()

◆ setWeightsValidation()

◆ unsetNamedWeights()

Member Data Documentation

◆ mImpl