TensorRT 10.0.0
nvinfer1::IRefitter Class Reference

Updates weights in an engine. More...

#include <NvInferRuntime.h>

Inheritance diagram for nvinfer1::IRefitter:
nvinfer1::INoCopy

Public Member Functions

virtual ~IRefitter () noexcept=default
 
bool setWeights (char const *layerName, WeightsRole role, Weights weights) noexcept
 Specify new weights for a layer of given name. Returns true on success, or false if new weights are rejected. Possible reasons for rejection are: More...
 
bool refitCudaEngine () noexcept
 Refits associated engine. More...
 
int32_t getMissing (int32_t size, char const **layerNames, WeightsRole *roles) noexcept
 Get description of missing weights. More...
 
int32_t getAll (int32_t size, char const **layerNames, WeightsRole *roles) noexcept
 Get description of all weights that could be refit. More...
 
bool setDynamicRange (char const *tensorName, float min, float max) noexcept
 
float getDynamicRangeMin (char const *tensorName) const noexcept
 Get minimum of dynamic range. More...
 
float getDynamicRangeMax (char const *tensorName) const noexcept
 Get maximum of dynamic range. More...
 
int32_t getTensorsWithDynamicRange (int32_t size, char const **tensorNames) const noexcept
 Get names of all tensors that have refittable dynamic ranges. More...
 
void setErrorRecorder (IErrorRecorder *recorder) noexcept
 Set the ErrorRecorder for this interface. More...
 
IErrorRecordergetErrorRecorder () const noexcept
 Get the ErrorRecorder assigned to this interface. More...
 
bool setNamedWeights (char const *name, Weights weights) noexcept
 Specify new weights of given name. More...
 
int32_t getMissingWeights (int32_t size, char const **weightsNames) noexcept
 Get names of missing weights. More...
 
int32_t getAllWeights (int32_t size, char const **weightsNames) noexcept
 Get names of all weights that could be refit. More...
 
ILoggergetLogger () const noexcept
 get the logger with which the refitter was created More...
 
bool setMaxThreads (int32_t maxThreads) noexcept
 Set the maximum number of threads. More...
 
int32_t getMaxThreads () const noexcept
 get the maximum number of threads that can be used by the refitter. More...
 
bool setNamedWeights (char const *name, Weights weights, TensorLocation location) noexcept
 Specify new weights on a specified device of given name. More...
 
Weights getNamedWeights (char const *weightsName) const noexcept
 Get weights associated with the given name. More...
 
TensorLocation getWeightsLocation (char const *weightsName) const noexcept
 Get location for the weights associated with the given name. More...
 
bool unsetNamedWeights (char const *weightsName) noexcept
 Unset weights associated with the given name. More...
 
void setWeightsValidation (bool weightsValidation) noexcept
 Set whether to validate weights during refitting. More...
 
bool getWeightsValidation () const noexcept
 Get whether to validate weights values during refitting. More...
 
bool refitCudaEngineAsync (cudaStream_t stream) noexcept
 Enqueue weights refitting of the associated engine on the given stream. More...
 
Weights getWeightsPrototype (char const *weightsName) const noexcept
 Get the Weights prototype associated with the given name. More...
 

Protected Attributes

apiv::VRefitter * mImpl
 

Additional Inherited Members

- Protected Member Functions inherited from nvinfer1::INoCopy
 INoCopy ()=default
 
virtual ~INoCopy ()=default
 
 INoCopy (INoCopy const &other)=delete
 
INoCopyoperator= (INoCopy const &other)=delete
 
 INoCopy (INoCopy &&other)=delete
 
INoCopyoperator= (INoCopy &&other)=delete
 

Detailed Description

Updates weights in an engine.

Warning
Do not inherit from this class, as doing so will break forward-compatibility of the API and ABI.

Constructor & Destructor Documentation

◆ ~IRefitter()

virtual nvinfer1::IRefitter::~IRefitter ( )
virtualdefaultnoexcept

Member Function Documentation

◆ getAll()

int32_t nvinfer1::IRefitter::getAll ( int32_t  size,
char const **  layerNames,
WeightsRole roles 
)
inlinenoexcept

Get description of all weights that could be refit.

Parameters
sizeThe number of items that can be safely written to a non-null layerNames or roles.
layerNamesWhere to write the layer names.
rolesWhere to write the weights roles.
Returns
The number of Weights that could be refit.

If layerNames!=nullptr, each written pointer points to a string owned by the engine being refit, and becomes invalid when the engine is destroyed.

◆ getAllWeights()

int32_t nvinfer1::IRefitter::getAllWeights ( int32_t  size,
char const **  weightsNames 
)
inlinenoexcept

Get names of all weights that could be refit.

Parameters
sizeThe number of weights names that can be safely written to.
weightsNamesThe names of the weights to be updated, or nullptr for unnamed weights.
Returns
The number of Weights that could be refit.

If layerNames!=nullptr, each written pointer points to a string owned by the engine being refit, and becomes invalid when the engine is destroyed.

◆ getDynamicRangeMax()

float nvinfer1::IRefitter::getDynamicRangeMax ( char const *  tensorName) const
inlinenoexcept

Get maximum of dynamic range.

Returns
Maximum of dynamic range.

If the dynamic range was never set, returns the maximum computed during calibration.

Warning
The string tensorName must be null-terminated, and be at most 4096 bytes including the terminator.

◆ getDynamicRangeMin()

float nvinfer1::IRefitter::getDynamicRangeMin ( char const *  tensorName) const
inlinenoexcept

Get minimum of dynamic range.

Returns
Minimum of dynamic range.

If the dynamic range was never set, returns the minimum computed during calibration.

Warning
The string tensorName must be null-terminated, and be at most 4096 bytes including the terminator.

◆ getErrorRecorder()

IErrorRecorder * nvinfer1::IRefitter::getErrorRecorder ( ) const
inlinenoexcept

Get the ErrorRecorder assigned to this interface.

Retrieves the assigned error recorder object for the given class. A nullptr will be returned if an error handler has not been set.

Returns
A pointer to the IErrorRecorder object that has been registered.
See also
setErrorRecorder()

◆ getLogger()

ILogger * nvinfer1::IRefitter::getLogger ( ) const
inlinenoexcept

get the logger with which the refitter was created

Returns
the logger

◆ getMaxThreads()

int32_t nvinfer1::IRefitter::getMaxThreads ( ) const
inlinenoexcept

get the maximum number of threads that can be used by the refitter.

Retrieves the maximum number of threads that can be used by the refitter.

Returns
The maximum number of threads that can be used by the refitter.
See also
setMaxThreads()

◆ getMissing()

int32_t nvinfer1::IRefitter::getMissing ( int32_t  size,
char const **  layerNames,
WeightsRole roles 
)
inlinenoexcept

Get description of missing weights.

For example, if some Weights have been set, but the engine was optimized in a way that combines weights, any unsupplied Weights in the combination are considered missing.

Parameters
sizeThe number of items that can be safely written to a non-null layerNames or roles.
layerNamesWhere to write the layer names.
rolesWhere to write the weights roles.
Returns
The number of missing Weights.

If layerNames!=nullptr, each written pointer points to a string owned by the engine being refit, and becomes invalid when the engine is destroyed.

◆ getMissingWeights()

int32_t nvinfer1::IRefitter::getMissingWeights ( int32_t  size,
char const **  weightsNames 
)
inlinenoexcept

Get names of missing weights.

For example, if some Weights have been set, but the engine was optimized in a way that combines weights, any unsupplied Weights in the combination are considered missing.

Parameters
sizeThe number of weights names that can be safely written to.
weightsNamesThe names of the weights to be updated, or nullptr for unnamed weights.
Returns
The number of missing Weights.

If layerNames!=nullptr, each written pointer points to a string owned by the engine being refit, and becomes invalid when the engine is destroyed.

◆ getNamedWeights()

Weights nvinfer1::IRefitter::getNamedWeights ( char const *  weightsName) const
inlinenoexcept

Get weights associated with the given name.

Parameters
weightsNameThe name of the weights to be refitted.
Returns
Weights associated with the given name.

If the weights were never set, returns null weights and reports an error to the refitter errorRecorder.

Warning
The string weightsName must be null-terminated, and be at most 4096 bytes including the terminator.

◆ getTensorsWithDynamicRange()

int32_t nvinfer1::IRefitter::getTensorsWithDynamicRange ( int32_t  size,
char const **  tensorNames 
) const
inlinenoexcept

Get names of all tensors that have refittable dynamic ranges.

Parameters
sizeThe number of items that can be safely written to a non-null tensorNames.
tensorNamesWhere to write the layer names.
Returns
The number of Weights that could be refit.

If tensorNames!=nullptr, each written pointer points to a string owned by the engine being refit, and becomes invalid when the engine is destroyed.

◆ getWeightsLocation()

TensorLocation nvinfer1::IRefitter::getWeightsLocation ( char const *  weightsName) const
inlinenoexcept

Get location for the weights associated with the given name.

Parameters
weightsNameThe name of the weights to be refitted.
Returns
Location for the weights associated with the given name.

If the weights were never set, returns TensorLocation::kHOST and reports an error to the refitter errorRecorder.

Warning
The string weightsName must be null-terminated, and be at most 4096 bytes including the terminator.

◆ getWeightsPrototype()

Weights nvinfer1::IRefitter::getWeightsPrototype ( char const *  weightsName) const
inlinenoexcept

Get the Weights prototype associated with the given name.

Parameters
weightsNameThe name of the weights to be refitted.
Returns
Weights prototype associated with the given name.

The type and count of weights prototype is the same as weights used for engine building. The values property is nullptr for weights prototypes. The count of the weights prototype is -1 when the name of the weights is nullptr or does not correspond to any refittable weights.

Warning
The string weightsName must be null-terminated, and be at most 4096 bytes including the terminator.

◆ getWeightsValidation()

bool nvinfer1::IRefitter::getWeightsValidation ( ) const
inlinenoexcept

Get whether to validate weights values during refitting.

◆ refitCudaEngine()

bool nvinfer1::IRefitter::refitCudaEngine ( )
inlinenoexcept

Refits associated engine.

Returns
True on success, or false if new weights validation fails or getMissingWeights() != 0 before the call. If false is returned, a subset of weights may have been refitted.

The behavior is undefined if the engine has pending enqueued work. Provided weights on CPU or GPU can be unset and released, or updated after refitCudaEngine returns.

IExecutionContexts associated with the engine remain valid for use afterwards. There is no need to set the same weights repeatedly for multiple refit calls as the weights memory can be updated directly instead.

◆ refitCudaEngineAsync()

bool nvinfer1::IRefitter::refitCudaEngineAsync ( cudaStream_t  stream)
inlinenoexcept

Enqueue weights refitting of the associated engine on the given stream.

Parameters
streamThe stream to enqueue the weights updating task.
Returns
True on success, or false if new weights validation fails or getMissingWeights() != 0 before the call. If false is returned, a subset of weights may have been refitted.

The behavior is undefined if the engine has pending enqueued work on a different stream from the provided one. Provided weights on CPU can be unset and released, or updated after refitCudaEngineAsync returns. Freeing or updating of the provided weights on GPU can be enqueued on the same stream after refitCudaEngineAsync returns.

IExecutionContexts associated with the engine remain valid for use afterwards. There is no need to set the same weights repeatedly for multiple refit calls as the weights memory can be updated directly instead. The weights updating task should use the same stream as the one used for the refit call.

◆ setDynamicRange()

bool nvinfer1::IRefitter::setDynamicRange ( char const *  tensorName,
float  min,
float  max 
)
inlinenoexcept

Update dynamic range for a tensor.

Parameters
tensorNameThe name of an ITensor in the network.
minThe minimum of the dynamic range for the tensor.
maxThe maximum of the dynamic range for the tensor.
Returns
True if successful; false otherwise.

Returns false if there is no Int8 engine tensor derived from a network tensor of that name. If successful, then getMissing may report that some weights need to be supplied.

Warning
The string tensorName must be null-terminated, and be at most 4096 bytes including the terminator.

◆ setErrorRecorder()

void nvinfer1::IRefitter::setErrorRecorder ( IErrorRecorder recorder)
inlinenoexcept

Set the ErrorRecorder for this interface.

Assigns the ErrorRecorder to this interface. The ErrorRecorder will track all errors during execution. This function will call incRefCount of the registered ErrorRecorder at least once. Setting recorder to nullptr unregisters the recorder with the interface, resulting in a call to decRefCount if a recorder has been registered.

If an error recorder is not set, messages will be sent to the global log stream.

Parameters
recorderThe error recorder to register with this interface.
See also
getErrorRecorder()

◆ setMaxThreads()

bool nvinfer1::IRefitter::setMaxThreads ( int32_t  maxThreads)
inlinenoexcept

Set the maximum number of threads.

Parameters
maxThreadsThe maximum number of threads that can be used by the refitter.
Returns
True if successful, false otherwise.

The default value is 1 and includes the current thread. A value greater than 1 permits TensorRT to use multi-threaded algorithms. A value less than 1 triggers a kINVALID_ARGUMENT error.

◆ setNamedWeights() [1/2]

bool nvinfer1::IRefitter::setNamedWeights ( char const *  name,
Weights  weights 
)
inlinenoexcept

Specify new weights of given name.

Parameters
nameThe name of the weights to be refit.
weightsThe new weights to associate with the name.

Returns true on success, or false if new weights are rejected. Possible reasons for rejection are:

  • The name of weights is nullptr or does not correspond to any refittable weights.
  • The count of the weights is inconsistent with the count returned from calling getWeightsPrototype() with the same name.
  • The type of the weights is inconsistent with the type returned from calling getWeightsPrototype() with the same name.

Modifying the weights before method refitCudaEngine or refitCudaEngineAsync returns will result in undefined behavior.

Warning
The string name must be null-terminated, and be at most 4096 bytes including the terminator.

◆ setNamedWeights() [2/2]

bool nvinfer1::IRefitter::setNamedWeights ( char const *  name,
Weights  weights,
TensorLocation  location 
)
inlinenoexcept

Specify new weights on a specified device of given name.

Parameters
nameThe name of the weights to be refitted.
weightsThe new weights on the specified device.
locationThe location (host vs. device) of the new weights.
Returns
True on success, or false if new weights are rejected. Possible reasons for rejection are:
  • The name of the weights is nullptr or does not correspond to any refittable weights.
  • The count of the weights is inconsistent with the count returned from calling getWeightsPrototype() with the same name.
  • The type of the weights is inconsistent with the type returned from calling getWeightsPrototype() with the same name.

It is allowed to provide some weights on CPU and others on GPU. Modifying the weights before the method refitCudaEngine() or refitCudaEngineAsync() completes will result in undefined behavior.

Warning
The string name must be null-terminated, and be at most 4096 bytes including the terminator.

◆ setWeights()

bool nvinfer1::IRefitter::setWeights ( char const *  layerName,
WeightsRole  role,
Weights  weights 
)
inlinenoexcept

Specify new weights for a layer of given name. Returns true on success, or false if new weights are rejected. Possible reasons for rejection are:

  • There is no such layer by that name.
  • The layer does not have weights with the specified role.
  • The count of weights is inconsistent with the layer’s original specification.
  • The type of weights is inconsistent with the layer’s original specification.

Modifying the weights before method refitCudaEngine or refitCudaEngineAsync returns will result in undefined behavior.

Warning
The string layerName must be null-terminated, and be at most 4096 bytes including the terminator.

◆ setWeightsValidation()

void nvinfer1::IRefitter::setWeightsValidation ( bool  weightsValidation)
inlinenoexcept

Set whether to validate weights during refitting.

Parameters
weightsValidationIndicate whether to validate weights during refitting.

When set to true, TensorRT will validate weights during FP32 to FP16/BF16 weights conversions or sparsifying weights in the refit call. If provided weights are not proper for some weights transformations, TensorRT will issue a warning and continue the transformation for minor issues (such as overflow during narrowing conversion), or issue an error and stop the refitting process for severe issues (such as sparsifying dense weights). By default the flag is true. Set the flag to false for faster refitting performance.

◆ unsetNamedWeights()

bool nvinfer1::IRefitter::unsetNamedWeights ( char const *  weightsName)
inlinenoexcept

Unset weights associated with the given name.

Parameters
weightsNameThe name of the weights to be refitted.
Returns
False if the weights were never set, returns true otherwise.

Unset weights before releasing them.

Warning
The string weightsName must be null-terminated, and be at most 4096 bytes including the terminator.

Member Data Documentation

◆ mImpl

apiv::VRefitter* nvinfer1::IRefitter::mImpl
protected

The documentation for this class was generated from the following file:

  Copyright © 2024 NVIDIA Corporation
  Privacy Policy | Manage My Privacy | Do Not Sell or Share My Data | Terms of Service | Accessibility | Corporate Policies | Product Security | Contact