TensorRT
5.1.3.4
|
Builds an engine from a network definition. More...
#include <NvInfer.h>
Public Member Functions | |
virtual nvinfer1::INetworkDefinition * | createNetwork ()=0 |
Create a network definition object. More... | |
virtual void | setMaxBatchSize (int batchSize)=0 |
Set the maximum batch size. More... | |
virtual int | getMaxBatchSize () const =0 |
Get the maximum batch size. More... | |
virtual void | setMaxWorkspaceSize (std::size_t workspaceSize)=0 |
Set the maximum workspace size. More... | |
virtual std::size_t | getMaxWorkspaceSize () const =0 |
Get the maximum workspace size. More... | |
virtual void | setHalf2Mode (bool mode)=0 |
Set whether half2 mode is used. More... | |
virtual bool | getHalf2Mode () const =0 |
Query whether half2 mode is used. More... | |
virtual void | setDebugSync (bool sync)=0 |
Set whether the builder should use debug synchronization. More... | |
virtual bool | getDebugSync () const =0 |
Query whether the builder will use debug synchronization. More... | |
virtual void | setMinFindIterations (int minFind)=0 |
Set the number of minimization iterations used when timing layers. More... | |
virtual int | getMinFindIterations () const =0 |
Query the number of minimization iterations. More... | |
virtual void | setAverageFindIterations (int avgFind)=0 |
Set the number of averaging iterations used when timing layers. More... | |
virtual int | getAverageFindIterations () const =0 |
Query the number of averaging iterations. More... | |
virtual nvinfer1::ICudaEngine * | buildCudaEngine (nvinfer1::INetworkDefinition &network)=0 |
Build a CUDA engine from a network definition. More... | |
virtual bool | platformHasFastFp16 () const =0 |
Determine whether the platform has fast native fp16. | |
virtual bool | platformHasFastInt8 () const =0 |
Determine whether the platform has fast native int8. | |
virtual void | destroy ()=0 |
Destroy this object. | |
virtual void | setInt8Mode (bool mode)=0 |
Set the maximum value for a region. More... | |
virtual bool | getInt8Mode () const =0 |
Query whether Int8 mode is used. More... | |
virtual void | setInt8Calibrator (IInt8Calibrator *calibrator)=0 |
Set Int8 Calibration interface. | |
virtual void | setDeviceType (ILayer *layer, DeviceType deviceType)=0 |
Set the device that this layer must execute on. More... | |
virtual DeviceType | getDeviceType (const ILayer *layer) const =0 |
Get the device that this layer executes on. More... | |
virtual bool | isDeviceTypeSet (const ILayer *layer) const =0 |
whether the DeviceType has been explicitly set for this layer More... | |
virtual void | resetDeviceType (ILayer *layer)=0 |
reset the DeviceType for this layer More... | |
virtual bool | canRunOnDLA (const ILayer *layer) const =0 |
Checks if a layer can run on DLA. More... | |
virtual void | setDefaultDeviceType (DeviceType deviceType)=0 |
Sets the default DeviceType to be used by the builder. It ensures that all the layers that can run on this device will run on it, unless setDeviceType is used to override the default DeviceType for a layer. More... | |
virtual DeviceType | getDefaultDeviceType () const =0 |
Get the default DeviceType which was set by setDefaultDeviceType. | |
virtual int | getMaxDLABatchSize () const =0 |
Get the maximum batch size DLA can support. For any tensor the total volume of index dimensions combined(dimensions other than CHW) with the requested batch size should not exceed the value returned by this function. | |
virtual void | allowGPUFallback (bool setFallBackMode)=0 |
Sets the builder to use GPU if a layer that was supposed to run on DLA can not run on DLA. More... | |
virtual int | getNbDLACores () const =0 |
Returns number of DLA hardware cores accessible. | |
virtual void | setDLACore (int dlaCore)=0 |
Set the DLA core that the engine must execute on. More... | |
virtual int | getDLACore () const =0 |
Get the DLA core that the engine executes on. More... | |
virtual void | reset (nvinfer1::INetworkDefinition &network)=0 |
Resets the builder state. | |
virtual void | setGpuAllocator (IGpuAllocator *allocator)=0 |
Set the GPU allocator. More... | |
virtual void | setFp16Mode (bool mode)=0 |
Set whether or not 16-bit kernels are permitted. More... | |
virtual bool | getFp16Mode () const =0 |
Query whether 16-bit kernels are permitted. More... | |
virtual void | setStrictTypeConstraints (bool mode)=0 |
Set whether or not type constraints are strict. More... | |
virtual bool | getStrictTypeConstraints () const =0 |
Query whether or not type constraints are strict. More... | |
virtual void | setRefittable (bool canRefit)=0 |
virtual bool | getRefittable () const =0 |
Query whether or not engines will be refittable. More... | |
virtual void | setEngineCapability (EngineCapability capability)=0 |
Configure the builder to target specified EngineCapability flow. | |
virtual EngineCapability | getEngineCapability () const =0 |
Query EngineCapability flow configured for the builder. More... | |
Builds an engine from a network definition.
|
pure virtual |
Sets the builder to use GPU if a layer that was supposed to run on DLA can not run on DLA.
Allows | fallback if setFallBackMode is true else disables fallback option. |
|
pure virtual |
Build a CUDA engine from a network definition.
|
pure virtual |
Checks if a layer can run on DLA.
|
pure virtual |
Create a network definition object.
|
pure virtual |
Query the number of averaging iterations.
|
pure virtual |
Query whether the builder will use debug synchronization.
|
pure virtual |
Get the device that this layer executes on.
|
pure virtual |
Get the DLA core that the engine executes on.
|
pure virtual |
Query EngineCapability flow configured for the builder.
|
pure virtual |
Query whether 16-bit kernels are permitted.
|
pure virtual |
Query whether half2 mode is used.
|
pure virtual |
Query whether Int8 mode is used.
|
pure virtual |
Get the maximum batch size.
|
pure virtual |
|
pure virtual |
Query the number of minimization iterations.
|
pure virtual |
Query whether or not engines will be refittable.
|
pure virtual |
Query whether or not type constraints are strict.
|
pure virtual |
whether the DeviceType has been explicitly set for this layer
|
pure virtual |
reset the DeviceType for this layer
|
pure virtual |
Set the number of averaging iterations used when timing layers.
When timing layers, the builder minimizes over a set of average times for layer execution. This parameter controls the number of iterations used in averaging.
|
pure virtual |
Set whether the builder should use debug synchronization.
If this flag is true, the builder will synchronize after timing each layer, and report the layer name. It can be useful when diagnosing issues at build time.
|
pure virtual |
Sets the default DeviceType to be used by the builder. It ensures that all the layers that can run on this device will run on it, unless setDeviceType is used to override the default DeviceType for a layer.
|
pure virtual |
Set the device that this layer must execute on.
DeviceType | that this layer must execute on. If DeviceType is not set or is reset, TensorRT will use the default DeviceType set in the builder. |
|
pure virtual |
Set the DLA core that the engine must execute on.
dlaCore | The DLA core to execute the engine on (0 to N-1, where N is the maximum number of DLA cores present on the device). Default value is 0. DLA Core is not a property of the engine that is preserved by serialization: when the engine is deserialized it will be associated with the DLA core which is configured for the runtime. |
|
pure virtual |
Set whether or not 16-bit kernels are permitted.
During engine build fp16 kernels will also be tried when this mode is enabled.
mode | Whether 16-bit kernels are permitted. |
|
pure virtual |
Set the GPU allocator.
allocator | Set the GPU allocator to be used by the builder. All GPU memory acquired will use this allocator. If NULL is passed, the default allocator will be used. |
Default: uses cudaMalloc/cudaFree.
|
pure virtual |
Set whether half2 mode is used.
half2 mode is a paired-image mode that is significantly faster for batch sizes greater than one on platforms with fp16 support.
mode | Whether half2 mode is used. |
|
pure virtual |
Set the maximum value for a region.
Used for INT8 mode compression.
|
pure virtual |
Set the maximum batch size.
batchSize | The maximum batch size which can be used at execution time, and also the batch size for which the engine will be optimized. |
|
pure virtual |
Set the maximum workspace size.
workspaceSize | The maximum GPU temporary memory which the engine can use at execution time. |
|
pure virtual |
Set the number of minimization iterations used when timing layers.
When timing layers, the builder minimizes over a set of average times for layer execution. This parameter controls the number of iterations used in minimization.
|
pure virtual |
Set whether engines will be refittable.
|
pure virtual |
Set whether or not type constraints are strict.
When strict type constraints are in use, TensorRT will always choose a layer implementation that conforms to the type constraints specified, if one exists. If this flag is not set, a higher-precision implementation may be chosen if it results in higher performance.
If no conformant layer exists, TensorRT will choose a non-conformant layer if available regardless of the setting of this flag.
See the developer guide for the definition of strictness.
mode | Whether type constraints are strict |