TensorRT
|
An engine for executing inference on a built network. More...
#include <NvInfer.h>
Public Member Functions | |
virtual int | getNbBindings () const =0 |
Get the number of binding indices. More... | |
virtual int | getBindingIndex (const char *name) const =0 |
Retrieve the binding index for a named tensor. More... | |
virtual const char * | getBindingName (int bindingIndex) const =0 |
Retrieve the name corresponding to a binding index. More... | |
virtual bool | bindingIsInput (int bindingIndex) const =0 |
Determine whether a binding is an input binding. More... | |
virtual Dims | getBindingDimensions (int bindingIndex) const =0 |
Get the dimensions of a binding. More... | |
virtual DataType | getBindingDataType (int bindingIndex) const =0 |
Determine the required data type for a buffer from its binding index. More... | |
virtual int | getMaxBatchSize () const =0 |
Get the maximum batch size which can be used for inference. More... | |
virtual int | getNbLayers () const =0 |
Get the number of layers in the network. More... | |
virtual std::size_t | getWorkspaceSize () const =0 |
Get the amount of workspace the engine uses. More... | |
virtual IHostMemory * | serialize () const =0 |
Serialize the network to a stream. More... | |
virtual IExecutionContext * | createExecutionContext ()=0 |
Create an execution context. More... | |
virtual void | destroy ()=0 |
Destroy this object;. | |
virtual TensorLocation | getLocation (int bindingIndex) const =0 |
Get location of binding. More... | |
virtual IExecutionContext * | createExecutionContextWithoutDeviceMemory ()=0 |
create an execution context without any device memory allocated More... | |
virtual size_t | getDeviceMemorySize () const =0 |
Return the amount of device memory required by an execution context. More... | |
An engine for executing inference on a built network.
|
pure virtual |
Determine whether a binding is an input binding.
bindingIndex | The binding index. |
|
pure virtual |
Create an execution context.
|
pure virtual |
create an execution context without any device memory allocated
The memory for execution of this device context must be supplied by the application.
|
pure virtual |
Determine the required data type for a buffer from its binding index.
bindingIndex | The binding index. |
|
pure virtual |
Get the dimensions of a binding.
bindingIndex | The binding index. |
|
pure virtual |
Retrieve the binding index for a named tensor.
IExecutionContext::enqueue() and IExecutionContext::execute() require an array of buffers.
Engine bindings map from tensor names to indices in this array. Binding indices are assigned at engine build time, and take values in the range [0 ... n-1] where n is the total number of inputs and outputs.
name | The tensor name. |
|
pure virtual |
Retrieve the name corresponding to a binding index.
This is the reverse mapping to that provided by getBindingIndex().
bindingIndex | The binding index. |
|
pure virtual |
Return the amount of device memory required by an execution context.
|
pure virtual |
Get location of binding.
This lets you know whether the binding should be a pointer to device or host memory.
bindingIndex | The binding index. |
|
pure virtual |
Get the maximum batch size which can be used for inference.
|
pure virtual |
Get the number of binding indices.
|
pure virtual |
Get the number of layers in the network.
The number of layers in the network is not necessarily the number in the original network definition, as layers may be combined or eliminated as the engine is optimized. This value can be useful when building per-layer tables, such as when aggregating profiling data over a number of executions.
|
pure virtual |
Get the amount of workspace the engine uses.
The workspace size will be no greater than the value provided to the builder when the engine was built, and will typically be smaller. Workspace will be allocated for each execution context.
|
pure virtual |
Serialize the network to a stream.
The network may be deserialized with IRuntime::deserializeCudaEngine()