TensorRT
5.1.3.4
|
Context for executing inference using an engine. More...
#include <NvInfer.h>
Public Member Functions | |
virtual bool | execute (int batchSize, void **bindings)=0 |
Synchronously execute inference on a batch. More... | |
virtual bool | enqueue (int batchSize, void **bindings, cudaStream_t stream, cudaEvent_t *inputConsumed)=0 |
Asynchronously execute inference on a batch. More... | |
virtual void | setDebugSync (bool sync)=0 |
Set the debug sync flag. More... | |
virtual bool | getDebugSync () const =0 |
Get the debug sync flag. More... | |
virtual void | setProfiler (IProfiler *)=0 |
Set the profiler. More... | |
virtual IProfiler * | getProfiler () const =0 |
Get the profiler. More... | |
virtual const ICudaEngine & | getEngine () const =0 |
Get the associated engine. More... | |
virtual void | destroy ()=0 |
Destroy this object. | |
virtual void | setName (const char *name)=0 |
Set the name of the execution context. More... | |
virtual const char * | getName () const =0 |
Return the name of the execution context. More... | |
virtual void | setDeviceMemory (void *memory)=0 |
set the device memory for use by this execution context. More... | |
Context for executing inference using an engine.
Multiple execution contexts may exist for one ICudaEngine instance, allowing the same engine to be used for the execution of multiple batches simultaneously.
|
pure virtual |
Asynchronously execute inference on a batch.
This method requires an array of input and output buffers. The mapping from tensor names to indices can be queried using ICudaEngine::getBindingIndex()
batchSize | The batch size. This is at most the value supplied when the engine was built. |
bindings | An array of pointers to input and output buffers for the network. |
stream | A cuda stream on which the inference kernels will be enqueued |
inputConsumed | An optional event which will be signaled when the input buffers can be refilled with new data |
|
pure virtual |
Synchronously execute inference on a batch.
This method requires an array of input and output buffers. The mapping from tensor names to indices can be queried using ICudaEngine::getBindingIndex()
batchSize | The batch size. This is at most the value supplied when the engine was built. |
bindings | An array of pointers to input and output buffers for the network. |
|
pure virtual |
Get the debug sync flag.
|
pure virtual |
Get the associated engine.
|
pure virtual |
Return the name of the execution context.
|
pure virtual |
Get the profiler.
|
pure virtual |
Set the debug sync flag.
If this flag is set to true, the engine will log the successful execution for each kernel during execute(). It has no effect when using enqueue().
|
pure virtual |
set the device memory for use by this execution context.
The memory must be aligned with cuda memory alignment property (using cudaGetDeviceProperties()), and its size must be at least that returned by getDeviceMemorySize(). If using enqueue() to run the network, The memory is in use from the invocation of enqueue() until network execution is complete. If using execute(), it is in use until execute() returns. Releasing or otherwise using the memory for other purposes during this time will result in undefined behavior.
|
pure virtual |
|
pure virtual |
Set the profiler.