Detailed Description

Triton gRPC mode backend processing class.

Definition at line 34 of file infer_grpc_backend.h.

Inheritance diagram for nvdsinferserver::TritonGrpcBackend:

[legend]

Collaboration diagram for nvdsinferserver::TritonGrpcBackend:

[legend]

Public Types
enum	{ kLTpLayerDesc, kTpLayerNum }

enum	{ kInShapeName, kInShapeDims }

using	InferenceDone = std::function< void(NvDsInferStatus, SharedBatchArray)>
	Function wrapper for post inference processing. More...

using	InputsConsumed = std::function< void(SharedBatchArray)>
	Function wrapper called after the input buffer is consumed. More...

using	LayersTuple = std::tuple< const LayerDescription *, int >
	Tuple containing pointer to layer descriptions and the number of layers. More...

using	InputShapeTuple = std::tuple< std::string, InferBatchDims >
	Tuple of layer name and dimensions including batch size. More...

using	InputShapes = std::vector< InputShapeTuple >

Public Member Functions
	TritonGrpcBackend (std::string model, int64_t version)

	~TritonGrpcBackend () override

void	setOutputs (const std::set< std::string > &names)

void	setUrl (const std::string &url)

void	setEnableCudaBufferSharing (const bool enableSharing)

NvDsInferStatus	initialize () override

void	addClassifyParams (const TritonClassParams &c)
	Add Triton Classification parameters to the list. More...

NvDsInferStatus	specifyInputDims (const InputShapes &shapes) override
	Specify the input layers for the backend. More...

void	setTensorMaxBytes (const std::string &name, size_t maxBytes)
	Set the maximum size for the tensor, the maximum of the existing size and new input size is used. More...

InferTensorOrder	getInputTensorOrder () const final
	Returns the input tensor order. More...

void	setUniqueId (uint32_t id)
	Set the unique ID for the object instance. More...

int	uniqueId () const
	Get the unique ID of the object instance. More...

void	setFirstDimBatch (bool flag)
	Set the flag indicating that it is a batch input. More...

bool	isFirstDimBatch () const final
	Returns boolean indicating if batched input is expected. More...

uint32_t	getLayerSize () const final
	Returns the total number of layers (input + output) for the model. More...

uint32_t	getInputLayerSize () const final
	Returns the number of input layers for the model. More...

const LayerDescription *	getLayerInfo (const std::string &bindingName) const final
	Retrieve the layer information from the layer name. More...

LayersTuple	getInputLayers () const final
	Get the LayersTuple for input layers. More...

LayersTuple	getOutputLayers () const final
	Get the LayersTuple for output layers. More...

bool	checkInputDims (const InputShapes &shapes) const
	Check that the list of input shapes have fixed dimensions and corresponding layers are marked as input layers. More...

const LayerDescriptionList &	allLayers () const
	Returns the list of all descriptions of all layers, input and output. More...

void	setKeepInputs (bool enable)
	Set the flag indicating whether to keep inputs buffers. More...

int32_t	maxBatchSize () const final
	Returns the maximum batch size set for the backend. More...

bool	isNonBatching () const
	Checks if the batch size indicates batched processing or no. More...

Protected Types
enum	{ kName, kGpuId, kMemType }
	Tuple keys as <tensor-name, gpu-id, memType> More...

using	AsyncDone = std::function< void(NvDsInferStatus, SharedBatchArray)>
	Asynchronous inference done function: AsyncDone(Status, outputs). More...

using	PoolKey = std::tuple< std::string, int64_t, InferMemType >
	Tuple holding tensor name, GPU ID, memory type. More...

using	PoolValue = SharedBufPool< UniqSysMem >
	The buffer pool for the specified tensor, GPU and memory type combination. More...

using	ReorderItemPtr = std::shared_ptr< ReorderItem >

using	LayerIdxMap = std::unordered_map< std::string, int >
	Map of layer name to layer index. More...

Protected Member Functions
NvDsInferStatus	enqueue (SharedBatchArray inputs, SharedCuStream stream, InputsConsumed bufConsumed, InferenceDone inferenceDone) override

void	requestTritonOutputNames (std::set< std::string > &names) override

NvDsInferStatus	ensureServerReady () override

NvDsInferStatus	ensureModelReady () override

NvDsInferStatus	setupLayersInfo () override

NvDsInferStatus	Run (SharedBatchArray inputs, InputsConsumed bufConsumed, AsyncDone asyncDone) override

NvDsInferStatus	setupReorderThread ()
	Create a loop thread that calls inferenceDoneReorderLoop on the queued items. More...

void	setAllocator (UniqTritonAllocator allocator)
	Set the output tensor allocator. More...

TrtServerPtr &	server ()
	Get the Triton server handle. More...

NvDsInferStatus	fixateDims (const SharedBatchArray &bufs)
	Extend the dimensions to include batch size for the buffers in input array. More...

SharedSysMem	allocateResponseBuf (const std::string &tensor, size_t bytes, InferMemType memType, int64_t devId)
	Acquire a buffer from the output buffer pool associated with the device ID and memory type. More...

void	releaseResponseBuf (const std::string &tensor, SharedSysMem mem)
	Release the output tensor buffer. More...

NvDsInferStatus	ensureInputs (SharedBatchArray &inputs)
	Ensure that the array of input buffers are expected by the model and reshape the input buffers if required. More...

PoolValue	findResponsePool (PoolKey &key)
	Find the buffer pool for the given key. More...

PoolValue	createResponsePool (PoolKey &key, size_t bytes)
	Create a new buffer pool for the key. More...

void	serverInferCompleted (std::shared_ptr< TrtServerRequest > request, std::unique_ptr< TrtServerResponse > uniqResponse, InputsConsumed inputsConsumed, AsyncDone asyncDone)
	Call the inputs consumed function and parse the inference response to form the array of output batch buffers and call asyncDone on it. More...

bool	inferenceDoneReorderLoop (ReorderItemPtr item)
	Add input buffers to the output buffer list if required. More...

bool	debatchingOutput (SharedBatchArray &outputs, SharedBatchArray &inputs)
	Separate the batch dimension from the output buffer descriptors. More...

void	resetLayers (LayerDescriptionList layers, int inputSize)
	Set the layer description list of the backend. More...

LayerDescription *	mutableLayerInfo (const std::string &bindingName)
	Get the mutable layer description structure for the layer name. More...

void	setInputTensorOrder (InferTensorOrder order)
	Set the tensor order for the input layers. More...

bool	needKeepInputs () const
	Check if the keep input flag is set. More...

void	setMaxBatchSize (uint32_t size)
	Set the maximum batch size to be used for the backend. More...

Member Typedef Documentation

◆ AsyncDone

using nvdsinferserver::TrtISBackend::AsyncDone = std::function<void(NvDsInferStatus, SharedBatchArray)>

protectedinherited

Asynchronous inference done function: AsyncDone(Status, outputs).

Definition at line 169 of file infer_trtis_backend.h.

◆ InferenceDone

using nvdsinferserver::IBackend::InferenceDone = std::function<void(NvDsInferStatus, SharedBatchArray)>

inherited

Function wrapper for post inference processing.

Definition at line 66 of file infer_ibackend.h.

◆ InputsConsumed

using nvdsinferserver::IBackend::InputsConsumed = std::function<void(SharedBatchArray)>

inherited

Function wrapper called after the input buffer is consumed.

Definition at line 70 of file infer_ibackend.h.

◆ InputShapes

using nvdsinferserver::IBackend::InputShapes = std::vector<InputShapeTuple>

inherited

Definition at line 84 of file infer_ibackend.h.

◆ InputShapeTuple

using nvdsinferserver::IBackend::InputShapeTuple = std::tuple<std::string, InferBatchDims>

inherited

Tuple of layer name and dimensions including batch size.

Definition at line 83 of file infer_ibackend.h.

◆ LayerIdxMap

using nvdsinferserver::BaseBackend::LayerIdxMap = std::unordered_map<std::string, int>

protectedinherited

Map of layer name to layer index.

Definition at line 136 of file infer_base_backend.h.

◆ LayersTuple

using nvdsinferserver::IBackend::LayersTuple = std::tuple<const LayerDescription*, int>

inherited

Tuple containing pointer to layer descriptions and the number of layers.

Definition at line 77 of file infer_ibackend.h.

◆ PoolKey

using nvdsinferserver::TrtISBackend::PoolKey = std::tuple<std::string, int64_t, InferMemType>

protectedinherited

Tuple holding tensor name, GPU ID, memory type.

Definition at line 224 of file infer_trtis_backend.h.

◆ PoolValue

using nvdsinferserver::TrtISBackend::PoolValue = SharedBufPool<UniqSysMem>

protectedinherited

The buffer pool for the specified tensor, GPU and memory type combination.

Definition at line 229 of file infer_trtis_backend.h.

◆ ReorderItemPtr

using nvdsinferserver::TrtISBackend::ReorderItemPtr = std::shared_ptr<ReorderItem>

protectedinherited

Definition at line 293 of file infer_trtis_backend.h.

Member Enumeration Documentation

◆ anonymous enum

anonymous enum

protectedinherited

Tuple keys as <tensor-name, gpu-id, memType>

Enumerator
kName
kGpuId
kMemType

Definition at line 220 of file infer_trtis_backend.h.

◆ anonymous enum

anonymous enum

inherited

Enumerator
kLTpLayerDesc
kTpLayerNum

Definition at line 72 of file infer_ibackend.h.

◆ anonymous enum

anonymous enum

inherited

Enumerator
kInShapeName
kInShapeDims

Definition at line 79 of file infer_ibackend.h.

Constructor & Destructor Documentation

◆ TritonGrpcBackend()

nvdsinferserver::TritonGrpcBackend::TritonGrpcBackend	(	std::string	model,
		int64_t	version
	)

◆ ~TritonGrpcBackend()

nvdsinferserver::TritonGrpcBackend::~TritonGrpcBackend ( )

override

Member Function Documentation

◆ addClassifyParams()

void nvdsinferserver::TrtISBackend::addClassifyParams ( const TritonClassParams & c )

inlineinherited

Add Triton Classification parameters to the list.

Definition at line 58 of file infer_trtis_backend.h.

◆ allLayers()

const LayerDescriptionList& nvdsinferserver::BaseBackend::allLayers ( ) const

inlineinherited

Returns the list of all descriptions of all layers, input and output.

Definition at line 113 of file infer_base_backend.h.

◆ allocateResponseBuf()

SharedSysMem nvdsinferserver::TrtISBackend::allocateResponseBuf	(	const std::string &	tensor,
		size_t	bytes,
		InferMemType	memType,
		int64_t	devId
	)

protectedinherited

Acquire a buffer from the output buffer pool associated with the device ID and memory type.

Create the pool if it doesn't exist.

Parameters

[in]	tensor	Name of the output tensor.
[in]	bytes	Buffer size.
[in]	memType	Requested memory type.
[in]	devId	Device ID for the allocation.

Returns: Pointer to the allocated buffer.

◆ checkInputDims()

bool nvdsinferserver::BaseBackend::checkInputDims ( const InputShapes & shapes ) const

inherited

Check that the list of input shapes have fixed dimensions and corresponding layers are marked as input layers.

◆ createResponsePool()

PoolValue nvdsinferserver::TrtISBackend::createResponsePool	(	PoolKey &	key,
		size_t	bytes
	)

protectedinherited

Create a new buffer pool for the key.

Parameters

[in]	key	The pool key combination.
[in]	bytes	Size of the requested buffer.

Returns

◆ debatchingOutput()

bool nvdsinferserver::TrtISBackend::debatchingOutput	(	SharedBatchArray &	outputs,
		SharedBatchArray &	inputs
	)

protectedinherited

Separate the batch dimension from the output buffer descriptors.

Parameters

[in]	outputs	Array of output batch buffers.
[in]	inputs	Array of input batch buffers.

Returns: Boolean indicating success or failure.

◆ enqueue()

NvDsInferStatus nvdsinferserver::TritonGrpcBackend::enqueue	(	SharedBatchArray	inputs,
		SharedCuStream	stream,
		InputsConsumed	bufConsumed,
		InferenceDone	inferenceDone
	)

overrideprotectedvirtual

Implements nvdsinferserver::IBackend.

◆ ensureInputs()

NvDsInferStatus nvdsinferserver::TrtISBackend::ensureInputs ( SharedBatchArray & inputs )

protectedinherited

Ensure that the array of input buffers are expected by the model and reshape the input buffers if required.

Parameters

inputs Array of input batch buffers.

Returns: NVDSINFER_SUCCESS or NVDSINFER_TRITON_ERROR.

◆ ensureModelReady()

NvDsInferStatus nvdsinferserver::TritonGrpcBackend::ensureModelReady ( )

overrideprotectedvirtual

Reimplemented from nvdsinferserver::TrtISBackend.

◆ ensureServerReady()

NvDsInferStatus nvdsinferserver::TritonGrpcBackend::ensureServerReady ( )

overrideprotectedvirtual

Reimplemented from nvdsinferserver::TrtISBackend.

◆ findResponsePool()

PoolValue nvdsinferserver::TrtISBackend::findResponsePool ( PoolKey & key )

protectedinherited

Find the buffer pool for the given key.

◆ fixateDims()

NvDsInferStatus nvdsinferserver::TrtISBackend::fixateDims ( const SharedBatchArray & bufs )

protectedinherited

Extend the dimensions to include batch size for the buffers in input array.

Do nothing if batch input is not required.

◆ getClassifyParams()

std::vector<TritonClassParams> nvdsinferserver::TrtISBackend::getClassifyParams ( )

inlineinherited

Definition at line 71 of file infer_trtis_backend.h.

◆ getInputLayers()

LayersTuple nvdsinferserver::BaseBackend::getInputLayers ( ) const

finalvirtualinherited

Get the LayersTuple for input layers.

Implements nvdsinferserver::IBackend.

◆ getInputLayerSize()

uint32_t nvdsinferserver::BaseBackend::getInputLayerSize ( ) const

inlinefinalvirtualinherited

Returns the number of input layers for the model.

Implements nvdsinferserver::IBackend.

Definition at line 83 of file infer_base_backend.h.

◆ getInputTensorOrder()

InferTensorOrder nvdsinferserver::BaseBackend::getInputTensorOrder ( ) const

inlinefinalvirtualinherited

Returns the input tensor order.

Implements nvdsinferserver::IBackend.

Definition at line 49 of file infer_base_backend.h.

◆ getLayerInfo()

const LayerDescription* nvdsinferserver::BaseBackend::getLayerInfo ( const std::string & bindingName ) const

finalvirtualinherited

Retrieve the layer information from the layer name.

Implements nvdsinferserver::IBackend.

Referenced by nvdsinferserver::BaseBackend::mutableLayerInfo().

◆ getLayerSize()

uint32_t nvdsinferserver::BaseBackend::getLayerSize ( ) const

inlinefinalvirtualinherited

Returns the total number of layers (input + output) for the model.

Implements nvdsinferserver::IBackend.

Definition at line 75 of file infer_base_backend.h.

◆ getOutputLayers()

LayersTuple nvdsinferserver::BaseBackend::getOutputLayers ( ) const

finalvirtualinherited

Get the LayersTuple for output layers.

Implements nvdsinferserver::IBackend.

◆ inferenceDoneReorderLoop()

bool nvdsinferserver::TrtISBackend::inferenceDoneReorderLoop ( ReorderItemPtr item )

protectedinherited

Add input buffers to the output buffer list if required.

De-batch and run inference done callback.

Parameters

[in] item The reorder task.

Returns: Boolean indicating success or failure.

◆ initialize()

NvDsInferStatus nvdsinferserver::TritonGrpcBackend::initialize ( )

overridevirtual

Implements nvdsinferserver::IBackend.

◆ isFirstDimBatch()

bool nvdsinferserver::BaseBackend::isFirstDimBatch ( ) const

inlinefinalvirtualinherited

Returns boolean indicating if batched input is expected.

Implements nvdsinferserver::IBackend.

Definition at line 69 of file infer_base_backend.h.

◆ isNonBatching()

bool nvdsinferserver::BaseBackend::isNonBatching ( ) const

inlineinherited

Checks if the batch size indicates batched processing or no.

Definition at line 130 of file infer_base_backend.h.

References INFER_EXPORT_API::isNonBatch(), and nvdsinferserver::BaseBackend::maxBatchSize().

◆ maxBatchSize()

int32_t nvdsinferserver::BaseBackend::maxBatchSize ( ) const

inlinefinalvirtualinherited

Returns the maximum batch size set for the backend.

Implements nvdsinferserver::IBackend.

Definition at line 125 of file infer_base_backend.h.

Referenced by nvdsinferserver::BaseBackend::isNonBatching().

◆ model()

const std::string& nvdsinferserver::TrtISBackend::model ( ) const

inlineinherited

Definition at line 73 of file infer_trtis_backend.h.

◆ mutableLayerInfo()

LayerDescription* nvdsinferserver::BaseBackend::mutableLayerInfo ( const std::string & bindingName )

inlineprotectedinherited

Get the mutable layer description structure for the layer name.

Definition at line 153 of file infer_base_backend.h.

References nvdsinferserver::BaseBackend::getLayerInfo().

◆ needKeepInputs()

bool nvdsinferserver::BaseBackend::needKeepInputs ( ) const

inlineprotectedinherited

Check if the keep input flag is set.

Definition at line 167 of file infer_base_backend.h.

◆ outputDevId()

int64_t nvdsinferserver::TrtISBackend::outputDevId ( ) const

inlineinherited

Definition at line 70 of file infer_trtis_backend.h.

◆ outputMemType()

InferMemType nvdsinferserver::TrtISBackend::outputMemType ( ) const

inlineinherited

Definition at line 68 of file infer_trtis_backend.h.

◆ outputPoolSize()

int nvdsinferserver::TrtISBackend::outputPoolSize ( ) const

inlineinherited

Definition at line 66 of file infer_trtis_backend.h.

◆ releaseResponseBuf()

void nvdsinferserver::TrtISBackend::releaseResponseBuf	(	const std::string &	tensor,
		SharedSysMem	mem
	)

protectedinherited

Release the output tensor buffer.

Parameters

[in]	tensor	Name of the output tensor.
[in]	mem	Pointer to the memory buffer.

◆ requestTritonOutputNames()

void nvdsinferserver::TritonGrpcBackend::requestTritonOutputNames ( std::set< std::string > & names )

overrideprotectedvirtual

Reimplemented from nvdsinferserver::TrtISBackend.

◆ resetLayers()

void nvdsinferserver::BaseBackend::resetLayers	(	LayerDescriptionList	layers,
		int	inputSize
	)

protectedinherited

Set the layer description list of the backend.

This function sets the layer description for the backend and updates the number of input layers, layer name to index map.

Parameters

[in]	layers	The list of descriptions for all layers, input followed by output layers.
[in]	inputSize	The number of input layers in the list.

◆ Run()

NvDsInferStatus nvdsinferserver::TritonGrpcBackend::Run	(	SharedBatchArray	inputs,
		InputsConsumed	bufConsumed,
		AsyncDone	asyncDone
	)

overrideprotectedvirtual

Reimplemented from nvdsinferserver::TrtISBackend.

◆ server()

TrtServerPtr& nvdsinferserver::TrtISBackend::server ( )

inlineprotectedinherited

Get the Triton server handle.

Definition at line 164 of file infer_trtis_backend.h.

◆ serverInferCompleted()

void nvdsinferserver::TrtISBackend::serverInferCompleted	(	std::shared_ptr< TrtServerRequest >	request,
		std::unique_ptr< TrtServerResponse >	uniqResponse,
		InputsConsumed	inputsConsumed,
		AsyncDone	asyncDone
	)

protectedinherited

Call the inputs consumed function and parse the inference response to form the array of output batch buffers and call asyncDone on it.

Parameters

[in]	request	Pointer to the inference request.
[in]	uniqResponse	Pointer to the inference response from the server.
[in]	inputsConsumed	Callback function for releasing input buffer.
[in]	asyncDone	Callback function for processing response .

◆ setAllocator()

void nvdsinferserver::TrtISBackend::setAllocator ( UniqTritonAllocator allocator )

inlineprotectedinherited

Set the output tensor allocator.

Definition at line 148 of file infer_trtis_backend.h.

◆ setEnableCudaBufferSharing()

void nvdsinferserver::TritonGrpcBackend::setEnableCudaBufferSharing ( const bool enableSharing )

inline

Definition at line 43 of file infer_grpc_backend.h.

◆ setFirstDimBatch()

void nvdsinferserver::BaseBackend::setFirstDimBatch ( bool flag )

inlineinherited

Set the flag indicating that it is a batch input.

Definition at line 64 of file infer_base_backend.h.

◆ setInputTensorOrder()

void nvdsinferserver::BaseBackend::setInputTensorOrder ( InferTensorOrder order )

inlineprotectedinherited

Set the tensor order for the input layers.

Definition at line 162 of file infer_base_backend.h.

◆ setKeepInputs()

void nvdsinferserver::BaseBackend::setKeepInputs ( bool enable )

inlineinherited

Set the flag indicating whether to keep inputs buffers.

Definition at line 118 of file infer_base_backend.h.

◆ setMaxBatchSize()

void nvdsinferserver::BaseBackend::setMaxBatchSize ( uint32_t size )

inlineprotectedinherited

Set the maximum batch size to be used for the backend.

Definition at line 174 of file infer_base_backend.h.

◆ setOutputDevId()

void nvdsinferserver::TrtISBackend::setOutputDevId ( int64_t devId )

inlineinherited

Definition at line 69 of file infer_trtis_backend.h.

◆ setOutputMemType()

void nvdsinferserver::TrtISBackend::setOutputMemType ( InferMemType memType )

inlineinherited

Definition at line 67 of file infer_trtis_backend.h.

◆ setOutputPoolSize()

void nvdsinferserver::TrtISBackend::setOutputPoolSize ( int size )

inlineinherited

Helper function to access the member variables.

Definition at line 65 of file infer_trtis_backend.h.

◆ setOutputs()

void nvdsinferserver::TritonGrpcBackend::setOutputs ( const std::set< std::string > & names )

inline

Definition at line 39 of file infer_grpc_backend.h.

◆ setTensorMaxBytes()

void nvdsinferserver::TrtISBackend::setTensorMaxBytes	(	const std::string &	name,
		size_t	maxBytes
	)

inlineinherited

Set the maximum size for the tensor, the maximum of the existing size and new input size is used.

The size is rounded up to INFER_MEM_ALIGNMENT bytes.

Parameters

name	Name of the tensor.
maxBytes	New maximum number of bytes for the buffer.

Definition at line 110 of file infer_trtis_backend.h.

References INFER_MEM_ALIGNMENT, and INFER_ROUND_UP.

◆ setUniqueId()

void nvdsinferserver::BaseBackend::setUniqueId ( uint32_t id )

inlineinherited

Set the unique ID for the object instance.

Definition at line 54 of file infer_base_backend.h.

◆ setupLayersInfo()

NvDsInferStatus nvdsinferserver::TritonGrpcBackend::setupLayersInfo ( )

overrideprotectedvirtual

Reimplemented from nvdsinferserver::TrtISBackend.

◆ setupReorderThread()

NvDsInferStatus nvdsinferserver::TrtISBackend::setupReorderThread ( )

protectedinherited

Create a loop thread that calls inferenceDoneReorderLoop on the queued items.

Returns: NVDSINFER_SUCCESS or NVDSINFER_TRITON_ERROR.

◆ setUrl()

void nvdsinferserver::TritonGrpcBackend::setUrl ( const std::string & url )

inline

Definition at line 42 of file infer_grpc_backend.h.

◆ specifyInputDims()

NvDsInferStatus nvdsinferserver::TrtISBackend::specifyInputDims ( const InputShapes & shapes )

overridevirtualinherited

Specify the input layers for the backend.

Parameters

shapes List of name and shapes of the input layers.

Returns: Status code of the type NvDsInferStatus.

Implements nvdsinferserver::IBackend.

Reimplemented in nvdsinferserver::TritonSimpleRuntime.

◆ uniqueId()

int nvdsinferserver::BaseBackend::uniqueId ( ) const

inlineinherited

Get the unique ID of the object instance.

Definition at line 59 of file infer_base_backend.h.

◆ version()

int64_t nvdsinferserver::TrtISBackend::version ( ) const

inlineinherited

Definition at line 74 of file infer_trtis_backend.h.

The documentation for this class was generated from the following file:

infer_grpc_backend.h


NVIDIA DeepStream SDK API Reference	7.1 Release

NVIDIA DeepStream SDK API Reference

7.1 Release

Detailed Description

Public Types

Public Member Functions

Protected Types

Protected Member Functions

Member Typedef Documentation

◆ AsyncDone

◆ InferenceDone

◆ InputsConsumed

◆ InputShapes

◆ InputShapeTuple

◆ LayerIdxMap

◆ LayersTuple

◆ PoolKey

◆ PoolValue

◆ ReorderItemPtr

Member Enumeration Documentation

◆ anonymous enum

◆ anonymous enum

◆ anonymous enum

Constructor & Destructor Documentation

◆ TritonGrpcBackend()

◆ ~TritonGrpcBackend()

Member Function Documentation

◆ addClassifyParams()

◆ allLayers()

◆ allocateResponseBuf()

◆ checkInputDims()

◆ createResponsePool()

◆ debatchingOutput()

◆ enqueue()

◆ ensureInputs()

◆ ensureModelReady()

◆ ensureServerReady()

◆ findResponsePool()

◆ fixateDims()

◆ getClassifyParams()

◆ getInputLayers()

◆ getInputLayerSize()

◆ getInputTensorOrder()

◆ getLayerInfo()

◆ getLayerSize()

◆ getOutputLayers()

◆ inferenceDoneReorderLoop()

◆ initialize()

◆ isFirstDimBatch()

◆ isNonBatching()

◆ maxBatchSize()

◆ model()

◆ mutableLayerInfo()

◆ needKeepInputs()

◆ outputDevId()

◆ outputMemType()

◆ outputPoolSize()

◆ releaseResponseBuf()

◆ requestTritonOutputNames()

◆ resetLayers()

◆ Run()

◆ server()

◆ serverInferCompleted()

◆ setAllocator()

◆ setEnableCudaBufferSharing()

◆ setFirstDimBatch()

◆ setInputTensorOrder()

◆ setKeepInputs()

◆ setMaxBatchSize()

◆ setOutputDevId()

◆ setOutputMemType()

◆ setOutputPoolSize()

◆ setOutputs()

◆ setTensorMaxBytes()

◆ setUniqueId()

◆ setupLayersInfo()

◆ setupReorderThread()

◆ setUrl()

◆ specifyInputDims()

◆ uniqueId()

◆ version()