Detailed Description

Wrapper class for the gRPC client of the Triton Inference Server, interfaces with the Triton client library.

Definition at line 130 of file infer_grpc_client.h.

Public Member Functions
	InferGrpcClient (std::string url, bool enableCudaBufferSharing)
	Constructor, save the server server URL and CUDA sharing flag. More...

	~InferGrpcClient ()
	Destructor, default. More...

NvDsInferStatus	Initialize ()
	Create the gRPC client instance of the Triton Client library. More...

NvDsInferStatus	getModelMetadata (inference::ModelMetadataResponse *model_metadata, std::string &model_name, std::string &model_version)
	Get the model metadata from the Triton Inference server. More...

NvDsInferStatus	getModelConfig (inference::ModelConfigResponse *config, const std::string &name, const std::string &version="", const Headers &headers=Headers())
	Get the model configuration from the Triton Inference Server. More...

bool	isServerLive ()
	Check if the Triton Inference Server is live. More...

bool	isServerReady ()
	Check if the Triton Inference Server is ready. More...

bool	isModelReady (const std::string &model, const std::string version="")
	Check if the specified model is ready for inference. More...

NvDsInferStatus	LoadModel (const std::string &model_name, const Headers &headers=Headers())
	Request to load the given model using the Triton client library. More...

NvDsInferStatus	UnloadModel (const std::string &model_name, const Headers &headers=Headers())
	Request to unload the given model using the Triton client library. More...

SharedGrpcRequest	createRequest (const std::string &model, const std::string &version, SharedIBatchArray input, const std::vector< std::string > &outputs, const std::vector< TritonClassParams > &classList=std::vector< TritonClassParams >())
	Create a new gRPC inference request. More...

NvDsInferStatus	inferAsync (SharedGrpcRequest request, TritonGrpcAsyncDone done)
	Get the inference input and output list from the request and trigger the asynchronous inference request using the Triton client library. More...

Constructor & Destructor Documentation

◆ InferGrpcClient()

nvdsinferserver::InferGrpcClient::InferGrpcClient	(	std::string	url,
		bool	enableCudaBufferSharing
	)

Constructor, save the server server URL and CUDA sharing flag.

Parameters

[in]	url	The Triton server address.
[in]	enableCudaBufferSharing	Flag to enable CUDA buffer sharing.

◆ ~InferGrpcClient()

nvdsinferserver::InferGrpcClient::~InferGrpcClient ( )

Destructor, default.

Member Function Documentation

◆ createRequest()

SharedGrpcRequest nvdsinferserver::InferGrpcClient::createRequest	(	const std::string &	model,
		const std::string &	version,
		SharedIBatchArray	input,
		const std::vector< std::string > &	outputs,
		const std::vector< TritonClassParams > &	classList = `std::vector< TritonClassParams >()`
	)

Create a new gRPC inference request.

Create the Triton client library InferInput objects from the input and copy/register the input buffers. Create InferRequestedOutput objects for the output layers.

Parameters

[in]	model	Model name.
[in]	version	Model version.
[in]	input	Array of input batch buffers.
[in]	outputs	List of output layer names.
[in]	classList	List of configured Triton classification parameters.

Returns: Pointer to the gRPC inference request object created.

◆ getModelConfig()

NvDsInferStatus nvdsinferserver::InferGrpcClient::getModelConfig	(	inference::ModelConfigResponse *	config,
		const std::string &	name,
		const std::string &	version = `""`,
		const Headers &	headers = `Headers()`
	)

Get the model configuration from the Triton Inference Server.

Parameters

[out]	config	The model configuration protobuf message.
[in]	name	Model name.
[in]	version	Model version.
[in]	headers	Optional HTTP headers to be included in the gRPC request.

Returns: Error status.

◆ getModelMetadata()

NvDsInferStatus nvdsinferserver::InferGrpcClient::getModelMetadata	(	inference::ModelMetadataResponse *	model_metadata,
		std::string &	model_name,
		std::string &	model_version
	)

Get the model metadata from the Triton Inference server.

Parameters

[out]	model_metadata	The model metadata protobuf message.
[in]	model_name	Model name.
[in]	model_version	Model version.

Returns: Error status.

◆ inferAsync()

NvDsInferStatus nvdsinferserver::InferGrpcClient::inferAsync	(	SharedGrpcRequest	request,
		TritonGrpcAsyncDone	done
	)

Get the inference input and output list from the request and trigger the asynchronous inference request using the Triton client library.

Parameters

[in]	request	The inference request object.
[in]	done	The inference complete callback.

Returns: Error status.

◆ Initialize()

NvDsInferStatus nvdsinferserver::InferGrpcClient::Initialize ( )

Create the gRPC client instance of the Triton Client library.

Returns: Error status.

◆ isModelReady()

bool nvdsinferserver::InferGrpcClient::isModelReady	(	const std::string &	model,
		const std::string	version = `""`
	)

Check if the specified model is ready for inference.

◆ isServerLive()

bool nvdsinferserver::InferGrpcClient::isServerLive ( )

Check if the Triton Inference Server is live.

◆ isServerReady()

bool nvdsinferserver::InferGrpcClient::isServerReady ( )

Check if the Triton Inference Server is ready.

◆ LoadModel()

NvDsInferStatus nvdsinferserver::InferGrpcClient::LoadModel	(	const std::string &	model_name,
		const Headers &	headers = `Headers()`
	)

Request to load the given model using the Triton client library.

◆ UnloadModel()

NvDsInferStatus nvdsinferserver::InferGrpcClient::UnloadModel	(	const std::string &	model_name,
		const Headers &	headers = `Headers()`
	)

Request to unload the given model using the Triton client library.

The documentation for this class was generated from the following file:

infer_grpc_client.h

NVIDIA DeepStream SDK API Reference

7.0 Release

Detailed Description

Public Member Functions

Constructor & Destructor Documentation

◆ InferGrpcClient()

◆ ~InferGrpcClient()

Member Function Documentation

◆ createRequest()

◆ getModelConfig()

◆ getModelMetadata()

◆ inferAsync()

◆ Initialize()

◆ isModelReady()

◆ isServerLive()

◆ isServerReady()

◆ LoadModel()

◆ UnloadModel()