NVIDIA DeepStream SDK API Reference

6.4 Release
nvdsinferserver::InferGrpcClient Class Reference

Detailed Description

Wrapper class for the gRPC client of the Triton Inference Server, interfaces with the Triton client library.

Definition at line 130 of file infer_grpc_client.h.

Public Member Functions

 InferGrpcClient (std::string url, bool enableCudaBufferSharing)
 Constructor, save the server server URL and CUDA sharing flag. More...
 
 ~InferGrpcClient ()
 Destructor, default. More...
 
NvDsInferStatus Initialize ()
 Create the gRPC client instance of the Triton Client library. More...
 
NvDsInferStatus getModelMetadata (inference::ModelMetadataResponse *model_metadata, std::string &model_name, std::string &model_version)
 Get the model metadata from the Triton Inference server. More...
 
NvDsInferStatus getModelConfig (inference::ModelConfigResponse *config, const std::string &name, const std::string &version="", const Headers &headers=Headers())
 Get the model configuration from the Triton Inference Server. More...
 
bool isServerLive ()
 Check if the Triton Inference Server is live. More...
 
bool isServerReady ()
 Check if the Triton Inference Server is ready. More...
 
bool isModelReady (const std::string &model, const std::string version="")
 Check if the specified model is ready for inference. More...
 
NvDsInferStatus LoadModel (const std::string &model_name, const Headers &headers=Headers())
 Request to load the given model using the Triton client library. More...
 
NvDsInferStatus UnloadModel (const std::string &model_name, const Headers &headers=Headers())
 Request to unload the given model using the Triton client library. More...
 
SharedGrpcRequest createRequest (const std::string &model, const std::string &version, SharedIBatchArray input, const std::vector< std::string > &outputs, const std::vector< TritonClassParams > &classList=std::vector< TritonClassParams >())
 Create a new gRPC inference request. More...
 
NvDsInferStatus inferAsync (SharedGrpcRequest request, TritonGrpcAsyncDone done)
 Get the inference input and output list from the request and trigger the asynchronous inference request using the Triton client library. More...
 

Constructor & Destructor Documentation

◆ InferGrpcClient()

nvdsinferserver::InferGrpcClient::InferGrpcClient ( std::string  url,
bool  enableCudaBufferSharing 
)

Constructor, save the server server URL and CUDA sharing flag.

Parameters
[in]urlThe Triton server address.
[in]enableCudaBufferSharingFlag to enable CUDA buffer sharing.

◆ ~InferGrpcClient()

nvdsinferserver::InferGrpcClient::~InferGrpcClient ( )

Destructor, default.

Member Function Documentation

◆ createRequest()

SharedGrpcRequest nvdsinferserver::InferGrpcClient::createRequest ( const std::string &  model,
const std::string &  version,
SharedIBatchArray  input,
const std::vector< std::string > &  outputs,
const std::vector< TritonClassParams > &  classList = std::vector< TritonClassParams >() 
)

Create a new gRPC inference request.

Create the Triton client library InferInput objects from the input and copy/register the input buffers. Create InferRequestedOutput objects for the output layers.

Parameters
[in]modelModel name.
[in]versionModel version.
[in]inputArray of input batch buffers.
[in]outputsList of output layer names.
[in]classListList of configured Triton classification parameters.
Returns
Pointer to the gRPC inference request object created.

◆ getModelConfig()

NvDsInferStatus nvdsinferserver::InferGrpcClient::getModelConfig ( inference::ModelConfigResponse *  config,
const std::string &  name,
const std::string &  version = "",
const Headers headers = Headers() 
)

Get the model configuration from the Triton Inference Server.

Parameters
[out]configThe model configuration protobuf message.
[in]nameModel name.
[in]versionModel version.
[in]headersOptional HTTP headers to be included in the gRPC request.
Returns
Error status.

◆ getModelMetadata()

NvDsInferStatus nvdsinferserver::InferGrpcClient::getModelMetadata ( inference::ModelMetadataResponse *  model_metadata,
std::string &  model_name,
std::string &  model_version 
)

Get the model metadata from the Triton Inference server.

Parameters
[out]model_metadataThe model metadata protobuf message.
[in]model_nameModel name.
[in]model_versionModel version.
Returns
Error status.

◆ inferAsync()

NvDsInferStatus nvdsinferserver::InferGrpcClient::inferAsync ( SharedGrpcRequest  request,
TritonGrpcAsyncDone  done 
)

Get the inference input and output list from the request and trigger the asynchronous inference request using the Triton client library.

Parameters
[in]requestThe inference request object.
[in]doneThe inference complete callback.
Returns
Error status.

◆ Initialize()

NvDsInferStatus nvdsinferserver::InferGrpcClient::Initialize ( )

Create the gRPC client instance of the Triton Client library.

Returns
Error status.

◆ isModelReady()

bool nvdsinferserver::InferGrpcClient::isModelReady ( const std::string &  model,
const std::string  version = "" 
)

Check if the specified model is ready for inference.

◆ isServerLive()

bool nvdsinferserver::InferGrpcClient::isServerLive ( )

Check if the Triton Inference Server is live.

◆ isServerReady()

bool nvdsinferserver::InferGrpcClient::isServerReady ( )

Check if the Triton Inference Server is ready.

◆ LoadModel()

NvDsInferStatus nvdsinferserver::InferGrpcClient::LoadModel ( const std::string &  model_name,
const Headers headers = Headers() 
)

Request to load the given model using the Triton client library.

◆ UnloadModel()

NvDsInferStatus nvdsinferserver::InferGrpcClient::UnloadModel ( const std::string &  model_name,
const Headers headers = Headers() 
)

Request to unload the given model using the Triton client library.


The documentation for this class was generated from the following file: