Wrapper class for the gRPC client of the Triton Inference Server, interfaces with the Triton client library.
Definition at line 130 of file infer_grpc_client.h.
Public Member Functions | |
InferGrpcClient (std::string url, bool enableCudaBufferSharing) | |
Constructor, save the server server URL and CUDA sharing flag. More... | |
~InferGrpcClient () | |
Destructor, default. More... | |
NvDsInferStatus | Initialize () |
Create the gRPC client instance of the Triton Client library. More... | |
NvDsInferStatus | getModelMetadata (inference::ModelMetadataResponse *model_metadata, std::string &model_name, std::string &model_version) |
Get the model metadata from the Triton Inference server. More... | |
NvDsInferStatus | getModelConfig (inference::ModelConfigResponse *config, const std::string &name, const std::string &version="", const Headers &headers=Headers()) |
Get the model configuration from the Triton Inference Server. More... | |
bool | isServerLive () |
Check if the Triton Inference Server is live. More... | |
bool | isServerReady () |
Check if the Triton Inference Server is ready. More... | |
bool | isModelReady (const std::string &model, const std::string version="") |
Check if the specified model is ready for inference. More... | |
NvDsInferStatus | LoadModel (const std::string &model_name, const Headers &headers=Headers()) |
Request to load the given model using the Triton client library. More... | |
NvDsInferStatus | UnloadModel (const std::string &model_name, const Headers &headers=Headers()) |
Request to unload the given model using the Triton client library. More... | |
SharedGrpcRequest | createRequest (const std::string &model, const std::string &version, SharedIBatchArray input, const std::vector< std::string > &outputs, const std::vector< TritonClassParams > &classList=std::vector< TritonClassParams >()) |
Create a new gRPC inference request. More... | |
NvDsInferStatus | inferAsync (SharedGrpcRequest request, TritonGrpcAsyncDone done) |
Get the inference input and output list from the request and trigger the asynchronous inference request using the Triton client library. More... | |
nvdsinferserver::InferGrpcClient::InferGrpcClient | ( | std::string | url, |
bool | enableCudaBufferSharing | ||
) |
Constructor, save the server server URL and CUDA sharing flag.
[in] | url | The Triton server address. |
[in] | enableCudaBufferSharing | Flag to enable CUDA buffer sharing. |
nvdsinferserver::InferGrpcClient::~InferGrpcClient | ( | ) |
Destructor, default.
SharedGrpcRequest nvdsinferserver::InferGrpcClient::createRequest | ( | const std::string & | model, |
const std::string & | version, | ||
SharedIBatchArray | input, | ||
const std::vector< std::string > & | outputs, | ||
const std::vector< TritonClassParams > & | classList = std::vector< TritonClassParams >() |
||
) |
Create a new gRPC inference request.
Create the Triton client library InferInput objects from the input and copy/register the input buffers. Create InferRequestedOutput objects for the output layers.
[in] | model | Model name. |
[in] | version | Model version. |
[in] | input | Array of input batch buffers. |
[in] | outputs | List of output layer names. |
[in] | classList | List of configured Triton classification parameters. |
NvDsInferStatus nvdsinferserver::InferGrpcClient::getModelConfig | ( | inference::ModelConfigResponse * | config, |
const std::string & | name, | ||
const std::string & | version = "" , |
||
const Headers & | headers = Headers() |
||
) |
Get the model configuration from the Triton Inference Server.
[out] | config | The model configuration protobuf message. |
[in] | name | Model name. |
[in] | version | Model version. |
[in] | headers | Optional HTTP headers to be included in the gRPC request. |
NvDsInferStatus nvdsinferserver::InferGrpcClient::getModelMetadata | ( | inference::ModelMetadataResponse * | model_metadata, |
std::string & | model_name, | ||
std::string & | model_version | ||
) |
Get the model metadata from the Triton Inference server.
[out] | model_metadata | The model metadata protobuf message. |
[in] | model_name | Model name. |
[in] | model_version | Model version. |
NvDsInferStatus nvdsinferserver::InferGrpcClient::inferAsync | ( | SharedGrpcRequest | request, |
TritonGrpcAsyncDone | done | ||
) |
Get the inference input and output list from the request and trigger the asynchronous inference request using the Triton client library.
[in] | request | The inference request object. |
[in] | done | The inference complete callback. |
NvDsInferStatus nvdsinferserver::InferGrpcClient::Initialize | ( | ) |
Create the gRPC client instance of the Triton Client library.
bool nvdsinferserver::InferGrpcClient::isModelReady | ( | const std::string & | model, |
const std::string | version = "" |
||
) |
Check if the specified model is ready for inference.
bool nvdsinferserver::InferGrpcClient::isServerLive | ( | ) |
Check if the Triton Inference Server is live.
bool nvdsinferserver::InferGrpcClient::isServerReady | ( | ) |
Check if the Triton Inference Server is ready.
NvDsInferStatus nvdsinferserver::InferGrpcClient::LoadModel | ( | const std::string & | model_name, |
const Headers & | headers = Headers() |
||
) |
Request to load the given model using the Triton client library.
NvDsInferStatus nvdsinferserver::InferGrpcClient::UnloadModel | ( | const std::string & | model_name, |
const Headers & | headers = Headers() |
||
) |
Request to unload the given model using the Triton client library.