Wrapper class for the gRPC client of the Triton Inference Server, interfaces with the Triton client library.
Definition at line 130 of file infer_grpc_client.h.
Public Member Functions | |
| InferGrpcClient (std::string url, bool enableCudaBufferSharing) | |
| Constructor, save the server server URL and CUDA sharing flag. More... | |
| ~InferGrpcClient () | |
| Destructor, default. More... | |
| NvDsInferStatus | Initialize () |
| Create the gRPC client instance of the Triton Client library. More... | |
| NvDsInferStatus | getModelMetadata (inference::ModelMetadataResponse *model_metadata, std::string &model_name, std::string &model_version) |
| Get the model metadata from the Triton Inference server. More... | |
| NvDsInferStatus | getModelConfig (inference::ModelConfigResponse *config, const std::string &name, const std::string &version="", const Headers &headers=Headers()) |
| Get the model configuration from the Triton Inference Server. More... | |
| bool | isServerLive () |
| Check if the Triton Inference Server is live. More... | |
| bool | isServerReady () |
| Check if the Triton Inference Server is ready. More... | |
| bool | isModelReady (const std::string &model, const std::string version="") |
| Check if the specified model is ready for inference. More... | |
| NvDsInferStatus | LoadModel (const std::string &model_name, const Headers &headers=Headers()) |
| Request to load the given model using the Triton client library. More... | |
| NvDsInferStatus | UnloadModel (const std::string &model_name, const Headers &headers=Headers()) |
| Request to unload the given model using the Triton client library. More... | |
| SharedGrpcRequest | createRequest (const std::string &model, const std::string &version, SharedIBatchArray input, const std::vector< std::string > &outputs, const std::vector< TritonClassParams > &classList=std::vector< TritonClassParams >()) |
| Create a new gRPC inference request. More... | |
| NvDsInferStatus | inferAsync (SharedGrpcRequest request, TritonGrpcAsyncDone done) |
| Get the inference input and output list from the request and trigger the asynchronous inference request using the Triton client library. More... | |
| nvdsinferserver::InferGrpcClient::InferGrpcClient | ( | std::string | url, |
| bool | enableCudaBufferSharing | ||
| ) |
Constructor, save the server server URL and CUDA sharing flag.
| [in] | url | The Triton server address. |
| [in] | enableCudaBufferSharing | Flag to enable CUDA buffer sharing. |
| nvdsinferserver::InferGrpcClient::~InferGrpcClient | ( | ) |
Destructor, default.
| SharedGrpcRequest nvdsinferserver::InferGrpcClient::createRequest | ( | const std::string & | model, |
| const std::string & | version, | ||
| SharedIBatchArray | input, | ||
| const std::vector< std::string > & | outputs, | ||
| const std::vector< TritonClassParams > & | classList = std::vector< TritonClassParams >() |
||
| ) |
Create a new gRPC inference request.
Create the Triton client library InferInput objects from the input and copy/register the input buffers. Create InferRequestedOutput objects for the output layers.
| [in] | model | Model name. |
| [in] | version | Model version. |
| [in] | input | Array of input batch buffers. |
| [in] | outputs | List of output layer names. |
| [in] | classList | List of configured Triton classification parameters. |
| NvDsInferStatus nvdsinferserver::InferGrpcClient::getModelConfig | ( | inference::ModelConfigResponse * | config, |
| const std::string & | name, | ||
| const std::string & | version = "", |
||
| const Headers & | headers = Headers() |
||
| ) |
Get the model configuration from the Triton Inference Server.
| [out] | config | The model configuration protobuf message. |
| [in] | name | Model name. |
| [in] | version | Model version. |
| [in] | headers | Optional HTTP headers to be included in the gRPC request. |
| NvDsInferStatus nvdsinferserver::InferGrpcClient::getModelMetadata | ( | inference::ModelMetadataResponse * | model_metadata, |
| std::string & | model_name, | ||
| std::string & | model_version | ||
| ) |
Get the model metadata from the Triton Inference server.
| [out] | model_metadata | The model metadata protobuf message. |
| [in] | model_name | Model name. |
| [in] | model_version | Model version. |
| NvDsInferStatus nvdsinferserver::InferGrpcClient::inferAsync | ( | SharedGrpcRequest | request, |
| TritonGrpcAsyncDone | done | ||
| ) |
Get the inference input and output list from the request and trigger the asynchronous inference request using the Triton client library.
| [in] | request | The inference request object. |
| [in] | done | The inference complete callback. |
| NvDsInferStatus nvdsinferserver::InferGrpcClient::Initialize | ( | ) |
Create the gRPC client instance of the Triton Client library.
| bool nvdsinferserver::InferGrpcClient::isModelReady | ( | const std::string & | model, |
| const std::string | version = "" |
||
| ) |
Check if the specified model is ready for inference.
| bool nvdsinferserver::InferGrpcClient::isServerLive | ( | ) |
Check if the Triton Inference Server is live.
| bool nvdsinferserver::InferGrpcClient::isServerReady | ( | ) |
Check if the Triton Inference Server is ready.
| NvDsInferStatus nvdsinferserver::InferGrpcClient::LoadModel | ( | const std::string & | model_name, |
| const Headers & | headers = Headers() |
||
| ) |
Request to load the given model using the Triton client library.
| NvDsInferStatus nvdsinferserver::InferGrpcClient::UnloadModel | ( | const std::string & | model_name, |
| const Headers & | headers = Headers() |
||
| ) |
Request to unload the given model using the Triton client library.