Class InferHttpContext

Inheritance Relationships

Base Type

Class Documentation

class InferHttpContext : public nvidia::inferenceserver::client::InferContext

InferHttpContext is the HTTP instantiation of InferContext.

Public Functions

~InferHttpContext()
Error Run(ResultMap *results)

Send a synchronous request to the inference server to perform an inference to produce results for the outputs specified in the most recent call to SetRunOptions().

Return

Error object indicating success or failure.

Parameters
  • results: Returns Result objects holding inference results as a map from output name to Result object.

Error AsyncRun(std::shared_ptr<Request> *async_request)

Send an asynchronous request to the inference server to perform an inference to produce results for the outputs specified in the most recent call to SetRunOptions().

Return

Error object indicating success or failure.

Parameters
  • async_request: Returns a Request object that can be used to retrieve the inference results for the request.

Error GetAsyncRunResults(ResultMap *results, bool *is_ready, const std::shared_ptr<Request> &async_request, bool wait)

Get the results of the asynchronous request referenced by ‘async_request’.

Return

Error object indicating success or failure.

Parameters
  • results: Returns Result objects holding inference results as a map from output name to Result object.

  • is_ready: Returns boolean that indicates if the results are ready, the results are valid only if is_ready returns true.

  • async_request: Request handle to retrieve results.

  • wait: If true, block until the request completes. Otherwise, return immediately.

Public Static Functions

static Error Create(std::unique_ptr<InferContext> *ctx, const std::string &server_url, const std::string &model_name, int64_t model_version = -1, bool verbose = false)

Create context that performs inference for a non-sequence model using HTTP protocol.

Return

Error object indicating success or failure.

Parameters
  • ctx: Returns a new InferHttpContext object.

  • server_url: The inference server name and port.

  • model_name: The name of the model to get status for.

  • model_version: The version of the model to use for inference, or -1 to indicate that the latest (i.e. highest version number) version should be used.

  • verbose: If true generate verbose output when contacting the inference server.

static Error Create(std::unique_ptr<InferContext> *ctx, CorrelationID correlation_id, const std::string &server_url, const std::string &model_name, int64_t model_version = -1, bool verbose = false)

Create context that performs inference for a sequence model using a given correlation ID and the HTTP protocol.

Return

Error object indicating success or failure.

Parameters
  • ctx: Returns a new InferHttpContext object.

  • correlation_id: The correlation ID to use for all inferences performed with this context. A value of 0 (zero) indicates that no correlation ID should be used.

  • server_url: The inference server name and port.

  • model_name: The name of the model to get status for.

  • model_version: The version of the model to use for inference, or -1 to indicate that the latest (i.e. highest version number) version should be used.

  • verbose: If true generate verbose output when contacting the inference server.