Model Configuration Extension#

This document describes Triton’s model configuration extension. The model configuration extension allows Triton to return server-specific information. Because this extension is supported, Triton reports “model_configuration” in the extensions field of its Server Metadata.

HTTP/REST#

In all JSON schemas shown in this document $number, $string, $boolean, $object and $array refer to the fundamental JSON types. #optional indicates an optional JSON field.

Triton exposes the model configuation endpoint at the following URL. The versions portion of the URL is optional; if not provided Triton will return model configuration for the highest-numbered version of the model.

GET v2/models/${MODEL_NAME}[/versions/${MODEL_VERSION}]/config

A model configuration request is made with an HTTP GET to the model configuration endpoint.A successful model configuration request is indicated by a 200 HTTP status code. The model configuration response object, identified as $model_configuration_response, is returned in the HTTP body for every successful request.

$model_configuration_response =
{
  # configuration JSON
}

The contents of the response will be the JSON representation of the model’s configuration described by the ModelConfig message from model_config.proto.

A failed model configuration request must be indicated by an HTTP error status (typically 400). The HTTP body must contain the $model_configuration_error_response object.

$model_configuration_error_response =
{
  "error": <error message string>
}
  • “error” : The descriptive message for the error.

GRPC#

The GRPC definition of the service is:

service GRPCInferenceService
{
  …

  // Get model configuration.
  rpc ModelConfig(ModelConfigRequest) returns (ModelConfigResponse) {}
}

Errors are indicated by the google.rpc.Status returned for the request. The OK code indicates success and other codes indicate failure. The request and response messages for ModelConfig are:

message ModelConfigRequest
{
  // The name of the model.
  string name = 1;

  // The version of the model. If not given the version of the model
  // is selected automatically based on the version policy.
  string version = 2;
}

message ModelConfigResponse
{
  // The model configuration.
  ModelConfig config = 1;
}

Where the ModelConfig message is defined in model_config.proto.