api.proto¶
-
message
InferRequestHeader
¶ - Meta-data for an inferencing request. The actual input data is delivered separate from this header, in the HTTP body for an HTTP request, or in the
InferRequest
message for a gRPC request.-
message
Input
¶ - Meta-data for an input tensor provided as part of an inferencing request.
-
string
name
¶ The name of the input tensor.
-
uint64
byte_size
¶ The size of the input tensor, in bytes. This is the size for one instance of the input, not the entire size of a batched input.
-
string
-
message
Output
¶ - Meta-data for a requested output tensor as part of an inferencing request.
-
string
name
¶ The name of the output tensor.
-
uint64
byte_size
¶ The size of the output tensor, in bytes. This is the size for one instance of the output, not the entire size of a batched output.
-
string
-
uint32
batch_size
¶ The batch size of the inference request. This must be >= 1. For models that don’t support batching batch_size must be 1.
-
message
-
message
InferResponseHeader
¶ - Meta-data for the response to an inferencing request. The actual output data is delivered separate from this header, in the HTTP body for an HTTP request, or in the
InferResponse
message for a gRPC request.-
message
Output
¶ - Meta-data for an output tensor requested as part of an inferencing request.
-
string
name
¶ The name of the output tensor.
-
message
Raw
¶ - Meta-data for an output tensor being returned as raw data.
-
uint64
byte_size
¶ The size of the output tensor, in bytes. This is the size for one instance of the output, not the entire size of a batched output.
-
uint64
-
message
Class
¶ - Information about each classification for this output.
-
int32
idx
¶ The classification index.
-
float
value
¶ The classification value as a float (typically a probability).
-
string
label
¶ The label for the class (optional, only available if provided by the model).
-
int32
-
message
Classes
¶ - Meta-data for an output tensor being returned as classifications.
-
Raw
raw
¶ If specified deliver results for this output as raw tensor data. The actual output data is delivered in the HTTP body for an HTTP request, or in the
InferResponse
message for a gRPC request. Only one of ‘raw’ and ‘batch_classes’ may be specified.
-
string
-
string
model_name
¶ The name of the model that produced the outputs.
-
uint32
model_version
¶ The version of the model that produced the outputs.
-
uint32
batch_size
¶ The batch size of the outputs. This will always be equal to the batch size of the inputs. For models that don’t support batching the batch_size will be 1.
-
Output
output
(repeated)¶ The outputs, in the same order as they were requested in
InferRequestHeader
.
-
message