server_status.proto¶
-
message
StatDuration
¶ - Statistic collecting a duration metric.
-
uint64
count
¶ Cumulative number of times this metric occurred.
-
uint64
total_time_ns
¶ Total collected duration of this metric in nanoseconds.
-
uint64
-
message
StatusRequestStats
¶ - Statistics collected for Status requests.
-
StatDuration
success
¶ Total time required to handle successful Status requests, not including HTTP or gRPC endpoint termination time.
-
StatDuration
-
message
ProfileRequestStats
¶ - Statistics collected for Profile requests.
-
StatDuration
success
¶ Total time required to handle successful Profile requests, not including HTTP or gRPC endpoint termination time.
-
StatDuration
-
message
HealthRequestStats
¶ - Statistics collected for Health requests.
-
StatDuration
success
¶ Total time required to handle successful Health requests, not including HTTP or gRPC endpoint termination time.
-
StatDuration
-
message
InferRequestStats
¶ - Statistics collected for Infer requests.
-
StatDuration
success
¶ Total time required to handle successful Infer requests, not including HTTP or gRPC endpoint termination time.
-
StatDuration
failed
¶ Total time required to handle failed Infer requests, not including HTTP or gRPC endpoint termination time.
-
StatDuration
compute
¶ Time required to run inferencing for an inference request; including time copying input tensors to GPU memory, time executing the model, and time copying output tensors from GPU memory.
-
StatDuration
queue
¶ Time an inference request waits in scheduling queue for an available model instance.
-
StatDuration
-
enum
ModelReadyState
¶ - Readiness status for models.
-
enumerator
ModelReadyState
::
MODEL_UNKNOWN
= 0¶ The model is in an unknown state. The model is not available for inferencing.
-
enumerator
ModelReadyState
::
MODEL_READY
= 1¶ The model is ready and available for inferencing.
-
enumerator
ModelReadyState
::
MODEL_UNAVAILABLE
= 2¶ The model is unavailable, indicating that the model failed to load or has been implicitly or explicitly unloaded. The model is not available for inferencing.
-
enumerator
ModelReadyState
::
MODEL_LOADING
= 3¶ The model is being loaded by the inference server. The model is not available for inferencing.
-
enumerator
ModelReadyState
::
MODEL_UNLOADING
= 4¶ The model is being unloaded by the inference server. The model is not available for inferencing.
-
enumerator
-
message
ModelVersionStatus
¶ - Status for a version of a model.
-
ModelReadyState
ready_statue
¶ Current readiness state for the model.
-
map<uint32, InferRequestStats>
infer_stats
¶ Inference statistics for the model, as a map from batch size to the statistics. A batch size will not occur in the map unless there has been at least one inference request of that batch size.
-
uint64
model_execution_count
¶ Cumulative number of model executions performed for the model. A single model execution performs inferencing for the entire request batch and can perform inferencing for multiple requests if dynamic batching is enabled.
-
uint64
model_inference_count
¶ Cumulative number of model inferences performed for the model. Each inference in a batched request is counted as an individual inference.
-
ModelReadyState
-
message
ModelStatus
¶ - Status for a model.
-
ModelConfig
config
¶ The configuration for the model.
-
map<int64, ModelVersionStatus>
version_status
¶ Duration statistics for each version of the model, as a map from version to the status. A version will not occur in the map unless there has been at least one inference request of that model version. A version of -1 indicates the status is for requests for which the version could not be determined.
-
ModelConfig
-
enum
ServerReadyState
¶ - Readiness status for the inference server.
-
enumerator
ServerReadyState
::
SERVER_INVALID
= 0¶ The server is in an invalid state and will likely not response correctly to any requests.
-
enumerator
ServerReadyState
::
SERVER_INITIALIZING
= 1¶ The server is initializing.
-
enumerator
ServerReadyState
::
SERVER_READY
= 2¶ The server is ready and accepting requests.
-
enumerator
ServerReadyState
::
SERVER_EXITING
= 3¶ The server is exiting and will not respond to requests.
-
enumerator
ServerReadyState
::
SERVER_FAILED_TO_INITIALIZE
= 10¶ The server did not initialize correctly. Most requests will fail.
-
enumerator
-
message
ServerStatus
¶ - Status for the inference server.
-
string
id
¶ The server’s ID.
-
string
version
¶ The server’s version.
-
ServerReadyState
ready_state
¶ Current readiness state for the server.
-
uint64
uptime_ns
¶ Server uptime in nanoseconds.
-
map<string, ModelStatus>
model_status
¶ Status for each model, as a map from model name to the status.
-
StatusRequestStats
status_stats
¶ Statistics for Status requests.
-
ProfileRequestStats
profile_stats
¶ Statistics for Profile requests.
-
HealthRequestStats
health_stats
¶ Statistics for Health requests.
-
string