Custom Metrics Example#

In this section we demonstrate an end-to-end example for Custom Metrics API in Python backend. The model repository should contain custom_metrics model. The custom_metrics model uses Custom Metrics API to register and collect custom metrics.

Deploying the Custom Metrics Models#

  1. Create the model repository:

mkdir -p models/custom_metrics/1/

# Copy the Python models
cp examples/custom_metrics/ models/custom_metrics/1/
cp examples/custom_metrics/config.pbtxt models/custom_metrics/config.pbtxt
  1. Start the tritonserver:

tritonserver --model-repository `pwd`/models
  1. Send inference requests to server:

python3 examples/custom_metrics/

You should see an output similar to the output below in the client terminal:

custom_metrics example: found pattern '# HELP requests_process_latency_ns Cumulative time spent processing requests' in metrics
custom_metrics example: found pattern '# TYPE requests_process_latency_ns counter' in metrics
custom_metrics example: found pattern 'requests_process_latency_ns{model="custom_metrics",version="1"}' in metrics
PASS: custom_metrics

In the terminal that runs Triton Server, you should see an output similar to the output below:

Cumulative requests processing latency: 223406.0

The model file is heavily commented with explanations about each of the function calls.

Explanation of the Client Output#

The sends a HTTP request with url http://localhost:8002/metrics to fetch the metrics from Triton server. The client then verifies if the custom metrics added in the model file are correctly reported.