Metrics Developer Guide

This guide explains how to create and use custom metrics in Dynamo components using the Dynamo metrics API.

Metrics Exposure

All metrics created via the Dynamo metrics API are automatically exposed on the /metrics HTTP endpoint in Prometheus Exposition Format text when the following environment variable is set:

DYN_SYSTEM_PORT=<port> - Port for the metrics endpoint (set to positive value to enable, default: -1 disabled)

Example:

$ DYN_SYSTEM_PORT=8081 python -m dynamo.vllm --model <model>

Prometheus Exposition Format text metrics will be available at: http://localhost:8081/metrics

Metric Name Constants

The prometheus_names.rs module provides centralized metric name constants and sanitization functions to ensure consistency across all Dynamo components.

Metrics API in Rust

The metrics API is accessible through the .metrics() method on runtime, namespace, component, and endpoint objects. See Runtime Hierarchy for details on the hierarchical structure.

Available Methods

.metrics().create_counter(): Create a counter metric
.metrics().create_gauge(): Create a gauge metric
.metrics().create_histogram(): Create a histogram metric
.metrics().create_countervec(): Create a counter with labels
.metrics().create_gaugevec(): Create a gauge with labels
.metrics().create_histogramvec(): Create a histogram with labels

Creating Metrics

1 use dynamo_runtime::DistributedRuntime;
2 
3 let runtime = DistributedRuntime::new()?;
4 let endpoint = runtime.namespace("my_namespace").component("my_component").endpoint("my_endpoint");
5 
6 // Simple metrics
7 let requests_total = endpoint.metrics().create_counter(
8     "requests_total",
9     "Total requests",
10     &[]
11 )?;
12 
13 let active_connections = endpoint.metrics().create_gauge(
14     "active_connections",
15     "Active connections",
16     &[]
17 )?;
18 
19 let latency = endpoint.metrics().create_histogram(
20     "latency_seconds",
21     "Request latency",
22     &[],
23     Some(vec![0.001, 0.01, 0.1, 1.0, 10.0])
24 )?;

Using Metrics

1 // Counters
2 requests_total.inc();
3 
4 // Gauges
5 active_connections.set(42.0);
6 active_connections.inc();
7 active_connections.dec();
8 
9 // Histograms
10 latency.observe(0.023);  // 23ms

Vector Metrics with Labels

1 // Create vector metrics with label names
2 let requests_by_model = endpoint.metrics().create_countervec(
3     "requests_by_model",
4     "Requests by model",
5     &["model_type", "model_size"],
6     &[]
7 )?;
8 
9 let memory_by_gpu = endpoint.metrics().create_gaugevec(
10     "gpu_memory_bytes",
11     "GPU memory by device",
12     &["gpu_id", "memory_type"],
13     &[]
14 )?;
15 
16 // Use with specific label values
17 requests_by_model.with_label_values(&["llama", "7b"]).inc();
18 memory_by_gpu.with_label_values(&["0", "allocated"]).set(8192.0);

Advanced Features

Custom histogram buckets:

1 let latency = endpoint.metrics().create_histogram(
2     "latency_seconds",
3     "Request latency",
4     &[],
5     Some(vec![0.001, 0.01, 0.1, 1.0, 10.0])
6 )?;

Constant labels:

1 let counter = endpoint.metrics().create_counter(
2     "requests_total",
3     "Total requests",
4     &[("region", "us-west"), ("env", "prod")]
5 )?;

1	use dynamo_runtime::DistributedRuntime;
2
3	let runtime = DistributedRuntime::new()?;
4	let endpoint = runtime.namespace("my_namespace").component("my_component").endpoint("my_endpoint");
5
6	// Simple metrics
7	let requests_total = endpoint.metrics().create_counter(
8	"requests_total",
9	"Total requests",
10	&[]
11	)?;
12
13	let active_connections = endpoint.metrics().create_gauge(
14	"active_connections",
15	"Active connections",
16	&[]
17	)?;
18
19	let latency = endpoint.metrics().create_histogram(
20	"latency_seconds",
21	"Request latency",
22	&[],
23	Some(vec![0.001, 0.01, 0.1, 1.0, 10.0])
24	)?;

1	// Counters
2	requests_total.inc();
3
4	// Gauges
5	active_connections.set(42.0);
6	active_connections.inc();
7	active_connections.dec();
8
9	// Histograms
10	latency.observe(0.023); // 23ms

1	// Create vector metrics with label names
2	let requests_by_model = endpoint.metrics().create_countervec(
3	"requests_by_model",
4	"Requests by model",
5	&["model_type", "model_size"],
6	&[]
7	)?;
8
9	let memory_by_gpu = endpoint.metrics().create_gaugevec(
10	"gpu_memory_bytes",
11	"GPU memory by device",
12	&["gpu_id", "memory_type"],
13	&[]
14	)?;
15
16	// Use with specific label values
17	requests_by_model.with_label_values(&["llama", "7b"]).inc();
18	memory_by_gpu.with_label_values(&["0", "allocated"]).set(8192.0);

1	let counter = endpoint.metrics().create_counter(
2	"requests_total",
3	"Total requests",
4	&[("region", "us-west"), ("env", "prod")]
5	)?;

Metrics Exposure

Metric Name Constants

Metrics API in Rust

Available Methods

Creating Metrics

Using Metrics

Vector Metrics with Labels

Advanced Features

Related Documentation

Metrics Exposure

Metric Name Constants

Metrics API in Rust

Available Methods

Creating Metrics

Using Metrics

Vector Metrics with Labels

Advanced Features

Related Documentation