Observability#
NIM provides Prometheus metrics indicating request statistics. These metrics can be used to create dashboards with Grafana. When enabled, these metrics are available at http://0.0.0.0:8000/v1/metrics.
Note
The metrics endpoint might not be listed in the OpenAPI schema.
You can use the following command to retrieve the metrics:
curl -X 'GET' 'http://0.0.0.0:8000/v1/metrics'
The following table describes the available metrics.
Category |
Metric Name |
Description |
|---|---|---|
GPU |
gpu_power_usage_watts |
GPU instantaneous power, in watts |
GPU |
gpu_power_limit_watts |
Maximum GPU power limit, in watts |
GPU |
gpu_total_energy_consumption_joules |
GPU total energy consumption, in joules |
GPU |
gpu_utilization |
GPU utilization rate (0.0–1.0) |
GPU |
gpu_memory_total_bytes |
Total GPU memory, in bytes |
GPU |
gpu_memory_used_bytes |
Used GPU memory, in bytes |
Process |
process_virtual_memory_bytes |
Virtual memory size in bytes |
Process |
process_resident_memory_bytes |
Resident memory size in bytes |
Process |
process_start_time_seconds |
Start time of the process since Unix epoch in seconds |
Process |
process_cpu_seconds_total |
Total user and system CPU time spent in seconds |
Process |
process_open_fds |
Number of open file descriptors |
Process |
process_max_fds |
Maximum number of open file descriptors |
Python |
python_gc_objects_collected_total |
Objects collected during GC |
Python |
python_gc_objects_uncollectable_total |
Uncollectable objects found during GC |
Python |
python_gc_collections_total |
Number of times this generation was collected |
Python |
python_info |
Python platform information |
Request |
request_count_total |
Total number of requests since container was launched |
Request |
request_latency_seconds |
Request latency (histogram) |