Is this page helpful?

Observability#

NIM provides Prometheus metrics indicating request statistics. These metrics can be used to create dashboards with Grafana. When enabled, these metrics are available at http://0.0.0.0:8000/v1/metrics.

Note

The metrics endpoint might not be listed in the OpenAPI schema.

You can use the following command to retrieve the metrics:

curl -X 'GET' 'http://0.0.0.0:8000/v1/metrics'

The following table describes the available metrics.

Category	Metric Name	Description
GPU	gpu_power_usage_watts	GPU instantaneous power, in watts
GPU	gpu_power_limit_watts	Maximum GPU power limit, in watts
GPU	gpu_total_energy_consumption_joules	GPU total energy consumption, in joules
GPU	gpu_utilization	GPU utilization rate (0.0–1.0)
GPU	gpu_memory_total_bytes	Total GPU memory, in bytes
GPU	gpu_memory_used_bytes	Used GPU memory, in bytes
Process	process_virtual_memory_bytes	Virtual memory size in bytes
Process	process_resident_memory_bytes	Resident memory size in bytes
Process	process_start_time_seconds	Start time of the process since Unix epoch in seconds
Process	process_cpu_seconds_total	Total user and system CPU time spent in seconds
Process	process_open_fds	Number of open file descriptors
Process	process_max_fds	Maximum number of open file descriptors
Python	python_gc_objects_collected_total	Objects collected during GC
Python	python_gc_objects_uncollectable_total	Uncollectable objects found during GC
Python	python_gc_collections_total	Number of times this generation was collected
Python	python_info	Python platform information
Request	request_count_total	Total number of requests since container was launched
Request	request_latency_seconds	Request latency (histogram)