Observability#
NVIDIA NIM for Visual Generative AI (Visual GenAI NIM) supports exporting Triton metrics through a Prometheus endpoint.
Triton Metrics#
Triton exposes its metrics on port 8002
in Prometheus format.
The following table describes the available metrics at http://localhost:8002/metrics
.
Category |
Metric |
Metric Name |
Description |
---|---|---|---|
Count |
Success Request Count |
|
Number of successful inference requests |
Count |
Failure Request Count |
|
Number of failed inference requests |
Count |
Total Request Count |
|
Number of inferences performed |
Count |
Request Duration |
|
Cumulative inference request duration in microseconds |
Count |
Queue Duration |
|
Cumulative inference queuing duration in microseconds |
Count |
Inference Duration |
|
Cumulative compute inference duration in microseconds |
Gauge |
GPU Utilization |
|
GPU utilization rate [0.0 - 1.0) |
Gauge |
Total GPU Memory |
|
GPU total memory |
Gauge |
Used GPU Memory |
|
GPU used memory |
Gauge |
CPU Utilization |
|
CPU utilization rate [0.0 - 1.0] |
Gauge |
Total CPU Memory |
|
CPU total memory (RAM) |
Gauge |
Used CPU Memory |
|
CPU used memory (RAM) |
Prometheus#
To install Prometheus for scraping metrics from NIM, download the latest Prometheus version appropriate for your system.
wget https://github.com/prometheus/prometheus/releases/download/v2.52.0/prometheus-2.52.0.linux-amd64.tar.gz
tar -xvzf prometheus-2.52.0.linux-amd64.tar.gz
cd prometheus-2.52.0.linux-amd64/
Edit the Prometheus configuration file to scrape from the NIM endpoint. Make sure the targets
field point to localhost:8002
vim prometheus.yml
# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
# The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
- job_name: "prometheus"
# metrics_path defaults to '/metrics'
# scheme defaults to 'http'.
static_configs:
- targets: ["localhost:8002"]
Next run Prometheus server
./prometheus --config.file=./prometheus.yml
Use a browser to check that the NIM target was detected by Prometheus server http://localhost:9090/targets?search=
.
You can also click on the NIM target URL link to explore generated metrics.
Grafana#
We can use Grafana for dashboarding NIM metrics. Install the latest Grafana version appropriate for your system.
wget https://dl.grafana.com/oss/release/grafana-11.0.0.linux-amd64.tar.gz
tar -zxvf grafana-11.0.0.linux-amd64.tar.gz
Run the Grafana server
cd grafana-v11.0.0/
./bin/grafana-server
To access the Grafana dashboard point your browser to http://localhost:3000
. You will need to login using the defaults
username: admin
password: admin
The first step is to configure the source for Grafana to scrape metrics from. Click on the Data Source button, select Prometheus and specify the Prometheus URL localhost:9090
. After saving the configuration you should see a success message, now you are ready to create a dashboard with metrics from NIM.