NV-CLIP NIM supports exporting metrics and traces in an OpenTelemetry-compatible format.

Additionally, the underlying Triton service exposes its own metrics through a Prometheus endpoint.

To collect these metrics and traces, export them to a running OpenTelemetry Collector instance, which can then export them to any OTLP-compatible backend.

Metrics# You can collect metrics from both the NVIDIA NIM for NV-CLIP container and underlying Triton instance. Triton Metrics# Triton exposes its metrics on port 8002 in Prometheus format. To collect these metrics, use a Prometheus receiver to scrape the Triton endpoint and export them in an OpenTelemetry compatible format. See the following example for details. The following table describes the available metrics at http://localhost:8002/metrics . Category Metric Metric Name Description Count Success Request Count nv_inference_request_success Number of successful inference requests Count Failure Request Count nv_inference_request_failure Number of failed inference requests Count Total Request Count nv_inference_count Number of inferences performed Count Request Duration nv_inference_request_duration_us Cumulative inference request duration in microseconds Count Queue Duration nv_inference_queue_duration_us Cumulative inference queuing duration in microseconds Count Inference Duration nv_inference_compute_infer_duration_us Cumulative compute inference duration in microseconds Gauge GPU Utilization nv_gpu_utilization GPU utilization rate [0.0 - 1.0] Gauge Total GPU Memory nv_gpu_memory_total_bytes GPU total memory Gauge Used GPU Memory nv_gpu_memory_used_bytes GPU used memory Gauge CPU Utilization nv_cpu_utilization CPU utilization rate [0.0 - 1.0] Gauge Total CPU Memory nv_cpu_memory_total_bytes CPU total memory (RAM) Gauge Used CPU Memory nv_cpu_memory_used_bytes CPU used memory (RAM)

Prometheus# To install Prometheus for scraping metrics from NIM, download the latest Prometheus version appropriate for your system. wget https://github.com/prometheus/prometheus/releases/download/v2.52.0/prometheus-2.52.0.linux-amd64.tar.gz tar -xvzf prometheus-2.52.0.linux-amd64.tar.gz cd prometheus-2.52.0.linux-amd64/ Edit the Prometheus configuration file to scrape from the NIM endpoint. Make sure the targets field point to localhost:8002 vim prometheus.yml # # A scrape configuration containing exactly one endpoint to scrape: # Here it's Prometheus itself. scrape_configs : # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config. - job_name : "prometheus" # metrics_path defaults to '/metrics' # scheme defaults to 'http'. static_configs : - targets : [ "localhost:8002" ] Next run Prometheus server ./prometheus --config.file=./prometheus.yml Use a browser to check that the NIM target was detected by Prometheus server http://localhost:9090/targets?search= . You can also click on the NIM target URL link to explore generated metrics.