Observability#
About Observability#
NeMo Retriever Text Embedding NIM supports exporting metrics and traces in an OpenTelemetry-compatible format. Additionally, the microservice and its underlying NVIDIA Triton Inference Server expose metrics through Prometheus endpoints.
To collect these metrics and traces, export them to a running OpenTelemetry Collector instance, which can then export them to any OTLP-compatible backend.
Metrics and Traces#
You can collect metrics from both the NIM microservice and the Triton Inference Server instance.
The following environment variables are related to exporting OpenTelemetry metrics and traces from the NIM microservice.
| Variable | Description | 
|---|---|
| 
 | Specifies the name of the service to use in the exported metrics. | 
| 
 | Specifies the endpoint of an OTLP gRPC receiver. | 
| 
 | Specifies to export metrics to the specified  | 
| 
 | Specifies to export traces to the specified  | 
The NIM microservice and Triton Inference Server also expose metrics in Prometheus format.
You can access these metrics through through the NIM microservice API at <nim-host>:8000/v1/metrics, and the Triton metrics endpoint at <nim-host>:8002/metrics, respectively.
Enabling OpenTelemetry#
The following example requires that an OpenTelemetry Collector gRPC receiver is running at <opentelemetry-collector-host> on port <opentelemetry-collector-grpc-port>.
export IMG_NAME=nvcr.io/nim/nvidia/llama-3.2-nv-embedqa-1b-v2
export IMG_TAG=1.3.0
# Choose a container name for bookkeeping
export CONTAINER_NAME=$(basename $IMG_NAME)
# Set the OTEL environment variables to enable metrics exporting
export OTEL_SERVICE_NAME=$CONTAINER_NAME
export OTEL_METRICS_EXPORTER=otlp
export OTEL_TRACES_EXPORTER=otlp
export OTEL_EXPORTER_OTLP_ENDPOINT="http://<opentelemetry-collector-host>:<opentelemetry-collector-grpc-port>"
docker run --runtime=nvidia -it --rm --name=$CONTAINER_NAME \
  ... \
  -e OTEL_SERVICE_NAME \
  -e OTEL_METRICS_EXPORTER \
  -e OTEL_TRACES_EXPORTER \
  -e OTEL_EXPORTER_OTLP_ENDPOINT \
  ... \
  $IMG_NAME:$IMG_TAG
Receiving and Exporting Telemetry Data#
The following OpenTelemetry Collector configuration enables both metrics and tracing exports.
Two receivers are defined:
- An OTLP receiver that receives both metrics and trace data from the NIM microservice. 
- A Prometheus receiver scrapes Triton Inference Server metrics. 
Three exporters are defined:
- A Zipkin exporter that exports to a running Zipkin instance. 
- An OTLP gRPC exporter that exports to a downstream collector or backend, such as Datadog. 
- A debug exporter that prints received data to the console. This exporter is helpful for testing and development purposes. 
Traces are received exclusively by the OTLP receiver and exported by both the Zipkin and debug exporters. Metrics are received by the OTLP and Prometheus receivers. The metrics are exported by the OTLP and debug exporters.
receivers:
  otlp:
    protocols:
      grpc:
      http:
        cors:
          allowed_origins:
            - "*"
  prometheus:
    config:
      scrape_configs:
        - job_name: nim-triton-metrics
          scrape_interval: 10s
          static_configs:
            - targets: ["<nim-endpoint>:8002"]
exporters:
  # NOTE: Prior to v0.86.0 use `logging` instead of `debug`.
  zipkin:
    endpoint: "<zipkin-endpoint>:<zipkin-port>/api/v2/spans"
  otlp:
    endpoint: "<otlp-metrics-endpoint>:<otlp-metrics-port>"
    tls:
      insecure: true
  debug:
    verbosity: detailed
service:
  pipelines:
    traces:
      receivers: [otlp]
      exporters: [debug, zipkin]
    metrics:
      receivers: [otlp, prometheus]
      exporters: [debug, otlp]