For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
    • Overview
    • Quickstart
  • Before You Deploy
    • Infrastructure Sizing
    • Manifest
  • Deployment
    • Installation Overview
    • Image Mirroring
    • Helmfile Installation
  • GPU Cluster Setup
    • GPU Cluster Setup
    • Self-Managed Clusters
  • Configuration
    • Optional Enhancements
    • LLM Function Enablement
    • Gateway Routing
    • Third-Party Registries
    • Registry Allowlist
    • Cluster Configuration
    • KAI Scheduler
  • Using Cloud Functions
    • API
    • Service Keys
    • Function Creation
    • LLM Gateway
    • Generic HTTP Function Invocation
    • gRPC Function Invocation
    • Container Functions
    • Helm Functions
    • Streaming Functions
    • Configure Autoscaling
    • CLI
  • Function Autoscaling
    • Function Autoscaling Overview
    • Architecture
    • Operations
    • Observability
  • Observability
    • Observability
    • Example Dashboards
      • Metrics Overview
      • Cassandra
      • ESS
      • Init Container
      • Invocation Service
      • LLM API Gateway
      • LLM Function Invocation Metrics Report
      • LLM Request Router
      • NVCF API
      • SIS/Spot
      • State Metrics
      • Utils Container
      • Vault/OpenBao
  • Operations
    • Control Plane Operations
    • Cluster Monitoring
    • Troubleshooting
  • Runbooks
    • Runbooks
    • Key Rotation
  • Reference
    • Cluster Reference
    • gRPC Load Testing
    • gRPC Load Test SLI Guide
    • HTTP Load Testing
    • HTTP Load Test SLI Guide
    • HTTP Soak Testing
  • Development
    • Architecture Overview
    • Fake GPU Operator
    • Release Process
  • Managed (Legacy)
    • Function Lifecycle
    • Observability
NVIDIANVIDIA
Developer-friendly docs for your API
Privacy Policy | Your Privacy Choices | Terms of Service | Accessibility | Corporate Policies | Product Security | Contact

Copyright © 2026, NVIDIA Corporation.

LogoLogoCloud Functions
ObservabilityMetrics

Utils Container Metrics

||View as Markdown|
Metric nameMetric typeSourceDescriptionUnit (where applicable)Interesting LabelsRequired Filters (where applicable)
kube_pod_container_status_restarts_totalCounterprometheus-kube-state-metrics:8080/metricsPod restart countfunction_id, function_version_id, namespace, podcontainer=“utils”
kube_pod_container_status_terminated_reasonGaugeprometheus-kube-state-metrics:8080/metricsReason pod was terminatednamespace, pod, reasoncontainer=“utils”
kube_pod_container_status_waiting_reasonGaugeprometheus-kube-state-metrics:8080/metricsReason pod is waiting to startnamespace, podcontainer=“utils”
nvcf_worker_service_response_totalGaugeprometheus-agent-metrics-utils:8010/metrics
Previous

State Metrics

Next

Vault/OpenBao