For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
    • Overview
    • Quickstart
  • Before You Deploy
    • Infrastructure Sizing
    • Manifest
  • Deployment
    • Installation Overview
    • Image Mirroring
    • Helmfile Installation
  • GPU Cluster Setup
    • GPU Cluster Setup
    • Self-Managed Clusters
  • Configuration
    • Optional Enhancements
    • LLM Function Enablement
    • Gateway Routing
    • Third-Party Registries
    • Registry Allowlist
    • Cluster Configuration
    • KAI Scheduler
  • Using Cloud Functions
    • API
    • Service Keys
    • Function Creation
    • LLM Gateway
    • Generic HTTP Function Invocation
    • gRPC Function Invocation
    • Container Functions
    • Helm Functions
    • Streaming Functions
    • Configure Autoscaling
    • CLI
  • Function Autoscaling
    • Function Autoscaling Overview
    • Architecture
    • Operations
    • Observability
  • Observability
    • Observability
    • Example Dashboards
      • Metrics Overview
      • Cassandra
      • ESS
      • Init Container
      • Invocation Service
      • LLM API Gateway
      • LLM Function Invocation Metrics Report
      • LLM Request Router
      • NVCF API
      • SIS/Spot
      • State Metrics
      • Utils Container
      • Vault/OpenBao
  • Operations
    • Control Plane Operations
    • Cluster Monitoring
    • Troubleshooting
  • Runbooks
    • Runbooks
    • Key Rotation
  • Reference
    • Cluster Reference
    • gRPC Load Testing
    • gRPC Load Test SLI Guide
    • HTTP Load Testing
    • HTTP Load Test SLI Guide
    • HTTP Soak Testing
  • Development
    • Architecture Overview
    • Fake GPU Operator
    • Release Process
  • Managed (Legacy)
    • Function Lifecycle
    • Observability
NVIDIANVIDIA
Developer-friendly docs for your API
Privacy Policy | Your Privacy Choices | Terms of Service | Accessibility | Corporate Policies | Product Security | Contact

Copyright © 2026, NVIDIA Corporation.

LogoLogoCloud Functions
ObservabilityMetrics

Invocation Service Metrics

||View as Markdown|
Metric nameMetric typeSourceDescriptionUnit (where applicable)Interesting LabelsRequired Filters (where applicable)
axum_http_requests_totalCounternvcf-invocation-service:41337/metricsInovcation Service http requests countstatusexported_endpoint=~”^/v2/nvcf.*|^/health$”, namespace=“astro-tenant-nvcf-invocation-service”
axum_http_requests_duration_seconds_countCounternvcf-invocation-service:41337/metricsInvocation Service http request countsstatusexported_endpoint=~”^/v2/nvcf.*|^/health$”, namespace=“astro-tenant-nvcf-invocation-service”
axum_http_requests_duration_seconds_sumCounternvcf-invocation-service:41337/metricsInvocation Service durations of http requestssecondsstatusexported_endpoint=~”^/v2/nvcf.*|^/health$”, namespace=“astro-tenant-nvcf-invocation-service”
app_invocation_errorCounternvcf-invocation-service:41337/metricsInvocation Service invocation errorshttp_status_codenamespace=""astro-tenant-nvcf-invocation-service
container_cpu_usage_seconds_totalCounternvcf-invocation-service:41337/metricsContainer cpu usage (used for uptime calculation)containercontainer=“nvcf-invocation-service”
container_memory_usage_bytesGaugenvcf-invocation-service:41337/metricsContainer memory usagebytescontainercontainer=“nvcf-invocation-service”
aws_requests_statusGaugeAWS Cloud WatchAWS request statusaws_status_codenamespace=“astro-tenant-nvcf-invocation-service”, service=“nvcf-invocation-service”
nats_jetstream_publishCounterPushed from SynadiaNATS Stream creation countnamespacenamespace=“astro-tenant-nvcf-invocation-service”
Previous

Init Container

Next

LLM API Gateway