For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
  • About NVIDIA NeMo Relay
    • Overview
    • Architecture
    • Ecosystem
    • Concepts
    • Release Notes
  • Getting Started
    • Agent Runtime Primer
    • Prerequisites
    • Installation
    • Configuration / Setup
    • Quick Start
  • NVIDIA NeMo Relay CLI
    • About
    • Basic Usage
    • Claude Code
    • Codex
    • Cursor
    • Hermes Agent
  • Supported Integrations
    • About
    • OpenClaw Plugin Guide
    • LangChain Integration Guide
    • LangGraph Integration Guide
    • Deep Agents Integration Guide
  • Instrument Applications
    • About
    • Adding Scopes and Marks
    • Instrument a Tool Call
    • Instrument an LLM Call
    • Add Middleware
    • Code Examples
  • Observability Plugin
    • About
    • Configuration
    • Agent Trajectory Interchange Format (ATIF)
    • Agent Trajectory Observability Format (ATOF)
    • OpenTelemetry
    • OpenInference
  • Adaptive Plugin
    • About
    • Configuration
    • Adaptive Cache Governor (ACG)
    • Adaptive Hints
  • NeMo Guardrails Plugin
    • About
    • Configuration
  • Integrate into Frameworks
    • About
    • Adding Scopes
    • Wrap Tool Calls
    • Wrap LLM Calls
    • Handle Non-Serializable Data
    • Using Codecs
    • Provider Codecs
    • Provider Response Codecs
    • Code Examples
  • Build Plugins
    • About
    • Define a Plugin
    • Validate Plugin Configuration
    • Plugin Configuration Files
    • Register Plugin Behavior
    • Design Plugin Configuration
    • NeMo Guardrails Example Plugin
    • Code Examples
  • Contribute
    • About
    • Development Setup
    • Workflow and Reviews
    • Testing and Documentation
  • Reference
    • APIs
      • Python Library Reference
      • Node.js Library Reference
      • Rust Library Reference
        • nemo-relay
        • nemo-relay-adaptive
          • acg
            • anthropic_plugin
            • canonicalize
            • capability
            • ir_builder
            • openai_plugin
            • passthrough
            • plugin
            • plugin_registry
            • policy
            • profile
            • prompt_ir
            • retention
            • stability
            • telemetry
              • CacheHitRate
              • CacheMissDiagnosis
              • CacheRequestFacts
              • CacheTelemetryEvent
              • CacheMissEvidence
              • CacheMissReason
              • CacheTelemetryProvider
            • types
            • variable_extractor
            • error
            • MIN_ACG_OBSERVATIONS
          • acg_component
          • acg_learner
          • acg_profile
          • adaptive_hints_intercept
          • cache_diagnostics
          • config
          • context_helpers
          • drain
          • error
          • intercepts
          • learner
          • plugin_component
          • redis
          • storage
          • subscriber
          • tool_parallelism_learner
          • trie
          • types
          • AdaptiveRuntime
        • nemo-relay-ffi
    • Performance
  • Resources
    • Support and FAQs
    • Glossary
    • Troubleshooting Guide
    • Community
    • Legal
NVIDIANVIDIA
Developer-friendly docs for your API
Privacy Policy | Your Privacy Choices | Terms of Service | Accessibility | Corporate Policies | Product Security | Contact

Copyright © 2026, NVIDIA Corporation.

LogoLogo
On this page
  • Fields
  • request_id: Uuid
  • agent_identity: AgentIdentity
  • cache_read_tokens: u64
  • cache_creation_tokens: u64
  • total_prompt_tokens: u64
  • hit_rate: f64
  • miss_reason: Option<CacheMissReason>
  • miss_diagnosis: Option<CacheMissDiagnosis>
  • provider: String
  • timestamp: DateTime<Utc>
  • Implementations
  • impl CacheTelemetryEvent
  • compute_hit_rate
  • from_usage
  • Trait Implementations
  • impl Clone for CacheTelemetryEvent
  • clone
  • clone_from
  • impl Debug for CacheTelemetryEvent
  • fmt
  • impl<'de> Deserialize<'de> for CacheTelemetryEvent
  • deserialize
  • impl PartialEq for CacheTelemetryEvent
  • eq
  • ne
  • impl Serialize for CacheTelemetryEvent
  • serialize
  • impl StructuralPartialEq for CacheTelemetryEvent
ReferenceAPIsRust Library Referencenemo-relay-adaptiveacgtelemetry

Struct Cache Telemetry Event

||View as Markdown|
Previous

Struct Cache Request Facts

Next

Enum Cache Miss Evidence

Generated from cargo doc --no-deps -p nemo-relay -p nemo-relay-adaptive -p nemo-relay-ffi.

pub struct CacheTelemetryEvent {
    pub request_id: Uuid,
    pub agent_identity: AgentIdentity,
    pub cache_read_tokens: u64,
    pub cache_creation_tokens: u64,
    pub total_prompt_tokens: u64,
    pub hit_rate: f64,
    pub miss_reason: Option<CacheMissReason>,
    pub miss_diagnosis: Option<CacheMissDiagnosis>,
    pub provider: String,
    pub timestamp: DateTime<Utc>,
}

Per-call cache telemetry event.

Captures provider-agnostic cache metrics for a single LLM request. The agent_identity field cross-references the Phase 3 AgentIdentity type for per-agent grouping.

Fields

request_id: Uuid

Request ID this telemetry pertains to.

agent_identity: AgentIdentity

Identity of the agent that issued the request.

cache_read_tokens: u64

Number of tokens served from cache.

cache_creation_tokens: u64

Number of tokens written to cache.

total_prompt_tokens: u64

Total prompt tokens (for hit rate calculation).

hit_rate: f64

Computed cache hit rate [0.0, 1.0].

miss_reason: Option<CacheMissReason>

Reason for cache miss, if applicable.

miss_diagnosis: Option<CacheMissDiagnosis>

Structured miss diagnosis, when the miss can be justified safely.

provider: String

Provider name (e.g., “anthropic”, “openai”).

timestamp: DateTime<Utc>

When this telemetry was recorded.

Implementations

impl CacheTelemetryEvent

impl CacheTelemetryEvent

compute_hit_rate

pub fn compute_hit_rate(cache_read_tokens: u64, total_prompt_tokens: u64) -> f64

Computes hit rate from token counts. Returns 0.0 if total_prompt_tokens is zero to avoid division by zero.

from_usage

pub fn from_usage(
    request_id: Uuid,
    agent_identity: AgentIdentity,
    provider: CacheTelemetryProvider,
    usage: &Usage,
    timestamp: DateTime<Utc>,
    request_facts: Option<&CacheRequestFacts>,
) -> Option<Self>

Builds a canonical cache telemetry event from normalized usage fields.

Returns None when the normalized usage payload does not contain prompt_tokens, because Phase 10 does not invent missing totals.

Trait Implementations

impl Clone for CacheTelemetryEvent

impl Clone for CacheTelemetryEvent

clone

fn clone(&self) -> CacheTelemetryEvent

clone_from

fn clone_from(&mut self, source: &Self)

impl Debug for CacheTelemetryEvent

impl Debug for CacheTelemetryEvent

fmt

fn fmt(&self, f: &mut Formatter<'_>) -> Result

impl<'de> Deserialize<'de> for CacheTelemetryEvent

impl<'de> Deserialize<'de> for CacheTelemetryEvent

deserialize

fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error>where
    __D: Deserializer<'de>,

impl PartialEq for CacheTelemetryEvent

impl PartialEq for CacheTelemetryEvent

eq

fn eq(&self, other: &CacheTelemetryEvent) -> bool

ne

fn ne(&self, other: &Rhs) -> bool

impl Serialize for CacheTelemetryEvent

impl Serialize for CacheTelemetryEvent

serialize

fn serialize<__S>(&self, __serializer: __S) -> Result<__S::Ok, __S::Error>where
    __S: Serializer,

impl StructuralPartialEq for CacheTelemetryEvent

impl StructuralPartialEq for CacheTelemetryEvent