For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
  • About NVIDIA NeMo Relay
    • Overview
    • Architecture
    • Ecosystem
    • Concepts
    • Release Notes
  • Getting Started
    • Agent Runtime Primer
    • Prerequisites
    • Installation
    • Configuration / Setup
    • Quick Start
  • NVIDIA NeMo Relay CLI
    • About
    • Basic Usage
    • Claude Code
    • Codex
    • Cursor
    • Hermes Agent
  • Supported Integrations
    • About
    • OpenClaw Plugin Guide
    • LangChain Integration Guide
    • LangGraph Integration Guide
    • Deep Agents Integration Guide
  • Instrument Applications
    • About
    • Adding Scopes and Marks
    • Instrument a Tool Call
    • Instrument an LLM Call
    • Add Middleware
    • Code Examples
  • Observability Plugin
    • About
    • Configuration
    • Agent Trajectory Interchange Format (ATIF)
    • Agent Trajectory Observability Format (ATOF)
    • OpenTelemetry
    • OpenInference
  • Adaptive Plugin
    • About
    • Configuration
    • Adaptive Cache Governor (ACG)
    • Adaptive Hints
  • NeMo Guardrails Plugin
    • About
    • Configuration
  • Integrate into Frameworks
    • About
    • Adding Scopes
    • Wrap Tool Calls
    • Wrap LLM Calls
    • Handle Non-Serializable Data
    • Using Codecs
    • Provider Codecs
    • Provider Response Codecs
    • Code Examples
  • Build Plugins
    • About
    • Define a Plugin
    • Validate Plugin Configuration
    • Plugin Configuration Files
    • Register Plugin Behavior
    • Design Plugin Configuration
    • NeMo Guardrails Example Plugin
    • Code Examples
  • Contribute
    • About
    • Development Setup
    • Workflow and Reviews
    • Testing and Documentation
  • Reference
    • APIs
      • Python Library Reference
      • Node.js Library Reference
      • Rust Library Reference
        • nemo-relay
        • nemo-relay-adaptive
          • acg
            • anthropic_plugin
            • canonicalize
            • capability
            • ir_builder
            • openai_plugin
              • OpenAICachePlugin
            • passthrough
            • plugin
            • plugin_registry
            • policy
            • profile
            • prompt_ir
            • retention
            • stability
            • telemetry
            • types
            • variable_extractor
            • error
            • MIN_ACG_OBSERVATIONS
          • acg_component
          • acg_learner
          • acg_profile
          • adaptive_hints_intercept
          • cache_diagnostics
          • config
          • context_helpers
          • drain
          • error
          • intercepts
          • learner
          • plugin_component
          • redis
          • storage
          • subscriber
          • tool_parallelism_learner
          • trie
          • types
          • AdaptiveRuntime
        • nemo-relay-ffi
    • Performance
  • Resources
    • Support and FAQs
    • Glossary
    • Troubleshooting Guide
    • Community
    • Legal
NVIDIANVIDIA
Developer-friendly docs for your API
Privacy Policy | Your Privacy Choices | Terms of Service | Accessibility | Corporate Policies | Product Security | Contact

Copyright © 2026, NVIDIA Corporation.

LogoLogo
On this page
  • Threat mitigations
  • Structs
ReferenceAPIsRust Library Referencenemo-relay-adaptiveacg

Module openai_plugin

||View as Markdown|
Previous

Function build_prompt_ir

Next

Struct OpenAI Cache Plugin

Generated from cargo doc --no-deps -p nemo-relay -p nemo-relay-adaptive -p nemo-relay-ffi.

OpenAI cache plugin for the Adaptive Cache Governor (ACG) system.

Maximizes automatic prefix cache hits through deterministic JSON serialization. OpenAI uses automatic prefix caching at 1024+ tokens with exact prefix matching - no explicit annotations are needed. The plugin’s job is to ensure that semantically identical prefixes produce byte-identical JSON so the cache hits rather than misses.

Implements the ProviderPlugin trait with:

  • Tool schema canonicalization: RFC 8785 via [canonicalize_value] for deterministic key ordering in function parameter schemas.
  • Stable message content canonicalization: Structured JSON content blocks in the stable prefix are canonicalized for byte-identical output.
  • No annotations injected: OpenAI handles caching automatically.

Threat mitigations

  • T-08-06: RFC 8785 is a semantic-preserving transform (only reorders keys, normalizes numbers). The plugin canonicalizes tool schemas (structured JSON) but does NOT modify text content in messages.
  • T-08-09: If canonicalization fails for one tool, the plugin reports Degraded (not Applied) and continues with remaining tools.

Structs

  • OpenAICachePlugin: OpenAI-specific provider plugin for deterministic JSON serialization.