For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
  • About NVIDIA NeMo Relay
    • Overview
    • Architecture
    • Ecosystem
    • Concepts
    • Release Notes
  • Getting Started
    • Agent Runtime Primer
    • Prerequisites
    • Installation
    • Configuration / Setup
    • Quick Start
  • NVIDIA NeMo Relay CLI
    • About
    • Basic Usage
    • Claude Code
    • Codex
    • Cursor
    • Hermes Agent
  • Supported Integrations
    • About
    • OpenClaw Plugin Guide
    • LangChain Integration Guide
    • LangGraph Integration Guide
    • Deep Agents Integration Guide
  • Instrument Applications
    • About
    • Adding Scopes and Marks
    • Instrument a Tool Call
    • Instrument an LLM Call
    • Add Middleware
    • Code Examples
  • Observability Plugin
    • About
    • Configuration
    • Agent Trajectory Interchange Format (ATIF)
    • Agent Trajectory Observability Format (ATOF)
    • OpenTelemetry
    • OpenInference
  • Adaptive Plugin
    • About
    • Configuration
    • Adaptive Cache Governor (ACG)
    • Adaptive Hints
  • NeMo Guardrails Plugin
    • About
    • Configuration
  • Integrate into Frameworks
    • About
    • Adding Scopes
    • Wrap Tool Calls
    • Wrap LLM Calls
    • Handle Non-Serializable Data
    • Using Codecs
    • Provider Codecs
    • Provider Response Codecs
    • Code Examples
  • Build Plugins
    • About
    • Define a Plugin
    • Validate Plugin Configuration
    • Plugin Configuration Files
    • Register Plugin Behavior
    • Design Plugin Configuration
    • NeMo Guardrails Example Plugin
    • Code Examples
  • Contribute
    • About
    • Development Setup
    • Workflow and Reviews
    • Testing and Documentation
  • Reference
    • APIs
    • Performance
  • Resources
    • Support and FAQs
    • Glossary
    • Troubleshooting Guide
    • Community
    • Legal
NVIDIANVIDIA
Developer-friendly docs for your API
Privacy Policy | Your Privacy Choices | Terms of Service | Accessibility | Corporate Policies | Product Security | Contact

Copyright © 2026, NVIDIA Corporation.

LogoLogo
On this page
  • Use Adaptive When
  • Pages
Adaptive Plugin

Adaptive

||View as Markdown|

Use the Adaptive plugin when you want NeMo Relay to collect runtime signals and activate measured adaptive behavior through the shared plugin system.

Adaptive is a first-party plugin component with kind adaptive. It uses the same runtime model as the rest of NeMo Relay: scopes and managed calls emit lifecycle events, subscribers and learners observe those events, intercepts can add guidance, and plugin configuration controls what is active.

The plugin can coordinate:

  • Adaptive state for learned runtime signals.
  • Telemetry subscribers for adaptive learners.
  • Adaptive hints injected into outgoing model requests.
  • Tool-parallelism observation or scheduling behavior.
  • Adaptive Cache Governor (ACG) prompt-cache planning.
  • Component-local validation policy.

Use Adaptive When

Adaptive is useful when an agent workflow repeats similar work and you want to observe or tune behavior without hard-coding tuning logic into every application.

Start here when you need to:

  • Collect runtime signals before changing behavior.
  • Add model-request hints in a controlled way.
  • Plan prompt-cache breakpoints for supported providers.
  • Evaluate tool parallelism opportunities.
  • Share adaptive state across workers when needed.
  • Roll out measured tuning as a configuration change.

If instrumentation is not in place yet, start with Instrument Applications or Integrate into Frameworks.

Pages

  • Adaptive Configuration documents the full plugin component shape, validation, activation, teardown, and whole-plugin settings.
  • ACG explains Adaptive Cache Governor configuration and what prompt cache planning accomplishes.
  • Adaptive Hints explains request hint injection and how downstream model paths can consume the hints.

State, telemetry, tool parallelism, and policy are whole-plugin configuration areas. They are documented on Adaptive Configuration rather than as separate area pages.

Previous

OpenInference

Next

Adaptive Configuration