For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
  • About NVIDIA NeMo Relay
    • Overview
    • Architecture
    • Ecosystem
    • Concepts
    • Release Notes
  • Getting Started
    • Agent Runtime Primer
    • Prerequisites
    • Installation
    • Configuration / Setup
    • Quick Start
  • NVIDIA NeMo Relay CLI
    • About
    • Basic Usage
    • Claude Code
    • Codex
    • Cursor
    • Hermes Agent
  • Supported Integrations
    • About
    • OpenClaw Plugin Guide
    • LangChain Integration Guide
    • LangGraph Integration Guide
    • Deep Agents Integration Guide
  • Instrument Applications
    • About
    • Adding Scopes and Marks
    • Instrument a Tool Call
    • Instrument an LLM Call
    • Add Middleware
    • Code Examples
  • Observability Plugin
    • About
    • Configuration
    • Agent Trajectory Interchange Format (ATIF)
    • Agent Trajectory Observability Format (ATOF)
    • OpenTelemetry
    • OpenInference
  • Adaptive Plugin
    • About
    • Configuration
    • Adaptive Cache Governor (ACG)
    • Adaptive Hints
  • NeMo Guardrails Plugin
    • About
    • Configuration
  • Integrate into Frameworks
    • About
    • Adding Scopes
    • Wrap Tool Calls
    • Wrap LLM Calls
    • Handle Non-Serializable Data
    • Using Codecs
    • Provider Codecs
    • Provider Response Codecs
    • Code Examples
  • Build Plugins
    • About
    • Define a Plugin
    • Validate Plugin Configuration
    • Plugin Configuration Files
    • Register Plugin Behavior
    • Design Plugin Configuration
    • NeMo Guardrails Example Plugin
    • Code Examples
  • Contribute
    • About
    • Development Setup
    • Workflow and Reviews
    • Testing and Documentation
  • Reference
    • APIs
    • Performance
  • Resources
    • Support and FAQs
    • Glossary
    • Troubleshooting Guide
    • Community
    • Legal
NVIDIANVIDIA
Developer-friendly docs for your API
Privacy Policy | Your Privacy Choices | Terms of Service | Accessibility | Corporate Policies | Product Security | Contact

Copyright © 2026, NVIDIA Corporation.

LogoLogo
On this page
  • Component Shape
  • plugins.toml Example
  • Per-Language Plugin Configuration
  • Manual API
  • Validation And Teardown
  • Rollout Guidance
Adaptive Plugin

Adaptive Configuration

||View as Markdown|
Previous

Adaptive

Next

Adaptive Cache Governor (ACG)

Use this page when you want to configure the built-in Adaptive plugin component as a whole. The component kind is adaptive.

Adaptive plugin configuration uses the generic NeMo Relay plugin document shape. Field names stay snake_case in every binding and in plugins.toml, even when language helper functions use language-native naming conventions.

For plugin file discovery, precedence, merge behavior, editor controls, and gateway conflict rules, see Plugin Configuration Files.

Component Shape

The top-level adaptive object contains:

FieldPurpose
versionAdaptive config schema version. Defaults to 1.
agent_idFallback agent identifier used only when no Agent scope is active, such as gateway-mode requests. Scoped runtime calls use the active Agent scope name instead.
stateAdaptive state backend.
telemetryAdaptive subscriber and learner settings.
adaptive_hintsRequest hint-injection behavior.
tool_parallelismTool scheduling observation or scheduling behavior.
acgAdaptive Cache Governor prompt-cache planning.
policyAdaptive-local handling for unknown fields and unsupported values.

The requested area pages cover Adaptive Cache Governor (ACG) and Adaptive Hints. State, telemetry, tool parallelism, and policy remain whole-plugin settings:

  • Use state.backend.kind = "in_memory" for local experiments.
  • Use Redis state when learned state must survive restarts or be shared across workers.
  • Enable telemetry when adaptive learners should consume runtime events.
  • Keep tool_parallelism.mode = "observe_only" until scheduling behavior has been validated.
  • Keep policy.unsupported_value = "error" for rollout safety.

plugins.toml Example

1version = 1
2
3[[components]]
4kind = "adaptive"
5enabled = true
6
7[components.config]
8version = 1
9agent_id = "planner"
10
11[components.config.state.backend]
12kind = "in_memory"
13
14[components.config.telemetry]
15subscriber_name = "adaptive.telemetry"
16learners = ["tool_parallelism"]
17
18[components.config.tool_parallelism]
19mode = "observe_only"
20priority = 100
21
22[components.config.adaptive_hints]
23priority = 100
24break_chain = false
25inject_header = true
26inject_body_path = "nvext.agent_hints"
27
28[components.config.acg]
29provider = "passthrough"
30observation_window = 100
31priority = 50
32
33[components.config.acg.stability_thresholds]
34stable_threshold = 0.95
35semi_stable_threshold = 0.50
36min_observations_for_full_confidence = 20
37
38[components.config.policy]
39unknown_component = "warn"
40unknown_field = "warn"
41unsupported_value = "error"

This configuration activates adaptive telemetry, keeps tool parallelism observational, injects adaptive hints, and leaves ACG in passthrough mode so requests can be observed without provider-specific cache translation.

Per-Language Plugin Configuration

Python
Node.js
Rust
1import nemo_relay
2
3adaptive_config = nemo_relay.adaptive.AdaptiveConfig(
4 agent_id="planner",
5 state=nemo_relay.adaptive.StateConfig(
6 backend=nemo_relay.adaptive.BackendSpec.in_memory(),
7 ),
8 telemetry=nemo_relay.adaptive.TelemetryConfig(
9 subscriber_name="adaptive.telemetry",
10 learners=["tool_parallelism"],
11 ),
12 tool_parallelism=nemo_relay.adaptive.ToolParallelismConfig(mode="observe_only"),
13 adaptive_hints=nemo_relay.adaptive.AdaptiveHintsConfig(
14 inject_body_path="nvext.agent_hints",
15 ),
16 acg=nemo_relay.adaptive.AcgConfig(provider="passthrough"),
17)
18
19plugin_config = nemo_relay.plugin.PluginConfig(
20 components=[nemo_relay.adaptive.ComponentSpec(adaptive_config)]
21)
22
23report = nemo_relay.plugin.validate(plugin_config)
24if any(diagnostic["level"] == "error" for diagnostic in report["diagnostics"]):
25 raise RuntimeError(report["diagnostics"])
26
27active = await nemo_relay.plugin.initialize(plugin_config)

Manual API

Use the manual runtime API when an integration needs to own adaptive lifecycle directly instead of activating the top-level plugin component.

Python
Node.js
Rust
1import nemo_relay
2
3adaptive_config = nemo_relay.adaptive.AdaptiveConfig(
4 agent_id="planner",
5 state=nemo_relay.adaptive.StateConfig(
6 backend=nemo_relay.adaptive.BackendSpec.in_memory(),
7 ),
8 telemetry=nemo_relay.adaptive.TelemetryConfig(
9 subscriber_name="adaptive.telemetry",
10 learners=["tool_parallelism"],
11 ),
12 tool_parallelism=nemo_relay.adaptive.ToolParallelismConfig(mode="observe_only"),
13 adaptive_hints=nemo_relay.adaptive.AdaptiveHintsConfig(
14 inject_body_path="nvext.agent_hints",
15 ),
16 acg=nemo_relay.adaptive.AcgConfig(provider="passthrough"),
17)
18
19runtime = nemo_relay.adaptive.AdaptiveRuntime(adaptive_config.to_dict())
20await runtime.register()
21try:
22 # Run instrumented application work here.
23 runtime.wait_for_idle()
24finally:
25 await runtime.shutdown()

Validation And Teardown

Validate plugin configuration before initialization. Disabled adaptive components are still validated, which lets operators prepare a rollout before setting enabled = true.

Common validation failures include:

  • Unknown adaptive fields when policy treats unknown fields as errors.
  • Unsupported backend kinds, tool-parallelism modes, or ACG providers.
  • Unsupported schema versions.
  • Backend-specific fields that do not match the selected backend.

Clear plugin configuration during shutdown or test cleanup. Clearing the plugin configuration deregisters adaptive subscribers and intercepts owned by the plugin runtime.

Rollout Guidance

Start by enabling state and telemetry in a development environment. Run representative instrumented workflows, inspect emitted events and adaptive reports, and then enable active behavior one area at a time. Keep rollback as a configuration change.