For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
  • About NVIDIA NeMo Relay
    • Overview
    • Architecture
    • Ecosystem
    • Concepts
    • Release Notes
  • Getting Started
    • Agent Runtime Primer
    • Prerequisites
    • Installation
    • Configuration / Setup
    • Quick Start
  • NVIDIA NeMo Relay CLI
    • About
    • Basic Usage
    • Claude Code
    • Codex
    • Cursor
    • Hermes Agent
  • Supported Integrations
    • About
    • OpenClaw Plugin Guide
    • LangChain Integration Guide
    • LangGraph Integration Guide
    • Deep Agents Integration Guide
  • Instrument Applications
    • About
    • Adding Scopes and Marks
    • Instrument a Tool Call
    • Instrument an LLM Call
    • Add Middleware
    • Code Examples
  • Observability Plugin
    • About
    • Configuration
    • Agent Trajectory Interchange Format (ATIF)
    • Agent Trajectory Observability Format (ATOF)
    • OpenTelemetry
    • OpenInference
  • Adaptive Plugin
    • About
    • Configuration
    • Adaptive Cache Governor (ACG)
    • Adaptive Hints
  • NeMo Guardrails Plugin
    • About
    • Configuration
  • Integrate into Frameworks
    • About
    • Adding Scopes
    • Wrap Tool Calls
    • Wrap LLM Calls
    • Handle Non-Serializable Data
    • Using Codecs
    • Provider Codecs
    • Provider Response Codecs
    • Code Examples
  • Build Plugins
    • About
    • Define a Plugin
    • Validate Plugin Configuration
    • Plugin Configuration Files
    • Register Plugin Behavior
    • Design Plugin Configuration
    • NeMo Guardrails Example Plugin
    • Code Examples
  • Contribute
    • About
    • Development Setup
    • Workflow and Reviews
    • Testing and Documentation
  • Reference
    • APIs
    • Performance
  • Resources
    • Support and FAQs
    • Glossary
    • Troubleshooting Guide
    • Community
    • Legal
NVIDIANVIDIA
Developer-friendly docs for your API
Privacy Policy | Your Privacy Choices | Terms of Service | Accessibility | Corporate Policies | Product Security | Contact

Copyright © 2026, NVIDIA Corporation.

LogoLogo
On this page
  • plugins.toml Example
  • Plugin Configuration
  • Manual API
  • Fields
  • Expected Output
  • Common Validation Failures
Adaptive Plugin

Adaptive Hints

||View as Markdown|
Previous

Adaptive Cache Governor (ACG)

Next

NeMo Guardrails Plugin

Use Adaptive Hints when downstream model calls or provider adapters can safely receive guidance metadata from the adaptive runtime.

Adaptive hints register as LLM request intercepts. Lower numeric priority values run earlier in the intercept chain. The default priority is chosen relative to other middleware rather than as a standalone importance score.

plugins.toml Example

1version = 1
2
3[[components]]
4kind = "adaptive"
5enabled = true
6
7[components.config]
8version = 1
9agent_id = "planner"
10
11[components.config.state.backend]
12kind = "in_memory"
13
14[components.config.telemetry]
15subscriber_name = "adaptive.telemetry"
16learners = ["tool_parallelism"]
17
18[components.config.adaptive_hints]
19priority = 100
20break_chain = false
21inject_header = true
22inject_body_path = "nvext.agent_hints"

This configuration injects adaptive guidance into outgoing model requests while allowing later request intercepts to continue running.

Plugin Configuration

Use plugin configuration when the application should let NeMo Relay own the Adaptive Hints request-intercept lifecycle.

Python
Node.js
Rust
1import nemo_relay
2
3adaptive_config = nemo_relay.adaptive.AdaptiveConfig(
4 agent_id="planner",
5 state=nemo_relay.adaptive.StateConfig(
6 backend=nemo_relay.adaptive.BackendSpec.in_memory(),
7 ),
8 telemetry=nemo_relay.adaptive.TelemetryConfig(learners=["tool_parallelism"]),
9 adaptive_hints=nemo_relay.adaptive.AdaptiveHintsConfig(
10 inject_body_path="nvext.agent_hints",
11 ),
12)
13
14plugin_config = nemo_relay.plugin.PluginConfig(
15 components=[nemo_relay.adaptive.ComponentSpec(adaptive_config)]
16)
17
18report = nemo_relay.plugin.validate(plugin_config)
19if any(diagnostic["level"] == "error" for diagnostic in report["diagnostics"]):
20 raise RuntimeError(report["diagnostics"])
21
22await nemo_relay.plugin.initialize(plugin_config)
23try:
24 # Run instrumented application work here.
25 pass
26finally:
27 nemo_relay.plugin.clear()

Manual API

Use the manual runtime API when an integration needs to own adaptive lifecycle directly instead of activating the top-level plugin component.

Python
Node.js
Rust
1import nemo_relay
2
3adaptive_config = nemo_relay.adaptive.AdaptiveConfig(
4 agent_id="planner",
5 state=nemo_relay.adaptive.StateConfig(
6 backend=nemo_relay.adaptive.BackendSpec.in_memory(),
7 ),
8 telemetry=nemo_relay.adaptive.TelemetryConfig(learners=["tool_parallelism"]),
9 adaptive_hints=nemo_relay.adaptive.AdaptiveHintsConfig(
10 inject_body_path="nvext.agent_hints",
11 ),
12)
13
14runtime = nemo_relay.adaptive.AdaptiveRuntime(adaptive_config.to_dict())
15await runtime.register()
16try:
17 # Run instrumented application work here.
18 nemo_relay.adaptive.set_latency_sensitivity(8)
19finally:
20 await runtime.shutdown()

Fields

FieldDefaultNotes
priority100Request intercept priority. Lower values run earlier.
break_chainfalseWhether this intercept stops later request intercepts.
inject_headertrueWhether to add adaptive hints as request header metadata.
inject_body_pathnvext.agent_hintsJSON body path for request-body hint injection.

Disable break_chain unless the adaptive hint should be the final request transform. Adjust priority only when adaptive hints need to run before or after known application middleware.

Expected Output

Outgoing managed LLM requests receive adaptive hint metadata in the configured header and body location. The hints do not replace the application callback or change the returned value by themselves. Downstream code must explicitly interpret the metadata before behavior changes.

Common Validation Failures

  • Unknown adaptive hint fields when unknown fields are treated as errors.
  • inject_body_path does not match the request shape expected by downstream provider adapters.
  • Hint injection is enabled before downstream model paths can consume or ignore the metadata safely.