For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
  • About NVIDIA NeMo Relay
    • Overview
    • Architecture
    • Ecosystem
    • Concepts
    • Release Notes
  • Getting Started
    • Agent Runtime Primer
    • Prerequisites
    • Installation
    • Configuration / Setup
    • Quick Start
  • NVIDIA NeMo Relay CLI
    • About
    • Basic Usage
    • Claude Code
    • Codex
    • Cursor
    • Hermes Agent
  • Supported Integrations
    • About
    • OpenClaw Plugin Guide
    • LangChain Integration Guide
    • LangGraph Integration Guide
    • Deep Agents Integration Guide
  • Instrument Applications
    • About
    • Adding Scopes and Marks
    • Instrument a Tool Call
    • Instrument an LLM Call
    • Add Middleware
    • Code Examples
  • Observability Plugin
    • About
    • Configuration
    • Agent Trajectory Interchange Format (ATIF)
    • Agent Trajectory Observability Format (ATOF)
    • OpenTelemetry
    • OpenInference
  • Adaptive Plugin
    • About
    • Configuration
    • Adaptive Cache Governor (ACG)
    • Adaptive Hints
  • NeMo Guardrails Plugin
    • About
    • Configuration
  • Integrate into Frameworks
    • About
    • Adding Scopes
    • Wrap Tool Calls
    • Wrap LLM Calls
    • Handle Non-Serializable Data
    • Using Codecs
    • Provider Codecs
    • Provider Response Codecs
    • Code Examples
  • Build Plugins
    • About
    • Define a Plugin
    • Validate Plugin Configuration
    • Plugin Configuration Files
    • Register Plugin Behavior
    • Design Plugin Configuration
    • NeMo Guardrails Example Plugin
    • Code Examples
  • Contribute
    • About
    • Development Setup
    • Workflow and Reviews
    • Testing and Documentation
  • Reference
    • APIs
    • Performance
  • Resources
    • Support and FAQs
    • Glossary
    • Troubleshooting Guide
    • Community
    • Legal
NVIDIANVIDIA
Developer-friendly docs for your API
Privacy Policy | Your Privacy Choices | Terms of Service | Accessibility | Corporate Policies | Product Security | Contact

Copyright © 2026, NVIDIA Corporation.

LogoLogo
On this page
  • Core Runtime Setup
  • Plugin Setup
  • Observability Setup
  • Adaptive Setup
Getting Started

Configuration

||View as Markdown|
Previous

Installation

Next

Quick Start

NeMo Relay runtime behavior is configured through API objects and registration calls rather than a global configuration file.

Core Runtime Setup

Most applications configure NeMo Relay by:

  1. Creating or reusing a scope stack.
  2. Registering guardrails, intercepts, or subscribers.
  3. Calling the managed tool or LLM helpers from the active scope.
  4. Deregistering global middleware that should not remain active for the lifetime of the process.

Use scope-local registration when behavior must be tied to one request, session, or agent run.

Plugin Setup

Plugins use a structured plugin configuration with:

  • A version
  • One or more component definitions
  • Optional component policy

Start with Define a Plugin when you need reusable middleware, subscribers, or adaptive behavior.

Use NeMo Guardrails Configuration when you want the built-in first-party nemo_guardrails component.

The nemo-relay CLI gateway reads plugin files named plugins.toml. See Plugin Configuration Files for file locations, precedence, merge behavior, editor controls, and validation rules.

Observability Setup

Agent Trajectory Observability Format (ATOF) exporters, Agent Trajectory Interchange Format (ATIF) exporters, OpenTelemetry subscribers, and OpenInference subscribers can be configured directly through binding-native config objects. Use the built-in observability plugin when you want one plugin component to own standard exporter setup and teardown. See Observability Configuration and Observability for the supported export paths.

NeMo Relay does not require application-level environment variables for normal runtime use. Configure most behavior through API objects, registration calls, or plugin configuration.

OTEL_* variables are only relevant when the underlying OpenTelemetry exporter reads endpoint settings from the environment. Prefer explicit config objects in application code so the active export settings are visible in docs, tests, and deployment manifests.

Adaptive Setup

Adaptive tuning is enabled through the adaptive plugin component and binding helper APIs. See Adaptive Configuration.