For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
  • About NVIDIA NeMo Relay
    • Overview
    • Architecture
    • Ecosystem
    • Concepts
    • Release Notes
  • Getting Started
    • Agent Runtime Primer
    • Prerequisites
    • Installation
    • Configuration / Setup
    • Quick Start
  • NVIDIA NeMo Relay CLI
    • About
    • Basic Usage
    • Claude Code
    • Codex
    • Cursor
    • Hermes Agent
  • Supported Integrations
    • About
    • OpenClaw Plugin Guide
    • LangChain Integration Guide
    • LangGraph Integration Guide
    • Deep Agents Integration Guide
  • Instrument Applications
    • About
    • Adding Scopes and Marks
    • Instrument a Tool Call
    • Instrument an LLM Call
    • Add Middleware
    • Code Examples
  • Observability Plugin
    • About
    • Configuration
    • Agent Trajectory Interchange Format (ATIF)
    • Agent Trajectory Observability Format (ATOF)
    • OpenTelemetry
    • OpenInference
  • Adaptive Plugin
    • About
    • Configuration
    • Adaptive Cache Governor (ACG)
    • Adaptive Hints
  • NeMo Guardrails Plugin
    • About
    • Configuration
  • Integrate into Frameworks
    • About
    • Adding Scopes
    • Wrap Tool Calls
    • Wrap LLM Calls
    • Handle Non-Serializable Data
    • Using Codecs
    • Provider Codecs
    • Provider Response Codecs
    • Code Examples
  • Build Plugins
    • About
    • Define a Plugin
    • Validate Plugin Configuration
    • Plugin Configuration Files
    • Register Plugin Behavior
    • Design Plugin Configuration
    • NeMo Guardrails Example Plugin
    • Code Examples
  • Contribute
    • About
    • Development Setup
    • Workflow and Reviews
    • Testing and Documentation
  • Reference
    • APIs
    • Performance
  • Resources
    • Support and FAQs
    • Glossary
    • Troubleshooting Guide
    • Community
    • Legal
NVIDIANVIDIA
Developer-friendly docs for your API
Privacy Policy | Your Privacy Choices | Terms of Service | Accessibility | Corporate Policies | Product Security | Contact

Copyright © 2026, NVIDIA Corporation.

LogoLogo
On this page
  • Use This Plugin When
  • Current Scope
  • Managed Surfaces Versus Request Defaults
  • Pages
NeMo Guardrails Plugin

NeMo Guardrails Plugin

||View as Markdown|
Previous

Adaptive Hints

Next

NeMo Guardrails Configuration

Use the NeMo Guardrails plugin when you want first-party Guardrails policy around managed NeMo Relay LLM and tool execution through the shared plugin system.

The built-in plugin component has kind nemo_guardrails and is available as a first-party NeMo Relay plugin.

The plugin is designed around backend modes:

  • remote
    • Implemented now.
    • Calls a Guardrails service over HTTP(S), including streaming over the same remote contract.
  • local
    • Planned.
    • Reserved for a future in-process Python nemoguardrails backend.

Use This Plugin When

Start here when you need to:

  • Apply Guardrails input and output checks around managed llm.execute(...) calls.
  • Apply Guardrails policy around managed tool execution, including the current remote managed tool_output lane.
  • Configure Guardrails behavior through the same plugin config surface used by other first-party NeMo Relay components.
  • Keep Guardrails behavior in a reusable process-level config document instead of wiring provider-specific checks into each application call site.

Current Scope

The current shipped user-facing lane is the built-in remote backend.

That lane supports:

  • Managed non-streaming LLM input checks.
  • Managed non-streaming LLM output checks.
  • Managed streaming LLM execution over the remote HTTP(S) path.
  • Managed tool-result checks through tool_output.
  • Request-time Guardrails defaults passed through to the remote backend.

The current built-in remote backend does not support:

  • Managed tool_input checks against the stock Guardrails remote contract.
  • local mode.
  • Remote managed LLM parity beyond codec = "openai_chat".

Managed Surfaces Versus Request Defaults

The NeMo Guardrails plugin model uses two different concepts:

  • Currently supported managed NeMo Relay execution surfaces in the shipped remote backend:
    • input
    • output
    • tool_output
  • Guardrails backend request defaults:
    • request_defaults.context
    • request_defaults.thread_id
    • request_defaults.state
    • request_defaults.rails
    • request_defaults.llm_params
    • request_defaults.llm_output
    • request_defaults.output_vars
    • request_defaults.log

This distinction matters:

  • Managed surfaces wrap real NeMo Relay execution boundaries such as llm.execute(...) and tools.execute(...).
  • Managed surfaces let NeMo Relay enforce behavior around those boundaries. Depending on the surface, Relay can block work, allow it, or apply managed request or result handling before the application sees the final outcome.
  • Managed surfaces also give NeMo Relay a stable runtime boundary for its own middleware ordering, lifecycle behavior, and observability marks. Relay knows exactly which step is being wrapped and can attach policy and telemetry to that step directly.
  • request_defaults fields are forwarded to the selected Guardrails backend as request semantics. They do not create new NeMo Relay-native execution surfaces.
  • request_defaults can still influence Guardrails behavior, but they do not give NeMo Relay a new local runtime step to wrap. Relay is passing backend options along with a request, not creating a new middleware boundary of its own.
  • request_defaults are also backend-contract dependent. A selected Guardrails backend can use them when evaluating a request, but the exact effect depends on what that backend supports. Relay is not creating a separate local retrieval, dialog, or tool boundary just because those fields exist in the request.

In practice, the tradeoff is:

  • Managed surfaces give you a Relay-owned enforcement point around a known runtime step, with Relay-owned enforcement, ordering, and marks around that step.
  • request_defaults give you backend-level configuration for a request, but not a separate Relay-owned interception point, runtime boundary, or middleware surface.

Another way to think about it:

  • Managed surfaces are places where NeMo Relay is holding the steering wheel.
  • request_defaults are notes that NeMo Relay passes to the Guardrails backend with a request.

Top-level tool_input is still part of the built-in plugin contract, but it is not supported by the current stock-remote backend.

The overlap in names is important:

  • Top-level input is a managed NeMo Relay execution surface.
  • request_defaults.rails.input is a backend pass-through option.
  • Top-level output is a managed NeMo Relay execution surface.
  • request_defaults.rails.output is a backend pass-through option.
  • Top-level tool_input is part of the built-in plugin model, but the current stock-remote backend rejects it.
  • request_defaults.rails.tool_input is a backend pass-through option.
  • Top-level tool_output is a managed NeMo Relay execution surface.
  • request_defaults.rails.tool_output is a backend pass-through option.

In particular, request_defaults.rails.dialog and request_defaults.rails.retrieval are simple pass-through options. They are not separate managed middleware surfaces in NeMo Relay.

Pages

  • NeMo Guardrails Configuration documents the built-in component shape, remote-mode boundaries, and current support matrix.