For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
  • About NVIDIA NeMo Relay
    • Overview
    • Architecture
    • Ecosystem
    • Concepts
    • Release Notes
  • Getting Started
    • Agent Runtime Primer
    • Prerequisites
    • Installation
    • Configuration / Setup
    • Quick Start
  • NVIDIA NeMo Relay CLI
    • About
    • Basic Usage
    • Claude Code
    • Codex
    • Cursor
    • Hermes Agent
  • Supported Integrations
    • About
    • OpenClaw Plugin Guide
    • LangChain Integration Guide
    • LangGraph Integration Guide
    • Deep Agents Integration Guide
  • Instrument Applications
    • About
    • Adding Scopes and Marks
    • Instrument a Tool Call
    • Instrument an LLM Call
    • Add Middleware
    • Code Examples
  • Observability Plugin
    • About
    • Configuration
    • Agent Trajectory Interchange Format (ATIF)
    • Agent Trajectory Observability Format (ATOF)
    • OpenTelemetry
    • OpenInference
  • Adaptive Plugin
    • About
    • Configuration
    • Adaptive Cache Governor (ACG)
    • Adaptive Hints
  • NeMo Guardrails Plugin
    • About
    • Configuration
  • Integrate into Frameworks
    • About
    • Adding Scopes
    • Wrap Tool Calls
    • Wrap LLM Calls
    • Handle Non-Serializable Data
    • Using Codecs
    • Provider Codecs
    • Provider Response Codecs
    • Code Examples
  • Build Plugins
    • About
    • Define a Plugin
    • Validate Plugin Configuration
    • Plugin Configuration Files
    • Register Plugin Behavior
    • Design Plugin Configuration
    • NeMo Guardrails Example Plugin
    • Code Examples
  • Contribute
    • About
    • Development Setup
    • Workflow and Reviews
    • Testing and Documentation
  • Reference
    • APIs
    • Performance
  • Resources
    • Support and FAQs
    • Glossary
    • Troubleshooting Guide
    • Community
    • Legal
NVIDIANVIDIA
Developer-friendly docs for your API
Privacy Policy | Your Privacy Choices | Terms of Service | Accessibility | Corporate Policies | Product Security | Contact

Copyright © 2026, NVIDIA Corporation.

LogoLogo
On this page
  • Install
  • Configure
  • Example Agent
  • Runtime Behavior
  • Supported Codecs
  • Limitations
Build Plugins

NeMo Guardrails Example Plugin

||View as Markdown|
Previous

Design Plugin Configuration

Next

Code Examples

This example shows how to write a Python NeMo Relay plugin that calls the NeMo Guardrails Python API.

This page documents the older external Python example plugin. For the built-in first-party nemo_guardrails component, see NeMo Guardrails Plugin.

The example lives under examples/nemoguardrails. The single-file plugin implementation, runnable agent, and Guardrails config artifacts are under example. It is not part of the nemo_relay Python package, and NeMo Relay does not depend on nemoguardrails. Applications that use the example install NeMo Guardrails in their own environment and import or vendor the example plugin.

Install

Install NeMo Relay normally, then install NeMo Guardrails in the application or example environment that activates the plugin:

$pip install nemoguardrails

The bundled example config uses NeMo Guardrails’ nvidia_ai_endpoints model engine. Install the NVIDIA LangChain provider when you want to run that config as-is:

$pip install langchain-nvidia-ai-endpoints

Configure

Guardrails stay in native NeMo Guardrails config. Point the plugin at a Guardrails config directory, or pass inline YAML content.

1import asyncio
2
3import nemo_relay
4import plugin as nemoguardrails_plugin
5
6async def main() -> None:
7 nemoguardrails_plugin.register()
8 try:
9 config = nemo_relay.plugin.PluginConfig(
10 components=[
11 nemo_relay.plugin.ComponentSpec(
12 kind=nemoguardrails_plugin.DEFAULT_KIND,
13 config={
14 "config_path": "./rails",
15 "codec": "openai_chat",
16 },
17 )
18 ]
19 )
20 await nemo_relay.plugin.initialize(config)
21 finally:
22 nemo_relay.plugin.clear()
23 nemoguardrails_plugin.deregister()
24
25asyncio.run(main())

The config_path directory is a normal NeMo Guardrails config directory. For example:

1# rails/config.yml
2models:
3 - type: main
4 engine: nvidia_ai_endpoints
5 model: meta/llama-3.1-8b-instruct
6
7rails:
8 input:
9 flows:
10 - self check input
11 output:
12 flows:
13 - self check output
14
15prompts:
16 - task: self_check_input
17 content: |-
18 You are checking whether a NeMo Relay request should be allowed.
19 The input may be plain user text or a JSON object with tool_name and
20 arguments fields.
21 User input: {{ user_input }}
22 Should this request be blocked? Answer only Yes or No.
23
24 - task: self_check_output
25 content: |-
26 You are checking whether a NeMo Relay response should be returned.
27 The output may be assistant text or a JSON object with tool_name,
28 arguments, and result fields.
29 Model output: {{ bot_response }}
30 Should this response be blocked? Answer only Yes or No.

The plugin config accepts these fields:

  • config_path: Path to a NeMo Guardrails config directory.
  • config_yaml: Inline NeMo Guardrails YAML config.
  • colang_content: Optional inline Colang content. This can only be used with config_yaml.
  • codec: One of openai_chat, openai_responses, or anthropic_messages. This is required when input or output is enabled.
  • input: Whether to run input rails around LLM calls. Defaults to true.
  • output: Whether to run output rails around LLM calls. Defaults to true.
  • tool_input: Whether to check managed tool arguments before execution. Defaults to false.
  • tool_output: Whether to check managed tool results after execution. Defaults to false.
  • priority: Execution-intercept priority. Defaults to 100.

Exactly one of config_path or config_yaml is required.

Example Agent

The example includes agent_example.py, a concrete example agent that initializes the plugin, checks a managed tools.execute(...) call, and checks a managed llm.execute(...) call against live NVIDIA-hosted inference.

Run it from a checkout where NeMo Relay and NeMo Guardrails are installed. The default lane uses a passthrough Guardrails config and the current_time tool. This is the fastest live validation path because it exercises the real plugin, real nemoguardrails initialization, tool execution, and LLM execution without running model-backed self-check rails:

$export NVIDIA_API_KEY="<your-key>"
$python examples/nemoguardrails/example/agent_example.py

To run the inline self-check rails example, load example_config.yml from example and pass it as inline config_yaml:

$python examples/nemoguardrails/example/agent_example.py --guardrails-config inline

The config directory lane uses the bundled examples/nemoguardrails/example/rails/config.yml by default. It contains the same input and output self-check rails as example/example_config.yml:

$python examples/nemoguardrails/example/agent_example.py --guardrails-config path

Use --tool weather when you want the example to use a weather tool instead of the default current_time tool:

$python examples/nemoguardrails/example/agent_example.py --tool weather

Pass --config-path when you want the example agent to use your own native NeMo Guardrails config directory:

$python examples/nemoguardrails/example/agent_example.py \
> --guardrails-config path \
> --config-path ./rails

Runtime Behavior

For non-streaming llm.execute(...) calls, the plugin checks the user input before the model call and checks the assistant text after the model call. Guardrails can pass, block, or rewrite input. For output, this example supports pass and block; modified output raises because NeMo Relay response codecs are decode-only and the example does not rewrite provider-shaped responses.

For managed tools.execute(...) calls, the plugin can also check serialized tool arguments before execution and serialized tool results after execution. When Guardrails rewrites tool arguments or results, the rewritten content must be valid JSON.

The bundled config uses the same NeMo Guardrails input and output self-check rails for both LLM messages and tool payloads. The plugin makes tool calls visible to Guardrails by serializing managed tool arguments and results as JSON message content.

This behavior changes the real execution path. It is not an observability-only sanitize guardrail.

Supported Codecs

The example is intentionally limited to NeMo Relay’s built-in LLM codec shapes:

  • openai_chat for OpenAI Chat Completions-style requests and responses.
  • openai_responses for OpenAI Responses API-style requests and responses.
  • anthropic_messages for Anthropic Messages-style requests and responses.

Provider-specific payloads outside those codecs need a NeMo Relay codec and a response text replacement strategy before a production plugin can apply modified output safely.

Limitations

This example calls NeMo Guardrails check_async, not generate_async. It checks around NeMo Relay LLM and tool execution calls, but it does not let NeMo Guardrails take over generation or agent orchestration.

The example does not support:

  • Streaming LLM calls.
  • Dialog rails, retrieval rails, execution rails, or generation rails that require NeMo Guardrails to orchestrate the full generation flow.
  • Arbitrary provider payloads beyond the three supported NeMo Relay codecs.
  • Applying modified LLM output back into provider responses.
  • Rewriting tool-call arguments inside model responses before an application turns those model tool calls into managed tools.execute(...) calls.

Tool checks use serialized JSON and NeMo Guardrails input/output checks. They are NeMo Relay tool middleware checks powered by Guardrails, not a full generate_async agent-loop integration.

config_path points at native NeMo Guardrails configuration. Guardrails config can load project code such as actions, so treat that path as trusted application code.