For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
LogoLogoNeMo Guardrails
      • Overview
      • How It Works
      • Guardrail Types
      • Supported LLMs
      • Release Notes
      • Installation
      • Tutorials
      • Integrate Guardrails
      • About Configuring Guardrails
      • Configuration Overview
      • Pre-Configuration Checklist
      • Configuring YAML File
      • YAML Schema Reference
      • Guardrail Catalog
      • Custom Actions
      • Custom Initialization
      • Colang
      • Other Configurations
      • Caching
      • Exceptions
      • About Running Guardrailed Inference
      • Python APIs
      • Guardrails API Server
        • Overview
        • Run the Server
        • Chat Completions
        • List Configurations
        • List Models
        • Actions Server
      • Evaluate Configuration
      • Evaluation Methodology
      • Evaluate Guardrails
      • Vulnerability Scanning
      • Overview
      • Logging
      • Tracing
      • Metrics
      • Deployment Options
      • Docker
      • NeMo Microservice
      • LangChain Frameworks
      • Tools Integration
      • Architecture
      • Sequence Diagrams
      • Use Case Diagrams
      • CLI
      • Migrating to 0.22
      • Troubleshooting
      • Security
      • Telemetry and Privacy
      • Research
  • Overview
  • How It Works
  • Guardrail Types
  • Supported LLMs
  • Release Notes
  • Installation
  • Tutorials
  • Check Harmful Content
  • Content Safety Reasoning
  • Restrict Topics
  • Detect Jailbreak Attempts
  • Jailbreak Heuristics
  • Add Multimodal Content Safety
  • Integrate Guardrails
  • About Configuring Guardrails
  • Configuration Overview
  • Pre-Configuration Checklist
  • Configuring YAML File
  • Models
  • Guardrails
  • Prompts
  • Tracing
  • Streaming
  • Streaming LLM Responses
  • Output Rail Streaming
  • YAML Schema Reference
  • Guardrail Catalog
  • Content Safety
  • Jailbreak Protection
  • Topic Control
  • PII Detection
  • Agentic Security
  • Hallucinations & Fact-Checking
  • LLM Self-Check
  • Third-Party APIs
  • ActiveFence
  • AlignScore
  • AutoAlign
  • Cisco AI Defense
  • Clavata
  • Cleanlab
  • CrowdStrike AIDR
  • Fiddler
  • GCP Text Moderation
  • GLiNER PII
  • GuardrailsAI
  • Llama Guard
  • Pangea AI Guard
  • Patronus Evaluate API
  • Patronus Lynx
  • PolicyAI
  • Presidio
  • Private AI
  • Prompt Security
  • Regex
  • Trend Micro
  • Custom Actions
  • Creating Actions
  • Built-in Actions
  • Action Parameters
  • Registering Actions
  • Custom Initialization
  • Init Function
  • LLM Providers
  • Custom LLM Model
  • Testing
  • Custom LLM Framework
  • Embedding Providers
  • Custom Data
  • Colang
  • Colang 2.0 Guide
  • What's Changed
  • Migration Guide
  • Getting Started
  • Hello World
  • Interaction Loop
  • Input Rails
  • Dialog Rails
  • LLM Flows
  • Multimodal Rails
  • Recommended Next Steps
  • Language Reference
  • Introduction
  • Event Generation and Matching
  • Working with Actions
  • Defining Flows
  • Working with Variables and Expressions
  • Flow Control
  • Colang Standard Library
  • Core Flows
  • Timing Flows
  • LLM Flows
  • Avatar Flows
  • Guardrail Flows
  • Attention Flows
  • Make Use of LLMs
  • More on Flows
  • Python Actions
  • Development and Debugging
  • Colang 1.0 Guide
  • Colang 1.0 Language Syntax
  • Colang 1.0 Tutorials
  • Hello World
  • Core Colang Concepts
  • Demo Use Case
  • Input Rails
  • Output Rails
  • Topical Rails
  • Retrieval-Augmented Generation
  • Guardrailing Bot Reasoning Content
  • Colang Usage Examples
  • Bot Message Instructions
  • Extract User-provided Values
  • Other Configurations
  • Knowledge Base
  • Embedding Search Providers
  • Caching
  • Memory Model Cache
  • KV Cache Reuse
  • Exceptions
  • About Running Guardrailed Inference
  • Python APIs
  • Python APIs Overview
  • Core Classes
  • Generation Options
  • Streaming
  • Check Messages
  • Event-Based API
  • Guardrails API Server
  • Overview
  • Run the Server
  • Chat Completions
  • List Configurations
  • List Models
  • Actions Server
  • Evaluate Configuration
  • Evaluation Methodology
  • Evaluate Guardrails
  • Vulnerability Scanning
  • Overview
  • Logging
  • Tracing
  • Quick Start
  • Adapters
  • OpenTelemetry
  • OpenTelemetry Logs
  • Troubleshooting
  • Metrics
  • Enable Guardrails Metrics
  • OpenTelemetry Metrics Integration
  • Metric Reference
  • Deployment Options
  • Docker
  • NeMo Microservice
  • LangChain Frameworks
  • LangChain Integration
  • Agent Middleware
  • RunnableRails
  • Chain with Guardrails
  • Runnable as Action
  • LangGraph
  • Tools Integration
  • Architecture
  • Sequence Diagrams
  • Use Case Diagrams
  • Nemoguardrails
  • Actions
  • Action Dispatcher
  • Actions
  • Core
  • Llm
  • Generation
  • Utils
  • Math
  • Output Mapping
  • Retrieve Relevant Chunks
  • V2 X
  • Generation
  • Validation
  • Base
  • Filter Secrets
  • Actions Server
  • Actions Server
  • Base Guardrails
  • Cli
  • Chat
  • Debugger
  • Migration
  • Providers
  • Colang
  • Runtime
  • V1 0
  • Lang
  • Colang Parser
  • Comd Parser
  • Coyml Parser
  • Parser
  • Utils
  • Runtime
  • Eval
  • Flows
  • Runtime
  • Sliding
  • Utils
  • V2 X
  • Lang
  • Colang Ast
  • Expansion
  • Grammar
  • Load
  • Parser
  • Transformer
  • Utils
  • Runtime
  • Errors
  • Eval
  • Flows
  • Runtime
  • Serialization
  • Statemachine
  • System Functions
  • Utils
  • Context
  • Embeddings
  • Basic
  • Cache
  • Providers
  • Azureopenai
  • Base
  • Cohere
  • Fastembed
  • Google
  • Nim
  • Openai
  • Registry
  • Sentence Transformers
  • Eval
  • Check
  • Cli
  • Eval
  • Models
  • Ui
  • Chart Utils
  • Common
  • README
  • Streamlit Utils
  • Utils
  • Utils
  • Evaluate
  • Cli
  • Evaluate
  • Simplify Formatter
  • Evaluate Factcheck
  • Evaluate Hallucination
  • Evaluate Moderation
  • Evaluate Topical
  • Utils
  • Exceptions
  • Guardrails
  • Api Engine
  • Async Work Queue
  • Base Engine
  • Engine Registry
  • Guardrails
  • Guardrails Types
  • Iorails
  • Model Engine
  • Rail Action
  • Rails Manager
  • Telemetry
  • Imports
  • Integrations
  • Langchain
  • Actions
  • Actions
  • Safetools
  • Exceptions
  • Helpers
  • Langchain Initializer
  • Llm Adapter
  • Message Utils
  • Middleware
  • Providers
  • Huggingface
  • Pipeline
  • Streamers
  • Providers
  • Trtllm
  • Client
  • Llm
  • Runnable Rails
  • Utils
  • Kb
  • Kb
  • Utils
  • Library
  • Activefence
  • Actions
  • Ai Defense
  • Actions
  • Autoalign
  • Actions
  • Clavata
  • Actions
  • Errs
  • Request
  • Utils
  • Cleanlab
  • Actions
  • Content Safety
  • Actions
  • Crowdstrike Aidr
  • Actions
  • Factchecking
  • Align Score
  • Actions
  • Request
  • Server
  • Fiddler
  • Actions
  • Gcp Moderate Text
  • Actions
  • Gliner
  • Actions
  • Models
  • Request
  • Guardrails Ai
  • Actions
  • Errors
  • Registry
  • Hallucination
  • Actions
  • Hf Classifier
  • Actions
  • Backends
  • Injection Detection
  • Actions
  • Yara Config
  • Jailbreak Detection
  • Actions
  • Heuristics
  • Checks
  • Model Based
  • Checks
  • Models
  • Request
  • Server
  • Llama Guard
  • Actions
  • Pangea
  • Actions
  • Patronusai
  • Actions
  • Policyai
  • Actions
  • Privateai
  • Actions
  • Request
  • Prompt Security
  • Actions
  • Regex
  • Actions
  • Self Check
  • Facts
  • Actions
  • Input Check
  • Actions
  • Output Check
  • Actions
  • Sensitive Data Detection
  • Actions
  • Topic Safety
  • Actions
  • Trend Micro
  • Actions
  • Llm
  • Cache
  • Interface
  • Lfu
  • Utils
  • Clients
  • Base
  • Constants
  • Openai Compatible
  • Constants
  • Filters
  • Frameworks
  • Default
  • Registry
  • Openai Reasoning
  • Output Parsers
  • Prompts
  • Providers
  • Taskmanager
  • Types
  • Logging
  • Explain
  • Llm Tracker
  • Processing Log
  • Simplify Formatter
  • Stats
  • Verbose
  • Patch Asyncio
  • Rails
  • Llm
  • Buffer
  • Config
  • Llmrails
  • Options
  • Utils
  • Registry
  • Server
  • Api
  • App
  • Datastore
  • Datastore
  • Memory Store
  • Redis Store
  • Singleton
  • Streaming
  • Telemetry
  • Testing
  • Chat Harness
  • Fake Model
  • Fixtures
  • Tracing
  • Adapters
  • Base
  • Filesystem
  • Opentelemetry
  • Registry
  • Constants
  • Interaction Types
  • Span Extractors
  • Span Format
  • Span Formatting
  • Spans
  • Tracer
  • Types
  • Utils
  • CLI
  • Create Chat Completion
  • List Models
  • List Guardrails Configurations
  • List Challenges
  • Get Server Health or Chat UI
  • Migrating to 0.22
  • Troubleshooting
  • Security
  • Telemetry and Privacy
  • Research
Run Guardrailed Inference

Use the Guardrails API Server

||View as Markdown|

The NeMo Guardrails library includes the Guardrails API server that exposes guardrails through an HTTP API. This section covers how to run the server and interact with it.

Overview

The Guardrails API server is a tool for running guardrails in a secure, isolated environment.

Concept
Run the Server

Start the Guardrails API server, configure CORS, and enable auto-reload.

Tutorial
Chat Completions

Send chat requests, use streaming, and manage conversation threads.

Tutorial
List Configurations

Retrieve available guardrails configurations from the server.

Reference
List Models

Query the available LLM models from the configured provider.

Reference
Actions Server

Run guardrail actions in a secure, isolated environment.

Tutorial
Previous

Event-Based API

Next

Overview of the NVIDIA NeMo Guardrails Library API Server

NVIDIANVIDIA
Developer-friendly docs for your API
Privacy Policy | Your Privacy Choices | Terms of Service | Accessibility | Corporate Policies | Product Security | Contact

Copyright © 2026, NVIDIA Corporation.