For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
LogoLogoNeMo Guardrails
      • Overview
      • How It Works
      • Guardrail Types
      • Supported LLMs
      • Release Notes
      • Installation
      • Tutorials
      • Integrate Guardrails
      • About Configuring Guardrails
      • Configuration Overview
      • Pre-Configuration Checklist
      • Configuring YAML File
      • YAML Schema Reference
      • Guardrail Catalog
      • Custom Actions
      • Custom Initialization
      • Colang
      • Other Configurations
      • Caching
      • Exceptions
      • About Running Guardrailed Inference
      • Python APIs
        • Python APIs Overview
        • Core Classes
        • Generation Options
        • Streaming
        • Check Messages
        • Event-Based API
      • Guardrails API Server
      • Evaluate Configuration
      • Evaluation Methodology
      • Evaluate Guardrails
      • Vulnerability Scanning
      • Overview
      • Logging
      • Tracing
      • Metrics
      • Deployment Options
      • Docker
      • NeMo Microservice
      • LangChain Frameworks
      • Tools Integration
      • Architecture
      • Sequence Diagrams
      • Use Case Diagrams
      • CLI
      • Migrating to 0.22
      • Troubleshooting
      • Security
      • Telemetry and Privacy
      • Research
  • Overview
  • How It Works
  • Guardrail Types
  • Supported LLMs
  • Release Notes
  • Installation
  • Tutorials
  • Check Harmful Content
  • Content Safety Reasoning
  • Restrict Topics
  • Detect Jailbreak Attempts
  • Jailbreak Heuristics
  • Add Multimodal Content Safety
  • Integrate Guardrails
  • About Configuring Guardrails
  • Configuration Overview
  • Pre-Configuration Checklist
  • Configuring YAML File
  • Models
  • Guardrails
  • Prompts
  • Tracing
  • Streaming
  • Streaming LLM Responses
  • Output Rail Streaming
  • YAML Schema Reference
  • Guardrail Catalog
  • Content Safety
  • Jailbreak Protection
  • Topic Control
  • PII Detection
  • Agentic Security
  • Hallucinations & Fact-Checking
  • LLM Self-Check
  • Third-Party APIs
  • ActiveFence
  • AlignScore
  • AutoAlign
  • Cisco AI Defense
  • Clavata
  • Cleanlab
  • CrowdStrike AIDR
  • Fiddler
  • GCP Text Moderation
  • GLiNER PII
  • GuardrailsAI
  • Llama Guard
  • Pangea AI Guard
  • Patronus Evaluate API
  • Patronus Lynx
  • PolicyAI
  • Presidio
  • Private AI
  • Prompt Security
  • Regex
  • Trend Micro
  • Custom Actions
  • Creating Actions
  • Built-in Actions
  • Action Parameters
  • Registering Actions
  • Custom Initialization
  • Init Function
  • LLM Providers
  • Custom LLM Model
  • Testing
  • Custom LLM Framework
  • Embedding Providers
  • Custom Data
  • Colang
  • Colang 2.0 Guide
  • What's Changed
  • Migration Guide
  • Getting Started
  • Hello World
  • Interaction Loop
  • Input Rails
  • Dialog Rails
  • LLM Flows
  • Multimodal Rails
  • Recommended Next Steps
  • Language Reference
  • Introduction
  • Event Generation and Matching
  • Working with Actions
  • Defining Flows
  • Working with Variables and Expressions
  • Flow Control
  • Colang Standard Library
  • Core Flows
  • Timing Flows
  • LLM Flows
  • Avatar Flows
  • Guardrail Flows
  • Attention Flows
  • Make Use of LLMs
  • More on Flows
  • Python Actions
  • Development and Debugging
  • Colang 1.0 Guide
  • Colang 1.0 Language Syntax
  • Colang 1.0 Tutorials
  • Hello World
  • Core Colang Concepts
  • Demo Use Case
  • Input Rails
  • Output Rails
  • Topical Rails
  • Retrieval-Augmented Generation
  • Guardrailing Bot Reasoning Content
  • Colang Usage Examples
  • Bot Message Instructions
  • Extract User-provided Values
  • Other Configurations
  • Knowledge Base
  • Embedding Search Providers
  • Caching
  • Memory Model Cache
  • KV Cache Reuse
  • Exceptions
  • About Running Guardrailed Inference
  • Python APIs
  • Python APIs Overview
  • Core Classes
  • Generation Options
  • Streaming
  • Check Messages
  • Event-Based API
  • Guardrails API Server
  • Overview
  • Run the Server
  • Chat Completions
  • List Configurations
  • List Models
  • Actions Server
  • Evaluate Configuration
  • Evaluation Methodology
  • Evaluate Guardrails
  • Vulnerability Scanning
  • Overview
  • Logging
  • Tracing
  • Quick Start
  • Adapters
  • OpenTelemetry
  • OpenTelemetry Logs
  • Troubleshooting
  • Metrics
  • Enable Guardrails Metrics
  • OpenTelemetry Metrics Integration
  • Metric Reference
  • Deployment Options
  • Docker
  • NeMo Microservice
  • LangChain Frameworks
  • LangChain Integration
  • Agent Middleware
  • RunnableRails
  • Chain with Guardrails
  • Runnable as Action
  • LangGraph
  • Tools Integration
  • Architecture
  • Sequence Diagrams
  • Use Case Diagrams
  • Nemoguardrails
  • Actions
  • Action Dispatcher
  • Actions
  • Core
  • Llm
  • Generation
  • Utils
  • Math
  • Output Mapping
  • Retrieve Relevant Chunks
  • V2 X
  • Generation
  • Validation
  • Base
  • Filter Secrets
  • Actions Server
  • Actions Server
  • Base Guardrails
  • Cli
  • Chat
  • Debugger
  • Migration
  • Providers
  • Colang
  • Runtime
  • V1 0
  • Lang
  • Colang Parser
  • Comd Parser
  • Coyml Parser
  • Parser
  • Utils
  • Runtime
  • Eval
  • Flows
  • Runtime
  • Sliding
  • Utils
  • V2 X
  • Lang
  • Colang Ast
  • Expansion
  • Grammar
  • Load
  • Parser
  • Transformer
  • Utils
  • Runtime
  • Errors
  • Eval
  • Flows
  • Runtime
  • Serialization
  • Statemachine
  • System Functions
  • Utils
  • Context
  • Embeddings
  • Basic
  • Cache
  • Providers
  • Azureopenai
  • Base
  • Cohere
  • Fastembed
  • Google
  • Nim
  • Openai
  • Registry
  • Sentence Transformers
  • Eval
  • Check
  • Cli
  • Eval
  • Models
  • Ui
  • Chart Utils
  • Common
  • README
  • Streamlit Utils
  • Utils
  • Utils
  • Evaluate
  • Cli
  • Evaluate
  • Simplify Formatter
  • Evaluate Factcheck
  • Evaluate Hallucination
  • Evaluate Moderation
  • Evaluate Topical
  • Utils
  • Exceptions
  • Guardrails
  • Api Engine
  • Async Work Queue
  • Base Engine
  • Engine Registry
  • Guardrails
  • Guardrails Types
  • Iorails
  • Model Engine
  • Rail Action
  • Rails Manager
  • Telemetry
  • Imports
  • Integrations
  • Langchain
  • Actions
  • Actions
  • Safetools
  • Exceptions
  • Helpers
  • Langchain Initializer
  • Llm Adapter
  • Message Utils
  • Middleware
  • Providers
  • Huggingface
  • Pipeline
  • Streamers
  • Providers
  • Trtllm
  • Client
  • Llm
  • Runnable Rails
  • Utils
  • Kb
  • Kb
  • Utils
  • Library
  • Activefence
  • Actions
  • Ai Defense
  • Actions
  • Autoalign
  • Actions
  • Clavata
  • Actions
  • Errs
  • Request
  • Utils
  • Cleanlab
  • Actions
  • Content Safety
  • Actions
  • Crowdstrike Aidr
  • Actions
  • Factchecking
  • Align Score
  • Actions
  • Request
  • Server
  • Fiddler
  • Actions
  • Gcp Moderate Text
  • Actions
  • Gliner
  • Actions
  • Models
  • Request
  • Guardrails Ai
  • Actions
  • Errors
  • Registry
  • Hallucination
  • Actions
  • Hf Classifier
  • Actions
  • Backends
  • Injection Detection
  • Actions
  • Yara Config
  • Jailbreak Detection
  • Actions
  • Heuristics
  • Checks
  • Model Based
  • Checks
  • Models
  • Request
  • Server
  • Llama Guard
  • Actions
  • Pangea
  • Actions
  • Patronusai
  • Actions
  • Policyai
  • Actions
  • Privateai
  • Actions
  • Request
  • Prompt Security
  • Actions
  • Regex
  • Actions
  • Self Check
  • Facts
  • Actions
  • Input Check
  • Actions
  • Output Check
  • Actions
  • Sensitive Data Detection
  • Actions
  • Topic Safety
  • Actions
  • Trend Micro
  • Actions
  • Llm
  • Cache
  • Interface
  • Lfu
  • Utils
  • Clients
  • Base
  • Constants
  • Openai Compatible
  • Constants
  • Filters
  • Frameworks
  • Default
  • Registry
  • Openai Reasoning
  • Output Parsers
  • Prompts
  • Providers
  • Taskmanager
  • Types
  • Logging
  • Explain
  • Llm Tracker
  • Processing Log
  • Simplify Formatter
  • Stats
  • Verbose
  • Patch Asyncio
  • Rails
  • Llm
  • Buffer
  • Config
  • Llmrails
  • Options
  • Utils
  • Registry
  • Server
  • Api
  • App
  • Datastore
  • Datastore
  • Memory Store
  • Redis Store
  • Singleton
  • Streaming
  • Telemetry
  • Testing
  • Chat Harness
  • Fake Model
  • Fixtures
  • Tracing
  • Adapters
  • Base
  • Filesystem
  • Opentelemetry
  • Registry
  • Constants
  • Interaction Types
  • Span Extractors
  • Span Format
  • Span Formatting
  • Spans
  • Tracer
  • Types
  • Utils
  • CLI
  • Create Chat Completion
  • List Models
  • List Guardrails Configurations
  • List Challenges
  • Get Server Health or Chat UI
  • Migrating to 0.22
  • Troubleshooting
  • Security
  • Telemetry and Privacy
  • Research
Run Guardrailed Inference

Use the NeMo Guardrails Python APIs

||View as Markdown|

This section covers how to use the NeMo Guardrails library Python API to run guardrailed inference and integrate the guardrails into your application.

Overview

RailsConfig and LLMRails core classes for generating guarded responses.

Concept
Core Classes

RailsConfig and LLMRails class reference for loading and running guardrails.

Reference
Generation Options

Configure logging, LLM parameters, and rail selection for generation.

Reference
Streaming

Stream LLM responses in real-time with the stream_async method.

Tutorial
Check Messages

Validate messages against input and output rails using check_async and check methods.

Reference
Event-Based API

Use generate_events for low-level control over guardrails execution.

Reference
Previous

About Running Guardrailed Inference

Next

Overview of the NeMo Guardrails Python APIs

NVIDIANVIDIA
Developer-friendly docs for your API
Privacy Policy | Your Privacy Choices | Terms of Service | Accessibility | Corporate Policies | Product Security | Contact

Copyright © 2026, NVIDIA Corporation.