RunnableRails
This guide demonstrates how to integrate the NeMo Guardrails library into LangChain applications using the RunnableRails class. The class implements the full Runnable Protocol with comprehensive support for synchronous and asynchronous operations, streaming, and batch processing.
Overview
RunnableRails provides a complete LangChain-native interface that wraps guardrail configurations around LLMs or entire chains. It supports all Runnable methods including invoke(), ainvoke(), stream(), astream(), batch(), and abatch() with full metadata preservation.
Getting Started
To get started, load a guardrail configuration and create a RunnableRails instance.
To add guardrails around an LLM model inside a chain, wrap the LLM model with a RunnableRails instance. For example, (guardrails | ...).
The following is an example of using a prompt, model, and output parser:
Add guardrails around the LLM model in the example above with the following code:
Using the extra parenthesis is essential to enforce the order in which the | (pipe) operator is applied.
To add guardrails to an existing chain or any Runnable, wrap it similarly.
You can also use the same approach to add guardrails only around certain parts of your chain. The following example from the RunnableBranch Documentation adds guardrails around the "anthropic" and "general" branches inside a RunnableBranch.
In general, you can wrap any part of a runnable chain with guardrails.
Streaming Support
RunnableRails provides full streaming support with both synchronous and asynchronous methods. This enables responsive applications that stream LLM outputs as they are generated.
Metadata in Streaming: RunnableRails preserves all metadata during streaming, including response_metadata, usage_metadata, and additional_kwargs in AIMessageChunk objects.
Batch Processing
RunnableRails supports efficient batch processing for multiple inputs. The following example shows how to use the batch and abatch methods.
Input/Output Formats
RunnableRails intelligently handles various input and output formats with automatic transformation.
LLM Wrapping Formats
Chain Wrapping Formats
Metadata Preservation
RunnableRails maintains complete metadata compatibility with LangChain components. All AIMessage responses include the following:
response_metadata: Token usage, model info, finish reasons.usage_metadata: Input/output token counts, total tokens.additional_kwargs: Custom fields from the LLM provider.id: Unique message identifiers.tool_calls: Tool call information when applicable.
This ensures seamless integration with LangChain components that depend on message metadata.
Configuration Options
Passthrough Mode
The role of a guardrail configuration is to validate user input, check LLM output, and guide the LLM model on how to respond. See the Configuration Guide for more details on the different types of rails.
To achieve this, the guardrail configuration might make additional calls to the LLM or other models/APIs (for example, for fact-checking and content moderation).
By default, when the guardrail configuration decides that it is safe to prompt the LLM, it uses the exact prompt that was provided as the input, such as a string, StringPromptValue or ChatPromptValue. However, to enforce specific rails, for example, dialog rails, general instructions, the guardrails configuration needs to alter the prompt used to generate the response.
The passthrough parameter controls this behavior.
passthrough=True(default): Uses the exact input prompt with minimal guardrail intervention.passthrough=False: Allows guardrails to modify prompts for enhanced protection.
Tool Calling Requirement: Set passthrough=True for proper tool call handling.
Custom Input/Output Keys
When you use a guardrail configuration to wrap a chain or a Runnable, the input and output are either dictionaries or strings. However, a guardrail configuration always operates on a text input from the user and a text output from the LLM. To achieve this, when dictionaries are used, one of the keys from the input dictionary must be designated as the "input text" and one of the keys from the output as the "output text".
By default, these keys are input and output. To customize these keys, provide the input_key and output_key parameters when creating the RunnableRails instance.
The following examples show how to customize the input and output keys with “question” and “answer” keys.
When a guardrail is triggered and predefined messages must be returned instead of the output from the LLM, only a dictionary with the output key is returned.
Tool Calling
RunnableRails supports LangChain tool calling with full metadata preservation and streaming. Tool calling requires passthrough=True to work properly.
The following steps are required to use tool calling with RunnableRails:
- Set
passthrough=Truewhen creatingRunnableRailsinstance. - Use
bind_tools()to attach tools to your model. - Handle tool execution in your application logic.
Basic Tool Setup
Two-Call Tool Pattern
The standard flow for two-call tool calling is to get tool calls, execute them, and synthesize results.
Single-Call with Pre-processed Messages
Use single-call tool calling when you already have a complete message history with tool results.
Composition and Chaining
RunnableRails integrates with complex LangChain compositions. The following example shows how to use RunnableRails with a conditional branching chain.
Key Benefits of RunnableRails:
- Maintains full Runnable protocol compatibility.
- Preserves metadata throughout the chain.
- Supports all async/sync operations.
- Works with streaming and batch processing.