Configuration Structure#
A guardrail configuration contains several properties that customize how the service interacts with models and applies safety checks. This page describes the core components (models, prompts, and rails) as well as advanced options for tracing, passthrough, and other behaviors.
Models#
A model configuration defines the LLM to use for a specific task. It consists of the following fields:
type: The task the model is used for.engine: The model provider. For most cases, usenim.model: The name of the model to use for the task. For themainmodel, an incoming Guardrails request can override themodel.mode: The completion mode. Allowed values are"chat"(default) or"text".cache: Cache configuration for this model. Primarily used for content safety models to cache repeated checks. Containsenabled(default:false),maxsize(default:50000), andstatssub-fields.parameters: Additional properties to configure interacting with the model. The following fields are supported for all model types:base_url: The URL to use for inference with this model. When using models deployed through the Inference Gateway, NeMo Guardrails automatically routes requests through Inference Gateway. You do not need to explicitly set a Base URL for your model.default_headers: Custom HTTP headers to include in requests to this model. Each key-value pair represents a header name (key) and its default value (value). At inference-time, you can override the default value for a header by including it in the request headers.
The main model is the model that an end user uses for chat and chat-like interactions.
You can also configure task-specific models for any task that occurs during the guardrail process. The following are common tasks that can be configured:
content_safety: Content Safety check for detecting harmful content.topic_control: Topic Control check for keeping conversations on-topic.jailbreak: Jailbreak detection check.self_check_input: Safety check that automatically checks the user input using themainmodel for inference.self_check_output: Safety check that automatically checks the final LLM output using themainmodel for inference.
models = [
{
"type": "content_safety",
"engine": "nim",
"model": "default/nvidia-llama-3-1-nemotron-safety-guard-8b-v3"
},
{
"type": "topic_control",
"engine": "nim",
"model": "default/nvidia-llama-3-1-nemoguard-8b-topic-control"
}
]
Tip
Model Entities are automatically created when you:
Deploy a NIM through the Inference Gateway (refer to Deploy Models)
Create a Model Provider pointing to an external API (refer to Deploy Models)
Use sdk.models.list() to retrieve available Model Entities in your workspace.
Using Direct URLs#
Using Inference Gateway is the recommended approach for interacting with models. If you require a direct connection to specific endpoints, you can explicitly set parameters.base_url:
models = [
{
"type": "main",
"engine": "nim",
"parameters": {
"base_url": "http://my-local-nim:8000/v1"
}
},
{
"type": "content_safety",
"engine": "nim",
"model": "nvidia/llama-3.1-nemotron-safety-guard-8b-v3",
"parameters": {
"base_url": "http://my-content-safety-nim:8000/v1"
}
}
]
Prompts#
A prompt is used by the model during a task to evaluate a message. It consists of the following fields:
task: The task to apply the prompt to.content: The content of the prompt. Mutually exclusive withmessages.Prompts that require a dynamic input variable(s) use Jinja2 templating. For example, the
{{ user_input }}variable is replaced with the end user’s input at runtime.
messages: A list of messages for chat-model prompts. Mutually exclusive withcontent. Each message hastype(such as"system"or"user") andcontentfields.output_parser: Name of output parser to process the model’s response.max_tokens: Maximum number of tokens the model can generate.max_length: Maximum prompt length in characters. When the maximum length is exceeded, the prompt is truncated by removing older turns from the conversation history until the length of the prompt is less than or equal to the maximum length. The default is 16,000 characters.models: Restricts this prompt to specific LLM engines or models. Format: a list of strings such as"<engine>"or"<engine>/<model>".mode: The prompting mode for this prompt. Defaults to the top-levelprompting_modevalue (typically"standard").stop: A list of stop tokens for models that support this feature.
For Content Safety and Topic Control checks, prompts must include the model reference in the task name:
prompts = [
{
"task": "content_safety_check_input $model=content_safety",
"content": "Task: Check for unsafe content...",
"output_parser": "nemoguard_parse_prompt_safety",
"max_tokens": 50
},
{
"task": "topic_safety_check_input $model=topic_control",
"content": "Ensure the user messages meet the following guidelines: ...",
"max_tokens": 50
}
]
Prompt Template Variables#
The following dynamic variables can be used in the prompt content:
Variable |
Description |
|---|---|
|
Current user message |
|
Current bot response (for output rails) |
|
Conversation history |
|
Retrieved knowledge base chunks (for retrieval rails) |
|
Additional context variables |
Rails#
Rails specify which flows to apply to the user input and LLM output. Rails are organized into five categories, based on when they trigger during the guardrails process.
Category |
Trigger Point |
Purpose |
|---|---|---|
Input Rails |
When user input is received |
Validate, filter, or modify user input |
Retrieval Rails |
After RAG retrieval completes |
Process retrieved chunks |
Dialog Rails |
After canonical form is computed |
Control conversation flow |
Execution Rails |
Before/after action execution |
Control tool and action calls |
Output Rails |
When LLM generates output |
Validate, filter, or modify bot responses |
Each rail category supports a flows list and, optionally, a parallel flag to execute those flows concurrently.
The following example defines the flow to run as an input and output rail.
rails = {
"input": {
"flows": ["self check input"],
"parallel": False
},
"output": {
"flows": ["self check output"],
"parallel": False,
"streaming": {
"enabled": False,
"chunk_size": 200,
"context_size": 50,
"stream_first": True
},
"apply_to_reasoning_traces": False
}
}
In addition to input and output, the rails object supports the following rail types:
retrieval: Applied to retrieved chunks in RAG deployments. Supportsflows.dialog: Applied after canonical form is computed. Supportsflows.actions: Controls tool and action execution. Supportsinstant_actions(list of actions that finish instantly).tool_output: Applied to tool calls before they are executed. Supportsflowsandparallel.tool_input: Applied to tool results before they are processed. Supportsflowsandparallel.
Output Rails Streaming#
Output rails support a streaming configuration for processing LLM tokens in chunks:
enabled: Enables streaming mode (default:false).chunk_size: Number of tokens per processing chunk (default:200).context_size: Number of tokens carried from the previous chunk for continuity (default:50).stream_first: Iftrue, token chunks are streamed before output rails are applied (default:true).
Rails-specific Configuration#
Specific rails can require additional configuration that you specify in the config key. The following integrations are supported:
jailbreak_detection– Threshold-based jailbreak detection.injection_detection– Prompt injection detection.
rails = {
"input": {
"flows": ["self check input"],
},
"output": {
"flows": ["self check output"],
},
"config": {
# Configures jailbreak detection settings
"jailbreak_detection": {
"length_per_perplexity_threshold": 89.79,
"prefix_suffix_perplexity_threshold": 1845.65
}
}
}
General Instructions#
Instructions provide context to the model about expected behavior. They are appended to the beginning of every prompt (similar to a system prompt).
instructions = [
{
"type": "general",
"content": """You are a customer service bot for ABC Company.
You answer questions about products and policies.
If you don't know an answer, say so honestly.
Always be polite and professional."""
}
]
Sample Conversation#
The sample conversation sets the tone for conversations between the user and the bot. It helps the LLM learn the format, tone, and verbosity of responses. Include a minimum of two turns. The sample conversation is appended to every prompt; keep it short and relevant.
sample_conversation = """user "Hi there. Can you help me with some questions I have about the company?"
express greeting and ask for assistance
bot express greeting and confirm and offer assistance
"Hi there! I'm here to help answer any questions you may have about the ABC Company. What would you like to know?"
user "What's the company policy on paid time off?"
ask question about benefits
bot respond to question about benefits
"The ABC Company provides eligible employees with up to two weeks of paid vacation time per year, as well as five paid sick days per year. Please refer to the employee handbook for more information."
"""
Advanced Options#
The guardrail configuration supports additional top-level fields for fine-tuning behavior. All fields below are optional and have sensible defaults.
Field |
Default |
Description |
|---|---|---|
|
|
URL of the actions server used by the rails engine to execute custom actions. |
|
|
Selects the prompting strategy. Custom prompts can target a specific mode using the prompt-level |
|
|
The lowest temperature used for tasks that require deterministic output. Some models do not support |
|
|
Enables multi-step generation for highly capable LLMs. Use with caution; recommended only for models comparable to GPT-3.5-turbo-instruct or newer. |
|
|
The version of the Colang dialogue language to use. |
|
|
A dictionary for arbitrary configuration data that custom actions or integrations can reference at runtime. |
|
|
When |
|
|
When |
|
Refer to section below |
Configuration object for OpenTelemetry-compatible tracing. Contains |
Warning
Enabling tracing.enable_content_capture causes prompts and responses (including user, assistant, and tool message content) to be included in telemetry events. This can expose PII and sensitive data in your telemetry backend.
Next Steps#
Refer to Manage Configurations to create, update, and manage your guardrail configurations.