Guardrail Types

View as Markdown

The NeMo Guardrails library applies guardrails at multiple stages of the LLM interaction. Input rails apply guardrails before the LLM is called by validating and sanitizing user inputs. Retrieval rails filter and validate retrieved knowledge (documents and chunks) to ensure only trusted context is provided to the LLM. Dialog rails steer and constrain the multi‑turn conversation, enforcing flow logic and policies across turns. Execution rails control and validate tool/function calls, their arguments, and results to safely interact with external systems. Output rails evaluate and post‑process model responses, filtering, editing, or blocking unsafe or off‑policy content before it reaches users.

Input and Output rails are the most common.

StageRail TypeCommon Use Cases
Before LLMInput railsContent safety, jailbreak detection, topic control, PII masking
RAG pipelineRetrieval railsDocument filtering, chunk validation
ConversationDialog railsFlow control, guided conversations
Tool callsExecution railsAction input/output validation
After LLMOutput railsResponse filtering, fact checking, sensitive data removal

Programmable Guardrails Flow

Use Cases and Applicable Rails

The following table summarizes which rail types apply to each use case.

Use CaseInputRetrievalDialogExecutionOutput
Content Safety
Jailbreak Protection
Topic Control
PII Detection
Knowledge Base / RAG
Agentic Security
Custom Rails