Input Rails
This topic demonstrates how to add input rails to a guardrails configuration. As discussed in the previous guide, Demo Use Case, this topic guides you through building the ABC Bot.
Prerequisites
- Install the
openaipackage:
- Set the
OPENAI_API_KEYenvironment variable:
- If you’re running this inside a notebook, patch the AsyncIO loop.
Config Folder
Create a config folder with a config.yml file with the following content that uses the gpt-4o-mini model:
General Instructions
Configure the general instructions for the bot. You can think of them as the system prompt. For details, see the Configuration Reference. These instructions configure the bot to answer questions about the employee handbook and the company’s policies.
Add the following content to config.yml to create a general instruction:
In the snippet above, we instruct the bot to answer questions about the employee handbook and the company’s policies.
Sample Conversation
Another option to influence how the LLM responds to a sample conversation. The sample conversation sets the tone for the conversation between the user and the bot. The sample conversation is included in the prompts, which are shown in a subsequent section. For details, see the Configuration Reference.
Add the following to config.yml to create a sample conversation:
Testing without Input Rails
To test the bot, provide it with a greeting similar to the following:
Get a summary of the LLM calls that have been made:
The summary shows that a single call was made to the LLM using the prompt for the task general. In contrast to the Core Colang Concepts guide, where the generate_user_intent task is used as a first phase for each user message, if no user canonical forms are defined for the Guardrails configuration, the general task is used instead. Take a closer look at the prompt and the completion:
As expected, the LLM is prompted with the general instructions and the user’s input. The next section adds an input rail, preventing the LLM to respond to certain jailbreak attempts.
Jailbreak Attempts
In LLMs, jail-breaking refers to finding ways to circumvent the built-in restrictions or guidelines set by the model’s developers. These restrictions are usually in place for ethical, legal, or safety reasons. For example, what happens if you instruct the ABC Bot to ignore previous instructions:
NOTE: this jailbreak attempt does not work 100% of the time. If you’re running this and getting a different result, try a few times, and you should get a response similar to the previous.
Allowing the LLM to comply with this type of request is something we don’t want. To prevent jailbreak attempts like this, you can add an input rail that can process the user input before it is sent to the LLM. NeMo Guardrails comes with a built-in self check input rail that uses a separate LLM query to detect a jailbreak attempt. To use it, you have to:
- Activate the
self check inputrail in config.yml. - Add a
self_check_inputprompt in prompts.yml.
Activate the rail
To activate the rail, include the self check input flow name in the input rails section of the config.yml file:
- The top-level
railskey configures the rails that are active in a guardrails configuration. - The
inputsub-key configures the input rails. Other valid sub-keys areoutput,retrieval,dialogandexecution, which are used in some of the following guides. - The
flowskeys contains the name of the flows that is used as input rails. self check inputis the name of a pre-defined flow that implements self-check input checking.
All the rails in NeMo Guardrails are implemented as flows. For example, you can find the self_check_input flow here.
The flows implementing input rails can call actions, such as execute self_check_input, instruct the bot to respond in a certain way, such as bot refuse to respond, and even stop any further processing for the current user request.
Add a prompt
The self-check input rail needs a prompt to perform the check.
Add the following content to prompts.yml to create a prompt for the self-check input task:
Using the Input Rails
Let’s reload the configuration and try the question again.
As you can see, the self_check_input LLM call has been made. The prompt and the completion were the following:
The following figure depicts in more details how the self-check input rail works:

The self check input rail calls the self_check_input action, which in turn calls the LLM using the self_check_input task prompt.
Here is a question that the LLM should answer:
In this case two LLM calls were made: one for the self_check_input task and one for the general task. The check_input was not triggered:
Because the input rail was not triggered, the flow continued as usual.

Note that the final answer is not correct.
Testing the Bot
You can also test this configuration in an interactive mode using NeMo Guardrails CLI Chat.
NOTE: make sure you are in the folder containing the config folder. Otherwise, you can specify the path to the config folder using the
--config=PATH/TO/CONFIGoption.
Feel free to experiment with various inputs that should or should not trigger the jailbreak detection.
More on Input Rails
Input rails also have the ability to alter the message from the user. By changing the value for the $user_message variable, the subsequent input rails and dialog rails work with the updated value. This can be useful, for example, to mask sensitive information. For an example of this behavior, checkout the Sensitive Data Detection rails.
Next
The next guide, Output Rails, adds output moderation to the bot.