Core Colang Concepts
This guide builds on the Hello World guide and introduces the core Colang concepts you should understand to get started with NeMo Guardrails.
Prerequisites
This “Hello World” guardrails configuration uses the OpenAI gpt-4o-mini model.
- Install the
openaipackage:
- Set the
OPENAI_API_KEYenvironment variable:
- If you’re running this inside a notebook, patch the AsyncIO loop.
What is Colang?
Colang is a modeling language for conversational applications. Use Colang to design how the conversation between a user and a bot should happen.
Throughout this guide, bot means the entire LLM-based Conversational Application.
Core Concepts
In Colang, the two core concepts are messages and flows.
Messages
In Colang, a conversation is modeled as an exchange of messages between a user and a bot. An exchanged message has an utterance, such as “What can you do?”, and a canonical form, such as ask about capabilities. A canonical form is a paraphrase of the utterance to a standard, usually shorter, form.
Using Colang, you can define the user messages that are important for your LLM-based application. For example, in the “Hello World” example, the express greeting user message is defined as:
The express greeting represents the canonical form and “Hello”, “Hi” and “Wassup?” represent example utterances. The role of the example utterances is to teach the bot the meaning of a defined canonical form.
You can also define bot messages, such as how the bot should converse with the user. For example, in the “Hello World” example, the express greeting and ask how are you bot messages are defined as:
If more than one utterance is given for a canonical form, the bot uses a random utterance whenever the message is used.
If you are wondering whether user message canonical forms are the same as classical intents, the answer is yes. You can think of them as intents. However, when using them, the bot is not constrained to use only the pre-defined list.
Flows
In Colang, flows represent patterns of interaction between the user and the bot. In their simplest form, they are sequences of user and bot messages. In the “Hello World” example, the greeting flow is defined as:
This flow instructs the bot to respond with a greeting and ask how the user is feeling every time the user greets the bot.
Guardrails
Messages and flows provide the core building blocks for defining guardrails, or rails for short. The previous greeting flow is in fact a rail that guides the LLM how to respond to a greeting.
How does it work?
This section answers the following questions:
- How are the user and bot message definitions used?
- How is the LLM prompted and how many calls are made?
- Can I use bot messages without example utterances?
Let’s use the following greeting as an example.
The ExplainInfo class
To get information about the LLM calls, call the explain function of the LLMRails class.
Colang History
Use the colang_history function to retrieve the history of the conversation in Colang format. This shows us the exact messages and their canonical forms:
LLM Calls
Use the print_llm_calls_summary function to list a summary of the LLM calls that have been made:
The info object also contains an info.llm_calls attribute with detailed information about each LLM call. That attribute is described in a subsequent guide.
The process
Once an input message is received from the user, a multi-step process begins.
Step 1: Compute the canonical form of the user message
After an utterance, such as “Hello!” in the previous example, is received from the user, the guardrails instance uses the LLM to compute the corresponding canonical form.
NeMo Guardrails uses a task-oriented interaction model with the LLM. Every time the LLM is called, it uses a specific task prompt template, such as generate_user_intent, generate_next_step, generate_bot_message. See the default template prompts for details.
In the case of the “Hello!” message, a single LLM call is made using the generate_user_intent task prompt template. The prompt looks like the following:
The prompt has four logical sections:
-
A set of general instructions. These can be configured using the
instructionskey in config.yml. -
A sample conversation, which can also be configured using the
sample_conversationkey in config.yml. -
A set of examples for converting user utterances to canonical forms. The top five most relevant examples are chosen by performing a vector search against all the user message examples. For more details see ABC Bot.
-
The current conversation preceded by the first two turns from the sample conversation.
For the generate_user_intent task, the LLM must predict the canonical form for the last user utterance.
As we can see, the LLM correctly predicted the express greeting canonical form. It even went further to predict what the bot should do, which is bot express greeting, and the utterance that should be used. However, for the generate_user_intent task, only the first predicted line is used. If you want the LLM to predict everything in a single call, you can enable the single LLM call option in config.yml by setting the rails.dialog.single_call key to True.
Step 2: Determine the next step
After the canonical form for the user message has been computed, the guardrails instance needs to decide what should happen next. There are two cases:
-
If there is a flow that matches the canonical form, then it is used. The flow can decide that the bot should respond with a certain message, or execute an action.
-
If there is no flow, the LLM is prompted for the next step using the
generate_next_steptask.
In our example, there was a match from the greeting flow and the next steps are:
Step 3: Generate the bot message
Once the canonical form for what the bot should say has been decided, the message must be generated. There are two cases:
-
If a predefined message is found, the exact utterance is used. If more than one example utterances are associated with the same canonical form, a random one is used.
-
If a predefined message does not exist, the LLM is prompted to generate the message using the
generate_bot_messagetask.
In our “Hello World” example, the predefined messages “Hello world!” and “How are you doing?” are used.
The follow-up question
In the previous example, the LLM is prompted once. The following figure provides a summary of the outlined sequence of steps:

Let’s examine the same process for the follow-up question “What is the capital of France?”.
Let’s check the colang history:
And the LLM calls:
Based on these steps, we can see that the ask general question canonical form is predicted for the user utterance “What is the capital of France?”. Since there is no flow that matches it, the LLM is asked to predict the next step, which in this case is bot response for general question. Also, since there is no predefined response, the LLM is asked a third time to predict the final message.

Wrapping up
This guide provides a detailed overview of two core Colang concepts: messages and flows. It also looked at how the message and flow definitions are used under the hood and how the LLM is prompted. For more details, see the reference documentation for the Python API and the Colang Language Syntax.
Next
The next guide, Demo Use Case, guides you through selecting a demo use case to implement different types of rails, such as for input, output, or dialog.