Make use of Large Language Models (LLM) | NVIDIA NeMo Guardrails Library Developer Guide

Introduction

At its core, Colang does not require a Large Language Model (LLM) as backend. However, many of the more advanced mechanisms in the Colang Standard Library (CSL) depend on one.

To enable the LLM backend, you first have to configure the LLM access in the config.yml by adding a models section like this:

1 models:
2 - type: main
3   engine: openai
4   model: gpt-4-turbo

Make sure to also define the required API access key. For example, for OpenAI you will have to set the OPENAI_API_KEY environment variable.

Every LLM prompt contains a default context that can be modified if needed to adapt to the use case. See this example configuration to get started. This will heavily influence all the LLM invocations.

Supported Models

Colang currently supports the following models out of the box:

1 engine: openai
2 model: gpt-3.5-turbo-instruct
3 model: gpt-3.5-turbo
4 model: gpt-4-turbo
5 model: gpt-4o
6 model: gpt-4o-mini

NVIDIA AI Foundry hosted NIMs:

1 engine: nim
2 model: meta/llama3-8b-instruct
3 model: meta/llama3-70b-instruct
4 model: meta/llama-3.1-8b-instruct
5 model: meta/llama-3.1-70b-instruct

To support other models, you would need to create a set of new template prompts that consider the specific capabilities and the API of the model and add them to your bot configuration.

Natural Language Description (NLD)

One of the main LLM generation mechanisms in Colang is the so-called Natural Language Description (NLD) in combination with the “generation” operator ....

# Assign result of NLD to a variable
$world_population = ..."What is the number of people in the world? Give me a number."
# Extract a value from the current interaction context
$user_name = ..."What is the name of the user? Return 'friend' if not available."
# Extract structured information from the current interaction context
$order_information = ..."Provide the products ordered by the user in a list structure, e.g. ['product a', 'product b']"
# Use an existing variable in NLD
$response_to_user = ..."Provide a brief summary of the current order. Order Information: '{$order_information}'"

Every NLD will be interpreted and replaced during runtime by the configured LLM backend and can be used in Colang to generate context-dependent values. With NLDs, you are able to extract values and summarize content from the conversation with the user or based on results from other sources (like a database or an external service).

NLDs together with the variable name are interpreted by the LLM directly. Depending on the LLM you use, you need to make sure to be very specific in what value you would like to generate. It is good practice to always clearly specify how you want the response to be formatted and what type it should have (e.g., $user_name = ..."Return the user name as single string between quotes''. If no user name is available return 'friend'".

Alternatively, you can also describe the purpose and function of a flow using a docstring like NLD at the beginning of a flow. Using a standalone generation operator ... in the flow will use the flows NLD to infer the right flow expansion automatically:

flow main
    """You are an assistant that should talk to the user about cars.
    Politely decline to talk about anything else.
    Last user question is: "{{ question }}"
    Generate the output in the following format:
    bot say "<<the response>>"
    """
    $question = await user said something
    ...

See the example in LLM Flows for more details on how this works.

Note that there is no explicit control over the NLD response format and sometimes it will fail to generate the expected result. Usually you can improve it by providing more explicit instructions in the NLD, e.g. “Welcome the user with a short sentence that is wrapped in quotation marks like this: ‘Hi there!’”. Another way is to check the returned value by using for example the is_str() function to make sure that it is of the expected format.

User Intent Matching

In section Defining Flows, we have already seen how we can define user intent flows. The limitation was that they did not generalize to variations of the given user intent examples. With the help of an LLM, we can overcome this issue and use its reasoning power by importing the llm standard library module and activating the flows automating intent detection and generating user intent for unhandled user utterance (GitHub link) to match unexpected user utterances to currently active user intent flows.

llm/user_intent_match_example/main.co

import core
import llm
flow main
    activate automating intent detection
    activate generating user intent for unhandled user utterance
    while True
        when user greeted
            bot say "Hi there!"
        or when user said goodbye
            bot say "Goodbye!"
        or when unhandled user intent # For any user utterance that does not match
            bot say "Thanks for sharing!"
flow user greeted
    user said "Hi" or user said "Hello"
flow user said goodbye
    user said "Bye" or user said "See you"

When running this example:

> Hi
Hi there!
> hi
Hi there!
> hallo
Hi there!
> How are you?
Thanks for sharing!
> bye bye
Goodbye!

You can see that if we have an exact match for “Hi”, for example, the LLM will not be invoked since it matches directly with one of the awaited user said flows. For any other user utterance, the activated flow generating user intent for unhandled user utterance will invoke the LLM, before finding a suitable user intent. If the user utterance was close enough to one of the predefined user intent flows (i.e. user greeted or user said goodbye), it will cause the related flow to finish successfully. This enables you to even talk in a different language (if supported by the LLM) to successfully map to the correct flow. If no good match was found, the flow unhandled user intent will match.

You might ask yourself how the LLM can know which flows are considered user intent flows. This can either be done based on the flow names by activating the flow automating intent detection to automatically detect flows starting with ‘user’, or using an explicit flow decorator to mark them independently of their names:

@meta(user_intent=True)
flow any fancy flow name
    user said "Hi" or user said "Hello"

From a semantic point of view, it makes sense to start a user intent flow with ‘user’ even if marked by a user intent meta decorator.

Bot Action Generation

Similarly to how we want to be able to handle variations in the user input, we have seen bot intent flows that define a variation of predefined bot actions. While this can be good enough for responses to expected user inputs, we would also like to handle unexpected user utterances and not always reply with “Thanks for sharing!”. For this case, another flow from the Standard Library named llm continue interaction will help us:

llm/bot_intent_generation_example/main.co

import core
import llm
flow main
    user said something
    llm continue interaction

> Hello
Hi there! How can I help you today?
> Tell me a funny story
Sure! Did you hear about the fire at the circus? It was intense!
> funny!
I'm glad you liked it! Do you want to hear another one?
> Bye
Bye! Have a great day!

You see that with this, the bot can react to any user input and respond with a suitable bot answer. This generalizes well to multimodal interactions and can be used to generate bot postures and bot gestures as well, if provided with a suitable prompting context.

The generated actions strongly depend on the current interaction context, the general prompt instructions, and sample conversation in the config.yml. Try updating them to achieve the expected results.

Basic Interaction Loop

We can now combine everything to create a basic interaction loop:

llm/interaction_loop/main.co

import core
import timing
import llm
flow main
    activate automating intent detection
    activate generating user intent for unhandled user utterance
    while True
        when unhandled user intent
            llm continue interaction
        or when user was silent 12.0
            $response = ..."A random fun fact"
            bot say $response
        or when user expressed greeting
            bot say "Hi there!"
        or when user expressed goodbye
            bot inform "That was fun. Goodbye"
flow user expressed greeting
    user said "hi"
        or user said "hello"
flow user expressed goodbye
    user said "goodbye"
        or user said "I am done"
        or user said "I have to go"

This loop will take care of matching user utterances to predefined user intents if possible (e.g. user expressed greeting or user expressed goodbye) or generate a suitable response to unexpected user intents using the flow llm continue interaction. Furthermore, if the user does not say anything for more than 12 seconds, the bot will say a random fun fact generated through an NLD.

Guardrailing

Checkout the examples in the Getting Started section or refer to the NeMo Guardrails documentation to learn more about how Colang can be used to guardrail LLM responses and user inputs.