Make use of Large Language Models (LLM)

Introduction

While at the core Colang does not require a Large Language Model (LLM) as backend, many of the more advanced mechanisms in the Colang Standard Library (CSL) depend on it.

To enable the LLM backend you first have to configure the LLM access in the config.yml by adding a models section like this:

models:
- type: main
  engine: openai
  model: gpt-3.5-turbo-instruct

Make sure to also define the required API access key, e.g. for OpenAI you will have to set the OPENAI_API_KEY environment variable.

Every LLM prompt contains a default context that can be modified if needed to adapt to the use case. See this example configuration to get started. This will heavily influence all the LLM invocations.

Natural Language Description (NLD)

One of the main LLM generation mechanism in Colang are the so-called Natural Language Description (NLD) in combination with the “generation” operator ....

# Assign result of NLD to a variable
$world_population = ..."What is the number of people in the world? Give me a number."

# Extract a value from the current interaction context
$user_name = ..."What is the name of the user? Return 'friend' if not available."

# Extract structured information from the current interaction context
$order_information = ..."Provide the products ordered by the user in a list structure, e.g. ['product a', 'product b']"

# Use an existing variable in NLD
$response_to_user = ..."Provide a brief summary of the current order. Order Information: '{$order_information}'"

Every NLD will be interpreted and replaced during runtime by the configured LLM backend and can be used in Colang to generate context dependent values. Alternatively, you can also describe the purpose and function of a flow using a docstring like NLD at the beginning of a flow. Using a standalone generation operator in the flow will use the flows NLD to infer the right flow expansion automatically:

flow main
    """You are an assistant that should talk to the user about cars.
    Politely decline to talk about anything else.

    Last user question is: "{{ question }}"
    Generate the output in the following format:

    bot say "<<the response>>"
    """
    $question = await user said something
    ...

See the example in LLM Flows for more details on how this works.

Note that there is no explicit control over the NLD response format and sometimes it will fail to generate the expected result. Usually you can improve it by providing more explicit instructions in the NLD, e.g. “Welcome the user with a short sentence that is wrapped in quotation marks like this: ‘Hi there!’”. Another way is to check the returned value by using e.g. the is_str() function to make sure that it is of the expected format.

User Intent Matching

In section Defining Flows we have already seen how we can define user intent flows. The limitation was that they did not generalize to variations of the given user intent examples. With the help of an LLM we can overcome this issue and use its reasoning power by importing the llm standard library module and activate the flows automating intent detection and generating user intent for unhandled user utterance (Github link) to match unexpected user utterances to currently active user intent flows.

llm/user_intent_match_example/main.co
import core
import llm

flow main
    activate automating intent detection
    activate generating user intent for unhandled user utterance

    while True
        when user greeted
            bot say "Hi there!"
        or when user said goodbye
            bot say "Goodbye!"
        or when unhandled user intent # For any user utterance that does not match
            bot say "Thanks for sharing!"

flow user greeted
    user said "Hi" or user said "Hello"

flow user said goodbye
    user said "Bye" or user said "See you"

When running this example:

> Hi

Hi there!

> hi

Hi there!

> hallo

Hi there!

> How are you?

Thanks for sharing!

> bye bye

Goodbye!

You can see that if we have an exact match for e.g. “Hi”, the LLM will not be invoked since it matches directly with one of the awaited user said flows. For any other user utterance the activated flow generating user intent for unhandled user utterance will invoke the LLM, before finding a suitable user intent. If the user utterance was close enough to one of the predefined user intent flows (i.e. user greeted or user said goodbye), it will cause the related flow to finish successfully. This enables you to even talk in a different language (if supported by the LLM) to successfully map to the correct flow. If no good match was found, the flow unhandled user intent will match.

You might ask yourself how the LLM can know which flows are considered user intent flows. This can either be done based on the flow names by activating the flow automating intent detection to automatically detect flows starting with ‘user’, or using an explicit flow decorator to mark them independently of their names:

@meta(user_intent=True)
flow any fancy flow name
    user said "Hi" or user said "Hello"

Note

From a semantic point of view it makes always sense to start a user intent flow with ‘user’ even if marked by a user intent meta decorator.

Bot Action Generation

Similarly to how we want to be able to handle variations in the user input, we have seen bot intent flows that define a variation of predefined bot actions. While this can be good enough for responses to expected user inputs we would also like to handle unexpected user utterances and not always reply with “Thanks for sharing!”. For this case, another flow from the Standard Library will help us named llm continue interaction:

llm/bot_intent_generation_example/main.co
import core
import llm

flow main
    user said something
    llm continue interaction
> Hello

Hi there! How can I help you today?

> Tell me a funny story

Sure! Did you hear about the fire at the circus? It was intense!

> funny!

I'm glad you liked it! Do you want to hear another one?

> Bye

Bye! Have a great day!

You see that with this the bot can react to any user input and respond with a suitable bot answer. This generalizes well to multimodal interactions and can be used to generate bot postures and bot gestures as well if provided with a suitable prompting context.

Note

The generated actions strongly depend on the current interaction context, the general prompt instructions and sample conversation in the config.yml. Try updating them to achieve the expected results.

Basic Interaction Loop

We can combine now everything to a basic interaction loop:

llm/interaction_loop/main.co
import core
import timing
import llm

flow main
    activate automating intent detection
    activate generating user intent for unhandled user utterance

    while True
        when unhandled user intent
            llm continue interaction
        or when user was silent 12.0
            $response = ..."A random fun fact"
            bot say $response
        or when user expressed greeting
            bot say "Hi there!"
        or when user expressed goodbye
            bot inform "That was fun. Goodbye"

flow user expressed greeting
    user said "hi"
        or user said "hello"

flow user expressed goodbye
    user said "goodbye"
        or user said "I am done"
        or user said "I have to go"

This loop will take care of matching user utterances to predefined user intents if possible (e.g. user expressed greeting or user expressed goodbye) or generate a suitable response to unexpected user intents using the flow llm continue interaction. Furthermore, if the user does not say anything for more than 12 seconds, the bot will say a random fun fact generated through a NLD.

Guardrailing

Checkout the examples in the Getting Started section or refer to the NeMo Guardrails documentation to learn more about how Colang can be used to guardrail LLM responses and user inputs.