Make use of Large Language Models (LLM)
Make use of Large Language Models (LLM)
Introduction
At its core, Colang does not require a Large Language Model (LLM) as backend. However, many of the more advanced mechanisms in the Colang Standard Library (CSL) depend on one.
To enable the LLM backend, you first have to configure the LLM access in the config.yml by adding a models section like this:
Make sure to also define the required API access key. For example, for OpenAI you will have to set the OPENAI_API_KEY environment variable.
Every LLM prompt contains a default context that can be modified if needed to adapt to the use case. See this example configuration to get started. This will heavily influence all the LLM invocations.
Supported Models
Colang currently supports the following models out of the box:
NVIDIA AI Foundry hosted NIMs:
To support other models, you would need to create a set of new template prompts that consider the specific capabilities and the API of the model and add them to your bot configuration.
Natural Language Description (NLD)
One of the main LLM generation mechanisms in Colang is the so-called Natural Language Description (NLD) in combination with the “generation” operator ....
Every NLD will be interpreted and replaced during runtime by the configured LLM backend and can be used in Colang to generate context-dependent values. With NLDs, you are able to extract values and summarize content from the conversation with the user or based on results from other sources (like a database or an external service).
NLDs together with the variable name are interpreted by the LLM directly. Depending on the LLM you use, you need to make sure to be very specific in what value you would like to generate. It is good practice to always clearly specify how you want the response to be formatted and what type it should have (e.g., $user_name = ..."Return the user name as single string between quotes''. If no user name is available return 'friend'".
Alternatively, you can also describe the purpose and function of a flow using a docstring like NLD at the beginning of a flow. Using a standalone generation operator ... in the flow will use the flows NLD to infer the right flow expansion automatically:
See the example in LLM Flows for more details on how this works.
Note that there is no explicit control over the NLD response format and sometimes it will fail to generate the expected result. Usually you can improve it by providing more explicit instructions in the NLD, e.g. “Welcome the user with a short sentence that is wrapped in quotation marks like this: ‘Hi there!’”. Another way is to check the returned value by using for example the is_str() function to make sure that it is of the expected format.
User Intent Matching
In section Defining Flows, we have already seen how we can define user intent flows. The limitation was that they did not generalize to variations of the given user intent examples. With the help of an LLM, we can overcome this issue and use its reasoning power by importing the llm standard library module and activating the flows automating intent detection and generating user intent for unhandled user utterance (GitHub link) to match unexpected user utterances to currently active user intent flows.
llm/user_intent_match_example/main.co
When running this example:
You can see that if we have an exact match for “Hi”, for example, the LLM will not be invoked since it matches directly with one of the awaited user said flows. For any other user utterance, the activated flow generating user intent for unhandled user utterance will invoke the LLM, before finding a suitable user intent. If the user utterance was close enough to one of the predefined user intent flows (i.e. user greeted or user said goodbye), it will cause the related flow to finish successfully. This enables you to even talk in a different language (if supported by the LLM) to successfully map to the correct flow. If no good match was found, the flow unhandled user intent will match.
You might ask yourself how the LLM can know which flows are considered user intent flows. This can either be done based on the flow names by activating the flow automating intent detection to automatically detect flows starting with ‘user’, or using an explicit flow decorator to mark them independently of their names:
From a semantic point of view, it makes sense to start a user intent flow with ‘user’ even if marked by a user intent meta decorator.
Bot Action Generation
Similarly to how we want to be able to handle variations in the user input, we have seen bot intent flows that define a variation of predefined bot actions. While this can be good enough for responses to expected user inputs, we would also like to handle unexpected user utterances and not always reply with “Thanks for sharing!”. For this case, another flow from the Standard Library named llm continue interaction will help us:
llm/bot_intent_generation_example/main.co
You see that with this, the bot can react to any user input and respond with a suitable bot answer. This generalizes well to multimodal interactions and can be used to generate bot postures and bot gestures as well, if provided with a suitable prompting context.
The generated actions strongly depend on the current interaction context, the general prompt instructions, and sample conversation in the config.yml. Try updating them to achieve the expected results.
Basic Interaction Loop
We can now combine everything to create a basic interaction loop:
llm/interaction_loop/main.co
This loop will take care of matching user utterances to predefined user intents if possible (e.g. user expressed greeting or user expressed goodbye) or generate a suitable response to unexpected user intents using the flow llm continue interaction. Furthermore, if the user does not say anything for more than 12 seconds, the bot will say a random fun fact generated through an NLD.
Guardrailing
Checkout the examples in the Getting Started section or refer to the NeMo Guardrails documentation to learn more about how Colang can be used to guardrail LLM responses and user inputs.