Building a Bot using Colang 2.0 and Event Interface

The ACE Agent event interface provides an asynchronous, event-based interface to interact with bots written in Colang 2.0. The interface allows you greater flexibility in managing your interaction between the user and the bot due to its asynchronous design. Using the interface, you can easily build interactive systems that break turn-taking behavior (user speaks, bot speaks, user speaks…) and you can handle multiple actions going on at the same time (for example, bot speaks, makes a gesture to emphasize a point, user is looking around and interrupts bot after a short while).

The event interface is best suited with more complex interactive systems. An example for this would be an interactive avatar system where you not only want to control the bot responses but also gestures, postures, sounds, showing pictures, and so on. The NVIDIA Tokkio reference application (part of ACE) is a great starting point for how to build a real interactive avatar system using the event interface from ACE Agent. For more information, NVIDIA Tokkio demonstrates such interactions with a 3D avatar running accessible from web browsers.

In this tutorial you will learn how to work with the ACE Agent event interface and how to create a simple bot that makes use of Colang 2.0 and asynchronous event processing. The bot will feature:

  • Multimodality. The bot will make use of gestures, utterances and showing information on a UI.

  • LLM integration. The bot will make different use of LLMs to provide contextual answers and to simplify user input handling.

  • Proactivity. The bot will be proactive and will try to engage the user if no reply is given.

  • Interruptibility. The user can interrupt the bot at any time.

  • Back-channeling. The bot can react in real time based on ongoing user input to make your interactions interesting.

Prerequisites

  1. Setup the Event Simulator packaged as part of Quick Start resource at clients/event_client/. To run the Event Simulator script, you need to install a few Python packages. It is recommended that you create a new Python environment to do so.

    cd clients/event_client/
    python3 -m venv sim-env
    source sim-env/bin/activate
    pip install -r requirements.txt
    
  2. Test the template.

    1. Open two different terminals.

    2. In one terminal, start the ACE Agent with the bot template packaged in the Quick Start resource at samples/event_interface_tutorial_bot.

      Set the OpenAI API key environment variable.

      export OPENAI_API_KEY=...
      
      export BOT_PATH=samples/event_interface_tutorial_bot
      source deploy/docker/docker_init.sh
      docker compose -f deploy/docker/docker-compose.yml up event-bot -d
      
    3. In a separate terminal, start the Event Simulator CLI.

      # Make sure that you are in the correct folder and that you have activated the python environment
      cd clients/event_client/
      source sim-env/bin/activate
      
      python event_client.py
      
  3. To confirm the test was successful, you should see the message Welcome to the tutorial in the Chat section on the left. Refer to the sample event client section for more information.

Step 1: Making the Greeting Multimodal

In this section, you will see how you can make the greeting (or any other response from the bot) multimodal.

  1. Change the existing flow bot express greetings at the top of the main.co file. The flow should look similar to:

    flow bot express greeting
      # meta: bot intent
      (bot express "Hi there!"
        or bot express "Welcome!"
        or bot express "Hello!")
        and bot gesture "Wave with one hand"
    
  1. Show the greeting in the UI by adding the statement start scene show short information "Welcome to this tutorial interaction" as $intro_ui in the main flow before the line bot express greeting as shown below.

    # The bot greets the user and a welcome message is shown on the UI
      start scene show short information "Welcome to this tutorial interaction" as $intro_ui
      bot express greeting
    
  2. Restart the updated bot and Event Simulator.

    docker compose -f deploy/docker/docker-compose.yml down
    docker compose -f deploy/docker/docker-compose.yml up event-bot -d
    
    python event_client.py
    

In addition to the greeting message, we have a bot gesture shown for two seconds in the Motion area on the left and a UI shown on the right. To make a 3D Interactive Avatar say “Welcome!”, wave into the camera, and display a proper UI in the view, you could use the exact same Colang code.

Step 2: Leveraging LLMs to Answer User Questions

In this section, you will enable the bot to answer any user question based on a large language model (LLM).

  1. Provide general instructions. Open the file event_interface_tutorial_bot/event_interface_tutorial_bot_config.yml. Under instructions is an example for general LLM instructions. Keep these instructions for now, however, you can experiment with different instructions and see how you can change the types of answers the bot provides.

    instructions:
        - type: "general"
          content: |
            Below is a conversation between Emma, a helpful interactive avatar assistant (bot), and a user.
            The bot is designed to generate human-like actions based on the user actions that it receives.
          [...]
    
  2. Enable LLM fallback. Update your main.co file as shown below. You only need to update the CHANGE sections. Ensure you don’t duplicate flows when doing your changes. If you define the same flow twice, the later definition will overwrite the first one. Your main.co file should look like this:

    flow bot express greeting
      # meta: bot intent
      (bot express "Hi there!"
        or bot express "Welcome!"
        or bot express "Hello!")
        and bot gesture "Wave with one hand"
    
    # CHANGE 1
    # Add two flows to handle ending the conversation: bot express goodbye, user expressed done
    flow bot express goodbye
      # meta: bot intent
      (bot express "Goodbye" or bot express "Talk to you soon!") and bot gesture "bowing in goodbye"
    flow user expressed done
      # meta: user intent
      user said r"(?i).*done.*|.*end.*showcase.*|.*exit.*"
    
    # The main flow is the entry point
    flow main
      # meta: exclude from llm
    
      # Technical flows, see Colang 2.0 documentation for more details
      activate catch undefined flows
      activate catch colang errors
      activate poll llm request response 1.0
      activate track bot talking state
    
      # The bot greets the user and a welcome message is shown on the UI
      start scene show short information "Welcome to this tutorial interaction" as $intro_ui
      bot express greeting
    
      # CHANGE 2
      # If we don't have an exact match for a user utterance, we use an LLM to infer the user's intent
      # without this flow activated only exact matches are considered (e.g. "Hello there" would not match "Hi there").
      activate trigger user intent for unhandled user utterance
    
      # CHANGE 3 (optional)
      # This will generated a variation of the question (variations are generated by the LLM)
      bot say something like "How can I help you today?"
    
      # CHANGE 4
      # This lets the bot continue to answer questions (first when clause) until the user expressed
      # that the conversation should end. in which case we break out of the while loop
      while True
        when unhandled user intent as $intent_ref
          generate then continue interaction
        orwhen user expressed done
          bot express goodbye
          break
    
      # This will prevent the main flow finishing ever
      wait indefinitely
    
  3. Restart the updated bot and Event Simulator.

    docker compose -f deploy/docker/docker-compose.yml down
    docker compose -f deploy/docker/docker-compose.yml up event-bot -d
    
    python event_client.py
    
  4. Ask any questions to the bot that will be answered by the LLM given the provided general instructions.

Step 3: Making the Bot Proactive

In this section, we will add a proactivity feature to make the conversation with your bot feel more natural. The bot will generate an appropriate utterance when the user has been silent for a specified amount of time. This helps drive the conversation forward and can be used to provide additional information or help to the user.

  1. Add an orwhen clause (for the flow user was silent 15.0) inside the when statement in the main flow.

    while True
        when unhandled user intent as $intent_ref
          generate then continue interaction
        orwhen user was silent 15.0
          generate then continue interaction
        orwhen user expressed done
          bot express goodbye
          break
    

    With this, we will leverage the LLM to generate a bot response whenever the user has been silent for at least 15 seconds.

  2. Restart the updated bot and Event Simulator.

    docker compose -f deploy/docker/docker-compose.yml down
    docker compose -f deploy/docker/docker-compose.yml up event-bot -d
    
    python event_client.py
    
  3. Look for a timer that is ticking down, in the Utils section. When the timer finishes, the bot will follow up with the user.

Step 4: Interrupting the Bot

When humans talk to each other we often interrupt each other to clarify certain points or to tell someone that you already know what they are talking about. With the ACE Agent event interface and Colang 2.0 we can easily achieve this with a few small changes to the current bot.

  1. Activate the following flow inside the main flow.

    flow main
      [...]
    
      # Allow the user to interrupt the bot at anytime
      activate interruption handling bot talking $mode="interrupt"
    
      while True
      [...]
    

    This flow handles any interruptions by the user.

  2. Ask the bot to tell a story about something, make the bot respond with a long sentence.

  3. While the bot is responding, type something to interrupt it.

  4. Restart the updated bot and Event Simulator.

docker compose -f deploy/docker/docker-compose.yml down
docker compose -f deploy/docker/docker-compose.yml up event-bot -d

python event_client.py

Step 5: Back-Channeling

Back channeling means the bot might provide short reactions based on user input to make the interaction more engaging. For this tutorial we will use a very simple example: At the end of the interaction we will ask the user to provide an email address. While the user is entering the email address the bot will provide contextual feedback.

  1. Add the following flows to the top of the file main.co.

    flow user confirmed
      # meta: user intent
      user has selected choice "yes"
        or user said "yes"
        or user said "ok"
        or user said "that's ok"
        or user said "yes why not"
        or user said "sure"
    
    flow user denied
      # meta: user intent
      user has selected choice "no"
        or user said "no"
        or user said "don't do it"
        or user said "I am not OK with this"
        or user said "cancel"
    
    flow ask for user email
      start VisualChoiceSceneAction(prompt= "Would you share your e-mail?", support_prompts=["You can just type 'yes' or 'no'","Or just click on the buttons below"],choice_type="selection", allow_multiple_choices=False, options= [{"id": "yes", "text": "Yes"}, {"id": "no", "text": "No"}]) as $confirmation_ui
      bot say "I would love to keep in touch. Would you be OK to give me your e-mail address?"
      when user confirmed
        send $confirmation_ui.Stop()
        bot ask "Nice! Please enter a valid email address to continue"
        start VisualFormSceneAction(prompt="Enter valid email",inputs=[{"id": "email", "description": "email address", "value" : ""}]) as $action
        while True
        when VisualFormSceneAction.InputUpdated(interim_inputs=[{"id": "email",  "value" : r"@$"}])
            bot say "And now only the domain missing!"
        orwhen VisualFormSceneAction.InputUpdated(interim_inputs=[{"id": "email",  "value" : r"^[-\w\.]+@([\w-]+\.)+[\w-]{2,4}$"}])
            bot say "Looks like a valid email address to me, just click ok to confirm" and bot gesture "success"
        orwhen VisualFormSceneAction.ConfirmationUpdated(confirmation_status="confirm")
            bot say "Thank you" and bot gesture "bowing"
            break
        orwhen VisualFormSceneAction.ConfirmationUpdated(confirmation_status="cancel")
            bot say "OK. Maybe another time."
            break
      orwhen user denied
        bot say "That is OK"
    
  2. Add the flow ask for user email to the orwhen case when the user wants to end the conversation.

    while True
        [...]
        orwhen user expressed done
          ask for user email
          bot express goodbye
          break
    
  3. Restart the updated bot and Event Simulator.

    docker compose -f deploy/docker/docker-compose.yml down
    docker compose -f deploy/docker/docker-compose.yml up event-bot -d
    
    python event_client.py
    
  4. Test this interaction by ending the conversation. For example, write I am done into the prompt. This will trigger the flow ask for user email. This flow first asks you if you are Okay to provide your email (you can write the confirmation in the prompt or click on an option in the UI on the right). If you confirm, the email entering prompt will appear on the right in the UI section.