Multimodal Rails

This section explains how to create multimodal rails in Colang 2.0.

Definition

Multimodal rails are a type of rails that take into account multiple types of input/output modalities (e.g., text, voice, gestures, posture, image).

Usage

The example below shows how you can control the greeting behavior of an interactive avatar.

Note

The Colang Standard Library (CSL) includes an avatars module with flows for multimodal events and actions to implement interactive avatars use cases.

examples/v2_x/tutorial/multi_modal/main.co
 1import core
 2import avatars
 3
 4flow main
 5  user expressed greeting
 6  bot express greeting
 7
 8flow user expressed greeting
 9  user expressed verbal greeting
10    or user gestured "Greeting gesture"
11
12flow user expressed verbal greeting
13  user said "hi"
14    or user said "hello"
15
16flow bot express greeting
17  bot express verbal greeting
18    and bot gesture "Smile and wave with one hand."
19
20flow bot express verbal greeting
21  bot say "Hi there!"
22    or bot say "Welcome!"
23    or bot say "Hello!"

In the flow above, lines 9 and 17 use the pre-defined flows user gestured and bot gesture which match user gestures and control bot gestures.

Under the Hood

Under the hood, the interactive systems that uses the Colang script above would need to generate GestureUserActionFinished events (which is what the user gestured flow is waiting for) and know how to handle StartGestureBotAction events (which is what the bot gesture flow triggers).

Testing

To test the above logic using the NeMo Guardrails CLI you can manually send an event by starting the message with a /:

$ nemoguardrails chat --config=examples/v2_x/tutorial/guardrails_1

> hi

Welcome!

Gesture: Smile and wave with one hand.

> /GestureUserActionFinished(gesture="Greeting gesture")

Hi there!

Gesture: Smile and wave with one hand.

The next example will show you how define input rails.