Modalities

The API provides standard events and actions for the following modalities: speech, gestures, emotions, movements and scene. Actions for different modalities are independent of each other. These actions can run independently of each other. An interactive system needs to support this. If multiple actions should be executed for the same modality, the policy of that modality governs how the execution / scheduling of these events are handled.

The following modality policies exist in UMIM:

UMIM Modality Policies
Policy	Description	Example Modality
`parallel`	Multiple actions can run at the same time.	Sound effects, Timers
`override`	Multiple actions can run at the same time. If an action is running it will be “overwritten”. When the overwriting action finishes the other action will be resumed	Postures, Gestures
`skip`	If an action is started while there is already an action in progress on the same modality the new action to be started will be started and immediately finished (`ActionStarted()` followed by `ActionFinished()`) without an effect on the interactive system
`replace`	Stops any ongoing action (`ActionFinished()`) on the same modality and starts the new action as soon as possible

Modality overview

Each interactive system can support different modalities. If a modality is supported, the interactive system needs to support all mandatory actions and events of that modality and may support further optional actions and events as listed in the table below.

UMIM Modality Overview
Modality Group	Modality Name	Policy	Required actions and events	Optional actions and events
Speech	`bot_speech`	`replace`	UtteranceBotAction
	`user_speech`	`replace`	UtteranceUserAction
Motion	`bot_gesture`	`override`	GestureBotAction
	`bot_posture`	`override`	PostureBotAction
	`user_gesture`	`parallel`	GestureUserAction
	`user_gesture`	`parallel`	GestureUserAction
	`information`	`override`	VisualInformationSceneAction
				VisualChoiceSceneAction
				VisualFormSceneAction
System	`user_presence`	`parallel`	PresenceUserAction
	`bot_expectation`	`parallel`	ExpectationBotAction
	`bot_expectation`	`parallel`	ExpectationSignalingBotAction
	`time`	`parallel`	TimerBotAction
	`web_request`	`parallel`	RestApiCallBotAction

Speech

This section defines events and actions related to dialog management using the speech modality. Both the bot as well as the user can use this modality. We distinguish between BotSpeech and UserSpeech to refer to the respective modality.

Utterance User Action

The user makes an utterance that is recognized by the interactive system. Examples of this action include the user typing into a text interface to interact with the bot or the user speaking to an interactive avatar.

UtteranceUserActionStarted()

The user started to produce an utterance. The user could have started talking or typing for example.

Parameters: ... – Additional parameters/payload inherited from UserActionStarted().

UtteranceUserActionIntensityUpdated(intensity: float)

Provides updated speaking intensity levels if the interactive system supports it.

Parameters

intensity (float) – A value from 0-1 that indicates the intensity of the utterance. A value of 0.5 means an “average” intensity. The intensity of an utterance action can correspond to different metrics depending on the interactive system. For a chatbot system the intensity could relate to the typing rate. In a speech-enabled system intensity could be computed based on the volume and pitch variation of the user’s voice.
... – Additional parameters/payload inherited from UserActionUpdated().

UtteranceUserActionTranscriptUpdated(interim_transcript: str)

Provides updated transcripts during a UtteranceUserAction

Parameters

interim_transcript (str) – Partial transcript of the user utterance up to this point in time
... – Additional parameters/payload inherited from UserActionUpdated().

StopUtteranceUserAction()

Indicate that the IM has received the information needed and that the Action Server should consider the Utterance as finished as soon as possible. This could for example instruct the Action Server to decrease the hold time (duration of silence in the user speech until we consider the end of speech has been reached.

Parameters: ... – Additional parameters/payload inherited from StopUserAction().

UtteranceUserActionFinished(final_transcript: str)

The user utterance has finished.

Parameters

final_transcript (str) – Final transcript of the user utterance
... – Additional parameters/payload inherited from UserActionFinished().

Action Sequence Diagrams

In the following we provide example sequence diagrams for the sequence of events for two typical interactive systems.

../../_images/utterance_user_action_chat.png — Example event flow for a chatbot system

Utterance Bot Action

The bot is producing an utterance (saying something) to the user. Depending on the interactive system this can mean different things, but this action always represents verbal communication with the user through a speech-like interface (e.g. chat interface, actual voice interface, brain-to-machine communication 😀)

StartUtteranceBotAction(script: str, intensity: Optional[float])

The bot should start to produce an utterance. Depending on the interactive system this could be a bot sending a text message or an avatar talking to the user.

Parameters

script (str) – The utterance of the bot, supporting SSML
intensity (Optional[float]) – A value from 0-1 that indicates the intensity of the utterance. A value of 0.5 means an “average” intensity. The intensity of an utterance action should change how the utterance is delivered to the user base on the type of interactive system For a chatbot system the intensity could relate to the typing rate in the UI. In a speech-enabled system intensity could change the volume and pitch variation of generated speech.
... – Additional parameters/payload inherited from StartBotAction().

UtteranceBotActionStarted()

The bot started to produce the utterance. This event should align as close as possible with the moment in time the user is receiving the utterance. For example in an Interactive Avatar system, the event is sent out by the Action Server once the text-to-speech (TTS) stream is sent to the user.

Parameters: ... – Additional parameters/payload inherited from BotActionStarted().

ChangeUtteranceBotAction(intensity: float)

Adjusts the intended volume while the action has already been started.

Parameters

intensity (float) – A value from 0-1 that indicates the intensity of the utterance. A value of 0.5 means an “average” intensity. The intensity of an utterance action should change how the utterance is delivered to the user base on the type of interactive system For a chatbot system the intensity could relate to the typing rate in the UI. In a speech-enabled system intensity could change the volume and pitch variation of generated speech.
... – Additional parameters/payload inherited from ChangeBotAction().

UtteranceBotActionScriptUpdated(interim_script: str)

Provides updated transcripts during a UtteranceBotAction. These events correspond to the time that a certain part of the utterance is delivered to the user. In a interactive system that supports voice output these events should align with when the user hears the partial transcript

Parameters

interim_script (str) – Partial script of the bot utterance up to this point in time
... – Additional parameters/payload inherited from BotActionUpdated().

StopUtteranceBotAction()

Stops the bot utterance.The action is stopped only once the UtteranceBotActionFinished has been received. For interactive systems that do not support this event, the action will continue to run normally until finished. The interaction manager is expected to handle arbitrary delays between the time stopping the utterance and the time the utterance actually finished.

Parameters: ... – Additional parameters/payload inherited from StopBotAction().

UtteranceBotActionFinished(final_script: str)

The bot utterance finished, either because the utterance has been delivered to the user or the action was stopped.

Parameters

final_script (str) – Final transcript of the bot utterance
... – Additional parameters/payload inherited from BotActionFinished().

Motion

Motion actions include movements, or sets of movements, that have a specific meaning. They are typically recognized through computer vision and can be generated by interactive avatars. At the moment we distinguish the following modalities for both the bot and the user: face, gesture, posture and position.

Many of these modalities are governed by the override policy. Which means the action server is expected to handle multiple concurrent actions by temporarily overriding the currently running action by and new action that has been started. A concrete example: The IM starts a PostureBotAction(gesture=“attentive”) action (Avatar maintains an attentive posture). 2 seconds later the IM starts a PostureBotAction(posture=“listening”) action. The listening posture action is executed by the action server overriding the “attentive” posture (Avatar appears to be listening). Once the listening posture action is stopped the Avatar is going back to the “attentive” posture (overwritten action is resumed).

Posture Bot Action

Instruct the bot to assume a pose. A pose will never be finished by the system. This is in contrast to Gesture actions that have a limited lifetime and should be “performed” with a clear start and end date dedicated by the Gesture. Poses can be implemented by the interactive system in different ways. For interactive avatar systems, poses can change the posture of the avatar. For chatbot systems this could change a bot indication icon (e.g. like the Siri assistant)

StartPostureBotAction(posture: str)

The bot should start adopting the specified posture.

Parameters

posture (str) – Natural language description (NLD) of the posture. The availability of postures depends on the interaction system. Postures should be expressed hierarchically such that interactive systems that provide less nuanced postures can fall back onto higher level postures. The following base postures need to be supported by all interactive systems supporting this action.: “idle”, “attentive”
... – Additional parameters/payload inherited from StartBotAction().

PostureBotActionStarted()

The bot has attained the posture.

Parameters: ... – Additional parameters/payload inherited from BotActionStarted().

StopPostureBotAction()

Stop the posture. Postures have no lifetime, so unless the IM calls the Stop action the bot will keep the posture indefinitely.

Parameters: ... – Additional parameters/payload inherited from StopBotAction().

PostureBotActionFinished()

The posture was stopped.

Parameters: ... – Additional parameters/payload inherited from BotActionFinished().

Gesture Bot Action

Instruct the bot to make a gesture. In contrast to PostureBotAction bot gestures have a limited “lifetime” and are used to provide an immediate effect. Bot gestures can be implemented by the interactive system in different ways. For interactive avatar systems, gestures should be performed by the avatar. For chatbot systems, gestures can be expressed by emojis or images, gifs.

StartGestureBotAction(gesture: str)

The bot should start making a specific gesture.

Parameters

gesture (str) – Natural language description (NLD) of the gesture. Availability of gestures depends on the interaction system. If a system supports this action, the following base gestures need to be supported: affirm, negate, attract
... – Additional parameters/payload inherited from StartBotAction().

GestureBotActionStarted()

The bot has started to perform the gesture.

Parameters: ... – Additional parameters/payload inherited from BotActionStarted().

StopGestureBotAction()

Stop the gesture. All gestures have a limited lifetime and finish on ‘their own’. Gesture are meant to accentuate a certain situation or statement. For example, in an interactive avatar system a affirm gesture could be implemented by a 1 second animation clip where the avatar nods twice. The IM can use this action to stop a gesture before it would be naturally done.

Parameters: ... – Additional parameters/payload inherited from StopBotAction().

GestureBotActionFinished()

The gesture was performed.

Parameters: ... – Additional parameters/payload inherited from BotActionFinished().

Gesture User Action

The system detected a user gesture.

GestureUserActionStarted()

The interactive system detects the start of a user gesture. Note: the time the system detects the gesture might be different from when the user started to perform the gesture.

Parameters: ... – Additional parameters/payload inherited from UserActionStarted().

GestureUserActionFinished(gesture: str)

The user performed a gesture.

Parameters

gesture (str) – Human readable name of the gesture. Availability of gestures depends on the interaction system.
... – Additional parameters/payload inherited from UserActionFinished().

Position Bot Action

Instructs the bot to hold a new position. If the action is stopped the bot will return to its original position. This is a state action (like PostureBotAction), that will ensure that the bot returns to its previous position when the action is finished.

StartPositionBotAction(position: str)

The bot needs to hold a new position.

Parameters

position (str) –
Specify the position the bot needs to move to and maintain.

Availability of positions depends on the interactive system. Positions are typically structured hierarchically into base position and position modifiers (“off center”).

Minimal NLD set:

The following base positions are supported by all interactive systems (that support this action):

center : Default position of the bot

left : Bot should be positioned to the left (from the point of view of the bot)

right: Bot should be positioned to the right (from the point of view of the bot)
... – Additional parameters/payload inherited from StartBotAction().

PositionBotActionStarted()

The bot has started to transition to the new position

Parameters: ... – Additional parameters/payload inherited from BotActionStarted().

PositionBotActionUpdated(position_reached: str)

The bot has arrived at the position and is maintaining that position for the entire action duration.

Parameters

position_reached (str) – The position the bot has reached.
... – Additional parameters/payload inherited from BotActionUpdated().

StopPositionBotAction()

Stop holding the position. The bot will return to the position it had before the call. Position holding actions have an infinite lifetime, so unless the IM calls the Stop action the bot maintains the position indefinitely. Alternatively PositionBotAction actions can be overwritten, since the modality policy is Override.

Parameters: ... – Additional parameters/payload inherited from StopBotAction().

PositionBotActionFinished()

The bot shifted back to the original position before this action. This might be a neutral position or the position of any PositionBotAction overwritten by this action that now gains the “focus”.

Parameters: ... – Additional parameters/payload inherited from BotActionFinished().

Facial Gesture Bot Action

Instruct the bot to make rapid and brief facial expressions that last for at most a few seconds, like a quick smile, a momentary frown, or a brief wink. In a chatbot system this could generate an emoji as part of a text message. For interactive avatars this changes the facial animations of the avatar for a short while (e.g. to wink or smile).

StartFacialGestureBotAction(facial_gesture: str)

The bot should start making a facial gesture.

Parameters

facial_gesture (str) –
Natural language description (NLD) of the facial gesture or expression.

Availability of facial gestures depends on the interactive system.

Minimal NLD set:

The following gestures should be supported by every interactive system implementing this action: smile, lough, frown, wink
... – Additional parameters/payload inherited from StartBotAction().

FacialGestureBotActionStarted()

The bot has started to perform the facial gesture

Parameters: ... – Additional parameters/payload inherited from BotActionStarted().

StopFacialGestureBotAction()

Stop the facial gesture or expression. All gestures have a limited lifetime and finish on “their own” (e.g., in an interactive avatar system a “smile” gesture could be implemented by a 1 second animation clip where some facial bones are animated). The IM can use this action to stop an expression before it would be naturally done.

Parameters: ... – Additional parameters/payload inherited from StopBotAction().

FacialGestureBotActionFinished()

The facial gesture was performed.

Parameters: ... – Additional parameters/payload inherited from BotActionFinished().

Facial Gesture User Action

The system detected a short facial gesture, e.g. a frown, a smile or a wink, that is different from a neutral expression of the user. Detected gestures or small expressions are expressed by the user over a short period of time.

FacialGestureUserActionStarted(expression: str)

Parameters

expression (str) –
Natural language description (NLD) of the facial expression.

Detected facial expressions depend on the capabilities of the interactive system.

Minimal NLD set:

The following expressions should be supported by every interactive system implementing this action: smile, lough, frown, wink
... – Additional parameters/payload inherited from UserActionStarted().

FacialGestureUserActionFinished()

Parameters: ... – Additional parameters/payload inherited from UserActionFinished().

Scene

Scene actions are typically available in interactive systems that provide some sort of screen real estate alongside the avatar interaction. In a chatbot system this could either be a section of the app that can display information or the ability to show information inline within a chat. In an interactive avatar system the avatar could be rendered alongside a TV (like in a news anchor scene) or a web UI could be rendered besides the avatar.

Shot Camera Action

Start the specified camera shot. We use shot to refer to anything that impacts the camera state over a longer period of time, this can include moving the camera to a new location, panning from left to right, etc. This is a state action (like PostureBotAction), that will ensure that the camera returns to its previous shot when the action is finished (override modality).

StartShotCameraAction(shot: str, start_transition: str)

Start a new shot.

Parameters

shot (str) –
Natural language description (NLD) of the shot.

Availability of shots depends on the interactive system.

Minimal NLD set:

The following shots should be supported by every interactive system implementing this action: full, medium, close-up
start_transition (str) –
NLD of the transition to the new shot. This should describe the movement.

Minimal NLD set:

The following shots should be supported by every interactive system implementing this action: cut, dolly
... – Additional parameters/payload inherited from StartBotAction().

ShotCameraActionStarted()

The camera shot started

Parameters: ... – Additional parameters/payload inherited from BotActionStarted().

StopShotCameraAction(stop_transition: str)

Stop the camera shot. The camera will return to the shot it had before this action was started. ShotCameraAction actions have an infinite lifetime, so unless the IM calls the Stop action the camera maintains the shot indefinitely.

Parameters

stop_transition (str) –
NLD of the transition back to the previous shot (override modality).

This should describe the movement.

Minimal NLD set:

The following shots should be supported by every interactive system implementing this action: cut, dolly
... – Additional parameters/payload inherited from StopBotAction().

ShotCameraActionFinished()

The camera shot was stopped. The camera has returned to the shot it had before (either a neutral shot) or the shot specified by any overwritten ShotCameraAction actions.

Parameters: ... – Additional parameters/payload inherited from BotActionFinished().

Motion Effect Camera Action

Apply a camera motion effect to the active camera.

StartMotionEffectCameraAction(effect: str)

Perform the described camera motion effect.

Parameters

effect (str) –
Natural language description (NLD) of the effect.

Availability of effects depends on the interactive system.

Minimal NLD set:

The following camera effects should be supported by every interactive system implementing this action:

shake, jump cut in, jump cut out
... – Additional parameters/payload inherited from StartBotAction().

MotionEffectCameraActionStarted()

Camera effect started.

Parameters: ... – Additional parameters/payload inherited from BotActionStarted().

StopMotionEffectCameraAction()

Stop the camera effect . All effects have a limited lifetime and finish “on their own” (e.g., in an interactive avatar system a “shake” effect could be implemented by a 1 second camera motion). The IM can use this action to stop a camera effect before it would be naturally done.

Parameters: ... – Additional parameters/payload inherited from StopBotAction().

MotionEffectCameraActionFinished()

Camera effect finished.

Parameters: ... – Additional parameters/payload inherited from BotActionFinished().

Visual Information Scene Action

Visualize information to the user. This action is used to show the user detailed information about a topic. Example: The user is interested in the details about a product or service.

StartVisualInformationSceneAction(content: List[umim.messages.modalities.scene.VisualInformationContent], title: str, summary: Optional[str], support_prompts: Optional[List[str]])

Present information in the scene to the user.

Parameters

content (List[umim.messages.modalities.scene.VisualInformationContent]) – List of options for the user to choose from
title (str) – Describes the choice you are offering to the user
summary (Optional[str]) – Summary of the information to be shown to the user
support_prompts (Optional[List[str]]) – List of prompts supporting the user in making a choice
... – Additional parameters/payload inherited from StartBotAction().

VisualInformationSceneActionStarted()

The system has started presenting the information to the user..

Parameters: ... – Additional parameters/payload inherited from BotActionStarted().

VisualInformationSceneActionConfirmationUpdated(confirmation_status: ConfirmationStatus)

Whenever the user confirms or tries to abort the visual information shown in the screen. Examples of this include: clicking a “confirm” button, “clicking on close”

Parameters

confirmation_status (ConfirmationStatus) – Update on the confirmation status. User indicating to have understood or to cancel the visual information
... – Additional parameters/payload inherited from BotActionUpdated().

StopVisualInformationSceneAction()

Stop presenting the information to the user

Parameters: ... – Additional parameters/payload inherited from StopBotAction().

VisualInformationSceneActionFinished()

Information action was stopped by the IM (no user action will cause the action to be finished by the Action Server).

Parameters: ... – Additional parameters/payload inherited from BotActionFinished().

Visual Choice Scene Action

Visualize a choice to the user ideally allowing the user to interact with the choice in multiple ways. Example: Show a website on screen, that presents the option allowing the user to choose either by touching an option or by using his voice to select an option.

StartVisualChoiceSceneAction(allow_multiple_choices: bool, choice_type: ChoiceType, options: List[umim.messages.modalities.scene.VisualChoiceOption], prompt: str, image: Optional[str], support_prompts: Optional[List[str]])

Present a choice in the scene to the user.

Parameters

allow_multiple_choices (bool) – Indicate if the user should be able to select multiple choices from the presented options
choice_type (ChoiceType) – Configures the type of choice the user can make.
options (List[umim.messages.modalities.scene.VisualChoiceOption]) – List of options for the user to choose from
prompt (str) – Describes the choice you are offering to the user
image (Optional[str]) – Description of an image that should be shown alongside the choice.
support_prompts (Optional[List[str]]) – List of prompts supporting the user in making a choice
... – Additional parameters/payload inherited from StartBotAction().

VisualChoiceSceneActionStarted()

The system has started presenting the choice to the user

Parameters: ... – Additional parameters/payload inherited from BotActionStarted().

VisualChoiceSceneActionChoiceUpdated(current_choice: List[str])

Whenever the user interacts directly with the choice presented in the scene but does not confirmed cancel the choice, a ChoiceUpdated event is sent out by the interactive system.

Parameters

current_choice (List[str]) – List of Option IDs if the user made a choice
... – Additional parameters/payload inherited from BotActionUpdated().

VisualChoiceSceneActionConfirmationUpdated(confirmation_status: ConfirmationStatus)

Whenever the user confirms or tries to abort the choice when interacting with the visual representation of the choice. Examples of this include: clicking a “confirm” button, “clicking on close”

Parameters

confirmation_status (ConfirmationStatus) – Status of the choice confirmation by the user.
... – Additional parameters/payload inherited from BotActionUpdated().

StopVisualChoiceSceneAction()

Stop presenting the choice to the user.

Parameters: ... – Additional parameters/payload inherited from StopBotAction().

VisualChoiceSceneActionFinished(final_choice: List[str])

The choice action was stopped by the IM. (no user action will cause the action to be finished by the Action Server).

Parameters

final_choice (List[str]) – List of options for the user to choose from
... – Additional parameters/payload inherited from BotActionFinished().

Visual Form Scene Action

Visualize a form to the user for cases where the bot needs accurate and specific input. Common examples include showing a form to get a user’s postal or email address.

StartVisualFormSceneAction(inputs: List[umim.messages.modalities.scene.VisualFormInputs], prompt: str, image: Optional[str], support_prompts: Optional[List[str]])

Present a visual form that is requesting certain inputs from the user in the scene to the user.

Parameters

inputs (List[umim.messages.modalities.scene.VisualFormInputs]) – List of inputs required.
prompt (str) – Describes the inputs you are requesting from the user
image (Optional[str]) – Description of an image that should be shown alongside the prompt.
support_prompts (Optional[List[str]]) – List of prompts supporting the user in making a choice
... – Additional parameters/payload inherited from StartBotAction().

VisualFormSceneActionStarted()

The system has started presenting the the form to the user..

Parameters: ... – Additional parameters/payload inherited from BotActionStarted().

VisualFormSceneActionConfirmationUpdated(confirmation_status: ConfirmationStatus)

Whenever the user confirms or tries to abort the form input when interacting with the visual representation of the form. Examples of this include: clicking a “confirm” button, “clicking on close”

Parameters

confirmation_status (ConfirmationStatus) – Status of the form input confirmation by the user.
... – Additional parameters/payload inherited from BotActionUpdated().

VisualFormSceneActionInputUpdated(interim_inputs: List[umim.messages.modalities.scene.VisualFormInputs])

Whenever the user interacts directly with the form inputs presented in the scene but has not yet confirmed the input, an Updated action is sent out by the interactive system. This allows the IM to react to partial inputs, e.g. if a user is typing an e-mail address the bot can react to partial inputs (the bot could say “And now only the domain missing” after the user typed “@” in the form field).

Parameters

interim_inputs (List[umim.messages.modalities.scene.VisualFormInputs]) – Final state of all inputs.
... – Additional parameters/payload inherited from BotActionUpdated().

StopVisualFormSceneAction()

Stop presenting the form to the user.

Parameters: ... – Additional parameters/payload inherited from StopBotAction().

VisualFormSceneActionFinished(final_inputs: List[umim.messages.modalities.scene.VisualFormInputs])

Form action was stopped by the IM (no user action will cause the action to be finished by the Action Server).

Parameters

final_inputs (List[umim.messages.modalities.scene.VisualFormInputs]) – Final state of all inputs.
... – Additional parameters/payload inherited from BotActionFinished().

System

This section describes system and utility actions and events that are necessary to build robust interactions.

Presence User Action

The system detected the presence of a user. Depending on the interactive system this can mean that a user open the interface (e.g. chat app) or that a user entered the field of view of a camera (e.g. Kiosk), or the system detects mouse or keyboard input that indicate that a user is in front of her laptop.

Note. If no user is present no user actions should be created by the interactive system. So the PresenceUserAction serves like a parent action for all user actions. In the current version UMIM

PresenceUserActionStarted()

The interactive system detects the presence of a user in the system.

Parameters: ... – Additional parameters/payload inherited from UserActionStarted().

PresenceUserActionFinished()

The interactive system detects the user’s absence

Parameters: ... – Additional parameters/payload inherited from UserActionFinished().

Attention User Action

The system detected the engagement of a user with the interactive system. Engagement can be measured in many different ways depending on the interactive system. In a chat application engagement can for example be estimated through typing characteristics or app navigation behavior of the user. In an interactive avatar setting user engagement might be estimated based on the visual attention.

AttentionUserActionStarted(attention_level: Optional[float])

The interactive system detects some level of engagement of the user.

Parameters

attention_level (Optional[float]) – Optional. Float in the range (0,1] (not including 0). Indicates the estimated attention level. The level should represent a statistic for the measured attention of the user. Depending on the interactive system this could for example be the mean engagement level over a sliding window.
... – Additional parameters/payload inherited from UserActionStarted().

AttentionUserActionUpdated(attention_level: Optional[float])

The interactive system provides an update to the engagement level.

Parameters

attention_level (Optional[float]) – Optional. Float in the range (0,1] (not including 0). Indicates the estimated attention level.
... – Additional parameters/payload inherited from UserActionUpdated().

AttentionUserActionFinished()

The system detects the user to be disengaged with the interactive system.

Parameters: ... – Additional parameters/payload inherited from UserActionFinished().

Timer Bot Action

Set a timer for a specified duration.

StartTimerBotAction(duration: timedelta, timer_name: Optional[str])

Start a timer.

Parameters

duration (timedelta) – Time duration with respect to the event_created_at timestamp of this StartTimerAction event. When the duration has passed the timer goes off (TimerBotActionFinished is sent out).
timer_name (Optional[str]) – Name for the timer
... – Additional parameters/payload inherited from StartBotAction().

TimerBotActionStarted()

Timer started.

Parameters: ... – Additional parameters/payload inherited from BotActionStarted().

ChangeTimerBotAction(duration: timedelta)

Change the duration of the timer. If the duration is reduced this can cause the timer to go off immediately (an TimerBotActionFinished event will be sent out).

Parameters

duration (timedelta) – Change the duration of the Timer with respect to the event_created_at timestamp of this StartTimerAction event.
... – Additional parameters/payload inherited from ChangeBotAction().

StopTimerBotAction()

Stop the timer.

Parameters: ... – Additional parameters/payload inherited from StopBotAction().

TimerBotActionFinished()

Timer finished.

Parameters: ... – Additional parameters/payload inherited from BotActionFinished().

Managing context

Many interaction managers have the notion of some sort of context or memory, where a certain context for the interaction can be stored. The following event supports notifying components about context updates.

ContextUpdate(data: Dict[str, Any])

An update to the context. All specified keys will override the ones in the current context.

Parameters

data (Dict[str, Any]) – Any contextual data that has changed.
... – Additional parameters/payload inherited from Event().

Pipeline events

Interaction managers typically can typically handle multiple interactions at the same time. For UMIM we are abstracting these as pipelines. A pipeline consists of

stream_uid, that corresponding to a unique stream of UMIM events, that is typically tied to a single client instance (e.g. a kiosk system in a shop that is always on, or a new browser session started by a user)
session_uid, that denotes a single interaction session. What a session consists of depends on your use case.
user_uid, that identifies the user that is participating in the session.

PipelineAcquired(stream_uid: str, session_uid: Optional[str], user_uid: Optional[str])

A new pipeline has been acquired. A pipeline connects the IM to end user devices (stream_uid). This event informs the IM in an event-based implementation about the availability of new pipelines.

Parameters

stream_uid (str) – A unique identifier for the stream
session_uid (Optional[str]) – A unique identifier for the session
user_uid (Optional[str]) – A unique identifier for the user
... – Additional parameters/payload inherited from Event().

PipelineUpdated(stream_uid: str, session_uid: Optional[str], user_uid: Optional[str])

Information about an existing pipeline has been updated. This means that a new session was started or a new user has been identified as part of the same pipeline.

Parameters

stream_uid (str) – A unique identifier for the stream
session_uid (Optional[str]) – A unique identifier for the session
user_uid (Optional[str]) – A unique identifier for the user
... – Additional parameters/payload inherited from Event().

PipelineReleased(stream_uid: str, session_uid: Optional[str], user_uid: Optional[str])

A pipeline has been released and is no longer available. A pipeline connects the IM to end user devices (stream_uid). This event informs the IM in an event-based implementation about pipelines that have been released.

Parameters

stream_uid (str) – A unique identifier for the stream
session_uid (Optional[str]) – A unique identifier for the session
user_uid (Optional[str]) – A unique identifier for the user
... – Additional parameters/payload inherited from Event().

Integration

This section provides actions to integrate the interactive system with other apps or services to provide additional capabilities that the Interactive System can leverage.

REST API Call Action

The system should make a REST API call.

StartRestApiCallBotAction(request_type: RequestType, url: str, headers: Optional[Dict[str, Any]], payload: Optional[Dict[str, Any]])

Start an API call.

Parameters

request_type (RequestType) – Request type
url (str) – REST API endpoint
headers (Optional[Dict[str, Any]]) – Custom headers
payload (Optional[Dict[str, Any]]) – Dict that will be converted to JSON and Content-Type header set to application/json
... – Additional parameters/payload inherited from StartBotAction().

RestApiCallBotActionStarted()

API call started

Parameters: ... – Additional parameters/payload inherited from BotActionStarted().

RestApiCallBotActionFinished(response: Dict[str, Any])

API call finished

Parameters

response (Dict[str, Any]) – Response of call
... – Additional parameters/payload inherited from BotActionFinished().

Modalities

Modality overview

Speech

Utterance User Action

Action Sequence Diagrams

Utterance Bot Action

Side note: Managing Bot Expectations related to speech input and ASR

Motion

Posture Bot Action

Gesture Bot Action

Gesture User Action

Position Bot Action

Facial Gesture Bot Action

Facial Gesture User Action

Scene

Shot Camera Action

Motion Effect Camera Action

Visual Information Scene Action

Visual Choice Scene Action

Visual Form Scene Action

System

Presence User Action

Attention User Action

Timer Bot Action

Managing context

Pipeline events

Integration

REST API Call Action