Overview#
The Tokkio UI is a reference web application which provides a way for a user to open and interact with a Tokkio application in a web browser. It can either be deployed as a standalone application, or embedded into another web application using an iframe.
It demonstrates the end to end interactions between the frontend and the Tokkio deployment, including the avatar stream, ASR and TTS transcripts, multimodal interactions, and more.
Architecture#
Overall Architecture#
Below is the architecture diagram of the Tokkio UI, showing how the UI connects to the other services in the Tokkio deployment.

Multimodal RAG Architecture#
Below is the architecture diagram of the Tokkio UI Multimodal feature, showing how the UI connects to the ACE Controller and the NVIDIA RAG service to retrieve multimodal content.

The NVIDIA RAG service returns a list of images, tables, and text as citations in the RAG response, alongside the RAG response. When Tokkio is connected to the NVIDIA RAG service, the ACE Controller will listen to these citations, process the citations into multimodal custom view frames, then send them to the UI to be displayed alongside the avatar.
A similar process can be followed for any other multimodal service without changing any UI code. Visit the Adding Custom Multimodal Content section to learn more.