Tokkio 3.0.1#
Tokkio LLM app#
The Tokkio LLM reference app becomes the Tokkio LLM-RAG reference app. It allows users to connect their own RAG pipeline to Tokkio with and without streaming.
Colang 2.0 support
Enterprise RAG support
GenerativeAIExamples RAG support
LLM based gesture generation
LLM response streaming support
Barge in support
Tokkio QSR app#
The intent slot NLP models are replaced with LLM
Colang 2.0 support
Barge in support
Animation and Rendering#
New animation pipeline architecture and microservices
New A2F model and FP16 support
OV kit upgrade
New scene
Dynamic GPU allocation support
Enable multiple level in the logs
VST#
Updated VST microservice
4K streaming support and dual peer connection
UMIM#
Introduction of UMIM action server
UI and UI server#
UMIM Integration
Dual peer connection support
ACE agent#
Updated architecture and microservices
RIVA#
Support for the ASR model parakeet_1-1b
IPA dict support
SDR#
Error recovery support
Architecture#
Migration from UCF 2.0 to UCS 2.5
Security#
Security patches for all the microservices
MLOPS#
The MLOPS microservice is no longer supported
Known Issues#
The Tokkio renderer and VST pod have to be manually restarted after a few hours of deployments
For T4 GPU based platforms, It is recommended to use [RIVA Conformer ASR model](https://registry.ngc.nvidia.com/orgs/nvidia/teams/ucs-ms/models/asr_conformer_en_us_streaming_throughput_flashlight_vad/version) instead of default [RIVA Parakeet 1.1B ASR model](https://registry.ngc.nvidia.com/orgs/nvidia/teams/ucs-ms/models/rmir_asr_parakeet_1-1b_en_us_str_vad/version) due to compute limitation
The Tokkio one-click scripts doesn’t support Tokkio customization with non-NGC resources
For full control over unethical, political and racial questions the user is expected to add guardrails to the ACE bot
The ASR model is sensitive to background noise. Use the RIVA Conformer ASR model for environment with background noise
The a2f CPU usage increase since Tokkio 3.0
Tokkio Reference Application
Some menu navigation related queries with speech like “Go to the next page” or “Show the main menu” might give inaccurate results.
LLM prompt tuning will be required to tune some of the responses appropriately for different LLM models.
Items and topping replacement might give inaccurate results.
Item recommendations feature has been removed.
Adding multiple items via speech in the same sentence might lead to inaccurate results.