Tokkio LLM-RAG - Omniverse Renderer#

Tokkio LLM-RAG - Omniverse Renderer app integrates Omniverse Renderer Microservice to Tokkio pipeline to support the Omniverse RTX real-time renderer. It enables the users to deploy their Tokkio application with a wide selection OV avatars (pre-built or custom).

This reference app is a variation of Tokkio LLM-RAG using Omniverse Renderer as its rendering option. Other workflows such as Tokkio Retail can also be used with this particular rendering option.

Minimum GPU Requirements#

Minimum GPU Requirements#

single stream configuration

2xT4 or 2xL4

3 streams deployment

4xT4

6 streams deployment

4xA10 or 4xL4

Architecture#

The LLM RAG with OV renderer is the default option for Tokkio deployment. It follows the basic Tokkio architecture described in the Microservices. The architecture diagram also shown below for reference.

Architecture Overview with Microservices

Note that the Tokkio LLM RAG resource used here is the resource for Plugin server, which is a part of the fulfillment pipeline. The renderer option is the OV renderer, as indicated in the diagram.

Source#

The helm chart of the sample LLM RAG workflow can be found in https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ace/helm-charts/ucs-tokkio-app-base-1-stream-llm-rag-3d-ov.

Refer to the Deployment for deployment instructions.

Customization#

One can perform avatar and scene customizations as described in Avatar and Scene customization for this rendering option.

Other customizations that can be performed for the bot are independent of the rendering pipeline. Please check the Customization page for more information.