ACE Controller Microservice#

The ACE Controller is a microservice utilizing the Python-based open-source Pipecat framework for building real-time, voice-enabled, and multimodal conversational AI agents. Pipecat uses a pipeline-based architecture to handle real-time AI processing and handles the complex orchestration of AI services, network transport, audio processing, and multimodal interactions, letting you focus on creating engaging experiences. The ACE Controller microservice extends the Pipecat framework to enable developers to easily customize, debug, and deploy complex pipelines and integrate powerful NVIDIA Services into the Pipecat ecosystem.

Architecture#

ACE Controller Architecture

Pipecat#

Pipecat-ai is an open-source framework developed by Daily, designed to streamline the creation of real-time, voice, and multimodal conversational AI agents. Essentially, it acts as an orchestration tool, managing the flow of audio, video, and text data between various AI services like speech-to-text, large language models, and text-to-speech. This allows developers to build interactive applications with natural, low-latency conversations, enabling use cases like AI-powered customer service, virtual assistants, and real-time interactive experiences. For an in-depth understanding on the Pipecat-ai framework, refer to the Pipecat documentation.

NVIDIA Pipecat#

NVIDIA Pipecat extends the capabilities of Pipecat-ai with additional frame processors/services and new multimodal frames to drive avatar interactions. This includes integrating many NVIDIA services and NIMs such as NVIDIA Riva, NVIDIA Audio2face, and NVIDIA Foundational RAG. It also introduces a few processors with a focus on improving the end-user experience for multimodal conversational agents, along with speculative speech processing to reduce latency for faster bot responses. The nvidia-pipecat source code can be found in the GitHub repository.

Pipecat Library

ACE Controller#

ACE Controller integrates the Pipecat ecosystem with ACE microservices to enable developers to build and customize applications such as a digital human easily and deploy them for production use cases. ACE Controller provides a FastAPI-based HTTP and websocket server to enable scaling for multiple user streams. It exposes REST APIs for adding and removing streams, which can be utilized by external components such as SDR (Stream Distribution & Routing) microservice to manage the workload of ACE Controller pods. It also integrates with VST (Video Storage Toolkit) microservice by supporting RTSP input streams in ACETransport along with a websocket connection used only for UI communication. For more details, refer to the ACE Controller Scaling section. UCS Microservice enables quick Kubernetes-based deployment using a single Helm chart along with other ACE microservices such as Animation Graph microservice, Video Storage Toolkit, Audio2Face, and Riva microservices.