Overview

NVIDIA ACE Agent is a GPU-accelerated SDK for building conversational AI agents or bots that are powered by LLMs, customized for your use case and deliver real-time performance. It offers a complete workflow to build and deploy virtual agents that can support multi-turn and multi-user contextual conversation flow. It provides connectivity between AI skills like NVIDIA Riva Speech AI, NVIDIA ACE Avatar AI & Vision AI, use case specific custom plugins, and user interfaces through efficient system integration and composable dialog management.

ACE Agent inference is powered by NVIDIA TensorRT optimizations and served using the NVIDIA Riva Skills Service and NVIDIA Triton Inference Server, which are both part of the NVIDIA AI platform. ACE Agent supports gRPC APIs for streaming low latency speech for virtual assistant applications and it also supports simple REST APIs for text-only chatbots.

ACE Agent Workflow

ACE Agent is fully containerized and can easily scale to a large number of concurrent parallel users.

Some of the major benefits that ACE Agent provides are:

  • In-built LLM integration - ACE Agent works with large language models (LLM) out-of-the-box and provides a hook to connect with the LLM model of your choice.

  • On-premise model deployment - ACE Agent supports on premise deployment of both ACE Agent models as well as other community and custom models. NVIDIA NIM for LLMs brings state of the art GPU accelerated large language model serving. Using NIM, you can deploy an LLM of your choice on premise and use it with ACE Agent.

  • Highly customizable - ACE Agent allows you to completely customize the behavior of the bot based on your use case using Colang. It even allows you to integrate agents and bots built using LangChain or similar frameworks in the ACE Agent pipeline for building multi model use cases.

  • RAG - ACE Agent allows easy integration with Retrieval Augmented Generation (RAG) workflows to support building agents using existing knowledge documents with minimal efforts.

  • Low latency - ACE Agent uses NVIDIA TensorRT optimized models, NVIDIA Triton Inference Server for model deployment, and optimized chat controller to ensure low latency and high throughput bot interactions.

Structure of this Document

  • Quick Start Guide - This is the starting point to try out ACE Agent. Specifically, this Quick Start Guide enables you to deploy sample bots and interact with them.

  • Release Notes - These release notes describe the key features, software enhancements and improvements, and known issues for the ACE Agent release.

  • Architecture- ACE Agent is a collection of microservices, this section describes architecture of the microservices and different possible pipelines based on the microservices.

  • Deployment - This section provides instructions to deploy the bots built using ACE Agent in different environments like Docker, Kubernetes, or Python Native.

  • Tutorials - NVIDIA ACE Agent is an SDK, which helps you to build your domain conversational AI agent using Large Language Models (LLM) and other NLP models. In this section, you will learn how to build a simple bot using ACE Agent and then add various capabilities to it.

  • User Guide - Learn how to perform general configurations such as controlling bot configurations like changing LLM models. Specifically:

    • Colang Guide - Learn about Colang; the dialog modeling language used to build conversations.

    • Integration with LangChain and LlmaIndex - Integrate NVIDIA ACE Agent in your existing LangChain-powered application or bring your preferred retrieval solutions to ACE Agent.

    • NLP Server - Learn how to plug in any custom NLP model seamlessly using this component and utilize it in your bots.

    • Speech AI - Learn how NVIDIA ACE Agent enables voice modality for your bots and the ecosystem of features around speech AI.

    • Training Models - The ACE Agent Quick Start comes with a model helper script. In this section you will learn how to easily train NVIDIA Riva Joint Intent & Slot Classification, Text Classification, and Named Entity Recognition NLP models with custom domain specific datasets, evaluate the models, and deploy them.

    • Plugin Server - Learn how to sandbox your domain specific custom business logic like calling an external endpoint while interacting with your bot.

  • Configurations Guide - This section provides guidance for different configurations available to you and outlines the common configurations files you need to assemble in order to build your own bot.

  • API Guide - This section provides a comprehensive explanation for the schemas exposed by the ACE Agent servers.

  • Best Practices - This section goes into more detail and provides guidance on how to tackle different common use cases you may encounter while building applications using NVIDIA ACE Agent.

  • Sample Bots - Learn all about the different sample bots that come with the NVIDIA ACE Agent. Learn how to deploy in a native and Docker-based environment.

  • Reference - Understand what the compatibility requirements are and learn how to migrate to the latest bot release.