Docker Environment

Prerequisites

Before you start using NVIDIA ACE Agent, it’s assumed that you meet the following prerequisites. The current version of ACE Agent is only supported on NVIDIA data center GPUs.

  1. You have access and are logged into NVIDIA GPU Cloud (NGC). You have installed the NGC CLI tool on the local system and you have logged into the NGC container registry. For more details about NGC, refer to the NGC documentation.

  2. You have installed Docker and the NVIDIA container toolkit. Ensure that you have gone through the Docker post-installation steps for managing Docker as a non-root user.

  3. You have access to an NVIDIA Volta, NVIDIA Turing, NVIDIA Ampere, NVIDIA Ada Lovelace, or an NVIDIA Hopper Architecture-based GPU.

  4. You have python >= 3.8.10 and pip >= 23.1.2 installed on your workstation.

Setup

  1. Download NVIDIA ACE Agent Quick Start Scripts by cloning the GitHub ACE repository.

    git clone git@github.com:NVIDIA/ACE.git
    cd ACE
    
  2. Go to the ACE Agent microservices directory.

    cd microservices/ace_agent
    
  3. Set your NGC API key in the NGC_CLI_API_KEY environment variable. Based on the bot’s configurations, you might need to export additional environment variables. For example, bots using OpenAI models will need to set the OPENAI_API_KEY environment variable.

    export NGC_CLI_API_KEY=...
    

Deployment

The ACE Agent Quick Start package comes with a deploy/docker/docker-compose.yml for deploying microservices. Before running the Docker compose command, perform the following steps:

  1. Set the BOT_PATH environment variable to the directory containing the bot configurations.

    export BOT_PATH=<LOCAL Directory Path Containing Bot>
    
  2. Export the environment variables needed by sourcing deploy/docker/docker_init.sh.

    source deploy/docker/docker_init.sh
    
  3. Pass the environment variables to the ACE Agent containers. You can utilize the .env file present in the same directory as docker-compose.yml.

  4. Stop all running Docker containers.

    docker compose -f deploy/docker/docker-compose.yml down
    

CLI Interface

chat-bot-cli - gives you access to the Chat Engine Docker terminal and you can interact with the bot using your workstation terminal.

Chit Chat sample bot uses OpenAI gpt-3.5-turbo-instruct as the main model. The Chit Chat bot is present in the quickstart directory at ./samples/chitchat_bot.

For deploying the Chit Chat sample bot, via the CLI interface, perform the following steps:

  1. Set the OpenAI API key environment variable.

    export OPENAI_API_KEY=...
    
  2. Prepare the environment for Docker compose commands.

    export BOT_PATH=./samples/chitchat_bot/
    source deploy/docker/docker_init.sh
    
  3. Deploy the ACE Agent containers. Deploy the Chat Engine, Plugin server, and NLP server containers. For starting the CLI Interface, we will issue docker exec in the Chat Engine container.

    docker compose -f deploy/docker/docker-compose.yml up chat-bot-cli -d
    docker compose -f deploy/docker/docker-compose.yml exec chat-bot-cli $CLI_CMD
    
  4. Interact with the bot by providing user queries via the CLI.

    [YOU] Are you a person
    [BOT] I'm not a real person, but I certainly exist.
    [YOU] What is your name?
    [BOT] I don't have a name. Or I don't know it.
    
  5. Stop all running Docker containers.

    docker compose -f deploy/docker/docker-compose.yml down
    

HTTP Server Interface

chat-bot - an HTTP web server is deployed with only textual chat support.

The Stock bot utilizes mixtral-8x7b-instruct-v0.1 from the NVIDIA API Catalog. The Stock bot is present in the quickstart directory at ./samples/stock_bot.

For deploying the Stock sample bot, via the Server interface, perform the following steps:

  1. Set the API key for the NVIDIA API Catalog.

    export NVIDIA_API_KEY=...
    
  2. Prepare the environment for the Docker compose commands.

    export BOT_PATH=./samples/stock_bot/
    source deploy/docker/docker_init.sh
    
  3. Deploy the ACE Agent microservices. Deploy the Chat Engine, Plugin server, and NLP Server containers.

    docker compose -f deploy/docker/docker-compose.yml up chat-bot -d
    
  4. Try out the bot using a web browser, you can deploy a sample frontend application with only textual chat support using the following command.

    docker compose -f deploy/docker/docker-compose.yml up frontend
    

    You can interact with the bot using the URL http://<workstation IP>:9001/.

Stock Market bot sample
  1. Stop deployment.

    docker compose -f deploy/docker/docker-compose.yml down
    

Event Interface

The ACE Agent event interface provides an asynchronous, event-based interface to interact with bots written in Colang 2.0 that allows bots to make full use of all features in UMIM (Unified Multimodal Interaction Management).

Colang 2.0 sample bot uses OpenAI gpt-3.5-turbo-instruct as the main model. Colang 2.0 sample bot is present in the quickstart directory at ./samples/colang_2_sample_bot.

For starting the Colang 2.0 sample bot via the Event Interface, perform the following steps:

  1. Set OpenAI API key environment variable.

    export OPENAI_API_KEY=...
    
  2. Prepare the environment for the Docker compose commands.

    export BOT_PATH=./samples/colang_2_sample_bot
    source deploy/docker/docker_init.sh
    
  3. Deploy the ACE Agent microservices. Deploy the Chat Engine, Plugin server, and NLP server containers along with the Redis container.

    docker compose -f deploy/docker/docker-compose.yml up event-bot -d
    

For interactions with the bot, you can use a Python sample client for the Event Interface.

gRPC Interface

A gRPC web server is deployed with voice capture and playback support.

The Stock bot utilizes mixtral-8x7b-instruct-v0.1 from the NVIDIA API Catalog. The Stock bot is present in the quickstart directory at ./samples/stock_bot.

For deploying the Stock sample bot, via the gRPC Interface, perform the following steps:

  1. Set the API key for NVIDIA API Catalog.

    export NVIDIA_API_KEY=...
    
  2. Prepare the environment for the Docker compose commands.

    export BOT_PATH=./samples/stock_bot/
    source deploy/docker/docker_init.sh
    
  3. Deploy the Speech and NLP models required for the bot which might take 20-40 minutes for the first time. For the Stock sample bot, Riva ASR (Automatic Speech Recognition) and TTS (Text to Speech) models will be deployed.

    docker compose -f deploy/docker/docker-compose.yml up model-utils-speech
    
  4. Deploy the ACE Agent microservices. Deploy the Chat Controller, Chat Engine, Plugin server, and NLP server microservices.

    docker compose -f deploy/docker/docker-compose.yml up speech-bot -d
    
  5. Wait for a few minutes for all services to be ready, you can check the Docker logs for individual microservices to confirm. You will see log print Server listening on 0.0.0.0:50055 in the Docker logs for the Chat Controller container.

  6. Try out the bot using a web browser, you can deploy a sample frontend application with voice capture and playback support as well as with text input-output support using the following command.

    docker compose -f deploy/docker/docker-compose.yml up frontend-speech
    

    Interact with the bot using the URL http://<workstation IP>:9001/. For accessing the mic on the browser, we need to either convert http to https endpoint by adding SSL validation or update your chrome://flags/ or edge://flags/ to allow http://<workstation IP>:9001 as a secure endpoint.

Alternatively, you can use the Python sample gPRC client present in the Quick Start resource.

Event Interface with Speech

For speech-to-speech conversation, you can deploy a gRPC server along with an event interface.

The Colang 2.0 sample bot uses OpenAI gpt-3.5-turbo-instruct as the main model. The Colang 2.0 sample bot is present in the quickstart directory at ./samples/colang_2_sample_bot.

For deploying the Colang 2.0 sample bot, via the gRPC Interface, perform the following steps:

  1. Set the OpenAI API key environment variable.

    export OPENAI_API_KEY=...
    
  2. Prepare the environment for the Docker compose commands.

    export BOT_PATH=./samples/colang_2_sample_bot
    source deploy/docker/docker_init.sh
    
  3. For event based bots, we need to use speech_umim or avatar_umim pipeline configuration for the Chat Controller microservice. Update the PIPELINE variable in deploy/docker/docker_init.sh or override by setting the PIPELINE environment variable manually.

    export PIPELINE=speech_umim
    
  4. Deploy the Speech and NLP models required for the bot which might take 20-40 minutes for the first time. For the Colang 2.0 sample bot, Riva ASR (Automatic Speech Recognition) and TTS (Text to Speech) models will be deployed.

    docker compose -f deploy/docker/docker-compose.yml up model-utils-speech
    
  5. Deploy the ACE Agent microservices. Deploy the Chat Controller, Chat Engine, Plugin server, and NLP server microservices along with the Redis container.

    docker compose -f deploy/docker/docker-compose.yml up speech-event-bot -d
    
  6. Wait for a few minutes for all services to be ready, you can check the Docker logs for individual microservices to confirm. You will see log print Server listening on 0.0.0.0:50055 in the Docker logs for the Chat Controller container.

  7. Try out the bot. You can use the Python sample gPRC client present in the Quick Start resource. Sample web application is not supported for event interface based bots.

Model Deployment

For deploying all models required for the bot, perform the following steps:

  1. Prepare the environment for the Docker compose commands.

    export BOT_PATH=<BOT Directory containing model_config.yaml>
    source deploy/docker/docker_init.sh
    
  2. Deploy the NLP models.

    docker compose -f deploy/docker/docker-compose.yml up model-utils
    
  3. Deploy the Speech and NLP models required for the bot.

    docker compose -f deploy/docker/docker-compose.yml up model-utils-speech
    
  4. Stop all running models.

    docker stop riva-speech-server nlp_triton