Python Environment

The ACE Agent Quick Start scripts contain the Python native ACE Agent tool packaged as a wheel. You can use the ACE Agent tool seamlessly to try out the NVIDIA ACE Agent key functionalities as well as for the development workflow of different bots. The ACE Agent tool allows you to quickly develop features in a non-containerized environment and test changes in the bot configuration files iteratively. The Python environment supports CLI, Server, and Event interfaces only.

Note

This environment doesn’t support speech based bots and is more suitable for developers who want to use native setup for building text based bots.

Prerequisites

Before you start using NVIDIA ACE Agent, it’s assumed that you meet the following prerequisites. The current version of ACE Agent is only supported on NVIDIA data center GPUs.

  1. You have access and are logged into NVIDIA GPU Cloud (NGC). You have installed the NGC CLI tool on the local system and you have logged into the NGC container registry. For more details about NGC, refer to the NGC documentation.

  2. You have installed Docker and the NVIDIA container toolkit. Ensure that you have gone through the Docker post-installation steps for managing Docker as a non-root user.

  3. You have access to an NVIDIA Volta, NVIDIA Turing, NVIDIA Ampere, NVIDIA Ada Lovelace, or an NVIDIA Hopper Architecture-based GPU.

  4. You have python >= 3.8.10 and pip >= 23.1.2 installed on your workstation.

Setup

  1. Download NVIDIA ACE Agent Quick Start Scripts by cloning the GitHub ACE repository.

git clone git@github.com:NVIDIA/ACE.git
cd ACE
  1. Go to the ACE Agent microservices directory.

cd microservices/ace_agent
  1. Set your NGC API key in the NGC_CLI_API_KEY environment variable.

export NGC_CLI_API_KEY=...
  1. Create a virtual environment and activate it.

python3 -m venv && source venv/bin/activate
  1. Install the aceagent Python package.

pip install deploy/wheel/aceagent-4.0.0-py3-none-any.whl
  1. Based on the bot’s configurations, you might need to export additional environment variables. For example, bots using OpenAI models will need to set the OPENAI_API_KEY environment variable.

The aceagent tool exposes a command line interface for interaction and offers the following options.

Chat Engine

The Chat Engine exposes different interfaces to interact with the bot using text and supports CLI, Server, and Event interfaces.

CLI Interface

This interface is suited for bot development and allows for quicker iterations.

The Chit Chat sample bot is a simple conversation bot for doing small talk and is present in the Quick Start Scripts directory at ./samples/chitchat_bot. Perform the following steps to deploy using the CLI interface.

  1. Set the OPENAI_API_KEY environment variable before launching the bot. This bot uses OpenAI gpt-3.5-turbo-instruct as the main model.

    export OPENAI_API_KEY=...
    
  2. Start the bot and interact with it through your workstation terminal.

    aceagent chat cli -c samples/chitchat_bot/
    

For example:

[YOU] Are you a person?
[BOT] No, I am just a chatbot.

[YOU] What is your name?
[BOT] I do not have a name yet.
CLI Interface

Type

Default

Description

--config

Path

none

Required path to a directory or multiple directories containing configuration files to use for the bot. Can also point to a single configuration file.

--log-level

Text

warning

Controls the verbosity level of the chat engine logs.

--help

Shows a list of supported command line arguments.

HTTP Server Interface

The HTTP server interface exposes several rest APIs which can be used to interact with the bot, view the bot’s status, and update the bot’s state.

The Chit Chat sample bot is a simple conversation bot for doing small talk and is present in the Quick Start Script directory at ./samples/chitchat_bot. Perform the following steps to deploy using the Server interface.

  1. Set the OPENAI_API_KEY environment variable before launching the bot. This bot uses OpenAI gpt-3.5-turbo-instruct as the main model.

    export OPENAI_API_KEY=...
    
  2. Start the bot in the Server interface to deploy a fastapi based rest server.

    aceagent chat server -c samples/chitchat_bot/
    

    You can interact with the bot over HTTP following the HTTP Interface. The bot response text can be found under Response[“response”][“text”].

    CURL

    curl -X POST http://localhost:9000/chat \
        -H "Content-Type: application/json" \
        -H "Accept: application/json" \
        -d '{"UserId": "1", "Query": "Hello"}'
    
Server Interface

Type

Default

Description

--config

Path

none

Required path to a directory or multiple directories containing configuration files to use for the bot. Can also point to a single configuration file.

--port

Integer

9000

Port for the HTTP interface of ACE Agent.

--workers

Integer

1

Number of uvicorn workers with which the web server will start.

--log-level

Text

warning

Controls the verbosity level of the chat engine logs.

--help

Shows a list of supported command line arguments.

Event Interface

The event interface requires a running Redis server. Communication (input and output) with the event interfaces takes place using Redis streams. The easiest way to setup a local Redis server is using the official Redis container images (alternative installation methods can be found in the official Redis documentation:

docker run -d --rm --name redis --net host redis

You can try the Colang 2.0 sample bot which is present in the quickstart directory at ./samples/colang_2_sample_bot. The bot uses OpenAI gpt-3.5-turbo-instruct as the main model. Set the OpenAI API key environment variable.

export OPENAI_API_KEY=...

To start the bot using event interface, run:

aceagent chat event --config samples/colang_2_sample_bot/

This will launch the event interface and wait for PipelineAcquired events on stream ace_agent_system_events. These events indicate that a new stream or pipeline has become available. The ACE Agent will spawn an event worker that will be dedicated to this stream to forward any events to the configured bot.

You can interact with the bot using the ACE Agent event interface Async API schema. You can use the Event Sample Client for trying out the bot.

Event Interface

Type

Default

Description

--config

Path

none

Required path to a directory or multiple directories containing configuration files to use for the bot. Can also point to a single configuration file.

--event-provider-name

Text

redis

Currently, only redis is supported. We are planning to support additional message brokers or event providers.

--event-provider-host

Text

localhost

Host address of where the event provider (for example, the redis server) is running.

--event-provider-port

Integer

6379

Port where the event provider is listening (for example, the redis server port).

--port

Integer

9000

Port for the HTTP interface of ACE Agent. This is required if you need a programmatic way of checking the health status of ACE Agent (for example, as part of liveness or readiness probes in a cluster deployment).

--workers

Integer

1

Number of uvicorn workers with which the web server will start.

--log-level

Text

warning

If the chat should be verbose and output the logs of the Chat Engine.

--help

Shows a list of supported command line arguments.

Plugin Server

The Plugin server is a FastAPI-based server that enables the ACE Agent to interact with third-party applications or APIs over a REST interface. It exposes a Swagger endpoint, which allows developers to easily write and validate Plugin servers in a sandbox environment.

The food ordering bot is a virtual assistant bot that can help you with placing your food order. It can list items from the menu, add, remove, and replace items in your cart and help you place the order. For deploying the Plugin server for the food ordering sample bot, run:

aceagent plugin-server deploy --config samples/food_ordering_bot/plugin_config.yaml

For stopping the Plugin server, run:

aceagent plugin-server stop
Plugin Server

Type

Default

Description

--config

Path

none

Required relative path to the present working directory containing the plugin_config.yaml file.

--port

Integer

9002

Port where the Plugin server will start.

--log-level

Text

warning

Controls the verbosity level of the Plugin server logs.

--help

Shows a list of supported command line arguments.

Model Deployment

For deploying NLP models along with the NLP Server as mentioned in model_config.yaml in bot configurations, run:

aceagent models deploy --config samples/food_ordering_bot/model_config.yaml

This command deploys the food ordering Intent Slot model for the Food ordering bot.

Model Deployment

Type

Default

Description

--config

Path

none

Required relative path to the present working directory containing the model_config.yaml file.

--model-repository-path

Path

./model_repository

Path where optimized models will be stored for Triton Inference Server.

--no-cache

Boolean

False

By default, cached Triton model plans are used. Set flag to True to rebuild the Triton plans. Cached models are stored at the $PWD/.cache directory.

--gpus

Text

1

GPUs to be used for deployment. Allowed formats are 3 [use total 3 GPUs], device=0,1 [use device 0 & 1].

--speech

Boolean

False

Set flag to True to deploy speech models (ASR and TTS) if specified in model_config.yaml.

--help

Shows a list of supported command line arguments.

NLP Server

The NLP server provides a single unified interface to integrate different NLP models in the dialog pipeline. It utilizes already production tested model servers like the NVIDIA Triton Inference Server and Riva Speech Server while also allowing to integrate experimental custom models easily in the pipeline.

If you have already deployed models using Triton Inference Server or Riva Server, you can deploy the NLP server REST interface by running:

aceagent nlp-server deploy --config samples/food_ordering_bot/model_config.yaml
NLP Server

Type

Default

Description

--config

Path

none

Required relative path to the present working directory containing the model_config.yaml file.

--custom_model_dir

Path

none

Directory containing the custom model clients for NLP Server.

--port

Integer

9003

Port where the Plugin server will start.

--log-level

Text

INFO

Controls the verbosity level of the Plugin server and NLP Server logs.

--workers

Integer

1

Number of uvicorn workers with which the NLP server will start.

--help

Shows a list of supported command line arguments.

Clean Up

You can stop all running servers and models using the following commands:

aceagent chat stop
aceagent models stop
aceagent nlp-server stop
aceagent plugin-server stop

Python Native Application

You can write a Python application using the aceagent package and interact with it.

Interacting with the bot synchronously:

import json
from chat_engine import CreateBots, Core
from time import sleep

bots = CreateBots.from_path(config_dir=["samples/chitchat_bot/"])

while not bots[0].is_ready:
    sleep(1)

request = {"UserId": "1", "Query": "Hello"}
response = Core.get_response(bots[0], request)

print(response.get("Response").get("Text"))

Interacting with a streaming bot:

from chat_engine import CreateBots, Core
from time import sleep
import asyncio
import json

bots = CreateBots.from_path(config_dir=["samples/chitchat_bot/"])

while not bots[0].is_ready:
    sleep(1)


async def chat():
    request = {"UserId": "1", "Query": "Tell me a joke"}
    streaming_handler = await Core.stream_response_async(bots[0], request)

    print("[BOT] ", end="", flush=True)
    async for chunk in streaming_handler:
        if not chunk:
            break
        parsed = json.loads(chunk)
        if parsed["Response"]["IsFinal"]:
            break

        print(parsed["Response"]["Text"], end="", flush=True)

if __name__ == "__main__":
    asyncio.run(chat())