Kubernetes Environment#
ACE Agent provides Kubernetes deployment using NVIDIA Unified Cloud Services (UCS) Tools. NVIDIA Unified Cloud Services Tools (UCS Tools) is a low-code framework for developing cloud-native, real-time, and multimodal AI applications. The NVIDIA ACE Agent releases includes the following UCS microservices:
Microservice Name |
Version |
Description |
---|---|---|
|
4.1.0 |
Chat Engine Microservice |
|
4.1.0 |
NLP Server Microservice |
|
4.1.0 |
Plugin Server Microservice |
|
4.1.0 |
Chat Controller Microservice for Speech Support |
|
4.1.0 |
Sample WebUI Application |
You can easily create your own custom application Helm chart using ACE Agent microservices with UCS applications. The ACE Agent Quick Start package comes with a number of UCS applications for sample bots which can be found in the ./deploy/ucs_apps/
directory.
Prerequisites
Before you start using NVIDIA ACE Agent, it’s assumed that you meet the following prerequisites. The current version of ACE Agent is only supported on NVIDIA data centers.
You have access and are logged into NVIDIA GPU Cloud (NGC). You have installed the NGC CLI tool on the local system and you have logged into the NGC container registry. For more details about NGC, refer to the NGC documentation.
You have installed UCS tools along with prerequisite setups such as Helm, Kubernetes, GPU Operator, and so on. Refer to UCS tools developer system and deployment system prerequisite sections for detailed instructions. The latest UCS 2.5 tools require Ubuntu 22.04, alternatively you can build the Docker image for executing UCS commands using the Ubuntu 22.04 base image.
You have access to an NVIDIA Volta, NVIDIA Turing, NVIDIA Ampere, NVIDIA Ada Lovelace, or an NVIDIA Hopper Architecture-based GPU.
Setup
Install the Local Path Provisioner by running the following command if not already done:
curl https://raw.githubusercontent.com/rancher/local-path-provisioner/v0.0.23/deploy/local-path-storage.yaml | sed 's/^ name: local-path$/ name: mdx-local-path/g' | kubectl apply -f -
Note
If you miss deploying the local path provisioner Helm chart, you will see the following errors during deployment and some of the pods will be stuck in pending state.
Warning FailedMount 4m10s (x2 over 4m11s) kubelet MountVolume.SetUp failed for volume "workload-cm-volume" : failed to sync configmap cache: timed out waiting for the condition Warning FailedMount 4m10s (x2 over 4m11s) kubelet MountVolume.SetUp failed for volume "scripts-cm-volume" : failed to sync configmap cache: timed out waiting for the condition Warning FailedMount 4m10s (x2 over 4m11s) kubelet MountVolume.SetUp failed for volume "configs-volume" : failed to sync configmap cache: timed out waiting for the condition
Setup the mandatory Kubernetes secrets required for deployment.
export NGC_CLI_API_KEY=... kubectl create secret docker-registry ngc-docker-reg-secret --docker-server=nvcr.io --docker-username='$oauthtoken' --docker-password="${NGC_CLI_API_KEY}" kubectl create secret generic ngc-api-key-secret --from-literal=NGC_CLI_API_KEY="${NGC_CLI_API_KEY}"
Download NVIDIA ACE Agent Quick Start Scripts by cloning the GitHub ACE repository.
git clone git@github.com:NVIDIA/ACE.git cd ACE
Go to the ACE Agent
microservices
directory.cd microservices/ace_agent
Deploying a Bot via UCS Application#
We will use the stock bot application specs as an example, which is present in the Quick Start package at ./deploy/ucs_apps/speech_bot
.
The Stock bot uses the gpt-4-turbo
model from OpenAI as the main model.
Create a Kubernetes secret for the
OPENAI_API_KEY
.export OPENAI_API_KEY="sk-XXX" kubectl create secret generic openai-key-secret --from-literal=OPENAI_API_KEY=${OPENAI_API_KEY
Generate the Helm Chart using tools.
ucf_app_builder_cli app build deploy/ucs_apps/speech_bot/stock_bot/app.yaml deploy/ucs_apps/speech_bot/stock_bot/app-params.yaml
Deploy the Helm Chart.
helm install ace-agent deploy/ucs_apps/speech_bot/stock_bot/ucf-app-speech-bot-4.1.0
Wait for all pods to be ready.
watch kubectl get pods
Try out the deployed bot using a web frontend application. Get the
nodeport
forace-agent-webapp-deployment-service
usingkubectl get svc
and interact with the bot using the URLhttp://<workstation IP>:<NodePort_for_7006>
.
For accessing the mic on the browser, we need to either convert http
to https
endpoint by adding SSL validation or update your chrome://flags/
or edge://flags/
to allow http://<workstation IP>:<nodeport_7006>
as a secure endpoint.
Stop the deployment and remove the persistent volumes.
helm uninstall ace-agent kubectl delete pvc --all
Building a Custom Helm Chart using ACE Agent Microservices#
In this tutorial, we will showcase how you can create a Helm Chart using ACE Agent Microservices for use cases like text-based bot and speech-based bot and extend it for your custom applications. ACE Agent provides Kubernetes support using Unified Cloud Services (UCS). You can use the tutorials for building UCS applications in the UCS documentation as a reference.
We will use:
the UCS CLI interface, however, you should be able to execute the same steps via UCS Studio.
various customizations for the UCS application specs, and
various UCS microservices in this tutorial.
Building a Chatbot UCS Application#
In this tutorial, we will use the stock sample bot present at ./samples/stock_bot
as a reference bot for building the chatbot UCS application. We already have a UCS application for the stock chat bot at ./deploy/ucs_apps/chat_bot/stock_bot/
and you can use the same as reference during this tutorial.
Create the boilerplate UCS application specs.
ucf_app_builder_cli app create test-app
Build the UCS application with the Chat Engine microservice only. Update
test-app/app.yaml
as follows:We only need the Chat Engine Microservice and Redis as the message broker, however, we might optionally include the Web App Microservice for the UI to interact with the text-based bot. The dependencies we need are:
dependencies: - ucf.svc.ace-agent.chat-engine:4.1.0 - ucf.svc.core.redis-timeseries:0.0.20 - ucf.svc.ace-agent.web-app:4.1.0
Update the components section with the Chat Engine and Web App Microservices.
components: - name: chat-engine type: ucf.svc.ace-agent.chat-engine parameters: imagePullSecrets: - name: ngc-docker-reg-secret - name: redis-timeseries type: ucf.svc.core.redis-timeseries parameters: imagePullSecrets: - name: ngc-docker-reg-secret - name: webapp type: ucf.svc.ace-agent.web-app parameters: imagePullSecrets: - name: ngc-docker-reg-secret
Update the connections between the Chat Engine and the Web App Microservices.
connections: chat-engine/redis: redis-timeseries/redis webapp/redis: redis-timeseries/redis
The Stock bot uses the
gpt-4-turbo
model from OpenAI as the main model. Create a Kubernetes secret for theOPENAI_API_KEY
.export OPENAI_API_KEY="sk-XXX" kubectl create secret generic openai-key-secret --from-literal=OPENAI_API_KEY=${OPENAI_API_KEY}
Retrieve the NGC CLI API key for downloading NGC resources. We have created
ngc-api-key-secret
in thePrerequisites
section. Remove thevault
section and update thesecrets
section withngc-api-key-secret
andopenai-key-secret
.secrets: k8sSecret/ngc-api-key-secret/NGC_CLI_API_KEY: k8sSecret: secretName: ngc-api-key-secret key: NGC_CLI_API_KEY k8sSecret/openai-key-secret/OPENAI_API_KEY: k8sSecret: secretName: openai-key-secret key: OPENAI_API_KEY
Configure
ngc-api-key-secret
andopenai-key-secret
in thechat-engine
section.- name: chat-engine type: ucf.svc.ace-agent.chat-engine parameters: imagePullSecrets: - name: ngc-docker-reg-secret secrets: ngc-api-key-secret: k8sSecret/ngc-api-key-secret/NGC_CLI_API_KEY openai-key-secret: k8sSecret/openai-key-secret/OPENAI_API_KEY
Mount the bot to the Chat Engine Microservice. We can do the same by providing the local directory under the
chat-engine
component in the application specs.- name: chat-engine type: ucf.svc.ace-agent.chat-engine parameters: imagePullSecrets: - name: ngc-docker-reg-secret secrets: ngc-api-key-secret: k8sSecret/ngc-api-key-secret/NGC_CLI_API_KEY openai-key-secret: k8sSecret/openai-key-secret/OPENAI_API_KEY files: config_dir: ../samples/stock_bot/
Specify parameters for the WebUI and Chat Engine microservices. You can either update the
parameters
section in thechat-engine
andwebapp
component or create a new filetest-app/params.yaml
and update it.chat-engine: botConfigName: stock_bot_config.yml interface: "event" webapp: chatInterface: "event" speechFlags: ""
Generate the Helm Chart using UCS tools.
ucf_app_builder_cli app build test-app/app.yaml test-app/params.yaml
Deploy the generated Helm Chart and wait for all pods to be ready. Interact with the bot using the Web app at
http://<NodeIP>:<NodePort_for_7006>/
. You can getNodePort
usingkubectl get svc
.helm install test test-app/test-app-0.0.1/
As we haven’t deployed the Plugin server microservices, queries related to live stock price will not work. Add an ACE Agent Plugin server in the UCS application. Update
test-app/app.yaml
.Update the
dependencies
section with the Plugin Microservices.dependencies: - ucf.svc.ace-agent.chat-engine:4.1.0 - ucf.svc.ace-agent.web-app:4.1.0 - ucf.svc.core.redis-timeseries:0.0.20 - ucf.svc.ace-agent.plugin-server:4.1.0
Add the
plugin-server
in components.- name: plugin-server type: ucf.svc.ace-agent.plugin-server parameters: imagePullSecrets: - name: ngc-docker-reg-secret secrets: ngc-api-key-secret: k8sSecret/ngc-api-key-secret/NGC_CLI_API_KEY
Mount the plugin configurations by providing the local directory under the
plugin-server
component.- name: plugin-server type: ucf.svc.ace-agent.plugin-server parameters: imagePullSecrets: - name: ngc-docker-reg-secret secrets: ngc-api-key-secret: k8sSecret/ngc-api-key-secret/NGC_CLI_API_KEY files: config_dir: ../samples/stock_bot/
Add a connection between the Chat Engine and the Plugin microservices.
connections: chat-engine/plugin-server: plugin-server/http-api chat-engine/redis: redis-timeseries/redis webapp/redis: redis-timeseries/redis
Specify parameters for the Plugin microservices. You can either update the
parameters
section in theplugin-server
component, or create a new filetest-app/params.yaml
and update.plugin-server: pluginConfigPath: "plugin_config.yaml" # relative path of the plugin_config.yaml
Generate the updated Helm Chart and deploy the application. You should be able to interact with the updated bot using the Web app at
http://<NodeIP>:<NodePort>/
.ucf_app_builder_cli app build test-app/app.yaml test-app/params.yaml # Uninstall existing deployment helm uninstall test kubectl delete pvc --all # Deploy updated helm chart helm install test test-app/test-app-0.0.1/
Try all queries supported by the stock sample bot.
The stock sample bot doesn’t use any NLP models, but if you are adding NLP models in your custom bots, we need to add an NLP server microservice.
Add the NLP server microservice in the
dependencies
component.dependencies: - ucf.svc.ace-agent.chat-engine:4.1.0 - ucf.svc.ace-agent.web-app:4.1.0 - ucf.svc.core.redis-timeseries:0.0.20 - ucf.svc.ace-agent.plugin-server:4.1.0 - ucf.svc.ace-agent.nlp-server:4.1.0
Update the components.
- name: nlp-server type: ucf.svc.ace-agent.nlp-server parameters: imagePullSecrets: - name: ngc-docker-reg-secret secrets: ngc-api-key-secret: k8sSecret/ngc-api-key-secret/NGC_CLI_API_KEY
Update the
connections
. The Chat Engine utilizes the NLP server.connections: chat-engine/plugin-server: plugin-server/http-api chat-engine/redis: redis-timeseries/redis webapp/redis: redis-timeseries/redis chat-engine/nlp-server: nlp-server/api-server
We can mount the model configurations by providing the local directory under the
nlp-server
component.- name: nlp-server type: ucf.svc.ace-agent.nlp-server parameters: imagePullSecrets: - name: ngc-docker-reg-secret secrets: ngc-api-key-secret: k8sSecret/ngc-api-key-secret/NGC_CLI_API_KEY files: config_dir: ../samples/stock_bot/
Update the NLP Server Microservice parameters and GPU device for
test-app/params.yaml
.nlp-server: ucfVisibleGpus: [0] modelConfigPath: "model_config.yaml"
Generate the updated Helm Chart and deploy the application. You should be able to interact with the updated bot using the Web app at
http://<NodeIP>:<NodePort>/
.# Build updated helm chart ucf_app_builder_cli app build test-app/app.yaml test-app/params.yaml # Uninstall previous deployment helm uninstall test kubectl delete pvc --all # Install updated helm chart helm install test test-app/test-app-0.0.1/
Building a Speech Bot UCS Application#
In the previous section, we have created a text based UCS application for stock sample bot and will update the UCS application to support speech to speech conversations. We already have a UCS application for the stock speech bot at ./deploy/ucs_apps/speech_bot/stock_bot/
and you can use the same as reference during this tutorial.
Add Speech support using the Chat Controller Microservice.
Add the Chat Controller Microservice in the
dependencies
section.
dependencies: - ucf.svc.ace-agent.chat-engine:4.1.0 - ucf.svc.ace-agent.web-app:4.1.0 - ucf.svc.ace-agent.plugin-server:4.1.0 - ucf.svc.ace-agent.nlp-server:4.1.0 - ucf.svc.ace-agent.chat-controller:4.1.0 - ucf.svc.core.redis-timeseries:0.0.20
Add the
chat-controller
to thecomponents
section.
- name: chat-controller type: ucf.svc.ace-agent.chat-controller parameters: imagePullSecrets: - name: ngc-docker-reg-secret secrets: ngc-api-key-secret: k8sSecret/ngc-api-key-secret/NGC_CLI_API_KEY
Add the Riva Speech Skills microservice for deploying speech models in the
dependencies
section.
Add the
riva-speech
to thecomponents
section.- name: riva-speech type: ucf.svc.riva.speech-skills parameters: imagePullSecrets: - name: ngc-docker-reg-secret secrets: ngc-api-key-secret: k8sSecret/ngc-api-key-secret/NGC_CLI_API_KEY
Add the Chat Controller connections to the Riva server and Chat Engine. Update the
webapp
connection with the Chat Controller.connections: chat-engine/plugin-server: plugin-server/http-api chat-engine/redis: redis-timeseries/redis chat-controller/riva: riva-speech/riva-speech-api chat-controller/redis: redis-timeseries/redis webapp/chat-controller: chat-controller/grpc-api webapp/redis: redis-timeseries/redis
Update the Riva server parameters with the ASR and TTS models in
test-app/params.yaml
.riva-speech: riva: visibleGpus: "0" modelRepoGenerator: ngcModelConfigs: triton0: models: #> description: List of NGC models for deployment - nvidia/ace/rmir_asr_parakeet_1-1b_en_us_str_vad:2.17.0 #english - nvidia/riva/rmir_tts_fastpitch_hifigan_en_us_ipa:2.17.0
Update the chat controller microservice parameters to use the
speech_umim
pipeline.chat-controller: pipeline: speech_umim
Update the WebUI microservice parameters to enable speech mode in test-app/params.yaml.
webapp: chatInterface: "event" speechFlags: "--speech"
Generate the updated Helm Chart and deploy the application. Wait for all pods to be ready, it might take around 40-50 minutes. You should be able to interact with the updated bot using the Speech Web App at
http://<NodeIP>:<NodePort_7006>/
.# Rebuild the helm chart ucf_app_builder_cli app build test-app/app.yaml test-app/params.yaml # Uninstall previous deployment helm uninstall test kubectl delete pvc --all # Install updated helm chart helm install test test-app/test-app-0.0.1/
Note
For accessing the mic on the browser, we need to have the HTTPS endpoint. You can add SSL validation and convert the endpoint to HTTPS, or update your
chrome://flags/
oredge://flags/
to allowhttp://<Node_IP>:<Webapp_NodePort_7006>
as a secure endpoint.
For production deployment, use NGC resources for providing bot configurations to various ACE Agent Microservices. This will allow you to keep track of versions and don’t have to depend on the host file systems. We will update the
chat-engine
Microservice, however, you can update the NLP, Plugin, and Chat Controller Microservices similarly.Remove files from the
chat-engine
component intest-app/app.yaml
.- name: chat-engine type: ucf.svc.ace-agent.chat-engine parameters: imagePullSecrets: - name: ngc-docker-reg-secret secrets: ngc-api-key-secret: k8sSecret/ngc-api-key-secret/NGC_CLI_API_KEY openai-key-secret: k8sSecret/openai-key-secret/OPENAI_API_KEY
Add the
chat-engine
parameters intest-app/params.yaml
. If files are provided via a component intest-app/app.yaml
, theconfigNgcPath
parameter will not be used. If you have multiple bots in NGC and want to deploy a single bot, usebotConfigName
to point to the requiredbot_config.yaml
.chat-engine: configNgcPath: <NGC_RESOURCE_PATH> botConfigName: bot_config.yaml
Rebuild the Helm Chart and redeploy.
Building a LangChain Bot UCS Application#
In this tutorial, we will showcase how you can create the UCS application for the sample LangChain based bot present at ./samples/ddg_langchain_bot
. This tutorial can be used for creating UCS applications for the other sample bots such as ./samples/rag_bot
or your own custom bot.
The DuckDuckGo LangChain sample bot uses OpenAI models and we will need an OPENAI_API_KEY
. We already have a UCS application for the DuckDuckGo LangChain bot at ./deploy/ucs_apps/speech_bot/ddg_langchain_bot/
and you can use the same as reference during this tutorial.
Create the boilerplate UCS application specs.
ucf_app_builder_cli app create langchain-app
Update
langchain-app/app.yaml
as follows:LangChain Agent will be deployed as part of the Plugin server microservice. We will need a Chat Controller and Riva Speech Skills for speech support. We might optionally include the Web App microservice for the UI to interact with the bot using text or speech. The dependencies we need are:
dependencies: - ucf.svc.riva.speech-skills:2.17.0 - ucf.svc.ace-agent.plugin-server:4.1.0 - ucf.svc.ace-agent.chat-controller:4.1.0 - ucf.svc.ace-agent.web-app:4.1.0
Update the
components
section for the microservices in thedependencies
section.- name: riva-speech type: ucf.svc.riva.speech-skills parameters: imagePullSecrets: - name: ngc-docker-reg-secret - name: plugin-server type: ucf.svc.ace-agent.plugin-server parameters: imagePullSecrets: - name: ngc-docker-reg-secret - name: chat-controller type: ucf.svc.ace-agent.chat-controller parameters: imagePullSecrets: - name: ngc-docker-reg-secret - name: webapp type: ucf.svc.ace-agent.web-app parameters: imagePullSecrets: - name: ngc-docker-reg-secret
Update the
connections
between the microservices.connections: chat-controller/riva: riva-speech/riva-speech-api chat-controller/chat-api: plugin-server/http-api webapp/chat-controller: chat-controller/grpc-api
Configure the OpenAI API key. To create Kubernetes secret, run:
export OPENAI_API_KEY=... kubectl create secret generic openai-key-secret --from-literal=OPENAI_API_KEY=${OPENAI_API_KEY}
Set the NGC CLI API key for downloading NGC resources. We have created
ngc-api-key-secret
in thePrerequisites
section. Remove thevault
section and update the secrets withngc-api-key-secret
andopenai-key-secret
.secrets: k8sSecret/ngc-api-key-secret/NGC_CLI_API_KEY: k8sSecret: secretName: ngc-api-key-secret key: NGC_CLI_API_KEY k8sSecret/openai-key-secret/OPENAI_API_KEY: k8sSecret: secretName: openai-key-secret key: OPENAI_API_KEY
Configure
ngc-api-key-secret
andopenai-key-secret
in theplugin-server
,chat-controller
, andriva-skills
section as needed.- name: riva-speech type: ucf.svc.riva.speech-skills parameters: imagePullSecrets: - name: ngc-docker-reg-secret secrets: ngc-api-key-secret: k8sSecret/ngc-api-key-secret/NGC_CLI_API_KEY - name: plugin-server type: ucf.svc.ace-agent.plugin-server parameters: imagePullSecrets: - name: ngc-docker-reg-secret secrets: ngc-api-key-secret: k8sSecret/ngc-api-key-secret/NGC_CLI_API_KEY openai-key-secret: k8sSecret/openai-key-secret/OPENAI_API_KEY - name: chat-controller type: ucf.svc.ace-agent.chat-controller parameters: imagePullSecrets: - name: ngc-docker-reg-secret secrets: ngc-api-key-secret: k8sSecret/ngc-api-key-secret/NGC_CLI_API_KEY
Mount the bot to the Plugin server and Chat Controller microservices. We can do the same by providing the local directory under the
plugin-server
andchat-controller
components in the application specs.- name: plugin-server type: ucf.svc.ace-agent.plugin-server parameters: imagePullSecrets: - name: ngc-docker-reg-secret secrets: ngc-api-key-secret: k8sSecret/ngc-api-key-secret/NGC_CLI_API_KEY openai-key-secret: k8sSecret/openai-key-secret/OPENAI_API_KEY files: config_dir: ../samples/ddg_langchain_bot - name: chat-controller type: ucf.svc.ace-agent.chat-controller parameters: imagePullSecrets: - name: ngc-docker-reg-secret secrets: ngc-api-key-secret: k8sSecret/ngc-api-key-secret/NGC_CLI_API_KEY files: config_dir: ../samples/ddg_langchain_bot
Specify the parameters for the microservices. You can update the
parameters
section in the component or createlangchain-app/app-params.yaml
. Updatelangchain-app/app-params.yaml
as follows:Update the Riva Speech parameters with the ASR and TTS models in
langchain-app/params.yaml
.riva-speech: riva: visibleGpus: "0" modelRepoGenerator: ngcModelConfigs: triton0: models: #> description: List of NGC models for deployment - nvidia/ace/rmir_asr_parakeet_1-1b_en_us_str_vad:2.17.0 #english - nvidia/riva/rmir_tts_radtts_hifigan_en_us_ipa:2.17.0
Update the Chat Controller parameters in
langchain-app/params.yaml
. We will be using thespeech_lite
pipeline. The Plugin server will expose the APIs with the LangChain endpoint prefix.chat-controller: pipeline: speech_lite speechConfigPath: "speech_config.yaml" chatEndpointPrefix: "langchain"
Update the Plugin server parameters.
plugin-server: pluginConfigPath: "plugin_config.yaml"
Update the WebUI microservice parameters.
webapp: chatInterface: "grpc" speechFlags: "--speech"
The LangChain plugin requires additional packages to be installed in the Plugin server container.
Build a custom Dockerfile by copying the requirements from
ddg_langchain_bot/plugins/requirements_dev.txt
intodeploy/docker/dockerfiles/plugin_server.Dockerfile
.############################## # Install custom dependencies ############################## RUN pip3 install \ langchain==0.1.1 \ langchain-community==0.0.13 \ langchain-core==0.1.12 \ duckduckgo-search==5.3.1b1
Note
If you see a crash in the Plugin server or an issue with fetching a response from DuckDuckGo, try using a more recent
duckduckgo-search
version.Build the container and push it to the NGC Docker registry.
# Set required environment variables for docker-compose.yaml source deploy/docker/docker_init.sh # Build custom plugin server docker image docker compose -f deploy/docker/docker-compose.yml build plugin-server # Retag docker image and push to NGC docker registry docker tag docker.io/library/plugin-server:4.1.0 <CUSTOM_DOCKER_IMAGE_PATH>:<VERSION> docker push <CUSTOM_DOCKER_IMAGE_PATH>:<VERSION>
If you want to use a different Docker registry, update
imagePullSecrets
inlangchain-app/app.yaml
.Override the Plugin server image using
langchain-app/params.yaml
.plugin-server: pluginConfigPath: "plugin_config.yaml" applicationSpecs: deployment: containers: container: image: repository: <CUSTOM_DOCKER_IMAGE_PATH> tag: <VERSION>
Generate the updated Helm Chart and deploy the application. You should be able to interact with the updated bot using the Speech Web App at
http://<NodeIP>:<NodePort_7006>/
.# Build helm chart using UCS tools ucf_app_builder_cli app build langchain-app/app.yaml langchain-app/params.yaml # Uninstall previous deployment helm uninstall test kubectl delete pvc --all # Deploy generated helm chart helm install test langchain-app/langchain-app-0.0.1/
For accessing the mic on the browser, we need to have the HTTPS endpoint. You can add SSL validation and convert the endpoint to HTTPS, or update your
chrome://flags/
oredge://flags/
to allowhttp://<Node_IP>:<Webapp_NodePort_7006>
as a secure endpoint.
Creating Kubernetes Secrets#
ACE Agent Microservices utilizes the following Kubernetes secrets for setting various keys.
Image Pull Secret [Mandatory]: Pulls Docker images for deployment. ACE Agent containers utilize the NGC Docker registry. For setting up an image pull secret, run:
export NGC_CLI_API_KEY=... kubectl create secret docker-registry ngc-docker-reg-secret --docker-server=nvcr.io --docker-username='$oauthtoken' --docker-password="${NGC_CLI_API_KEY}"
Where
NGC_CLI_API_KEY
is the NGC Personal API key.NGC_CLI_API_KEY Secret [Mandatory]: Pulls models and resources from NGC. This secret is mandatory for Chat Engine, Chat Controller, Plugin, and NLP Server microservices. It will create a file with secret value at
/secrets/ngc_api_key.txt
in pods. It can also be used to populate theNGC_CLI_API_KEY
environment variable.export NGC_CLI_API_KEY=... kubectl create secret generic ngc-api-key-secret --from-literal=NGC_CLI_API_KEY="${NGC_CLI_API_KEY}"
Where
NGC_CLI_API_KEY
is the NGC Personal API key.OPENAI_API_KEY Secret [Optional]: Sets the
OPENAI_API_KEY
environment variable in the Chat Engine, Plugin, and NLP Server Microservices. The secret will create a file with secret value at/secrets/openai_api_key.txt
in pods.export OPENAI_API_KEY="sk-XXX" kubectl create secret generic openai-key-secret --from-literal=OPENAI_API_KEY=${OPENAI_API_KEY}
NVIDIA_API_KEY Secret [Optional]: Sets the
NVIDIA_API_KEY
environment variable commonly used for accessing https://build.nvidia.com/ models in the Plugin and Chat Engine Microservices. The secret will create a file with secret value at/secrets/nvidia_api_key.txt
in pods.export NVIDIA_API_KEY="XXX" kubectl create secret generic nvidia-api-key-secret --from-literal=ELEVENLABS_API_KEY=${NVIDIA_API_KEY}
Custom ENV Secret [Optional]: This secret can be used to pass any key value pairs which will be exported as environment variables and supported by Chat Engine, Chat Controller, Plugin, and NLP Server Microservices. This secret will create a
/secrets/.env
file and will be sourced before running services to set the environment variables.cat <<EOF | tee custom-env.txt KEY1=VALUE1 KEY2=VALUE2 EOF kubectl create secret generic custom-env-secrets --from-file=ENV=custom-env.txt
Bot and UCS Applications Customization#
This section provides guidance on how to modify sample UCS applications and sample workflows such as Tokkio for your custom use cases. ACE Agent documentation uses Docker flow for development and bot customizations, and recommends Kubernetes deployment for production use cases. For some customization or debugging, it becomes important to work with both Docker and Kubernetes deployment interchangeably.
Local Development with Kubernetes Environment#
If you are deploying ACE Agent as part of a more complex Kubernetes environment, for example, to create an interactive avatar experience with multiple ACE microservices, you will often want to test the bot you are developing directly in the setup environment. Testing the bot as part of the Kubernetes deployment will allow you to check the multimodal aspects of the interaction like the animations of the avatar, the voice, timings of response and more.
To simplify this workflow and to allow for fast iteration times, it is possible to run the Chat Engine in your local Python environment and connect it to your running Kubernetes deployment. In this case, you can test changes to the Colang scripts or bot configurations simply by restarting the Chat Engine locally.
In this section, we will showcase steps for using the local Chat Engine in the event interface, but similar steps should work for the Chat Engine server interface also.
Replace the Chat Engine microservice from your UCS application specification with local chat engine deployment by updating connection for Chat Controller egress endpoint
chat_api
.dependencies: [...] - ucf.svc.riva.speech-skills:2.17.0 # Remove chat-engine from the dependency list <Remove>- ucf.svc.ace-agent.chat-engine:4.1.0</Remove> - ucf.svc.ace-agent.chat-controller:4.1.0 [...] components: [...] # Remove the chat-engine from the list of components <Remove>- name: chat-engine type: ucf.svc.ace-agent.chat-engine parameters: imagePullSecrets: - name: ngc-docker-reg-secret secrets: ngc-api-key-secret: k8sSecret/ngc-api-key-secret/NGC_CLI_API_KEY openai-key-secret: k8sSecret/openai-key-secret/OPENAI_API_KEY</Remove> [...] # Add the local chat engine as external endpoint to the component list - name: engine-placeholder type: ucf.svc.external-endpoint parameters: # Update Local Chat Engine deployment IP Port service: 127.0.0.1 port: 8000 [...] connections: [...] # Remove outgoing connections from chat-engine (since we removed it above) <Remove> chat-engine/plugin-server: plugin-server/http-api chat-engine/redis: redis-timeseries/redis</Remove> # Point the chat-api egress connection to the engine-placeholder chat-controller/chat-api: engine-placeholder/endpoint
Redeploy your UCS application by following the Kubernetes environment section.
In a separate terminal, use port forwarding or exposing nodePort to connect to the Redis time series microservice inside the cluster.
kubectl port-forward $(kubectl get pods | grep redis-timeseries | awk '{print $1}') 30379:6379
Start ACE Agent locally in your Python environment (refer to Python Environment for more information).
aceagent chat event -c bots/your_bot/ --event-provider-port 30379 --log-level=INFO
Edit the Colang Script and any bot configurations, save any changes to disk and simply restart the ACE Agent (CTRL+C and repeat the previous step) to test your changes.
Using Local Bot Configurations with UCS Applications#
The Tokkio workflow utilizes the bot configurations as NGC resources. Instead of using NGC resources to set the customizations, you can perform the following steps to customize your bot configurations using a local directory.
Download NGC bot resources locally.
ngc registry resource download-version <NGC_BOT_RESOURCE_PATH>
Update the UCS
app.yaml
file to use the downloaded NGC resource local directory. The local directory will be mounted as a configmap in the pods. Use the absolute directory path or relative path toapp.yaml
.
- name: chat-engine type: ucf.svc.ace-agent.chat-engine parameters: imagePullSecrets: - name: ngc-docker-reg-secret secrets: ngc-api-key-secret: k8sSecret/ngc-api-key-secret/NGC_CLI_API_KEY openai-key-secret: k8sSecret/openai-key-secret/OPENAI_API_KEY files: config_dir: <LOCAL_BOT_DIRECTORY>You can make similar updates for the Chat Controller, NLP server, and Plugin server microservices.
Update the UCS
app-params.yaml
file to remove references to the NGC bot configurations resource path fromchat-engine
,chat-controller
,nlp-server
, andplugin-server
components.
chat-engine: <REMOVE>configNgcPath: <NGC_RESOURCE_PATH><REMOVE> botConfigName: bot_config.yaml
You can make modifications in the local bot configurations directory, generate the updated Helm Chart, and deploy the application using the Helm upgrade.
# Build helm chart using UCS tools ucf_app_builder_cli app build app.yaml params.yaml # Upgrade deployment to utilize latest bot configurations helm upgrade <DEPLOYMENT_NAME> <NEW_HELM_CHART>Note
You might get warnings for binary files during Helm chart generation. Remove any binary files from the bot configurations as it will be mounted in configmap in the Kubernetes deployment.
Using 3rd Party Text-to-Speech (TTS) Solutions#
The ACE Agent pipeline supports Riva TTS as the default option.
For speech bots, you might want to customize the voice for speech response. You can train your own TTS model, clone the TTS voice, or use any 3rd party provider. We have shown in the tutorial section how you integrate ElevenLabs text to speech APIs using Docker flow; in this section we will add additional steps needed for the Kubernetes deployment.
Build the NLP server Docker image with the required ElevenLabs dependencies.
Add the required dependencies in the NLP server Dockerfile present at
deploy/docker/dockerfiles/nlp_server.Dockerfile
.############################## # Install custom dependencies ############################## RUN apt-get update & apt-get -y install ffmpeg RUN pip3 install pydub elevenlabs==1.4.1
Build the container and push it to the NGC Docker registry.
# Set required environment variables for docker-compose.yaml source deploy/docker/docker_init.sh # Build custom nlp server docker image docker compose -f deploy/docker/docker-compose.yml build nlp-server # Retag docker image and push to NGC docker registry docker tag docker.io/library/nlp-server:4.1.0 <CUSTOM_DOCKER_IMAGE_PATH>:<VERSION> docker push <CUSTOM_DOCKER_IMAGE_PATH>:<VERSION>If you want to use a different Docker registry, update
imagePullSecrets
in theapp.yaml
file under thenlp-server
component with your own image Kubernetes secret.
Update the bot configurations with the ElevenLabs specific client and configurations by following the steps in the tutorial section. If you are using the NGC bot resource path, switch to the local bot configurations for quicker iteration. You can verify your changes using the Docker flow.
Add NLP server microservice in the UCS application specs.
Add the dependencies section in the
app.yaml
file.- ucf.svc.ace-agent.nlp-server:4.1.0
Add the components section in the
app.yaml
file.- name: nlp-server type: ucf.svc.ace-agent.nlp-server parameters: imagePullSecrets: - name: ngc-docker-reg-secret secrets: ngc-api-key-secret: k8sSecret/ngc-api-key-secret/NGC_CLI_API_KEY files: config_dir: <LOCAL_BOT_CONFIG_DIR>
Update the NLP server parameters in the
app-params.yaml
file and update the NLP server Docker image created in step 1.nlp-server: ucfVisibleGpus: [0] modelConfigPath: "model_config.yaml" customModelDir: "elabs.py" # custom elevenlabs client path applicationSpecs: deployment: containers: nlp-api: image: # UPDATE NGC image pushed in step1 repository: <CUSTOM_DOCKER_IMAGE_PATH> tag: <VERSION>
Pass
ELEVENLABS_API_KEY
to the NLP server microservice.
Create a custom Kubernetes secret for passing
ELEVENLABS_API_KEY
.cat <<EOF | tee custom-env.txt ELEVENLABS_API_KEY=<API_KEY_VALUE> EOF kubectl create secret generic custom-env-secrets --from-file=ENV=custom-env.txt
Add
custom-env-secrets
in the secrets section of theapp.yaml
file.secrets: k8sSecret/ngc-api-key-secret/NGC_CLI_API_KEY: k8sSecret: secretName: ngc-api-key-secret key: NGC_CLI_API_KEY k8sSecret/custom-env-secrets: k8sSecret: secretName: custom-env-secrets key: ENV
Add
custom-env-secrets
in thenlp-server
component.- name: nlp-server type: ucf.svc.ace-agent.nlp-server parameters: imagePullSecrets: - name: ngc-docker-reg-secret secrets: ngc-api-key-secret: k8sSecret/ngc-api-key-secret/NGC_CLI_API_KEY custom-env-secrets: k8sSecret/custom-env-secrets files: config_dir: <LOCAL_BOT_CONFIG_DIR>
Update the parameters in the UCS
app-params.yaml
file to utilize the ElevenLabs TTS client.
In the
chat-controller
component, update the TTS section to utilize the ElevenLabs client.riva_tts: RivaTTS: tts_mode: "http" voice_name: "Brian" language: "en-US" server: http://ace-agent-nlp-server-deployment-service:9003/speech/text_to_speech arpabet_dict: "" sample_rate: 16000 model_name: "eleven_turbo_v2_5" chunk_duration_ms: 600 #amount of data to be sent to downstream in realtime audio_start_threshold_ms: 800 #2000 #duration for which audio data will be sent in burst and rest of the data will be sent in realtime send_audio_in_realtime: true #this will send synthesized audio data in realtime to downstream
Comment out the TTS model from the
riva-speech
component.nvidia/riva/rmir_tts_fastpitch_hifigan_en_us_ipa:2.17.0
You can make modifications in the local bot configurations directory, generate the updated Helm Chart, and deploy the application using the Helm upgrade.
# Build helm chart using UCS tools ucf_app_builder_cli app build app.yaml app-params.yaml # Uninstall previous deployment helm uninstall <DEPLOYMENT_NAME> kubectl delete pvc --all # Deploy generated helm chart helm install <DEPLOYMENT_NAME> <HELM_CHART_PATH>
Deploying Multilingual Bots with Kubernetes#
NVIDIA ACE Agent supports deploying bots in different languages with few limitations.
For ASR ( Automatic Speech Recognition), only Riva Speech compatible models and languages are supported.
For TTS (Text-to-Speech), you can use external services in addition to Riva TTS for your choice of language.
Single deployment can’t support multiple languages. You will need to pick one language for ASR, and you can have the same or different language for TTS.
In this section, we will showcase the needed changes in the UCS application to deploy sample Spanish bots released as part of the ACE Agent release. You can apply similar changes when changing languages.
Update the ASR and TTS models for your target language in the
app-params.yaml
file under theriva-speech
component.
riva-speech: riva: visibleGpus: "0" modelRepoGenerator: ngcModelConfigs: triton0: models: #> description: List of NGC models for deployment - nvidia/ace/rmir_asr_parakeet_1-1b_en_us_str_vad:2.17.0 #english - nvidia/riva/rmir_tts_fastpitch_hifigan_en_us_ipa:2.17.0
If you are using Riva’s Neural Machine Translation model to do translation either at the query level or response level, then you need to also deploy the translation models in the riva-speech component.
Update
language
for ASR, andvoice_name
andlanguage
for TTS in thechat-controller
component in theapp-params.yaml
file. These updates can be performed in thespeech_config.yaml
file. Changes in theapp-params.yaml
file will get preference if updated in both places.
chat-controller: pipelineParams: riva_asr: RivaASR: server: "0.0.0.0:50051" language: "en-US" word_boost_file_path: "/workspace/config/asr_words_to_boost.txt" enable_profanity_filter: false endpointing_stop_history: 800 endpointing_stop_history_eou: 240 riva_tts: RivaTTS: server: "0.0.0.0:50051" voice_name: "English-US.Female-1" language: "en-US" sample_rate: 44100 chunk_duration_ms: 100 audio_start_threshold_ms: 400 send_audio_in_realtime: true tts_mode: "grpc"
You can develop multilingual bots using one of the following implementations:
Use a Multilingual LLM or a language specific LLM - Refer to the Spanish LLM sample bot as an example. We use CoLang syntax (en-US) and multilingual LLM to generate Spanish responses for user queries which are also in Spanish.
Use Riva’s Neural Machine Translation model - In this implementation, we translate a user query into en-US, generate the response in en-US and translate the response to the target language. Refer to the Spanish NMT bot as an example.
Update the bot configurations either using the local directory or NGC resource path in the UCS applications with the required changes for your target use case. For more information, refer to the Using Local Bot configurations with UCS Applications section.
Rebuild the Helm chart using the UCS tools.
# Build helm chart using UCS tools ucf_app_builder_cli app build app.yaml app-params.yaml