Kubernetes Environment#

ACE Agent provides Kubernetes deployment using NVIDIA Unified Cloud Services (UCS) Tools. NVIDIA Unified Cloud Services Tools (UCS Tools) is a low-code framework for developing cloud-native, real-time, and multimodal AI applications. The NVIDIA ACE Agent releases includes the following UCS microservices:

UCF Microservices#

Microservice Name

Version

Description

ucf.svc.ace-agent.chat-engine

4.1.0

Chat Engine Microservice

ucf.svc.ace-agent.nlp-server

4.1.0

NLP Server Microservice

ucf.svc.ace-agent.plugin-server

4.1.0

Plugin Server Microservice

ucf.svc.ace-agent.chat-controller

4.1.0

Chat Controller Microservice for Speech Support

ucf.svc.ace-agent.web-app

4.1.0

Sample WebUI Application

You can easily create your own custom application Helm chart using ACE Agent microservices with UCS applications. The ACE Agent Quick Start package comes with a number of UCS applications for sample bots which can be found in the ./deploy/ucs_apps/ directory.

Prerequisites

Before you start using NVIDIA ACE Agent, it’s assumed that you meet the following prerequisites. The current version of ACE Agent is only supported on NVIDIA data centers.

  1. You have access and are logged into NVIDIA GPU Cloud (NGC). You have installed the NGC CLI tool on the local system and you have logged into the NGC container registry. For more details about NGC, refer to the NGC documentation.

  2. You have installed UCS tools along with prerequisite setups such as Helm, Kubernetes, GPU Operator, and so on. Refer to UCS tools developer system and deployment system prerequisite sections for detailed instructions. The latest UCS 2.5 tools require Ubuntu 22.04, alternatively you can build the Docker image for executing UCS commands using the Ubuntu 22.04 base image.

  3. You have access to an NVIDIA Volta, NVIDIA Turing, NVIDIA Ampere, NVIDIA Ada Lovelace, or an NVIDIA Hopper Architecture-based GPU.

Setup

  1. Install the Local Path Provisioner by running the following command if not already done:

    curl https://raw.githubusercontent.com/rancher/local-path-provisioner/v0.0.23/deploy/local-path-storage.yaml | sed 's/^  name: local-path$/  name: mdx-local-path/g' | kubectl apply -f -
    

    Note

    If you miss deploying the local path provisioner Helm chart, you will see the following errors during deployment and some of the pods will be stuck in pending state.

    Warning  FailedMount  4m10s (x2 over 4m11s)  kubelet  MountVolume.SetUp failed for volume "workload-cm-volume" : failed to sync configmap cache: timed out waiting for the condition
    Warning  FailedMount  4m10s (x2 over 4m11s)  kubelet  MountVolume.SetUp failed for volume "scripts-cm-volume" : failed to sync configmap cache: timed out waiting for the condition
    Warning  FailedMount  4m10s (x2 over 4m11s)  kubelet  MountVolume.SetUp failed for volume "configs-volume" : failed to sync configmap cache: timed out waiting for the condition
    
  2. Setup the mandatory Kubernetes secrets required for deployment.

    export NGC_CLI_API_KEY=...
    
    kubectl create secret docker-registry ngc-docker-reg-secret --docker-server=nvcr.io --docker-username='$oauthtoken' --docker-password="${NGC_CLI_API_KEY}"
    
    kubectl create secret generic ngc-api-key-secret --from-literal=NGC_CLI_API_KEY="${NGC_CLI_API_KEY}"
    
  3. Download NVIDIA ACE Agent Quick Start Scripts by cloning the GitHub ACE repository.

    git clone git@github.com:NVIDIA/ACE.git
    cd ACE
    
  4. Go to the ACE Agent microservices directory.

    cd microservices/ace_agent
    

Deploying a Bot via UCS Application#

We will use the stock bot application specs as an example, which is present in the Quick Start package at ./deploy/ucs_apps/speech_bot.

The Stock bot uses the gpt-4-turbo model from OpenAI as the main model.

  1. Create a Kubernetes secret for the OPENAI_API_KEY.

    export OPENAI_API_KEY="sk-XXX"
    kubectl create secret generic openai-key-secret --from-literal=OPENAI_API_KEY=${OPENAI_API_KEY
    
  2. Generate the Helm Chart using tools.

    ucf_app_builder_cli app build deploy/ucs_apps/speech_bot/stock_bot/app.yaml deploy/ucs_apps/speech_bot/stock_bot/app-params.yaml
    
  3. Deploy the Helm Chart.

    helm install ace-agent deploy/ucs_apps/speech_bot/stock_bot/ucf-app-speech-bot-4.1.0
    
  4. Wait for all pods to be ready.

    watch kubectl get pods
    
  5. Try out the deployed bot using a web frontend application. Get the nodeport for ace-agent-webapp-deployment-service using kubectl get svc and interact with the bot using the URL http://<workstation IP>:<NodePort_for_7006>.

For accessing the mic on the browser, we need to either convert http to https endpoint by adding SSL validation or update your chrome://flags/ or edge://flags/ to allow http://<workstation IP>:<nodeport_7006> as a secure endpoint.

  1. Stop the deployment and remove the persistent volumes.

    helm uninstall ace-agent
    kubectl delete pvc --all
    

Building a Custom Helm Chart using ACE Agent Microservices#

In this tutorial, we will showcase how you can create a Helm Chart using ACE Agent Microservices for use cases like text-based bot and speech-based bot and extend it for your custom applications. ACE Agent provides Kubernetes support using Unified Cloud Services (UCS). You can use the tutorials for building UCS applications in the UCS documentation as a reference.

We will use:

  • the UCS CLI interface, however, you should be able to execute the same steps via UCS Studio.

  • various customizations for the UCS application specs, and

  • various UCS microservices in this tutorial.

Building a Chatbot UCS Application#

In this tutorial, we will use the stock sample bot present at ./samples/stock_bot as a reference bot for building the chatbot UCS application. We already have a UCS application for the stock chat bot at ./deploy/ucs_apps/chat_bot/stock_bot/ and you can use the same as reference during this tutorial.

  1. Create the boilerplate UCS application specs.

    ucf_app_builder_cli app create test-app
    
  2. Build the UCS application with the Chat Engine microservice only. Update test-app/app.yaml as follows:

    1. We only need the Chat Engine Microservice and Redis as the message broker, however, we might optionally include the Web App Microservice for the UI to interact with the text-based bot. The dependencies we need are:

      dependencies:
      - ucf.svc.ace-agent.chat-engine:4.1.0
      - ucf.svc.core.redis-timeseries:0.0.20
      - ucf.svc.ace-agent.web-app:4.1.0
      
    1. Update the components section with the Chat Engine and Web App Microservices.

      components:
      - name: chat-engine
        type: ucf.svc.ace-agent.chat-engine
        parameters:
          imagePullSecrets:
          - name: ngc-docker-reg-secret
      - name: redis-timeseries
        type: ucf.svc.core.redis-timeseries
        parameters:
          imagePullSecrets:
          - name: ngc-docker-reg-secret
      - name: webapp
        type: ucf.svc.ace-agent.web-app
        parameters:
          imagePullSecrets:
          - name: ngc-docker-reg-secret
      
    2. Update the connections between the Chat Engine and the Web App Microservices.

      connections:
        chat-engine/redis: redis-timeseries/redis
        webapp/redis: redis-timeseries/redis
      
    3. The Stock bot uses the gpt-4-turbo model from OpenAI as the main model. Create a Kubernetes secret for the OPENAI_API_KEY.

      export OPENAI_API_KEY="sk-XXX"
      kubectl create secret generic openai-key-secret --from-literal=OPENAI_API_KEY=${OPENAI_API_KEY}
      
    4. Retrieve the NGC CLI API key for downloading NGC resources. We have created ngc-api-key-secret in the Prerequisites section. Remove the vault section and update the secrets section with ngc-api-key-secret and openai-key-secret.

      secrets:
        k8sSecret/ngc-api-key-secret/NGC_CLI_API_KEY:
          k8sSecret:
            secretName: ngc-api-key-secret
            key: NGC_CLI_API_KEY
        k8sSecret/openai-key-secret/OPENAI_API_KEY:
          k8sSecret:
            secretName: openai-key-secret
            key: OPENAI_API_KEY
      
    5. Configure ngc-api-key-secret and openai-key-secret in the chat-engine section.

      - name: chat-engine
        type: ucf.svc.ace-agent.chat-engine
        parameters:
          imagePullSecrets:
          - name: ngc-docker-reg-secret
        secrets:
          ngc-api-key-secret: k8sSecret/ngc-api-key-secret/NGC_CLI_API_KEY
          openai-key-secret: k8sSecret/openai-key-secret/OPENAI_API_KEY
      
    6. Mount the bot to the Chat Engine Microservice. We can do the same by providing the local directory under the chat-engine component in the application specs.

      - name: chat-engine
      type: ucf.svc.ace-agent.chat-engine
      parameters:
          imagePullSecrets:
          - name: ngc-docker-reg-secret
      secrets:
          ngc-api-key-secret: k8sSecret/ngc-api-key-secret/NGC_CLI_API_KEY
          openai-key-secret: k8sSecret/openai-key-secret/OPENAI_API_KEY
      files:
          config_dir: ../samples/stock_bot/
      
    7. Specify parameters for the WebUI and Chat Engine microservices. You can either update the parameters section in the chat-engine and webapp component or create a new file test-app/params.yaml and update it.

      chat-engine:
        botConfigName: stock_bot_config.yml
        interface: "event"
      webapp:
        chatInterface: "event"
        speechFlags: ""
      
  3. Generate the Helm Chart using UCS tools.

    ucf_app_builder_cli app build test-app/app.yaml test-app/params.yaml
    
  4. Deploy the generated Helm Chart and wait for all pods to be ready. Interact with the bot using the Web app at http://<NodeIP>:<NodePort_for_7006>/. You can get NodePort using kubectl get svc.

    helm install test test-app/test-app-0.0.1/
    
  5. As we haven’t deployed the Plugin server microservices, queries related to live stock price will not work. Add an ACE Agent Plugin server in the UCS application. Update test-app/app.yaml.

    1. Update the dependencies section with the Plugin Microservices.

      dependencies:
      - ucf.svc.ace-agent.chat-engine:4.1.0
      - ucf.svc.ace-agent.web-app:4.1.0
      - ucf.svc.core.redis-timeseries:0.0.20
      - ucf.svc.ace-agent.plugin-server:4.1.0
      
    1. Add the plugin-server in components.

      - name: plugin-server
      type: ucf.svc.ace-agent.plugin-server
      parameters:
          imagePullSecrets:
          - name: ngc-docker-reg-secret
      secrets:
          ngc-api-key-secret: k8sSecret/ngc-api-key-secret/NGC_CLI_API_KEY
      
    2. Mount the plugin configurations by providing the local directory under the plugin-server component.

      - name: plugin-server
      type: ucf.svc.ace-agent.plugin-server
      parameters:
          imagePullSecrets:
          - name: ngc-docker-reg-secret
      secrets:
          ngc-api-key-secret: k8sSecret/ngc-api-key-secret/NGC_CLI_API_KEY
      files:
          config_dir: ../samples/stock_bot/
      
    3. Add a connection between the Chat Engine and the Plugin microservices.

      connections:
        chat-engine/plugin-server: plugin-server/http-api
        chat-engine/redis: redis-timeseries/redis
        webapp/redis: redis-timeseries/redis
      
    4. Specify parameters for the Plugin microservices. You can either update the parameters section in the plugin-server component, or create a new file test-app/params.yaml and update.

      plugin-server:
        pluginConfigPath: "plugin_config.yaml" # relative path of the plugin_config.yaml
      
  6. Generate the updated Helm Chart and deploy the application. You should be able to interact with the updated bot using the Web app at http://<NodeIP>:<NodePort>/.

    ucf_app_builder_cli app build test-app/app.yaml test-app/params.yaml
    
    # Uninstall existing deployment
    helm uninstall test
    kubectl delete pvc --all
    
    # Deploy updated helm chart
    helm install test test-app/test-app-0.0.1/
    

    Try all queries supported by the stock sample bot.

  7. The stock sample bot doesn’t use any NLP models, but if you are adding NLP models in your custom bots, we need to add an NLP server microservice.

    1. Add the NLP server microservice in the dependencies component.

      dependencies:
      - ucf.svc.ace-agent.chat-engine:4.1.0
      - ucf.svc.ace-agent.web-app:4.1.0
      - ucf.svc.core.redis-timeseries:0.0.20
      - ucf.svc.ace-agent.plugin-server:4.1.0
      - ucf.svc.ace-agent.nlp-server:4.1.0
      
    1. Update the components.

      - name: nlp-server
      type: ucf.svc.ace-agent.nlp-server
      parameters:
          imagePullSecrets:
          - name: ngc-docker-reg-secret
      secrets:
          ngc-api-key-secret: k8sSecret/ngc-api-key-secret/NGC_CLI_API_KEY
      
    2. Update the connections. The Chat Engine utilizes the NLP server.

      connections:
        chat-engine/plugin-server: plugin-server/http-api
        chat-engine/redis: redis-timeseries/redis
        webapp/redis: redis-timeseries/redis
        chat-engine/nlp-server: nlp-server/api-server
      
    3. We can mount the model configurations by providing the local directory under the nlp-server component.

      - name: nlp-server
      type: ucf.svc.ace-agent.nlp-server
      parameters:
          imagePullSecrets:
          - name: ngc-docker-reg-secret
      secrets:
          ngc-api-key-secret: k8sSecret/ngc-api-key-secret/NGC_CLI_API_KEY
      files:
          config_dir: ../samples/stock_bot/
      
    4. Update the NLP Server Microservice parameters and GPU device for test-app/params.yaml.

      nlp-server:
        ucfVisibleGpus: [0]
        modelConfigPath: "model_config.yaml"
      
  8. Generate the updated Helm Chart and deploy the application. You should be able to interact with the updated bot using the Web app at http://<NodeIP>:<NodePort>/.

    # Build updated helm chart
    ucf_app_builder_cli app build test-app/app.yaml test-app/params.yaml
    
    # Uninstall previous deployment
    helm uninstall test
    kubectl delete pvc --all
    
    # Install updated helm chart
    helm install test test-app/test-app-0.0.1/
    

Building a Speech Bot UCS Application#

In the previous section, we have created a text based UCS application for stock sample bot and will update the UCS application to support speech to speech conversations. We already have a UCS application for the stock speech bot at ./deploy/ucs_apps/speech_bot/stock_bot/ and you can use the same as reference during this tutorial.

  1. Add Speech support using the Chat Controller Microservice.

    1. Add the Chat Controller Microservice in the dependencies section.

    dependencies:
    - ucf.svc.ace-agent.chat-engine:4.1.0
    - ucf.svc.ace-agent.web-app:4.1.0
    - ucf.svc.ace-agent.plugin-server:4.1.0
    - ucf.svc.ace-agent.nlp-server:4.1.0
    - ucf.svc.ace-agent.chat-controller:4.1.0
    - ucf.svc.core.redis-timeseries:0.0.20
    
    1. Add the chat-controller to the components section.

    - name: chat-controller
      type: ucf.svc.ace-agent.chat-controller
      parameters:
        imagePullSecrets:
        - name: ngc-docker-reg-secret
      secrets:
        ngc-api-key-secret: k8sSecret/ngc-api-key-secret/NGC_CLI_API_KEY
    
    1. Add the Riva Speech Skills microservice for deploying speech models in the dependencies section.

    dependencies:
    - ucf.svc.ace-agent.chat-engine:4.1.0
    - ucf.svc.ace-agent.web-app:4.1.0
    - ucf.svc.ace-agent.plugin-server:4.1.0
    - ucf.svc.riva.speech-skills:2.17.0
    - ucf.svc.ace-agent.nlp-server:4.1.0
    - ucf.svc.ace-agent.chat-controller:4.1.0
    
    Chat Bot Web UCS Application
    1. Add the riva-speech to the components section.

      - name: riva-speech
      type: ucf.svc.riva.speech-skills
      parameters:
          imagePullSecrets:
          - name: ngc-docker-reg-secret
      secrets:
          ngc-api-key-secret: k8sSecret/ngc-api-key-secret/NGC_CLI_API_KEY
      
    2. Add the Chat Controller connections to the Riva server and Chat Engine. Update the webapp connection with the Chat Controller.

      connections:
        chat-engine/plugin-server: plugin-server/http-api
        chat-engine/redis: redis-timeseries/redis
        chat-controller/riva: riva-speech/riva-speech-api
        chat-controller/redis: redis-timeseries/redis
        webapp/chat-controller: chat-controller/grpc-api
        webapp/redis: redis-timeseries/redis
      
    3. Update the Riva server parameters with the ASR and TTS models in test-app/params.yaml.

      riva-speech:
        riva:
          visibleGpus: "0"
        modelRepoGenerator:
          ngcModelConfigs:
          triton0:
              models:
              #> description: List of NGC models for deployment
      
              - nvidia/ace/rmir_asr_parakeet_1-1b_en_us_str_vad:2.17.0 #english
              - nvidia/riva/rmir_tts_fastpitch_hifigan_en_us_ipa:2.17.0
      
    4. Update the chat controller microservice parameters to use the speech_umim pipeline.

      chat-controller:
        pipeline: speech_umim
      
    5. Update the WebUI microservice parameters to enable speech mode in test-app/params.yaml.

      webapp:
        chatInterface: "event"
        speechFlags: "--speech"
      
  2. Generate the updated Helm Chart and deploy the application. Wait for all pods to be ready, it might take around 40-50 minutes. You should be able to interact with the updated bot using the Speech Web App at http://<NodeIP>:<NodePort_7006>/.

    # Rebuild the helm chart
    ucf_app_builder_cli app build test-app/app.yaml test-app/params.yaml
    
    # Uninstall previous deployment
    helm uninstall test
    kubectl delete pvc --all
    
    # Install updated helm chart
    helm install test test-app/test-app-0.0.1/
    

Note

For accessing the mic on the browser, we need to have the HTTPS endpoint. You can add SSL validation and convert the endpoint to HTTPS, or update your chrome://flags/ or edge://flags/ to allow http://<Node_IP>:<Webapp_NodePort_7006> as a secure endpoint.

  1. For production deployment, use NGC resources for providing bot configurations to various ACE Agent Microservices. This will allow you to keep track of versions and don’t have to depend on the host file systems. We will update the chat-engine Microservice, however, you can update the NLP, Plugin, and Chat Controller Microservices similarly.

    1. Remove files from the chat-engine component in test-app/app.yaml.

      - name: chat-engine
        type: ucf.svc.ace-agent.chat-engine
        parameters:
          imagePullSecrets:
          - name: ngc-docker-reg-secret
        secrets:
          ngc-api-key-secret: k8sSecret/ngc-api-key-secret/NGC_CLI_API_KEY
          openai-key-secret: k8sSecret/openai-key-secret/OPENAI_API_KEY
      
    2. Add the chat-engine parameters in test-app/params.yaml. If files are provided via a component in test-app/app.yaml, the configNgcPath parameter will not be used. If you have multiple bots in NGC and want to deploy a single bot, use botConfigName to point to the required bot_config.yaml.

      chat-engine:
        configNgcPath: <NGC_RESOURCE_PATH>
        botConfigName: bot_config.yaml
      
    3. Rebuild the Helm Chart and redeploy.

Building a LangChain Bot UCS Application#

In this tutorial, we will showcase how you can create the UCS application for the sample LangChain based bot present at ./samples/ddg_langchain_bot. This tutorial can be used for creating UCS applications for the other sample bots such as ./samples/rag_bot or your own custom bot.

The DuckDuckGo LangChain sample bot uses OpenAI models and we will need an OPENAI_API_KEY. We already have a UCS application for the DuckDuckGo LangChain bot at ./deploy/ucs_apps/speech_bot/ddg_langchain_bot/ and you can use the same as reference during this tutorial.

  1. Create the boilerplate UCS application specs.

    ucf_app_builder_cli app create langchain-app
    
  2. Update langchain-app/app.yaml as follows:

    1. LangChain Agent will be deployed as part of the Plugin server microservice. We will need a Chat Controller and Riva Speech Skills for speech support. We might optionally include the Web App microservice for the UI to interact with the bot using text or speech. The dependencies we need are:

      dependencies:
      - ucf.svc.riva.speech-skills:2.17.0
      - ucf.svc.ace-agent.plugin-server:4.1.0
      - ucf.svc.ace-agent.chat-controller:4.1.0
      - ucf.svc.ace-agent.web-app:4.1.0
      
    1. Update the components section for the microservices in the dependencies section.

      - name: riva-speech
        type: ucf.svc.riva.speech-skills
        parameters:
          imagePullSecrets:
          - name: ngc-docker-reg-secret
      - name: plugin-server
        type: ucf.svc.ace-agent.plugin-server
        parameters:
          imagePullSecrets:
          - name: ngc-docker-reg-secret
      - name: chat-controller
        type: ucf.svc.ace-agent.chat-controller
        parameters:
          imagePullSecrets:
          - name: ngc-docker-reg-secret
      - name: webapp
        type: ucf.svc.ace-agent.web-app
        parameters:
          imagePullSecrets:
          - name: ngc-docker-reg-secret
      
    2. Update the connections between the microservices.

      connections:
        chat-controller/riva: riva-speech/riva-speech-api
        chat-controller/chat-api: plugin-server/http-api
        webapp/chat-controller: chat-controller/grpc-api
      
    3. Configure the OpenAI API key. To create Kubernetes secret, run:

      export OPENAI_API_KEY=...
      kubectl create secret generic openai-key-secret --from-literal=OPENAI_API_KEY=${OPENAI_API_KEY}
      
    4. Set the NGC CLI API key for downloading NGC resources. We have created ngc-api-key-secret in the Prerequisites section. Remove the vault section and update the secrets with ngc-api-key-secret and openai-key-secret.

      secrets:
        k8sSecret/ngc-api-key-secret/NGC_CLI_API_KEY:
          k8sSecret:
            secretName: ngc-api-key-secret
            key: NGC_CLI_API_KEY
        k8sSecret/openai-key-secret/OPENAI_API_KEY:
          k8sSecret:
            secretName: openai-key-secret
            key: OPENAI_API_KEY
      
    5. Configure ngc-api-key-secret and openai-key-secret in the plugin-server, chat-controller, and riva-skills section as needed.

      - name: riva-speech
        type: ucf.svc.riva.speech-skills
        parameters:
          imagePullSecrets:
          - name: ngc-docker-reg-secret
        secrets:
          ngc-api-key-secret: k8sSecret/ngc-api-key-secret/NGC_CLI_API_KEY
      
      - name: plugin-server
        type: ucf.svc.ace-agent.plugin-server
        parameters:
          imagePullSecrets:
          - name: ngc-docker-reg-secret
        secrets:
          ngc-api-key-secret: k8sSecret/ngc-api-key-secret/NGC_CLI_API_KEY
          openai-key-secret: k8sSecret/openai-key-secret/OPENAI_API_KEY
      
      - name: chat-controller
        type: ucf.svc.ace-agent.chat-controller
        parameters:
          imagePullSecrets:
          - name: ngc-docker-reg-secret
        secrets:
          ngc-api-key-secret: k8sSecret/ngc-api-key-secret/NGC_CLI_API_KEY
      
    6. Mount the bot to the Plugin server and Chat Controller microservices. We can do the same by providing the local directory under the plugin-server and chat-controller components in the application specs.

      - name: plugin-server
        type: ucf.svc.ace-agent.plugin-server
        parameters:
          imagePullSecrets:
          - name: ngc-docker-reg-secret
        secrets:
          ngc-api-key-secret: k8sSecret/ngc-api-key-secret/NGC_CLI_API_KEY
          openai-key-secret: k8sSecret/openai-key-secret/OPENAI_API_KEY
        files:
          config_dir: ../samples/ddg_langchain_bot
      
      - name: chat-controller
        type: ucf.svc.ace-agent.chat-controller
        parameters:
          imagePullSecrets:
          - name: ngc-docker-reg-secret
        secrets:
          ngc-api-key-secret: k8sSecret/ngc-api-key-secret/NGC_CLI_API_KEY
        files:
          config_dir: ../samples/ddg_langchain_bot
      
  3. Specify the parameters for the microservices. You can update the parameters section in the component or create langchain-app/app-params.yaml. Update langchain-app/app-params.yaml as follows:

    1. Update the Riva Speech parameters with the ASR and TTS models in langchain-app/params.yaml.

      riva-speech:
        riva:
          visibleGpus: "0"
        modelRepoGenerator:
          ngcModelConfigs:
          triton0:
              models:
              #> description: List of NGC models for deployment
      
              - nvidia/ace/rmir_asr_parakeet_1-1b_en_us_str_vad:2.17.0 #english
              - nvidia/riva/rmir_tts_radtts_hifigan_en_us_ipa:2.17.0
      
    2. Update the Chat Controller parameters in langchain-app/params.yaml. We will be using the speech_lite pipeline. The Plugin server will expose the APIs with the LangChain endpoint prefix.

      chat-controller:
        pipeline: speech_lite
        speechConfigPath: "speech_config.yaml"
        chatEndpointPrefix: "langchain"
      
    3. Update the Plugin server parameters.

      plugin-server:
        pluginConfigPath: "plugin_config.yaml"
      
    4. Update the WebUI microservice parameters.

      webapp:
        chatInterface: "grpc"
        speechFlags: "--speech"
      
  4. The LangChain plugin requires additional packages to be installed in the Plugin server container.

    1. Build a custom Dockerfile by copying the requirements from ddg_langchain_bot/plugins/requirements_dev.txt into deploy/docker/dockerfiles/plugin_server.Dockerfile.

      ##############################
      # Install custom dependencies
      ##############################
      RUN pip3 install \
          langchain==0.1.1 \
          langchain-community==0.0.13 \
          langchain-core==0.1.12 \
          duckduckgo-search==5.3.1b1
      

    Note

    If you see a crash in the Plugin server or an issue with fetching a response from DuckDuckGo, try using a more recent duckduckgo-search version.

    1. Build the container and push it to the NGC Docker registry.

      # Set required environment variables for docker-compose.yaml
      source deploy/docker/docker_init.sh
      
      # Build custom plugin server docker image
      docker compose -f deploy/docker/docker-compose.yml build plugin-server
      
      # Retag docker image and push to NGC docker registry
      docker tag docker.io/library/plugin-server:4.1.0 <CUSTOM_DOCKER_IMAGE_PATH>:<VERSION>
      
      docker push <CUSTOM_DOCKER_IMAGE_PATH>:<VERSION>
      

    If you want to use a different Docker registry, update imagePullSecrets in langchain-app/app.yaml.

    1. Override the Plugin server image using langchain-app/params.yaml.

      plugin-server:
        pluginConfigPath: "plugin_config.yaml"
        applicationSpecs:
          deployment:
            containers:
              container:
                image:
                  repository: <CUSTOM_DOCKER_IMAGE_PATH>
                  tag: <VERSION>
      
  5. Generate the updated Helm Chart and deploy the application. You should be able to interact with the updated bot using the Speech Web App at http://<NodeIP>:<NodePort_7006>/.

    # Build helm chart using UCS tools
    ucf_app_builder_cli app build langchain-app/app.yaml langchain-app/params.yaml
    
    # Uninstall previous deployment
    helm uninstall test
    kubectl delete pvc --all
    
    # Deploy generated helm chart
    helm install test langchain-app/langchain-app-0.0.1/
    

    For accessing the mic on the browser, we need to have the HTTPS endpoint. You can add SSL validation and convert the endpoint to HTTPS, or update your chrome://flags/ or edge://flags/ to allow http://<Node_IP>:<Webapp_NodePort_7006> as a secure endpoint.

Creating Kubernetes Secrets#

ACE Agent Microservices utilizes the following Kubernetes secrets for setting various keys.

  • Image Pull Secret [Mandatory]: Pulls Docker images for deployment. ACE Agent containers utilize the NGC Docker registry. For setting up an image pull secret, run:

    export NGC_CLI_API_KEY=...
    kubectl create secret docker-registry ngc-docker-reg-secret --docker-server=nvcr.io --docker-username='$oauthtoken' --docker-password="${NGC_CLI_API_KEY}"
    

    Where NGC_CLI_API_KEY is the NGC Personal API key.

  • NGC_CLI_API_KEY Secret [Mandatory]: Pulls models and resources from NGC. This secret is mandatory for Chat Engine, Chat Controller, Plugin, and NLP Server microservices. It will create a file with secret value at /secrets/ngc_api_key.txt in pods. It can also be used to populate the NGC_CLI_API_KEY environment variable.

    export NGC_CLI_API_KEY=...
    kubectl create secret generic ngc-api-key-secret --from-literal=NGC_CLI_API_KEY="${NGC_CLI_API_KEY}"
    

    Where NGC_CLI_API_KEY is the NGC Personal API key.

  • OPENAI_API_KEY Secret [Optional]: Sets the OPENAI_API_KEY environment variable in the Chat Engine, Plugin, and NLP Server Microservices. The secret will create a file with secret value at /secrets/openai_api_key.txt in pods.

    export OPENAI_API_KEY="sk-XXX"
    kubectl create secret generic openai-key-secret --from-literal=OPENAI_API_KEY=${OPENAI_API_KEY}
    
  • NVIDIA_API_KEY Secret [Optional]: Sets the NVIDIA_API_KEY environment variable commonly used for accessing https://build.nvidia.com/ models in the Plugin and Chat Engine Microservices. The secret will create a file with secret value at /secrets/nvidia_api_key.txt in pods.

    export NVIDIA_API_KEY="XXX"
    kubectl create secret generic nvidia-api-key-secret --from-literal=ELEVENLABS_API_KEY=${NVIDIA_API_KEY}
    
  • Custom ENV Secret [Optional]: This secret can be used to pass any key value pairs which will be exported as environment variables and supported by Chat Engine, Chat Controller, Plugin, and NLP Server Microservices. This secret will create a /secrets/.env file and will be sourced before running services to set the environment variables.

    cat <<EOF | tee custom-env.txt
    KEY1=VALUE1
    KEY2=VALUE2
    EOF
    kubectl create secret generic custom-env-secrets --from-file=ENV=custom-env.txt
    

Bot and UCS Applications Customization#

This section provides guidance on how to modify sample UCS applications and sample workflows such as Tokkio for your custom use cases. ACE Agent documentation uses Docker flow for development and bot customizations, and recommends Kubernetes deployment for production use cases. For some customization or debugging, it becomes important to work with both Docker and Kubernetes deployment interchangeably.

Local Development with Kubernetes Environment#

If you are deploying ACE Agent as part of a more complex Kubernetes environment, for example, to create an interactive avatar experience with multiple ACE microservices, you will often want to test the bot you are developing directly in the setup environment. Testing the bot as part of the Kubernetes deployment will allow you to check the multimodal aspects of the interaction like the animations of the avatar, the voice, timings of response and more.

To simplify this workflow and to allow for fast iteration times, it is possible to run the Chat Engine in your local Python environment and connect it to your running Kubernetes deployment. In this case, you can test changes to the Colang scripts or bot configurations simply by restarting the Chat Engine locally.

In this section, we will showcase steps for using the local Chat Engine in the event interface, but similar steps should work for the Chat Engine server interface also.

  1. Replace the Chat Engine microservice from your UCS application specification with local chat engine deployment by updating connection for Chat Controller egress endpoint chat_api.

    dependencies:
    [...]
    - ucf.svc.riva.speech-skills:2.17.0
    # Remove chat-engine from the dependency list
    <Remove>- ucf.svc.ace-agent.chat-engine:4.1.0</Remove>
    - ucf.svc.ace-agent.chat-controller:4.1.0
    
    [...]
    
    components:
    [...]
    # Remove the chat-engine from the list of components
    <Remove>- name: chat-engine
    type: ucf.svc.ace-agent.chat-engine
    parameters:
        imagePullSecrets:
        - name: ngc-docker-reg-secret
    secrets:
        ngc-api-key-secret: k8sSecret/ngc-api-key-secret/NGC_CLI_API_KEY
        openai-key-secret: k8sSecret/openai-key-secret/OPENAI_API_KEY</Remove>
    [...]
    # Add the local chat engine as external endpoint to the component list
    - name: engine-placeholder
    type: ucf.svc.external-endpoint
    parameters:
        # Update Local Chat Engine deployment IP Port
        service: 127.0.0.1
        port: 8000
    
    [...]
    
    connections:
    [...]
    # Remove outgoing connections from chat-engine (since we removed it above)
    <Remove>  chat-engine/plugin-server: plugin-server/http-api
    chat-engine/redis: redis-timeseries/redis</Remove>
    # Point the chat-api egress connection to the engine-placeholder
    chat-controller/chat-api: engine-placeholder/endpoint
    
  2. Redeploy your UCS application by following the Kubernetes environment section.

  3. In a separate terminal, use port forwarding or exposing nodePort to connect to the Redis time series microservice inside the cluster.

    kubectl port-forward $(kubectl get pods | grep redis-timeseries | awk '{print $1}') 30379:6379
    
  4. Start ACE Agent locally in your Python environment (refer to Python Environment for more information).

    aceagent chat event -c bots/your_bot/ --event-provider-port 30379 --log-level=INFO
    
  5. Edit the Colang Script and any bot configurations, save any changes to disk and simply restart the ACE Agent (CTRL+C and repeat the previous step) to test your changes.

Using Local Bot Configurations with UCS Applications#

The Tokkio workflow utilizes the bot configurations as NGC resources. Instead of using NGC resources to set the customizations, you can perform the following steps to customize your bot configurations using a local directory.

  1. Download NGC bot resources locally.

ngc registry resource download-version <NGC_BOT_RESOURCE_PATH>
  1. Update the UCS app.yaml file to use the downloaded NGC resource local directory. The local directory will be mounted as a configmap in the pods. Use the absolute directory path or relative path to app.yaml.

- name: chat-engine
  type: ucf.svc.ace-agent.chat-engine
  parameters:
    imagePullSecrets:
    - name: ngc-docker-reg-secret
  secrets:
    ngc-api-key-secret: k8sSecret/ngc-api-key-secret/NGC_CLI_API_KEY
    openai-key-secret: k8sSecret/openai-key-secret/OPENAI_API_KEY
  files:
    config_dir: <LOCAL_BOT_DIRECTORY>

You can make similar updates for the Chat Controller, NLP server, and Plugin server microservices.

  1. Update the UCS app-params.yaml file to remove references to the NGC bot configurations resource path from chat-engine, chat-controller, nlp-server, and plugin-server components.

chat-engine:
  <REMOVE>configNgcPath: <NGC_RESOURCE_PATH><REMOVE>
  botConfigName: bot_config.yaml
  1. You can make modifications in the local bot configurations directory, generate the updated Helm Chart, and deploy the application using the Helm upgrade.

# Build helm chart using UCS tools
ucf_app_builder_cli app build app.yaml params.yaml

# Upgrade deployment to utilize latest bot configurations
helm upgrade <DEPLOYMENT_NAME> <NEW_HELM_CHART>

Note

You might get warnings for binary files during Helm chart generation. Remove any binary files from the bot configurations as it will be mounted in configmap in the Kubernetes deployment.

Using 3rd Party Text-to-Speech (TTS) Solutions#

The ACE Agent pipeline supports Riva TTS as the default option.

For speech bots, you might want to customize the voice for speech response. You can train your own TTS model, clone the TTS voice, or use any 3rd party provider. We have shown in the tutorial section how you integrate ElevenLabs text to speech APIs using Docker flow; in this section we will add additional steps needed for the Kubernetes deployment.

  1. Build the NLP server Docker image with the required ElevenLabs dependencies.

  1. Add the required dependencies in the NLP server Dockerfile present at deploy/docker/dockerfiles/nlp_server.Dockerfile.

##############################
# Install custom dependencies
##############################
RUN apt-get update & apt-get -y install ffmpeg
RUN pip3 install pydub elevenlabs==1.4.1
  1. Build the container and push it to the NGC Docker registry.

# Set required environment variables for docker-compose.yaml
source deploy/docker/docker_init.sh
# Build custom nlp server docker image
docker compose -f deploy/docker/docker-compose.yml build nlp-server

# Retag docker image and push to NGC docker registry
docker tag docker.io/library/nlp-server:4.1.0 <CUSTOM_DOCKER_IMAGE_PATH>:<VERSION>

docker push <CUSTOM_DOCKER_IMAGE_PATH>:<VERSION>

If you want to use a different Docker registry, update imagePullSecrets in the app.yaml file under the nlp-server component with your own image Kubernetes secret.

  1. Update the bot configurations with the ElevenLabs specific client and configurations by following the steps in the tutorial section. If you are using the NGC bot resource path, switch to the local bot configurations for quicker iteration. You can verify your changes using the Docker flow.

  2. Add NLP server microservice in the UCS application specs.

  1. Add the dependencies section in the app.yaml file.

- ucf.svc.ace-agent.nlp-server:4.1.0
  1. Add the components section in the app.yaml file.

- name: nlp-server
  type: ucf.svc.ace-agent.nlp-server
  parameters:
    imagePullSecrets:
    - name: ngc-docker-reg-secret
  secrets:
    ngc-api-key-secret: k8sSecret/ngc-api-key-secret/NGC_CLI_API_KEY
  files:
    config_dir: <LOCAL_BOT_CONFIG_DIR>
  1. Update the NLP server parameters in the app-params.yaml file and update the NLP server Docker image created in step 1.

nlp-server:
  ucfVisibleGpus: [0]
  modelConfigPath: "model_config.yaml"
  customModelDir: "elabs.py" # custom elevenlabs client path
  applicationSpecs:
    deployment:
      containers:
        nlp-api:
          image:
            # UPDATE NGC image pushed in step1
            repository: <CUSTOM_DOCKER_IMAGE_PATH>
            tag: <VERSION>
  1. Pass ELEVENLABS_API_KEY to the NLP server microservice.

  1. Create a custom Kubernetes secret for passing ELEVENLABS_API_KEY.

cat <<EOF | tee custom-env.txt
ELEVENLABS_API_KEY=<API_KEY_VALUE>
EOF
kubectl create secret generic custom-env-secrets --from-file=ENV=custom-env.txt
  1. Add custom-env-secrets in the secrets section of the app.yaml file.

secrets:
  k8sSecret/ngc-api-key-secret/NGC_CLI_API_KEY:
    k8sSecret:
      secretName: ngc-api-key-secret
      key: NGC_CLI_API_KEY
  k8sSecret/custom-env-secrets:
    k8sSecret:
      secretName: custom-env-secrets
      key: ENV
  1. Add custom-env-secrets in the nlp-server component.

- name: nlp-server
  type: ucf.svc.ace-agent.nlp-server
  parameters:
    imagePullSecrets:
    - name: ngc-docker-reg-secret
  secrets:
    ngc-api-key-secret: k8sSecret/ngc-api-key-secret/NGC_CLI_API_KEY
custom-env-secrets: k8sSecret/custom-env-secrets
  files:
    config_dir: <LOCAL_BOT_CONFIG_DIR>
  1. Update the parameters in the UCS app-params.yaml file to utilize the ElevenLabs TTS client.

  1. In the chat-controller component, update the TTS section to utilize the ElevenLabs client.

riva_tts:
RivaTTS:
  tts_mode: "http"
  voice_name: "Brian"
  language: "en-US"
  server: http://ace-agent-nlp-server-deployment-service:9003/speech/text_to_speech
  arpabet_dict: ""
  sample_rate: 16000
  model_name: "eleven_turbo_v2_5"
  chunk_duration_ms: 600   #amount of data to be sent to downstream in realtime
  audio_start_threshold_ms: 800 #2000   #duration for which audio data will be sent in burst and rest of the data will be sent in realtime
  send_audio_in_realtime: true   #this will send synthesized audio data in realtime to downstream
  1. Comment out the TTS model from the riva-speech component.

nvidia/riva/rmir_tts_fastpitch_hifigan_en_us_ipa:2.17.0
  1. You can make modifications in the local bot configurations directory, generate the updated Helm Chart, and deploy the application using the Helm upgrade.

# Build helm chart using UCS tools
ucf_app_builder_cli app build app.yaml app-params.yaml

# Uninstall previous deployment
helm uninstall <DEPLOYMENT_NAME>
kubectl delete pvc --all

# Deploy generated helm chart
helm install <DEPLOYMENT_NAME> <HELM_CHART_PATH>

Deploying Multilingual Bots with Kubernetes#

NVIDIA ACE Agent supports deploying bots in different languages with few limitations.

  • For ASR ( Automatic Speech Recognition), only Riva Speech compatible models and languages are supported.

  • For TTS (Text-to-Speech), you can use external services in addition to Riva TTS for your choice of language.

  • Single deployment can’t support multiple languages. You will need to pick one language for ASR, and you can have the same or different language for TTS.

In this section, we will showcase the needed changes in the UCS application to deploy sample Spanish bots released as part of the ACE Agent release. You can apply similar changes when changing languages.

  1. Update the ASR and TTS models for your target language in the app-params.yaml file under the riva-speech component.

riva-speech:
  riva:
    visibleGpus: "0"
  modelRepoGenerator:
    ngcModelConfigs:
      triton0:
        models:
        #> description: List of NGC models for deployment

        - nvidia/ace/rmir_asr_parakeet_1-1b_en_us_str_vad:2.17.0 #english
        - nvidia/riva/rmir_tts_fastpitch_hifigan_en_us_ipa:2.17.0
  1. If you are using Riva’s Neural Machine Translation model to do translation either at the query level or response level, then you need to also deploy the translation models in the riva-speech component.

  2. Update language for ASR, and voice_name and language for TTS in the chat-controller component in the app-params.yaml file. These updates can be performed in the speech_config.yaml file. Changes in the app-params.yaml file will get preference if updated in both places.

chat-controller:
 pipelineParams:
   riva_asr:
     RivaASR:
       server: "0.0.0.0:50051"
       language: "en-US"
       word_boost_file_path: "/workspace/config/asr_words_to_boost.txt"
       enable_profanity_filter: false
       endpointing_stop_history: 800
       endpointing_stop_history_eou: 240
   riva_tts:
     RivaTTS:
       server: "0.0.0.0:50051"
       voice_name: "English-US.Female-1"
       language: "en-US"
       sample_rate: 44100
       chunk_duration_ms: 100
       audio_start_threshold_ms: 400
       send_audio_in_realtime: true
       tts_mode: "grpc"
  1. You can develop multilingual bots using one of the following implementations:

  1. Use a Multilingual LLM or a language specific LLM - Refer to the Spanish LLM sample bot as an example. We use CoLang syntax (en-US) and multilingual LLM to generate Spanish responses for user queries which are also in Spanish.

  2. Use Riva’s Neural Machine Translation model - In this implementation, we translate a user query into en-US, generate the response in en-US and translate the response to the target language. Refer to the Spanish NMT bot as an example.

  1. Update the bot configurations either using the local directory or NGC resource path in the UCS applications with the required changes for your target use case. For more information, refer to the Using Local Bot configurations with UCS Applications section.

  2. Rebuild the Helm chart using the UCS tools.

# Build helm chart using UCS tools
ucf_app_builder_cli app build app.yaml app-params.yaml