NVIDIA Docs Hub NVIDIA AI Enterprise Speech AI Workflows Step 3: Install Workflow Components & Run The Workflow

Speech AI Workflows (Latest Version)

Step 3: Install Workflow Components & Run The Workflow

Intelligent Virtual Assistant

This section will walk through an end-to-end workflow deployment using the example software stack components previously described.

Ensure that the two nodes provisioned from the previous Hardware Requirements section are accessible.

One of the VMIs will be used for the training pipeline.
The second Kubernetes cluster node will be used for the inference pipeline.

Training Pipeline

SSH into the training VMI (this is the VMI without Cloud Native Stack or Kubernetes installed).

Download https://catalog.ngc.nvidia.com/enterprise/orgs/nvaie/resources/intelligent-virtual-assistant-training

Copy
Copied!

            
            ngc registry resource download-version "nvaie/intelligent-virtual-assistant-training:0.1"

Switch to the training directory.

Copy
Copied!

            
            cd intelligent-virtual-assistant-training_v0.1/

Make the set-up script executable.

Copy
Copied!

            
            chmod +x ./run.sh

Run the setup script.
Copy

Copied!
```
            
            sudo ./run.sh <YOUR-API-KEY>
        
```
Note
The installer may fail if dpkg does not run cleanly or entirely during the instance provisioning. If this occurs, run the following command to resolve the issue, then retry the installation.
Copy

Copied!

sudo dpkg --configure -a

From a browser, navigate to the Jupyter Notebook URL displayed once the setup script is completed. It is part of the CUSTOM STATES output, e.g.

Copy
Copied!

            
            RUN: { "services": [ { "name": "notebook", "url":"**http://<External-IP>/notebook**" } ]}

Select and run through the Jupyter Notebooks, starting with the Welcome Notebook.
The Training Deployment steps are complete

As a part of the workflow, we will be demonstrating how to deploy the packaged workflow components as a Helm chart on the previously described Kubernetes-based platform. We will also demonstrate how to interact with the workflow, how each of the components in the pipeline work, and how they all function together.

This includes an example of how to securely send requests to the inference pipeline, using Envoy set up as a proxy to authenticate and authorize requests sent to Triton, and Keycloak as the OIDC identity provider. For more information about the authentication portion of the workflow, refer to the Authentication section in the Appendix.

First, configure Keycloak according to the instructions provided in the Appendix.
Note down these six fields for the Deployment workflow.
- Client-id
- Client-secret
- Realm Name
- Username
- Password
- Token_endpoint
SSH into the inference/deployment VMI.

Once Keycloak has been configured, run the following command on your system via the SSH console to get the access token (replace the TOKEN_ENDPOINT, CLIENT_ID, CLIENT_SECRET, USERNAME and PASSWORD fields with the values previously created).

Copy
Copied!

            
            curl -k -L -X POST '<TOKEN_ENDPOINT>' -H 'Content-Type: application/x-www-form-urlencoded' --data-urlencode 'client_id=<CLIENT_ID>' --data-urlencode 'grant_type=password' --data-urlencode 'client_secret=<CLIENT_SECRET>' --data-urlencode 'scope=openid' --data-urlencode 'username=<USERNAME>' --data-urlencode 'password=<PASSWORD>' | json_pp

For example:

Copy
Copied!

            
            curl -k -L -X POST 'https://auth.your-cluster.your-domain.com/realms/ai-workflows/protocol/openid-connect/token' -H 'Content-Type: application/x-www-form-urlencoded' --data-urlencode 'client_id=merlin-workflow-client' --data-urlencode 'grant_type=password' --data-urlencode 'client_secret=vihhgVP76TgA4qDL3c5jUFAN1gixWYT8' --data-urlencode 'scope=openid' --data-urlencode 'username=nvidia' --data-urlencode 'password=hello123' | json_pp

This will output a JSON string like below

Copy
Copied!

            
            {"access_token":"eyJhbGc...","expires_in":54000,"refresh_expires_in":108000,"refresh_token":"eyJhbGci...","not-before-policy":0,"session_state":"e7e23016-2307-4290-af45-2c79ee79d0a1","scope":"openid email profile"}

Note down the access_token, this field will be required later on in the workflow, within the Jupyter notebook.

Now we’re ready to deploy the intelligent virtual assistant application.

Ensure the NGC CLI is configured.

Copy
Copied!

            
            ngc config set

Download https://catalog.ngc.nvidia.com/enterprise/orgs/nvaie/resources/intelligent-virtual-assistant-deployment

Copy
Copied!

            
            ngc registry resource download-version "nvaie/intelligent-virtual-assistant-deployment:0.1"

Switch to the transcription Helm chart directory.

Copy
Copied!

            
            cd intelligent-virtual-assistant-deployment_v0.1/helm_charts

Install Haystack Helm chart.

Copy
Copied!

            
            helm -n cciva install haystack haystack --create-namespace

Note down the haystack url from the output of the Helm install.

Install the Rasa Helm chart.

Copy
Copied!

            
            helm -n cciva install rasa rasa/ --set ngcCredentials.password=<NGC_KEY> --set haystackUrl=<HAYSTACK_URL>

Note down the Rasa URL from the Helm chart output.
Important
If you have already installed the Audio Transcription Riva Helm Chart, ensure you delete it before proceeding if you only have a single GPU attached to your VMI.
Copy

Copied!

cd ~/audio-transcription-deployment_v0.1/helm_charts helm del transcription -n riva

Install Riva via Helm.

Copy
Copied!

            
            helm -n cciva install riva riva/ --set ngcCredentials.password=<NGC_KEY> --set workflow.keycloak.keycloakrealm=<WORKFLOW_REALM> --create-namespace --set haystackUrl=<HAYSTACK_URL> --set rasaUrl=<RASA_URL>

Reference the output from the Helm install and launch the Jupyter Notebook from a browser once the transcription pods are running.
Validate that all of the iva pods are running
Copy

Copied!
```
            
            kubectl get pods -n cciva
        
```
Note

It can take about half an hour for the Riva pod to initialize
Once all pods are running, access the Jupyter Notebook for instructions to run the workflow

Step 3: Install Workflow Components & Run The Workflow

Training Pipeline

Inference Pipeline