Setup a Speech AI Server with Speech Recognition and Text-to-speech Models
In this section, you will learn how to set up NVIDIA Riva Speech skills on NVIDIA Launchpad.
Through LaunchPad, enterprises can get immediate, short-term access to NVIDIA AI running on private accelerated compute infrastructure to power critical AI initiatives. This platform speeds the development of complex AI models using NVIDIA DGX SuperPOD™, NVIDIA Base Command, NVIDIA Fleet Command, and pretrained models from NVIDIA NGC™.
As a prerequisite for setting up Riva speech skills, you will first need access to NVIDIA GPU Cloud (NGC). After signing up, you can leverage the NGC command-line interface (CLI) to access the relevant resources. The NGC CLI provides an interface for many of the same operations that are available from the NGC website, such as accessing Docker repositories within your organization and team space.
Setup
Access the LaunchPad VM console, click on “VM Console” on the left-hand navigation pane, and navigate to the newly created tab.
Download, unzip, and install from the command line by moving to a directory where you have execute permissions and using the command below.
wget -O ngccli_arm64.zip https://ngc.nvidia.com/downloads/ngccli_arm64.zip && unzip -o ngccli_arm64.zip && chmod u+x ngc
Check the binary’s md5 hash to ensure the file wasn’t corrupted during download using the command below.
md5sum -c ngc.md5
Add your current directory to path using the command below.
echo "export PATH=\"\$PATH:$(pwd)\"" >> ~/.bash_profile && source ~/.bash_profile
You must configure NGC CLI for your use so that you can run the commands. Using the command below, including your API key when prompted.
ngc config set
There are two options to deploy Riva: 1) Using Quick Start scripts to deploy Riva using Docker, and 2) Using the Helm chart available in NGC Helm Repository for a push-button deployment to a Kubernetes cluster. This section provides instructions for the former, using Quick Start scripts for local deployment on the LaunchPad VM.
Run the following commands on VM Console.
Download Quick Start scripts through NGC CLI Tool by using the command below.
ngc registry resource download-version nvidia/riva/riva_quickstart:2.4.0
Initialize and Start Riva.
Navigate to your
quickstart
directory using the command below.cd riva_quickstart_v2.4.0
Modify the
config.sh
with your preferred configuration. Options include which models to retrieve from NGC, where to store them, and which GPU to use if more than one is installed in your system.Initialize Riva using the command below.
bash riva_init.sh
The
riva_init.sh
script downloads all required models and containers from NGC and generates the model repository. The download can take some time depending on your internet bandwidth. Upon successful completion of this command, you should see the output shown below.Logging into NGC Docker registry if necessary... Pulling required Docker images if necessary... > Pulling Riva Speech Server images. > Pulling nvcr.io/nvidia/riva/riva-speech:2.4.0. This may take some time... Riva initialization complete. Run bash riva_start.sh to launch services.
After downloading the required models and containers, start the Riva Speech Service API server using the command below.
bash riva_start.sh
You should see similar output shown below.
Starting Riva Speech Services > Waiting for Triton server to load all models...retrying in 10 seconds > Waiting for Triton server to load all models...retrying in 10 seconds > Waiting for Triton server to load all models...retrying in 10 seconds > Triton server is ready…
To verify that the servers have started correctly, check that the output of
docker logs riva-speech
shows.I0428 03:14:50.440943 1 riva_server.cc:66] TTS Server connected to Triton Inference Server at 0.0.0.0:8001 I0428 03:14:50.440943 1 riva_server.cc:66] NLP Server connected to Triton Inference Server at 0.0.0.0:8001 I0428 03:14:50.440951 1 riva_server.cc:68] ASR Server connected to Triton Inference Server at 0.0.0.0:8001 I0428 03:14:50.440955 1 riva_server.cc:71] Riva Conversational AI Server listening on 0.0.0.0:50051
The Riva server is all set up and ready to explore!
The best place to start is the Interact with Real-time Speech AI APIs guide, which provides an introduction to Riva’s Python API for accessing Riva services. For more information, refer to Riva Speech Skills documentation.
To shut down the Riva Services server containers use the command below.
bash riva_stop.sh
To clean-up the local Riva installation use the command below.
bash riva_clean.sh
This stops and removes all Riva-related containers, as well as deletes the Docker volume used to store model files. The Docker images themselves are not removed.