Quick Start Guide¶

This NVIDIA® Riva Speech Skills Quick Start Guide is a starting point to try out Riva; specifically, this guide enables you to quickly deploy pretrained models on a local workstation and run a sample client.

For more information and questions, visit the NVIDIA Riva Developer Forum.

Prerequisites¶

Before you begin using Riva skills, it’s assumed that you meet the following prerequisites.

You have access and are logged into NVIDIA NGC™. For step-by-step instructions, refer to the NGC Getting Started Guide.
You have access to an NVIDIA Volta™, NVIDIA Turing™, or an NVIDIA Ampere Arcitecture-based A100 GPU. For more information, refer to the Support Matrix.
You have Docker installed with support for NVIDIA GPUs. For more information, refer to the Support Matrix.

Models Available for Deployment¶

To deploy Riva skills, there are two options:

Option 1: You can use the Quick Start scripts to set up a local workstation and deploy the Riva services using Docker. Continue with this section to use the Quick Start scripts.

Option 2: You can use a Helm chart. Included in the NGC Helm Repository is a chart designed to automate the steps for push-button deployment to a Kubernetes cluster. For details, see Kubernetes.

When using either of the push-button deployment options, Riva uses pre-trained models from NGC. You can also fine-tune custom models with NVIDIA TAO Toolkit. Creating a model repository using a fine-tuned model trained with TAO Toolkit is a more advanced approach to generating a model repository.

Local Deployment using Quick Start Scripts¶

Riva includes Quick Start scripts to help you get started with Riva skills. These scripts are meant for deploying the services locally for testing and running the example applications.

Go to Riva Quick Start and select the File Browser tab to download the scripts or download them via the command-line with the NGC CLI tool by running:
```
ngc registry resource download-version nvidia/riva/riva_quickstart:1.9.0-beta
```
Initialize and start Riva. The initialization step downloads and prepares Docker images and models. The start script launches the server.
1. Within the quickstart directory, modify the config.sh file with your preferred configuration. Options include which models to retrieve from NGC, where to store them, and which GPU to use if more than one is installed in your system (see Local (Docker) for more details).
  
  Note
  
  This process can take up to an hour on an average internet connection. Each model is individually optimized for the target GPU after download.
```
cd riva_quickstart_v1.9.0-beta
bash riva_init.sh
bash riva_start.sh
```
Start a container with sample clients for each service.
```
bash riva_start_client.sh
```

From inside the client container, try the different services using the provided Jupyter notebooks.

   jupyter notebook --ip=0.0.0.0 --allow-root --notebook-dir=/work/notebooks

If running the Riva Quick Start scripts on a cloud service provider (such as AWS or GCP), ensure that your compute instance has an externally visible IP address. To run the Jupyter notebooks, connect a browser window to the correct port (``8888`` by default) of that external IP address.

For further details on how to customize a local deployment, see Local Deployment (Docker).

Running the Riva Client and Transcribing Audio Files¶

For Automatic Speech Recognition (ASR), run the following commands from inside the Riva client container to perform streaming and offline transcription of audio files.

For offline recognition, run: riva_asr_client --audio_file=/work/wav/en-US_sample.wav
For streaming recognition, run: riva_streaming_asr_client --audio_file=/work/wav/en-US_sample.wav

Running the Riva Client and Converting Text to Audio Files¶

From within the Riva client container, to synthesize the audio files, run:

riva_tts_client --voice_name=ljspeech --text="Hello, this is a speech synthesizer."\
    --audio_file=/work/wav/output.wav

The audio files are stored in the /work/wav directory.

The streaming API can be tested by using the command-line option --online=true. However, there is no difference between both options with the command-line client since it saves the entire audio to a .wav file.