Quick Start Guide

This Riva Speech Skills Quick Start Guide is a starting point to try out Riva; specifically, this guide enables you to quickly deploy pretrained models on a local workstation and run a sample client.

For more information and questions, visit the NVIDIA Riva Developer Forum.


Before you begin using Riva skills, it’s assumed that you meet the following prerequisites.

  1. You have access and are logged into NVIDIA GPU Cloud (NGC). For step-by-step instructions, refer to the NGC Getting Started Guide.

  2. You have access to a Volta, Turing, or an NVIDIA Ampere arcitecture-based A100 GPU. For more information, refer to the Support Matrix.

  3. You have Docker installed with support for NVIDIA GPUs. For more information, refer to the Support Matrix.

Models Available for Deployment

To deploy Riva skills, there are two options:

Option 1: You can use the Quick Start scripts to set up a local workstation and deploy the Riva services using Docker. Continue with this section to use the Quick Start scripts.

Option 2: You can use a Helm chart. Included in the NGC Helm Repository is a chart designed to automate the steps for push-button deployment to a Kubernetes cluster. For details, see Kubernetes.

When using either of the push-button deployment options, Riva uses pre-trained models from NGC. You can also fine-tune custom models with NVIDIA TAO Toolkit. Creating a model repository using a fine-tuned model trained with TAO Toolkit is a more advanced approach to generating a model repository.

Local Deployment using Quick Start Scripts

Riva includes Quick Start scripts to help you get started with Riva skills. These scripts are meant for deploying the services locally for testing and running the example applications.

  1. Go to Riva Quick Start and select the File Browser tab to download the scripts or download them via the command-line with the NGC CLI tool by running:

    ngc registry resource download-version nvidia/riva/riva_quickstart:1.8.0-beta
  2. Initialize and start Riva. The initialization step downloads and prepares Docker images and models. The start script launches the server.

    1. Within the quickstart directory, modify the config.sh file with your preferred configuration. Options include which models to retrieve from NGC, where to store them, and which GPU to use if more than one is installed in your system (see Local (Docker) for more details).


      This process can take up to an hour on an average internet connection. Each model is individually optimized for the target GPU after download.

      cd riva_quickstart_v1.8.0-beta
      bash riva_init.sh
      bash riva_start.sh
  3. Start a container with sample clients for each service.

    bash riva_start_client.sh
  4. From inside the client container, try the different services using the provided Jupyter notebooks.

    jupyter notebook --ip= --allow-root --notebook-dir=/work/notebooks

For further details on how to customize a local deployment, see Local Deployment (Docker).

Running the Riva Client and Transcribing Audio Files

For ASR, run the following commands from inside the Riva client container to perform streaming and offline transcription of audio files.

  1. For offline recognition, run: riva_asr_client --audio_file=/work/wav/en-US_sample.wav

  2. For streaming recognition, run: riva_streaming_asr_client --audio_file=/work/wav/en-US_sample.wav

Running the Riva Client and Converting Text to Audio Files

From within the Riva client container, to synthesize the audio files, run:

riva_tts_client --voice_name=ljspeech --text="Hello, this is a speech synthesizer."\

The audio files are stored in the /work/wav directory.

The streaming API can be tested by using the command-line option --online=true. However, there is no difference between both options with the command-line client since it saves the entire audio to a .wav file.