Quick Start Guide#

This is the starting point to try out Riva. Specifically, this Quick Start Guide enables you to deploy pretrained models on a local workstation and run a sample client.

Riva Speech AI Skills supports two architectures, Linux x86_64 and Linux ARM64. These are referred to as data center (x86_64) and embedded (ARM64) throughout this documentation.

For more information and questions, visit the NVIDIA Riva Developer Forum.

Note

Riva embedded (ARM64) is in public beta.

Prerequisites#

Before using Riva Speech AI, ensure you meet the following prerequisites:

Data Center#

  1. You have access and are logged into NVIDIA NGC. For step-by-step instructions, refer to the NGC Getting Started Guide.

  2. You have access to an NVIDIA Volta™, NVIDIA Turing™, or an NVIDIA Ampere architecture-based A100 GPU. For more information, refer to the Support Matrix.

  3. You have Docker installed with support for NVIDIA GPUs. For more information, refer to the Support Matrix.

  4. Obtain a free trial license to install NVIDIA Riva. For more information, refer to the NVIDIA AI Enterprise Trial.

Embedded#

  1. You have access and are logged into NVIDIA NGC. For step-by-step instructions, refer to the NGC Getting Started Guide.

  2. You have access to an NVIDIA Jetson Orin, NVIDIA Jetson AGX Xavier™, or NVIDIA Jetson NX Xavier. For more information, refer to the Support Matrix.

  3. You have installed NVIDIA JetPack™ version 6.0 on the Jetson platform. For more information, refer to the Support Matrix.

  4. You have ~15 GB free disk space on Jetson as required by the default containers and models. If you are deploying any Riva model intermediate representation (RMIR) models, the additional disk space required is ~15 GB plus the size of the RMIR models.

  5. You have enabled the following power modes on the Jetson platform. These modes activate all CPU cores and clock the CPU/GPU at maximum frequency for achieving the best performance.

    sudo nvpmodel -m 0 (Jetson Orin AGX, mode MAXN)
    sudo nvpmodel -m 0 (Jetson Xavier AGX, mode MAXN)
    sudo nvpmodel -m 2 (Jetson Xavier NX, mode MODE_15W_6CORE)
    
  6. You have set the default runtime to nvidia on the Jetson platform by adding the following line in the /etc/docker/daemon.json file. Restart the Docker service using sudo systemctl restart docker after editing the file.

    "default-runtime": "nvidia"
    
  7. Obtain a free trial license to install NVIDIA Riva. For more information, refer to the NVIDIA AI Enterprise Trial.

Deployment Guide#

There are two push-button deployment options to deploy Riva Speech AI, which use pretrained models available from the NGC catalog:

Local Docker: You can use the Quick Start scripts to set up a local workstation and deploy the Riva services using Docker. Continue with this guide to use the Quick Start scripts.

Kubernetes: The Riva Helm Chart is designed to automate the steps for push-button deployment to a Kubernetes cluster. For more information, refer to Kubernetes deployment. This option is not supported for embedded.

In addition to using pretrained models, Riva Speech AI can run with fine-tune custom models using NVIDIA NeMo. Refer to the Model Development with NeMo section for details regarding the advanced option to create a model repository with NVIDIA NeMo.

For detailed instructions on deploying and using specific Riva services, refer to the following quick start guides.