logo

NVIDIA Riva

Getting Started

  • Overview
  • Quick Start Guide
    • Speech Recognition
    • Speech Synthesis
    • Translation
  • NVIDIA AI Enterprise Trial
  • Release Notes
  • Support Matrix

Installation

  • Best Practices
  • Local (Docker)
  • Kubernetes
  • How to Deploy Riva at Scale on AWS with EKS
  • NVIDIA Fleet Command

Tutorials

  • Speech Recognition
    • How do I use Riva ASR APIs with out-of-the-box models?
    • Creating Grammars for Speech Hints
    • How to Customize Riva ASR Vocabulary and Pronunciation with Lexicon Mapping
    • How to Deploy a Custom Language Model (n-gram) Trained with NeMo on Riva
    • How to Deploy a Custom Acoustic Model (Citrinet) Trained with NeMo on Riva
    • How to Deploy a Custom Acoustic Model (Conformer-CTC) Trained with NeMo on Riva
    • How to Deploy a Conformer-CTC Acoustic Model with WFST Decoders
    • How to Fine-Tune a Riva ASR Acoustic Model with NVIDIA NeMo
    • How to Customize a Riva ASR Acoustic Model (Conformer-CTC) with Adapters
    • How to Fine-Tune a Riva ASR Acoustic Model with NVIDIA NeMo
    • How to Improve Recognition of Specific Words
    • How to Synthesize a Noisy Dataset that can be used to Train a Noise Robust ASR Model
    • How to Improve the Accuracy on Noisy Speech by Fine-Tuning the Acoustic Model (Conformer-CTC) in the Riva ASR Pipeline
    • How To Train, Evaluate, and Fine-Tune an n-gram Language Model
    • How do I Use Speaker Diarization with Riva ASR?
    • How do I boost specific words at runtime with word boosting?
    • Support for Class Based n-gram Language Models in Riva (WFST Decoder)
  • Speech Recognition - New Language Adaptation
  • Cloud Deployment
    • How do I Deploy Riva at Scale on Azure Cloud with AKS?
    • How to Deploy Riva at Scale on AWS with EKS
    • How do I Deploy Riva at Scale on Google Cloud with GKE?
    • How to Deploy Riva at Scale on OCI with OKE
  • Speech Synthesis
    • How do I use Riva TTS APIs with out-of-the-box models?
    • TTS Deploy
    • Evaluate a TTS Pipeline
    • Text to Speech Finetuning using NeMo
    • Calculate and Plot the Distribution of Phonemes in a TTS Dataset
    • Guidelines to Record a TTS Dataset at Home
  • Translation
    • How do I perform Language Translation using Riva NMT APIs with out-of-the-box models?
    • How to deploy a NeMo-finetuned NMT model on Riva Speech Skills server?
    • How to fine-tune a Riva NMT Bilingual model with Nvidia NeMo
    • How to perform synthetic data generation using Riva NMT Multilingual model with Nvidia NeMo
    • How to fine-tune a Riva NMT Multilingual model with Nvidia NeMo

Architecture

  • Overview
  • Clients in a New Programming Language

Speech Recognition

  • ASR Overview
  • Basics of Speech Recognition and Customization of Riva ASR
  • Pipeline Configuration
  • Performance
  • ASR Advanced Details

Speech Synthesis

  • TTS Overview
  • TTS Inference and Customization
  • TTS Zero Shot
  • Speaker Adapter for Custom Voice
  • Custom Models
  • Performance
  • TTS Deploy
  • Phoneme Support
  • Data Collection - Script Generation

Natural Language Processing

  • NLP Overview
  • Custom Models

Translation

  • Translation Overview
  • Custom Models
  • Performance

SDKs and Sample Apps

  • Python
  • Command-line Clients
  • Sample Apps
    • Riva Contact
    • Riva Virtual Assistant Example
    • Virtual Assistant (with Rasa)
    • Virtual Assistant (with Google Dialogflow)
    • SpeechSquad
    • AudioCodes VoiceGateway Sample

Reference

  • Models
    • Speech Recognition
    • Natural Language Processing
    • Natural Machine Translation(NMT)
    • Speech Synthesis
  • gRPC & Protocol Buffers
  • Troubleshooting
  • Upgrading
  • Acknowledgements
  • End User License Agreement
  • Notice

Speech Synthesis

Speech Synthesis#

  • How do I use Riva TTS APIs with out-of-the-box models?
    • NVIDIA Riva Overview
    • Basics: Generating Speech with Riva TTS APIs
    • Customizing Riva TTS audio output with SSML
  • TTS Deploy
    • Learning Objectives
    • Prerequisties
    • Riva ServiceMaker
    • Run riva-build
    • Run riva-deploy
  • Run Inference
    • Connect to the Riva server and run inference
  • Evaluate a TTS Pipeline
    • Download data
    • Synthesize text from asr.
    • Calculate character error rate (CER).
    • Calculate WER(Word error rate)
    • Conclusion
  • Text to Speech Finetuning using NeMo
    • Text to Speech
    • Let’s Dig in: TTS using NeMo
  • Calculate and Plot the Distribution of Phonemes in a TTS Dataset
    • Get arpabet file
  • Reference distribution:
    • Load reference distribution
    • Sample corpus Phonemes
    • Plot distribution together
    • Phoneme comparison to a reference distribution
    • Calculate key differences between reference distribution and total phonemes
  • Conclusion
  • Guidelines to Record a TTS Dataset at Home
    • Recommended Data
    • Hardware Requirements
    • Software Requirements
    • Recording Prerequisites
    • Adjusting the Microphone Level and Body Position Before Recording
    • Positioning Yourself Just Right, Too Far or Too Close
    • Recording the TTS Data

previous

How to Deploy Riva at Scale on OCI with OKE

next

How do I use Riva TTS APIs with out-of-the-box models?

By NVIDIA
© Copyright 2024 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
Last updated on Apr 03, 2025.