logo

NVIDIA Riva

Getting Started

  • Overview
  • Quick Start Guide
  • Release Notes

Installation

  • Best Practices
  • Local (Docker)
  • Kubernetes
  • How to Deploy Riva at Scale on AWS with EKS
  • NVIDIA Fleet Command

Tutorials

  • Speech Recognition
    • How do I use Riva ASR APIs with out-of-the-box models?
    • How to Customize Riva ASR Vocabulary and Pronunciation with Lexicon Mapping
    • How to Deploy a Custom Language Model (n-gram) Trained with NeMo on Riva
    • How to Deploy a Custom Acoustic Model (Citrinet) Trained with NeMo on Riva
    • How to Deploy a Custom Acoustic Model (Conformer-CTC) Trained with NeMo on Riva
    • How to Customize a Riva ASR Acoustic Model (Conformer-CTC) with Adapters
    • How to Fine-Tune a Riva ASR Acoustic Model with NVIDIA NeMo
    • How to Improve Recognition of Specific Words
    • How to Improve the Accuracy on Noisy Speech by Fine-Tuning the Acoustic Model (Conformer-CTC) in the Riva ASR Pipeline
    • How to Fine-Tune a Riva ASR Acoustic Model (Conformer-CTC) with TAO Toolkit
    • How to pretrain a Riva ASR Language Modeling (n-gram) with TAO Toolkit
    • How do I boost specific words at runtime with word boosting?
  • Speech Recognition - New Language Adaptation
    • The Making of RIVA German ASR Service
    • The Making of RIVA Hindi ASR Service
    • The Making of the Riva Mandarin ASR Service
  • Cloud Deployment
    • How to Deploy Riva at Scale on AWS with EKS
  • Speech Synthesis
    • Evaluate a TTS Pipeline
    • Text to Speech Finetuning using NeMo
    • Calculate and Plot the Distribution of Phonemes in a TTS Dataset

Architecture

  • Overview
  • Clients in a New Programming Language

Speech Recognition

  • ASR Overview
  • ASR Customization Best Practices
  • Pipeline Configuration
  • Performance
  • ASR Advanced Details

Speech Synthesis

  • TTS Overview
  • TTS Inference and Customization
  • Custom Models
  • Performance
  • TTS Deploy
  • Phoneme Support

Natural Language Processing

  • NLP Overview
  • Custom Models

Translation

  • Translation Overview
  • Custom Models
  • Performance

SDKs and Sample Apps

  • Python
  • Command-line Clients
  • Sample Apps
    • Riva Contact
    • Riva Virtual Assistant Example
    • Virtual Assistant (with Rasa)
    • Virtual Assistant (with Google Dialogflow)
    • SpeechSquad
    • AudioCodes VoiceGateway Sample

Reference

  • Models
    • Speech Recognition
    • Natural Language Processing
    • Natural Machine Translation(NMT)
    • Speech Synthesis
  • gRPC & Protocol Buffers
  • Troubleshooting
  • Support Matrix
  • Archives
  • Upgrading
  • Acknowledgements
  • End User License Agreement
  • Notice

Speech Synthesis

Speech Synthesis#

  • Evaluate a TTS Pipeline
    • Download data
    • Synthesize text from asr.
    • Calculate character error rate (CER).
    • Calculate WER(Word error rate)
    • Conclusion
  • Text to Speech Finetuning using NeMo
    • Text to Speech
    • Let’s Dig in: TTS using NeMo
  • Calculate and Plot the Distribution of Phonemes in a TTS Dataset
    • Get arpabet file
  • Reference distribution:
    • Load reference distribution
    • Sample corpus Phonemes
    • Plot distribution together
    • Phoneme comparison to a reference distribution
    • Calculate key differences between reference distribution and total phonemes
  • Conclusion

previous

Cloud Deployment

next

Evaluate a TTS Pipeline

By NVIDIA
© Copyright 2022 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
Last updated on Mar 10, 2023.