Contents
- Speech Recognition
- How do I use Riva ASR APIs with out-of-the-box models?
- Creating Grammars for Speech Hints
- How to Customize Riva ASR Vocabulary and Pronunciation with Lexicon Mapping
- How to Deploy a Custom Language Model (n-gram) Trained with NeMo on Riva
- How to Deploy a Custom Acoustic Model (Citrinet) Trained with NeMo on Riva
- How to Deploy a Custom Acoustic Model (Conformer-CTC) Trained with NeMo on Riva
- How to Deploy a Conformer-CTC Acoustic Model with WFST Decoders
- How to Customize a Riva ASR Acoustic Model (Conformer-CTC) with Adapters
- ASR with Adapters
- What are Adapters?
- Advantages and Limitations of Adapter Training
- Preparing the Acoustic Encoder for Adapter Training
- Preparing the Model and Dataset for Adaptation
- Creating and Training an Adapter
- Evaluating the Model
- Export the Model to Riva
- What’s Next?
- How to Fine-Tune a Riva ASR Acoustic Model with NVIDIA NeMo
- How to Improve Recognition of Specific Words
- Conclusion
- How to Synthesize a Noisy Dataset that can be used to Train a Noise Robust ASR Model
- How to Improve the Accuracy on Noisy Speech by Fine-Tuning the Acoustic Model (Conformer-CTC) in the Riva ASR Pipeline
- How To Train, Evaluate, and Fine-Tune an n-gram Language Model
- How do I Use Speaker Diarization with Riva ASR?
- Requirements and Setup
- How do I boost specific words at runtime with word boosting?
- Support for Class Based n-gram Language Models in Riva (WFST Decoder)
- WFST Decoding
- Speech Recognition - New Language Adaptation
- Cloud Deployment
- Speech Synthesis
- Translation
- How do I perform Language Translation using Riva NMT APIs with out-of-the-box models?
- How to deploy a NeMo-finetuned NMT model on Riva Speech Skills server?
- How to fine-tune a Riva NMT Bilingual model with Nvidia NeMo
- How to perform synthetic data generation using Riva NMT Multilingual model with Nvidia NeMo
- How to fine-tune a Riva NMT Multilingual model with Nvidia NeMo
- ASR Overview
- Basics of Speech Recognition and Customization of Riva ASR
- Basics of Automatic Speech Recognition
- Evaluation of ASR Accuracy
- Riva ASR
- Riva Speech Recognition Pipeline
- Pipeline Configuration
- Streaming/Offline Recognition
- Language Models
- Flashlight Decoder Lexicon
- Flashlight Decoder Lexicon Free
- OpenSeq2Seq Decoder
- Beginning/End of Utterance Detection
- Neural-Based Voice Activity Detection
- Generating Multiple Transcript Hypotheses
- Impact of Chunk Size and Padding Size on Performance and Accuracy (Advanced)
- Sharing Acoustic and Feature Extractor Models Across Multiple ASR Pipelines (Advanced)
- Riva-build Optional Parameters
- Offline Speaker Diarization
- Performance
- ASR Advanced Details
- Models
- gRPC & Protocol Buffers
- Troubleshooting
- Support Matrix
- Riva 2.16.0
- Riva 2.15.0
- Riva 2.14.0
- Riva 2.13.0
- Riva 2.12.0
- Riva 2.11.0
- Riva 2.10.0
- Riva 2.9.0
- Riva 2.8.0
- Riva 2.7.0
- Riva 2.6.0
- Riva 2.5.0
- Riva 2.4.0
- Riva 2.3.0
- Riva 2.2.0
- Riva 2.1.0
- Riva 2.0.0
- Riva 1.10.0 Beta
- Riva 1.9.0 Beta
- Riva 1.8.0 Beta
- Riva 1.7.0 Beta
- Riva 1.6.0 Beta
- Riva 1.5.0 Beta
- Riva 1.4.0 Beta
- Upgrading
- Acknowledgements
- End User License Agreement
- Notice