Contents
- Speech Recognition
- How do I use Riva ASR APIs with out-of-the-box models?
- How to Customize Riva ASR Vocabulary and Pronunciation with Lexicon Mapping
- How to Deploy a Custom Language Model (n-gram) Trained with NeMo on Riva
- How to Deploy a Custom Acoustic Model (Citrinet) Trained with NeMo on Riva
- How to Deploy a Custom Acoustic Model (Conformer-CTC) Trained with NeMo on Riva
- How to Customize a Riva ASR Acoustic Model (Conformer-CTC) with Adapters
- ASR with Adapters
- What are Adapters?
- Advantages and Limitations of Adapter Training
- Preparing the Acoustic Encoder for Adapter Training
- Preparing the Model and Dataset for Adaptation
- Creating and Training an Adapter
- Evaluating the Model
- Export the Model to Riva
- What’s Next?
- How to Fine-Tune a Riva ASR Acoustic Model with NVIDIA NeMo
- How to Improve Recognition of Specific Words
- Conclusion
- How to Improve the Accuracy on Noisy Speech by Fine-Tuning the Acoustic Model (Conformer-CTC) in the Riva ASR Pipeline
- How to Fine-Tune a Riva ASR Acoustic Model (Conformer-CTC) with TAO Toolkit
- How to pretrain a Riva ASR Language Modeling (n-gram) with TAO Toolkit
- How do I boost specific words at runtime with word boosting?
- Speech Recognition - New Language Adaptation
- Cloud Deployment
- Speech Synthesis
- ASR Overview
- ASR Customization Best Practices
- Pipeline Configuration
- Supported Languages and Models
- Streaming/Offline Recognition
- Language Models
- Flashlight Decoder Lexicon
- Flashlight Decoder Lexicon Free
- OpenSeq2Seq Decoder
- Beginning/End of Utterance Detection
- Neural-Based Voice Activity Detection
- Generating Multiple Transcript Hypotheses
- Impact of Chunk Size and Padding Size on Performance and Accuracy (Advanced)
- Sharing Acoustic and Feature Extractor Models Across Multiple ASR Pipelines (Advanced)
- Riva-build Optional Parameters
- Offline Speaker Diarization
- Performance
- ASR Advanced Details