Riva ASR NIM Overview#

Riva ASR NIM APIs provide easy access to state-of-the-art automatic speech recognition (ASR) models for multiple languages. Riva ASR NIM models are built on the NVIDIA software platform, incorporating CUDA, TensorRT, and Triton to offer out-of-the-box GPU acceleration.
ASR NIM takes an audio stream or audio buffer as input and returns one or more text transcripts, along with additional optional metadata. Riva ASR NIM supports offline/batch and streaming recognition modes with various customizations.

Customization Across Riva ASR Pipeline

Enterprise-Ready Features#

Riva ASR NIM comes with enterprise-ready features, such as a high-performance inference server, flexible integration, and enterprise-grade security.

  • State-of-the-art accuracy: Superior WER performance across diverse audio sources and domains with strong robustness to non-speech segments.

  • Open-source and extensibility: Built on NVIDIA NeMo, allowing for seamless integration and customization.

  • Pre-trained checkpoints: Ready-to-use model for inference or fine-tuning.

  • Permissive license: Released under CC-BY-4.0 license, model checkpoints can be used in any commercial application.

Try It Out#

Riva ASR NIM can be tried out at NVIDIA NIM.