Riva ASR NIM Overview
Riva ASR NIM APIs provide easy access to state-of-the-art automatic speech recognition (ASR) models, capable of transcribing spoken English with exceptional accuracy. It is a XXL version of the FastConformer-CTC model. Riva ASR NIM models are built on the NVIDIA software platform, incorporating CUDA, TensorRT, and Triton to offer out-of-the-box GPU acceleration.
Model architecture can be found from the FastConformer CTC paper.
Enterprise-Ready Features
Riva ASR NIM comes with enterprise-ready features, such as a high-performance inference server, flexible integration, and enterprise-grade security.
State-of-the-art accuracy: Superior WER performance across diverse audio sources and domains with strong robustness to non-speech segments.
Open-source and extensibility: Built on NVIDIA NeMo, allowing for seamless integration and customization.
Pre-trained checkpoints: Ready-to-use model for inference or fine-tuning.
Permissive license: Released under CC-BY-4.0 license, model checkpoints can be used in any commercial application.
Riva ASR NIM can be tried out at this link.