Automatic Speech Recognition (Latest)
Automatic Speech Recognition (Latest)

Riva ASR NIM Overview

Riva ASR NIM APIs provide easy access to state-of-the-art automatic speech recognition (ASR) models, capable of transcribing spoken English with exceptional accuracy. It is a XXL version of the FastConformer-CTC model. Riva ASR NIM models are built on the NVIDIA software platform, incorporating CUDA, TensorRT, and Triton to offer out-of-the-box GPU acceleration.

Model architecture can be found from the FastConformer CTC paper.

Enterprise-Ready Features

Riva ASR NIM comes with enterprise-ready features, such as a high-performance inference server, flexible integration, and enterprise-grade security.

  • State-of-the-art accuracy: Superior WER performance across diverse audio sources and domains with strong robustness to non-speech segments.

  • Open-source and extensibility: Built on NVIDIA NeMo, allowing for seamless integration and customization.

  • Pre-trained checkpoints: Ready-to-use model for inference or fine-tuning.

  • Permissive license: Released under CC-BY-4.0 license, model checkpoints can be used in any commercial application.

Riva ASR NIM can be tried out at this link.

Previous Riva ASR NIM
Next Getting Started
© Copyright © 2024, NVIDIA Corporation. Last updated on Aug 6, 2024.