NVIDIA Riva is a GPU-accelerated SDK for building Speech AI applications, customized for your use case, and delivering real-time performance.

Fully Customizable: Flexibility at every step, from modifying model architectures to fine-tuning models on your data and customizing pipelines, as well as the ability to deploy on any platform.
State of the Art Models: Built on a decade of AI innovations by NVIDIA across hardware, model architectures, training techniques, inference optimizations, and deployment solutions.
Real-time Performance Optimizations: Continued optimizations across the entire stack from models to software to hardware delivered 12X the gain versus the previous generation.
Flexible and Scalable Deployments: Supports scaling to hundreds of thousands of concurrent users in the cloud, on premises, and at the edge.
Data Ownership and Privacy: Data processed on-premesis or your cloud.

Riva#

NVIDIA Riva Skills 2.8.0 is a toolkit for production-grade conversational AI inference.

The Riva Speech server exposes a simple API for performing speech recognition, speech synthesis, and a variety of natural language processing inferences.

Highlights

State-of-the-art pretrained models available from NGC
Easy fine-tuning with NVIDIA TAO Toolkit
Fully custom trained models with NVIDIA NeMo
Helm-managed cloud deployment
Streaming and batch speech recognition
Streaming and batch speech synthesis
NLP models including question answering, entity recognition, and more