NVIDIA Speech NIM Microservices#
NVIDIA Speech NIM microservices are GPU-accelerated Docker containers that provide speech AI capabilities as building blocks for your applications. Each NIM microservice packages a Nemotron model, the full NVIDIA inference stack (CUDA, TensorRT, Triton), and a unified API into a single container that you deploy, scale, and interact with through standard gRPC and HTTP interfaces.
About NVIDIA Speech NIM Microservices#
Understand what Speech NIM microservices are, how they work, and what is new in the latest release.
Learn about NVIDIA Speech NIM Microservices, including release notes and product overview.
Learn how the NVIDIA Speech NIM microservices work together to build speech applications.
Track the release notes for the NVIDIA Speech NIM microservices.
API reference, support matrix, performance benchmarks, and environment variables.
Get Started#
Set up prerequisites, install Speech NIM microservices, and deploy them with Docker or Helm.
Prerequisites, installation, configuration, and tutorials to get up and running.
Deploy NVIDIA Speech NIM microservices using Docker or Helm charts.
Developer Guides#
Explore each speech NIM microservice capability, including model customization and integration options.
Convert speech to text with the NVIDIA ASR NIM microservice supporting multiple models, languages, and inference modes.
Generate natural speech from text with multiple voices, languages, and voice cloning using the NVIDIA TTS NIM microservice.
Translate text between 36 languages with the NVIDIA NMT NIM microservice, including translation exclusion and custom dictionaries.
References#
Look up API specifications, supported configurations, performance data, and troubleshooting guidance.
gRPC and real-time API references for the ASR, NMT, and TTS NIM microservices.
Latency and throughput benchmarks for ASR, NMT, and TTS NIM microservices across supported GPUs.
Common issues and solutions shared across Speech NIM microservices (ASR, TTS, NMT).