Speech AI Models

User Guide (Latest Version)

NVIDIA NeMo Framework supports the training and customization of Speech AI models, specifically designed to enable voice-based interfaces for conversational AI applications. For more information on each speech subdomain, refer to the following sections in the NeMo Developer Documentation.

  • Automatic Speech Recognition

  • Speech Classification

  • Speaker Recognition

  • Speaker Diarization

  • Text To Speech

NeMo Framework also includes a large set of :doc:Speech AI tools <../NeMoToolKit/tools/intro> for dataset preparation, model evaluation and text normalization.

Previous Performance
Next Deploy NeMo Framework Models
© | | | | | | |. Last updated on Jun 19, 2024.