Overview

NVIDIA TAO Release 4.0.1

Speech synthesis or text-to-speech (TTS) is defined as the artificial production of human voices. The main use is to translate text into spoken speech automatically. TAO Toolkit supports a two-stage pipeline for TTS:

  1. A spectrogram model to generate a Mel spectrogram from text

  2. A vocoder model to generate audio from a Mel spectrogram

© Copyright 2023, NVIDIA.. Last updated on Aug 2, 2023.