Customization
This section provides some examples on how to customize Riva TTS through the following SSML tags:
The prosody tag, which supports attributes rate, pitch, and volume, through which we can control the rate, pitch, and volume of the generated audio.
The phoneme tag, which allows us to control the pronunciation of the generated audio.
The sub tag, which allows us to replace the pronounciation of the specified word or phrase with a different word or phrase.
Customizing rate, pitch, and volume with the prosody tag
python3 python-clients/scripts/tts/talk.py --server 0.0.0.0:50051 --text "<speak><prosody pitch='2.5'>Today is a sunny day</prosody>. <prosody rate='high' volume='+1dB'>But it might rain tomorrow.</prosody></speak>" --language-code en-US
Customizing pronunciation with the phoneme tag
python3 python-clients/scripts/tts/talk.py --server 0.0.0.0:50051 --text "<speak>You say <phoneme alphabet="ipa" ph="təˈmeɪˌtoʊ">tomato</phoneme>, I say <phoneme alphabet="ipa" ph="təˈmɑˌtoʊ">tomato</phoneme>.</speak>" --language-code en-US
Replacing pronunciation with the sub tag
python3 python-clients/scripts/tts/talk.py --server 0.0.0.0:50051 --text "<speak><sub alias='World Wide Web'>WWW</sub> is known as the web.</speak>" --language-code en-US
The synthesized audio file output.wav will contain the resulting speech with SSML attributes applied.