riva/proto/jarvis_tts.proto¶

service JarvisTTS¶

rpc SynthesizeSpeechResponse Synthesize(SynthesizeSpeechRequest): Used to request text-to-speech from the service. Submit a request containing the desired text and configuration, and receive audio bytes in the requested format.

rpc stream SynthesizeSpeechResponse SynthesizeOnline(SynthesizeSpeechRequest): Used to request text-to-speech returned via stream as it becomes available. Submit a SynthesizeSpeechRequest with desired text and configuration, and receive stream of bytes in the requested format.

message SynthesizeSpeechRequest¶

string text¶

string language_code¶

nvidia.jarvis.AudioEncoding encoding: audio encoding params

int32 sample_rate_hz¶

string voice_name¶: voice params

message SynthesizeSpeechResponse¶

bytes audio¶

SynthesizeSpeechResponseMetadata meta¶

message SynthesizeSpeechResponseMetadata¶

string text¶: Currently experimental API addition that returns the input text after preprocessing has been completed as well as the predicted duration for each token. Note: this message is subject to future breaking changes, and potential removal.

string processed_text¶

float predicted_durations(repeated)¶