riva_nmt.proto#

service RivaTranslation

RivaTranslation service provides rpcs to translate between languages.

rpc TranslateTextResponse TranslateText(TranslateTextRequest): Translate text to text, from a source to a target language. Currently source and target language fields is required, along with the model name. Multiple texts may be passed per request up to the given batch size for the model, which is set at translation pipeline creation time.

rpc AvailableLanguageResponse ListSupportedLanguagePairs(AvailableLanguageRequest): Lists the available language pairs and models names to be used for TranslateText

rpc stream StreamingTranslateSpeechToTextResponse StreamingTranslateSpeechToText(stream StreamingTranslateSpeechToTextRequest): streaming speech to text translation api.

rpc stream StreamingTranslateSpeechToSpeechResponse StreamingTranslateSpeechToSpeech(stream StreamingTranslateSpeechToSpeechRequest)

message AvailableLanguageRequest

Returns a map of model names to its source and target language pairs. Can specificy a specific model name to retrieve only its language pairs.

string model: If empty returns all available languages.

message AvailableLanguageResponse

Language pairs are the sets of src to tgt languages available per model. languages contains all the model_name -> Language pair

AvailableLanguageResponse.LanguagesEntry languages (repeated)

message AvailableLanguageResponse.LanguagePair

string src_lang(repeated)

string tgt_lang(repeated)

message AvailableLanguageResponse.LanguagesEntry

string key

AvailableLanguageResponse.LanguagePair value

message StreamingTranslateSpeechToSpeechConfig

Configuration for Translate S2S. reuse existing protos from other services.

nvidia.riva.asr.StreamingRecognitionConfig asr_config: from riva_asr.proto

SynthesizeSpeechConfig tts_config

TranslationConfig translation_config

message StreamingTranslateSpeechToSpeechRequest

Streaming translate speech to speech used to configure the entire pipline for speech translation. This can be be backed by a cascade of ASR, NMT, TTS models or an end to end model

StreamingTranslateSpeechToSpeechConfig config

bytes audio_content

message StreamingTranslateSpeechToSpeechResponse

nvidia.riva.tts.SynthesizeSpeechResponse speech

Contains speech responses, the last response sends an empty buffer to mark the end of stream.

from riva_tts.proto

message StreamingTranslateSpeechToTextConfig

nvidia.riva.asr.StreamingRecognitionConfig asr_config: existing ASR config

TranslationConfig translation_config

message StreamingTranslateSpeechToTextRequest

StreamingTranslateSpeechToTextConfig config

bytes audio_content

message StreamingTranslateSpeechToTextResponse

nvidia.riva.asr.StreamingRecognitionResult results (repeated): from riva_asr.proto

message SynthesizeSpeechConfig

nvidia.riva.AudioEncoding encoding

int32 sample_rate_hz

string voice_name

string language_code

message TranslateTextRequest

request for synchronous translation of each text in texts. Available languages can be queried using ListSupportLanguagePairs RPC. source and target languages must be specified, are currently two character ISO codes, this will likely change to BCP-47 inline with other Riva Services for GA.

string texts(repeated)

string model

string source_language

string target_language

message TranslateTextResponse

Translations are returned as text:language pairs. These are 1:1 for the passed in ‘texts’ from the request.

Translation translations(repeated)

message Translation

contains a single translation, collecting into the translate text response Includes the target language code, since with multi lingual models there are multiple possibilities.

string text

string language

message TranslationConfig

string source_language_code: BCP-47 “en-US”

string target_language_code

string model_name

NVIDIA Riva

riva/proto/riva_nmt.proto

riva/proto/riva_nmt.proto#