riva/proto/riva_nmt.proto#

service RivaTranslation

RivaTranslation service provides rpcs to translate between languages.

rpc TranslateTextResponse TranslateText(TranslateTextRequest)

Translate text to text, from a source to a target language. Currently source and target language fields is required, along with the model name. Multiple texts may be passed per request up to the given batch size for the model, which is set at translation pipeline creation time.

rpc AvailableLanguageResponse ListSupportedLanguagePairs(AvailableLanguageRequest)

Lists the available language pairs and models names to be used for TranslateText

rpc stream StreamingTranslateSpeechToTextResponse StreamingTranslateSpeechToText(stream StreamingTranslateSpeechToTextRequest)

streaming speech to text translation api.

rpc stream StreamingTranslateSpeechToSpeechResponse StreamingTranslateSpeechToSpeech(stream StreamingTranslateSpeechToSpeechRequest)
message AvailableLanguageRequest

Returns a map of model names to its source and target language pairs. Can specificy a specific model name to retrieve only its language pairs.

string model

If empty returns all available languages.

message AvailableLanguageResponse

Language pairs are the sets of src to tgt languages available per model. languages contains all the model_name -> Language pair

AvailableLanguageResponse.LanguagesEntry languages (repeated)
message AvailableLanguageResponse.LanguagePair
string src_lang(repeated)
string tgt_lang(repeated)
message AvailableLanguageResponse.LanguagesEntry
string key
AvailableLanguageResponse.LanguagePair value
message StreamingTranslateSpeechToSpeechConfig

Configuration for Translate S2S. reuse existing protos from other services.

nvidia.riva.asr.StreamingRecognitionConfig asr_config

From riva_asr.proto

SynthesizeSpeechConfig tts_config
TranslationConfig translation_config
message StreamingTranslateSpeechToSpeechRequest

Streaming translate speech to speech used to configure the entire pipline for speech translation. This can be be backed by a cascade of ASR, NMT, TTS models or an end to end model

StreamingTranslateSpeechToSpeechConfig config
bytes audio_content
nvidia.riva.RequestId id

The ID to be associated with the request. If provided, this will be returned in the corresponding response.

message StreamingTranslateSpeechToSpeechResponse
nvidia.riva.tts.SynthesizeSpeechResponse speech

Contains speech responses, the last response sends an empty buffer to mark the end of stream.

from riva_tts.proto

nvidia.riva.RequestId id

The ID associated with the request

message StreamingTranslateSpeechToTextConfig
nvidia.riva.asr.StreamingRecognitionConfig asr_config

existing ASR config

TranslationConfig translation_config
message StreamingTranslateSpeechToTextRequest
StreamingTranslateSpeechToTextConfig config
bytes audio_content
nvidia.riva.RequestId id

The ID to be associated with the request. If provided, this will be returned in the corresponding response.

message StreamingTranslateSpeechToTextResponse
nvidia.riva.asr.StreamingRecognitionResult results (repeated)

from riva_asr.proto

nvidia.riva.RequestId id

The ID associated with the request

message SynthesizeSpeechConfig
nvidia.riva.AudioEncoding encoding
int32 sample_rate_hz
string voice_name
string language_code
message TranslateTextRequest

request for synchronous translation of each text in texts. Available languages can be queried using ListSupportLanguagePairs RPC. source and target languages must be specified, are currently two character ISO codes, this will likely change to BCP-47 inline with other Riva Services for GA.

string texts(repeated)
string model
string source_language
string target_language
nvidia.riva.RequestId id

The ID to be associated with the request. If provided, this will be returned in the corresponding response.

message TranslateTextResponse

Translations are returned as text:language pairs. These are 1:1 for the passed in ‘texts’ from the request.

Translation translations(repeated)
nvidia.riva.RequestId id

The ID associated with the request

message Translation

contains a single translation, collecting into the translate text response Includes the target language code, since with multi lingual models there are multiple possibilities.

string text
string language
message TranslationConfig
string source_language_code

BCP-47 “en-US”

string target_language_code
string model_name