About NVIDIA NMT NIM Microservice#

The NVIDIA Neural Machine Translation (NMT) NIM microservice translates text between languages. It packages the Riva Translate 1.6b model, a Transformer-based encoder-decoder with 24 encoder and 24 decoder layers, into a self-contained container that handles model download, optimization, and serving.

Note

The NIM container image is named riva-translate-1_6b. The underlying NMT model that ships inside the container appears as megatronnmt_any_any_1b when you run --list-models against the deployed service. Both names refer to the same Megatron any-to-any translation model.

The Riva Translate 1.6b model supports any-to-any translation across 36 languages, including English, Chinese, Japanese, Korean, Arabic, Hindi, and major European languages. For the full language list, refer to the NMT support matrix.

Key Capabilities#

Translation Exclusion#

Protect terms from translation using <dnt> (do not translate) tags. Brand names, product names, and technical terms enclosed in <dnt>...</dnt> pass through unchanged while the surrounding text is translated normally.

--text "<dnt>NVIDIA NIM</dnt> provides optimized inference."

Custom Dictionaries#

Provide a dictionary file to force specific translations or block translation of specific words. Each entry maps a source word to a target word (source##target) or marks a word to leave untranslated (word with no ##). This gives fine-grained control over domain-specific terminology.

Batch Inference#

Translate multiple inputs in parallel using the --batch-size flag with a text file containing one input per line. Batching improves throughput when translating large volumes of text.

Morphologically Complex Languages#

Some target languages (Arabic, Turkish, Finnish) produce longer output than the source text. If output is truncated, increase the --max-len-variation parameter (default: 20, range: 0-256) to allow longer translations.

Next Steps#

  • Deploy and Run NMT NIM: Deploy the NMT NIM container and run translation with the CLI client or the HTTP REST endpoint.

  • Custom Dictionaries: Force specific translations or protect domain terms from translation.

  • Translate with Python: Call the NMT gRPC API programmatically.

  • NMT Tutorial: Guided walkthrough of deployment, exclusion tags, custom dictionaries, and batch inference.

  • NMT Support Matrix: GPU requirements, model profiles, and all 36 supported languages.