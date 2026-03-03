Use $oauthtoken as the username and NGC_API_KEY as the password. The $oauthtoken username is a special name that indicates that you will authenticate with an API key and not a user name and password.

To pull the NIM container image from NGC, first authenticate with the NVIDIA Container Registry with the following command:

Other, more secure options include saving the value in a file, so that you can retrieve with cat $NGC_API_KEY_FILE , or using a password manager .

Run one of the following commands to make the key available at startup:

If you’re not familiar with how to create the NGC_API_KEY environment variable, the simplest way is to export it in your terminal:

Pass the value of the API key to the docker run command in the next section as the NGC_API_KEY environment variable to download the appropriate models and resources when starting the NIM.

Personal keys allow you to configure an expiration date, revoke or delete the key using an action button, and rotate the key as needed. For more information about key types, please refer the NGC User Guide .

When creating an NGC API Personal key, ensure that at least “NGC Catalog” is selected from the “Services Included” dropdown. More Services can be included if this key is to be reused for other purposes.

To access NGC resources, you need an NGC API key. You can generate a key here: Generate Personal Key .

This command should produce output similar to the following, where you can confirm the CUDA driver version and available GPUs.

To ensure that your setup is correct, run the following command:

After installing the toolkit, follow the instructions in the Configure Docker section in the NVIDIA Container Toolkit documentation.

Using a network repository as part of a package manager installation , skipping the CUDA toolkit installation as the libraries are available within the NIM container

Have glibc >= 2.35 in the output of ld -v

NVIDIA GPU(s) : Riva NMT NIM runs on any NVIDIA GPU with sufficient GPU memory, but some model/GPU combinations are optimized.

NVIDIA AI Enterprise License : Riva NMT NIM is available for self-hosting under the NVIDIA AI Enterprise (NVAIE) License.

The following commands deploy the Riva Translate 1.6b model on any of the supported GPUs.

Running Inference#

Note It may take a up to 30 minutes depending on your network speed, for the container to be ready and start accepting requests from the time the docker container is started.

Open a new terminal and run the following command to check if the service is ready to handle inference requests:

curl -X 'GET' 'http://localhost:9000/v1/health/ready'

If the service is ready, you get a response similar to the following.

{ "status" : "ready" }

Install the Riva Python client

Riva uses gRPC APIs. You can download proto files from Riva gRPC Proto files and compile them to a target language using Protoc compiler. You can find Riva clients in C++ and Python languages at the following locations.

Install Riva Python client

sudo apt-get install python3-pip pip install -U nvidia-riva-client

Download Riva sample client

git clone https://github.com/nvidia-riva/python-clients.git

Run Text-to-Text translation inference:

Following command will translate the text from English to German.

python3 python-clients/scripts/nmt/nmt.py --server 0 .0.0.0:50051 \ --text "This will become German words" \ --source-language-code en-US \ --target-language-code de-DE

You will see the translated output as shown below.

## Das werden deutsche Wörter

Refer Supported Languages for supported language codes for source and target languages.

Batched Inference# Riva Translate supports batched inference of multiple inputs to provide a faster translation experience. Using the translation client, one can batch together up to 8 inputs and translate them in a single request. Below command assumes that there exists a multiline text file input_text.txt with one English text input on each line. Multiple inputs from the file are batched in size of 8 and submitted to the model for inference. Translated output is printed on the terminal. python3 python-clients/scripts/nmt/nmt.py --server 0 .0.0.0:50051 \ --text-file input_text.txt \ --source-language-code en \ --target-language-code de --batch-size 8

Translation Exclusion# Riva Translate spports a feature called translation exclusion. Tags <dnt> and </dnt> are used to enclose the words or phrases which should not be translated. Without usage of <dnt> tag, all words get translated. python3 python-clients/scripts/nmt/nmt.py --server 0 .0.0.0:50051 \ --text "Riva translate model translates audio between language pairs." \ --source-language-code en-US \ --target-language-code fr-FR Le modèle de traduction Riva traduit l'audio entre les paires de langues. With usage of <dnt> tag around Riva translate , it is maintained as it is in the output. python3 python-clients/scripts/nmt/nmt.py --server 0 .0.0.0:50051 \ --text "<dnt>Riva translate</dnt> model translates audio between language pairs." \ --source-language-code en-US \ --target-language-code fr-FR Le modèle Riva translate traduit l'audio entre les paires de langues.

Custom Translation Dictionary# Riva Translate allows the use of a custom text dictionary to specify desired translation for particular words. The sample Python client supports custom dictionary input through a text file with a defined syntax. Each line in the file should contain a translation pair, a source word and its desired translation, separated by a double-hash ## symbol. To exclude certain words from being translated, list them on separate lines without the ## symbol; they will appear untranslated in the output. python3 python-clients/scripts/nmt/nmt.py --server 0 .0.0.0:50051 \ --text "bad morning everyone" \ --source-language-code en-US \ --target-language-code it-IT brutto mattino tutti With custom dictionary, the translation can be customized. echo bad##good > custom_dict.txt echo everyone >> custom_dict.txt python3 python-clients/scripts/nmt/nmt.py --server 0 .0.0.0:50051 \ --text "bad morning everyone" \ --source-language-code en-US \ --target-language-code it-IT \ --dnt-phrases-file custom_dict.txt good mattina everyone

Morphologically Complex Translations# Translating to morphologically rich languages (like Arabic, Turkish) typically requires more tokens to accurately convey the meaning of the input text. The model needs to perform additional operations when translating into these languages. In such cases, you can use the --max-len-variation parameter (default: 20) to specify the allowed difference in token count between the source and translated text. For languages like Arabic or Turkish, which need more tokens, we recommended setting a higher value like 150 . The allowed range for this parameter is 0 to 256. Increasing this value may raise inference latency, the impact is generally noticeable only when translating into languages that require more tokens due to their morphologically complexity. Incomplete translations may occur when using the default value for --max-len-variation . python3 python-clients/scripts/nmt/nmt.py --server 0 .0.0.0:50051 \ --text "Despite numerous challenges faced by the international community in coordinating an effective response to climate change, several countries have committed to achieving net-zero emissions by 2050." \ --source-language-code en-US \ --target-language-code ar-AR \ --max-len-variation 20 وعلى الرغم من التحديات العديدة التي يواجهها المجتمع الدولي في تنسيق الاستجابة الفعالة لتغير المناخ، فقد التزمت عدة بلدان بتحقيق صافي الانبعاثات الصفرية بحلول عام 2050 . وعلى الرغم من التحديات العديدة التي يواجهها المجتمع الدولي ف Correct translations are more likely when --max-len-variation is set to a higher value. python3 python-clients/scripts/nmt/nmt.py --server 0 .0.0.0:50051 \ --text "Despite numerous challenges faced by the international community in coordinating an effective response to climate change, several countries have committed to achieving net-zero emissions by 2050." \ --source-language-code en-US \ --target-language-code ar-AR \ --max-len-variation 150 وعلى الرغم من التحديات العديدة التي يواجهها المجتمع الدولي في تنسيق الاستجابة الفعالة لتغير المناخ، فقد التزمت عدة بلدان بتحقيق صافي الانبعاثات الصفرية بحلول عام 2050.