Custom Models
Contents
Custom Models#
The Riva translation supports bilingual and multilingual models trained in NeMo. Each model must have 1G shared memory available. If not using the quick start path, specify --shm-size for Docker to ensure the models have enough memory to run.
NeMo models can be converted into Riva models using nemo2riva. For example:
nemo2riva
<nemo_filename> \
--out=<riva_filename> \
--max-dim=<max-dim>
Translation models can be deployed in Riva using riva-build and riva-deploy.
The translation pipeline has a single optional parameter for <model name> called --name. For example:
riva-build translation \
--name <model name> \
<rmir_filename>:<encryption_key> \
<riva_filename>:<encryption_key>
For NeMo Megatron-LLM trained models, replace translation with megatron_translation. For example:
riva-build megatron_translation \
--name <model name> \
<rmir_filename>:<encryption_key> \
<riva_filename>:<encryption_key>
Both model types accept the following flags:
<rmir_filename>is the Rivarmirfile that is generated<riva_filename>is the name of therivafile to use as input<encryption_key>is the encryption key used during the export of the.rivafile<model name>is how to differentiate the model at inference time. The default isriva-nmt.
Example#
riva-build translation \
--name mnmt_en_deesfr_transformer12x2 \
/data/mnmt_en_deesfr_transformer12x2nmt.rmir \
/data/mnmt_en_deesfr_transformer12x2.riva
riva-deploy -f /data/mnmt_en_deesfr_transformer12x2nmt.rmir /data/models
Supported Models#
Model |
Architecture |
Source Language(s) |
Target Language(s) |
NGC link |
|---|---|---|---|---|
megatronnmt_en_any_500m_32 |
Transformer Encoder-Decoder |
English (en) |
Any language |
|
megatronnmt_any_en_500m_32 |
Transformer Encoder-Decoder |
Any language |
English (en) |
|
megatronnmt_any_en_1b |
Transformer Encoder-Decoder |
Any language |
English (en) |
|
megatronnmt_en_any_1b |
Transformer Encoder-Decoder |
Any language |
English (en) |
|
megatronnmt_any_any_1b |
Transformer Encoder-Decoder |
Any language |
Any language |