Natural Machine Translation(NMT)
Natural Machine Translation(NMT)#
Transformer based Seq2Seq#
The Transformer-based encoder-decoder Neural Machine Translation models in Riva are based on the original Transformer paper. The main modification is to use the pre-layernorm transformer variant. For more information, refer to the NeMo Machine Translation Documentation.
The 24x6 models provided with Riva have 500M parameters with 24 encoder and 6 decoder layers.