RETRO - NVIDIA Docs

NVIDIA Docs Hub NVIDIA NeMo Framework User Guide (Latest) RETRO

User Guide (Latest Version)

Released in December 2021, the Retrieval-Enhanced Transformer (RETRO) model is an innovative approach to enhancing auto-regressive language models. Developed by researchers at DeepMind, RETRO leverages retrieval from a large text database as a complementary memory to scale reasoning capabilities without significantly increasing computational requirements. More information is available in the companion paper “Improving language models by retrieving from trillions of tokens”.

Feature	Status
Data parallelism	✓
Tensor parallelism	✗
Pipeline parallelism	✗
Interleaved Pipeline Parallelism Sched	N/A
Sequence parallelism	✗
Selective activation checkpointing	✓
Gradient checkpointing	✓
Partial gradient checkpointing	✓
FP32/TF32	✓
AMP/FP16	✓
BF16	✓
TransformerEngine	✓
TransformerEngine/FP8	✗
Multi-GPU	✓
Multi-Node	✓
Inference	✓
Slurm	✓
Base Command Manager	✓
Kubernetes	✗
Distributed data preprcessing	N/A
NVfuser	✗
P-Tuning and Prompt Tuning	✗
IA3 and Adapter learning	✗
Distributed Optimizer	✓
Distributed Checkpoint	✓
Fully Shared Data Parallel	✗

Previous Parameter Efficient Fine-Tuning (PEFT)

Next Data Preparation