RETRO

User Guide (Latest Version)

Released in December 2021, the Retrieval-Enhanced Transformer (RETRO) model is an innovative approach to enhancing auto-regressive language models. Developed by researchers at DeepMind, RETRO leverages retrieval from a large text database as a complementary memory to scale reasoning capabilities without significantly increasing computational requirements. More information is available in the companion paper “Improving language models by retrieving from trillions of tokens”.

Feature

Status

Data parallelism
Tensor parallelism
Pipeline parallelism
Interleaved Pipeline Parallelism Sched N/A
Sequence parallelism
Selective activation checkpointing
Gradient checkpointing
Partial gradient checkpointing
FP32/TF32
AMP/FP16
BF16
TransformerEngine
TransformerEngine/FP8
Multi-GPU
Multi-Node
Inference
Slurm
Base Command Manager
Kubernetes
Distributed data preprcessing N/A
NVfuser
P-Tuning and Prompt Tuning
IA3 and Adapter learning
Distributed Optimizer
Distributed Checkpoint
Fully Shared Data Parallel
Previous Parameter Efficient Fine-Tuning (PEFT)
Next Data Preparation
© | | | | | | |. Last updated on Jun 24, 2024.