Gemma and CodeGemma - NVIDIA Docs

NVIDIA Docs Hub NVIDIA NeMo Framework User Guide Gemma and CodeGemma

Released in February 2024, Google’s Gemma is an open model based on the work (Gemini v1.5 Report) done to create Google’s Gemini family of models. It adopts the transformer decoder framework while adding multi-query attention, RoPE, GeGLU activations, and more. Gemma is offered at 2B and 7B, providing a powerful model at reasonable sizes. More information is available in Google’s release blog.

Subsequently released in April 2024, CodeGemma joins the Gemma family with a specialization in code understanding and generation.

Feature	Status
Data parallelism	✓
Tensor parallelism	✓
Pipeline parallelism	✓
Interleaved Pipeline Parallelism Sched	N/A
Sequence parallelism	✓
Selective activation checkpointing	✓
Gradient checkpointing	✓
Partial gradient checkpointing	✓
FP32/TF32	✓
AMP/FP16	✗
BF16	✓
TransformerEngine/FP8	✗
Multi-GPU	✓
Multi-Node	✓
Inference	N/A
Slurm	✓
Base Command Manager	✓
Base Command Platform	✓
Distributed data preprcessing	✓
NVfuser	✗
P-Tuning and Prompt Tuning	✓
IA3 and Adapter learning	✓
Distributed Optimizer	✓
Distributed Checkpoint	✓
Fully Shared Data Parallel	N/A

Previous Llama-2 Results

Next Data Preparation for SFT and PEFT