Gemma and CodeGemma

User Guide (Latest Version)

Released in February 2024, Google’s Gemma is an open model based on the work (Gemini v1.5 Report) done to create Google’s Gemini family of models. It adopts the transformer decoder framework while adding multi-query attention, RoPE, GeGLU activations, and more. Gemma is offered at 2B and 7B, providing a powerful model at reasonable sizes. More information is available in Google’s release blog.

Subsequently released in April 2024, CodeGemma joins the Gemma family with a specialization in code understanding and generation.

Feature

Status

Data parallelism
Tensor parallelism
Pipeline parallelism
Interleaved Pipeline Parallelism Sched N/A
Sequence parallelism
Selective activation checkpointing
Gradient checkpointing
Partial gradient checkpointing
FP32/TF32
AMP/FP16
BF16
TransformerEngine/FP8
Multi-GPU
Multi-Node
Inference N/A
Slurm
Base Command Manager
Base Command Platform
Distributed data preprcessing
NVfuser
P-Tuning and Prompt Tuning
IA3 and Adapter learning
Distributed Optimizer
Distributed Checkpoint
Fully Shared Data Parallel N/A
Previous Llama-2 Results
Next Data Preparation for SFT and PEFT
© | | | | | | |. Last updated on May 30, 2024.