Gemma
Google’s Gemma is a family of open-weight language models built on the same research and technology as Gemini. Gemma models are available in multiple sizes and versions, with improvements in each generation including local sliding window attention (Gemma 2) and interleaved global/local attention (Gemma 3).
Available Models
- Gemma 3: 1B, 4B, 12B, 27B
- Gemma 2: 2B, 9B, 27B
- Gemma (v1): 2B, 7B
Architectures
GemmaForCausalLM— Gemma v1Gemma2ForCausalLM— Gemma 2Gemma3ForCausalLM— Gemma 3
Example HF Models
Example Recipes
Try with NeMo AutoModel
1. Install (full instructions):
2. Clone the repo to get the example recipes:
3. Run the recipe from inside the repo:
Run with Docker
1. Pull the container and mount a checkpoint directory:
2. Navigate to the AutoModel directory (where the recipes are):
3. Run the recipe:
See the Installation Guide and LLM Fine-Tuning Guide.
Fine-Tuning
See the LLM Fine-Tuning Guide for full SFT and LoRA instructions.