Gemma#
Gemma is Google’s original lightweight open model family. Megatron Bridge supports Gemma causal language models through the GemmaBridge implementation for the Hugging Face GemmaForCausalLM architecture.
Supported Variants#
Megatron Bridge supports Hugging Face Gemma checkpoints that use the gemma model type, including:
Gemma 2B: https://huggingface.co/google/gemma-2b
Gemma 7B: https://huggingface.co/google/gemma-7b
Gemma release collection: https://huggingface.co/collections/google/gemma-release-65d5efbccdbb8c4202ec078b
Architecture Notes#
RMSNorm with zero-centered gamma.
GeGLU-style gated MLPs.
RoPE positional embeddings and flash attention backend.
Shared input/output embedding weights.
Examples#
Gemma uses the common conversion and generation entry points:
uv run python examples/conversion/convert_checkpoints.py import \
--hf-model google/gemma-2b \
--megatron-path /checkpoints/gemma_2b_megatron
uv run python examples/conversion/hf_to_megatron_generate_text.py \
--hf_model_path google/gemma-2b \
--megatron_model_path /checkpoints/gemma_2b_megatron \
--prompt "What is artificial intelligence?"