GLM-4 MoE (GLM-4.5 / GLM-4.7)
GLM-4 MoE (GLM-4.5 / GLM-4.7)
GLM-4.5 and GLM-4.7 are Mixture-of-Experts variants of the GLM family released under the zai-org HuggingFace organization. GLM-4.7-Flash is a lighter variant with fewer active parameters.
Available Models
- GLM-4.5-Air (
Glm4MoeForCausalLM) - GLM-4.7 (
Glm4MoeForCausalLM) - GLM-4.7-Flash (
Glm4MoeLiteForCausalLM): lightweight MoE variant
Architectures
Glm4MoeForCausalLM— GLM-4.5, GLM-4.7Glm4MoeLiteForCausalLM— GLM-4.7-Flash
Example HF Models
Example Recipes
Try with NeMo AutoModel
1. Install (full instructions):
2. Clone the repo to get the example recipes:
This recipe was validated on 8 nodes × 8 GPUs (64 H100s). See the Launcher Guide for multi-node setup.
3. Run the recipe from inside the repo:
Run with Docker
1. Pull the container and mount a checkpoint directory:
2. Navigate to the AutoModel directory (where the recipes are):
3. Run the recipe:
See the Installation Guide and LLM Fine-Tuning Guide.
Fine-Tuning
See the LLM Fine-Tuning Guide and the Large MoE Fine-Tuning Guide.