Diffusion Language Models (dLLMs)

View as Markdown

Diffusion language models (dLLMs) generate text by denoising rather than left-to-right autoregression. A fixed-length response β€œcanvas” is corrupted and then iteratively refined, so tokens are produced in parallel and can be revised across steps. NeMo AutoModel supports fine-tuning block-diffusion dLLMs with the same recipe-driven, FSDP2/Expert-Parallel training stack used for LLMs and VLMs.

Supported Models

OwnerModel FamilyArchitectures
GoogleDiffusionGemmaDiffusionGemmaForBlockDiffusion

Fine-Tuning

See the DiffusionGemma Fine-Tuning Guide for the block-diffusion training objective (uniform-random token corruption, no [MASK]), self-conditioning, and the supported feature set (SFT, LoRA, Expert Parallelism, activation checkpointing).