Diffusion | NVIDIA Dynamo Documentation

Overview

Dynamo supports serving diffusion models across multiple backends, enabling generation of images and video from text prompts. Backends expose diffusion capabilities through the same Dynamo pipeline infrastructure used for LLM inference, including frontend routing, scaling, and observability.

Support Matrix

Modality	vLLM-Omni	SGLang	TRT-LLM
Text-to-Text	✅	✅	❌
Text-to-Image	✅	✅	❌
Text-to-Video	✅	✅	✅
Image-to-Video	✅	❌	❌

Status: ✅ Supported | ❌ Not supported

Backend Documentation

For deployment guides, configuration, and examples for each backend:

vLLM-Omni
SGLang Diffusion
TRT-LLM Diffusion
FastVideo (custom worker)