Model Matrix#
Refer to the table below for world foundation models (WFMs) available in Cosmos Predict2, along with their supported workflows and compute requirements.
Note
We recommend using NVIDIA H100-80GB or A100-80GB GPUs for inference and post-training.
Model |
Description |
Inference |
Post-Training |
||
---|---|---|---|---|---|
Compute Requirements |
Multi-GPU Supported |
Post-Training Supported |
Compute Requirements |
||
Diffusion-based text to image generation (2 billion parameters) |
1 GPU |
Yes |
Yes |
8 GPUs |
|
Diffusion-based text to image generation (14 billion parameters) |
1 GPU |
No |
No |
– |
|
Diffusion-based video and text to visual world generation (2 billion parameters) |
1 GPU |
Yes |
Yes |
8 GPUs |
|
Diffusion-based video and text to visual world generation (14 billion parameters) |
1 GPU |
No |
No |
– |