Cosmos-Predict1#

Cosmos-Predict1 is a collection of general-purpose world foundation models (WFMs) for inference, along with scripts for post-training these models for specific Physical AI use cases.

The architecture of Cosmos-Predict1 is shown in the following figure:

../_images/predict1_diagram.png

Cosmos-Predict1 includes the following components:

  • Diffusion Models: Generate visual simulations using text or video prompts.

  • Autoregressive Models: Generate visual simulations using video prompts along with optional text prompts.

  • Tokenizers: Split images or videos into continuous tokens (latent vectors) and discrete tokens (integers) efficiently and effectively.

  • Post-training Scripts: Help developers post-train the diffusion and autoregressive models for their particular Physical AI use cases.

  • Pre-training Scripts: Help developers train their WFMs from scratch.

Examples#

Cosmos-Predict1-7B-Text2World-Multiview#

This video shows the text input and corresponding multiview output generated using inference with the Cosmos-Predict1-7B-Text2World-Multiview diffusion model.

Cosmos-Predict1-5B-Video2World#

This video shows the text and image input and the corresponding video output generated using inference with the Cosmos-Predict1-5B-Video2World autoregressive model.

Getting Started Workflow#

Follow these steps to explore the capabilities of Cosmos-Predict1:

  1. Use the Model Matrix page to determine the best model for your use case. Note that only a subset of models currently support post-training.

  2. Review the Prerequisites page and follow the Installation guide.

  3. Follow the steps in the Diffusion Quickstart Guide or Autoregressive Quickstart Guide to get familiar with the inference process.

  4. If you want to post-train a model for a particular Physical AI use case, follow the steps in the Diffusion Post-Training Guide or Autoregressive Post-Training Guide.

  5. To learn more about the inference options available for each model, refer to the Diffusion Model Reference or Autoregressive Model Reference.