Autoregressive Quickstart Guide#

This page will walk you through setting up and running inference with Cosmos-Predict1-4B, a pre-trained autoregressive model.

Set up the Autoregressive Model#

  1. Ensure you have the necessary hardware and software, as outlined on the Prerequisites page.

  2. Follow the Installation guide to download the Cosmos-Predict1 repo and set up the conda environment.

  3. Generate a Hugging Face access token. Set the access token permission to ‘Read’ (the default permission is ‘Fine-grained’).

  4. Log in to Hugging Face with the access token:

    huggingface-cli login
    
  5. Download the model weights for Cosmos-Predict1-4B from Hugging Face:

    CUDA_HOME=$CONDA_PREFIX PYTHONPATH=$(pwd) python scripts/download_autoregressive_checkpoints.py --model_sizes 4B
    

Generate a Video using Video Input#

Generate a video using video or image input using the Cosmos-Predict1-4B model. The following example uses theinput.mp4 example video as the --input_image_or_video_path argument.

CUDA_HOME=$CONDA_PREFIX PYTHONPATH=$(pwd) python cosmos_predict1/autoregressive/inference/base.py \
    --checkpoint_dir checkpoints \
    --ar_model_dir Cosmos-Predict1-4B \
    --input_type video \
    --input_image_or_video_path assets/autoregressive/input.mp4 \
    --top_p 0.8 \
    --temperature 1.0 \
    --offload_diffusion_decoder \
    --offload_tokenizer \
    --video_save_name autoregressive-4b

Next Steps#

Get started adapting an Autoregressive model for your use case with the Autoregressive Model Post-Training Guide or explore all autoregressive model input/output options in the Autoregressive Model Reference.