World Scenario Video Generation#
This page details how to generate world scenario videos from 3D scene annotations for use with Cosmos-Transfer2.5.
Additional Requirements#
In addition to the standard Transfer2.5 Prerequisites, you will need the following:
UV (for dependency management)
A GPU with EGL support (for headless OpenGL rendering)
3D scene annotation data in Parquet format
Install Dependencies#
Use the following command to install dependencies:
cd packages/cosmos-transfer2
uv sync
source .venv/bin/activate # On Windows: .venv\Scripts\activate
Generate Control Videos#
The following command will generate control videos (videos for all seven cameras are generated by default):
python scripts/generate_control_videos.py /path/to/{input_root} ./{save_root}
The following command will generate a control video for the front:wide:120fov
and cross:right:120fov
cameras:
python scripts/generate_control_videos.py {input_root}/ {save_root}/ \
--cameras "camera:front:wide:120fov,camera:cross:right:120fov"
Command Options#
Option |
Default |
Description |
---|---|---|
|
|
A comma-separated list of camera names, or “all” for all seven cameras |
Available Cameras#
camera:front:wide:120fov
camera:front:tele:sat:30fov
camera:cross:right:120fov
camera:cross:left:120fov
camera:rear:left:70fov
camera:rear:right:70fov
camera:rear:tele:30fov
Data Format#
Input Structure#
scene_annotations_directory/
├── uuid.obstacle.parquet (required)
├── uuid.calibration_estimate.parquet (required)
├── uuid.egomotion_estimate.parquet (required)
├── uuid.lane.parquet (optional)
├── uuid.lane_line.parquet (optional)
└── ... (other optional parquet files)
Output Structure#
save_root/
└── uuid/
├── uuid.camera_front_wide_120fov.mp4
├── uuid.camera_front_tele_sat_30fov.mp4
├── uuid.camera_cross_right_120fov.mp4
├── uuid.camera_cross_left_120fov.mp4
├── uuid.camera_rear_left_70fov.mp4
├── uuid.camera_rear_right_70fov.mp4
├── uuid.camera_rear_tele_30fov.mp4
Rendered Elements#
The following elements are always rendered:
3D bounding boxes for vehicles/pedestrians (from the required
obstacle.parquet
file)
The following elements are optionally rendered if the corresponding Parquet file is provided:
Lane lines, lanes, road boundaries
Crosswalks, poles, road markings, wait lines
Traffic lights and signs
Troubleshooting#
ModernGL/EGL errors
Install GPU drivers and EGL libraries (
libGL.so.1
,libEGL.so.1
). On Ubuntu/Debian:apt install libegl1-mesa-dev libgl1-mesa-dri
Missing parquet files
Ensure the required files exist:
obstacle.parquet
,calibration_estimate.parquet
,egomotion_estimate.parquet
.
Memory issues
Process fewer cameras at once if needed.
Invalid camera names
Run with
--help
to see valid options.
Next Steps#
Generated control videos serve as conditioning inputs for Cosmos Transfer2.5 multiview inference. The HD map visualizations provide spatial context for video generation tasks.