Cosmos-Reason1#

Cosmos-Reason1 is an open , customizable, reasoning vision language model (VLM) for physical AI and robotics. It enables robots and vision AI agents to reason like humans, using prior knowledge, physics understanding, and common sense to understand and act in the real world. This model understands space, time, and fundamental physics, and can serve as a planning model to reason what steps an embodied agent might take next.

Cosmos-Reason1 excels at navigating the long tail of diverse physical world scenarios with spatial-temporal understanding. The Cosmos-Reason1 model is post-trained with physical common sense and embodied reasoning data, including supervised fine-tuning and reinforcement learning. It uses chain-of-thought reasoning capabilities to understand world dynamics without human annotations.