Cosmos Guardrail#

This page outlines the set of tools used to ensure content safety in Cosmos. For implementation details, consult the Cosmos paper.

The Cosmos-1.0-Guardrail model is integrated into the diffusion and autoregressive world generation pipelines and cannot be disabled.

Overview#

The Cosmos Guardrail guardrail system consists of two stages: pre-guard and post-guard.

Pre-Guard#

Cosmos pre-guard models are applied to text input, including input prompts and upsampled prompts.

  • Blocklist: A keyword list checker for detecting harmful keywords

  • Aegis: An LLM-based approach for blocking harmful prompts

Post-Guard#

Cosmos post-Guard models are applied to video frames generated by Cosmos models.

  • Video Content Safety Filter: A classifier trained to distinguish between safe and unsafe video frames

  • Face Blur Filter: A face detection and blurring module