Cosmos Guardrail#
This page outlines the set of tools used to ensure content safety in Cosmos. For implementation details, consult the Cosmos paper.
The Cosmos-1.0-Guardrail model is integrated into the diffusion and autoregressive world generation pipelines and cannot be disabled.
Overview#
The Cosmos Guardrail guardrail system consists of two stages: pre-guard and post-guard.
Pre-Guard#
Cosmos pre-guard models are applied to text input, including input prompts and upsampled prompts.
Blocklist: A keyword list checker for detecting harmful keywords
Aegis: An LLM-based approach for blocking harmful prompts
Post-Guard#
Cosmos post-Guard models are applied to video frames generated by Cosmos models.
Video Content Safety Filter: A classifier trained to distinguish between safe and unsafe video frames
Face Blur Filter: A face detection and blurring module