***
description: Generate clip-level embeddings using Cosmos-Embed1
categories:
* video-curation
tags:
* embeddings
* cosmos-embed1
* video
personas:
* data-scientist-focused
* mle-focused
difficulty: intermediate
content\_type: howto
modality: video-only
***
# Embeddings
Generate clip-level embeddings for search, question answering, filtering, and duplicate removal.
## Use Cases
* Prepare semantic vectors for search, clustering, and near-duplicate detection.
* Score optional text prompts against clip content.
* Enable downstream filtering or retrieval tasks that need clip-level vectors.
## Before You Start
* Create clips upstream. Refer to [Clipping](/curate-video/process-data/clipping).
* Provide frames for embeddings or sample at the required rate. Refer to [Frame Extraction](/curate-video/process-data/frame-extraction).
* Access to model weights on each node (the stages download weights if missing).
***
## Quickstart
Use the pipeline stages or the example script flags to generate clip-level embeddings.
```python
from nemo_curator.pipeline import Pipeline
from nemo_curator.stages.video.clipping.clip_frame_extraction import ClipFrameExtractionStage
from nemo_curator.utils.decoder_utils import FrameExtractionPolicy, FramePurpose
from nemo_curator.stages.video.embedding.cosmos_embed1 import (
CosmosEmbed1FrameCreationStage,
CosmosEmbed1EmbeddingStage,
)
pipe = Pipeline(name="video_embeddings_example")
pipe.add_stage(
ClipFrameExtractionStage(
extraction_policies=(FrameExtractionPolicy.sequence,),
extract_purposes=(FramePurpose.EMBEDDINGS,),
target_res=(-1, -1),
verbose=True,
)
)
pipe.add_stage(CosmosEmbed1FrameCreationStage(model_dir="/models", variant="224p", target_fps=2.0, verbose=True))
pipe.add_stage(CosmosEmbed1EmbeddingStage(model_dir="/models", variant="224p", gpu_memory_gb=20.0, verbose=True))
pipe.run()
```
```bash
# Cosmos-Embed1 (224p)
python tutorials/video/getting-started/video_split_clip_example.py \
... \
--generate-embeddings \
--embedding-algorithm cosmos-embed1-224p \
--embedding-gpu-memory-gb 20.0
```
## Embedding Options
### Cosmos-Embed1
1. Add `CosmosEmbed1FrameCreationStage` to transform extracted frames into model-ready tensors.
```python
from nemo_curator.stages.video.embedding.cosmos_embed1 import (
CosmosEmbed1FrameCreationStage,
CosmosEmbed1EmbeddingStage,
)
frames = CosmosEmbed1FrameCreationStage(
model_dir="/models",
variant="224p", # or 336p, 448p
target_fps=2.0,
verbose=True,
)
```
2. Add `CosmosEmbed1EmbeddingStage` to generate `clip.cosmos_embed1_embedding` and optional `clip.cosmos_embed1_text_match`.
```python
embed = CosmosEmbed1EmbeddingStage(
model_dir="/models",
variant="224p",
gpu_memory_gb=20.0,
verbose=True,
)
```
#### Parameters
| Parameter | Type | Default | Description |
| ------------ | ------------------------ | ------------------------ | ----------------------------------------------------------------------------------- |
| `model_dir` | str | `"models/cosmos_embed1"` | Directory for model utilities and configs used to format input frames. |
| `variant` | {"224p", "336p", "448p"} | `"336p"` | Resolution preset that controls the model’s expected input size. |
| `target_fps` | float | 2.0 | Source sampling rate used to select frames; may re-extract at higher FPS if needed. |
| `num_cpus` | int | 3 | CPU cores used when on-the-fly re-extraction is required. |
| `verbose` | bool | `False` | Log per-clip decisions and re-extraction messages. |
| Parameter | Type | Default | Description |
| ----------------- | ------------------------ | ------------------------ | ---------------------------------------------------------------- |
| `model_dir` | str | `"models/cosmos_embed1"` | Directory for model weights; downloaded on each node if missing. |
| `variant` | {"224p", "336p", "448p"} | `"336p"` | Resolution preset used by the model weights. |
| `gpu_memory_gb` | int | 20 | Approximate GPU memory reservation per worker. |
| `texts_to_verify` | list\[str] \| None | `None` | Optional text prompts to score against the clip embedding. |
| `verbose` | bool | `False` | Log setup and per-clip outcomes. |
#### Outputs
* `clip.cosmos_embed1_frames` → temporary tensors used by the embedding stage
* `clip.cosmos_embed1_embedding` → final clip-level vector (NumPy array)
* Optional: `clip.cosmos_embed1_text_match`
## Troubleshooting
* Not enough frames for embeddings: Increase `target_fps` during frame extraction or adjust clip length so that the model receives the required number of frames.
* Out of memory during embedding: Lower `gpu_memory_gb`, reduce batch size if exposed, or use a smaller resolution variant.
* Weights not found on node: Confirm `model_dir` and network access. The stages download weights if missing.
## Next Steps
* Use embeddings for duplicate removal. Refer to [Duplicate Removal](/curate-video/process-data/dedup).
* Generate captions and previews for review workflows. Refer to [Captions & Preview](/curate-video/process-data/captions-preview).