***
description: >-
Extract frames from clips or full videos for embeddings, filtering, and
analysis
categories:
* video-curation
tags:
* frames
* extraction
* fps
* ffmpeg
* nvdec
personas:
* data-scientist-focused
* mle-focused
difficulty: intermediate
content\_type: howto
modality: video-only
***
# Frame Extraction
Extract frames from clips or full videos at target rates and resolutions. Use frames for embeddings (such as Cosmos‑Embed1), aesthetic filtering, previews, and custom analysis.
## Use Cases
* Prepare inputs for embedding models that expect frame sequences.
* Run aesthetic filtering that operates on sampled frames.
* Generate lightweight previews or QA snapshots.
* Provide frames for scene-change detection before clipping (TransNetV2).
## Before You Start
If you need saved media files, frame extraction is optional. [Embeddings](/curate-video/process-data/embeddings) and [aesthetic filtering](/curate-images/process-data/filters/aesthetic) require frames.
***
## Quickstart
Use the pipeline stages or the example script flags to extract frames for embeddings, filtering, and analysis.
```python
from nemo_curator.pipeline import Pipeline
from nemo_curator.stages.video.clipping.clip_extraction_stages import FixedStrideExtractorStage
from nemo_curator.stages.video.clipping.clip_frame_extraction import ClipFrameExtractionStage
from nemo_curator.utils.decoder_utils import FrameExtractionPolicy, FramePurpose
from nemo_curator.stages.video.embedding.cosmos_embed1 import (
CosmosEmbed1FrameCreationStage,
CosmosEmbed1EmbeddingStage,
)
pipe = Pipeline(name="clip_frames_embeddings")
pipe.add_stage(FixedStrideExtractorStage(clip_len_s=10.0, clip_stride_s=10.0))
pipe.add_stage(
ClipFrameExtractionStage(
extraction_policies=(FrameExtractionPolicy.sequence,),
extract_purposes=(FramePurpose.EMBEDDINGS,),
target_res=(-1, -1),
verbose=True,
)
)
pipe.add_stage(CosmosEmbed1FrameCreationStage(model_dir="/models", variant="224p", target_fps=2.0, verbose=True))
pipe.add_stage(CosmosEmbed1EmbeddingStage(model_dir="/models", variant="224p", gpu_memory_gb=20.0, verbose=True))
pipe.run()
```
```bash
# Clip frames implicitly when generating embeddings or aesthetics
python tutorials/video/getting-started/video_split_clip_example.py \
... \
--generate-embeddings \
--clip-extraction-target-res -1
# Full-video frames for TransNetV2 scene change
python tutorials/video/getting-started/video_split_clip_example.py \
... \
--splitting-algorithm transnetv2 \
--transnetv2-frame-decoder-mode pynvc
```
## Options in NeMo Curator
NeMo Curator provides two complementary stages:
* `ClipFrameExtractionStage`: Extracts frames from already‑split clips. Supports several target FPS values and computes an LCM rate to reduce decode work.
* `VideoFrameExtractionStage`: Extracts frames from full videos (for example, before scene‑change detection). Supports PyNvCodec (NVDEC) or `ffmpeg` CPU/GPU decode.
### Extract Frames
```python
from nemo_curator.stages.video.clipping.clip_frame_extraction import (
ClipFrameExtractionStage,
)
from nemo_curator.utils.decoder_utils import FrameExtractionPolicy, FramePurpose
extract_frames = ClipFrameExtractionStage(
extraction_policies=(FrameExtractionPolicy.sequence,),
extract_purposes=(FramePurpose.EMBEDDINGS,), # sets default FPS if target_fps not provided
target_res=(-1, -1), # keep original resolution
# target_fps=[1, 2], # optional: override with explicit FPS values
verbose=True,
)
```
```python
from nemo_curator.stages.video.clipping.video_frame_extraction import VideoFrameExtractionStage
frame_extractor = VideoFrameExtractionStage(
decoder_mode="pynvc", # or "ffmpeg_gpu", "ffmpeg_cpu"
output_hw=(27, 48), # (height, width) for frame extraction
pyncv_batch_size=64, # batch size for PyNvCodec
verbose=True,
)
```
## Parameters
| Parameter | Description |
| --------------------- | ----------------------------------------------------------------------------------------------------------------------- |
| `extraction_policies` | Frame selection strategy. Use `sequence` for uniform sampling. `middle` selects a single middle frame. |
| `target_fps` | For clips: sampling rate in frames per second. If you provide several integer values, the stage uses LCM sampling. |
| `extract_purposes` | Shortcut that sets default FPS for specific purposes (such as embeddings). You can still pass `target_fps` to override. |
| `target_res` | Output frame resolution `(height, width)`. Use `(-1, -1)` to keep original. |
| `num_cpus` | Number of CPU cores for frame extraction. Default: `3`. |
| `decoder_mode` | For full‑video extraction: `pynvc` (NVDEC), `ffmpeg_gpu`, or `ffmpeg_cpu`. |
| `output_hw` | For full‑video extraction: `(height, width)` tuple for frame dimensions. Default: `(27, 48)`. |
| `pyncv_batch_size` | For full‑video extraction: batch size for PyNvCodec processing. Default: `64`. |
### LCM Sampling for Several FPS Values
If you provide several integer `target_fps` values (such as `1` and `2`), the clip stage decodes once at the LCM rate and then samples every k‑th frame to produce each target rate. This reduces decode cost.
```python
ClipFrameExtractionStage(
extraction_policies=(FrameExtractionPolicy.sequence,),
target_fps=[1, 2], # LCM = 2; decode once at 2 FPS, then subsample
)
```
## Hardware and Performance
* Prefer `pynvc` (NVDEC) or `ffmpeg_gpu` for high throughput when GPU hardware is available; otherwise use `ffmpeg_cpu`.
* Use batching where applicable and track worker resource use.
* Keep resolution modest if memory limits apply; set `target_res` when needed.
## Downstream Dependencies
* **Embeddings**: Cosmos‑Embed1 expects frames at specific rates. Refer to [Embeddings](/curate-video/process-data/embeddings).
* **Aesthetic Filtering**: Requires frames extracted earlier. Refer to [Filtering](/curate-video/process-data/filtering).
* **Clipping with TransNetV2**: Uses full‑video frame extraction before scene‑change detection. Refer to [Clipping](/curate-video/process-data/clipping).
## Troubleshooting
* "Frame extraction failed": Check decoder mode and availability; confirm `ffmpeg` and drivers for GPU modes.
* Not enough frames for embeddings: Increase `target_fps` or adjust clip length; certain embedding stages can re‑extract at a higher rate when needed.