Extract frames from clips or full videos at target rates and resolutions. Use frames for embeddings (such as Cosmos‑Embed1), aesthetic filtering, previews, and custom analysis.
If you need saved media files, frame extraction is optional. Embeddings and aesthetic filtering require frames.
Use the pipeline stages or the example script flags to extract frames for embeddings, filtering, and analysis.
NeMo Curator provides two complementary stages:
ClipFrameExtractionStage: Extracts frames from already‑split clips. Supports several target FPS values and computes an LCM rate to reduce decode work.VideoFrameExtractionStage: Extracts frames from full videos (for example, before scene‑change detection). Supports PyNvCodec (NVDEC) or ffmpeg CPU/GPU decode.If you provide several integer target_fps values (such as 1 and 2), the clip stage decodes once at the LCM rate and then samples every k‑th frame to produce each target rate. This reduces decode cost.
pynvc (NVDEC) or ffmpeg_gpu for high throughput when GPU hardware is available; otherwise use ffmpeg_cpu.target_res when needed.ffmpeg and drivers for GPU modes.target_fps or adjust clip length; certain embedding stages can re‑extract at a higher rate when needed.