Overview | NeMo Curator

Use NeMo Curator stages to split videos into clips, encode them, generate embeddings or captions, and remove duplicates.

How it Works

Create a Pipeline and add stages for clip extraction, optional re-encoding and filtering, embeddings or captions, previews, and writing outputs. Each stage is modular and configurable to match your quality and performance needs.

Processing Options

Choose from the following stages to split, encode, filter, embed, caption, preview, and remove duplicates in your videos:

Clip Videos

Split long videos into shorter clips using fixed stride or scene-change detection. clips fixed-stride transnetv2

Encode Clips

Encode clips to H.264 using CPU or GPU encoders and tune performance. clips h264_nvenc

Filter Clips and Frames

Apply motion-based filtering and aesthetic filtering to improve dataset quality. clips frames motion aesthetic

Extract Frames

Extract frames from clips or full videos for embeddings, filtering, and analysis. frames frames fps

Create Embeddings

Generate clip-level embeddings with Cosmos-Embed1 for search and duplicate removal. clips cosmos-embed1

Create Captions & Preview

Produce clip captions and optional preview images for review workflows. clips frames captions preview

Remove Duplicate Embeddings

Remove near-duplicates using semantic clustering and similarity with generated embeddings. clips semantic pairwise

Write Outputs

Persist clips, embeddings, previews, and metadata at the end of the pipeline using ClipWriterStage. Refer to Save & Export for directory layout and examples.

Example (place as the final stage):

1 from nemo_curator.stages.video.io.clip_writer import ClipWriterStage
2 
3 pipeline.add_stage(
4     ClipWriterStage(
5         output_path=OUT_DIR,
6         input_path=VIDEO_DIR,
7         upload_clips=True,
8         dry_run=False,
9         generate_embeddings=True,
10         generate_previews=False,
11         generate_captions=False,
12         embedding_algorithm="cosmos-embed1-224p",
13         caption_models=[],
14         enhanced_caption_models=[],
15         verbose=True,
16     )
17 )

Path helpers are available to resolve common locations (such as clips/, filtered_clips/, previews/, metas/v0/, and ce1_embd_parquet/).