Filtering | NeMo Curator

Apply motion-based filtering to clips and aesthetic filtering to frames to prune low-quality assets during curation.

How it Works

Filtering runs in two passes that balance speed and quality:

Motion pass (fast): The pipeline decodes lightweight motion vectors and computes motion scores to drop static or near‑static clips at the first filtering stage. This step adds decoded_motion_data per clip, then writes motion_score_global_mean and motion_score_per_patch_min_256. Clips below thresholds move to video.filtered_clips, and video.clip_stats.num_filtered_by_motion increments.
Aesthetic pass (model based): Upstream, the pipeline extracts frames using the sequence policy at a chosen target_fps. The aesthetic stage reads extracted_frames[sequence-<target_fps>], produces an aesthetic_score, and removes clips below the threshold. These clips move to video.filtered_clips, and video.clip_stats.num_filtered_by_aesthetic increments.

Before You Start

Motion decoding and aesthetic scoring operate on clip buffers. You must run clipping and encoding first so each clip has a valid buffer (bytes).

Quickstart

Use the pipeline stages or the example script flags to enable motion and aesthetic filtering.

Pipeline Stage

Script Flags

1 from nemo_curator.pipeline import Pipeline
2 from nemo_curator.stages.video.filtering.motion_filter import (
3     MotionVectorDecodeStage,
4     MotionFilterStage,
5 )
6 from nemo_curator.stages.video.filtering.clip_aesthetic_filter import (
7     ClipAestheticFilterStage,
8 )
9 
10 pipe = Pipeline(name="filtering_examples")
11 
12 # Motion filtering
13 pipe.add_stage(
14     MotionVectorDecodeStage(target_fps=2.0, target_duration_ratio=0.5, num_cpus_per_worker=4.0)
15 )
16 pipe.add_stage(
17     MotionFilterStage(
18         score_only=False,
19         global_mean_threshold=0.00098,
20         per_patch_min_256_threshold=0.000001,
21         motion_filter_batch_size=64,
22         num_gpus_per_worker=0.5,
23         verbose=True,
24     )
25 )
26 
27 # Aesthetic filtering (assumes frames extracted upstream)
28 pipe.add_stage(
29     ClipAestheticFilterStage(
30         model_dir="/models",
31         score_threshold=3.5,
32         reduction="min",
33         target_fps=1.0,
34         num_gpus_per_worker=0.25,
35         verbose=True,
36     )
37 )
38 
39 pipe.run()

Filtering Options

Motion Filtering

Motion filtering is a two‑step process: first decode motion vectors, then filter clips based on motion scores.

Add MotionVectorDecodeStage to sample motion vectors from each clip.

1 from nemo_curator.stages.video.filtering.motion_filter import MotionVectorDecodeStage
2 
3 decode = MotionVectorDecodeStage(
4     target_fps=2.0,
5     target_duration_ratio=0.5,
6     num_cpus_per_worker=4.0,
7 )

This step adds decoded_motion_data to each clip, or records an error in clip.errors.

Add MotionFilterStage to compute motion scores and filter out low‑motion clips.

1 from nemo_curator.stages.video.filtering.motion_filter import MotionFilterStage
2 
3 motion = MotionFilterStage(
4     score_only=False,
5     global_mean_threshold=0.00098,
6     per_patch_min_256_threshold=0.000001,
7     motion_filter_batch_size=64,
8     num_gpus_per_worker=0.5,
9     verbose=True,
10 )

Adds motion_score_global_mean and motion_score_per_patch_min_256 to each clip.
Moves filtered clips to video.filtered_clips and increments video.clip_stats.num_filtered_by_motion.

Parameters

MotionVectorDecodeStage

MotionFilterStage

Parameter	Type	Default	Description
`num_cpus_per_worker`	float	6.0	CPU cores reserved per worker for decoding motion vectors.
`target_fps`	float	2.0	Target frames per second for sampling motion vectors.
`target_duration_ratio`	float	0.5	Fraction of each clip’s duration to decode for motion analysis.
`verbose`	bool	`False`	Log warnings and per‑clip issues during decoding.

Aesthetic Filtering

Aesthetic filtering works best when you prepare frames first, then score clips using a CLIP‑based aesthetic model.

Extract frames earlier in the pipeline. Use a frame extraction stage with a sequence policy and set a target_fps that matches the aesthetic stage. Refer to Frame Extraction for guidance.
```
1 # Example: upstream frame extraction snippet (pseudocode)
2 from nemo_curator.stages.video.frame_extraction import FrameExtractionStage
3 frames = FrameExtractionStage(policy="sequence", target_fps=1.0)
```
Frame Requirements:
- Use sequence frame extraction policy.
- Match target_fps here and in the aesthetic stage.
- Ensure clip.extracted_frames contains frames for the signature sequence-<target_fps>.

Add ClipAestheticFilterStage to score each clip and drop clips below a threshold.

1 from nemo_curator.stages.video.filtering.clip_aesthetic_filter import ClipAestheticFilterStage
2 
3 aesthetic = ClipAestheticFilterStage(
4     model_dir="/models",
5     score_threshold=3.5,
6     reduction="min",  # or "mean"
7     target_fps=1.0,
8     num_gpus_per_worker=0.25,
9     verbose=True,
10 )

Adds aesthetic_score to each clip.
Moves filtered clips to video.filtered_clips and increments video.clip_stats.num_filtered_by_aesthetic.

Parameters

Parameter	Type	Default	Description
`model_dir`	str	`"models/clip_aesthetic"`	Directory for model weights; downloaded on each node if missing.
`score_threshold`	float	0.5	Minimum aesthetic score required to keep a clip.
`reduction`	min	`"min"`	Aggregate frame‑level scores using mean or minimum.
`target_fps`	float	1.0	Frame sampling rate expected to match extracted frames.
`num_gpus_per_worker`	float	0.25	GPUs reserved per worker for aesthetic scoring.
`verbose`	bool	`False`	Log per‑clip aesthetic scores and decisions.

1	# Example: upstream frame extraction snippet (pseudocode)
2	from nemo_curator.stages.video.frame_extraction import FrameExtractionStage
3	frames = FrameExtractionStage(policy="sequence", target_fps=1.0)