Curate VideoProcess Data

Filtering

View as Markdown

Apply motion-based filtering to clips and aesthetic filtering to frames to prune low-quality assets during curation.

How it Works

Filtering runs in two passes that balance speed and quality:

  1. Motion pass (fast): The pipeline decodes lightweight motion vectors and computes motion scores to drop static or near‑static clips at the first filtering stage. This step adds decoded_motion_data per clip, then writes motion_score_global_mean and motion_score_per_patch_min_256. Clips below thresholds move to video.filtered_clips, and video.clip_stats.num_filtered_by_motion increments.
  2. Aesthetic pass (model based): Upstream, the pipeline extracts frames using the sequence policy at a chosen target_fps. The aesthetic stage reads extracted_frames[sequence-<target_fps>], produces an aesthetic_score, and removes clips below the threshold. These clips move to video.filtered_clips, and video.clip_stats.num_filtered_by_aesthetic increments.

Before You Start

Motion decoding and aesthetic scoring operate on clip buffers. You must run clipping and encoding first so each clip has a valid buffer (bytes).


Quickstart

Use the pipeline stages or the example script flags to enable motion and aesthetic filtering.

1from nemo_curator.pipeline import Pipeline
2from nemo_curator.stages.video.filtering.motion_filter import (
3 MotionVectorDecodeStage,
4 MotionFilterStage,
5)
6from nemo_curator.stages.video.filtering.clip_aesthetic_filter import (
7 ClipAestheticFilterStage,
8)
9
10pipe = Pipeline(name="filtering_examples")
11
12# Motion filtering
13pipe.add_stage(
14 MotionVectorDecodeStage(target_fps=2.0, target_duration_ratio=0.5, num_cpus_per_worker=4.0)
15)
16pipe.add_stage(
17 MotionFilterStage(
18 score_only=False,
19 global_mean_threshold=0.00098,
20 per_patch_min_256_threshold=0.000001,
21 motion_filter_batch_size=64,
22 num_gpus_per_worker=0.5,
23 verbose=True,
24 )
25)
26
27# Aesthetic filtering (assumes frames extracted upstream)
28pipe.add_stage(
29 ClipAestheticFilterStage(
30 model_dir="/models",
31 score_threshold=3.5,
32 reduction="min",
33 target_fps=1.0,
34 num_gpus_per_worker=0.25,
35 verbose=True,
36 )
37)
38
39pipe.run()

Filtering Options

Motion Filtering

Motion filtering is a two‑step process: first decode motion vectors, then filter clips based on motion scores.

  1. Add MotionVectorDecodeStage to sample motion vectors from each clip.

    1from nemo_curator.stages.video.filtering.motion_filter import MotionVectorDecodeStage
    2
    3decode = MotionVectorDecodeStage(
    4 target_fps=2.0,
    5 target_duration_ratio=0.5,
    6 num_cpus_per_worker=4.0,
    7)

    This step adds decoded_motion_data to each clip, or records an error in clip.errors.

  2. Add MotionFilterStage to compute motion scores and filter out low‑motion clips.

    1from nemo_curator.stages.video.filtering.motion_filter import MotionFilterStage
    2
    3motion = MotionFilterStage(
    4 score_only=False,
    5 global_mean_threshold=0.00098,
    6 per_patch_min_256_threshold=0.000001,
    7 motion_filter_batch_size=64,
    8 num_gpus_per_worker=0.5,
    9 verbose=True,
    10)
    • Adds motion_score_global_mean and motion_score_per_patch_min_256 to each clip.
    • Moves filtered clips to video.filtered_clips and increments video.clip_stats.num_filtered_by_motion.

Parameters

ParameterTypeDefaultDescription
num_cpus_per_workerfloat6.0CPU cores reserved per worker for decoding motion vectors.
target_fpsfloat2.0Target frames per second for sampling motion vectors.
target_duration_ratiofloat0.5Fraction of each clip’s duration to decode for motion analysis.
verboseboolFalseLog warnings and per‑clip issues during decoding.

Aesthetic Filtering

Aesthetic filtering works best when you prepare frames first, then score clips using a CLIP‑based aesthetic model.

  1. Extract frames earlier in the pipeline. Use a frame extraction stage with a sequence policy and set a target_fps that matches the aesthetic stage. Refer to Frame Extraction for guidance.

    1# Example: upstream frame extraction snippet (pseudocode)
    2from nemo_curator.stages.video.frame_extraction import FrameExtractionStage
    3frames = FrameExtractionStage(policy="sequence", target_fps=1.0)

    Frame Requirements:

    • Use sequence frame extraction policy.
    • Match target_fps here and in the aesthetic stage.
    • Ensure clip.extracted_frames contains frames for the signature sequence-<target_fps>.
  2. Add ClipAestheticFilterStage to score each clip and drop clips below a threshold.

    1from nemo_curator.stages.video.filtering.clip_aesthetic_filter import ClipAestheticFilterStage
    2
    3aesthetic = ClipAestheticFilterStage(
    4 model_dir="/models",
    5 score_threshold=3.5,
    6 reduction="min", # or "mean"
    7 target_fps=1.0,
    8 num_gpus_per_worker=0.25,
    9 verbose=True,
    10)
    • Adds aesthetic_score to each clip.
    • Moves filtered clips to video.filtered_clips and increments video.clip_stats.num_filtered_by_aesthetic.

Parameters

ParameterTypeDefaultDescription
model_dirstr"models/clip_aesthetic"Directory for model weights; downloaded on each node if missing.
score_thresholdfloat0.5Minimum aesthetic score required to keep a clip.
reductionmin"min"Aggregate frame‑level scores using mean or minimum.
target_fpsfloat1.0Frame sampling rate expected to match extracted frames.
num_gpus_per_workerfloat0.25GPUs reserved per worker for aesthetic scoring.
verboseboolFalseLog per‑clip aesthetic scores and decisions.