Filtering#
Apply motion-based filtering to clips and aesthetic filtering to frames to prune low-quality assets during curation.
How it Works#
Filtering runs in two passes that balance speed and quality:
Motion pass (fast): The pipeline decodes lightweight motion vectors and computes motion scores to drop static or near‑static clips at the first filtering stage. This step adds
decoded_motion_data
per clip, then writesmotion_score_global_mean
andmotion_score_per_patch_min_256
. Clips below thresholds move tovideo.filtered_clips
, andvideo.clip_stats.num_filtered_by_motion
increments.Aesthetic pass (model based): Upstream, the pipeline extracts frames using the
sequence
policy at a chosentarget_fps
. The aesthetic stage readsextracted_frames[sequence-<target_fps>]
, produces anaesthetic_score
, and removes clips below the threshold. These clips move tovideo.filtered_clips
, andvideo.clip_stats.num_filtered_by_aesthetic
increments.
Before You Start#
Motion decoding and aesthetic scoring operate on clip buffers. You must run clipping and encoding first so each clip has a valid buffer
(bytes).
Quickstart#
Use the pipeline stages or the example script flags to enable motion and aesthetic filtering.
from nemo_curator.pipeline import Pipeline
from nemo_curator.stages.video.filtering.motion_filter import (
MotionVectorDecodeStage,
MotionFilterStage,
)
from nemo_curator.stages.video.filtering.clip_aesthetic_filter import (
ClipAestheticFilterStage,
)
pipe = Pipeline(name="filtering_examples")
# Motion filtering
pipe.add_stage(
MotionVectorDecodeStage(target_fps=2.0, target_duration_ratio=0.5, num_cpus_per_worker=4.0)
)
pipe.add_stage(
MotionFilterStage(
score_only=False,
global_mean_threshold=0.00098,
per_patch_min_256_threshold=0.000001,
motion_filter_batch_size=64,
num_gpus_per_worker=0.5,
verbose=True,
)
)
# Aesthetic filtering (assumes frames extracted upstream)
pipe.add_stage(
ClipAestheticFilterStage(
model_dir="/models",
score_threshold=3.5,
reduction="min",
target_fps=1.0,
num_gpus_per_worker=0.25,
verbose=True,
)
)
pipe.run()
# Motion filtering
python -m nemo_curator.examples.video.video_split_clip_example \
... \
--motion-filter enable \
--motion-decode-target-fps 2.0 \
--motion-decode-target-duration-ratio 0.5 \
--motion-decode-cpus-per-worker 4.0 \
--motion-global-mean-threshold 0.00098 \
--motion-per-patch-min-256-threshold 0.000001 \
--motion-score-batch-size 64 \
--motion-score-gpus-per-worker 0.5
# Aesthetic filtering
python -m nemo_curator.examples.video.video_split_clip_example \
... \
--aesthetic-threshold 3.5 \
--aesthetic-reduction min \
--aesthetic-gpus-per-worker 0.25
Filtering Options#
Motion Filtering#
Motion filtering is a two‑step process: first decode motion vectors, then filter clips based on motion scores.
Add
MotionVectorDecodeStage
to sample motion vectors from each clip.from nemo_curator.stages.video.filtering.motion_filter import MotionVectorDecodeStage decode = MotionVectorDecodeStage( target_fps=2.0, target_duration_ratio=0.5, num_cpus_per_worker=4.0, )
This step adds
decoded_motion_data
to each clip, or records an error inclip.errors
.Add
MotionFilterStage
to compute motion scores and filter out low‑motion clips.from nemo_curator.stages.video.filtering.motion_filter import MotionFilterStage motion = MotionFilterStage( score_only=False, global_mean_threshold=0.00098, per_patch_min_256_threshold=0.000001, motion_filter_batch_size=64, num_gpus_per_worker=0.5, verbose=True, )
Adds
motion_score_global_mean
andmotion_score_per_patch_min_256
to each clip.Moves filtered clips to
video.filtered_clips
and incrementsvideo.clip_stats.num_filtered_by_motion
.
Parameters#
Parameter |
Type |
Default |
Description |
---|---|---|---|
|
float |
6.0 |
CPU cores reserved per worker for decoding motion vectors. |
|
float |
2.0 |
Target frames per second for sampling motion vectors. |
|
float |
0.5 |
Fraction of each clip’s duration to decode for motion analysis. |
|
bool |
|
Log warnings and per‑clip issues during decoding. |
Parameter |
Type |
Default |
Description |
---|---|---|---|
|
bool |
|
Compute motion scores without filtering out clips. |
|
float |
0.00098 |
Threshold on the global mean motion score; lower implies less motion. |
|
float |
0.000001 |
Threshold on the minimum per‑patch score over 256 patches. |
|
int |
256 |
Batch size for GPU computation; decrease to reduce memory usage. |
|
float |
0.0 |
GPUs reserved per worker for motion scoring (0 uses CPU path). |
|
bool |
|
Log per‑clip decisions and scores. |
Aesthetic Filtering#
Aesthetic filtering works best when you prepare frames first, then score clips using a CLIP‑based aesthetic model.
Extract frames earlier in the pipeline. Use a frame extraction stage with a
sequence
policy and set atarget_fps
that matches the aesthetic stage. Refer to Frame Extraction for guidance.# Example: upstream frame extraction snippet (pseudocode) from nemo_curator.stages.video.frame_extraction import FrameExtractionStage frames = FrameExtractionStage(policy="sequence", target_fps=1.0)
Frame Requirements:
Use
sequence
frame extraction policy.Match
target_fps
here and in the aesthetic stage.Ensure
clip.extracted_frames
contains frames for the signaturesequence-<target_fps>
.
Add
ClipAestheticFilterStage
to score each clip and drop clips below a threshold.from nemo_curator.stages.video.filtering.clip_aesthetic_filter import ClipAestheticFilterStage aesthetic = ClipAestheticFilterStage( model_dir="/models", score_threshold=3.5, reduction="min", # or "mean" target_fps=1.0, num_gpus_per_worker=0.25, verbose=True, )
Adds
aesthetic_score
to each clip.Moves filtered clips to
video.filtered_clips
and incrementsvideo.clip_stats.num_filtered_by_aesthetic
.
Parameters#
Parameter |
Type |
Default |
Description |
---|---|---|---|
|
str |
|
Directory for model weights; downloaded on each node if missing. |
|
float |
0.5 |
Minimum aesthetic score required to keep a clip. |
|
{“mean”, “min”} |
|
Aggregate frame‑level scores using mean or minimum. |
|
float |
1.0 |
Frame sampling rate expected to match extracted frames. |
|
float |
0.25 |
GPUs reserved per worker for aesthetic scoring. |
|
bool |
|
Log per‑clip aesthetic scores and decisions. |