nemo_curator.stages.audio.alm.alm_data_overlap
nemo_curator.stages.audio.alm.alm_data_overlap
ALM Data Overlap Stage - Native NeMo Curator Implementation.
Filters overlapping windows based on threshold. Follows the exact pattern from NeMo Curator: https://github.com/NVIDIA-NeMo/Curator/blob/main/nemo_curator/stages/audio/common.py
Produces identical output to SDP implementation.
Module Contents
Classes
Functions
Data
API
Bases: ProcessingStage[AudioTask, AudioTask]
Filter overlapping ALM windows.
Removes windows with overlap exceeding the threshold, keeping windows closest to target duration.
Validate parameters.
Filter overlapping windows from entry.
Calculate list of durations from windows data.
Calculate (end, start) timestamp pairs from windows data.
Calculate total duration from windows data.
Filter out segments that have overlap greater than threshold.
Get complete window objects that correspond to filtered timestamps.
Calculate overlap ratio between two segments (stored as (end, start) tuples).
Get total duration of qualified segments.
Get duration list of qualified segments.