ALMDataOverlapStage removes redundant training windows that share too much audio content. When two windows overlap beyond a configurable threshold, the stage keeps the window whose duration is closest to the target and discards the other.
The stage processes each AudioTask independently:
windows list produced by ALMDataBuilderStagetarget_durationWhen using shorter target windows, match the target_duration parameter:
The stage adds the following user-facing fields to each AudioTask:
The stage also writes several intermediate fields (total_dur_list_window, total_dur_list_window_timestamps, filtered, swift_filepath) that are primarily used for internal bookkeeping. The original windows list produced by ALMDataBuilderStage is preserved so downstream consumers can compare pre- and post-filter results.
The right threshold depends on your training requirements:
overlap_percentage (0 to 30) to maximize the variety of audio content in the training setoverlap_percentage (70 to 100) to retain more windows at the cost of some redundancyoverlap_percentage=50 as a starting point and adjust based on the ratio of filtered_windows to input windowsMonitor the yield by comparing filtered_dur to total_dur_window in the output.