For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
DocumentationAPI Reference
DocumentationAPI Reference
  • Home
    • Welcome
  • About NeMo Curator
    • Overview
    • Key Features
  • Get Started
    • Overview
    • Install (All Modalities)
    • Text Quickstart
    • Image Quickstart
    • Video Quickstart
    • Audio Quickstart
  • Curate Text
    • Overview
    • Tutorials
    • Save and Export
  • Curate Images
    • Overview
    • Save and Export
  • Curate Video
    • Overview
    • Load Data
    • Save and Export
  • Curate Audio
    • Overview
      • Overview
        • Overview
        • Data Builder
        • Overlap Filtering
      • Text Integration
    • Save and Export
  • Setup & Deployment
    • Overview
  • Reference
    • Overview
    • Related Tools
NVIDIANVIDIA
Developer-friendly docs for your API
Privacy Policy | Your Privacy Choices | Terms of Service | Accessibility | Corporate Policies | Product Security | Contact

Copyright © 2026, NVIDIA Corporation.

LogoLogoNeMo Curator
On this page
  • How it Works
  • Parameters
  • Overlap Percentage Behavior
  • Basic Usage
  • Advanced Configuration
  • Moderate Filtering
  • Short-Window Pipeline
  • Output Fields
  • Tuning the Overlap Threshold
  • Related Topics
Curate AudioProcess DataALM Data Curation

ALM Overlap Filtering

||View as Markdown|
Previous

Data Builder

Next

Text Integration

ALMDataOverlapStage removes redundant training windows that share too much audio content. When two windows overlap beyond a configurable threshold, the stage keeps the window whose duration is closest to the target and discards the other.

How it Works

The stage processes each AudioTask independently:

  1. Extracts the windows list produced by ALMDataBuilderStage
  2. Sorts windows by start time
  3. For each window, compares it against every later window whose start falls before its end — all pairs that overlap in time, not only adjacent ones — and calculates the overlap ratio (overlap duration divided by the shorter window duration)
  4. When the overlap ratio meets the threshold, greedily removes the window whose duration is further from target_duration
  5. Writes filtered results back to the task

Parameters

ParameterTypeDefaultDescription
overlap_percentageint0Overlap threshold from 0 to 100. Lower values remove more windows.
target_durationfloat120.0Preferred window duration in seconds, used for tie-breaking

Overlap Percentage Behavior

ValueBehaviorTypical Use Case
0Remove any overlapping windowsMaximum deduplication, smallest output
50Remove windows with 50% or more overlapBalanced yield and diversity
100Keep all windows except fully-contained duplicates (ratio = 1.0)Minimum filtering, largest output

Basic Usage

1from nemo_curator.stages.audio.alm import ALMDataOverlapStage
2
3# Remove windows with any overlap
4overlap_filter = ALMDataOverlapStage(
5 overlap_percentage=0,
6 target_duration=120.0,
7)

Advanced Configuration

Moderate Filtering

1# Keep windows unless they overlap by more than 50%
2overlap_filter = ALMDataOverlapStage(
3 overlap_percentage=50,
4 target_duration=120.0,
5)

Short-Window Pipeline

When using shorter target windows, match the target_duration parameter:

1overlap_filter = ALMDataOverlapStage(
2 overlap_percentage=30,
3 target_duration=30.0, # Match ALMDataBuilderStage target
4)

Output Fields

The stage adds the following user-facing fields to each AudioTask:

FieldTypeDescription
filtered_windowslistWindows that passed overlap filtering
filtered_durfloatTotal duration of filtered windows in seconds
filtered_dur_listlistDuration of each individual filtered window
total_dur_windowfloatTotal duration of all input windows before filtering
manifest_filepathstrSource manifest path carried through from the builder stage

The stage also writes several intermediate fields (total_dur_list_window, total_dur_list_window_timestamps, filtered, swift_filepath) that are primarily used for internal bookkeeping. The original windows list produced by ALMDataBuilderStage is preserved so downstream consumers can compare pre- and post-filter results.

Tuning the Overlap Threshold

The right threshold depends on your training requirements:

  • For diverse training data, use a low overlap_percentage (0 to 30) to maximize the variety of audio content in the training set
  • For maximum training volume, use a higher overlap_percentage (70 to 100) to retain more windows at the cost of some redundancy
  • For balanced results, use overlap_percentage=50 as a starting point and adjust based on the ratio of filtered_windows to input windows

Monitor the yield by comparing filtered_dur to total_dur_window in the output.

Related Topics

  • ALM Data Builder: Previous stage in the ALM pipeline
  • ALM Pipeline Concepts: Architectural overview
  • ALM Tutorial: End-to-end walkthrough with sample data