nemo_curator.stages.audio.postprocessing.timestamp_mapper

View as Markdown

Timestamp mapper stage.

Normalizes task data at the pipeline output boundary. Handles four sources of timing information (checked in priority order):

  1. segment_mappings in task._metadata — remaps concat-space positions back to original file positions.
  2. start_ms / end_ms in task.data — uses them directly as original positions (from VAD fan-out).
  3. diar_segments in task.data — computes span from first segment start to last segment end (from SpeakerSep).
  4. duration fallback — uses whole-file duration.

Output control uses two layers:

  • passthrough_keys (whitelist): only keys in this list are copied from the input to the output. Defaults to all built-in quality filter and speaker metadata keys. Users can override via config.
  • _NEVER_PASS_KEYS (safety net): non-serializable keys that are always blocked, even if accidentally added to passthrough_keys.

Module Contents

Classes

NameDescription
TimestampMapperStageNormalize task data at the pipeline output boundary.

Functions

NameDescription
_translate_to_originalTranslate concatenated position range to original file positions.

Data

_DEFAULT_PASSTHROUGH_KEYS

_NEVER_PASS_KEYS

API

class nemo_curator.stages.audio.postprocessing.timestamp_mapper.TimestampMapperStage(
passthrough_keys: list[str] | None = None,
name: str = 'TimestampMapper',
batch_size: int = 1,
resources: nemo_curator.stages.resources.Resources = (lambda: Resources(cpus=1.0...
)
Dataclass

Bases: ProcessingStage[AudioTask, AudioTask]

Normalize task data at the pipeline output boundary.

Constructs core output fields from available timing sources, then copies only the keys listed in passthrough_keys from the input.

Core fields (always present, not controlled by passthrough_keys): original_file, original_start_ms, original_end_ms, duration_ms, duration. When diarization segments are available: diar_segments, speaking_duration are also set as core fields.

Parameters:

passthrough_keys
list[str] | NoneDefaults to None

Keys to copy from input to output. Defaults to all built-in quality filter and speaker metadata keys. Override to include custom fields or restrict the output schema.

batch_size
int = 1
name
str = 'TimestampMapper'
passthrough_keys
list[str] | None = field(default=None)
resources
Resources
nemo_curator.stages.audio.postprocessing.timestamp_mapper.TimestampMapperStage.__post_init__()
nemo_curator.stages.audio.postprocessing.timestamp_mapper.TimestampMapperStage._build_output_item(
item: dict[str, typing.Any],
orig: dict[str, typing.Any]
) -> dict[str, typing.Any]
nemo_curator.stages.audio.postprocessing.timestamp_mapper.TimestampMapperStage._build_output_item_no_mapping(
item: dict[str, typing.Any]
) -> dict[str, typing.Any]
nemo_curator.stages.audio.postprocessing.timestamp_mapper.TimestampMapperStage._copy_passthrough(
item: dict[str, typing.Any],
result: dict[str, typing.Any]
) -> None
nemo_curator.stages.audio.postprocessing.timestamp_mapper.TimestampMapperStage.inputs() -> tuple[list[str], list[str]]
nemo_curator.stages.audio.postprocessing.timestamp_mapper.TimestampMapperStage.outputs() -> tuple[list[str], list[str]]
nemo_curator.stages.audio.postprocessing.timestamp_mapper.TimestampMapperStage.process(
task: nemo_curator.tasks.AudioTask
) -> nemo_curator.tasks.AudioTask | list[nemo_curator.tasks.AudioTask]) -> nemo_curator.tasks.AudioTask | list[nemo_curator.tasks.AudioTask]
nemo_curator.stages.audio.postprocessing.timestamp_mapper._translate_to_original(
mappings: list[dict[str, typing.Any]],
concat_start_ms: int,
concat_end_ms: int
) -> list[dict[str, typing.Any]]

Translate concatenated position range to original file positions.

nemo_curator.stages.audio.postprocessing.timestamp_mapper._DEFAULT_PASSTHROUGH_KEYS: list[str] = ['speaker_id', 'num_speakers', 'speaking_duration', 'sample_rate', 'utmos_mos', ...
nemo_curator.stages.audio.postprocessing.timestamp_mapper._NEVER_PASS_KEYS = frozenset({'waveform', 'audio', 'audio_data', 'audio_array', 'segments'})