nemo_curator.stages.audio.alm.alm_manifest_writer

View as Markdown

ALM Manifest Writer Stage — writes AudioTask dicts to a JSONL manifest.

Module Contents

Classes

NameDescription
ALMManifestWriterStageAppend a single AudioTask to a JSONL manifest file.

API

class nemo_curator.stages.audio.alm.alm_manifest_writer.ALMManifestWriterStage(
name: str = 'alm_manifest_writer',
output_path: str = ''
)
Dataclass

Bases: ProcessingStage[AudioTask, FileGroupTask]

Append a single AudioTask to a JSONL manifest file.

The output file is truncated once per node in setup_on_node() so repeated pipeline runs produce a clean output. Supports local and cloud paths via fsspec.

Parameters:

output_path
strDefaults to ''

Destination JSONL path (local or cloud).

name
str = 'alm_manifest_writer'
output_path
str = ''
nemo_curator.stages.audio.alm.alm_manifest_writer.ALMManifestWriterStage.__post_init__() -> None
nemo_curator.stages.audio.alm.alm_manifest_writer.ALMManifestWriterStage.num_workers() -> int | None
nemo_curator.stages.audio.alm.alm_manifest_writer.ALMManifestWriterStage.process(
task: nemo_curator.tasks.AudioTask
) -> nemo_curator.tasks.FileGroupTask
nemo_curator.stages.audio.alm.alm_manifest_writer.ALMManifestWriterStage.setup_on_node(
_node_info: nemo_curator.backends.base.NodeInfo | None = None,
_worker_metadata: nemo_curator.backends.base.WorkerMetadata | None = None
) -> None
nemo_curator.stages.audio.alm.alm_manifest_writer.ALMManifestWriterStage.xenna_stage_spec() -> dict[str, typing.Any]