nemo_curator.stages.audio.io.convert
nemo_curator.stages.audio.io.convert
Module Contents
Classes
API
Bases: ProcessingStage[AudioTask, DocumentBatch]
Convert AudioTask entries into DocumentBatch DataFrames.
Overrides process_batch to aggregate an entire batch of
AudioTask objects into a single multi-row DocumentBatch,
avoiding the overhead of many single-row DataFrames. Set
batch_size to control how many audio entries land in each
DataFrame (default 64).
batch_size
name