nemo_curator.stages.audio.tagging.resample_audio

View as Markdown

Resample Audio Stage

Resamples audio files to a target sample rate and format. Follows the exact pattern from NeMo Curator: https://github.com/NVIDIA-NeMo/Curator/blob/main/nemo_curator/stages/audio/common.py

Module Contents

Classes

NameDescription
ResampleAudioStageStage for resampling audio files in a TTS/ALM dataset.

API

class nemo_curator.stages.audio.tagging.resample_audio.ResampleAudioStage(
resampled_audio_dir: str,
input_format: str = 'wav',
target_sample_rate: int = 16000,
target_format: str = 'wav',
target_nchannels: int = 1,
audio_filepath_key: str = 'audio_filepath',
resampled_audio_filepath_key: str = 'resampled_audio_filepath',
duration_key: str = 'duration',
audio_item_id_key: str = 'audio_item_id',
name: str = 'ResampleAudio'
)
Dataclass

Bases: ProcessingStage[AudioTask, AudioTask]

Stage for resampling audio files in a TTS/ALM dataset.

Takes a manifest containing audio file paths and resamples them to target sample rate and format, while creating a new manifest with updated paths.

audio_filepath_key
str = 'audio_filepath'
audio_item_id_key
str = 'audio_item_id'
duration_key
str = 'duration'
input_format
str = 'wav'
name
str = 'ResampleAudio'
resampled_audio_dir
str
resampled_audio_filepath_key
str = 'resampled_audio_filepath'
target_format
str = 'wav'
target_nchannels
int = 1
target_sample_rate
int = 16000
nemo_curator.stages.audio.tagging.resample_audio.ResampleAudioStage.inputs() -> tuple[list[str], list[str]]
nemo_curator.stages.audio.tagging.resample_audio.ResampleAudioStage.outputs() -> tuple[list[str], list[str]]
nemo_curator.stages.audio.tagging.resample_audio.ResampleAudioStage.process(
task: nemo_curator.tasks.AudioTask
) -> nemo_curator.tasks.AudioTask

Process a single task by resampling the audio file.

Parameters:

task
AudioTask

AudioTask with data dict containing audio_filepath and audio_item_id(optional)

Returns: AudioTask

AudioTask with updated metadata

nemo_curator.stages.audio.tagging.resample_audio.ResampleAudioStage.setup_on_node(
_node_info: nemo_curator.backends.base.NodeInfo | None = None,
_worker_metadata: nemo_curator.backends.base.WorkerMetadata | None = None
) -> None