bridge.models.nemotron_omni.nemotron_omni_sound#
Module Contents#
Classes#
Sound encoder wrapper for Bridge that wraps HF transformers’ ParakeetEncoder. |
API#
- class bridge.models.nemotron_omni.nemotron_omni_sound.BridgeSoundEncoder(config)#
Bases:
megatron.core.transformer.module.MegatronModuleSound encoder wrapper for Bridge that wraps HF transformers’ ParakeetEncoder.
Uses the public
ParakeetEncoderfromtransformersso that Megatron-side parameter names line up 1:1 with the Nemotron-Omni HF checkpoint’ssound_encoder.encoder.*state dict.The outer config carries fields required by LLaVAModel’s sound interface (sound_model_type, sound_pad_to_clip_duration, sound_batch_split) plus the ParakeetEncoderConfig fields needed to build the inner encoder.
Does NOT include a feature extractor – input is pre-processed mel spectrograms of shape (batch, frames, mel_bins), not raw audio waveforms.
Initialization
- __setattr__(name, value)#
- set_input_tensor(input_tensor)#
Dummy for pipeline parallel set_input_tensor hook.
- forward(sound_clips, sound_length)#