Qwen2-Audio#

Qwen2-Audio is an audio-language model from the Qwen family. Megatron Bridge supports Qwen2-Audio through a dedicated audio model bridge and provider.

Supported Variants#

  • Qwen2-Audio-7B-Instruct: https://huggingface.co/Qwen/Qwen2-Audio-7B-Instruct

Architecture Notes#

  • Uses a Qwen2 language backbone with an audio encoder path.

  • The bridge maps Hugging Face Qwen2AudioForConditionalGeneration checkpoints to the Megatron audio-language model.

  • Inference supports remote audio URLs and local audio files through the audio generation helper.

Examples#

For runnable audio inference commands and expected output, see the Qwen2-Audio examples README.