`bridge.models.qwen3_asr.hf_qwen3_asr.processing_qwen3_asr`#

Module Contents#

Classes#

`Qwen3ASRProcessorKwargs`
`Qwen3ASRProcessor`	Constructs a Qwen3ASR processor. [`Qwen3ASRProcessor`] offers all the functionalities of [`WhisperFeatureExtractor`], and [`Qwen2TokenizerFast`]. See the [`~Qwen3ASRProcessor.__call__`] and [`~Qwen3ASRProcessor.decode`] for more information.

Functions#

_get_feat_extract_output_lengths

Computes the output length of the convolutional layers and the output length of the audio encoder

Data#

__all__

API#

class bridge.models.qwen3_asr.hf_qwen3_asr.processing_qwen3_asr.Qwen3ASRProcessorKwargs#

Bases: transformers.processing_utils.ProcessingKwargs

_defaults#: None

bridge.models.qwen3_asr.hf_qwen3_asr.processing_qwen3_asr._get_feat_extract_output_lengths(input_lengths)#: Computes the output length of the convolutional layers and the output length of the audio encoder

class bridge.models.qwen3_asr.hf_qwen3_asr.processing_qwen3_asr.Qwen3ASRProcessor( feature_extractor=None, tokenizer=None, chat_template=None, )#

Bases: transformers.processing_utils.ProcessorMixin

Constructs a Qwen3ASR processor. [Qwen3ASRProcessor] offers all the functionalities of [WhisperFeatureExtractor], and [Qwen2TokenizerFast]. See the [~Qwen3ASRProcessor.__call__] and [~Qwen3ASRProcessor.decode] for more information.

Parameters:

feature_extractor ([WhisperFeatureExtractor], optional) – The audio feature extractor.
tokenizer ([Qwen2TokenizerFast], optional) – The text tokenizer.
chat_template (Optional[str], optional) – The Jinja template to use for formatting the conversation. If not provided, the default chat template is used.

Initialization

attributes#: [‘feature_extractor’, ‘tokenizer’]

feature_extractor_class#: ‘WhisperFeatureExtractor’

tokenizer_class#: (‘Qwen2Tokenizer’, ‘Qwen2TokenizerFast’)

__call__(

text: transformers.tokenization_utils_base.TextInput = None,

audio: transformers.audio_utils.AudioInput = None,

**kwargs,

) → transformers.feature_extraction_utils.BatchFeature#

Main method to prepare for the model one or several sequences(s) and audio(s). This method forwards the text and kwargs arguments to Qwen2TokenizerFast’s [~Qwen2TokenizerFast.__call__] if text is not None to encode the text. To prepare the audio(s), this method forwards the audio and kwargs arguments to WhisperFeatureExtractor’s [~WhisperFeatureExtractor.__call__] if audio is not None. Please refer to the doctsring of the above two methods for more information.

Parameters:

text (str, List[str], List[List[str]]) – The sequence or batch of sequences to be encoded. Each sequence can be a string or a list of strings (pretokenized string). If the sequences are provided as list of strings (pretokenized), you must set is_split_into_words=True (to lift the ambiguity with a batch of sequences).
audio (np.ndarray, List[np.ndarray]) – The audio or batch of audio to be prepared. Each audio can be a NumPy array.

replace_multimodal_special_tokens(text, audio_lengths)#

get_chunked_index( token_indices: numpy.ndarray, tokens_per_chunk: int, ) → list[tuple[int, int]]#

Splits token index list into chunks based on token value ranges.

Given a list of token indices, returns a list of (start, end) index tuples representing slices of the list where the token values fall within successive ranges of t_ntoken_per_chunk.

For example, if t_ntoken_per_chunk is 1000, the function will create chunks such that:

the first chunk contains token values < 1000,
the second chunk contains values >= 1000 and < 2000, and so on.

Parameters:

token_indices (np.ndarray) – A monotonically increasing list of token index values.
t_ntoken_per_chunk (int) – Number of tokens per chunk (used as the chunk size threshold).

Returns:

A list of tuples, each representing the start (inclusive) and end (exclusive) indices of a chunk in token_indices.

Return type:

list[tuple[int, int]]

apply_chat_template(conversations, chat_template=None, **kwargs)#

property model_input_names#

bridge.models.qwen3_asr.hf_qwen3_asr.processing_qwen3_asr.__all__#: [‘Qwen3ASRProcessor’]

bridge.models.qwen3_asr.hf_qwen3_asr.processing_qwen3_asr#

Module Contents#

Classes#

Functions#

Data#

API#

`bridge.models.qwen3_asr.hf_qwen3_asr.processing_qwen3_asr`#