bridge.models.qwen3_asr.hf_qwen3_asr.processing_qwen3_asr#

Module Contents#

Classes#

Qwen3ASRProcessorKwargs

Qwen3ASRProcessor

Constructs a Qwen3ASR processor. [Qwen3ASRProcessor] offers all the functionalities of [WhisperFeatureExtractor], and [Qwen2TokenizerFast]. See the [~Qwen3ASRProcessor.__call__] and [~Qwen3ASRProcessor.decode] for more information.

Functions#

_get_feat_extract_output_lengths

Computes the output length of the convolutional layers and the output length of the audio encoder

Data#

API#

class bridge.models.qwen3_asr.hf_qwen3_asr.processing_qwen3_asr.Qwen3ASRProcessorKwargs#

Bases: transformers.processing_utils.ProcessingKwargs

_defaults#

None

bridge.models.qwen3_asr.hf_qwen3_asr.processing_qwen3_asr._get_feat_extract_output_lengths(input_lengths)#

Computes the output length of the convolutional layers and the output length of the audio encoder

class bridge.models.qwen3_asr.hf_qwen3_asr.processing_qwen3_asr.Qwen3ASRProcessor(
feature_extractor=None,
tokenizer=None,
chat_template=None,
)#

Bases: transformers.processing_utils.ProcessorMixin

Constructs a Qwen3ASR processor. [Qwen3ASRProcessor] offers all the functionalities of [WhisperFeatureExtractor], and [Qwen2TokenizerFast]. See the [~Qwen3ASRProcessor.__call__] and [~Qwen3ASRProcessor.decode] for more information.

Parameters:
  • feature_extractor ([WhisperFeatureExtractor], optional) – The audio feature extractor.

  • tokenizer ([Qwen2TokenizerFast], optional) – The text tokenizer.

  • chat_template (Optional[str], optional) – The Jinja template to use for formatting the conversation. If not provided, the default chat template is used.

Initialization

attributes#

[‘feature_extractor’, ‘tokenizer’]

feature_extractor_class#

‘WhisperFeatureExtractor’

tokenizer_class#

(‘Qwen2Tokenizer’, ‘Qwen2TokenizerFast’)

__call__(
text: transformers.tokenization_utils_base.TextInput = None,
audio: transformers.audio_utils.AudioInput = None,
**kwargs,
) transformers.feature_extraction_utils.BatchFeature#

Main method to prepare for the model one or several sequences(s) and audio(s). This method forwards the text and kwargs arguments to Qwen2TokenizerFast’s [~Qwen2TokenizerFast.__call__] if text is not None to encode the text. To prepare the audio(s), this method forwards the audio and kwargs arguments to WhisperFeatureExtractor’s [~WhisperFeatureExtractor.__call__] if audio is not None. Please refer to the doctsring of the above two methods for more information.

Parameters:
  • text (str, List[str], List[List[str]]) – The sequence or batch of sequences to be encoded. Each sequence can be a string or a list of strings (pretokenized string). If the sequences are provided as list of strings (pretokenized), you must set is_split_into_words=True (to lift the ambiguity with a batch of sequences).

  • audio (np.ndarray, List[np.ndarray]) – The audio or batch of audio to be prepared. Each audio can be a NumPy array.

replace_multimodal_special_tokens(text, audio_lengths)#
get_chunked_index(
token_indices: numpy.ndarray,
tokens_per_chunk: int,
) list[tuple[int, int]]#

Splits token index list into chunks based on token value ranges.

Given a list of token indices, returns a list of (start, end) index tuples representing slices of the list where the token values fall within successive ranges of t_ntoken_per_chunk.

For example, if t_ntoken_per_chunk is 1000, the function will create chunks such that:

  • the first chunk contains token values < 1000,

  • the second chunk contains values >= 1000 and < 2000, and so on.

Parameters:
  • token_indices (np.ndarray) – A monotonically increasing list of token index values.

  • t_ntoken_per_chunk (int) – Number of tokens per chunk (used as the chunk size threshold).

Returns:

A list of tuples, each representing the start (inclusive) and end (exclusive) indices of a chunk in token_indices.

Return type:

list[tuple[int, int]]

apply_chat_template(conversations, chat_template=None, **kwargs)#
property model_input_names#
bridge.models.qwen3_asr.hf_qwen3_asr.processing_qwen3_asr.__all__#

[‘Qwen3ASRProcessor’]