bridge.data.vlm_datasets.collate
#
Collation utilities for building VLM training batches from conversation examples.
Module Contents#
Functions#
Extract assistant text segments from the structured conversation example. |
|
Tokenizer-agnostic masking via substring search of assistant texts. |
|
Collate function for Phi-4 MM model audio input |
|
Collate function for Qwen2.5 VL model. |
|
Default collate function for VLM models. |
Data#
API#
- bridge.data.vlm_datasets.collate.MISSING_QWEN_VL_UTILS_MSG#
‘qwen_vl_utils is required for Qwen2.5 VL processing. Please
pip install qwen-vl-utils
or provide c…’
- bridge.data.vlm_datasets.collate._gather_assistant_text_segments(example: dict) list[str] #
Extract assistant text segments from the structured conversation example.
The example schema is expected to be {“conversation”: [{“role”: …, “content”: […]} …]} where content is a list of items like {“type”: “text”|”image”|…, “text”: “…”}. Returns a list of concatenated text strings, one per assistant turn.
- bridge.data.vlm_datasets.collate.create_multiturn_loss_mask_by_search(
- example: dict,
- input_ids,
- processor,
- skipped_tokens: torch.Tensor,
Tokenizer-agnostic masking via substring search of assistant texts.
Tokenize full conversation with processor already done -> input_ids
Extract assistant text strings from the structured example
For each assistant text, tokenize without special tokens and search sequentially
On success, unmask that span; otherwise leave masked
- bridge.data.vlm_datasets.collate.phi4_mm_collate_fn(examples, processor)#
Collate function for Phi-4 MM model audio input
- bridge.data.vlm_datasets.collate.qwen2_5_collate_fn(
- examples: list,
- processor,
Collate function for Qwen2.5 VL model.
- bridge.data.vlm_datasets.collate.default_collate_fn(
- examples: list,
- processor,
Default collate function for VLM models.
- bridge.data.vlm_datasets.collate.COLLATE_FNS#
None