`bridge.data.vlm_datasets.collate`#

Collation utilities for building VLM training batches from conversation examples.

Module Contents#

Functions#

`_gather_assistant_text_segments`	Extract assistant text segments from the structured conversation example.
`create_multiturn_loss_mask_by_search`	Tokenizer-agnostic masking via substring search of assistant texts.
`phi4_mm_collate_fn`	Collate function for Phi-4 MM model audio input
`qwen2_5_collate_fn`	Collate function for Qwen2.5 VL model.
`nemotron_nano_v2_vl_collate_fn`	Collate function for Nemotron Nano V2 VL model.
`ministral3_collate_fn`	Collate function for Ministral 3 VL model.
`default_collate_fn`	Default collate function for VLM models.

Data#

`MISSING_QWEN_VL_UTILS_MSG`
`COLLATE_FNS`

API#

bridge.data.vlm_datasets.collate.MISSING_QWEN_VL_UTILS_MSG#: ‘qwen_vl_utils is required for Qwen2.5 VL processing. Please pip install qwen-vl-utils or provide c…’

bridge.data.vlm_datasets.collate._gather_assistant_text_segments(example: dict) → list[str]#

Extract assistant text segments from the structured conversation example.

The example schema is expected to be {“conversation”: [{“role”: …, “content”: […]} …]} where content is a list of items like {“type”: “text”|”image”|…, “text”: “…”}. Returns a list of concatenated text strings, one per assistant turn.

bridge.data.vlm_datasets.collate.create_multiturn_loss_mask_by_search( example: dict, input_ids, processor, skipped_tokens: torch.Tensor, ) → list[int]#

Tokenizer-agnostic masking via substring search of assistant texts.

Tokenize full conversation with processor already done -> input_ids
Extract assistant text strings from the structured example
For each assistant text, tokenize without special tokens and search sequentially
On success, unmask that span; otherwise leave masked

bridge.data.vlm_datasets.collate.phi4_mm_collate_fn(examples, processor)#: Collate function for Phi-4 MM model audio input

bridge.data.vlm_datasets.collate.qwen2_5_collate_fn( examples: list, processor, ) → dict[str, torch.Tensor]#: Collate function for Qwen2.5 VL model.

bridge.data.vlm_datasets.collate.nemotron_nano_v2_vl_collate_fn( examples: list, processor, start_of_response_token=None, ) → dict[str, torch.Tensor]#: Collate function for Nemotron Nano V2 VL model.

bridge.data.vlm_datasets.collate.ministral3_collate_fn( examples: list, processor, ) → dict[str, torch.Tensor]#: Collate function for Ministral 3 VL model.

bridge.data.vlm_datasets.collate.default_collate_fn( examples: list, processor, ) → dict[str, torch.Tensor]#: Default collate function for VLM models.

bridge.data.vlm_datasets.collate.COLLATE_FNS#: None

bridge.data.vlm_datasets.collate#

Module Contents#

Functions#

Data#

API#

`bridge.data.vlm_datasets.collate`#