nemo_automodel.components.datasets.vlm.collate_fns#
Module Contents#
Functions#
Construct label and optional loss-mask tensors aligned to assistant responses. |
|
Collate function for Phi-4 MM model audio input |
|
Collate function for Qwen2.5 VL model. |
|
Collate function for Qwen3 Omni processors. |
|
Default collate function for multimodal VLM datasets. |
Data#
API#
- nemo_automodel.components.datasets.vlm.collate_fns.logger#
‘getLogger(…)’
- nemo_automodel.components.datasets.vlm.collate_fns._find_pattern_indices(
- template,
- pattern,
- search_start_index=0,
- allow_first_token_mismatch=False,
- nemo_automodel.components.datasets.vlm.collate_fns._extract_assistant_text(message: Dict[str, Any]) str#
- nemo_automodel.components.datasets.vlm.collate_fns.build_labels(
- input_ids_batch: torch.Tensor,
- conversations: Sequence[Sequence[Dict[str, Any]]],
- processor,
Construct label and optional loss-mask tensors aligned to assistant responses.
- nemo_automodel.components.datasets.vlm.collate_fns.phi4_mm_collate_fn(examples, processor)#
Collate function for Phi-4 MM model audio input
- nemo_automodel.components.datasets.vlm.collate_fns.qwen2_5_collate_fn(
- examples: list,
- processor,
Collate function for Qwen2.5 VL model.
- nemo_automodel.components.datasets.vlm.collate_fns.qwen3_omni_collate_fn(
- examples: Sequence[Dict[str, Any]],
- processor,
- use_audio_in_video: bool = False,
Collate function for Qwen3 Omni processors.
- nemo_automodel.components.datasets.vlm.collate_fns.default_collate_fn(
- examples: Sequence[Dict[str, Any]],
- processor,
Default collate function for multimodal VLM datasets.
- nemo_automodel.components.datasets.vlm.collate_fns.COLLATE_FNS#
None