nemo_automodel.components.datasets.vlm.collate_fns
#
Module Contents#
Functions#
Create loss mask by finding start of turn token positions, similar to squad.py approach. |
|
Collate function for Phi-4 MM model audio input |
|
Collate function for Qwen2.5 VL model. |
|
Default collate function for VLM models. |
Data#
API#
- nemo_automodel.components.datasets.vlm.collate_fns.create_loss_mask_with_start_of_response_token(
- input_ids,
- processor,
- start_of_response_token=None,
Create loss mask by finding start of turn token positions, similar to squad.py approach.
- Parameters:
input_ids – List or tensor of token IDs for a single example
processor – Processor/tokenizer to convert token string to ID
start_of_response_token – String token that marks the start of turns (e.g., “<start_of_turn>model\n”)
- Returns:
List of 0/1 flags where 0 = masked (prompt), 1 = unmasked (response)
- Return type:
loss_mask
- nemo_automodel.components.datasets.vlm.collate_fns.phi4_mm_collate_fn(examples, processor)[source]#
Collate function for Phi-4 MM model audio input
- nemo_automodel.components.datasets.vlm.collate_fns.qwen2_5_collate_fn(
- examples: list,
- processor,
- start_of_response_token='<|im_start|>assistant\n',
Collate function for Qwen2.5 VL model.
- nemo_automodel.components.datasets.vlm.collate_fns.default_collate_fn(
- examples: list,
- processor,
- start_of_response_token=None,
Default collate function for VLM models.
- nemo_automodel.components.datasets.vlm.collate_fns.COLLATE_FNS#
None