bridge.models.stepfun.step37_flickr8k_step#

Step3.7 Flickr8k forward step — consumes the packed batch dict produced by :class:Step37Flickr8kSFTDataProvider.

Step37Model.forward takes (input_ids, images: list[ImageForInsert], cu_seqlens, position_ids, attention_mask, labels, loss_mask, packed_seq_params, max_seq_len). This file performs no list[ImageForInsert] → pixel_values translation; the packed batch flows straight from preprocess to model forward kwargs.

Responsibilities:

  1. next(data_iterator) → packed dict.

  2. preprocess_packed_batch → CUDA move + PIL load + list[ImageForInsert] with raw pixels (PP rank 0 only).

  3. Pad tokens / labels / loss_mask / position_id to TP×16 multiple for TE/FP8; tail padding becomes its own sub-seq via cu_seqlens.

  4. Build PackedSeqParams from cu_seqlens for FlashAttn varlen.

  5. Call model(**forward_args).

Module Contents#

Functions#

_build_packed_seq_params

Build PackedSeqParams from a 1-D cu_seqlens (the sub-seq boundary array inside one packed row).

forward_step

Forward step for the Flickr8k packed pipeline.

Data#

API#

bridge.models.stepfun.step37_flickr8k_step._build_packed_seq_params(
cu_seqlens: torch.Tensor,
) megatron.core.packed_seq_params.PackedSeqParams#

Build PackedSeqParams from a 1-D cu_seqlens (the sub-seq boundary array inside one packed row).

bridge.models.stepfun.step37_flickr8k_step.forward_step(
state: megatron.bridge.training.state.GlobalState,
data_iterator: Iterable,
model: megatron.core.models.gpt.GPTModel,
return_schedule_plan: bool = False,
) tuple[torch.Tensor, functools.partial]#

Forward step for the Flickr8k packed pipeline.

bridge.models.stepfun.step37_flickr8k_step.__all__#

[‘forward_step’]