bridge.data.vlm_datasets.step37_flickr8k#

Step3.7 Flickr8k SFT dataset pipeline.

Self-contained inside Megatron-Bridge with no external runtime dependency. trust_remote_code is never set; the tokenizer is loaded directly from the local HF snapshot via transformers.AutoTokenizer.from_pretrained.

See :class:Step37Flickr8kSFTDataProvider for the mbridge integration entry-point.

Submodules#

Package Contents#

Data#

API#

bridge.data.vlm_datasets.step37_flickr8k.__all__#

[‘IMAGE_ITEM_TYPE’, ‘PATCH_ITEM_TYPE’, ‘ImageForInsert’, ‘build_image_for_insert’, ‘compute_rope_arg…