bridge.data.vlm_datasets.step37_flickr8k#
Step3.7 Flickr8k SFT dataset pipeline.
Self-contained inside Megatron-Bridge with no external runtime dependency.
trust_remote_code is never set; the tokenizer is loaded directly from
the local HF snapshot via transformers.AutoTokenizer.from_pretrained.
See :class:Step37Flickr8kSFTDataProvider for the mbridge integration
entry-point.
Submodules#
bridge.data.vlm_datasets.step37_flickr8k.flickr8k_loaderbridge.data.vlm_datasets.step37_flickr8k.preprocessbridge.data.vlm_datasets.step37_flickr8k.samplersbridge.data.vlm_datasets.step37_flickr8k.packingbridge.data.vlm_datasets.step37_flickr8k.pack_transformbridge.data.vlm_datasets.step37_flickr8k.templatebridge.data.vlm_datasets.step37_flickr8k.multimodal_utilsbridge.data.vlm_datasets.step37_flickr8k.providerbridge.data.vlm_datasets.step37_flickr8k.packed_dataloader
Package Contents#
Data#
API#
- bridge.data.vlm_datasets.step37_flickr8k.__all__#
[‘IMAGE_ITEM_TYPE’, ‘PATCH_ITEM_TYPE’, ‘ImageForInsert’, ‘build_image_for_insert’, ‘compute_rope_arg…