bridge.data.vlm_datasets.step37_flickr8k.samplers#
Looped sequential / shuffle / weighted-random samplers.
These drive :class:MixedPackedDataloader._schedule_all. The seeds and the
exact torch.randperm / heapq-based selection order define the
packed batch sequence, so they must not be changed if reproducible packs
are required.
Module Contents#
Classes#
size=3: [0, 1, 2, 0, 1, 2, …] |
|
Looped shuffle sampler. |
|
Heap-based balanced weighted sampler. |
API#
- class bridge.data.vlm_datasets.step37_flickr8k.samplers.LoopedSequentialSampler(size: int)#
size=3: [0, 1, 2, 0, 1, 2, …]
Initialization
- __iter__()#
- __next__() int#
- get() int#
- update() None#
- state_dict() dict[str, Any]#
- load_state_dict(state_dict: dict[str, Any]) None#
- class bridge.data.vlm_datasets.step37_flickr8k.samplers.LoopedShuffleSampler(
- size: int = 0,
- base_seed: int = 1234,
- same_order_for_each_epoch: bool = False,
Looped shuffle sampler.
Yields a fresh
torch.randpermpermutation per epoch usingseed = base_seed + epoch(or justbase_seedifsame_order_for_each_epochis set).Initialization
- __iter__()#
- __next__() int#
- get() int#
- update() None#
- state_dict() dict[str, Any]#
- load_state_dict(state_dict: dict[str, Any]) None#
- _reset_idx_cur_epoch() None#
- class bridge.data.vlm_datasets.step37_flickr8k.samplers.WeightedRandomSampler(
- size: int = 0,
- base_seed: int = 1234,
- weights: Optional[Union[Sequence[float], torch.Tensor]] = None,
Heap-based balanced weighted sampler.
Maintains a min-heap of cumulative scores
count[i] / weight[i]and always picks the lowest-score index (ties broken by lower index, per Python’sheapqinvariant). This produces an exactly reproducible weighted order without floating-point randomness in the selection.Initialization
- __iter__()#
- __next__() int#
- _select_idx() int#
- get() int#
- update() None#
- state_dict() dict[str, Any]#
- load_state_dict(state_dict: dict[str, Any]) None#