aistore.pytorch.dynamic_sampler

View as MarkdownOpen in Claude

Dynamic Batch Sampler for Dynamic Batch Sizing

In scenarios where memory is a constraint, the DynamicBatchSampler can be used to generate mini-batches that fit within a memory constraint so that there is a guarantee that each batch fits within memory while attempting to fit the maximum number of samples in each batch.

Copyright (c) 2024-2025, NVIDIA CORPORATION. All rights reserved.

Module Contents

Classes

NameDescription
DynamicBatchSamplerDynamically adds samples to mini-batch up to a maximum batch size.

Data

SATURATION_FACTOR

logger

API

class aistore.pytorch.dynamic_sampler.DynamicBatchSampler(
data_source: aistore.pytorch.base_map_dataset.AISBaseMapDataset,
max_batch_size: float,
drop_last: bool = False,
allow_oversized_samples: bool = False,
saturation_factor: float = SATURATION_FACTOR,
shuffle: bool = False
)

Bases: Sampler

Dynamically adds samples to mini-batch up to a maximum batch size.

NOTE: Using this sampler with AISBaseMapDatasets that use ObjectGroups in their ais_source_lists will be slower than using it with Buckets as ObjectGroups will perform one extra API call per object to get size metadata.

Parameters:

data_source
AISBaseMapDataset

Base AIS map-style dataset to sample from to create dynamic mini-batches.

max_batch_size
float

Maximum size of mini-batch in bytes.

drop_last
boolDefaults to False

If True, then will drop last batch if the batch is not at least 80% of max_batch_size. Defaults to False.

allow_oversized_samples
boolDefaults to False

If True, then any sample that is larger than the max_batch_size will be processed in its own min-batch by itself instead of being dropped. Defaults to False.

saturation_factor
floatDefaults to SATURATION_FACTOR

Saturation of a batch needed to not be dropped with drop_last=True. Default is 0.8.

shuffle
boolDefaults to False

Randomizes order of samples before calculating mini-batches. Default is False.

_samples_list
= data_source.get_obj_list()
aistore.pytorch.dynamic_sampler.DynamicBatchSampler.__iter__() -> typing.Iterator[typing.List[int]]

Returns an iterator containing mini-batches (lists of indices).

aistore.pytorch.dynamic_sampler.DynamicBatchSampler.__len__() -> int

Returns the total number of samples.

aistore.pytorch.dynamic_sampler.DynamicBatchSampler._get_next_index(
index
) -> int

Get next index from indices if shuffling or otherwise return incremented count.

Returns: int

Next index to sample from

aistore.pytorch.dynamic_sampler.SATURATION_FACTOR = 0.8
aistore.pytorch.dynamic_sampler.logger = getLogger(__name__)