aistore.pytorch.batch_iter_dataset

View as MarkdownOpen in Claude

Iterable Dataset using Batch API for AIS

Copyright (c) 2025, NVIDIA CORPORATION. All rights reserved.

Module Contents

Classes

NameDescription
AISBatchIterDatasetCustom AIStore PyTorch dataset that uses the AIS batch API for efficient data loading

API

class aistore.pytorch.batch_iter_dataset.AISBatchIterDataset(
ais_source_list,
client: aistore.sdk.Client,
prefix_map: typing.Dict[aistore.sdk.AISSource, typing.Union[str, typing.List[str]]] = {},
show_progress: bool = False,
max_batch_size: int = 32,
output_format: str = '.tar',
streaming: bool = True
)

Bases: AISBaseIterDataset

Custom AIStore PyTorch dataset that uses the AIS batch API for efficient data loading with multi-worker support and memory-efficient iteration.

Parameters:

ais_source_list
Union[AISSource, List[AISSource]]

Single or list of AISSource objects to load data

client
Client

AIStore client instance

max_batch_size
intDefaults to 32

Maximum number of objects to fetch in each batch request. Defaults to 32

output_format
strDefaults to '.tar'

Format for batch response. Defaults to “.tar”

streaming
boolDefaults to True

Enable streaming mode. Defaults to True

prefix_map
DictDefaults to {}

Map of AISSource objects to prefixes

show_progress
boolDefaults to False

Show progress indicator. Defaults to False

aistore.pytorch.batch_iter_dataset.AISBatchIterDataset.__iter__() -> typing.Iterator[typing.Tuple[str, bytes]]

Memory-efficient iterator with multi-worker support using batch API.

aistore.pytorch.batch_iter_dataset.AISBatchIterDataset._process_batch(
batch_objects: typing.List
) -> typing.Iterator[typing.Tuple[str, bytes]]

Process a batch of objects using the batch API.

Parameters:

batch_objects
List

List of objects to process in this batch