aistore.pytorch.parallel_map_dataset

View as Markdown

PyTorch Map-style Dataset with parallel download acceleration.

Copyright (c) 2026, NVIDIA CORPORATION. All rights reserved.

Module Contents

Classes

NameDescription
AISParallelMapDatasetMap-style dataset that uses parallel download to fetch objects.

API

class aistore.pytorch.parallel_map_dataset.AISParallelMapDataset(
ais_source_list: typing.Optional[typing.Union[aistore.sdk.AISSource, typing.List[aistore.sdk.AISSource]]] = None,
prefix_map: typing.Optional[typing.Dict[aistore.sdk.AISSource, typing.Union[str, typing.List[str]]]] = None,
num_workers: int = 16
)

Bases: AISBaseMapDataset

Map-style dataset that uses parallel download to fetch objects.

Parallel download splits each object into byte ranges and fetches them concurrently using num_workers workers.

__getitem__ returns (object_name, ParallelBuffer). The caller (or PyTorch DataLoader collate function) is responsible for consuming and closing the ParallelBuffer.

Parameters:

ais_source_list
Optional[Union[AISSource, List[AISSource]]]Defaults to None

Single or list of AISSource objects to load data.

prefix_map
Optional[Dict[AISSource, Union[str, List[str]]]]Defaults to None

Map of AISSource to prefix(es) for filtering objects.

num_workers
intDefaults to 16

Number of concurrent range-read workers per object.

num_workers
int

Number of concurrent range-read workers per object.

aistore.pytorch.parallel_map_dataset.AISParallelMapDataset.__getitem__(
index: int
)
aistore.pytorch.parallel_map_dataset.AISParallelMapDataset.__len__()