aistore.pytorch.base_iter_dataset

View as Markdown

Base class for AIS Iterable Style Datasets

Copyright (c) 2024-2025, NVIDIA CORPORATION. All rights reserved.

Module Contents

Classes

NameDescription
AISBaseIterDatasetA base class for creating AIS Iterable Datasets. Should not be instantiated directly. Subclasses

API

class aistore.pytorch.base_iter_dataset.AISBaseIterDataset(
ais_source_list: typing.Union[aistore.sdk.AISSource, typing.List[aistore.sdk.AISSource]],
prefix_map: typing.Dict[aistore.sdk.AISSource, typing.Union[str, typing.List[str]]] = {}
)
Abstract

Bases: IterableDataset

A base class for creating AIS Iterable Datasets. Should not be instantiated directly. Subclasses should implement :meth:__iter__ which returns the samples from the dataset and can optionally override other methods from torch IterableDataset such as :meth:__len__.

Parameters:

ais_source_list
Union[AISSource, List[AISSource]]

Single or list of AISSource objects to load data

prefix_map
Dict(AISSource, List[str])Defaults to {}

Map of AISSource objects to list of prefixes that only allows objects with the specified prefixes to be used from each source

_ais_source_list
aistore.pytorch.base_iter_dataset.AISBaseIterDataset.__iter__() -> typing.Iterator
abstract

Return iterator with samples in this dataset.

Returns: Iterator

Iterator of samples

aistore.pytorch.base_iter_dataset.AISBaseIterDataset.__len__()

Returns the length of the dataset. Note that calling this will iterate through the dataset, taking O(N) time.

NOTE: If you want the length of the dataset after iterating through it, use for i, data in enumerate(dataset) instead.

aistore.pytorch.base_iter_dataset.AISBaseIterDataset._create_objects_iter() -> typing.Iterable

Create an iterable of objects given the AIS sources and associated prefixes.

Returns: Iterable

Iterable over the objects from the sources provided

aistore.pytorch.base_iter_dataset.AISBaseIterDataset._get_worker_iter_info() -> typing.Tuple[typing.Iterator, str]

Depending on how many Torch workers are present or if they are even present at all, return an iterator for the current worker to access and a worker name.

Returns: Tuple[Iterator, str]

tuple[Iterator, str]: Iterator of objects and name of worker

aistore.pytorch.base_iter_dataset.AISBaseIterDataset._reset_iterator()

Reset the object iterator to start from the beginning.