For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
Blog
DocsAPI Reference
DocsAPI Reference
  • Python SDK
        • Aistore
          • Botocore Patch
          • Mcp
          • Pytorch
            • Base Iter Dataset
            • Base Map Dataset
            • Batch Iter Dataset
            • Dynamic Sampler
            • Iter Dataset
            • Map Dataset
            • Multishard Dataset
            • Parallel Map Dataset
            • Shard Reader
            • Utils
          • Sdk
          • Version
Blog
NVIDIANVIDIA
Developer-friendly docs for your API
Privacy Policy | Your Privacy Choices | Terms of Service | Accessibility | Corporate Policies | Product Security | Contact

Copyright © 2026, NVIDIA Corporation.

LogoLogoAIStore
On this page
  • Module Contents
  • Classes
  • API
Python SDKPythonPythonAistorePytorch

aistore.pytorch.shard_reader

||View as Markdown|
Previous

aistore.pytorch.parallel_map_dataset

Next

aistore.pytorch.utils

AIS Shard Reader for PyTorch

PyTorch Dataset and DataLoader for AIS.

Copyright (c) 2024-2025, NVIDIA CORPORATION. All rights reserved.

Module Contents

Classes

NameDescription
AISShardReaderAn iterable-style dataset that iterates over objects stored as Webdataset shards

API

class aistore.pytorch.shard_reader.AISShardReader(
bucket_list: typing.Union[aistore.sdk.Bucket, typing.List[aistore.sdk.Bucket]], bucket_list: typing.Union[aistore.sdk.Bucket, typing.List[aistore.sdk.Bucket]],
prefix_map: typing.Dict[aistore.sdk.Bucket, typing.Union[str, typing.List[str]]] = {},
etl_name: str = None,
show_progress: bool = False
)

Bases: AISBaseIterDataset

An iterable-style dataset that iterates over objects stored as Webdataset shards and yields samples represented as a tuple of basename (str) and contents (dictionary).

Parameters:

bucket_list
Union[Bucket, List[Bucket]]

Single or list of Bucket objects to load data

prefix_map
Dict(AISSource, Union[str, List[str]])Defaults to {}

Map of Bucket objects to list of prefixes that only allows

etl_name
strDefaults to None

Optional ETL on the AIS cluster to apply to each object

show_progress
boolDefaults to False

Enables console shard reading progress indicator

_observed_keys
= set()
aistore.pytorch.shard_reader.AISShardReader.__iter__() -> typing.Iterator
aistore.pytorch.shard_reader.AISShardReader.__len__()

Returns the length of the dataset. Note that calling this will iterate through the dataset, taking O(N) time.

NOTE: If you want the length of the dataset after iterating through it, use for i, data in enumerate(dataset) instead.

aistore.pytorch.shard_reader.AISShardReader._read_samples_from_shards(
shard_content
) -> typing.Dict