nemo_automodel.components.datasets.diffusion.base_dataset
nemo_automodel.components.datasets.diffusion.base_dataset
Module Contents
Classes
Data
API
Abstract
Bases: Dataset
Abstract base class for multiresolution datasets with bucket-based sampling.
cache_dir
calculator
metadata
abstract
Load a single sample. Subclasses must implement.
Convert aspect ratio to a descriptive name.
Group samples by bucket (aspect_ratio + resolution).
Load metadata from cache directory.
Expects metadata.json with “shards” key referencing shard files.
Get bucket organization information.