> For clean Markdown of any page, append .md to the page URL.
> For a complete documentation index, see https://docs.nvidia.com/nemo/automodel/llms.txt.
> For AI client integration (Claude Code, Cursor, etc.), connect to the MCP server at https://docs.nvidia.com/nemo/automodel/_mcp/server.

# nemo_automodel.components.datasets.diffusion.base_dataset

## Module Contents

### Classes

| Name                                                                                                                  | Description                                                                  |
| --------------------------------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------- |
| [`BaseMultiresolutionDataset`](#nemo_automodel-components-datasets-diffusion-base_dataset-BaseMultiresolutionDataset) | Abstract base class for multiresolution datasets with bucket-based sampling. |

### Data

[`logger`](#nemo_automodel-components-datasets-diffusion-base_dataset-logger)

### API

```python
class nemo_automodel.components.datasets.diffusion.base_dataset.BaseMultiresolutionDataset(
    cache_dir: str,
    quantization: int = 64
)
```

Abstract

**Bases:** `Dataset`

Abstract base class for multiresolution datasets with bucket-based sampling.

```python
nemo_automodel.components.datasets.diffusion.base_dataset.BaseMultiresolutionDataset.__getitem__(
    idx: int
) -> typing.Dict
```

abstract

Load a single sample. Subclasses must implement.

```python
nemo_automodel.components.datasets.diffusion.base_dataset.BaseMultiresolutionDataset.__len__() -> int
```

```python
nemo_automodel.components.datasets.diffusion.base_dataset.BaseMultiresolutionDataset._aspect_ratio_to_name(
    aspect_ratio: float
) -> str
```

Convert aspect ratio to a descriptive name.

```python
nemo_automodel.components.datasets.diffusion.base_dataset.BaseMultiresolutionDataset._group_by_bucket()
```

Group samples by bucket (aspect\_ratio + resolution).

```python
nemo_automodel.components.datasets.diffusion.base_dataset.BaseMultiresolutionDataset._load_metadata() -> typing.List[typing.Dict]
```

Load metadata from cache directory.

Expects metadata.json with "shards" key referencing shard files.

```python
nemo_automodel.components.datasets.diffusion.base_dataset.BaseMultiresolutionDataset.get_bucket_info() -> typing.Dict
```

Get bucket organization information.

```python
nemo_automodel.components.datasets.diffusion.base_dataset.logger = logging.getLogger(__name__)
```