> For clean Markdown of any page, append .md to the page URL.
> For a complete documentation index, see https://docs.nvidia.com/nemo/automodel/llms.txt.
> For AI client integration (Claude Code, Cursor, etc.), connect to the MCP server at https://docs.nvidia.com/nemo/automodel/_mcp/server.

# nemo_automodel.components.datasets.lazy_mapped_dataset

## Module Contents

### Classes

| Name                                                                                             | Description                                                             |
| ------------------------------------------------------------------------------------------------ | ----------------------------------------------------------------------- |
| [`LazyMappedDataset`](#nemo_automodel-components-datasets-lazy_mapped_dataset-LazyMappedDataset) | Dataset wrapper that applies a transform function on-the-fly instead of |

### Data

[`logger`](#nemo_automodel-components-datasets-lazy_mapped_dataset-logger)

### API

```python
class nemo_automodel.components.datasets.lazy_mapped_dataset.LazyMappedDataset(
    dataset,
    map_fn,
    cache_size = 10000
)
```

**Bases:** `Dataset`

Dataset wrapper that applies a transform function on-the-fly instead of
preprocessing the whole dataset upfront with .map(fn).

**Parameters:**

Any object that supports `__len__` and `__getitem__`
(e.g. a Hugging Face `datasets.Dataset`).

A callable that accepts a single example and returns the
transformed example.

Number of processed items to cache. Defaults to the 10k
dataset samples. Set to 0 to disable caching or None to cache all.

**Returns:**

A map-style dataset that applies map\_fn lazily on each item access.

Return LRU cache statistics, or `None` if caching is disabled.

```python
nemo_automodel.components.datasets.lazy_mapped_dataset.LazyMappedDataset.__getitem__(
    idx: int
) -> typing.Any
```

Returns the transformed item at the given index

```python
nemo_automodel.components.datasets.lazy_mapped_dataset.LazyMappedDataset.__getstate__() -> dict
```

Returns pickable state by dropping the unpicklable \_get\_item function

```python
nemo_automodel.components.datasets.lazy_mapped_dataset.LazyMappedDataset.__len__() -> int
```

Returns the number of items in the dataset

```python
nemo_automodel.components.datasets.lazy_mapped_dataset.LazyMappedDataset.__repr__() -> str
```

returns a string representation of the dataset

```python
nemo_automodel.components.datasets.lazy_mapped_dataset.LazyMappedDataset.__setstate__(
    state: dict
) -> None
```

Restores state and rebuild \_get\_item after unpickling

```python
nemo_automodel.components.datasets.lazy_mapped_dataset.LazyMappedDataset._build_get_item() -> None
```

Build the internal item accessor, with or without LRU caching

```python
nemo_automodel.components.datasets.lazy_mapped_dataset.logger = logging.getLogger(__name__)
```