> For clean Markdown of any page, append .md to the page URL.
> For a complete documentation index, see https://docs.nvidia.com/nemo/curator/llms.txt.
> For full documentation content, see https://docs.nvidia.com/nemo/curator/llms-full.txt.
> For AI client integration (Claude Code, Cursor, etc.), connect to the MCP server at https://docs.nvidia.com/nemo/curator/_mcp/server.

# nemo_curator.utils.performance_utils

## Module Contents

### Classes

| Name                                                                     | Description                                        |
| ------------------------------------------------------------------------ | -------------------------------------------------- |
| [`StagePerfStats`](#nemo_curator-utils-performance_utils-StagePerfStats) | Statistics for tracking stage performance metrics. |
| [`StageTimer`](#nemo_curator-utils-performance_utils-StageTimer)         | Tracker for stage performance stats.               |

### API

```python
class nemo_curator.utils.performance_utils.StagePerfStats()
```

Statistics for tracking stage performance metrics.
Attributes:
stage\_name: Name of the processing stage.
process\_time: Total processing time in seconds.
actor\_idle\_time: Time the actor spent idle in seconds.
input\_data\_size\_mb: Size of input data in megabytes.
num\_items\_processed: Number of items processed in this stage.
custom\_metrics: Custom metrics to track.

```python
nemo_curator.utils.performance_utils.StagePerfStats.__add__(
    other: nemo_curator.utils.performance_utils.StagePerfStats
) -> nemo_curator.utils.performance_utils.StagePerfStats
```

Add two StagePerfStats.

```python
nemo_curator.utils.performance_utils.StagePerfStats.__radd__(
    other: int | nemo_curator.utils.performance_utils.StagePerfStats
) -> nemo_curator.utils.performance_utils.StagePerfStats
```

Add two StagePerfStats together, if right is 0, returns itself.

```python
nemo_curator.utils.performance_utils.StagePerfStats.items() -> list[tuple[str, float | int]]
```

Returns (metric\_name, metric\_value) pairs
custom\_metrics are flattened into the format (custom.\<metric\_name>, metric\_value)

```python
nemo_curator.utils.performance_utils.StagePerfStats.reset() -> None
```

Reset the stats.

```python
nemo_curator.utils.performance_utils.StagePerfStats.to_dict() -> dict[str, float | int]
```

Convert the stats to a dictionary.

```python
class nemo_curator.utils.performance_utils.StageTimer(
    stage: nemo_curator.stages.base.ProcessingStage
)
```

Tracker for stage performance stats.
Tracks processing time and other metrics at a per process\_data call level.

```python
nemo_curator.utils.performance_utils.StageTimer._reset() -> None
```

Reset internal counters.

```python
nemo_curator.utils.performance_utils.StageTimer.log_stats(
    verbose: bool = False
) -> tuple[str, nemo_curator.utils.performance_utils.StagePerfStats]
```

Log the stats of the stage.
Args:
verbose: Whether to log the stats verbosely.
Returns:
A tuple of the stage name and the stage performance stats.

```python
nemo_curator.utils.performance_utils.StageTimer.reinit(
    stage_input_size: int = 1
) -> None
```

Reinitialize the stage timer.
Args:
stage: The stage to reinitialize the timer for.
stage\_input\_size: The size of the stage input.

```python
nemo_curator.utils.performance_utils.StageTimer.time_process(
    num_items: int = 1
) -> collections.abc.Generator[None, None, None]
```

Time the processing of the stage.
Args:
num\_items: The number of items being processed.