> For clean Markdown of any page, append .md to the page URL.
> For a complete documentation index, see https://docs.nvidia.com/nemo/datadesigner/llms.txt.
> For full documentation content, see https://docs.nvidia.com/nemo/datadesigner/llms-full.txt.
> For AI client integration (Claude Code, Cursor, etc.), connect to the MCP server at https://docs.nvidia.com/nemo/datadesigner/_mcp/server.

# data\_designer.engine.processing.processors.base

## Module Contents

### Classes

| Name                                                                 | Description                        |
| -------------------------------------------------------------------- | ---------------------------------- |
| [`Processor`](#data_designerengineprocessingprocessorsbaseprocessor) | Base class for dataset processors. |

### API

```python
class data_designer.engine.processing.processors.base.Processor(
    config: data_designer.engine.configurable_task.TaskConfigT,
    resource_provider: data_designer.engine.resources.resource_provider.ResourceProvider
)
```

**Bases**: `data_designer.engine.configurable_task.ConfigurableTask[data_designer.engine.configurable_task.TaskConfigT]`, `abc.ABC`

Base class for dataset processors.

Processors transform data at different stages of the generation pipeline.
Override the callback methods for the stages you want to handle.

```python
implements(method_name: str) -> bool
```

Check if subclass overrides a callback method.

```python
process_before_batch(data: data_designer.engine.configurable_task.DataT) -> data_designer.engine.configurable_task.DataT
```

Called at PRE\_BATCH stage before each batch is generated.

Override to transform batch data before generation begins.

**Parameters:**

The batch data before generation.

**Returns:**

`data_designer.engine.configurable_task.DataT`

Transformed batch data.

```python
process_after_batch(
    data: data_designer.engine.configurable_task.DataT,
    *,
    current_batch_number: int | None
) -> data_designer.engine.configurable_task.DataT
```

Called at POST\_BATCH stage after each batch is generated.

Override to process each batch of generated data.

**Parameters:**

The generated batch data.

The current batch number (0-indexed), or None in preview mode.

**Returns:**

`data_designer.engine.configurable_task.DataT`

Transformed batch data.

```python
process_after_generation(data: data_designer.engine.configurable_task.DataT) -> data_designer.engine.configurable_task.DataT
```

Called at AFTER\_GENERATION stage on the final combined dataset.

Override to transform the complete generated dataset.

**Parameters:**

The final combined dataset.

**Returns:**

`data_designer.engine.configurable_task.DataT`

Transformed final dataset.