> For clean Markdown of any page, append .md to the page URL.
> For a complete documentation index, see https://docs.nvidia.com/nemo/datadesigner/llms.txt.
> For full documentation content, see https://docs.nvidia.com/nemo/datadesigner/llms-full.txt.
> For AI client integration (Claude Code, Cursor, etc.), connect to the MCP server at https://docs.nvidia.com/nemo/datadesigner/_mcp/server.

# data\_designer.config.processors

## Module Contents

### Classes

| Name                                                                                             | Description                                                                     |
| ------------------------------------------------------------------------------------------------ | ------------------------------------------------------------------------------- |
| [`ProcessorType`](#data_designerconfigprocessorsprocessortype)                                   | Enumeration of available processor types.                                       |
| [`DropColumnsProcessorConfig`](#data_designerconfigprocessorsdropcolumnsprocessorconfig)         | Drop columns from the output dataset (prefer `drop=True` in the column config). |
| [`SchemaTransformProcessorConfig`](#data_designerconfigprocessorsschematransformprocessorconfig) | Configuration for transforming the dataset schema using Jinja2 templates.       |

### Functions

| Name                                                                                                 | Description                                                                   |
| ---------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------- |
| [`get_processor_config_from_kwargs`](#data_designerconfigprocessorsget_processor_config_from_kwargs) | Create a processor configuration from a processor type and keyword arguments. |

### API

```python
class data_designer.config.processors.ProcessorType
```

**Bases**: `str`, `enum.Enum`

Enumeration of available processor types.

**Attributes:**

Processor that removes specified columns from the output dataset.

Processor that creates a new dataset with a transformed schema using Jinja2 templates.

**Initialization:**

Initialize self.  See help(type(self)) for accurate signature.

```python
DROP_COLUMNS = drop_columns
```

```python
SCHEMA_TRANSFORM = schema_transform
```

```python
data_designer.config.processors.get_processor_config_from_kwargs(
    processor_type: data_designer.config.processors.ProcessorType,
    **kwargs: typing.Any
) -> data_designer.config.base.ProcessorConfig
```

Create a processor configuration from a processor type and keyword arguments.

**Parameters:**

The type of processor to create.

Additional keyword arguments passed to the processor constructor.

**Returns:**

`data_designer.config.base.ProcessorConfig`

A processor configuration object of the specified type.

```python
class data_designer.config.processors.DropColumnsProcessorConfig(
    /,
    **data: typing.Any
)
```

**Bases**: `data_designer.config.base.ProcessorConfig`

Drop columns from the output dataset (prefer `drop=True` in the column config).

This processor removes specified columns from the generated dataset. The dropped
columns are saved separately in the `dropped-columns-parquet-files` directory for reference.
When this processor is added via the config builder, the corresponding column
configs are automatically marked with `drop = True`.

**Parameters:**

List of column names to remove from the output dataset.

Inherited Attributes:
name (required): Name of the processor.
**Attributes:**

List of column names to remove from the output dataset.

**Initialization:**

Create a new model by parsing and validating input data from keyword arguments.

Raises \[`ValidationError`]\[pydantic\_core.ValidationError] if the input data cannot be
validated to form a valid model.

`self` is explicitly positional-only to allow `self` as a field name.

```python
column_names: list[str] = Field(...)
```

```python
processor_type: typing.Literal[data_designer.config.processors.ProcessorType]
```

```python
class data_designer.config.processors.SchemaTransformProcessorConfig(
    /,
    **data: typing.Any
)
```

**Bases**: `data_designer.config.base.ProcessorConfig`

Configuration for transforming the dataset schema using Jinja2 templates.

This processor creates a new dataset with a transformed schema. Each key in the
template becomes a column in the output, and values are Jinja2 templates that
can reference any column in the batch. The transformed dataset is written to
a `processors-files/{processor_name}/` directory alongside the main dataset.

**Parameters:**

Dictionary defining the output schema. Keys are new column names,
values are Jinja2 templates (strings, lists, or nested structures).
Must be JSON-serializable.

Inherited Attributes:
name (required): Name of the processor.
**Attributes:**

Dictionary defining the output schema. Keys are new column names,
values are Jinja2 templates (strings, lists, or nested structures).
Must be JSON-serializable.

**Initialization:**

Create a new model by parsing and validating input data from keyword arguments.

Raises \[`ValidationError`]\[pydantic\_core.ValidationError] if the input data cannot be
validated to form a valid model.

`self` is explicitly positional-only to allow `self` as a field name.

```python
template: dict[str, typing.Any] = Field(...)
```

```python
processor_type: typing.Literal[data_designer.config.processors.ProcessorType]
```

```python
validate_template(v: dict[str, typing.Any]) -> dict[str, typing.Any]
```