For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
  • Getting Started
    • Welcome
    • Contributing
  • Concepts
    • Columns
    • Seed Datasets
    • Agent Rollout Ingestion
    • Custom Columns
    • Validators
    • Processors
    • Person Sampling
    • Traces
    • Architecture & Performance
    • Deployment Options
    • Security
  • Tutorials
    • Overview
    • The Basics
    • Structured Outputs, Jinja Expressions, and Conditional Generation
    • Seeding with an External Dataset
    • Providing Images as Context
    • Generating Images
    • Image-to-Image Editing
  • Recipes
    • Recipe Cards
  • Plugins
    • Overview
    • Example Plugin
    • FileSystemSeedReader Plugins
    • Discover
  • Code Reference
    • Overview
      • Overview
      • seed_readers
      • processors
      • mcp
      • column_generators
      • Seed Reader API
      • Processor API
        • Base
        • Drop Columns
        • Registry
        • Schema Transform
      • MCP Runtime API
      • Column Generator API
  • Dev Notes
    • Overview
    • Have It Your Way
    • VLM Long Document Understanding
    • Push Datasets to Hugging Face Hub
    • Text-to-SQL for Nemotron Super
    • Async All the Way Down
    • Owning the Model Stack
NVIDIANVIDIA
Developer-friendly docs for your API
Privacy Policy | Your Privacy Choices | Terms of Service | Accessibility | Corporate Policies | Product Security | Contact

Copyright © 2026, NVIDIA Corporation.

LogoLogoNeMo Data Designer
On this page
  • Module Contents
  • Classes
  • Functions
  • Data
  • API
Code ReferenceEngine Extension APIProcessor API

data_designer.engine.processing.processors.schema_transform

||View as Markdown|
Previous

Registry

Next

MCP Runtime API

Module Contents

Classes

NameDescription
SchemaTransformProcessorTransforms dataset schema using Jinja2 templates after each batch.

Functions

NameDescription
_escape_value_for_jsonEscape a value for safe embedding inside a JSON string.

Data

logger

API

1logger = getLogger(...)
1data_designer.engine.processing.processors.schema_transform._escape_value_for_json(value: typing.Any) -> str

Escape a value for safe embedding inside a JSON string.

Unlike prompt or expression templates (which produce plain text), schema transform templates produce JSON. Values interpolated into a JSON string must be escaped - e.g. quotes and backslashes - so the rendered output is valid JSON. We pass this as record_str_fn to also enable nested dot access, such as {{ col.sub.field }}, on deserialized JSON columns.

1class data_designer.engine.processing.processors.schema_transform.SchemaTransformProcessor(
2 config: data_designer.engine.configurable_task.TaskConfigT,
3 resource_provider: data_designer.engine.resources.resource_provider.ResourceProvider
4)

Bases: data_designer.engine.processing.ginja.environment.WithJinja2UserTemplateRendering, data_designer.engine.processing.processors.base.Processor[data_designer.config.processors.SchemaTransformProcessorConfig]

Transforms dataset schema using Jinja2 templates after each batch.

1template_as_str: str
1process_after_batch(
2 data: pandas.DataFrame,
3 *,
4 current_batch_number: int | None
5) -> pandas.DataFrame
1_transform(data: pandas.DataFrame) -> pandas.DataFrame