Structured Outputs#
Data Designer provides powerful capabilities for generating structured data with user-defined schemas. This guide explains how to use structured outputs in your data generation workflows.
What Are Structured Outputs?#
Structured outputs allow you to generate data with specific formats, schemas, and nested relationships. Instead of generating free-form text, you can generate dataset objects that conform to a specific schema.
Use cases include:
- Complex nested records (e.g., orders with line items) 
- Nested arrays and objects (e.g., lists of products) 
- Structured conversation data (e.g., chat logs) 
Defining Data Models with Pydantic#
The most common way to define structured outputs is using Pydantic models. For example, you can define an order with a list of products as follows:
from pydantic import BaseModel, Field
# Define a simple product model
class Product(BaseModel):
    name: str = Field(..., description="Name of the product")
    price: float = Field(..., description="Price in USD")
    category: str = Field(..., description="Product category")
    in_stock: bool = Field(..., description="Whether the product is in stock")
# Define an order with a list of products
class Order(BaseModel):
    order_id: str = Field(..., description="Unique order identifier")
    customer_name: str = Field(..., description="Name of the customer")
    order_date: str = Field(..., description="Date the order was placed")
    total_amount: float = Field(..., description="Total order amount")
    products: list[Product] = Field(..., description="List of products in the order")
    shipping_address: dict = Field(..., description="Shipping address")
Using Structured Outputs in Data Designer#
Before getting started, ensure you have the Data Designer client and configuration builder set up:
import os
from nemo_microservices import NeMoMicroservices
from nemo_microservices.beta.data_designer import DataDesignerClient, DataDesignerConfigBuilder
from nemo_microservices.beta.data_designer.config import columns as C
from nemo_microservices.beta.data_designer.config import params as P
data_designer_client = DataDesignerClient(
    client=NeMoMicroservices(base_url=os.environ["NEMO_MICROSERVICES_BASE_URL"])
)
config_builder = DataDesignerConfigBuilder(model_configs="path/to/your/model_configs.yaml")
Adding a Structured Column#
User the LLMStructuredColumn class to add an LLM-generated structured column.
# Add a sampler column for customer information
config_builder.add_column(
    C.SamplerColumn(
        name="customer_city",
        type=P.SamplerType.CATEGORY,
        params=P.CategorySamplerParams(
            values=["New York", "Los Angeles", "Chicago", "Houston"]
        )
    )
)
# Add structured order data column
config_builder.add_column(
    C.LLMStructuredColumn(
        name="order_data",
        prompt=(
            "Generate a realistic order for a customer from {{customer_city}}.",
            "Include between 1 and 5 products in the order.",
        ),
        output_format=Order,
        model_alias="structured"
    )
)
For end-to-end examples, we recommend following along with the Tutorial Notebooks.