Recipes are a collection of code examples that demonstrate how to leverage Data Designer in specific use cases. Each recipe is a self-contained example that can be run independently.
Recipes provide working code for specific use cases without detailed explanations. If you’re learning Data Designer for the first time, start with our tutorial notebooks, which offer step-by-step guidance and explain core concepts. Once you’re familiar with the basics, return here for practical, ready-to-use implementations.
These recipes use the OpenAI model provider by default. Ensure your OpenAI provider is set up via the Data Designer CLI before running a recipe.
Natural-language instructions paired with Python implementations across complexity levels and industries.
Python code generation · validation · LLM-as-judge
Natural-language instructions paired with SQL implementations across complexity levels and industries.
SQL code generation · validation · LLM-as-judge
Enterprise-grade text-to-SQL training data — dialect-specific SQL, distractor injection, dirty data, 5 LLM judges with 15 scoring dimensions.
Multi-dialect SQL · SubcategorySamplerParams · 5 judges · 15 score columns
Product information paired with question/answer pairs.
Structured outputs · expression columns · LLM-as-judge
Multi-turn chat conversations between a user and an AI assistant.
Structured outputs · expression columns · LLM-as-judge
Minimal example of MCP tool calling — defines a simple MCP server and generates data that requires tool calls to complete.
LocalStdioMCPProvider · simple tool server · tool-augmented text
Grounded Q&A pairs from PDF documents using MCP tool calls and BM25 search.
LocalStdioMCPProvider · BM25 retrieval · per-column trace capture
Multi-turn search agent trajectories — Tavily web search via MCP, Wikidata KG seeding, BrowseComp-style question generation.
Tavily MCP · Wikidata seeding · two-stage question generation · trajectory capture