Traces capture the conversation history during LLM generation, including system prompts, user prompts, model reasoning, tool calls, tool results, and the final response. This visibility is essential for understanding model behavior, debugging generation issues, and iterating on prompts.
Traces are also useful in certain scenarios as the target output of the workflow, e.g. producing an SFT dataset for fine-tuning tool-use capability, for instance.
When generating content with LLM columns, you often need to understand what happened during generation:
Traces provide this visibility by capturing the ordered message history for each generation, including any multi-turn conversations that occur during tool use or retry scenarios.
Data Designer supports three trace modes via the TraceType enum:
Set with_trace on specific LLM columns:
When enabled, LLM columns produce an additional side-effect column:
{column_name}__traceFor example, if your column is named "answer", the trace column will be "answer__trace".
Each trace is a list[dict] where each dict represents a message in the conversation.
A basic trace without tool use:
When tool use is enabled, traces capture the full conversation including tool calls:
When an assistant message includes tool calls:
Some models (particularly those with extended thinking or chain-of-thought capabilities) expose their reasoning process separately via the reasoning_content field in assistant messages. While this is included in full traces, you may want to capture it separately without the overhead of storing the entire conversation history.
Set extract_reasoning_content=True on any LLM column to create a {column_name}__reasoning_content side-effect column:
The extracted reasoning content:
reasoning_content from the final assistant message in the traceNone if the model didn’t provide reasoning content or if it was whitespace-onlyThe extract_reasoning_content option is available on all LLM column types:
LLMTextColumnConfigLLMCodeColumnConfigLLMStructuredColumnConfigLLMJudgeColumnConfig