Message Traces
Traces capture the conversation history during LLM generation, including system prompts, user prompts, model reasoning, tool calls, tool results, and the final response. This visibility is essential for understanding model behavior, debugging generation issues, and iterating on prompts.
Traces are also useful in certain scenarios as the target output of the workflow, e.g. producing an SFT dataset for fine-tuning tool-use capability, for instance.
Overview
When generating content with LLM columns, you often need to understand what happened during generation:
- What system prompt was used?
- What did the rendered user prompt look like?
- Did the model provide any reasoning content?
- Which tools were called (if tool use is enabled)?
- What arguments were passed to tools?
- What did tools return?
- Did the model retry after failures?
- How did the model arrive at the final answer?
Traces provide this visibility by capturing the ordered message history for each generation, including any multi-turn conversations that occur during tool use or retry scenarios.
Trace Types
Data Designer supports three trace modes via the TraceType enum:
Enabling Traces
Per-Column (Recommended)
Set with_trace on specific LLM columns:
Trace Column Naming
When enabled, LLM columns produce an additional side-effect column:
{column_name}__trace
For example, if your column is named "answer", the trace column will be "answer__trace".
Trace Data Structure
Each trace is a list[dict] where each dict represents a message in the conversation.
Message Fields by Role
Example Trace (Simple Generation)
A basic trace without tool use:
Example Trace (With Tool Use)
When tool use is enabled, traces capture the full conversation including tool calls:
The tool_calls Structure
When an assistant message includes tool calls:
Extracting Reasoning Content
Some models (particularly those with extended thinking or chain-of-thought capabilities) expose their reasoning process separately via the reasoning_content field in assistant messages. While this is included in full traces, you may want to capture it separately without the overhead of storing the entire conversation history.
Dedicated Reasoning Column
Set extract_reasoning_content=True on any LLM column to create a {column_name}__reasoning_content side-effect column:
The extracted reasoning content:
- Contains only the
reasoning_contentfrom the final assistant message in the trace - Is stripped of leading/trailing whitespace
- Is
Noneif the model didn’t provide reasoning content or if it was whitespace-only
When to Use Each Approach
Availability
The extract_reasoning_content option is available on all LLM column types:
LLMTextColumnConfigLLMCodeColumnConfigLLMStructuredColumnConfigLLMJudgeColumnConfig
See Also
- Agent Rollout Ingestion: Import external agent traces from disk into normalized seed rows
- Safety and Limits: Understand turn limits and timeout behavior