For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
  • Getting Started
    • Welcome
    • Contributing
  • Concepts
    • Columns
    • Seed Datasets
    • Agent Rollout Ingestion
    • Custom Columns
    • Validators
    • Processors
    • Person Sampling
    • Traces
    • Architecture & Performance
    • Deployment Options
    • Security
  • Tutorials
    • Overview
    • The Basics
    • Structured Outputs, Jinja Expressions, and Conditional Generation
    • Seeding with an External Dataset
    • Providing Images as Context
    • Generating Images
    • Image-to-Image Editing
  • Recipes
    • Recipe Cards
  • Plugins
    • Overview
    • Example Plugin
    • FileSystemSeedReader Plugins
    • Discover
  • Code Reference
    • Overview
  • Dev Notes
    • Overview
    • Have It Your Way
    • VLM Long Document Understanding
    • Push Datasets to Hugging Face Hub
    • Text-to-SQL for Nemotron Super
    • Async All the Way Down
    • Owning the Model Stack
NVIDIANVIDIA
Developer-friendly docs for your API
Privacy Policy | Your Privacy Choices | Terms of Service | Accessibility | Corporate Policies | Product Security | Contact

Copyright © 2026, NVIDIA Corporation.

LogoLogoNeMo Data Designer
On this page
  • Overview
  • Trace Types
  • Enabling Traces
  • Per-Column (Recommended)
  • Trace Column Naming
  • Trace Data Structure
  • Message Fields by Role
  • Example Trace (Simple Generation)
  • Example Trace (With Tool Use)
  • The tool_calls Structure
  • Extracting Reasoning Content
  • Dedicated Reasoning Column
  • When to Use Each Approach
  • Availability
  • See Also
Concepts

Message Traces

||View as Markdown|
Previous

Person Sampling in Data Designer

Next

Tool Use & MCP

Traces capture the conversation history during LLM generation, including system prompts, user prompts, model reasoning, tool calls, tool results, and the final response. This visibility is essential for understanding model behavior, debugging generation issues, and iterating on prompts.

Traces are also useful in certain scenarios as the target output of the workflow, e.g. producing an SFT dataset for fine-tuning tool-use capability, for instance.

Overview

When generating content with LLM columns, you often need to understand what happened during generation:

  • What system prompt was used?
  • What did the rendered user prompt look like?
  • Did the model provide any reasoning content?
  • Which tools were called (if tool use is enabled)?
  • What arguments were passed to tools?
  • What did tools return?
  • Did the model retry after failures?
  • How did the model arrive at the final answer?

Traces provide this visibility by capturing the ordered message history for each generation, including any multi-turn conversations that occur during tool use or retry scenarios.

Trace Types

Data Designer supports three trace modes via the TraceType enum:

TraceTypeDescription
TraceType.NONENo trace captured (default)
TraceType.LAST_MESSAGEOnly the final assistant message is captured
TraceType.ALL_MESSAGESFull conversation history (system/user/assistant/tool)

Enabling Traces

Per-Column (Recommended)

Set with_trace on specific LLM columns:

1import data_designer.config as dd
2
3# Capture full conversation history
4builder.add_column(
5 dd.LLMTextColumnConfig(
6 name="answer",
7 prompt="Answer: {{ question }}",
8 model_alias="nvidia-text",
9 with_trace=dd.TraceType.ALL_MESSAGES, # Full trace
10 )
11)
12
13# Capture only the final assistant response
14builder.add_column(
15 dd.LLMTextColumnConfig(
16 name="summary",
17 prompt="Summarize: {{ text }}",
18 model_alias="nvidia-text",
19 with_trace=dd.TraceType.LAST_MESSAGE, # Just the final response
20 )
21)

Trace Column Naming

When enabled, LLM columns produce an additional side-effect column:

  • {column_name}__trace

For example, if your column is named "answer", the trace column will be "answer__trace".

Trace Data Structure

Each trace is a list[dict] where each dict represents a message in the conversation.

Message Fields by Role

RoleFieldsDescription
systemrole, contentSystem prompt setting model behavior. content is a list of blocks in ChatML format.
userrole, contentUser prompt (rendered from template). content is a list of blocks (text + multimodal).
assistantrole, content, tool_calls, reasoning_contentModel response; content may be empty if only requesting tools.
toolrole, content, tool_call_idTool execution result; tool_call_id links to the request.

Example Trace (Simple Generation)

A basic trace without tool use:

1[
2 # System message (if configured)
3 {
4 "role": "system",
5 "content": [{"type": "text", "text": "You are a helpful assistant that provides clear, concise answers."}]
6 },
7 # User message (the rendered prompt)
8 {
9 "role": "user",
10 "content": [{"type": "text", "text": "What is the capital of France?"}]
11 },
12 # Final assistant response
13 {
14 "role": "assistant",
15 "content": [{"type": "text", "text": "The capital of France is Paris."}],
16 "reasoning_content": None # May contain reasoning if model supports it
17 }
18]

Example Trace (With Tool Use)

When tool use is enabled, traces capture the full conversation including tool calls:

1[
2 # System message
3 {
4 "role": "system",
5 "content": [{"type": "text", "text": "You must call tools before answering. Only use tool results."}]
6 },
7 # User message (the rendered prompt)
8 {
9 "role": "user",
10 "content": [{"type": "text", "text": "What documents are in the knowledge base about machine learning?"}]
11 },
12 # Assistant requests tool calls
13 {
14 "role": "assistant",
15 "content": [{"type": "text", "text": ""}],
16 "tool_calls": [
17 {
18 "id": "call_abc123",
19 "type": "function",
20 "function": {
21 "name": "list_docs",
22 "arguments": "{\"query\": \"machine learning\"}"
23 }
24 }
25 ]
26 },
27 # Tool response (linked by tool_call_id)
28 {
29 "role": "tool",
30 "content": [{"type": "text", "text": "Found 3 documents: intro_ml.pdf, neural_networks.pdf, transformers.pdf"}],
31 "tool_call_id": "call_abc123"
32 },
33 # Final assistant response
34 {
35 "role": "assistant",
36 "content": [{"type": "text", "text": "The knowledge base contains three documents about machine learning: ..."}]
37 }
38]

The tool_calls Structure

When an assistant message includes tool calls:

1{
2 "id": "call_abc123", # Unique ID linking to tool response
3 "type": "function", # Always "function" for MCP tools
4 "function": {
5 "name": "search_docs", # Tool name
6 "arguments": "{...}" # JSON string of tool arguments
7 }
8}

Extracting Reasoning Content

Some models (particularly those with extended thinking or chain-of-thought capabilities) expose their reasoning process separately via the reasoning_content field in assistant messages. While this is included in full traces, you may want to capture it separately without the overhead of storing the entire conversation history.

Dedicated Reasoning Column

Set extract_reasoning_content=True on any LLM column to create a {column_name}__reasoning_content side-effect column:

1import data_designer.config as dd
2
3builder.add_column(
4 dd.LLMTextColumnConfig(
5 name="solution",
6 prompt="Solve this math problem step by step: {{ problem }}",
7 model_alias="reasoning-model",
8 extract_reasoning_content=True, # Creates solution__reasoning_content
9 )
10)

The extracted reasoning content:

  • Contains only the reasoning_content from the final assistant message in the trace
  • Is stripped of leading/trailing whitespace
  • Is None if the model didn’t provide reasoning content or if it was whitespace-only

When to Use Each Approach

NeedApproach
Full conversation history for debuggingwith_trace=True
Just the model’s reasoning/thinkingextract_reasoning_content=True
Both conversation history and separate reasoningUse both options
Fine-tuning data with reasoningextract_reasoning_content=True for clean extraction

Availability

The extract_reasoning_content option is available on all LLM column types:

  • LLMTextColumnConfig
  • LLMCodeColumnConfig
  • LLMStructuredColumnConfig
  • LLMJudgeColumnConfig

See Also

  • Agent Rollout Ingestion: Import external agent traces from disk into normalized seed rows
  • Safety and Limits: Understand turn limits and timeout behavior