For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
  • Getting Started
    • Welcome
    • Contributing
  • Concepts
    • Columns
    • Seed Datasets
    • Agent Rollout Ingestion
    • Custom Columns
    • Validators
    • Processors
    • Person Sampling
    • Traces
    • Architecture & Performance
    • Deployment Options
    • Security
  • Tutorials
    • Overview
    • The Basics
    • Structured Outputs, Jinja Expressions, and Conditional Generation
    • Seeding with an External Dataset
    • Providing Images as Context
    • Generating Images
    • Image-to-Image Editing
  • Recipes
    • Recipe Cards
  • Plugins
    • Overview
    • Example Plugin
    • FileSystemSeedReader Plugins
    • Discover
  • Code Reference
    • Overview
      • Overview
      • models
      • mcp
      • column_configs
      • config_builder
      • data_designer_config
      • run_config
      • sampler_params
      • validator_params
      • seeds
      • processors
      • analysis
      • Config API
  • Dev Notes
    • Overview
    • Push Datasets to Hugging Face Hub
    • Text-to-SQL for Nemotron Super
    • Async All the Way Down
    • Owning the Model Stack
    • Data Designer Got Skills
NVIDIANVIDIA
Developer-friendly docs for your API
Privacy Policy | Your Privacy Choices | Terms of Service | Accessibility | Corporate Policies | Product Security | Contact

Copyright © 2026, NVIDIA Corporation.

LogoLogoNeMo Data Designer
On this page
  • Column Statistics
  • Column Profilers
  • Dataset Profiler
Code ReferenceConfig

Analysis

||View as Markdown|

The analysis modules provide tools for profiling and analyzing generated datasets. It includes statistics tracking, column profiling, and reporting capabilities.

Column Statistics

Column statistics are automatically computed for every column after generation. They provide basic metrics specific to the column type. For example, LLM columns track token usage statistics, sampler columns track distribution information, and validation columns track validation success rates.

The classes below are result objects that store the computed statistics for each column type and provide methods for formatting these results for display in reports.

Column Profilers

Column profilers are optional analysis tools that provide deeper insights into specific column types. Currently, the only column profiler available is the Judge Score Profiler.

The classes below are result objects that store the computed profiler results and provide methods for formatting these results for display in reports.

Dataset Profiler

The DatasetProfilerResults class contains complete profiling results for a generated dataset. It aggregates column-level statistics, metadata, and profiler results, and provides methods to:

  • Compute dataset-level metrics (completion percentage, column type summary)
  • Filter statistics by column type
  • Generate formatted analysis reports via the to_report() method

Reports can be displayed in the console or exported to HTML/SVG formats.

Previous

Processors

Next

Config API