> For clean Markdown of any page, append .md to the page URL.
> For a complete documentation index, see https://docs.nvidia.com/nemo/datadesigner/llms.txt.
> For full documentation content, see https://docs.nvidia.com/nemo/datadesigner/llms-full.txt.
> For AI client integration (Claude Code, Cursor, etc.), connect to the MCP server at https://docs.nvidia.com/nemo/datadesigner/_mcp/server.

# data\_designer.config.analysis.column\_profilers

## Module Contents

### Classes

| Name                                                                                                 | Description                                                                 |
| ---------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------- |
| [`ColumnProfilerType`](#data_designerconfiganalysiscolumn_profilerscolumnprofilertype)               | str(object='') -> str str(bytes\_or\_buffer\[, encoding\[, errors]]) -> str |
| [`ColumnProfilerResults`](#data_designerconfiganalysiscolumn_profilerscolumnprofilerresults)         | Abstract base class for column profiler results.                            |
| [`JudgeScoreProfilerConfig`](#data_designerconfiganalysiscolumn_profilersjudgescoreprofilerconfig)   | Configuration for the LLM-as-a-judge score profiler.                        |
| [`JudgeScoreSample`](#data_designerconfiganalysiscolumn_profilersjudgescoresample)                   | Container for a single judge score and its associated reasoning.            |
| [`JudgeScoreDistributions`](#data_designerconfiganalysiscolumn_profilersjudgescoredistributions)     | Container for computed distributions across all judge score dimensions.     |
| [`JudgeScoreSummary`](#data_designerconfiganalysiscolumn_profilersjudgescoresummary)                 | Container for an LLM-generated summary of a judge score dimension.          |
| [`JudgeScoreProfilerResults`](#data_designerconfiganalysiscolumn_profilersjudgescoreprofilerresults) | Container for complete judge score profiler analysis results.               |

### Data

[`ColumnProfilerConfigT`](#data_designerconfiganalysiscolumn_profilerscolumnprofilerconfigt)
[`ColumnProfilerResultsT`](#data_designerconfiganalysiscolumn_profilerscolumnprofilerresultst)

### API

```python
class data_designer.config.analysis.column_profilers.ColumnProfilerType
```

**Bases**: `str`, `enum.Enum`

```python
JUDGE_SCORE = judge-score
```

```python
class data_designer.config.analysis.column_profilers.ColumnProfilerResults(
    /,
    **data: typing.Any
)
```

**Bases**: `pydantic.BaseModel`, `abc.ABC`

Abstract base class for column profiler results.

Stores results from column profiling operations. Subclasses hold profiler-specific
analysis results and provide methods for generating formatted report sections for display.

**Initialization:**

Create a new model by parsing and validating input data from keyword arguments.

Raises \[`ValidationError`]\[pydantic\_core.ValidationError] if the input data cannot be
validated to form a valid model.

`self` is explicitly positional-only to allow `self` as a field name.

```python
create_report_section() -> rich.panel.Panel
```

Creates a Rich Panel containing the formatted profiler results for display.

**Returns:**

`rich.panel.Panel`

A Rich Panel containing the formatted profiler results. Default implementation
returns a "Not Implemented" message; subclasses should override to provide
specific formatting.

```python
class data_designer.config.analysis.column_profilers.JudgeScoreProfilerConfig(
    /,
    **data: typing.Any
)
```

**Bases**: `data_designer.config.base.ConfigBase`

Configuration for the LLM-as-a-judge score profiler.

**Parameters:**

Alias of the LLM model to use for generating score distribution summaries.
Must match a model alias defined in the Data Designer configuration.

Number of score samples to include when prompting the LLM
to generate summaries. Larger sample sizes provide more context but increase
token usage. Must be at least 1 when provided. Set to None to skip LLM-generated
summaries. Defaults to 20.

**Attributes:**

Alias of the LLM model to use for generating score distribution summaries.
Must match a model alias defined in the Data Designer configuration.

Number of score samples to include when prompting the LLM
to generate summaries. Larger sample sizes provide more context but increase
token usage. Must be at least 1 when provided. Set to None to skip LLM-generated
summaries. Defaults to 20.

**Initialization:**

Create a new model by parsing and validating input data from keyword arguments.

Raises \[`ValidationError`]\[pydantic\_core.ValidationError] if the input data cannot be
validated to form a valid model.

`self` is explicitly positional-only to allow `self` as a field name.

```python
model_alias: str
```

```python
summary_score_sample_size: int | None = Field(...)
```

```python
class data_designer.config.analysis.column_profilers.JudgeScoreSample(
    /,
    **data: typing.Any
)
```

**Bases**: `pydantic.BaseModel`

Container for a single judge score and its associated reasoning.

Stores a paired score-reasoning sample extracted from an LLM-as-a-judge column.
Used when generating summaries to provide the LLM with examples of scoring patterns.

**Parameters:**

The score value assigned by the judge. Can be numeric (int) or categorical (str).

The reasoning or explanation provided by the judge for this score.

**Attributes:**

The score value assigned by the judge. Can be numeric (int) or categorical (str).

The reasoning or explanation provided by the judge for this score.

**Initialization:**

Create a new model by parsing and validating input data from keyword arguments.

Raises \[`ValidationError`]\[pydantic\_core.ValidationError] if the input data cannot be
validated to form a valid model.

`self` is explicitly positional-only to allow `self` as a field name.

```python
score: int | str
```

```python
reasoning: str
```

```python
class data_designer.config.analysis.column_profilers.JudgeScoreDistributions(
    /,
    **data: typing.Any
)
```

**Bases**: `pydantic.BaseModel`

Container for computed distributions across all judge score dimensions.

Stores the complete distribution analysis for all score dimensions in an LLM-as-a-judge
column. Each score dimension (e.g., "relevance", "fluency") has its own distribution
computed from the generated data.

**Parameters:**

Mapping of each score dimension name to its list of score values.

Mapping of each score dimension name to its list of reasoning texts.

Mapping of each score dimension name to its classification.

Mapping of each score dimension name to its computed distribution statistics.

Mapping of each score dimension name to its histogram data.

**Attributes:**

Mapping of each score dimension name to its list of score values.

Mapping of each score dimension name to its list of reasoning texts.

Mapping of each score dimension name to its classification.

Mapping of each score dimension name to its computed distribution statistics.

Mapping of each score dimension name to its histogram data.

**Initialization:**

Create a new model by parsing and validating input data from keyword arguments.

Raises \[`ValidationError`]\[pydantic\_core.ValidationError] if the input data cannot be
validated to form a valid model.

`self` is explicitly positional-only to allow `self` as a field name.

```python
scores: dict[str, list[int | str]]
```

```python
reasoning: dict[str, list[str]]
```

```python
distribution_types: dict[str, data_designer.config.analysis.column_statistics.ColumnDistributionType]
```

```python
distributions: dict[str, data_designer.config.analysis.column_statistics.CategoricalDistribution | data_designer.config.analysis.column_statistics.NumericalDistribution | data_designer.config.analysis.column_statistics.MissingValue]
```

```python
histograms: dict[str, data_designer.config.analysis.column_statistics.CategoricalHistogramData | data_designer.config.analysis.column_statistics.MissingValue]
```

```python
class data_designer.config.analysis.column_profilers.JudgeScoreSummary(
    /,
    **data: typing.Any
)
```

**Bases**: `pydantic.BaseModel`

Container for an LLM-generated summary of a judge score dimension.

Stores the natural language summary and sample data for a single score dimension
generated by the judge score profiler. The summary is created by an LLM analyzing
the distribution and patterns in the score-reasoning pairs.

**Parameters:**

Name of the score dimension being summarized (e.g., "relevance", "fluency").

LLM-generated natural language summary describing the scoring patterns,
distribution characteristics, and notable trends for this score dimension.

List of score-reasoning pairs that were used to generate the summary.
These are the examples of the scoring behavior that were used to generate the summary.

**Attributes:**

Name of the score dimension being summarized (e.g., "relevance", "fluency").

LLM-generated natural language summary describing the scoring patterns,
distribution characteristics, and notable trends for this score dimension.

List of score-reasoning pairs that were used to generate the summary.
These are the examples of the scoring behavior that were used to generate the summary.

**Initialization:**

Create a new model by parsing and validating input data from keyword arguments.

Raises \[`ValidationError`]\[pydantic\_core.ValidationError] if the input data cannot be
validated to form a valid model.

`self` is explicitly positional-only to allow `self` as a field name.

```python
score_name: str
```

```python
summary: str
```

```python
score_samples: list[data_designer.config.analysis.column_profilers.JudgeScoreSample]
```

```python
class data_designer.config.analysis.column_profilers.JudgeScoreProfilerResults(
    /,
    **data: typing.Any
)
```

**Bases**: `data_designer.config.analysis.column_profilers.ColumnProfilerResults`

Container for complete judge score profiler analysis results.

**Parameters:**

Name of the judge column that was profiled.

Mapping of each score dimension name to its LLM-generated summary.

Complete distribution analysis across all score dimensions.

**Attributes:**

Name of the judge column that was profiled.

Mapping of each score dimension name to its LLM-generated summary.

Complete distribution analysis across all score dimensions.

**Initialization:**

Create a new model by parsing and validating input data from keyword arguments.

Raises \[`ValidationError`]\[pydantic\_core.ValidationError] if the input data cannot be
validated to form a valid model.

`self` is explicitly positional-only to allow `self` as a field name.

```python
column_name: str
```

```python
summaries: dict[str, data_designer.config.analysis.column_profilers.JudgeScoreSummary]
```

```python
score_distributions: data_designer.config.analysis.column_profilers.JudgeScoreDistributions | data_designer.config.analysis.column_statistics.MissingValue
```

```python
create_report_section() -> rich.panel.Panel
```