> For clean Markdown of any page, append .md to the page URL.
> For a complete documentation index, see https://docs.nvidia.com/nemo/datadesigner/llms.txt.
> For full documentation content, see https://docs.nvidia.com/nemo/datadesigner/llms-full.txt.
> For AI client integration (Claude Code, Cursor, etc.), connect to the MCP server at https://docs.nvidia.com/nemo/datadesigner/_mcp/server.

# data\_designer.config.analysis.dataset\_profiler

## Module Contents

### Classes

| Name                                                                                           | Description                                                    |
| ---------------------------------------------------------------------------------------------- | -------------------------------------------------------------- |
| [`DatasetProfilerResults`](#data_designerconfiganalysisdataset_profilerdatasetprofilerresults) | Container for complete dataset profiling and analysis results. |

### API

```python
class data_designer.config.analysis.dataset_profiler.DatasetProfilerResults(
    /,
    **data: typing.Any
)
```

**Bases**: `pydantic.BaseModel`

Container for complete dataset profiling and analysis results.

Stores profiling results for a generated dataset, including statistics for configured columns,
dataset-level metadata, side-effect column names, and optional advanced profiler results.
Provides methods for computing derived metrics and generating formatted reports.

**Parameters:**

Actual number of records successfully generated in the dataset.

Target number of records that were requested to be generated.

List of statistics objects for configured columns. Each
column has statistics appropriate to its type. Must contain at least one column.

Column names that were generated as side effects of other columns.

Column profiler results for specific columns when configured.

**Attributes:**

Actual number of records successfully generated in the dataset.

Target number of records that were requested to be generated.

List of statistics objects for configured columns. Each
column has statistics appropriate to its type. Must contain at least one column.

Column names that were generated as side effects of other columns.

Column profiler results for specific columns when configured.

**Initialization:**

Create a new model by parsing and validating input data from keyword arguments.

Raises \[`ValidationError`]\[pydantic\_core.ValidationError] if the input data cannot be
validated to form a valid model.

`self` is explicitly positional-only to allow `self` as a field name.

```python
num_records: int
```

```python
target_num_records: int
```

```python
column_statistics: list[typing.Annotated[data_designer.config.analysis.column_statistics.ColumnStatisticsT, Field(discriminator='column_type')]] = Field(...)
```

```python
side_effect_column_names: list[str] | None
```

```python
column_profiles: list[data_designer.config.analysis.column_profilers.ColumnProfilerResultsT] | None
```

```python
ensure_python_integers(v: int) -> int
```

```python
percent_complete: float
```

Returns the completion percentage of the dataset.

```python
column_types() -> list[str]
```

Returns a sorted list of unique column types present in the dataset.

```python
get_column_statistics_by_type(column_type: data_designer.config.column_types.DataDesignerColumnType) -> list[data_designer.config.analysis.column_statistics.ColumnStatisticsT]
```

Filters column statistics to return only those of the specified type.

```python
to_report(
    save_path: str | pathlib.Path | None = None,
    include_sections: list[data_designer.config.analysis.utils.reporting.ReportSection | data_designer.config.column_types.DataDesignerColumnType] | None = None
) -> None
```

Generate and print an analysis report based on the dataset profiling results.

**Parameters:**

Optional path to save the report. If provided, the report will be saved
as either HTML (.html) or SVG (.svg) format. If None, the report will
only be displayed in the console.

Optional list of sections to include in the report. Choices are
any DataDesignerColumnType, "overview" (the dataset overview section),
and "column\_profilers" (all column profilers in one section). If None,
all sections will be included.