data_designer.config.analysis.dataset_profiler
data_designer.config.analysis.dataset_profiler
data_designer.config.analysis.dataset_profiler
Bases: pydantic.BaseModel
Container for complete dataset profiling and analysis results.
Stores profiling results for a generated dataset, including statistics for configured columns, dataset-level metadata, side-effect column names, and optional advanced profiler results. Provides methods for computing derived metrics and generating formatted reports.
Parameters:
Actual number of records successfully generated in the dataset.
Target number of records that were requested to be generated.
List of statistics objects for configured columns. Each column has statistics appropriate to its type. Must contain at least one column.
Column names that were generated as side effects of other columns.
Column profiler results for specific columns when configured.
Attributes:
Actual number of records successfully generated in the dataset.
Target number of records that were requested to be generated.
List of statistics objects for configured columns. Each column has statistics appropriate to its type. Must contain at least one column.
Column names that were generated as side effects of other columns.
Column profiler results for specific columns when configured.
Initialization:
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be
validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
Returns the completion percentage of the dataset.
Returns a sorted list of unique column types present in the dataset.
Filters column statistics to return only those of the specified type.
Generate and print an analysis report based on the dataset profiling results.
Parameters:
Optional path to save the report. If provided, the report will be saved as either HTML (.html) or SVG (.svg) format. If None, the report will only be displayed in the console.
Optional list of sections to include in the report. Choices are any DataDesignerColumnType, “overview” (the dataset overview section), and “column_profilers” (all column profilers in one section). If None, all sections will be included.