morpheus.utils.compare_df
Functions
compare_df (df_a, df_b[, include_columns, ...]) |
Compares two pandas Dataframe, returning a comparison summary as a dict in the form of. |
filter_df (df, include_columns, exclude_columns) |
Filters the dataframe df including and excluding the columns specified by include_columns and exclude_columns respectively. |
- compare_df(df_a, df_b, include_columns=None, exclude_columns=None, replace_idx=None, abs_tol=0.001, rel_tol=0.005, dfa_name='val', dfb_name='res', show_report=False)[source]
Compares two pandas Dataframe, returning a comparison summary as a dict in the form of:
{ "total_rows": <int>, "matching_rows": <int>, "diff_rows": <int>, "matching_cols": <[str]>, "extra_cols": extra_cols: <[str]>, "missing_cols": missing_cols: <[str]>, }
- filter_df(df, include_columns, exclude_columns, replace_idx=None)[source]
Filters the dataframe
df
including and excluding the columns specified byinclude_columns
andexclude_columns
respectively. If a column is matched by bothinclude_columns
andexclude_columns
, it will be excluded.- Parameters
- dfpd.DataFrame
Dataframe to filter.
- include_columnstyping.List[str]
List of regular expression strings of columns to be included.
- exclude_columnstyping.List[str]
List of regular expression strings of columns to be excluded.
- replace_idx: str, optional
When
replace_idx
is not None and existsa in the dataframe it will be set as the index.
- Returns
- pd.DataFrame
Filtered slice of
df
.