morpheus.utils.schema_transforms#

Functions

process_dataframe()

Applies column transformations to the input dataframe as defined by the input_schema.

process_dataframe(
df_in: pandas.DataFrame,
input_schema: DataFrameInputSchema,
) pandas.DataFrame[source]#
process_dataframe(
df_in: cudf.DataFrame,
input_schema: DataFrameInputSchema,
) cudf.DataFrame

Applies column transformations to the input dataframe as defined by the input_schema.

If input_schema is an instance of DataFrameInputSchema, and it has a ‘json_preproc’ attribute, the function will first flatten the JSON columns and concatenate the results with the original DataFrame.

Parameters:
df_inUnion[pd.DataFrame, cudf.DataFrame]

The input DataFrame to process.

input_schemaUnion[DataFrameInputSchema]

Defines the transformations to apply to ‘df_in’. with JSON columns preprocessed if ‘json_preproc’ attribute is present.

Returns:
Union[pd.DataFrame, cudf.DataFrame]

The processed DataFrame. If ‘df_in’ was a pd.DataFrame, the return type is also pd.DataFrame, otherwise, it is cudf.DataFrame.

Notes

Any transformation that needs to be performed should be defined in ‘input_schema’. If ‘df_in’ is a pandas DataFrame, it is temporarily converted into a cudf DataFrame for the transformation.