morpheus.utils.schema_transforms

Functions

process_dataframe()

Applies column transformations to the input dataframe as defined by the input_schema.

process_dataframe(df_in: pandas.DataFrame, input_schema: Union[nvtabular.Workflow, morpheus.utils.column_info.DataFrameInputSchema]) → pandas.DataFrame[source]
process_dataframe(df_in: cudf.DataFrame, input_schema: Union[nvtabular.Workflow, morpheus.utils.column_info.DataFrameInputSchema]) → cudf.DataFrame

Applies column transformations to the input dataframe as defined by the input_schema.

If input_schema is an instance of DataFrameInputSchema, and it has a ‘json_preproc’ attribute, the function will first flatten the JSON columns and concatenate the results with the original DataFrame.

Parameters:
df_inUnion[pd.DataFrame, cudf.DataFrame]

The input DataFrame to process.

input_schemaUnion[nvt.Workflow, DataFrameInputSchema]

Defines the transformations to apply to ‘df_in’. If an instance of nvt.Workflow, it is directly used to transform the dataframe. If an instance of DataFrameInputSchema, it is first converted to an nvt.Workflow, with JSON columns preprocessed if ‘json_preproc’ attribute is present.

Returns:
Union[pd.DataFrame, cudf.DataFrame]

The processed DataFrame. If ‘df_in’ was a pd.DataFrame, the return type is also pd.DataFrame, otherwise, it is cudf.DataFrame.

© Copyright 2023, NVIDIA. Last updated on Aug 23, 2023.