Functions
Applies column transformations to the input dataframe as defined by the |
- process_dataframe(df_in: pandas.DataFrame, input_schema: Union[nvtabular.Workflow, morpheus.utils.column_info.DataFrameInputSchema]) → pandas.DataFrame[source]
- process_dataframe(df_in: cudf.DataFrame, input_schema: Union[nvtabular.Workflow, morpheus.utils.column_info.DataFrameInputSchema]) → cudf.DataFrame
Applies column transformations to the input dataframe as defined by the
input_schema
.If
input_schema
is an instance ofDataFrameInputSchema
, and it has a ‘json_preproc’ attribute, the function will first flatten the JSON columns and concatenate the results with the original DataFrame.- Parameters:
- df_inUnion[pd.DataFrame, cudf.DataFrame]
- input_schemaUnion[nvt.Workflow, DataFrameInputSchema]
The input DataFrame to process.
Defines the transformations to apply to ‘df_in’. If an instance of nvt.Workflow, it is directly used to transform the dataframe. If an instance of DataFrameInputSchema, it is first converted to an nvt.Workflow, with JSON columns preprocessed if ‘json_preproc’ attribute is present.
- Returns:
- Union[pd.DataFrame, cudf.DataFrame]
The processed DataFrame. If ‘df_in’ was a pd.DataFrame, the return type is also pd.DataFrame, otherwise, it is cudf.DataFrame.