morpheus.controllers.file_to_df_controller

Morpheus pipeline module for fetching files and emitting them as DataFrames.

Functions

single_object_to_dataframe(file_object, ...) Converts a file object into a Pandas DataFrame with optional preprocessing.

Classes

FileToDFController(schema, filter_null, ...) Controller class for converting file objects to Pandas DataFrames with optional preprocessing.
single_object_to_dataframe(file_object, schema, file_type, filter_null, parser_kwargs)[source]

Converts a file object into a Pandas DataFrame with optional preprocessing.

Parameters
file_object : fsspec.core.OpenFile

A file object, typically from a remote storage system.

schema : morpheus.utils.column_info.DataFrameInputSchema

A schema defining how to process the data.

file_type : morpheus.common.FileTypes

The type of the file being processed (e.g., CSV, Parquet).

filter_null

Flag to indicate whether to filter out null values.

parser_kwargs

Additional keyword arguments to pass to the file parser.

Returns
pd.DataFrame: The resulting Pandas DataFrame after processing and optional preprocessing.

Previous morpheus.controllers.elasticsearch_controller.ElasticsearchController
Next morpheus.controllers.file_to_df_controller.FileToDFController
© Copyright 2024, NVIDIA. Last updated on Apr 25, 2024.