morpheus.controllers.file_to_df_controller.FileToDFController#
- class FileToDFController(
- schema,
- filter_null,
- file_type,
- parser_kwargs,
- cache_dir,
- timestamp_column_name,
- download_method=DownloadMethods.DASK_THREAD,
Bases:
objectController class for converting file objects to Pandas DataFrames with optional preprocessing.
- Parameters:
- schemaDataFrameInputSchema
A schema defining how to process the data.
- filter_nullbool
Flag to indicate whether to filter out null values.
- file_typeFileTypes
The type of the file being processed (e.g., CSV, Parquet).
- parser_kwargsdict
Additional keyword arguments to pass to the file parser.
- cache_dirstr
Directory where cache will be stored.
- timestamp_column_namestr
Name of the timestamp column.
- download_methodtyping.Union[DownloadMethods, str], optional, default = DownloadMethods.DASK_THREAD
The download method to use, if the
MORPHEUS_FILE_DOWNLOAD_TYPEenvironment variable is set, it takes presedence.
Methods
close()Close the resources used by the controller.
convert_to_dataframe(file_object_batch)Convert a batch of file objects to a DataFrame.