morpheus.loaders.file_to_df_loader

Functions

close_dask_cluster()

file_to_df_loader(control_message, task)

This function is used to load files containing data into a dataframe.

get_dask_client(dask_cluster)

get_dask_cluster(download_method)

close_dask_cluster()[source]

file_to_df_loader(control_message, task)[source]

This function is used to load files containing data into a dataframe. Dataframe is created by processing files either using a single thread, multiprocess, dask, or dask_thread. This function determines the download method to use, and if it starts with “dask,” it creates a dask client and uses it to process the files. Otherwise, it uses a single thread or multiprocess to process the files. This function then caches the resulting dataframe using a hash of the file paths. The dataframe is wrapped in a MessageMeta and then attached as a payload to a ControlMessage object and passed on to further stages.

Parameters
control_messageControlMessage

The ControlMessage object containing the pipeline control message.

tasktyping.Dict[any, any]

A dictionary representing the current task in the pipeline control message.

Returns
messageControlMessage

Updated message control object with payload as a MessageMeta.

Raises
RuntimeError:

If no files matched the input strings specified in the task, or if there was an error loading the data.

get_dask_client(dask_cluster)[source]

get_dask_cluster(download_method)[source]

© Copyright 2023, NVIDIA. Last updated on Apr 11, 2023.