morpheus.utils.downloader.Downloader

class Downloader(download_method=DownloadMethods.DASK_THREAD, dask_heartbeat_interval='30s')[source]

Bases: object

Downloads a list of fsspec.core.OpenFiles files using one of the following methods:

single_thread, dask or dask_thread

The download method can be passed in via the download_method parameter or via the MORPHEUS_FILE_DOWNLOAD_TYPE environment variable. If both are set, the environment variable takes precedence, by default dask_thread is used.

When using single_thread, dask and dask.distributed is not reuiqrred to be installed.

Parameters
download_method

The download method to use, if the MORPHEUS_FILE_DOWNLOAD_TYPE environment variable is set, it takes presedence.

dask_heartbeat_interval

The heartbeat interval to use when using dask or dask_thread.

Attributes
download_method

Return the download method.

Methods

close() Cluster management is handled by Merlin.Distributed
download(download_buckets, download_fn) Download the files in download_buckets using the method specified in the constructor.
get_dask_client() Construct a dask client using the cluster created by get_dask_cluster
get_dask_cluster() Get the dask cluster used by this downloader.
close()[source]

Cluster management is handled by Merlin.Distributed

download(download_buckets, download_fn)[source]

Download the files in download_buckets using the method specified in the constructor. If dask or dask_thread is used, the get_dask_client_fn function is used to create a dask client, otherwise it is not called. If using one of the other methods this can be set to None.

Parameters
download_buckets

Files to download

download_fn

Function used to download an individual file and return the contents as a pandas DataFrame

Returns
typing.List[pd.DataFrame]

property download_method: str

Return the download method.

get_dask_client()[source]

Construct a dask client using the cluster created by get_dask_cluster

Returns
dask.distributed.Client

get_dask_cluster()[source]

Get the dask cluster used by this downloader. If the cluster does not exist, it is created.

Returns
dask_cuda.LocalCUDACluster

Previous morpheus.utils.downloader.DownloadMethods
Next morpheus.utils.execution_chain
© Copyright 2024, NVIDIA. Last updated on Apr 25, 2024.