morpheus.models.dfencoder.dataloader.DFEncoderDataLoader
- class DFEncoderDataLoader(*args, **kwargs)[source]
Bases:
torch.utils.data.DataLoader
Methods
__call__
(*args, **kwargs)Call self as a function. get_distributed_training_dataloader_from_dataset
(...)Returns a distributed training DataLoader given a dataset and other arguments. get_distributed_training_dataloader_from_df
(...)A helper funtion to get a distributed training DataLoader given a pandas dataframe. get_distributed_training_dataloader_from_path
(...)A helper funtion to get a distributed training DataLoader given a path to a folder containing data. - static get_distributed_training_dataloader_from_dataset(dataset, rank, world_size, pin_memory=False, num_workers=0)[source]
Returns a distributed training DataLoader given a dataset and other arguments.
- Parameters
- dataset
- rank
- world_size
- pin_memory
- num_workers
The dataset to load the data from.
The rank of the current process.
The number of processes to distribute the data across.
Whether to pin memory when loading data, by default False.
The number of worker processes to use for loading data, by default 0.
- Returns
- DataLoader
The training DataLoader with DistributedSampler for distributed training.
- static get_distributed_training_dataloader_from_df(model, df, rank, world_size, pin_memory=False, num_workers=0)[source]
A helper funtion to get a distributed training DataLoader given a pandas dataframe.
- Parameters
- model
- df
- rank
- world_size
- pin_memory
- num_workers
The autoencoder model used to get relevant params and the preprocessing func.
The pandas dataframe containing the data.
The rank of the current process.
The number of processes to distribute the data across.
Whether to pin memory when loading data, by default False.
The number of worker processes to use for loading data, by default 0.
- Returns
- DFEncoderDataLoader
The training DataLoader with DistributedSampler for distributed training.
- static get_distributed_training_dataloader_from_path(model, data_folder, rank, world_size, load_data_fn=pandas.read_csv, pin_memory=False, num_workers=0)[source]
A helper funtion to get a distributed training DataLoader given a path to a folder containing data.
- Parameters
- model
- data_folder
- rank
- world_size
- load_data_fn
- pin_memory
- num_workers
The autoencoder model used to get relevant params and the preprocessing func.
The path to the folder containing the data.
The rank of the current process.
The number of processes to distribute the data across.
A function for loading data from a provided file path into a pandas.DataFrame, by default pd.read_csv.
Whether to pin memory when loading data, by default False.
The number of worker processes to use for loading data, by default 0.
- Returns
- DFEncoderDataLoader
The training DataLoader with DistributedSampler for distributed training.