morpheus.models.dfencoder.dataloader.DFEncoderDataLoader
- class DFEncoderDataLoader(*args, **kwargs)[source]
Bases:
torch.utils.data.DataLoader
Methods
__call__
(*args, **kwargs)Call self as a function. get_distributed_training_dataloader_from_dataset
(...)Returns a distributed training DataLoader given a dataset and other arguments. get_distributed_training_dataloader_from_df
(...)A helper funtion to get a distributed training DataLoader given a pandas dataframe. get_distributed_training_dataloader_from_path
(...)A helper funtion to get a distributed training DataLoader given a path to a folder containing data. - static get_distributed_training_dataloader_from_dataset(dataset, rank, world_size, pin_memory=False, num_workers=0)[source]
Returns a distributed training DataLoader given a dataset and other arguments.
- Parameters
- datasetDataset
The dataset to load the data from.
- rankint
The rank of the current process.
- world_sizeint
The number of processes to distribute the data across.
- pin_memorybool, optional
Whether to pin memory when loading data, by default False.
- num_workersint, optional
The number of worker processes to use for loading data, by default 0.
- Returns
- DataLoader
The training DataLoader with DistributedSampler for distributed training.
- static get_distributed_training_dataloader_from_df(model, df, rank, world_size, pin_memory=False, num_workers=0)[source]
A helper funtion to get a distributed training DataLoader given a pandas dataframe.
- Parameters
- modelAutoEncoder
The autoencoder model used to get relevant params and the preprocessing func.
- dfpandas.DataFrame
The pandas dataframe containing the data.
- rankint
The rank of the current process.
- world_sizeint
The number of processes to distribute the data across.
- pin_memorybool, optional
Whether to pin memory when loading data, by default False.
- num_workersint, optional
The number of worker processes to use for loading data, by default 0.
- Returns
- DFEncoderDataLoader
The training DataLoader with DistributedSampler for distributed training.
- static get_distributed_training_dataloader_from_path(model, data_folder, rank, world_size, load_data_fn=pandas.read_csv, pin_memory=False, num_workers=0)[source]
A helper funtion to get a distributed training DataLoader given a path to a folder containing data.
- Parameters
- modelAutoEncoder
The autoencoder model used to get relevant params and the preprocessing func.
- data_folderstr
The path to the folder containing the data.
- rankint
The rank of the current process.
- world_sizeint
The number of processes to distribute the data across.
- load_data_fnfunction, optional
A function for loading data from a provided file path into a pandas.DataFrame, by default pd.read_csv.
- pin_memorybool, optional
Whether to pin memory when loading data, by default False.
- num_workersint, optional
The number of worker processes to use for loading data, by default 0.
- Returns
- DFEncoderDataLoader
The training DataLoader with DistributedSampler for distributed training.