morpheus.models.dfencoder.dataloader.DatasetFromPath
- class DatasetFromPath(*args, **kwargs)[source]
Bases:
torch.utils.data.Dataset
A dataset class that reads data in batches from a folder and applies preprocessing to each batch. * This class assumes that the data is saved in small csv files in one folder.
- Attributes:
num_samples
Returns the number of samples in the dataset.
Methods
__call__
(*args, **kwargs)Call self as a function.
convert_to_validation
(model)Converts the dataset to validation mode by resetting instance variables.
Loads all data from the files into memory and returns it as a pandas.DataFrame.
get_train_dataset
(model, data_folder[, ...])A helper function to get a train dataset with the provided parameters.
get_validation_dataset
(model, data_folder[, ...])A helper function to get a validation dataset with the provided parameters.
- convert_to_validation(model)[source]
Converts the dataset to validation mode by resetting instance variables.
- Parameters:
- modelAutoEncoder
The autoencoder model used to get relevant params and the preprocessing func.
- get_preloaded_data()[source]
Loads all data from the files into memory and returns it as a pandas.DataFrame.
- static get_train_dataset(model, data_folder, load_data_fn=pandas.read_csv, preload_data_into_memory=False)[source]
A helper function to get a train dataset with the provided parameters.
- Parameters:
- modelAutoEncoder
- data_folderstr
- load_data_fnfunction, optional
- preload_data_into_memorybool, optional
The autoencoder model used to get relevant params and the preprocessing func.
The path to the folder containing the data.
A function for loading data from a provided file path into a pandas.DataFrame, by default pd.read_csv.
Whether to preload all the data into memory, by default False.
- Returns:
- DatasetFromPath
Validation Dataset set up to load from the path.
- static get_validation_dataset(model, data_folder, load_data_fn=pandas.read_csv, preload_data_into_memory=True)[source]
A helper function to get a validation dataset with the provided parameters.
- Parameters:
- modelAutoEncoder
- data_folderstr
- load_data_fnfunction, optional
- preload_data_into_memorybool, optional
The autoencoder model used to get relevant params and the preprocessing func.
The path to the folder containing the data.
A function for loading data from a provided file path into a pandas.DataFrame, by default pd.read_csv.
Whether to preload all the data into memory, by default True. (can speed up data loading if the data can fit into memory)
- Returns:
- DatasetFromPath
Validation Dataset set up to load from the path.
- property num_samples
Returns the number of samples in the dataset.