morpheus.models.dfencoder.dataloader.DatasetFromPath
- class DatasetFromPath(data_folder, batch_size, preprocess_fn, load_data_fn=<function read_csv>, shuffle_rows_in_batch=True, shuffle_batch_indices=False, preload_data_into_memory=False)[source]
Bases:
torch.utils.data.dataset.Dataset
A dataset class that reads data in batches from a folder and applies preprocessing to each batch. * This class assumes that the data is saved in small csv files in one folder.
- Attributes
num_samples
Returns the number of samples in the dataset.
Methods
convert_to_validation
(model)Converts the dataset to validation mode by resetting instance variables.
Loads all data from the files into memory and returns it as a pandas.DataFrame.
get_train_dataset
(model, data_folder[, ...])A helper function to get a train dataset with the provided parameters.
get_validation_dataset
(model, data_folder[, ...])A helper function to get a validation dataset with the provided parameters.
- convert_to_validation(model)[source]
Converts the dataset to validation mode by resetting instance variables.
- Parameters
- modelAutoEncoder
The autoencoder model used to get relevant params and the preprocessing func.
- get_preloaded_data()[source]
Loads all data from the files into memory and returns it as a pandas.DataFrame.
- static get_train_dataset(model, data_folder, load_data_fn=<function read_csv>, preload_data_into_memory=False)[source]
A helper function to get a train dataset with the provided parameters.
- Parameters
- modelAutoEncoder
- data_folderstr
- load_data_fnfunction, optional
- preload_data_into_memorybool, optional
The autoencoder model used to get relevant params and the preprocessing func.
The path to the folder containing the data.
A function for loading data from a provided file path into a pandas.DataFrame, by default pd.read_csv.
Whether to preload all the data into memory, by default False.
- Returns
- DatasetFromPath
Validation Dataset set up to load from the path.
- static get_validation_dataset(model, data_folder, load_data_fn=<function read_csv>, preload_data_into_memory=True)[source]
A helper function to get a validation dataset with the provided parameters.
- Parameters
- modelAutoEncoder
- data_folderstr
- load_data_fnfunction, optional
- preload_data_into_memorybool, optional
The autoencoder model used to get relevant params and the preprocessing func.
The path to the folder containing the data.
A function for loading data from a provided file path into a pandas.DataFrame, by default pd.read_csv.
Whether to preload all the data into memory, by default True. (can speed up data loading if the data can fit into memory)
- Returns
- DatasetFromPath
Validation Dataset set up to load from the path.
- property num_samples
Returns the number of samples in the dataset.