morpheus.models.dfencoder.dataloader.DatasetFromPath

class DatasetFromPath(data_folder, batch_size, preprocess_fn, load_data_fn=<function read_csv>, shuffle_rows_in_batch=True, shuffle_batch_indices=False, preload_data_into_memory=False)[source]

Bases: torch.utils.data.dataset.Dataset

A dataset class that reads data in batches from a folder and applies preprocessing to each batch. * This class assumes that the data is saved in small csv files in one folder.

Attributes
num_samples

Returns the number of samples in the dataset.

Methods

convert_to_validation(model)

Converts the dataset to validation mode by resetting instance variables.

get_preloaded_data()

Loads all data from the files into memory and returns it as a pandas.DataFrame.

get_train_dataset(model, data_folder[, ...])

A helper function to get a train dataset with the provided parameters.

get_validation_dataset(model, data_folder[, ...])

A helper function to get a validation dataset with the provided parameters.

convert_to_validation(model)[source]

Converts the dataset to validation mode by resetting instance variables.

Parameters
modelAutoEncoder

The autoencoder model used to get relevant params and the preprocessing func.

get_preloaded_data()[source]

Loads all data from the files into memory and returns it as a pandas.DataFrame.

static get_train_dataset(model, data_folder, load_data_fn=<function read_csv>, preload_data_into_memory=False)[source]

A helper function to get a train dataset with the provided parameters.

Parameters
modelAutoEncoder

The autoencoder model used to get relevant params and the preprocessing func.

data_folderstr

The path to the folder containing the data.

load_data_fnfunction, optional

A function for loading data from a provided file path into a pandas.DataFrame, by default pd.read_csv.

preload_data_into_memorybool, optional

Whether to preload all the data into memory, by default False.

Returns
DatasetFromPath

Validation Dataset set up to load from the path.

static get_validation_dataset(model, data_folder, load_data_fn=<function read_csv>, preload_data_into_memory=True)[source]

A helper function to get a validation dataset with the provided parameters.

Parameters
modelAutoEncoder

The autoencoder model used to get relevant params and the preprocessing func.

data_folderstr

The path to the folder containing the data.

load_data_fnfunction, optional

A function for loading data from a provided file path into a pandas.DataFrame, by default pd.read_csv.

preload_data_into_memorybool, optional

Whether to preload all the data into memory, by default True. (can speed up data loading if the data can fit into memory)

Returns
DatasetFromPath

Validation Dataset set up to load from the path.

property num_samples

Returns the number of samples in the dataset.

© Copyright 2023, NVIDIA. Last updated on Apr 11, 2023.