morpheus.models.dfencoder.dataloader.DatasetFromPath

class DatasetFromPath(*args, **kwargs)[source]

Bases: torch.utils.data.Dataset

A dataset class that reads data in batches from a folder and applies preprocessing to each batch. * This class assumes that the data is saved in small csv files in one folder.

Attributes:
num_samples

Returns the number of samples in the dataset.

Methods

__call__(*args, **kwargs)

Call self as a function.

convert_to_validation(model)

Converts the dataset to validation mode by resetting instance variables.

get_preloaded_data()

Loads all data from the files into memory and returns it as a pandas.DataFrame.

get_train_dataset(model, data_folder[, ...])

A helper function to get a train dataset with the provided parameters.

get_validation_dataset(model, data_folder[, ...])

A helper function to get a validation dataset with the provided parameters.

convert_to_validation(model)[source]

Converts the dataset to validation mode by resetting instance variables.

Parameters:
modelAutoEncoder

The autoencoder model used to get relevant params and the preprocessing func.

get_preloaded_data()[source]

Loads all data from the files into memory and returns it as a pandas.DataFrame.

static get_train_dataset(model, data_folder, load_data_fn=pandas.read_csv, preload_data_into_memory=False)[source]

A helper function to get a train dataset with the provided parameters.

Parameters:
modelAutoEncoder

The autoencoder model used to get relevant params and the preprocessing func.

data_folderstr

The path to the folder containing the data.

load_data_fnfunction, optional

A function for loading data from a provided file path into a pandas.DataFrame, by default pd.read_csv.

preload_data_into_memorybool, optional

Whether to preload all the data into memory, by default False.

Returns:
DatasetFromPath

Validation Dataset set up to load from the path.

static get_validation_dataset(model, data_folder, load_data_fn=pandas.read_csv, preload_data_into_memory=True)[source]

A helper function to get a validation dataset with the provided parameters.

Parameters:
modelAutoEncoder

The autoencoder model used to get relevant params and the preprocessing func.

data_folderstr

The path to the folder containing the data.

load_data_fnfunction, optional

A function for loading data from a provided file path into a pandas.DataFrame, by default pd.read_csv.

preload_data_into_memorybool, optional

Whether to preload all the data into memory, by default True. (can speed up data loading if the data can fit into memory)

Returns:
DatasetFromPath

Validation Dataset set up to load from the path.

property num_samples

Returns the number of samples in the dataset.

© Copyright 2023, NVIDIA. Last updated on Oct 12, 2023.