NVIDIA Docs Hub NVIDIA Morpheus NVIDIA Morpheus (25.02.01) morpheus.models.dfencoder.dataloader.DFEncoderDataLoader

morpheus.models.dfencoder.dataloader.DFEncoderDataLoader

class DFEncoderDataLoader(*args, **kwargs)[source]

Bases: torch.utils.data.DataLoader

Methods

`__call__`(args, *kwargs)	Call self as a function.
`get_distributed_training_dataloader_from_dataset`(...)	Returns a distributed training DataLoader given a dataset and other arguments.
`get_distributed_training_dataloader_from_df`(...)	A helper funtion to get a distributed training DataLoader given a pandas dataframe.
`get_distributed_training_dataloader_from_path`(...)	A helper funtion to get a distributed training DataLoader given a path to a folder containing data.

static get_distributed_training_dataloader_from_dataset(dataset, rank, world_size, pin_memory=False, num_workers=0)[source]

Returns a distributed training DataLoader given a dataset and other arguments.

Parameters

datasetDataset: The dataset to load the data from.
rankint: The rank of the current process.
world_sizeint: The number of processes to distribute the data across.
pin_memorybool, optional: Whether to pin memory when loading data, by default False.
num_workersint, optional: The number of worker processes to use for loading data, by default 0.

Returns

DataLoader: The training DataLoader with DistributedSampler for distributed training.

static get_distributed_training_dataloader_from_df(model, df, rank, world_size, pin_memory=False, num_workers=0)[source]

A helper funtion to get a distributed training DataLoader given a pandas dataframe.

Parameters

modelAutoEncoder: The autoencoder model used to get relevant params and the preprocessing func.
dfpandas.DataFrame: The pandas dataframe containing the data.
rankint: The rank of the current process.
world_sizeint: The number of processes to distribute the data across.
pin_memorybool, optional: Whether to pin memory when loading data, by default False.
num_workersint, optional: The number of worker processes to use for loading data, by default 0.

Returns

DFEncoderDataLoader: The training DataLoader with DistributedSampler for distributed training.

static get_distributed_training_dataloader_from_path(model, data_folder, rank, world_size, load_data_fn=pandas.read_csv, pin_memory=False, num_workers=0)[source]

A helper funtion to get a distributed training DataLoader given a path to a folder containing data.

Parameters

modelAutoEncoder: The autoencoder model used to get relevant params and the preprocessing func.
data_folderstr: The path to the folder containing the data.
rankint: The rank of the current process.
world_sizeint: The number of processes to distribute the data across.
load_data_fnfunction, optional: A function for loading data from a provided file path into a pandas.DataFrame, by default pd.read_csv.
pin_memorybool, optional: Whether to pin memory when loading data, by default False.
num_workersint, optional: The number of worker processes to use for loading data, by default 0.

Returns

DFEncoderDataLoader: The training DataLoader with DistributedSampler for distributed training.

Previous morpheus.models.dfencoder.dataloader

Next morpheus.models.dfencoder.dataloader.DataframeDataset