NVIDIA Morpheus (24.06)
(Latest Version)

morpheus.models.dfencoder.dataloader.DFEncoderDataLoader

class DFEncoderDataLoader(*args, **kwargs)[source]

Bases: torch.utils.data.DataLoader

Methods

__call__(*args, **kwargs) Call self as a function.
get_distributed_training_dataloader_from_dataset(...) Returns a distributed training DataLoader given a dataset and other arguments.
get_distributed_training_dataloader_from_df(...) A helper funtion to get a distributed training DataLoader given a pandas dataframe.
get_distributed_training_dataloader_from_path(...) A helper funtion to get a distributed training DataLoader given a path to a folder containing data.
static get_distributed_training_dataloader_from_dataset(dataset, rank, world_size, pin_memory=False, num_workers=0)[source]

Returns a distributed training DataLoader given a dataset and other arguments.

Parameters
dataset

The dataset to load the data from.

rank

The rank of the current process.

world_size

The number of processes to distribute the data across.

pin_memory

Whether to pin memory when loading data, by default False.

num_workers

The number of worker processes to use for loading data, by default 0.

Returns
DataLoader

The training DataLoader with DistributedSampler for distributed training.

static get_distributed_training_dataloader_from_df(model, df, rank, world_size, pin_memory=False, num_workers=0)[source]

A helper funtion to get a distributed training DataLoader given a pandas dataframe.

Parameters
model

The autoencoder model used to get relevant params and the preprocessing func.

df

The pandas dataframe containing the data.

rank

The rank of the current process.

world_size

The number of processes to distribute the data across.

pin_memory

Whether to pin memory when loading data, by default False.

num_workers

The number of worker processes to use for loading data, by default 0.

Returns
DFEncoderDataLoader

The training DataLoader with DistributedSampler for distributed training.

static get_distributed_training_dataloader_from_path(model, data_folder, rank, world_size, load_data_fn=pandas.read_csv, pin_memory=False, num_workers=0)[source]

A helper funtion to get a distributed training DataLoader given a path to a folder containing data.

Parameters
model

The autoencoder model used to get relevant params and the preprocessing func.

data_folder

The path to the folder containing the data.

rank

The rank of the current process.

world_size

The number of processes to distribute the data across.

load_data_fn

A function for loading data from a provided file path into a pandas.DataFrame, by default pd.read_csv.

pin_memory

Whether to pin memory when loading data, by default False.

num_workers

The number of worker processes to use for loading data, by default 0.

Returns
DFEncoderDataLoader

The training DataLoader with DistributedSampler for distributed training.

Previous morpheus.models.dfencoder.dataloader
Next morpheus.models.dfencoder.dataloader.DataframeDataset
© Copyright 2024, NVIDIA. Last updated on Jul 8, 2024.