Modulus Datapipes - NVIDIA Docs

Benchmark datapipes

class modulus.datapipes.benchmarks.darcy.Darcy2D(resolution: int = 256, batch_size: int = 64, nr_permeability_freq: int = 5, max_permeability: float = 2.0, min_permeability: float = 0.5, max_iterations: int = 30000, convergence_threshold: float = 1e-06, iterations_per_convergence_check: int = 1000, nr_multigrids: int = 4, normaliser: Optional[Dict[str, Tuple[float, float]]] = None, device: Union[str, device] = 'cuda')[source]

Bases: Datapipe

2D Darcy flow benchmark problem datapipe.

This datapipe continuously generates solutions to the 2D Darcy equation with variable permeability. All samples are generated on the fly and is meant to be a benchmark problem for testing data driven models. Permeability is drawn from a random Fourier series and threshold it to give a piecewise constant function. The solution is obtained using a GPU enabled multi-grid Jacobi iterative method.

Parameters
Raises

generate_batch() → None[source]

initialize_batch() → None[source]

class modulus.datapipes.benchmarks.darcy.MetaData(name: str = 'Darcy2D', auto_device: bool = True, cuda_graphs: bool = True, ddp_sharding: bool = False)[source]

class modulus.datapipes.benchmarks.kelvin_helmholtz.KelvinHelmholtz2D(resolution: int = 512, batch_size: int = 16, seq_length: int = 8, nr_perturbation_freq: int = 5, perturbation_range: float = 0.1, nr_snapshots: int = 256, iteration_per_snapshot: int = 32, gamma: float = 1.6666666666666667, normaliser: Optional[Dict[str, Tuple[float, float]]] = None, device: Union[str, device] = 'cuda')[source]

Bases: Datapipe

Kelvin-Helmholtz instability benchmark problem datapipe.

This datapipe continuously generates samples with random initial conditions. All samples are generated on the fly and is meant to be a benchmark problem for testing data driven models. Initial conditions are given in the form of small perturbations. The solution is obtained using a GPU enabled Finite Volume Method.

Parameters

generate_batch() → None[source]

initialize_batch() → None[source]

class modulus.datapipes.benchmarks.kelvin_helmholtz.MetaData(name: str = 'KelvinHelmholtz2D', auto_device: bool = True, cuda_graphs: bool = True, ddp_sharding: bool = False)[source]

Weather and climate datapipes

class modulus.datapipes.climate.era5_hdf5.ERA5DaliExternalSource(data_paths: Iterable[str], num_samples: int, channels: Iterable[int], num_steps: int, stride: int, num_samples_per_year: int, batch_size: int = 1, shuffle: bool = True, process_rank: int = 0, world_size: int = 1)[source]

Bases: object

DALI Source for lazy-loading the HDF5 ERA5 files

Parameters

Note

For more information about DALI external source operator: https://docs.nvidia.com/deeplearning/dali/archives/dali_1_13_0/user-guide/docs/examples/general/data_loading/parallel_external_source.html

class modulus.datapipes.climate.era5_hdf5.ERA5HDF5Datapipe(data_dir: str, stats_dir: Optional[str] = None, channels: Optional[List[int]] = None, batch_size: int = 1, num_steps: int = 1, stride: int = 1, patch_size: Optional[Union[Tuple[int, int], int]] = None, num_samples_per_year: Optional[int] = None, shuffle: bool = True, num_workers: int = 1, device: Union[str, device] = 'cuda', process_rank: int = 0, world_size: int = 1)[source]

Bases: Datapipe

ERA5 DALI data pipeline for HDF5 files

Parameters

load_statistics() → None[source]

Loads ERA5 statistics from pre-computed numpy files

The statistic files should be of name global_means.npy and global_std.npy with a shape of [1, C, 1, 1] located in the stat_dir.

Raises

parse_dataset_files() → None[source]

Parses the data directory for valid HDF5 files and determines training samples

Raises

class modulus.datapipes.climate.era5_hdf5.MetaData(name: str = 'ERA5HDF5', auto_device: bool = True, cuda_graphs: bool = True, ddp_sharding: bool = True)[source]

Graph datapipes

class modulus.datapipes.gnn.mgn_dataset.MGNDataset(name='dataset', data_dir=None, split='train', num_samples=1000, num_steps=600, noise_std=0.02, force_reload=False, verbose=False)[source]

Bases: DGLDataset

In-memory MeshGraphNet Dataset for stationary mesh .. rubric:: Notes

This dataset prepares and processes the data available in MeshGraphNet’s repo:
https://github.com/deepmind/deepmind-research/tree/master/meshgraphnets
A single adj matrix is used for each transient simulation.
Do not use with adaptive mesh or remeshing

Parameters

static add_edge_features(graph, pos)[source]

static cell_to_adj(cells)[source]

static create_graph(src, dst, dtype=torch.int32)[source]

static denormalize(invar, mu, std)[source]

static normalize_edge(graph, mu, std)[source]

static normalize_node(invar, mu, std)[source]