NVIDIA Docs Hub NVIDIA PhysicsNeMo NVIDIA Modulus Core v0.2.1 Modulus Datapipes

Modulus Datapipes

Benchmark datapipes

class modulus.datapipes.benchmarks.darcy.Darcy2D(resolution: int = 256, batch_size: int = 64, nr_permeability_freq: int = 5, max_permeability: float = 2.0, min_permeability: float = 0.5, max_iterations: int = 30000, convergence_threshold: float = 1e-06, iterations_per_convergence_check: int = 1000, nr_multigrids: int = 4, normaliser: Optional[Dict[str, Tuple[float, float]]] = None, device: Union[str, device] = 'cuda')[source]

Bases: Datapipe

2D Darcy flow benchmark problem datapipe.

This datapipe continuously generates solutions to the 2D Darcy equation with variable permeability. All samples are generated on the fly and is meant to be a benchmark problem for testing data driven models. Permeability is drawn from a random Fourier series and threshold it to give a piecewise constant function. The solution is obtained using a GPU enabled multi-grid Jacobi iterative method.

Parameters
Raises

generate_batch() → None[source]

initialize_batch() → None[source]

class modulus.datapipes.benchmarks.darcy.MetaData(name: str = 'Darcy2D', auto_device: bool = True, cuda_graphs: bool = True, ddp_sharding: bool = False)[source]

class modulus.datapipes.benchmarks.kelvin_helmholtz.KelvinHelmholtz2D(resolution: int = 512, batch_size: int = 16, seq_length: int = 8, nr_perturbation_freq: int = 5, perturbation_range: float = 0.1, nr_snapshots: int = 256, iteration_per_snapshot: int = 32, gamma: float = 1.6666666666666667, normaliser: Optional[Dict[str, Tuple[float, float]]] = None, device: Union[str, device] = 'cuda')[source]

Bases: Datapipe

Kelvin-Helmholtz instability benchmark problem datapipe.

This datapipe continuously generates samples with random initial conditions. All samples are generated on the fly and is meant to be a benchmark problem for testing data driven models. Initial conditions are given in the form of small perturbations. The solution is obtained using a GPU enabled Finite Volume Method.

Parameters

generate_batch() → None[source]

initialize_batch() → None[source]

class modulus.datapipes.benchmarks.kelvin_helmholtz.MetaData(name: str = 'KelvinHelmholtz2D', auto_device: bool = True, cuda_graphs: bool = True, ddp_sharding: bool = False)[source]

Weather and climate datapipes

class modulus.datapipes.climate.era5_hdf5.ERA5DaliExternalSource(data_paths: Iterable[str], num_samples: int, channels: Iterable[int], num_steps: int, stride: int, num_samples_per_year: int, batch_size: int = 1, shuffle: bool = True, process_rank: int = 0, world_size: int = 1)[source]

Bases: object

DALI Source for lazy-loading the HDF5 ERA5 files

Parameters

Note

For more information about DALI external source operator: https://docs.nvidia.com/deeplearning/dali/archives/dali_1_13_0/user-guide/docs/examples/general/data_loading/parallel_external_source.html

class modulus.datapipes.climate.era5_hdf5.ERA5HDF5Datapipe(data_dir: str, stats_dir: Optional[str] = None, channels: Optional[List[int]] = None, batch_size: int = 1, num_steps: int = 1, stride: int = 1, patch_size: Optional[Union[Tuple[int, int], int]] = None, num_samples_per_year: Optional[int] = None, shuffle: bool = True, num_workers: int = 1, device: Union[str, device] = 'cuda', process_rank: int = 0, world_size: int = 1)[source]

Bases: Datapipe

ERA5 DALI data pipeline for HDF5 files

Parameters

load_statistics() → None[source]

Loads ERA5 statistics from pre-computed numpy files

The statistic files should be of name global_means.npy and global_std.npy with a shape of [1, C, 1, 1] located in the stat_dir.

Raises

parse_dataset_files() → None[source]

Parses the data directory for valid HDF5 files and determines training samples

Raises

class modulus.datapipes.climate.era5_hdf5.MetaData(name: str = 'ERA5HDF5', auto_device: bool = True, cuda_graphs: bool = True, ddp_sharding: bool = True)[source]

Graph datapipes

class modulus.datapipes.gnn.vortex_shedding_dataset.VortexSheddingDataset(name='dataset', data_dir=None, split='train', num_samples=1000, num_steps=600, noise_std=0.02, force_reload=False, verbose=False)[source]

Bases: DGLDataset

In-memory MeshGraphNet Dataset for stationary mesh .. rubric:: Notes

This dataset prepares and processes the data available in MeshGraphNet’s repo:
https://github.com/deepmind/deepmind-research/tree/master/meshgraphnets
A single adj matrix is used for each transient simulation.
Do not use with adaptive mesh or remeshing

Parameters

static add_edge_features(graph, pos)[source]

static cell_to_adj(cells)[source]

static create_graph(src, dst, dtype=torch.int32)[source]

static denormalize(invar, mu, std)[source]

static normalize_edge(graph, mu, std)[source]

static normalize_node(invar, mu, std)[source]

class modulus.datapipes.gnn.ahmed_body_dataset.AhmedBodyDataset(data_dir: str, split: str = 'train', num_samples: int = 10, invar_keys: List[str] = ['pos', 'velocity', 'reynolds_number', 'length', 'width', 'height', 'ground_clearance', 'slant_angle', 'fillet_radius'], outvar_keys: List[str] = ['p', 'wallShearStress'], normalize_keys: List[str] = ['p', 'wallShearStress', 'velocity', 'reynolds_number', 'length', 'width', 'height', 'ground_clearance', 'slant_angle', 'fillet_radius'], normalization_bound: Tuple[float, float] = (-1.0, 1.0), force_reload: bool = False, name: str = 'dataset', verbose: bool = False, compute_drag: bool = False)[source]

Bases: DGLDataset, Datapipe

In-memory Ahmed body Dataset

Parameters

add_edge_features() → List[DGLGraph][source]

Add relative displacement and displacement norm as edge features for each graph in the list of graphs. The calculations are done using the ‘pos’ attribute in the node data of each graph. The resulting edge features are stored in the ‘x’ attribute in the edge data of each graph.

This method will modify the list of graphs in-place.

Returns
Return type

denormalize(pred, gt, device) → Tuple[Tensor, Tensor][source]

Denormalize the graph node data.

pred: Tensor
gt: Tensor
device: Any

Tuple(Tensor, Tensor)

normalize_edge() → List[DGLGraph][source]

Normalize edge data ‘x’ in each graph in the list of graphs using min-max normalization. The normalization is performed in-place. The normalization formula used is:

normalized_x = 2.0 * normalization_bound[1] * (x - edge_min) / (edge_max - edge_min) + normalization_bound[0]

This will bring the edge data ‘x’ in each graph into the range of [normalization_bound[0], normalization_bound[1]].

Returns
Return type

normalize_node() → List[DGLGraph][source]

Normalize node data in each graph in the list of graphs using min-max normalization. The normalization is performed in-place. The normalization formula used is:

normalized_data = 2.0 * normalization_bound[1] * (data - node_min) / (node_max - node_min) + normalization_bound[0]

This will bring the node data in each graph into the range of [normalization_bound[0], normalization_bound[1]]. After normalization, node data is concatenated according to the keys defined in ‘self.input_keys’ and ‘self.output_keys’, resulting in new node data ‘x’ and ‘y’, respectively.

Returns
Return type

class modulus.datapipes.gnn.ahmed_body_dataset.MetaData(name: str = 'AhmedBody', auto_device: bool = True, cuda_graphs: bool = False, ddp_sharding: bool = True)[source]

modulus.datapipes.gnn.utils.load_json(file: str) → Dict[str, Tensor][source]

Loads a JSON file into a dictionary of PyTorch tensors.

Parameters
Returns
Return type

modulus.datapipes.gnn.utils.read_vtp_file(file_path: str) → Any[source]

Read a VTP file and return the polydata.

Parameters
Returns
Return type

modulus.datapipes.gnn.utils.save_json(var: Dict[str, Tensor], file: str) → None[source]

Saves a dictionary of tensors to a JSON file.

Parameters