Graph Neural Network Datapipes#

The VortexSheddingDataset processes flow field data around bluff bodies, capturing vortex shedding patterns and flow structures for graph-based learning. The VortexSheddingDataset is used in the VortexShedding CFD examples.

class physicsnemo.datapipes.gnn.vortex_shedding_dataset.VortexSheddingDataset(
name='dataset',
data_dir=None,
split='train',
num_samples=1000,
num_steps=600,
noise_std=0.02,
)[source]#

Bases: Dataset

In-memory MeshGraphNet Dataset for stationary mesh .. rubric:: Notes

  • This dataset prepares and processes the data available in MeshGraphNet’s repo:

    deepmind/deepmind-research

  • A single adj matrix is used for each transient simulation.

    Do not use with adaptive mesh or remeshing

Parameters:
  • name (str, optional) – Name of the dataset, by default “dataset”

  • data_dir (_type_, optional) – Specifying the directory that stores the raw data in .TFRecord format., by default None

  • split (str, optional) – Dataset split [“train”, “eval”, “test”], by default “train”

  • num_samples (int, optional) – Number of samples, by default 1000

  • num_steps (int, optional) – Number of time steps in each sample, by default 600

  • noise_std (float, optional) – The standard deviation of the noise added to the “train” split, by default 0.02

static add_edge_features(graph, pos)[source]#

adds relative displacement & displacement norm as edge features

static cell_to_adj(cells)[source]#

creates adjancy matrix in COO format from mesh cells

static create_graph(src, dst, dtype=torch.int32)[source]#

creates a PyG graph from an adj matrix in COO format. torch.int32 can handle graphs with up to 2**31-1 nodes or edges.

static denormalize(invar, mu, std)[source]#

denormalizes a tensor

static normalize_edge(graph, mu, std)[source]#

normalizes a tensor

static normalize_node(invar, mu, std)[source]#

normalizes a tensor

The AhmedBodyDataset manages flow field data around Ahmed bodies, supporting aerodynamic analysis and drag prediction tasks. The AhmedBodyDataset is used in the AeroGraphNet CFD External Aerodynamics example.

class physicsnemo.datapipes.gnn.ahmed_body_dataset.AhmedBodyDataset(data_dir: str, split: str = 'train', *args, **kwargs)[source]#

Bases: Dataset

In-memory Ahmed body Dataset

Parameters:
  • data_dir (str) – The directory where the data is stored.

  • split (str, optional) – The dataset split. Can be ‘train’, ‘validation’, or ‘test’, by default ‘train’.

  • num_samples (int, optional) – The number of samples to use, by default 10.

  • invar_keys (Iterable[str], optional) – The input node features to consider. Default includes ‘pos’, ‘velocity’, ‘reynolds_number’, ‘length’, ‘width’, ‘height’, ‘ground_clearance’, ‘slant_angle’, and ‘fillet_radius’.

  • outvar_keys (Iterable[str], optional) – The output features to consider. Default includes ‘p’ and ‘wallShearStress’.

  • Iterable[str] (normalize_keys) – The features to normalize. Default includes ‘p’, ‘wallShearStress’, ‘velocity’, ‘length’, ‘width’, ‘height’, ‘ground_clearance’, ‘slant_angle’, and ‘fillet_radius’.

  • optional – The features to normalize. Default includes ‘p’, ‘wallShearStress’, ‘velocity’, ‘length’, ‘width’, ‘height’, ‘ground_clearance’, ‘slant_angle’, and ‘fillet_radius’.

  • normalization_bound (tuple[float, float], optional) – The lower and upper bounds for normalization. Default is (-1, 1).

  • name (str, optional) – The name of the dataset, by default ‘dataset’.

  • compute_drag (bool, optional) – If True, also returns the coefficient and mesh area and normals that are required for computing the drag coefficient.

  • num_workers (int, optional) – Number of dataset pre-loading workers. If None, will be chosen automatically.

add_edge_features() list[PyGData][source]#

Add relative displacement and displacement norm as edge features for each graph in the list of graphs. The calculations are done using the ‘pos’ attribute in the node data of each graph. The resulting edge features are stored in the ‘x’ attribute in the edge data of each graph.

This method will modify the list of graphs in-place.

Returns:

The list of graphs with updated edge features.

Return type:

list[PyGData]

create_graph(
index: int,
file_path: str,
info_path: str,
)[source]#

Creates a graph from VTP file.

This method is used in parallel loading of graphs.

Return type:

Tuple that contains graph index, graph, and optionally coeff, normal and area values.

denormalize(
pred,
gt,
device,
) tuple[Tensor, Tensor][source]#

Denormalize the graph node data.

Parameters:
  • pred (Tensor) – Normalized prediction

  • gt (Tensor) – Normalized ground truth

  • device (Any) – The device

Returns:

Denormalized prediction and ground truth

Return type:

Tuple(Tensor, Tensor)

normalize_edge() list[PyGData][source]#

Normalize edge data ‘x’ in each graph in the list of graphs.

Returns:

The list of graphs with normalized edge data ‘x’.

Return type:

list[PyGData]

normalize_node() list[PyGData][source]#

Normalize node data in each graph in the list of graphs.

Returns:

The list of graphs with normalized and concatenated node data.

Return type:

list[PyGData]

class physicsnemo.datapipes.gnn.ahmed_body_dataset.FileInfo(
velocity: float,
reynolds_number: float,
length: float,
width: float,
height: float,
ground_clearance: float,
slant_angle: float,
fillet_radius: float,
)[source]#

Bases: object

VTP file info storage.

class physicsnemo.datapipes.gnn.ahmed_body_dataset.MetaData(
name: 'str' = 'AhmedBody',
auto_device: 'bool' = True,
cuda_graphs: 'bool' = False,
ddp_sharding: 'bool' = True,
)[source]#

Bases: DatapipeMetaData

The DrivAerNetDataset handles automotive aerodynamics surface data, providing access to surface pressure and wall shear stress distributions. The DrivAerNetDataset is used in the AeroGraphNet and FIGConvNet CFD External Aerodynamics examples.

class physicsnemo.datapipes.gnn.drivaernet_dataset.DrivAerNetDataset(
data_dir: str | Path,
split: str = 'train',
num_samples: int = 10,
coeff_filename: str = 'AeroCoefficients_DrivAerNet_FilteredCorrected.csv',
invar_keys: Iterable[str] = ('pos',),
outvar_keys: Iterable[str] = ('p', 'wallShearStress'),
normalize_keys: Iterable[str] = ('p', 'wallShearStress'),
cache_dir: str | Path = './cache/',
name: str = 'dataset',
force_reload: bool = False,
**kwargs,
)[source]#

Bases: Dataset, Datapipe

DrivAerNet dataset.

Note: DrivAerNetDataset caches graphs in __getitem__ call which helps to avoid long initialization delay but increases first epoch time.

Parameters:
  • data_dir (str) – The directory where the data is stored.

  • split (str, optional) – The dataset split. Can be ‘train’, ‘validation’, or ‘test’, by default ‘train’.

  • num_samples (int, optional) – The number of samples to use, by default 10.

  • coeff_filename (str, optional) – DrivAerNet coefficients file name, default is from the dataset location.

  • invar_keys (Iterable[str], optional) – The input node features to consider. Default includes ‘pos’.

  • outvar_keys (Iterable[str], optional) – The output features to consider. Default includes ‘p’ and ‘wallShearStress’.

  • Iterable[str] (normalize_keys) – The features to normalize. Default includes ‘p’ and ‘wallShearStress’.

  • optional – The features to normalize. Default includes ‘p’ and ‘wallShearStress’.

  • cache_dir (str, optional) – Path to the cache directory to store graphs in PyG format for fast loading. Default is ./cache/.

  • name (str, optional) – The name of the dataset, by default ‘dataset’.

  • force_reload (bool, optional) – If True, forces a reload of the cached data, by default False.

denormalize(
pred: Tensor,
gt: Tensor,
device: device,
) tuple[Tensor, Tensor][source]#

Denormalizes the inputs using previously collected statistics.

class physicsnemo.datapipes.gnn.drivaernet_dataset.MetaData(
name: str = 'DrivAerNet',
auto_device: bool = True,
cuda_graphs: bool = False,
ddp_sharding: bool = True,
)[source]#

Bases: DatapipeMetaData

The StokesDataset processes Stokes flow simulations in pipe domains obstructed by random polygons, supporting various boundary conditions and geometry configurations. The StokesDataset is used in the Stokes MeshGraphNet CFD example.

class physicsnemo.datapipes.gnn.stokes_dataset.StokesDataset(
data_dir,
split='train',
num_samples=10,
invar_keys=['pos', 'marker'],
outvar_keys=['u', 'v', 'p'],
normalize_keys=['u', 'v', 'p'],
name='dataset',
)[source]#

Bases: Dataset

In-memory Stokes flow Dataset

Parameters:
  • data_dir (str) – The directory where the data is stored.

  • split (str, optional) – The dataset split. Can be ‘train’, ‘validation’, or ‘test’, by default ‘train’.

  • num_samples (int, optional) – The number of samples to use, by default 10.

  • invar_keys (List[str], optional) – The input node features to consider. Default includes ‘pos’ and ‘marker’

  • outvar_keys (List[str], optional) – The output features to consider. Default includes ‘u’, ‘v’, and ‘p’.

  • List[str] (normalize_keys) – The features to normalize. Default includes ‘u’, ‘v’, and ‘p’.

  • optional – The features to normalize. Default includes ‘u’, ‘v’, and ‘p’.

  • name (str, optional) – The name of the dataset, by default ‘dataset’.

add_edge_features()[source]#

adds relative displacement & displacement norm as edge features

static denormalize(invar, mu, std)[source]#

denormalizes a tensor

normalize_edge()[source]#

normalizes a tensor

normalize_node()[source]#

normalizes node features

The GNN utilities provide helper functions for reading VTP files and saving/loading JSON-serialized statistics used by the GNN datapipes. The GNN utilities are used by the GNN dataset classes and in the structural mechanics examples.

physicsnemo.datapipes.gnn.utils.load_json(file: str) Dict[str, Tensor][source]#

Loads a JSON file into a dictionary of PyTorch tensors.

Parameters:

file (str) – Path to the JSON file.

Returns:

Dictionary where each value is a PyTorch tensor.

Return type:

Dict[str, torch.Tensor]

physicsnemo.datapipes.gnn.utils.read_vtp_file(file_path: str) Any[source]#

Read a VTP file and return the polydata.

Parameters:

file_path (str) – Path to the VTP file.

Returns:

The polydata read from the VTP file.

Return type:

vtkPolyData

physicsnemo.datapipes.gnn.utils.save_json(var: Dict[str, Tensor], file: str) None[source]#

Saves a dictionary of tensors to a JSON file.

Parameters:
  • var (Dict[str, torch.Tensor]) – Dictionary where each value is a PyTorch tensor.

  • file (str) – Path to the output JSON file.