# Darcy Flow with Adaptive Fourier Neural Operator¶

## Introduction¶

This tutorial demonstrates the use of Adaptive Fourier Neural Operators (AFNO) for surrogate modeling an elliptic PDE. This is an extension of the Darcy Flow with Fourier Neural Operator chapter. The unique topics covered here include:

1. How to use the Adaptive Fourier Neural Operator architecture in Modulus

2. Differences between Adaptive and the original Fourier Neural Operator

Note

This tutorial assumes that you are familiar with the basic functionality of Modulus and understand the AFNO architecture. Please see the Lid Driven Cavity Background and Adaptive Fourier Neural Operator sections for additional information. Additionally, this tutorial builds upon the Darcy Flow with Fourier Neural Operator which should be read prior to this one.

Warning

The Python package gdown is required for this example if you do not already have the example data downloaded and converted. Install using pip install gdown.

## Problem Description¶

This problem develops a surrogate model that learns the mapping between a permeability and pressure field of a Darcy flow system. The mapping learned, $$\textbf{K} \rightarrow \textbf{U}$$, should be true for a distribution of permeability fields $$\textbf{K} \sim p(\textbf{K})$$ not a single solution. The major distinction between FNO and AFNO outside of the model internals is how the image data is inserted into and predicted by the model. As discussed further in the Adaptive Fourier Neural Operator theory, AFNO breaks the input field into a set of patches which are all fed into the neural network. AFNO then predicts the output field with the same patch-based discretization, which are all then stitched together for the final prediction. While this allows AFNO to handle larger input features, this can also introduce potential artifacts into the prediction.

## Case Setup¶

Similar to the FNO chapter, the training and validation data for this example can be found on the Fourier Neural Operator Github page. However, we have included an automated script for downloading and converting this dataset. This requires the package gdown which can easily installed through pip install gdown.

Note

The python script for this problem can be found at examples/darcy/darcy_afno.py.

### Configuration¶

The configuration for this problem is generally the same as the FNO example, but the AFNO architecture has different hyperparameters. A key hyperparameter in AFNO not present in FNO is the patch_size, which defines patch dimensions used to sub-divide in input feature. The embed_dim parameter defines the size of the latent embedded features used inside the model for each patch.

defaults :
- modulus_default
- arch:
- afno
- scheduler: tf_exponential_lr
- loss: sum
- _self_

arch:
afno:
patch_size: 16
embed_dim: 256
depth: 4
num_blocks: 8

scheduler:
decay_rate: 0.95
decay_steps: 1000

training:
rec_results_freq : 1000
max_steps : 10000

batch_size:
grid: 32
validation: 32


Loading both the training and validation datasets into memory follows a similar process as the Darcy Flow with Fourier Neural Operator example.

    # load training/ test data
input_keys = [Key("coeff", scale=(7.48360e00, 4.49996e00))]
output_keys = [Key("sol", scale=(5.74634e-03, 3.88433e-03))]

"datasets/Darcy_241/piececonst_r241_N1024_smooth1.hdf5",
[k.name for k in input_keys],
[k.name for k in output_keys],
n_examples=1000,
)
"datasets/Darcy_241/piececonst_r241_N1024_smooth2.hdf5",
[k.name for k in input_keys],
[k.name for k in output_keys],
n_examples=100,
)


The inputs for AFNO need to be perfectly divisible by the specified patch size (in this example patch_size=16), which is not the case for this data-set. Therefore, trim the input/output features such that they are an appropriate dimensionality 241x241 -> 240x240.

    # get training image shape
img_shape = [
next(iter(invar_train.values())).shape[-2],
next(iter(invar_train.values())).shape[-1],
]

# crop out some pixels so that img_shape is divisible by patch_size of AFNO
img_shape = [s - s % cfg.arch.afno.patch_size for s in img_shape]
print(f"cropped img_shape: {img_shape}")
for d in (invar_train, outvar_train, invar_test, outvar_test):
for k in d:
d[k] = d[k][:, :, : img_shape[0], : img_shape[1]]
print(f"{k}: {d[k].shape}")


### Initializing the Model¶

Initializing the model and domain follows the same steps as in other examples. For AFNO, calculate the size of the domain after loading the dataset. The domain needs to be defined in the AFNO model, which is provided with the inclusion of the keyword argument img_shape in the instantiate_arch call.

    # make list of nodes to unroll graph on
model = instantiate_arch(
input_keys=input_keys,
output_keys=output_keys,
cfg=cfg.arch.afno,
img_shape=img_shape,
)


### Adding Data Constraints and Validators¶

Data-driven constraints and validators are then added to the domain. For more information, see the Darcy Flow with Fourier Neural Operator chapter.

    # add constraints to domain
supervised = SupervisedGridConstraint(
nodes=nodes,
invar=invar_train,
outvar=outvar_train,
batch_size=cfg.batch_size.grid,
cell_volumes=None,
lambda_weighting=None,
)

val = GridValidator(
invar_test,
outvar_test,
nodes,
batch_size=cfg.batch_size.validation,
plotter=GridValidatorPlotter(n_examples=5),
)


## Training the Model¶

The training can now be simply started by executing the python script.

python darcy_AFNO.py


### Results and Post-processing¶

The checkpoint directory is saved based on the results recording frequency specified in the rec_results_freq parameter of its derivatives. See Results Frequency for more information. The network directory folder (in this case 'outputs/darcy_afno/validators') contains several plots of different validation predictions.

There is a distinct grid pattern in the error contours which is an artifact from the patch-based input and prediction system used in AFNO. FNO and AFNO performance for this surrogate modeling problem is illustrated in the Tensorboard plot below. FNO is able to out perform AFNO for this problem, but this does demonstrate that AFNO can remain competitive with FNO for smaller machine learning problems. It is important to recognize that AFNO’s strengths lie in its ability to scale to a much larger model size and datasets than what is used in this chapter 1 2.

References

1

Guibas, John, et al. “Adaptive fourier neural operators: Efficient token mixers for transformers” International Conference on Learning Representations, 2022.

2

Pathak, Jaideep, et al. “FourCastNet : A global data-driven high-resolution weather model using adaptive Fourier neural operators” arXiv preprint arXiv:2202.11214 (2022).