In this section we provide a brief introduction to solving differential equations with neural networks. The idea is to use a neural network to approximate the solution to the given differential equation and boundary conditions. We train this neural network by constructing a loss function for how well the neural network is satisfying the differential equation and boundary conditions. If the network is able to minimize this loss function then it will in effect, solve the given differential equation.

To illustrate this idea we will give an example of solving the following problem,

(5)\[\begin{split} \mathbf{P} : \left\{\begin{matrix}
\frac{\delta^2 u}{\delta x^2}(x) = f(x), \\
\\
u(0) = u(1) = 0,
\end{matrix}\right.\end{split}\]

We start by constructing a neural network \(u_{net}(x)\). The input to this network is a single value \(x \in \mathbb{R}\) and its output is also a single value \(u_{net}(x) \in \mathbb{R}\). We suppose that this neural network is infinitely differentiable, \(u_{net} \in C^{\infty}\). The typical neural network used is a deep fully connected network where the activation functions are infinitely differentiable.

Next we need to construct a loss function to train this neural network. We easily encode the boundary conditions as a loss in the following way:

(6)\[L_{BC} = u_{net}(0)^2 + u_{net}(1)^2\]

For encoding the equations, we need to compute the derivatives of \(u_{net}\). Using automatic differentiation we can do so and compute \(\frac{\delta^2 u_{net}}{\delta x^2}(x)\). This allows us to write a loss function of the form:

(7)\[ L_{residual} = \frac{1}{N}\sum^{N}_{i=0} \left( \frac{\delta^2 u_{net}}{\delta x^2}(x_i) - f(x_i) \right)^2\]

Where the \(x_i\) ‘s are a batch of points sampled in the interior,
\(x_i \in (0, 1)\). Our total loss is then
\(L = L_{BC} + L_{residual}\). Optimizers such as Adam ^{1} are used to train this neural
network. Given \(f(x)=1\), the true solution is
\(\frac{1}{2}(x-1)x\). Upon solving the problem, you can obtain good
agreement between the exact solution and the neural network solution as
shown in Fig. 10.

Fig. 10 *Neural Network Solver compared with analytical solution.*

Using the PINNs in Modulus Sym, we were able to solve complex problems with intricate geometries and multiple physics. In order to achieve this we have deviated and improved on the current state-of-the-art in several important ways. In this section we will briefly cover some topics related to this.

In literature, the losses are often defined as a summation similar to
our above equation (7),
^{2}. In Modulus Sym, we take a different
approach and view the losses as integrals. You can instead write
\(L_{residual}\) in the form,

(8)\[L_{residual} = \int^1_0 \left( \frac{\delta^2 u_{net}}{\delta x^2}(x) - f(x) \right)^2 dx\]

Now there is a question of how we approximate this integral. We can easily see that if we use Monte Carlo integration we arrive at the same summation in equation (7).

(9)\[\int^1_0 \left( \frac{\delta^2 u_{net}}{\delta x^2}(x) - f(x) \right)^2 dx \approx (\int^1_0 dx) \frac{1}{N} \sum^{N}_{i=0} \left( \frac{\delta^2 u_{net}}{\delta x^2}(x_i) - f(x_i) \right)^2 = \frac{1}{N} \sum^{N}_{i=0} \left( \frac{\delta^2 u_{net}}{\delta x^2}(x_i) - f(x_i) \right)^2\]

We note that, this arrives at the exact same summation because \(\int^1_0 dx = 1\). However, this will scale the loss proportional to the area. We view this as a benefit because it keeps the loss per area consistent across domains. We also note that this opens the door to more efficient integration techniques. In several examples, in this user guide we sample with higher frequency in certain areas of the domain to approximate the integral losses more efficiently.

Many PDEs of interest have integral formulations. Take for example the continuity equation for incompressible flow,

(10)\[\frac{\delta u}{\delta x} + \frac{\delta v}{\delta y} + \frac{\delta w}{\delta z} = 0\]

We can write this in integral form as the following,

(11)\[\iint_{S} (n_xu + n_yv + n_zw) dS = 0\]

Where \(S\) is any closed surface in the domain and \(n_x, n_y, n_z\) are the normals. We can construct a loss function using this integral form and approximate it with Monte Carlo Integration in the following way,

(12)\[L_{IC} = \left(\iint_{S} (n_xu + n_yv + n_zw) dS \right)^2 \approx \left((\iint_{S} dS) \frac{1}{N} \sum^N_{i=0} (n^i_xu_i + n^i_yv_i + n^i_zw_i)\right)^2\]

For some problems we have found that integrating such losses significantly speeds up convergence.

One important advantage of a neural network solver over traditional
numerical methods is its ability to solve parameterized geometries
^{3}. To illustrate this concept we
solve a parameterized version of equation
(5). Suppose we want to know how the
solution to this equation changes as we move the position on the
boundary condition \(u(l)=0\). We can parameterize this position
with a variable \(l \in [1,2]\) and our equation now has the form,

(13)\[\begin{split} \mathbf{P} : \left\{\begin{matrix}
\frac{\delta^2 u}{\delta x^2}(x) = f(x), \\
\\
u(0) = u(l) = 0,
\end{matrix}\right.\end{split}\]

To solve this parameterized problem we can have the neural network take \(l\) as input, \(u_{net}(x,l)\). The losses then take the form,

(14)\[L_{residual} = \int_1^2 \int_0^l \left( \frac{\delta^2 u_{net}}{\delta x^2}(x,l) - f(x) \right)^2 dx dl \approx \left(\int_1^2 \int^l_0 dxdl\right) \frac{1}{N} \sum^{N}_{i=0} \left(\frac{\delta^2 u_{net}}{\delta x^2}(x_i, l_i) - f(x_i)\right)^2\]

(15)\[L_{BC} = \int_1^2 (u_{net}(0,l))^2 + (u_{net}(l,l) dl \approx \left(\int_1^2 dl\right) \frac{1}{N} \sum^{N}_{i=0} (u_{net}(0, l_i))^2 + (u_{net}(l_i, l_i))^2\]

In Fig. 11 we see the solution to the differential equation for various \(l\) values after optimizing the network on this loss. While this example problem is overly simplistic, the ability to solve parameterized geometries presents significant industrial value. Instead of performing a single simulation we can solve multiple designs at the same time and for reduced computational cost. Examples of this will be given later in the user guide.

Fig. 11 *Modulus Sym solving parameterized differential equation problem.*

Another useful application of a neural network solver is solving inverse problems. In an inverse problem, we start with a set of observations and then use those observations to calculate the causal factors that produced them. To illustrate how to solve inverse problems with a neural network solver, we give the example of inverting out the source term \(f(x)\) from equation (5). Suppose we are given the solution \(u_{true}(x)\) at 100 random points between 0 and 1 and we want to determine the \(f(x)\) that is causing it. We can do this by making two neural networks \(u_{net}(x)\) and \(f_{net}(x)\) to approximate both \(u(x)\) and \(f(x)\). These networks are then optimized to minimize the following losses;

(16)\[L_{residual} \approx \left(\int^1_0 dx\right) \frac{1}{N} \sum^{N}_{i=0} \left(\frac{\delta^2 u_{net}}{\delta x^2}(x_i, l_i) - f_{net}(x_i)\right)^2\]

(17)\[L_{data} = \frac{1}{100} \sum^{100}_{i=0} (u_{net}(x_i) - u_{true}(x_i))^2\]

Using the function \(u_{true}(x)=\frac{1}{48} (8 x (-1 + x^2) - (3 sin(4 \pi x))/\pi^2)\) the solution for \(f(x)\) is \(x + sin(4 \pi x)\). We solve this problem and compare the results in Fig. 12, Fig. 13

Fig. 12 *Comparison of true solution for \(f(x)\) and the function approximated by the NN.*

Fig. 13 *Comparison of \(u_{net}(x)\) and train points from \(u_{true}\).*

In previous discussions on PINNs, we aimed at solving the classical
solution of the PDEs. However, some physics have no classical (or
strong) form but only a variational (or weak) form
^{4}. This requires handling the PDEs in
a different approach other than its original (classical) form,
especially for interface problem, concave domain, singular problem, etc.
In Modulus Sym, we can solve the PDEs not only in its classical form, but
also in it weak form. Before describing the theory for weak solutions of
PDEs using PINNs, let’s start by the definitions of classical, strong
and weak solutions.

**Note:** The mathematical definitions of the different spaces that are
used in the subsequent sections like the \(L^p\), \(C^k\),
\(W^{k,p}\), \(H\), etc. can be found in the
Relative Function Spaces and Integral Identities. For general theory of the partial differential
equations and variational approach, we recommend
^{5}, ^{6}.

### Classical solution, Strong solution, Weak solution

In this section, we introduce the classical solution, strong solution, and weak solution for the Dirichlet problem. Let us consider the following Poisson’s equation.

(18)\[\begin{split}\left\{\begin{matrix}
\Delta u = f \quad \text{ in } \Omega \\
\\
u = 0 \quad \text{ on } \partial \Omega
\end{matrix}\right.\end{split}\]

**Definition (Classical Solution):**

Let \(f\in C(\overline{\Omega})\) in (18), then there is a unique solution \(u\in C^2(\Omega)\cap C_0^1(\Omega)\) for (18). We call this solution as the classical solution of (18).

**Definition (Strong Solution):**

Let \(f\in L^2(\Omega)\) in (18), then there is a unique solution \(u\in H^2(\Omega)\cap H_0^1(\Omega)\) for (18). We call this solution as the strong solution of (18).

From the definition of strong solution and Sobolev space, we can see that the solution of (18) is actually the solution of the following problem: Finding a \(u\in H^2(\Omega)\cap H_0^1(\Omega)\), such that

(19)\[\int_{\Omega}(\Delta u + f)v dx = 0\qquad \forall v \in C_0^\infty(\Omega)\]

By applying integration by parts and \(u = 0\), we get

(20)\[\int_{\Omega}\nabla u\cdot\nabla v dx = \int_{\Omega} fv dx\]

This leads us to the definition of weak solution as the following.

**Definition (Weak Solution):**

Let \(f\in L^2(\Omega)\) in (18), then there is a unique solution \(u\in H_0^1(\Omega)\) for the following problem: Finding a \(u\in H_0^1(\Omega)\) such that

(21)\[ \int_{\Omega} \nabla u \cdot\nabla v dx = \int_{\Omega}fv dx\qquad \forall v\in H_0^1(\Omega).\]

We call this solution as the weak solution of (18).

In simpler terms, the difference between these three types of solutions can be summarized as below:

The essential difference among classical solution, strong solution and weak solution is their regularity requirements. The classic solution is a solution with \(2\)nd order continuous derivatives. The strong solution has \(2\)nd order weak derivatives, while the weak solution has weak \(1\)st order weak derivatives. Obviously, classical solution has highest regularity requirement and the weak solution has lowest one.

### PINNs for obtaining weak solution

Now we will discuss how PINNs can be used to handle the PDEs in
approaches different than its original (classical) form. In
^{7}, ^{8}, the authors
introduced the VPINN and hp-VPINN methods to solve PDEs’ integral form.
This integral form is based on (19). Hence, it is
solving a strong solution, which is better than a classical solution.

To further improve the performance of PINNs, we establish the method based on eq:weak i.e., we are solving the weak solution. Let us assume we are solving (18). To seek the weak solution, we may focus on the following variational form:

(22)\[ \int_{\Omega}\nabla u\cdot\nabla v dx = \int_{\Omega} fv dx\]

(23)\[ u = 0 \quad\mbox{ on } \partial \Omega\]

For (23), we may handle it as the traditional PINNs: take random points \(\{\mathbf{x_i}^b\}_{i=1}^{N_b}\subset\partial\Omega\), then the boundary loss is

(24)\[MSE_b = \frac{1}{N_b}\sum_{i=1}^{N_b}\left(u_{NN}(\mathbf{x_i}^b)-0\right)^2\]

For (22), we choose a quadrature rule \(\{\mathbf{x_i}^q,w_i^q\}_{i=1}^{N_q}\), such that for \(u: \Omega\mapsto\mathbb{R}\), we have

(25)\[\int_{\Omega} u dx \approx \sum_{i=1}^{N_q}w_i^q u(\mathbf{x_i}^q).\]

For uniform random points or quasi Monte Carlo points, \(w_i^q=1/N_q\) for \(i=1,\cdots, N_q\). Additionally, we choose a set of test functions \(v_j\in V_h\), \(j=1,\cdots, M\) and then the loss of the integral is

(26)\[MSE_v = \left[\sum_{i=1}^{N_q}w_i^q\left(\nabla u(\mathbf{x_i}^q)\cdot\nabla v_j(\mathbf{x_i}^q)-f(\mathbf{x_i}^q)v_j(\mathbf{x_i}^q)\right)\right]^2.\]

Then, the total loss is

(27)\[MSE=\lambda_v*MSE_v+\lambda_b*MSE_b,\]

where the \(\lambda_v\) and \(\lambda_b\) are the corresponding weights for each terms.

As we will see in the tutorial example Interface Problem by Variational Method, this scheme is flexible and can handle the interface and Neumann boundary condition easily. We can also use more than one neural networks on different domains by applying the discontinuous Galerkin scheme.

References

Kingma, Diederik P., and Jimmy partial. “Adam: A method for stochastic optimization.” arXiv preprint arXiv:1412.6980 (2014).

Raissi, Maziar, Paris Perdikaris, and George Em Karniadakis. “Physics informed deep learning (part i): Data-driven solutions of nonlinear partial differential equations.” arXiv preprint arXiv:1711.10561 (2017).

Sun, Luning, et al. “Surrogate modeling for fluid flows based on physics-constrained deep learning without simulation data.” Computer Methods in Applied Mechanics and Engineering 361 (2020): 112732.

Braess, Dietrich. Finite elements: Theory, fast solvers, and applications in solid mechanics. Cambridge University Press, 2007.

Gilbarg, David, and Neil S. Trudinger. Elliptic partial differential equations of second order. Vol. 224. springer, 2015.

Evans, Lawrence C. “Partial differential equations and Monge-Kantorovich mass transfer.” Current developments in mathematics 1997.1 (1997): 65-126.

Kharazmi, Ehsan, Zhongqiang Zhang, and George Em Karniadakis. “Variational physics-informed neural networks for solving partial differential equations.” arXiv preprint arXiv:1912.00873 (2019).

Kharazmi, Ehsan, Zhongqiang Zhang, and George Em Karniadakis. “hp-VPINNs: Variational physics-informed neural networks with domain decomposition.” Computer Methods in Applied Mechanics and Engineering 374 (2021): 113547.