Fast Numerical Simulation of Allen-Cahn Equation

Simulation speed depends on code structures, hence it is crucial how to build a fast algorithm. We solve the Allen-Cahn equation by an explicit finite difference method, so it requires grid calculations implemented by many for-loops in the simulation code. In terms of programming, many for-loops make the simulation speed slow. To solve the problem, we propose a model architecture containing a pad and a convolution operation for the Allen-Cahn equation. Also, the GPU operation is used to boost up the speed more. In this way, the simulation of other differential equations can be improved. In this paper, various numerical simulations are conducted to confirm that the Allen-Cahn equation follows motion by mean curvature and phase separation in two-dimensional and three-dimensional spaces. Finally, we demonstrate that our algorithm is much faster than an unoptimized code and the CPU operation.


Introduction
The Allen-Cahn (AC) equation is a reaction-diffusion equation composed of the reaction term −F (φ(x, t))/ 2 and the diffusion term ∆φ(x, t): where φ(x, t) is order parameter which is defined as the difference in concentration of the two components in a mixture and F (φ) is double well potential energy function with minimum values at −1 and 1, and its form is F (φ) = 0.25(φ 2 − 1) 2 . is thickness of transition layer and which is small positive constant value. The AC equation is first introduced in a research on the phase separation of binary iron alloys [1]. The AC equation has studied and applied to various fields such as image inpainting [2,3,4], image segmentation [5,6], crystal growth [7,8,9].
With the development of computer hardware such as GPU(Graphics Processing Unit) and memory cards, neural networks are applied in a wide range of research areas such as computer vision, natural language processing, and numerical analysis. As GPU operations outperform CPU(Central Processing Unit) performance in multi-tasks and high-dimensional problems, open source machine learning libraries such as Pytorch provide a variety of neural networks using GPU and are useful to build the architecture combined with neural networks and numerical methods. Thus, many researches use machine learning libraries. M. Raissi et al. [10] proposed physics informed neural networks combined by multi-layer perceptrons and numerical methods to solve nonlinear partial differential equations, S. Karumuri et al. [11] introduced a solver-free approach for Stochastic partial differential equations, and L. Yang et al. [12] proposed a Bayesian physics informed neural network. In this paper, we propose a structure using padding and convolution operation for the GPU calculation of the AC equation and demonstrate the validity of the proposed structure by verifying the result with Python code that has the same mathematically meaning.
This paper is organized as follows. In Section 2, we present an explicit finite difference method to solve the AC equation which implemented by CPU and GPU algorithms. In Section 3, the numerical simulations including a motion by mean curvature, phase separation, and temporal evolutions of various initial shapes are introduced as well as the runtime results between CPU and GPU operations are compared. Finally, conclusions are drawn in Section 4.

Numerical solutions
In this section, we present an explicit finite difference method to solve the AC Eq. (1). Also, we give an explanation of each algorithm for CPU and GPU computing. For simplicity of expression, we describe a numerical scheme for the AC equation in two dimensions (2D), and the definition in three dimensional (3D) space can be easily extended and considered. A computational domain is defined using a uniform grid of size h = 1/N x and For the definition of the boundary condition, we define the extended computational domain as follows: for 1 ≤ i ≤ N x + 2 and 1 ≤ j ≤ N y + 2. Let φ n ij be approximations of φ(x i , y j , n∆t), where ∆t = 0.1h 2 is temporal step size, T is a final time, and N t is a total number of time steps. The boundary condition is zero Neumann boundary condition: We define the thickness of transition layer in Eq. (1) as m [13]: where m is the number of grids representing the thickness.
First of all, we show the error results obtained by Eq. (3) to check the difference between the numerical results of CPU:Python (baseline) and GPU:Pytorch. In Table 1, all the errors for any cases are less than 1.0e-6. To estimate the error of the CPU and GPU codes, we use the following defined Err where avg(X) is the average of elements in an array X and t is the time step. As the result, it is confirmed that there is little difference between the two algorithms.

Numerical solutions on CPU (baseline)
The AC equation (1) is discretized using the explicit finite difference method as: Figure 1: Schematic of the process to obtain f The baseline algorithm is implemented in Eq. (4) using Numpy (CPU array).

Numerical solutions on GPU (Pytorch)
For GPU computing, the AC equation (1) can be expressed using Pytorch and the algorithm can be represented as Fig. 1.
where f is a Pytorch model. The main algorithm is a set of steps in Eq. (5) using Pytorch. In this algorithm, nn.ReplicationPad2d (or nn.ReplicationPad3d) is applied to satisfy the boundary condition by padding the φ n input using replication of the input boundary. Also, the convolutional operator F.conv2d (or F.conv3d) with the 2nd order differencing filter is used to calculate the diffusion term ∆φ n . The model can choose an operation mode between GPU and CPU by to(device). If device=cuda:0, the model is implemented on GPU, otherwise on CPU. All the codes are available from the first author's GitHub webpage: https://github.com/kimy-de/gpuallencahn

Numerical experiments
In this section, we perform the following numerical tests in 2D and 3D: m effect, the motion by mean curvature effect with various initial shapes [circle (sphere in 3D), dumbbell, star, torus, maze] and phase separation. In the m effect test, we compare the numerical solutions to analytic values to find a proper m value. We simulate a phenomenon that follows motion by mean curvature in various initial shapes and phase separation with random initial condition. We perform the simulation on the following specifications: Intel(R) Core(TM) i9-9900K CPU @3.60GHz, 32GB RAM / NVIDIA GeForce RTX 2080 Super.

Initial conditions
We consider the 2D and 3D initial conditions introduced in this section: To check the m effect, we measure the circle's (sphere in 3D) radius that changes with time and take the initial conditions: where R 0 is the initial radius of a circle (sphere in 3D).
In various tests to check the motion by the mean curvature, the initial conditions for the circle and sphere refer to Eqs. (6a) and (6b), respectively. A dumbbell is constructed by following the initial conditions (o/w: otherwise) where R 0 is initial radius of both side of dumbbell's circle (sphere in 3D) and for simplicity of The initial conditions of a star shape are defined as: and we use different domain sizes in 2D and 3D, so the center of the star depends on the dimensions.
And a torus shape is given by where R 1 and R 2 are the radius of major (outside) and minor (inside) circles, respectively. And, for simplicity of expression, XY = (x − 0.5) 2 + (y − 0.5) 2 .
The last initial conditions of a maze shape is complicated to describe its equation, so refer to the code in Appendix.

Simulations in 2 dimensional space
Unless otherwise stated, we use the following parameters: mesh size N x = N y = 200, space step size h = 1/N x , time step size ∆t = 0.1h 2 , and computational domain Ω = (0, 1) × (0, 1). We should find a proper thickness of transition layer as defined Eq. (2). First, we find an appropriate m by comparing the numerical solution with the exact solution for the radius that decreases due to the motion by mean curvature in the circle initial condition in Eq. (6a).
where r 0 is the initial radius of the circle and d is a dimension, and t is time. We simulate various m until the final time T = 0.03. As shown in Fig. 2, when 10 is used, it is found that the exact solution and the numerical solution are most similar in the experiment. Therefore, the other tests are performed using 10 except for the case of changing the grid in 2D. Figure 3 shows the temporal evolution of the dumbbell shape and the initial condition is shown in Eq. (7a). For resolution, we use N x = 400 and N y = 200 on the computational domain Ω = (0, 2) × (0, 1). The final time T of the simulation is 0.0094 and R 0 = 0.2. In Fig. 3(e), the changing direction by the motion by mean curvature is indicated by arrows. Figure 4 shows the evolution of the star shape created by Eq. (8a). The parameters are used as mentioned at the beginning of this section and T = 0.0325. As shown in Fig. 4, the tips of the star move inward and the gaps between the tips move outward. When it changes to the shape of a circle, the change of the radius can be predicted as shown in Fig. 2. Figure 5 shows the evolution of the torus shape of Eq. (9a) with T = 0.0575, R 1 = 0.4 and R 2 = 0.3. Because the inner circle has a larger curvature, it shrinks faster than the outer circle, and after the inner circle disappears, the change of radius over time can be measured as shown in Fig.2. Figure 6 shows the evolution of a maze shape. We use N x = N y = 100, 5 , and T = 0.04. As shown in Fig. 6, we obtain the results of shrinking while maintaining its initial shape.     The last simulation in 2D is phase separation with a random initial condition Eq. (10a). In Fig. 7, starting with random values with 0.1 amplitude, but over time, phase separation occurs with values −1 to 1.
According to the results in Table 2, the speed gap between CPU and GPU is significant. In 2D, that is up to 251.6 times the difference between the Python:CPU and the Pytorch:GPU codes. Also, Pytorch:GPU tensors make the model up to 4.73 times faster than Pytorch:CPU tensors in the same code.

Simulations in 3 dimensional space
Three dimensional simulations can be considered as an extension of two dimensions tests. Unless otherwise stated, we use the following parameters: mesh size N x = N y = N z = 100, space step size h = 1/N x , time step size ∆t = 0.1h 2 , and computational domain Ω = (0, 1) × (0, 1) × (0, 1). As in the 2D simulation, we find an appropriate m by comparing the numerical solution with the exact solution (using in Eq. (11)) for the radius of the sphere initial condition in Eq. (6b).   The reason is that in the initial shape in Fig. 9(a), the radius of the handle is much smaller than the radius of the spheres at both ends. Therefore, the handle part shrinks faster than end of spheres and breaks as in Fig. 9. Figure 10 shows the evolution of the star shape shown in Eq. (8b) on the computational domain Figure 9: (a)-(d) Time evolution of dumbbell shape. The handle shrinks quickly and breaks occurs, because the curvature of the handle is larger than that of both spheres 1). The parameters are used as mentioned at the beginning of this section and T = 0.02. As shown in Fig. 10, similar to the 2D result, the tips of the star move inward and the gaps between the tips move outward. Figure 11 shows the evolution of the torus shape with initial condition in Eq. (9b) using R 1 = 0.3 and R 2 = 0.3 and T = 0.01 on the computational domain Ω = (−1, 1) × (−1, 1) × (−1, 1). Contrary to the 2D result, the inner circle becomes larger because the radius of the minor circle exists. Therefore, the radius of the major circle and the minor circle have different each curvature. Since the radius of the minor circle is smaller than major circle, the mean curvature drives the motion into the inside of the torus as shown in Fig. 11. Figure 12 shows the evolution of a maze shape. The initial condition is described in Appendix code. In this simulation, we use T = 0.0175 and computational domain Ω = (−1, 1) × (−1, 1) × (−1, 1). As shown in Fig. 12, we obtain the results of shrinking while maintaining its initial shape in Fig. 12(b), then merges and shrinks in Fig. 12(c)-(d). If we use different m , it can shrink while preserving its initial shape. The last simulation in 3D is phase separation with random initial condition in Eq. (10b).    According to the results in Table 3, the speed gap between CPU and GPU operations is significant. Of course, it will be faster by performing GPU calculations, but we proposed a structure using padding and convolution operation in performing GPU calculations on AC equations. And the results are demonstrated by verifying the results (Tab. 1) with Python code which has the same mathematically meaning. In 3D, the GPU performance is much more overwhelming than the CPUs. For instance, the GPU code is 4766 times faster than the Python code in the torus problem. Also, GPU tensors make the model up to 76 times faster than CPU tensors in the same code. The values in parentheses describe how many times the difference is based on GPU:Pytorch time for each test.

Conclusions
In this paper, we proposed a structure using padding and convolution operation for the GPU calculation of the Allen-Cahn equation. We increased the simulation speed and demonstrated the validity of the proposed structure by verifying the result with Python code that has the same mathematically meaning. We solved the Allen-Cahn equation by the explicit finite difference method and compared the runtime results between CPU and GPU algorithms. The errors of CPU:Python and GPU:Pytorch are less than 1.0e-6 for the given initial conditions. Also, we showed that our GPU code is up to 251.6 and 4765.81 times faster than the CPU codes in 2D and 3D, respectively. By showing these results, accuracy and efficiency have been demonstrated. Various numerical simulations were presented to confirm that the Allen-Cahn equation follows the motion by mean curvature and phase separation in 2D and 3D space. In this way, we can build a fast algorithm for any differential equations using the finite difference method by efficient programmatic code structures.