Reduced-Order Modeling of Cavity Flow Oscillations across Multi-Mach Numbers Using Deep Learning

The reduced-order model can accurately and eﬃciently predict unsteady problems in many aerospace engineering applications. The traditional reduced-order model based on proper orthogonal decomposition (POD) and Galerkin projection has poor robustness and large error in predicting complex problems. In this paper, a reduced-order model combining POD and deep learning is proposed to predict cavity ﬂow oscillations under diﬀerent ﬂow conditions. Firstly, POD modes and corresponding coeﬃcients are obtained by POD. Then, two deep learning frameworks, including multilayer perceptron (MLP) and long short-term memory (LSTM) neural networks, are used to predict the future POD coeﬃcients, respectively. Finally, the cavity ﬂow oscillations across multi-Mach numbers are predicted by the POD modes and the future coeﬃcients. The results show that both of these frameworks can accurately predict cavity ﬂow oscillations when the ﬂow conditions change, and the time cost is reduced by order of magnitude. In addition, due to the performance of LSTM is better than that of MLP, its calculation speed is faster.


Introduction
Cavity flow oscillations exist in many aerospace engineering fields [1][2][3][4], such as weapon bays [5,6] and landing gears [7,8]. e physical mechanism in a cavity is complex. e shear layer above the cavity generates a vortex, which collides with the trailing wall to generate sound waves. e generated sound waves radiate forward and continue to excite the shear layer to generate new vortices. is process causes intense pressure oscillations in the cavity. e study of oscillation characteristics in the cavity is helpful to understand the mechanism of cavity noise and suppress cavity noise. erefore, the cavity flow oscillation issue has received more and more attention [9][10][11]. e experimental investigations of cavity flow oscillations are usually carried out in wind tunnels or water tunnels. e maintenance of experimental equipment and the complexity of working conditions require many costs, while computational fluid dynamics (CFD) can effectively solve this problem. It not only fundamentally changes the design process of aerospace vehicles but also effectively reduces the number of experiments and deeply understands the physical mechanism.
In order to effectively reduce the calculation cost in engineering practice, the research of reduced-order model (ROM) has been widely concerned. Since the 1990s, researchers [20,21] have developed a variety of unsteady ROMs. On the one hand, ROM can save expensive calculation costs. On the other hand, it can effectively extract the main characteristics of the flow field and provide a theoretical basis for analyzing the mechanism and oscillation characteristics of complex systems.
Proper orthogonal decomposition (POD) is an efficient method for establishing a ROM. Rowley et al. [12] and Golerfelt [22] accurately predicted the oscillation in a cavity that has a ratio of L/D � 2 by POD and Galerkin projection. POD modes were obtained by extracting a series of orthogonal bases, and the high-dimensional governing equations were transformed into the low-dimensional ordinary differential equations by the Galerkin projection. e traditional method applying POD and Galerkin projection can establish the ROM, but the robustness of the method is poor [23][24][25]. In order to effectively solve nonlinear problems, deep learning is introduced in this work. As an important branch of machine learning, deep learning [26] has been widely recognized and developed in many applications [27][28][29][30][31][32], such as speech recognition [27,28] and image processing [29,30]. e excellent ability to process big data and nonlinear relationships makes deep learning develop rapidly in the fluid mechanics for recent several years. In 2016, Ling et al. [33] realized deep learning of RANS turbulence model by embedding Galileo invariants into the deep neural network structure and predicted the channel flow and separated flow. It is considered the first combination of deep learning and fluid mechanics [34]. Miyanawala and Jaiman [35] realized the prediction of flow characteristics in the wake region of a two-dimensional cylinder by convolution neural network (CNN). Lee and You [36] applied generated antagonistic network (GAN) to predict the laminar vortex shedding over a cylinder.
In this work, ROMs are based on multilayer perceptron (MLP) and long short-term memory (LSTM) neural network for predicting cavity flow oscillations when the Mach number changes are established, respectively. Although there is some literature on establishing the ROM by deep learning, they are mainly applied to simple incompressible flows. For example, Yu and Hesthaven [23] proposed a flow reconstruction method based on POD and artificial neural network (ANN). ey validated the efficiency of this approach in two-dimensional viscous nozzle flows, an inviscid M6 wing flow, a viscous hypersonic flow of a complex configuration, and an unsteady two-dimensional Riemann problem. San et al. [25] also proposed an efficient framework based on POD and ANN and accurately predicted a nonlinear wave-propagation problem. e compressible cavity flow is extremely unsteady and complex. ere are intense flow oscillations in the cavity. ere are complex nonlinear interactions among the shear layer, vortex, sound waves, and cavity walls. Even a small change in Mach number will make the internal oscillation of the cavity change unpredictably. It is meaningful to establish the ROM of cavity flow oscillations across multi-Mach numbers based on deep learning.
In addition to the approach proposed in this paper, nonlinear approaches (such as autoencoders) can be the substitute for POD, but the establishment of the ROM is completely different. e ROM proposed in this paper can obtain the flow field distributions of the multiple physical quantities through the predicted POD coefficients. However, nonlinear approaches can only obtain the flow field distributions of the input physical quantity. We need to readjust the hyperparameters and retrain the neural network if we want to obtain the flow field distributions of other physical quantities. Furthermore, although POD is a linear approach, the loss of efficiency in the intrusive ROM frameworks can be compensated by nonintrusive reduced-order models (NIROMs) [37][38][39]. erefore, we did not try to use nonlinear approaches for the approximation of the solution space. e paper is organized as follows: Section 2 introduces the basic mathematical process of POD. e theory and characteristics of MLP and LSTM neural networks are described in Section 3 and Section 4. Moreover, Section 5 discusses POD results and the predicted results applied MLP and LSTM neural networks, respectively. Finally, conclusions are summarized in Section 6.

Proper Orthogonal Decomposition
e snapshots POD method [40] is applied in this work. e snapshots q( where c is the sound speed. e snapshots are divided into the average quantity and the pulse quantity: where q(x) � (1/N) N k�1 q k (x). We hope to obtain a set of optimal orthogonal bases: e kth pulse quantity can be approximated as where a i (t k ) is the coefficient of the ith POD mode constituting the kth snapshot. e snapshot q k ′ (x) and the basis function φ i (x) are in the same spatial domain, so φ i (x) can be represented by a linear combination of all snapshots: where A i (t k ) is a complex coefficient. In order to obtain the orthogonal basis φ i (x), the above problem can be converted into the eigenvalue problem [41]: where A i is the coefficients matrix, , λ i is the eigenvalue of the i th mode, and C mn is a N × N self-adjoint matrix (m, n � 1, 2, . . . , N), which is calculated by e inner product of energy based on isentropic assumption is applied as follows [42,43]: 2 Shock and Vibration e eigenvalue decomposition of the matrix C mn is obtained. e eigenvector matrix is A, and the eigenvalues λ i are arranged in descending order.

Multilayer Perceptron
Artificial neural networks are often referred to as neural networks or multilayer perceptrons. Multilayer perceptron, also known as multilayer perceptron neural network, is a traditional supervised learning method that simulates human neurons [44]. When dealing with the nonlinear regression problem, it can approach the actual mapping relationship between input feature space X and output tag vector Y infinitely by learning nonlinear functions. Perceptron is a single neuron model, which is the precursor of the large neural network [45]. e power of neural networks lies in their ability to represent the training data and how to relate it to the output variables you want to predict [46]. Mathematically, they can learn any mapping function and have been proved to be a general approximation algorithm [47]. e prediction ability of a neural network comes from the hierarchical or multilayer structure of the network [48]. Data structures can select (learn to represent) features of different scales or resolutions and combine them into higher-order features [49]. e basic MLP structure includes the input layer, hidden layer, and output layer, in which the number of the hidden layer can be more or less [45]. Each layer is composed of many nodes. e node of each layer is a neuron. Except that the input layer needs to deal with the input characteristics, the other neurons have a nonlinear activation function, and all neurons are fully connected with the next layer. A simple MLP framework with a single hidden layer is shown in Figure 1. ere are N nodes in the hidden layer. e POD coefficient of every POD mode at the current time instant a i (t k ) is regarded as the input layer. e POD coefficient of each POD mode at the next time instant a i (t k+1 ) is regarded as the output layer. e mathematical formula can be expressed by where W and w is the weight in the mapping from the input layer to the hidden layer and from the hidden layer to the output layer, respectively. c and b is the bias in the mapping from the input layer to the hidden layer and from the hidden layer to the output layer, respectively. e rectified linear unit (ReLu) is selected as the activation function. ReLu is the most common activation function at present, which can effectively solve the gradient disappearance problem of sigmoid and tanh, and its convergence speed is much higher than that of sigmoid and tanh [50]. e superiority of ReLu has been proved in many studies [51,52]. e training of the neural network is based on the backpropagation algorithm. e parameters of each node are updated layer by layer in real time. In addition, the adaptive moment estimation (Adam) algorithm [53] is used to find the optimal solution of the weight and bias, which is an excellent gradient optimization algorithm. Its main advantage is to use the same learning rate for each parameter and adapt independently as the learning progresses [54]. e loss function is the mean-absolute-error. Comparisons of the activation functions and optimizers are shown in Figure 2. It can be found that the results predicted by the framework with the ReLu activation function and the Adam optimizer are the most accurate. Although the results by the ReLu and tanh are almost the same, the computational speed by ReLu is faster than that by tanh. erefore, the ReLu activation function and Adam optimizer are used in this work.

Long Short-Term Memory Neural Network
LSTM is a special kind of recurrent neural network (RNN), which is good at processing time sequence data. e traditional neural network cannot achieve continuous memory, but RNN can solve this problem. When the traditional RNN processes long sequences, RNN may face the problem that the gradient disappears or bursts [55].
LSTM can create a path that allows the gradient to flow sustainably for a long time by introducing controllable self-circulation. e architecture of the LSTM cell is shown in Figure 3. e reason why LSTM can remember longterm information lies in the design of the gate structure, which is a way to allow information to pass selectively. e LSTM cell contains the forget gate, input gate, and output gate.
e specific mathematical process [56,57] is as follows.
e input of the forgetting gate is the output of the above layer h t−1 and the sequence data x t . e output f t is obtained by a sigmoid activation function. e output value is in the range of [0, 1], which indicates the probability that the state of the cell in the previous layer will be forgotten, 1 is completely reserved, and 0 is completely abandoned: e next step is to decide what information we want to keep in the neuron cell, which consists of two parts. First, a sigmoid layer called the input gate layer determines the values we want to update. en, a tanh layer generates a new candidate value, C t , which is added to the neuron state:

Shock and Vibration
In order to obtain the current cell state, the forgotten state of the last cell obtained through the forget gate is added to the new information after screening: e output gate is used to control how much of the cell state of the layer is filtered. Firstly, the sigmoid activation function is used to get the output gate o t . en, the output h t in the current layer is obtained by that the cell state C t disposed by the tanh activation function multiplies with o t : In the above formulas, σ and tanh are the sigmoid and tanh activation functions, respectively, which are shown as

Results and Discussion
5.1. e Whole Architecture. In this work, the cavity flow oscillations are numerically simulated by direct numerical simulation (DNS). e flow chart of the ROM for cavity flow oscillations is shown in Figure 4. e goal of this work is to combine POD and deep learning to predict cavity flow oscillations across multi-Mach numbers. e specific process of this method can be divided into five steps: (  [58], the simulation details and grid convergence have been described, and the accuracy of the simulation method has also been verified. e sonicFoam solver is applied, which is based on the PISO (pressure implicit with splitting of operator) algorithm.
e time derivative discretization is the Euler scheme, the gradient discretization is the Gauss linear scheme, and the divergence discretization is the Gauss upwind scheme. e baseline rectangular cavity has a lengthto-depth ratio of L/D � 4 and L/W � 0.5, p ∞ � 70422 Pa, T ∞ � 294.5 K, and Ma � 0.5 , Re D � 5,000. e nonuniform mesh, which is dense near the cavity wall, is used. e number of grid points within the cavity is 260 × 200 and 312 × 240, as shown in Figure 5. e Strouhal numbers are compared with the theoretical and experimental results, as shown in Table 1. e theoretical results are calculated by the modified Rossiter's formula [59]. e Strouhal numbers obtained by the present DNS are consistent with the theoretical and experimental results.       e POD analysis was conducted on the velocity fields of cavity flow oscillations at different Mach numbers to extract the dominant POD modes and the corresponding coefficients. e eigenvalue λ i reflects the energy that the POD mode holds. It is now generally accepted that the 99% energy is enough to reconstruct the flow field. e eigenvalues and the cumulative eigenvalues for cavity flow oscillations at Ma � 0.51, 0.6 are shown in Figure 6. e low-order modes occupy the highest energy. e first two modes occupy 78% of the energy at a Mach number of 0.51 and 77% at a Mach number of 0.6. e eigenvalues at the lower orders are almost identical at the two Mach numbers, but there are differences at the higher orders. is is due to the increase of Mach number, the increase of airflow velocity in the cavity, and the more complicated flow structure. Whether Ma � 0.51 or 0.6, 13 POD modes already occupy 99% of the energy, so we extract the first 13 POD modes and their corresponding coefficients. e most important requirement for deep learning is that the data has some common features. e totally different data cannot be well learned. In order to test the similarity of the POD modes at different Mach numbers, the first two POD modes at Ma � 0.51 and Ma � 0.6 are compared, as shown in Figure 7. POD mode 1 and mode 2 at two Mach numbers have three large-scale structures, which are located near the leading edge, the middle of the cavity, and the trailing edge, respectively. Mode 2 also has a small-scale structure at the trailing edge corner. Although these largescale structures are different in size and direction, their position and shape are similar. e flow structures at different Mach numbers are qualitatively similar, but there are differences in size and direction. ese differences are irregular and unpredictable.

Selection of Parameters.
In this part, we employ MLP and LSTM frameworks to predict the POD coefficient of the first POD mode. e performance of the two neural networks depends on the selection of parameters such as the number of layers, the number of nodes in the hidden layer, the number of iterations, and the learning rate. In order to achieve better performance, we manually adjust these parameters.
e effects of these parameters on the model performance are shown in Tables 2 and 3, respectively. Increasing the number of nodes and layers of the hidden layer will improve the performance of the model, but it will also lead to overfitting of the training data. If the learning rate is too small, the calculation time will be too long. If the learning rate is too large, and the optimal solution may be missed. e number of iterations corresponds to the learning rate. If the learning rate is small, the number of iterations will be more, and if the learning rate is large, the number of iterations will be less. If the batch size is too large, the convergence speed of the model is slow. If the batch size is too small, the model will not converge.
In the MLP1, MLP2, MLP3 framework, when there are 10 nodes, the test loss is the smallest, and there is little difference in the training loss. erefore, the number of nodes is fixed at ten. e number of hidden layers is increased to 2 and 3 layers. e results show that the training loss of MLP5 is the smallest, and there is little difference in the test loss. In addition, with the increase or decrease of the learning rate, the test loss does not decrease. With the increase of iterations, the training loss and test loss will decrease slightly, but the calculation cost will increase. e optimal LSTM framework can also be determined by using the same single variable method. erefore, considering the training loss, test loss, and calculation cost, we choose the network framework MLP5 and LSTM7.

Predicted POD Coefficients.
e key to establishing the ROM is to predict POD coefficients accurately. Its accuracy directly affects the accuracy of the reconstructed velocity field across multi-Mach numbers. Figure 8 shows the time evolution curves of different modal coefficients achieved by the MLP and LSTM framework. Due to the limited space in the paper, we only list the first four POD modal coefficients across partial test Mach numbers. e coefficients predicted by MLP and LSTM frameworks are very similar to the real coefficients. e predicted coefficients can well reflect the change of the POD coefficients. No matter when Mach numbers are 0.53, 0.57, 0.6, or even other Mach numbers not shown in this paper, the predicted coefficients are not much different from the real coefficients. However, it is worth noting that, in the case of higher order, the deviation between the predicted coefficient and the real coefficient will increase, which is determined by the high instability and nonlinearity of the higher-order modal coefficients. Because the energy occupied by higherorder POD modes is extremely low, it has little influence on the final velocity field reconstruction results. In addition, we also find that the coefficients predicted by the MLP framework have small fluctuations in some positions, which may be caused by the method only considering the information of the current time instant.
In order to analyze the training results in more detail, the training and test loss obtained by different frameworks are shown in Figure 9. In general, under the same number of iterations, the training loss and test loss of LSTM are smaller than those of MLP (except for the second modal coefficient), and the LSTM framework can achieve convergence with fewer iterations. is shows the superiority of the LSTM framework in dealing with time series problems. e comparison of the two frameworks will be described in Section 5.4.4.

Velocity Field Reconstruction.
According to the predicted coefficients and POD modes, the reconstructed velocity field can be obtained. Figure 10 and Figure 11 show the velocity field simulated by DNS and predicted by two deep learning frameworks at Ma � 0.54 and 0.6. Both of these two deep learning frameworks can accurately capture large-scale structures in instantaneous velocity fields. e results of contours at other Mach numbers are similar. It shows that       the ROMs based on the MLP or LSTM framework are accurate and reliable. In order to further analyze the reconstruction results in detail, Figure 12 compares the time trace of velocity at the monitor (x/L � 0.9, 0). It can be seen from the figure that the shapes of time trace curves of the velocity across different Mach numbers are basically the same, but their magnitudes are different. e two deep learning frameworks can capture the speed value at all Mach numbers. e prediction results by the LSTM framework are more stable and will not oscillate.

Comparison of MLP and LSTM Frameworks.
From the above analysis of reconstruction results, it can be seen that the MLP and LSTM frameworks can well predict the flow field in the future, but MLP has a certain degree of oscillation. In order to further compare the performance of the two frameworks, Figure 13-15 compare the root mean square error (RMSE) of the reconstruction results. Except at a Mach number of 0.53, the RMSE predicted by the LSTM framework is slightly larger than that by the MLP framework. At all other Mach numbers, the RMSE predicted by the LSTM framework is obviously lower than that by the MLP framework. Especially at Mach number 0.54, the maximum RMSE predicted by the MLP framework is 0.05, while the maximum RMSE predicted by the LSTM framework is 0.025. In addition, all the maximum RMSEs occur near the cavity leading edge, which may be related to the instability of speed caused by inflow passing through the cavity leading edge. e calculation time of the ROM in this paper mainly includes the time of POD analysis and the training time of deep learning frameworks. Table 4 compares the calculation time of these two ROMs with the DNS simulation. All the calculations were carried out on a computer with 44 cores (Intel Xeon E5-2699) and 256 GB RAM, and these calculations are based on a single core. As can be seen from the table, both of the two deep learning frameworks can significantly reduce the calculation time by order of magnitude on average. e LSTM framework has the least calculation time.

Conclusions
In this paper, a reduced-order model based on POD and deep learning was established for predicting cavity flow oscillations across multiple Mach numbers. e specific conclusions are concluded as follows: (i) After the POD analysis of the numerical simulation data, the first 13 POD modes were extracted, which occupy 99.9% of the energy. e POD modal structures at Ma � 0.51, 0.6 are qualitatively similar, so the deep learning method can accurately learn their common features. In addition, their size and direction are different, and the variation between different Mach numbers is irregular and unpredictable.
erefore, it is of great significance to establish the reduced-order model of cavity flow oscillations across multi-Mach numbers.
(ii) By comparing the predicted coefficients with the actual POD coefficients, it is found that the MLP and LSTM frameworks can accurately predict the POD coefficients, but there are some small oscillations in the coefficients predicted by the MLP framework.  (iii) Both of the frameworks can accurately reconstruct the velocity field at different Mach numbers, but the results reconstructed by the LSTM framework are more accurate, and the root mean square error is smaller than the results reconstructed by the MLP framework. Comparing the total computation time of the two frameworks with DNS, it is found that the time computed by the ROM proposed in this paper is reduced by at least one order of magnitude and that by the LSTM framework is the least.

Data Availability
e data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest
e authors declare that they have no conflicts of interest.