Evolution of Deep Neural Network Architecture Using Particle Swarm Optimization to Improve the Performance in Determining the Friction Angle of Soil

,


Introduction
In geotechnical engineering, the internal friction angle of soil is one of the most important parameters [1]. is index is mentioned in many studies in the field of geology, for instance, in the investigation of the foundation bearing capacity [2][3][4][5][6][7], soil stability [3,[8][9][10], limit loads [11], and slope stability [9,[12][13][14]. In slope stability or retaining wallsrelated problems, the friction angle was proven to affect the soil slope stability significantly, as an increase of the friction angle decreases the soil consolidation [3]. Similarly, the retaining wall's reaction is also significantly affected by the internal friction angle [8]. In another study, You Mo et al. [11] pointed out a positive correlation between friction angle and critical load. e above points showed that the friction angle is a vital soil parameter [1], and more in-depth research of the latter is required. However, the friction angle is only mentioned indirectly in related geological studies [2-4, 8, 9, 11], and there are limited researches on this index. e value of the internal friction angle depends on many soil parameters, such as grain size, particle shape, unit weight, specific gravity, and soil moisture [15,16]. Two main approaches were proposed for estimating the friction angle of soil: experimental and theoretical approaches. Concerning the experimental one, the internal friction angle can be obtained from the direct shear test [1,[17][18][19] and the triaxial shear test [4,[19][20][21][22]. However, these experiment testing methods possessed some disadvantages, as they are time-consuming and required relatively expensive equipment [23][24][25]. Besides, theoretical and numerical methods were also applied to determine the friction angle of soil. As an example, Mo et al. [11] proposed an equation for the friction angle of the soil and confirmed the validity of the theoretical method.
Notwithstanding the foregoing, the theoretical method's limitation is that it is necessary to be based on some assumptions, in this case, the ideal rigid-plastic slip-line field theory. Other studies proposed alternative methods to evaluate the internal friction angle of soil, based on the standard penetration test (SPT) values [26][27][28][29] and cone penetration test (CPT) results [30,31].
ose methods provided several empirical formulas that could quickly predict the soil friction angle. However, these formulas contained only one or several main input variables, which could reduce the prediction accuracy in many cases.
In the last two decades, based on the development of computer science, numerous studies have proposed a novel approach for surveying and assessing geological issues, namely, the artificial intelligence (AI) approach [10,25,32,33]. In Mikaeil et al.'s work [32], the authors have used the genetic algorithm (GA) to study three crucial physical and mechanical characteristics of soil, in which the internal friction angle is an important parameter. Additionally, an adaptive neuro-fuzzy inference system (ANFIS) has been used by Murlidhar et al. [25] to study the shear strength of rock based on the internal friction angle. Das and Basudhar [34] and Al-Hamed et al. [35] have developed ANN models to predict the internal friction angle of soil. Besides, Pham et al. [36] have developed a hybrid model using random forest and particle swarm optimization for estimation of undrained shear strength of soil. e obtained results showed the effectiveness and reliability of AI approach, reflected by the high R 2 values achieved [10,25,32,33]. Besides, Deep Neural Network (DNN) is also another powerful and efficient algorithm of AI. Nowadays, DNN has become popular and widely used to solve practical engineering problems and provide reliable results [37][38][39][40].
e problem with such an ML approach is that it has a large number of critical hyperparameters, making it difficult to find an optimal model architecture. Among various optimization algorithms, Particle Swarm Optimization (PSO) is an efficient, robust, and straightforward algorithm mainly used to solve problems that are difficult to find an exact mathematical model [41][42][43]. erefore, the main objective of this study is to develop a PSO-DNN hybrid model, which can be self-developed to find the best architectural model to predict the internal friction angle of the soil, including the number of hidden layers and number of neurons in each hidden layer. Seven input factors that might affect the prediction of friction angle of soil were considered: the soil state, standard penetration test value, unit weight of soil, void ratio, thickness of soil layer, top elevation of the soil layer, and bottom elevation of the soil layer. For this aim, a data set consisting of 245 soil samples was collected from some drill holes in Ho Chi Minh City, Vietnam. e database was then divided into the training (60% of the data set), validation (20% of the data set), and testing (20% of the data set), related to the training, validation, and testing phases of the hybrid model. In addition, a global sensitivity analysis method using Monte Carlo simulation was conducted to find out the most important parameter that affects the prediction of friction angle of soil.

Experimental Measurement of Friction Angle of Soil.
In this work, the authors conducted 245 identified different soil samples in Ho Chi Minh City, Vietnam (Figure 1), and the results are summarized in Table 1. To determine the characteristics of the soil properties, we use the method of determination of shear laboratory resistance in a shear box apparatus [44,45]. e test samples with the original structure and natural moisture were prepared by cutting from the original soil samples into blocks, which were taken into the ring knives by the method of determining the volume by the ring knife [46]. e underside and top of the soil sample should be leveled with the circular knife's edge and placed with dampened paper first. For fast cutting without draining, the absorbent paper must be replaced with tracing paper (or thin plastic) [46]. Simultaneously with cutting soil samples, it is necessary to take soil to determine moisture content.
With sandy soil, the sample was prepared by pouring dry sand onto a cutter installed in a hard bottom box with many small holes. en, take a sample into a cylindrical ring knife with two ends, place the sample in the cutting box, and compress the sample under pressure σ. If soil does not preconsolidate, cut samples immediately, cut quickly, and cut samples at a speed of 1 mm/minute until the sample is damaged. If the soil consolidates under pressure σ, maintain that compressive force until the level of consolidation is met, and then proceed as normal. It is required to cut slowly at a cutting rate of 0.01 mm/min (or slower) until the sample breaks. e destructive force is the maximum value read on the strain gauges. e vertical compression pressure was applied at four levels: 50 kPa, 100 kPa, 200 kPa, and 300 kPa, respectively. For each level of compressive pressure, the corresponding shear strength of soil was recorded. Draw the shear strength according to the vertical pressure, thereby determining the cohesion of soil C, tgϕ and calculating the internal friction angle (ϕ).

Data Preparation.
To predict the internal friction angle of the soil, the inputs need to be carefully considered. First, the water content was not included in this study, as it is believed that the effect was taken into account in the number of SPT blows [47]. Besides, many studies have been successful in finding the relationship between the results of in situ experiments (CPT and SPT) and the Plastic and Liquid Limited humidity [48][49][50][51]. Since the SPT is one of the most commonly used tests in practice to indicate the soil's ability to withstand compression and in situ shear, it is believed to relate to the factors that characterize the ability of the shear strength of the soil. Many studies have shown the relationship between the in situ experiment results and the internal friction angle of the soil [26][27][28][29][30][31]. Second, this study aims to provide a tool to quickly analyze the internal friction angle of the soil based on basic parameters that can be quickly and simply determined instead of parameters that take time and cost for determination. erefore, this study selected SPT value (N30) of soil as the main feature, and some other indicators such as soil state (S), unit weight of soil (G), void ratio (e), thickness of soil layer (H), top elevation of soil layer (Z 1 ), and bottom elevation of soil layer (Z 2 ) are inputs to predict the internal friction angle of soil. e internal friction angle of soil (ϕ) is the single output variable.
One of the most popular methods to determine the internal friction angle of soil (ϕ) is the direct shear test method [44,45]. ere are three soil direct cutting modes: UU mode (this mode is fast cutting-no cohesion, no drainage), CU mode (fast cutting, cohesive-no drainage), and CD mode (slow cutting, consolidation-drainage; this mode usually applies to sandy soils) [1]. Firstly, the UU cutting mode is suitable for fast construction and difficult drainage, and the soil samples in the present study were all under this type of condition. Secondly, the UU cutting mode is quick test, taking under 30 minutes to perform, while the CU and CD modes take longer to execute and can take weeks or even months to complete. Besides, the shear strength parameters given by the UU cutting mode are generally safer than those in the other two modes. erefore, in this study, the results of the UU cutting mode were applied to determine the friction angle of all soil samples. e meaning of the soil type state number is explained in Table 2. Besides, the parameters' meanings are illustrated in Figure 2, and the histograms of all the input and output variables are shown in Figure 3. e data set containing 245 samples was statistically introduced and summarized in Table 1, including several soil samples, min, max, average, and standard deviation of the input and output variables. It can be seen that, among 245 soil samples, 46 samples are sandy soil and the rest are clayey soil. Soil state number (S) ranged from 2 to 9, corresponding to soft clay to dense sandy soil. e minimum depth of the soil samples is 2.0 m, and the maximum depth is 79.32 m. e standard penetration test values (N30) were from 2 to 85, where 2 corresponds to the soft clay soil and 85 corresponds to the hard clay soil. e unit weight of soil (G) ranged from 16.2 (kN/m 3 ) to 21.57 (kN/m 3 ) in accordance with the clay soil to sandy soil layers. e void ratio (e) varied from 0.416 to 2.194, corresponding from medium sandy soil    to soft clay soil. e internal friction angles of soil ranged from 4.23 (degrees) to 33.4 (degrees), related to the shear strength of the soil increasing from soft clay to medium and dense sandy soil. ese input data results are consistent with the properties of the soil layers. An example of 50 data samples is given in Table 3.
In this study, we divided the data into three data sets, in which the training set was used to build the model, the validation set was used to give an estimate of model skill while tuning the model's hyperparameters, and the testing set was used to estimate of the skill of the final tuned models to choose the best model. e test set was fixed at 20% of the total data set, hidden during training, and used only to evaluate the performance of the final model after the hyperparameters tuning process.

Particle Swarm Optimization.
Particle Swarm Optimization (PSO) was presented by Kennedy and Eberhart [52]. It became prevalent because it is a continuous optimization process type and allows an analysis of multiple targets. For a continuous search of the best solution, the method used is to move the positions of particles at a given velocity calculated in each iteration. Each movement of particles is influenced by its best position and the best position in the entire search space (Figure 4). is is expected to move the swarm to the best position. e effectiveness of the solution is assessed through a fitness function. In addition, PSO does not use the slope of the problem to be optimized, which means that PSO does not require optimization issues to be differentiable like standard optimization methods such as gradient descent and quasi-newton methods. PSO is a powerful technique that has been widely used for optimization issues in many fields in general and geotechnical engineering in particular [53,54]. e pseudocode of the algorithm is presented as follows. (Algorithm 1) e parameters in the equation that defines the velocity of the next iteration are as follows: w is an inertial parameter; c₁ and c₂ are the acceleration coefficients; c₁ value gives the importance of individual best solution, and c 2 is the importance of global best candidate solution; w_damp is the inertial reduction coefficient, and it helps the swarm movement to quickly converge.

Deep Neural Network.
Deep neural network (DNN) is definitely one of the most advanced regression methods. DNN model is analogous to a multistage regression. e main idea is to create a flexible nonlinear statistical model consisting of several layers and neurons in each layer. In DNN, each node in a layer is associated with a certain weight, denoted as w ij , with every node in the other layers creating a fully linked neural system [55]. Except for the input layer, each node is a neuron that uses a nonlinear activation function [56].
anks to multilayer and nonlinear activation functions, the DNN model could distinguish nonlinear separable data. e DNN structure used in this study is a multilayer perception (MLP), shown in Figure 5. Further details of the DNN model can be found in [57].

Particle Swarm Optimization-Deep Neural Network
Algorithm.
e PSO-DNN hybrid model was created based on the PSO algorithm, in which the fitness function of each particle in the swarm was DNN. e hybrid algorithm is implemented with the number of hidden layers ranging from 2 to 10. For each case of the number of hidden layers, random particles are generated, the length of the particle vector is the number of hidden layers, and the j th dimension of the particle vector is the number of hidden neurons. e swarm architecture was presented in Figure 6, in which k is the number of hidden layers and N is the number of particles in a swarm. e flow chart of the hybrid PSO-DNN model was shown in Figure 7. In this algorithm, the number of hidden layers in the model is incremented from 2 to 10. For each case of the number of hidden layers, the algorithm will automatically look for the optimal model architecture for that number of hidden layers.

Performance Evaluation.
To evaluate the effectiveness of the proposed AI models, three widely used statistical criteria    Mathematical Problems in Engineering

Mathematical Problems in Engineering
Update particle positon: are applied, namely, mean absolute error (MAE), root mean square error (RMSE), and the coefficient of determination (R 2 ). In order to estimate the percentage change of data which can be achieved by prediction, it is common to use the R 2 coefficient in regression analysis [58]. To measure the average intensity of error, both MAE and RMSE were used [59]. Studies have concluded that the model will be more accurate when R 2 is closer to 1 and MAE and RMSE is closer to 0. ese coefficients are determined by the following formulas: . .
Hidden neuron k -number of hidden layers where k represents the number of the samples, y i and y i were the actual and predicted outputs, respectively, and y was the average value of y i .

Effect of Training Data Size.
Because the testing set size was selected and fixed to 20% of the total data set and hidden from the training process, the training data set size was chosen by dropping from 60% to 20% of the data set, or meaning the size of the validation set increased from 20% to 60% of the remaining data set. For each size case of the training set, 300 simulations were used, taking into account the random shuffle in the training order. For each size case of the training set, 300 simulations were used, taking into account the random shuffle in the training order. e initialization parameters of the DNN model used in this section are presented in Table 4. From Figure 8, it can be seen that training performance progressively increased with decreasing in the training set. e average R 2 value was increased from 0.952 to 0.991 when the training set size was decreased from 60% to 20%. On the other side, the predictions showed less accurate results on the validation set and testing set as the training set size decreases. e average R 2 value was decreased from 0.889 to 0.602 on the validation set, while the average R 2 value dropped from 0.919 to 0.570 on the testing set. In addition, the standard deviation from the DNN models was increased with decreasing in training set size. To be more specific, the standard deviation was increased from 0.0277 to 0.351 on the testing set. e above results show that the performance of the model got better and became more stable with the most significant training set size case. Aiming to increase the prediction accuracy of the model, the training set size of 60% was selected in this study.

DNN Structure Optimization.
e evolutionary results of PSO-DNN models are evaluated in this section. e initialization parameters of PSO-DNN used in this study are given in Table 5. Figure 9 illustrates the evolution of the PSO-DNN model through 30 generations with the number of hidden layers set to values: 2, 4, 6, 8, and 10. A summary of the best predictability of the models is presented in Table 6.
It can be seen that the DNN model with 10 hidden layers has evolved to bring about better performance; the best generation yielded a correlation of R 2  ). e DNN model with 10 hidden layer gives the best performance for R 2 and MAE criteria, while the model with 8 hidden layers gives the best result for the RMSE cost function. However, the best neuron structure for each cost function is different and is also shown in Table 6. e evolutions of number of neurons in 10 hidden layers and 8 hidden layers of the best DNN model referring to R 2 , MAE, and RMSE are presented in Figure 10. It means that the evolution of the model has found three good results; choosing the best model will be done in the next section.

Predictive Capability of the Models.
In this section, the three best models, 10 hidden layers, R 2 (10-HD-R 2 ), 10 hidden layers, MAE (10-HD-MAE), and 8 hidden layers, RMSE (8-HD-RMSE) were chosen for comparison with each other. From the point of view of statistical probability, the random factors in the input variables should be considered. erefore, 300 simulations take into account the randomness of the input variable sequence number. e data used in this section include training, validation, and testing data sets. e criteria for selecting the best model are based on the results of the testing data set. e results are shown in Figure 11 and Tables 7-9 . e results showed that the predicted results on the training and validation sets gave good accuracy with satisfactory average accuracy R 2 ranging from 0.851 to 0.946, MAE ranging from 1.188 to 2.105, and RMSE varying from 1.566 to 2.668, in which it can be seen that, among three good models, the 10-HD-R 2 gives the best prediction, which yielded an average correlation of R 2 � 0.935, MAE � 1.34, and RMSE � 1.77 on the testing data set. e next best model is 8-HD-RMSE, which produces accurate intermediate precision (R 2 � 0.881, MAE � 1.523, and RMSE � 2.019). Moreover, the standard deviation of the 10-HD-R 2 model is better than that of the 8-HD-RMSE, where SD � 0.0138, 0.1339, and 0.1835 for R 2 , MAE, and RMSE criteria compared to SD � 0.0186, 0.1923, and 0.3264 of two models, respectively. Table 10 shows the comparison between the DNN model and the PSO-DNN model. e DNN model received initialized parameters as shown in Table 11. e best 10-HD-R 2 PSO-DNN model was included for comparison with the proof of the effectiveness of the algorithm. e average error criteria of 300 simulations were compared to confirm the model's performance. It can be seen that the 10-HD-R 2 PSO-DNN model outperformed the DNN model, which improved the average correlation of R 2 by 1.83%, MAE by 5.94%, and RMSE by 8.58%. Table 11 presents some research results on ML applications in determining the internal friction angle of soil. e results of the present study, as well as other studies, show the effectiveness of the ML technique in predicting the internal friction angle of the soil with expected R 2 from 0.79 to 0.935 on the testing data set. However, due to the different data sets used and the input variables, it makes no sense to compare these results. A project using different data sets as well as various input variables is needed to provide a general model for the prediction of soil shear parameters. Figure 12 shows a visual comparison of test results and predictions based on the friction angle of soil of the DNN and PSO-DNN models. e performances of two ML models have been tested on all three data sets: training, validation, and testing.

Sensitivity Analysis.
In this section, a global sensitivity analysis was conducted to evaluate the importance of input parameters for the model using Monte Carlo methods [63]. It is an effective way to investigate a relationship between input and output. e input data set taken from Saltelli's sampling scheme was used to develop the DNN model [64]. e global sensitivity index is determined by the following formulas: where Var(Y) denotes the total variance of the model output; d denotes the number of input features; Var i denotes the model output variance in response to variation of the i th input variable; Var ij denotes the model output variance in response to the simultaneous variation of the i th and the j th input; S Ti denotes the total sensitivity index. e result of the total sensitivity analysis was shown in Figure 13. It can be seen that, among the 7 input variables used to predict the friction angle of soil, the elevation of the soil layer bottom (Z 2 ) was the most important feature, which achieved an average sensitivity index score of 0.625. e topsoil layer elevation (Z 1 ) was the second important variable, confirmed by an average sensitivity index score of 0.603. From the point of view of soil mechanics, the deeper the soil is, the more compact it is, so the depth of the soil sample plays an important role in detecting the friction angle of the soil when its density, state, and many other mechanical properties change according to the depth. e variables H, G, and S were ranked as the third to the fifth important predictors, with an average sensitivity index ranging from 0.203 to 0.05. Other predictor variables included in the model (e, G) had lower than 0.01 in the sensitivity index, indicating that they had not affected the output prediction. It is important to note that the sensitivity analysis in this section is only relevant for the input data set itself and cannot be confirmed with other data sets.

Conclusions
In this study, a PSO-DNN hybrid model, which can evolve itself to find out the best models, was developed to predict the friction angle of soil. e hybrid model evolved on its own and found three of the best among the survey models. A database containing 245 soil samples from geological boreholes was used to develop and evaluate the three proposed DNN models: 10-HD-R 2 , 10-HD-MAE, and 8-HD-RMSE.
e results show that the models' performances improved and stabilized from R 2 � 0.602 to 0.889 on the validation set with the increase in training data set size from 20% to 60%. Research results show that PSO-DLNN obtains the best results with the number of hidden layers from 8 to 10. e optimal number of neurons in each hidden layer is not the same and is distributed intricately in the hidden layer. It recommends that a DLNN model with 8 to 10 hidden layers might be optimal for the problem related to predicting the friction angle of soil. However, it is advisable to select the number of neurons in each hidden layer through evolutionary methods to bring about high efficiency to the DLNN model. e results also showed that, on the training and validation data sets, all three best models, 10-HD-R 2 , 10-HD-MAE, and 8-HD-RMSE, have good predict results, in which the leading is the 10-HD-R 2 model on the testing data set; the 10-HD-R 2 -DNN model still gave the best results and outperformed the other two models. Prediction results of 300 simulations show that the 10-HD-R 2 -DNN model has a smaller standard deviation, indicating that the model is more stable than the other two models.
In addition, the sensitivity analysis using the Monte Carlo method was carried out to evaluate the importance of input features in the model study. e results show that the two inputs related to the depth of the soil layers (Z 2 and Z 1 ) were considered to be the most important parameters for predicting soil friction angle.
Data Availability e data used in the manuscript are available in the supplementary materials. Resharing of data with other researchers is completely allowed; however, they need to be cited in the researches.