Long Short-Term Memory Networks and Bayesian Optimization for Predicting the Time-Weighted Average Pressure of Shield Supporting Cycles

Characteristic parameters of shield supporting in fully mechanized mining, especially time-weighted average pressure (TWAP), are crucial for the analysis and prediction of roof weightings in longwall panels. Despite the leap-forward development of underground data collection and transmission, mining and regional correlation analysis of massive shield data remains challenging. In this study, a hybrid machine learning model integrating the long short-term memory (LSTM) networks and the Bayesian optimization (BO) algorithm was developed to predict TWAP based on the setting pressure (SP), revised setting pressure (RSP), ﬁ nal pressure (FP), number of yielding (NY), TWAP in the last supporting cycle (TWAP (last)), and loading rate in each period. Statistical measures including the mean square error and mean absolute error were used to validate and compare the prediction performances of the BP model, the LSTM model, and the BO-LSTM model. Furthermore, sensitivity studies were carried out to evaluate the importance of input parameters. The results show that the BO-LSTM model is robust in predicting TWAP. FP and TWAP (last) are the most important input parameters in TWAP prediction, followed by RSP and NY. Moreover, the total importance scores of loading rates reach 0.229, indicating the necessity of including these parameters into the dataset. The proposed BO-LSTM model is capable of predicting TWAP which serves for shield-roof status intelligent perception.


Introduction
With the surge in energy demands and mining intensity, shallow resources are decreasing while deep resources are gradually exploited [1]. High ground stress induced by deep mining complicates mine pressure behaviors, including partial roof fall, rib spalling [2], shield crushing, and even rock burst [3]. Mine pressure monitoring is an important means to ensure safe and efficient production in coal mines. Mine pressure data in the roof-control area are comprised of basic data (roof-to-floor convergence, shield load, and leg closure) and statistical data (fissure development, the number of gob steps, etc.) [4]. In recent years, microseismic signals and elec-tromagnetic radiation signals have been included in the dataset [5]. In fact, time-weighted average pressure (TWAP) in the dataset, which belongs to shield supporting feedback and one of the judgment bases for roof weighting, is significant for weighting prediction [6]. As electronic technology and sensor theories advance, the information transmission speed and accuracy have achieved a leap-forward development thanks to the shield electrohydraulic monitoring technology [7]. However, deep analysis on nonlinear and nonstationary shield pressure signals remains challenging, and there is a lack of correlation analysis on information monitored by different stations. Machine learning (ML) is a branch of artificial intelligence that has experienced tremendous progress in the last two decades. Therefore, research on ML-based TWAP prediction of shields in the fully mechanized working face is meaningful for realizing the intelligent perception and coupling adaptation of shield-roof status.
ML has been extensively applied to mining engineering at an astounding rate primarily for the purposes of mine disaster monitoring (e.g., gushing water, gas, and mine pressure [8][9][10][11]) and roadway excavation technology (e.g., coal-rock interface recognition and digital drilling [12][13][14][15][16][17]). With respect to mine pressure monitoring, Li et al. [18] optimized the backpropagation (BP) neural network by using the particle swarm optimization (PSO) algorithm and established a PSO-BP model for assessing rock burst risk. Tan et al. [19] optimized the BP neural network by using the genetic algorithm (GA) and established a GA-BP mine pressure prediction model for predicting the weighting interval and intensity. Sun et al. [20] processed the monitored mine pressure data by means of chaotic analysis and established a shield pressure prediction model based on a support vector machine. The above researches all strived to predict roof weighting based on the traditional neural network. Although some researchers optimized the neural network by using the PSO algorithm, they failed to fully consider the time sequence of mine pressure data. Problems remain in the adaptability between intelligent optimization algorithms and neural networks. In addition, researchers rarely built a weighting prediction model considering the load-bearing information of shields (e.g., TWAP), and the characteristic parameters of shield supporting cycle lacks in-depth exploration. In summary, the weighting prediction model can still be optimized to a large extent.
Despite the application of various ML algorithms to roof weighting prediction, the applicability and mobility of these algorithms are questioned for two reasons. First, mine pressure behaviors possess strong periodicity which is reflected by the time sequence change in shield pressure data, so it is essential to choose a model suitable for dealing with time sequence problems. Second, due to the improper choice of input variables, a model matching the current engineering background can hardly be adjusted adaptively to the best performance in another engineering background. Long short-term memory (LSTM), first proposed by Hochreiter and Schmidhuber [21], is an excellent deep learning network widely adopted in fields such as natural language processing and time sequence prediction [22][23][24]. Since its establishment, the LSTM model demonstrates extraordinary performance for mine pressure prediction [25]. It is absolutely essential to select a suitable intelligent optimization algorithm to adjust its parameters. The commonly used optimization algorithms include the GA, the simulated annealing (SA) algorithm, the PSO algorithm, and the Bayesian optimization (BO) algorithm. Among these algorithms [26][27][28][29][30][31][32], the BO algorithm whose iterations are few is able to quickly find the optimal value without wasting resources. Such a feature enables this algorithm to be applicable to the coal mine field.
Several contributions of the study are summarized as follows: (1) Loading rates in each period are applied to TWAP prediction for the first time. (2) The BO-LSTM hybrid model is proposed to better predict the TWAP value in the next supporting cycle by setting appropriate input parameters, and this value is regarded as the criterion for weighting prediction. The study is mainly aimed at efficiently and accurately predicting the TWAP value in the next shield supporting cycle through the BO-LSTM model and serving for the intelligent perception of shield-roof status. The input parameters include setting pressure (SP), final pressure (FP), loading rates in each period, and number of yielding (NY) [33][34][35][36]. Considering that model development and validation require sufficient available shield data, the 22307 fully mechanized working face of the Bulianta Coal Mine was taken as the study area. The Pearson correlation coefficient was adopted for analyzing correlation between the input parameters and the output parameter. Meanwhile, the sensitivity between the input parameters and the output parameter was analyzed in depth. The 22307 working face is equipped with 150 ZY/18000/32/70D shields whose numbers are given in Figure 1(b). A PM32 electrohydraulic controller and a pressure sensor are arranged on each shield (Figure 1(c)). The data acquisition and transmission method is presented in Figure 1(d) [37]. Specifically, the pressure data collected by the sensors are conveyed through the CAN bus network to the ground server and the host computer, and then the automatic control of the shields is realized through the decision of the host computer. The self-developed Status of Shield and Roof IntelliSense (SSRI) system, integrated with the data mining and cycle analysis technology, is capable of efficiently and accurately analyzing massive monitoring data and cooperates with the host computer to make decisions intelligently. Detailed architecture and working principle of the SSRI system can be found in the literature [38].

Dataset Preparation
With the aid of the SSRI system, a total of over 56,000 pieces of data are extracted from the massive pressure data of 80 shields in the 22307 working face (Figure 3(a)). In this study, three shields, i.e., 1#, 40#, and 80#, from the end to the middle of the working face are selected as the research object. The processed pressure data of the three shields are displayed in Figure 3(b). The pressure data of each shield contains n supporting cycles reflecting the relationship between the shield and the surrounding rock.

Data collection
2.2.1. Output. The output variable is TWAP. The shield supporting cycle refers to the pressure variation process of a shield, including its setting, loading/unloading, and forward moving. The cycles can be obtained from the pressure data in Figure 3(b). TWAP [39], a characteristic quantity reflecting the pressure variation in shield cycles, is the basis for where P t is TWAP (MPa); P n is the shield pressure (MPa); t n is the total time (min). In this study, the SSRI system can identify the number of cycles in each pressure data set and calculate the TWAP value in each cycle. The processed results of the pressure data ( Figure 3(a)) indicate that each shield contains 600 supporting cycles. The overall TWAP data within 1100 h are shown in Figure 5.

Input
Variables. The input variables are multidimensional feature parameters of supporting cycles. As can be observed in Figure 4, the typical shield supporting cycle can be simplified into three periods in accordance with the loading/unloading characteristics: initial period (T 1 ), relatively stable period (T 2 ), and cutting influenced and neighboring shield movement period (T 3 ). Based on the loading characteristics of the supporting cycle in different stages, the SSRI system can extract 27 characteristic parameters by employing multiple algorithms such as Naive Bayes, clustering, classification, and correlation. In addition to SP and FP, loading rates and NY are also included in these parameters as input variables for TWAP prediction.
The 27 characteristic parameters of TWAP factors are subjected to gray correlation analysis. Among them, 9 parameters with high correlation degrees, namely, SP, revised setting pressure (RSP), FP, NY, loading rates in the initial period (LRIP 5 min and LRIP 10 min), loading rate in the relatively stable period (LRRSP), loading rate in the cutting influenced and neighboring shield movement period (LRCMP), and TWAP in the last supporting cycle (TWAP last), are regarded as input variables in this study.
(1) SP and RSP. SP, an active supporting force provided by the shield to the roof, plays a crucial role in ground controlling in the working face (point A in Figure 4). After shield pressure reaches SP, it will first experience a short-term (1-3 min) drop and then rise slowly or stabilize. The shield leg pressure at this moment is the actual SP provided by the shield to the roof, and it is defined as the RSP (point B in Figure 4).
(2) FP. The FP of a shield (point C in Figure 4), which is normally considered as the maximum supporting load in the whole supporting cycle, is attained when its neighboring shields are lowered and advanced.
(3) Loading Rates. Loading rates refer to the rates of load changes in the T 1 , T 2 , and T 3 stages in Figure 4. They reflect the state of roof activity in different time periods. Among the three, LRIP = M 1 /T 1 , LRRSP = M 2 /T 2 , and LRCMP = M 3 /T 3 . It is noteworthy that LRIP (5 min) and LRIP (10 min) refer to the ratio of resistance change to time within the first 5 min and 10 min in the T 1 stage, respectively.
(4) NY. When the shield pressure reaches or exceeds the rated shield capacity, high-pressure emulsion overflows for protecting the shield leg under the condition that working face production is suspended for a long time or that the roof activity is intense. The process of yield valve opening and closure is called a yielding cycle. The number of yielding cycles is an important indicator reflecting the intensity of roof activity.

Geofluids
The data of 9 feature parameters in this study are preliminarily analyzed in Table 1.

Fundamental Theory of LSTM and BOA
3.1.1. Long Short-Term Memory Networks (LSTM). LSTM neural networks are especially applicable to processing highly time-dependent problems. Roof weighting has notable periodic characteristics, as the pressure waveform is a strong time series whose front and back inputs are considerably correlated. Thus, LSTM neural networks are applied to predicting TWAP of shields.
The LSTM neural unit is optimized on the basis of the recurrent neural network (RNN) by adding control gates inside the units. The addition of a new cell state input enables the LSTM to remember and retain historical information. The entire neural unit is controlled by the input gate, the forget gate, and the output gate.
The internal structure of the LSTM neural unit at three consecutive times is illustrated in Figure 6 where x is the input vector of the neural unit at the current time; h is the output vector of the neural unit; C is the cell state; σ and tan h are both activation functions; f , i, and O are the forget gate, the input gate, and the output gate, respectively; and W and b are the weight and the deviation matrix, respectively.
Equations (2)-(7) show the internal operations followed by neural units: 3.1.2. Bayesian Optimization (BO) Algorithm. The BO algorithm is a kind of probability distribution optimization algorithm used for the automatic tuning of ML hyperparameters. The algorithm, whose core idea is to give an optimized objective function and update the posterior distribution of the objective function by continuously adding sample points, is mainly oriented to solving complex black box problems with multimodality, nonconvexity, high dimensionality, and high evaluation costs. Established on the basis of the Gaussian process, it considers the previous parameter information and constantly updates the prior. It not only greatly saves the optimization cost due to the small number and fast speed of iterations, but it also boasts good performance against both convex and nonconvex optimization problems. The basic flow of the BO algorithm is listed in Figure 7.
3.2. Modelling Methodology. Shield data, a kind of strong time series data, feature instability, nonlinearity, and periodic uncertainty. In this study, a prediction model regarding shield pressure data is built based on LSTM networks. Since data of various shields are not synchronized with significant differences, the BO algorithm is combined with the LSTM model so that the hyperparameters match the data characteristics of each shield.

Geofluids
The framework of the BO-LSTM model is listed in Figure 8. It is known that both the structure and model parameters of an artificial neural network exert an influence on the performance of the model, so the number of units in the LSTM layers, the parameter of the dropout layer, the batch size, and the optimizer learning rate are selected as the optimization objects of the Bayesian optimizer. The model is composed of an input layer, two LSTM layers, a dropout layer, and an output layer. The loss function adopts the mean square error (MSE), and the model training process is optimized through the Adam algorithm. The optimization value ranges of hyperparameters are set as follows: number of units in the LSTM layers (1, 80), batch size (1, 80), optimizer learning rate (0.0001, 0.01), and parameter of the dropout layer (0.1, 0.6).
The BP model and the LSTM model are taken as contrast models in this study. The BP model is comprised of an input layer, two hidden layers (20 units), a dropout layer (0.2 parameter), and an output layer. The LSTM model consists of an input layer, a LSTM layer (20 units), a dropout layer (0.2 parameter), and an output layer. The learning rate of the Adam optimizer is 0.01.
The flow of the BO-LSTM model is as follows: Step 1. To divide the experimental data into the training set and the test set.
Step 2. To initialize the BO algorithm by taking batch size, optimizer learning rate, and dropout layer parameter in the LSTM network model as optimization objects.
Step 3. To randomly calculate the current function distribution according to the formula.
Step 4. To adjust the current function distribution according to the strategy selected by the selection function.
Step 5. To judge whether the termination condition is met. If it is met, the optimal hyperparameter values are returned; otherwise, Step 4 is returned.
Step 6. To construct the LSTM network model with the optimal hyperparameters.
Step 7. To conduct training and testing.

Statistical Assessment.
The assessment of this study was carried out using MSE and the mean absolute error (MAE) [40], the most common indicators for ML model validation and comparison. The equations of the two indexes are as follows: where N is the number of instances;ŷ n is the value predicted by the model; y n is the true value. The two indexes can both represent the error betweenŷ n and y n . A smaller value of the index is indicative of a smaller deviation of the predicted value from the true value and better performance of the model.

Correlations among Inputs and Output.
The relationships between the 9 input variables and the output variable have been investigated based on the Pearson correlation coefficient [41]. Figure 9 is a distribution diagram of correlation coefficients among variables. The results show that SP, LRIP, FP, NY, and RSP are all positively correlated with TWAP, and the loading rates in the three stages (T 1 , T 2 , and T 3 ) will Figure 6: Structure of the LSTM neural unit.

TWAP Prediction for 40# Shield.
The TWAP values predicted by the three models on the test sets and the actual TWAP values for 40# shield are compared in Figure 11. Among the 662 pieces of 40# shield data (Sept. 27, 201427, -Nov. 11, 2014, 70% are taken as the training set and 30% are used as the test set. Figure 11(a) displays the results predicted by the BP model, the MSE value and MAE value being 0.0194 and 0.1058, respectively. Figure 11(b) exhibits the results predicted by the LSTM model, the MSE value and MAE value being 0.0165 and 0.0924, respectively. Figure 11(c) gives the results predicted by the BO-LSTM model. After being tuned by the BO algorithm, the hyperparameters of the BO-LSTM model are as follows: the number of units in the LSTM layer is 77, the batch size is 65, the optimizer learning rate is 0.0061, and the dropout layer parameter is 0.24. In this case, the MSE value and MAE value are 0.0155 and 0.0886, respectively.

TWAP Prediction for 80# Shield.
The TWAP values predicted by the three models on the test sets and the actual TWAP values for 80# shield are compared in Figure 12. Among the 645 pieces of 80# shield data (Sept. 27, 201427, -Nov. 11, 2014, 70% are taken as the training set and 30% are used as the test set.

Analysis of Prediction
Results. The errors of the prediction results by the BP, LSTM, and BO-LSTM models are shown in Figure 13, according to which the BO-LSTM model achieves the best prediction performance. Compared with the BP model, the LSTM model succeeds in reducing the errors when it is used to predict the TWAP data of shields. This result suggests that the LSTM model is more suitable for processing TWAP data with a notable time sequence than the BP model. Moreover, the BO-LSTM model whose hyperparameters have been automatically tuned by the BO algorithm boasts an improved performance compared with the LSTM model which has not received BO optimization, with the errors of the three shields by the BO-LSTM model all being smaller than those by the LSTM model and the BP model. The result confirmed the feasibility of the BO algorithm in improving the performance of the LSTM model.
The BP model presents the largest errors because its TWAP prediction is based on the data of the previous moment. It lacks the memory of data characteristics in the past. The LSTM model demonstrates a better prediction performance because it memorizes data characteristics in the past. However, it fails to be well compatible with the difference in data characteristics of each shield because the shield weightings in the 22307 working face are not synchronous while the LSTM model adopts the same set of hyperparameters to train the data of shields. In contrast, the BO-LSTM model boasts the smallest prediction errors because it not only possesses the advantages of the LSTM model but also finds out the hyperparameters that are the most suitable for the current data in accordance with the distinct characteristics of shields.
With respect to the spatial evolution, the MSE value and MAE value increase from 1# shield to 80# shield, demonstrating that the prediction error of the shield (80#) in the middle is larger than that of the shield (1#) in the end. The reason is that compared with 1# shield, 80# shield corresponds to a smaller weighting interval and experiences more fluctuant supporting pressure [42].
In short, the BO-LSTM model proposed in this study can quickly and effectively predict the TWAP values of shields at different positions. Its performance may be improved due to     PDP is a popular tool to determine the sensitivity of the output variable on the individual input variables [45]. The horizontal axis represents all possible values of the input variable, and the vertical axis represents the mean value of the predicted output value. Figures 14(a)-14(h) demonstrate the partial dependence of input variables (excluding LRIP (10 min)) and the output variable. The results reveal that loading rates in different stages, FP, and TWAP (last) are positively correlated with the TWAP value. SP and RSP are positively correlated with the TWAP value before 27 MPa, whereas they are negatively correlated with it after this pressure, indicating that 27 MPa is the best SP. The TWAP value varies more intensely with the changes in FP and TWAP (last), compared with NY, SP, RSP, and loading rates in different stages.
To illustrate the importance of input variables more clearly, the importance scores (IS) of all input variables were deduced through normalizing the summation ( Figure 15). The results manifest that FP (IS: 0.277) exerts the greatest influences on TWAP prediction, followed by TWAP (last) (IS: 0.241), RSP (IS: 0.109), NY (IS: 0.077), SP (IS: 0.067), and loading rates in different stages (IS: mostly around 0.06) in turn. From IS, it can be seen that FP and TWAP (last) are the primary variables influencing the TWAP prediction performance. The two parameters are also the basis for judging roof weighting in traditional mining, and this explains why the TWAP value varies more intensely with the changes in FP and TWAP (last) in Figure 14.
Indeed, traditional feature parameters (e.g., FP and TWAP (last)) remarkably influence the TWAP value in the next supporting cycle, which has been proved by IS. Nevertheless, the total IS of LRIP, LRRSP, and LRCMP reaches 0.229, so it can be concluded that loading rates in different stages are also indispensable for TWAP prediction.
In a supporting cycle, the feature parameters which are later in time play a more important role in TWAP prediction. For instance, LRCMP is more important than LRIP, and FP is more important than SP and RSP. Meanwhile, the shield supporting characteristic with a higher loading rate plays a greater role in TWAP prediction.
In addition, NY also exerts a certain influence on TWAP prediction. The importance of RSP on TWAP prediction is larger than that of SP, probably because when the shield is initially set, it has not fully contacted the roof and SP cannot obviously reflect the roof condition. This phenomenon exists commonly in shield supporting cycles. For example, in the 22304 working face of the Buertai Coal Mine in China and the LW23 working face of the Emerald Coal Mine in the United States, 48.5% and 71.9% of shield supporting cycles underwent a load decrement of over 1 MPa after the shield was initially set, and the shield supporting cycles within which the load decrement lasted shorter than 1 min accounted for 70.3% and 96%, respectively.
In general, the above research results verify the importance of the thoroughly explored input variables to a certain extent. It is believed that further research based on this viewpoint will have far-reaching significance for the further development of TWAP prediction.

Conclusions
In this study, the BO-LSTM model, a hybrid ML model integrating the advantages of the LSTM model and the BO algorithm, was proposed to predict the shield TWAP values for the purpose of shield-roof status intelligent perception. A total of over 56000 pieces of regionally correlated shield pressure data were collected from the Bulianta Coal Mine, and the SSRI system assisted in identifying the supporting cycles and extracting characteristic parameters from them. Nine parameters sharing large gray correlation degrees with the output variable (TWAP), namely, SP, RSP, FP, NY, loading rates, and TWAP (last), were selected as the input variables. Statistical measures including MSE and MAE were used to validate and compare the prediction performance of the BP model, the LSTM model, and the BO-LSTM model. Moreover, PDP and IS were employed to evaluate the importance of input parameters in the model study. The main conclusions are as follows.