Research on Identification of Natural and Unnatural Earthquake Events Based on AlexNet Convolutional Neural Network

Accurately and quickly identifying the types of natural and unnatural earthquake events is the basic premise of monitoring, prediction, early warning, and other study in the ﬁ eld of seismology, which is of great signi ﬁ cance to the prevention, evaluation, emergency rescue, and other work of earthquake disasters. Convolutional neural network model is a representative arti ﬁ cial intelligence deep learning algorithm, which has been widely used in computer vision, natural language processing, object type identi ﬁ cation, and other ﬁ elds in recent years. In this study, AlexNet convolutional neural network model is selected to study the type identi ﬁ cation of 1539 earthquake event waveform records in and around Ningxia Hui Autonomous Region, China. Earthquake event waveform records contain three types: natural earthquake, explosion, and collapse, in which both explosion and collapse are unnatural earthquakes. MATLAB software is used to build the training module and test module for AlexNet convolutional neural network model, and the earthquake event waveform record is transformed into an image format ﬁ le of 224 times 224 pixels as input parameters. Finally, AlexNet convolutional neural network model has the ability of automatic identi ﬁ cation of earthquake event types. The results of this study show that the identi ﬁ cation accuracy of earthquake event type in training module is 99.97%, the average value of loss function is 0.001, the identi ﬁ cation accuracy of earthquake event type in test set is 98.51%, and the average value of loss function is 0.059. After training and testing, 60 di ﬀ erent types of earthquake event waveform records were randomly selected, and AlexNet convolutional neural network model was used to identify them automatically. The automatic identi ﬁ cation accuracy of natural earthquakes, explosions, and collapses was 90%, 80%, and 85%, respectively. After training AlexNet convolutional neural network model with earthquake event waveform records, it can have accurate and fast automatic identi ﬁ cation ability. The accuracy of automatic identi ﬁ cation is comparable to that of professional seismic workers, and the time of automatic identi ﬁ cation is greatly reduced compared with that of professional seismic workers. This study can provide an implementation idea of deep learning based on arti ﬁ cial intelligence for the identi ﬁ cation of earthquake event types and make contributions to the cause of earthquake prevention and disaster reduction.


Introduction
Seismologists classify earthquake events into two main categories according to their seismogenesis mechanism: natural earthquake events and unnatural earthquake events. Natural earthquake event is a natural phenomenon that the crust ruptures rapidly under the action of stress and releases enormous energy and causes violent vibration on the earth's surface. Common natural earthquake events are tectonic earthquake, volcanic earthquake, and subsidence earthquake [1,2]. Unnatural earthquake events are caused by human social activities, so they are also called artificial earthquakes. Common unnatural earthquake events include engineering blasting, ground collapse caused by mining, geothermal exploration, reservoir water storage, and nuclear testing. Both natural and unnatural earthquake events are highly destructive earthquake disasters, which pose a great threat to social and economic construction and the safety of people's lives and property. In particular, unnatural earthquake events will also cause a series of adverse social impacts such as social panic and military conflict [3,4]. Under the mechanism of vibration, both natural and unnatural earthquake events are elastic waves, which vibrate regularly in the propagation medium and carry and propagate energy. Therefore, their event waveform recording characteristics are very similar. Professional seismologists need to spend a lot of precision and time to distinguish them, and the results of type identification often have a large error rate. The traditional method based on human resources can no longer meet the needs of monitoring, prediction, and early warning in the field of earthquake but also affect the prevention, evaluation, emergency rescue, and other work of earthquake disasters. Under the background of global economic prosperity and frequent human social activities, the number of unnatural earthquake events is increasing. According to the statistics of China Earthquake Network Center (CENC), as many as 112 socially influential unnatural earthquake events were monitored from 2012 to 2021. Many unnatural earthquake events in history have caused a large number of casualties and economic losses. In view of their relatively small magnitude and secret occurrence, they often miss the best earthquake disaster emergency rescue time. Therefore, how to quickly and effectively identify the types of natural and unnatural earthquake events has attracted continuous attention of seismologists at this stage [5][6][7][8].
The early study on the type identification of natural and unnatural earthquake events mainly focused on a series of characteristic parameters such as signal-to-noise ratio, phase difference, and duration of their earthquake event waveform records. After analogizing and analyzing a large number of waveform records of natural and unnatural earthquake events, Zhu et al. [9] pointed out that the sequence of seismic phases in natural and unnatural earthquake events is P wave, S wave, and surface wave in turn. The period of P wave is less than that of S wave, and the period of surface wave is the largest. After studying the waveform records of 293 explosion events in Southwest China, Zhao et al. [10] and Miao et al. [11] found that for the explosion near the seismic monitoring station (about several kilometers), its P wave is upward in the initial motion direction in the vertical direction, including the pulse component of jump type, and its P wave has the characteristics of small period and fast amplitude attenuation; The P wave of explosion (about dozens of kilometers) far away from the seismic monitoring station contains pulses similar to Rayleigh surface wave in the vertical direction. This pulse has large amplitude and long period, which is completely different from natural earthquake events. After using the focal mechanism solution method to study the location of earthquake events, Shiyuan et al. [12] pointed out that most unnatural earthquake events occur on the surface with relatively loose geological structure, and their high-frequency components are easy to be absorbed by the propagation medium, so their earthquake event waveform records are relatively smooth; most natural earthquake events occur in hard rocks, and their highfrequency components are not easy to be absorbed, so their earthquake event waveform records have a large fluctuation trend. The waveform recording characteristics of collapse are very different from those of natural earthquake and explosion. Its vibration frequency is particularly low, its vibration spectrum is particularly single, and its P wave's initial motion direction in the vertical direction is almost all downward [13,14]. Korrat et al. [15] used the DEMY wavelet transform method to process the digital signal of the waveform recorded signals of natural and unnatural earthquake events and calculated their normalized spectral values and maximum time frequencies. The parameters in this method can be used to identify the types of natural and unnatural earthquake events. In order to identify the types of earthquake events, seismologists have summarized a large number of large methods and laws, such as the minimum distance method, continuous hemming method, Fisher method, step-by-step iterative minimum decision method, and continuous vector machine method [16][17][18]. Although the above methods can effectively identify the types of earthquake events from the perspective of scientific research, they still have certain defects in terms of timeliness and convenience, so the above methods cannot be directly applied to practical work such as earthquake monitoring, prediction, early warning, and emergency rescue.
In the 1950s, the concept of artificial intelligence deep learning appeared in the field of computer science. Its principle is to simulate and learn human consciousness and thinking and complete all kinds of complex work and tasks with the help of the powerful computing power of computers. The basic process of artificial intelligence deep learning is to imitate the working mode of human neural network, use a large number of data samples to conduct in-depth training on the constructed neural network model, and constantly optimize the hyperparameters in the network layer in the training process. When the hyperparameters are optimized and improved, the complete convolutional neural network model is used to automatically analyze and process the new data samples. Convolutional neural network model has achieved great success in image recognition, face recognition, medical image recognition, natural language recognition, and many other fields [19][20][21]. Therefore, in the 1990s, some seismologists tried to introduce it into the study field of earthquake event waveform records. Dai and MacBeth successfully identified the P wave, S wave, and ground pulsation noise in the waveform records of unidirectional earthquake events by using the convolutional neural network model, but because the algorithm of the convolutional neural network model was not perfect at that time, they could not identify the type of earthquake events. Perol et al. [22] selected 689 waveform records of earthquake events with high signal-to-noise ratio and used convolutional neural network model to identify their seismic phases. The study results show that the identification accuracy reaches 87%. Kriegerowski et al. [23] first used STA/LTA method to detect natural earthquake events and artificial noise and then used convolutional neural network model to identify their types. The research data showed that convolutional neural network model can identify natural earthquake events and artificial noise with high accuracy after deep learning. Yang et al. [24] used GoogLeNet convolutional neural network model to study the type identification of the event waveform recorded images of explosion and collapse. The author 2 Wireless Communications and Mobile Computing pointed out that the success rate in identifying the types of explosion and collapse in a specific area can be stabilized at about 90%. Shaohui Zhou et al. compared and studied the accuracy of VGG16, VGG19, and GoogLeNet convolutional neural network models in identifying natural and unnatural earthquake event waveform records. The author pointed out that the identification rate of VGG16 and VGG19 is higher than GoogLeNet, and the identification accuracy of the three convolutional neural network models is directly proportional to the number of events in the training set. In addition, some studies showed that natural and unnatural earthquake events have obvious regional characteristics (geological structures in different regions will affect the seismic phase characteristics in earthquake event waveform records) [25]; the signal-to-noise ratio of event waveform records is directly proportional to the recognition accuracy [26]; the number of training iterations is proportional to the recognition accuracy, and the computing performance of computer GPU is inversely proportional to the recognition time [27].

AlexNet Convolutional Neural Network Model.
In 2021, Alex, Ilya, and Geoffrey compiled the first-generation algorithm of AlexNet convolutional neural network model and won the annual champion of the championship of ImageNet Large Scale Visual Recognition Challenge (ILSVRC). Since then, AlexNet convolutional neural network model has made a big impact in the field of artificial intelligence deep learning. AlexNet convolutional neural network model includes one input layer, five convolutional layers, three fully connected layers, and one input layer. In particular, three convolutional layers accept the maximum pooling operation. Compared with other convolutional neural network models, AlexNet pioneered the use of ReLU unsaturated activation function to shorten the training speed of convolutional neural network models. It calls two GPUs in the computer to place convolutional kernels, locally normalize some layers, and perform overlapping pooling, data enhancement, output loss, and other operations to place the occurrence of over fitting phenomenon [28][29][30]. The schematic diagram of the AlexNet convolutional neural network model is shown in Figure 1.
The function of the input layer is to input images for the whole convolutional neural network model. Each image is composed of several pixels, and each pixel has a specific pixel value. Therefore, the input layer transforms the specific image into a matrix of pixel values and carries out input operations. The input image format usually comes in two forms: color image and gray image. The color image adopts RGB color management mode, and each pixel corresponds to three-pixel value matrices. Gray image is relatively simple; each pixel corresponds to a gray value matrix [31].

Convolutional
Layer. The function of the convolutional layer is to extract features from the input data of the input layer. The convolutional layer contains multiple convolutional kernels; a weight coefficient and a bias vector form an element in the convolutional kernel. Its structure is similar to that of neurons in a feedforward neural network. Each neuron in the convolutional layer is connected to multiple neurons in the region with similar positions in the previous layer. The size of the region depends on the size of the convolution kernel, which is also known as the receptive field. The convolutional kernel will regularly scan the input features when working, and multiply and sum the matrix elements of the input features in the receptive field and superimpose the deviation vector. The important parameters of the convolution layer include the convolutional kernel size, step size, and padding, which jointly determine the size of the output feature graph of the convolution layer. The size of the convolutional kernel can be specified as any value smaller than the size of the input image. The larger the convolutional kernel is, the more complex the input features can be extracted. The working principle of convolutional layer is as follows [32]: where b is the deviation, Zði, jÞ represents the pixel of the corresponding feature graph, Z ðl+1Þ represents the convolution output of the l + 1 layer, Z l represents the convolution input of the l + 1 layer, w l lists the weight parameters of the l layer, L l+1 is the size of Z ðl+1Þ , and f , s 0 , and p are convolution layer parameters, corresponding to the convolutional kernel size, convolutional step size, and filling number.
and other factors is also enhanced. The pooling operation can fuse the feature graph of the upper layer, because the parameters of adjacent regions have strong correlation, and can prevent the overfitting phenomenon. The pooling layer works as shown in the equation [33] A l In the formula, s 0 and pixel ði, jÞ have the same meaning as the convolution layer; p is a prespecified parameter; when p ⟶ ∞, it is called maximal pooling.

Fully Connected Layer.
In convolutional neural networks, the function of the convolutional layer is to achieve feature extraction, while the fully connected layer is to classify all features. After the pooling layer outputs the feature mapping map, the fully connected layer needs to classify and output each feature according to the feature mapping map. The fully connected layer is generally located at the end of the convolutional neural network. It connects all nodes of the previous layer and all nodes of the later layer and calculates the network weight by cross calculation. At the same time, features are distributed, and the hidden layer feature space is mapped to the sample marker space. Due to the large number of parameters of the fully connected layer, the calculation rate of the model is affected. Therefore, other layers are often used to replace the fully connected layer, such as the global average pooling layer, when the data volume is large and the model convergence speed is slow, especially when the real-time transmission results of the model are required. This alternative has worked well with ResNet, Goo-gLeNet, and AlexNet. After many times of convolution and pooling, feature data is expanded into vectors and output by excitation function. Logical function or normalized exponential function is used to output classification tag encoding. The forward calculation process of the full connection layer is where a i,l represents the feature vector of the ith data after passing through the lth fully connected layer and w l and b l represent the weight and bias of the lth neuron, respectively.

ReLU Unsaturated Activation Function.
Other convolutional neural network models usually employ sigmoid or tanh activation functions, which are very large or very small in independent variables and tend to maintain the output of the activation functions. Therefore, such activation functions are also called saturation functions. To solve the problem of supersaturation, AlexNet uses the unsaturated correction linear activation function ReLU = ðxÞ = max ð0, xÞ. ReLU is quicker than other saturated functions in training time. It makes full use of piecewise linear structure to achieve nonlinear expression ability. The disappearance of gradient is weaker, which helps to train deeper and more efficient convolution neural network model. In addition, AlexNet convolutional neural network model also performs local response normalization on some layers, which can effectively reduce the error rate of recognition results [34,35].

Description of each Layer
(1) The first layer is the input layer, which is a 3-channel image with a size of 224 × 224 (2) The second layer is the convolutional layer, which uses 96 convolutional kernels with a size of 11 × 11 × 3. These convolutional kernels are divided into two groups (each group contains 48 convolutional kernels), which carry out convolutional operation on the input layer by 4 pixels in step length, and obtain two groups of convolution results of 55 × 55 × 48. The ReLU activation function is used for the convolution result to obtain the activation result. Then, the two groups of activation results of 55 × 55 × 48 were pooled using the overlapping maximum pooling with a window of 3 × 3 and step size of 2 pixels, and the pooling results of the two groups of 27 × 27 × 48 were obtained. Finally, two groups of 27 × 27 × 48 normalized results were obtained by using local response normalization operation for pooling results   (5) The fifth layer is the convolutional layer, which uses 384 convolutional kernels with a size of 13 × 13 × 192 . These convolutional kernels are divided into two groups (each group contains 192 convolution kernels), and convolution operation is performed on the activation results of layer 4 by 1 pixel step length, and two groups of convolution results of 13 × 13 × 192 are obtained. Then, the ReLU activation function is used for the convolution result to obtain the activation result (6) The sixth layer is the convolutional layer, which uses 256 convolutional kernels with a size of 13 × 13 × 192. These convolutional kernels are divided into two groups (each group contains 128 convolutional kernels), and convolution operation is performed on the activation results of layer 5 by step length of 1 pixel, and two groups of 13 × 13 × 128 convolution results are obtained. Then, the ReLU activation function is used for the convolution result to obtain the activation result. Then, the two groups of 13 × 13 × 128 activation results were pooled with a window of 3 × 3 and a step size of 2 pixels to obtain two groups of 13 × 13 × 128 pooling results (7) The seventh layer is the fully connected layer, and 4096 neurons are used to divide into two groups (each group has 2048 volume neurons). The pooling results of the sixth layer are fully connected, and then the ReLU activation function is used to obtain the activation results. The dropout result is then obtained by using a dropout operation with probability of 0.5 on the activation result (8) Layer 8 is the fully connected layer, which uses 4096 neurons and is divided into two groups (2048 volume neurons per group). The dropout results of layer 7 are fully connected, and then ReLU activation function is used to obtain the activation results. The dropout result is then obtained by using a dropout operation with probability of 0.5 on the activation result (9) The final layer is a soft max output layer for 1000 channels, which is used to produce a label distribution covering 1000 classes  [36,37]. Ningxia Hui Autonomous Region and its surrounding areas are mostly loess landforms, karst landforms, and river valley landforms, and this region is also very rich in coal resources, geothermal resources, and oil resources, which makes this region a high incidence area of unnatural earthquake events. At present, more than 40 fixed explosion sites have been recorded, covering a wide range of more than a dozen cities and counties in Ningxia Hui Autonomous Region, Inner Mongolia Autonomous Region, Gansu Province, and Shaanxi Province. Among them, large fixed explosion sites include Sanguankou mining area, Zhongning County mining area, Zhongwei City mining area, Yinchuan mining area, Dafeng mining area, and Shitanjin mining area at the junction of Guyuan City and Jingyuan County, Pingluo County, and Alxa Left Banner. The largest explosion occurred in the study area was the chamber blasting carried out by the Ning Coal Group in Pingluo County open pit mining area on December 20, 2007. Its explosive volume was 5499 tons, and the explosion coverage was about 632.9 times 104 square meters. There are also countless collapses caused by mining, groundwater extraction, engineering construction, and other incentives, which are mainly distributed in the northwest, east, and southeast of Ningxia Hui Autonomous Region. Abundant waveform records of natural and unnatural earthquake events provide a solid data foundation for this study. The regional location of Ningxia Hui Autonomous Region is shown in Figure 2 The time range of waveform records of natural and unnatural earthquake events selected in this study is from January 1, 2012, to December 31, 2021, of which the number of waveform records of natural earthquake events is 728 and the number of waveform records of unnatural earthquake events is 811 (including 426 explosions and 385 collapses). The data format of event waveform record is .SEED format, which is applicable to the JOPENS6.0 seismic data processing system of the China Seismic Network Center (CENC). Each event waveform record includes east-west direction, north-south direction, and vertical direction, and their signal-to-noise ratio is relatively high. Each event waveform record includes seismic phase parameters such as P wave, S wave, surface wave, and background noise. The time window length of event waveform record is 30 seconds. The proportional distribution of natural and unnatural earthquake event types is shown in Figure 3.

Research Process
4.1.1. Preprocessing. Import each event waveform record in .SEED format into the seismic waveform record data processing software MSDP. Use the software to delete the text, background color, coordinate axis, and other interference parameters in the event waveform record, and only keep the event waveform in a single three directions.
width. An example of the event waveform recording image is shown in Figure 4 4.1.4. Training. Images were recorded with a 3-channel event waveform at 244 × 244 input resolution for the Alex-Net convolutional neural network model, which was solved using the ReLU activation function with a step size of 4 pixels, and the resulting activation was normalized using a local response normalization operation, the normalized results were obtained. During this period, the AlexNet convolutional neural network model also performs an automatic dropout operation on the activation results, helping to mitigate the appearance of overfitting. After training, the Alex-Net convolutional neural network model automatically records features from every pixel spot in every event waveform recording image, and it uses these features to identify different types of event waveform recording images. 4.1.5. Testing. The purpose of this step is to test whether the AlexNet convolutional neural network model has been trained to identify the recorded images of different types of event waveforms. Its specific operation method is to import various types of event waveform recording images that are different from the training set, and see whether the AlexNet convolutional neural network model can accurately and quickly identify the type of event waveform recording images.   Wireless Communications and Mobile Computing percentage of recognition accuracy, and other key parameters), and draw the final conclusion of this study through the statistical results

Research
Results. In this study, AlexNet convolutional neural network model is used to identify the types of natural and unnatural earthquake event waveform records in Ningxia Hui Autonomous Region and its surrounding areas in China. There are three types of event waveform records, which are natural earthquake, explosion, and collapse. After building and optimizing the basic algorithm of AlexNet convolutional neural network model with MATLAB software, the natural and unnatural earthquake event waveform records are imported in the image format of 244 times 244 pixels, and it is allowed to perform convolutional operation, ReLU activation function operation, overlap maximum pooling operation, local response normalization operation, etc. Finally, AlexNet has the ability to automatically identify the type of event waveform records.
The results show that the identification accuracy of Alex-Net convolutional neural network model for natural and unnatural earthquake event waveform records is as high as 99.97% in the training process and 98.51% in the testing process. There is no overfitting phenomenon in AlexNet convolutional neural network model during training and testing. The average value of its loss function during training is 0.001, and the average value of its loss function during testing is 0.059. In the process of training and testing, with the increase of training times, its accuracy and the trend curve of loss function basically remain unchanged and stable near a special value, and with the increase of training times, its type identification accuracy gradually increases and exceeds 90%. In the process of training and testing, the trend curve of its loss function fluctuates at the beginning of the period. With the increase of training times, the trend curve of the loss function decreases rapidly and gradually stabilizes near a relatively small value, and finally no other changes occur. The accuracy curve and loss function curve of AlexNet convolutional neural network model during training and testing are shown in Figure 5.
In the accuracy verification stage of natural and unnatural earthquake event waveform records, 18 of 20 natural earthquakes were correctly identified, 1 was identified incorrectly, and 1 was not identified, with an identification accuracy rate of 90%. 16 of the 20 explosions were correctly     Wireless Communications and Mobile Computing identified, 2 were identified incorrectly, and 2 were unrecognized; the identification accuracy rate was 80%. Among the 20 collapses, 17 were correctly identified, 1 was identified incorrectly, and 2 were identified incorrectly, the identification accuracy rate was 85%.

Conclusions and Discussion
(1) AlexNet convolutional neural network is a representative algorithm based on artificial intelligence deep learning, which has a high reputation in the field of image identification. In the early work on the type identification of waveform records of natural and unnatural earthquake events, seismologists mainly focused on the distinction of seismic phase characteristics. The image recorded by the event waveform records is simply regarded as a pixel matrix, and the pixel features in the image are scanned and learned by using the AlexNet convolutional neural network model, so as to achieve the purpose of type identification, which can provide a new idea for such seismological work in the future (2) AlexNet convolutional neural network model can obviously increase the accuracy rate of waveform record type identification of natural and unnatural earthquake events after using ReLU activation function, maximum overlap pooling, local response normalization, and other operations. These innovative operations are different from other types of convolutional neural network models (3) In the type identification of natural and unnatural earthquake event waveform records, the low signalto-noise ratio will affect the final identification accuracy. Similarly, large background noise interference, earthquake event waveform drift, missing earthquake event waveform, and too small magnitude (about lower than M0.8) will affect the final identification accuracy (4) AlexNet convolutional neural network model has a very high accuracy in the type identification of natural earthquake event waveform records, which can reach almost 100%, but its accuracy in the type identification of explosion and collapse is relatively low, about 85%. The reason for this kind of phenomenon is that the seismogenic mechanism of explosion and collapse is complex and diverse, and their human interference factors are too large during the seismogenic process, which sometimes leads to the irregular trend of their event waveform records, resulting in the reduction of the accuracy of type identification (5) In this study, the training and testing process of AlexNet convolutional neural network model is completely automated, which does not need human intervention, and conforms to the automation concept of artificial intelligence deep learning. However, in the early stage of the selection and format conver-sion of event waveform records, manual operation is still used, which is relatively time-consuming and laborious, so this is also the defect of this study (6) After training and testing, the AlexNet convolutional neural network model in this study can well replace manual operation to accurately and quickly identify the waveform records of natural and unnatural earthquake events in and around Ningxia Hui Autonomous Region, China. However, the data samples in this study are relatively small, and the region is relatively concentrated, which has certain limitations. The reason for this kind of problem is that there are great differences in various seismic phase parameters recorded by natural and unnatural earthquake event waveform in different regions. When it cannot be fully trained and tested, its results will be biased

Data Availability
The waveform recording of natural and unnatural seismic events data used to support the findings of this study were supplied by the Earthquake Agency of Ningxia Hui Autonomous Region under license and so cannot be made freely available. Requests for access to these data should be made to Ren Jiaqi, renjiaqiqjr@163.com.

Conflicts of Interest
The authors declare that they have no conflicts of interest.