Model and Algorithm of BP Neural Network Based on Expanded Multichain Quantum Optimization

The model and algorithm of BP neural network optimized by expanded multichain quantum optimization algorithm with super parallel and ultra-high speed are proposed based on the analysis of the research status quo and defects of BP neural network to overcome the defects of overfitting, the random initial weights, and the oscillation of the fitting and generalization ability along with subtle changes of the network parameters. The method optimizes the structure of the neural network effectively and can overcome a series of problems existing in the BP neural network optimized by basic genetic algorithm such as slow convergence speed, premature convergence, and bad computational stability. The performance of the BP neural network controller is further improved.The simulation experimental results show that themodel iswith good stability, high precision of the extracted parameters, and good real-time performance and adaptability in the actual parameter extraction.


Introduction
Artificial neural network (ANN) is a new information processing and computer system which is based on the modern neuroscience research and is formed by abstracting, simplifying, and simulating of biological structure.The features of ANN are as follows.ANN can fully approximate to any complex nonlinear relations; all quantitative or qualitative information keeps in storage equipotentially in each neuron of network so it is with very strong robustness and fault tolerance; ANN is a kind of system which can emulate and adapt to the unknown system and is able to deal with the quantitative and qualitative knowledge at the same time [1].The main research of ANN is how to make computer simulate and realize the self-learning and mathematical thinking ability of human to mine the inner relation from limited samples.It is mainly through studying and storing the data relations, the inference rules, and probability distribution of known sample to deduce and reveal the potential information between variables in the unknown data sample [2].
BP neural network is currently the most popular neural network model in application [3].BP algorithm was proposed by Rumelhart and Mcllelland in 1986 which well solved the weight adjustment problems of nonlinear continuous function in the field of multilayer feedforward neural network.It is a typical error back propagation algorithm [4,5].Selection of activation function, design of structure parameter, and improvement of network defect have been researched a lot since the emergence of the BP neural network.In 1973, Grossberg found sigmoid function is very similar to the work situation of biological neurons so he began to explore the relationship between the features of sigmoid function and the stability of the neural network.He prompted the function to become the most commonly used activation function of BP neural network.Since then many scholars have done improvements on the limitations of the sigmoid function.In order to solve the problem that BP algorithm can obtain larger gradient values in the whole domain, many scholars try to combine activation functions with different characteristics in different intervals together to set a larger derivative value at the needful position to make up for the inadequacy of single activation function [6].In 1991, Kung and Hu [7] proposed FARM approximate simplified method which uses Frobenius norm according to the ideas of the global optimization and gradual optimization to delete the hidden layer units and determine the number of hidden layer nodes.Fahlman [8] found that if we adopt the sigmoid function as the activation function, the derivative of error of weight will become very small when the output value of BP network is close to 0 and 1.Salomon [9] created two BP neural networks with exactly the same initial parameters and structure and put forward a new method of network learning.The learning rate  is increased and decreased, respectively, to update the weight of two networks and network is set with fast error drop as the starting point of the next update.The effect is obvious.Dan Foresee and Hagan [10] proposed BFGS quasi-Newton method which not only can avoid calculation of the second derivative but also keeps the advantage of fast convergence of the Newton algorithm.Riedmiller and Braun [11] proposed elastic BP (RPROP) method in 1993.RPROP method introduced probability of resilient update value to modify weights directly which can reduce influence of the network structure parameters in the whole learning process and avoid unforeseen gradient error convergence of fuzzy network performance.
At present, the research based on ANN to carry out quantitative structure-activity relationship research has gradually been applied to various fields.Wang et al. [12] utilized neural network to study the quantitative structure-activity relationship of angiotensin converting enzyme inhibitors.Zhao et al. [13] established the model of antitumor activity of emodin derivatives on the basis of neural network.Cui et al. [14] applied the principal component analysis and neural network method to study the quantitative structure-activity relationship of the nitrobenzene and its homologue.González-Díaz et al. [15] established some quantitative structure-activity relationship models of synthetic compounds by neural network and verified the reliability of the model.The model can be used to design new drugs.Prado-Prado et al. [16] studied quantitative structure-activity relationship of parasite drug resistance between different kinds of parasites through neural network.Ramírez-Galicia et al. [17] established quantitative structure-activity relationship model of amoeba drug resistance through multiple linear regression, stepwise regression analysis, and neural network.The result shows that the threedimensional structure of the model is very important.
Although BP algorithm has become the most widely used artificial neural network, the BP neural network has the following defects.
(1) It falls into local minimum value easily: the BP neural network learning algorithm based on gradient descent method is easy to fall into local extremum points and saddle point if error function is not strictly convex function and there are multiple points whose gradient is zero.Then, the network can not converge and is unable to get the optimal solution of the problem so the optimal network connection weights and threshold parameters can not be obtained [18].
(2) The speed of error convergence is slow: convergence speed of BP neural network is decided by two aspects: one is learning rate and the other one is the size of derivative of related excitation function [19].It is usually not easy to choose the size of the learning rate.If the learning rate is too large, the oscillation or even no convergence phenomenon will happen in the process of training.The learning rate should be a small positive number.If the learning rate is too small, the product of learning rate and negative gradient vector will become smaller.Then, the renewal speed of weights and thresholds will be affected.In addition, the size of derivative of the activation function also affects the rate of convergence.In the flat area of error curved surface, the gradient of error function is small to make update speed of weights and threshold of network slow.The network needs to pass through much iteration to be out of the flat area so the convergence speed of the network becomes slow.
(3) Structure of network is not easy to determine: the determination of the structure of BP neural network usually refers to determination of the number of hidden layers and the number of neurons of hidden layer.Especially after Kolmogorov et al. prove the single hidden layer of BP neural network approximation theorem, how to select the number of neurons in single hidden layer neural network has always been number one of the hot and key problems.Generally, the number of neurons of input layer and output layer is easy to determine according to identification objects.The number of neurons of hidden layer is difficult to determine, and it directly affects topology structure and performance of BP neural network.It has been proved in theory that the three-layer BP neural network can approximate nonlinear functions of arbitrary precision which solves the problem of determination of the number of hidden layers in the BP neural network.If the number of the neurons of hidden layer is too small, the network can not meet the requirements of learning and the approximation performance; if the number of the neurons of hidden layer is too much, there will be adverse phenomenon in the network and it will make the hardware implementation and software calculation complicate at the same time.At present, there is no unified and complete determination theoretical framework to determine the structure of the BP neural network.The experience or grope through lots of experiments is the usual manner to estimate and adjust the structure of neural network.
In order to further improve the efficiency of the BP neural network and overcome the shortage of it, a lot of research has been conducted.Sun et al. established improved BP neural network prediction model and quantitatively researched related parameters [20].The accuracy of the model was improved to some extent.Xiao et al. proposed BP neural network with rough set for short term load forecasting.The accuracy of prediction of BP neural network was further improved [21].
Genetic algorithm imitates evolutionary and genetic rule of biology and is a mathematical algorithm which can solve problems to find the global optimization.Due to the strong macroscopic search ability and good global optimization performance of genetic algorithm, many scholars try to use genetic algorithm to optimize the connection weights, structure, learning rules, and so forth of BP neural network.Xiao et al. [22] applied genetic algorithm to constitute GA-ANN method to optimize the optimization problems in complex engineering.The method not only takes advantage of the nonlinear mapping, network reasoning, and predicting function of neural network but also uses the characteristics of global optimization of genetic algorithm.It can be widely applied to many complex engineering problems whose objective function is difficult to express by the form of explicit function of decision variables.Yang et al. [23] applied genetic algorithm to select parameters which overcame the restriction of symmetric weight matrix of traditional fluid neural network and broadened the application fields of this intelligent exploration method.Ge [24] applied genetic algorithm to optimize the controller parameters of structure of the neural network and applied the controller to control object with pure lag.The experiment proved that the control system optimized by genetic algorithm was with good static performance and dynamic performance.Li et al. [25] combined genetic algorithm with traditional DBD algorithm to propose a new algorithm to optimize the BP neural network, making network with large scale be able to converge fast and be out of the trap of local minimum.This algorithm is less sensitive to the selection of network parameters and obtains good effect in the application of missile comprehensive test.
In order to further overcome the shortage of the model based on genetic algorithm to optimize the BP neural network, the model and algorithm of BP neural network based on expanded multichain quantum optimization are proposed.The structure of neural network is effectively optimized.The model can overcome a series of problems of basic genetic algorithm, such as the slow convergence speed, premature convergence, and bad computational stability to further improve the performance of the BP neural network controller.

BP Neural Network Model and Algorithm
The main idea of BP algorithm is to divide learning process into two stages, that is, positive communication of signal and back propagation of error.In the stage of positive communication, input information is from the input layer to output layer through the hidden layer [26].The output signal forms on the output side.The weights of network are fixed in the process of the signal transmission forward.The state of neurons of each layer only affects state of neurons of the next layer.If the desired output can not be achieved in the output layer, then the error signal will back propagate.In the back propagation stage, the error signal which failed to meet the accuracy requirement spreads forward step by step and the error is shared by all units of each layer.The connection weights are adjusted dynamically according to the error signal.The weight value between neurons keeps correcting through the cycle of forward and back adjustment.The learning stops when the error of the output signal meets the requirement of precision [27].

Topology Structure of BP Neutral
Network.The simplest BP neural network is with three layers as is shown in Figure 1.It includes input layer, hidden layer, and output layer.The number of nodes in the input layer is equal to the dimension of input vector .The number of nodes in the output layer is equal to the type of output module .The number of nodes in the hidden layer  is associated with specific application which is usually selected by test.
The connection weight matrix between input layer nodes and hidden layer nodes is  1 which is defined as formula (1).Similar to it, the connection weight matrix between hidden layer nodes and the output nodes can be defined as  2 .The threshold matrix of nodes of hidden layer and output layer is  1 and  2 , respectively: (1)

Learning Algorithm of BP Network.
The learning algorithm of BP network can be divided into two stages that are forward and back propagation.It is a gradient descent algorithm which can make error of per connection weights of the neural network reduce.At the beginning of learning, assign the random numbers in the range of [−1, +1] to the connection weight matrices  1 ,  2 and threshold matrices of the hidden layer and output layer nodes  1 ,  2 .

Forward Operation.
First, enter the learning samples {, }, where  is the input vector of the learning samples and  is the corresponding output vector: where  is the number of nodes of input layer;  is the number of nodes of output layer.The data forward propagates to the output layer through the input layer and hidden layer.The produced weight value of output pattern classification is the learning result.The following steps are mainly included.
Step 1 (calculation of output value of the hidden layer nodes).The input value of nodes in hidden layer is where  is the number of nodes of input layer,  is the number of nodes of hidden layer,  1 is connection weight, and   is component of input vector.The output value of node  is where  1 is the threshold of node .The activation function  used is the sigmoid function given by Rumelhart: Output layer

Hidden layer
Input layer

Vector of connection
Vector of connection w 211 w 2 1 m w2 Step 2 (calculation of the output value of the nodes of output layer).The input values of nodes of output layer  are The output value of node  is where  2 is threshold of node of output layer  and  is activation function defined by type (5).

The Back Operation.
Calculate the error of output value and expected value of output layer.The error back propagates from the output layer through the hidden layer to the input layer.The connection weight values are modified.Steps are as follows.
Step 1 (calculate the output error of nodes of output layer).The error between the learning value   of node of output layer  and the output value of learning samples   is Step 2 (test learning error). 0 is the allowed maximum learning error, which is in the interval [0, 1] set by the user.
If max(  ) ≤  0 , enter the next learning sample.Otherwise, adjust the weights of network and reenter the original learning sample.
The learning process is ended when all learning samples meet the aforementioned conditions.
Step 3 (calculate the learning error of nodes of output layer).The learning error of node of output layer  is Step 4 (calculate learning error of nodes of hidden layer).
Learning error of node of hidden layer  is Step 5 (revise the value of the connection weights matrix  2 ).Set the weight value at time +1 as adjusted new weight value; then where  is learning rate and  is momentum factor.Both  and  are in the scope [0, 1].Using  can accelerate the learning speed and be helpful to overcome the local minimum problem of common BP algorithm.
Step 6 (revise the value of the connection weight matrix  2 ).Consider Step 7 (revise threshold  2 ).Threshold of nodes of output layer is Step 8 (revise threshold  1 ).Threshold of nodes of hidden layer  is 2.3.Run the BP Neural Network.Run the BP neural network after learning and carry out pattern classification for the input vector .Only the forward operation of BP learning algorithm is used.The process is as follows.
Step 1. Assign the value revised by BP learning algorithm to the connection weight matrices  1 and  2 and threshold matrices  1 and  2 .
Step 2. Input the input vector  which needs to be recognized.Step 3. Apply formulas (3), (4), and ( 5) to calculate the output of hidden layer.
Step 4. Use formulas ( 5), (6), and (7) to calculate the output of the output layer, that is, the classification result of input vector .

The Quantum Optimal Model Based on Multichain Expanded Coding Scheme
where  and  satisfy the normalization condition The complex numbers  and  satisfying formulas ( 15) and ( 16) are called probability amplitude of a quantum bit.Quantum bit can also be expressed by probability amplitude as [, ]  .According to the property of probability amplitude, a quantum bit |⟩ can be expressed by Figure 2.
Obviously, in Figure 2,  = cos  and  = sin ; therefore, quantum bit can be represented as The coding scheme is where   = 2 × rand, rand ∈ [0, 1];  = 1, 2, 3, . . ., ;  = 1, 2, . . ., ;  is the size of the population;  is the number of quantum bit.Obviously, in Figure 3,  = sin  can be decomposed into [ * cos ,  * sin ]  with the same way of formula (17).Combine formulas ( 17) and (18); quantum bit can be represented as the following form: Formula ( 19) is to decompose sin  of formula ( 17) into two variables, so it also meets the condition of formula (16); however, it describes the feature from the view of twodimensional space.Only one variable  is not conducive to describing the dynamic behavior of the quantum objectively, comprehensively, and vividly. (0 <  < ) is used to replace the described component of hypotenuse according to the characteristics of formulas (17) and (18).Formula ( 19) can be transformed to Formula ( 9) is equivalent to translating sin  in formula (1) to the representation of quantum bit that is the variable in two-dimensional space which is translated to threedimensional space.The encoding scheme based on three genes of three-dimensional space of the two angle variables is formed:  Increase of the supporting angle constantly like this can conclude the encoding scheme based on +1 chain of +1dimensional space, where  is the number of angle variables of multiple chain code.
It can be concluded from Figure 4 that the  + 1 chain coding scheme is . . .
Set the maximal evolutionary generation to be , and the size of the th quantum rotation angle of the th generation is Δ  = Δ * (1.01 − /), where Δ is the maximum rotation angle.The changing type of the rotation angle can ensure the angle decreases with the increase of generation to rotate around the optimal solution.

Mutation of
The specific form of  can be determined by the method of undetermined coefficients.Diversity of population can be increased though the quantum Not gate which can prevent local convergence and avoid premature phenomenon in the process of evolution.

Algorithm Description.
The main realization steps of multichain quantum genetic algorithm are as follows.
Step 2 (transformation of solution space).Map the multiple approximate solutions of each chromosome from unit space   = [−1, 1]  to the solution space of optimization problem Ω to get the solution set ().
Step 3. Calculate the fitness of the approximate solution to obtain the contemporary optimal solution Best and contemporary optimal chromosome Best.
Step 4. Make Best be the optimal solution ; make Best be the global optimal chromosome .
Step 5. Do the iteration cycle,  = +1, obtaining new species by updating and mutation.
Step 6. Transform solution space of () to get the solution of optimal problem ().
Step 7. Evaluate () to access the contemporary optimal solution Best and optimal chromosome Best.
Step 8.If fit(Best) < fit(), update the contemporary optimal solution Best =  and update the contemporary optimal chromosome  =  at the same time to prevent the degradation phenomenon.Otherwise,  = Best,  = .Such control can ensure the convergence to the optimal value.
Step 9. Go back to Step 5.

The Model of BP Neural Network
Based on Multichain Code Quantum Optimization Algorithm 4.1.Code.Connect each weight and threshold of BP network in order to form a long string of an array of real number.The string is as a chromosome.The decoding value of individual is the corresponding weights and thresholds.

Fitness Function.
The purpose of using multichain quantum algorithm to optimize BP network is to simplify the structure of the network to minimize the error of network.Therefore, the fitness function is where  is the global error of network;  is the number of training samples;  is the number of units of output layer;   is the desired output value;   is the actual output value.The combination of multichain quantum optimization algorithm and neural network is realized by using multichain quantum optimization algorithm to optimize the parameters of the neural network.The specific ways are as follows.
(1) Use the algorithm to optimize the initial weights and thresholds of neural network.(2) Use the algorithm to optimize structure parameters of BP network, the main of which is to optimize the unit number of hidden layers of network.(3) Use the algorithm to optimize the selection the learning rate and momentum factor of BP network.
The first optimization way is adopted to realize the optimization of neural network in this paper.The specific implementation process is as below.
The basic idea of using multichain quantum optimization algorithm to optimize the initial weight and threshold value of BP network is as follows.The multichain quantum optimization algorithm is applied to ensure the appropriate initial weight and threshold value to find the optimal solution of BP algorithm.

Simulation Test
In order to test the trained network, three sets of new data are presented in Table 2.
Use BP, GA-BP, DCQA-BP (BP neural network based on double-chain quantum optimization algorithm), and MCQA-BP (BP neural network based on multichain quantum optimization algorithm, where the number of chain is 4).In GA-BP and DCQA-BP, the population number is set to be 50; the maximal evolutionary generation is set to be 30.The 10 times average values of the performance of algorithms are shown in Table 3.
Apply the test method of mean difference of two normal populations (-test) to analyze the error of BP, GA-BP, DCQA-BP, and MCQA-BP.
The data of error of the four methods can be considered to be samples from normal population (,     Consider  1 = 6.5161 >  0.005 (18) = 2.8784.So reject  0 ; that is, Error 2 of BP is larger than the GA-BP with probability of more than 99.5%.
The simulation results show that MCQA-BP is significantly superior to BP, GA-BP, and DCQA-BP.The evolutionary process of BP, GA-BP, DCQA-BP, and MCQA-BP is shown in Figure 5.
It can be seen that the fitting results of different models are roughly the same.The trend of these algorithms is the same.It shows that the method of this paper is scientific and effective.The fitting error of the model of this paper is less than results of other models.So the model in this paper is better than other models.
For the reason that MCQA-BP with super parallel and ultra-high speed which can overcome a series of problems existing in the BP and GA-BP such as slow convergence speed, premature convergence, and bad computational stability optimizes the structure of the neural network effectively, the computational cost of the MCQA-BP is lower than the method of BP, GA-BP, and DCQA-BP in test case 1 and test case 2.

Test Case 2.
The training data is from the results of numerical simulation of 1365 kinds of target board damage under the kinetic energy rod collision of penetration cases.15 kinds of destruction values predicted by neural network are randomly chosen and shown in Table 4.In the listed penetration value forecast under collision situation, the fitting results of different models are different.The errors between predicted values and the actual values of different models are distinguished from each other.The error of fitting result of MCQA-BP model is less than results of other models.The efficiency of MCQA-BP is further proved.
It can be seen from the table that the prediction effect of MCQA-BP is better than BP network and GA-BP on the same sample data.The error range of MCQA-BP network is less than that of BP neural network and GA-BP neural network under the same learning times.The high efficiency of MCQA-BP network model is fully illustrated.

Conclusions
A kind of super parallel ultra-fast BP neural network model and algorithm based on multichain quantum optimization algorithm are proposed.The algorithm makes full use of local time domain feature of the BP network and global optimization search capability of multichain quantum optimization algorithm to enhance the intelligent search ability of network.It overcomes the disadvantages of BP network to improve the effectiveness of the optimization and accelerate search efficiency and convergence speed.The model controller based on MCQA-BP network can evaluate damage effect of penetration target board of kinetic energy well.The antijamming ability of the model is fine.The MCQA-BP network controller is effective as can be seen from the actual simulation results.

Figure 3 :
Figure 3: Decomposition of probability amplitude of quantum bit.

( 1 )
Initial population with  individuals is produced by multichain quantum optimization algorithm.Each individual in the population composed the chromosome strings with the weight, threshold, and the number of units of hidden layer of BP network.(2)Enter the neural network module.Apply BP algorithm to do the learning process and apply multichain quantum optimization algorithm to do the fitness inspection.If the fitness is unqualified, update the chromosome until the fitness meets the requirements.The individuals which meet requirements are updated by multichain quantum optimization algorithm to calculate fitness.The unqualified individuals are updated until they meet the accuracy requirement.

Table 1 .
The algorithm adopts network with three layers.The function of hidden layer is sigmoid tangent function; the function of output layer is sigmoid logarithmic function.The formula  2 = 2 ×  1 + 1 is used to calculate the number of neurons of hidden layer, where  1 is the number of neurons of input layer;  2 is the number of neurons of hidden layer.Since there are 15 input parameters and 3 output parameters, the structure of BP neural network is 15-31-3.That is, the number of nodes of input layer is 15, the number of nodes of hidden layer is 31, and the number of nodes of output layer is 3.The number of weight values is 15 × 31 + 31 × 3 = 558 and the number of thresholds is 31 + 3 = 34.The total number of all the parameters that need to be optimized is 592.

Table 1 :
The training sample data.

Table 2 :
The test sample data.

Table 3 :
The experiment results of the four algorithms. 2 ).The means of the sample of the four methods are  1 ,  2 ,  3 , and  4 , respectively, and the variances are  2 1 ,  2 2 ,  2 3 , and  2 4 .The statistic is introduced to be the test statistic:

Table 4 :
Contrast between 15 kinds of simulation of damage index value and predicted value.