A New Method on Software Reliability Prediction

,


Introduction
Reliability is an important software quality characteristic related to the probability that system works without failures over a period of time in a certain environment.The estimation or prediction of the reliability level is a very important task.This level can be used to plan test, deployment, and maintenance activities.To help in this task, the use of modeling and prediction of software reliability are a crucial issue.
Different types of software reliability prediction models consider different elements of the software project, such as the specification and codification of the programs, and are usually based on characteristics of the testing activity.Some of those models consider the time between failures [1][2][3].Others consider the coverage of a test criterion [4][5][6][7].A criterion can be viewed as a predicate to be satisfied by the test cases and can be used to evaluate test sets [8].The advantage of models based on coverage is that they are independent of the operation profile.However, models based on time are most commonly used.Due to the general nonlinear function mapping capabilities, artificial neural networks have received increasing attention in time series forecasting [9][10][11].These works show that ANN nonparametric models present better results than traditional ones.However, most of those works explore only models based on time.In addition, the ANN itself is filled with strong experience, the theory is not strict or easily interpreted, and then it easily converges at the local minimum point.So the application of artificial neural networks is very limited for software reliability prediction.
Recently, a novel machine learning technique, called support vector machine (SVM), has drawn much attention in the fields of pattern classification and regression forecasting.SVM was first introduced by Vapnik and his colleagues in 1995 [5].SVM is a kind of classifier's studying method on statistic study theory.This algorithm derives from linear classifier and can solve the problem of two kinds classifiers; later this algorithm is applied in nonlinear fields; that is to say, we can find the optimal hyperplane (large margin) to classify the samples set.It is an approximate implementation to the structure risk minimization (SRM) principle in statistical learning theory, rather than the empirical risk minimization (ERM) method [5].
Compared with traditional neural networks, SVM can use the theory of minimizing the structure risk to avoid the problems of excessive study, calamity data, local minimal value, and so on.SVM has been successfully used for machine learning with large and high-dimensional data sets.These attractive properties make SVM become a promising technique.This is due to the fact that the generalization property of an SVM does not depend on the complete training data but only a subset thereof, the so-called support vectors.Now, SVM has been applied in many fields [12][13][14].However, the essence of SVM training is solving convex quadratic programming problems with linear equality constraint.The classic methods for solving nonlinear programming, such as the Newton method and quasi-Newton method, have large computing.So the predicted effect is not so perfect.
In order to overcome the limitations of SVM mentioned previously, the researchers apply particle swarm optimization (PSO) to the training of the SVM [15,16].PSO is an intuitive and easy-to-implement algorithm from the swarm intelligence community.To replace the need for numeric solvers, a PSO algorithm based on chaos searching (CPSO) which improves the convergence speed and the abilities of searching for the global optima is proposed and shown to be feasible in solving the SVM quadratic programming problem, but the research is fit for the large sample set, which is ineffective for less sample data in early software reliability prediction.
In this paper, based on analyzing classic PSO-SVM model and the characteristic of software reliability prediction, we propose concrete measures of the improved PSO-SVM model and establish the improved PSO-SVM model.This paper is organized as follows: Section 2 summarizes the classic PSO-SVM model.Section 3 analyzes characteristic of software reliability prediction and PSO-SVM applicability and then proposes specific improved strategy.Section 4 describes results of two compared simulation experiments.Finally, Section 5 concludes the paper.

Traditional PSO-SVM Characteristics Analysis
Traditional PSO-SVM model used PSO algorithm to optimize the model parameters and kernel parameter of SVM and improved the prediction accuracy by searching the optimal parameters value.SVM classification was first proposed for the second largest interval algorithm and then gradually extended to the field of nonlinear regression.SVM nonlinear regression prediction is similar to classification problems, which are calculated according to the given decision function, and then classify and predict.Regression problem retains the main features of the largest interval algorithm, which is to minimize a convex function, and the nonlinear function can be got by studying the linear devices in the kernel feature space; the difference is mainly reflected in a given data set.Suppose that a given data set is {( Since the dimension of feature space is high, and the objective function is nondifferentiable, in order to facilitate the calculation, the dot product kernel function technology and Wolf dual theory are introduced, which is transformed into the dual problem.The original problem is transformed into the dual quadratic programming problem.Specific methods are firstly constructing the Lagrangian function (such as type 2) and then calculating the partial derivative of each variable; the result is substituted into the original problem: Lagrange function requires , , ,  * to be minimized; thus min Function() can be directly expressed as When checking When checking When SVM solves nonlinear regression problems, the nonlinear mapping (⋅) makes feature space mapping into high-dimensional feature space and then finishes linear regression in high-dimensional space.In order to reduce the sensitivity of the prediction error, the objective function of nonlinear regression model is defined insensitive loss function, and the slack variable  is introduced to ignore the fitting error of less than , which ensures that the model is the existence of global minimum and reliable generalization sector optimization.
The original problem (1/2)   is for the regularized part, whose role is to make the function smoother to enhance the generalization ability;   and  *  reflect the training point margin of error; ∑  =1 (  + *  ) reflects the experience risk of the model; error penalty factor  is the model parameters, which determines the balance between the empirical risk and the regularization parts.
Dual problem is the quadratic programming problem, where (  ⋅   ) = ⟨(  ) ⋅ (  )⟩ is called kernel function, including linear kernel, polynomial kernel function, RBF kernel function, and sigmoid kernel function.The RBF kernel function can fully reflect the software reliability nonlinear characteristics, which is used in the construction of prediction model.
Consider RBF kernel function: When making PSO optimize ,  and  2 in SVM model, the population is constantly updating from the best local position to the best global locations in the iterative process.Assuming that the population size is , the d dimensional space position of the particle I is   = { 1 ,  2 , . . .,   }, speed is V  = {V 1 , V 2 , . . ., V  }, the optimal local location is best  = { 1 ,  2 , . . .,   }, and the best global position is best = { 1 ,  2 , . . .,   }. Specific methods are as follows. Speed: Location: where  is the current iteration number,  1 and  2 are the acceleration factor,  1 is own dependence on memory of particles,  2 is the impact of other particles on the particle itself, which make each particle close to best and best, and  1 and  2 are uniform distribution random numbers in (0, 1), which is used to simulate the slight disturbance.

Model Applicability and Improved Measures Analysis
The traditional PSO-SVM has many outstanding advantages, which are adapted to software reliability prediction characteristics, which are shown in Table 1.
Although traditional PSO-SVM prediction model has many advantages, because of inherent weaknesses and deficiencies of PSO and SVM algorithms, this paper proposes the correspondent improved strategy to get the optimal software reliability prediction model.The model shortcomings and correspondent improved measures are shown in Table 2.

Improved Model
4.1.Block Population Initialized Measure.Particle swarm optimization (PSO) is a global optimal search algorithm, so this algorithm should quickly search to obtain the optimal value.However, the traditional particle swarm is randomly generated within the region in the whole population, which cannot fully guarantee that it is dispersed throughout the search space.If we can put search space into many blocks, it will be able to improve the nonuniform status.The main idea is that each particle is almost evenly distributed; assuming that the number of particles is , then the entire search space is divided into  small areas: where   and   are expressed in the range of values in  dimension; then the initial position of particle  is where  is random number values in [0, 1].

Adaptive Inertia Factor
Measure.Inertia factor  in the PSO algorithm makes the particle velocity update with historical memory, which adjusts history speed to the local and global optimal speed in order to balance the relationship between the global search ability and the local one.When the iteration begins, the larger inertia weight can enhance the global search capability; that is, the larger the search area, in the latter, the smaller the inertia weight which can be

Model advantages Prediction characteristics
The parameter  adjusts the ratio between configuration risk and experience risk, avoiding the overlearning problems and improving the prediction model generalization ability.

Less software reliability sample set
The slack variable is introduced to reduce the error sensitivity of prediction model.The model transforms the original problem into a dual problem, making solving process into a convex quadratic programming problem, getting the optimal solution easily.

Complex prediction process
The kernel function is introduced, which makes multidimensional input space into high dimensional space in order to solve multidimensional space problems.
More software reliability characteristic parameters The prediction process is operated in the high-dimensional space, which makes original non-linear prediction problem transform into a linear problem, and then the results are reduced to the nonlinear problem solution.

Nonlinear characteristics of software reliability prediction
PSO is used to search the optimal solution of parameters to improve the overall prediction accuracy.
It is difficult to get the optimal solution for the software reliability prediction model parameters.
Table 2: Model shortcomings and improved measures.

Model shortcomings
Improved measures The model parameters and kernel parameter of SVM are random, which are not conducive to search the optimal parameters, thereby reducing the prediction accuracy and efficiency.
Block population initialized measure PSO inertia factor is fixed, and the local and global search abilities are limited, which reduces obtaining the optimal solution ability.
Adaptive inertia factor PSO algorithm is easy to fall into local minimum in the latter prediction part.Nonevolution number of mutation strategies When low-dimensional space transforms into high dimensional space and solves quadratic programming problems, if the number of input spaces is high, computational efficiency will be a new problem.
Transforming SVM into LSSVM enhanced local search ability, which is conducive to better local search.If we can prolong the former and latter search times, we will improve the overall algorithm performance, so the adaptive weight update method is as follows: ) .(12) Suppose that  max ,  min are 0.9 and 0.1.The corresponding inertia weight curve is shown in Figure 1.
In the previous table, the curve expresses the relationship between the power of (/ max ). and .Compared with other values, when the power is 6, the particle search time is the longest.The method makes the inertia weight a larger value in the iterative initial time, and smaller in the latter, which extends the global and local search times, strengthens the search ability, and balances the global search ability and local search ability.

Nonevolution Number of Mutation Measures.
Mutation mechanism comes from the genetic algorithm, which is mainly used to overcome the problem of converging at local minimum in the iterative process.Standard PSO is easy to converge at local optimal solution in high-dimensional function optimization problems, and nonevolution number of the particles can determine whether it is entering into the local optimal solution.Therefore, if nonevolution number and mutation operators can be introduced into the PSO, they will be selection criteria as the mutation time in order to overcome the local minimum problem.Specific strategies are as follows.
(1) Calculate the fitness changing rate (abbreviated as FCR hereinafter): the FCR is the fitness changing rate of   (history optimal particle position) between the current iteration and the previous  times ( = 1): (2) Count nonevolution number: in the beginning of the evolution, the non-evolution number is stop time; the fitness changing threshold value is slope value; non-evolution limit is MaxStep; mutation probability is   .In the iterative process, the non-evolution number is determined by the fitness changing rate, as follows: If non-evolution number is more than the limit (MaxStep), the algorithm may be stopped, and we can do mutation operation based on the mutation probability: where  is the random number in [0, 1].The improvement makes the particles continue to approach the global optimum when converging at local minimum in training.
4.4.LSSVM Measure.Least squares support vector machine (LSSVM) has the two main deformations.Firstly, the least squares linear system is introduced as a loss function, which makes equality constraints replace inequality ones in SVM; secondly the quadratic programming solving replaces linear equations, which avoids insensitive loss function and greatly improves the learning efficiency and the training accuracy.
The standard SVM problem can be simplified as follows: It should be noted that the one equality constraint in LSSVM is used instead of the two inequality constraints in SVM; the corresponding objective function  ∑  =1 (  +  *  ) can also be replaced by  ∑  =1  2  .Based on the standard method of transforming into dual problem in SVM, LSSVM Decision function is As solving linear equations,  in the decision function can be obtained through the equation, which can greatly reduce the computation and the model is more simple.

The Flow Chart of the Improved PSO-LSSVM Model
The flow chart of improved PSO-LSSVM prediction model is shown in Figure 2; the dashed part expresses the improved process of PSO-LSSVM model.

Simulation Comparison
In order to evaluate the performance of the new model and compare it with the traditional model, the simulation experiment is shown as follows.Here, taking a military software system as an example, thirteen module indexes and module defect number are shown in Table 3. SN is module number; LOC is module size (the number of line codes is units); FO is module output; FI is module input; PATH is module control flow path; FAULTS is the number of module defects.
In order to evaluate the prediction accuracy of the optimization model, we carry out two experiments.Experiment After the training samples and test samples are normalized [17], respectively, we input them into the BP network model, the traditional PSO-SVM model, and the optimization PSO-LSSVM model.Where BP prediction model uses the momentum factors model, the hidden nodes of model are 18; the training objective is 0.00001.In accordance with the cross-validation algorithm and the depth search algorithm, after 2 rounds of selection, the traditional PSO-SVM prediction model parameter is  = 499, and the nuclear kernel parameter is  2 = 5.Both the traditional and the optimization PSO-SVM models make RBF as kernel function.In the model training process, the error curve of BP prediction model and the optimization PSO-LSSVM model are shown in Figures 3 and 4, respectively.
We can see from the tables that the improved PSO-LSSVM prediction model training error decreases rapidly,    Because BP prediction model is greatly influenced by the initial parameters, in order to reduce the randomness, we calculate the average of 10 consecutive operations.Calculating the average percentage prediction error, the prediction results are shown in Table 4; the comparing result is shown in Table 5.

Conclusion
Because of using the optimized model parameters and kernel parameters in the improved PSO-LSSVM prediction model, the prediction accuracy is much higher than the traditional PSO-SVM model and BP prediction model; as the number of training samples decreases, the prediction accuracy of the improved PSO-LSSVM model is significantly higher than the traditional PSO-SVM model and BP model owing to its good generalization performance in less training samples.Thus, the improved PSO-LSSVM prediction model is better than the traditional PSO-SVM and BP prediction models in both training efficiency and prediction accuracy.Due to the current situation that the prediction sample set is small and the cost is high in software reliability prediction, the proposed model has important practical significance, and it may become the preferred prediction method for the less samples prediction projects.

Figure 2 :
Figure 2: The flow chart of improved PSO-LSSVM prediction model.

Table 1 :
Model advantages and prediction characteristics.

Table 4 :
Prediction results.After training, the three methods get corresponding prediction models applicable to sample data; therefore we can input prediction sample data into each model to forecast.

Table 5 :
Prediction error comparing results.