An Improved Nonlinear Grey Bernoulli Model Based on the Whale Optimization Algorithm and Its Application

In order to improve the prediction performance of the existing nonlinear grey Bernoulli model and extend its applicable range, an improved nonlinear grey Bernoulli model is presented by using a grey modeling technique and optimization methods. First, the traditional whitening equation of nonlinear grey Bernoulli model is transformed into its linear formulae. Second, improved structural parameters of the model are proposed to eliminate the inherent error caused by the leap jumping from the diﬀerential equation to the diﬀerence one. As a result, an improved nonlinear grey Bernoulli model is obtained. Finally, the structural parameters of the model are calculated by the whale optimization algorithm. The numerical results of several examples show that the presented model’s prediction accuracy is higher than that of the existing models, and the proposed model is more suitable for these practical cases.


Introduction
Professor Deng [1] originally proposed the grey system theory to solve the uncertain system with partially known and partially unknown information.As a crucial branch of the grey system theory, it has been widely used to address numerous real-world problems owing to its effectiveness, such as electricity prediction [2][3][4], energy prediction [5,6], and tourism prediction [7].In these models, a common characteristic is that they do not require a large number of observations (not less than 4).It has attracted considerable interests of researchers because it is difficult, even impossible, to collect enough data to build the traditional models, including linear [8] or nonlinear regression models [9], autoregressive integrated moving average model [10] and its extensive versions [11], support vector machine [12], and artificial neural network [13].
Generally speaking, the development of discipline also benefits from practical applications.In the past three decades, various grey models have been emerged rapidly according to practical applications.For example, Xie and Liu [14] investigated the discrete grey model and analyzed the traditional grey model's connection.Wu et al. [15] investigated the grey model with fractional order accumulation that made the grey model more flexible.For the purpose of considering the effects of related factors on the behavioral system, Tien [16] initially proposed a novel grey model called GM (1, n) in which the "n" stands for the n − 1 driving variable.More recently, Wang et al. [17] presented a data-grouping approach-based grey modeling method to predict quarterly hydropower production in China.Subsequently, they proposed a seasonal grey model based on the accumulation operators for forecasting the seasonal electricity consumption of China [18].Zeng et al. [19] predicted the sequence of ternary interval numbers using a novel multivariable grey model.Ma et al. [20] raised a conformable fractional grey system model; he also investigated the novel fractional time-delayed grey model with grey wolf optimizer [21].A large number of related research studies emerge continuously.Zeng et al. [22] presented a new-structure grey Verhulst model for predicting China's tight gas production.In the model, they deduced the time-response function and an initial value optimization method.e same year, they proposed another new-recent years, metaheuristic algorithms are used in grey models for finding the optimal parameter solutions.Zhang et al. [34] optimized the background value weighting coefficients of the grey model using the genetic algorithm.In [35], a multiobjective grey wolf optimizer was used to optimize the kernel-based nonlinear extension of the Arps decline model to ensure both prediction stability and accuracy.Wu et al. [36] used the particle swarm optimization algorithm to search optimal system parameters of the nonlinear grey Bernoulli model.
is study focuses on improving the nonlinear grey Bernoulli model, which was initially proposed by Chen [37] and abbreviated as NGBM (1, 1).As is known, NGBM (1, 1) has been widely used in many problems with nonlinear characteristics and extended to general versions [38].However, there are still spaces to improve its accuracy.e root cause of loss of information in the conversion of the grey differential equation to the grey difference equation is proposed in the paper [39].Following the thought of Ma et al. [7], the model parameters of the NGBM (1, 1) model are optimized to better match these two equations to reduce prediction error.e main contributions of this paper are drawn as follows: (1) the grey differential equation is transformed into linear form rather than sharing the same form to the traditional NGBM (1, 1) model; (2) the optimized parameters are constructed and the whale optimization algorithm (WOA) is used to search for the optimal power index; (3) three cases are employed to verify the effectiveness of INGBM (1, 1).e rest of this paper is organized as follows: Section 2 briefly describes the NGBM (1, 1) model and obtains the "linear" solution to the NGBM (1, 1) model.In Section 3, the NGBM (1, 1) model with improved parameters is deduced in detail.Section 4 provides two real-world examples to validate the effectiveness of the proposed model.Section 5 applies INGBM (1, 1) to predict the number of R&D institutions of higher education in China to reveal the forecasting ability of INGBM (1, 1), and the main conclusions are listed in the final section.

Description of the Nonlinear Grey
Bernoulli Model e nonlinear grey Bernoulli model (NGBM (1, 1)), originally proposed by Chen [37], has wide applications, especially in solving nonlinear problems.However, this model still has some drawbacks that impair the prediction accuracy of NGBM (1, 1). is section is to analyze the root reason and propose a novel method to reduce the modeling bias.First, a brief description of NGBM (1, 1) is introduced.Additionally, a "linear" solution to the whitening equation of NGBM (1, 1) is proposed to make the parameter optimization more simplified.

e Traditional Solution to the Nonlinear Grey Bernoulli
Model.Assume to be a nonnegative series, and then the first-order accumulative generating operator (1-AGO) series is where is called the whitening equation of nonlinear grey Bernoulli model and n, regarded as the power index, cannot be equal to one.With the two-point trapezoidal formula, the discrete difference equation can be written as where z (1) (k) represents the background value and is obtained as 2 Mathematical Problems in Engineering e model parameters can be estimated by the least-squares method and shown that erefore, the solution to equation (3) with x (1) (1) � .
(7) Using the firs-order inverse accumulative generating operator (1-IAGO), the simulated values of X (0) ,  X (0) , is 2.2.e "Linear" Solution to Nonlinear Grey Bernoulli Model. is section transforms the whitening equation of the nonlinear grey Bernoulli model (NGBM (1, 1)) into the linear formulation, rather than directly solving the whitening equation.at is, it does not share the same pattern as the traditional grey model.e detailed computational process can be depicted as follows.
Analogously to Section 2.1, both sides of whitening equation (3) are multiplied by x (1) (t) − c , and then Set y (1) (t) � x (1) (t) 1− c ; furthermore, ereby, equation ( 10) can be written as which is called the linearization of the NGBM (1, 1) model.Moreover, it easily yields the discrete form by using the twopoint trapezoidal formula as follows: where z (1) y (k) � 0.5(y (1) (t) + y (1) (t − 1)).If T , the parameters can be estimated by the least-squares method and shown that After estimating the model parameters, the whitening equation, equation (11), is resolved.Multiply both sides in equation ( 11) by the integrating operator e a(1− c)t : Integrate both sides in equation ( 14) over the interval [1, t]: and which is also According to 1-AGO and e solution of the NGBM (1, 1) model, either in linearization or in nonlinearization, is essentially approximate because the conversion of equations ( 11) and ( 12) is based on two-point trapezoidal formula regarded as an approximate method.It implies that the "misplaced replacement" of the model parameters will cause the following: (i) the difference grey equation does not match with the differential grey equation because model parameters have different meanings in these equations; (ii) the prediction model is not satisfied in most situations.It indicates the performance of the NGBM (1, 1) model must be improved.In other words, the model parameters should be optimized to better match equations (11) and (12) and to increase the forecasting ability of the NGBM (1, 1) model.

Parameter Optimization of Nonlinear Grey
Bernoulli Model e whitening equation parameters, a, b and power index c, are important parameters of the nonlinear grey Bernoulli model.In this section, the parameters are calculated.

Whitening Equation Parameter Calculation.
e optimized parameters, a and b, are denoted as p and q for simplicity.e optimized parameters are substituted into the time-response function, and the following equation is obtained: Equation ( 18) is substituted into the left-hand side in equation ( 4): According to equation ( 4), the left-hand side L(t) should be equal to the right-hand side R(t); that is, L(t) − R(t) � 0. erefore, It is easy to find that Part 1 and 2 both are equal to zero in equation ( 20); hence, By doing so, the optimized parameters p and q can be estimated.Moreover, it is obviously believed that the optimized parameters can better match the differential equation and the difference equation and reduce the prediction error.For simplicity, NGBM (1, 1) with the improved parameters is abbreviated as INGBM (1, 1) in this study.

Power Index Estimation Based on the Whale Optimization
Algorithm.In the above descriptions, the power index c is assumed to be known.However, the power index is always changeable in a different situation that requires flexible adjusting over given datasets.To solve this problem, an intelligent algorithm, whale optimization algorithm, shorted for WOA, is employed to automatically determine the power index.
Based on the humpback whale's hunting behavior that recognizes the location of prey and encircles them, Mirjalili and Lewis designed the WOA [40].In this optimizer, assume the current best candidate solution (search agent) to be the target prey or be near the optimum.Once the best search agent is defined, the other search agents will update their positions towards the best search agent: (i) In this behavioral system, they update their position by where t represents the current iteration, X → * (t) is the current best agent, and D → � (D 1 , D 2 , . . ., D d ), j � 1, 2, . . ., d denotes the length of the individual whale approaching the current best search agent in j th spatial position.In particular, the coefficient vector A → and C → are defined as where r is a random number generated from [0, 1] and a is called convergence factor that linearly decreases from 2 to 0. at is, (ii) A spiral equation is also designed between the position of whale and prey to mimic the helixshaped movement of humpback whales: where | and implies the distance of the i th whale to the prey, b is a constant for fixing the shape of the logarithmic spiral, and l is a random number and l ∈ [− 1, 1]. (iii) In addition, humpback whales also search for prey in a random way according to the position of each other.is behavior is written as the following mathematical expression: where X → rand is a random position chosen from the current position.For clearness, the detailed steps of the algorithm based on WOA to find the optimal c are listed as follows: Step 1: set algorithm parameters N, dim, and t max .
Step 3: calculate the fitness of each search agent f(X → i ).
Step 5: generate a random number p in [0, 1].If p ≥ 0.5, update the position of the current search agent by equation (27).If p < 0.5 and |A| ≥ 1, update the position of the current search agent by equation (29).If p < 0.5 and |A| < 1, update the position of the current search agent by equation (22).
Step 6: return to Step 3, until the optimal value c is found.
Note that the fitness function, f(X → i ), as usual, is often defined as an objective function, MAPE, and shown in the next section.Moreover, the flowchart of the INGBM (1, 1) model is graphed in Figure 1 for clearness.

Validation of the Nonlinear Grey Bernoulli Model
is section provides two examples to demonstrate the efficacy of the proposed model comparing with three competing models, including the GM (1, 1), DGM (1, 1), NGBM (1, 1), and ONGBM (1, 1).Additionally, to evaluate the prediction accuracy of these grey models, the mean absolute percentage error (MAPE) and root mean square error (RMSE) are applied to measure the level of prediction performance, which are defined as e grade of the prediction performance is depicted by Lewis [41] using the criteria for MAPE and listed in Table 1.
Case 1. Forecasting education-in-practice-intensive university: the example from paper [42] is used to test for efficacy and applicability of the grey model.e data from 1 to 7 are used to build different grey models, and the final data are used to test for the prediction accuracies of these models.Accordingly, the five models' parameters are listed in Table 2, and especially parameter values of the proposed model by WOA are graphed in Figure 2.
Consequently, the simulation and prediction results are shown in Table 3.
Case 2. Forecasting subway passenger: the data sets of example from paper [43] are empirically broken down into two groups: the data from 2005 to 2012 are used to build five grey models, and the other data are used to test for the prediction accuracies of these models.
First of all, the parameter values of the five grey models are computed in Table 4.Moreover, the track of searching for the optimal nonlinear parameter of the INGBM (1, 1) model using WOA is graphed in Figure 3.
Furthermore, the simulation and prediction results are shown in Table 5.
In Tables 1-5, the desired conclusions can be drawn as follows: (1) In case 1, the   Note: ζrepresents the weighted parameter of background value and it is taken as 0.5 generally.It is, however, recommended to search for the optimal value in ONGBM (1, 1).In addition, β 1 and β 2 are parameters of DGM (1, 1) in this case.

Mathematical Problems in Engineering
for MAPE value listed in Table 1, it is easy to find that these models can effectively make predictions because of the low MAPE values.e proposed model has a smaller value that indicates higher accuracy.As is known, a favorable predictor performs well in the simulated period and satisfies prediction accuracy in the verifying period.Herein, the proposed model still is better than other grey models because of its lower MAPE value again in the predicted period.In this case, the fitting error and prediction error of all the models are small, which shows that no fitting has occurred.More, the nonlinear model (NGBM (1, 1), ONGBM (1, 1), and INGBM (1, 1)) performs better than the linear model (GM (1, 1) and DGM (1, 1)), which proves that the nonlinear grey model can well capture the nonlinear characteristics of the data.
In cost-effectiveness, the grey model is a kind of model solving small sample modeling, so the time consumption is usually very small.For example, in case 1, the time cost of GM (1, 1), DGM (1, 1), NGBM (1, 1), and INGBM (1, 1) is 0.1638 s, 0.1489 s, 0.1744 s, and 0.1862 s, respectively.All the time costs are less than 1 s and within the allowable range.In summary, the INGBM (1, 1) model can enhance the prediction accuracy of the traditional NGBM (1, 1) model by optimizing the model parameters.Furthermore, the proposed model is applied to analyze the practical application.

Application
Universities play an irreplaceable role in the process of building a strong country in the field of science and technology in China, as the core department for cultivating talent and achieving technological innovation, which shoulder important responsibility and mission in the National Innovation System.As is expected, the number of R&D institutions of higher education has increased fast in the past few years.Accurately forecasting the number of R&D institutions of higher education will provide a reference for the Ministry of Education of the People's Republic of China and the government to make better plans and strategies in advance.However, the effects of related factors on the number of R&D institutions of higher education are  quite uncertain, and reliable observations are limited because of China's rapid development, which implies the traditional models (e.g., regression analysis) are not suitable for this case because of the small sample size and uncertain factors.Herein, the proposed model, INGBM (1, 1), is obviously more suitable for this case with few observations.Empirically, the data collected from China's National Bureau of Statistics of the People's Republic China and listed in Table 6, are divided into two groups, the data from 2011 to 2016 are used to build these five prediction models, and the others are used to assess the accuracy of these models.
Similar to Case 1 and 2, all the parameters in these models are computed and listed in Table 7.Moreover, the track of the power index c using WOA is exhibited in Figure 4.
As a consequence, the simulated and predicted results are shown in Table 8.
In this case, by ignoring the first item of predicted results, it should be known that the RMSE values (see Figure 5) of five grey models are 0.14, 0.14, 0.04, 0.04, and 0.03 for simulation and are 0.87, 0.85, 0.39, and 0.40 for prediction, respectively.Moreover, the MAPE values (see Figure 6) of these models are 1.17%, 1.18%, 0.28%, 0.27%, and 0.25% for simulation and those of models mentioned here are 5.54%, 5.46%, 2.32%, 2.39%, and 1.72% for prediction, respectively.erefore, in the simulation period, the proposed model outperforms other grey models with the lowest RMSE value of 0.03 and a MAPE value of 0.27%.
e ONGBM (1, 1) model has the following prediction performance with a relatively lower MAPE value of 0.28%.As mentioned in [44], as a proper forecasting method, it performs excellently in simulation and should do well in the prediction stage.By observing Table 8, it is easy to find that the proposed model is better than other grey models again because of its lower RMSE value of 0.40 and MAPE value of 1.72%.Interestingly, the ONGBM (1, 1) is the second better because its MAPE value is a bit higher than that of the INGBM (1, 1), which implies the improved NGBM (1, 1) through optimization of background value can be regarded as the alternative  Mathematical Problems in Engineering model to predict the number of R&D institutions of higher education in this paper.In this case, the prediction and fitting errors of all models are not big, which shows that there has no overfitting in the modeling.At the same time, the prediction effect of the nonlinear grey model is better than that of the linear model, which shows that the nonlinear grey model can effectively capture the nonlinear characteristics of the data.Finally, the improved model has the highest accuracy, which indicates that our improvement strategy is effective.
In order to further verify the advantages of WOA, three kinds of intelligent optimizer, grey wolf optimizer (GWO) [45], particle swarm optimizer (PSO) [46], and ant lion optimizer (ALO) [47], are used for comparison.ese four kinds of algorithms are all excellent optimizers with their own characteristics and advantages.e population numbers of the four algorithms are all set to be 100 and the search times to be 100.e population is initialized 100 times to compare the final MAPE with the corresponding nonlinear parameters and calculate the average time.For the four types of optimization algorithms, the MAPE, and the corresponding nonlinear parameters after running 30 times are shown in Figure 7, and the time consumption is shown in Table 9.

Mathematical Problems in Engineering
It can be seen from Figure 7 and Table 9 that the operation of WOA is relatively stable, and the running time of WOA is 9.9931 s, which is relatively small.Overall, the WOA is reasonable as an optimizer.

Conclusion
is paper aims to further promote the prediction accuracy of the nonlinear grey Bernoulli model (NGBM (1, 1)), and as a result, the nonlinear grey Bernoulli model with improved parameters, abbreviated as INGBM (1, 1), is proposed.is study does not share the same differential equation as the traditional NGBM (1, 1) model.Instead, the differential equation is transformed into the linear formula.Besides, considering that "misplaced replacement" is the root cause of contradiction when converting the differential equation to the difference equation, the model parameters are optimized to better match these two equations to reduce prediction error.In particular, the whale optimization algorithm is used to automatically determine the optimal power index of INGBM (1, 1).ree examples are employed to validate the proposed model's effectiveness by comparing with commonly used grey models.In all cases, the proposed model both outperforms other grey models, implying that the INGBM (1, 1) model can effectively solve the nonlinear problems with a small sample size and provide valuable information for related decisionmakers to make strategies in advance.
Although INGBM (1, 1) has a very good effect, there are some limitations that need to be overcome in future work: (1) although the model has a good effect, there may be overfitting in some special cases.(2) More accurate parameter values can be further obtained with multiple optimizers.

Figure 2 :
Figure 2: e track of searching for the optimal power index by WOA.

Figure 3 :
Figure 3: e track of searching for the optimal power index by WOA.

Figure 4 :
Figure 4: e track of searching for the optimal power index by WOA.

Table 1 :
e criteria for MAPE proposed byLewis.

Table 3 :
Simulated and predicted results by different grey models.

Table 4 :
Parameter values for five grey models.

Table 2 :
Parameter values for five grey models.

Table 5 :
Simulated and predicted results by different grey models.

Table 6 :
e number of R&D institutions of higher education from 2011 to 2018.

Table 7 :
Parameter values for five grey models.

Table 8 :
Simulated and predicted performance by five grey models using raw data of the number of R&D institutions of China's higher education.

Table 9 :
Average time cost of four optimizers.