Research on WNN Modeling for Gold Price Forecasting Based on Improved Artificial Bee Colony Algorithm

Gold price forecasting has been a hot issue in economics recently. In this work, wavelet neural network (WNN) combined with a novel artificial bee colony (ABC) algorithm is proposed for this gold price forecasting issue. In this improved algorithm, the conventional roulette selection strategy is discarded. Besides, the convergence statuses in a previous cycle of iteration are fully utilized as feedback messages to manipulate the searching intensity in a subsequent cycle. Experimental results confirm that this new algorithm converges faster than the conventional ABC when tested on some classical benchmark functions and is effective to improve modeling capacity of WNN regarding the gold price forecasting scheme.


Introduction
Since ancient times, gold has been recognized as a symbol of wealth and a frontierless currency that can be easily exchanged among different monetary systems [1,2]. In recent decades, gold has gradually become a popular nonmonetary tool in the financial market, which is characterized by highyield and high-risk. Gold price is partly regarded as a reflection of investors' expectations and the world's economic trends. Therefore, gold price forecasting is a vital issue in economics. At the same time, it is noted that, during the financial crisis in 2008 and early 2009, the global gold price has increased by 6% on average, while many mineral prices have dropped by 40% approximately [3]. In this sense, the gold price behavior differs from that of most other mineral commodities, making the forecasting scheme even more challenging.
Regarding the prediction of gold price, like the prediction of any other macroeconomic indexes, research efforts have been focused on the neural network (NN) approaches [4]. A neural network is known as a mathematical model consisting of interconnected groups of artificial neurons and processing information based on a connectionist approach to computation [5,6]. In most cases, neural networks are adaptive systems that can alter the internal structures according to the external information. Ever since McCulloch and Pitts pioneering work [7], artificial models, such as back-propagation neural network (BP-NN) [8], radial basis function neural network (RBF-NN) [9], wavelet neural network (WNN) [10], Kohonen neural network [11], and Hopfield neural network [12] have been proposed and investigated. Among all these methods, WNN has shown its advantages in regression accuracy and fault-tolerant ability due to the adoption of wavelet transform. It has been confirmed that the WNN model is the optimal approximator for functions of one variable [13].
Large numbers of methods have been available to optimize WNN models, among which the gradient descent method (GDM) and the least square method (LS) are undoubtedly the most popular ones [14]. However, such conventional methods cannot help to optimize efficiently or globally in terms of some complicated WNN models. In other words, when the parameters in a WNN model are large in number or when the training scheme is complicated, these deterministic optimization methods are not as efficient as they are expected. Therefore, researchers have gradually shifted their interests towards some intelligent algorithms to optimize WNN models [15].
Intelligent algorithms have been well studied in recent decades, among which artificial bee colony (ABC) algorithm  is a famous example. It is motivated by the foraging behavior of bee swarms, in which both local exploitation and global exploration are implemented in each iteration [16]. Applications and developments have been made for ABC in different ways [17][18][19][20][21][22][23]. However, viewing the improvements has ever been made for the conventional ABC; to the best of my knowledge, attention has seldom been paid to fully utilizing the convergence messages hiding in the iteration system. In this paper, internal-feedback ABC (IF-ABC) is applied for WNN parameter optimization when training a gold price prediction model. In this new algorithm, invalid trial time is taken as an index to reflect the internal status and then to manipulate the exploration/exploitation intensity. At this point, the author believes that, apart from the objective function values, messages that reflect the convergence status should be made full use of so as to direct the subsequent searching cycles [24]. As for gold price forecasting, neural network methods such as RBF-NN [25], BP-NN [26], and WNN [27] have been studied. This work provides an intensive research to evaluate the performance of IF-ABC when training the WNN models, in comparison with the conventional ABC algorithm.
The remainder of this paper is organized as follows. In Section 2, principle of the WNN model is briefly introduced. In Section 3, principles of ABC and IF-ABC are introduced in detail. Section 4 validates the effectiveness of IF-ABC by means of some classical benchmark functions. Then, IF-ABC is applied to optimize the WNN models for the gold price forecasting scheme. Simulation results are released in Section 5, together with some discussions. The conclusion is drawn in the last section.

Principle of Wavelet Neural Network Model
WNN is a feed-forward neural network combined with the wavelet transform theory [15]. In such a framework, the wavelet space is regarded as a feature space, where features are extracted by weighting the interior states of the input signals. Compared with other NN models, WNN possesses higher prediction accuracy and better fault tolerance to meet the uncertainty, nonlinearity, and complexity in real-world systems [28].

Review of Conventional ABC.
ABC is a swarm intelligencebased optimization algorithm inspired by the forging behavior of bees [16]. In this algorithm, three kinds of bees, namely, the employed bees, the onlooker bees, and the scout bees, cooperate to search for the very optimal nectar source in the space [23].
At the beginning, an initial population is randomly generated, which contains as many as food sources (i.e., feasible solutions) using (5). In this equation, each solution X = ( 1 , 2 , . . . , ) is a -dimensional vector, X max and X min stand for the constraints of the optimization problem, and rand(0, 1) stands for a random number in the range (0, 1) obeying the uniform distribution. Note that the variables involved in this section are not relevant to those emerged in Section 2: Afterwards, the iteration process starts. Generally, in each cycle of iteration, as many as employed bees search globally, and then onlooker bees search locally for the "qualified" employed bees. Here, the qualification standard concerns the roulette selection strategy and will be introduced later.
In detail, each employed bee utilizes the position of its one randomly chosen companion so as to generate a new searching position. Here, only one (randomly chosen) element in the vector X needs to be changed. For instance, when the th employed bee utilizes position of the th companion in the th element, the involved element is changed according to Afterwards, the greedy selection procedure is implemented. If the new position updated by (6) is better (i.e., if the corresponding objective function value is lower), the previous position is discarded; otherwise, the employed bee remains at the previous position. When all the employed bees complete the searching procedure mentioned above, an 4 Computational Intelligence and Neuroscience   The bold values denote the better value (mean or S.D.) in each line.
index is calculated as the qualification measurement for the employed bees using where obj(⋅) stands for the objective function and fitness(⋅) stands for a conventionally defined fitness function. Each onlooker bee needs to search for an employed bee using (6). In this case, stands for the corresponding element of the selected employed bee and denotes that of the th onlooker bee. Again, the greedy selection procedure is then implemented. The selection principle for the qualified employed bees concerns the roulette selection strategy. If 1 ≥ rand(0, 1), the first employed bee is chosen for the specific onlooker bee; otherwise, comparison between 2 and rand(0, 1) is carried on. If every happens to be smaller than rand(0, 1), such process will go over again until one employed bee satisfies the condition. In this way, each of the onlooker bees determines the corresponding employed bee to follow.
During each cycle of iteration, once the th employed bee or an onlooker bee which searches for the th employed bee finds a better position in the crossover procedure, the parameter trial( ) is directly reset to zero; otherwise, it adds one. In this sense, trial is regarded as a counter memorizing the invalid searching times for the th employed bee. Before Computational Intelligence and Neuroscience  a new cycle of iteration starts, it is necessary to check whether any trial( ) exceeds a certain threshold Limit. If trail( ) > Limit, trial( ) will be directly reset to zero. A scout bee with a randomly initialized position in the food source utilizing (5) will take the place. It should be noted that one scout bee at most is allowed to emerge in each cycle of iteration.

Principle of IF-ABC.
The author and some companions proposed IF-ABC originally in the previous literature [24,29]. But this algorithm is slightly modified when presented in this work, aiming to make it more efficient.
At first, all the employed bees are randomly sent out to explore in the nectar source space (i.e., the feasible solution space) following (5). Afterwards, the iteration process gets started.
In each cycle of iteration, an employed bee exchanges information with its (randomly selected) companions. Different from that in ABC, the crossover procedure should involve as many as trial( ) elements in the position of the th This equation is slightly different from (6), aiming to promote swarm diversity during the global exploration procedure. Then, the greedy selection procedure is conducted so as to select better position.
Afterwards, the onlookers carry on the searching process. In the IF-ABC, each of the employed bees is given a chance to be followed by an onlooker regardless of the fact that they are "qualified" or not, pursuing to bring about more chances (i.e., more dynamics and diversity) for evolution and to fight against premature convergence as well. In IF-ABC, a new idea is introduced to evaluate the qualification of a bee. Now that the roulette selection strategy is discarded in IF-ABC; then the onlookers directly choose their corresponding  employed bees to search locally using (9), where the companion X and the element item are randomly selected. Afterwards, the greedy selection is implemented: where For each of the employed bees, together with the corresponding onlookers, the parameter trial represents the number of inefficient searching times before even one better position is derived. If the th employed bee or the th onlooker bee finds a better position, trial( ) is directly reset to 1; otherwise, it adds 1. If trial( ) is greater than , the current th position X should be replaced by a reinitialized position using (5).
Since 1 ≤ trial( ) ≤ , it is expected that as many as trial( ) out of the elements in a candidate feasible solution involve in the exploration process. But when it comes to the onlooker bees, only one element is changed, because it is believed that multicrossover process contributes little to local search ability. Note that a convergence factor ( ) appears in (9), which is carefully designed to manipulate the exploitation accuracy according to the current convergence status of the th employed bee. As shown in (10), ( ) decreases exponentially to 0.1 as trial( ) gradually approaches . Here, 0.1 is a user-specified lower boundary of convergence scale, but the selection of such constant can be flexible according to the users. In this sense, the exploitation around one certain employed bee is gradually intensified before it is eventually discarded by means of reinitialization (when trial exceeds ).
To briefly conclude, trial in IF-ABC works to manipulate the searching intensity in local exploitation and to determine the searching scale in global exploration. In the author's viewpoint, convergence performances of the bees are measured not only by the corresponding objective function values but also by the facts whether they are better than the previous one.
Such change intends to provide more possibilities for the socalled unqualified employed bees to be exploited locally by onlooker bees.
The pseudocode of IF-ABC for constrained optimization problems is given in Algorithm 1. MCN denotes the predefined maximum cycles of iteration.

Effectiveness Validation of IF-ABC for Numerical Optimization
In this section, ABC and IF-ABC are tested on a number of classical benchmark functions [30]. The concerned functions are listed in Table 1, together with the pre-defined optimization domains, optimums, and optimal solutions. In this table, dim stands for the dimension of feasible solutions. 1 is unimodal and 2, 3, 4, and 5 are multimodal. All the simulations were implemented in MATLAB R2010a and executed on an Intel Core 2 Due CPU with 2 GB RAM running at 2.53 GHz. Each kind of experiment repeated itself 50 times with different random seeds. The maximum cycle number MCN is set to 1000 for all the cases involved in this section, the swarm population (i.e., 2 ⋅ ) is constantly set to 40, and Limit is set to 200. Two indexes that reflect the convergence performances (i.e., the mean and standard deviation of benchmark function values) are listed in Table 2. Figures 2, 3, 4, 5, and 6 illustrate some typical simulation results to illustrate the significant advantage of IF-ABC.
As can be seen in Figures 2-6, the iteration process converges slower when using IF-ABC than it does when using ABC in the early cycles of iteration. But IF-ABC makes it catch up and be surpassed later. Initially, it is generally easy to evolve, regardless of the differences in algorithms. In other words, when more than one element of a feasible solution is involved in the crossover procedure, it does not necessarily lead to a better result in comparison with the case in which only one element is involved. However, as the iteration moves on, the internal feedback strategy begins to take effort. Therefore, it is believed that IF-ABC sacrifices part of its initial convergence capability for dynamics and diversity in the bee swarms. A complete comparison concerning these two algorithms is listed in Table 2, where IF-ABC performs far better (within 1000 cycles of iteration) in most of the cases.
The author noticed that many remedies for the conventional ABC come from the outside world (e.g., [18,19]), ignoring utilizing the convergence status inside the iteration system. In this sense, IF-ABC intends to emphasize and advocate the great importance of fully utilizing internal status as feedback messages. In the event that IF-ABC really performs not so good as some existing X-ABCs, it does not mean that the internal feedback strategy is of no use. Therefore, the author preliminarily compared IF-ABC with ABC in all the experiments and simulations of this work.

Simulations for Gold Price Forecasting Scheme
The quantities of supply and demand, the prosperity of economics, and the environment of international politics mainly   affect the gold price or may be regarded as good reflections of gold price in the future [3,31,32]. In this work, seven macroeconomic indexes are considered principal reflections of the long-term gold price in the future (i.e., gold price in the next year), namely, Dow Jones Industrial Average Index (DJIA), Consumer Price Index (CPI), US dollar nominal effective exchange rate (NEER), US federal funds rate (FFR), US dollar index (USDX), the world's gold reserves (WGR), and the world's crude oil price (COP) [33,34]. The long-term prediction network structure is demonstratively given in Figure 7. In this case, four sensitive-about-time macroeconomic indexes are taken as principal reflections of the short-term gold price in the future (i.e., gold price in the next month), namely, DJIA, CPI, USDX, and COP. Then, the short-term prediction network structure is demonstratively given in Figure 8. Each single type of experiment was repeated 50 times with randomly initialized conditions so as to guarantee the significant initial differences in statistics. It is set that = 20 and Limit = 200. All the connection weights in WNN (i.e., and ) range from 0 to 1, any ranges from 0.0001 to 10, and any ranges from −1 to 1. The determination of hidden layer node number is theoretically unavailable. In general,  if is too large, the overfitting trouble inevitably occurs, and, conversely, if is too small, the derived model will reflect anything but the true facts. In this work, is selected using the following equation: where and denote the number of input and output layer nodes, respectively [28]. Besides, all data put into the WNN model (i.e., economic indexes and the corresponding gold prices) should be linearly standardized in the range [0, 1] as a preprocessing step, and the results worked out by WNN (i.e., the predicted future gold prices) involve an inverse process. First, two cases were studied to compare the convergence performances of IF-ABC and ABC when optimizing a shortterm forecasting model. In the first case, the prediction model was trained using four macroeconomic indexes from April 1982 to August 1985 (as long as 60 months), and the optimized WNN model was tested on the data in each of the coming Computational Intelligence and Neuroscience 9 40 months (i.e., from September 1985 to August 1990). In the second case, the prediction model was trained using four macroeconomic indexes from September 1990 to April 1997 (as long as 80 months), and the optimized WNN model was tested on the data in each of the coming 120 months (i.e., from May 1997 to April 2007). The comparative convergence curves are illustrated in Figures 9 and 10. As can be seen in the following Figures 11 and 12, the trend of gold price predictions derived by the IF-ABC-WNN model is closer to the actual gold price data.
In the long-term, such training methodologies are ineffective or invalid, since the principle how the macroeconomic indexes affect the gold price may be varying significantly. To confirm this point of view, an experiment was carried out in this work as well (see Figure 13). In this example, annual macroeconomic indexes from 1987 to 2000 are regarded as the training data. The derived WNN model is tested by forecasting the gold price trends from 1973 to 1986 and from 2001 to 2011. Figure 13 clearly depicts that the trained model only fits the actual gold prices well from 1987 to 2002. That is to say, the generalization ability of WNN is too weak to forecast the long-term gold price.

Conclusion
In this work, a modified version of ABC named IF-ABC is applied to optimize the WNN model in the gold price forecasting scheme. Series of numerical experiments confirm that IF-ABC is more effective than conventional ABC in the capability to train WNN models.
IF-ABC is applied in this work to advocate the viewpoint that, apart from the quality of a nectar source (i.e., the objective function value), the true convergence efficiency may also be reflected by the fact that whether a bee does find a position better than the previous one it stays at. The author believes that the internal feedback strategy in the IF-ABC algorithm may be applied to modify some other swarm intelligence algorithms.
Besides, further investigations into the relationship between the gold price and other key influencing variables, especially in the long-term, will be the future work.