^{1}

^{1}

^{1}

^{1}

^{2}

^{1}

^{2}

A Multilayer Perceptron (MLP) is a feedforward neural network model consisting of one or more hidden layers between the input and output layers. MLPs have been successfully applied to solve a wide range of problems in the fields of neuroscience, computational linguistics, and parallel distributed processing. While MLPs are highly successful in solving problems which are not linearly separable, two of the biggest challenges in their development and application are the local-minima problem and the problem of slow convergence under big data challenge. In order to tackle these problems, this study proposes a Hybrid Chaotic Biogeography-Based Optimization (HCBBO) algorithm for training MLPs for big data analysis and processing. Four benchmark datasets are employed to investigate the effectiveness of HCBBO in training MLPs. The accuracy of the results and the convergence of HCBBO are compared to three well-known heuristic algorithms: (a) Biogeography-Based Optimization (BBO), (b) Particle Swarm Optimization (PSO), and (c) Genetic Algorithms (GA). The experimental results show that training MLPs by using HCBBO is better than the other three heuristic learning approaches for big data processing.

The term big data [

In the last decade, feedforward neural networks (FNNs) [

Theoretically, the goal of the learning process of MLPs is to find the best combination of weights and biases of the connections in order to achieve minimum error for the given train and test data. However, one of the most common problems of training an MLP is that there is a tendency for the algorithm to converge on a local minimum. Since an MLP can consist of multiple local minima, it is easy to be trapped in one of them rather than converging on the global minimum. This is a common problem in most gradient-based learning approaches such as backpropagation (BP) based NNs [

Recently, a novel optimization method called Biogeography-Based Optimization (BBO) [

The rest of this paper is organized as follows. In Section

The notation used in the rest of the paper represents a fully connected feedforward MLP network with a single hidden layer (as shown in Figure

An MLP with one hidden layer.

The output of each hidden node is calculated as follows:

After calculating outputs of the hidden nodes, the final output can be defined as follows:

The learning error

From the above equations, it can be observed that the final value of the output in MLPs depends upon the parameters of the connecting weights and biases. Thus, training an MLP can be defined as the process of finding the optimal values of the weights and biases of the connections in order to achieve the desirable outputs from certain given inputs.

Biogeography-Based Optimization (BBO) is a population-based optimization algorithm inspired by evolution and the balance of predators and preys in different ecosystems. Experiments show that results obtained using the BBO are at least competitive with other population-based algorithms. It has been shown to outperform some well-known heuristic algorithms such as PSO, GA, and ACO on some real-world problems and benchmark functions [

The steps of the BBO algorithm can be described as follows. In the beginning, the BBO generates a random number of search agents named habitats, which are represented as vectors of the variables in the problem (analogous to chromosomes in GA). Next, each agent is assigned emigration, immigration, and mutation rates which simulate the characteristics of different ecosystems. In addition, a variable called HSI (the habitat suitability index) is defined to measure the fitness of each habitat. Here, a higher value of HSI indicates that the habitat is more suitable for the residence of biological species. In other words, a solution of the BBO with a high value of HSI indicates a superior result, while a solution with a low value of HSI indicates an inferior result.

During the course of iterations, a set of solutions is maintained from one iteration to the next, and each habitat sends and receives habitants to and from different habitats based on their immigration and emigration rates which are probabilistically adapted. In each iteration, a random number of habitants are also occasionally mutated. That makes each solution adapt itself by learning from its neighbors as the algorithm progresses. Here, each solution parameter is denoted as a suitability index variable (SIV).

The process of BBO is composed of two phases: migration and mutation. During the migration phase, immigration

Species model of a habitat.

The mathematical formula of immigration

The mutation of each habitat, which improves the exploration of BBO, is defined as follows:

There are three different approaches for using heuristic algorithms for training MLPs. In the first approach, heuristic algorithms are employed to find a combination of weights and biases to provide the minimum error for an MLP. In the second approach, heuristic algorithms are utilized to find the proper architecture for an MLP to be applied to a particular problem. In the third approach, heuristic algorithms can be used to tune the parameters of a gradient-based learning algorithm.

Mirjalili et al. [

Chaos theory [

In this paper, chaotic systems are applied to BBOs instead of random values [

During the training phase of an MLP, each training data sample should be involved in calculating the HIS of each candidate solution. In this work, the Mean Square Error (MSE) is utilized for evaluating all training samples. The MSE is defined as follows:

To improve the convergence of BBO algorithm during the mutation phase, a method named opposition-based learning (OBL) has been used in [

Assuming that

(1) Generate a vector

(2) Evaluate the fitness of both points,

(3) If

Thus, the vector and its opposite vector are evaluated simultaneously to obtain the fitter one.

In this section, the main procedure of HCBBO is described. To guarantee an initial population with a certain quality and diversity, the initial population is generated using a combination of the chaotic system and the OBL approach. By fusing the local search strategies with the migration and mutation phases of the BBO algorithm, the exploration and exploitation capabilities of the HCBBO can be well balanced. The main procedure of our proposed HCBBO to train an MLP can be described as Algorithm

Probabilistically use immigration and emigration to modify each non-elite habitat based on Eq. (

This study focuses on finding an efficient training method for MLPs. To evaluate the performance of the proposed HCBBO algorithm in this paper, a series of experiments were developed using the Matlab software environment (V2009). The system configuration is as follows: (a) CPU: Intel i7; (b) RAM: 4 GB; (c) operating system: Windows 8. Based on the works described in [

Classification datasets.

Classification datasets | Number of attributes | Number of training samples | Number of test samples | Number of classes |
---|---|---|---|---|

Balloon | 4 | 16 | 16 as training samples | 2 |

Iris | 4 | 150 | 150 as training samples | 3 |

SPECT Heart | 22 | 80 | 187 | 2 |

Vehicle | 18 | 400 | 446 | 4 |

In this paper, we compare the performances of 4 algorithms, BBO, PSO, GA, and HCBBO, over the benchmark datasets described in Table

The main parameters of BBO and HCBBO.

Maximum number of generations: | Maximum mutation rate: |

Elitism parameter: | Maximum possible emigration rate: |

Population size: | Maximum possible immigration rate: |

In order to increase the accuracy of the experiment, each algorithm was run 20 times, and different MLP structures will be used to deal with different datasets, which were listed in Table

MLP structure parameters.

Balloon | Iris | SPECT Heart | Vehicle | |
---|---|---|---|---|

Input | 4 | 4 | 22 | 18 |

Hidden | 9 | 9 | 45 | 52 |

Output | 1 | 3 | 1 | 4 |

The running time (RT) and convergence curves of each algorithm are shown in Figures

Total running time of each algorithm.

Convergence curves of algorithms for iris dataset.

Convergence curves of algorithms for heart dataset.

Convergence curves of algorithms for balloon dataset.

Convergence curves of algorithms for vehicle dataset.

The convergence curves in Figures

The experimental results of mean classification rate are provided in Table

Experimental results for classification rate.

Algorithm | Iris | Heart | Balloon | Vehicle |
---|---|---|---|---|

Classification rate | Classification rate | Classification rate | Classification rate | |

HCBBO | 93% | 81.2% | 100% | 76.2% |

BBO | 90% | 75.4% | 100% | 71.7% |

PSO | 38% | 66.5% | 100% | 56.8% |

GA | 88.2% | 56.9% | 100% | 59.9% |

In this paper, a HCBBO algorithm was presented for training an MLP. Four benchmark big datasets (balloon, iris, heart, and vehicle) were employed to investigate the effectiveness of HCBBO in training MLPs. The performance results were statistically compared with three state-of-the-art algorithms: BBO, PSO, and GA. The main contributions and innovations of this work are summarized as follows: (a) this is the first research work combining a hybrid chaos system with the BBO algorithm to train MLPs; (b) the method named OBL was used in the mutation operator of HCBBO to improve the convergence of the algorithm; and (c) the results demonstrate that HCBBO has better convergence capabilities than BBO, PSO, and GA. In the future, we will apply the trained neural networks to analyze the big medical data and integrate more novel data mining algorithms [

The authors declare that they have no conflicts of interest.