Three-Dimensional Short-Term Prediction Model of Dissolved Oxygen Content Based on PSO-BPANN Algorithm Coupled with Kriging Interpolation

Dissolved oxygen (DO) content is a significant aspect of water quality in aquaculture. Prediction of dissolved oxygen may timely avoid the financial loss caused by inappropriate dissolved oxygen content and three-dimensional prediction can achieve more accurate and overall guidance.Therefore, this study presents a three-dimensional short-term prediction model of dissolved oxygen in crab aquaculture ponds based on back propagation artificial neural network (BPANN) optimized by particle swarm optimization (PSO), which coupled with Kriging method. In this model, wavelet analysis is adopted for denoising, BPANN optimized by PSO is utilized for data analysis and one-dimensional prediction, andKrigingmethod is used for three-dimensional prediction. Compared with traditional one-dimensional prediction model, three-dimensional model has more real reaction of dissolved oxygen content in crab growth environment. In particular, the merits of PSO are evaluated against genetic algorithm (GA). The root mean square error (RMSE), mean absolute error (MAE), and mean absolute percentage error (MAPE) for PSO model are 0.136445, 0.90534, and 0.15384, respectively, while for the GA model the values are 2.04184, 1.18316, and 0.21014, respectively. Furthermore, results of cross validation experiment show that the average error of this model is 0.0705 (mg/L). Consequently, this study suggests that the prediction model operates in a satisfactory manner.


Introduction
Dissolved oxygen content is one of the most important concerns to pond water quality management [1].Dissolved oxygen concentration has potent influence on feed consumption, metabolic rate, and energy expenditure [2].Inappropriate content of dissolved oxygen will cause not only the extra increase in bait coefficient but also the inevitable decline in growth rate of aquaculture products.For example, in crab culture ponds, the number of crab deaths accounts for about 50% of the total deaths according to the reliable statistical analysis.Therefore, establishing an accurate, practical prediction model is urgent and important for the aquaculture.The model can provide early warnings of dissolved oxygen content and improve efficiency and income and reach the standard of scientific breeding in aquaculture.
In recent years, many researches have focused on dissolved oxygen (DO) content prediction and obtained great achievements.Holenda et al. designed an approach using model predictive control (MPC) to control the DO concentration [3].Liu et al. studied the prediction of DO content based on least squares support vector regression optimized by improved particle swarm optimization [4] and a hybrid WA-CPSO-LSSVR model for DO content prediction in crab culture [5].Ahmed proposed the prediction of DO using artificial neural networks [6] and an application of adaptive neurofuzzy inference system (ANFIS) to estimate the DO of Surma River [7].Meanwhile, all the above studies are limited to one-dimensional prediction research which cannot be comprehensive to forecast the DO.Because aquatic products have a certain range to grow in ponds, the traditional onedimensional prediction cannot accurately predict dissolved oxygen in the growth environment of aquatic products.Therefore, the study of dissolved oxygen three-dimensional prediction is very necessary.This paper proposed a threedimensional prediction model of dissolved oxygen which can forecast the changing trend of space which is closer to the real growth environment of aquatic products.Furthermore, treedimensional prediction model has more reference values for the switching times and installation sites of aerator.
Many methods were used for the prediction of water quality such as regression analysis method [8][9][10], fuzzy reasoning method [11], support vector machine [12,13], and artificial neural network.The artificial neural network (ANN) has become an efficient model for predicting water quality due to the dynamic, complex, and nonlinear water system [14].ANN processes information by adjusting the internal relations between large connected nodes and it has strong self-learning abilities and adaptive abilities.Xu and Liu build a short-term water quality prediction model combining the wavelet transform with the BP neural network.And the model can predict water quality effectively and can meet the management requirements in intensive freshwater pearl breeding [15].Alizadeh and Kavianpour developed a model based on wavelet-ANN to predict water quality parameters in Hilo Bay, Pacific Ocean.The results of their model show that it can be successfully applied in order to predict daily and hourly values of DO, water temperature, and salinity in bays and oceans [16].Sarkar and Pandey presented the use of artificial neural network technique to estimate the dissolved oxygen concentrations downstream of Mathura city, India.The predicted values of DO show prominent accuracy by producing high correlations between measured and predicted values [17].
However, the performance of artificial neural network heavily depends on the choice of the kernel parameters which are necessary to define the model accuracy.Although it is important to select the optimal parameters, there is no exact method to obtain them.Therefore, we had better use heuristic global optimization algorithm particle swarm optimization (PSO) to improve the model prediction accuracy.The optimization process of PSO is updating the particle position and velocity iteratively.This algorithm with its easy implementation, high precision, and fast convergence has been widely used.García Nieto et al. proposed a practical new hybrid PSO optimized SVM-based model to predict the successful growth cycle of Spirulina platensis.This optimization mechanism involved kernel parameter setting in the SVM training procedure, which significantly influences the regression accuracy [18].Li et al. proposed an improved PSO-BP algorithm for temperature compensation of thermopile sensor and correcting the error in the condition of the system accuracy requirements reduced by temperature, and the experimental results show that the proposed PSO-BP network outperforms other similar algorithms with faster convergence speed, lower errors, and higher accuracy [19].In this study, the particle swarm optimization algorithm was applied to optimize the weight and threshold value of BPANN.
The Kriging method is capable of estimating an unknown value at a particular location and estimating an average value or distribution of values over a large/small area [20].There have been many models using Kriging interpolation method.Khankham et al. adopted MLPG method based on moving Kriging interpolation for solving convectiondiffusion equations with integral condition [21].Phaochoo et al. established a meshless local Petrov-Galerkin method based on moving Kriging interpolation for solving fraction Black-Scholes model.In MLPG method, the shape function is constructed by the moving Kriging approximation [22].This study presents the three-dimensional prediction model coupled with Kriging interpolation method due to the limited number of sensors and the limited test data.This experiment's results show that this prediction model which is simple and rapid with high precision of prediction is workable.

Data Collection.
For this research work, the measured data sets were collected from crab pond in Gaocheng town, Yixing city, Jiangsu province, China.The local climate is subtropical monsoon climate where the weather is warm and humid with abundant rainfall throughout the year.For this study, the data samples collected between 00:00 am on 12 September 2014 and 11:50 pm on 16 September 2014 are used as sample data.There is total data of 720 samples which are taken every ten minutes.The water quality data are divided into two samples of training and testing.Writer selects the previous 600 samples as training data set and the remaining 120 samples as testing data set.All of the data were dealt with through the data preprocessing method mentioned above.Since there are a large number of data, only a portion of the data of collection point 2 is shown as Table 1.
Dissolved oxygen content in water is affected by the two aspects.One aspect makes the dissolved oxygen content down; for example, biological respirations consume dissolved oxygen.The other aspect raises the dissolved oxygen content which is mainly affected by aquatic plants photosynthesis and the oxygen in the air.Furthermore, rate of plant photosynthesis and respiration is mainly related to solar radiation, air temperature, and water temperature.Oxygen reaeration rate is mainly affected by wind speed, air temperature, air humidity, atmospheric pressure, water temperature, and other factors.In addition, these parameters have a great influence on the content of dissolved oxygen, and the data can be easily obtained by sensors.Therefore, apart from the acquisition time and dissolved oxygen, other 8 parameters (rainfall, wind speed, wind direction, solar radiation, air temperature, air humidity, atmospheric pressure, and water temperature) in Table 1 are used as input data in this paper.
The length of the observed pond is 270 meters and the width of the pond is 76.5 meters.In addition, there was a nanotube aeration device at the bottom of the pond to increase oxygen content.The top view of the observed pond is shown clearly in Figure 1.Taking into account the limits of the number of sensors and the requirements of the experimental data, this study took twelve points in the pond.Figures 1  and 2 show the top view and sectional view of collection points in aquaculture pond, respectively, in which circles represent sensors.Water temperature sensors and dissolved  oxygen sensors were installed with the distances of 30 cm and 90 cm, respectively, from water surface at the same time to collect different depths of water quality data every ten minutes.Due to the fact that dissolved oxygen content was partially affected by many weather factors, meteorological station was installed next to the aquaculture pond to collect solar radiation, atmospheric pressure, temperature, humidity, rainfall, wind direction, and wind speed every ten minutes.

Wavelet Analysis.
Wavelet analysis (WA) method plays a critical role in data preprocessing of dissolved oxygen data.WA can be divided into three types: spatial correlation filtering method, wavelet threshold filtering method, and modulus maximum reconstruction filtering method.Among three methods, wavelet threshold filtering method, which has good effect of noise reduction and high speed of computing, is widely used in engineering.WA cannot only process signal in the respects of time and frequency domain simultaneously, but also retain the useful information in the signal effectively.Suppose that () is an original signal and () is a noise signal; then the general denoising model is where () is noise and  is noise intensity.The destination of wavelet denoising is recovering ().Next, set () and () as square integrable functions, and satisfy ∫  () = 0; we can define continuous wavelet transform as follows: where  is basic wavelet or mother wavelet.There are three steps constituting a wavelet noise reduction processing.Firstly, original signal should conduct the decomposition of wavelet.And then, set up corresponding rules of denoising and realize the separation of noise and useful signal in the wavelet domain.At last, it should carry out the wavelet reconstruction to achieve the purpose of purifying signal.

Back Propagation Artificial Neural
Network.Back propagation artificial neural network (BPANN) which was presented by Remelhart in 1986 is one of the most widely used neural network models at present.The characteristics of neurons and the topological structure of the network are the key of the processing unit of the BP network.Neuron structure of BP network is shown in Figure 3.In Figure 3,  is the input vector,  is the number of input neurons,  is the weight vector,  is the threshold,  is the transfer function, and  is the neuron output.The output can be expressed as follows:  = (), where  = ∑  =1  ,   + .The implementation process of BPANN is that the signals are input from the input layers, processed in the hidden layer, and output from the output layer.If the output data cannot meet expected results, next step is the error back propagation and adjusting the network weights and threshold values constantly to minimize the errors of the network.Classical BP neural network topological graph is shown in Figure 4.
In Figure 4,  1 ,  2 , . . .,   are input values,   ,   are network weights, and  1 , . . .,   are output values.The BP neural network can be regarded as a nonlinear function.The input values and output values can be regarded as independent variables and dependent variables.When we build a BP neural network, there is no mature reference method to choose appropriate network structure and neuron quantity.Therefore, we offer some empirical formulas to choose hidden layer node number.The formulas can be presented by following formulas: where  is the number of nodes in hidden layer,  is the number of nodes in output layers,  is the number of nodes in input layers, and  is a constant between 0 and 10.
After the network structure is confirmed, choices of weights and threshold values will directly decide the prediction precision of the network.The common effective methods, genetic algorithm (GA) and PSO, can be used to solve the global optimization problems of the weights and threshold values.This paper uses PSO which is introduced as follows to improve the prediction precision.

Back Propagation Artificial Neural Network Optimized by
Particle Swarm Optimization.The intent of particle swarm optimization is to simulate the bird flock foraging behavior.Every bird will gradually close to other individuals around them which may be the closest to the food to forage.PSO method initializes a group of random particles and finds the optimal solutions through iteration in the first place.In each iteration process, the particles update themselves by tracking the two extreme values, one of which is particle optimal value named  best and the other is group optimal value named  best .When finding the two optimal values, particles update their location and speed according to the following formulas: where  is inertia factor which is not less than 0,  1 ,  2 are random numbers obeying uniform distribution between 0 and 1,  1 ,  2 are learning factors usually both assigned as 2,  is a restraint factor usually assigned as 1,  is a current position, and V is the velocity of the particle.BP neural network initializes the weights and thresholds between 0 and 1 randomly before training starts.This random initialization without optimization often makes convergence speed of BP neural network slower or even makes the final results unable to be the optimal solution.Therefore, PSO is needed to optimize the weights and thresholds to improve the prediction accuracy [18].And the implementation process of BPANN optimized by PSO algorithm can be described as follows.
Step 1. Build topological structure of BP neural network according to the number of input parameters and output parameters.
Step 2. Initialize a group of particles randomly and many parameters such as population size, iterations, learning factors, and values of the position and velocity.Step 3. Calculate the fitness value of each particle corresponding to the position and make sure of individual extreme and group extreme.
Step 4. Update each particle's position and speed according to the features of individual extremum and group extremum.
Step 5.If the prediction accuracy or the number of iterations meets the requirements, algorithm can be stopped.Otherwise, it should jump to the second step to continue executing until the requirements are reached.Finally, the optimal particle of this group is the model solution.
Step 6. Assign weights and thresholds according to the model solution and train the neural network.
In order to understand the optimization process more intuitively, the process will be shown as Figure 5 by the form of flow chart.

Kriging Method. Kriging method was proposed by South
African mining engineer D. G. Krige in the 1950s.Kriging method, firstly, analyses influence rules of how spatial attributes affect the measured sample values and makes sure of the influence rules of how measured sample values affect the points to be interpolated around them.Then, the attributes of points to be interpolated can be estimated according to the above rules.Kriging method is an unbiased estimation method whose precision is superior to response surface method and inverse distance weighting method [19].The prediction formula used by Kriging method is as follows: In the formulas, (  ) is measured value at the position of ,   is the weight at the position of ,  0 is the predicted value, and  is the quantity of measured samples.And the weight depends on the distance of the measured points, predicted positions, and the distribution regularities of predicted points in whole space.
The variation function is used to describe spatial correlation and building variation function is the first step in the process of Kriging method.The core idea of variation function is that all points will be grouped according to the size of the distance and the impact of each point on the variable values will be calculated in each group.Finally, the average value of each point's differences is the group variable value's differences.In this way, a variety of different distance relationships will be considered when there is need for calculating an unknown point's attribute according to sample points.There are a lot of commonly used variation functions such as spherical function, exponential function, and Gaussian function.This paper chooses spherical function as variable function because the change rules of spherical function are similar to water quality change regularities.

The Description of the Overall Prediction Model.
Based on the basic information of the methods introduced above, the flow chart of the overall prediction model is shown as Figure 6.Firstly, select 8 parameters of one collection point and use interpolation and wavelet noise reduction, respectively, to realize the dissolved oxygen data recovery and noise reduction.Afterwards, choose processed dissolved oxygen content data and other raw data as input of onedimensional BP artificial neural network prediction model which is optimized by PSO.Because this study chooses twelve test points, one-dimensional prediction model should be carried out twelve times using different data.Eventually, put dissolved oxygen predicted value of twelve test points into Kriging interpolation model to calculate half variation function and complete three-dimensional prediction.

Results and Discussion
3.1.Data Preprocessing.According to the influence factors of DO content in the pond, eight widely available parameters, rainfall, wind speed, wind direction, solar radiation, air temperature, air humidity, atmospheric pressure, and water temperature, are considered as input elements to predict the content of DO.However, the raw data collected by sensors may go wrong with data missing or data exception due to the reasons of equipment failure or network failure.Such data will lead to the increase of data processing costs and response time and even affect the validity of the final prediction model.In order to complete the data restoration, this paper uses the linear interpolation method and the mean smoothing method for the data missing and data exception, respectively.The two formulas applied in this study are as follows: The second formula is used when In addition, this paper utilizes wavelet analysis method to realize the noise reduction.The selections of the basis function, threshold, and the threshold function all have obvious influence on the noise reduction effect.Nonetheless, there is no optimal parameter selection method up to now.In this paper, the orthogonal experiment is carried out on the factors mentioned above using the signal-to-noise ratio (SNR) and root mean square error (RMSE) as the evaluation standard of the noise reduction effect.Moreover, basis functions include Haar wavelet, Symlets wavelet, Coiflet wavelet, and Meyer wavelet.Thresholds include universal threshold, SureShrink threshold, Minimax threshold, and Heursure.And threshold functions are hard threshold and soft threshold.The results show that sym4 basis function, 3-layer decomposition model, heuristic threshold, and soft threshold function have the best effects and low complexity.The noise reduction result is shown as Figure 7 which is the original data set of dissolved oxygen in point 2 and the data after denoising.The final experimental results show that this model can meet the need of the noise reduction of dissolved oxygen in aquaculture environment.In order to show excellent characteristics of PSO optimizing ANN, this study chooses genetic algorithm (GA) as comparison.The prediction performance evaluation parameters adopted by this research are mean absolute error (MAE), mean absolute percentage error (MAPE), and root mean square error (RMSE).The evaluation parameter calculation formulas are as follows:

Dissolved Oxygen
where   is true value,   is predicted value, and  is the number of test samples.Prediction of dissolved oxygen requires the following steps. of all, appropriate parameter settings are necessary.In genetic algorithm, the iteration number is 50, population size is 10, crossover probability is 0.2, and mutation probability is 0.1.In the particle swarm algorithm, the iteration is 50, population size is 10, the individual learning factor is 1.49445, the social learning factor is 1.49445, and iteration of BPANN is 200.According to the empirical formulas (3), the number of neurons in the hidden layer of BP neural network is between 3 and 7.Then, BP neural network is optimized by GA and PSO, respectively, and the results of RMSE, MAE, and MAPE are shown in Tables 2,  3, and 4. It should be noted that the results are the average of 5 experiments due to the randomness of the results of the heuristic algorithm and BP neural network.From Tables 2, 3, and 4, BPANN optimized by GA and PSO, respectively, both have the improvement on RMSE, MAE, and MAPE.But the performance of 5 layers optimized by PSO is particularly prominent.Therefore, in this paper, the BP neural network of 5 nodes, which is optimized by PSO, is used as the prediction model of DO.Then it should be executed on Matlab 2010b using data set of 8 input parameters and dissolved oxygen after preprocessing.In addition to the comparison of prediction accuracy, the choice of the prediction step is also very important for the model performance.If the step length is too long, the predicted results are not accurate, while if the step length is too short, it may lead to too many unnecessary repeat operations.Therefore, there is a need for determining the appropriate step length according to the fitting effect of the prediction curve.The prediction curve is shown in Figure 8.
In Figure 8, red line represents the actual measured value of DO and the blue line and green line represent the predicted value of dissolved oxygen content using BPANN prediction model and BPANN optimized by PSO prediction model, respectively.It can be clearly observed from Figure 8 that the PSO-BPANN model has high fitting precision in the first 20 sample points (which is within 200 minutes), while other predictive values are not ideal.Therefore, this is a short-term prediction model which had better predict data within 200 minutes.
This study selected a total of 12 experimental points which are shown in Figures 1 and 2. The prediction curve in Figure 8 is the result using the DO content and water temperature data in point 2. In order to realize the three-dimensional

Conclusions
In order to solve the problems of the traditional prediction methods, such as low space accuracy and poor generalization capability, this paper presents the prediction model based on PSO-BPANN coupled with Kriging method.Moreover, this study realizes three-dimensional prediction model of DO in aquaculture ponds according to the actual requirements of aquaculture water quality monitoring and prediction.According to the results, the model has high prediction accuracy.Running time of prediction method in this study is 1.028 s.Therefore, this model can meet the accuracy requirements and time requirements for the real-time prediction of DO.The proposed method of the dissolved oxygen content prediction is useful for the comprehensive intelligent control, improves the water quality deterioration effectively, and saves the production cost.The results show that this method can realize the purpose of safety and health cultivation ultimately.
Further work needs to be continued; nevertheless, this prediction is restricted by many factors such as personal abilities and experimental conditions.This paper records some the sections in need of improvement as follows to provide some references for the subsequent research work.Firstly, this study fails to observe the whole crab breeding cycle due to time limitation.In addition, the author fails to extend this method to more breeding objects and breeding regions to implement experiments for the same reason.Secondly, some other factors affecting DO such as carbon, nitrogen, and ammonia nitrogen are not considered in this model because of the lack of corresponding sensors and the support of empirical data.If we can get more data of other factors, the precision of the model and generalization abilities will be further improved.Thirdly, aquaculture water is influenced by human factors and natural factors simultaneously.This research mainly considers the physical and chemical factors in the natural conditions.If the switching time of the aerators and changing time of water can be quantized and input into model as input variables, the prediction results would be more accurate.Fourthly, a total of 12 sample points set up by this study to build the prediction model are the lower limit of the sample numbers of 3D model.If the conditions permit, choosing more sampling points is a necessary choice.These future researches may effectively improve the efficiency of aquaculture and have important significance.

Figure 1 :Figure 2 :
Figure 1: Top view of collection points in aquaculture pond.

Figure 5 :
Figure 5: Back propagation artificial neural network optimized by particle swarm optimization.

Figure 6 :
Figure 6: Flow chart of the overall prediction model.
Prediction.The numerical experiments of this prediction model are implemented on Matlab 2010b.

Figure 7 :
Figure 7: The original data set and denoising data set of dissolved oxygen in point 2.

Table 2 :
The RMSE of back propagation artificial neural network in different number of hidden layer neurons using different optimization method.