Dynamic Heat Supply Prediction Using Support Vector Regression Optimized by Particle Swarm Optimization Algorithm

We developed an effective intelligent model to predict the dynamic heat supply of heat source. A hybrid forecasting method was proposed based on support vector regression (SVR) model-optimized particle swarm optimization (PSO) algorithms. Due to the interaction of meteorological conditions and the heating parameters of heating system, it is extremely difficult to forecast dynamic heat supply. Firstly, the correlations among heat supply and related influencing factors in the heating system were analyzed through the correlation analysis of statistical theory. Then, the SVRmodel was employed to forecast dynamic heat supply. In the model, the input variables were selected based on the correlation analysis and three crucial parameters, including the penalties factor, gammaof the kernel RBF, and insensitive loss function, were optimized by PSO algorithms.The optimized SVRmodel was compared with the basic SVR, optimized genetic algorithm-SVR (GA-SVR), and artificial neural network (ANN) through six groups of experiment data from two heat sources. The results of the correlation coefficient analysis revealed the relationship between the influencing factors and the forecasted heat supply and determined the input variables. The performance of the PSO-SVR model is superior to those of the other three models. The PSO-SVR method is statistically robust and can be applied to practical heating system.


Introduction
With the recent development of intelligent heating system in China, every component of the heating system requires intelligentization.In order to ensure heat supply of the users and energy conservation, the users' heat supply demand should be accurately calculated.However, the dynamic heat supply prediction remains a longstanding challenge because of the nonlinear problems of the huge pipe networks system under the increasing growth tendency of heat users, such as heat decline in conveying process, large time delay, and huge thermal inertia of the heating system itself.The accuracy of the prediction model directly affects user's comfort and the economical operation of the heating system.Therefore, it is necessary to accurately predict dynamic heat supply of the whole heating system.
Though the tremendous progress has been made, large prediction deviation for dynamic heat supply still exists and leads to the high energy consumption of the whole heating system and the thermal comfort problem of users.Most of existing prediction methods belong to linear models, in which it is assumed that heat supply of the system only varies with the outdoor temperature without considering other influencing factors.The methods are relatively accurate only when they are used to describe the dynamic thermal load of building envelope.However, the dynamic heat supply of the whole heating system includes heat consumption of exterior protected construction and heat loss from the pipeline and different devices to surrounding environment in heating system.The prediction of the dynamic heat supply is a complicated nonlinear problem.When the predicted heat value is too low or too high, it will cause resource waste or insufficient heat supply to users.These factors, including the supply and return water temperature, the supply and return water pressure, and the flow, which are used to reflect internal performance of heating system, should be considered together with outdoor meteorological parameters.For a huge and complex heating system, it is almost impossible to obtain an accurate mathematics model.Therefore, it is necessary to develop a more precise method to predict the dynamic heat supply of heating system.Support vector regression (SVR) is generally used to solve the nonlinear small-sample problem.The model is based on support vector regression and widely used in power system [1,2] and the prediction accuracy is significantly enhanced.
The SVR prediction model combined with particle swarm optimization (PSO) algorithms was used to predict the heat supply of the heat source for the heating system.Combined with the analysis of influencing factors, the method can achieve reasonable heat supply prediction, thus guaranteeing the normal operation of the heat supply system and energy conservation.
The paper is structured as follows.Some related works are outlined in Section 2. In Section 3, the SVR model and its optimization algorithm are introduced and the flowchart of the proposed method is designed.In Section 4, the influencing factors of heat supply are analyzed and the experiment and comparisons are performed to validate the proposed approach.Our conclusions are summarized in Section 5.

System Performance and Influencing Factors of Dynamic
Heat Supply.Before predicting the dynamic heat supply of the heating system, it is necessary to explore the system performance and influencing factors.In general, the heat load is calculated as a steady-state value, which is not consistent with the actual value.Westphal and Lamberts presented a transfer function method to analyze the dynamic heat load of nonresidential buildings based on simplified meteorological data [3].A model and corresponding computer code, which are based on the accurate high-order numerical solution of the transient energy equation and the hydraulic prediction of pressure and fluid flow rates within the complex pipe network, are developed to simulate the thermal transients in local heating systems [4].In a parameter estimation procedure [3], the hourly space heating and cooling loads from the monthly energy consumption were restored.The procedure was based on a nonlinear multivariate regression approach.In the literature [5], an assessment method for describing daily heat load variations was described in view of two basic conditions.Firstly, it was independent from the system size.Secondly, external parameters were not analyzed.A simplified building procedure spreadsheet was presented to evaluate energy demand in an early design stage of dwellings [6] and the spreadsheet required only fewer input data to describe the building design within a short time.The two influencing factors of prediction loads in district-heating systems, the outdoor temperature and the social behaviors of consumers, were analyzed [7].Solar radiation and heat demand increase caused by a higher wind speed were taken into consideration in the prediction of heat supply for a single building [8].In seasonal heating load calculations [9], it was important to carry out climatologic investigations and develop empirical equations for determining the total duration of daytime temperatures and solar radiation intensities.
The above studies were based on climate parameters and the characteristic of heating system and mainly focused on heat load of building envelope.However, only external factors of heat load were considered.With the expansion of city construction and increasing customers in China, dynamic heat supply prediction becomes more complicated because of the nonlinear problem of central heating system.Dynamic heat supply is not only affected by external factors, but also related to internal factors.

Prediction Methods of Dynamic Heating
Load.Nowadays, various predicting techniques have been developed.Due to the complex nonlinear property of central heating system, dynamic heating load prediction needs to be improved through nonlinear models.The existing prediction methods include autoregressive integrated moving average (ARIMA) model [10,11], linear regression (LR) technique [12], artificial neural network (ANN) model [13,14], wavelet neural network (WNN) [15], and gray model [16].These prediction models did not perform well enough because each model considered a few factors or only one relevant factor.These models had some defects, such as the optimization of the weight and threshold of the neural network.In addition, these models had relatively poor robustness.Recently, with the developments of artificial intelligence, some coupled algorithms have obtained considerable development [17][18][19].The accuracy and generalization of the algorithms have been improved more than previous traditional methods.
Although many methods for the dynamic heat supply prediction have been presented, the application is limited because of the nonlinear characters of heating systems, such as large thermal inertia, attenuation, and large time delay.It is difficult to accurately forecast dynamic heat supply of central heating system with precise mathematical models and traditional linear methods.The core of SVR in Vapnik [20,21] is to describe the relationship between overfitting and generalization ability and to control the capacity of machine learning through introducing structure risk minimization.Compared with traditional methods, the method requires fewer samples, avoids the dimension disaster and the local minimum problem, and shows the good adaptability to nonlinear problems.In addition, the significant characteristic of the SVR is to transform a nonlinear problem into a linear problem in a higher-dimensional feature space with kernel functions.The SVR is applicable to predict nonlinear dynamic heat supply of the heating system.The presented method might be just a feasible solution of dynamic load prediction of heating system.
In order to improve the prediction accuracy, a fusion algorithm for heat supply based on SVR and PSO was proposed and some actual data were provided to verify the proposed method.

Support Vector Regression (SVR)
. The notions of support vector machine (SVM) for the case of regression are introduced briefly.Given a set of data {(  ,   ),  = 1, 2, . . ., }, where   ∈   ;   ∈ ;  is the total number of data patterns, a nonlinear mapping function, :   →  ℎ , is defined to map the training (input) data   into the so-called high dimensional feature space (which may have infinite dimensions),  ℎ .Then in the high dimensional feature space, there theoretically exists a linear function, , which can be used to formulate the nonlinear relationship between input data and output data.The linear function, namely, SVR function, is expressed as where () denotes the prediction values; (  ), which maps inputs vectors   into a high dimensional feature space, is a nonlinear mapping; coefficient  ( ∈  ℎ ) and  ( ∈ ) are adjustable and can be estimated by minimizing the following regularized risk function: where  and  are prescribed parameters.In (2),   (  ,   ) is called the -insensitive loss function.The loss equals zero if the forecasted value is within the -tube (3).The second term, (1/2)‖‖ 2 , indicates the flatness of the function.Therefore,  is considered to specify the trade-off between the empirical risk and the model flatness.The parameters  and  are user-determined.Two positive slack variables  and  * denote the distance from actual values to the corresponding boundary values -tube.Therefore, (2) can be transformed into the quadratic optimization problem with inequality constraints: with the constrains . ., .The optimal weight of the regression hyperplane is calculated as where   ,  *  are the Lagrangian multipliers obtained by solving a quadratic program.
So the SVR regression function is obtained as (6) according to the duality theory, where (  ,   ) is defined as the kernel functions and the value of the kernel equals the inner product of two vectors,   and   , in the feature space (  ) and (  ), respectively.That is to say, (  ,   ) = (  ) × (  ).Any function which meets Mercer's condition [22] can be used as the kernel function.
There are several kernel functions, such as the polynomial kernel with an order of  and the constants of  1 and  2 , (  ,   ) = ( 1     +  2 )  , and the Gaussian radial basis functions (RBF) with a width of , (  ,   ) = exp(−0.5‖ −   ‖ 2 / 2 ).However, it is difficult to determine which kind of kernel function is suitable for some specific data patterns [23].The Gaussian kernel RBF can be easily applied and map original domain into the higher-dimensional space in a nonlinear way.Therefore, it is suitable to deal with the nonlinear relationship problem.The Gaussian kernel RBF is adopted in this study.
The determination of three parameters (including the penalty parameter (), the insensitive loss function (), and the kernel parameter ()) of the SVR model will largely affect the forecasting accuracy.Thus, it is necessary to develop an efficient approach to simultaneously meet three optimal parameters.Particle swarm optimization (PSO), which was a good method to optimize SVR model parameters, was used to predict the cyanotoxin content [24].Therefore, in the study, we adopted the optimization model to predict the dynamic heat supply of central heating system.

Particle Swarm Optimization (PSO)
Algorithm.The particle swarm optimization (PSO) is an intelligent optimization algorithm [25,26].In PSO, a population of particles or proposed problems gradually evolves towards the optimal solution of the problem after each iteration.Genetic Algorithms [27,28], Differential Evolution [29], and Ant Colony System [30] are also intelligent optimization algorithms, in which the evolution process of the population involves random factors.Indeed, a new population in the PSO algorithm is obtained by shifting the position of the previous one at each iteration.In its movement, each individual is influenced by its neighbor and its own trajectory.
Supposing a -dimension search space, one population is composed of  particles.The state information of the particle  can be expressed through two -dimensional vectors:   = ( 1 ,  2 , . . .,   ) expresses location information of particle  and   = (V 1 , V 2 , . . ., V  ) expresses velocity information, which decides the flying direction and distance of a particle.
In the simulation of each particle, two factors should be considered.One is the personal best position, expressed as   = ( 1 ,  12 , . . .,   ); the other one is the global optimum value, expressed as   = ( 1 ,  2 , . . .,   ).The algorithm updates the positions and the velocities of the particles following [31]   The velocity of each particle  at iteration  depends on three components: (i) The previous step velocity term, V ()  , affected by the constant inertia weight, ; (ii) The cognitive learning term, which is the difference between the existing best particle position (called  ()  , local best) and the current particle position  ()   ; (iii) The social learning term, which is the difference between the global best positions found in the entire swarm (called  ()  , global best) and the current particle position  ()   ; where  ()  1 and  () 2 are random numbers distributed uniformly in the interval [0, 1];  1 is used to indicate the understanding of the particle and called the weight coefficient tracking the historical best value of particle itself;  2 is used to indicate the knowledge to whole group and called the weight coefficient tracking the best value of the group. is called constriction factor used to restrain the speed of the updating position, where  is the inertial weight coefficient to maintain the original speed.Generally, when  is a nonnegative number, the algorithm shows the better global search capability for the large  and the better local search capability for the small , where  max and  min represent the maximum and minimum weight factors, respectively; iter is the current iteration; iter max is the maximum iteration.In (9), when the value  gradually decreases in the search process, it meets the demand of the adaptive process for the algorithm from global optimization to local optimization.

Particle Swarm Optimization (PSO) Optimizing the Parameters of SVR Model (PSO-SVR).
Based on the above SVR modeling theory and PSO optimization algorithm, the proposed algorithm of dynamic heat supply is shown in Figure 1.The prediction steps are described as follows.
Step 1. Parameter initialization includes the parameters of SVR, population size, iterations, and position and velocity of particles.
Step 2. To generate initial particle and its velocity, randomly generate population and velocity and calculate initial fitness.
Step 3. Searching for the extreme value and extreme point includes local and global points.
Step 4. Calculate the average fitness of each generation.
Step 5. To update population and its velocity, calculate fitness and update the optimum value of local and global points.
Step 6.To judge whether the termination conditions are met, if the conditions are met, go to the next step, otherwise go to Step 5.
Step 7. Obtain the best parameters of SVR model.
Step 8. Predict the dynamic heat supply with the optimized model.
Step 9. Calculate MSE and output data.

Model Performance Evaluation.
To evaluate the performance of the optimized SVR models, MSE and  2 are used to measure the forecasting accuracy.MSE, also known as the standard error, is employed to reflect the ability of a system to fit actual data.The smaller value indicates the higher precision of the prediction model.The determination coefficient  2 measures the overall similarity between the actual heat supply and the forecasted heat supply.If the determination coefficient  2 is closer to 1, the accuracy of the prediction model for the heating system is higher.Then, the two evaluation indexes are defined as follows: where   , (  ) are the measured and predicted values.

Experiment Simulation
where  is the correlation coefficient between input variable and output value, respectively, corresponding to the influencing factor and heat dynamic load in this paper.The closer the value to 1, the greater the linear dependency between input variable and output value.The correlation coefficients between every influencing factor and the predicted heat supply are provided in Table 2, where  −1 ,  −2 , and  −3 represent the heat supply of the past three time points,   represents the output heat supply predicted.According to the statistical analysis results of 496 sets of measured data, we found that supply water temperature had the most significant effect on dynamic heat supply, while return water pressure had the least effect on it.Negative sign means that the influencing factor moves to the predicted heat supply in the opposite way.This is in accordance with the practical situation in which heat supply increases with the decrease in outdoor temperature.Because of thermal inertia of building envelope and heating system itself, the heat supply in every moment is influenced by the past value.Through the correlation analysis, we also found that the dynamic heat supply of the past three time points had a major impact on the prediction value.Therefore, 8 influencing factors except return water pressure, which have great effect on the heat supply, are assigned as the input variables of the predictive model.

Division of the Measurement Data.
The above 496 data sets were divided into four groups according to the following two criteria: the influences of small-sample size on the model and the influences of different data and sample sizes on the model.The four groups of data are shown in Figure 2.
As shown in Figure 2, the 496 data sets are divided into the four crosswise data sets with different sizes.The largest and smallest amounts of data in the four data sets are, respectively, 248 and 176.There are different sizes and variation trends in these data sets.These characteristics can be used to test the influences of different sizes and variation trends on the performance of the proposed model.
In addition, we selected two data sets from another heat source in Handan, which had different scales and operation regulation modes, to study the impacts of actual data from different heating systems on the proposed method.The two data sets, which, respectively, contained 123 and 104 data sets, had characteristic similar to the aforementioned data sets.The 8 influencing factors were also assigned as the input variables of the model.Therefore, the six data sets were used to train and test the proposed prediction model.

4.4.
Pretreatment.Some null or wrong data might be acquired because of instrument failure, debugging system, and other reasons.In addition, the influencing factors have different sizes and units.These factors will affect prediction accuracy unless a certain pretreatment is performed.The pretreatment in this paper mainly includes the following two steps: interpolating data and normalization processing.
Interpolating data mainly aimed at the missing values, erroneous values, and abnormal values which exceeded 150% of the average value.These values were replaced by the average value.
The influencing factors are different from dynamic heat supply in measurement units and the value range.Therefore, if normalization processing was not performed, the obtained prediction results would be greatly affected.
The normalization method is expressed as where  max and  min represent the maximum and minimum sizes of each dimension, respectively, and    is the normalized value.

Initial Parameters of the Model.
The initial parameters of the prediction model are set according to the characteristic of heat dynamic load prediction below.

Population Size.
We should get a trade-off among precision, stability, and calculation time after the population size was selected.The population size was set to 50 according to the previous results [32] and the trials in this paper.Evolutional generation was set to 100.

Learning
Factor or Search Capacity.According to (7), the functions of learning factors  1 and  2 are used to adjust particle and population.The learning factors indicate the weights of random acceleration terms when the particle  is pulled to the situations   and   .Through actual operation, observation, and comparison,  1 and  2 are set to 1.5 and 1.7.

Ranges of SVR Model Parameters.
To avoid the waste of computing time caused by endlessly rummaging around, the search scope of the SVR model parameters needs to be set before they are optimized by PSO algorithm.In this study, they are set as follows: Generally, the minimum and maximum velocities of the particles should meet the following relation: If the maximum velocity V max of the particle exceeds the allowable values, the particle may skip the good solution and cannot converge effectively.If the particle velocity was too small, the particle would not skip local optimum interval and get local extreme point.In the previous results [19], the relationship between the maximum velocity and the corresponding search space is expressed as where  is the proportional relation between the velocity and the limited position of particle; it is set as 0.6.|| max is the maximum value of the limited position for particle corresponding to , , and  in this paper.

Results.
Because the dynamic heat supply involves many influencing factors, the prediction model is a multi-input and monooutput system.The 8 influencing factors were set as the inputs of the model according to the statistical method mentioned in Section 4.2.Among 5 influencing factors in the model, the outdoor temperature of the prediction time point was used and the supply water temperature, the return water temperature, the supply water pressure, and the circulation flow of the system taken from the previous time of the prediction time point were used.heat supply of where the subscript  expresses the predicted time point = , the subscript  − 1 refers to the previous time of the predicted time point = , the subscript  − 2 expresses the previous time of the time point =  − 1, and so on.The 6 data sets mentioned above were firstly pretreated according to the method described in Section 4.4.Then, the processed data were arranged in Figure 1.The PSO-SVR model was established with MATLAB to perform the training and testing processes for the dynamic heat supply prediction of local heating system.We divided every group of data into the training and testing sets.As mentioned above, every data set had different sample sizes.For comparison, we assigned the last 20 samples as the testing set; the remaining samples were assigned as the training set.The results are shown in Figure 3.
Figure 3 shows the relationship between prediction heat supply and measured values from the six data sets.The predicted heat supply of PSO-SVR model is close to the measured values.According to the prediction results of the two heating systems (resp., shown in Figures 3(a)-3(d) and 3(e)-3(f)), the prediction tendency is also in accordance with the measured status.The comparison results indicate that the PSO-SVR model is applicable to predict the dynamic heat supply of heating system.Moreover, the smoother trend of the measured values indicates the higher prediction precision, as shown in Figures 3(a), 3(b), and 3(f).This conclusion can also be seen from the results of the training set in Table 3. Evaluation indexes MSE and  2 show the best training results for the six data sets.
The parameters of SVR model optimized by the PSO algorithm are also listed in Table 3.The evaluation indexes are summarized in Table 3.As shown in Table 3, for the training sets, MSE and determination coefficient  2 are satisfactory.MSE values are lower than 0.003 and  2 values are higher than 0.98, indicating that the proposed prediction model shows a good nonlinear characteristic, which meets with the requirements of the heating system.The two indexes are satisfactory for the testing process, indicating that the model has good generalization capability for prediction.
As mentioned above, SVR model can better resolve smallsample learning problems.The predicted results from the six different data sets indicate that the precision rate of prediction result is not affected by sample size.The smallest MSE in the test set is 0.001 in Data Set 1 with 196 data sets, which is not the biggest data set.MSE of the Data Set 5 with 123 data sets is 0.0012, which is the second smallest value in MSE values of 6 data sets.Data Set 5 is not the biggest data set.Therefore, the prediction algorithm can deal with small-sample learning problems.
To further validate the proposed prediction model, we compared it with the other three prediction models, which are the basic SVR, SVR optimized by genetic algorithm (GA-SVR) applied to aquaculture water quality prediction [33], and artificial neural network (ANN) applied to the prediction of heating energy consumption in a model house [34], respectively.The six data sets mentioned above from two heat sources were still used to train and test the other three models.Their prediction results which are compared with PSO-SVR are shown in Figures 4 and 5.
Figure 4 shows the two indices of four prediction models trained by six different data sets.The comparison results indicate that the two indexes of these methods have unanimous variation tendencies, where  2 values of the PSO-SVR method are higher than those of the other models and MSE values of the PSO-SVR method are lower than those of the other models.The conclusions show that the SVR optimized by PSO has the better performance in fitting original data.The comparison results of generalization ability for four methods are shown in Figure 5.Although they are much closer in the Data Set 2, two indexes of PSO-SVR method are better than those of other methods as a whole.This shows that the generalization ability of the proposed model is better than those of other models and that the testing results of the proposed model are closer to the measured values.Therefore, the proposed model is more suitable for the prediction of dynamic heat supply.

Conclusions
Accurate heat supply prediction is crucial for improving the ability of operation regulation of heating system and avoiding excessive energy consumption.This paper proposed a support sector regression model optimized particle swarm optimization algorithm to predict the dynamic heat supply.Heat supply is caused by the complex interaction of the outdoor temperature and the inherent dynamic complexity of the heating system.Analyzing the influencing factors is a crucial step to build a proper model.Correlation coefficient analysis is used to reveal the relationship between the heat supply and the influencing factors.
With the capability of modeling complex nonlinearity system, support vector regression (SVR) models are employed to forecast the heat supply data at Taiyuan and Handan in China.The forecasting accuracy of SVR greatly depends on its parameters (, , and ) and its input signals.Therefore, proper selection of inputs and parameters is essential for satisfactory SVR performance.Input signals analyzed by correlation coefficient method are chosen as the input variables.The optimum parameters of SVR are selected by PSO algorithm.
Data size for heating system is limited; the SVR model can just solve the nonlinear small-sample problem.Six data sets from the two different heat source are used to validate the applicability of the small-samples data for the PSO-SVR model.The results show that the performance of the proposed method is less affected by sample size.
Finally, the proposed approaches are then compared with the basic SVR, GA-SVR, and ANN.The predicting results show that the performance of the proposed prediction models is better than those of the other three models and that the proposed method could provide more precise prediction and the stronger generalization ability.The proposed method is applicable for the dynamic heat supply prediction of the nonlinear heating system.

Figure 1 :
Figure 1: The algorithm flow diagram of the optimized SVR model.

4 .
Flight Velocity of Particles.The flight velocity of particles should be restricted, otherwise the particles might show strong fluctuations after multistep iterations.

Table 1 :
The measured values and their units.

Table 2 :
Correlation coefficients between   and every influencing factor.