Short-Term Load Forecasting Using Neural Network and Particle Swarm Optimization (PSO) Algorithm

Electrical load forecasting plays a key role in power system planning and operation procedures. So far, a variety of techniques have been employed for electrical load forecasting. Meanwhile, neural-network-based methods led to fewer prediction errors due to their ability to adapt properly to the consuming load’s hidden characteristic..erefore, these methods were widely accepted by the researchers. As the parameters of the neural network have a significant impact on its performance, in this paper, a short-term electrical load forecasting method using neural network and particle swarm optimization (PSO) algorithm is proposed, in which some neural network parameters including learning rate and number of hidden layers are determined in order to forecast electrical load using the PSO algorithm precisely. .en, the neural network with these optimized parameters is used to predict the short-term electrical load. In this method, a three-layer feedforward neural network trained by backpropagation algorithm is used beside an improved gbest PSO algorithm. Also, the neural network prediction error is defined as the PSO algorithm cost function. .e proposed approach has been tested on the Iranian power grid using MATLAB software. .e average of three indices beside graphical results has been considered to evaluate the performance of the proposed method. .e simulation results reflect the capabilities of the proposed method in accurately predicting the electrical load.


Introduction
Load forecasting is an effective and crucial process in the management and operation of power systems which can lead to significant cost savings when accurately calculated. Also, very important decisions are made based on the forecasted load, the economic consequences of which are notable [1].
Load forecasting can be divided into three categories: short-term, mid-term, and long-term forecasting [2]. Due to the vital role of short-term load forecasting in optimizing the unit commitment, turning the thermal units on and off, spinning reserve control, and buying and selling the electricity in interconnected systems, the efforts are majorly concentrated on short-term load forecasting [3]. Appropriate load prediction has always been one of the main challenges for researchers, so that if the predicted load is less than its actual value, the required load will not be supplied, and if it is estimated to be more than the actual amount, it will impose additional costs and cause energy waste.
Due to the great ability in nonlinear relationships modelling between inputs and outputs, artificial neural networks are increasingly used in load forecasting [4][5][6][7]. ese networks are able to extract the implicit relations between input variables by learning through training data [8]. e first reports of the neural network application in load forecasting were published in the late 1980s and early 1990s and since then their number steadily increased [9].
Optimizing neural network architecture design, including determining the number of input variables, the number of input nodes, and the number of hidden neurons to enhance prediction performance, is an important issue in intelligent systems [10][11][12][13]. In recent years, many intelligent methods such as PSO have been proposed to improve artificial neural networks' training and architecture in short-term electrical load forecasting [8,14,15]. e results reflect the capability of these methods compared to the past ones.
Reference [16] proposed a hybrid method that consists of deep neural network and empirical mode decomposition (EMD) technique. e EMD is used to decompose the load time series and deep neural network is used to perform short-term load forecasting. In [17], a new ensemble residual network model is presented. At first, a recurrent neural network similar structure is built and then a modified residual network is applied, where the final outputs are obtained. Reference [18] proposed a deep learning framework based on a combination of a convolutional neural network (CNN) [19] and long short-term memory (LSTM) [20]. e CNN layers are used for feature extraction [21] from the input data and LTSM layers are used for sequence learning. In [22], a multifactorial framework for short-term load forecasting is proposed. At first, the candidate feature set is chosen from the load data. Next, partial mutual information is used to omit the redundant and irrelevant features from the candidate set. en artificial neural network optimized by genetic algorithm is applied to train this set. Finally, the optimized trained network is used to predict the shot-term load forecasting. Reference [23] investigates the way to apply sequence-to-sequence recurrent neural networks to shortterm load forecast. Reference [24] proposes a full wavelet neural network method for short-term load forecasting. Decomposition of the load profile and various features is performed using the full wavelet packet transform model. e neural networks are then trained using these features and the outputs of these trained neural networks are known as the forecasted load. In [25], a data mining and artificial neural network optimized by multiobjective grasshopper and phase space reconstruction method is presented. In [26], a short-term load forecasting approach that can capture variations in building operation regardless of building type and location is proposed. Also, nine different hybrids of recurrent neural networks and clustering methods are explored. In [27], the applications and features of support vector machine method, the random forest regression method, and the LSTM neural network method are discussed and compared. Also, a fusion forecasting approach and a data preprocessing technique are proposed by integrating these methods' advantages. In [28], the past load data is considered as a feature and the time series characteristics of load data simultaneously. In order to forecast the load, an approach named multi-temporal-spatial-scale temporal convolutional network is adopted. Reference [29] presented six clustering techniques involving different combinations of Kalman filtering (KF), wavelet neural network (WNN), and artificial neural network (ANN) schemes. In [30], a hybrid method based on Elman neural network (ENN) and PSO is proposed. Reference [31] proposes a genetic-algorithmbased backpropagation neural network (GABPNN) considering data loss. Also, a particle swarm optimizationsupporting vector regression (PSO-SVR) algorithm is further used to integrate the GABPNN results with better accuracy. In addition, a combined ultra-short-term load forecasting model for industrial power users is introduced. Furthermore, the proposed model combines the cubature Kalman filter (CKF) prediction model with good performance in nonlinear dynamic systems and the least square support vector machine (LS-SVM) prediction model with good performance in small-scale data prediction. e grey neural network is used to integrate the two algorithms, which further improves the accuracy of ultra-short-term load forecasting. e rest of this paper is organized as follows: A brief overview of neural networks and PSO algorithms is presented in Sections 2 and 3. In Section 4, the proposed method is demonstrated and, in Section 5, the methodology applied to predict the load is explained. In Section 6, results are reflected and finally Section 7 is devoted to conclusion.

Neural Network
An artificial neural network is derived from the way of information process in human biological systems and consists of an interconnected group of elements called neurons [32][33][34][35]. Various architectures are used in neural networks [36][37][38][39][40][41]. Some of them include feedforward networks and recurrent networks. In the meantime, feedforward neural networks have become more popular. In this type of the network, the input signal from the input layer propagates to the output layers through the hidden layers, where the outputs of one layer will be the inputs of the next layer [32] as depicted in Figure 1. is figure shows a neural network with R input and s output. It has three layers named input layer, hidden layer, and output layer. e output of the input layer is the input of the hidden layer and the output of the hidden layer is the input of the output layer. In most of the presented papers, this architecture has been used in order to perform short-term electrical load forecasting [42].
An artificial neural network with an input layer, one or more hidden layers, and an output layer is called multilayer perceptron network. Each layer consists of several neurons, each of which is connected in a layer to its adjacent layers through some weights. Weight and bias are the two adjustable parameters in neural networks. Tuning the neural networks parameters is done in a process called training algorithm [43]. Neural networks training is accompanied by minimizing a cost function [44][45][46]. e most well-known training algorithm in neural networks is the backpropagation algorithm in which the mentioned mathematical cost function is the mean of the squared errors. In order to minimize the errors, in backpropagation technique, weights and biases are modified according to the errors returned to the system [32].
PSO algorithm is an iterative optimization method in which a population is produced for the search process at first called particles. en these particles travel a multidimension space formed by each particle [96]. ere are two parameters, position and velocity, in PSO algorithm, which are updated for each particle and in all considered dimensions [95]. Each particle alters its position according to best position it has ever achieved and the best position achieved by other particles up to now.
Benefits of PSO over other metaheuristic approaches are computational feasibility and effectiveness. PSO shows its uniqueness such as easy implementation and consistency in performance [97].
e main advantage of PSO in comparison to other optimization methods is its ability to accomplish fast convergence in many complicated optimization problems. In addition , PSO has several attractive advantages like simplicity with fewer mathematical equations and having fewer parameters in implementation [98]. PSO has many key features that attracted the attention of many researchers to use it in various applications in which traditional optimization algorithms might fail. We have the following examples: -Only a fitness function to measure the ''quality" of a solution instead of complex mathematical operations like gradient, Hessian, or matrix inversion is required. is reduces the computational complexity and relieves some of the restrictions that are usually imposed on the objective function like differentiability, continuity, or convexity.
(i) As it is a population-based algorithm, it is less sensitive to a good initial solution. (ii) Easily incorporates with other optimization tools to form hybrid ones. (iii) It has the ability to escape local minima, since it follows probabilistic transition rules.
More interesting PSO advantages can be emphasized when compared to other members of evolutionary methods like GA, HHO, GWO, and so forth as follows: (i) Easily programmed and modified with basic mathematical and logic operations. (ii) Inexpensive in terms of computation time and memory. (iii) Less parameter tuning is required. (iv) It works with direct real valued numbers, which omits the need to do binary conversion of classical canonical genetic algorithm [99].
Different PSO algorithms have been known up to now, among which gbest algorithm is more popular. In this approach, the whole population is considered as a unique neighborhood for that particle during the gaining experience process. In order to optimize the search procedure, the best particle shares its coordinates information with other particles [99].
In this algorithm, the i th particle velocity, v k+1 i , is updated according to the following equatio (1) [95,99]: where β 1 and β 2 are two random numbers between zero and one, x i k and v k i are the position and velocity of the i th particle in k th iteration, respectively, gφ i is the best position experienced by the whole particles, and pφ i represents the best personal experience of the particle. Also, w is called the inertia constant, which actually considers a percentage of the previous particle velocity in the new velocity calculation. c 1 and c 2 are constants called personal learning factor and social learning factor, respectively.
Updating the particle position is done according to the following equation: where x i k+1 is the new particle position, x i k is the previous particle position, and v k i is the new particle velocity obtained according to (1). e velocity and position of each particle are updated according to (1) and (2) until all particles move.
en the next iteration occurs and this procedure continues until finding the best solution [94]. e old types of PSO algorithm had some undesirable dynamic characteristics including velocity restrictions to control the particle path. In this paper, by applying the limitation on factors according to (3) and (4), the possibility of the dynamic characteristic control on the particle swarms   Mathematical Problems in Engineering and making a balance between local search and global search is provided [100]. According to (3), the factors related to PSO algorithm are considered as in (4).
w � θ, where ρ 1 and ρ 2 are positive random numbers gained from a unified distribution, where their sum, ρ, should be more than 4, and α has a value between zero and one. Also, θ is known as restriction factor.

Proposed Method
In this paper, in order to predict the short-term electrical load, a feedforward neural network trained by backpropagation algorithm has been chosen. is network consists of one hidden layer, and the number of neurons in this layer is considered as the optimization parameter. In designing the neural network architecture, the number of neurons in the hidden layer has an important effect on the network performance, making the precision in choosing them. If the number of these layers is chosen to be low, the network gets in trouble in the training step; and if the number of these layers is chosen to be high, the network will face overfitting. Also, the network learning rate between other parameters of the neural network is considered the other optimization parameter. Suitable values for the two optimization parameters are found, utilizing the improved PSO algorithm introduced in Section 3. is algorithm's parameters are selected in accordance with (3) and (4) to overcome the dynamic problems, involved in traditional PSO algorithm. e resulting error by the neural network for load forecasting is considered the cost function and, as declared before, two variables of the network, the learning rate and the number of hidden layer neurons, are considered the optimization variables for the PSO algorithm. Along with minimizing the neural network error, which leads to minimizing the load forecasting error, the PSO algorithm tries to find the best values for the learning rate and the hidden layer neurons. e neural network with these optimized parameters is then used to predict the load. e flowchart of the proposed method is reflected in Figure 2. At the beginning, the preprocessed input data are fed into the three-layer backpropagation neural network, where its learning rate and the number of the hidden layer neurons are considered as the optimizing parameters for the PSO algorithm. PSO travels the search space to find the best values for the intended neural network. After simulation, these two parameters are obtained, while the neural network prediction error is minimal. Next, the trained, optimized neural network is used for electrical load forecasting. is network is used to predict load per day and, for each day, three indices that will be introduced in Section 6 are calculated.
e total values of these indices for all days for which the load is predicted are obtained by averaging all indices.

Methodology
is section provides more details about the proposed method that was presented in the previous section. As stated in Section 4, a neural network with one hidden layer is used to forecast the electrical load. At first, the information of the total hourly daily load consumption of Iran's power grid was extracted from [101] and the data related to its 1093 days (22 March 2010 to 18 March 2013) was selected for study. e simulations are performed in MATLAB software environment.
In order to improve the performance of the neural network and prevent the neurons saturations phenomena, all used data in neural network are normalized using the following formula:  normalized data � actual amoumt of data Maximum amount of data . (5) e first 900 days of 1093 studying days are considered as training data and the remaining 193 are considered neural network test data. e number of neurons in the output layer is 24 and, due to crucial role of the number of the hidden layer neurons, its number is considered as an optimization parameter. e transition function for the output layer and hidden layer is considered tansig and the learning law is considered Levenberg-Marquardt [100,102]. Also, the network learning rate between other parameters of the neural network is considered the other optimization parameter. e resulting error by the neural network for load forecasting is considered the cost function and, as declared before, two variables of the network, the learning rate and the number of hidden layer neurons, are considered the optimization variables for the PSO algorithm. e improved type of PSO is considered, the parameters of which are as in (3) and (4). e optimized value of ρ is considered 4.2 and w is considered as 0.729843.

Results
is section demonstrates the effectiveness of the proposed method; it has been simulated in MATLAB software. e computing system is a core i5 system with 1.6 GHz CPU and 4 GB memory.
e chosen test data are adopted to evaluate this network's performance, where the results for some days are according to Figure 3 through Figure 4. ey show the consumed load according to hour of the day. In each figure, the solid graph is related to the actual load and the dashed graph reflects the result of the prediction. It can be seen that Iran's load is a two-peak load, meaning that it has two peaks. One appears around noon at 3 p.m. and the other appears in the evening at 8 p.m. ese Figures (Figures 3, 5, 6, and 7) follow nearly the same load pattern except for Figures 8 and  4, where their first peak appears at a lower level of the power. is pattern is impressed by the day of the week, the special religious ceremony, certain TV program, and so on. In all cases, the maximum prediction error is not more than 1000 MW. e obtained results apparently show the appropriate performance of the introduced approach.
For more investigation, in order to evaluate the forecasting model, MSE, MAPE, and MAE indices are calculated according to Table 1, where the average results of these calculations for all test data are summarized in Table 2. In Table 1, N is the total number of pieces of the test data, y m (i) is the i th data actual load value, and y p (i) is the i th data predicted load value.
Although these results are acceptable and proper, in this paper, only the load historical data is used, whereas using other impressive factors on load behaviour can reduplicate the approach's performance.

Conclusions
Electrical load forecasting affected the power system operation and planning processes in a way where the correct operation of the power system depends on precision of this prediction. Also, the power system's behaviour, especially its generation units in small or large scale, is affected by this prediction and its deviation from the actual value can impose additional costs to the system. Numerous load forecasting methods have been proposed up to now, where neural-network-based methods are one of them. Due to the nonlinear relationship between load pattern changes and effective parameters on it and complex relations between load pattern changes and these parameters and neural networks' ability to discover them, researchers have accepted them more than other methods. Meanwhile, numerical to neural network parameters have an obvious effect on their performance. So, exploiting algorithms such as PSO algorithm can be helped. is paper proposes an approach for electrical load forecasting using PSO algorithm and neural network with backpropagation algorithm. At first, the PSO algorithm is used to tune some neural network parameters to access the optimized and appropriate model. en, the neural network with obtained optimized parameters is used for shortterm electrical load forecasting. e simulation results indicate the precision and power of the proposed method in short-term electrical load forecasting. For future directions, we will develop a model according to the deep learning techniques [103,104] and fuzzy logic [105,106]. Also, more parameters can be used for load forecasting beside the load information like the temperature, the humidity, and so forth in order to improve the prediction accuracy.

Data Availability
e data used to support the findings of this study are included within the article.