Microgrid Load Forecasting Based on Improved Long Short-Term Memory Network

With the fast growing of new energy technologies, the proportion of distributed renewable energy sources dominated by wind and light energy in the microgrid continues to increase. However, the uncertainty and randomness of energy itself bring challenges to the stable operation of the power system. Microgrid load forecasting with high accuracy is the key means to handle the above problems. It can provide help for power grid dispatching and decision-making, optimize resource allocation, reduce operation cost, and ensure system safety. In this paper, a load-forecasting algorithm for microgrid based on improved long short-term memory neural network (LSTM) is proposed. Firstly, the criticality analysis of load influencing factors is carried out, and the clustering classification and weight calculation are completed. .en, the input data is preprocessed to ensure the quality of database. Secondly, the LSTM gets improved from three aspects: multilayer convolution channel, lookahead optimizer, and AM weight. And a complete forecasting model is designed to accomplish the load forecasting. Finally, based on the data of a local microgrid in Zhejiang Province, China, simulation experiments are conducted..e results are quantitatively compared with other forecasting algorithms to verify the accuracy and superiority of the proposed algorithm.


Introduction
With the rapid progress of science and economy across the world, the environmental pollution caused by traditional ways of power generation is becoming more and more serious [1]. e new energy technology such as wind power and photovoltaic power has become the focus of research at home and abroad. However, renewable energy itself has considerable randomness and volatility, which will adversely affect the stability and reliability of power grid when used in power system widely [2,3]. As a small, decentralized and independent system, microgrid is featured by self-control, protection, and management. e fast development of related technologies provides the possibility to promote the wide access of new energy sources and the reliable supply of multienergy forms [4]. e accurate prediction of microgrid load power and its inclusion in dispatching plan are important guarantees to promote the security and economy of modern power system [5].
Traditional load forecasting has been developed over the years and can be divided into time series methods represented by autoregressive models and deep learning algorithms represented by artificial neural networks. e paper [6] uses Deep Belief Networks for short-term load forecasting. e thesis [7] first performs feature extraction of load features by convolutional neural networks and then load forecasting by long short-term neural networks. Artificial intelligence has a very prominent data mining capability and complex function approximation capability, which is widely used in practical engineering. Its ability to autonomously mine the implicit features of load and its influencing factors has made it a mainstream method for probabilistic load prediction. In addition, gray theory, chaos theory, wavelet analysis, and other theories are also used for load prediction [8][9][10]. Many contributions have been made by domestic and foreign scholars to further improve the prediction accuracy and the generalization ability of the model. e article [11] used CNN to extract the nonlinear invisible features between influencing factors and load in the short-term load forecasting model and improved the generalization ability of the model by using fuzzy time series. e paper [12] used CNN to extract spatially adjacent loadrelated features to improve the load forecasting effect. However, most of the above methods are targeted in traditional power systems and are not applicable to new microgrid systems with a large proportion of new energy access. e volatility of new energy sources puts higher demands on the accuracy of microgrid load forecasting, and the excessive information processing also leads to problems such as large computational workload and low efficiency. In addition, how to complete the data feature selection and improve the expression ability of neural network based on the big data information of microgrid is also an urgent problem to be solved.
To deal with these problems, a load forecasting algorithm for microgrid based on improved long short-term memory network is proposed in this paper. Firstly, the power load of microgrid is analyzed, and the key factors that influence the load value are found out. en different types of key factor data are collected to complete characteristic analysis and feature extraction, forming the data set of key influencing factors. Secondly, the data are preprocessed. e missing data are supplemented, and the abnormal data are eliminated and smoothed to ensure the authenticity and reliability of the feature database. en, traditional LSTM are optimized from three aspects: multilevel convolution channel, attention learning reinforcement, and lookahead optimizer to improve the prediction accuracy and convergence speed of the algorithm. Finally, a microgrid data set collected in Zhejiang Province, China, is simulated, and the load forecasting is completed. Compared with the traditional neural network, the accuracy and reliability of the new algorithm are verified.

Analysis of Influencing Factors and
Data Processing e output of microgrid load is affected by many aspects, such as climate, temperature, economic level, time, and special events. e change of load value shows randomness; however, it also has certain periodicity and regularity [13]. In this chapter, the load output of several typical microgrids is studied, and the possible load influencing factors are analyzed to find out the key influencing elements. e key data are preprocessed to improve the quality of input data of neural network and ensure the accuracy of microgrid load forecasting.

Analysis of Key Influencing Factors.
Research shows that the load output of microgrid is affected not only by the structure and characteristics of the system itself, but also by the factors outside the system [14,15]. From the economic aspect, the gross domestic product (GDP) is closely related to the load output. From an industrial point of view, the proportion of power consumption in regional factories has a greater impact on load. From meteorological factors, extreme weather has large influence on load output. From the point of view of time, month, season, holidays will affect the load. From the angle of marketing, peak and valley power will act on the load. From the microgrid structure, the proportion of new energy will also play a role in load. In order to reveal the influence of different kinds of influencing factors on power grid output, this paper analyzes the characteristics of all factors. e influencing factors of load are numerous; as a result, the amount of data is huge. In order to select key influencing factors conveniently and improve the efficiency of load forecasting, the influence weights of different factors are calculated separately. First, K-means clustering algorithm is used to cluster the original data and add corresponding labels. en, a feature selection model is established, and the maximum sum of intervals is taken as the objective function to complete the feature selection. Euclidean distance is adopted for distance calculation of clustering algorithm, and the formula is as follows: where x mi and x ni are the data samples of different types of influencing factors. Once clustering is complete, the sample data is mapped into a weight space W and intervals are calculated to reflect the data structure. e larger the interval, the better the data structure. Based on this, the data samples are trained to find the best feature weight. e formula is as follows: where ω denotes the weight vector, N 0 (x) T is the similar nearest neighbor of the sample x, and N 1 (x) T stands for the nonhomogeneous nearest neighbor of the sample x. After completing clustering analysis and feature weight calculation, the obtained results are shown in Table 1. In the long run, the price of electricity is the main factor affecting the load, and industrial power consumption is the most frequent factor. e proportion of new energy in this region is relatively high, and the electricity price of new energy also has certain impact on the load. In the short term, load output is greatly affected by time. e occurrence of special events also affects the load value of microgrid.

Data
Processing. Analyzing influencing factors, there are many factors affecting the load prediction of microgrid. However, there are great differences between different types of collected data, which makes it impossible to compare and train directly, thus easily leading to neuron saturation. erefore, it is necessary to normalize all data to ensure that they are within the same interval, so as to facilitate statistics calculation [16]. e normalization formula is as follows: where x * i shows the data after normalization processing, x i stands for the data before processing, and x is the average of the initial data.
Because of the huge amount of data acquisition in the load forecasting of microgrid, the data loss is frequently caused by equipment failure or communication error. e lack of data will adversely affect the training quality of forecasting. In order to ensure the continuity and stability of load data, it is necessary to complete the missing data. To ensure the repair quality, the data shall be supplemented by the weighted processing method. e formula is as follows: where y i represents the missing data to be added; y i−1 and y i+1 are the data at the adjacent time, y * i indicates the similar data, and λ 1−3 embodies the weight.
In addition to missing data, abnormal data may occur in data acquisition due to interference signal and other reasons. To ensure the prediction quality, it is necessary to eliminate and smooth the abnormal data. In this paper, the method of deviation identification is used to identify the abnormal data. When the identification is completed, the abnormal data are removed, and the missing data will be filled in.
e anomaly identification formula is as follows: where E x denotes the expectation of the data x , D x indicates the variance of the data x, and ε means the set identification threshold value.

Improvement of LSTM Algorithm
In order to improve the accuracy and convergence speed of load forecasting algorithm, LSTM is improved in this paper. Traditional algorithms are optimized from three aspects: multiconvolution channel cooperative optimization, attention learning enhancement, and lookahead optimizer optimization.

Cooperative Optimization.
Traditional load forecasting algorithm uses single channel to perform convolution calculation on data. However, the prediction model of single channel is easily influenced by external noise, so the reliability of data acquisition is strictly required. Especially when the input types of load forecasting models are quite different, single channel cannot meet the needs of high prediction accuracy of microgrid. erefore, in this paper, a multiconvolution channel structure is constructed, and a cooperative neural network prediction model is adopted. At the same time of data training and load forecasting, data exchange and sharing are carried out to establish stable reconstruction loss. e accuracy of model prediction is improved by cross training and cooperative training. e structure of cooperative prediction is shown in Figure 1.

Optimization of Attention.
Research shows that the characteristic signals in the key data feature set are easy to be submerged in a large amount of meaningless redundant data, which results in the difficulty and huge workload of forecasting. e attention mechanism is used to optimize the prediction model. By weighting the output of LSTM, the important features are highlighted, and the prediction efficiency is well guaranteed.
Attention mechanism is a kind of weighting function in essence. By integrating the information of multidimensional eigenvectors f i , a new feature vector f * i can be obtained. is feature vector can reflect critical information of a plurality of features. e weighting formula is as follows: where α i stands for the weight, which is also the key information in attention optimization. Attention weight is mainly operated through the scoring function g. Each characteristic value is scored, with higher score indicating that the feature is more important. After scoring, the final weight is calculated using the SoftMax function. In this paper, attention optimization layer is added between LSTM layer and full connection layer, namely, where W α denotes the attention weight matrix, b shows the paranoid quantity, and σ indicates the sigmoid nonlinear function.
With the attention optimization mechanism, the data processing of the prediction model can be focused on the key features and ignore the irrelevant features, thus ensuring the prediction efficiency and accuracy.

Lookahead Optimizer.
e weight optimization of LSTM in load forecasting has always been the focus of neural learning. High-quality optimization measures can guarantee the convergence speed and stability of the model. In this paper, a forward-looking called lookahead optimizer is used to determine the search direction by looking ahead at a fast weight sequence generated by another optimizer. Combined with the conventional Adam optimizer, this optimizer can improve the fitting speed and generalization ability and guarantee the robustness of self-learning. e fast weight part of the optimizer means that the traditional Adam optimizer updates the model k times fulltime and saves the sequence weights. e update expression is as follows: where A d is the Adam optimizer function, L represents the model objective function, θ shows the update weight, and x is the input data. e slow weight part updates the weight of training model through exponential weighted average algorithm. e formula is as follows: where ω i denotes the model weight, and α shows the learning rate.
In the initialization phase, the initial value of training θ j+1,0 � ω j+1 . e working steps of the lookahead optimizer are shown in Figure 2.

e Basic Principles of CNN.
In order to fully excavate the deep feature of the image, convolution neural network is used in this paper to extract the original feature information effectively [17,18]. Convolutional neural networks can reduce the number of parameters inside the network by special connection between adjacent layers. e latter convolution layer can extract the data features of the previous convolution layer and convert the feature information from low level to high level. It mainly consists of convolution layer, pooling layer, active layer, and full connection layer.
Convolution layer is the most important link in convolution neural network, which uses filter to obtain the characteristic information of previous layer data. Generally, after the convolution layer, the pooling layer resamples and obtains new feature information by using the correlation characteristics of the input data. Its main function is to process the dimension of input data and improve the efficiency of information processing. e activation layer improves the convergence speed through the activation function and avoids the problem of gradient disappearance. e full-connection layer is usually located at the end of the network, which integrates and outputs the data to obtain the eigenvector reflecting the characteristics of the data.

e Basic Principles of LSTM.
LSTM neural network belongs to cyclic neural network [19]. It can solve the problem of long term dependence through the inner gating mechanism of neural unit. Its basic internal structure is as shown in Figure 3. Each cell is circularly connected to form a traditional RNN structure, and the internal unit adopts the gating mechanism with LSTM core.

Improved LSTM Algorithm.
e improved LSTM neural network proposed in this paper takes the data set of the critical factors affecting the load output of microgrid as the input of the network. e feature extraction is carried out through two-layer convolution neural network channel. e high-value information contained in the data is further mined by cross-training between the two convolution layers. Output features of the convolutional neural network are then used as inputs to the LSTM. Its unique 3 gates and memory units can further screen the features. At the same time, lookahead optimizer is designed to inverse optimize the convolution neural network. At last, the output of LSTM is weighted by AM, and the load power is predicted. e detailed model structure is shown in Figure 4. e structural design of the improved LSTM is related to the input data setting, the design of the convolutional kernel, and the choice of the activation function. e specific design of the proposed scheme is presented below in terms of each of these three aspects. e data input selected in this scheme is a multidimensional data matrix. After completing the data selection, the critical data with the highest relevance are used as the input to the prediction model. Since different types of data have different sensitivities to time scales, some data characteristics require longer time periods to be reflected in the data, while others are the opposite. It is not appropriate to use the same time window for data extraction. erefore, for different types of data information, three different time resolutions, high, medium, and low, are used for data collection to ensure the accuracy and efficiency of power prediction. e traditional convolutional kernel design is prone to inconsistent weight matching factors. is leads to the need to identify the data in the feature extraction process of the convolution kernel, which increases the difficulty of feature extraction. In this paper, the problem of confusing weight matching is avoided by setting the lateral size of the convolutional kernel equal to the number of critical factors to ensure the convolutional direction. To ensure the nonlinear approximation ability of the neural network, and the suppression ability of useless signals, the ReLU function is used as the activation function. e function expression is as follows:

Experimental Results and Analysis
In this paper, the microgrid load and related data sets of Zhejiang Province of China over the past 5 years are used for model training and simulation verification. Among them, 50,000 samples are selected for testing, 70% of which are taken as training set and 30% as verification set. is article uses Python as the programming language.

Evaluation Index. Mean Absolute Percentage Error
(MAPE) and Root Mean Square Error (RMSE) are used to measure the prediction error in order to quantitatively evaluate the accuracy of the improved LSTM model. e specific formula is as follows: where z * i indicates the forecast data and z i indicates the real data.

Identification of Characteristics.
It can be seen from Table 1 that there are 14 factors influencing the load, and their critical weights are different. As the input of neural network, the data of key influencing factors will directly affect the accuracy and speed of load forecasting. Small amount of data will lead to poor prediction accuracy, while large amount of data will slow down the calculation speed. erefore, it is necessary to select the number of influencing factors reasonably.
Firstly, the data of all influencing factors of load are taken as model input for load forecasting, and the prediction precision and optimal convergence speed are calculated, respectively. e key weight values are deleted one by one from low to high, and the prediction accuracy and optimal convergence speed are recalculated. e conclusions obtained are shown in Figure 5.
It can be seen that when the number of influencing factors is more than 7, the prediction error is minimum, and the prediction accuracy is the highest. e speed of convergence reaches the fastest at 8, and only about 0.56 s running time is needed to complete the accurate prediction. erefore, in the selection of influencing factors, 8 kinds of data with the highest critical weight are selected as the input of the model. And the improved neural network with 3 layers and 250 rounds of period is used for training.

Input door
Output door Activation function Activation function

Predictive Performance Test.
In order to verify the prediction accuracy of the proposed algorithm, the prediction performance is tested after the neural network training is fully completed. Eight groups of data under different conditions are, respectively, selected for load prediction, and prediction error is calculated. en, the load prediction is completed using the traditional LSTM load prediction algorithm for comparison experiments.
e power error comparison can be obtained as shown in Table 2. It can be seen that the basic error of the improved LSTM algorithm is within 4% and about 1 kW. Experiments show that this algorithm can complete the task of microgrid load forecasting. Compared with traditional LSTM, the accuracy gets greatly improved.
100 prediction experiments on power are performed using the same data, and the average error of prediction is calculated separately for different scenarios.
e results are shown in Figure 6, which shows the daily comparison of load power forecasts. It can be seen that the load shows a certain daily cycle. It is lower at night and higher during the day, and there is a certain peak-to-valley characteristic. e experiment proves that the algorithm can accomplish the task of microgrid load forecasting. Compared with the traditional LSTM, the prediction accuracy has been improved.

Optimization Performance Test.
e optimization speed of load forecasting model is the key index to evaluate the forecasting performance. e model uses less iteration times and achieves higher test accuracy, which is the necessary condition for excellent prediction model. e improved LSTM method is compared with the traditional LSTM method and other load forecasting algorithms.
e following test performance is obtained in Figure 7.
It can be seen from the figure that the improved LSTM method converges faster than the traditional method. It also has certain advantages over other methods: the proposed     method can achieve the best performance after about 250 iterations, and the prediction error is generally less than 4%.

Comparison with Other Schemes.
In order to further compare the forecasting performance of the proposed scheme, this paper compares the proposed scheme with the traditional load forecasting methods, such as CNN-LSTM, Bi-LSTM, and GAN-LSTM. Among them, CNN-LSTM is a data feature proposed after the power signal is recognized by CNN as the input of LSTM, and the power prediction is completed after important feature capture. ree convolutional layers are used, and the LSTM neurons are 64, where Bi-LSTM is a bidirectional LSTM considering feature selection. Power accurate prediction is achieved by sequential and inverse order information exchange, the number of network layers is 3, and the training period is 250 rounds. e GAN-LSTM is to expand the number of samples and improve the training quality by data generation. Two convolutional layers with 3 × 3 kernels and 64 feature maps are used, followed by batch-normalization layers and ReLU as the activation function. e comparison results obtained are shown in Table 3.
It can be seen that, compared with other load forecasting algorithms, the improved LSTM algorithm has certain advantages in accuracy and convergence speed. Experiments show that this scheme is feasible in microgrid load forecasting.

Conclusion
An improved LSTM algorithm for load power prediction of microgrid is proposed in this paper. Firstly, the analysis of influencing factors and data processing are completed, and high-quality model input database is established. Secondly, the traditional LSTM load forecasting algorithm is optimized in three aspects: multilayer convolution neural network optimization, lookahead optimizer, and attention mechanism optimization to improve prediction accuracy, generalization ability, and convergence speed. At last, the validity and reliability of the algorithm are proved by simulation experiments. e main contributions of this paper can be summarized as follows: (1) Comprehensive analysis on load influencing factors is carried out, and the key influencing factors are found out by cluster analysis and characteristic weight calculation. en, the collected data are preprocessed to ensure the quality of input data.
(2) e multilayer convolution channel is used to extract the feature, and the deep mining of the feature is completed by means of data exchange and crosstraining between channels of different layers. Lookahead optimizer is also adopted to optimize the forward-looking weights to improve the fitting speed and generalization ability of the model and ensure robustness. e output is weighted by AM attention mechanism to further ensure the accuracy of prediction. (3) Taking a microgrid model in Zhejiang Province as an example, the simulation experiment is carried out. Experiments show that the 8 key factors can be used as model input to guarantee the prediction speed and accuracy. Compared with the traditional LSTM algorithm and other algorithms, the proposed scheme has higher prediction accuracy and convergence speed.

Data Availability
e data used to support the findings of this study were supplied by YUQING ZHENG under license and so cannot be made freely available. Requests for access to these data should be made to YUQING ZHENG (zhengyq@ cceecexpo.org)

Conflicts of Interest
e authors declare that they have no conflicts of interest.
Acknowledgments is work was supported by the Project of Ningbo Polytechnic in 2022 "Basic research on renewable energy power consumption based on big data" (NZ22003).  Journal of Electrical and Computer Engineering 7