Explanatory Optimization of the Prediction Model for Building Energy Consumption

Traditional prediction models, which are based on artificial neural networks (ANNs), consider the various factors affecting building energy consumption comprehensively. However, their explanatory power is not ideal in actual application, resulting in prediction errors of building energy consumption. Thus, this paper pursues the explanatory optimization of the prediction model for building energy consumption. First, the authors displayed the architecture of the prediction model for building energy consumption, which is based on the temporal pattern attention mechanism (TPAM), and explained the principle of predicting building energy consumption. Then, the input of the TPAM was illustrated, and the execution steps of the model were depicted. Based on feature importance and the Shapley additive explanations (SHAP) method, the explanatory power of the proposed prediction model was analyzed, from the perspective of the time series features of building energy consumption prediction. The proposed model was proved effective through experiments.


Introduction
With the rapid development of the economy, global energy consumption has increased year by year. Meanwhile, buildings become larger in scale, and better in grade, taking up a growing portion in global energy consumption. erefore, it is practically significant to reduce the energy consumption, lower the emissions, and predict the energy consumption of large buildings [1][2][3][4][5][6][7]. For models with low data requirements, the trend of the time series for building energy consumption prediction can be obtained through simple statistical analysis of time nodes. en, it is possible to make the final prediction of building energy consumption [8][9][10][11][12][13][14][15][16][17]. Traditional prediction models, which are based on artificial neural networks (ANNs), consider the various factors affecting building energy consumption comprehensively. However, their explanatory power is not ideal in actual application, resulting in prediction errors of building energy consumption.
is significantly limits the applicable range of the models. As a result, it is necessary to develop models that can explain building energy consumption [18][19][20][21].
Nearly, 40% of global carbon emissions come from the building industry, which has a great potential for meeting the climate targets. Aiming to enhance building energy efficiency, the energy performance certificate requires accurate prediction of building energy performance. With the significant improvement of information and communication technology, the data-driven method has been introduced to study building energy performance and proved to boast a high computing efficiency and predictive performance. Nevertheless, most studies focus on predictive performance, without considering the potential of explaining artificial intelligence (AI). To fill up the gap, Wenninger et al. [6] designed the novel QLattice algorithm, which considers both predictive performance and the explanatory power. e algorithm was applied to forecast the annual energy-saving performance on a dataset containing 25,000 plus German residential buildings. Lei and Yin [22] improved the standard backpropagation (BP) neural network with the Levenberg-Marquardt (LM) algorithm and built an LMBP-based prediction model for the lighting energy consumption of high-rise buildings. e traditional prediction models for building energy consumption in shopping malls face limitations in the type and number of input variables. To overcome the limitations, Jing et al. [23] proposed a prediction model for building energy consumption in shopping malls based on the chaotic theory. e first step of the model is to compute the Lyapunov exponent of the energy consumption, which proves the chaotic nature of the energy consumption. As an important process in the sustainable building design, the evaluation of building energy-saving performance has a great impact on global energy reduction and environmental protection. Li et al. [24] developed an efficient way to evaluate building energy consumption. eir strategy integrates building information modeling (BIM), energy consumption simulation, and energy consumption prediction. e accurate prediction of building energy consumption is of great significance to the building energy management system. Nonetheless, building energy consumption is difficult to predict because the relevant data are often nonlinear and unsmooth. Karijadi and Chou [25] combined random forest (RF) and long short-term memory (LSTM) to predict building energy consumption, based on the complete ensemble empirical mode decomposition with adaptive noise (CEEMDAN).
Many studies have successfully predicted building energy consumption. Yet, most of them concentrate on improving prediction accuracy, summing up the key factors affecting building energy consumption, and the research trend of building energy prediction. e prediction effect of ensemble models depends on the basic learner and the ensemble strategy. It is impossible to demonstrate whether the model prediction is trustworthy or not. us, the prediction results often trigger many controversies. As a result, the building energy prediction models lack explanatory power. e model users in need of forecasting building energy consumption no longer trust the prediction results. In addition, the models are not frequently used in actual scenarios. erefore, this paper pursues the explanatory optimization of the prediction model for building energy consumption. Section 2 displays the architecture of the prediction model for building energy consumption, which is based on the temporal pattern attention mechanism (TPAM), and explains the principle of predicting building energy consumption. Section 3 illustrates the input of the TPAM, and details the execution steps of the model. Section 4 analyzes the explanatory power of the proposed prediction model based on feature importance and the Shapley additive explanations (SHAP) method, from the perspective of the time series features of building energy consumption prediction. Finally, the proposed model was proved effective through experiments.

Prediction Principle of Building Energy Consumption
In traditional prediction models of building energy consumption, an information loss may occur due to the connection vector between the encoder and the decoder. To solve the problem, this paper introduces the TPAM to the existing prediction framework. e new prediction model can make better predictions of building energy consumption, during the handling of the input time series, which involve multiple variables. Figure 1 illustrates the architecture of the TPAMbased prediction model for building energy consumption.
Compared with traditional attention mechanism, the TPAM can effectively handle the multivariate input time series for the forecast of building energy consumption. is mechanism can mine more deep-seated information from the multivariate input time series, according to the weights of specific variables, and the acquired local features. It can also identify the lost position and other key information. e TPAM mainly encompasses two parts: a time detection module of convolutional neural network (CNN) and an attention weighting module. e workflow of the TPAM is as follows: In the prediction model, the bidirectional gated recurrent unit (BiGRU) layer mainly computes the n-dimensional hidden state f i of each time step, and then derive the hidden state matrix F � {f e − q , f e-+ 1 , . . ., f e − 1 }, where q is the length of the sliding window. For the time series for building energy prediction, the states of each variable at all time steps can be represented by a row vector of the hidden state matrix E. e states of all variables at each time step can be characterized by a column vector in that matrix. en, a one-dimensional (1D) CNN can be called to extract the variable time pattern: Formula (1) performs a convolution operation D with l filters and the kernel size of 1 × E over the row vector of F, producing the time pattern matrix F D corresponding to that variable. Let F D ij be the calculation result of the i-th row vector and the j-th kernel; F D i be the i-th row of matrix F D ; Q x be the trainable weight matrix; f e be the hidden state outputted by the BiGRU layer. en, the variable time pattern can be evaluated by the following: To obtain multiple variables that facilitate the prediction of building energy consumption, the TPAM would determine the attention weights by sigmoid function. e attention weight β i can be calculated by the following: Finally, the context vector vtue can be obtained as the weighted sum of the vectors in row F D :6   Computational Intelligence and Neuroscience e GA optimization of the BP neural network intends to improve the initial weights and thresholds of the network. However, the traditional GA has difficulty in optimizing the structural parameters of the neural network. e main difficulty lies in the variation of chromosome length with the number of hidden layers. To solve this problem, the GA is improved in this research. In the improved GA, the singlepoint crossover is adopted as follows: where e improved GA adopts polynomial mutation. Suppose the mutation operator is u l en, ξ can be calculated by the following:

Construction of the Energy Consumption Prediction Model
e TPAM-based energy consumption prediction model consists of three parts: BiGRU encoder, TPAM, and BiGRU decoder. Figure 2 illustrates the input of the TPAM. e prediction model is executed in the following steps: Step 1. Let a e be the input time series at the current time step; f e − 1 be the hidden state at the previous time step; f e be the hidden state at the current time step e. After importing a e , and f e − 1 into the encoder, f e can be calculated by the following: Step 2. Let b e − 1 be the previous output series; r e − 1 be the hidden state of the previous time step; r e be the hidden state of the current time step. Based on b e − 1 and r e − 1 , the decoder can compute r e : r e � BiGRU decoder b e− 1 , r e− 1 .
Step 3. Let Q u be the parameter matrix. e context vectors u e and r e can be connected to obtain the following: Step 4. Let Q r be the parameter matrix. en, b e can be solved based on the SoftMax function: e energy consumption of buildings is affected by various factors. e time series inputted to predict building energy consumption can be divided into the historical energy consumptions at the n previous moments, time variables, and weather variables. Here, the objective function of building energy consumption prediction is defined as follows: where A�(PE, SF, XO, QR, F, C, Q, N, W), with PE, SF, XO, QR, F, C, Q, N, and W being the various input variables, such as wind velocity, humidity, temperature, and sun intensity. e historical energy consumptions at the n previous moments are denoted by B � (K e , K e − 1 , K e − 2 , K e − 3 , K e − 4 , . . ., K e − n ), with n being the window length of the historical data.
e predicted values for the m future moments are denoted as b * � (b e+1 * , b e+2 * , b e+3 * , . . . , b e+m * ). Figure 3 illustrates the input data of the model. e Adam algorithm can dynamically adjust the learning rate of each parameter, making the update of parameters steadier. In actual applications, this algorithm outshines most gradient-based optimizes.
is paper relies on the Adam algorithm to iteratively update the weight and bias of each node of the prediction model, in order to obtain an adaptive learning rate. Let n e and m e be the estimation for the first-order moment of gradients, and that for the secondorder moment of gradients; n e ′ and m e ′ be the correction for n e and m e , respectively; c be the learning rate. en, we have as follows:

Computational Intelligence and Neuroscience
where n e ′ and m e ′ can be viewed as unbiased estimates of expectations. e loss function of the model is the easily solvable mean squared error (MSE): where m is the number of samples; b i and b o i are actual value and predicted value, respectively.

Analysis on Explanatory Power
Based on feature importance and the SHAP method, the explanatory power of the proposed prediction model was analyzed, from the perspective of the time series features of building energy consumption prediction.
To disturb each time series for predicting building energy consumption, noises are added to the eigenvalues of each time series. e model performance is measured by   Computational Intelligence and Neuroscience MSE (14). e importance of feature l can be calculated by the following: Let MSE i and MSE j denote the MSE of the model before and after noise addition, respectively. e greater the SHAP feature importance, the more important the features of a time series. e SHAP feature importance can be calculated by the following: where m is the sample size; Ψ (i) j is the Shapley value of the jth feature in the i-th time series sample. e interaction effect between time series features can be measured by the effect of the combinatory features added to a single time series feature. e Shapley interaction index of two time series features can be defined as follows: If i≠j, we have the following:

Experiments and Result Analysis
To highlight the universality of predicting building energy consumption, two different public datasets were selected from China and a foreign country for our experiments. Out of the various factors affecting building energy consumption, the change of the local climate is a leading factor. is paper summarizes the energy consumption of target buildings at different temperatures. e results in Figure 4 show that when the temperature is below 20°C, the mean energy consumption of buildings does not change significantly with the rise of the mean temperature; when the temperature is above 20°C, the mean energy consumption of buildings surges up with the growth of the mean temperature, mainly due to the increase of refrigeration loads.
To demonstrate its effectiveness, the proposed prediction model of building energy consumption was compared with recurrent neural network (RNN) (model 1), GRU (model 2), LSTM (model 3), deep Q network (DQN) (model 4), and dueling DQN (model 5). Figure 5 compares the prediction results of the different models. It can be observed that the prediction of our model is closer to the actual value, and more accurate than that of other models. Besides, the prediction effect of our model is better, when the refrigeration load consumes much energy during the working period, as evidenced by the minimum prediction error. Table 1 compares the prediction errors of different models. Obviously, the five models differed significantly in the prediction accuracy on the time series for building energy consumption. Our experiment lasted five months. e results show that our model had a 3-6% lower MAPE and 35-50% lower MSE than the other models. e low MAPE and MSE demonstrate the good generalization and robustness of our model.
As a tool for explaining the prediction of building energy consumption, SHAP can explain the quality of the prediction on a single test sample, and the importance of complex time series features. Figure 6 compares the metrics of different models. It can be seen that our model predicted building energy consumption better than the other models, as evidenced by its superior fitting effect on the time series for building energy consumption.  Computational Intelligence and Neuroscience other variables are greater than zero, suggesting that they promote the prediction output of our model. In the meantime, the positive effect of the eigenvalues corresponding to the 11 features was smaller than the negative effect of these eigenvalues. us, the output of the time series for building energy consumption is smaller than the mean prediction output. Figure 7 shows the importance of the SHAP features of the 11 variables. It can be observed that the high-ranking variables exert a significant impact on the prediction output of building energy consumption. Among them, the SHAP feature corresponding to b * was the most important, indicating that b * contributes the greatest to the output of the energy consumption prediction model. By contrast, the

Conclusions
is paper optimizes the explanatory power of the prediction model for building energy consumption. Specifically, the authors displayed the architecture of the TPAM-based prediction model for building energy consumption and introduced the principle of predicting building energy consumption. en, they went on to illustrate the input of the TPAM and explain the execution steps of the model. Based on feature importance and SHAP method, the explanatory power of the proposed prediction model was analyzed, from the perspective of the time series features of building energy consumption prediction. rough experiments, this paper summarizes the energy consumption of target buildings at different temperatures and identifies the main reasons for the influence of temperature on building energy consumption. To demonstrate its effectiveness, our prediction model for building energy consumption was compared with different models. e comparison of prediction errors proves the effectiveness of our model. In addition, the authors displayed the SHAP values of the features of the 11 variables below the mean prediction output and exhibited the importance of the SHAP features of these variables. Finally, the explanatory power of our model was analyzed.
Due to the limited data volume, it is difficult to build an energy consumption prediction model for each season. In future, as more data become available, the research of building energy prediction model will be further improved and better prediction effect will be realized.

Data Availability
e data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest
e authors declare that they have no conflicts of interest. Computational Intelligence and Neuroscience 7