Load Identification Method Based on ISMA-GRNN

The noninvasive load monitoring method carries out load identiﬁcation after event detection and feature extraction of load data. At present, nonload intrusive load monitoring faces the problems of low load identiﬁcation accuracy and long load identiﬁcation time. In order to solve these problems, a load identiﬁcation method based on the improved slime mould algorithm-generalized regression neural network (ISMA-GRNN) is proposed. Firstly, by adding mutation operation in slime mould algorithm (SMA) position update, the global optimization ability of SMA is improved. Then, the improved slime mould algorithm (ISMA) is used to optimize the smoothing factor of GRNN and ﬁnd the best smoothing factor. Finally, the best smoothing factor is input into GRNN for load identiﬁcation, and the load identiﬁcation results are output. To measure the eﬀect of load identiﬁcation, load identiﬁcation precision, load identiﬁcation accuracy, and load identiﬁcation time are used as evaluation indicators. The simulation results show that compared with HHO-GRNN and WOA-GRNN, the load identiﬁcation time of SMA-GRNN is greatly shortened, but the results are not satisfactory. On the basis of SMA-GRNN, ISMA-GRNN has signiﬁcantly improved the accuracy and precision of load identiﬁcation. In conclusion, ISMA-GRNN can better adapt to the load identiﬁcation of multiple electrical equipment scenes.


Literature Review
With the in-depth research on smart grid and ubiquitous power Internet of things, NILM, as one of the key technologies for building ubiquitous power Internet of things, has become a research hotspot [1][2][3]. NILM technology can not only realize energy consumption management through the identification of electric equipment on the user side [4], but also predict power demand through load data to realize good interaction between power generation and power consumption [5]. e four steps of data measurement, event detection, feature extraction, and load identification form a NILM. As the core step of noninvasive load monitoring, the load identification method directly affects the effect of load identification [6]. e essence of load identification is load classification according to the load characteristics of electrical equipment [7]. Many algorithms such as SVM, Bayesian algorithm, and GRNN are used to deal with classification problems. SVM classifies data by constructing classification hyperplane, which is mainly used to deal with the classification problems of small-and medium-sized data samples, nonlinear and high-dimensional. Zhao et al. [8] used SVM to build a bearing fault diagnosis model and verified the recognition ability of SVM method to a variety of rolling bearing fault types. Bayesian algorithm calculates the probability of different events and selects the one with the greatest probability as the classification result. Jake et al. [9] used Bayesian algorithm to build bipolar short circuit fault location model. is method has the advantages of high precision, few measuring points, and high location accuracy. GRNN is an artificial neural network that uses a radial basis function as an activation function. Yao et al. [10] used clustering algorithm to cluster weather characteristic quantities and GRNN to build photovoltaic power generation prediction model.
is method can take into account small sample usage scenarios and has high generalization to realize highprecision prediction of photovoltaic power generation. In the field of load identification, the above-mentioned SVM and Bayesian algorithm have been applied, but there is almost no literature based on the GRNN method.
In the field of load identification, many experts and scholars have also done a lot of research. Du et al. [11] selected the fundamental power factor and the third harmonic content difference of voltage current during equipment operation as the load characteristics and combined the improved Gray Wolf Algorithm with FCM clustering algorithm. is method has a high recognition rate for small power loads, but it is prone to feature overlap and local optimal solution when dealing with multiload classification problems. Qiang et al. [12] transformed V-I image features into track coding through DPSH model and used them as load features and then built a load recognition model using CNN. is method can solve the problem of reducing the load identification rate when unknown equipment is added, but the overall accuracy of this method is not high and the load identification time is too long. Zhou et al. [13] used Fourier transform to decompose current data, selected current odd harmonics as load characteristics, and then used AdaBoost algorithm combined with BP neural network to build load identification model. Although this method can effectively improve the identification ability of easily misjudged load, the model training time becomes longer.
To sum up, in order to solve the problems of low load identification accuracy and long load identification time in the above methods, a noninvasive load monitoring method based on improved slime mould algorithm-generalized regression neural network (ISMA-GRNN) is proposed. Firstly, the method carries out data processing operations such as denoising, eliminating outliers, event detection, and feature extraction. en, aiming at the problem that SMA algorithm is easy to fall into local optimization, the smoothing factor of GRNN is optimized after adding mutation disturbance on the basis of SMA, the optimized optimal smoothing factor is input into GRNN for load identification, and the load identification results are output. Finally, the load identification accuracy, load identification precision, and load identification time are compared with HHO-GRNN, WOA-GRNN, and SMA-GRNN, in order to reflect the advantages of ISMA-GRNN in load identification. e rest of this study is organized as follows: Section 2 introduces the methods of event detection and feature extraction. Section 3 introduces the theory of SMA and GRNN and introduces the disturbance formula to improve the SMA method. Section 4 establishes the ISMA-GRNN model and describes its process in detail. Multiple indicators are used to measure ISMA-GRNN in Section 5 followed by the conclusion in Section 6.

Event Detection and Feature Extraction
Event detection scans the load data for a period of time by using the algorithm to detect the connection and disconnection system of electrical equipment, which is the premise of noninvasive load monitoring [14]. Niu and Jia [15] proposed a bilateral cumulative sum load event detection method based on sliding window, which can effectively detect high-power load events according to power load data, and the detection of low-power load events is inaccurate. On this basis, the improved mean difference accumulation method can improve the event detection sensitivity of the algorithm and effectively detect low-power load events.
In this paper, the sliding window mean difference cumulative sum algorithm is used for load event detection. By setting the load event threshold, when the electrical equipment is connected to the system, the power data fluctuates greatly. After the mean difference accumulation calculation, the threshold will be triggered, and the event detection algorithm can detect the access or exit of the electrical equipment from the system. Define positive load event threshold h + , negative load event threshold h − , the cumulative function g, and the step size of sliding window sp. Characteristic power sequence during load operation is P 1 � [ p 1 , p 2 , . . ., p n ] n×1 . e matrix after sliding window mean difference processing is X, and its expression is shown in the following equation: where l � n−sp. When the electric equipment is connected, the load data has a positive offset; that is, x k ≥ x k−1 , and the power sequence increases. en, the positive offset will be accumulated into g + k , and g + k will continue to increase. On the contrary, if the load data has a negative offset when the electric equipment exits the system, that is, x k < x k−1 , and the power sequence decreases, the negative offset will be accumulated into g − k and g − k will continue to increase. When g + k ≥ h + or g + k < h − , the system will detect the occurrence of load events. When g + k ≥ h + , positive load events are recorded. Similarly, when g + k < h − , negative load events are recorded.
After data processing operations such as denoising and eliminating outliers, the collected load data is input into the sliding window mean difference cumulative sum algorithm for event detection. en, the load event data will be input into the feature extraction method for feature extraction. e load characteristics of electrical equipment can generally be divided into transient characteristics and steady-state characteristics. e steady-state characteristics are widely used because of their advantages such as strong repeatability, low cost, and simple acquisition [16]. Hilbert Huang transform (HHT) can transform one-dimensional signal into a two-dimensional complex plane signal, which can better display signal features, and is often used for feature extraction [17]. e improved CEEMDAN algorithm is used for feature extraction [18,19]. CEEMDAN method uses the estimation of local mean to replace the estimation of mode and uses the local mean of signal to extract k-order mode, which can effectively solve the problem of mode aliasing. erefore, we use CEEMDAN + HT transform to decompose the current data of load events and extract instantaneous amplitude, instantaneous frequency, and instantaneous phase as the load characteristics of electrical equipment. e digital model is derived as follows.
Assuming that the current characteristic of the load event is I(t), the first intrinsic mode component is defined as I 1 , ε is the adaptive coefficient, ω is the Gaussian white noise added during decomposition, and N is the number of EMD decompositions of the signal. e expressions of I 1 and R 1 (t) are as follows: By repeating the above operation, we can get I 1 , I 2 , I 3 ,. . ., I K+1 , where the expressions of R K (t) and I K+1 are as follows: 2 Mathematical Problems in Engineering If the number of extreme points of residual sequence R K (t) is less than 2, CEEMDAN decomposition is stopped. e final decomposition of load event current characteristic I(t) is shown in the following equation: Calculate the correlation parameters of I 1 , I 2 , I 3 , . . ., I K + 1 , according to the Pearson correlation, sort the correlation according to the value, and select the larger first D intrinsic mode components to construct the analytical signal . en, HT transformation is performed on the selected intrinsic mode components to obtain the characteristics of instantaneous amplitude a i (t), instantaneous phase Φ i (t), and instantaneous frequency f(t).
e obtained formula is as follows: e load data is subject to the sliding window mean difference cumulative sum algorithm for event detection and CEEMDAN + HT transformation for feature extraction, and the instantaneous amplitude, instantaneous frequency, and instantaneous phase of electrical equipment during operation are extracted as load features. en, the training set and test set are divided by combining the load label, and the load identification model is input for training and identification.

GRNN.
GRNN is mainly composed of four layers of neurons: input layer, mode layer, summation layer, and output layer [19] are two transfer functions in the sum. Assuming the input data X � [x 1 , x 2 , x 3 , . . ., x n ], the calculation formula of mode layer data is shown in (6), where σ is the smoothing factor of GRNN.
ere are two transfer functions in the summation layer [20]. One is arithmetic summation of the outputs of all mode neurons, and its transfer function is S D � n i�1 p i . e other is weighted summation of all mode neurons, and its transfer function is S nj � n i�1 y ij p i , j � 1, 2, . . . , m. e output layer neuron is equal to the output data dimension, and its transfer function is a proportional function: Its structure is shown in Figure 1.

SMA.
SMA is an optimization algorithm based on the foraging behavior of slime moulds. Slime moulds determine the regional location of food through diffusion behavior and foraging behavior. SMA first updates the location through three steps: approaching food, wrapping food, and obtaining food, and then, it finds the global optimal location by comparing fitness. It has the characteristics of fast convergence speed and strong optimization ability [21].
At the stage of approaching food, the approaching behavior of slime moulds is approximate to a mathematical equation, and its contraction rule mode is as follows: where ]b is the random number between [−a, a], ]c is a parameter that is linearly decreasing from 1 to 0, t is the number of current iterations, X b (t) represents the individual position with the best current fitness, X(t) represents the individual position with current slime moulds, X A (t) and X B (t) are two random individual positions, and W represents the weight coefficient of the slime moulds. e control parameter p and parameter a with the weight coefficient W are updated using

Mathematical Problems in Engineering
where i ∈ 1, 2, . . ., n, n is the number of populations, S(i) represents the fitness value of the i-th slime moulds individual, and DF is the optimal fitness value achieved. Condition denotes the first half of individuals with fitness in the population, others denote the remaining individuals, r denotes the random number between [0, 1], bF denotes the best fitness value acquired for the current iteration, WF denotes the worst fitness value for the current iteration. SmellIndex(i) is the fitness value sequence (the minima problem is an increasing sequence). During the food-wrapping phase, the location of slime moulds individuals is updated as follows: where UB and LB are upper and lower bounds, random numbers are evenly distributed between 0 and 1, and z is a custom parameter value of 0.03. In the stage of obtaining food, the value of ]b oscillates randomly between [−a, a] and gradually approaches zero with the increase of the number of iterations. e value of ]c oscillates between [−1, 1] and finally tends to 0.

ISMA.
SMA can only alleviate the problem of falling into local optimal solution to a certain extent. By adding the method of variation disturbance, the variation disturbance is added in the iterative process to make the slime moulds individual jump from local optimal solution, which can improve its global optimization ability, help GRNN find the best smoothing factor, and improve the identification accuracy of load identification method. e added variation disturbance formula is as follows: (13) where A is the coefficient of variation, which is mainly used to control the variation amplitude, T is the variation period, and rand is a random number evenly distributed between [0, 1]. e main steps for ISMA to optimize GRNN smoothing factor parameters are as follows: Step 1: SMA population initialization and mutation disturbance parameter initialization. After many experiments, the improvement effect is the best when t � 5 and a � 1.
Step 2: Sorting the initial fitness parameters after calculating the initial fitness parameters according to the initial value of slime moulds position, and selecting the larger value as the initial parameter.
Step 3: Calculating the control parameters, α parameters and weight parameters according to equations (10) to (12).
Step 4: Updating the individual position of slime moulds by using equations (9) and (14), calculating the fitness value, comparing it with the initial fitness value, and updating the fitness value.
Step 5: Using equation (15) to update the position of variation disturbance, and using the fitness value to update the position of slime moulds and the global optimal value.
Step 6: Judging whether the number of iterations reaches the maximum number of iterations. If the number of iterations has been reached, output the best smoothing factor. Otherwise, go to step 3 to continue optimization.

Input and Output of ISMA-GRNN.
Load data such as voltage, current, and power are input into NILM method after denoising and eliminating abnormal values. e load events are detected by the sliding window mean difference cumulative sum algorithm, and the instantaneous amplitude, instantaneous phase, and instantaneous frequency are extracted as load characteristics by CEEMDAN + HT transformation. rough many experiments, we find that the load identification accuracy is the highest when extracting the first five-dimensional intrinsic mode functions, so the input dimension of the load identification method is 15. e load data used in this paper includes 9 kinds of electrical equipment, a total of 17 load states. e output of multiclass classification problems generally includes binary coded output and direct output. Binary encoded output is often used to identify situations where there are fewer electrical devices. Because there are many load states corresponding to the load data used in this paper, the direct output label is adopted, and the output dimension of the load identification method is 1. In order to facilitate the subsequent representation of electrical equipment and status, the electrical load and its status are numbered, and the corresponding labels are shown in Table 1.

ISMA-GRNN Implementation Process.
After the load data is input into the NILM method, data preprocessing operations such as denoising and eliminating outliers are carried out in the first stage to reduce the impact of load data acquisition error on load identification results. In the second stage, the sliding window mean difference cumulative sum algorithm is used to detect the load event of power load data, and the access state of useless electrical equipment in the load process is eliminated. In the third stage, CEEM-DAN + HT transform is used to decompose the current data of load events, extract instantaneous amplitude, instantaneous frequency, and instantaneous phase as load characteristics, and then the extracted load characteristic data set is divided into training set and test set. In the fourth stage, ISMA-GRNN is used for load identification, and then the load identification results are output. e main process of ISMA-GRNN is shown in Figure 2.
e fitness function of ISMA is shown in the following equation: where C represents the number of correct classification groups in the load identification results and Z represents the total number of load data groups input into the load identification method.
In ISMA-GRNN, firstly initialize SMA and variation disturbance, then calculate the initial fitness value, control parameters, α, and weight parameters, and update the individual position and fitness of slime mould by using equations (9), (14), and (10). Finally, judge whether the preset number of iterations is reached. If the preset number of iterations is reached, output the best smoothing factor. If the preset number of iterations is not reached, repeat equations (9), (14), and (10) to update the slime mould position.

Evaluation Index.
Load identification is essentially a classification problem. On the binary classification problem, the classification methods are mainly evaluated by accuracy, precision (P), recall (R), and F1 [22]. For multiple-class classification problems, the average values of accuracy, precision, recall, and F1 are generally used to measure the performance of the algorithm. e average value is mainly calculated by micro average and macro average. Macro average calculates the accuracy, precision, recall, and F1 of each category and then calculates the average value. is calculation method can treat each category equally. e micro average is calculated by combining the contribution value of the category [23]. In NILM, the category of each piece of electrical equipment exists independently, so it is more reasonable to use Macro-P, Macro-R, accuracy, and Macro-F measurement models. e macro parameters are calculated as follows:  Mathematical Problems in Engineering e confusion matrix [24] is the most intuitive and simplest way to measure the results of a multiple-class classification method. References [25,26] used a confusion matrix to measure the load identification results, but due to the inconsistency in the number of sample groups for multiple classified loads, their consistency needed to be tested using the kappa. Kappa is a statistical index to measure classification accuracy based on confusion matrix calculation. In this paper, we use Kappa to analyze the consistency of load identification accuracy of different power devices.
Load identification time is an important index to measure the performance of load identification methods. In engineering applications, generally by comparing the recognition accuracy and running time of different algorithms, the algorithm with high recognition accuracy and short recognition time is selected as load identification method [27,28].
In summary, we use the confusion matrix and kappa to represent the accuracy of load identification and use the accuracy of load identification and load identification time to evaluate the quality of load identification methods.

Experiments and Analysis
e software simulation platform used in this paper is Matlab 2018b, the hardware platform is intel i7-9700 3.0 GHz, and the RAM is 16 g. A total of 2743 groups of load  data are collected. After data preprocessing, event detection, and feature extraction, the number of data groups input into the load identification method is 1988. e number of data groups corresponding to electrical equipment and load status is shown in Table 2.
We divide the processed feature dataset into training set and test set, then optimize it in the GRNN processed by the input optimization algorithm, output the best smoothness factor, and finally test the optimized GRNN using the test set data to get the load identification results. To highlight the superiority of ISMA-GRNN, the advantages of load identification in scenes of multiple electrical equipment are verified by comparing it with HHO-GRNN, WOA-GRNN, and SMA-GRNN using confusion matrix, precision (P), recall (R), F1, and load identification time as evaluation indexes.
In order to reflect the fairness of the result comparison, the initial population number of the four optimization algorithms is set to 30, the maximum number of iterations is set to 100, the upper limit of the smoothing factor is set to 1000, and the lower limit is set to 0.1. e simulation result confusion matrix is shown in Figure 3.
In Figure 4, we can draw the following conclusions: in the load identification of multiple electrical equipment use scenes, the load identification effect of ISMA-GRNN is significantly better than the other three load identification Table 2: Data group number table corresponding to electrical equipment and load status.  Label  1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  Group number  393  233  58  58  74  63  63  195  95  220  62  58  135  63  106  68   methods. In Figure 4, the other three methods do not have high recognition accuracy for label 16, which is easy to misidentify as label 17. At the same time, WOA-GRNN can easily misidentify label 17 as label 16. e main reason is that these two kinds of electrical equipment have similar load data characteristics and have similar intrinsic mode functions when extracting features, so they are easy to misidentify. But ISMA-GRNN alleviates this problem to some extent, and its load identification accuracy for labels 16 and 17 is higher than 90%. e main reason for misidentification of SMA-GRNN is that there are similar loads on the four load devices (Label 3, Label 6, Label 7, and Label 11), which leads SMA to fall in locally optimal solution when searching for smoothing factor, resulting in different degrees of misidentification in load identification. ISMA improves the accuracy of load identification for Label 3, Label 6, and Label 7 by increasing the update of mutation operation, taking into account the global optimum. In ISMA-GRNN, only the identification accuracy of tag 11 is 87%, and the load identification rate of other electrical equipment and states is more than 90%.
However, because the number of load data groups of different load labels is unbalanced, we need to use kappa to check the consistency of confusion matrix. Kappa under the four methods is shown in Table 3.
It is obvious from Table 3 that the kappa of ISMA-GRNN is the highest, indicating that, in the confusion matrix obtained by ISMA-GRNN load identification method, the prediction results are almost the same as the actual classification results, and the results have high credibility. At the same time, the kappa of the four load identification methods is greater than 0.93, which shows that the identification  results of the four load identification methods have good repeatability, and their prediction results are highly consistent with the actual classification results. When the number of load data groups of different electric equipment is uneven, the confusion matrix cannot fully reflect the identification ability of the load identification method in the scenes of multiple electrical equipment. In order to reflect the rationality of the comparison of different load identification methods, we use precision (P), recall (R), and F1 to comprehensively evaluate the algorithm performance of the four load identification methods. e test results of the four load identification methods are shown in Table 4.
In Table 4, after comparing the identification effects of the four load identification methods on 17 load states, it is found that the identification effect of ISMA-GRNN is better, and its precision, recall, and F1 are higher than those of the other three methods. e minimum value of the classification index is usually used to measure the identification effect of the load identification method. e greater the minimum value, the better the performance of the classification model and the more applicable to different use scenarios.
By comparing the minimum values of the three classification indexes under the four load identification methods, it is found that the recall and F1 of ISMA-GRNN are higher than those of the other three methods, and the precision is only slightly lower than that of HHO-GRNN, indicating that ISMA-GRNN is more suitable for scenes of multiple electrical equipment use than the other three methods.
In engineering application, the less time it takes for load identification, the faster the iterative optimization process of the algorithm, and the stronger the performance of the algorithm. In addition, the parameters in Table 4 are mainly used to describe the identification effect of the load identification method on a single electrical equipment, which cannot well reflect the overall effect of the load identification model. erefore, we use macro average and load identification time to comprehensively describe the overall performance of the load identification model in the scenes of multiple electrical equipment. e macro average and load identification time under the four load identification methods are shown in Table 5.
It can be seen from Table 5 that, among the four methods, there is little difference in macro average parameters between HHO-GRNN and WOA-GRNN, but there is a great difference in load identification time. e main reason is that the update strategy of HHO is more complex, which results in longer updates per iteration, so HHO-GRNN takes longer to reach 376.02s. Compared with HHO-GRNN and WOA-GRNN, SMA-GRNN improves the accuracy and time of load identification, but other indicators decrease accordingly. ISMA-GRNN, on the basis of SMA-GRNN, makes SMA take into account both global optimization and local development capabilities by adding mutation operation to the location update, which improves the macro average greatly. However, due to the addition of location mutation operation, the iteration time becomes longer, resulting in a load identification time of 1.63s longer than SMA-GRNN. e load identification result figure is represented by comparing the load identification output label with the test set label, which can intuitively represent the identification effect of the load identification method. e identification results of the four load identification methods are shown in Figure 4.
If the characteristics of two pieces of electric equipment are similar, the load identification method is easy to misjudge them, and both of them are misjudged as each other. It  can be seen from Figure 4 that the main misjudgment of HHO-GRNN is on label 6, label 7, and label 12, but we believe that the reason for the misjudgment is that the possibility of misidentification caused by similar load characteristics is very small. e main reason for the misjudgment of label 8 and label 13 in WOA-GRNN is that the number of data sample groups of tag 8 accounts for a large proportion, resulting in the misidentification of label 13. e results of SMA-GRNN show complex false recognition results. e main reason is that the algorithm falls into local optimum and cannot find the best smoothing factor, resulting in false recognition. Compared with the other three methods, the load identification result of ISMA-GRNN is the best. ere is only misidentification of individual electrical equipment, and the identification error range is within the acceptable range.

Conclusion
In this paper, a noninvasive load monitoring load identification method based on ISMA-GRNN is proposed. Firstly, load event monitoring and feature extraction are carried out for load data, then mutation operation is added in the position update process of SMA, ISMA is used to optimize the smoothing factor of GRNN, and finally ISMA-GRNN is used for load identification, and the identification results are output. e load data we use includes 9 kinds of electrical equipment and a total of 17 power consumption states. e load identification accuracy, load identification precision, and load identification time are used as evaluation indexes. rough simulation experiments, the following conclusions are obtained: (1) Compared with HHO-GRNN and WOA-GRNN, SMA-GRNN has the highest load identification accuracy and the shortest load identification time, but the load identification precision is reduced. (2) After adding mutation operation, ISMA can move off the local optimal solution to a certain extent and find the best smoothing factor. Based on SMA-GRNN, the load identification accuracy of ISMA-GRNN is improved by 3.52%, the load identification precision is improved by 5.15%, and the load identification time is only increased by 1.63s. (3) For label 11 and label 12, the four methods have misidentification. e possible reason is that there is a certain time difference between the collection of operation data of electric equipment and the recording of load label when collecting load data for different power consumption states of the same electric equipment, resulting in load misidentification. Next, we will consider using the algorithm to record the load tag to reduce the time difference between running data collection and label recording.

Data Availability
Data sets cost a lot of energy to measure in the laboratory, so we choose not to disclose them.

Conflicts of Interest
e authors declare that they have no conflicts of interest.