Intelligent Online Monitoring and Remote Verification of Gateway Meters under the Embedded Sensor and Clustering Algorithm

Since gateway meters cannot simultaneously realize online monitoring and remote verification, the study is aimed at exploring the intelligent online monitoring and remote verification of the gateway meter. First, the similarity and related evaluation indexes of embedded sensors are analyzed based on the relevant theories, like the theories of the embedded sensor and clustering algorithm. Second, the gateway meter is tested on the standard vibration test bed, and the accuracy, timeliness, and environmental adaptability of its intelligent online system are tested. Finally, the remote verification of the gateway meter is carried out from three aspects: the error/load value, secondary voltage drop, and the admittance test. The results show that when the intelligent online monitoring ability of the gateway meter is tested on the standard vibration test bed, the error of the same pilot is controlled between 0.1mm after multiple peak tests, and the error is within the allowable range. In the intelligent online monitoring system based on the embedded sensor and clustering algorithm, the vibration acceleration is -0.6 cm/s~0.3 cm/s, the speed is -1 cm/s~1 cm/s, and the displacement fluctuates between -0.8 cm/s~0.8 cm/s. This shows that the intelligent online monitoring system can meet the performance requirements of online monitoring. In the process of remote verification of gateway meters, the active error and reactive error are 0.2% mm. The results of the secondary voltage drop and the admittance test show that the relevant technical indexes of the system meet the expected requirements. Therefore, the intelligent online monitoring and remote verification of gateway meters are discussed based on the embedded sensor and clustering algorithm, which provides a reference for the rapid development of gateway meters.


Introduction
With the development of science and technology, the gateway meter is widely used in various fields such as civil production, life, industry, and national defense. It is the core of the production of many large enterprises [1]. And realizing intelligent online monitoring and remote verification of gateway meters has become one of the problems that all sectors of society need to be discussed [2]. Zhu et al. (2020) introduced the online monitoring principle of the gateway meter by referring to the typical application examples of devices, such as transformers and capacitors, analyzed the problems in the monitoring, accu-mulated experience in online monitoring technology, and explained the necessity of popularizing and applying online monitoring devices [3]. Junior et al. (2020) analyzed the key technologies of intelligent online monitoring and remote verification of the gateway meter and put forward the big data platform architecture and business application architecture [4]. Mohammadi et al. (2022) summarized and compared the developing strategies of intelligent gateway meters in China, the United States, and Europe, introduced the main characteristics of gateway meters in detail, predicted the developing trend of gateway meters, and made the technical analysis of intelligent online monitoring and remote verification of American gateway meters [5]. Qi et al. (2020) discussed the power grid monitoring system based on the embedded sensor and clustering algorithm, improved the traditional power monitoring algorithm, and proposed a reliable monitoring and processing algorithm [6]. Mieloszyk et al. (2020) used the clustering technology in data mining to analyze the load curve of the power system and proposed a load clustering algorithm based on dimension reduction of characteristic indexes [7]. Arumona et al. (2020) used the boundary element method to study intelligent online monitoring and remote verification of gateway meters. However, in the actual experiment, the transmission line used is a single circuit, which is suitable for traditional power grid lines and is inconsistent with the transmission lines used in real life. Therefore, this scheme cannot be applied in practice [8]. Some research institutions expounded remote calibration and measurement control of the gateway meter based on communication networks alone or the cooperation with other technologies and performed the network remote measurement. At present, remote calibration technology is discussed by the United States and other countries, and some achievements have been made, and some projects are already in use [9].
The main task is to explore intelligent online monitoring and remote verification of gateway meters. First, the embedded sensor and clustering algorithm are used to analyze the accuracy, timeliness, and environmental adaptability of its intelligent online system. Then, its remote verification ability is discussed from the error rate/load, secondary voltage drop, and admittance. The innovation is to analyze the clustering effect of different cluster numbers through K-means. K-means has fast clustering convergence speed and is suitable for processing large datasets. It meets the requirements of the intelligent monitoring of gateway meters and provides ideas and a basis for processing intelligent data of gateway meters.

Materials and Methods
2.1. Basic Principle of the Clustering Algorithm. Clustering is the process of dividing the disordered data into meaningful or useful groups or class clusters according to the set rules. The datasets in the same class cluster are highly similar, but the datasets in different groups are totally different [10,11]. Cluster analysis is a statistical analysis method to study the classification of (samples or indexes) [12]. The idea of cluster analysis comes from taxonomy, but it is different from taxonomy [13]. In addition, a similar set of data cannot be obtained through cluster analysis. If class C i is the result of cluster analysis, it needs to meet the following three conditions: In Equation (1), U is a pattern set, and C t is a subset in the pattern set, and C t ⊆ U, t = 1, 2, ⋯k.
Equation (2) shows that the data in the cluster are similar even if they do not belong to the same group. C m and C r are two different subsets in the pattern set, and the meanings of the remaining letters are the same as those in the above equation.
In addition, the clustering algorithm also has many requirements, as shown in Figure 1.
Clustering is composed of five steps, and the specific process is shown in Figure 2.
Clustering is a very important concept in data mining [14]. According to the accumulation rules of data in clustering, the clustering algorithm can be divided into different categories. The classic algorithms are shown in Figure 3.
If the sample size is n, the sample is divided into k classes in certain cluster analyses. If the sample size of a certain cluster is n i , the proportion of the sample size of this cluster in the whole sample size is calculated by In addition, entropy is the basic concept of information theory. The size of entropy shows the randomness of a random event: the greater the entropy is, the greater the uncertainty of the random event is and the more information will be given. Otherwise, the smaller the entropy is, the smaller the uncertainty of the event is and the less information will be given [15]. The discrete random vector y = ½y 1 , ⋯, y n is calculated by In Equation (5), the meanings of letters are the same as those in the above equations.
Similarly, when the number of clusters is k, the entropy of this cluster is calculated by In Equation (6), p i represents the proportion of the sample size, and the meanings of the remaining letters are the same as those in the above equations.
A cluster is regarded as a complete event group, and each category is regarded as an event in the complete event group with a probability of p i . When the probabilities of all categories are equal, that is, when the sample sizes in all categories are equal, the information entropy of this subcluster has the maximum value [16]. The probability is if p i = 1/k, the entropy of clustering is the largest, as shown in

Wireless Communications and Mobile Computing
In Equation (7), the meanings of letters are the same as those in the above equations.

System
Structure of Embedded Sensors. The sensor is a kind of detection device, which can feel the measured information and transform the sensed information into electrical signals or other required forms according to a certain law, so that the requirements of information transmission, processing, storage, display, recording, and control are met [17]. The sensor comprises the sensitive element, a conversion element, a conversion circuit, and an auxiliary power supply, as shown in Figure 4.
The sensitive element can detect the measured and output a physical quantity signal with a definite relation with the measured; the conversion element can convert the physical quantity signal output by the sensitive element into an electrical signal; the conversion circuit amplifies and modulates the electrical signal output by the conversion element; the auxiliary element is responsible for supplying power to the conversion element and conversion circuit [18].
The embedded system consists of hardware and software. The software includes software running environments and an operating system. The hardware has a signal processor, memory, and a communication module [19]. In practical operation, embedded sensors often have zero output due to component aging, asymmetric circuit parameters, and other unstable factors [20]. In the sensor system of a microcomputer, the method of zero compensation is simple, and the calculation equation is: In Equation (8), x is the measured physical quantity, k is the scale coefficient, y 0 is zero output, and y c is zero compensation. The embedded sensor is stored in zero position before normal operation. At this time, the output calculation method of the embedded sensor system is shown Equation (9) indicates that the output value of the embedded sensor system is equal to the zero output value, and the meanings of letters are the same as those in the above equations. The zero output is temporarily stored in the storage unit of the computer, and the compensation is calculated by Equation (10)    3 Wireless Communications and Mobile Computing output value, and the letters are the same as the above equations. According to Equations (9) and (10), the specific calculation method of the embedded sensor after the normal operation is shown in Equation (11) states that there is a linear relation between zero compensation and measured physical quantity when the embedded sensor works normally. The meaning of the letters in the equation is the same as the above.
The zero output of embedded sensors drifts with the change of working temperature, which is shown by using In Equation (12), Δy 0 is zero temperature drift, α 0 , α 1 , ⋯, α n represent the temperature error coefficient, and Δθ represents the difference between the actual working temperature and the standard temperature.
The logic topology optimization of embedded sensors can be realized by In Equation (13), SðxÞ is the total delay of the system, t ðl j Þ is the delay on incoming link ðl j Þ, ðn i Þ is the delay of node n i , γ is the set of all data transmission paths, L is the total link, and N is the total number of nodes. The link reliability is calculated by In Equation (14), Eðl j Þ represents the reliability of link l j , G is the functional relationship between link reliability and unit price, and a represents costs. The calculation of node reliability is shown in

Wireless Communications and Mobile Computing
In Equation (15), Eðm j Þ is the reliability of m j , and F is the functional relation between node reliability and node cost. The structure of the specific intelligent sensor node is shown in Figure 5.
The waiting transmission time in the intelligent sensor is calculated by In Equation (16), T represents the time, n is the number, m is the group, and m′ is the corresponding group of the group. The average waiting time of the intelligent sensor can be obtained according to Equations (16) and (17).
In Equation (17), T is time, n is the number, α is the information arrival rate, β is the service rate grouped for service information, and γ is the coefficient.
The time required for grouping two information groups of intelligent sensors is calculated by In Equation (18), the meanings of the remaining letters are the same as the above.
The average queue length of the embedded sensor in the intelligent online monitoring of gateway meters is calculated by In Equation (19), I is the team leader, α is the informa-tion arrival rate, β is the service rate grouped for service information, and γ is the coefficient. When the data are checked and the embedded sensor feeds back the information data to the monitoring system, the feedback time is calculated by In Equation (20), T is the time, i and j are the transmission time and reception time, respectively, and a is the ath transmission.
2.3. Similarity Measurement. The main task of clustering is to gather similar data or pattern vectors. Therefore, measuring the distance or similarity between data is the key to cluster analysis. Measuring methods determines the clustering results. The traditional method measures the similarity by revealing the relationship between two data. Two data are represented by d-dimensional x = ðx 1 , x 2 , ⋯, x d Þ and y = ð y 1 , y 2 , ⋯, y d Þ. The similarity between them is sðx, yÞ = sðx 1 , The more similar x and y are, the greater the value of sðx, yÞ is. If dðx, yÞ is the relationship between two data, the more similar x and y are, the smaller the value of dðx, yÞ is. The distance function needs to meet the following conditions: Symmetry: Nonnegativity:

Wireless Communications and Mobile Computing
The clustering algorithm is to measure similarity or dissimilarity of data. Generally, the distance function or similarity function is used as the measurement standard. The similarity is measured by distance, and the main distances used are Euclidean distance, Minkowski distance, Manhattan distance, and Chebyshev distance [21,22].
(1) Euclidean distance Euclidean distance is the actual distance between two points in the space, and it is calculated by In Equation (24), d is distance, pq is two points in space, n is a multidimensional space, k is the kth point in space, and x is the point formed by pq in space.
(2) Minkowski distance Minkowski distance is a special form of Euclidean distance [23], and it is calculated by In Equation (25), A and B are the coordinates of two points in space, and c is the cth point in space, | A c − B c | is the distance between any two points in the space, and z is the zth times. The meaning of the remaining letters is the same as the above.  Manhattan distance is the sum of the absolute wheelbase of any two points in the coordinate system [24]. It is calculated by In Equation (26), x 1 and x 2 are two points on the abscissa axis, y 1 and y 2 represents two points on the ordinate axis, i is the formed point of x 1 and y 1 , j is the point formed by x 1 and x 2, |x 1 − x 2 | is the distance between x 1 and x 2 , and |y 1 − y 2 | is the distance between y 1 and y 2 . The meanings of the rest letters are the same as the above.

(4) Chebyshev distance
Chebyshev distance is a measure in vector space, which was used in chess before [25,26]. It is the maximum distance between two points, and it is calculated by In Equation (27), the meanings of letters are the same as the above. Another equivalent equation of Chebyshev distance is:

Wireless Communications and Mobile Computing
In Equation (28), x 1i and x 2i are two points on the vector, and k is a random datum.

(5) Mahalanobis distance
Mahalanobis distance is the covariance distance of data. It is used to calculate the similarity of two unknown sample sets, and the calculation equation is: In Equation (29), x T is a polytropic matrix, μ is the mean, Σ is the covariance matrix, and D M is Mahalanobis distance. Equation (26) can also be defined as two that obey the same distribution and the difference of x ! and y ! whose covariance matrix is Σ. The difference between the random variables x ! and y ! is calculated by In Equation (30), the meanings of letters are the same as the above.

(6) Cosine similarity
Cosine similarity, also known as cosine similarity, evaluates the similarity of two vectors by calculating the cosine value of the included angle [27,28]. Cosine similarity draws the vector into the vector space according to the coordinate value, like the most common two-dimensional space. The specific calculation equation is: In Equation (31), A and B represent given vectors, A i and B i represent the components of vectors A and B, and θ is the cosine similarity. The load curve monitoring points of five power users in the monitoring system are selected as the test dataset to test the actual clustering effect of K-means. The dataset contains 273 groups of data.

Optimization of the Clustering Algorithm.
The evaluation method of the effectiveness of the computer clustering algorithm is used to evaluate the quality of this clustering algorithm, so that the clustering results are better analyzed [29]. The following three evaluation indexes are mainly used: adjusted Rand index (ARI), homogeneity, completeness and V-measure, and adjusted mutual information (AMI) [30].
(1) ARI ARI is used to measure the effectiveness of the computer clustering algorithm. It is stable and will not change with the change of clustering label. In other words, if the real label in the experiment is [a,a,b,b,c,c], the predicted label is  [a,a,c,c,b,b], [b,b,c,c,a,a], or [b,b,a,a,c,c]. However, the value of ARI remains unchanged at this time [31]. The value range of ARI is between -1 and 1. The ARI value is between -1 and  Figure 7: Test results of the vibration testbed ((a) the main frequency of the gateway meter is 300 times/min and the peak to peak value is 3 mm; (b) the main frequency of the gateway meter is 480 times/min and the peak to peak value is 8 mm; (c) the main frequency of the gateway meter is 600 times/min and the peak to peak value is 6 mm).  (32), and ARI is calculated by Equation (33).
In Equation (32), RI is a rand index, C is the given category, O is the logarithm of elements in the same category, P is the logarithm of elements in different categories, and n is the total number of samples.
In Equation (33), AR is the adjusted rand index, RI is the rand index, and E is the expected index.
(2) Homogeneity and integrity index Homogeneity indicates that a database contains only one single element, and integrity shows that a database contains all elements [32]. They are calculated by Equations (34) and (35), respectively.
In Equation (34), h is the homogeneity index, H is the random data, L is the given category, and k is the clustering result.

Wireless Communications and Mobile Computing
In Equation (35), c is the integrity index, and the meanings of the remaining letters are the same as the above. The average V-measure of homogeneity and integrity is calculated by In Equation (36), V is the harmonic average value, and the meanings of the remaining letters are the same as the above.

(3) Mutual information index
Mutual information measures the distribution degree between two data. The value of the mutual information  index is from -1 to 1. If the data value of mutual information is larger, the measured data result is closer to the actual situation. If the value of mutual information is smaller, the measured data is significantly different from the actual value [33,34]. The specific calculation equation is: In equation (37), MI is mutual information, U and V are the allocations of sample labels, and i and j are two points in the space. The meanings of the remaining letters are the same as the above. The specific calculation equation of standardized mutual information is: In Equation (38), NMI is standardized mutual information and H is the random data. The calculation equation of adjusting mutual information is: In Equation (39), AMI is the adjusting mutual information, and the meanings of the remaining letters are the same as the above.

Analysis of Intelligent Online Monitoring and Remote Calibration Results of Gateway Meters
3.1. Simulation Analysis of K-Means. Figure 6 shows the simulation results of the clustering algorithm. Figure 6(a) shows that the distance sum is sharply reduced when K is 5. Since the calculation cost increases with the increase of clusters, the optimal number of clusters is 5. When K = 5, the calculation convergence is shown in Figure 6(b). The ordinate is the sum of the distance between the sample point and the centroid, and the abscissa is the times of iterations. When the iters are 3, the distance sum of the samples has gradually converged. The overall convergence speed of the algorithm is fast, which verifies that the K -means is simple and takes less time, and is suitable for processing large-scale datasets.

Analysis of Intelligent Online Monitoring
Results of Gateway Meters. The accuracy, timeliness, and environmental adaptability of intelligent online monitoring of gateway meters based on the embedded sensor and clustering algorithm are tested on the standard vibration test bed. The results are shown in Figure 7. Figure 7 shows that when the main frequency is set to 300 times/min and the peak to peak value is 3 mm, the measured main frequencies are 298 mm, 300 mm, and 299 mm, respectively. When the dominant frequency is set at 480 times/min and the peak value is 8 mm, the measured dominant frequencies are 478 mm, 479 mm, and 479 mm, respectively. When the dominant frequency is set to 600 times/min and the peak value is 6 mm, the measured dominant frequencies are 598 mm, 598 mm, and 601 mm, respectively. In addition, the peak value of channel 1 and channel 2 reach the maximum when the number of tests is 2, followed by the third and the last time. These data show that the intelligent online monitoring network system of gateway meters works stably and reliably in data acquisition and transmission. The error between the final detected data and the parameter values set in advance is relatively small. When the main frequency is set to 300 times/min, the peak value of channel 1 of the three data is between 2.95 and 2.98 mm, and the peak value of channel 2 is between 2.96 and 3 mm; when the dominant frequency is 480 times/min, the peak value of channel 1 is between 7.97 and 7.98 and that of channel 2 is between 7.98 and 8.00 mm; when the dominant frequency is 600 times/min, the peak value of channel 1 is between 5.95 and 5.98 and that of channel 2 is between 5.97 and 6.01. It is concluded that the error of the same pilot is controlled between 0.1 mm after multiple peak value tests, and the error is within the allowable range.
Based on the test data, the gateway meter based on the embedded sensor and clustering algorithm is tested. The data results are shown in Figure 8. Figure 8 shows that the vibration acceleration fluctuates between -0.6 cm/s and 0.3 cm/s, the speed fluctuates between -1 cm/s and 1 cm/s, and the displacement fluctuates between -0.8 cm/s and 0.8 cm/s in intelligent online monitoring. These data show that the intelligent online monitoring

11
Wireless Communications and Mobile Computing system can meet the performance requirements of online monitoring, and the real-time performance of data acquisition and transmission is good. In addition, it saves cost and time and provides necessary data support for security detection, and real-time diagnosis and management.

Remote Verification
Results of Gateway Meters. The remote verification of gateway meters based on the embedded sensor and clustering algorithm is carried out from error/load, secondary voltage drop, and the admittance test.
(1) The error/load data of a manufacturer from April 5, 2019, to April 24, 2019, are collected and shown in Figure 9 Figure 9 shows that the fluctuation of the active error curve is not obvious, and it is changing around 0.1% in 20 days, while the fluctuation of the reactive error curve is relatively large. The maximum error value is about 0.12% on April 8 and April 11, 2019, and the overall error value is between -0.12% and 0.12%. The reactive power of the load test varies between 0Var and 50Var, and the active power varies between 150 V and 250 V. However, the changes in the overall active power curve and reactive power curve are the same.
(2) Secondary voltage drop The specific secondary voltage drop curve is shown in Figure 10. (3) Admittance test curve The specific admittance test curve is shown in Figure 11. Figure 11 shows that the test curves of phase A and phase B are constantly fluctuating, the admittance value of phase A is between 2.3 s and 2.9 ms, and the admittance value of phase B is between 2.2 s and 2.8 ms. This shows that the gateway meter based on the embedded sensor and clustering algorithm can accurately detect the faults of metering devices.
In short, the intelligent online monitoring system of gateway meter based on the embedded sensor and clustering algorithm can meet the needs of industrial applications in terms of timeliness, accuracy, and stability in data transmission. In terms of remote verification, the error is within 0.2%, which meets the expected results. The test results of secondary voltage drop and admittance show that the relevant technical indexes of the system meet the relevant requirements, which can be used to replace manual detec-tion, improve efficiency, reduce cost, and promote modernization, intelligence, and networking management of gateway meters.

Conclusion
With the construction of the energy Internet, gateway meters, which undertake trade settlement between power generation and supply, accelerate intelligent development and make great online monitoring achievements. The study discusses the intelligent online monitoring and remote verification of gateway meters. First, the similarity and relevant evaluation indexes are analyzed based on the relevant theories of embedded sensors and the clustering algorithm. Second, the gateway meter is tested on the standard vibration test bed, and then, the accuracy, timeliness, and environmental adaptability of its intelligent online system are tested. Finally, the error rate/load, secondary voltage drop, and admittance are discussed to test the remote verification of gateway meters. The experimental results show that (1) in terms of intelligent monitoring, the error rate of the same pilot is controlled between 0.1 mm after multiple peak-topeak tests, and it is within the allowable range; (2) the acceleration, speed, and displacement the intelligent online monitoring system of gateway meters based on embedded sensor and clustering algorithm can meet the performance requirements of online monitoring and provide necessary data support for safety detection, real-time diagnosis, and management; (3) the results of the remote verification of gateway meters based on embedded sensors and the clustering algorithm show that the relevant technical indexes of the system meet the relevant requirements, and the secondary voltage drop and admittance can replace manual detection, improve working efficiency, reduce costs, and promote the intelligent development of gateway meters. However, the study still has some shortcomings: (1) the size of the samples is small, resulting in some deviations in the inspection of relevant data; (2) the costs of the intelligent online monitoring and remote verification of gateway meters are not discussed. In the follow-up study, the benefit evaluation will be carried out according to the specific situation, so that the online monitoring and remote verification of gateway meters can be realized more accurately and safely.

Data Availability
All research data used to support the findings of this study are included within the article.

Conflicts of Interest
The authors declare that they have no conflicts of interest.