A Data-Driven Modeling Strategy for Smart Grid Power Quality Coupling Assessment Based on Time Series Pattern Matching

This study introduces a data-driven modeling strategy for smart grid power quality (PQ) coupling assessment based on time series patternmatching to quantify the influence of single and integrated disturbance amongnodes in different pollution patterns. Periodic and random PQ patterns are constructed by using multidimensional frequency-domain decomposition for all disturbances. A multidimensional piecewise linear representation based on local extreme points is proposed to extract the patterns features of single and integrated disturbance in consideration of disturbance variation trend and severity. A feature distance of pattern (FDP) is developed to implement pattern matching on univariate PQ time series (UPQTS) and multivariate PQ time series (MPQTS) to quantify the influence of single and integrated disturbance among nodes in the pollution patterns. Case studies on a 14-bus distribution system are performed and analyzed; the accuracy and applicability of the FDP in the smart grid PQ coupling assessment are verified by comparing with other time series pattern matching methods.


Introduction
In recent years, the increased penetration of distributed generations and power electronic loads has aggravated power quality (PQ) pollution in power systems [1,2].Moreover, the extensive use of adjustable speed drive systems, computer systems, and precision production lines has brought forward higher requirements for PQ [3][4][5].Furthermore, with the improvement of electricity market system, diverse PQ selection and corresponding electrovalence mechanisms are required to be provided as an important service to consumers [6].Traditional power grids have an inability to wrestle with these challenges due to the lack of intelligent PQ monitoring and analysis management platforms.Advanced PQ monitoring and analysis techniques of smart grids have become indispensable in order to ensure the electromagnetic compatibility of various power loads, to satisfy the demand of superior power supply quality and to provide the reasonable electricity markets service [7].
Smart grids have network PQ monitoring and analysis systems.Knowledge-excavating technologies and intelligent algorithms are utilized to analyze PQ problems that generally include disturbance detection and classification [8], disturbance control [9], disturbance estimation [10], disturbance source locating [11], and PQ evaluation [12].Most existing studies focus on the PQ problems of nodes in power networks, whereas the PQ influence analysis among the nodes is lacking.PQ disturbances have propagating and diffusing effects in power networks.The effects of disturbance sources on nodes may be superposed or counteracted.A coupling relation existed in the disturbance influence among nodes under the comprehensive action of the disturbance sources [13].Hence, smart grid PQ coupling assessment is required to quantify the disturbance influence among nodes.This topic research not only has a theoretical value for PQ relation analysis among nodes in power networks, but also has a potential application value in associated region division, local disturbance control, and disturbance estimation on the basis of PQ coupling property.
PQ coupling among nodes is reflected by the similarity of the disturbance variation rules.Time series pattern matching can measure similarity of univariate or multivariate data [14], which provides a possibility for the coupling assessment of single or integrated disturbance.Common pattern matching methods for univariate time series are Euclidean distance (ED) and dynamic time warping (DTW) distance.ED is applicable for sample sequences of equal length [15], and DTW distance supports time stretching and warping [16].However, the two methods consider only the data value difference and ignore the data variation characteristic, which may cause misjudgment in the pattern matching of univariate time series.Popular pattern matching methods for multivariate time series include point distribution (PD), principal component analysis (PCA), and tendency distance (TD).PD uses the distribution of local important points to represent multivariate time series, whereas the dimension and feature differences of variables are unconsidered [17].PCA extracts principal components to reduce variables dimensions but destroys the isomorphism among low-dimensional time series [18].TD employs bottom-up algorithm to achieve the piecewise sequence representation, and pattern matching on multivariate time series is implemented by measuring the trend feature difference [19].However, a definite physical meaning was not reflected by the segmentation manner and the impact of variable size difference was ignored in the pattern matching of multivariate time series.
Existing time series pattern matching methods have inherent advantages and disadvantages, which render these methods inapplicable for PQ coupling assessment.An appropriate time series pattern matching method for PQ coupling assessment should comprehensively consider matching object demand, pollution pattern characteristic, and coupling property.Specifically, various disturbances may occur independently or simultaneously.According to actual demands, pattern matching object could be univariate PQ time series (UPQTS) or multivariate PQ time series (MPQTS).Moreover, PQ pollution is affected by different types of disturbance sources, thereby containing periodic variation pattern (e.g., caused by industrial loads that have a fixed production time and task) and random variation pattern (e.g., caused by renewable energy generations) [20,21].These PQ pollution patterns are necessary to be considered in the time series pattern matching to give an overall analysis for the disturbance influence among nodes.Additionally, PQ coupling among nodes is reflected by the similarity of the disturbance variation rules.Disturbance importance is determined by its severity.Thus, both disturbance variation trend and severity should be considered in the time series pattern matching.
In this study, a data-driven modeling strategy for smart grid PQ coupling assessment based on time series pattern matching is introduced to quantify the influence of single and integrated disturbance among nodes in different pollution patterns.Periodic and random PQ patterns are constructed by using multidimensional frequency-domain decomposition for all disturbances.A multidimensional piecewise linear representation based on local extreme points is proposed to extract the patterns features of single and integrated disturbance taking into account disturbance variation trend and severity.A feature distance of pattern (FDP) is developed to implement the pattern matching on UPQTS and MPQTS to quantify the influence of single and integrated disturbance among nodes in the pollution patterns.Case studies on a 14-bus distribution system are performed and analyzed, and the accuracy and applicability of the FDP method in the smart grid PQ coupling assessment are compared with those of other time series pattern matching methods.

PQ Pattern Construction
2.1.PQ Time Series Expression.Continuous PQ disturbances include voltage fluctuation, voltage deviation, interharmonic ratio of voltage, total harmonic distortion of voltage, and three-phase voltage unbalance.Each disturbance represents a type of PQ pollution, and the properties of the disturbances differ from each other.Various disturbances may independently or simultaneously occur in power networks.The coupling assessment object could be single or integrated disturbance according to the actual situation.The authors in research [22] constructed a PQ state space to represent the state set of continuous disturbances.PQ state change can be expressed as a multivariate function of time.Hence, MPQTS can be expressed as () = [ 1 (),  2 (), . . .,  5 ()]  .In the expression,   () is the UPQTS of disturbance ,   () = [  (1),   (2), . . .,   ()];   () is the monitoring value of disturbance  at time ; and  = 1, 2, . . ., 5 represents the five disturbances orderly.

Multidimensional Frequency-Domain Decomposition.
The PQ condition of nodes is determined by the common effect of different disturbance sources in a power network.The operations of some disturbance sources present certain regularity; for instance, arc furnaces and rolling mills may have a fixed working time and production task.The PQ disturbances generated by such sources usually show periodicity variations.The operations of other disturbance sources are irregular, such as the output power of wind and photovoltaic generation, which are affected by weather conditions.The PQ disturbances generated by this type of sources present random variations.Therefore, daily cycle, weekly cycle, low frequency, and high frequency patterns are defined in this study to represent the PQ pollution affected by different types of disturbance sources.Multidimensional frequency-domain decomposition is used for all disturbances to construct the four pollution patterns.
Fourier series expansion can decompose mutually orthogonal frequency components from a continuous periodic signal [23]; hence, it is used synchronously in all the dimensions of ().

𝑀 (𝑡) = 𝑎
where  0 is the direct current component and   and   are the coefficients of cosine and sine components of   (), respectively.According to angular frequency   = / × 2, () is reconstructed as where   ()   (),   (), and ℎ  () represent the daily cycle, weekly cycle, low frequency, and high frequency components of   (), respectively.Every 10 min is taken as a sampling interval of disturbance data, with a total of 144 sampling points per day.The frequency sets of   (),   (),   (), and ℎ  () are established as follows.
Periodic Frequency Sets.The cycles of   () and   () are 144 and 144 × 7, respectively, and their angular frequency sets Ω  and Ω  are expressed as where mod is the modulo operation.
Random Frequency Sets.Daily 144 sampling points are regarded as a critical condition to distinguish   () and ℎ  (), and their angular frequency sets Ω  and Ω  are expressed as () is transformed from the time domain to the frequency domain by multidimensional discrete Fourier transform.
To extract the four objective patterns, direct current component  0 is assigned to   (),   (),   (), and ℎ  () according to the four components sizes.

𝑀 (𝑡) = (𝑎
where |  ()| rms is the root-mean-square value of   (),   (),   (), or ℎ  (),   (), and   () are the daily and weekly cycle patterns that, respectively, reflect the disturbance variations per day and per week, and   () and   () are the low and high frequencies patterns that, respectively, reflect the randomness of the slow and fast disturbance variations.9) or (10).In this definition, the extreme points on a horizontal line represent placidity tendency.For two disturbance dimensions, the segmentations of UPQTS based on local extreme points are shown in Figure 1(a).The segmentations of MPQTS are obtained based on the local extreme points of each disturbance dimension, as shown in Figure 1(b).Multidimensional segmentations can dispose of both UPQTS and MPQTS.The purpose is to make the trend in each subsection generally consistent.The purpose of the piecewise strategy is to make the trends in each subsection consistent in general (rising, placid, or falling).Meanwhile, the similar trends of different dimensions are segmented into the same subsection.

PQ Pattern Representation
Least square method is used to achieve a linear representation for the piecewise data (  ,   (  )) of disturbance  [24].

Pattern Features Representation.
The slope of fitting segment   reflects the trend feature of disturbance .In addition, the average monitoring value   reflects the severity of disturbance .A serious disturbance has a great influence in PQ, which should be given much attention in PQ coupling assessment.Thus,   is used as an importance factor of disturbance .The time span  is the length of the subsection.
A long subsection has an important effect in characterizing a PQ time series.Hence,  is used as an importance factor of a piecewise trend.In consideration of the interval of   ,   and   are infinite, and two similar time series may be mistakenly regarded as dissimilar due to the overlarge numerical difference between two piecewise features.Therefore, the features conversions are given as follows: where   is the inclination angle of fitting straight,  *  is the average per-unit value of disturbance ,  max is the standard limit of disturbance , and   is the ratio of piecewise length   to total length .A pattern of single and integrated disturbance is represented by the feature matrix (13) where  is the subsection numbers of PQ time series.

PQ Coupling Assessment
Time series pattern matching can measure the similarity of univariate or multivariate data.The PQ coupling among nodes is reflected by the similarity of the PQ time series variations.Time series pattern matching can be introduced to measure the similarity of the variation trend of UPQTS or MPQTS, thereby achieving the coupling assessment of single or integrated disturbance.Therefore, on the basis of the pattern construction and representation, a time series pattern matching method is required to apply for both UPQTS and MPQTS, which should comprehensively consider disturbance variation trend and severity.For a PQ pattern at two nodes, the row numbers of the feature matrices are the same because the disturbance variables correspond to one another.The column numbers of the feature matrices are decided by generally different piecewise numbers.The required time series pattern matching method should be applicable for the pattern feature matrices with the same row and different column numbers.DTW distance supports time stretching and warping, which can determine the alignment matching relation between two univariate time series [16].Thus, this study proposes a FDP method based on DTW distance to implement the pattern matching on both UPQTS and MPQTS in different pollution patterns.
and  are the feature matrices of integrated disturbance of pattern  at two nodes, which are as follows: where   ( = 1, 2, . . ., ) and   ( = 1, 2, . . .,   ) are the column vectors of  and , respectively.The feature distance of pattern  between  and  is defined to implement the pattern matching on MPQTS.
where [ : ] represents the subsequence composed of column vectors  to  in the pattern feature matrix  and FDP base (  ,   ) is the basic distance between   and   .FDP(, ) = 0, FDP(, ) = ∞, and FDP(, ) = ∞, where  represents an empty set of the column vector.FDP base (  ,   ) reflects the cumulative differences between the corresponding subsections of  and , which consider the importance of disturbances and subsections.The FDP base (  ,   ) between  and  is defined as follows: where  *  and   *  are, respectively, the average per-unit values of the disturbance  in the subsections  and ,   and    are, respectively, the time-weighted coefficients of the subsections  and , and   is the disturbance-weighted coefficient.The longer time span corresponds to the stronger effect on the characterization of a pattern feature.The more serious disturbance corresponds to the more important influence in the PQ coupling assessment.While   = 1, the FDP method can achieve the pattern matching on UPQTS.
The feature distance of the synthetic PQ pattern between two nodes is calculated according to the contributions of the four patterns.The more serious the PQ pollution caused by the pattern, the greater the pattern contribution.
In the equation,  *  and  *  are the average per-unit values of single or integrated disturbance of pattern  at two nodes, respectively, and FDP  is the feature distance of the synthetic PQ pattern between two nodes.
A small pattern feature distance corresponds to a high coupling degree of single or integrated disturbance among nodes.In accordance with this rule, the coupling degree   of single or integrated disturbance is defined as where FDP max is the maximum of the pattern matching results.  = 0 indicates minimal PQ coupling degree, whereas it does not mean that absolutely no interaction of single or integrated disturbance occurs between two nodes.  = 100% indicates that the variation rules of single or integrated disturbance at two nodes are the same.

Case Studies and Results
Smart grid PQ coupling assessment based on pattern construction, pattern representation, and time series pattern matching is conducted with a 14-bus distribution system, as shown in Figure 2. The power supplies located at nodes 1, 2, and 3.The UPQTS of voltage deviation (actual values) and the MPQTS of integrated disturbance (per unit values) at the nodes, except power supplies, are respectively, given in Figures 3 and 4, with a sampling step of 10 min.Voltage fluctuation, voltage deviation, interharmonic ratio of voltage, total harmonic distortion of voltage, and three-phase voltage unbalance are represented in the disturbance variable dimensions 1, 2, . . ., 5, respectively.To avoid excessive numerical difference among the disturbances to influence clear observation for MPQTS, the disturbances are transformed into per-unit values (p.u.), and the base values are the limits of the disturbances [25].The actual values are used in the calculation.The following two cases are considered.

Case 1.
The coupling of voltage deviation among the nodes is assessed through FDP method.The accuracy and applicability of the FDP in the voltage deviation coupling assessment are compared with those of other pattern matching methods for univariate time series in this case.

Coupling Assessment of Voltage Deviation through FDP Method.
The voltage deviations of the nodes are decomposed into daily cycle, weekly cycle, low frequency, and high frequency patterns, as shown in Figure 5. FDP method is used to implement the UPQTS pattern matching in the daily cycle, low frequency, and high frequency patterns.The assessment results of coupling degree   of voltage deviation among the nodes are shown in Figure 6.
Figure 5 shows the magnitudes and variation laws of voltage deviation in the four patterns, which reflects the following aspects.The daily cycle pattern is periodically changing and the weekly cycle pattern is nonexistent.Low frequency and high frequency patterns are randomly changing.The voltage

Comparison with ED and DTW Methods. ED and DTW
distance are used to implement the pattern matching on the UPQTS of voltage deviation [15,16].The assessment results of   of voltage deviation among the nodes are given in Figure 7.
of voltage deviation obtained via ED and DTW distance are similar (Figure 7).A comparison of Figures 3 and  7 reveals that a large   corresponds to the similar voltage deviation values rather than the similar voltage deviation variations between two nodes, thereby indicating that   obtained via ED and DTW distance are unreasonable and inaccurate.The reason for this result is as follows.ED and DTW distance reflect the difference in disturbance values, which are useless for the disturbance with similar values and different variation trends.However, the FDP calculates the difference in the variation trend of disturbance, which reflects the coupling property of disturbance among nodes.Therefore, the FDP method is more accurate and applicable for the coupling assessment of single disturbance among nodes in a distribution system.

Case 2.
The coupling of integrated disturbance among the nodes is assessed through FDP method.The accuracy and applicability of the FDP in the integrated disturbance Mathematical Problems in Engineering coupling assessment are compared with those of other pattern matching methods for multivariate time series in this case.

Coupling Assessment of Integrated Disturbance through FDP Method.
The integrated disturbances of the nodes are decomposed into daily cycle, weekly cycle, low frequency, and high frequency patterns, which are represented by the results at node 14 (Figure 8).FDP method is used to implement the MPQTS pattern matching in the daily cycle, low frequency, and high frequency patterns.The assessment results of coupling degree   of integrated disturbance among the nodes are shown in Figure 9.
Figure 8 shows the magnitudes and variation laws of the integrated disturbance in the four patterns, which reflects the following aspects.The daily cycle pattern is periodically changing and the weekly cycle pattern is nonexistent.Low frequency and high frequency patterns are randomly changing.The integrated disturbance magnitude of the daily cycle

5.2.2.
Comparison with PD, PCA, and TD.The -nearest neighbor method is introduced to compare the accuracy of FDP with that of PD, PCA, and TD for the MPQTS pattern matching [26].As shown in Figure 4, the MPQTS of nodes 9, 10, 11, 12, and 13 are more similar than those of the other nodes due to the amplitudes and variation rules, which indicates that the set of the MPQTS belongs to the same category.PD, PCA, and TD methods are used for pattern matching on the five MPQTS samples.The densest subsection [−1 : +1, −1 :  + 1] is chosen in PD [17].For PCA, the cumulative percent of the principal components is 0.85 [18].The fitting error in TD is set to 0.03, and the importance of each disturbance is the same [19].The accuracy rate  of the MPQTS pattern matching for a sample is calculated as where  is the number of matching results that are the same in the category.For the five samples, the mathematical      expectation  of the accuracy rate   of a method is calculated as where (  ) is the occurring probability of   .The accuracy rate and expectation of PD, PCA, TD, and FDP in the MPQTS pattern matching are listed in Tables 1  and 2, respectively.
Tables 1 and 2 show that the accuracy rate and expectation of FDP are higher than those of PD, PCA, and TD in the MPQTS pattern matching.The comparison results indicate that reasonable and accurate coupling assessment of the which directs more attention to over-limit disturbance than to standard-compliant disturbance.However, TD ignores disturbance severity difference, which may result in erroneous results in MPQTS pattern matching.Compared with the three methods, FDP not only embodies the timing relationship and continuous variation of MPQTS, but also reflects the difference in the variation trends of integrated disturbance with considering disturbance severity.Therefore, the FDP method is more accurate and applicable for coupling assessment of integrated disturbance among nodes in a distribution system.

Conclusion
In this study, smart grid PQ coupling assessment based on pattern construction, pattern representation, and time series pattern matching is introduced to quantify the influence of single and integrated disturbance among nodes in different pollution patterns.
Case studies on a 14-bus distribution system are performed and analyzed to verify the accuracy and applicability of the FDP method in the smart grid PQ coupling assessment.
The main outcomes of the study are as follows.
(1) Periodic and random PQ patterns are constructed by using multidimensional frequency-domain decomposition for all disturbances taking into account the variation laws of disturbances affected by different types of pollution sources.(2) A multidimensional piecewise linear representation based on local extreme points is proposed to extract the patterns features of single and integrated disturbance in consideration of disturbance variation trend and severity.(3) A FDP method is developed to implement the pattern matching on both UPQTS and MPQTS to quantify the influence of single and integrated disturbance among nodes in different pollution patterns.The study not only has a theoretical value for PQ relation analysis among nodes in power networks, but also has a potential application value in associated region division, local disturbance control, and disturbance estimation on the basis of PQ coupling property.(4) Compared with ED, DTW distance, PD, PCA, and TD, the FDP method is more accurate and applicable for the PQ coupling assessment.The advantages of FDP are as follows.The FDP method embodies the timing relationship and continuous variation of PQ time series.Moreover, the developed method

Figure 1 :
Figure 1: Schematic of multidimensional segmentations based on local extremes points.

Figure 5 :
Figure 5: Patterns construction of voltage deviation.

Figure 6 :
Figure 6: Coupling degree   of voltage deviation in different pollution patterns obtained via FDP.

Figure 7 :
Figure 7: Coupling degree   of voltage deviation among nodes obtained ED and DTW distance.
t u r b a n c e D i s t u r b a n c e S a m p le se ri e s S a m p le se ri e s D is t u r b a n c e D is t u r b a n c e S am p le se ri es S am p le se ri es

Figure 8 :
Figure 8: Patterns construction of integrated disturbance at node 14.

Figure 9 :
Figure 9: Coupling degree   of integrated disturbance in different pollution patterns obtained via FDP.