A GIS Partial Discharge Pattern Recognition Method Based on Improved CBAM-ResNet

Diferent types of partial discharge (PD) cause diferent damages to gas-insulated substation (GIS), so it is very important to correctly identify the type of PD for evaluating the GIS insulation condition. Te traditional PD pattern recognition algorithm has the limitations of low recognition accuracy and slow recognition speed in engineering applications. To efectively diagnose the GIS PD type and safeguard the safe and reliable operation of the distribution network, a GIS PD method based on improved CBAM-ResNet was proposed in this paper. And the improved CBAM-ResNet takes advantage of the residual neural network and attention mechanism. In particular, the channel attention module and the spatial attention module are connected in parallel in the improved CBAM. Te experimental results showed that the GIS PD pattern recognition method proposed herein has a recognition rate of 93.58%, 95.00%, 93.55%, and 93.88% against the four PD types. Compared with the traditional PD pattern recognition algorithm, the algorithm has the advantages of a lightweight model and more accurate recognition results, which carry better engineering application values.


Introduction
GIS is widely used in power systems thanks to its advantages of good insulation, high reliability, and small physical dimensions [1].However, when the power system works in a complex environment of high temperature and high pressure, or in the process of GIS manufacturing, some safety hazards are inevitable, such as dust, metal tips, and air gaps [2].When there are insulation defects inside the GIS, PD will generate ultrahigh-frequency (UHF) signals [3].PD causes insulation deterioration of the GIS in the substation and also leads to aggravated insulation deterioration.If not handled in time, it will eventually lead to insulation breakdown of the GIS.Since diferent insulation fault defects require corresponding treatment methods, it is necessary to accurately and efectively judge the type of equipment insulation faults and then take measures to avoid equipment insulation deterioration or even breakdown, which is an important technical means to safeguard the stable operation of substations and power systems [4,5].
Te PD signal has obvious time-frequency-domain characteristics, and its phase-resolved patterns of partial discharge (PRPD) can intuitively represent the relationship between the power frequency phase φ corresponding to the partial discharge pulse, the discharge amount q, and the number of discharges n [6,7].Te PRPD pattern contains the intrinsic information of insulation defect types, which makes the identifcation of partial discharge PRPD pattern an important method to diagnose defect types [8].Li et al. [9] frst extracted the statistical features in the PRPD pattern and then used a probabilistic neural network to analyze the statistical features.Sha and Liang [10] established diferent feature subsets based on the statistical features of the PRPD pattern and eliminated redundant features through a probabilistic neural network algorithm, with a recognition accuracy of about 90% against partial discharge types.Yan et al. [11] extracted the time-frequency distribution image features of partial discharge and input them into SVM for classifcation, but this method of image feature extraction requires various artifcial transformations, which seriously reduces the efciency of partial discharge pattern recognition.
With the development of deep learning and computer hardware, convolutional neural networks (CNN) have been developed rapidly.Since CNN can automatically extract data features and get rid of the dependence on manual feature extraction and expert experience, it has the potential to improve the efciency of GIS partial discharge pattern recognition [12].In recent years, more and more deep-learning methods have been applied in the feld of partial discharge pattern recognition.Zhang et al. [13] used a deep belief network for GIS partial discharge pattern recognition and compared the recognition results with SVM and BPNN, and the overall recognition rate was signifcantly improved.Yin et al. [14] used the PRPD image multifeature information fusion method for PD pattern recognition, and the online sequential-extreme learning machine (OS-ELM) algorithm for PD pattern recognition was proposed by Zhang et al. [15].Tese methods have achieved good recognition results, which prove the efectiveness of deeplearning algorithms in the feld of PD pattern recognition.However, the improvement of the performance of the CNN model depends more on the deep network structure, but as the number of network layers continues to increase, the performance decreases instead [16].At this time, the network leads to degradation problems such as information loss and gradient disappearance.In response to this problem, there have been techniques including improving the activation function and using batch regularization to improve network performance, but they have not been completely resolved.Te emergence of the residual neural network has solved the degradation problem caused by the increase in the number of network layers to a certain extent.
Tis paper proposes a GIS PD pattern recognition method based on improved CBAM-ResNet, which takes the advantage of residual neural networks and convolutional block attention modules.Te network framework has three key components, including the PRPD image input layer, feature extraction layer, and classifcation output layer.In particular, the two attention mechanisms of the channel attention module and the spatial attention module are connected in parallel to solve the interference caused by the cascade of the two attention mechanisms.Te network structure is simple and can obtain detailed information on the target, reducing the redundant feature information generated in feature extraction.Each type of PD has a different form of PRPD image which is selected as the input layer for PD-type recognition.To prove the efectiveness of the algorithm, comparative experimentation was completed.Te novelty and contribution of this work are threefold: (1) proposing an improved convolutional block attention module residual network (improved CBAM-ResNet) for GIS PD-type recognition, (2) developing a "parallel connection" structure CBAM to improve the efciency of extracting efective information from PRPD images, and (3) applying the proposed method to the GIS PD experimental platform and engineering project for method verifcation and demonstration.
Te rest of the paper is organized as follows.Section 2 details the proposed improved CBAM-ResNet for GIS PD pattern recognition.Section 3 reports the experimental results with discussions.Te paper is concluded in Section 4.

Proposed GIS PD Pattern Recognition Method
Te common GIS partial discharge modes include four types: tip discharge, free particle discharge, creeping discharge, and suspended electrode discharge [17].Te PRPD image of various types of partial discharges has distinctive characteristic diferences.Te proposed GIS PD pattern recognition method is used for the PRPD image feature and the improved CNN to improve the PD-type recognition accuracy.In particular, based on the residual neural network, the attention module is added in this work, and then, an improved convolutional block attention module residual network (improved CBAM-ResNet) is formed as the improved CNN for GIS PD-type recognition.Following an overall overview of the proposed GIS PD pattern method, the three key components of the recognition method, including the PRPD image, the improved CBAM-ResNet structure, and the GIS PD pattern recognition, are detailed in this section.

Method Overview.
Te overall architecture of the proposed GIS PD pattern recognition method based on improved CBAM-ResNet is illustrated in Figure 1, and the network framework has three key components, including the PRPD image input layer, feature extraction layer, and classifcation output layer, which can realize the detection from the PRPD image to the PD type.In particular, the PRPD image is a source of information for the input layer, the two residual blocks, and the improved residual block as the feature extraction layer, and the classifcation output layer consists of the global average pooling operation and the softmax classifer.Te feature extraction layer uses ResNet34 as the backbone network.ResNet34 contains 5 stages, that is, 5 convolution stages with diferent parameters and features.Te residual network is used in the feature extraction layer to alleviate network performance degradation phenomena such as gradient explosion or gradient dispersion caused by the increase in network depth, to speed up model convergence, and to improve model training efects.Furthermore, the CBAM attention mechanism is introduced in the residual blocks of stage 3, stage 4, and stage 5.It seeks to improve the feature extraction ability of the proposed model, especially the extraction ability of low-contrast areas information, so as to improve the detection accuracy of PD type.In the classifcation output layer, the global mean pooling is used to fuse the image features, which compresses the output feature dimension, and the softmax function is used to calculate the classifcation result.

PRPD Image.
Te PRPD image is a commonly used representation of PD information in the feld of GIS PD pattern recognition, which mainly describes the relationship 2 Journal of Electrical and Computer Engineering between the number of partial discharges n, the discharge quantity q, and the corresponding discharge phase φ [18].In this paper, the PRPD image is a two-dimensional spectrum constructed by continuously sampling 1 min PD signals, where the abscissa represents the discharge phase φ, the ordinate represents the discharge quantity q, and the number of discharges n is mapped to the color space.Each typical partial discharge has diferent discharge characteristics, which are manifested as diferences in the initial discharge phase, amplitude, and the number of partial discharges on the PRPD image.
Te typical PRPD image of corona discharge is illustrated in Figure 2(a).It can be seen that the amplitude of the corona discharge is small and that the maximum discharge amplitude appears at the peak voltage.Te discharge density of the negative half cycle is greater than that of the positive half cycle, and the discharge is mainly distributed near the peak value of the negative half cycle.Te pulse phase of the surface discharge is mainly distributed at the rising edge of the positive half cycle and the falling edge of the negative half cycle, as shown in Figure 2(b).Te PD pulse phase distribution of the suspended electrode discharge is symmetrical at the peak of the applied voltage, as demonstrated in Figure 2(c).Te initial discharge point of free metal particles appeared earlier than the voltage peak, the distribution of PD was relatively compact, and the amplitude and density of PD were basically the same in the positive and negative half cycles, as shown in Figure 2(d).

Improved CBAM-ResNet
Structure.CNN is a kind of deep-learning network that simulates the hierarchical processing of input data by brain neurons.It can extract the features of the input layer, establish a structural model with rich information, and obtain more essential abstract features for better classifcation and recognition [19].To improve the efciency of type recognition, this paper introduces the residual network structure based on the typical CNN and adds the improved CBAM.Te improved CBAM-ResNet is formed for the GIS PD type recognition.ResNet34 is used as the backbone network in the proposed improved CBAM-ResNet.Te fve convolutional stages with diferent parameters are used to construct the feature extraction layer to extract the PD image feature.Ten, the classifcation output layer is constructed using the global averaging pooling and softmax function to calculate the recognition results.In the feature extraction layer, stages 1 and 2 consist of a convolution layer and a maximum pooling layer, and stages 3-5 are stacked by residuals with improved CBAM.

Residual Element.
For normal CNN, expanding the depth of the neural network, that is, adding a new hidden layer, often through complete training, can efectively improve the training accuracy.However, in actual experiments, blindly adding too many hidden layers and deepening the depth of the neural network will easily cause the problem of gradient disappearance and gradient explosion in the network, However, in actual experiments, blindly adding too many hidden layers and deepening the depth of the neural network will easily cause the network to have gradient disappearance and gradient explosion problems, and the training accuracy cannot reach the expected goal.In response to the above problems, He et al. [20] introduced the residual network structure into ordinary CNN.Te skip connection structure is adopted in the residual network as shown in Figure 3.
In particular, x and y represent the input and output of the L-th residual network, respectively.F(x) is the residual function, representing the residual learned by this residual network.Te activation function ReLU is represented by σ(x).Te residual learned in the residual network can be expressed as where w 1 and w 2 represent the weight of the convolution layer, H(x) stands for the identity map, that is, H(x) � x, and the output of the L-th residual network can be defned as  Journal of Electrical and Computer Engineering (2)

Improved CBAM.
Te PD information only occupies a small part of the PRPD image, and most of the content is the natural background.In order to eliminate the redundant characteristic information irrelevant to the partial discharge information and improve the detection accuracy of the GIS PD, the residual block is reconstructed in ResNet34, and an improved CBAM attention mechanism is designed and added in this work.CBAM can imitate human visual perception ability, automatically flter out unimportant information, use more attention resources for target areas that need to be focused on, and improve the efciency and accuracy of visual information processing.
Te CBAM attention mechanism consists of a channel attention module and a spatial attention module and improves the ability of the network to extract implicit important features.It collaboratively learns the key information in the image, assigns a higher weight to the PD information area in the PRPD image, and assigns a lower weight to the background information to improve the neural network.Attention to PD information can improve the feature learning and expression ability of the network.
In the "serial connection" structure, the channel attention module and the spatial attention module are linked in series.Whether the channel attention module is enabled frst and then the spatial attention module is enabled, or the spatial attention module is enabled frst and then the channel attention module is enabled, the weight in the back is generated by the feature image in front of it [21].Te output of the attention module in front focuses more on valid information.However, the local information degree of the PRPD image is very high.It is considered that the attention module in front "modifes" the original input feature image, which to a certain extent afects the characteristics learned by the attention module at the end of the row.In other words, the attention module in front focuses on the valid information under its own module mechanism and ignores the information required by the other module.It will be detrimental to the improvement of recognition accuracy.
In order to solve this problem, this work improves the serial attention module of CBAM and adopts a "parallel  connection" structure, so that both attention modules can directly learn the original input feature image and that there is no need to pay attention to the order of the spatial attention and channel attention, then obtaining an improved CBAM.Te overall process of the improved CBAM attention mechanism is shown in Figure 4.
Te proposed improved CBAM frst obtains the corresponding weights from the input feature image F 1 through channel attention and spatial attention and then performs an inner product operation on the weights and the original input feature image F 1 to obtain the feature image F 2 .Te process can be expressed as where M C (F) stands for the weights of the channel attention channel and M s (F) is the weights of the spatial attention Te channel attention mechanism treats the feature information of diferent channels diferently by constructing the relationship between the channels of diferent feature maps.Te input feature map is marked as F 1 ∈ R C×H×W , where H, W, and C are the height, width, and channel number of the input feature maps, respectively.
Te global spatial information of the feature graph F 1 is compressed by global maximum pooling and global average pooling, and then, two feature maps S 1 and S 2 with size 1 × 1 × C are generated.Te two feature maps are input into the shared multilayer perceptron network to get two onedimensional feature maps.Te two one-dimensional feature maps are summed according to channels and then normalized using the sigmoid function to obtain the weight M c with the size of each channel being 1 × 1 × C. Te channel attention weight calculation process can be expressed as where σ stands for the sigmoid function and f x 0 (i 0 , j 0 ) represents the pixel value of the coordinate (x, y) in the x 0 channel of the input feature graph F 1 .
Te improved CBAM proposed in this paper adopts the "parallel connection" structure of two attention mechanisms, so the spatial attention mechanism also takes the original input feature map F 1 as input, performs global maximum pooling and average pooling operations on the channel dimension, respectively, and generates two feature maps P 1 and P 2 with a size of 1 × H × W. Te two feature maps are merged in the channel-based dimension and then normalized by the sigmoid function after the convolution operation, and the spatial weighted information M s is obtained.Te process can be expressed as follows: where f conv stands for the convolution operation.

Classifcation Output.
Te classifcation output layer consists of the global average pooling (GAP) operation and softmax classifer in this work.GAP operation refers to the spatial averaging operation of the output feature maps of the last convolution layer or pooling layer [22].Compared with the fully connected operation, the GAP operation can enhance the correlation between the feature and the target type, and the GAP operation has no parameters to be trained, so it can greatly reduce the network parameters and the amount of calculation, efectively prevent overftting, and make the learned features more robust.Te mathematical expression can be written as where y c represents the global average of the c-th channel of the input feature maps, n indicates the number of features of each channel, and x c i is the i-th eigenvalue of the c-th channel.
Te softmax classifer is an extension of logistic regression on multiple classifcation problems.Its principle is simple, and it can solve multiple classifcation problems.Terefore, this work uses the softmax classifer to realize the classifcation of GIS PD types.We suppose that the set of input feature is X � x 1 , x 2 , ..., x i , ..., x T  , and the PD type of the input feature is one of the sets of C � c 1 , c 2 , ..., c k , ..., c T  .Ten, the probability of classifying the input sample x i to be c k which is one of the sets of C can be expressed as where e x T i c k is the correlation between category c k and the entire classifcation type x i and 1/ K k�1 e x T k c k is the normalization function.

Experimentation
To verify the efectiveness of the proposed improved CBAM-ResNet for GIS partial discharge pattern recognition in this paper, the GIS PD experimental platform was set up in the laboratory with a PD detector as the acquisition equipment, as shown in Figure 5.In the GIS PD experimental platform, corona discharge, surface discharge, suspended electrode discharge, and free metal particle discharge defect are set Journal of Electrical and Computer Engineering artifcially to form four typical PD PRPD image acquisition.500 sets of 200 × 120 × 3 PRPD maps were collected of each typical PD type for a total of 2000 PRPD maps.Te recognition performance analysis experiment based on different recognition methods and diferent size training sets were completed, and then, the proposed improved CBAM-ResNet for GIS partial discharge pattern recognition was tested during the engineering project application.

Recognition Performance for Diferent Training Set Size.
In order to verify the infuence of training set size on the recognition performance of the proposed improved CBAM-ResNet recognition method, the collected PRPD images are randomly divided into training and test sets in the ratios of 0.8 : 0.2, 0.6 : 0.4, and 0.4 : 0.6 in this work.Ten, the recognition performance of the proposed method under different training datasets is obtained in three groups of experiment.Te experimental results are shown in Figure 6.Te diagonal black box represents the number of samples whose recognition type is consistent with the actual PD type, the white box is the number of incorrectly recognized samples, the last row in the gray box indicates the precision rate of the recognition method, the last column in the gray box represents the recall rate, and the fnal dark gray box represents the average recognition precision rate.
From the experimental results, it can be clearly seen that the average recognition precision rate is the highest, reaching 94%, when the training set and the test set are randomly allocated at a ratio of 0.8 : 0.2.At the same time, the precision rate and the recall rate of the four PD types are more than 93.55% and 92.93%, respectively.When the training set and the test set are randomly allocated at a ratio of 0.4 : 0.6, the average recognition precision rate of the recognition method is the lowest, which is 82.25%.Tis shows that increasing the number of training sets can improve the recognition performance of the proposed improved CBAM-ResNet for GIS PD-type recognition.6

Recognition Performance for Diferent Recognition
Journal of Electrical and Computer Engineering remaining 400 PRPD maps were used as the test set to test the above three recognition methods.Te recognition results are shown in Figure 7. Te diagonal black box represents the number of samples whose recognition type is consistent with the actual PD type, the white box is the number of incorrectly recognized samples, the last row in the gray box indicates the precision rate of the recognition method, the last column in the gray box represents the recall rate, and the fnal dark gray box represents the average recognition precision rate.About the normal CNN, the average recognition accuracy is 86.5%, and the recognition precision rate is 85.32%, 88.00%, 86.02%, and 86.73% for corona discharge, surface discharge, suspended electrode discharge, and free metal particle discharge, respectively.For the SVM, the average recognition accuracy is 85.00%, and the recognition precision rate is 81.65%, 84.00%, 86.02%, and 88.77% for corona discharge, surface discharge, suspended electrode discharge, and free metal particle discharge, respectively.In contrast, the improved CBAM-ResNet has the highest average recognition accuracy which is 94.00%; the SVM has the lowest average recognition accuracy which is 845.00%.Te same conclusion holds for the recall rate.Tat is, compared with the improved CBAM-ResNet PD-type recognition method proposed in this work, both normal CNN and SVM recognition methods have poorer accuracy for PD-type recognition.Te normal CNN realizes the PD-type recognition by automatically obtaining PRPD map features.Te shapes of diferent PD type PRPD images are relatively close, which seriously afects the accuracy of the local placement pattern recognition [12].SVM uses the statistical characteristic parameter for the PD-type recognition.For very close statistical characteristic parameter values, it is difcult to accurately identify PD types by a single analysis [23].Ten, CNN and SVM recognition methods are naturally dependent on large-scale training sample data.To sum up, the proposed improved CBAM-ResNet method is more suitable for PD-type recognition.

Engineering Project Application. Te improved CBAM-
ResNet model trained with experimental data was applied to diagnose PD types at a GIS in Hefei.In the engineering application, the directional antenna was used to obtain the PD ultrahigh-frequency pulse (UHF) signals, the LeCroy WR640Zi oscilloscope was used to acquire the UHF signals with a bandwidth of 4 GHz, and then, the PD UHF signals are used to generate the PRPD images.Te UHF PD signal acquisition site is shown in Figure 8.

Journal of Electrical and Computer Engineering
A large number of PRPD images were collected in the feld tests, and then, the trained CBAM-ResNet model is used to identify the PD type.Te recognition results are shown in Table 1.Among them, the number of corona discharge, surface discharge, suspended electrode discharge, and free metal particle discharge was 3, 17, 24, and 5, respectively.Te average recognition accuracy is 93.88%.A randomly selected PRPD image is shown in Figure 9. Te pulse phase of this PRPD image is mainly distributed at the positive half-cycle rising edge and the negative half-cycle falling edge; that is, the discharge phenomenon occurs between 0 °and 90 °and 180 °and 270 °.Te recognition result of this PRPD image is that surface discharge agrees with the real PD type.

Conclusion
PD-type recognition is very important for evaluating the GIS insulation condition.To solve the limits, the traditional PDtype recognition method has low recognition accuracy and limits its engineering application.Tis paper proposed the improved CBAM-ResNet for GIS PD-type recognition.And the improved CBAM-ResNet takes advantage of the residual neural noetwork and attention mechanism.In particular, the channel attention module and the spatial attention module are connected in parallel in the improved CBAM.Te comparison test results show that the improved CBAM-ResNet for the GIS PD-type recognition method proposed in this work has a great improvement in recognition accuracy compared with normal CNN and SVM and has a good application prospect in engineering practice.

Figure 1 :
Figure 1: Te overall architecture of the proposed GIS PD pattern recognition method based on improved CBAM-ResNet.

Figure 3 :
Figure 3: Basic structure of the residual element.
Methods.To further verify the recognition performance of the proposed improved CBAM-ResNet for GIS PD-type recognition, the normal CNN and SVM recognition methods were used to perform the comparative experiment in this work.In particular, the SVM uses the statistical feature values of the PRPD image as the PD-type recognition and chooses the radial basis function as the kernel function.Four related standard normal distribution statistical operators are used as statistical feature values of the PRPD image, which are the skewness S k , the steepness K u , the phase asymmetry Φ, and the phase correlation coefcient C c .According to the above experiment, increasing the number of training sets can improve the recognition performance.Terefore, the collected PRPD images are divided into training and test sets in the ratios of 0.8 : 0.2 in this experiment.Tat is, among the 2000 collected PD PRPD images, 400 PRPD images of each PD type were randomly selected as the training set, a total of 1600 PRPD images were used to train the three PD recognition methods, and the

Figure 5 :
Figure 5: Te diagram of the GIS PD experimental platform.

Figure 6 :
Figure 6: Te recognition performance for three diferent recognition methods.(a) Te ratio of training and test sets is 0.8 : 0.2; (b) the ratio of training and test sets is 0.4 : 0.6; (c) the ratio of training and test sets is 0.6 : 0.4.

Figure 7 :
Figure 7: Te recognition performance for two diferent recognition methods.(a) Te normal CNN recognition performance; (b) the SVM recognition performance.

Table 1 :
Recognition results of PRPD images in the feld test.