Pi-Score: An Estimation Strategy of the Class Prior in Positive-Unlabeled Learning for Electrical Insulator Defect Detection with Incomplete Annotations

Insulators in high-voltage power


Introduction
A high-voltage power system consists of high-voltage transmission lines and towers. Insulators are responsible for isolating and preventing the grounding of overhead transmission lines [1,2]. They play a crucial role in maintaining the safety and stabilization of the power grid. However, a majority of the insulators are exposed to an outdoor environment with sunlight, rain, and snow throughout the entire year. Sometimes, unpredictable objects may also damage them. Therefore, it is possibly inevitable to cause regional grid failures and affect residential electricity consumption without periodic inspection of insulators' defects [2].
The conventional method for inspecting power equipment is manual, which is time-consuming, labor-intensive, and dangerous. Besides, it may be difficult to grasp the status of insulators as quickly as possible. Due to the advancement of unmanned aerial vehicle (UAV) technology, UAV inspection is gradually replacing low-efficient manual inspection [3,4]. The UAV inspection can automatically capture insulator images, and it enables the image-based insulator detection method. Early investigations of image-based insulator detection rely on hand-crafted features [1,5,6]. The performance of the aforementioned methods depends on the quality of the hand-crafted feature, which is time expensive and requires expert-level experience.
With the development of deep neural networks [7,8], deep learning-based detectors are proposed and categorized into two lines of research: two-stage and one-stage object detection. The typical two-stage detectors contain a region proposal network (RPN), such as region-based convolutional neural networks (CNN) [9], Fast R-CNN [10], and Faster R-CNN [11]. RPN is regarded as the first stage to provide the regions that are more likely to contain objects of interest. On the contrary, one-stage detectors, such as the you only look once (YOLO) series [12][13][14][15], exclude RPN from their pipelines.
Insulator detection is considered to be one particular case of object detection. There are various detectors based on the deep neural networks applied to the location of insulator defects. Faster R-CNN was introduced in the study of Kang et al. [16] to generate the proposal regions with insulators, and then a deep multitask neural network was used to locate the insulator defects in the above proposal regions. Moreover, a cascading architecture of object detection was adopted to solve two related insulator inspection tasks (insulator localization and defects detection) [17]. The cascading architecture incorporated two concatenated Faster RCNNs, whose backbones are VGG16 and ResNet 101, respectively. A novel tighter-oriented localization (TOL) framework is proposed to improve the vanilla Faster R-CNN by adding an external postprocess network [18]. The additional network is built on generative adversarial networks [19] for refining the localization of the given bounding boxes.
Additionally, one-stage detectors were also introduced into the field of insulator detection. YOLO v3 framework with a lightweight backbone was used to identify the defective insulators [20]. The lightweight backbone was constructed by integrating spatial pyramid pooling [21] with MobileNet [22]. A lightweight version of YOLO v4 was developed in the study of Xing and Chen [23] with the backbone switched from CSPDarknet53 to MobileNet, which enhanced average detection speed and ensured high detection accuracy. Furthermore, Tiny-YOLO v4, another lightweight version of YOLO v4, incorporated the self-attention module for the better fusion of features in different channels [4]. In recent years, the YOLO v5 pipeline has been applied to the research of insulator detection. In the study of Lan and Xu [24], four different versions of YOLO v5 were compared to acquire the most suitable architecture of detectors for the task of insulator defect detection. Then, attention mechanisms such as channel and spatial attention were introduced into the YOLO v5 pipeline to improve the detection accuracy of the insulator defect.
In the object detection community, the acquisition of the perfect annotation is time-consuming and labor-intensive. To alleviate this problem, some studies about imperfect annotation have been explored in different scenarios, such as incomplete annotation [25][26][27], unreliable labels [28], and incremental new categories [29]. However, there are few insulator detection frameworks when dealing with imperfect annotation. As one kind of imperfect annotation, the incomplete annotation will lead to less supervision of insulators and make unannotated insulators treated as background during the training process.
In this paper, we present a framework that integrates positive-unlabeled (PU) learning with a Faster R-CNN detector for incomplete annotation of insulators. Since the class prior in PU loss directly influences PU learning, we propose an estimation algorithm for the class prior, named Pi-Score, to improve PU learning. The Pi-Score calculates the class prior based on the predictions from two-stage classifiers. In our designed framework, Faster R-CNN is employed as the base model, PU learning or PU loss is fuzed into an RPN, and Pi-Score estimates the class prior for PU learning. The proposed framework aims to successfully locate the insulators and detect the defective ones on the condition that part of the annotations are not accessible. Simultaneously, it facilitates alleviating the annotation burden and enhancing the robustness of insulator defect detection.
The contributions of this paper are summarized in the following aspects: (1) the Pi-Score, a novel estimation strategy for the class prior, is proposed to improve vanilla PU learning; (2) the PU learning strategy is incorporated with RPN in Faster R-CNN to solve the missing label problem of insulator defect detection; (3) experiment results show that the proposed framework achieves better performance compared with baseline methods when the amount of missing labels changes.
The rest of this paper is organized as follows: Section 2 is about related work. Section 3 describes our framework that contains Faster R-CNN and PU learning, while the proposed Pi-Score is introduced in Section 4. In Section 5, experimental results demonstrate the benefit of the proposed approach. Section 6 gives a summary of this paper.

Related Work
In this section, we introduce the related work of insulator detection that is most relevant to our study. Two distinct research paradigms have studied this question: insulator detection and insulator segmentation. The latter can be treated as the pixel-level detection of the insulators.

Insulator Detection.
Most studies about insulator detection are to predict the location of insulators, i.e., by surrounding the insulators in images with bounding boxes. Guo et al. [5] applied the human receptive field (RF) model to the different types of insulators and achieved the accurate location of the defective parts in insulator images. The introduced RF model is analogous to the pipeline of the human visual system. Kang et al. [16] employed Faster R-CNN to locate the image areas with insulators and then combined deep material classifier and deep denoising autoencoder for the detection of defective parts on the insulator surface. Tao et al. [17] presented a cascading network architecture that modeled the insulator problem as two-level object detection. The former detector utilized VGGNet-based Faster R-CNN as an insulator localizer, while the latter adopted the original Faster R-CNN to inspect the defect region from the bounding boxes of insulators [17]. Zhong et al. [18] proposed a TOL framework to realize arbitrarily oriented localization for insulators and achieved better performance. An oriented RPN was used in place of the RPN in the TOL framework to refine the predicted bounding boxes by adjusting their orientation. Then two generative adversarial networks [19] were introduced at the end of the framework. Zhang et al. [30] proposed the adaptive receptive field network (ARFNet) and inserted it into the feature pyramid network (FPN) [31], which is the backbone of Faster R-CNN. ARFNet is based on an attention mechanism to extract context information for self-explosion defects.
The methods listed above rely on two-stage detectors. Besides, one-stage object detection frameworks, e.g., the YOLO series, were exploited to solve the insulator detection problem, simultaneously. Yang et al. [20] combined spatial pyramid pooling and MobileNet to form a lightweight network, and then it was integrated into the YOLO v3 framework. This study focused on a particular type of detective insulator, i.e., missing-cap insulators. Xing and Chen [23] utilized MobileNet to replace the original backbone CSPDar-knet53 in a lightweight framework of YOLO v4. The proposed method could detect insulators in images with a higher average detection speed and detection accuracy. Han et al. [4] reformed the YOLO v4 by merging the efficient channel attention network (ECA-Net) into the FPN backbone, which was equivalent to introducing self-attention in the channel dimension of features. Lan and Xu [24] explored four different versions of YOLO v5 to address the problem of insulator defect detection and found out that YOLO v5x is more suitable for the task. Gao et al. [3] introduced a triplet attention module into YOLO v5, which included three channel-spatial attention mechanisms. Similarly, Lan and Xu [24] sought to incorporate the convolutional block attention module (CBAM) into the neck module in YOLO v5. The CBAM enhanced both the channel and spatial context for improving insulator defect detection.

Insulator
Segmentation. Some research works also segment the insulators from the background in images. Li et al. [32] cascaded two modified networks, a global insulator detector, and a local defect segmentation model, to locate the tiny insulator defect object. The segmentation model was based on an improved U-Net [33] that incorporated an attention module. Han et al. [34] introduced ECA-Net into the encoder of U-Net, which embedded the attention mechanism for insulator segmentation. Yu et al. [2] enhanced the fine-grained texture of the receptive field based on the SINet framework and proposed an enhanced positioning network to extract the insulator defect segmentation area. Antwi-Bekoe et al. [35] employed a standard instance segmentation framework to solve the insulator segmentation problem. At the end of the framework, the detection and mask branches simultaneously output instance-level bounding box and mask prediction. Xuan et al. [36] introduced a modified CenterMask framework to achieve satisfied insulator defect segmentation results, in which the squeezeexcitation module was used to improve the backbone, and the spatial attention module was performed to predict the insulator mask.

Framework of PU Faster R-CNN
In this section, we first introduce the proposed framework for insulator defect detection. Then, we revisit the pipeline of Faster R-CNN, in which the vanilla RPN is replaced by the (PU) PU-RPN. Besides, the other two components are introduced, containing FPNs and region of interest (ROI) Head. Finally, we present the PU learning strategy [37] as the solution to incomplete insulator annotation.
3.1. Overview of the Proposed Framework. Our framework for detecting insulator defects is based on the Faster R-CNN pipeline [11]. We introduce PU learning and incorporate it into the Faster R-CNN to address the issue of incomplete annotation, as illustrated in Figure 1. According to the network architecture, the framework combining Faster R-CNN and PU learning can be divided into three components. The FPNs backbone, the initial component, is in charge of extracting feature maps, which are then fed into the twostage detectors (PU-RPN and ROI Head).
The rest two components correspond to the two stages in the Faster R-CNN. The first stage of the detector, PU-RPN, primarily considers the situation of incomplete annotation by merging the PU learning classifier and bounding-box regressor. Its network architecture is consistent with vanilla RPN. In addition, PU learning is capable of alleviating the effect that unlabeled insulators are treated as background [26]. The ROI Head, regarded as the second stage of the detector, utilizes the proposal regions to further refine the predicted insulator's category and bounding box.

Journal of Sensors 3
3.2. Faster R-CNN as Insulator Detector. In this paper, the Faster R-CNN is adopted as the base detector for insulator defect detection. To satisfy the detection scenario in that part of the insulators are unlabeled, we modify the framework of Faster R-CNN by updating the RPN with PU learning. The modified Faster R-CNN is shown in Figure 1, and its three components correspond to Module A (FPN backbone), Module B (PU-RPN), and Module C (ROI Head). The details are described as follows.
3.2.1. FPNs as the Backbone. FPN extends the pyramid of images to the pyramid of feature maps, which applies the multiple-scale strategy to insulator defect detection. The lowlevel feature maps in the pyramid make it easy to detect the smaller insulators, while the high-level feature maps facilitate the location of the larger ones. As shown in Figure 2, FPN consists of a bottom-up and a top-down pathway. The bottom-up pathway is usually a CNN for computing feature maps. Following the architecture in the study of Ren et al. [11], our modified Faster R-CNN specifically employs a residual neural network (ResNet) [38]. To balance the performance and computational complexity, ResNet-50 is chosen as the feature extractor of the bottom-up pathway. There are five residual blocks from bottom to top. On the left of Figure 2, the feature maps along the direction of the dataflow in ResNet contain more semantic information and possess lower spatial resolution (larger receptive field). The top layers are more suitable for perceiving the larger insulators and vice versa.
FPN also provides a top-down pathway to construct multiscale feature maps and enhance the semantic information of the bottom feature maps. To exploit the high-level structures obtained by the top layers, a reconstruction strategy is introduced to upsample the feature maps in the top layers. The reconstructed feature maps are semantically strong, but the locations of objects are not precise. Furthermore, lateral connections used a 1 × 1 convolutional layer to reshape the channel dimension of the bottom feature maps. We add the reshaped feature maps and the reconstructed feature maps improving the semantic property and the location resolution. In Figure 2, for example, the feature map 5 (denoted as M5) is upsampled by 2 × 2 nearest-neighbor interpolation. The output channels of the residual block (denoted as C4) are further decreased to 256. The upsampled and the channelreduced feature maps are merged and then fed into a 3 × 3 convolutional layer. The final output P4 is used for insulator defect prediction. We repeat the same process for P3 and P2, and all pyramid feature maps (P5, P4, P3, and P2) have 256-d output channels.

PU-RPN for Insulator
Proposals. PU-RPN in our framework introduces PU learning based on RPN to generate the proposal regions of insulators in the condition of incomplete annotations. The details about how PU learning solves the missing label problem are in Section 3.3. In this section, we focus on the pipeline of the PU-RPN. RPN applies a sliding window (also known as an anchor) over the feature maps to make predictions. The PU-RPN in this paper inherits the architecture and the supervision method (the supervision of RPN in Faster R-CNN needs to convert the manual annotated or ground-truth boxes to the labeled anchors. An anchor is assigned a positive label if it has   Journal of Sensors intersection over union (IoU) over 0.7 with any ground-truth box, and a negative label if it has IoU lower than 0.3) of the vanilla RPN in the study of Ren et al. [11]. As seen in Figure 3, it consists of a 3 × 3 convolutional layer, and two separated 1 × 1 convolutional layers. The 3 × 3 convolutional layer receives multiscale feature maps from the FPN (P5, P4, P3, and P2). The dimension of the input channel is 256, while the output of this layer has 512-d channels. The upper 1 × 1 convolutional layer is designed for a binary classification between insulators and the background. The output dimension of this layer is 18 (2 × 9 scores), in which two stands for the prediction scores of foreground and background, and nine corresponds to the number of anchors. On the contrary, the other 1 × 1 convolutional layer in Figure 3 is used to predict the coordinates of the anchors. The values in the output channels represent the concatenated coordinates (offsets for x, y, w, and h) of nine anchor boxes. When training the PU-RPN, we first compute the ground-truth annotations of the anchors based on IoU between the anchors and the ground-truth boxes. The loss functions of the PU learning classification and bounding-box regression will be discussed in Section 3.3.

ROI Head for Insulator Detection.
The ROI Head is at the end of Faster R-CNN and follows the PU-RPN, as shown in Figure 1. It reserves the top k proposal regions (or named ROIs) with insulators and further refines the prediction results in both classification and regression aspects. The former transforms the selected ROIs to produce fixed-size feature maps, while the latter constructs several fully connected (FC) layers to identify the good or defective insulators from those ROIs' feature maps. In detail, ROI pooling obtains proposal regions or ROIs of different sizes. It splits each ROI's feature map into a fixed number of roughly equal regions (the row and column correspond to W and H parts separately). Then max pooling is applied to each of these regions, i.e., each region of the feature map is converted into a value. This ensures that the size of all the ROI's feature maps is the same W × H. Moreover, the VGG-Head possesses two shared FC layers for learning ROIs' feature maps and two separated FC layers as the classification and detection branches. The shared FC layers are used to extract visual semantic features, while the separated FC layers transform the semantic features into task-related features. Finally, ROI Head uses smooth L1-loss [39] for the regressor and crossentropy for the classifier to calculate the loss. The losses of the ROI Head are accumulated and backpropagated to train the overall model.

PU Learning for Insulator Defect Detection with
Incomplete Annotations. For insulator defect detection from images, manual annotations need to overcome the problem derived from the varied insulator appearances and the complicated background. Therefore, it is challenging to annotate every insulator by experts. If a method can successfully detect insulators with incomplete annotations, it will alleviate the annotation effort and enhance the robustness of the method. Traditional methods for complete annotations consider the regions enclosed by bounding boxes as the positive samples, while the rest are regarded as background or negative samples. The loss function for this positive-negative (PN) classification is computed as follows: where i and j are the indices of positive and negative samples, respectively. N p and N n are the total number of positive and negative samples, respectively. c i p and c j n separately represent the predicted classification probabilities of the positive and negative samples or anchors in object detection. H ⋅; ⋅ ð Þ is usually set to cross-entropy loss that calculates the error between the anchors' prediction classification probability and the corresponding ground-truth labels. When it comes to the problem of incomplete annotations, the missing-labeled regions with insulators are treated as the background. This leads to semantic ambiguity because the regions with insulators are identified as both foreground and background. Hence, training with the loss described in Equation (1) results in performance degradation for detecting defective insulators.
Due to the presence of missing-labeled insulators in images, the anchors sampled from the background contain both negative and positive samples. In this circumstance, these instances or anchors can be considered unlabeled. Consequently, an incompletely annotated dataset can be indicated as a set of positively labeled samples x p and unlabeled samples x u . We exploit PU learning to approximate the PN classification loss for incomplete annotations of insulators. In a PU scenario or data collection, the PN loss function defined in Equation (1) faces the following challenges: (i) The first term in the parenthesis stands for the loss component of positive samples. As part of the insulators are unlabeled, there are some positive samples not included in this term. As a result, the estimation for the positive samples is biased and needs to be appropriately calibrated. (ii) The second term within the parenthesis refers to the loss of negative samples. The unlabeled issue exerts the adverse effect that some "positive" samples receive negative labels in the training process. Therefore, the positive samples are regarded as both positive and negative. These "positive" samples confuse the Faster R-CNN, which are insulators in our task.
In the framework of traditional PU learning [37], the class prior π is usually introduced to represent the proportion of the actual positive samples in the dataset. It equals the ratio of the number of actual positive samples (N ap ) to the number of all samples (N p þ N u ). N p and N u therein stand for the number of labeled positive samples and unlabeled samples, respectively. The loss of the positive class is approximatively estimated by Equation (2). As part of positive samples is inaccessible, the term ∑ . This substitution is viewed as the result of N ap multiplying the average loss of labeled positive samples.
The above solution about the positive sample can be applied to insulator defect detection. Following the previous work [26], an image is divided into a series of anchors to cover the whole image. After the manual annotations are converted to the annotated anchors, the classification for the anchors can be viewed as a PU problem based on incomplete annotations. Consequently, the loss for positive anchors can be reasonably estimated by Equation (2).
For the loss of the negative anchors, the estimation bias in the study of Yang et al. [26] derives from calculating the distribution of the whole data x based on the unlabeled data x u . An improved strategy is proposed in the study of Zhao et al. [40] to approximate the distribution of x by combining the positively labeled and unlabeled anchors. The loss of negative anchors equals the difference in loss between all anchors and actual positive anchors that are anticipated to be negative. The computation of L n cls can be formulated as follows: ð Þstands for the loss between the predicted probability and the negative label. Therefore, the first term is the loss that all samples are predicted to be negative. The second term corresponds to the average loss of actual positive samples belonging to the negative class. Then a nonnegative operation is applied to Equation (3) as in the study of Kiryo et al. [37], which leads to the following: To add the above two losses L p cls and L n cls , we have the classification loss L pu cls based on PU learning to approximate and calibrate the PN loss: When it comes to localization loss for insulator defect detection, a typical choice is the smooth L1-loss function [39]. The output of bounding-box regression is the predicted loca- , while the ground-truth bounding box is denoted as b ¼ x; y; w; h f g . The first two elements in v and b are the coordinates of the left bottom corner. w and ŵ represent the width of the box annotation, while h and ĥ represent the height of the box annotation. Hence, the localization loss L loc is defined as follows: 6 Journal of Sensors Here i and N p are the indices and the total number of positive samples, i.e., the anchors with insulators in our task. The complete loss function L for insulator defect detection is based on the combination of the PU classification loss L pu cls and the localization loss L loc .
We employ the loss in Equation (7) to train the PU-RPN in Faster R-CNN, which achieves insulator defect detection under the scenario of incomplete annotations. In the next section, we will describe the estimation of the class prior π in PU classification loss L p cls .

Estimation of Class Prior
Due to the problem of incomplete annotations in this study, the class prior is unknown but crucial for the PU classification loss. In this section, we first review the existing methods of class prior estimation. Then our proposed Pi-Score is described in detail in Section 4.2.

Revisit the Existing Methods.
The class prior is defined as the proportion of actual positive samples in a given dataset. In insulator defect detection, it corresponds to the proportion of positive anchors derived from annotations. In the study of Zhao et al. [40], the class prior is treated as a hyperparameter and estimated by grid search based on a validation set. To estimate an accurate class prior, the interval of the grid search needs to be set to a small value. Although class prior can be estimated with high precision using the grid search method, this further leads to high computation complexity. For example, If the class prior is searched with an interval of 0.1, the computation complexity will increase tenfold from the original.
Yang et al. [26] proposed an estimation method of class prior based on a fixed threshold; this method filtered the anchors by comparing the probabilities of the RPN with a preset threshold. Assume the anchors in Faster R-CNN as a set A ≡ a j j j ¼ where N a is the total number of anchors. The RPN is responsible for modeling the probabilities of anchors that belong to the positive, which is represented as P rpn ≡ p j j j ¼ È 1; ⋯; N a g. Specifically, an anchor is considered to be a positive sample if its probability exceeds the preset threshold T rpn . These anchors form a new set P according to Equation (8), the number of which is divided by N a to obtain the class prior π (as defined in Equation (9)).
To stabilize the estimation of class prior π, an exponential moving average strategy was introduced in the study of Yang et al. [26], where the momentum therein is set to 0.9. In conclusion, a reasonable threshold T rpn directly determines the number of positive anchors, which affects the estimation of the class prior. Although the default threshold in the study of Yang et al. [26] is set to 0.5, this threshold is not suitable for our application. Concretely, we found out that the probabilities predicted by the RPN are all less than 0.5 after the training converges, which means all the anchors are identified as negative samples. Therefore, a series of experiments should be conducted to determine a more optimal threshold T rpn .
The experimental results are depicted in Figure 4 Figure 4(a), the model performance reaches its optimum when the threshold T rpn is set to 0.2. And T rpn ¼ 0:4 corresponds to the lowest detection performance. In Figure 4(b), the best performance is attained with a threshold of 0.1, while the worst performance is achieved at 0.3. Although the curve trends in Figures 4(c) and 4(d) are distinct, the best and worst detection performances are achieved at T rpn ¼ 0:2 and T rpn ¼ 0:1, respectively. In summary, it is obvious that the relationship between the thresholds and the detection performance is inconsistent under different annotation proportions. Therefore, the threshold also needs to be optimized as a hyperparameter, which will also exponentially increase the computation complexity.

4.2.
Pi-Score: The Proposed Method of Class Prior. To estimate the class prior more efficiently and accurately, we propose an adaptive threshold method, termed Pi-Score, based on two-stage classifiers. The Pi-Score replaces fixed thresholds with adaptive ones. On the one hand, the adaptive threshold is automatically calculated based on the current image and its annotations, avoiding the hyperparameter tuning process and reducing the computation complexity. On the other hand, it can more precisely calculate the class priors corresponding to various input batches, which facilitates backpropagation with more proper PU loss. Besides, our estimation method introduces the two-stage classifiers, i.e., the classification heads of RPN and ROI Head. Compared with the method in the study of Yang et al. [26], our method further incorporates the ROI Head's classifier to distinguish Journal of Sensors positive anchors more precisely. As the ROI Head outperforms the RPN in classification, it is reasonable to combine the two-stage classifiers. Figure 5 depicts the two-stage framework, which corresponds to Modules B and C. In Modules B and C, we develop two levels of adaptive thresholds (T rpn and T roih , respectively) to determine whether the anchors are positive. Assuming its batch size is denoted as N B . The class prior estimation is a forward process of PU Faster R-CNN and consists of the following steps according to Figure 5.
(2) The first adaptive threshold T rpn . The classification head of the RPN estimates the probability of each anchor belonging to the positive class based on the output of the FPN. This process can be represented by Equation (11). The RPN classification head outputs a vector for each anchor's probability of being positive, indicated as P The dashed box of Module B in Figure 5 illustrates how the first adaptive threshold is used to filter the initial candidate proposals. The threshold T i ð Þ rpn can be obtained by computing the maximum of P i ð Þ rpn and dividing it by 2. For the ith image, its adaptive threshold can be represented as follows: Based on the first adaptive threshold T i ð Þ rpn , we can filter anchors to form a candidate set of anchors P i ð Þ , as shown in Equation (13). The feature maps F i ð Þ FPN and the candidate set P i ð Þ are combined as the input for the second-stage ROI Head.
(3) The second adaptive threshold T rpn . The classification head of the ROI Head predicts the multiclass probabilities of the anchors in P i ð Þ , as shown in Equation (14). Let P i ð Þ roih 2 R P i ð Þ j j×N c symbolize the multiclass probability matrix, where rows and columns correspond to each anchor and each class, respectively. Then, we can obtain the maximum probability P i ð Þ roih 2 R jP i ð Þ j for each anchor using Equation (15).
Then, sorting P i ð Þ roih in ascending order yields S . The probabilities associated with the 5th and 95th percentiles of the sorted vector are defined as S low roih and S high roih , respectively. Based on these two probability values, we can calculate the second-stage adaptive threshold, as described in Equation (16).
Based on the second-stage threshold, we can further filter out the final positive anchors. We define the maximum probabilities of these anchors as pi-scores, which represent their likelihood of being positive (denoted as Equation (17)).

PU-RPN
For the ith image, its class prior can be computed by π k =N a , and the class prior for the entire batch is To demonstrate the Pi-Score more clearly, we also provide a comprehensive description in Algorithm 1.

Experiment Results
In this section, we present several experiments to evaluate our proposed framework in comparison to the baseline methods. First, the dataset and experimental setup are described in Section 5.1. Then, we introduce the evaluation metrics used in experimental evaluation. Finally, we report the insulator defect detection results on various annotation proportions in Section 5.3.

Dataset Description and Experimental
Setup. Our proposed framework was evaluated experimentally using the Insulator Defect Image Dataset (IDID) (the Insulator Defect Image Dataset can be downloaded from: https://ieee-dataport. org/competitions/insulator-defect-detection). The IDID was released by Electric Power Research Institute through its artificial intelligence initiative. It was used for training an artificial intelligence or machine learning method. The IDID is composed of high-quality images of insulators on transmission lines and the corresponding annotations for object detection, as depicted in Figure 6. The annotations follow the type of bounding box. Specifically, the insulators are enclosed by bounding boxes and labeled with one of three categories: good, broken, or flashover-damaged insulator shells. The green, blue, and cyan bounding boxes correspond to good, broken, and flashover-damaged insulators, respectively. Moreover, some insulators in an image constitute a group, and those that belong to the same group are surrounded by a larger red box and labeled as the class of insulator string. The training samples of IDID consist of approximately 1,600 images with 1,788 insulator strings, 2,636 good, 1,140 broken, and 2004 flashover-damaged insulator shells. In our experiments, we randomly split the training samples of IDID into three parts according to the ratio of 7 : 1 : 2, i.e., the training set (70%), validation set (10%), and test set (20%). Here, the random seed is set to 0 in all the experiments.
In real application scenarios, it may be difficult to provide the perfect annotations of targets due to oversights by annotators or dense insulators that are too close to each other. This prevents the dataset from meeting the criteria of complete annotation. Even though the IDID has been made publicly available as a fully annotated dataset, it still contains a small number of unannotated targets. Figure 7 shows image samples with missing annotations in the IDID dataset, where the unannotated insulators are indicated by yellow bounding boxes. Since the validation and test sets   are split randomly, they unavoidably contain unlabeled samples. Some errors could be introduced when computing metrics (details in Section 5.2) on the validation and test sets with incomplete labels. Therefore, we performed quality control on the validation and test sets by (1) removing images with a large number of missing annotations; and (2) manually labeling the image when only a few annotations are missing. For verifying our method's performance under different annotation proportions, a part of the annotations is removed in a random way. We use a hyper-parameter called Annotation PerCent (APC) to show what percentage of annotations in the training set will be used in the training process. Moreover, there are other important hyper-parameters for the training process. For example, the batch size is set to 16 in all the experiments, while the learning rate is set to 0.02.
A total of 10,000 iterations are performed, and evaluation on the validation set is carried out every 200 iterations. The optimal model is chosen according to the performance on the validation set and applied to the test set to obtain the evaluation metrics, like mean average precision (mAP). Finally, the data augmentation in our framework contains the horizontal flip and vertical flip, as well as the default data augmentation strategy in Detectron2 (the GitHub repository of Detectron2: https://github.com/facebookresearch/detectron2).
The experiment is conducted based on the Linux operating system and the Vscode development software. Python was selected as the programing language and PyTorch as the deep learning framework. The hardware of the server mainly consists of two Intel Xeon Gold 4314 CPUs @ eight 2,933GHz CPUs, 32 GB of RAM, and 2 Nvidia A40 GPUs.

Evaluation Metrics.
Existing object detection metrics have been utilized in the PASCAL VOC challenge [41], the COCO challenge [42], the Open Images challenge [43], etc. In this paper, we introduce the COCO evaluation metrics because of their popularity and comprehensiveness. The mAP is the principal metric, and it is extended to some other variations. For an instance to be classified as a True Positive (TP), its predicted confidence score must exceed a predefined confidence threshold (CT). In addition, the IoU between its predicted and annotated bounding boxes needs to be higher than a predefined IoU threshold (IoUT). On the  contrary, this instance is regarded as a False Positive (FP). The AP for one category is to average the precision (defined as TP/(TP + FP)) with CT varying from 0 to 1 when the IoUT is held constant.
The mean of APs across all the categories is used as the metric of mAP. Assuming N c is the number of categories, mAP can be calculated by Equation (18). As all the AP is based on a constant IoUT, the mAP should also be calculated with IoUT (denoted as AP@ IoUT). In our experiments, we report both AP@0.5 and AP@0.75.
The COCO evaluation metrics also contain "AP" and "APcategory." AP is computed by averaging AP@ IoUT with IoUT varying from 0.5 to 0.95 (the interval is set to 0.05). In our experiments, AP-category is the AP being applied to one particular category, such as AP-String, AP-Good, AP-Broken, and AP-Flashover-Damaged (abbreviated as AP-FlashoverD).

Evaluation with AP Metrics for Various Object Detectors.
In this section, we primarily conduct two groups of experiments to compare the proposed framework with mainstream methods. In the first group, we use all annotations from the IDID dataset, the results of which are summarized in Table 1. Section 5.1 shows that the IDID dataset contains a small number of unlabeled samples. The PU learning method is applied to a realistic "fully annotated" dataset, which provides a check on the practical significance of our study. The second group is based on the IDID dataset with APC as a controllable hyperparameter. In detail, a proportion of annotations are randomly retained based on APC. This allows us to evaluate object detection algorithms in situations where a substantial proportion of annotations are missing. This group of experiments reported in this section employs three APCs of 0.7, 0.5, and 0.3 (corresponding to . To verify the effectiveness of the proposed algorithm, we compare it with mainstreaming PN and PU object detectors. The PN object detectors can be categorized as either singlestage or two-stage frameworks. The single-stage ones in the paper include YOLO v3 [14], YOLO v4 [15], and YOLO v5 (the GitHub repository of YOLO v5: https://github.com/ultra lytics/yolov5). In the applications of insulator detection, many studies substituted the backbone of YOLO frameworks with MobileNet, such as YOLO v3 with MobileNet (abbreviated to M-YOLO v3) [20] and YOLO v4 with MobileNet (abbreviated to M-YOLO v4) [23]. Since our study does not focus on lightweight models, we also evaluated YOLO v3, YOLO v4, and YOLO v5 based on DarkNet53 (abbreviated to D-YOLO v3, D-YOLO v4, and D-YOLO v5). The above single-stage detectors are listed in the first five rows of Tables 1-4, and the backbone network is indicated in parentheses. For the twostage framework of PN learning, Faster R-CNN with ResNet-50 backbone corresponds to the sixth row in Tables 1-4.
The PU object detectors are all based on Faster R-CNN, including the two methods described in Section 4.1 and our proposed Pi-Score in Section 4.2. The PU Faster R-CNN in  the seventh rows of Tables 1-4 corresponds to the method in the study of Zhao et al. [40], which estimates class prior probabilities via grid search on the validation set. The search interval is set to 0.1, and 10 repeated training processes are required to obtain the estimated class priors. The eighth rows of Tables 1-4 list the results of the method in the study of Yang et al. [26], which estimates class priors using a fixed threshold. The results of the proposed Pi-Score are shown in the last row of Tables 1-4. Table 1 summarizes the results of various comparison methods under complete annotation. Among the five singlestage object detectors, YOLO v5 achieved the best performance with an AP of 84.50%. It also outperformed two-stage detectors on both AP@0.75 and AP@0.5 metrics. However, the AP metric of two-stage detectors was higher than that of YOLO v5. It can be inferred that the mAP of the YOLO v5 at IoU > 0.75 is lower, indicating that YOLO v5 is slightly inferior to two-stage detectors in terms of accuracy. A comparison of the results in rows 1-2 and 3-4 of Table 1 shows that D-YOLO v3 and D-YOLO v4 achieved better performance than their MobileNetbased counterparts (M-YOLO v3 and M-YOLO v4). Theoretically, the lightweight MobileNet has a higher computation speed than DarkNet53 but lower detection accuracy, which is consistent with the detection results in Table 1. Among the four two-stage detection methods, the proposed Pi-Score obtained superior performance in terms of AP, AP@0.5, and AP@0.75.
In the PN frameworks, the Faster R-CNN achieved a comprehensive detection result of 87.19% AP, which is 2.69% higher than YOLO v5. By comparing the results of the first six rows and the last three rows, it is concluded that PU learning can effectively improve the detection performance on the IDID dataset. This also indirectly confirms the existence of unlabeled samples described in Section 5.1. Finally, the Pi-Score achieves the highest performance with an AP of 88.26%, compared to the other two PU Faster RCNNs. It demonstrates that the adaptive threshold of the Pi-Score is somewhat effective in estimating class priors. The class prior estimation methods in the study of Zhao et al. [40] and Yang et al. [26] are based on grid search for fixed values or fixed thresholds, and the 0.1 search interval may skip the optimal value and impair the detection performance. Table 2 shows the detection results of various methods with an APC of 0.7. Among the single-stage frameworks, only YOLO v5 achieved an AP value over 70%, but its performance is still inferior to that of the two-stage detector, Faster R-CNN (AP, AP@0.75, and AP@0.5 decreased by 3.84%, 1.08%, and 1.66%, respectively). When comparing the reduction magnitudes of the above three metrics, it can be concluded that YOLO v5 performs relatively worse on mAPs with an IoU threshold greater than 0.75. Therefore, YOLO v5 is slightly less effective than two-stage detectors in terms of precise localization. In comparison to the above PNbased detectors, the PU Faster R-CNN series outperformed those PN-based detectors. In detail, the PU-based detectors in the last three rows of Table 2 gained about 0.6% to 1% in AP compared to Faster R-CNN. This proves that PU learning can boost the performance of PN on incompletely annotated data.
The main difference among the PU-based detectors in Table 2 is reflected in the estimation of the class prior.    Therefore, the comparison focuses on the effect of class prior estimation. From the last three rows of Table 2, both the proposed Pi-Score and the method in the study of Zhao et al. [40] achieved relatively good performance (77.01% and 77.03% AP, respectively). The PU Faster R-CNN [40] and our proposed method obtain similar performance, which is different from the conclusion of Table 1. This may be attributed to the fact that the theoretical class prior is close to the values on the grid during grid search, making the estimation of class prior more accurate. In addition, Pi-Score also achieved the highest accuracy in evaluation metrics such as AP@0.75 and AP@0.5 (88.72% AP@0.75, 93.74% AP@0.5).
When the annotation percentage is set to 0.7, the results of various methods are summarized in Table 3. Faster R-CNN obtains higher performance than YOLO v5 for most of the metrics. Therefore, Faster R-CNN, a two-stage framework, outperformed the single-stage YOLO series frameworks among the PN-based object detection methods. Using the AP metric as an illustration, Faster R-CNN was 7.15% higher than YOLO v5. The PU Faster RCNNs in the last three rows improved by approximately 0.5%-1% on AP when compared to Faster R-CNN. This verifies that PU learning has certain advantages in dealing with incompletely annotated data. Based on the results of various PU-based detectors (class prior estimation strategies) in the last three rows of Table 3, the proposed Pi-Score achieved a higher AP metric (70.92%) than the algorithms in the study of Zhao et al. [40] and Yang et al. [26]. Table 4 summarizes the results of all methods with APC ¼ 0:3, which means that 70% of the annotations are randomly removed, signifying a severe label absence. From the first four rows of Table 4, we can draw the conclusion that the replacement of a lightweight backbone reduces the detection performance of YOLO frameworks. However, the AP of YOLO v5 was lower than that of D-YOLO v4, indicating that the YOLO v5 was unsuitable for missing a large proportion of annotations. Similarly, in the PN learning scenario, Faster R-CNN outperforms D-YOLO v4 with a 5.17% improvement in AP. Moreover, PU Faster RCNNs in Table 4 obtained an improvement of about 0.8%-2.2% compared to vanilla Faster R-CNN (the best PN-based detector in this paper). The effectiveness of the proposed Pi-Score can be inferred from the results in the last three rows of Table 4. In terms of the AP, the Pi-Score gained higher performance from the methods in the study of Yang et al. [26] and Zhao et al. [40], while there was no significant difference between the other two methods.
Based on Tables 1-4, we can conclude that the Pi-Score achieves the best performance under APC ¼ 1; APC ¼ 0:7, and APC ¼ 0:5. However, the proposed method and the approach in the study of Zhao et al. [40] outperformed the PU Faster R-CNN in the study of Yang et al. [26] for 0.7 and 0.3 APCs. As the class prior is regarded as the hyperparameter in the study of Zhao et al. [40], the hyperparameter determination based on grid search needs multiple training processes. This strategy will lead to an increase in computational costs. Our proposed method estimates the class prior based on adaptive thresholds, which provides a more accurate class prior in an efficient way.

Prediction Results of Various Object Detectors.
To demonstrate the effectiveness of the proposed method more intuitively, we selected some samples under different annotation percents and displayed the predicted boxes on the original images. During the prediction process, the object recognition network outputs both the predicted boxes and classification confidence. The predicted boxes are used to indicate the target position, while the classification confidence provides category information. In addition, classification confidence can also indicate whether the target prediction is reliable. By setting a confidence threshold, unreliable targets can be removed to reduce false alarms. In this section, the confidence threshold was set to 0.5, and detection results with classification confidence less than 0.5 were considered to be the background. Since some images contain densely arranged small targets of insulators, we employ a series of yellow arrows to indicate the positions of the insulators (or some small insulator strings) for the convenience of result analysis. In (b)-(h) of each figure, the red dashed boxes represent the predicted boxes, and the green solid boxes are the groundtruth annotations. The text at the top-left corner of the red and green boxes denotes the predicted and ground-truth categories, respectively. Detection results can be analyzed from a macroperspective by counting the number of targets that are detected correctly. On the contrary, microlevel analysis can be divided into the following three aspects: (1) whether the insulator strings are detected correctly; (2) whether the larger insulators are detected correctly, such as the first column in Figure 8 and the second column in Figure 9; (3) whether smaller insulators are missed during the detection process and whether the categories and locations of smaller insulators are predicted precisely, namely, the last two columns in Figure 8 and the first column in Figure 9.

Prediction
Results with IDID's Annotations. Figure 8 shows the detection results of the methods mentioned above with complete annotations in the IDID dataset. As described in Section 5.1, the densely arranged insulators and oversights by annotators may lead to the situation of incomplete annotations in the IDID dataset. Several samples with incomplete annotations from the IDID dataset are shown in Figure 7. From the first row, the insulators in the first column are relatively large in scale, while the insulators in the last two columns can be considered relatively small targets. The following analysis is performed column by column from (b) to (h) for each method. In the first column, D-YOLO v3 classified all the targets correctly, but there was some deviation in locating the predicted boxes of insulators at positions (4)- (6). M-YOLO v3 failed to detect the insulator string, identified the "Good" insulator to be "Flashover Damaged" at position (1), and recognized "Flashover Damaged" insulators to be "Good" at positions (3), (5), and (6). The targets at positions (2) and (3) were detected multiple times, while the detection of the insulator was missed at position (4). D-YOLO v4 predicted the right categories of all targets in the image, but the insulator string's left boundary was detected with some error. M-YOLO v4 had inaccurate localization of the upper boundary for the insulator string. The insulator classes were accurately identified at places (3)-(5), but their predicted boxes possessed inaccurate left boundaries. At position (1), the "Good" insulator was detected to be both "Good" and "Flashover Damaged." YOLO v5 located an insulator string near its ground-truth box. Meanwhile, another insulator string was detected at position (3), which could be viewed as a false alarm. The predicted classes of insulators at positions (3)-(6) were all incorrect (the "Flashover Damaged" insulator was predicted to be "Good"). The last two rows correspond to the methods in the study of Yang et al. [26] and the Pi-Score in this paper, respectively. From the detection results, both methods performed well in the aspects of classification and localization.
In the second column of Figure 8, all the methods performed relatively well. The main difference in performance among these methods is reflected in the detection of the insulator string and a few insulators. The upper boundary of the insulator string detected by D-YOLO v3 exhibited a significant error (the error was close to the height of one insulator). M-YOLO v3 and D-YOLO v4 predicted inaccurate boxes for the insulator string in the image, both of which exceeded the ground-truth box region. For the location of the insulator string, M-YOLO v4 output a predicted box with an upper boundary about two insulator heights below the ground-truth box. The subsequent YOLO v5 and PU Faster RCNNs performed better in terms of locating the insulator string. The detection difference among them is as follows: YOLO v5 repeated the detection of "Good" insulators at position (7); PU Faster R-CNN [26] identified the "Good" insulators at position (14) to be both "Good" and "Flashover Damaged;" the proposed Pi-Score detected all insulator correctly. 16

Journal of Sensors
The third column of Figure 8 provides the various methods' predicted results. In general, the methods output similar detection results, and the main differences lie in the detection of two insulator strings and insulators at positions (18) and (24). The target at position (25) corresponds to a small insulator string, where the insulators are partially visible in the image and cannot be annotated precisely. Therefore, we focus on the detection of this insulator string rather than the insulators therein. For the detection of the insulator strings, D-YOLO v3 output a small predicted box for the insulator string in the center, i.e., it was entirely enclosed by the ground-truth box. In contrast, D-YOLO v3 detected the insulator string at position (25) more accurately. One more insulator string was detected by M-YOLO v3 in the image's center, and both predicted boxes were quite distinct from the ground truth boxes. Moreover, one insulator was detected at position (25), but the insulator string was missed.
D-YOLO v4 also had some errors in the localization of the two insulator strings. The top and lower borders of the insulator string's prediction in the image's center surpassed the ground-truth box. The predicted upper and lower boundaries at position (25) were both lower than those of the ground-truth box. M-YOLO v4 located the insulator string in the center of the image with a bounding box that was slightly larger than the ground-truth box. At position (25), the insulator string was unsuccessfully detected, whereas multiple insulators were identified. YOLO v5 and PU Faster RCNNs achieved superior performance in insulator string localization. The categories of the insulators at positions (18) and (24) were "Flashover Damaged." D-YOLO v3 provided the false detection and the missed detection at these two positions, respectively. M-YOLO v3 generated the false detection at positions (18) and (24). The two YOLOv4 models, YOLO v5, and PU Faster R-CNN [26] detected the insulator at position (18) incorrectly, while the proposed method detected all targets with the right categories and the precisely predicted boxes.

Prediction
Results under 0.7 Annotation Percentage. Figure 9 depicts the detection results of seven methods with an annotation ratio of 0.7. The analysis of each method from rows (b) to (h) is presented below. For the methods' predicted results in the first column, we first focus on the detection results of the insulator string. The D-YOLO v3 output a predicted box with an inaccurate right boundary. Meanwhile, M-YOLO v3 detected the insulator string twice, and both predicted boxes had large errors. For the insulator string, D-YOLO v4 and M-YOLO v4 both had inaccurate localization for the insulator string. The former's predicted box was located slightly to the lower right of the ground truth box, while the latter's predicted box had a certain error in the aspect ratio. The predicted boxes of YOLO v5 had large errors in the left and bottom boundaries, while PU Faster R-CNN [26] had a relatively large error in the right boundary of the predicted box. Compared with the above methods, the proposed Pi-Score obtained the predicted box with the highest overlap with the ground-truth box.
For other targets in the image, the detection results are summarized as follows: D-YOLO v3 incorrectly detected the insulators at positions (18) and (24) and missed the insulator string at position (25). M-YOLO v3 missed the insulators at positions (12) and (15) in addition to the aforementioned detection errors. Both D-YOLO v4 and M-YOLO v4 failed to detect the insulator at position (15) and missed the insulator string at position (25). Besides, the insulator at position (24) was missed by M-YOLO v4. YOLO v5 predicted the wrong categories of the insulators at positions (18) and (24), while PU Faster R-CNN [26] missed the insulators at positions (5) and (11). The proposed method successfully identified all targets except for the insulator at position (4) and achieved relatively accurate localization for the targets.
In the second column of Figure 9, the annotations of the image include eight targets consisting of seven insulators and one insulator string. Since parts of the three insulators in the top-right corner are not in the image, their categories cannot be determined. Therefore, these insulators are therefore excluded from the evaluation procedure. D-YOLO v3 successfully recognized four "Good" insulators among seven insulators but failed to detect the insulator string. M-YOLO v3 identified the right class of the insulator string but only detected one insulator. Meanwhile, M-YOLO v3 predicted the localization of the insulator string and the insulators with a large error. D-YOLO v4 only recognized two insulators successfully but missed the insulator string and the insulators at positions (3)-(7). M-YOLO v4 detected the insulator string with slight errors in the upper and right boundaries of the predicted box. And it also recognized the right classes of two insulators at positions (2) and (4). For PU Faster RCNNs, all targets in the scene were correctly detected, which was better than the other detectors mentioned above. The difference between these two methods is mainly focused on the insulator at position (7). From (g) and (h) in the second column, our method had higher localization accuracy compared to that of the study of Yang et al. [26].
The third column of Figure 9 contains seven targets, including six insulators and one insulator string. D-YOLO v3 was unable to predict the position of the insulator string, and M-YOLO v3 missed the insulator at position (6). D-YOLO v4 predicted the correct categories of all targets and accurately located them. At position (6), the predicted box generated by M-YOLO v4 had a downward trend relative to the ground-truth box, while the "Broken" insulator was misidentified to be "Good." YOLO v5 accurately located all targets but missed one at position (6). The proposed method and PU Faster R-CNN [26] achieved similar detection results, which accurately detected all targets.   (6). PU Faster R-CNN [26] had a certain error in the upper boundary of the predicted box. YOLOv5 and our method had better detection performance at position (6). In summary, our method obtained better detection performance compared with these mainstream methods.
The second column in Figure 10 shows

18
Journal of Sensors insulator string on the right side. Although YOLO v5 detected the insulator string on the right side, there was a large error in the lower boundary of the predicted box. Simultaneously, it missed detections of the insulators at positions (6), (8), and (16). Besides, When the PU Faster R-CNN [26] detected the insulator string on the left side, a false alarm occurred, i.e., an extra "Insulator String" bounding box was detected within the ground-truth box. It also missed the insulators at positions (6), (8), and (16). In contrast, our proposed method accurately detected the insulator strings on both the left and right sides and only missed the detection of the insulator at position (6).
In the third column of Figure 10, there is an insulator string containing 13 insulators in the center of each image. These insulators belong to the "Good" category except for one "Broken" insulator at position (4). The detection performance of YOLO v3 in (b) and (c) was unsatisfactory, thanks to the missing detection of many insulators. D-YOLO v3 only detected the insulator string, while M-YOLO v3 merely detected two insulators at positions (6) and (13). Compared with YOLO v3, YOLO v4 significantly reduced the false negative rate for insulator detection. D-YOLO v4 only missed detections at positions (2), (6), and (11), but it also failed to detect the insulator string. M-YOLO v4 had some error in the upper boundary of the predicted box for insulator strings and missed some insulators at positions (2)-(4) and (11)- (13). From the detection results of YOLO v5 in (f), a serious problem of false alarms arose in detecting the insulator string.
Although it correctly detected the insulator string at position (1), it detected three extra insulator strings at positions (10)- (12). PU Faster R-CNN [26] detected all targets except the insulator at position (2). The proposed Pi-Score detected all the insulators and the insulator string precisely. Note that the insulator at position (1) is unannotated in the "Good" category. This insulator was ignored by all methods except YOLO v5, while YOLO v5 misclassified it to be "Flashover Damaged." In summary, the proposed Pi-Score achieved better detection performance.

Prediction
Results under 0.3 Annotation Percentage. Figure 11 shows the detection results of seven detectors under 0.3 annotation percent. The columns (b)-(h) correspond to the various methods and are analyzed as follows: Due to the existence of massive unannotated samples, the performance of the contrast methods decreased to varying degrees. In the first column of the image, there is one insulator string and six insulators. Neither the M-YOLO v3 nor the M-YOLO v4 detected the insulators, as shown in (c) and (e). But M-YOLO v4 successfully detected the insulator string. In (b) and (d), DarkNet-based YOLO v3 and YOLO v4 both detected two insulators, and D-YOLO v4 also detected the insulator string. Both YOLO v5 and the PU Faster R-CNN [26] detected three insulators. For the localization of the insulator string, the YOLO v5's predicted box had a relatively large error, while PU Faster R-CNN [26] obtained an accurate prediction or a bounding box. The proposed Pi-Score correctly detected four insulators and one insulator string.
In the second column of Figure 11, there are a total of seven targets (six insulators and one insulator string). Both D-YOLO v3 and M-YOLO v3 failed to detect any targets in the image. D-YOLO v4 and M-YOLO v4 successfully detected the insulator string, and D-YOLO v4 further identified two insulators at positions (3) and (4). YOLO v5 correctly detected one insulator string and three insulators at positions (1), (2), and (4), but the insulators at positions (5) and (6) were predicted with incorrect categories. Similarly, PU Faster R-CNN [26] detected the insulator string and three insulators at positions (2)-(4), but the insulator at position (6) was identified to be "Good" instead of "Flashover Damaged." Our method correctly detected the insulator string and four insulators at positions (2)-(5) without any false alarms and false negatives.
In the third column of Figure 11, there are a total of eight targets (seven insulators and one insulator string). D-YOLO v3 detected the insulator string and one insulator on the string, while M-YOLO v3 did not detect any targets. D-YOLO v4 and M-YOLO v4 both detected the insulator string. M-YOLO v4 therein detected two more insulators with poor localization accuracy. The location of the insulator string provided by YOLO v5 was insufficiently precise, but it correctly detected four insulators at positions (1), (2), (3), and (5). PU Faster R-CNN [26] detected the insulator string and three insulators at positions (1), (3), and (5). Our method accurately detected the insulator string and correctly identified five insulators at positions (1), (3), (4), (5), and an unannotated insulator at position (4).
In summary, the proposed method outperforms singlestage detectors in terms of the number of anchors that were detected correctly and the localization accuracy. When there exist massively unlabeled samples, the proposed method shows significant improvement in contrast to other methods. Even if unannotated samples are few, the proposed method obtains slightly better than other PU-based detectors and is substantially superior to the traditional Faster R-CNN.

Conclusions
In this paper, we propose a framework to study insulator defect detection with incomplete annotation. In our framework, the Faster R-CNN is redesigned to be a PU-based alternative by introducing an improved PU learning technique. Specifically, the vanilla RPN in the Faster R-CNN is modified to the PU-RPN by combining with the PU classification loss. Besides, the proposed Pi-Score method in the improved PU framework concentrates on boosting the estimation accuracy of the class prior by combining the RPN and ROI Head to mine the potential positive anchors or unlabeled positive anchors in images. To verify the effectiveness of our proposed framework, we conducted two groups of experiments. On the one hand, the experiments prove that our method outperforms not only the baseline method or Faster R-CNN but also the other mainstream PU-based detectors. On the other hand, the experimental results also demonstrate that our method has achieved the highest AP metrics (77.03% for 0.7 APC, 70.92% for 0.5 APC, and 60.73% for 0.3 APC) with different proportions of annotations when compared with mainstream methods.

Data Availability
The dataset used in this paper is the Insulator Defect Image Dataset (IDID), which can be downloaded from: https://ieee-dataport.org/competitions/insulator-defectdetection.

Conflicts of Interest
The authors declare that they have no conflicts of interest.