Fabric Defect Segmentation System Based on a Lightweight GAN for Industrial Internet of Things

Machine vision systems based on deep learning play an important role in the industrial Internet of things (IIoT) and Industry 4.0 applications, especially for product quality monitoring. Fabric defect detection is an important task in the industrial production of textiles and is crucial for product quality assurance. In actual production, the detection of many small and weak target defects remains challenging. Furthermore, industrial production requires high production rates and small model sizes in practice. This study proposes a lightweight segmentation system that meets real-time industrial production requirements. Herein, ﬁ rst, the defect sample image was repaired based on the image repair mechanism of the generative adversarial network model. Then, the di ﬀ erence between the defect sample and the repaired sample was obtained and subsequent processing, such as denoising and enhancement, was done. Finally, the defect areas were segmented. Our model was speci ﬁ cally designed for the segmentation of weak and small defects. This was achieved through adversarial training, optimization of an objective function, and image processing. Experimental comparisons show that the intersection over union of the three di ﬀ erent datasets is 77.84%, 77.85%, and 73.6% and that our model is superior to the conventional semantic segmentation model. Furthermore, our model has good image restoration quality with a low mean absolute error and high structural similarity index. Additionally, our model is lightweight, has good real-time performance, and is suitable for applications in the IIoT and industrial production lines, such as embedded systems.


Introduction
In recent years, the industrial Internet of things (IIoT) has accelerated its integration into traditional industries and, therefore, has evolved into various applications. With the deployment of machine vision systems on the edge side, an automated production inspection line can be established for product defect detection; the inspection results can be transmitted to the cloud to provide data support to satisfy different customer needs. Migrating complete or partial tasks to the edge can diminish the network bandwidth, computing, and storage requirements of a cloud center [1]. Fabric defect detection has attracted significant research attention in the textile industry. In industrial production, it is essential to segment fabric defects to ensure the high qual-ity of fabric products [2]. With the development of machine learning and machine vision technology, machine visionbased methods to solve textile quality control problems have gradually become an industry trend because of their high accuracy, fast detection speed, and low labor cost [3,4].
Many researchers have used various algorithms and models for the automatic detection and segmentation of fabric defects [5,6]. Two methods are applied for defect detection, one of which involves the use of classical image analysis algorithms, such as texture models [7], Fourier analysis [8], and Gabor filters [9].
The second method is based on deep learning algorithms [10,11] that can often achieve good results. However, in practical applications, some problems remain, for example, compared to normal samples, fewer defect samples can be obtained during production, and there are few observable types of defects. Additionally, the conventional labeling of defect samples is time-consuming and labor-intensive.
Traditional segmentation networks have insufficient segmentation capabilities for small and weak defect samples. In recent years, the generative adversarial network (GAN) model [12] has become increasingly favored and valued by researchers because of the strong modeling ability of the discriminator. It can continuously judge the difference between the segmentation results from the generator and the ground truth. The discriminator and the generator are optimized to obtain a segmentation feature map of multicontext features, which enables the generation of segmented images that are infinitely close to the ground truth. Therefore, in this study, we introduce a GAN as a fabric defect segmentation model.
Although the number of defect samples in the production is generally small and the types of defects that appear are also few, expanding the training sample set by heavy manual annotation work is not the best choice. By learning a small quantity of samples, the probability distribution of the texture and other features of the normal sample can be obtained. Some unknown random defects that are excluded from existing training samples often appear in industrial production. Regardless of the type and characteristics of the defect, differences exist between the normal sample area and the defect area. Therefore, the defect area can be determined based on these differences. Zhao et al. proposed a defect detection model based on positive samples, which first repairs the defect area and then determines the defect area by comparison [13]. The method of combining GANs and autoencoders is used to repair the defect image; then, local binary pattern (LBP) features are used to detect defects. The LBP method has a good effect on large-scale defects. However, the contours detected by the LBP features may be inaccurate for small and weak defects. In this study, we achieved the segmentation of the defect area based on the GAN image repair mechanism. In addition, many of the models have a large size and do not consider actual production needs.
Our research motivation is to develop a lightweight realtime system suitable for industrial production, confronting the more difficult detection problems of weak and small defects. Many existing models have good effects on conventional fabric defects. However, there are some small defects with low contrast, which affect the further improvement of product quality. Therefore, the detection and segmentation of these weak and small defects have become important research tasks. In addition, in actual production, the model must be as light as possible, occupy a small space, and meet real-time requirements. Deep learning systems are deployed at the edge and can play a very important role in the IIoT [14], and our system can be deployed on local production lines or at the edge of the IIoT.
In response to the abovementioned problems, herein, we studied the detection and segmentation of multiple types of defects in actual production samples, focusing on the segmentation of weak and small defects. A method for quickly constructing a large number of samples that conform to the true probability distribution is proposed.
In addition, a model designed and optimized to occupy a small space and have a fast segmentation speed is presented, which can also be applied to industrial fabric production. A GAN model was used to realize the segmentation of fabric defects. Adversarial training not only makes the model more stable but also increases the accuracy of defect segmentation.
In summary, the main contributions of this article are summarized as follows.
(1) A fabric defect segmentation system suitable for industrial applications is proposed. The system is composed of a defect sample-synthesizing module without manual annotation, defect repair module, and defect segmentation module.
(2) Using a combined image processing method, we designed a defect segmentation model for weak and small defects, including an objective function for confrontation training, a normalization method, and a learning rate decay strategy, which contribute to the accurate segmentation of defects.
(3) The segmentation model proposed herein has the advantages of being lightweight and functions in real time, which is especially suitable for applications in IIoT and industrial production lines, such as embedded systems.
The remainder of this paper is organized as follows: Section 2 summarizes related work in recent years, Section 3 introduces our methodology, and Section 4 discusses the training process and model optimization method. Experimental results are presented and discussed in Section 5. Finally, Section 6 concludes the paper.

Related Work
Many classical algorithm models have been used in fabric defect segmentation and detection research, for example, wavelet analysis [15,16], fuzzy C-means method [17], Gabor filter [18], established texture distribution model [19], Elo Rating algorithm [20], Bayesian classifier based on statistical features [21], and XGBoost classifier based on the genetic algorithm [22]. These methods have achieved good results in solving many application problems in related scenarios.
In the application field, fabric defect detection is also a type of object surface detection. Some researchers have conducted research in this field. For instance, surface defect detection is based on deep learning methods [37,38].
In fabric production, defects are only a small part of the abnormal samples. In recent years, some researchers have 2 Wireless Communications and Mobile Computing proposed anomaly detection methods and models [39] for image anomaly data detection. With the development of the GAN model, new models and theories have emerged consistently, such as conditional GANs [40,41], cycleconsistent GANs [42], and style-based GANs [43]. In recent years, owing to the powerful learning ability of the GAN model, some researchers have also used GAN models for anomaly detection [44]. Schlegl et al. proposed AnoGAN [45], which is trained on positive samples to learn a mapping from the latent space. Akcay et al. subsequently proposed GANomaly [46]; their approach only requires a generator and a discriminator as in a standard GAN architecture, which is an improvement compared to AnoGAN and EGBAD. Perera et al. proposed OCGAN [47], and Ngo et al. proposed Fence-GAN, which corrects the GAN loss and has a better anomaly classification accuracy [48]. Zhao et al. proposed a defect detection framework based only on positive sample training [13]. The defect area in the sample is first repaired, and then, the model compares the input defect sample with the restored sample to determine the exact defect area. Furthermore, Wang et al. proposed a method using the GAN model with locality-preferred recoding for visual anomaly detection [49]. Nema et al. proposed an unpaired GAN model for brain tumor segmentation [50]. To identify small changes in small structures, Murugesan [53].
In recent years, some researchers have proposed lightweight systems for applications, such as in underwater object [54], salient object [55], and blind road detection and crosswalks [56].
Although previous studies have made their own contributions, limited research on the balance of requirements of lightweight, real-time, and high-accuracy fabric defect detection and segmentation has been conducted in enterprise production. This article designs a system that includes three modules for the abovementioned problems and a model for the segmentation of weak and small defects. Let I denote a fabric image and S denote a predicate with the same properties, and the image includes n regions R i (i = 1, 2, ⋯, n). Among them, R 1 is a normal region and the other regions are defective regions that satisfy

Our Proposed Methodology
Defect segmentation results can be described as where i = 1, 2, ⋯, n, j = 1, 2, ⋯, n, and i ≠ j:Pðx, yÞ is the image pixel, I mask is the result after defect segmentation, and the region with a pixel value of one in I mask is the defect region.

Framework Description.
This study is a two-step segmentation method based on a GAN model, as shown in Figure 1. The first step is to repair the defect image to obtain the corresponding repaired image. The second step is to compare the two images to obtain the difference result and obtain the mask result of the defect area using image processing methods such as denoising, linear transformation, and binarization processing. The experiment was divided into three modules: a module that synthesizes defect samples, defect repair module, and defect segmentation module.

Synthesizing Defect Samples.
Our experiments used samples taken from the equipment during the fabric production process. Owing to the rapid improvement of production processes, few defect samples can be obtained. To meet the needs of sample training, we designed a method to quickly obtain a large number of experimental samples without manual labeling. As shown in Figure 2, first, the defect areas are separated to obtain the defect block using only a small number of existing defect samples combined with the corresponding labeling information. We used the "sliding cutting" method with a sliding step and cutting resolution. By sliding and cropping each sample, new images with cutting resolution were obtained. By determining whether the label corresponding to the cropped image contains a label, determining whether the cropped image is a defect image can be easy. In this manner, we can obtain a large number of normal samples with a cutting resolution.
Then, these few defect blocks are randomly pasted into the existing normal background by programming while recording the labeling information at the same time. In this way, a set of samples, including the defect, repaired, and mask images, is quickly obtained. Here, "random" includes the random selection of defect blocks and random pasting positions. Figure 2 shows an example of a method for artificially synthesizing and constructing defect samples.
This method not only quickly provides a large number of defect samples that are close to the original defect sample distribution but also directly obtains the corresponding 3 Wireless Communications and Mobile Computing annotation information, which can replace the tedious work of manual labeling. Figure 3 illustrates some examples of synthesized defect samples.

Defect Repairing.
Considering the real-time requirements of industrial production, we designed a simplified SegNet model. Compared with FCN [23] and U-Net [26], SegNet [29] uses the position information during maximum pooling. This does not require learning and, therefore, reduces the number of end-to-end training parameters. Seg-Net cleverly achieves upsampling by recording the position of the maximum value during pooling, and because there is no deconvolution process, it improves the training speed of the model.
In this study, we propose a concise SegNet with a reduced number of network layers. As shown in Figure 4, the model uses fewer coding and decoding layers but can retain more detail for repairing the image while also significantly reducing the storage space occupied by the model. In the encoding process, convolution and maximum pooling are alternately used to complete the downsampling of the image. This process is followed only three times (the original SegNet involves five downsampling times). In the decoding process, maximum depooling and convolution are alternately used and are performed only three times. Furthermore, the LeakyReLU activation function was used directly for the output. Pooling indices (location information during pooling) are used to transfer the decoder, record the location information during pooling, and directly place the value back to the original location for unpooling. Figure 5 shows the structure of the discriminant model, which uses a six-layer convolutional encoder structure. After     Wireless Communications and Mobile Computing convolution, it is activated by the LeakyReLU function, and the last output layer uses the sigmoid function. Then, a score is obtained to determine whether the input image is truly normal based on the probability value.

Defect Segmentation.
The end-to-end defect repair model is finally obtained through alternate training of the G and D networks. The test samples were inputted into the generated network to obtain a normal image.
There are two situations in this study: if there are any defects in the input sample, the model repairs the defects; otherwise, there is no significant difference between the output of the model and the input if the sample is normal. The original image and the repaired image need to be compared to obtain the difference image. The image difference can be described by formula (3).
where I ori ðx, yÞ is the original image, I rep ðx, yÞ the repaired image, and j•j the absolute value sign.
Since the defect area in the difference image may not be apparent, several enhancement operations are required, as shown in Figure 6. Conventional filtering methods are not used for denoising because they may blur the edges and details of the target. Rather, the threshold method is used to denoise the image directly, based on the background of the difference image. While filtering out the noise, the details of the segmented target can be preserved. The threshold method can be described by formula (4).
where th is a threshold for denoising. Then, the brightness and contrast of the difference images are enhanced by a linear transformation, as in formula (5).
where X represents the pixel value of a certain point in the original image and Y represents the pixel value of the corresponding position after transformation. The contrast of the image can be adjusted using α, and the brightness of the image can be changed using β. Finally, the OTSU algorithm is used for binarization to obtain the required mask image, which is the final segmentation result.

Training Process.
In the sample-synthesizing module, we obtain the image group consisting of the defect image, repaired image, and mask image. Only the first two were used in the training of the repair model. In the GAN training, an alternate iteration method is adopted for model training. First, the G network was trained, following which the D network was trained. The training of the D network also requires the output of the G network in the previous round of gradient backpropagation as input. Figure 7 shows the training process for the G and D networks.
For the G network, the defect samples are input into the generation model to generate a fake repair image, and then, the discriminant model is used to obtain a score. The expected repair image generated is sufficiently real; therefore, this score will form an error with the true label "1." Meanwhile, an error is formed between the false and true repair images generated. The aforementioned two errors are combined to form the loss function of the G network, and the parameters of the G network can be updated by the gradient backpropagation through the loss function.
For the training process of the D network, a score was obtained after the true repaired image was inputted into the discriminant model. The D network is expected to be able to accurately distinguish between true and false repaired images. Therefore, the discriminant score of the true repaired image and the true label "1" form an error. Similarly, the score of the false repaired image and the false label "0" form an error. The average of the two errors constitutes the loss function of the D network.
The role of the D network is to interfere with the generation model, that is, the score of the true repaired image tends to the true label "1" and the score of the false repaired image tends to the false label "0." This contradicts the  Figure 6: Process of segmenting defect images. 6 Wireless Communications and Mobile Computing expectation of the G network that the score of the fake repaired image tends to the true label "1," which is the antagonism of the GAN model. In an ideal situation, when the scores obtained after the true and false repaired images entering the discriminant model are all close to 0.5, it means that the discriminant model is unable to distinguish between the true and false repaired images. This means that the sample generated by the generation model has become the data of the real sample distributed. At this time, the model reaches an ideal balance.

Objective Function.
In the training of the adversarial segmentation network, there are four errors: the discriminant error and the generation error of the G network and two discriminant errors of the D network. Therefore, four loss functions were included in the error analysis. For the G network, there was an error between the false repaired image and the true repaired image. The mean square error (MSE) was used for evaluation. In addition, there is an error between the score of the fake repaired image and the true label "1." This is a binary classification problem. Binary cross entropy (BCE) was used to calculate the loss. Similarly, in the D network, both errors were binary classification problems and BCE was used to calculate the loss. First, we observe the composition of the MSE loss function, as shown in formula (6). An additional sample number average compared to the Euclidean distance formula can be described as the expected value of the square of the difference between the true value and the estimated value. The MSE loss of the G network can be simply described by formula (7).
where y i andŷ i are the true and estimated values, respectively.
where x and y are the defect and true repaired samples, respectively. The calculation of the BCE loss function is described in formula (8), whereŷ i is the evaluation value of the sample and y i is the label of the binary classification, which is 0 or 1.
It can be concluded that the objective function of the D network can be expressed as The objective function of the G network can be described as

Wireless Communications and Mobile Computing
Then, to unify the formula, the objective function of the D network can be changed to That is, the final objective function of the trained model can be described as The main parameter settings of the experiment are shown in Table 1.
All data in each epoch went through the network. In the training process, this experiment used the Adam optimizer, which combines the advantages of the RMSProp and Ada-Grad optimization algorithms. In this experiment, the initial learning rate set for the Adam optimizer was 0.001 and the momentum values of the first-order moment and secondorder moment estimation were 0.5 and 0.999, respectively. Meanwhile, a multistep learning rate decay strategy (Multi-StepLR) was set. In the experiment, the learning rate was 0.0001 when the epoch was 30 and the learning rate was 0.00001 when the epoch was 60. The advantage of this setting is that the loss of the model can be rapidly reduced in the early stage and can gradually reach the optimum in the later stage.
Image processing after defect repair was implemented using the OpenCV method; the denoising threshold was set to 19, linear transformation process used the convertSca-leAbs method, and α and β parameters were set to 5 and 0, respectively.

Datasets.
The experiments in this study used the following three datasets: 5.2.1. Enterprise Dataset. The fabric defect samples in the experiment originated from the image acquisition equipment on the enterprise assembly line. In the production process, high-speed cameras are used to monitor product quality.
There were 4360 original samples, and the original image resolution was 371 × 257 pixels. After removing duplicate and invalid samples, there were only a total of 90 samples and these were labeled for defects. Then, we used image rotation, flip, transpose, and other operations to enlarge the image set and obtain seven new forms of defect images. In the process of transforming the defect image, the label corresponding to the image is also expanded so that there is no need to label the new defect image one by one.
We then adopted the method described in Section 3, to quickly obtain a large number of defect samples that were close to the original defect sample distribution, where the "sliding cutting" method was used with a sliding step of 20 and cutting resolution of 128 × 128 pixels.

AITEX Dataset.
The AITEX dataset [57] is composed of 245 images of 4096 × 256 pixels with seven different fabric structures. There are 140 nondefect images in the database

Expanded Dataset.
Because the defect samples are actually small, the existing defects and types of defects are very limited. Therefore, we created artificial defect samples. Such defects did not appear in the training set and were, therefore, used to test whether our defect segmentation model was effective.

Evaluation Metrics.
Our model is evaluated using several metrics, such as Pixel Acc and intersection over union (IoU). Pixel Acc represents the ratio of the number of correctly classified pixels to the total number of pixels in the segmentation image, including correctly classified background points. The IoU measures the similarity between the segmentation result and the ground truth, as shown in formula (14). Each pixel in the segmentation result is divided into four types, that is, true positive (TP) (the number of defect pixels that are correctly divided into the defect area by the model), false positive (FP) (the number of background pixels that are incorrectly divided into the defect area by the model), false negative (FN) (the number of defect pixels that are incorrectly divided into the background area by the model), and true negative (TN) (the number of background pixels that are correctly divided into the background area by the model).
We used the mean absolute error (MAE) to evaluate the average pixel error after image repair, as in formula (15).
where A i is the original value of the ith pixel, C i is the repaired value of the ith pixel, and n is the total number of pixels in an image. In addition, we used the structural Ssmilarity Iidex (SSIM) [58] to analyze the quality of image restoration.

Enterprise Fabric Samples.
Through the samplesynthesizing module, we obtained 19606 pairs of artificial defect samples. A total of 10240 pairs of samples were used as the training set, and 320 pairs were used as the validation set to observe the effect that training had on the model. The In the fabric defect samples, the optimal model-generated error (MSE error) in the validation set was only 0.00021. The Pixel Acc was 99.68%, and IoU accuracy was 77.84%. Figure 8 shows the segmentation results for the samples.

AITEX Dataset.
First, the large-sized samples were cropped to obtain 128 × 128 pixel samples. The sample preprocessing method and training hyperparameter settings were the same as those of the previous fabric defect sample set.
The optimal model-generated error (MSE error) on the validation set was 0.00056, the Pixel Acc of defect segmentation was 99.94%, and the IoU score was 77.85%. Figure 9 illustrates the segmentation results for the AITEX samples. The model in this study also has a good effect on this type of model with a more complex background in terms of segmentation accuracy. Table 2 lists the experimental results for several samples in this model.      Figure 11 illustrates examples of the artificial defect segmentation results. In the extended set of 64 samples, the Pixel Acc of segmentation reached 99.3% and the IoU score reached 73.6%. The results achieved the segmentation accuracy of the existing defect samples.

Comparative Experiments.
To compare the performance of the different models, we implemented six other models, as shown in Table 2.

12
Wireless Communications and Mobile Computing The FCN, U-Net, and SegNet models are described in [23,26,29], respectively. The specific implementation details of the three models are presented in Figure 12.
FCNGAN, U-NetGAN, and SegNetGAN in Table 2 indicate a model obtained by training based on the GAN mechanism, where FCN, U-Net, and SegNet, respectively, are used as the G network and the D network is composed of a six-layer convolutional network. The D-network model is shown in Figure 5. The training parameters of the models listed in Table 2 are consistent with those listed in Table 1.  Figure 14: Segmentation results of uneven background samples.   Table 4: SSIM results in Figure 15. Among the first three segmentation models, SegNet has more advantages and the three segmentation evaluation indicators are better than the other two models. After the GAN training mechanism was introduced into the three segmentation models, the segmentation performance of FCNGAN and SegNetGAN was improved but U-NetGAN did not. Through comparison, it was found that the GAN training mechanism did not significantly improve the performance of the three semantic segmentation models. The model proposed herein achieved the best experimental results, and the segmentation performance evaluation index was better than the other six models. In this experiment, many of the sample defect areas were weak and small targets; therefore, the fluctuation of the segmentation results had a greater impact on the IoU but the IoU reached 0.7784, which is 7.5% higher than the best result of 0.7034 in the other six models.

Segmentation Effects of Weak and Small-Defect Samples.
We compared the segmentation results of each model for weak and small-defect samples. As shown in Figure 13, the first row contains five weak and small-defect samples and the second row is the ground truth. Upon comparison, it was found that the first sample on the left contained two very small defects situated very close. The model segmentation result is closest to the ground truth, which separates two small defects.
In another example, there were two small defects in the third sample. The segmentation result of the U-NetGAN model misses the defect, and the defect segmentation results of the other models are enlarged. The segmentation results of the model proposed in this study are the most accurate.

Samples with Uneven Background.
To test the ability of the model proposed herein to repair defect samples, we selected some defect samples with uneven backgrounds for testing. The test results show that the proposed model can effectively segment the defects. Since our model uses an image difference algorithm, the defect area that is very similar to the background may not be continuous in the segmentation results, as illustrated in Figure 14.

Analysis of the Sample Repair
Effect. This study provides representative samples for analyzing the repair results of our model. In Figure 15(a), sample 1 has an apparent flaw, sample 2 has a weak and small defect, and samples 3 and 4 have long stripe defects. These four samples in Figure 15(b) illustrate normal samples with different backgrounds and textures. Figure 15(a) illustrates the result of repairing the four defect samples. From the results of the repair, the flaws assumingly disappeared. We used the MAE to evaluate the average pixel error after image repair, as shown in Table 3. From the MAE results, the pixel error of the repaired image was less than three. Since samples 3 and 4 had long strips of flaws with larger areas, the MAE was also larger. Figure 15(b) illustrates the results of normal sample image restoration. The results generated by our model are nearly identical to those of the original images. The MAE results showed that the average pixel error was less than two. The results showed that the repair effect of our model was excellent.   14 Wireless Communications and Mobile Computing Moreover, we used SSIM to analyze the quality of image restoration. Table 4 presents the SSIM results, which show that the similarity between the original and repaired samples was generally high. In the sample in Figure 15(a) with flaws, samples 1 and 2 have a minimal effect on the similarity because the flaws are small, whereas, in samples 3 and 4, the similarity decreases owing to the larger area of the flaws. For the normal sample in Figure 15(b), the similarity was high. This indicates that the repair quality of our model is good.

Model Size Comparison.
To verify whether the model can meet real-time requirements, we tested the segmentation speed of seven different models. The research index is the number of cotton samples (frames per second (FPS)) that the model can process in one second. The experimental results are presented in Table 5 and Figure 16. The model size represents the size of the saved model file. The model proposed in this article occupies a small space, only 14.4 MB, which is easy to be embedded in industrial equipment.
The first three models follow the probability that the smaller the model, the higher the FPS value. This is because the model size and computing speed do not necessarily show an anticorrelation. The size of the model directly represents the number of parameters of the model, but the speed of the model calculation is not only related to the number of parameters but also affected by the structure of the model. The processing speeds of the seven models can meet realtime requirements. Since our model is calculated using a GPU, the process of repairing the network to obtain the repaired image is very fast: the processing of 7200 samples takes only approximately 56 s (equal to 128 FPS). As for the image processing operations after repairing the network, the calculation time is only slightly increased based on the OpenCV calculation on the CPU. This results in a decrease in the overall FPS but still achieves good real-time performance.
Considering that the resolution of the test samples has an impact on the FPS, we used a sample set with a resolution of 256 × 256 pixels to test the model again; this included 720 defect samples and 6480 normal samples. These test samples did not come from an enlarged 128 × 128-pixel image but were cut directly from the original cotton cloth sample. The test results are presented in Table 6. The test results show that the FPS decreased owing to the increase in sample resolution; however, the real-time requirements can still be reached.

Conclusion
In this study, a lightweight system composed of three modules was designed to solve the segmentation problem of fabric defects, particularly for weak and small-defect targets. We used a GAN model based on the repair mechanism, which is lightweight and has good defect segmentation ability. The results of testing corporate samples and samples from a public database show that the model proposed in this study has good segmentation effects and can achieve real-time performance, thus demonstrating its application value in IIoT and industrial production lines.
In the future, we will focus on few-shot and unsupervised learning. In addition, further improvements in realtime performance are worth studying.

Data Availability
The data used to support the findings of this study are included within the article.