Semantic Recognition and Location of Cracks by Fusing Cracks Segmentation and Deep Learning

For a long time, cracks can appear on the surface of concrete, resulting in a number of safety problems. Traditional manual detection methods not only cost money and time but also cannot guarantee high accuracy. Therefore, a recognition method based on the combination of convolutional neural network and cluster segmentation is proposed. The proposed method realizes the accurate identification of concrete surface crack image under complex background and improves the efficiency of concrete surface crack identification. The research results show that the proposed method not only classifies crack and noncrack efficiently but also identifies cracks in complex backgrounds. The proposed method has high accuracy in crack recognition, which is at least 97.3% and even up to 98.6%.


Introduction
At present, the use of Chinese concrete structure is fairly wide, involving many engineering industries. But with various cracks generated in the process of construction and using, potential safety hazard has been brought to engineering construction and maintenance. Now, the most common detection method is still manual detection, which is not only inefficient, but also brings safety problems. And the more advanced nondestructive detection methods are costly and cannot realize absolute noncontact detection. In recent years, with the continuous development of computer vision technology, the realization of high-precision crack detection based on computer technology has become a hot spot. e effects of detection methods are mainly focused on the sampledata cluster formation protocol to tackle the leaderless cluster formation problems [1]. ey are mainly divided into traditional image processing methods (including edge detection [2], threshold segmentation [3], and region segmentation [4]) and machine learning methods (including deep learning [5] and clustering segmentation [6]).
For the traditional image processing methods, Xu et al. [7] used phase angle and gray distribution to improve the Canny operator, which can identify fine surface cracks, but generate more noise. Rivera et al. [8] tried to detect the cracks on the concrete surface by morphological methods and crack splicing, which can remove noise well, but the splicing effect of the crack's overall skeleton is not very good. Su and Yang [9] proposed an algorithm of image segmentation enhancement, named morphological segmentation based on edge detection-II (MSED-II), to concrete crack segmentation. Lu et al. [10] used a new double-threshold algorithm combined with morphological denoising to realize the identification of strain hardening cementitious composites crack images, but this method cannot identify cracks with complex backgrounds. Based on the edge detection algorithm, Nikravesh and Nezamivand Chegini [11] proposed a detection method using wavelet transform, which can identify crack images with insignificant grayscale differences better and can identify cracks under a certain complex background, but the effect is not significant enough.
Compared with traditional image processing methods, Rao et al. [12] proposed a faster and simpler single-level convolutional neural network based on real-time target detection technology, which can not only reduce the impact of background on crack identification to a certain extent, but also identify various surface cracks of concrete bridges, but the accuracy is only 66%. Based on deep fully convolutional neural network, Ren et al. [13] proposed a multiscale crack feature extraction network structure for concrete tunnels named CrackSegNet, which has a strong ability to split the entire crack, but it still has trouble in crack extraction with complex background. Dung and Anh [14] introduced "Full Convolutional Neural Network" to realize effective detection of crack targets in a complex background and reduce false marks, but the processing efficiency is still not high enough. Based on the improved and optimized convolutional neural network "GoogLeNet," Ye et al. [15] have realized the highprecision identification of multiple cracks with complex background, but the identification accuracy of the crack data set is not very high.
For the accurate identification of concrete surface cracks, we proposed a method of concrete surface crack identification in this paper. It is based on the combination of convolutional neural network and clustering segmentation algorithm and can achieve accurate identification and size calculation of concrete surface crack images.

Crack Identification Network (CIN).
According to the diversity of crack shapes and the uncertainty of crack magnitude, this paper proposes a convolution neural network structure called crack identification network (CIN) based on the convolutional neural network, as shown in Figure 1. Set serial number as i, and layer i as L i ; convolution operation i as C i ; and pooling operation i as P i . L 1 , L 3 , L 4 , L 6 , L 8 , and L 9 are convolutional layers; L 2 , L 5 , L 7 , and L 10 are pooling layers; L 11 is fully connected layer; L 12 is ReLu layer; L 13 is dropout layer; and L 14 is softmax layer. e basic structure of the model consists of six convolutional layers, four pooling layers, a fully connected layer, an input layer, and an output layer. All convolutional layers and fully connected layers are batch normalized to improve the generalization ability of the model. In order to ensure the images which may be too large or too small not to affect the recognition effect, set the input images size as 256 × 256 × 3. e three dimensions are length, width, and RGB components (red, green, and blue components). In order to perform feature compression better and simplify network complexity, set pooled window size of L 2 , L 5 , L 7 , and L 10 as 2 × 2, 4 × 4, 4 × 4, and 6 × 6 and step size as 2 and apply the maximum pooling rule. In order to obtain deeper image features and ensure that the complexity of the network would not increase, set convolution kernel size of convolutional layers L 1 , L 3 , L 6 , and L 8 as 3 × 3 and convolution kernel size of convolutional layers L 4 and L 9 as 5 × 5; step size of all convolutional layers is 1. After convolution and pooling, through the Flatten operation, a vector containing 512 elements is output to the ReLu layer. en, the partial neuron connection is randomly cut off by the dropout layer, and the dependence between network model parameters is reduced, the robustness of the model is improved, and overfitting is effectively prevented. Finally, the softmax layer is used to judge whether the data after the above processing is a crack image.
According to the constructed crack identification network, the cracks can be preliminarily identified through the following steps. First, we established a large number of crack data sets, and the data set was divided into a training set and a verification set. Next, we used the training set and validation set to train and validate the crack identification network model, and last, we used test set to test the model to predict the effect of the model prediction. e overall identification process is shown in Figure 2.

Data Set Making.
e crack acquisition mechanism shown in Figure 3 was constructed to obtain the crack image to make a data set. We used the CCD industrial camera (Basler acA1300-30 gm) mounted on the drone to collect 1000 original images, most of which were crack pictures with complex backgrounds (such as crack pictures with rough surfaces, moss, bumps, depressions, and stains), and each image was different due to different photographing conditions.
e size of the collected original images was 4896 × 3672 pixels. ey are too large to input directly, so they are divided into 512 × 512 subimages in steps of 248, and 1000 original images are divided into 46000 subimages. In order to detect cracks and eliminate noise more effectively, some subimages were rotated. e final data set contained 58,000 subimages, 48,000 of which were used for training and 10,000 for testing and verifying the effectiveness of the network.
In order to train out a crack detection model with high robustness and high accuracy, the generated data set contains a wide range of crack image types. For example, the crack images contain cross cracks, thick and thin cracks, and cracks with debris. e noncrack images contain concrete surfaces with shadows, bumps, and textures, as shown in Figure 4.
It can be seen from Figure 4 that the blue dotted box in the first image of Figure 4(a) marks the intersection of the cross crack, the box in the second image marks the fallen leaves in the crack image, and the last two images are examples of thick and fine cracks. In Figure 4(b), the blue dotted elliptical frame in the first noncracked image marks the shaded part of the concrete surface. e elliptical frame in the second image marks the depression on the concrete surface. e elliptical frame in the third image marks the bumps on the concrete surface, and the oval frame in the last image marks the texture of the concrete surface.
In order to verify the recognition accuracy of the recognition model effectively and ensure the simplicity of the detection method, we used the 5-fold cross-validation [16] to verify and train 5 sets of models to obtain the training results of the 5 sets of models. en, we selected a training result 2 Complexity randomly to show the accuracy and loss rate curves of the training set and the verification set, which is shown in Figure 5. It can be seen from Figure 5(a) that the model convergence speed is very fast. In the case with a small amount of sample data, the loss curve drops fast. e curve tends to be stable at 20 epochs, and it indicates that the sample training is almost completed. It can be seen from Figure 5(b) that the accuracy of the verification set continues to rise, and the loss value continues to decrease. e accuracy of the  Non-crack Flatten Dropout Figure 1: Schematic of crack identification network (CIN).

Complexity 3
verification set tends to 100% at 20 epochs; the loss value tends to 0%, and it indicates that training has finished at 20 epochs. Figure 6 shows the results of the 5-fold cross-validation. e accuracy rates of the five sets of model training sets are all above 99%, and the average accuracy rate of the validation set is 99.1%. e accuracy of the validation set of the five groups of models has a little change, but the range of change is small, which shows that the trained crack network model has good robustness and stability.
Some images were selected from the data set randomly to test the effect of crack identification of the crack identification network, as shown in Figure 7.
From Figure 7(b), it is clearly visible that the crack identification network based on deep learning can classify cracks and noncracks accurately. On the one hand, as shown in Figure 7(b), in which the first and second layers belong to the classified cracks Images, while the third and fourth layers belong to the classified noncrack images. On the other hand, the cracks can be identified and located cracks accurately, as shown in the blue box in Figure 7(b). It can be seen from Figure 7(a) that the resolution of the original image is different. For example, the resolution of the first image is significantly greater than that of the second. However, the first and second images in Figure 7(b) can identify the cracks accurately, indicating that the method is not affected by the resolution. At the same time, there are other miscellanies around the cracks, such as the leaves, as shown in Figure 7(a). e blue boxes in the second and seventh pictures of Figure 7(b) show the cracks around the leaves clearly. It indicates that the method is not affected by the surrounding miscellanies. As shown in the first and seventh pictures of Figure 7(b), the cracks in the first picture of Figure 7(a) are significantly larger than those in the seventh picture of Figure 7 cracks, but the fourth picture in Figure 7(b) gives three cross blue box identification results, indicating that the method can automatically decompose the cross crack identification result into straight crack identification result. e noncrack image in the blue dotted oval frame in Figure 7(a) contains large convex and concave concrete surfaces, and the solid blue line in the oval frame shows the concrete surface with large texture. e identification result shown in Figure 7(b) is not affected by those, indicating that the method is not affected by the surface characteristics of the crack. From the above analysis, it can be seen that the crack identification network we proposed is not affected by the image resolution, the surrounding miscellanies, nor the surface characteristics of the crack. Although the crack identification network accurately positions the determined crack images, the positioning results lack the identification of complete cracks and continuity and cannot eliminate the influence of background miscellanies on the cracks. If we only clip and quantitative calculate according to above results, a large error will occur, so it is necessary to divide the cracks more accurately to achieve quantitative identification of cracks.

Accurate Crack Segmentation
After the crack identification in the previous section, the initial crack image was obtained, but the crack in the crack image was not segmented accurately. For this, we achieve accurate segmentation of cracks based on improved Kmeans clustering algorithm.

Improved K-Means Clustering Segmentation Algorithm.
Although the K-means clustering algorithm [17] has fast and efficient segmentation characteristics, it has high requirement for high clustering center selection and is very easy to converge to the local optimal solution, thereby missing the global optimal solution. In view of this, we used dynamic particle swarm optimization (DPSO) to improve the  Complexity K-means clustering algorithm.
e specific steps of improved K-means are as follows: Step 1. Convert the color image to grayscale image to obtain the initial grayscale image of the crack, and then set the initial cluster number k.
Step 2. Determine the clustering center m according to the K-means algorithm.
Step 3. After the clustering center is determined, calculate the fitness value f i of the particles i [18], which is shown in the following equation: (1) Here, x i is the i th data point in the data point set X and m j is the j th cluster center.
Step 4. Determine the particle inertia coefficient w and learning factor s according to equations (2) and (3) [18]: Here, w max is the maximum inertia coefficient; w min is the minimum inertia coefficient; f i is the current adaptation value of particles; f ave is the current average adaptation value of all particles; and f min is the minimum adaptation value of all particles.
Here, s 1 and s 2 represent the particle's self-learning ability and the learning ability to the excellent collective; s 1,int and s 2,int represent the particle's initial learning ability; s 1,fin and s 1,fin represent the particle's final learning ability; s 1,int > s 2,int ; s 1,fin > s 2,fin ; and t max represents the maximum time for algorithm operation.
Step 5. Use the inertia coefficient w and learning factor s obtained in step 4 to update the particle velocity V i [18] and position X i [18], which are shown in the following equation: Here, r 1 and r 2 are two random numbers which are evenly distributed on the interval [0, 1]; t is time; p ij is the position of particle i during the update process; and p gi is the best position that particle i experienced during the update process.
Step 6. When the maximum number of iterations is reached or the particle swarm fitness variance a converges to δ 2 fixed value, the global optimal solution is output to obtain the particle swarm fitness variance δ 2 [18], which is shown in the following equation; otherwise, repeat step two to step four.
Step 7. Take the optimal solution of step 5 as the optimal clustering center, and then obtain k clusters.
Step 8. Perform criterion function judgment. e index function of the Davidson Burger Index (DBI) [19] is adopted as the criterion function of the final clustering result, and its calculation formula is shown as follows: Here, Dis(i, j) represents the distance between the i th cluster and the j th cluster; S i represents the sum of the standard error of the Euclidean distance between each data point in the i th cluster and the center m i of the cluster; and k is the total number of clusters in the data set.
Step 9. If DBI converges, the clustering result at this time is output as the final result; if it does not converge, repeat steps 2 to 8 until it converges, and then output the result. e overall flowchart of the improved K-means algorithm is shown in Figure 8.

Algorithm Performance Analysis.
We used the evaluation index of precision, recall, and F-measure to evaluate the effectiveness of the algorithm. e specific evaluation index [20] is defined as shown in Figure 9.
In Figure 9, P stands for precision rate; R stands for recall rate; F stands for F-measure value; A stands for the number of images that are crack images and correctly identified as crack; B stands for the number of images that are background images but identified as crack; C represents the number of images that are crack images but identified as background; and D represents the number of images that are background images and are correctly identified as background.

Comparison with Traditional
Algorithms. Based on the crack data set in Section 2, taking a few simple crack images as an example, compare the segmentation effects of the improved K-means algorithm with the traditional improved Otsu [21], improved Canny [22] and improved median filtering [23]. Taking the presence and absence of rough surfaces and protrusions as examples, the comparison of segmentation effects is performed as shown in Figures 10  and 11, respectively.
It can be seen from Figure 10 that, for the first crack image with rough surface, although the improved Otsu 6 Complexity algorithm, Canny algorithm, and median filter algorithm can all segment the crack, but there is still some noise, but the improved K-means algorithm has almost no noise, and the contour of the crack is also clearer. For the second crack image, because the surface is smooth, all algorithms have few noise, but the crack skeleton integrity and continuity of the improved K-means algorithm are higher than the traditional algorithms. It can be seen from the above comparison that the precise segmentation of the improved K-means algorithm has good noise resistance and accuracy. As can be seen from Figure 11, for the first image with raised cracks, although the improved Otsu algorithm, Canny algorithm, and median filter algorithm can all segment the cracks, the cracked skeletons are all thick, and the bulge is also segmented out. In addition to almost no noise, the improved K-means algorithm can also restore the outline of The maximum number of iterations is not reached and the particle swarm fitness variance diverges Figure 8: Overall flowchart of the improved K-means algorithm.
Complexity 7 the thin original crack skeleton more accurately. For the second crack image without bumps, because the surface has no bumps, the traditional algorithm only has a small amount of noise, but the crack continuity is poor. But the improved K-means algorithm does not have noise, while ensuring integrity and continuity of the crack skeleton well. It can be seen from the above comparison that the precise segmentation of the improved K-means algorithm has good noise resistance and accuracy. en, combine the algorithms with the crack identification network, and compare and analyze the overall identification effect. e result of evaluation indicators is shown in Figure 12.
As can be seen from Figure 12, based on the data set of this article, the indicators of the improved Otsu algorithm and the improved Canny algorithm are less than 80%, while the improved median filter algorithm only has a precision rate slightly higher than 80%, and the other two indicators are also no more than 80%; and the improved K-means algorithm has a recall rate and F-measure value of more than 90%, and the accuracy rate has reached 97%. It can be seen from the above that the processing effect of the improved K-means algorithm is much higher than the above traditional algorithms.

Comparison with Clustering
Algorithms. Based on the crack data set, the improved K-means segmentation algorithm is used to identify crack pictures on the surface of the concrete with moss, fallen leaves, or water stains, and the identification results are combined with the K-means algorithm [24], means shift algorithm [25], and fuzzy C-means algorithm for comparative analysis, as shown in Figure 13.
It can be seen from Figure 13 that when identifying the first and second cracks with moss in Figure 13(a), the improved K-means algorithm removes almost all noise and guarantees the integrity of the crack skeleton well. For the third cracked image with fallen leaves in Figure 13(a) (where the fallen leaves are marked with a blue dotted box), the other three algorithms have failed to eliminate the noise from the fallen leaves. e two cracks connected to the second small segment were clearly identified. For the identification of the fourth crack with fallen leaves and shadows in Figure 13(a) (where the fallen leaves are marked with a blue dotted box and the shadow is marked with a blue solid line), the K-means algorithm, mean shift algorithm, and fuzzy C-means algorithm can hardly identify cracks because the cracks are too fine, and neither can remove the noise caused by the fallen leaves and shadows. However, the improved K-means algorithm identified the entire fine 8 Complexity cracks well when the fallen leaves and shadow noises are completely removed. For the last picture with water stains in Figure 13(a) (where water stains are marked with solid blue boxes), the other three types of algorithms all have a lot of noise and the upper left part of the crack is missing, and the lower rightmost part of the crack is unable to be identified due to the influence of water stains and the grayscale difference of the concrete surface, but the improved K-means algorithm completes the clear identification of the overall crack without noise, and the small arc in the middle of the crack is also clearly identified. e above analysis shows that the crack identification method is not only not affected by moss, shadow, water stains, and miscellanies, but also not affected by the size of the crack and the surface characteristics. Even for such cracks under complex backgrounds, this paper's identification method can still identify and segment the crack skeleton with high accuracy, and it also guarantees the great advantage of no noise and the integrity and continuity of the crack skeleton. And the comparison of the evaluation indexes of the algorithms is shown in Figure 14.
It can be seen from Figure 14 that based on the data set, the recall and F-measure values of the improved K-means algorithm are both above 90%, and the accuracy rate is 97%, which is still much higher than other three clustering algorithms. It can be inferred that the performance of the improved K-means algorithm is better than the other clustering algorithms.

Quantitative Identification of Cracks
Use deep learning and improved K-means algorithm to accurately extract the crack but cannot determine the size of the crack. For this, this section calculates the physical size of the crack accurately according to the premise determination of the crack pixel size.

Crack Physical Size.
e physical size of the cracks, namely, the area, length, average width, and occupation ratio, needs to be calculated. e calculation method is as follows: (1) Crack Area. Calculate the number of pixels included in each connected domain, according to the actual physical size corresponding to each pixel, as shown   us, the area of each connected domain is obtained, and the areas of all connected domains are summed.
Here, P a is the actual physical size of the crack, the unit is mm 2 ; P i is the pixel size of the crack, the unit is pixel 2 ; and α is the zoom ratio of the crack, the unit is mm 2 /pixel 2 . (2) Crack Length. in the image of the crack skeleton. e crack width after thinning is a single pixel; then calculate half of the perimeter to get the number of pixels of the crack at this time, and get the crack physical length according to formula (7).
(3) Average Crack Width. e average crack width is the ratio of the crack area to the length. e calculation formula of the average crack width can be obtained by combining the calculation method of the crack length, which is shown as follows: Here, S i is the area of the i th crack; C i is the perimeter of the i th crack; d i is the average width of the i th crack; and n is the total number of cracks. (4) Crack Occupancy Ratio. It is the ratio of the area of the crack to the total area of the image.

Crack Marking.
To facilitate the calculation of the size of the crack, the acquired crack image needs to be marked. Based on the data set, combined with the output of the CIN network model, the crack pictures were tested for marking through morphological method. e software used is MATLAB 2016b, the processor is Intel ® Core ™ i5-8300H CPU @2.30 GHz, the memory is 8.00 GB, the graphics card is NVIDIA GeForce GTX1050Ti, and the operating system is Windows 10.
In order to verify the identification performance better, the identification test was carried out by taking the crack shown in Figure 15 as an example. And the crack was marked in sections to facilitate the size calculation; at the same time, the grayscale image was reversed to improve the effect of marking. And, Figure 15 Complexity Figure 15(b); the segmentation of the identified crack is shown in Figure 15(c); and the marking of the split cracks is shown in Figure 15(d).
It can be seen from Figure 15 that the marking of the crack skeleton is very clear and unambiguous due to the precise identification and segmentation, and almost every large turning point is segmented to facilitate the subsequent size calculation.

Crack Size Calculation.
e method of this paper was used to identify and segment cracks and mark them. en, we chose two images as an example to perform the size calculation, which is shown in Figure 16. e crack size calculation method can achieve quantitative identification of cracks. e shooting equipment used in this article was CCD industrial camera (Basler acA 1300-30 gm), and the zoom ratio of the camera in image shooting is α 0.21 mm/pixel.   It can be seen from Figure 16 that segmented marking is adopted due to the different widths of each segment. Crack 1 in Figure 16(a) contains 8 segments, and crack 2 in Figure 16(b) contains 4 segments. Table 1 shows the pixel size and occupancy ratio of each segment of 1 and crack 2 as well as the overall pixel size. Combining Table 1 and the zoom ratio, the actual sizes of the cracks 1 and 2 were obtained and compared with the measurement results of the crack measuring instrument. e results are shown in Table 2.   It can be seen from Tables 1 and 2 that the error between the size obtained by the identification method and the size obtained by the crack measuring instrument is not very large. In order to verify the accuracy more effectively, a part of the image was randomly extracted from the data set. is part of the image was divided into 10 categories, and each category selected a representative image, as shown in Figure 17.
e results of these 10 crack images were identified and quantified and compared with the average width, overall length, and area measured by the crack measuring instrument. e statistics and comparison results are shown in Table 3, and the accuracy chart is shown in Figure 18.
It can be seen from Table 3 and Figure 16 that for 10 representative crack images, the accuracy of the average width calculated by the quantitative identification method is not less than 97.3% compared with the average width measured by the crack measuring instrument, and the highest accuracy rate is up to 98.84%; the accuracy of the overall length is no less than 97.8%, and the highest accuracy is even more than 98.6%; the accuracy of the area is no less than 97.9%, and the highest accuracy is up to 98.5%. From the above comparative analysis, we can see that the quantitative identification method has good stability, and the accuracy is extremely high. Compared with the instrument measurement, the error obtained is basically within the controllable range.

Conclusion
is paper proposes a concrete surface crack identification and size calculation method combining deep learning convolutional neural network, clustering segmentation, and morphological methods, which can effectively classify concrete cracks and noncrack images and accurately segment and quantitatively identify cracks. e main conclusions are as follows: (1) Using deep learning convolutional neural network for training, a preliminary crack recognition model is obtained. After verification and testing, the identification accuracy rate reached more than 99%. At the same time, the crack identification model of deep convolutional neural network is combined with clustering segmentation and morphological methods to propose a hybrid multialgorithm crack identification extraction method called CIN. (2) Compared with traditional segmentation algorithms and similar clustering segmentation algorithms, the improved K-means algorithm can identify a variety of concrete surface cracks in complex backgrounds while ensuring extremely high segmentation accuracy. (3) Based on the improved K-means algorithm, combined with morphological methods, the identification of cracks, quantitative identification, and calculation of physical size are realized. Experiments show that the accuracy of quantitative identification is at least 97.3%, and the highest can reach 98.6%.
(4) It provides a certain theoretical basis for ensuring the safety of crack detection personnel on the concrete surface, reducing workload, and maintaining and safety testing of concrete structures. It also provides some research bases for the identification of similar cracks under higher precision and more complex conditions.

Data Availability
All the data generated or analyzed during this study are included within this article.

Conflicts of Interest
e authors declare that there are no conflicts of interest regarding the publication of this paper.