Deep-Learning-Based Bughole Detection for Concrete Surface Image

. Bugholes are surface imperfections that appear as small pits and craters on concrete surface after the casting process. The traditional measurement methods are carried out by in situ manual inspection, and the detection process is time-consuming and diﬃcult. This paper proposed a deep-learning-based method to detect bugholes on concrete surface images. A deep convolutional neural network for detecting bugholes on concrete surfaces was developed, by adding the inception modules into the traditional convolution network structure to solve the problem of the relatively small size of input image (28 × 28 pixels) and the limited number of labeled examples in training set (less than 10 K). The eﬀects of noise such as illumination, shadows, and combinations of several diﬀerent surface imperfections in real-world environments were considered. From the results of image test, the proposed DCNN had an excellent bughole detection performance and the recognition accuracy reached 96.43%. By the comparative study with the Laplacian of Gaussian (LoG) algorithm and the Otsu method, the proposed DCNN had good robustness which can avoid the interference of cracks, color-diﬀerences, and nonuniform illumination on the concrete surface.


Introduction
Bugholes are surface imperfections that appear as small pits and craters on concrete surface after the casting process [1].
ese imperfections appear as regular or irregular pits with diameters ranging from few millimeters to 15 mm in diameter and are usually scattered randomly around the surface of the concrete.Bugholes are recognized as a major problem in the construction industry very early on [2,3].On the one hand, building owners and architects are getting stricter on quality of concrete surfaces.Surfaces are demanded to be flat and free of surface bugholes to leave an aesthetically pleasing impression.On the other hand, even though bugholes are primarily an aesthetic issue for exposed concrete structures and do not affect the structural strength of concrete, it does reduce the adhesion properties of the fiber-reinforced plastic (FRP) material applied to the concrete surface [4].Additionally, research has indicated that salt accumulated in bugholes causes premature degradation of reinforced concrete (RC) structures [5].Moreover, these surface bugholes are generally considered a nuisance for subsequent processes since these imperfections need to be filled before paint coating is applied to the concrete surface, and this additional surface preparation is a labor-intensive and costly process.
Owing to its influence on the quality of concrete surfaces, the methods for the detection of bugholes were established very early on.In the 1970s, construction professionals attempted to assess concrete surface quality by manually counting the number and measuring the diameter of bugholes and then calculating percentage of holed areas on the surface, which was considered timeconsuming and impractical [2,3].erefore, an improved method classifying concrete surface quality was proposed by omson [6], who suggested using bughole photos with different degrees of coverage as reference samples to compare with actual concrete surface.e current method [7, 8] of bughole rating developed by the American Concrete Institute (ACI) based on the idea introduced by omson and the ACI method compares the concrete surface to be assessed with a set of standard surface photographs representing seven scales, and the expert who is performing the comparison determines which scale the concrete surface being inspected belongs to.While simple in principle, some people argue that the comparison with photographs of reference samples can be problematic due to the variability between different printed scales of the reference samples and the subjectivity of the human eye [9,10].In addition, one surface may have several types of bugholes combined or a combination of several different surface imperfections such that the use of the reference becomes rather difficult and subjective.Moreover, there is a large amount of concrete engineering in practical states, and the manual inspection method is not only timeconsuming and labor-intensive but also costly.
In order to obtain a better evaluation of the concrete surface, more objective and intelligent methods need to be developed.Digital image processing technology is considered a powerful automated tool that can provide objective results quickly [11,12], and it has been successfully applied in bridge coating quality assessment in recent years.Lee et al. developed an automated processor that can recognize the presence of bridge coating rust defects [13].In order to solve the problem of nonuniform illumination, Chen et al. proposed the adaptive ellipse approach (AEA), the box-and-ellipse-based adaptive-network-based fuzzy inference system (BE-ANFIS), and the support-vectormachine-based rust assessment approach (SVMRA) [14][15][16].In order to adapt to various background colors and overcome the effects of background noise or nonuniform illumination, Shen et al. proposed a rust defect recognition method based on color and texture feature, which combines the Fourier transform and color image processing [17].Son et al. employed the J48 decision tree algorithm to rapidly and accurately determine rusted surface area [18].In order to improve the detection accuracy of the rusted areas on steel bridges, Liao et al. proposed a digital image recognition algorithm that consisted of the K-means method and the double-center-double-radius (DCDR) algorithm [19].Shen et al. proposed an artificial-neuralnetwork-based rust intensity recognition approach (ANNRI) [20].In addition to detecting rust, researchers also applied a variety of image processing techniques on visual images to detect cracks, including edge detection methods [21][22][23][24], morphological operations [25][26][27], digital image correlation [28,29], and image binarization [30,31].A comparative study of fast Haar transform (FHT), fast Fourier transform, Sobel edge detector, and Canny edge detector showed that FHT has the best performance [22].Lim et al. used the Laplacian of Gaussian (LoG) edge detector to detect surface cracks in concrete bridge decks and obtain global crack maps through camera calibration and robotic localization [23].Talab et al. used multiple filters such as Sobel and Area filters to change the small area to the background and used the Otsu method to detect major cracks [24].Some researchers have verified that the morphological operations are effective for crack detection [25,26,32].Rimkus et al. used digital image correlation (DIC) technique to detect and locate cracks in concrete surfaces [28].Kim et al. and Li et al. suggested using image binarization methods to extract crack information from digital images [30,31].
e current research focused on the use of image processing technology to detect surface cracks and rust with relatively few studies on surface bugholes.Zhu and Brilakis proposed an image processing method to detect bugholes (referred to as air pockets in their paper) on the concrete surfaces [33], but this method did not consider the influence of nonuniform illumination.Ozkul and Ismail developed a bughole measuring device for rating the quality of a concrete surface [1].Silva et al. developed an expert system that uses image analysis methods to classify the surface quality of self-consolidating concrete for precast members by calculating the percentage area, diameter, and distribution of the bugholes [34].Hirano et al. used a thresholding method to detect bugholes in the images, but using this method, it is difficult to detect small bugholes [35].Liu et al. established a method to detect surface bugholes via image analysis and published evaluation parameters; the OTSU image threshold segmentation technology is adopted to extract the characteristics of bugholes on the concrete surface [36].Isamu et al. developed an image analysis method using color images to quantify the bugholes distributed on concrete surfaces [37].
e application of digital image processing technology (IPT) has promoted the advancement of quality inspection methods of structural surface.However, the direct use of image processing technology for surface defect inspection has several disadvantages.First, the algorithms are tailored for certain images in the studied datasets, which affect their performance on new datasets [38].Second, due to the effects of noise such as illumination, shadows, and combination of several different surface imperfections, the detection results of the image processing algorithms may be inaccurate [39][40][41].Finally, the image processing algorithms are often designed to aid the inspector in defect detection and still rely on human judgement for final results [25].One possible solution is using deep-learning algorithms to analyze the inspection images [42].In recent years, deep-learning algorithm has shown remarkable performance in image object recognition [43][44][45][46] and deep convolutional neural networks (DCNN) have attracted wide attention as an effective recognition method [47].Several researchers have applied this method to detect concrete surface cracks.Zhang et al. conducted a comparative study on the image classification of pavement cracks using three methods: deep convolution network, support vector machine, and integrated learning.Results showed that the detection effect of the deep convolution network was better than the other two methods [48].Cha et al. used a convolution neural network based on deep learning to detect cracks in concrete images [49].Wang et al. proposed a convolutional neural network (CNN) for recognizing cracks on asphalt surfaces at subdivided image cells; the accuracy of the proposed CNN can achieve high accuracies 96.32% and 94.29% on training data and testing data, respectively [50].

Advances in Civil Engineering
In this paper, a deep convolutional neural network (DCNN) has been used to detect bugholes on concrete surfaces, and the effects of noise such as illumination, shadows, and combinations of several different surface imperfections in real-world environments were considered.
e performance of the proposed method was compared with that of the traditional image processing methods.

Main Concrete Surface Defect Classification
ere are many different types of quality defects of concrete surfaces.
e most common surface defects are cracks, bugholes, and color-differences.Figures 1-3 show the image characteristics of these three types of defects.
Cracks, bugholes, and color-differences have its own features.Cracks are generally irregularly line-shaped, as shown in Figure 1.Owing to the different reflectivities of light, a bughole is normally darker than the rest of the concrete surface.e greater the depth, the darker the color.As shown in Figure 2(a), typical bugholes are nearly circular.However, irregularly shaped bugholes also exist, as shown in Figure 2(b).Additionally, because of the smaller depth of some bugholes, the contrast with the normal concrete surface is not obvious, as seen in Figure 2(c).Colordifference is the color that deviates from the color of a normal or desired concrete surface.It has neither a particular shape nor a clear border, as shown in Figure 3.
e previous studies ignored the situation that multiple defects existed on a concrete surface at the same time.However, this situation often occurs during the concrete casting stage.
e concrete surface bughole images were classified according to a combination of surface defects, as shown in Table 1.

Proposed Method for Concrete Surface Bughole Detection
Figure 4 shows the proposed method's general flow with training steps (solid lines) and testing steps (dashed lines).e DCNN was trained for a total 30 training epochs.e model was evaluated on the validation set after every training epoch.When the DCNN classifier was well trained, the testing images were scanned by the validated classifier to generate a report of bugholes.

Database Establishment.
To train a CNN classifier, raw images of concrete surfaces with multiple types of defects (including cracks, bugholes, and color-differences) were taken from several completed construction sites with a mobile phone camera.e shooting distance to the objects ranges from 1.0 to 5.0 m. e bugholes need to be visible in images with the naked human eye.e total number of raw images was 116, with a resolution of 3,120 × 4,160 pixels.Among the 116 raw images, 80 were used for training and validation and 36 were cropped into 800 small images (256 × 256 pixel resolutions) for testing.Due to the small size of bugholes, the pixels of a single bughole range from 18 × 18 to 60 × 60. e 80 raw images were cropped into smaller images (28 × 28 pixel resolutions), which were manually annotated as bughole (i.e., positive sample) or not bughole (i.e., negative sample) to generate a database, as shown in Figure 5.
e total number of prepared training images in the database was 4 K.According to the ratio of training set, validation set � 4 : 1 [49], the number of training set images was 3.2 K, and the number of validation set images was 0.8 K.

Overall Architecture.
is section describes the overall architecture of the DCNN used in this study, including parameter selection of each layer.Figure 6 presents the DCNN architecture, which was the original configuration for concrete bughole detection.
e first layer was the input layer of 28 × 28 × 3 pixel resolutions, where each dimension indicated height, width, and channel (i.e., red,   Advances in Civil Engineering green, and blue), respectively.Due to the relatively small size of input image (28 × 28 pixels) and the limited number of labeled examples in training set (less than 10 K), this network utilized the inception modules architecture proposed by Szegedy et al. [51].e inception architecture was able to approximate an optimal local sparse structure in convolution vision network, which allowed for utilizing e cient dense computation instead of insu cient numerical calculation on nonuniform sparse data structure directly.By applying the inception modules into the traditional convolution network structure, less network parameters can be obtained, which meant that the model would be more robust to the over tting, especially when a small dataset was used.Besides, the batch normalization layer was inserted before activation function in all layers, which made the network easier to train and improved the generalization ability of nal model [52].Table 2 shows the convolution con guration and inception architecture.
e convolution layer has proved to be greatly e ective in extracting di erent features from images.Instead of the fully connected layer in traditional neuron network, each neuron of convolution layer is connected to only a local region of the input volume.
e spatial extent of this connectivity is a hyperparameter called the receptive eld of the neuron (equivalently, this is the lter size).e extent of the connectivity along the depth axis is always equal to the depth of the input volume.Meanwhile, the lter has the shared weights for input images.For the rst few layers with local receptive elds, convolution layers can extract elementary visual features such as oriented edges, end-points, and corners.
ese features are then combined by the subsequent layers in order to detect highorder features.For example, suppose that the input volume has size of 28 × 28 × 3 (e.g., an input image with three channels of red, green, and blue).If the receptive eld (or the lter size) is 5 × 5, then each neuron in the convolution layer will have weights to a 5 × 5 × 3 region in the input volume, for a total of 5 × 5 × 3 75 weights (and +1 bias parameter), giving output of size (28 − 5)/1 + 1 24.So, after the convolution layer, the feature map gets a size of 24 × 24.
In general, the pooling layers will be periodically inserted into the convolution layers of a CNN architecture, reducing the number of parameters and saving computation resources required for data storage, which also avoids over tting to a certain extent.
e pooling units can perform di erent functions, such as max pooling, average pooling, or even L2norm pooling, of which the max pooling was the most commonly used [53].It divides the input image into several rectangular areas and outputs the maximum value for each subarea.Figure 7 shows an example of max pooling, with a stride of 2, where the pooling layer output size is calculated by the equation in the gure.
e ReLU layer applies the nonsaturating activation function f(x) max(0, x) to perform a threshold operation on each element of the input.It can increase the nonlinearity of the decision function and the entire network and will not a ect the receptive elds of the convolution layer.In addition, the sigmoid function f(x) (1 + e −x ) −1 and the saturating hyperbolic tangent f(x) tanh(x) are often used to increase the nonlinearity.Figure 8 depicts several examples of nonlinear functions.Compared to other functions, ReLU function is more popular, because it can speed up the neural network's training speed without signi cantly a ecting the generalization accuracy of the model [45].
e loss layer is used to determine how the training process penalizes the di erence between the predicted and actual results of the network, which is usually the last level of the network.e softmax function is often used in the last layer of the CNN architecture, as the output layer, to classify input data.e softmax function is given by Equation ( 1), which is expressed as the probabilistic expression,  W, where W T n x (i) are inputs of the softmax layer.e sum of the right-hand side for the ith input always returns 1, because the function always normalizes the distribution.In other words, the following equation returns probabilities of each input's individual classes:

So max
e network is trained using a stochastic gradient descent (SGD) algorithm with a minibatch size of 100 out of 4,000 images.SGD, using backpropagation, is considered the most e cient and simplest way to minimize deviations [54,55].A disadvantage of the SGD method is that its update direction is completely dependent on the current batch.
us, its update is very unstable.A number of improvements have been proposed and used, including the proposed use of a momentum method, in which the stochastic gradient of momentum reduction remembers the updated Δw at each iteration and determines the next update as a linear combination of the gradient and the previous update [56]: that leads to where the parameter, w, which minimizes Q i (w), is to be estimated and η is a step size (learning rate).
As small and decreasing learning rates were recommended [57], the exponential decay learning rates depicted in Figure 9 were used in this study.e x-axis represents epochs, so that the learning rates are updated each time.As shown in Figure 9, the error tends to converge after 30 iterations.Weight decay and momentum parameters are assigned as 0.0001 and 0.9.

Testing and Discussion
To evaluate bugholes detection performance of the trained DCNN, 36 images not used in training were cropped into Output size: 4 × 4  Advances in Civil Engineering 800 small images (256 × 256 pixel resolutions) for testing.ese images were taken from several construction sites.e test results of some images are shown in Figure 10.e yellow box marks the discriminated bugholes, and the number inside indicates the labeled number of bugholes in one image.
From the test results above, the proposed DCNN showed excellent performance in detecting bugholes on the concrete surfaces.When there simultaneously exist multiple types of defects, including cracks and color-di erences, the trained DCNN also accurately detects the bughole defect without interference from other defects, as shown in Figures 10(b) and 10(c).Furthermore, under the circumstance of nonuniform illumination, the trained DCNN can also exclude the noise of strong light or shade and recognize the bugholes with a high accuracy, as shown in Figure 10(d).
By the image detection experiment, it can be found that the error mainly occurs at the edge of the image and the color-di erence area, as shown in Figure 11.e identication error, as shown in Figure 11(a), is basically caused by improper images cropping, and the reason for identi cation error in Figure 11(b) seems to be the shape of stronger colordi erence region, similar to bugholes.
By performing image testing on the trained DCNN, the recognition accuracy reached 96.43%, as shown in Table 3.Note that if pixels that are actually nonbughole are erroneously detected as bughole, it will be regarded as false positives; conversely, if pixels that are actually bugholes are 8 Advances in Civil Engineering incorrectly classi ed as nonbughole, it will be considered as a false negative.

Comparative Study
In order to compare the performance of the proposed DCNN-based bughole detection method and the traditional edge detection methods, it is necessary to conduct a comparative study.According to the study of related researchers, the Otsu method has the highest accuracy of identify the bugholes on concrete surface compared with the global threshold method, the nonmaximum suppression edge detection method, and the canny edge detection method.Talab et al. and Liu et al. [24,36] used the Otsu method to detect cracks in the image and achieved good detection results.Lim et al. [23] used the Laplacian of Gaussian (LoG) algorithm to detect cracks, and the results showed that the LoG algorithm successfully detected the crack in the image.erefore, the Laplacian of Gaussian (LoG) algorithm and the Otsu method were chosen for comparative study.e bughole detection results under di erent methods are shown in Figure 12.For these four types of images, the method presented in this paper all had good recognition results and can accurately identify the region and the number of bugholes in images.Moreover, the trained DCNN can avoid the interference of cracks, colordi erences, and uneven illumination on the concrete surface.When there exist only bugholes on the concrete surface, the LoG method and the Otsu method can identify bugholes well, as shown in Figure 12(a).However, when there exist multiple types of defects on the concrete surface as shown in Figures 12(b) and 12(c), the detection performance of both the LoG method and the Otsu method are prone to be poor.
Both the Otsu method and the LoG method are hard to obtain good bughole detection results as expected because of the noise of color-di erence.As shown in Figure 12(d), when the illumination of the concrete surface was uneven, the Otsu method could not detect the bugholes at the shadows; although the LOG method perform well in detecting bugholes, strong light would drastically lower the detection performance.

Conclusion
is paper proposed a deep-learning-based method to detect bugholes on concrete surface images.e concrete images required for the training, validation, and testing were taken from several construction sites by a mobile phone camera.e total number of raw images was 116.
e 80 images were cropped into 4,000 smaller images of 28 × 28 pixel resolutions to build the database for training and validation processes, and the 36 images cropped into 800 small images (256 × 256 pixel resolutions) were used for the testing process.Due to the relatively small size of input image (28 × 28 pixels) and the limited number of labeled examples in training set (less than 10 K), this network utilized the inception modules architecture.From the results of the image test, the proposed DCNN had an excellent bughole detection performance and the recognition accuracy reached 96.43%, expanding the application of deep-learning-based methods in the detection of concrete surface defects.e e ects of noise such as illumination, shadows, and combinations of several di erent surface imperfections in real-world environments were considered.By the comparative study with the Laplacian of Gaussian (LoG) algorithm and the Otsu method, it is clear that the proposed DCNN has   Advances in Civil Engineering good robustness and can avoid the interference of cracks, color-di erences, and nonuniform illumination on the concrete surface; using both the Otsu method and the LoG method, it was di cult to detect bugholes e ectively due to the interference of color-di erence and nonuniform illumination.
However, a common limitation of deep-learning-based methods is that the network training requires a large amount of labeled training data, so it is necessary to add more images to extend the database.In the next stage, the major task is to establish a quality evaluation system of concrete surface based on computer vision.

Figure 6 :
Figure 6: Overall architecture of the proposed DCNN.

Figure 11 :
Figure 11: Results of image testing with (a) false-negative error and (b) false-positive error.

Table 1 :
Image categories of concrete surface bughole.

Table 2 :
Dimensions of layers and operations.

Table 3 :
Accuracy of the proposed DCNN on testing images.