BgCut: Automatic Ship Detection from UAV Images

Ship detection in static UAV aerial images is a fundamental challenge in sea target detection and precise positioning. In this paper, an improved universal background model based on Grabcut algorithm is proposed to segment foreground objects from sea automatically. First, a sea template library including images in different natural conditions is built to provide an initial template to the model. Then the background trimap is obtained by combing some templates matching with region growing algorithm. The output trimap initializes Grabcut background instead of manual intervention and the process of segmentation without iteration. The effectiveness of our proposed model is demonstrated by extensive experiments on a certain area of real UAV aerial images by an airborne Canon 5D Mark. The proposed algorithm is not only adaptive but also with good segmentation. Furthermore, the model in this paper can be well applied in the automated processing of industrial images for related researches.


Introduction
With the application of unmanned aerial vehicles (UAV) in the supervision of ships, forestry, and natural resources [1], the use of UAV in positioning of vessels and management of fishery activities is confirmed [2]. UAV, according to the target task, can carry different devices such as synthetic aperture radar (SAR) systems and high-resolution optical camera aerial system. Compared with SAR images, UAV aerial images have several advantages, including high resolution and definition, and measurable character [3]. Due to the large amount of big-data images acquired during one flight, classical manual or semisupervised image processing method is no longer suitable or applicable for some specific conditions and surroundings. Therefore, automatic object identification method is crucial to industrial images processing. Since automated processing often loses recognition accuracy, object recognition and localization have become a hot issue in aerial images on the premise of ensuring high accuracy. For these reasons, we consider that the problem can also be regarded as separation of sea foreground and background. Because the background of the sea varies greatly with the natural conditions such as illumination, weather, and wind speed, building a generic background model for high-accuracy automatic target recognition is of great significance.
Considering the problems mentioned above, we design and propose an improved universal background model based on Grabcut algorithm to separate and identify ship candidate region automatically and quickly. Meanwhile, pyramid multiresolving can be applied for the lowest level images as the advanced processing. Not only can the background model guarantee the recognition rate for surface extraction, but also it covers the shortage that Grabcut algorithm in the image segmentation process cannot automatically obtain the background results.
This paper is arranged as follows. Section 2 is a brief introduction to ship identification methods and development of the Grabcut algorithm. Section 3 is the details of our proposed algorithm. Discussion and analysis of experiments and results are shown in Section 4.

2
The Scientific World Journal research on SAR images and inverse synthetic aperture radar (ISAR) images to identity the ship target.
Ship identification methods are mainly divided into three parts, which are the coarse-segmentation-based feature extraction, the statistic-model-based constant false alarm rate (CFAR) algorithm, and the direct image segmentation method. First of all, the coarse-segmentation-based feature extraction methods include single feature and multifeatures. The single feature is common, such as building a composite distribution to simulate the characteristic of the various types of sea surface [4] and applying texture descriptors to extract feature to build statistical matrix [5]. The multifeatures method is different, for example, some multifeatures can be obtained from the combination of the aspect ratio and the number of target pixels [6]. These kinds of recognition methods are limited by the feature library completeness and features selection and some even rely on a priori parameters or previous segmentation.
CFAR based on the statistical model is the most widely used algorithm in SAR image target detection, and some fast algorithms based on it increase the unsatisfactory calculation speed [7], such as the Gama-CFAR [8] and weighted Parzen window clustering algorithm [9]. Even though fast algorithms improve the processing speed, its prescreening results are not rejoined, and it cannot ensure high detection accuracy [7].
The direct image segmentation method consists of traditional approaches and some combine specific theories and models. Traditional ones are mainly region based and edge based; model-based methods are such as triplet Markov fields [10], the layer set method [11], and snake model [12]; theory-based methods are such as fuzzy c-means clustering algorithm [13] and fuzzy logic [14]; graph-based methods are such as graph cut [15,16], Grabcut [17], and normalized cut [18,19]. Nevertheless, model-based methodology is usually suitable for single-object segmentation; edge-based approach is not strong in noise adaptability; region-based mechanism relies on interaction or a priori conditions. Method based on graph theory requires human interaction; however, its high accuracy meets the processing demands of the aerial image perfectly.
Because of its major advantage, graph cut algorithm is widely used in kinds of fields, such as the myocardium image segmentation [20], accurate segmentation of mural images [21], buildings in optical satellite images [22], and the SAR image change detection [23]. SAR image target segmentation [24] uses combination of Grabcut algorithm and neighborhood growing algorithm. While ships in UAV aerial images have different sizes and variable direction characteristics, most of the methods may omit some small fishing boats as spot which affect the whole and final recognition rate.
Currently Grabcut-based image segmentation methods mainly make improvements in energy minimization function and minimizing the objective function. Global energy minimization methods include simulated annealing, dynamic programming, and graph theory; local energy minimization methods include variation and ICM methods [25]. Sometimes, it is essential to construct a good initial model for this sensitive processing. For example, it is impossible to offer an optimal trimap manually [26]. In order to solve the problems of the initialization, the adaptive shape prior method provides a priori trimap to Graphcut [27], which is particular for specific shapes so that this kind of model lacks the generality and ability to detect various shapes of vessels.
In response to this situation, we propose a universal background model to initialize and optimize the Grabcut algorithm. The improved algorithm is adaptable and suitable to kinds of sea surfaces, which is robust to different shapes and sizes of ships. The novelty of our approach lies first in this universal background model. We apply a sea template library to enhance the generality by randomly selecting pixel points of the sea as the feature image and automatically acquiring the new generated template. Secondly, template matching algorithm is designed to find a certain sea seed point which can grow out optimal background area and generate background trimap. Finally, we use the trimap to optimize and improve Grabcut and to segment the foreground objects.

Universal Background Model with Grabcut
In the present paper, the proposed algorithm is divided into two main sections: building a background model and improving Grabcut segmentation. Background model consists of three parts. Firstly, we use the sum of squared difference (SSD) [28] template matching based on selected sea template to acquire seed point. Secondly, we apply the neighborhood growing algorithm based on the seed point to carry out the growth of the sea surface background and to get background mask image. Finally, we transform the mask image and generate background trimap images. Since trimap initializes the Grabcut background model, there is one process that can obtain higher accurate results.

Automatic Selection of Sea Template
3.1.1. Template Library Building. The sea template library includes samples from different sea areas, season, weather, and natural conditions. The size of samples is 26 * 16 and the number is 1000. Partial template samples are shown in Figure 1, including collection in cloudy day, foggy day, rainy day, sunny day, and night.

Automatic Template Selection.
We assume that the number of background pixels is much larger than the number of foreground pixels. The parameter is the number of feature samples.
blocks of pixels are randomly selected from the image to be segmented with size of 10 * 5. Each pixel block is called a feature image sample. Respectively, the characteristics of the image can be calculated as the mean gray value ( ), the mean brightness ( ), and the mean hue ( ). Equation (1) is to calculate the whole characteristic value of the image as follows: where ( ) is the characteristic value of the feature sample. and are the positive weights of , ∈ (0, 1) and + < 1. Figure 1: Part of sea template library. The first two rows represent the cloudy condition, the next two rows represent the foggy condition, the fifth and the sixth rows represent the night condition, the seventh and the eighth rows represent the rainy condition, and the last two rows represent the sunny condition.
After that the obtained characteristic value can be calculated and matched with all of the templates, which is defined as where ∈ (1, ) and ∈ (1, ). We assign = 20 and = 1000 in this paper. The feature sample set = ( 1 , . . . , , . . . , ) has samples, and is the th feature sample. The template library = ( 1 , . . . , , . . . , ) has samples, and is the th template. Evidently, represents the degree of differences between the feature sample and some template samples.
Then, a function ( ) is defined to sum up the number of different , and the voting method shown in (3) is adopted to find out the most similar template as The task is to make sure that a suitable template will be selected to seek seed point and to initialize the background model.

Automatic Trimap by Background Model.
According to the analysis mentioned above, trimap is specified as = { , , } in which background regions and foreground regions are marked by Grabcut algorithm [17]. To get more reasonable results, unknown regions in trimap should be as little as possible. The reason is that color and mark information of the adjacent pixels is used to estimate and to identify whether the unknown region pixels are background or foreground. If there are too many unknown regions or the regions are too far away from the marked area, it will greatly reduce the accuracy of sampling-based assessment and even get error results. Intuitively, some regions are not divided or processed.
This section describes the parts of trimap modeling automatically, which is mainly divided into three steps: finding a better seed point by SSD template matching algorithm, growing an optimal background mask with the seed point, and generating background trimap. 3.2.1. SSD Template Matching Algorithm. Sum of squared difference (SSD) [28] is one of template matching methods defined as (4). As we know that the best matching value is 0, because the larger the value is, the worse the match will be. Consider where is the sea template image and is the input target image. ∈ (0, − 1) and ∈ (0, ℎ − 1); is the width of and ℎ is the height of . According to (4), the position point ( , ) with the minimum SSD can be selected as the seed value.

Optimal Background Growth.
The obtained seed point from the previous section is passed to the neighborhood growing algorithm as the growth starting point. And the optimal background should be identified for more than 90 of the total image background. The region growing algorithm process is described in Algorithm 1.
In the optimal algorithm, the prior threshold value is set to be 0.07, which is obtained by a large number of tests on the sea images. Obviously, the threshold value depends on the resolution of the target images.

Trimap via Background
Mask. The background region obtained from the growth of the growing algorithm scatters to be punctate; however, it is not the target trimap image. This mask has three channels with RGB color. We can map it to a single channel of gray-scale image which is the result trimap. The mask image is = ( 1 , . . . , , . . . , ), and the map function is shown as where ( ) presents the value of the th dimension of the pixel and is the pixel of trimap. Compared to the trimap described as = { , , } [17], we redefine the trimap as = { , PF }, where still represents the background and PF represents the possible foreground. In this paper, we assume that PF = 3, and it is a further refinement calibration to , which means a greater possibility of the pixel belonging to the foreground. The initial value of and in trimap is set to be fixable, while is allocated by the Gaussian mixture models (GMM). There are two types of connections in GMM model including -Links and -Links [29]. -Links connect the neighboring pixels, which describes the cost of division boundary between adjacent pixels. -Links connect the nodes of the background and foreground, which describes the possibility of unknown pixel belonging to the foreground or background. As we know, Tlink weights can be upgraded by GMM [29]. Therefore, PF is applied, and its initial value will be reallocated after the GMM model recalculation. This model can obtain the sea background mask automatically, and the mask image is the optimal result. For some reason, the optimal model should combine Grabcut algorithm in the further processing step, if high-precision region segmentation is wanted. Grabcut will make a scattered background mask connection and even achieve high precise extraction for foreground target area.

Grabcut Algorithm Based on Background Model.
In order to achieve automatic target foreground segmentation, the key issue is how to design the combination of the improved algorithm of Grabcut background model. From the view of characteristics of the original Grabcut algorithm, it can achieve high precise segmentation of foreground from background; however, it needs human intervention and interaction. Considering the industrial image processing, it is necessary to improve the original algorithm by initializing background model automatically. Therefore, the new and improved processing flow of Grabcut is shown in Algorithm of BgCut.
(i) Establish the initial segmentation from Trimap. Assign pixels in to TrimBackground which is the set of background points; assign pixels in PF to TrimForeground which is the set of foreground points.
(ii) Initialize = 0 for ∈ and = 3 for ∈ PF , where is the opacity value of the pixel , which is used to donate the initial trimap .
(iii) Use the initial segmentation to obtain GMMs parameters.
-Links between pixel and pixel are shown as follows: where Alpha is a prior parameter, and its value is usually set to be 50. dist() is the Euclidean distance 6 The Scientific World Journal between pixel and pixel , and and are the color values [29].
-Links for pixel , the similar to -Links, are shown as follows: where () is the mixture weighting coefficient and mean and covariance Σ are of the GMM components for the background and foreground distribution. It is a restatement of -Links in reference [17]. Consider where ∈ {1, . . . , } is a vector and is the number of GMM background and foreground components. is the parameter of GMM which is learned from data .
(iv) Build new GMMs after the iteration and calculation.
Calculate the minimum value of Gibbs energy function: ( , , , ) and then obtain the min cut: where ( ) = ∑ ( ), which is data items of the Color GMM and depends on the GMM components . (v) Apply border matting. Improved Grabcut algorithm based on background model can achieve background trimap automatically, and it will meet the demand for automatic segmentation of big-data image. In this paper, the Gibbs is applied as energy function, and the application of the maximized flow can be used to minimum the cut method, which guarantees the accuracy of the segmentation. Above all, our proposed model upgrades the initialization process and the quality of background trimap, and it will achieve more accurate identification results for sea target.

Experiment
Experiments are conducted to evaluate the proposed algorithm with the real UAV aerial images for the purpose of automatic foreground target segmentation. All the experimental Input: Coordinates ( , ) of the seed point.
Step: Create the adjacency list of the seed point. For: Add new pixels into the segmented region; recalculate the mean value of the distance between the latest added pixels and the whole region; if the mean distance value is less than the given threshold, goto Step i; else, goto Output.
Step i: Take the seed point as the center, and traverse the four pixels around it. If the coordinate of the adjacent pixels is in the segmentation region, while still not belonging to the part of the segmentation region, it will be added to the background region.
Step ii: Add the pixel whose intensity is closest to the average value to the region and mark it with label. After that, remove it from the adjacent list. Output: Background mask image. images are collected and obtained from a certain area of the East China Sea in 2012. Canon 5D Mark is used as the camera on the UAV with the flying height of 800 m, and the resolution of the image is 0.1 m. All the images have been preprocessed such as defogging and uniforming light. Then the big-data images are stored in kmz format which are compressed by pyramid hierarchy mechanism. Experiment is performed on the 8th layer, which is the bottom layer of images after decompression so as to verify the improved BgCut model in this paper.
The condition of modeling and calculation is CPU 2 GHz and memory is 2 GB. And the seed point can be calculated by the OpenCV library, which can be downloaded from "http://sourceforge.net/projects/opencvlibrary/files/. "

Results and Conclusions. For our proposed algorithm
BgCut for the experiment, it is a graphical improved modeling and segmenting approach which can realize the whole task automatically. Experimental process can be designed as follows. Firstly, we input test images to background model, and a template is selected from the library; secondly, a trimap is obtained from the output of the calculating model; finally, we apply the trimap to initialize Grabcut background model, and the target region can be obtained after the segmentation. Figure 2 shows the most similar template which is selected from the initial template library. And Figure 3 shows the UAV aerial image and the results of the proposed algorithm on the real world image. With the high resolution of the image, the shape of boats and mariculture zones can be seen clearly. In the result image, 29 candidate regions are detected, wherein there are 18 boat candidates (18 boats), 5 sea debris regions, and 6 mariculture zones. Small bright spots in the image are surface debris which has no effect on the detection of boats. The distance threshold value of template matching algorithm is assigned as 0.07 and only one iteration of Grabcut. The false alarm rate is 17.24%.
On the contrast, the original Grabcut is used to perform on the same image, and the manual rectangle and the results are shown in Figure 4. It can be seen that the calculation is convergent at the end of the 7th iteration, and the results of the 7th and the 15th are almost the same. In Figure 4, we cannot calculate the false alarm rate, because there is no ship being segmented from the background by the Grabcut method.
Since the templates in the library have different conditions, we can get better segmentation results for different sea images. Furthermore, one template is not only limited to one kind of image condition. The following experiments are carried on different scenarios. For each scenario, both Grabcut [29] method and our improved method BgCut are applied to the target image, respectively. And the Grabcut representation is the final iteration result.

Scenario: Ship with Different
Sizes. There are many kinds of ship on the ocean such as vessels and fishing boats. If the ship sizes vary widely, it has a great influence on segmentation. Some smaller ships are often identified as the background, because these ships are similar to the background. The comparison results are shown in Figure 6.
The results show that our algorithm can almost identify all the ships under the premise of sacrificing some accurate edge segmentation.

Scenario: Image Blur and Low Resolution.
Because of the impact of external distraction factors and jitter of UAV during the data collection, it is easy to make the image blur. Obviously, fuzzy edge has a great influence on unknown pixels classification and identification. The results of Grabcut and BgCut are shown in Figure 7. In this scenario, Grabcut algorithm may have smoother edge segmentation if the ships are not near to the edge of the image. Our BgCut algorithm may have some noise for the sea surface.

Scenario: Reef Edge and Shore Removed Incompletely.
Reef edge or the shore removed incompletely can also appear in the target image. For this situation, the interference also has a great impact on the split. The results of Grabcut and BgCut are shown in Figure 8. Our algorithm has a better effect on processing this kind of image. And the results show that the reef edge and shore will not interfere with the segmentation.

Scenario: Ships Concentration Areas.
There are some ships much closer to the mariculture zone than to the reef area. According to the specific conditions, ships tend to gather in designated area. And in the areas of large ships gathered, accurate detection of ship number and the location of the ship are difficult for fishery activities. Figure 9 shows the results of ship that were identified by Grabcut and BgCut. It is indicated that Grabcut has a good effect on processing ships in gathered area, while our BgCut algorithm has a higher accuracy rate.

BgCut Detection Results of Ships and Mariculture Zone.
Finally, we will give the BgCut detection results of ships and mariculture zone. The candidate regions are almost full of ships, which provide a good foundation for the realistic ship detection. What we should do is just applying shape texture feature library or some information that can be suitable for detecting ships. As shown in Figure 10 we use a specific threshold to distinguish and filter the shape feature. At last, the ships and mariculture zone are circled by our BgCut algorithm: the fishing boats in green circle and mariculture zone in red circle.

Conclusion
This paper proposes and designs an effective and efficient approach to automatic ship detection from industrial UAV images. A universal background model based on Grabcut algorithm is introduced to initialize Grabcut and obtain better segmentation results. The proposed algorithm does not need manual interaction during the whole process. It can get accurate ship candidate regions and make the bigdata image processing to be an automatic flow. The results of segmentation are precise and the recognition is accurate. Experimental results for aerial images obtained from the East China Sea indicate that out improved BgCut model can perform well and meet the needs of practical industrial conditions. The future work will focus on more in-depth learning to increase accuracy of ship detection and make the parameter of region grow more adaptive.