A new method for detecting rooftops in satellite images is presented. The proposed method is based on a combination of machine learning techniques, namely,
Automatic rooftop detection from satellite/aerial images is an important task in a variety of applications. Interesting examples include change detection in urban monitoring, the production of digital maps, land use analysis, verification, and updating GIS databases and route planning [
However, detecting rooftops from aerial or satellite images can be very challenging. One reason is that the images used often differ in terms of lighting conditions, quality, and resolution. Another reason is that buildings may have diverse and complicated shapes and structures and as such can be easily confused with similar objects such as cars, roads, and courtyards. The result of these complications is that there are currently no algorithms or features that are universally applicable, that is, which can be used to detect roofs in all or even a majority of aerial and satellite images.
Much of the earlier work on rooftop detection has depended on computer vision and image processing techniques such as edge detection, corner detection, and image segmentation. One widely used approach is to first generate rooftop candidates using image segmentation techniques and then to identify the true rooftops within this set of candidates, where the latter process is performed using discriminative features such as intensity, shape, and area. Ren et al. and Nosrati and Saeedi [
Many modern approaches have used machine learning to perform rooftop detection. In [
Other studies have used both spectral and spatial features for the classification task (e.g., [
Based on our review of the literature, a new rooftop detection system which is novel in a number of key respects was developed. The proposed method has the following key characteristics. It uses only It based on both It utilizes classification results obtained using the SVM are subjected to a
There do not seem to be any existing studies which combine all four characteristics above, and we believe that presented together these represent a significant improvement over existing methods for performing rooftop detection. The original motivation for this study was to assess the available rooftop area in Abu Dhabi, for the deployment of photovoltaics, as such images from this area are used to evaluate this method.
The proposed rooftop detection system consists of the following three main steps.
The overall approach is shown in Figure
Diagram describing the overall model.
The goal of image segmentation is to create a set of candidate regions (segments), each of which will later be classified as rooftop or nonrooftop. To divide an image into segments we use
To improve the quality of the extracted segments bilateral filtering [
The original image (a). The result after applying bilateral filtering on the original image (b). The result of the
An important consideration is the choice of an appropriate value of
The result of applying the
The original image (a). The result of 4-connected flood-fill algorithm (b). The result of 8-connected flood-fill algorithm (c).
After dividing the training images into candidate regions (segments) as described in the previous section, the dataset was constructed, in which each row represents one of the segments. Eight features were extracted to describe each segment (this is discussed in more detail in the next section). Each row is manually labeled as “1” (if it corresponds to a rooftop) or “0” (if not).
Features are numerical attributes which characterize the object to be classified. So, the extracted features are those which hold properties which can help to distinguish rooftops and nonrooftops in an image [
Each feature was normalized by subtracting the mean of the feature and dividing it by the standard deviation.
The image showing rooftop and nonrooftop objects together with the features.
The support vector machine (SVM) is a machine learning technique which finds the decision boundary (or “hyperplane”) that optimally separates the data points of one class from those of the other class, where a “hyperplane” is optimal if it maximizes the margin of separation between the two classes. Like most kernel methods, the performance of an SVM is heavily dependent on the choice of kernel function. Because of its good classification performance on our data, we used the Gaussian radial basis function kernel:
As already mentioned, it is likely that the SVM will not be able to detect all the rooftop regions in an image. To help address this problem, color information from the detected rooftops was subsequently used to find the “missing” rooftops.
The main idea is to use the information from the regions which were classified by the SVM as rooftops in order to detect the misclassified segments. This is based on the observation that rooftops within a single image tend to have the same pixel intensities. Hence, the idea is to use the intensity information of the segments which were classified as rooftops by the SVM, to affect a “second-pass” of classification. An example is shown in Figure
The original image (a). The result after the SVM (b). The distribution of intensities of the detected rooftop pixels (c). The distribution of intensities of the nondetected rooftop pixels (d).
Two histograms were used: one for the intensities of pixels which were classified as rooftops and another for pixels which were classified as nonrooftops. Each histogram consisted of 10 bins, which represented a reasonable balance between computational requirements, good results, and adequate coverage of each bin (in terms of pixels). The two histograms are shown in Figures
From the first distribution (shown in Figure
Considering only 2 bins also avoids adding too much noise to the model, since considering too many bins can significantly increase both true positive and false positive rates. An example is shown in Figure
The original image (a). The result when considering 2 bins (b). The result when considering 3 bins (c).
Another issue related to the histogram method is having different objects (roads, cars, and so on), which are of the same color as the rooftops (as an example see Figure
The image with roads of the same intensity as the rooftops (a). The image without any rooftop (b).
In order to avoid the situations discussed above a thresholding scheme was applied. The scheme adopted is based on the fact that the aim of the histogram method is to complement SVM classification; if the number of nondetected pixels in a bin is significantly higher than that of detected pixels in the same bin, there would be little sense in using that bin. For example, from Figure
The results will be discussed in greater detail in the next section, but briefly our observation was that the “histogram” method performed very well for one of our datasets, where it resulted in a big increase in performance. Unfortunately for the second dataset this method did not perform as well; however, even in this case it still produced a slight improvement in performance. Suggested reasons for this will be presented later on in the paper.
As explained earlier, one of the aims of research was for the proposed method to be able to work using only panchromatic data. Such data can be obtained from a variety of commercial sources, but for this study images that were manually collected from Google Maps were used. Since this paper was focused on finding the total amount of rooftop area for deployment of photovoltaics in Abu Dhabi, UAE, we use images gathered from selected residential areas in Abu Dhabi city. To ensure the generality of our model, it was tested on two separate datasets, “Raha” and “Khalifa,” which consist of images gathered from Al Raha Gardens and Khalifa City A.
For the segmentation process to work properly, the
14 such images were collected for each dataset, out of which 8 were used for training and the remaining 6 images were used for testing and validation. In addition rooftops in each of these images were manually labeled and these labels were subsequently used to label the regions extracted during the segmentation process, where each rooftop region is labeled “1” and nonrooftop regions “0.”
Figure
Sample images from Al Raha Gardens (a). Sample images from Khalifa City A (b). An example of a manually labeled image (c).
Commonly adopted performance metrics were used to evaluate the performance of the system. These are Precision, Recall and
As mentioned, to determine the optimal value of
As might be expected, it can be seen that the accuracy of the SVM grows with the size of the training dataset. The relationship between the
While accuracy increases with the number of images used, this seems to level off after around 8 images and this was hence deemed to be sufficient amount of training data.
Finally, there was also the issue of the suitable value of
The results for trained SVM on Al Raha Gardens (a) and Khalifa City A (b) validation datasets.
Number of clusters |
|
|
|
|
|
---|---|---|---|---|---|
Precision (%) | 60.5 | 68.19 | 79.66 | 73.5 | 67.39 |
Recall (%) | 86.62 | 84.9 | 83.71 | 72.4 | 68.22 |
|
71.24 | 75.6 | 81.63 | 72.94 | 67.8 |
Number of clusters |
|
|
|
|
|
---|---|---|---|---|---|
Precision (%) | 64.5 | 68.2 | 77.2 | 75.5 | 69.12 |
Recall (%) | 69.2 | 73.72 | 88.55 | 73.4 | 69.4 |
|
66.76 | 70.85 | 82.48 | 74.43 | 69.25 |
We evaluate the overall performance of our method based on two criteria: the number of detected rooftops and the overall area covered by detected rooftops. We compare the results before and after applying the histogram method. In Table
The numbers of detected rooftops before and after applying the histogram method.
Before the histogram method | After the histogram method | |
---|---|---|
Image 1 from Raha | 12 out of 17 | 14 out of 17 |
Image 2 from Raha | 9 out of 14 | 10 out of 14 |
Image 3 from Raha | 7 out of 12 | 12 out of 12 |
Image 1 from Khalifa | 4 out of 13 | 13 out of 13 |
Image 2 from Khalifa | 6 out of 15 | 14 out of 15 |
Image 3 from Khalifa | 4 out of 12 | 10 out of 12 |
It can be observed that the SVM performs quite well on “Raha” images even without using the histogram method. However for “Khalifa” images the performance of the basic SVM is weak and the histogram method produces a huge improvement for “Khalifa” datasets. One possible reason for this is that rooftops on “Raha” images are well separated from each other by white boundaries (see Figure
To better evaluate the overall performance of the model, results based on the correctly classified rooftop and nonrooftop pixels are presented in Table
The amount of detected rooftop pixels before and after applying the histogram method for Khalifa City A (a) and Al Raha Gardens test images (b).
Before the histogram method | After the histogram method | |
---|---|---|
Precision (%) | 92.8 | 93.16 |
Recall (%) | 52.4 | 70.01 |
|
66.9 | 79.9 |
Before the histogram method | After the histogram method | |
---|---|---|
Precision (%) | 88.5 | 97.6 |
Recall (%) | 7.1 | 79.7 |
|
13.1 | 87.7 |
Again it can be observed that in contrast to the “Raha” dataset, the histogram method significantly improves the performance of the system on the “Khalifa” dataset (though we still see a slight improvement in the case of the “Raha” dataset).
More results of the performance of our model can be seen in Figure
The original image from Raha Gardens (a). The result after the SVM (b). The result after the “histogram method” (c). The original image from Khalifa City A (d). The result after the SVM (e). The result after the “histogram method” (f).
The paper presented a new approach for detecting rooftops using machine learning techniques like
However there are still some situations in which the method does not perform well. For example rooftops which are very big relative to the image size were sometimes classified as nonrooftop by the SVM, which tended to consider such rooftops as outliers.
Another weakness of the method is poor performance when rooftops of many different colors are encountered. Also, when there is a single “dominant” rooftop color, it renders the system less sensitive to rooftops with less common colors.
For future work we intend to extend the system along three main directions: improvements to the classification process via additional feature engineering to discover more informative features and screening and testing alternative classifiers, such as the unbalanced SVM used in [ the addition of a higher-order classification stage. Rooftops which are in close proximity to each other tend to have similar characteristics (color, design, orientation, density, etc). While the histogram method is a step in this direction there are other characteristics beyond simply the grayscale intensity; testing and extension of the method to larger geographical areas.