In recent years the spatial resolutions of remote sensing images have been improved greatly. However, a higher spatial resolution image does not always lead to a better result of automatic scene classification. Visual attention is an important characteristic of the human visual system, which can effectively help to classify remote sensing scenes. In this study, a novel visual attention feature extraction algorithm was proposed, which extracted visual attention features through a multiscale process. And a fuzzy classification method using visual attention features (FC-VAF) was developed to perform high resolution remote sensing scene classification. FC-VAF was evaluated by using remote sensing scenes from widely used high resolution remote sensing images, including IKONOS, QuickBird, and ZY-3 images. FC-VAF achieved more accurate classification results than the others according to the quantitative accuracy evaluation indices. We also discussed the role and impacts of different decomposition levels and different wavelets on the classification accuracy. FC-VAF improves the accuracy of high resolution scene classification and therefore advances the research of digital image analysis and the applications of high resolution remote sensing images.
With the rapid development of satellite and sensor technologies, remote sensing has become an important and efficient means to collect earth spatial information in recent years [
Visual attention is an important characteristic of the human visual system [
The assumption of this study is that visual attention features could be extracted through a multiscale process for high resolution remote sensing scene classification. Fuzzy theory is an effective mathematical tool to process fuzzy and complex information [
The wavelet analysis is a powerful mathematical tool to obtain decomposition, reconstruction, and a multiscale representation of signals [
Two-level two-dimensional discrete wavelet decomposition of an image.
Wavelet transform can obtain the multiscale representation of images. Therefore, a novel visual attention feature extraction algorithm based on wavelet transform is proposed, which extracts visual attention features from the saliency maps of remote sensing scenes through a multiscale process.
Visual saliency in an image measures to what extent details attract human attention [
The visual attention features are extracted from an integrated saliency map as follows.
(a) The integrated saliency map is decomposed by
The multiscale representation of an integrated saliency map for visual attention feature extraction (
(b) Visual attention focuses are extracted in the top level of the multiscale representation. The salient points in the top level are extracted based on the saliency values of the points. The human visual system can be easily attracted by the most salient point. Therefore, the most salient point is selected as the first and current visual attention focus. Then visual attention is shifted among the salient points in the top level. The next visual attention focus is the unselected salient point which is closest to the current visual attention focus. For example, there are three salient points in Figure
(c) Visual attention is shifted from the top level to the low level of the multiscale representation. Take the visual attention focus
(d) The saliency values of the visual attention focuses in the visual saliency map are used as the visual attention features for scene classification. In Figure
We apply the fuzzy classification method [
(a) Multiple original features are extracted from the samples of remote sensing scenes, including gray level cooccurrence matrix features [
(b) The features are transformed into fuzzy features using the standard S-function as follows:
(c) Fuzzy class centers are obtained by using the mean value method. Suppose
(d) Test samples are classified using Euclidean fuzzy closeness degree on the basis of the fuzzy closeness principle [
(e) Fuzzy classification results are assessed using overall accuracy (OA), Kappa coefficient (KC), average producer’s accuracy (APA), and average user’s accuracy (AUA) based on confusion matrices [
A flowchart of the fuzzy classification process is shown in Figure
Flowchart of the fuzzy classification process.
In order to validate the effectiveness of FC-VAF, 80 samples of remote sensing scenes were selected as the experimental data from widely used high spatial resolution remote sensing images, including IKONOS, QuickBird, and ZY-3 images. The samples consist of four classes, which are residential areas, farmlands, woodlands, and water areas, respectively. Each class has 20 samples where 10 samples are used as the training samples and all are used as the test samples. The size of the samples is
Representative samples of remote sensing scenes. (a) Residential areas; (b) farmlands; (c) woodlands; (d) water areas.
To demonstrate the effectiveness of FC-VAF, comparisons were carried out between FC-VAF and scene classification based on four traditional algorithms. The four methods for comparison are standard backpropagation neural network classification (SBPC), adaptive learning rule backpropagation neural network classification (ALRBPC), general regression neural network classification (GRNNC), and fuzzy classification (FC). Four gray level cooccurrence matrix features and four Laws texture energy features were extracted from these samples for all scene classification methods. The Euclidian closeness degree measurement was adopted in both FC and FC-VAF. Symlets wavelet was adopted in FC-VAF. The main parameters of different methods are shown in Table
Main parameters of different methods.
Method | Parameter description | Parameter value |
---|---|---|
SBPC | Number of hidden layers | 1 |
Number of neurons in hidden layers | 15 | |
Learning rate | 0.01 | |
Maximum number of epochs to train | 5000 | |
|
||
ALRBPC | Number of hidden layers | 1 |
Number of neurons in hidden layers | 15 | |
Learning rate | 0.01 | |
Ratio to increase learning rate | 1.05 | |
Ratio to decrease learning rate | 0.7 | |
Maximum number of epochs to train | 1000 | |
|
||
GRNNC | Spread parameter | 0.5 |
|
||
FC | Fuzzy parameter |
0.2 |
Fuzzy parameter |
0.8 | |
|
||
FC-VAF | Fuzzy parameter |
0.2 |
Fuzzy parameter |
0.8 | |
Level of wavelet decomposition | 2 | |
Number of VAF | 4 |
We compared the results of different scene classification methods using the measures of OA, KC, APA, and AUA. Table
Comparisons of different scene classification methods.
|
Scene classification accuracy indicators | |||
---|---|---|---|---|
Overall accuracy (OA) (%) | Kappa coefficient (KC) | Average producer’s accuracy (APA) (%) | Average user’s accuracy (AUA) (%) | |
SBPC | 76.3 | 0.683 | 76.3 | 78.4 |
ALRBPC | 78.8 | 0.717 | 78.8 | 81.5 |
GRNNC | 82.5 | 0.767 | 82.5 | 86.8 |
FC | 80.0 | 0.733 | 80.0 | 82.4 |
FC-VAF | 85.0 | 0.800 | 85.0 | 89.1 |
The decomposition level (DL) of wavelets is the key parameter of FC-VAF, which affects the accuracy of scene classification. The scene classification accuracy of FC-VAF related to DL was analyzed and discussed. The 80 samples of scenes in the case study were used with different DL values
The effects of different wavelet decomposition levels (DL) on the classification accuracy. OA represents overall accuracy, KC represents Kappa coefficient, APA represents average producer’s accuracy, and AUA represents average user’s accuracy.
Different wavelets lead to different wavelet decomposition effects, which affect the classification accuracy of FC-VAF. The scene classification accuracy of FC-VAF related to wavelets was analyzed and discussed. The 80 samples of scenes in the case study were used with different wavelets. Other parameters of FC-VAF were kept the same as those in the case study. The classification accuracy of FC-VAF using different wavelets is shown in Figure
The effects of different wavelets on the classification accuracy. OA represents overall accuracy, KC represents Kappa coefficient, APA represents average producer’s accuracy, and AUA represents average user’s accuracy.
In this study, a novel visual attention feature extraction algorithm was proposed, which extracted visual attention features through a multiscale process. And a fuzzy classification method using visual attention features (FC-VAF) was developed to perform high resolution remote sensing scene classification. FC-VAF was evaluated by using 80 samples of remote sensing scenes, which were selected from widely used high resolution remote sensing images, including IKONOS, QuickBird, and ZY-3 images. FC-VAF achieved more accurate classification results than four traditional classification methods according to the measures of OA, KC, APA, and AUA. The OA values of SBPC, ALRBPC, GRNNC, FC, and FC-VAF are 76.3%, 78.8%, 82.5%, 80.0%, and 85.0%, respectively. The KC values of SBPC, ALRBPC, GRNNC, FC, and FC-VAF are 0.683, 0.717, 0.767, 0.733, and 0.800, respectively. The classification accuracy of FC-VAF related to the decomposition level and to the wavelets was discussed.
FC-VAF can extract visual attention features through a multiscale process and improve the accuracy of scene classification in high resolution remote sensing images. Therefore, FC-VAF not only advances the research of visual attention models and digital image analysis methods, but also promotes the applications of high resolution remote sensing images. Possible further development of the study will focus on the integration of FC-VAF and other intelligent algorithms to further improve the accuracy of high resolution remote sensing scene classification.
The authors declare that there are no conflicts of interest regarding the publication of this paper.
This paper was supported by the National Natural Science Foundation of China (Grant no. 41371343). The authors also wish to thank Susan Cuddy at CSIRO for her helpful comments and suggestions.