Project Gradient Descent Adversarial Attack against Multisource Remote Sensing Image Scene Classification

Deep learning technology (a deeper and optimized network structure) and remote sensing imaging (i.e., the moremultisource and the more multicategory remote sensing data) have developed rapidly. Although the deep convolutional neural network (CNN) has achieved state-of-the-art performance on remote sensing image (RSI) scene classification, the existence of adversarial attacks poses a potential security threat to the RSI scene classification task based on CNN. (e corresponding adversarial samples can be generated by adding a small perturbation to the original images. Feeding the CNN-based classifier with the adversarial samples leads to the classifier misclassify with high confidence. To achieve a higher attack success rate against scene classification based on CNN, we introduce the projected gradient descent method to generate adversarial remote sensing images. (en, we select several mainstream CNN-based classifiers as the attacked models to demonstrate the effectiveness of our method. (e experimental results show that our proposed method can dramatically reduce the classification accuracy under untargeted and targeted attacks. Furthermore, we also evaluate the quality of the generated adversarial images by visual and quantitative comparisons. (e results show that our method can generate the imperceptible adversarial samples and has a stronger attack ability for the RSI scene classification.


Introduction
With the high-speed development of remote sensing technology, we can obtain more multisource and multicategory remote sensing images (RSIs). ose images are more efficiently employed to observe ground objects. RSI encompasses more the abundant spectral features of the ground objects, i.e., the texture and geometric features. erefore, employing RSI to achieve scene classification has also gathered considerable attention. e purpose of scene classification is to predict the label for a given RSI, e.g., airport, forest, and river. e critical process of RSI scene classification is to extract image features. e existing researches can be mainly divided into two strategies according to the features used: (1) the low-level features and the middle-level global features obtained by the manual extraction and (2) the high-level features extracted automatically by the convolutional neural network (CNN).
In early RSI scene classification research, scene classification mainly relies on handcrafted low-level and middlelevel features [1][2][3][4][5], e.g., color, texture, spatial structure, and scale-invariant feature transform [6]. e middle-level features' extracting process can be regarded as the highorder statistical of the low-level local features [7][8][9]. Compared with the low-level features, the middle-level features (e.g., bag-of-visual-words model [10]) have expressive content for scene images. However, due to the lack of flexibility and adaptability, using the middle-level features cannot easily distinguish the complex scenes and gain the ideal classification accuracy. Traditional methods can express RSI features to a certain extent [11]. Nevertheless, the above features are manually extracted, and end-to-end scene classification cannot be achieved. e final classification accuracy is mainly subject to the human experience.
To ameliorate the limitations of handcrafted features, more researchers adopted the automatic extracting of RSI features. In recent years, deep learning has achieved impressive performance in image processing tasks, e.g., image generation, object detection, and image classification. Moreover, the CNN-based classier has also been employed for RSI scene classification. With an end-to-end framework, CNN performs multilayer nonlinear transformations to extract the high-level image features automatically. ese features focus on the overall semantics of RSI rather than the pixels of a local region. Hence, the high-level features obtained by CNN can be more effective for RSI scene classification. For instance, Wang et al. [12] proposed a method based on Residual Network (ResNet) to achieve RSI classification. It breaks the limitation of lacking training samples. Zhang et al. [13] proposed an effective architecture based on CNN and Capsule Network (CapsNet), named CNN-CapsNet. RSI scene classification has a common problem, i.e., the overfitting caused by the deeper network (too many parameters). However, the shallower network has not enough extraction ability of semantic information. To this end, Zhang et al. [14] proposed a simple and efficient network based on the dense convolutional network (Den-seNet). DenseNet can improve network performance without increasing the number of network parameters.
Although the above CNN-based methods have an outstanding performance for RSI scene classification, they also face various security risks [15], e.g., data poisoning attacks [16] and adversarial sample attacks. To defend against these attacks, researchers also proposed corresponding malicious detection methods [17][18][19]. Among these attacks, the classifiers can easily be fooled by the designed adversarial samples and get unexpected results. is phenomenon has raised serious concerns, especially for the security of applications based on deep learning [20][21][22]. e vulnerability of deep learning is first revealed in [23]. By adding adversarial perturbations to original images, we can obtain deceptive images imperceptive to human eyes. Subsequently, numerous attack methods were proposed. Goodfellow proposed a hypothesis that the existence of adversarial examples may be due to deep model linear property. Based on this hypothesis, they proposed a fast gradient sign method (FGSM) [24], which can modify the pixel values of images. Hence, the modified images can obtain the misclassify result. Later, an iterative version of FGSM, the basic iterative method (BIM) [25], was proposed. BIM can achieve an effective attack by using the generated more imperceptible adversarial perturbation. en, projected gradient descent (PGD) [26] attack was proposed, which is seen as a variant of BIM. Unlike BIM, PGD adds a step of initializing the uniform random perturbation. en, it can run more iterations until finding an adversarial example. Also, it replaces the clip operation of BIM on the gradient with the gradient projection [27]. Due to the above improvements, the attack capability of PGD is far superior to both FGSM and BIM. Moreover, PGD is known as the strongest first-order attack. Another strong attack is the Carlini and Wagner (C & W) [28] attack. C & W proposed a family of three attacks that minimize diverse similarity metrics: L 0 , L 2 , and L ∞ , respectively. C & W achieves the effect that is imperceptive by humans, and they constrain the L 0 , L 2 , and L ∞ norm to make the perturbation small.
Although C & W can generate more precise perturbations, the computational efficiency of C & W is slower. By contrast, PGD is a commonly used attack method that is stable and efficient at present. e research on the adversarial samples for RSI scene classification is not thoroughly studied yet. Czaja et al. [29] provided a preliminary analysis of the adversarial examples for remote sensing data. However, their analysis is only under the targeted attack. Xu et al. [30] used FGSM and L-BFGS [23] attacks to generate the adversarial samples for RSI scene classification. ey tested these above two attacks under the targeted and the untargeted attacks. We call it Xu2020 in this paper. Additionally, they utilized an adversarial training strategy to increase the resistibility of deep models that faced adversarial examples. Burnel et al. [31] proposed a different approach based on a modified Wasserstein generative adversarial network (GAN) [32] to generate natural adversarial hyperspectral images. Chen et al. [33] tested two adversarial attack algorithms (i.e., FGSM and BIM) on two classifiers trained on different remote sensing data sets. ey found that a higher similarity in the feature space is more straightforward for generating the corresponding adversarial examples. Although the above methods can easily fool the classifier, these attack methods could not represent a state-ofthe-art performance according to the attack capacity. FGSM attack has the limitation of its one-step characteristics. Hence, it has a low success rate on large public data sets. Unlike FGSM, BIM has lower iteration numbers than PGD, so its attack effect lags behind PGD.
Based on the above analysis, we utilize the PGD attack to generate adversarial samples. en, we test the generated images on eight state-of-the-art classifiers to prove our method's effectiveness in this paper. e main contributions are summarized as follows: (1) We apply a first-order attack (i.e., PGD) to generate the adversarial samples. is method can generate more accurate perturbation by uniform random perturbation as initialization, then running several BIM iterations to find an adversarial example (2) To demonstrate our method's effectiveness, we employ three benchmark multicategory remote sensing scene data sets imaged by different sensors. Also, eight CNN-based classifiers are attacked under the untargeted and the targeted settings. Compared with the Xu2020 method, our method can make the attacked classifiers much lower classification accuracy. is phenomenon has revealed that the current state-of-the-art classifiers still have potential security risks in the multisource RSI scene classification task (3) Furthermore, we also provide visual comparisons of the generated perturbations and the adversarial samples. For the more objective visual quality evaluation, the peak signal-to-noise ratio (PSNR) is employed as the evaluation index. e results indicate that our method can generate imperceptible adversarial samples and has a stronger attack ability for current state-of-the-art classifiers e rest of this paper is organized as follows. Section 2 introduces the main strategies for generating adversarial examples and the detailed expression of the PGD attack under untargeted and targeted settings. Section 3 shows three RSI scene data sets used in this paper and the corresponding experimental results. Conclusions are summarized in Section 4.

Method
Adversarial attacks principally followed these two strategies.
(1) We maximize the classifier's loss function under the L norm constraint by the small constant value. is operation will cause the classifier to misclassify. (2) Using the L norm to constrain the perturbation to be minor, the human eye's imperceptible images are generated. e overall procedure of our method is shown in Figure 1.
Given a set of test remote sensing imagesX � x i , . . . , x n , x i ∈ [0, 255] n , with the corresponding labels y ∈ Y n i�1 , a CNN classifier F: X ⟶ y ∈ Y n i is introduced to map images to the corresponding labels. e adversarial attack aims to find a perturbation, which makes the target label y adv ≠ y ture . Usually, we maximize the loss function of original data under constraint norm (i.e., ‖δ‖ p ), and δ denotes the perturbation. Hence, the process of finding perturbation can be regarded as an inner maximization problem, and this procedure can be expressed as where L(θ, x ′ , y) is the loss function. θ denotes the model's parameters and x ′ demotes the generated adversarial sample. To find a small perturbation for the given image, therefore, we minimize D(x, x ′ ) for the input image x, where D is the distance metric between the original image and the adversarial image. e procedure can be described as However, the optimization problem of equation (2) is difficult to be directly solved. Hence, FGSM was proposed, which is an efficient one-step attack. is attack used an L ∞ -bounded constraint and the sign function to get its specific gradient direction. en, it adjusts the input data by a small step in the direction that will maximize the loss until it gets the suitable perturbation. Finally, the perturbed image can be obtained: where ε denotes a small constant value that restricts the perturbation. FGSM is based on a linear model, and the direction of loss is fixed. Even if we iterate it multiple times, the direction of the perturbation will not change. However, for a nonlinear model, the direction may not be completely correct after only one iteration. Hence, the multistep iterative version of FGSM was proposed, i.e., PGD attack. PGD uniforms the random perturbation as the initialization firstly. en, an adversarial example is found by running several iterations of BIM. PGD creates a stronger attack than other previous iterative methods (e.g., BIM). Formally, the iterative procedure follows: where Π denotes the projection operator, which clips the input at the positions around the predefined perturbation range. α means a gradient step size, and x + S represents the perturbation set. L is the cross-entropy loss. Usually, L ∞ and L 2 norms are currently used as the constraint that bound the perturbation to be small. In this paper, [34] mentioned that L ∞ is the optimal distance metric for the image classification task, and [35] argues that the distillation is secure under this distance metric. us, we choose PGD-L ∞ as our attack method (Algorithm 1). Generally, adversarial attacks can fool the CNN classifier under untargeted and targeted settings. e targeted attack means that the attacker assigns the classification result to a specific class. Unlike the targeted attack, the untargeted attack aims to make the classifier mispredict the given image's label but does not necessarily specify the prediction into a specific label. e targeted and the untargeted settings of our attack method can be formulated as

Experimental Setup.
We tested the Xu2020 method and our method on three RSI scene classification data sets, i.e., UC Merced (UCM) data set [10], WHU-RS19 data set [36], and AID data set [37]. Figures 2-4 show the examples of the three data sets. UCM data set includes 2,100 remote sensing scene images. is data set consists of 21 scene classes, and each class contains 100 samples with a size of 256 × 256 pixels. All images were collected from the National Map Urban Area Imagery collection, including multiple towns across the United States. e 21 land-use classes are agricultural, airplane, baseball diamond, beach, buildings, chaparral, dense residential, forest, freeway, golf course, harbor, intersection, medium-density residential, mobile home park, overpass, parking lot, river, runway, sparse residential, storage tanks, and tennis courts. e WHU-RS19 data set was collected from Google Earth (Google Inc.). e WHU-RS19 has 19 different scene classes, and each class contains 50 images with the size of 600 × 600 pixels. e 19 scene classes are airport, beach, bridge, commercial area, desert, farmland, football field, forest, industrial area, meadow, mountain, park, parking lot, pond, port, railway station, residential area, river, and viaduct. AID data set is the biggest data set among the three data sets, containing 30 scene classes. Each class has around 220-420 images.
is data set includes Security and Communication Networks 3 x i+1 Figure 1: e overall procedure of our method.

Input:
A set of test remote sensing images X � x 1 , . . . , x n and the corresponding labels Y � y 1 , . . . , y n .
A classifier f with parameter θ.
Number of epochs t; a small constant that restricts the perturbation parameter α; Output: Adversarial example, x adv ; 10,000 remote sensing scene images. e 30 scene classes contain airport, bare land, baseball field, beach, bridge, center, church, commercial, dense residential, desert, farmland, forest, industrial, meadow, medium residential, mountain, park, parking, playground, pond, port, railway station, resort, river, school, sparse residential, square, stadium, storage tanks, and viaduct.
To prove the strong attack ability of our method, we conducted the experiments in three parts. Firstly, we compared the classification accuracies of the Xu2020 method (used in [30]) with those of our method. Both attacks were tested in the untargeted and targeted settings. en, we also showed the classification accuracy specific to each class under the targeted setting. Finally, we visualized the generated adversarial images and the corresponding perturbations. For the more objective evaluation, we used the peak signal-to-noise ratio (PSNR) index to analyze the generated images of the Xu2020 and our method.
In this paper, we first randomly picked 20% of labeled images as the training set. e rest images are employed as the test set. All experiments were randomly repeated three times. In our experiment, we used Xu2020 as the compared method. Eight state-of-the-art CNN-based classifiers were employed to test our method. ese classifiers were VGG-16 [38], GoogLeNet [39], InceptionV3 [40], ResNet-18 [41], ResNet-50 [41], ResNet-101 [41], DenseNet-121 [42], and DenseNet-201 [42], respectively. ey were pretrained on the ImageNet [43] data set. Adam was used as the optimizer to train the classifiers with a batch size of 32. e training epochs were set as 40 with a learning rate initialized to 5e − 5 at the first 20 epochs and then the learning rate decay to 1e − 5 at the last 20 epochs. e experimental environment configuration is shown in Table 1.

Experimental Results of Untargeted Attack.
In this section, we show the classification accuracies of eight different classifiers before and after the attack. e overall accuracy (OA, i.e., the number of accurately classified images divided by the number of entire test sets) is employed to evaluate different methods quantitatively. OA gap is the gap between the classification accuracy of the clean test set and that of the Xu2020 method and our method test set. For the untargeted and targeted attacks, the parameter ε was set to be 0.01 for the Xu2020 method and our method. Moreover, because Xu2020 is a one-step attack, we set the number of steps to 1. For our method, the number of steps was set to be 40. Tables 2-4 show the untargeted attack results against three types of input (clean, perturbed by Xu2020, perturbed by our method ran for 40 steps).
As shown in Table 2, we can see that all classifiers have excellent performance on three data sets before the attack. Generally, the OA values of eight classifiers are almost around 90%. Among these different classifiers, the classifiers with more complicated layers have higher accuracy than the classifiers with fewer layers. For instance, ResNet-101 has gained about 5% advancement in terms of OA than VGG-16 on the UCM data set. However, the OA of different classifiers all decreased dramatically after the attack.
As shown in Table 4, for the biggest data set (i.e., AID data set), the OA values of VGG-16 and DenseNet-201 are 20.97% and 53.88% after being attacked by Xu2020. (a)          Meanwhile, they are 1.31% and 2.89% after being attacked by our method, respectively. As shown in Tables 2-4, our method's adversarial images have the lower OA values for the bigger data set. Specifically, for the UCM data set (a small size data set with 21 classes), the OA value of DenseNet-201 is 5.65% after being attacked by our method. For the WHU-RS19 and the AID data sets, the OA values of DenseNet-201 are only 4.81% and 2.89% after being attacked by our method, respectively. ese experimental results demonstrate the effectiveness of our method for different attacked classifiers and different data sets.
Also, we can find an interesting phenomenon from Tables 2-4. e OA gap between the clean and the adversarial test sets is highly related to the classifiers' depth for Xu2020.
is phenomenon shows that the complicated classifiers tend to have a more robust and more stable property. However, the experimental results show that our method always has a stable OA gap value no matter the classifiers' depth. In summary, our method is not affected by the depth of the classifiers and can attack various classifiers stably.

3.3.
Experimental Results of Targeted Attack. Furthermore, we also performed our method and the compared method under the targeted attack. Tables 5-7 represent the corresponding OA values.
In our experiments, we set the targeted attack categories as each category every time. e number of attacks was the number of the entire categories of the whole data set. e experimental results demonstrate that the classifiers also have vulnerability under the targeted attack. From Tables 5-7, we discover that the targeted attack's OA values are higher than those of the untargeted attack. For instance, for the RS19 data set, compared with the untargeted attack, the OA value has reached 26.61% on VGG-16 after being attacked by our method. However, the OA value has dropped to 0.24% on the same classifier under the untargeted attack. is phenomenon indicates that the targeted attack is more complicated than the untargeted attack.
We can find the common points from Tables 5-7. e OA gap between the shallow model (i.e.,

Security and Communication Networks 9
As shown in Tables 5-7, our method has no significant fluctuation between the OA gaps under targeted and untargeted attacks. is phenomenon proves that our method is more beneficial than Xu2020 under targeted and untargeted attacks.
To further analyze the adversarial attack's capacity, we provided each category classification accuracy (CA) after being attacked by Xu2020 and our method in Figures 5-7.
We can know that the CA values of Xu2020 are much lower than those of our method in each class. For instance, for the UCM data set, the CA value is 64.35% after being attacked by Xu2020. However, the CA value is 23.06% after being attacked by our method. As shown in Figures 5-7, the CA values are around 60% after being attacked by Xu2020 on each category in all three data sets. e gap between the highest and the lowest CA values was only 8.8% from Figure 5. However, the CA values have fluctuated after being attacked by our method. e possible reason is that the images' feature distributions of different categories have similar distributions.
An interesting phenomenon can be found in Figures 5-7. e CA values vary dramatically when the same classifiers are attacked by the same attack method but used different data sets. For the WHU-RS19 data set, the gap between the highest and the lowest CA values is 43.03% attacked by our method. However, for the bigger data set (UCM), the gap between the highest CA and the lowest CA values was only 29.46% when using our method. is phenomenon may be because the deep model is easily overfitting by training on a small data set.

Visualization of Adversarial Samples.
In this section, we visualized adversarial samples generated by Xu2020 and our method. To further compare the generated images' quality, we also demonstrated the generated perturbations against the DenseNet-201 classifier on the UCM data set, as shown in Figure 8 (the untargeted attack) and Figure 9 (the targeted attack).
ese visualization results indicate that the  perturbations generated by our method are more imperceptible to human eyes compared with Xu2020. e step of Xu2020 was set to 1, and the step of our method was set to be 40. ε is set to be 0.01. Under the targeted attack, we set the target category to "airport". e second and the fourth lines of Figures 8 and 9 represent the perturbations generated by the Xu2020 and our method, respectively. We can find that most of the perturbations generated by Xu2020 are more evident than those generated by our method, especially in the third and fifth perturbations of the second line under the targeted attack.
To evaluate visual quality more objectively, we employ the PSNR to compare the generated adversarial images, as shown in Figures 8 and 9. e PSNR is a common index, which can measure the similarity between the original clean and the adversarial images. e calculation of PSNR is given as where MAX I represents the maximum possible pixel value of the image; MSE is the mean squared error. e higher value of PSNR indicates that the adversarial image's distribution is closer to the original clean image's distribution. If the PSNR value is higher than 40 dB, it means that the image quality is excellent. If the PSNR value is between 30 and 40 dB, it usually indicates that the image quality is good, which means that the distortion of the generated image is detectable but acceptable. e PSNR values are provided on the top of the third and fifth lines of Figures 8  and 9. Figure 8 presents the adversarial images and perturbations generated by Xu2020 and our method. We can see that the PSNR values of Xu2020 are slightly lower than those of our method. Hence, our method's distribution of the adversarial images is closer to those of original clean images.
is inference also can be confirmed from the generated perturbations in Figure 9. ese results show that the images generated by our method have a better trade-off between the attack strength and the image quality.
In summary, our method can maintain its attack strength for the different classifiers and the models with different depths. Also, our method is more beneficial than Xu2020 under untargeted and targeted attacks. Moreover, 40  our method's quality of generated adversarial images has less distortion than Xu2020.

Conclusions
In this paper, we discussed the adversarial sample problem on the RSI scene classification task. Although the classifiers based on CNN have a state-of-the-art performance on the classification task, they have vulnerability toward adversarial samples. is may cause security risks for the RSI scene classification application. Moreover, our method performs better than the compared method under targeted and untargeted attacks. Specifically, our method has a stronger attack strength and better attack stability. We also analyzed the PSNR values of the generated adversarial images, and the results indicated that our method's adversarial images have better image quality.
Adversarial samples are a potential threat to the RSI scene classification task. We will further investigate the effective defense or detection strategies against the adversarial samples for RSI image classification in our future work.

Data Availability
e data used to support the findings of this study are available from the website given by corresponding papers.

Conflicts of Interest
e authors declare that there are no conflicts of interest regarding the publication of this paper.