A SAR Target Recognition Method Based on Decision Fusion of Multiple Features and Classifiers

A synthetic aperture radar (SAR) target recognition method combining multiple features and multiple classifiers is proposed. *e Zernike moments, kernel principal component analysis (KPCA), and monographic signals are used to describe SAR image features. *e three types of features describe SAR target geometric shape features, projection features, and image decomposition features. *eir combined use can effectively enhance the description of the target. In the classification stage, the support vector machine (SVM), sparse representation-based classification (SRC), and joint sparse representation (JSR) are used as the classifiers for the three types of features, respectively, and the corresponding decision variables are obtained. For the decision variables of the three types of features, multiple sets of weight vectors are used for weighted fusion to determine the target label of the test sample. In the experiment, based on the MSTAR dataset, experiments are performed under standard operating condition (SOC) and extended operating conditions (EOCs). *e experimental results verify the effectiveness, robustness, and adaptability of the proposed method.


Introduction
Synthetic aperture radar (SAR) obtains effective ground observation data through two-dimensional high-resolution imaging, supporting related applications in the military and civilian fields. SAR target recognition uses the feature analysis and classification decision to determine the target class [1]. Feature extraction obtains the effective feature descriptions of the target in SAR images, including geometric shape, scattering centers, and projection transformation features. Literature [2][3][4][5][6][7] designed SAR target recognition methods based on geometric features such as target region, contour, and shadow. In [2,3], the Zernike moments were used to describe the target area. In [4], a recognition method was proposed based on target region matching. In [6], the target contour distribution was modeled based on the elliptic Fourier descriptor. Papson proposed a SAR target recognition method based on shadow features [7]. e scattering center features describe the target's backscattering electromagnetic characteristics at the high-frequency region. In [8][9][10], SAR target recognition methods were developed using the attribute scattering center as the basic feature. Projection transformation features can be further divided into projection ones and image decomposition ones. Projection features mainly use mathematical transformation algorithms, such as principal component analysis (PCA) or kernel PCA (KPCA) [11,12] and nonnegative matrix factorization (NMF) [13]. Image decomposition methods include the wavelet analysis [14], monogenic signal [15], and bidimensional empirical mode decomposition (BEMD) [16]. e methods mentioned above are all based on a single feature to carry out target recognition. In fact, combining a variety of different features can effectively improve the performance of SAR target recognition. In [17], the multitask compressive sensing was employed to implement joint classification of multiple features of SAR images. In [18], a multifeature hierarchical decision fusion method was proposed. In [19], a multifeature and multirepresentation fusion strategy was proposed for target recognition. According to the extracted feature categories, the classifier has to analyze and make decisions accordingly to obtain the target label of the unknown sample. In [20], a SAR target recognition method was developed based on K-nearest neighbor (KNN). e support vector machine (SVM) was employed in [19,20] as a base classifier to design a SAR target recognition method. e sparse representation classification-based classification (SRC) was employed for SAR target recognition in [12][13][14][15][16][17][18][19][20][21][22]. With the development of deep learning in recent years, the convolutional neural network (CNN) has gradually become a hot tool in SAR target recognition, and a number of representative methods have emerged [23][24][25][26][27][28][29][30]. Similarly, classifier fusion is also used and verified in SAR target recognition. In [21], SVM and SRC were used for fused classification. In [24], the CNN and SVM were combined to further improve the classification performance.
is study proposes a SAR target recognition method based on multiple features and classifiers. ree types of features, i.e., Zernike moments, KPCA, and monogenic signals are used for feature extraction. Zernike moments describe the geometric shape of the target and have the advantages of invariable translation and rotation. e features have clear physical meaning and reflect the details of the target [2,3,19]. KPCA extracts the projection features of the original image, which provide a concise feature vector and have a certain nonlinear description ability [11,12]. e monogenic signal can effectively decompose the SAR image and obtain the multilevel and multifrequency description characteristics [15].
erefore, the three types of features have good complementarity and can provide more sufficient discriminative information for decision-making. In the classification stage, SVM, SRC, and joint sparse representation (JSR) are used as the classifiers for Zernike moments, KPCA feature vectors, and monogenic features to obtain the corresponding decision variables. On this basis, multiple sets of linear weights are designed to perform weighted fusion on the decision variables of the three types of features [31][32][33][34][35][36], which determine the target label of the test sample finally. In the experiments, the proposed method is tested under standard operating condition (SOC) and extended operating conditions (EOCs) [37][38][39][40][41][42] based on the MSTAR dataset. e recognition results and comparative analysis verify the effectiveness and robustness of the proposed method.

Zernike Moment.
Zernike moments are widely used in the description of SAR target regions because of the translation and rotation invariance and noise robustness [2,3,19]. For the image I(r, θ) in the polar coordinates, the Zernike moment is calculated as follows: where n � 0, 1, . . . , ∞, l � 0, ±1, . . ., n − |l| is even, and |l| ≤ n. Zernike polynomials V nl (r, θ) � R(r)e ilθ are a set of orthogonal complete complex-valued functions on the unit circle x 2 + y 2 ≤ 1, which satisfy the following condition: On this basis, the rotation invariant feature is constructed as follows: Based on the above equations, the Zernike moment at any order of the input image can be calculated. Among them, the high-order moments can effectively reflect the detailed information in the image, which is beneficial to improve the recognition performance. In this study, the 3-8th moments of Zernike moments are used to describe the SAR target region, which constitute a feature vector.

KPCA.
PCA calculates the best projection direction by analyzing the data structure of a large number of samples to achieve data dimensionality reduction [11]. For the sample set X � x 1 , x 2 , . . . , x n , X ∈ R m×n , their mean and covariance matrix are first calculated as follows: e eigenvalues and eigenvectors of the covariance matrix are calculated as follows: In equation (6), the vector V stores the eigenvalues, and each eigenvalue corresponds to the eigenvector in the matrix D. By selecting feature vectors corresponding to several largest eigenvalues, a projection matrix can be constructed for feature extraction of the samples.
KPCA is an extension of PCA in nonlinear space, which can process datasets with nonlinear structure more efficiently [11,12]. KPCA first processes the data by introducing a kernel function (typically polynomial and radial basis kernel) and then performs PCA operations in high dimensions. In this study, KPCA is used to process SAR images and obtain 80-dimensional feature vectors for subsequent classification.

Monogenic Signal.
e monogenic signal is a two-dimensional signal decomposition algorithm, which can effectively analyze the multilevel spectrum characteristics of the original image [15]. For the input image f(z), its Riesz transformation is calculated as f R (z), where z � (x, y) T represents the two-dimensional coordinates. e corresponding monogenic signal f M (z) is calculated as follows: where i and j are the imaginary units. f(z) and its Riesz transform, respectively, correspond to the real part and imaginary part of the monogenic signal, respectively.

Scientific Programming
Accordingly, the characteristics of the monogenic signal are defined as follows: In the above equations, f x (z) and f y (z) correspond to the i-imaginary part and j-imaginary part of the monogenic signal, respectively; A(z) represent the amplitude information; φ(z) and θ(z) correspond to the local phase and azimuth information, respectively. e three types of features obtained based on monogenic signal decomposition have different characteristics. Among them, A(z) mainly reflects the gray distribution characteristics of the image. φ(z) and θ(z) reflect the local detail information and shape characteristics of the image. erefore, the joint use of the characteristics of the monogenic signal is conducive to constructing a more informative characterization. In this study, according to [9], the SAR image is decomposed by monogenic signal, and three corresponding feature vectors are obtained through downsampling and vector concatenation.

SVM for Zernike Moments.
For the classification problem of the two types of patterns, SVM obtains the best classification interface by minimizing structural risks [24]. For the sample x with unknown label, the decision hyperplane of SVM classification is as follows: where w is the weight coefficient vector, which is used to describe the relevant parameters of the hyperplane, ϕ(·) represents the kernel function, and b represents the bias. At first, SVM was proposed for the recognition of two types of patterns, namely, the hyperplane of equation (9) was used to distinguish between the two types. In the later period, researchers extended it to the classification of multiclass models through strategies such as "one-to-one" and "one-to-many." rough the training of a large number of labeled training samples, a suitable classification surface can be obtained. At the same time, choosing a suitable kernel function can effectively enhance the nonlinear classification ability of SVM. When using SVM for multicategory classification, the (pseudo) posterior probability of each category is output to represent the possibility that the current sample belongs to a certain training class. e type of test sample can be determined by the principle of maximum posterior probability. In this study, SVM is used to classify Zernike moments [43][44][45][46].

SRC for KPCA Features
. SRC uses sparse representation as a basic algorithm to characterize test samples with unknown classes and then determines their category based on the analysis of the reconstruction errors [14,15,23]. Dictionary construction is one of the key steps in SRC. Existing methods mostly use samples of all training classes to construct a global dictionary A � [A 1 , A 2 , . . . , A C ] ∈ R d×N , in which A i ∈ R d×N i (i � 1, 2, . . . , C) contains all training samples from the i th training category. Accordingly, the sparse reconstruction of the test sample y is described as follows: where α is the sparse representation coefficient vector to be solved, and ε is the reconstruction error threshold. Under the constraint of ℓ 0 norm, it is very difficult to solve the sparse representation coefficient in equation (10). For this reason, researchers used ℓ 1 norm minimization [23] to approximate the problem in equation (10) and converted it into a convex optimization problem that is easy to solve. In addition, the orthogonal matching pursuit algorithm (OMP) [14], Bayesian compressive sensing (BCS) [15], and other algorithms can also be used to obtain the approximate solution of equation (10). Based on the solutions, the decision process is performed as follows: where α i are the coefficients corresponding to the i th training class from the extraction in α, and r(i) (i � 1, 2, . . . , C) represent the reconstruction errors of different classes. Studies have shown that SRC has good robustness against noise interference and occlusion [23], which can effectively supplement SVM. is study uses SRC to classify KPCA feature vectors.

JSR for Monogenic Features.
For the different types of features extracted from the same SAR image, they have a certain inherent correlation. For this reason, this study uses JSR to jointly represent them, thereby improving the overall accuracy [5,15,26]. e three monogenic feature vectors obtained from the test sample y are denoted as [y (1) is study uses JSR for classification. e basic representation process is as follows: where Φ (k) is the global dictionary corresponding to the k th feature, x (k) is the corresponding coefficient vector, and β � [x (1) x (2) x (3) ]. e objective function in equation (12) does not take into account the inherent relationship of the three types of features. is goal can be achieved by constraining the sparse matrix β. e updated objective function is as follows: Scientific Programming 3 In equation (13), the ℓ 1 /ℓ 2 norm is used to constrain β, which can effectively use the internal relationship of the three types of features.
According to the obtained coefficient matrix β, the sum of the reconstruction errors of each class for the three types of features can be calculated, and then, the target label of the test sample can be decided: where Φ (k) i and x (k) i are the part of the dictionary and the corresponding coefficient vector corresponding to the k th feature in the i th class.

Decision Fusion.
For SRC, the output result is the reconstruction error vector [r 1 , r 2 , . . . , r C ]. First, these reconstruction errors are converted into a probability vector according to the following: is study uses multiple sets of weights for linear fusion, so as to obtain a more robust result. Denoted r i k as the decision variable of the k th feature of the i th class, first construct N weight vectors: In the formula, each column in the matrix W represents a weight vector, which satisfies K k�1 w ki � 1, w ki ≥ 0. (17) e weighting process under a set of weight vectors is as follows: erefore, in the group of N random weight vectors, the i th training sample obtains a weighted result R � R i 1 R i 2 · · · R i N , which is called the fusion decision vector. Finally, the N decision variable is averaged as the final decision value of the i th category, and the target category of the test sample is determined through various comparisons. It can be seen that under the action of multiple sets of weight vectors, the three types of features participating in the fusion can be fully analyzed.

MSTAR Dataset.
e MSTAR dataset is a representative dataset for testing and evaluation of current SAR target recognition methods. e dataset contains ten types of military vehicle targets, as shown in Figure 1. e SAR images of various targets are acquired by X-band airborne radar with a resolution of 0.3 m. e MSTAR dataset has abundant samples, and a number of representative operating conditions can be set accordingly. e azimuth angles of various targets cover 0°-360°, which can be used for comprehensive training and testing. Some targets include several different configurations, which can be used to investigate the performance of the recognition methods under configuration variance. Some targets have multiple different depression angles, which can be used to investigate the performance of the recognition methods under large depression angle differences.
During the experiments, several types of existing methods are compared, which are mainly divided into two categories.
e first category employs the three types of features used in this study, but uses a single feature, which are, respectively, denoted as Zernike [2], KPCA [12], and monogenic [15]. e second category is the multifeature fusion method, i.e., the methods in the [17,18], which are, respectively, denoted as fusion 1 and fusion 2. e third category is the currently popular deep learning methods, using the A-ConvNet method in [23]. Subsequent experiments are first carried out under SOC and then under two EOCs of configuration variance and depression angle variance.

SOC.
In the SAR target recognition problem, SOC generally refers to the high overall similarity between the test and the training samples, and the recognition difficulty is relatively low. Table 1 provides the training and test samples under SOC, which are from 17°and 15°depression angles, respectively. e test and training samples for various targets are from the same target confirmations. e proposed method is used to classify the 10 types of targets shown in Figure 1, and the confusion matrix shown in Figure 2 is obtained. Among them, the diagonal value marks the correct recognition rate of the corresponding target. In the experiment, the average recognition rate is defined as the proportion of correctly recognized samples in all test samples. e average recognition rate of the proposed method for 10 types of targets is 99.46%. Table 2 compares the average recognition rates of various methods under current experimental settings, which are all higher than 98%, reflecting the low difficulty of problems under SOC. Compared with the three types of methods based on single features, this study significantly improves the final recognition performance through their combined use. Compared with the other two types of multifeature fusion methods, the performance of this study is better, which shows that the designed decision fusion algorithm has stronger effectiveness. e CNN method can achieve high performance under SOC, but it is still lower than the proposed method. In summary, the proposed method can achieve superior performance under SOC, which verifies its effectiveness.

EOCs.
EOCs are defined with reference to SOC, which mainly examine the differences between the test and the training samples due to factors such as target, background, and sensor variations. e typical EOCs that can be set based on the MSTAR dataset mainly include configuration variance and depression angle variance, which are tested in the subsequent experiments.

Configuration Variance.
e configuration variance is mainly due to the situation caused by the change of the target itself, which refers to the different configurations of the test sample and the training sample from the same target. Table 3 provides the current experimental scene. Among them, the test samples and training samples of BMP2 and T72 targets are from different configurations. e appearance similarity between BTR70 and these two types of targets is relatively high (as shown in Figure 1), and its introduction increases the overall recognition difficulty. Various methods are tested under current conditions, and their average recognition rates are given in Table 4. Compared with the three types of single feature-based methods, the performance advantages of this method are very significant, indicating that their joint representation and weighted fusion can effectively improve the robustness of recognition. Compared with the two types of multifeature fusion methods, the recognition rate of the proposed method is higher, reflecting its stronger robustness.
e performance degradation of the CNN method under current conditions is very obvious, mainly due to the weak coverage of the training samples to the test samples, and the corresponding training network adaptability also decreases.
Depression Angle Variance. As the relative viewing angle between the target and the sensor changes, the corresponding SAR images obtained will also have larger differences. In particular, when the test sample and the training     Scientific Programming rate at both depression angles, indicating that it has better robustness to depression angle variance. rough the effective fusion of the three types of features, the proposed method can more comprehensively investigate the image changes caused by the variances in the depression angle, so as to obtain more reliable recognition results.

Conclusion
In this study, a multifeature and multiclassifier SAR target recognition method is proposed. Zernike moments, KPCA, and monogenic signal are used to describe the characteristics of the original SAR image, and the corresponding feature vectors are obtained. In the classification stage, SVM, SRC, and JSR are used to make decisions on the three types of features, and then, their decision vectors are weighted and fused based on the multiple weight vector. Finally, the target label of the test sample is determined according to the fused decision variables. e three types of features and the three classifiers have good complementarity, so they can provide more effective information for target recognition. In the experiment, the proposed method is tested and verified under SOC, configuration variance, and depression angle variance based on the MSTAR dataset. e results show the performance advantages of the proposed method.

Data Availability
e dataset used to support the findings of this study is available from the corresponding author upon request.

Conflicts of Interest
e authors declare that they have no conflicts of interest.