Hierarchical Recognition System for Target Recognition from Sparse Representations

A hierarchical recognition system (HRS) based on constrained Deep Belief Network (DBN) is proposed for SAR Automatic Target Recognition (SARATR). As a classical Deep Learningmethod, DBN has shown great performance on data reconstruction, big data mining, and classification.However, fewworks have been carried out to solve small data problems (like SARATR) byDeep Learning method. In HRS, the deep structure and pattern classifier are combined to solve small data classification problems. After building the DBN with multiple Restricted Boltzmann Machines (RBMs), hierarchical features can be obtained, and then they are fed to classifier directly. To obtain more natural sparse feature representation, the Constrained RBM (CRBM) is proposed with solving a generalized optimization problem. Three RBM variants, L 1 -RNM, L 2 -RBM, and L 1/2 -RBM, are presented and introduced to HRS in this paper.The experiments onMSTAR public dataset show that the performance of the proposed HRS with CRBM outperforms current pattern recognition methods in SAR ATR, like PCA + SVM, LDA + SVM, and NMF + SVM.


Introduction
Synthetic Aperture Radar (SAR) Automatic Target Recognition (ATR) plays an important role in military and civil applications, such as social security, environmental monitoring, and national defense [1][2][3][4][5].Most current researches focus on the pattern features [6][7][8] or pattern classifiers [9,10].The pattern recognition methods have shown the excellent ability on classifying the small data.However, if the samples number is huge, the pattern recognition methods are slow and inefficient.With the development of SAR imaging ability, more data can be captured.The data dimensionality is increasing, which means more powerful algorithms are needed.
Since Hintion and Salakhutdinov proposed the Deep Auto-Encoder networks [11], Deep Leaning has been a research hot spot in recent years.Deep Learning algorithms have shown the great performance on big data reconstruction, data mining, and classification [12], as they can learn hierarchical representations of high-dimensional data and have been applied in many fields, such as handwritten digit recognition [13,14], object detection [15][16][17], and scene classification [18][19][20].Thus introducing Deep Learning to SAR ATR is necessary and urgent.Few researchers have started such work.The Auto-Encoder is applied to SAR ATR directly in [21], but with few theoretical problems solution.In [22], the features of SAR target and shadow are extracted based on the multilayer Auto-Encoder; then the combined features are fed to Synergetic Neural Network (SNN) for recognition.However, the recognition performance is not very prominent.
For current SAR image database, a hierarchical recognition system (HRS) with combining Deep Belief Network (DBN) and pattern classifier is proposed in this paper.The proposed HRS has both advantages of deep structure and pattern recognition.Based on the great reconstruction ability of DBN, the features can be obtained in each layer.These features can be fed to classifier for high performance recognition.
Meanwhile, in order to obtain sparse feature representation, the Constrained Restricted Boltzmann Machine (CRBM) is defined based on a generalized optimization Visual units: V problem.Unlike the Sparse RBM (SRBM) constrains, the expectation of the hidden units to a certain value, the constraint in CRBM is performed on the probability density of hidden units directly to obtain more sparse solution.Three RBM variants with norm constraint,  1 -RNM,  2 -RBM, and  1/2 -RBM, are presented.Stacked CRBMs are used to built Constrained DBN (CDBN), which can be introduced to the proposed HRS.
From the performance on MSTAR public dataset, the proposed HRS with CRBM can effectively solve the small dataset recognition problem and outperforms current pattern recognition methods in SAR ATR, like PCA + SVM, LDA + SVM, and NMF + SVM [23] ("PCA + SVM" means extracting the target feature by PCA, and using the SVM for classification; "LDA + SVM" and "NMF + SVM" have the similar meanings).
The contribution of this paper includes two aspects: one is a hierarchical recognition system built for SAR ATR, which can obtain hierarchical features for recognition.The other is the CRBM proposed to obtain more natural sparse feature representation and introduced to HRS for better performance.
The rest of this paper is organized as follows.Section 2 introduces the framework of the proposed HRS and the hierarchical features representation in HRS.Section 3 describes the Constrained RBM with a generalized optimization problem and presents three specific RBM variants.The recognition experiments based on MSTAR database are performed and the results are analyzed in Section 4. Finally in Section 5 the conclusion and future work are stated.

Hierarchical Recognition System
The purpose of the the proposed Hierarchical Recognition System is to solve small data classification problems by combining the deep structure and the pattern classifiers.The framework of HRS is shown in Figure 1.

Deep Structure.
Suppose the deep structure in HRS has  layers.In each layer, the features can be obtained by certain feature exacting algorithm.Then the features are fed to pattern classifiers for recognition.The features in each layer can be the same or different, and the classifier in each layer can be the same or not.For the convenience of measurement and comparison, the features and classifier in each layer both remain the same.
The deep structure of Deep Belief Networks (DBN) is mainly discussed in this paper.The DBN is stacked by Restricted Boltzmann Machines (RBMs).In Figure 1, the left part can be seen as a DBN with  layers.In each layer, the reconstruction work is done by RBM.
Actually, the deep structure is not only from DBN but can be from Stacked Auto-Encoder [24] or Convolutional Neural Networks (CNN) [25,26] also.

Hierarchical Features Representation.
Just like the feed forward perception in neural network, the features in Layer  are obtained by the following way: where FEA   means the feature obtained in Layer ,  stands for the samples, sigm is a function corresponding to () = sigm() = 1/(1 +  − ), and the    indicates the transform of weight basis matrix   in Layer .
The hierarchical features of the training and test samples obtained by DBN can be treated as the pattern features and fed to the pattern classifier for recognition work directly.

Constrained RBM
Due to the textural characters of SAR images, sparse representation is beneficial for SAR ATR [27,28].In order to obtain the sparse representation and improve the performance of HRS, the sparse constraint is introduced to DBN, and the constraint is forced on RBM.In this section, simple introduction about RBM and the Sparse RBM proposed by Lee et al. in 2007 are presented at first.Then the Constrained RBM is proposed based on a generalized optimal problem.The CRBM can be introduced to DBN to build Constrained DBN (CDBN).

Restricted Boltzmann Machine.
The DBN is stacked by multiple RBMs.The RBM is a particular type of Markov random field that has a two-layer architecture [29,30].The structure of RBM is shown in Figure 2.
The update of parameters  = {, , } can be obtained by Gradient Descent (GD) method: where ⟨⋅⟩ data stands for the expectation over all samples and ⟨⋅⟩ recon is the expectation over the reconstruction data obtained by Contrastive Divergence (CD) learning.For detailed information of CD learning, please refer to [31].

Sparse RBM.
It is believed that solving the reconstruction minimal error optimization problem by sparse constraint can obtain better performance in feature representation.For sparse representation, the SRBM constrains the activation of the hidden units at a fixed level .To achieve this purpose, a regularization term is added on the log-likelihood cost function of RBM.The optimization problem in SRBM can be described as follows [13]: where { 1 , . . .,   , . . .,   } stands for the training set including  examples,  is a regularization constant, and  is a parameter to control the sparseness of the hidden units ℎ  .
The updating of the log-likelihood term can be computed by CD learning.The right-hand side of ( 4) is updated by gradient descent method.In the gradient step, SRBM only update the bias term   instead of updating all the parameters.The update of SRBM just adds one additional update rule of   following the last rule in (3).

Constrained RBM.
The SRBM constrains the expectation of the hidden units values on RBM for sparse representation but does not constrain the probability density function of hidden units directly.In this paper, Constrained RBM (CRBM) is proposed by extending (4) to be a generalized optimization problem for more sparse representation.The constraint is performed on the probability density of hidden units, which can include the case of constraining the expectation.The optimization problem can be described as minimize where (⋅) is a function about P[ℎ   |   ], and Different from SRBM, which constrains the average activation probability expectation of the hidden units values, the constraint to RBM in ( 5) is performed on the probability density of hidden units directly.The purpose of this generalized optimization problem is to obtain more sparse representation by increasing the probability of ℎ  = 0 along with reducing the probability of ℎ  = 1.
In (5), the (⋅) can be the functions including norms, the combination of norms, or the composite functions about P[ℎ   |   ].Thus, the SRBM can be seen a special case of CRBM.The norm constraints are mainly discussed in this paper: For the convenient calculations, (6) can be modified to () = (1/)‖‖   .Three RBM variants,  1 -RBM,  2 -RBM, and  1/2 -RBM, corresponding to three common norms,  1norm,  2 -norm, and  1/2 -norm, are specifically applied to SAR target recognition and shown the performance in the following sections.

Experimental Results and Analysis
To verify the performance of the proposed HRS, in this section, the HRS are compared with DBN and some pattern

SVM Results
Hidden units: H 2

Hidden units: H 1
Visual units:   [25,33], the MSTAR database only have hundreds samples and can be definitely seen as the "small" dataset.

Initialization.
To build the proposed HRS, the DBN and HRS are stacked by two RBMs or CRBMs layers.Thus, the DBN has five variants: DBN(RBM), DBN(SRBM), DBN( 1 -RBM), DBN( 2 -RBM), and DBN( 1/2 -RBM).Meanwhile, the proposed HRS has five variants: HRS(RBM), The SVM is chosen for pattern classifier.The HRS built for experiments in this section is shown is Figure 3.
Both of the two layers in DBN and HRS have 300 hidden units.The input sample has 4096 (64 × 64) pixels and the visual layer has 4096 units.For RBM, the learning rate  in (3) is set to 0.0045.For BPNN, the learning rate is set to 1.The parameter  in ( 5) is set to 0.00001.

Experiments Analysis.
The experiments mainly include two aspects.One is to show the performance of features in each DBN layer the other one is to compare the performance of different recognition methods.
Table 2 lists the performance of the hierarchical features in DBN two layers with respect to iteration.The second and third columns indicate the recognition rates by features FEA 1 and FEA 2 using SVM classifier.The last column shows the performance of the common DBN algorithm which uses Softmax for classification.
Form Table 2, it can be seen that, compared to common DBN, only using the feature in one layer can have better performance.The FFNN obtains the results by fusing the hierarchical features.However, the fusion may reduce the performance because of less texture information in small dataset.
Besides, the performance of feature FEA 1 outperforms the feature FEA 2, which means that the feature obtained in the first layer has the best performance for MSATR 3-class recognition problem, in part because the texture information in SAR images is relatively small and the features in higher layers correspond with more reconstruction loss.Please note that, in Table 2, when the iteration is 500, the recognition rates obtained by DBN and HRS are both lower than when the iteration is 400.It can be seen that more iterations do not mean better recognition performance.That is partly because the MSTAR database can be seen as "small" dataset, and too many iterations may lead to overfitting.
The comparison between pattern recognition methods and HRS is shown in Table 3.In this part, only feature FEA 1 is used, and the iteration for NMF, DBN, and HRS is set to 200.
From Table 3, it can be seen that, for targets BMP2 and BTR70, the proposed HRS has similar performance with the pattern methods.But for target T72, the proposed HRS has an obvious improvement.From the average recognition rates, it can be seen that the performance of DBN is better than PCA + SVM and LDA + SVM, but little worse than NMF + SVM.The proposed HRS outperforms DBN and all three pattern methods.Moreover, the proposed three constrained HRS variants, HRS( 1 -RBM), HRS( 2 -RBM), and HRS( 1/2 -RBM) have better performance than DBN, HRS(RBM), and HRS(SRBM), in which the HRS( 1/2 -RBM) has the best performance.
Comparing the five DBN variants, it can be seen that the DBN with sparse constraint can obtain better performance than DBN.The DBN( 1/2 -RBM) especially has the best performance.Comparing the five HRS variants, the HRS with sparse constraint outperforms the HRS without sparse constraint.The HRS ( 1/2 -RBM) especially can obtian the best recognition rate.Thus, the feasibility of the proposed CRBM can be verified.
Comparing the HRS variants to DBN variants, it can be seen that the proposed HRS has better performance than DBN.Thus, the effectiveness of the proposed HRS can be verified.
Overall, the results in Tables 2 and 3 verify the feasibility and effectiveness of the proposed HRS, and adding sparse constraint on HRS can improve the recognition performance.

Conclusion
The hierarchical recognition system (HRS) based on Deep Belief Network (DBN) is proposed to solve SAR Automatic Target Recognition (SAR ATR) problem.In HRS, the deep structure of DBN is combined with pattern classifier to solve small data classification problems.The hierarchical features are obtained by the multiple RBMs which is stacked in DBN, and then are fed to pattern classifier directly.To obtain more natural sparse feature representation, the Constrained RBM (CRBM) is proposed with solving a generalized optimization problem.Three RBM variants,  1 -RNM,  2 -RBM, and  1/2 -RBM, are introduced to HRS, corresponding to three HRS variants, HRS( 1 -RBM), HRS( 2 -RBM), and HRS( 1/2 -RBM) which are presented in this paper.The experiments on MSTAR public dataset show the performance of the proposed HRS with CRBM outperforms the DBN and current pattern recognition methods in SAR ATR, like PCA + SVM, LDA + SVM, and NMF + SVM.

Figure 1 :
Figure 1: The framework of proposd hierarchical recognition system based on deep structure.

Figure 3 :
Figure 3: The structure of HRS with two RBMs or CRBMs layers used in the experiments.

Table 1 :
Training and test data from MSTAR.∘ and 15 ∘ for training and test, respectively.BMP2 has three types, and only the type sn-c21 is used for training.Meanwhile, T72 has three types, and only the type sn-132 is used for test.The raw images of targets BMP2, BTR70, and T72 have 128 × 128 pixels.For convenience, all the images are only cropped by extracting 64 × 64 patches from the center of the image.The statistics of these three targets are listed in Table1.From Table 1, it can be seen that the training set has 698 samples and the test set has 1365 samples.The sample number in MSTAR is in hundreds level.Compared to the standard databases for Deep Learning algorithms, MNIST, CIFAR, and ImageNet databases, which have tens of thousand samples

Table 2 :
The performance of HRS and DBN on MSTAR with respect to iteration.

Table 3 :
The recognition rates obtained by different methods.