Generalized Discriminant Orthogonal Nonnegative Tensor Factorization for Facial Expression Recognition

In order to overcome the limitation of traditional nonnegative factorization algorithms, the paper presents a generalized discriminant orthogonal non-negative tensor factorization algorithm. At first, the algorithm takes the orthogonal constraint into account to ensure the nonnegativity of the low-dimensional features. Furthermore, the discriminant constraint is imposed on low-dimensional weights to strengthen the discriminant capability of the low-dimensional features. The experiments on facial expression recognition have demonstrated that the algorithm is superior to other non-negative factorization algorithms.


Introduction
Over the past few years, the nonnegative matrix factorization algorithm (NMF) [1] and its variants have proven to be useful for several problems, especially in facial image characterization and representation problems [2][3][4][5][6][7][8]. The idea of nonnegative factorization is partly motivated by the biological fact that the firing rates in visual perception neurons are nonnegative.
However, NMF and its variants have some drawbacks. First of all, NMF requires that all object images should be vectorized in order to find the non-negative decomposition. This vectorization leads to information loss, since the local structure of the image is lost. Moreover, NMF is not unique [9,10]. In order to remedy these drawbacks, non-negative tensor factorization (NTF) has been proposed [11][12][13]. NTF represents a facial expression database as a three-order tensor. The tensor representation avoids the vectorization operation and preserves the structure of the data. Under some mild conditions, NTF is unique. Existing NMF and NTF algorithms project data into low-dimensional space with the inverse or pseudoinverse of the basis images, so both of them cannot guarantee the nonnegativity of low-dimensional features, which restricts the application of non-negative factorization in real world. Furthermore, NTF do not take into account class information in data samples. Actually, it is believed that those features with discriminant constraints are of great importance for pattern recognition. Reference [14] develops a discriminant non-negative tensor factorization algorithm (DNTF), which adds fisher discriminant constraint into the objective function. But like other discriminant nonnegative matrix factorizations [6,[15][16][17][18], DNTF employed discriminant analysis on the representation coefficients and not on the actual features used in the recognition procedure. The actual features used for recognition are derived from the projection of data samples to the bases matrix and only implicitly depend on the representation coefficients.
Based on the above analysis, the paper proposes a generalized discriminant orthogonal non-negative tensor factorization algorithm (GDONTF), which makes full use of the class information and imposes the orthogonal constraint to the objective function. The algorithm not only guarantees the non-negativity of low-dimensional features, but also generalizes discriminant constraints to low-dimension features. The experiments on facial expression recognition indicate that GDONTF achieves better performance than other nonnegative factorization algorithms.

Generalized Discriminant Orthogonal Non-Negative Tensor Factorization
Consider an order tensor ∈ R 1 × 2 ⋅⋅⋅× , every data sample is an − 1 order tensor; that is,

is the dimensionality and
is the number of data set. The data set is divided into classes. Data samples belonging to class denote ( ); the number of data samples in ( ) is . In order to guarantee the non-negativity of low-dimensional features and take use of the class information, we propose generalized discriminant orthogonal non-negative tensor factorization algorithm; the objective function of which is defined as follow: In which, ≥ 0, ≥ 0, ( ) ≥ 0, = 1, 2, . . . , , is the identity matrix and and are the within-and betweenclass scatter matrices of the low-dimensional features, respectively. Because ( ) ( ) = , low-dimensional features can be computed as follows: where the basis matrix Let ℎ be the low-dimensional features of the sample ; then the feature matrix ∈ R × consists of all low-dimensional features, is the low dimensionality of samples, and is the number of all samples. Actually, the separability of the weight coefficient has nothing to do with the recognition accuracy, while the class separability of the low-dimensional features has a great influence on the recognition accuracy. Consequently, the within-and between-class scatter matrices are defined as follows: where is the mean of the low-dimensional features in the class and is the mean of all low-dimensional features. The objective function in (1) can be written as the following optimization problem: Since the basis matrix consists of the projection matrices ( ) , = 1, 2, . . . , − 1, we solve the projection matrices ( ) , = 1, 2, . . . , − 1, and the weight matrix ( ) , respectively, to deal with the optimization problem (4). First of all, we formulate the Lagrange multipliers out of the constrained optimization problem in (4): . Take the derivative of ( ( ) , ) with respect to ( ) and , = 1, 2, . . . , − 1; we have Set (6) and (7) to zeros; we get Left multiply both side of (8) by ( ) ; we immediately have Therefore, the update rule for ( ) is The gradient [∇ ( )] , = [ ]/ ( ) is given by The Scientific World Journal Since Similarly, we have To solve the weight matrix ( ) , the objective function is The gradient functon is The Scientific World Journal  Consequently, the update rules of ( ) are

Experiments
We have conducted facial expression recognition in order to compare the GDONTF with other algorithms such as NMFOS [19], DNMF [6], FisherNMF [16], and DNTF [14]. Because these algorithms calculate low-dimension features in iteration form, the iteration number is 100. For NMFOS and GDONTF, = 1. = 0.5 in DNMF and = 1 in FisherNMF. All low-dimension features are classified by SVM with linear kernel. The database used for the facial expression recognition experiments is Jaff facial expression database [20]. The database contains 213 images of ten Japanese women. Each person has two to four images for each of the seven expressions: neutral, happy, sad, surprise, anger, disgust, and fear. Each image is resized into 32 × 32. A few examples are shown in Figure 1. We randomly select 20 images from each expression for training; the rest is used for testing. The recognition rates with various dimensionalities of different algorithms are shown in Figure 2. Table 1 shows the best recognition rates of the above algorithms. Because NMF is unsupervised learning algorithm, it has the lowest recognition rates. DNMF and FisherNMF have better recognition rates with supervised learning. It is interesting that NMFOS is superior to DNMF and FisherNMF when the feature dimensionality is from 16 to 160 and is better than DNTF when the feature dimensionality is from 16 to 40, which also illustrates the validity of the orthogonal constraint. It is obvious that GDONTF outperforms other algorithms and the best recognition rate is up to 97.07%.

Conclusion
In this paper, a generalized discriminant orthogonal nonnegative tensor factorization algorithm is proposed considering the orthogonal constraint and the discriminant constraint. For the algorithm, the non-negativity of the lowdimensional features is preserved due to the orthogonal constraint for either training samples or testing samples. In order to enhance the recognition accuracy, the discriminant is conducted on low-dimensional features instead of the weight coefficient of the basis images. The experiments also validate the performance of the algorithm.