An MPCA / LDA Based Dimensionality Reduction Algorithm for Face Recognition

• Users may download and print one copy of any publication from the public portal for the purpose of private study or research. • You may not further distribute the material or use it for any profit-making activity or commercial gain • You may freely distribute the URL identifying the publication in the public portal Take down policy If you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediately and investigate your claim.


Introduction
Face recognition has become a topical and timely study focus in the fields of pattern recognition and computer vision for its wide application prospect [1,2].Feature extraction is the key element in face recognition.Currently, diverse recognition methods use different extraction strategies.And one of the most popular algorithms is principal component analysis algorithm (PCA), which aims to find the projected directions along with the minimum reconstructing error and then map the face dataset to a low-dimensional space spanned by those directions corresponding to the top eigenvalues [3,4].Traditional PCA face recognition technology can reach accuracy rate of 70%-92% [5].However, it is still not fully practical.
PCA has certain limitations which result in bad adaptability in the image brightness and facial expression variety [6][7][8][9].Under either strong bright light or weak light environments, the information of the features of the face is deficient; hence the structural information from the feature points of the face image may hardly be captured using traditional algorithms like PCA [10].In addition, existing algorithms which are based on capturing single expressions make it difficult and challenging to capture the correct features of the same person if he changes his facial expressions.Traditional PCA fails to see the natural structure and correlation represented in data set [3], which leads to potential additional loss of compact and/or useful facial representations and will result in a higher reconstruction error rate [11].
There are many recognition proposals to address limitations of PCA presented above.In [12], Bansal and Chawla proposed normalized principal component analysis (NPCA) to improve the recognition rate.They normalized images to remove the lightening variations by applying SVD instead of eigenvalue decomposition.Pereira et al. [13] introduced a new technique which can reduce face dimensions called classmodular image principal component analysis (CMIPCA) to extract local and global information to reduce illumination effects, face expressions, and head-pos changes resulting in speed-up over PCA.In [14] Tsai showed an application of dimensionality reduction techniques, such as PCA, EM-PCA, multidimensional scaling, and locally linear embedding, to identity emotion of facial animations.But the application was not for realistic human faces.
In our method, we decided to complement some of these limitations of PCA by adopting the MPCA algorithm together with the LDA algorithm as the basis for the study [3,15].The MPCA algorithm disregards the traditional method which is based on two-dimensional data and uses instead vectors and integrates multiple face images into a high-dimensional tensor and processes data in tensor space.The advantage of this approach lies in its ability to persistently structure facial information images and consequently increases the accuracy rate when spatial relationships between pixels are considered.When the light brightness changes or facial expression changes, spatial structural information between pixels becomes particularly important.
LDA was adopted to further reduce the dimensions of samples processed by MPCA as it is capable of aggregating the samples in subspace and hence improving the face recognition rate [16,17].We combine MPCA and LDA to form LDA subspace, from which both MPCA features and LDA features can be extracted.
The organization of this paper is as follows.Our proposed algorithm will be discussed in Section 2. Methodology of the approach is presented in Section 3. To demonstrate the effectiveness of the proposed method, experimental results will be shown in Section 4. Finally conclusions are drawn in Section 5.

Principle of MPCA
In computer vision, most of the objects are naturally considered as th-order tensors ( ≥ 2) [18].Take Figure 1 as an example; the image matrix in (i) is a 2nd-order tensor and a movie clip, while in (ii) it is a 3rd-order tensor.Traditional techniques for subspace dimensionality reduction such as PCA could transform image matrix to vectors with high dimensionality in one mode only, which cannot meet the need of dimensionality reduction.So such techniques are unable to handle multidimensional objects well and get satisfactory results.Therefore, in order to reduce dimensionality, a reduction algorithm which can directly operate on a high-order tensor object is desirable.Twodimensional PCA (2DPCA) algorithm is proposed and developed, while researches are using dimensionality reduction solutions which represent facial images as matrices (2ndorder tensors) instead of vectors [19][20][21][22].However, 2DPCA can only project images in single mode, which results in bad dimensionality of reduction [3,23].Thus, a more efficient algorithm MPCA has been proposed to get better dimensionality reduction.

Tensor Notations and Definitions.
Multilinear principal component analysis (MPCA) has been introduced in details in [3], which is used to solve the problem of gait recognition.Before describing MPCA, the notations will be shown in this paper.
( 1 ,  2 , . . .,   ) indicates the orthonormal bases of vector space   1 and (V 1 , V 2 , . . ., V  2 ) indicates the orthonormal bases of vector space   2 .Vector   V   indicates orthonormal bases of tensor space   1 ⊗   2 .Image matrix  equals Define two matrices ( Based on different objective functions, transformation matrices  and  can be obtained by iteration; hence dimension reduction can be achieved.

Principle of MPCA.
MPCA is developed based on the PCA algorithm.Its advantage is that it operates on tensor, replacing the traditional algorithms which transform highdimensional data into one-dimensional vector.For example, to process 100 face images with size 112 × 92, PCA treats them as a 100 × 10304 matrix while MPCA treat them as a 100 × 112 × 92 tensor.MPCA have the advantage of taking into account correlation in the original data which is ignored by PCA.
As Figure 2 shows, by projecting each mode of facial tensor , low-dimensional facial tensor which satisfies maximum variance can be achieved.For tensor objects of image samples, the variance before projection is as follows: And the tensors after projection satisfy the following equation: By combining ( 5) and ( 6), we can get the following equation: The MPCA algorithm equals to the resolving optimization problem: The 1st group Error rate In (9), by using alternating-least-square method (ALS), we are able to calculate local optimization procedure.When solving the th projection matrix Ũ() , other matrices were set constant; tensor  is projected to tensor space ( 1 , . . .,  −1 ,  +1 , . . .,   ), where Ũ() +1 .Column of Ũ() can be obtained from orthogonal basis of projection subspace.Sample   in ( 8) is projected to lower dimensional tensor   ∈   1 × 2 ×⋅⋅⋅× −1 ×  × +1 ⋅⋅⋅×  . ()   , thmode unfolding matrix of   , is inputted to get PCA.It equals to arg max 2.3.MPCA Algorithm.MPCA have managed to handle multidimensional objects.According to the above sections, pseudocode for the computation of the MPCA algorithm can be concluded [25] as shown in Figure 3.
Step 3. Calculate the eigenvectors and their corresponding most significant eigenvalues, and the result is output as Ũ() .(a) calculate the total scatter matrix's eigenvectors and their corresponding most significant eigenvalues, and the result is output as Ũ() , for  = 1, 2; (b) get    and { ŷ ,  = 1, . . ., }; (c) if    −   −1 < , then break the loop and go to Step 5.
Step 5. Finally calculate the feature matrix; see the following equation:  Step 1. Compute the average sample values for different kinds of facial images in the original space.Total number is denoted by .  denotes the th object of the th class of samples: Step 2. Compute covariance matrix of each class: Step 3. Compute within-class and between-class scatter matrices: Step 4. Compute eigenvectors of matrix  −1    to get projection vectors.Then dimensionality reduction data can be obtained by projection [26,27].
After dimensionality reduction using MPCA, the matrices are arranged in columns into vectors as inputs to the LDA algorithm.By using MPCA algorithm to reduce the dimension of the image, we not only solved the problem of singular matrix but also retained structure information in the images and thus improve the recognition rate.

KNN Algorithm.
-nearest neighbor (KNN) algorithm [28,29] is adopted for sample set classification here, and the concrete steps are as follows.
Step 1. Select different parameters of  value.
Step 2. Adopt the method of cross-validation on training face images, for  = 1 : .
Step 3. Make the cross-validation error classification rate minimization and get its corresponding parameter .
Step 4. Construct a prediction model with .

Process of the Recognition Algorithm
3.1.Preprocessing.Image preprocessing and normalization are vital for face recognition systems as images are often affected by image quality, illumination, face rotation, facial expression [8,30], and so forth.In order to offset above factors, it is necessary for us to carry out face normalization before facial feature extraction.
Our data is preprocessing normalized images with a resolution of 80 × 80.In our research, histogram equalization was applied (see (15)):

Dimensionality Reduction Using MPCA and Feature
Matrix Extraction Using LDA.MPCA reduces dimensions of input face images and generates feature projection matrix [30] that are then taken as input samples to LDA.MPCA and LDA combination were used to construct LDA subspace, from which both MPCA features and LDA features can be extracted.
The detailed steps have been described in Sections 2.2 and 2.3.

Face Recognition
Using L2 Distance Measure.We used resultant output acquired above as input samples for training and applied aforesaid techniques to get the feature matrix.Then we carried out a similarity measure on image samples.In our research, we choose L2 distance for measures (see (16)): KNN classifier [31] is adopted for sample set classification here, while the procedure and details are introduced in Section 2.4.
The overall approach of face recognition proposed in this paper is shown in Figure 3.

Experiments
We evaluated the performance of our algorithm based on MPCA + LDA in this research and compared with the PCA, MPCA, and PCA + LDA algorithm by performing experiments on ORL databases [32].In order to examine the ability of our method, we also try it on other classical face databases such as FERET and YALE.The experiments were conducted with three groups.We choose part of images in each group for training, while the rest for testing.As the probabilities for each kind of facial samples are the same, then   that equals 1 is set in LDA algorithm.
Initially, we tested how different parameters affect the recognition error rate and how classification result is affected by dimensionality using the MPCA, dimensionality after using the LDA and  value of KNN algorithm.LDA algorithm requires dimension reduction not greater than the total number of samples minus 1, so 1 ≤ LDA dimension reduction ≤ 19.There are 10 samples in each category, so 1 ≤  ≤ 10.Other parameters of MPCA are set to the optimized values.

Experiments on the ORL.
The ORL face database contains a total of 400 images of 40 individuals (each individual has 10 gray scale images) [33].Some photos are taken in different periods, and some are taken with the various countenances and the facial details.Each image is of a resolution of 256 grey levels per pixel [34].Recognition error rate of PCA is shown in Figure 5.
Judging from the figure, when  equals 1, the recognition error rate reaches minimal value.PCA recognition accuracy reaches 58%-82%.
Error rate of MPCA under different  value is shown in Figure 6.As shown in the figure, when  equals 1, error rate is minimal.MPCA recognition accuracy reaches 75%-85% in the experiments.
When  equals 8, error rate of PCA + LDA algorithm reaches minimal value 7%.How different LDA dimension reduction affects recognition accuracy is shown in Figure 7.
We can see from Figures 5, 6, and 7 that dimension after LDA increases as the number of samples also increases.When applying PCA + LDA algorithm, we use MPCA to decrease dimension of facial samples to 11 and then use LDA.For the 1st group, reduce dimension to 7. For the 2nd group, reduce dimension to 8.For the 3rd group, reduce dimension to 10.We can conclude that the LDA algorithm is not satisfied with multidimensional objects.Accuracy of PCA + LDA reaches 86%-88%.
MPCA + LDA algorithm only produces higher error rate of 10%-25% when  equals 10.In other situations, the recognition error rate is very low.When  equals 8, the error rate of different LDA dimensionality reduction is shown in Figure 8. Algorithm recognition accuracy rate reaches a high value.
Result of the experiments on ORL database is shown in Figure 9.Take recognition accuracy of four algorithms for comparison; the combination of MPCA and LDA does result in better recognition performance than traditional methods.to those in above section, so we just simplify steps and focus on the results.FERET face database consists of a total of 1400 images of 200 individuals (each person has 7 different images).Figure 10 shows image examples of two persons before preprocessing.

Experiments on
YALE face database contains 165 images of 15 individuals; Figure 11 shows image examples of two persons before preprocessing [35].
The performance of PCA, MPCA, PCA plus LDA, and MPCA + LDA techniques is tested by varying the number of eigenvectors.We have chosen one group of result in each database for comparison.
PCA performed worse on YALE than on FERET because of the poor adaptability for the image brightness and facial expression, which is shown in Figure 12.
Though in Figure 13 MPCA performed much well on face recognition in YALE database, the process takes longer time than with PCA.
Figures 14 and 15 show that both PCA + LDA and MPCA + LDA can turn to high accuracy and low error rate in recognition.However, PCA + LDA effectively sees only the Euclidean structure, while MPCA + LDA successes to discover the underlying structure [36].
Compared against all the other algorithms, although with simple preprocessing, we can learn that MPCA + LDA has achieved best overall performance in both FERET and YALE databases.

Conclusions
This paper presents an algorithm for face recognition based on MPCA and LDA.As opposed to other traditional methods, our proposed algorithm treats data as multidimensional tensor and fully considers the spatial relationship.The advantage of our approach is of great relevance to applications and is capable of recognizing face dataset under different lighting conditions and with various facial expressions.LDA algorithm projects the data to a new space and has exact clustering result in our experiments.Compared with traditional face recognition algorithms, our proposed algorithm is not only a boost in recognition accuracy but also an unclogging of dimensionality bottlenecks and an efficient resolution of the small sample size problem.Future work of our research will include applying this approach on larger face databases such as on the CMU Multi-PIE, NIST's FRGC, and MBGC.

Figure 2 :
Figure 2: Illustration of the multilinear projection in the 1-mode vector space.

Figure 3 :
Figure 3: Flow chart of face recognition algorithm.

Figure 4 :
Figure 4: Face image examples of two persons in ORL face database.

Figure 6 :
Figure 6: Recognition error rate of MPCA against different  values.

Step 4 .Figure 7 :
Figure 7: Recognition error rate of PCA + LDA against different LDA dimension reduction values.

Figure 8 :
Figure 8: Error rate of MPCA + LDA algorithm against different LDA dimension reduction values.

Figure 9 :
Figure 9: Histogram of recognition results in experiments.

Figure 10 :
Figure 10: Face image examples of two persons in FERET face database.

Figure 11 :
Figure 11: Face image examples of two persons in YALE face database.

Figure 4
shows image examples of two persons before preprocessing.Now, images have been divided into three different groups.With the first group, we select the first 5 images of the first 20 persons as training data and the last 5 images of the first 20 persons as test samples for face identification.With the second group we select the first 5 images of the rest 20 persons as training data and the last 5 images of the rest 20 persons as testing samples.With the third group we select the first 5 images of 40 persons as training data and the last 5 images of 40 persons as testing samples.

Figure 14 :Figure 15 :
Figure 14: Recognition error rate of PCA + LDA against different LDA dimension reduction values.
More Face Databases.We choose FERET and YALE for our experiments.Implement steps are similar Recognition error rate of PCA against different  values (best LDA dimension reduction).In one group of FERET,  equals 27.In one group of YALE,  equals 63.
Figure 13: Recognition error rate of MPCA against different  values.