Incremental Graph Regulated Nonnegative Matrix Factorization for Face Recognition

,


Introduction
Nonnegative matrix factorization (NMF) is a widely used method for low-rank approximation of a nonnegative matrix (matrix with only nonnegative entries), where nonnegative constraints are imposed on factor matrices in the decomposition.There are large bodies of past work on NMF [1]. Lee and Seung [2,3] proposed NMF for learning parts of faces, and in their work the reconstruction error function (, ) is introduced: (, ) = ‖ − ‖ 2   , where  denotes the data matrix,  can be considered as the basic matrix, and  can be considered as the coefficient matrix; all elements of , , and  are nonnegative.Sparse coding is a famous parts-based representation method, by minimizing a  1 regularization-related objective function of NMF-based algorithms, sparse constraints can be achieved.Hoyer [4] proposed a method by keeping  2 norm unchanged in each iteration, but  1 norm set to achieve desired sparseness.Li et al. [5] proposed another sparse representation method which focused on sparse NMF algorithm with Kullback-Leibler based cost function.Discrimination method was also introduced into NMF algorithm.Wang et al. [6] proposed a Fisher nonnegative matrix factorization which introduces Fisher constraint (discrimination) method into NMF algorithm.later, Nikitidis et al. [7] introduced subclass discriminant into NMF algorithm, by separating each class into several subclasses; this method was able to ensure that the underlying data distribution in each subclass is unimodal.Because the convergence of canonical NMF algorithms is slow, gradient descent based methods are introduced to NMF to improve its speed of convergence.Guan et.al [8] applied Nesterov's optimal gradient method to alternatively optimize one factor with another.By introducing fast gradient descent method into search of the optimal step size for gradient descent based NMF algorithm, Guan et al. [9] introduced nonnegative patch alignment framework (NPAF) and nonnegative discriminative locality alignment (NDLA).Canonical NMF algorithm aims to minimize the Euclidean distance or the Kullback-Leibler distance between the data matrix  and its reconstruction matrix  × .By introducing Manhattan distance into NMF algorithm, Guan et al. [10] introduced Manhattan nonnegative matrix a small computation cost while   can be recomputed as   =  × (  ) −1 ; obviously there is an assumption that  ≈  ×  are full-rank decompositions and   is invertible.
Although the incremental algorithms we mentioned above dramatically reduce the cost of incremental study, they did not consider the geometrical information of the original data space on the original manifold.The geometrical information of the original data space is an important information for face recognition [20,21].Data set often lay in a high dimension manifold, so the intrinsic geodesic distance in the manifold between two data points is more suitable than Euclidean distance.Thus, in this paper, By introducing manifold into Incremental nonnegative matrix factorization algorithm (INMF), first, we propose a new incremental study algorithm called incremental graph regulated nonnegative matrix factorization (IGNMF).Second, as human face images always come in batches, we improve our IGNMF algorithms to Batch-IGNMF algorithms (B-IGNMF).The contributions of the research presented in this paper are as follows.
(1) We propose a novel subspace incremental method called incremental graph regularized nonnegative matrix factorization algorithm (IGNMF), which is able to preserve discrimination information under incremental study framework.
(2) We further improve our IGNMF algorithms, thus developing Batch-IGNMF (B-IGNMF), which is able to perform incremental study with image batches no mater whether the batch of images belongs to the same class or different classes.
The remainder of this paper is organized as follows.Section 2 introduces the GNMF algorithm and INMF algorithm.Section 3 presents our IGNMF and B-IGNMF algorithms.Experiments on face databases are reported in Section 4. Section 5 concludes the paper.

A Brief Review of GNMF and INMF
In this section, we will give a brief review on graph regularized nonnegative matrix factorization (GNMF) which was proposed by Cai et al. [13] and incremental nonnegative matrix factorization algorithm (INMF) which was proposed by Bucak and Gunsel [14]; both algorithms are closely related to our new proposed algorithms.

GNMF.
The geometrical information of the original data space is an important information for face recognition [20,21]; data set often lay in a high dimension manifold, so the intrinsic geodesic distance in the manifold between two data points is more suitable than Euclidean distance.Thus in this section, we will give a brief review on graph regularized NMF (GNMF) [13].
(1) Supervised method is achieved by defining   ∈   (  ) if   and   belong to the same class. ( where   and   are the elements of  and , respectively.

INMF. Incremental nonnegative factorization (INMF)
[14] is a popular incremental study algorithm which is NMFbased.Considering the canonical reconstruction inequation of NMF algorithm as  ≈  × , and each column of both sides is   ≈  ×   ; from this inequation we can find that if we consider each column of the matrix  as the building block of the database and   can be considered as the reconstruction coefficient matrix, by summing each building block of  under the coefficient matrix   , original image   will be approximately reconstructed [2,3].Every time when new image x comes, the inequation of the reconstruction function will become [ x] ≈ ×[ k]; by assuming that the previous coefficient matrix  would not change during the incremental process, the computation cost will be reduced significantly.
If we defined the   as the cost function corresponding to the NMF as (3) where  denotes the total number of samples before incremental study, then Everytime, when the ( + 1)th sample x arrives, the corresponding cost function  +1 can be defined as ( 4) Note  +1 can be separated into two parts as follows: where x and k can be considered as the new coming image and its corresponding coefficient vector.By assuming that the first  columns of  +1 would not change after the incremental study, the first part of ( 5) can be rewritten as So  +1 can be rewritten as Considering that   is the function which is independent of V  , the partial derivative of   /k  = 0. Thus the partial derivative of  +1 /k  (= /k  ) and  +1 /  can be deduced.Then the update rule of k  and   can be formulated within the framework of gradient descent algorithm.
In order to save space, here we just list the iterative update algorithms of INMF in (8):

IGNMF and B-IGNMF
Bucak and Gunsel [14] introduced an incremental nonnegative matrix factorization algorithm (INMF) which imposed the NMF algorithm into incremental study, so INMF inherits the disadvantage of the NMF algorithm; that is, it does not consider the geometric structure in the data.In this section, we introduced an incremental graph regularized nonnegative matrix factorization algorithm (IGNMF), in which manifold is introduced to overcome this limitation.

IGNMF.
Let   ,   ,   ,   , and   denote the corresponding , , , , and (, ) in (1) under the initial  samples, so the objective function   can be rewriten as Let  +1 ,  +1 ,  +1 ,  +1 , and  +1 denote the corresponding , , , , and (, ) in (1) when the ( + 1)th sample arrives.So objective function  +1 can be rewriten as can be considered as the basic matrix and  can be considered as the coefficient matrix, so the reconstruction process can be thought as adding columns of matrix  under the coefficient matrix   , just as   ≈  ×   [3].Thus we make the assumption that during the incremental process, when the ( + 1)th sample arrives, the first  columns of  +1 does not change, which means the fist  columns of  +1 is approximately equal to   .This assumption would reduce the computation cost significantly.Experiments show that IGNMF would iterate less than 5 times and then the objective function converges to its minimal value, because we just need to update the last column of  +1 meanwhile  +1 needs to be updated completely which will dramatically reduce the cost of incremental study.For more details about this assumption, please refer to [14].
Assuming   +1 refers to the objective function corresponding to GNMF representation of the first  sample when the ( + 1)th sample arrives,   +1 refers to the  ×  dimensional matrix which equals the first  rows and first  columns of  +1 , and  is a predefined parameter which indicates that the -dimensional facial image vector maps in to an -dimensional vector.So   +1 can be rewritten as Consequently, objective function  +1 can be rewriten as follows: where x means the new coming facial vector and k means the coefficient vector corresponding to x.
After constructing the objective function given by ( 12), gradient descent optimization that yields IGNMF can be performed.The update rule of k can be formulated within the framework of gradient descent algorithm as follows: In (13),  +1 /k  is the partial derivative of  +1 with respect to k  and  +1 /k  =   +1 /k  +  +1 /k  , because   +1 is the function which is not relevant to k  , so   +1 /k  = 0, and we just need to compute the partial derivative of  +1 : is the step size and is calculated by where ( +1 ) +1,: , ( +1 ) +1,: and ( +1 ) +1,: are vectors which are equal to the ( + 1)th row of  +1 ,  +1 and  +1 , respectively.
After substituting ( 14) and ( 15) into ( 13), The update rule equation for k  yields as The update rule of ( +1 )  can also be formulated within the framework of gradient descent algorithm as In (17),  +1 /( +1 )  is the partial derivative of  +1 with respect to ( +1 )  and is given as is the step size and is calculated by After substituting (18) and ( 19) into (17), The update rule equation for ( +1 )  yields as We omit the proof of its convergence as it is similar to that of GNMF while IGNMF just assumes the first  columns of  would not be updated during iterations.

Batch-IGNMF.
In practice, images often come in batches.For example, a face recognition system needs to record a batch of images belonging to a new class or belonging to many different classes.
Once several images come, our IGNMF needs to run one time for each image, which is time consuming if there are too many images.In this section, we propose an improved version of IGNMF algorithm which can deal with a batch of images.This improved version is named Batch-IGNMF (B-IGNMF), which is able to perform incremental study in batch of images no matter whether the batch of images belongs to the same class or different classes.
Let   denote the new coming  images.So  + is the total sample with its first  columns equal to   and the rest  columns equal to   . + and  + denote the optimized factor matrices of  + , assuming   as the first  columns of  + which would not change during incremental study,   is the last  columns of  + denoting the corresponding coefficient matrix of   , and ( + ) +1∼+,: denotes the rows of  + from  + 1 to  + .
So the objective function  + can be rewriten as We omit the detailed derivation process due to its similarity to the derivation process of IGNMF.We just list the multiplicative update rules as

Experiments
In this section, the FERET database [23,24] and CMU-PIE database [25] are selected to evaluate the performance of our IGNMF and B-IGNMF algorithms, along with two canonical face recognition algorithms: supervised GNMF (GNMF-S) and unsupervised GNMF (GNMF-U) and three incremental algorithms: INMF, CINMF, and IOPNMF.NMF is selected as the baseline.The stopping condition of iterative update is defined as (23) in all experiments: where  () is the th iteration of the update criterion function defined in (9) under the th iteration.
In the FERET database [23,24], we select 200 individual faces (7 images for each).All images in the database are aligned by the centers of eyes and mouth and then resized into 25 × 25.The CMU-PIE database includes total of 68 individuals [25] and we select all these 68 individuals, each of which

Incremental Study for Single Image.
In these experiments, we choose 5 images for each individual as the training set from the FERET database and 42 images for each individual from the CMU-PIE database.Our experiments are performed as follows: first, GNMF and NMF are chosen for initialization; second, IGNMF and INMF are performed to incremental study; GNMF and NMF are also performed by rerunning GNMF and NMF every time when new image comes.We choose one image of each individual from the rest of the database to incremental study, so all algorithms were run at a total of 200 times for the FERET database and 68 times for the CMU-PIE database.Notice INMF is an unsupervised learning algorithm, so unsupervised IGNMF is performed.
Figures 3 and 4 illustrate the curves of recognition rates for INMF, IGNMF, NMF, and GNMF during incremental study.We can see that when a small number of new images come, incremental methods (INMF and IGNMF) perform better than canonical methods (NMF and GNMF).We also observe that incremental methods can keep the recognition rate increasing smoothly with the number of images increasing; while canonical methods would make the recognition rate fluctuating.This is because every time when we rerun the canonical methods, the initial values for  and  are randomly assigned; different initial values would lead to different local minimum value, which in turn affect the recognition rate significantly.It is also noted that the recognition rates of IGNMF and GNMF are better than those of NMF and INMF; this is because the use of the manifold method, which preserves the geometrical information of the original data, contributes to the improvement of the recognition rate.
Figures 5 and 6 illustrate the running time of the selected algorithms in both databases, from the figure we can see that the mean running time of IGNMF for single image is close then INMF, while faster than GNMF and NMF, which means IGNMF achieved better recognition rate than INMF while faster than it.

Incremental Study for One Batch of Images Belonging
to One Class.In this section, our B-IGNMF, and the other two typical incremental algorithms (CINMF and IOPNMF) are performed along with supervised GNMF, unsupervised GNMF, and NMF.Because CINMF is a supervised learning algorithm while IOPNMF is an unsupervised one, supervised B-IGNMF (BIGNMF-S) and unsupervised B-IGNMF (BIGNMF-U) are both performed.
We choose 5 images for each individual as the training set from the FERET database and 42 images for each individual from the CMU-PIE database.First, we choose 195 individuals from the FERET database and 63 individuals from the CMU-PIE database for study.Second, one of the rest of five individuals is chosen to add for incremental study, total 5 times.The parameters of CINMF are set to  = 10,  = 10 −7 , and  0 = 6 for the FERET database and  = 0.1,  = 10 −4 , and  0 = 6 for the CMU-PIE database; the parameters of IOPNMF are set to  = 200 for the FERET database and  = 136 for the CMU-PIE database.Tables 1 and 2 showed the recognition rates for two databases; "Add 1" means the recognition rate of adding one batch of images into "Start, " and "Add 2" means the recognition rate of adding another one batch of into "Add 1, " and so on.
Tables 1 and 2 showed the recognition rates for two databases.We can see that the recognition rate of our BIGNMF is close to GNMF, both for the supervised one and the unsupervised one.In some cases BIGNMF is slightly better than GNMF.The recognition rates of GNMF-U, GNMF-S, and NMF are fluctuating during incremental study.In the experiments we found that CINMF was the fastest one to start (study without incremental); it needs less than 40 s for the FERET database and less than 15 s for CMU-PIE database, but during the incremental process, the recognition rate of our BIGNMF outperforms CINMF and is much more faster than CINMF.It is noted that the recognition rate of IOPNMF performs close to NMF for the FERET database while the worst is for the CMU-PIE database; the reason does not concern us.that the mean running time of BIGNMFs for one batch of images belonging to one class is faster than other NMF-based incremental algorithms, both for the supervised one and the unsupervised one, which means our proposed algorithms achieved better recognition rate than other NMF-based incremental algorithms, while faster than then.

Incremental Study for one Batch of Images Belonging to
Different Classes.In this section, incremental study for one batch of images belonging to different classes is performed.Our B-IGNMF, both supervised method and unsupervised method, is selected, along with supervised GNMF, unsupervised GNMF, IOPNMF, and NMF.Experiments are designed as follows: we choose 4 images for each individual (totally 7 images) as the starting training set from the FERET database; the rest of the images are considered as the testing set.For each incremental study, one image for each individual from the testing set is selected as one batch of images to incremental study, totally 200 images for each batch.The incremental study for FERET is performed twice.For the CMU-PIE database, we choose 42 images for each individual (totally 56 images) as the starting training set, then the rest of the images are considered as the testing set.For each incremental study, 2 images for each individual from the testing set are selected as one batch of images to incremental study, total 68 × 2 images for each batch.The incremental study for FERET is performed 5 times.Tables 3 and 4 showed the recognition rates for two databases; "Add 1" means the recognition rate of adding one batch of images into "Start, " and "Add 2" means the recognition rate of adding another one batch of images into "Add 1, " and so on.
Tables 3 and 4 showed the recognition rates for two databases; we can see that the recognition rate of our BIGNMF, both the supervised one and the unsupervised one, is close to that of GNMF and better than that of NMF.Also, we can see that the recognition rate for both BIGNMFs is increasing while the new batch of images comes, which illustrate the effectiveness of the new incremental study algorithms.Note in the experment of the FERET database, in the starting, there are 4 images for each individual, totally 800 images; after adding the second batch of images, we added totally 400 images (2 images for each individual).Which means during the 50% amount of new images to incremental study, our BIGNMF still works.It is noted that the recognition rate of IOPNMF performs close to NMF for the FERET database while the worst for the CMU-PIE database; the reason is unknown.Figures 9 and 10 illustrate the mean running times for all the selected algorithms in both database.From the figures we can see that the mean running time of our BIGNMFs for one batch of images belonging to different classes is faster than other NMF-based incremental algorithms, both for the supervised one and the unsupervised one, which means the recognition rate of our proposed algorithms is close to  rerunning GNMF algorithms (sometimes slightly better), but faster.

Conclusions
In this paper, a novel incremental study method named incremental graph regularized nonnegative matrix factorization (IGNMF) has been proposed for face recognition.IGNMF introduces the graph regularized nonnegative matrix factorization algorithm into incremental study.The proposed IGNMF is able to preserve discrimination information under incremental study framework.In addition, we adapted our IGNMF algorithm to deal with learning from image batches, resulting in another new learning method called Batch IGNMF (B-IGNMF).Experiments show that the recognition rates of our IGNMF and B-IGNMF algorithms are close to GNMF algorithms while they run faster than GNMF algorithms.The running times of our IGNMF and B-IGNMF algorithms are close to INMF, faster than other popular NMFbased face recognition incremental algorithms, while the recognition rate of our IGNMF and B-IGNMF algorithms outperforms them.Finally we point out the fact that if IGNMF and BIGNMF run too many times, the recognition rate would worsen than that of rerun GNMF.The reason for this is the assumption that   remains unchanged during the iterations and cannot hold if algorithms run too many times.So our future work will be focusing on how to resolve this issue.

Figure 1 :
Figure 1: All images of one person from FERET database.

Figure 2 :
Figure 2: All images of one person from CMU-PIE database.

Figure 3 :Figure 4 :
Figure 3: The average recognition rate during incremental study for the FERET database.

Figure 5 :Figure 6 :
Figure 5: The mean running time during incremental study with single image for the FERET database.

Figures 7 and 8 Figure 7 :Figure 8 :
Figure 7: The mean running time during incremental study with images belonging to one class for the FERET database.

Figure 9 :
Figure 9: The mean running time during incremental study with images belonging to different classes for the FERET database.

Figure 10 :
Figure 10: The mean running time during incremental study with images belonging to different classes for the CMU-PIE database.
Unsupervised method is achieved by defining   (  ) as the set of  nearest neighbors of   under the Euclidean distance.

Table 1 :
The recognition rates (%) for the FERET database during adding 5 batches of images belonging to one class.

Table 2 :
The recognition rates (%) for the CMU-PIE database during adding 5 batches of images belonging to one class.

Table 3 :
The recognition rates (%) for the FERET database during adding 2 batches of images belonging to different classes.

Table 4 :
The recognition rates (%) for the CMU-PIE database during adding 5 batches of images belonging to different classes.