A Curvelet Domain Face Recognition Scheme Based on Local Dominant Feature Extraction

A feature extraction algorithm is introduced for face recognition, which efficiently exploits the local spatial variations in a face image utilizing curvelet transform. Although multi-resolution ideas have been profusely employed for addressing face recognition problems, theoretical studies indicate that digital curvelet transform is an even better method due to its directional properties. Instead of considering the entire face image, an entropy-based local band selection criterion is developed for feature extraction, which selects high-informative horizontal bands from the face image. These bands are segmented into several small spatial modules to capture the local spatial variations precisely. The effect of modularization in terms of the entropy content of the face images has been investigated. Dominant curvelet transform coefficients corresponding to each local region residing inside the horizontal bands are selected, based on the proposed threshold criterion, as features, which not only drastically reduces the feature dimension but also provides high within-class compactness and high between-class separability. A principal component analysis is performed to further reduce the dimensionality of the feature space. Extensive experimentation is carried out upon standard face databases and a very high degree of recognition accuracy is achieved even with a simple Euclidean distance based classifier.


Introduction
Automatic face recognition has widespread applications in security, authentication, surveillance, and criminal identification.Conventional ID card and password-based identification methods, although very popular, are no more reliable as before because of the use of several advanced techniques of forgery and password-hacking.As an alternative, biometric, which is defined as an intrinsic physical or behavioral trait of human beings, is being used for identity access management [1].The main advantage of biometric features is that these are not prone to theft and loss and do not rely on the memory of their users.Moreover, biometrics, such as palm-print, finger-print, face, and iris, does not change significantly over time and it is difficult for a person to alter own physiological biometric or imitate that of other person's.Among physiological biometrics, face is getting more popularity because of its nonintrusiveness and high degree of security.Moreover, unlike iris or finger-print recognition, face recognition does not require high precision equipments and user agreement, when doing image acquisition, which make face recognition even more popular for video surveillance.Nevertheless, face recognition is a complicated visual task even for humans.The primary difficulty in face recognition arises from the fact that different images of a particular person may vary largely, while images of different persons may not necessarily vary significantly.Moreover, some aspects of the image, such as variations in illumination, pose, position, scale, environment, accessories, and age differences, make the recognition task more complicated.
Face recognition methods are based on extracting unique features from face images.In this regard, face recognition approaches can be classified into two main categories: holistic and texture-based [2][3][4].Holistic or global approaches to face recognition involve encoding the entire facial image in a high-dimensional space [2].It is assumed that all faces are constrained to particular positions, orientations, and scales and, hence, are very sensitive to pose variations [5].However, texture-based approaches rely on the detection of individual facial characteristics and their geometric relationships prior to performing face recognition [3,4].Edge information of faces has also been used for face recognition.A line edgemap approach was proposed in [6], which gives a distance measurement between two line edgemaps of faces and performs face matching based on those measures.
Apart from these approaches, face recognition can also be performed by using different local regions of face images [7][8][9].It is well known that although face images are affected due to variations, such as nonuniform illumination, expressions, and partial occlusions, facial variations are confined mostly to local regions.A local binary pattern was applied in [8] as a texture descriptor.The local pattern is extracted by binarising the gradients of center point to its eight neighboring points pixel-wisely and this binary pattern is used as image features for classification.It is expected that capturing the localized variations of images would result in a better recognition accuracy [9].In this regard, wavelet analysis is also employed that possesses good characteristics of spatial-frequency localization to detect facial geometric structure [10].
Over the past two decades, following wavelets, other multi-resolution tools [11,12], such as contourlets [13], ridgelets [14], and curvelets [15], to name a few, were developed.These tools have better directional decomposition capabilities than wavelets.These new techniques were used for image processing problems like image compression [16] and denoising [17], but not for addressing problems related to computer vision.In [18], it has been shown that curvelets can be used for pattern recognition problems as well.In [19], it has been showed that curvelets can indeed supersede wavelets as bases for face recognition.Hence, it is motivating to utilize local variations of human face for feature extraction and thereby develop a scheme of holistic face recognition incorporating the advantageous properties of texture-based approach.
The objective of this paper is to develop a feature extraction algorithm for face recognition using multi-resolution tools based on dominant curvelet domain features extracted from local zones instead of using the entire face image.In order to exploit the high-informative areas of a face image, an entropy-based horizontal band selection criterion is presented.Such high-informative bands are further divided into some smaller spatial modules to extract local variations in detail.The effect of modularization in terms of the entropy content of the face images has been investigated.A feature extraction algorithm using digital curvelet transform is developed, which operates within those local zones to extract dominant spectral features.In comparison to the wavelet transform, the curvelet transform is used as it possesses better directional decomposition properties.It is shown that the discriminating capabilities of the proposed features are enhanced because of modularization of the face images.The variation of recognition performance with the module size has been investigated.Moreover, the improvement of the quality of the extracted features as a result of illumination adjustment has also been analyzed.In view of further reducing the computational complexity, principal component analysis is performed on the proposed feature space.Finally, the face recognition task is carried out using a distance-based classifier.

Brief Description of the Proposed Scheme
A typical face recognition system consists of some major steps, namely, input face image collection, preprocessing, feature extraction, classification, and template storage or database, as illustrated in Figure 1.The input image can be collected generally from a video camera or still camera or surveillance camera.In the process of capturing images, distortions including rotation, scaling, shift, and translation may be present in the face images, which make it difficult to locate at the correct position.Preprocessing removes any unwanted objects (such as, background) from the collected image.It may also segment the face image for feature extraction.For the purpose of classification, an image database is needed to be prepared consisting template face poses of different persons.The recognition task is based on comparing a test face image with template data.It is obvious that considering images themselves would require extensive computations for the purpose of comparison.Thus, instead of utilizing the raw face images, some characteristic features are extracted for preparing the template.It is to be noted that the recognition accuracy strongly depends upon the quality of the extracted features.Therefore, the main focus of this research is to develop an efficient feature extraction algorithm.
The proposed feature extraction algorithm is based on extracting spatial variations precisely from high-informative local zones of the face image instead of utilizing the entire image.In view of this, an entropy-based selection criterion is developed to select high-informative facial zones.A modularization technique is employed then to segment the highinformative zones into several smaller segments.It should be noted that variation of illumination of different face images of the same person may affect their similarity.Therefore, prior to feature extraction, an illumination adjustment step is included in the proposed algorithm.After feature extraction, a classifier compares features extracted from face images of different persons and a database is used to store registered templates and also for verification purpose.

Proposed Method
For any type of biometric recognition, the most important task is to extract distinguishing features from the training  biometric traits, which directly dictates the recognition accuracy.In comparison to person recognition based on different biometric features, face image-based recognition is very challenging even for a human being, as face images of different persons may seem similar whereas face images of a single person may seem different, under different conditions.Thus, obtaining a significant feature space with respect to the spatial variation in a human face image is very crucial.Moreover, a direct subjective correspondence between face image features in the spatial domain and those in the frequency domain is not very apparent.In what follows, we are going to demonstrate the proposed feature extraction algorithm for face recognition, where spatial domain local variation is extracted using curvelet-domain transform.

Entropy-Based Horizontal Band
Selection.The information contents of different regions of a human face image vary widely [20].It can be shown that if an image of a face were divided into certain segments, not all the segments would contain the same amount of information.It is expected that a close neighborhood of eyes, nose, and lips contains more information than that possessed by the other regions of a human face image.It is obvious that a region with highinformation content would be the region of interest for the purpose of feature extraction.However, identification of these regions is not a trivial task.Estimating the amount of information from a given image can be used to identify those significant zones.In this paper, in order to determine the information content in a given area of a face image, an entropy-based measure of intensity variation is defined as [21] where the probabilities {p k } m 1 are obtained based on the intensity distribution of the pixels of a segment of an image.It is to be mentioned that the information in a face image exhibits variations more prominently in the vertical direction than that in the horizontal direction [22].Thus, the face image is proposed to be divided into several horizontal bands and the entropy of each band is to be computed.It has been observed from our experiments that variation in entropy is closely related to variation in the face geometry.Figure 2(b) shows the entropy values obtained in different horizontal bands of a person for several sample face poses.One of the poses of the person is shown in Figure 2(a).As expected, it is observed from the figure that the neighborhood of eyes, nose, and lips contains more information than that possessed by the other regions.Moreover, it is found that the locus of entropies obtained from different horizontal bands can trace the spatial structure of a face image.Hence, for feature extraction in the proposed method, spatial horizontal bands of face images are chosen corresponding to their entropy content.

Proposed Curvelet-Domain Feature.
For biometric recognition, feature extraction can be carried out using mainly two approaches, namely, the spatial domain approach and the frequency domain approach [23].The spatial domain approach utilizes the spatial data directly from the face image or employs some statistical measure of the spatial data.On the other hand, frequency domain approaches employ some kind of transform over the face images for feature extraction.In case of frequency domain feature extraction, pixel-by-pixel comparison between face images in the spatial domain is not necessary.Phenomena, such as rotation, scale, and illumination, are more severe in the Curvelet transform is a relatively new multiscale and multidirectional transform.Curvelets exhibit highly anisotropic shape obeying parabolic-scaling relationship [19].The curvelet transform has been developed for image analysis with the need of improved directional capability, better ability to represent edges, and other singularities along curves as compared to other traditional multi scale transforms, for example, the wavelet transform.In order to implement curvelet transform, first a two-dimensional discrete Fourier transform (2D DFT) of the image is taken.Then the 2D frequency plane is divided into parabolic wedges.Finally an inverse Fourier transform of each wedge is taken to find the curvelet coefficients at each scale and angle.There are two different digital implementations of Fast Digital Curvelet Transform (FDCT) [24], namely, (i) Curvelets via USFFT (Unequally Spaced Fast Fourier Transform) and (ii) Curvelets via Wrapping.In this research, we have implemented the Curvelets via USFFT.These digital transformations are linear and take as input Cartesian arrays of the form f [t 1 , t 2 ], 0 ≤ t 1 , t 2 < n, which allows us to think of the output as a collection of coefficients, c D ( j, l, k) obtained by where each c D j,l,k is a digital curvelet waveform.We intend to demonstrate that the distinguishability of face images of separate persons is enhanced in the frequency domain.In Figure 3, two sample face images of two different persons are shown.The Euclidean distance between the raw face images and that between their corresponding curvelet coefficients are shown in Figure 4.It is observed from the figure that the latter one provides comparatively higher Euclidean distance as opposed to the earlier one, which shows better discriminating capability.
In order to demonstrate the effect of rotation on the extracted features in frequency domain, two face images are shown in Figure 5.The two images are from the same person, except that in the second image, the person's head is slightly rotated.The Euclidean distance between the raw face images and that between their corresponding curvelet coefficients are shown in Figure 6.It is evident from the figure that the latter one provides orders of magnitude lower Euclidean distance as opposed to the earlier one, which shows sharp correlation signifying better match.

Illumination Adjustment.
It is intuitive that images of a particular person captured under different lighting conditions may vary significantly, which can affect the face recognition accuracy.In order to overcome the effect of lighting variation in the proposed method, illumination adjustment is performed prior to feature extraction.Given two images of a single person having different intensity distributions due to variation in illumination conditions, our objective is to provide with similar feature vectors for these two images irrespective of the different illumination condition.Since in the proposed method, feature extraction is performed in the curvelet domain, it is of our interest to analyze the effect of variation in illumination on the curveletbased feature extraction.
In Figure 7, two face images of the same person are shown, where the second image (shown in Figure 7(b)) is made brighter than the first one by changing the average illumination level.Curvelet transform is performed upon each image, first without any illumination adjustment and then after performing illumination adjustment.Considering all the curvelet coefficients to form the feature vectors for these two images, a measure of similarity can be obtained by using correlation.In Figures 8 and 9, the cross-correlation values of the curvelet coefficients obtained by using the two images without and with illumination adjustment are shown, respectively.It is evident from these two figures that the latter case exhibits more similarity between the curvelet coefficients indicating that the features belong to the same person.The similarity measures in terms of Euclidean distances between the curvelet coefficients of the two images for the aforementioned two cases are also calculated.It is found that there exists a huge separation in terms of Euclidean distance when no illumination adjustment is performed, whereas the distance completely diminishes when illumination adjustment is performed, as expected, which clearly indicates a better similarity between extracted feature vectors.

Modularization and Its Effect upon Information Content.
As mentioned earlier, it is expected to extract spectral features from portions of a face image, where the information content is relatively high.One possible way is to segment the entire face image into several horizontal bands, compute the entropy content of each band, and select the bands with higher entropy contents.It is to be noted that within a particular horizontal band of a face image, the change in information over the band may not be properly captured if the curvelet operation is performed over the entire band.Even if it is performed, it may offer features with very low between-class separation.In order to obtain high withinclass compactness as well as high between-class separability, we modularize the horizontal bands into some smaller segments, which are capable of extracting variation in image geometry locally within a band.
In view of presenting more rationale towards modularizing the high-informative facial bands, three different face images are shown in Figures 10(a)-10(c) along with their corresponding histograms calculated from the intensity distribution of those images in Figures 10(d)-10(f).Based on these histograms, a general trend of the intensity distribution of human face images can be acquired.It is observed that the distribution follows an almost similar pattern for the three different persons.One can compute the information content in terms of entropy of a high-informative horizontal band using (1).For the purpose of comparison, from the entropy of the horizontal band, the average entropy per segment H is computed by taking into account the total number of segments to be used for modularization.In Figure 11(a), the average entropy per segment H computed for the image shown in Figure 10  However, the size of the module is also an important factor.In Figure 12, the variation of average entropy of a sample face image with segment size is shown.It is clear that decreasing the size of the modules offers greater entropy values, that is, variation in information, which is obviously desirable.However, if the modules were extremely small in size, it is quite natural that the small segments will not be capable of exhibiting significant differences in different images.

Proposed Dominant Curvelet-Domain Feature Selection.
Instead of taking the Curvelet coefficients of the entire image, the coefficients obtained from each module of the highinformative horizontal band of a face image are considered to form the feature vector of that image.However, if all of these coefficients were used, it would definitely result in a feature vector with a very large dimension.In view of reducing the feature dimension, we propose to utilize the dominant Curvelet coefficients as desired features.In order to select the dominant Curvelet coefficients, we propose to consider the frequency of occurrence of the Curvelet coefficients as the determining characteristic.It is expected that coefficients with higher frequency of occurrence would definitely dominate over all the coefficients for image reconstruction and it would be sufficient to consider only those coefficients as desired features.One way to visualize the frequency of occurrence of Curvelet coefficients is to compute the histogram of the coefficients of a segment of a high-informative horizontal band.In order to select the dominant features from a given histogram, the coefficients having frequency of occurrence greater than a certain threshold value are considered.
It is intuitive that within a high-informative horizontal band of a face image, the image intensity distribution may drastically change at different localities.In order to select the  dominant Curvelet coefficients, if the thresholding operation were to be performed over the Curvelet coefficients of the entire band, it would be difficult to obtain a global threshold value that is suitable for every local zone.Use of a global threshold in a particular horizontal band of a face image may offer features with very low between-class separation.In order to obtain high within-class compactness as well as high between-class separability, we have considered Curvelet coefficients corresponding to some smaller spatial modules residing within a horizontal band, which are capable of extracting variation in image geometry locally.In this case, for each module, a different threshold value may have to be chosen depending on the coefficient values of that segment.We propose to utilize the coefficients corresponding to different scale j and orientation l with frequency of occurrence greater than θ% of the maximum frequency of occurrence for the particular module of the face image which are considered as dominant Curvelet coefficients and selected as features for the particular segment of the image.This operation is repeated for all the modules of a face image within the selected high-informative horizontal band.
Next, in order to demonstrate the advantage of extracting dominant Curvelet coefficients corresponding to some smaller modules residing in a horizontal band, we conduct an experiment considering two different cases: (i) when the entire horizontal band is used as a whole and (ii) when all the modules of that horizontal band are used separately for feature extraction.For these two cases, centroids of the dominant Curvelet coefficients obtained from several poses of two different persons (appeared in Figure 13) are computed and shown in Figures 14 and 15, respectively.It is observed from Figure 14 that the feature centroids of the two persons at different poses are not well separated and even for some poses they overlap with each other, which clearly indicates poor between-class separability.In Figure 15, it is observed that, irrespective of the poses, the feature centroids of the two persons maintain a significant separation indicating a high between-class separability, which strongly supports the proposed local feature selection algorithm.
We have also considered dominant feature values obtained for various poses of those two persons in order to demonstrate the within class compactness of the features.The feature values, along with their centroids, obtained for    the two different cases, that is, extracting the features from the horizontal band without and with modularization, are shown in Figures 16 and 17, respectively.It is observed from Figure 16 that the feature values of several poses of the two different persons are significantly scattered around the respective centroids resulting in a poor within-class compactness.On the other hand, it is evident from Figure 17 that the centroids of the dominant features of the two different persons are well separated with a low degree of scattering among the features around their corresponding centroids.Thus, the proposed dominant features extracted locally within a band offer not only a high degree of between-class separability but also a satisfactory within-class compactness.

Reduction of the Feature Dimension.
For the cases where the acquired face images are of very high resolution, even after selection of dominant features from the small segments of the high-informative horizontal band of a face image, the feature vector length may still be very high.Further dimensionality reduction may be employed for reduction in computational burden.
Principal component analysis (PCA) is a very wellknown and efficient orthogonal linear transformation [25].It reduces the dimension of the feature space and the correlation among the feature vectors by projecting the original feature space into a smaller subspace through a transformation.The PCA transforms the original pdimensional feature vector into the L-dimensional linear subspace that is spanned by the leading eigenvectors of the covariance matrix of feature vector in each cluster (L < p).PCA is theoretically the optimum transform for given data in the least square sense.For a data matrix, X T , with zero empirical mean, where each row represents a different repetition of the experiment, and each column gives the results from a particular probe, the PCA transformation is given by where the matrix Σ is an m × n diagonal matrix with nonnegative real numbers on the diagonal and WΣV T is the singular value decomposition of X.If q poses of each person are considered and a total of M dominant Curvelet coefficients are selected per image, the feature space per person would have a dimension of q × M. For the proposed dominant spectral features, implementation of PCA on the derived feature space could efficiently reduce the feature dimension without loosing much information.Hence, PCA is employed to reduce the dimension of the proposed feature space.
3.7.Distance-Based Face Recognition.In the proposed method, for the purpose of recognition using the extracted dominant features, a distance-based similarity measure is utilized.The recognition task is carried out based on the distances of the feature vectors of the training face images from the feature vector of the test image.Given the mdimensional feature vector for the kth pose of the jth person be {γ jk (1), γ jk (2), . . ., γ jk (m)} and a test face image f with a feature vector {v f (1), v f (2), . . ., v f (m)}, a similarity measure between the test image f of the unknown person and the sample images of the jth person, namely, average sumsquares distance, Δ, is defined as where a particular class represents a person with q number of poses.Therefore, according to (4), given the test face image f , the unknown person is classified as the person j among the p number of classes when  [26] 98.18% 99.00% Method [27] 97.70% N/A obtained from all the modules of high-informative horizontal bands of a face image are used to form the feature vector of that image and feature dimension reduction is performed using PCA.The recognition task is carried out using a simple Euclidean distance-based classifier as described in Section 3.7.The experiments were performed following the leave-one-out cross-validation rule.
For simulation purposes, N number of horizontal bands are selected based on the entropy measure described in Section 3.1 and divided further into small modules.Module height is the same as that of the horizontal band and module width is chosen based on the face image width.In our simulations, N = 2 for the ORL database and N = 3 for the Yale database are chosen and the module size is chosen as 32 × 32 pixels for both the databases.
In order to show the effectiveness of the proposed local dominant feature extraction scheme, where each module within the high-informative horizontal bands is considered separately, the recognition task is also carried out by considering the entire horizontal bands as a whole using the same feature extraction algorithm.We refer to the later scheme as Proposed Scheme Without Modularization (PSWOM) method.For the purpose of comparison, recognition accuracies obtained using the proposed method (Proposed Scheme With Modularization or PSWM) along with those obtained by the PSWOM method and methods reported in [26,27] are listed in Table 1.Here, in case of the ORL database, the recognition accuracy for the method in [27] is denoted as not available (N/A).It is evident from the table that the recognition accuracy of the proposed method is comparatively higher than those obtained by the other methods for both databases.It indicates the robustness of the proposed method against partial occlusions, expressions, and nonlinear lighting variations.It is to be noted that the recognition accuracy is drastically reduced for the PSWOM method, where, unlike the proposed method, feature extraction is carried out without dividing the horizontal bands into modules.
As mentioned earlier, dominant features are extracted from the small modules of the high-informative horizontal bands of the face images.Next, we intend to demonstrate the effect of variation of module width upon the recognition accuracy obtained by the proposed method.In Figure 20, the recognition accuracies, calculated using both Euclidean distance and Mahalanobis distance measures, obtained for different module widths for both the databases are shown.The horizontal band height, and hence the module height, is chosen to be 32 pixels for both the ORL database and the Yale database.It is observed from the figure that the recognition accuracies drastically reduce if the module width were doubled.Note that, in case of considering the entire horizontal band as a whole instead of any modularization, the recognition accuracy drastically falls to a value less than 7.00% for both the databases, as expected.
In view of reducing computational complexity, dimension reduction of the feature space plays an important role.In the proposed method, the task of feature dimension reduction is performed using PCA.In Figure 21, the effect of dimension reduction upon recognition accuracy is shown.It is found from this figure that even for a very low feature dimension, the recognition accuracies remain very high for both the databases.
For the case of choosing dominant Curvelet coefficients based on the thresholding criterion in the proposed method, the effect of changing the threshold values, that is, incorporating different amount of Curvelet coefficients, has been investigated.In Figure 22 decreases, the recognition accuracy also decreases, although the recognition accuracies are sufficiently high even for very low amount of coefficients utilized.

Conclusion
The proposed Curvelet-domain dominant feature extraction algorithm provides an excellent space-frequency localization, which is clearly reflected in the high within-class compactness and high between-class separability of the extracted features.Instead of using the whole face image for feature extraction at a time, first, certain high-informative horizontal bands within the image are selected using the proposed entropy-based measure.Modularization of the horizontal bands is performed and the effect of modularization has been investigated.The dominant Curvelet coefficient features are then extracted from within those local zones of those horizontal bands.The effect of variation of module size upon recognition performance has been investigated and found that the recognition accuracy does not depend on the module size unless it is extremely large.It has been found that the proposed feature extraction scheme offers an advantage of precise capturing of local variations in the face images, which plays an important role in discriminating different faces.Moreover, it utilizes a very low-dimensional feature space, which ensures lower computational burden.For the task of classification, an Euclidean distance-based classifier has been employed and it is found that, because of the quality of the extracted features, such a simple classifier can provide a very satisfactory recognition performance and there is no need to employ any complicated classifier.From our extensive simulations on different standard face databases, it has been found that the proposed method provides high recognition accuracy even for images affected due to partial occlusions, expressions, and nonlinear lighting variations.

Figure 1 :
Figure 1: Block diagram of the proposed method.

Figure 2 :
Figure 2: (a) Sample face image of a person and (b) entropy values in different horizontal bands of several face poses.

Figure 3 :Figure 4 :
Figure 3: Two sample face images of the two different persons.

Figure 5 :Figure 6 :
Figure 5: Two face images of the same person.
(a) considering 23 segments each having a size of 28 × 4 pixels is shown.Next, we consider different small segments of the highinformative horizontal band of the face images and compute the entropy of each segment H i based on the histogram of corresponding segments.In Figure 11(a), the entropy values computed from different segments of the high-informative horizontal band of the face image shown in Figure 10(a)

Figure 7 :Figure 8 :Figure 9 :
Figure 7: Two face images of the same person under different illumination.

Figure 12 :
Figure 12: Variation of entropy with segment size.

Figure 13 :Figure 14 :
Figure 13: Sample face images of two persons.

Figure 15 :
Figure 15: Feature centroids of different poses for modularized horizontal band.

Figure 20 :Figure 21 :
Figure 20: Variation of recognition accuracy with module size for the ORL and the Yale databases using the Euclidean distance and the Mahalanobis distance measures.

Figure 22 :
Figure 22: Variation of recognition accuracy with different threshold values for the ORL Database and the Yale Database using the Euclidean distance and the Mahalanobis distance measures.

Table 1 :
Comparison of recognition accuracies.