3D face recognition using anthropometric and curvelet features fusion

Curvelet transform can describe the signal by multiple scales, and multiple directions. In order to improve the performance of 3D face recognition algorithm, we proposed an Anthropometric and Curvelet features fusion-based algorithm for 3D face recognition (Anthropometric Curvelet Fusion Face Recognition, ACFFR). First, the eyes, nose, and mouth feature regions are extracted by the Anthropometric characteristics and curvature features of the human face. Second, Curvelet energy features of the facial feature regions at different scales and different directions are extracted by Curvelet transform. At last, Euclidean distance is used as the similarity between template and objectives. To verify the performance, the proposed algorithm is compared with Anthroface3D and Curveletface3D on the Texas 3D FR database. The experimental results have shown that the proposed algorithm performs well, with equal error rate of 1.75% and accuracy of 97.0%. The algorithm we proposed in this paper has better robustness to expression and light changes than Anthroface3D and Curveletface3D.


Introduction
Face recognition is widely used in video, passports, security, psychological research, automatic aid, robotics, fatigue testing, human-machine interfaces, and other occasions. In the past few decades, researchers focused on the study of 2D face recognition. However, 2D face recognition still has limitation to pose, expression, illumination, age variation, and other external factors. Researchers are now paying more attention to 3D face recognition which can overcome the pose and illumination existing in 2D face recognition. Besides, the 3D data contains more geometric information for face recognition such as geodesic distance and curvature characteristics. So, the scholars have carried on the research of 3D face recognition and achieved some results in recent years [1][2][3][4][5][6][7][8][9][10][11].
Currently, there are many researches on 3D face recognition algorithm which can be mainly divided into two categories: global feature and local feature method. Global feature method uses a unified facial expression model to represent human face, such as eigenface [5], Fisher face [6], and ICP techniques [7]. In local features method, the features of partial face region are used for face recognition, such as Gabor feature [8], Iso geodesic stripes characteristics [1], texture + Curvelet features [9,10], and Anthropometric characteristics [11]. Compared with global feature method, local feature has greater advantages in dealing with expressions, gestures, light and scale change, and other issues.
Gupta et al. [11] proposed Anthropometric 3D face recognition algorithm (Anthropometric 3D Face Recognition, Anthroface3D). According to Anthropometric principle, 10 facial basis points were orientated accurately and the 3D geodesic distances between these points were used for human face recognition. The Anthroface3D recognition algorithm performs well (equal error rate of 1.98% and a rank 1 recognition rate of 96.8%) in Texas 3D database.
Elaiwat et al. [9] proposed a face recognition algorithm fusing the information of 2D texture and 3D Curvelet in Curvelet domain. First, the rigid region which corresponds to face region except for the mouth and the area around it of 2D image and 3D image is decomposed at multiple scales and multiple directions. Second, Curvelet coefficients of the rigid region are obtained by PCA, and the key points are located by fusing the 2D Curvelet coefficients and 3D Curvelet coefficients. Then, the local descriptor of the key points is obtained in Curvelet domain, and recognition is achieved by Mahalanobis for 4 regions which are 3D scale 2, 3D scale 3, 2D scale 2, and 2D scale 3, respectively. Last, 3D face recognition is realized by fusing the results of 4 regions in decision level.
Inspired by Elaiwat et al. [9] and Gupta et al. [11], this paper proposed a 3D face recognition algorithm using Anthropometric and Curvelet fusion (Anthropometric Curvelet Fusion Face Recognition, ACFFR). Due to the good representation of Anthropometric characteristics and Curvelet feature on the surface information, we extract the left eye, right eye, nose, and mouth region from the face as the local feature region for the 3D face recognition according to the Anthropometric characteristics. Then, the Curvelet feature vector is constructed which can describe the 4 regions at different scales and different orientations. Finally, the 3D face recognition is realized by the Euclidean distance. The Anthropometric characteristic was used to extract the nose, mouth, left eye, and right eye feature region in the proposed ACFFR algorithm. And it has better robustness for the changes of expression and illumination compared with the method proposed by Elaiwat et al. [9]. In order to verify the performance, the proposed ACFFR algorithm is tested in the famous Texas 3D database [12] and compared with the Anthroface3D and Curveletface3D face recognition methods. Experimental results show that the proposed ACFFR algorithm has better robustness in expression, light, and other environmental changes.

Location of Facial Feature Region
The selection of facial feature region is not arbitrary, which satisfies the usefulness, robustness, significance, and so on. The nose, eyes, and mouth are salient. So, the nose, eyes, and mouth region are selected as the facial feature region in our proposed method. Figure 1 is depth image corresponding to the 15th 3D face image in Texas 3D FR database [12]. It covers the major curve information of the face surface and can reflect the structural information of the whole face. The location of nose region begins with the detection of the tip of the nose, and the tip location is then employed to detect the nose width. Nose region is determined by these 3 points. These 3 points are then employed to detect the inner corners of the eyes; then, the eyes region can be determined. The location method of 4 facial feature regions is as follows.  of face surface. and can be calculated as shown in the following formulae:

Location of Nose Region
where and are the first partial derivatives of ( , ), with regard to and , respectively, and , , and are the second partial derivatives of ( , ) with regard to and .
When > 0 and < 0, absolute values of and are maximum, corresponding to the location of the nose tip (prn) [11].

(b) Location of Nose Width Points (al-al).
On average, human has nose width = 35 mm with = 2.5 mm and height = 53 mm with = 3.4 mm [13]. The location of the nose width points in Texas 3D FR database is as follows [11]. Hence, to account for variations in the human, we fixed the width of the search region for points (al-al) about the tip of the nose at + 6 . Similarly, we fixed the height of the search region at 0.6 × ( + 6 ). Second, within the search region, the ACFFR 3D algorithm detects edges on the facial range images using Laplacian of Gaussian edge detector. Figure 2 is the edges of the search region. At last, the leftmost and rightmost points are closest to the tip of the nose along the vertical direction. The points we get are the nose width (al-al).
(c) Location of Nose Region. After steps (a) and (b), the location of the nose tip and the nose width points is acquired. The nose region is acquired according to the three points (prn, al-al) and the nose height. Then, the nose region is normalized to 100 × . Figure 3 is the location result of nose region.
However, NaN is obtained as a result of being mathematically undefined.

The Location of Left and Right Eye Region (a) Location of Inner Eye Corners (en-en).
For an adult, the vertical distance between the inner corners of the eyes and the tip of the nose is 0.3803 times the vertical distance between the top of the head and the tip of the nose [14].
Firstly, the upper limit of the search region is set as formula (3) [11] and the lower limit as formula (4) [11]. Hence, where is the vertical coordinate of the nose tip and V is the vertical coordinate of the highest point of the 3D model.
Secondly, according to the ratio of the horizontal distance between the inner corners of the eyes to nose width [13], the horizontal limit of the two search regions is determined. The limit of the inner corner of the left eye is as formula (5) [11], and that of the right eye is as formula (6) [11]. Hence, , where ,left is the horizontal coordinate of the left of the nose width points and ,right is the horizontal coordinate of the right of the nose width points. Finally, in the search region, when < 0, are the maximum absolute value, corresponding to the location of inner corners of the eyes (en-en) [11].

(b) Location of Outer Eye Corners (ex-ex).
For an adult, the distance between the inner and the outer corner of an eye is approximately equal to the distance between the inner corners of the two eyes [13]. The position of the outer corner of the left eye is approximate ( ) as formula (7) [11], and the position of the outer corner of the right eye ( ) is approximate as formula (8) [11]. Hence, where ,left is the horizontal coordinate of the left inner eye corner, ,right is the horizontal coordinate of the right inner eye corner, ,left is the vertical coordinate of the left inner eye corner, and ,right is the vertical coordinate of the right inner eye corner. The locations of the inner eye corners and the outer eye corners in Texas 3D FR database are above [11]. However, NaN is obtained as a result of being mathematically undefined.

Location of Mouth Region.
Studying the curvature of facial surface regions located below the nose, we found that the outer corners of the mouth were distinct concavities. So, we can get the location of mouth corner by calculating [11]. And the location of the detected points (al-al) is employed to horizontally constrain these search regions.
Firstly, the search region of mouth is determined. The limitation of left mouth corner is as formula (9) [11], and the limitation of right mouth corner is as formula (10) [11]. Hence, , where ,left is the horizontal coordinate of the left of the nose points and ,right is the horizontal coordinate of the right of the nose points. Secondly, the upper and lower edge are determined. In the search region, when > 0, and is the maximum absolute value, the region corresponds to the location of mouth corners (ch-ch) [11].
Finally, the location of the mouth corners is acquired; the mouth region is acquired according to the mouth corners (chch) and the upper and lower edge of mouth. Then, the mouth region is normalized to 100 × . Figure 6 is the location result of mouth region.
However, NaN is obtained as a result of being mathematically undefined.

Feature Extraction
The Curvelet transform is a kind of multiscale transformation, which is proposed by the famous scholars Donoho and Duncan in 2000 [15]. In essence, Curvelet transform is multiscale pyramid decomposition, and each of which corresponds to the image at different directions and scales. However, this pyramid is nonstandard, the length and width of each Curvelet are variable, and the width is the square of the length. With the increase of the decomposition level, the direction of the decomposition of the Curvelet is much finer.

Acquisition of Curvelet Coefficients.
After the successful localization of the facial features, the 4 normalized feature regions are decomposed by FDCT WARPING. The decomposition level is 4.
The process of Curvelet decomposition is as follows.
Step 2. Acquire the Curvelet coefficients at scale 4, denoted by matrix 4,1 : (1) The right and left windows along the horizontal direction are constructed, denoted by row vector −1 and −1 , respectively.
(2) The right and left windows along the vertical direction are constructed, denoted by row vector −2 and −2 , respectively.
(2) A high-pass filter at scale 3 and angle 1 in the same way as at scale 4, (3) lowpass 4 is filtered by lowpass 3 , hence generating the filtered low-pass signal at scale 3, (4) lowpass 4 is filtered by hipass 3 , hence generating the filtered high-pass signal at scale 3, which has the same size as that of lowpass 4 .
(5) The discrete locating window of wedge wave at scale 3 and angle 1 is determined. The Curvelet coefficients at scale 3 are divided into the 4 quadrants. Each quadrant has 8 angles.
(6) The discrete locating window of wedge wave data is filtered and rotated, hence generating matrix data 2 .
(7) Inverse 2DFFT is applied to data 2 , hence generating the Curvelet coefficient at scale 3 and angle 1, 3,1 .  (8) (5), (6), and (7) in Step 3 are repeated, in the same way of acquiring 3,1 ; Curvelet coefficients at scale 3 and angle from 2 to 8 are generated. (9) The Curvelet coefficients at scale 3 and the other three quadrants are acquired in the same way as that in the first quadrants.
Step 3 is repeated, hence generating the Curvelet coefficients at scale 2 and angle from 1 to 16.
Take the nose region as an example; the size of all the coefficients at different scales and directions is shown in Table 1, and the curves at different scales and directions are shown in Figure 7.
In Figure 7, the white part of the image described the edges of nose at different directions. Meanwhile, they are important Curvelet coefficient regions of the image. Low frequency coefficient (coarse scale coefficient, the third scale coefficient) located the center of the Curvelet coefficient image. The outside scale corresponds to the high frequency coefficient (fine scale coefficient, the second scale coefficient). The second scale includes 4 strips, corresponding to scale 2 Curvelet coefficient of the four quadrants, respectively. Each subsegmented block corresponds to the corresponding scale and direction.

Extraction of Curvelet Feature.
During the Curvelet decomposition, the number of directions at the fourth scale is 1. Meanwhile, the number of directions at the second scale is 16. The Curvelet coefficients at different scales and different directions represent the image direction. Thus, the direction and detail of the original signal can be approximated by Curvelet decomposition coefficients of each subblock. According to this principle, the Curvelet coefficients of each subblock are extracted by average 1 norm in our proposed method. The formula is as follows: where | ( , )| is the modular for . The feature vector of each subblock includes 50 features, which is not only the expression of the whole instruction information for human face but also the accurate expression for the information of defective detail and direction. The feature vector of each subblock can be represented as = [ 1 , 2 , 3 , . . . , 49 , 50 ]. For example, the Curvelet energy at each scale and each direction in the nose region is shown in Table 2.  Considering that there are 50 features in each region, a total of 200 features are extracted for 3D face recognition. Finally, the template matching method is used to accomplish the recognition.
where is the feature vector of the target subblock image and is the feature vector of the template subblock image.
Since there are 4 feature regions in each face image, we can get 4 in each set of matching results. This paper uses a simple weighted method to fuse these 4 . Finally, the minimum is taken for the matching results.

Experimental Results and Analysis
The proposed 3D face recognition (Anthropometric Curvelet Fusion Face Recognition, ACFFR) is demonstrated in Texas 3D database which includes 1149 depth images of 118 persons with different expression, light, sex, and age.

Selection of Testing Sets and Training Sets.
According to the evaluation standard of international human face recognition system, FRV2002 [16,17] and FRGC2006 [18,19], the Texas 3D database is divided into testing sets and training sets shown in Table 3. We randomly select 15 persons' images from Texas 3D database as training set to optimize the classifier. And each person has 30 images which include neutral images and expression images. Testing sets include 699 images of 103 persons, independent of training set, and testing sets are divided into model sets and target subsets. Model sets have 103 persons and each one has 1 neutral image. Target subset includes 600 images of 95 persons included in those 103 persons with neutral images and expression images.

Experiment. The statistics of ROC (Receiver Operating
Characteristics) and EER (equal error rates) in the experiment are, respectively, shown in Figure 8 and Table 4. The statistics of CMC (Cumulative Match Characteristic) and RR (rank 1 recognition rates) in recognition experiment are, respectively, shown in Figure 9 and Table 5. In order to demonstrate the effectiveness of the proposed algorithm, we make the comparison with the present famous methods in [9,11].     From Figures 8 and 9, the proposed algorithm, ACFFR, is better than Anthroface3D and Curveletface3D. Statistics in Tables 4 and 5 have shown that EER is 1.75% and RR is 97.0% in our algorithm. Compared with Anthroface3D, the EER of ACFFR is lower and the best recognition rate (RR) is higher. Compared with Curveletface3D, the EER of ACFFR is slightly higher in neutral images while the EER of ACFFR is lower in expression images. Thus, the effectiveness of ACFFR is better than Anthroface3D and Curveletface3D on the occlusion and the expression changes.
The experiment result shows that the proposed Anthropometric and Curvelet features fusion-based algorithm for 3D face recognition (ACFFR) has fused 3D Curvelet features on the feature region based on Anthropometric features. Thus, it has achieved high performance.

Conclusion
This paper proposes an Anthropometric and Curvelet features fusion-based algorithm for 3D face recognition (Anthropometric Curvelet Fusion Face Recognition, ACFFR). The experiments have been done on the Texas 3D FR database. Experimental results show the superiority of ACFFR over Anthroface3D and Curveletface3D with EER rates of 2.12% and RR rates of 96.1% against EER rates of 2.81% and 2.23% and RR rates of 95.6% and 94.1% in expressive images. Meanwhile, EER rate of ACFFR is 1.75% and RR rate of ACFFR is 97.0% for all face images including neutral images and expressive images. Thus, the ACFFR algorithm has good robustness to the occlusion and the changes of light and expression.