^{1}

^{1}

^{1}

^{1}

Localizing facial landmarks is a popular topic in the field of face analysis. However, problems arose in practical applications such as handling pose variations and partial occlusions while maintaining moderate training model size and computational efficiency still challenges current solutions. In this paper, we present a global shape reconstruction method for locating extra facial landmarks comparing to facial landmarks used in the training phase. In the proposed method, the reduced configuration of facial landmarks is first decomposed into corresponding sparse coefficients. Then explicit face shape correlations are exploited to regress between sparse coefficients of different facial landmark configurations. Finally extra facial landmarks are reconstructed by combining the pretrained shape dictionary and the approximation of sparse coefficients. By applying the proposed method, both the training time and the model size of a class of methods which stack local evidences as an appearance descriptor can be scaled down with only a minor compromise in detection accuracy. Extensive experiments prove that the proposed method is feasible and is able to reconstruct extra facial landmarks even under very asymmetrical face poses.

Facial landmark localization is the first and a crucial step for many face analysis tasks such as face recognition [

For the last ten years remarkable progress has been made in the field of facial landmark localization [

However, facial landmark localization still meets great challenges in practical applications, such as handling pose variations and partial occlusion while maintaining moderate training model size and computational efficiency. In SDM and its improved methods, the dimension of regression matrix in each stage keeps the same with the stacked feature descriptors, where the magnification of dimension is relevant to the number of feature points. This substantially limits the use of feature descriptors with high dimension. In this type of methods, solving such high dimensional regression matrices occupies a large memory and the computation of both training and testing is not very efficient.

In this paper, we consider the possibility of only training a subset of available facial landmarks while in predicting stage the “missing” landmarks are restored by the inner shape correlation as if they were trained before. In this way, both the training time and the model size can be scaled down by only compromising a minor detection accuracy. Following this idea we propose a global shape reconstruction (GSR) method for predicting extra facial landmarks with limited models; see Figure

The flowchart of the proposed GSR method.

The main contributions of this paper are the following:

Facial landmark localization methods can be roughly divided into two categories [

The active shape model [

In contrast to generative methods, discriminative methods such as deep network based methods and cascade regression based methods have attracted much attentions in recent years. In the deep network domain, Sun et al. [

As discussed in ESR [

In this section, we present the proposed method for extra facial landmark localization in detail. First we give a brief review of the popular SDM framework. Then the training process on reduced facial landmarks is described. Finally, extra facial landmarks are predicted via proposed sparse shape constraints. An approximate sparse shape reconstruction method is also presented for reconstructing just a small number of extra facial landmarks.

The shape

Previous works mostly focus on designing a robust feature mapping function or a specific form of stage regressors [

Given a set of

Motivated by sparse shape constraint proposed in DSC-CR [

We call this a constrained strategy since the sparse coefficient vector is reused in a regressive manner. A number of face shapes

The proposed C-GSR method.

To further simplify the C-GSR method, an approximation is made as follows: after calculating all the sparse coefficient vectors in C-GSR, the sparse coefficients

The proposed S-GSR method.

Experiments are conducted on two datasets, namely, LFPW-68 and HELEN-68.

As an overall setting, every training image is cropped by detecting a face bounding box and then resized to

In this section, we evaluated the reconstruction error of (

Parameters evaluation on LFPW-68 testing set.

In this section, we study the reconstruction performance under different extra facial landmarks reconstruction strategies. Two strategies of C-GSR and S-GSR with the Procrustes analysis are denoted with a “-PA” postfix. Experiments are all carried out on LFPW-68 points testing dataset; that is,

In Figure

Reconstruction strategies evaluation on LFPW-68 points testing dataset.

In order to get an intuitive understanding of the proposed method, we give a visualization of five different landmark configurations in Figure

Visualization of different landmark configurations on LFPW-68 points testing dataset.

In this section, we evaluate the performance of the proposed method in four aspects, namely, the normalized alignment error, training time cost, testing time cost, and training model size, on two facial datasets. We first train models as described in Section

In Figure

Cumulative error distribution (CED) curves of five reconstruction configurations on (a) HELEN-68 and (b) LFPW-68 datasets.

Normalized alignment error, training time cost, testing time cost, and training model size of five reconstruction configurations on two facial datasets: (a) HELEN-68 and (b) LFPW-68 datasets.

From Figure

For a better understanding, we also present the detection results with predicted landmarks labeled as green dots in Figure

Detection results on LFPW-68 and HELEN-68 datasets. The first column gives the reduced landmark configuration for model training. For the rest of the columns, red dots on the images are the landmarks predicted by SDM and green dots on the images are the landmarks reconstructed by C-GSR-PR method.

As discussed in Sections

CED curves of four methods on artificial occlusion testing dataset.

In this paper, we present a global shape reconstruction method for locating extra facial landmarks comparing to facial landmarks used in the training phase. By applying the proposed method, both the training time and the model size of a class of methods can be scaled down with only a minor compromise in detection accuracy. Specifically, we propose a constrained strategy (C-GSR-PA) which exploits the sparse coefficients constraints in face shape correlations. Extensive experiments show that it is able to reconstruct up to three times facial landmarks even under very asymmetrical face poses. We also propose a simplified strategy (S-GSR-PA) which shows comparable performance when reconstructing a few facial landmarks. It does not need to train a regression matrix in advance and require a less computation resource. It has potential in refining a small number of unreliable predictions.

The authors declare that there are no conflicts of interest regarding the publication of this paper.

This work is supported by National Key Research & Development Plan of China (no. 2016YFB1001401) and National Natural Science Foundation of China (no. 61572110).