Nonfrontal Expression Recognition in the Wild Based on PRNet Frontalization and Muscle Feature Strengthening

,


Introduction
Nonfrontal facial expression recognition (FER) in the wild is very important for artificial intelligence (AI), and it is the key for human-computer interaction (HCI) [1]. If HCI system wants to act as a real human to communicate with its human clients, the computer needs to recognize the clients' facial expression and detect their emotion effectively [2]. However, at present, FER methods have not yet reached the level of practice, and most of them could only recognize expression on front face; the FER methods for nonfrontal faces are still under research [3].
However, in the wild, FER is challenged by nonfrontal faces, illumination variations, and registration errors [4].
e nonfrontal faces are caused by head turning and pitching and camera viewpoints changing, and these would be easy to deform face shape significantly and cause FER errors [5]. In deformed facial images, the important features for expression recognition, including the appearance and position of brows, eyes, cheeks, and mouth, are much different from those of front face and cause FER error seriously. e other challenges for FER are also related to these interferences in nonfrontal faces [6]. If illumination is poor, especially when head is turning, some face parts cannot be illuminated enough and might be dark and blurry in the photos; FER error may be caused [7]. For the deformed faces, the matching with standard front face is also impossible. Because geometrical relationship among these key points on deformed face image might be deformed and might be much different from that of front face, the traditional FER methods would be disturbed seriously [8].
In order to solve these problems of FER in the wild, there are usually three ways: extracting facial key points as expression features, extracting the expression features on the whole or local face, and establishing the relationship of different poses and expression. e facial key points, such as the points on the brows, eyes, and mouth, are very useful for FER, because they are very sensitive to facial muscle movement [9]. e newest studies for this issue concentrate on introducing self-study method to build model for key points localization; for example, the study in [10] established hierarchical probabilistic model to extract key points on the deformed face. As in-thewild is very important for computer vision, traditional methods are also improved for extracting facial key points in nonfrontal faces, for example, training Active Shape Model (AAM) in-the-wild to eliminate the disturbance caused by head pose changing in [11]. While facial key points can be localized exactly, the expression recognition can still be disturbed, because FER models through key points mainly rely on the geometrical relationship among these points. While the relationship among 3D key points geometrical projection on the 2D image would lose a lot of information, the geometrical projection of the same expression on every facial photo is much different when head poses are different.
is problem might cause FER error, especially when head pose changes significantly.
For the expression features on the whole or local face, they are more adaptable for FER in the wild. Comparing key points, much more expression information can be gotten from face areas, for example, eyes, brows, cheeks, and mouth. Its core is to extract the local or whole facial features of different poses. e newest deep learning methods are very suitable for this issue [12]. e study in [13] proposed weighted mixture deep networks fusing facial grayscale images features and their corresponding LBP images for FER. e facial expression features are extracted through a pretrained model (VGG16 model on ImageNet). e study in [14] proposed Weighted Center Regression Adaptive Feature Mapping (W-CR-AFM); this network could finetune parameters to gain better recognition accuracy in the specific applications, and these misclassified samples and new samples can be corrected or reformulated. e authors also proposed preprocessing method to assist the AFM by extracting the precise face images based on Neighbor-Center Difference Image (NCDI) method. e study in [15] proposed Multichannel Pose-aware Convolution Neural Networks (MPCNN) for nonfrontal faces expression recognition; this network is combined by three parts: multichannel feature extraction, jointly multiscale feature fusion, and pose-aware recognition. However, there are still some problems. As the result of 3D face solid projection, the spatial information loss in 2D facial photos is still serious with different head poses. e expression features deformation and displacement are different in every nonfrontal face image and may cause FGR error.
Compared the facial feature points and regional expression features, establishing the relationship of different poses and expression may be the most effective way for nonfrontal faces expression recognition. For example, one common way is to establish the transformation model between front face and nonfrontal faces to frontalize nonfrontal faces. FER in the wild would be much convenient according to these frontalization faces. Although the lack of face deep information in the 2D photo would cause some difficulty for face frontalizing, the deep learning method, especially these deconvolution networks, can build effective frontalization model for nonfrontal faces. e typical methods include Generative Adversarial Nets (GAN), 3D morphable model (3DMM), 3D Dense Face Alignment (3DDFA), and Position Map Regression Network (PRNet). GAN could transform nonfrontal 2D face image into 2D front face [16]. e study in [17] introduced GAN to gain the front faces of nonfrontal faces and using these frontalization faces for expression recognition. 3DMM [18], 3DDFA [19], and PRNet [20] introduce another way for face frontalization; they could directly build the 3D faces for nonfrontal faces, and the front face can be gotten through the projection of 3D face. e 3D face could contain much more information and improve FER accurate. erefore, in recent years, the deep learning, especially CNN, has been extensively applied in expression recognition. ese networks can be designed to recognize expression with any pose. e study in [14] presented Multichannel Pose-aware Convolution Neural Networks (MPCNN) to extract facial features; they consisted of multichannel feature extraction, joint multiscale feature fusion, and pose-aware recognition. e study in [21] proposed Region Attention Network (RAN), which is a region-based deep attention architecture. is network can integrate visual clues from regions and whole faces to capture the important facial regions, and the weights of this network can be refined adaptively. e study in [12] used weighted mixture deep networks (VGG16 model) to improve traditional local binary pattern for expression. e study in [13] proposed Weighted Center Regression Adaptive Feature Mapping (W-CR-AFM) to eliminate misclassified samples or add new samples to improve model accurate. e study in [22] improved convolution neutral network with attention mechanism, and two networks were proposed: patch-based ACNN (pACNN) and global-localbased ACNN (gACNN). ese networks can focus on the most discriminative unoccluded face regions and design gate unit that could weigh each region adaptively. e study in [23] proposed parallel feature extraction-based lightweight CNN, named as eXnet (Expression Net), which can combine accuracy and a smaller number of parameters. e study in [4] proposed and combined three different novel CNN models together. e first one is consisting of six depth-wise separable residual convolution modules and can solve the problem of complex topology and overfitting. e second one has dual-branch and could perform extraction. Traditional LBP features and deep learning features are parallel. e third one could overcome the shortage of training samples based on transfer learning technique. e study in [24] proposed island loss (IL-CNN) to improve CNN, which can enlarge interclass differences and reduce intraclass variations simultaneously. e study in [25] proposed identity-aware convolutional neural network (IACNN). e authors designed identity-sensitive contrastive loss to recognize identity-invariant expression for particular people.
However, all the above methods cannot accomplish FER in the wild effectively. For the facial muscle movement, which can show the direct expression directly, it might be weakened on the nonfrontal faces or its frontalization and cause FER error. Some important parameters of muscle movement, such as the moving zone and the extent of contraction or relaxation, cannot be calculated exactly. is is a serious problem for frontalization face and had been widely reported in the previous studies. In this paper, based on the theory of "N : q rule" of parameter estimates [26,27] and face finite element analysis (FEM) [28][29][30] and muscle structure anatomy [31,32], it would be proved that all muscle parameters in frontalization face are more weakened than those of real face, except muscle moving direction on each small area.
In order to solve these problems, this paper presents a more useful method for accomplishing FER in the wild. It could rebuild face for frontalizing and intensify muscle movement to strengthen expression features on frontalization face and improve expression accuracy more effectively.
Firstly, deep learning method PRNet is introduced. is network could extract nonfrontal faces in the photos and build their 3D face for frontalizing. In consideration of human face which is symmetrical, the front face can be established through the half face toward camera and its mirror in the 3D face. In order to analyze model accuracy for the face muscle, 3D face is divided as skull model and muscular layer. According to the theory of "N : q rule" of parameter estimates [26,27], it can be proved that skull model can be rebuilt exactly. Meanwhile, for the muscle model, according to the theory of muscular structure and cell shape, 3D face can only estimate and show muscular moving direction on each small area in the face, and the other parameters of muscle movement cannot be estimated effectively, and facial expression features on 3D frontalization face are weakened significantly.
erefore, face contour model and muscle movement strengthening model are proposed for intensifying expression features. For face contour model, face contour and Fréchet distance are proposed in this paper for dividing face into many small areas and extracting muscular moving direction in each area.
rough muscle movement strengthening model, each area is given a 3D Gauss model that can follow the muscle moving direction to simulate and strengthen muscle movement. ese Gauss strengthening models in different face areas could partially overlap each other to strengthen muscle contraction or relaxation in the whole frontalization face image for intense facial features.
In the test, the famous MEGVII Face++ facial expression recognition platform, which is also able to recognize expression in the wild, is used for comparison. Face++ recognizes the expression in the original nonfrontal faces and frontalization faces and expression strengthened frontalization faces, respectively. Test result shows that the method proposed in this paper could enlarge the facial feature effectively and recognition accuracy in the strengthened frontalization face is much higher than that of original face.

Method Design
e structure of nonfrontal facial expression recognition method is shown in Figure 1. It is composed of two parts: face frontalization through PRNet 3D model and muscle movement strengthening and expression strengthening model.
For every face in the 2D photo, PRNet could build special 3D face for each of them. e frontalization faces can be gotten from these 3D faces, and muscle movement can also be extracted from these 3D faces.
For the nonfrontal face, it is inevitable that facial feature in frontalization face is more weakened than that of real front face. So, this paper presents a model to strengthen muscle movement and facial expression features and improve FER accuracy.

Face Frontalization and Facial Muscle Movement Extraction
PRNet could output the 3D face just through its 2D head photo. e core of PRNet is ten residual blocks, and their self-study and modeling ability is much stronger than those of ordinary convolution layers. ese blocks could extract 2D face image features and transfer these features into 8 × 8 × 512 feature maps. 17 transposed deconvolution layers use these features to decode and generate the 256 × 256 × 3 position map, named as UV position map by the PRNet authors [20]. is UV position map is the 3D face. After projecting and frontalizing this 3D face, the front face can be gotten.
In order to recognize facial expression effectively, the face symmetry is taken advantage of to build front face. As the left and right faces are symmetric, the muscle movement and expression on these two sides are also the same. So, the face frontalization is just using the half 3D face, which is toward the camera and is less deformed and much clearer on the photo, to build front face. Its mirror can substitute the other half face that is more seriously deformed and compressed in the original 2D photo. After 3D face rotation transform and projection, the frontalization face can be gotten. Figure 2 shows the image processing of a side face frontalization through its left 3D face and its mirror. e half face toward camera can be extracted according to the head pose resolved by the PRNet. rough this method, the expression of nonfrontal faces can be recognized, for example, in Figures 3-5. Figure 3 shows the original sad face and its frontalization.
In the experiment, the famous Face++ recognition platform is taken advantage of to test our frontalization method. As shown in Figure 4, there are 7 horizontal bars on the right of Face++ platform. ey show the prediction confidence of seven expressions. From top to bottom, these are "happy," "neutral," "surprise," "sad," "disgust," "angry," and "fear" and are noted in English. e most probable expression recognition result is orange-colored. e original face of turning and bowing head is error recognized as neutral, as shown in Figure 4. Meanwhile, after face frontalization by our method, "sad|" can be recognized effectively, as shown in Figure 5.
While there is still a problem for face frontalization, as the muscle movement for expression in frontalization face is Mathematical Problems in Engineering more weakened than that of real front face, almost all of the frontalization researches are bothered by this problem. As shown in Figure 6, Figures 6(a) and 6(b) are the front and nonfrontal face images of the former US president Richard M. Nixon, and Figure 6(c) is the frontalization of Figure 6(b). It can be shown that the facial features and muscle movement on frontalization face are more weakened than original face significantly.
ere are two reasons for this problem: One is that because some facial organ areas are compressed in the nonfrontal-view face; their resolution is very low in the nonfrontalization face image and still blurry in the frontalization image, for example, the brow, eye, nose, and mouth in the side face which is not toward the camera. e other is that muscle movement cannot be rebuilt effectively in the frontalization model, and this is the main interference for expression recognition. erefore, we would solve these problems by strengthening facial muscular movement on the 3D face to intensify facial expression feature.

Accuracy Analysis for Facial Muscle Movement in 3D
Face. In order to analyze the accuracy of muscle movement on 3D face, the real human head is decomposed into two parts: skull model (blue area) and muscular layer model (red area), as shown in Figure 7.
e characterization of expression on the skin is produced by the muscular layer. So the 3D face building through PRNet can also be equivalent to build these two models through PRNet. e fitting accuracy of PRNet for 3D skull and 3D muscle can be analyzed from two aspects: PRNet fitting ability distribution for them and the number of their parameters that need to be estimated. For example, if fitting ability for the facial muscle is weak and muscular parameters are numerous, it is impossible to build precise muscular movement in the 3D face. e distribution of PRNet fitting ability for skull and muscular movement can be discussed through the loss function, which is the core of PRNet training to build 3D face; and, according to the theory of Finite Element Method (FEM), face can be divided into many small areas for analysis. For each small area i, P(x, y) � P H (x, y)+ P E,i (x, y) and P H (x, y) and P E,i (x, y) is the real value of skull and muscular UV maps output by PRNet. erefore, for each self-study epoch, the loss function for driving PRNet training is where P(x, y) is PRNet output and it is named as UV maps transformed from 3D face coordinates, P(x, y) represents the UV maps of real 3D face, W i (x, y) is the weight of this area in loss function, and r i is the ratio between muscle and skull: r i � ((P E,i (x, y))/ (P H (x, y) + P E,i (x, y))).
Because the size and volume of skull are much larger than those of muscle in the skin, it can be deduced that P H (x, y) ≫ P E,i (x, y). So the ratio of output for skull (1 − r i ) is much larger than that of face muscular r i , P(x, y)(1 − r i ) ≫ P(x, y)r i . is means that PRNet would distribute most of its ability to fit skull and revise skull model   Mathematical Problems in Engineering error. So, for the facial muscle, the remaining PRNet fitting ability is very less and fitting error may be large. is is a disadvantage for fitting muscle movement and weakens the expression feature on the 3D face. What was worse, for the 3D face parameter estimation, some important parameters in muscle movement model cannot be estimated effectively as that of skull model. is can be proved through the theory of Finite Element Method (FEM) [28]: (1) e Parameters Fitting Error for Skull Model. For the skull model, it can be descripted as follows: (2) e relationship among these 3D points (X H , Y H , Z H ) follows the structure of skull geometric. According to the FEM theory, the skull is composed of a lot of finite elements, and the number is n 0 . For each element, its model contains n 1 parameters p 1,i , p 2,i , . . . , p n1,i and can be shown as follows: Function (3) shows the model of the No. i element. If the PRNet could estimate the parameters of each skull element exactly, it could build skull model exactly.
According to the theory of "N : q rule" of parameter estimation [26,27], sample number needs to be much larger than parameter number. e study in [26] suggested that the ratio should be at least larger than 5 : 1, and the study in [27] suggested that the ratio should be larger than 10 : 1 or 20 : 1. So, for the 3D face built by PRNet, because it contains 256 × 256 fix points on the head, it can be seen as using n 0 � 256 × 256 elements to establish skull model, and total parameter number is n 0 × n 1 � 256 × 256 × n 1 . For deep learning network training by massive faces database, input sample number for training PRNet must be much larger than 20 × 256 × 256 × n 1 , and all the facial element parameters can be resolved. So, for the "neutral" expression, as muscle movement is very less, its 3D face can be built exactly, as shown in Figure 8. e front face for each nonfrontal face image is the projection of front 3D face, which can be gotten by rotating 3D nonfrontal face through Euler matrix: e Euler matrix A(heading, pitch, roll), composed by heading and pitch and roll angles, would be gotten by comparing the arrays of standard front 3D face points and 3D face points in the wild by the Least-Square Method. PRNet provides a special function "frontalize ( )" for projecting and getting front face. So, as long as the number of input face samples is much larger than the number of skull parameter, the skull model can be fitted exactly by PRNet.
(2) e Parameter Fitting Error for Muscular Model. While, for the model of muscular layer, there may be some fitting error, muscular model is more complicated than skull model, and not all parameters of muscle movement can be resolved through PRNet. According to the FEM theory [28] and facial action coding theory [29], facial muscular expression can also be divided into many small expression areas, given that total small area number is n m . When receiving emotion signal from brain, the muscle in some areas would contract or relax and control the face skin to make expression (as the muscular layer profile shown in Figure 2(c)). e key for descripting muscle movement is that building model to give the different contraction and relaxation value to different skin points, as shown in Figure 9. e real facial muscle movement is very complicated. Reference [30] presented the important muscular moving parameters, for example, zone of influence, moving direction and displacement of each point, and the different fitting model for the muscles of eye, mouth, cheek, and so forth.
Reference [31] proposed an image deformation method based on moving least squares for whole face deform, using a group of linear functions including affine, similarity, and rigid transformations.
For simplifying the model, in this paper, the face is divided into may small areas to decompose the whole face muscle movement. In each small area, the muscular movement is simplified as 3D Gauss model. As shown in Figure 10  inclination angle α m (when α m ≈ 0, the inclination is maximum and muscle movement is maximum, as shown in Figure 10(c)).
For the No. m muscular small area, whose center is (x m , y m ), its Gauss model can be described as follows: where e estimation of parameters d m and α m , which are used to show the extent of muscle movement, may be difficult. eir subparameters are relative to area center (x m , y m ) and the muscle structure around this center. When muscular moving, original muscle point position which is (x m , y m ) is displaced, and muscle structure around it is also changed. So, for different face samples, these subparameters are not fixed value. erefore, PRNet used to build 3D face is difficult to resolve d m and α m exactly; and the extent of muscle movement in frontalization face is more weakened than that of real front face.
But the parameter of muscle moving direction θ m can be estimated exactly, and 3D face could show muscle moving direction clearly. For the same area (e.g., No. m) in the different face sample, although its center (x m , y m ) in each sample may be a little displacement, the subparameters of θ m , including p m,θ,1 (x m , y m ), p m,θ,2 (x m , y m ), . . . , p m,θ,nθ (x m , y m ), are only relative to the direction of muscle cells contraction and relaxation, while these cells around No. m area have almost the same moving direction.
is can be proved through the theory of muscular structure and cells anatomy. e muscle on the face is striated muscle that is used for control expression [32,33]. e shape of every muscle cell is slender; these cells array as bundle structure for a muscle, as shown in Figure 11. ese cell contract and relax on one fixed direction, θ and π + θ.
For every muscle, it is composed of a whole regular cells bundle, and its cells are contracting and relaxing in the same direction. So, for one of small area (No. m) in the cells bundle, it has the same moving direction as that of the whole bundle.
erefore, the area center in the θ m resolving function (6) (7) for all small areas is n M × n θ . Meanwhile, for the PRNet used to build face muscle model, its training samples number n s is very large and n s ≫ n M × n θ ; and, according the theory of "N : q rule" of parameter estimates [26,27], subparameters for θ m can be resolved. All the small areas moving direction (θ i , i � 1, . . . , n θ ) in the muscle model can be resolved effectively by PRNet. Although the muscle moving extended in the model is more weakened than that of the real one, they can still be strengthened according to their moving direction.
In the next section, in order to extract muscle moving direction from 3D face, face contour model is proposed and Fréchet distance is introduced to analyze contour lines; and muscle movement strengthening model is also designed to strengthen muscle movement following the moving direction. e muscle strengthening result is shown in Figure 12; "sad" can be intensified, especially on the brows, eyes, and mouth.

e Design of Facial Expression Strengthening Model.
In order to strengthen muscle movement on the PRNet 3D face to intensify expression features, our expression strengthening model is composed of two parts: face 3D contour model for muscle moving parameters extracting (especially direction θ) and face muscle movement model for expression strengthening.

Face Contour Model for Extracting Muscle Moving
Direction. Extracting the muscle moving direction for every small face area is the core for enlarging muscle movement and strengthening facial expression. In this paper, 3D contour model and Fréchet distance are introduced to extract muscle moving direction. e contour model is designed based on the 3D face. In the case of muscular contraction and relaxation, the depths (the thickness of muscle and bone) of different face areas are changed, and the contours shape around muscle would be changed significantly. As shown in Figure 13, the contours with the same depth are colored with the same color and it can be shown clearly that contours shapes of happy face are much different from that of neutral face, especially the contours on the eyes, cheeks, and mouth.
In order to simplify the contour model for computer algorithm calculation, the 3D face is divided into 7 contour layers with different depth; and, in the head horizonal direction, 3D face was divided into four expression sensitive parts along skin: forehead part, brow and eye part, cheek part, and moth part. So, there are a total of 28 parts in the half face; and 28 muscle areas would be extracted from these parts to strengthen muscle movement. e whole expression strengthened frontalization face can be gotten by combining this strength half face with its mirror.
For each small part, the place in which Fréchet distance is the maximum between expression face contours and standard neutral face contours is the most significant muscle movement area in this part. Fréchet distance is a very useful method to analyze the distance between two curves. Its direction is muscle moving direction, and length is in ratio with real muscular movement area extent, as shown in Figure 14.
Given two contours curves E, S, their Fréchet distance 1] d(E(α(t)), S(β(t))) ,  where (x i , y i ) and (x k , y k ) are the points on these two contours and ne, ns are the points number. α, β are a kind of distance measure and can be shown through distance matrix D which is the combination of connecting lines between two contours. rough processing matrix D, a group of connect line distance d 1 , d 2 , . . . , d nd could be extracted as thresholds can meet the requirement of curve distance. F(E, S) is shortest and is the infimum of d 1 , d 2 , . . . , d nd denoted as "inf" in function (9). Fréchet distance vector is shown as the color arrows in Figure 14. And the muscle movement parameters in each of face parts, including muscle area center (x m , y m ) and moving direction θ, can be gotten.
As nonfrontal face is frontalized just through the half face toward camera, the muscle is only strengthened on this half face, and then its mirror is made to build whole front face.

Muscle Movement Strengthening
Model. 3D Gauss model is very convenient for controlling muscle contraction and relaxation in different small face area. And this model could combine these muscle movements in adjacent areas together to strengthen the whole face expression.  is Gauss model can simulate muscle movement by inclining its peak and enlarging its zone extent and its projection on the XY plane much similar to the muscle movement, as shown in Figures 15(a) and 15(b). In order to simplify the muscular model, the inclination angle α is set to minimum to make the strengthening effect maximum. e adjustable parameter is the area extent d of muscle movement; it could be enlarged following the direction θ to enlarge muscle movement area and strengthen muscle movement.
As frontalization face is the projection of 3D face on the XY plane, the muscle strengthening process can be simplified by calculating the projection of the 3D Gauss model on XY plane, as shown in function (10). is function could give every point on the face image a new position (X M,i , Y M,i ) to change face shape and intensify facial muscle contraction or relaxation, for the No. i area: where (X, Y) is the original coordinate of face point, θ is the muscle moving direction, and X H , Y H are related to the model inclination angle α which have been set to make muscle movement maximum. e moving direction θ in function (10) is embodied as sin(π + θ) and cos(π + θ). It can control the contraction or relaxation area rotating to the muscular moving direction on the XY platform.
is function could not only make muscular contract and relax but also make this strengthening area connect with its surround areas smoothly. For example, in the X direction, it gives different position with different coordinate displacement X H . exp(− ((X − X M,0 ) 2 /(2d 2 ))) sin(π + θ). Following the 3σ principle of Gauss model, the boundary of real area to strengthen muscle movement is set as ±2σ in our method, which is double of muscle movement area extent d (σ � d), and the coordinate is [X M,0 − 2d, X M,0 + 2d], where the muscle movement (decreasing to 5%) is close to 0 on the area boundary. In the center of this area, the displacement value is maximum: X H sin(π + θ). For the points on the two sides, their coordinate displacement is decreased gradually to the area boundary and can connect with the adjacent areas smoothly, as shown in Figure 15(b). e same relationship also suits the Y-axis and its muscle movement model Y H . exp(− ((Y − Y M,0 ) 2 /(2d 2 )))cos(π + θ) + Y.
is function can make the muscles of some areas partially overlap each other to make whole face muscle moving. For the point in the overlap areas, their coordinate functions are designed as follows: e muscular contraction or relaxation for multioverlap areas is shown in Figure 16(c); this is a "sad" image. And the expression strengthening face is shown in Figure 16(d); the shapes of brow, eye, cheek, and mouth are changed significantly to strengthen "sad" compared to original frontalization face (Figure 16(b)).

Experiment Results
For testing this expression recognition method effectively, the library SFEW2.0 is used. is library concludes a great deal of photos taken from many famous films; human faces in these photos are taken in the wild and their heads poses are varied. e common FER methods might be disturbed seriously by this library, and recognition rates of traditional methods are only 60∼70% in the wild. For example, MEGVII Face++ is a famous face image processing company; its expression recognition software also makes a lot of mistakes for the images in SFEW2.0 library. While the face frontalization and expression strengthening method presented in this paper can solve these problems, these faces, the expression of which cannot be recognized exactly by Face++, can also be recognized exactly after face frontalization and expression strengthening. e following are the experiment results with different face poses and different expressions. Face++ software failed to detect expression from these original faces. Meanwhile, after our face frontalization, some expressions can be recognized, and, after expression strengthening, the facial features on the strengthened frontalization face are more significant than those of original frontalization face, and recognition result is exact. e FER accuracy for different expressions increases by 10% in average. Figure 17 shows the expression recognition result of rising head. According to the library label, this man is "sad." Meanwhile, as his head is rising, the sad features are all dislocated. So this expression is mistakenly identified as "neutral" by Face++ platform, as shown in Figure 18.
After face frontalization, these sad features are readjusted and are similar to those of front face, as shown in Figure 17(b).
rough this image, Face++ can rightly identify expression as "sad," as shown in Figure 19; the prediction confidence for "sad" is 41.35% and is larger than those in the other expression kinds.
After expression strengthening, the features of "sad" on cheeks and brows on the frontalization face are strengthened automatically. And Face++ can recognize the "sad" much exactly; the prediction confidence increased from 41.35% to 68.55%, as shown in Figure 20.
In Figure 18, there are 7 horizontal bars on the right of Face++ platform to show the prediction confidence of the seven expressions. From top to bottom, the seven expressions are "happy," "neutral," "surprise," "sad," "disgust," "angry," and "fear," and these are noted in English as shown in Figure 18. e most probable expression in the recognition result is shown as orange bar. Figure 21 shows the expression recognition result of turning head. According to the library label, this man is smiling. Meanwhile, as his head is turning and bowing and he is closing his eyes, the features in the original face are very blurly in the dim light.
is expression is mistakenly identified as "sad" by Face++ platform, as shown in Figure 22.
After face frontalization, although the front face is gotten, the expression features are insignificant and Face++ is introduced to mistakenly identify as "disgust," as shown in Figure 23.
After expression strengthening, the features of smile are intensified on this frontalization face automatically, especially the features on the cheek; the cheek muscle is squeezed into two sides to make the smile more significant. rough this strengthening image, Face++ can recognized the "smile" more exactly, as shown in Figure 24, and prediction confidence is close to 80%. Figure 25 shows the expression recognition result of turning head to left. According to the library label, the human is "angry." Meanwhile, as his head is turning, this   Figure 19: e frontalization face is recognized as "sad" but prediction confidence is only 41.35%.  Figure 20: e expression strengthened face is correctly recognized as "sad" and prediction confidence increases to 68.55%. expression is mistakenly identified as "surprise" (prediction confidence 47.16%) and "neutral" (prediction confidence 46.68%) by Face++, as shown in Figure 26.
For the frontalization face, it is recognized as "sad," which is a little close to "anger" than "surprise," as shown in Figure 27.
After expression strengthening, the features of "sad" on brows and mouth on the strengthened frontalization face are more significant. And Face++ can recognize the "anger" more exactly; the prediction confident is 69.4%, as shown in Figure 28.

Results-Related Discussion
For the SFEW database introduced to test our expression recognition method, it is a very difficult database for testing nonfrontal expression recognition. e reported recognition rates for database SFEW are often very less, because the facial expression features in the wild are often deformed and translated. For example, the deep learning method, Multiple deep CNNs, is 55.96% [21], RAN (VGG16 + ResNet18) is 56.4% [30], gACNN is 54.47%  Figure 24: e expression strengthened face is correctly recognized as "happy" [22], and ensemble IL-CNN [24] is 59.41%. Even for unpleasantness and pleasantness recognition, the recognition rates are still very low; for example, the reported recognition rates of ensemble IL-CNN [24] for unpleasantness and pleasantness are just 64.82% and 73.7%.
Meanwhile, through face rebuilding method presented in this paper and muscle movement rebuilding and intensifying method, the expression on the nonfrontal face can be recognized more exactly from its rebuild front face, which can also strengthen muscle movement during this  Figure 26: Angry face is mistakenly recognized as "neutral" and "surprise." frontalization process. And the expression rate for different expressions can be increased by 10% on average in the test.

Conclusions
Facial expression recognition (FER), especially in the wild, is a hotspot and also a challenge for AI. In order to solve the problem caused by face pose changing, this paper proposed a face frontalization and expression strengthening method for these nonfrontal face images. is method could build 3D face for each face in the wild and extract the front face through projection. And in order to improve FER accurate, this method has taken advantage of the features of face symmetry to build frontalization face. But the frontalization face cannot be used for FER directly, because, according to the theory of "N:q rule" of parameter estimates and the theory of muscle structure anatomy, this paper had proved that expression features in the frontalization face might be more weakened than those of real front face. And the parameters of muscular facial expression, except moving direction, cannot be estimated effectively. In order to strengthen muscle movement to highlight expression features, face contour model and muscle movement strengthening model are designed in this paper. Face contour model could extract the muscular moving direction in different face areas. Muscle movement strengthening model could intensify muscle movement in these areas. e expression strengthened frontalization face would be much e expression in nonfrontal faces, which cannot be recognized by the famous MEGVII Face++ software effectively, can be recognized exactly after face frontalization and expression feature strengthening.
Data Availability e database used in this manuscript is SFEW2.0. It can be download from https://pan.baidu.com/s/ 1IktnrO0WTQoGNsX8cNSviQ, and download password is "ztum."

Disclosure
All the human faces, except Figures 2, 6, and 8 which had obscured their eyes, used in this manuscript are taken from the open database SFEW2.0. When this SFEW2.0 is built by its establishers, all the patients whose figures are sampled by this database have given consent for the clinical study to be published.

Conflicts of Interest
e authors declare that they have no conflicts of interest.