This paper presents a human gait recognition algorithm based on a leg gesture separation. Main innovation in this paper is gait recognition using leg gesture classification which is invariant to covariate conditions during walking sequence and just focuses on underbody motions and a neuro-fuzzy combiner classifier (NFCC) which derives a high precision recognition system. At the end, performance of the proposed algorithm has been validated by using the HumanID Gait Challenge data set (HGCD), the largest gait benchmarking data set with 122 objects with different realistic parameters including viewpoint, shoe, surface, carrying condition, and time. And it has been compared to recent algorithm of gait recognition.
In the last decade, there have been great interests in applying human biometrics for identification and verification purposes, for instance, in video surveillance and human recognition areas. Amongst there have been lots of researches in using ear and face recognition, body tracking and hand gesture recognition, and recently gait recognition using in the human identification areas. But as a comparison between human gait and other various biometrics, such as hand geometry, iris, face, voice, signature, and fingerprint [
The previous works have been classified under similar covariate conditions (e.g., clothing, surface, carrying, etc.). But in this paper we proposed an improved and also novel method of classification which is only based on different gestures of leg during walking without body parts tracking and invariant to different covariate conditions.
As it is indicated in Figure
(a) A sequence of energy halation images. (b) Underbody changes during walking. (c) Bust (upper body) changes during walking.
As a review to fundamental of the usual gait recognition algorithm, we can express that in the walking process functional versatility of the body joints allows the lower and upper limbs to readily accommodate stairs, doorways, changing surfaces, and obstacles in the path of progression. Efficiency in these endeavors depends upon free joint mobility and muscle activity that is selective with timing and intensity. Energy conservation is optimal in the normal pattern of limb action. A person will perform one’s walking pattern in a fairly repeatable and unique way, and medical research has been trying to apply these gait patterns for the treatment of pathologically abnormal patients [
As a brief introduction of the approach of this paper we can express the following procedures.
Five states of human gait are extracted after background estimation and human detection in the scene. Leg gestures are classified over directional chain code of bottom part of silhouette contour. A spatiot-emporal data base, namely, Energy Halation Image (EHI), is constructed over bottom part of human silhouette from train film sequence for five leg gestures separately. Eigen space of energy halation is applied to multilayer perceptron neural network. Five neural network systems recognize people but with medium recognition rate. A Neuro-fuzzy fusion technique is used for obtaining high recognition rate. Experimental results are performed over a suitable data base. It includes 20 samples for eight people which each sample have 100 frames approximately. 99% recognition rate of the proposed system is obtained over 10 samples test patterns.
Leg gesture studies have various applications. Among this, some interest work indicates importance of leg gesture classification as in [
The infrared thermal imaging was applied to collect gait video, and an infrared thermal gait database was established in [
Reference [
Reference [
In [
Tactile ground surface indicators installed on sidewalks help visually impaired people walk safely. The visually impaired distinguish the indicators by stepping into its convexities and following them. However, these indicators sometimes cause the nonvisually impaired to stumble. In [
Another interest for gait identification is that of reflect gait degeneration due to ageing that might have closer linkage to the causes of falls. This would help to undertake appropriate measures to prevent falls. Like in many other developed countries, falls in older population have been identified as a major health issue in Australia [
Biomechanical analysis of gait has been successfully applied in human clinical gait analysis [
Identification of people by analysis of gait patterns extracted from video has recently become a popular research problem. However, the conditions under which the problem is “solvable” are not understood or characterized as in [
As a solution for making it possible to identify human gait from a sequence of segmented noisy silhouettes in low-resolution video, a model-based gait cycle extraction based on the prediction-based hierarchical active shape model (ASM) is presented in [
As it mentioned, the gait recognition is an effective way for identifying from a distance but there are two different obstacles in this situation. First in the low-resolution case the performance of gait recognition is abated because of noisy images. Furthermore, as a usual procedure of gait recognition the gait sequences are projected onto a nonoptimal low-dimensional subspace to reduce the data complexity which again would lead to decline of gait recognition performance. A new algorithm is proposed in [
Recognizing gait with body decomposition to details and fusion of them were not observed in the literature. Main contribution of this paper is gesture classification for human gait recognition. But some new notes can be found in this paper as follows. A new spatio-temporal data base, namely, energy halation. Five-feature space generation using leg gesture concept. Human gait recognition based on leg gesture classification. Neuro-fuzzy-based combiner classifiers (NFCCs). Presentation of complete system in gait recognition.
Low performance in human gait recognition systems is one of motivations of the proposed method. Human detection in the scene, object tracking, and classifiers capability over time-dependent features are some of problems in obtaining low recognition rate. So, we try to present a complete system in human gait recognition which includes many features.
Block diagram of the proposed method can be abstracted in Figure Background estimation, leg gesture recognizer, energy halation image construction (spatio-temporal data base), gait recognition in Eigen space, neuro-fuzzy-based combiner classifier.
Block diagram of proposed gait recognition method.
Several approaches are known to separate foreground from background. If the background is known a simple thresholding yields to the foreground. One suitable way in object detection is background estimation. This paper uses probability density function (PDF) estimation of each pixel [
Human detection in the scenes using Gaussian PDF model.
After background estimation and human detection in the scene, binary human image (blob) is obtained. After cutting a bottom of blob image (waist to sole), distribution function of directional chain code is extracted from blob contour. After normalizing the chain code to its maximum, a multilayer perceptron neural network (MLP-NN) is used for leg gesture recognizing with this feature. Block diagram of leg gesture classifier is shown in Figure First digit denotes the state of person (1-2-3-4-5). Second digit denotes the person (1-2-3-4-5). Third digit denotes the number of the image of each person (1-2-3-4-5).
Therefore we now have 125 named images in the database for training. Moreover, we considered five different angles in the video sequence of samples for each state like the one in Figure
Different states of every sample for the NN trainer.
Five angles of each gait state.
A sample of recognizing 5 different states of gait using NN recognizer.
One of leg gesture classifier parts is gesture data base which is necessary for training of MLP-NN using backpropagation algorithm. Five states are determined for leg gesture which depends on frame rate and type of application. Figure
Normalized histogram for state 1 and 2.
However, trained neural network cannot classify leg gestures perfectly but this problem compensates in creation of spatio-temporal data base and using classifier.
Spatio-temporal data base use for compact presentation of film sequence and use in many applications as image retrieval, gesture analysis, action recognition, and behavioral recognition in the scene.
In this sub-section we propose a spatio-temporal like motion history image (MHI) in [
Each input frame belong to one of five leg gestures and is used for generation of five energy halation images. Initializing: Let Let
Note: Adding zero rows and columns bilateral of If it is not end of sequence go to step 2. End.
Obtained results include five images of energy halation for each input sequence. As an example, Figure
Five images of energy halation (columns) for three people (rows).
As face recognition and similar applications, we use Eigen space transform for reducing the dimensions of the energy halation images before applying to MLP-neural network. Training MLP-NN is performed over each leg gesture for human gait recognition. So five trained MLP-NNs are created and use for human identification but each network recognized people separately based on different features (these features are energy halation over each leg gesture).
Recognition rate of each network does not satisfy the using system as good human gait recognizer so we combine neural networks output using neuro-fuzzy-based mixer classifiers which is followed in the next sub-section.
Neuro-fuzzy system has been proved to have significant results in modeling nonlinear functions. Neuro-fuzzy system has been used frequently in the literature as fishing predictions [
In a neuro-fuzzy system, the membership functions (MFs) are extracted from a data set that describes the system behavior. The neuro-fuzzy system learns features in the data set and adjusts the system parameters according to given error criterion. In a fused architecture, NN learning algorithms are used to determine the parameters of fuzzy inference system. Below, we have summarized the advantages of the neuro-fuzzy system technique. Fusion of output classifiers with linear combiner has been pointed in [
A set of film including 160 sequences of eight people is used as data base. Frame rate per second is 25, and image size is 352 × 288. Some images from data base are shown in Figure
Some samples of people image.
Leg gesture recognizer is a three-layer MLP neural network with eight input neurons and five output neurons and fifteen neurons in hidden layer that can categorize input frames to 5 states. An example of this stage is shown in Figure
Result of gesture recognizer system.
As it was mentioned before, each gesture helps in categorization of frame sequence in five images of energy halation are performed, and five MLP neural networks are trained over 10 film sequences for 8 people. Each network has 50 neurons in input layer, and three hidden layers with 100, 90, 40 neurons and 8 neurons in output layer. In testing phase, captured confusion matrixes for two networks are shown in Tables
Confusion matrix of neural networks 1and 2 related to the 1st and 2nd gestures.
NN1 | P1 | P2 | P3 | P4 | P5 | P6 | P7 | P8 |
---|---|---|---|---|---|---|---|---|
P1 | 7 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
P2 | 3 | 7 | 0 | 1 | 2 | 0 | 0 | 1 |
P3 | 0 | 0 | 9 | 0 | 0 | 0 | 0 | 0 |
P4 | 0 | 0 | 0 | 9 | 1 | 0 | 0 | 0 |
P5 | 0 | 2 | 0 | 0 | 7 | 0 | 0 | 0 |
P6 | 0 | 1 | 1 | 0 | 0 | 7 | 0 | 0 |
P7 | 0 | 0 | 0 | 0 | 0 | 3 | 10 | 0 |
P8 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 9 |
Confusion matrix of neural networks 1and 2 related to the 1st and 2nd gestures.
NN2 | P1 | P2 | P3 | P4 | P5 | P6 | P7 | P8 |
---|---|---|---|---|---|---|---|---|
P1 | 10 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
P2 | 0 | 8 | 0 | 0 | 0 | 0 | 0 | 1 |
P3 | 0 | 0 | 10 | 0 | 0 | 0 | 0 | 0 |
P4 | 0 | 0 | 0 | 8 | 1 | 1 | 0 | 0 |
P5 | 0 | 2 | 0 | 2 | 5 | 1 | 3 | 0 |
P6 | 0 | 0 | 0 | 0 | 1 | 5 | 0 | 0 |
P7 | 0 | 0 | 0 | 0 | 2 | 3 | 7 | 0 |
P8 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 9 |
Confusion matrix of proposed system as shown in Figure
NF | P1 | P2 | P3 | P4 | P5 | P6 | P7 | P8 |
---|---|---|---|---|---|---|---|---|
P1 | 10 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
P2 | 0 | 9 | 1 | 0 | 0 | 0 | 0 | 0 |
P3 | 0 | 0 | 10 | 0 | 0 | 0 | 0 | 0 |
P4 | 0 | 0 | 0 | 10 | 0 | 0 | 0 | 0 |
P5 | 0 | 0 | 0 | 0 | 10 | 0 | 0 | 0 |
P6 | 0 | 0 | 0 | 0 | 0 | 10 | 0 | 0 |
P7 | 0 | 0 | 0 | 0 | 0 | 0 | 10 | 0 |
P8 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 10 |
As an approach to evaluate our proposed method (gait recognition based on NFCC), we also analyzed comparison between our method and different algorithms of HumanID Gait Challenge Dataset (HGCD) and compared its result with a recent algorithm of gait recognition which have been evaluated by HGCD (Table
The final result of NFCC performance in gait recognition in comparison with HGCD and ASM algorithms [
Algorithm | Width vectors (UMD) | DTW | HMM | BodyShape (CMU) | HMM (MIT) | Body (CAS) | Baseline | ASM | Proposed method (NFCC) |
---|---|---|---|---|---|---|---|---|---|
Average performance* | 25.7 | 30.1 | 53.2 | 46.2 | 43.2 | 39.2 | 49.8 | 89.1 | 94.2 |
*The average performance under different experiments of HGCD.
An interesting note was found in this paper “human gait recognition based on leg gestu.” But this paper includes a new spatio-temporal gait data base (Energy Halation Image), neuro-fuzzy-based combiner classifier (NFCC). To overcome the limitation of recognition performance rate, we proposed a system for gait feature fusion. We used five spatio-temporal data bases and applied their features in Eigen space to five neural networks separately. Performance of each NN for test samples was low (about 70% to 80%). Then we used a neuro-fuzzy combiner classifier for mixing the neural networks for the first time in gait recognition. Result of combination of neural network outputs was satisfying.
Neural Networks (NNs) are demonstrated to have powerful capability of expressing relationship between input-output variables. In fact it is always possible to develop a structure that approximates a function with a given precision. However, there is still distrust about NNs identification capability in some applications. Fuzzy set theory plays an important role in dealing with uncertainty in plant modeling applications. Neuro-fuzzy systems are fuzzy systems, which use NNs to determine their properties (fuzzy sets and fuzzy rules) by processing data samples. Neuro-fuzzy integrates to synthesize the merits of both NN and fuzzy systems in a complementary way to overcome their disadvantages. The fusion of an NN and fuzzy logic in neuro-fuzzy models possesses both low-level learning and computational power of NNs and advantages of high-level human-like thinking of fuzzy systems. For identification, hybrid neuro-fuzzy system called ANFIS combines an NN and a fuzzy system together. ANFIS has been proved to have significant results in modeling nonlinear functions. In ANFIS, the membership functions (MFs) are extracted from a data set that describes the system behavior. The ANFIS learns features in the data set and adjusts the system parameters according to given error criterion. In a fused architecture, NN learning algorithms are used to determine the parameters of fuzzy inference system. Below, we have summarized the advantages of the ANFIS technique. Real-time processing of instantaneous system input and output data’s. This property helps using of this technique for many operational researches problems. Offline adaptation instead of online system-error minimization, thus easier to manage and no iterative algorithms being involved. System performance is not limited by the order of the function since it is not represented in polynomial format. Fast learning time. System performance tuning is flexible as the number of membership functions and training epochs can be altered easily. The simple if-then rules declaration and the ANFIS structure are easy to understand and implement.
A typical architecture of ANFIS is shown in Figure
ANFIS architecture.
Layer 1, every node
Layer 2
Layer 3, the
Layer 4, the node function in this layer is represented by
Layer 5, the single node in this layer computes the overall output as the summation of all incoming signals:
It is seen from the ANFIS architecture that when the values of the premise parameters are fixed, the overall output can be expressed as a linear combination of the consequent parameters:
The hybrid learning algorithm combining the least square method and the backpropagation (BP) algorithm can be used to solve this problem. This algorithm converges much faster since it reduces the dimension of the search space of the BP algorithm. During the learning process, the premise parameters in layer 1 and the consequent parameters in layer 4 are tuned until the desired response of the FIS is achieved. The hybrid learning algorithm has a two-step process. First, while holding the premise parameters fixed, the functional signals are propagated forward to layer 4, where the consequent parameters are identified by the least square method. Second, the consequent parameters are held fixed while the error signals, the derivative of the error measure with respect to each node output, are propagated from the output end to the input end, and the premise parameters are updated by the standard BP algorithm.