Human Walking Pattern Recognition Based on KPCA and SVM with Ground Reflex Pressure Signal

Algorithms based on the ground reflex pressure (GRF) signal obtained from a pair of sensing shoes for human walking pattern recognition were investigated. The dimensionality reduction algorithms based on principal component analysis (PCA) and kernel principal component analysis (KPCA) for walking pattern data compression were studied in order to obtain higher recognition speed. Classifiers based on support vector machine (SVM), SVM-PCA, and SVM-KPCA were designed, and the classification performances of these three kinds of algorithms were compared using data collected from a person who was wearing the sensing shoes. Experimental results showed that the algorithm fusing SVM and KPCA had better recognition performance than the other two methods. Experimental outcomes also confirmed that the sensing shoes developed in this paper can be employed for automatically recognizing human walking pattern in unlimited environments which demonstrated the potential application in the control of exoskeleton robots.


Introduction
In the past decades, many wearable human-assistive robot systems have been developed for the purpose of assisting physically weakened people such as elderly, disabled, and injured people.Many important results have already been achieved.Sankai et al. developed hybrid-assistive limb (HAL) for augmenting power of normal persons [1,2] and Kazerooni et al. introduced Berkeley Lower Extremity Exoskeleton (BLEEX) for military applications [3,4].Yamamoto et al. developed Power Assist Suit to assist nurses lifting heavy patients [5].Kong and Jeon introduced Exoskeleton for Patients and Old People by Sogang University (EXPOS) for weakened persons [6].However, there are several factors that limited the general use and commercialization of these devices.In particular, the development of control strategy is challenging [7,8].Most of the control strategies adopted in the exoskeleton robots used finite-state machines for gait phase detection.At the hardware level, the mechanical components and sensors used in the prototypes usually confine the nature of the low-level controllers to particular configurations [7,8].The control objective of an exoskeleton robot is to follow up the movements of a healthy wearer, augmenting his/her physical capabilities for specific tasks in a relatively safe way.Human motion intent recognition is one of the key issues of the controller of the exoskeleton robot.This is because these exoskeleton robots must know the wearer's intent so that they can follow the movements of the wearers.
In recent years, many researches were focused on the recognition of human's motion pattern for the purpose of control of exoskeleton robot.Signals for the gait recognition can be obtained by different kinds of sensors.The main information types used are biomechanical signals, electromyographic (EMG) signals, peripheral nervous system signals, and central nervous system signals [8].During the signals for motion pattern recognition, EMG signals were widely used in the exoskeleton robots.However, noise included in the EMG signals makes it difficult to identify the gait phase of walking exactly.Furthermore, each joint of a human body is actuated with the cooperation of many muscles.Therefore, it is difficult to identify wear's walking pattern accurately based on the activities of only few muscles.
Consequently, studies on the motion pattern recogniztion are focused on looking for other signals instead of EMG.Ground reflex pressure (GRF) signal is viable and effective for identifying behavior because human movement and posture are well reflected in foot pressure distribution.Many walking pattern recognition methods based on GRF were studied.Force platform is widely used to analyze human movement.However, the force platform imposes constraints on measurement and is not feasible for measurement of free-living subjects.The present researches tend to focus on daily worn wristwatches, glasses, and shoes where sensors can be embedded into.With embedded sensors, noninvasive detection is available for providing action assistantce.Many researchers have developed wearable sensors attached to insoles [9][10][11][12][13][14][15][16][17][18].
In many robotic systems, pressure sensors were installed at toe or heel to recognize movements.Most of the methods were based on threshold [9,15,16].Hirata et al. set two representative points to measure GRF on the heel and the toe, and considered the threshold to determine flat foot.However, the recognition accuracy is relatively low.In order to improve the recognition accuracy, many researches focus on the recognition algorithm besides trying to find sensors replacing EMG, and most of the recognition algorithms are based on machine learning.Recognition model was built offline.An algorithm based on support vector machine (SVM) fusing neuromuscular and mechanical signals to continuously recognize a variety of locomotion modes was developed [10].Walking modes were classified by a classifier based on Bayesian [11].Different computational approaches have been proposed to support various gait pattern-based applications.
Generally, using more signal acquisition channels could provide more motion information for better performance of walking pattern classification.However, more sensors will definitely increase the complexity of computation and analysis which may lead to slow discrimination response.These issues make walking pattern recognition a difficult task for the control of exoskeleton robot.Therefore, comprehensive analysis between the recognition accuracy and the processing speed should be done.Feature extraction acts as a vital role for pattern recognition.In order to improve the processing speed, some algorithms can be used to compress the data during the process of feature extraction.Principal component analysis (PCA) is a well-known method for feature extraction which can lower dimension [19][20][21][22][23].By calculating the eigenvectors of the covariance matrix of the original inputs, PCA linearly transforms a high-dimensional input vector into a low-dimensional one whose components are uncorrelated.Although PCA has many advantages, it has many shortcomings, such as its sensitiveness to noise and its limitation to data description.To eliminate these shortcomings, many methods have been proposed to improve PCA algorithm.Among these improved PCA algorithms, kernel-based PCA (KPCA) proposed is a state-of-the-art one as a nonlinear PCA algorithm [24][25][26][27].KPCA utilizes kernel function to gain the random high-order correlation between input variants and finds the principal components needed through the inner production between input data.KPCA not only can successfully describe the data with Gaussian distribution, but also can describe the data with non-Gaussian distribution.More and more researchers are interested in this field and have carried out some relevant researches.
It is worth noting that stair climbing has not been studied as extensively as gait during the control of exoskeleton robot, although the significance of the prevention of falling on stairs has been well recognized.Startzell et al. [28] reported that more than 1000 individuals over the age of 65 years die in the United States each year as a result of falling on stairs.More recently, Lee and Chou [29] found that older adults had more difficulty in maintaining balance during stair descending than stair ascending.Hayashi and Kiguchi [30] proposed a stability control method based on ZMP on a assistant robot system when go upstairs and downstairs.So that the controller can adopt active stability control strategy during stair ascending to ensure the safety of the wearer.Therefore, in the application of walking assistance for older people, it is very important for the controller of the exoskeleton robot to know whether the wearer is stair descending or stair ascending.
The main purpose of this paper is to apply machine learning approaches to recognize the walking pattern of stair descending or stair ascending from the pattern of walking on a flat surface or standing still using the GRF data.In order to improve both the recognition accuracy and rapidity, datadecreased algorithms based on both PCA and KPCA were studied in this paper.Comparing with the traditional motion recognizing methods based on pressure threshold which are used in most exoskeleton robots, the method proposed in this paper can provide higher recognition accuracy.This reminder of this paper is organized as follows.Section 2 presented the walking pattern recognizing system with foot pressure sensing shoes.Section 3 illustrates the proposed method.In Section 4, experimental results were shown.Finally, Section 5 summarized this paper.

System Description
A pair of foot pressure sensing shoes was developed in this paper.The distribution of the pressure sensors was shown in Figure 1 and the sensors position were listed in Table 1.Pressure sensors are FSR402, which are force-sensitive resistor sensor.FSR402 sensor is a flexible printed circuit with a thickness of 0.5 mm.The more sensors placed, the higher the precision of plantar pressure distribution can be measured.In this system, seven sensors were installed on each insole at seven different positions.These seven points were defined after walking analysis experiments, according to plantar parts traditionally used in researches about gait analysis [31,32].A microcontroller (STM32F107VET6) was employed for analog-digital conversion, data processing, and control of data transmission.The wireless transmission module based on ZigBee communication was used for digital data output and input.G r e a t b a l l

Methodologies
The proposed method for recognition of human motion pattern using FRG signals was displayed in Figure 2 which can be divided into two main stages: offline and online stages.The offline stage shown in dashed box involves the processes required for motion pattern recognition model building.The result from the offline stage is a motion pattern recognition model which will be used later in online motion pattern recognition.The process of online motion pattern recognition was shown in a solid box.

Data Inputs.
The first step of the process is to create the training input dataset from the signals that were obtained from the 14 pressure sensors which installed under the sole.
The signals are wirelessly transmitted to the PC.The raw data is made up of the pressure of each point at each time step.In addition, every time step is labeled with a value which indicates the current type of motion (such as walking forward on a flat surface, stair descending, or stair ascending): where SR denotes the sensor sample rate and  is the number of pressure sensor.

Preprocessing and Segmentation.
The signals were filtered by a low-pass filter with a 10 Hz cut-off frequency.
In order to prepare the input from the sensor data, the sliding window technique is used to segment the GRF data for continuous classification decision making (Figure 3).
Features that characterized the data signals were extracted from each analysis window.This technique is commonly used for separating time series data into the input vector without losing information.An experiment on the different window length was carried out where it was decided to use a window of 200 ms.Therefore, decisions are made at 200 ms intervals.Processing algorithms were implemented in MATLAB; the processing was performed on a PC with 2 G, 2 GHZ CPU.

Feature Extraction and Reduction.
The performance of the classifier mainly depends on the effective feature extraction method.In this work, five features were calculated from the collected sensor data for training and testing, which are average value, standard deviation, maximum value, minimum value, and difference deviation.

The Principal Analysis Feature Reduction.
A commonly used feature reduce method is the principal analysis (PCA), which maps data onto the axes of greatest variance and reduces the number of dimensions.PCA reduces dimensionality by throwing away axes with small variances, ensuring that the data matrix, now projected onto its principal components, loses as little information as possible.
Mathematically, the principal components of a matrix  are calculated from the eigenvectors of 's covariance matrix .If   is the th column of ,  was obtained from the following formula: where  is the number of columns in .The th eigenvector of , V  , can then be found using the standard eigenvalue of problem There must exist some vector of coefficients   that allows V  to be constructed from  and that allows the th principal component of  to be calculated: where    is the th component of   .Since PCA is linear transformation, it is relatively quick to compute.However, the PCA algorithm may not be effective in dealing with nonlinear feature boundaries.
In this study, foot pressure detection is formulated as a four-class classification problem.The distribution of foot pressure signals is nonlinear.Therefore, a linear boundary is inappropriate.The input vector should be mapped into a high-dimensional feature space for higher classifying accuracy.machine learning.Similar to PCA, kernel principal component analysis (KPCA) takes a matrix of data and projects it onto new, reduced principal components.Unlike PCA, however, KPCA accomplishes this mapping through the use of a nonlinear kernel function.The purpose of KPCA is to keep as much information as possible in terms of variance and find directions that have minimal reconstruction error.KPCA has been proven to be more effective than PCA on nonlinear data sets.KPCA maps data to a higher-dimensional feature space and then executes traditional PCA.The nature of KPCA makes it far more adept at representing nonlinear data in a way that can be interpreted linearly.

The Kernel Principal Analysis Feature Reduction
If the data matrix  is mapped into a higher-dimensional feature space by a kernel Φ, that is,  → Φ(), then the math describing KPCA is very similar to that of PCA.A covariance matrix can be built as follows: Then, the eigenvectors are found in the same way as PCA.Constructing the principal components yields: Feature extraction with KPCA uses a kernel function defined as (  ,   ) = Φ(  ) ⋅ Φ(  ).A kernel matrix  is then built up from evaluations of this kernel function such that To construct , the vectors   for  = 1, . . .,  are examined, centered, and scaled before being fed into the kernel function.Calculation time can be reduced drastically by realizing that  is symmetric and thus   =   . is then centered in the feature space: Because of the series of relations set up in (3), ( 4), (6), and ( 7), the eigenvectors of  must be the coefficient vectors   such that Therefore, it follows from ( 5) and ( 6) that the th principal component generated by KPCA can be calculated with the following formula: where   is the th row of .Here,  = 1, . . ., .The value of  is the number of eigenvectors desired to be extracted from , and can vary based on the user's needs provided that  < .The goal of this research was to improve the classification accuracy with a kernel PCA because of the fact that KPCA tends to have better results than PCA with nonlinear data, and experimental results with both PCA and KPCA were illustrated in this paper.
(b) Kernel Function Selection.The core ideal of KPCA is to map the input data into a kernel feature space using nonlinear mapping and then perform linear PCA in that space.In general, this non-linear mapping is realized by means of a kernel function.It is thus quite obvious that deciding on the form of the kernel function plays a crucial role in KPCA-based methods.Some of the most widely used kernel functions are Gaussian kernel equation (11), polynomial kernel equation (12), and sigmoid kernel equation ( 13): , Sigmoid (, ) = tanh ( ⟨, ⟩ + ) .
Once the kernel function was selected, values of multiple free kernel parameters must be determined.Although these kernel functions have been widely used in many applications successfully, they are not the optimal choice for all data sets.Instead of a priori selecting kernel function from a finite set of candidates without explicitly considering the structure of the data, the better way is to estimate it from the data.With a data-dependent kernel function tailored to data under consideration, the performance of KPCA or other kernelbased methods can be improved.
(c) Number of Principal Components.One of the crucial steps of the PCA-based and KPCA-based approaches is also determining the number of principal components to keep.The new components we got are within a new dimensional space.By employing only a finite set of eigenvectors in the descending order of eigenvalues, the number of principal components in   will be reduced.Therefore, the cumulative contribution rate of the first several components would be expressed as (1/) ∑  =1   ().Usually the contribution rate value is over 95% to characterize the original data.

Support Vector Machine for Classification.
A nonlinear approach, that is, support vector machine (SVM) with a nonlinear kernel, was investigated for walking pattern recognition.Support vector machine (SVM) is based on the foundation of statistical learning theory [33].It is a powerful classification algorithm with state-of-the-art property.Recently, SVM has been successfully applied to plenty of fields, such as pattern identification, regression analysis, and function approximating [34,35].The results give the evidence that this technique cannot only be satisfactory from the theoretical perspective, but also can lead to high accuracy in practical applications.Additional reasons for choosing the SVM with a nonlinear kernel were as follows: (1) a nonlinear classifier might accurately classify the data when the linear boundaries among classes are difficult to define and (2) the SVM is more computationally efficient than other nonlinear classifiers, such as the ANN.
The basic SVM takes a set of input data and predicts, for each given input, which of two possible classes forms the output, making it a nonprobabilistic binary linear classifier.Given a set of training examples, each marked as belonging to one of two categories, an SVM training algorithm builds a model that assigns new examples into the first category or the other.
By choosing a non-linear mapping, the SVM constructs an optimum separation hyperplane in the higherdimensional space.To start with, we train a classifier () with a learning algorithm from a set of samples ({(  ,   ),  = 1, 2, . . ., }).  is the given label for each training example   .We take a linear classification function: where  and  are defined according to The parameter  is for controlling the trade-off between the model complexity and empirical risk [15].In this case, we have to utilize the kernel function to map input vector  to a higher-dimensional space through a nonlinear mapping Φ().Hence, the inner product does not need to be evaluated in the feature space.With the kernel concept (  ,   ) = Φ  (  )Φ(  ), the resulting SVM model can be written as where   ,  = 1, 2, . . .,   are support vectors, which are determined during the training process.So the formula above results in an optimization problem with convex constrains, which is ready to be solved by the interior point method.For the walking pattern recognition issue, we selected the foot pressure signals of the 14 points in Figure 1 as input vectors.Thus, the SVM classifier was enabled to identify four kinds of movements: walking on level ground, standing still, stair descending, and stair ascending.
A multiclass SVM with "one-against-one" structure was used [10].The applied kernel function was radial basis function (RBF), which was defined as where   and   are the feature vectors the of th and th classes, respectively, and  = 1/, where  is the dimension of the feature .During the training procedure, all observed feature vectors  were nonlinearly mapped into a higherdimensional feature space based on the kernel function in (17).For the -class classification problem, ( − 1)/2 binary classifiers were constructed.

Experimental Protocol.
During the experiments, the participant wore a pair of shoes with pressure sensors installed in the sole.Our tests were carried out using a 24-year-old female wearer, 1.66 m tall.Wearer walked at a comfortable speed.With battery power supplied, foot pressure signals were gathered by 14 FSR402 sensors every 40 ms and transmitted virtually through the data processing board to the computer wirelessly.The waveforms of each sensor on both feet were exhibited on the desktop simultaneously for monitoring.The walking pattern recognition procedure was implemented by MATLAB 2010a, running on a PC with 2 G, 2 GHZ CPU.
The raw data on foot pressure distributions for each moving pattern were acquired with the developed foot pressure sensing shoes (Figure 4).Variation of foot pressure for each kind of movement was displayed in Figure 5.

Experimental Result.
There was a total of 4 classes, with a total of 6 binary classifiers for the classifier in each walking pattern.To build each binary classifier, a hyperplane was found by maximizing the boundary margin between two classes and minimizing the training classification errors.Six hyperplanes between any two classes were computed after training.During the testing procedure, each observed feature vector  was nonlinearly transformed and sent to the individual binary classifiers built in the training procedure; therefore, a total of 6 classification decisions were made.A voting strategy was used to make the final decision.The class (mode) with the most votes out of the 6 decisions was considered to be the locomotion mode.If more than one class had the same number of votes, the class (mode) with the smaller class index was chosen as the final decision.
Five internal time-domain parameters were picked up as feature, which are average value, standard deviation, maximum value, minimum value, and difference deviation.Features representing the pressure signal were memorized in matrix and sent to the classifier.Cross-validation was used to train and test the classifier based on SVM.We picked up 630 sets of data samples of each moving pattern; the former 420 are for classifier training and the latter for testing.The training data and training label are used to form the whole training set.
The classification accuracy of SVM, SVM-PCA, and SVM-KPCA was compared in the experiments.
(1) SVM.For the training part, we got an optimal  of 724 and  of 8 by cross-validation.The optimization of these two parameters is done for obtaining a high recognition rate based on current training samples.The RBF (radial basis function) kernel is employed.The cross-validation result (contour map and 3D view) of parameter selection is shown in Figure 6.The accuracy is about 91.96%.
The classification model was applied to predict the output category for testing samples identification.The actual and testing patterns of testing samples were shown in Figure 7.According to Figure 7, we summarize the classification performance results achieved by this SVM classifier.The average accuracy with all seven sensors is at 92.9% for all four kinds of movements and the diagnosis accuracy for each moving pattern is in Table 2.
(2) SVM-PCA.Compared with the preset inputs, we sent the input matrix to PCA processing algorithm for dimensionality reduction beforehand.We took the columns of PCs which occupied over 95% information of original data.A few numbers of new input eigenvectors provided sufficient information for foot pressure labeling and walking pattern recognition.The accuracy rate can be obtained when  = 1024 and  = 32; it reaches as high as 88.7% (Figure 8).The outcome of inputting the new eigenvectors in classifier is shown in Table 3.It could be noted that if a SVM classifier is used, declining recognition rate of moving patterns would be caused by PCA, whereas the classification time with proposed PCA algorithm did have a higher recognition speed, which was only 0.21 seconds.It decreased 0.25 seconds compared to the former classification.The accuracy is about 92.26% which is a little higher than the result of SVM.
(3) SVM-KPCA.We also sent the input matrix to KPCA processing algorithm for dimensionality reduction beforehand.Firstly, different kernel functions on the effect of dimensionality reduction were analyzed.In this paper, 3 kinds of typical kernel functions, Gaussian kernel, polynomial kernel, and sigmoid kernel (( 11)-( 13)), were used to reduce the dimensionality, respectively.KPCA1, KPCA2, and KPCA3, denote Gaussian kernel, polynomial kernel and sigmoid kernel, respectively.
The kernel parameters of different kernel functions were selected after several simulation experiment as follows:  = We also took the columns of PCs which occupied over 95% information of original data.A few numbers of new input eigenvectors were fed to the SVM classifier.
As an example, when polynomial kernel function was used, the accuracy rate can be obtained; it reaches as high as 92.5%.The parameters of SVM are  = 8 and  = 32, respectively.The error between the predicted data and  the original data was shown in Figure 10, and the outcome of inputting the new eigenvectors in classifier is shown in Table 4.The classification time after KPCA2 processing was only 0.22 seconds.The error figure was shown in Figure 10.The values of 0, 1, 2, and 3 labeled standing, walking, upstairs, and downstairs, respectively.The ordinate value of Figure 10 is the difference between the predicted and original value.The difference would be zero if the motion pattern was recognized correctly.For example, for the motion pattern of walking, if the difference value was −1, it indicated that the walking pattern was recognized as standing.From the experimental result, the recognition error of downstairs was higher than other kinds of motion.
Based on the experimental results, for the same testing data, the recognition of the algorithms based on KPCA with different kernel functions were obviously higher than algorithm based on KPCA.The biggest defect of KPCA over PCA is that more runtime would be required by KPCA algorithm during the training stage because it mapped data to a higher-dimensional space in order to perform PCA.However, the additional time could be taken as a one-time cost and has little impact on the recognition speed in online classifying.
The average recognition accuracy and running time with different processing algorithms were compared in Table 5.From the experimental results, we can see if a SVM classifier is used, the classification speed is much higher with the dimensionality reduction algorithm (PCA, KPCA).The declining recognition rate of walking patterns would be caused by PCA.However, no obvious recognition rate decline was caused by KPCA.

Conclusions
This study demonstrated that the foot pressure sensing shoes designed in this paper was able to recognize four walking patterns accurately.The experimental results showed that the classification method fusing SVM and KPCA was superior to the method that only used SVM or fused SVM and PCA.The average recognition accuracy based on KPCA-SVM classifier  produced 91.1%, and the classifying speed is twice as fast as that SVM classifier.These promising results may help the future controller design of exoskeleton robot.However, there still exists some drawback of our designed system.The processing speed of microcontroller and ZigBee communication used in the data acquisition mode is limited, which result in lower sampling rate.Higher-speed data acquisition device would be adopted in the future to obtain better recognition accuracy.Furthermore, walking patterns of stair ascending and stair descending were recognized in this study, but in the real control of exoskeleton robot, predicting these two walking pattern in advance is more important for the controller to adopt corresponding stability control strategy.So, researches on the walking pattern predicting algorithm will be studied in our future work and will be discussed in further studies.

Figure 2 :Figure 3 :
Figure 2: The process of human motion pattern recognition.

Figure 4 :
Figure 4: Ascending and descending stairs with developed shoes.

3. 5 .
Evaluation and Validation.The performances of the dimensionality reduction algorithms based on both PCA and KPCA were evaluated by experiments data.The data set is partitioned into 2 subsets.One subset is used to train the classifier model, whereas the other subset is used to test the classifier model.The classifier was evaluated by the classification accuracy (CA) which was defined as the percentage of correctly classified observations out of the total number of observations within that class.The value of CA can be calculated using the following equation: CA = Number of correctly classified testing data Total number of applied testing data .(18)

Figure 6 :
Figure 6: The cross-validation results of parameter selection of SVM.

Figure 7 :
Figure 7: Actual and testing patterns of testing samples with SVM.

Table 1 :
Names of sensing positions.

Table 2 :
Accuracy of different movements with SVM.

Table 3 :
Accuracy of different movements with PCA.

Table 4 :
Accuracy of different movements with KPCA2.

Table 5 :
Comparison of recognition rate with different processing methods.