Human Activity Recognition Based on the Hierarchical Feature Selection and Classification Framework

. Human activity recognition via triaxial accelerometers can provide valuable information for evaluating functional abilities. In this paper, we present an accelerometer sensor-based approach for human activity recognition. Our proposed recognition method used a hierarchical scheme, where the recognition of ten activity classes was divided into five distinct classification problems. Every classifierusedtheLeastSquaresSupportVectorMachine(LS-SVM)andNaiveBayes(NB)algorithmtodistinguishdifferentactivity classes.Theactivityclasswasrecognizedbasedonthemean,variance,entropyofmagnitude,andangleoftriaxialaccelerometer signalfeatures.Ourproposedactivityrecognitionmethodrecognizedtenactivitieswithanaverageaccuracyof95.6%usingonly asingletriaxialaccelerometer.


Introduction
Recently, activity recognition has become an emerging field of research and one of the challenges for pervasive computing.A typical application for activity recognition is in health care.Activity recognition is also an important research issue in building a pervasive and smart environment to provide personalized support.
Computer vision-based techniques and body-fixed accelerators are the main methodologies used for activity recognition.Computer vision-based techniques for activity recognition should be conducted in a well-controlled environment and be subject to the limitations of the environment.However, they may significantly fail in an environment with clutter and variable lighting [1][2][3].Body-fixed accelerators offer a practical and relatively low-cost method to measure human motion.
The existing literature demonstrates many studies on activity recognition that use accelerometers.However, there are three primary challenges in these studies.
(1) The large muscles of the body are controlled for walking, running, sitting, and other activities.The glutes are the primary muscles that drive lower-body movement because of their natural strength and leverage advantage on the legs.Lower-body movement includes activities such as running, jumping, and walking.Sleeping, sitting, standing, walking, running, and jumping must be recognized as typical physical activities.The activity recognition algorithm in Khan et al. [4] did not consider jumping.Running and jumping were excluded from the experiments in the research of Trabelsi et al. [5], Tang and Sazonov [6], Lee et al. [7], and Deng et al. [8].Gupta and Dallas [9] did not report how to recognize standing and sleeping, and Tao et al. [10] did not describe tests for recognizing sitting and sleeping.Alshurafa et al. [11] studied only walking and running recognition.These studies were incomplete in recognizing typical physical activities [12].
(2) Some studies [6,9,13,14] required the combination of multiple sensors to increase recognition performance.However, a user is less likely to wear a more complex operating system at all times.People may not feel comfortable wearing multiple sensors.Nevertheless, the multisensor systems do not have an enormous advantage over the single-sensor system on the recognition accuracy if the single-sensor system uses a higher sampling rate, suitable features, a more sophisticated classifier, and the correct sensor position, which has the best performance for recognizing activities.A single sensor mounted at the right position can also obtain good recognition performance.For typical physical activities, multiple sensors are not helpful for significantly improving recognition performance [15][16][17].
(3) A series of lectures [18][19][20] have been given on the topic of recognizing so-called ADL (activities of daily living), which is not physical-activity recognition."Activities of daily living" is a term used in healthcare to refer to daily self-care activities, such as cooking and hair drying, within an individual's place of residence or in outdoor environments.Physical activity included any body movement that works the muscles and requires more energy than resting, and it simply implies a movement of the body that uses energy, such as running or walking [21][22][23].Physical-activity recognition is discussed in this paper.
(1) SVM and ANN have been broadly used in human activity recognition, although they do not include a set of rules understandable by humans [24].As two different algorithms, SVM and ANN share the same concept of using the linear learning model for pattern recognition.The difference is mainly on how nonlinear data are classified.Consequently, SVM models have preferable prediction performances to ANN models.SVMs have been demonstrated to have superior classification accuracies to neural classifiers in many experiments.The generalization performance of neural classifiers considers the structure size, and the selection of an appropriate structure relies on cross validation [25].The performance of SVMs depends on the selection of kernel function type and parameters, but this dependence is less effective [26].
(2) NN does not perform well when the size of dataset increases, and it is suitable for small datasets.SVM is a complicated classifier; here, we implement the leaner kernel function.We conclude that the accuracy and other performance criteria do not significantly depend on the dataset size, but they depend on the number of training cycles among all factors.The number of training cycles is the best classifier for activity recognition [27].
(3) When a continuous HMM approach to activities is used, the length of the event sequence that gives the best predictions uses sequential data.A HMM is used to model the sequential information in multiaspect target signatures.The parameter-learning task in HMMs is to determine the best set of state transition and emission probabilities given an output sequence or a set of such sequences.The task is usually to derive the maximum likelihood estimate of the parameters of the HMM for the set of output sequences.Typical physical activities are nonsequential, and it is not easy to use HMM to recognize a single physical activity [28].
The traditional SVM [29] is formulated for binary nonlinear classification problems.How to effectively extend the SVM for multiclassification remains a hot topic.The Least Squares Support Vector Machine (LS-SVM) is an advanced version of the standard SVM, and LS-SVM defines a different cost function from the classical SVM and changes its inequation restriction to an equation restriction.Recently, there have been relatively few studies that use LS-SVM to recognize activities using a triaxial accelerometer.Nasiri et al. [30] addressed the Energy-Based Least Square Twin Support Vector Machine (ELS-TSVM) algorithm, which is an extended LS-SVM classifier that performs classification using two nonparallel hyper planes instead of a single hyper plane, which is used in the conventional SVM.ELS-TSVM was used to recognize activities using computer vision instead of a triaxial accelerometer.Altun et al. [31] compared the performances of the least squares method (LSM) and the SVM but did not include the LS-SVM.The LS-SVM for multiclassification is decomposed into multiple binary classification tasks.The LS-SVM for multiclassification reduces the computational complexity by using a small number of classifiers and effectively eliminates the unclassifiable regions that possibly affect the classification performance of this algorithm [32][33][34].
In this paper, we aimed to overcome the limitations of the existing physical-activity recognition system and intended to develop a new method that could recognize a set of typical physical activities using only a single triaxial accelerometer.This method consisted of three parts, six features for activity recognition, the hierarchical recognition scheme, and the activity estimator based on the LS-SVM and NB algorithms.This method could recognize ten physical activities with a high recognition rate.
The remainder of the paper is organized as follows.Section 2 describes the experimental dataset and hierarchical classification framework in this paper.Section 3 involves feature extraction to improve the classification accuracy using feature data over raw sensor data.Section 4 focuses on an activity estimator for multiclassification to estimate the human activity from the feature data.The experimental results and conclusion are presented in Sections 5 and 6, respectively.

Hierarchical Classification Framework
2.1.Activities Dataset.For this work, the used dataset was the University of Southern California Human Activity Dataset (USC-HAD).The USC-HAD was specifically designed to include the most basic and common human activities in daily life from a large and diverse group of human subjects.The activities in the dataset were applicable to many scenarios.The activity data were captured using a high-performance inertial sensing device, which is MotionNode [35].MotionNode integrates a 3-axis accelerometer, a 3-axis gyroscope, and a 3-axis magnetometer, and the measurement range for each axis of the accelerometer and gyroscope is ±6 g and ±500 dps, respectively.MotionNode was firmly attached onto the participant's right front hip.The sampling rates of this dataset for both accelerometer and gyroscope were set to 100 Hz.The dataset included 10 activities: walking (forward, left, and right), walking (upstairs, downstairs), jumping, running, standing, sitting, and sleeping [36][37][38].
The main goal of this paper was to identify ten activities, which were divided into four groups: 2D walking (walking forward, left, and right), 3D walking (walking upstairs, downstairs), plane motion (jumping, running), and static activities (standing, sitting, and sleeping).The division was performed using a single triaxial accelerometer.The activities are listed in Table 1.

Hierarchical Classification Framework.
To achieve higher scalability than the single-layer framework, a multilayer classification framework was presented.In the first layer, because the walking-related activities (walking forward, walking left, walking right, walking upstairs, and walking downstairs), jumping, running, and static activities were differentiated from one another, we classified the activities into two subsets (walking and all static activities) and two activities (jumping and running) based on feature selection.In the second layer, the walking-related activities subset included plane motion and 3D motion.In this layer, the static activity subset could be classified by standing, sitting, and sleeping.In the third layer, all detailed activities of 2D and 3D walking were recognized [39,40].Figure 1 illustrates the structure of the hierarchical classification framework.The yellow boxes represent the activity set, and the green boxes represent the ten types of activities to recognize.Now, the problem of recognizing ten activity classes was broken down to  distinct classification problems, and the red boxes represent the classifiers.A preliminary investigation of  selection is reported in Table 2.The fourclass classifier was the best selection in this hierarchical classification framework because of the small number of classifiers and high average accuracy rate of each classifier.The four-class classifier was used in this paper.
In the hierarchical classification framework of the fourclass classifier, classifier 1, at the top layer, distinguishes walking-related activities, jumping, running, and static activities.Walking-related activities include walking forward, walking left, walking right, walking upstairs, and walking downstairs.Static activities include standing, sitting, and sleeping [37].Classifier 2, at the second layer, distinguishes plane motions and 3D motions.Classifier 3 recognizes activities from plane motion, and classifier 4 distinguishes walking upstairs and downstairs from 3D motions.Finally, classifier 5 focuses on recognizing different static activities.

Feature Design and Selection
Recent related work in feature selection was performed in a filter-based approach using Relief-F and a wrapperbased approach using a variant of sequential forward floating search.Because different features were on different scales, all features were normalized to obtain the best results for NN or Naive Bayes classifiers, which were used for error estimation and ensure equal weight to all potential features [1-6, 8-10, 13, 18, 24, 29].
In our approach, according to the elementary mechanics of walking, running, jumping, and sleeping, we used the means and variances of magnitudes and angles as the activity features and the magnitudes and angles that were produced by a triaxial acceleration vector.The reasons for this approach are as follows.First, according to [41][42][43], the muscles produce different forces when people walk, run, jump, and sleep.Normally, the forces increase in the order of sleeping, walking, running, and jumping.Based on Newton's second law, the resultant accelerations of these activities also increase in that order.Second, as in [44], a model of persistent 2D random walks can be represented by drawing turning angles.Detailed features are described below.Third, Shannon entropy in the time domain can measure the acceleration signal uncertainty and describe the information-related properties for an accurate representation of a given acceleration signal.
The triaxial acceleration vector ⃗ () is where   (),   (), and   () represent the  acceleration sample of the , , and  axes.This feature is independent of the orientation of the sensing device and measures the instantaneous intensity of human movements at index .We computed the mean, variance, and entropy of magnitude and of the angle of over the window and used them as six features:  mag ,  mag ,  mag ,  ang ,  ang , and  ang , where  is the window length. is the angle between vectors ⃗ ( − 1) and ⃗ (), as shown in the following.Let  = 1, 2, . . ., /; then where  mag () =  where cos  () . ( To explore the performance and correlation among these six features, a series of scatter plots in a 2D feature space is shown in Figure 2. The horizontal and vertical axes represent two different features.The points in different colors represent different activities.In Figure 2(a), the relationship between  mag and  mag is described, and the running, jumping, walking, and static activities are clustered.In Figure 2(b), the straight line between 2D walking (forward, left, and right) and 3D walking (upstairs and downstairs) implies that  ang is an available feature. Figure 2(c) illustrates that the  mag and  mag features successfully partition the triaxial acceleration data samples from walking forward, walking left,   and walking right into three isolated clusters, where each cluster contains data samples roughly from one single activity class.Figure 2(d) demonstrates the discrimination power of the  ang and  ang features to differentiate walking upstairs and walking downstairs.Figure 2(e) shows that the triaxial acceleration signal can be classified into standing, sitting, and sleeping based on the  mag and  mag features.
In this study, we used  mag ,  mag ,  mag ,  ang ,  ang , and  ang as the best features for the classifiers in each layer [45].

Activity Estimation for Multiclassification
We presented an activity estimator for multiclassification to estimate the human activity from the feature data.Each activity estimator for the multiclassification included one LS-SVM classifier and a maximum Act Label frequency estimator (Figure 3).We used the LS-SVM [34] method to cluster the feature data.After loading the testing data into Matlab, we built an activity-recognizing model from the data.After the parameters of the model were calculated, we estimated the activity by inputting some test feature data [46].The function trainlssvm() was used to train the support features of an LS-SVM for classification, and the function simlssvm() was used to evaluate the LS-SVM for some test feature data.
Because  mag ,  mag ,  mag ,  ang ,  ang , and  ang have (/) elements, the LS-SVM for the multiclassifier outputs an activity set, which includes / elements of Act Label.The activity set may have different Act Labels, and we must estimate the Act Label maximum likelihood in this activity set.We used the Naive Bayes algorithm to compute all Act Label likelihoods and obtained the human activity using the maximum Act Label likelihood.( Figure 4 shows the activity estimator working process, which includes the training stage and testing stage (online activity recognition).In the training stage, the labeled data of triaxial acceleration were normalized and the statistical features were extracted from those synthesized-acceleration data.Then, the multiclassification estimator was used to build the classification model.In the testing stage, unlabeled raw data of the triaxial accelerometer were processed with the method that was used in the training stage.These synthesized data were classified using the multiclassification estimator, and the recognized result was obtained [47,48].

Experiment
The activity recognition dataset was the USC Human Activity Dataset.The activity dataset included ten activities and collected data from 14 subjects.To capture the day-to-day activity variations, each subject was asked to perform 5 trials for each activity on different days at various indoor and outdoor locations.Although the duration of each trial varies for different activities, it was sufficiently long to capture all information for each performed activity [37].In this section, we estimated the performances of the five activity classifiers in this activity recognition scheme.Table 3 shows the results  of five activity recognition classifiers.These activity classifiers had over 95% accuracy [24] and were acceptable.The results of these folds are summarized in Table 4.The average recognition accuracy of 95.6% indicates that our proposed human activity recognition scheme can achieve high recognition rates for a specific subject.Because 2D walking and 3D walking are similar, the recognition accuracy of the five walking activities is low.We will attempt to obtain higher recognition accuracy using an adequate amount of training data in future research.
We compared the accuracy rate and running time for common multiclassification methods.All algorithms were run on a computer with CPU i7-2670QM 2.2 G, 8 G ram, and Matlab 2013a.The LS-SVM performed notably well in the tests.The average running time for the hierarchical classification framework with the LS-SVM recognizing activities was 0.021 seconds, which was less than the ANN (Artificial Neural Network), DT (Decision Tree), and NN (-Nearest Neighbor) algorithms.We performed the ANN, DT, and NN classifier tests with the built-in functions of Matlab.The LS-SVM method was also better than ANN, DT, and NN in terms of the average recognition accuracy rate for the ten activities.Table 5 shows the results.

Conclusion and Future Work
This paper aims to provide an accurate and robust human activity recognition scheme.The scheme used triaxial acceleration data, a hierarchical recognition scheme, and activity classifiers based on the LS-SVM and the NB algorithm.The mean, variance, entropy of magnitude, and angle of triaxial acceleration data were used as the features of the activity classifiers.The scheme effectively recognized a typical set of daily physical activities with an average accuracy of 95.6%.It could distinguish walking (forward, left, right, upstairs, and downstairs), running, jumping, standing, sitting, and sleeping activities using only a single triaxial accelerometer.The experimental results of the hierarchical recognition scheme show significant potential in its ability to accurately differentiate activities using triaxial acceleration data.Although the scheme remains to be tested with USC-HAD datasets, the core of this scheme is independent of the features of other activity datasets; therefore, it is applicable to any dataset.
The novelty of the proposed human activity recognition scheme is the introduction of the LS-SVM method as the classifier algorithm.The LS-SVM is an advanced version of the standard SVM, and there are recently relatively few studies using LS-SVM to recognize activities with only one triaxial accelerometer.The human activity recognition scheme with LS-SVM classifiers simplifies the construction of the hierarchical classification framework and has a lower running time than other common multiclassification algorithms.Accuracy is the basic element that must be considered when any activity recognition system is implemented, and this recognition scheme has a high success rate, for which it can recognize ten different types of activities with an average accuracy of 95%.
The next stage of our research has two parts.First, the algorithms are improved to recognize these activities, and the user will not have to worry about placing the sensors at the correct positions to correctly detect the activities.Second, an unsupervised approach for automatic activity recognition is considered.An unsupervised learning framework of human activity recognition will automatically cluster a large amount of unlabeled acceleration data into discrete groups of activity, which implies that the human activity recognition can be naturally performed.

Table 1 :
Classified states and activities recognized in this study.

Table 2 :
A preliminary investigation of  selection.
Least Squares Support VectorMachines classifierM mag V mag E mag M ang V ang E ang

Table 4 :
Confusion matrix for average recognition accuracy for all activities.

Table 5 :
Accuracy rates and running times of the classification methods.