Transportation Mode Detection Based on Permutation Entropy and Extreme Learning Machine

. With the increasing prevalence of GPS devices and mobile phones, transportation mode detection based on GPS data has been a hot topic in GPS trajectory data analysis. Transportation modes such as walking, driving, bus, and taxi denote an important characteristic of the mobile user. Longitude, latitude, speed, acceleration, and direction are usually used as features in transportation mode detection. In this paper, first, we explore the possibility of using Permutation Entropy (PE) of speed, a measure of complexity and uncertainty of GPS trajectory segment, as a feature for transportation mode detection. Second, we employ Extreme Learning Machine (ELM) to distinguish GPS trajectory segments of different transportation. Finally, to evaluate the performance of the proposed method, we make experiments on GeoLife dataset. Experiments results show that we can get more than 50% accuracy when only using PE as a feature to characterize trajectory sequence. PE can indeed be effectively used to detect transportation mode from GPS trajectory. The proposed method has much better accuracy and faster running time than the methods based on the other features and SVM classifier.


Introduction
With the increasing prevalence of positioning technologies, GPS mobile devices, smartphones, and so forth are equipped with multiple sensors [1].It is possible to collect movement data of human.This makes it possible to implement various location-aware services [2].Humans travel by different transportation modes, for example, walking, bicycle, car, and train [3].In ubiquitous and context aware computing, understanding the mobility of a mobile user is an important research area.The knowledge of the transportation mode is critical for travel behavior research, transport planning, and traffic management [4,5].Transportation modes of individuals effectively reflect their past events and we can deeply understand their own life pattern [6].Because the collected data from GPS do not contain the transportation mode, the detection of transportation mode from GPS trajectory is necessary.
Transportation mode detection from GPS data has been studied in the literature.Different studies use different features (or combination of features), such as speed, acceleration, maximum or median speed, and acceleration and length between GPS fixes.A simple approach is to measure the speed and acceleration of the GPS data, which is then compared with empirical thresholds [5,7].However, for some transport modes, such as cycling and running, the usage of speed and acceleration is not enough.For example, for traffic jam, rain, and snow weather, the speed and acceleration under different transportation modes may be the same, so it is hard to differentiate using only the speed and acceleration thresholds.Zheng et al. identified a set of sophisticated features including heading change rate, velocity change rate, and stop rate [6,8].Beyond simple velocity and acceleration, they are more robust to traffic condition and contain more information of users' motion.Stenneth et al. considered transportation network data which consist of real time locations of buses, rail lines, and bus stops spatial data [9].This approach can achieve over 93.5% accuracy for inferring various transportation modes.However, transportation network data is not available in most cases.Reddy et al. combined GPS sensor data with accelerometer data to detect the modes of transportation [10].They select GPS speed, accelerometer variance, and accelerometer DFT as features.
Stopher et al. considered the average speeds and the maximum and minimum speed as feature set [11].The other features used in transportation mode detection are shown in [12,13].
Permutation Entropy (PE) directly investigates the temporal information contained in the time series, which was introduced by Bandt and Pompe [14].PE has the quality of simplicity, robustness, and very low computational cost.PE has been applied in many applications [15], such as neural applications [16], electroencephalography (EEG) signal analysis [17], electrocardiograph (ECG) [18], and stock market analysis [19].As a new learning algorithm for singlehidden layer feedforward neural networks, Extreme Learning Machine (ELM) has attracted a lot of research interests [20][21][22][23].ELM has shown good performance in classification applications due to the low computational cost, better generalization performance, and faster learning speed than traditional gradient-based learning algorithms [24,25].
PE has not been used for analyzing moving objects data.Therefore, it is interesting to investigate PE in mobility analysis of moving object.In this paper, we propose to use PE as a feature for the transportation mode detection.To reduce the training time without compromising accuracy, Extreme Learning Machine (ELM) is used as a classifier in this paper.Experiments are conducted on GeoLife dataset to validate the feasibility of the proposed method and evaluate the effectiveness of this approach.We make comprehensive performance evaluation for various feature measures as well as different supervised classifiers in transportation mode detection.The results showed that the proposed scheme was capable of detecting transportation not only with high accuracy but also with a very fast speed.
The remainder of this paper is organized as follows.In Section 2, we introduce the proposed transportation modes detection algorithm based on PE and ELM.In Section 3, we review the concept of PE.In Section 4, we provide a review of ELM.In Section 5, we present experiments and results to demonstrate the effectiveness of the algorithm.Section 5 contains conclusions.

Transportation Mode Detection with PE and ELM
In transportation modes detection from GPS data, different features, such as speed, average speed, acceleration, and sophisticated features, are used.But the speed of different transportation modes is usually vulnerable to traffic conditions and weather.Intuitively, the average speed of driving would be as slow as walking in congestion.Speed change of GPS trajectories is an important indicator to describe trajectories.For example, the speed of car changes in a wide range.Compared with this, the speed of walking has less change.Permutation Entropy estimates the complexity of time series through the comparison of neighboring values.It is conceptually simple and computationally very fast.So, we explore whether Permutation Entropy can be used to detect transportation modes.
In the paper, we adopt Permutation Entropy as a measure of complexity due to its fast calculation, robustness, and invariance with respect to nonlinear monotonous transformations.
Up to now, to the best of our knowledge, in the literature, there is no related study about detecting moving objects' transportation mode by PE. Figure 1 shows the proposed transportation mode detection method with PE and ELM.
Firstly, GPS trajectory data are collected from GPS sensor.Secondly, trajectory data is segmented into time sequences with the same length.For each trajectory segment, several features, such as speed and PE, are extracted.Finally, a classification model, ELM, is used to detect transportation mode.

Permutation Entropy (PE)
Permutation Entropy is widely used to study the irregularity and nonlinearity in time series, which has a fairly high sensitivity on time, so it is an effective method to detect the dynamic changes of a complex system.
For a given time series {()}  =1 , the calculation steps of PE are described as follows.
(1) Using time delay embedding theorem to reconstruct the phase space, the data segment () is derived from the point () of original time series: In ( 1),  is the embedding dimension and  is the delay time.
(2) Each component in () can be arranged in an increasing order to achieve an ordinal pattern: where  * is the index of the element in the new vector.

Single-Hidden Layer Feedforward Neural Network (SFLN).
Feedforward neural networks have been extensively used in many fields.A single-hidden layer feedforward neural network (SLFN) with at most  hidden nodes and with almost any nonlinear activation function can exactly learn  distinct observations.The activation function of a node defines the output of that node given an input or set of inputs.The input weights (linking the input layer to the first hidden layer) and hidden layer biases need to be adjusted.
For  arbitrary distinct samples (  ,   ), where   = [ 1 ,  2 , . . .,   ]  ∈   and   = [ 1 ,  2 , . . .,   ]  ∈   , SLFN with  hidden nodes and activation function () is modeled as where The above  equations in (4) can be written compactly as where  is defined as is called the hidden layer output matrix of the neural network; the th column of  is the th hidden node output with respect to inputs  1 ,  2 , . . .,   .
Traditionally, all the parameters of the feedforward networks need to be tuned iteratively.Gradient descent-based methods have mainly been used in various learning algorithms of feedforward neural networks.Gradient descentbased learning methods are generally very slow due to improper learning steps or may easily converge to local minima.

Extreme Learning Machine. Extreme Learning Machine
(ELM) is a simple learning algorithm for SFLN.The learning speed of ELM can be thousands of times faster than traditional feedforward network learning algorithms while obtaining better generalization performance.
In most applications, the number of hidden neurons is much smaller than the number of distinct training samples, and  is a nonsquare matrix.In the ELM approach, the input weights   and the hidden layer biases   of SLFNs are not tuned but are assigned randomly and then fixed.This is equivalent to mapping the samples to a random feature space.Then, training SLFN is equivalent to find a least squares error solution β of the linear system  = .β =  *  is the Moore-Penrose generalized inverse of matrix .
There are many ways of calculating the Moore-Penrose generalized inverse of a matrix such as the orthogonal projection method, iterative method, and singular value decomposition [26].Singular value decomposition is used to calculate the Moore-Penrose generalized inverse of a matrix.

Experiments Evaluation
In this section, we first describe experiment dataset.Second, we present feature extraction.Finally, transportation modes detection based on elementary features, only PE, and combination of PE and the elementary features is discussed.
We do not compare our method with the previous transportation modes detection methods because of the following.Firstly, the transportation mode detection method is composed of trajectory partition, feature selection, and learning process.Subtrajectories are attained automatically from trajectories partition algorithm in the other transportation mode detection methods.The length of subtrajectories is not the same.However, subtrajectories in our method need to have the same length to calculate PE.So, we get the fixed length subtrajectories by partitioning the trajectories with the same length.Other researchers partition the trajectories with the specific trajectories partition algorithms.Secondly, we compare PE with the other features in the other method to show that PE is a valid indicator.Finally, we compare our learning method, ELM, with SVM, commonly used learning method in the other methods, to validate ELM's efficiency.

Dataset Description.
The experiments are carried out on the Microsoft GeoLife dataset [1] which consists of 17621 moving trajectories of 182 users over three years.These trajectories were recorded by different GPS loggers and GPS phones.A GPS trajectory is represented by a sequence of time-stamped points of a user in a certain time interval and each time-stamped point contains the information of latitude, longitude, and altitude.The trajectories of 73 users have been labeled with transportation mode.The total distance and duration of transportation modes are listed in Table 1.

Feature Extraction.
We extract the features from each trajectory.The features are shown in Table 2.The elementary features {AV, DV, HCR, SR, and VCR} have the same definition in [9].

Transportation Modes Detection Based on Elementary
Features.We select 30 of 73 users and extract 5525 trajectories to perform the experiments.The elementary features {AV, DV, HCR, SR, and VCR} are calculated to detect trajectory modes.5525 trajectories segments are partitioned into the training set and the testing set randomly.Table 3 lists different training set sizes and the used features.
We choose SVM and ELM as classifiers to detect transportation modes.Tables 4-8 show the running time and accuracy.We observe that classification accuracy of ELM is about 62% when nodes number is larger than 500.We can get higher and steady results when nodes number of ELM is 800.Detection accuracy of SVM is about 45% and lower than ELM.For ELM, different activation functions have great effect on running time and accuracy.Sigmoid can get the most accurate result.The accuracy of Hardlim is slightly less than Sigmoid.However, training time of Hardlim is much shorter than Sigmoid.

Transportation Modes Detection Based on Only PE.
To compute the speed PE of transportation modes, we extract trajectory segments with the same transportation mode.The number of points of each trajectory segment is greater than 1000.We collect 500 trajectory segments in our experiment.We calculate the speed of each point and the speed of transportation modes by using PE of each trajectory segment.We use the speed from PE as a feature to detect transportation modes.
Figure 2 shows the average speed of 500 trajectory segments.Figure 3 presents the speed distribution of different transportation modes.The range of speed change in each transportation mode is high.Different transportation modes, such as walk, bike, and bus, have high overlap in the average velocity.Consequently, the average velocity is not a perfect feature to distinguish different transportation modes.Figure 4 is the PE from the speed for different transportation modes.We can see that the PE from the speed in cars and buses is lower and PE from the speed in walking and bikes is higher.The PE from the speed in car and bus has smaller scale and lower value.The PE from speed in walking and bikes has a larger scale and a higher value.When the speed of car, bus, walking, and bike is usual, we can recognize different transportation modes from the average speed easily.But, in the traffic jam, the average speed of car and bus is almost the same as the average speed of walking and bike.Because the PE from the speed for car is lower than that of walking, we can make a distinction between car and walking from PE from the speed.The average PE of 500 trajectory segments with different  under multiple transportation modes is shown in Table 9.We can see that the average PE from the speed becomes large with the increase of .This is possibly because when  is larger, the probability of distinct symbols is smaller and each row of the reconstruction matrix is much more complex.
We use the PE from the speed as the feature to detect transportation modes for 500 trajectory segments.We choose SVM and ELM as classifiers.In ELM, we adopt Sigmoid as For , Bandt and Pompe recommend  = 3, . . ., 7 [14,15] and found that  = 3 and 4 may still be too small, and a value of  = 5, 6, or 7 seems to be the most suitable.
We set  as 4, 5, 6, and 7. Tables 10-13 present the experimental results.We find that when  = 4, it is too small to get better effect as shown in [15].For  = 5, 6, or 7, we find that when  is larger, the detection accuracy is lower and the training time is longer.At the same time, we note that the larger the dimension is, the more time PE computing needs.When  = 5, we obtain the best experimental results.For two kinds of classifiers, it is noted that ELM gives a better stability and a higher accuracy than SVM.

Transportation Modes Detection Based on PE and the
Elementary Features.We gradually add the other elementary features based on PE.In ELM, we adopt Sigmoid as the activation function and the training data size is 50%.Tables 14-18 present different detection results with different feature sets.

Figure 1 :
Figure 1: The steps of transportation mode detection method based on PE and ELM.
=1 ‖  −   ‖ = 0, and there exist {  ,   ,   }  =1 such that ]  is the weight vectors connecting the input nodes to the th hidden node,   = [ 1 ,  2 , . ..,   ]  is the weight vectors connecting th hidden node to the output nodes, and   is the threshold of the th hidden node.⋅   denotes the inner product of   and   .The standard SLFN with  hidden neurons can approximate these  samples with zero error; that is, ∑  ∑ =1    (     +   ) =   ,  = 1, . . ., .

Table 1 :
Total distance and duration of transportation modes.

Table 2 :
Extracted features of each trajectory segment.

Table 3 :
Different training set sizes and the used features.

Table 4 :
Running time and accuracy of 1 : 9 data.

Table 5 :
Running time and accuracy of 2 : 8 data.

Table 6 :
Running time and accuracy of 3 : 7 data.

Table 7 :
Running time and accuracy of 4 : 6 data.

Table 8 :
Running time and accuracy of 5 : 5 data.

Table 9 :
Average PE of 500 trajectory segments with different  under multiple transportation modes.

Table 12 :
Experimental results with  = 6.activation function and the number of nodes is 800 since these parameters can give relatively good performance for ELM.To demonstrate the effect of the number of training samples, we design the experiments by setting different training set sizes (10%, 20%, 30%, 40%, and 50%) and the remaining samples act as the training set. the