Support Vector Machine Based Mobility Prediction Scheme in Heterogeneous Wireless Networks

To improve the intelligence of the mobile-aware applications in the heterogeneous wireless networks (HetNets), it is essential to establish an advanced mechanism to anticipate the change of the user location in every subnet for HetNets. This paper proposes a multiclass support vector machine based mobility prediction (Multi-SVMMP) scheme to estimate the future location of mobile users according to the movement history information of each user in HetNets. In the location prediction process, the regular and random user movement patterns are treated differently, which can reflect the user movements more realistically than the existing movementmodels inHetNets. Anddifferent forms ofmulticlass support vectormachines are embedded in the twomobility patterns according to the different characteristics of the two mobility patterns. Moreover, the introduction of target region (TR) cuts down the energy consumption efficiently without impacting the prediction accuracy. As reported in the simulations, our Multi-SVMMP can overcome the difficulties found in the traditional methods and obtain a higher prediction accuracy and user adaptability while reducing the cost of prediction resources.


Introduction
HetNets are investigated in recent years quite intensively because no one single wireless network form can satisfy the evergrowing requirements of all users [1,2].In order to improve the whole capacity of HetNets and guarantee the QoS requirements of users, the user mobility management [3,4] becomes a research focus and should be well designed.Traditionally, the mobility management includes keeping track of the location for users and maintaining connections when users propose handoff requirements.Therefore, many related problems, such as location updating, paging [5], handoff latency [6], and resource reservation [7], should be well handled.Recently, more and more scholars realize that the mobility management becomes more intelligent if a network system can anticipate the change of the user location.For instance, the mobility prediction information in a prehandoff mechanism can reduce the workload in the practical handoff process and reduce the handoff latency [8].Moreover, resources can be reserved in advance for coming users according to the prediction information.Thus, the phenomenon that no enough resources are left for the QoS requirements of coming users in the target network can be avoided [9].In addition, mobility prediction also affects the traffic control and reduces the energy consumption.Therefore, mobility prediction [10] becomes one of the hottest topics in wireless mobile computing research.In particular, HetNets aim to control different access networks centrally just in one unified system and they have to cope with multiplex mobility patterns and more service types in a cooperative way.Furthermore, the geographical conditions in HetNets are more various, because different wireless access network technologies are combined together.Consequently, the mobility prediction in HetNets faces more complicated challenges than any other single wireless networks.
The subject of the mobility prediction for users in HetNets has been extensively explored and many efforts have been spent on it [11,12].The algorithms can be subdivided into three kinds according to what information is used for the prediction: GPS position information [13]; domain information [14][15][16] (such as the coordinates, the speed and direction of moving users, and the geometric shape of the networks); and history location arrays information [17,18].Kalman filter in [13] forecasts the location well by updating the inaccuracy of GPS location.However, it squanders much power consumption and many bandwidth resources, since multiple parameters must be transmitted among nodes.In addition, it is invalid in indoor environment which is usually the hotspots in HetNets, although GPS can provide accurate location information in outdoor environment.A good location prediction technique can guarantee the accuracy without performing complex localization operations, so this paper just touches upon the prediction not using GPS position information.Methods in [14][15][16] are common to offer user position as supplementary decision information in handoff or mobility management.However, both the mobility pattern hypothesis and the prediction in these methods are too simple to obtain accurate prediction results.Data mining (DM) for aircraft trajectory in [17,18] collects much history location information to construct databases and mine typical trajectories.Based on the databases, the method anticipates the location by using match mechanism.However, the databases are hard to be established and managed in HetNets because the number of mobile users is huge.Furthermore, the randomness of mobile users is more evident than aircraft trajectories owing to the movement character of the individual.That is to say, the data in the databases seldom visited by users will have low utilization ratio, which leads to a high waste of resources.
On the other hand, from the system model perspective, the prediction methods can be deterministic model-based or statistic model-based [19].The main drawback of the former algorithms is their high sensitivity to the movement patterns that do not tally with the deterministic model; that is, the prediction performance decreases linearly with the increase of the randomness.References [20,21] take notes of the historical series of base stations that users have gone through and discover the mobility model by statistical methods.Nevertheless, the models are too simple to reflect the actual movement patterns.Method in [22] is a good attempt to set up a better Markov model by dividing the movement patterns to regular and random parts.Unfortunately, it puts so many efforts into the model establishment and ignores the design of a perfect prediction algorithm.
In this paper, we exclusively focus on the two main problems in the existing literatures analyzed above.The first one is to deal with the complex localization operations and too much history location information in the mobility prediction schemes.The second one is to avoid the unpractical and too simple mobility models in model-based methods.Thus, a novel mobility prediction mechanism, called Multi-SVMMP, is proposed.It solves the former problem by using the character of multiclass SVM method and deals with the second problem via distinguishing and treating the regular and random movements differently.
The main contributions and distinctions in our Multi-SVMMP are four-fold.Firstly, a Multi-SVMMP mechanism is proposed to predict the future location of users in Hetnets.Multiclass SVM method in Multi-SVMMP guarantees the prediction accuracy without complex localization operations and too much history location information.Secondly, we distinguish the regular/random movement patterns by introducing the location entropy threshold  th .This approach reflects the movement patterns in HetNets more actually and decreases the sensitivity of mobility models to the randomness of the movements.Moreover, the introduction of TR in the network topology saves much prediction resources without impacting the location prediction accuracy.Thirdly, we successfully embedded Multi-SVMMP method into the two proposed different mobility patterns and then treated the two types of movement patterns through designing different input location samples.Thus, the practical value of the prediction mechanism can be improved.Lastly, we conduct some experiments to investigate and verify the prediction performances of Multi-SVMMP.The simulations show that our Multi-SVMMP can overcome the difficulties found in the traditional methods and obtain a higher prediction accuracy and user adaptability while reducing the cost of prediction resources.

System Model
Among all type of access networks, the cellular networks and WLAN have perfect complementary characteristics.WLAN can take the advantage of high data rate and unlicensed spectrum.However, its coverage area is small.The cellular networks have much wider area of coverage but low data rate.Hence, the collaboration between these two wireless access networks can attract more users and introduce high speed wireless data services for the cellular networks.Furthermore, LTE is an advanced type of cellular networks.Therefore, LTE and WLAN are adopted to constitute the HetNets in this paper.This section describes the system model of our HetNets developed for this paper before the design of mobility prediction mechanism.The network topology of the HetNets will be given out first and then the mobility model will be discussed.

Network Topology.
For simplicity, ideal circles and hexagons are often considered in the simulations of the existing papers about cellular networks research.Although these hypotheses simplify the analyses, they are impractical for a real cellular network.The reason is that the shape and size of each cell may vary under different receiver sensitivity, antenna radiation patterns of the base stations, and propagation environment.Moreover, the number of neighboring cells varies from cell to cell.Therefore, irregular polygons but not hexagons or circles are adopted by LTE subnets for more actual application value.
Figure 1 shows our HetNets topology.In Figure 1  Actually, it is impractical and not necessary to know the exact path of users.In most cases, we care more about which subnet users will be in the next moment.Therefore, what the mobility prediction algorithm predicts is whether one user departures from one subnet to another in the next period of time.Obviously, subnet transfer happens more often when users stroll near the boundary of neighboring subnets.Hence, we pay more attention to the boundary of neighboring subnets.We define TR (in Figure 1) by drawing dotted line boundary close to the boundary of each subnet.TR is thus defined as the region between two dotted lines along the boundary of LTE subnets and two dotted circular along the boundary of WLAN subnets.Let   be the observing epoch.

And the distance between the HetNets boundary and TR boundary can be written as
where V max is the maximum speed of users.Thus, users in the region out of TR will still stay in that region after a period   , even though they move with the maximum speed.Obviously, we can predict that those kinds of users will still stay in the previous subnet after   .Then, we observe the location of the users again after   .

Mobility Patterns.
Many literatures [23,24] adopt random walk models to describe the mobility patterns for mobile users.However, the movement nature characteristic analyses in [25,26] find out that human trajectories do not show randomness completely.On the contrary, it often exhibits a high degree of temporal and spatial regularity.That is, each individual may be characterized by a significant probability to return to a few highly frequently visited locations and dwell a longer duration in those locations, just as shown in Figure 2. Humans somehow follow simple reproducible movement patterns, in spite of the travel history of mobile users having high diversity.For instance, Figure 2 shows the locations one user may visit in a period of time in the HetNets.She/he visits a lot of locations with low probability.And she/he revisits three locations (maybe home or office) with high probability, which is 28%, 19%, and 10%, respectively.Based on the consideration above and to the end of modeling practical movements for mobile users, the prediction algorithm for regular movements and random movements are developed separately in this paper.Therefore, we describe our daily movements of a mobile user by two parts.Let (, ) and (, ) be the regular and random movement parts, respectively, when the user stays in the subnet  at .Thus, the movement patterns (, ) can be denoted by (2)

Multiclass SVM Module Training
The SVM method implements two steps: offline module training with history location information and online next subnet number prediction by inputting test location data vector to SVM module.The goal of offline module training is to find an approximate prediction module (x i ) that can reflect the relationship between the prediction values   and x i , by using the history information (x 1 ,  1 ), . . ., (x m ,   ).When a new vector x is input to the module (x i ), it can output a prediction result  to predict the next subnet number the user will leave for.In two-class SVM, this process is equivalent to construct an optimal hyper plane  in the feature space.And the equation of  can be written as where w is the weight vector and  is the bias.The separation margin between support vectors (SVs) is 2/‖w‖.The goal is to find the optimal w * and  * .The problem is actually to maximize the separation margin with the corresponding constraints: where   is the relaxation variable.It is introduced to construct the optimal-type hyperplan in the case when the data are linearly inseparable. is the adjustment factor that is used to determine the model complexity and punishment degree.
According to Lagrange theorem, (4) can be converted to seek for the minimum of the following formula as follows: where   and   are the Lagrange multipliers.Consider To estimate the coefficients   , it is sufficient to find the maximum of its dual functional with the corresponding constraints: where (x i , x) is kernel function.The role of kernel function is to extract features from the original space and map samples in original space to a vector in the high-dimensional feature space, so as to solve the problem of linear inseparable in the original space.Main kernel function types include the linear kernel, polynomial kernel, RBF kernel, and sigmoid kernel.According to (9) we can obtain  * .Substitute  * into (6), we can obtain the optimization w * : Bias  can be obtained by the original constraint condition: And the prediction module can be written as (x) is the two-group prediction results.Then test location data x can be input to (x) to implement the online mobility prediction.It is worth mentioning that SVM algorithm in this section is just designed for two-group classification problems initially.However, more than two kinds of results need to be output from Multi-SVMMP.Therefore, multiclass SVM should be developed beyond the traditional two-class SVM.

Multi-SVMMP Mobility Prediction Design
Actually, the Multi-SVMMP scheme is implemented in the mobile information center (MIC), which is a center controller established for the mobility prediction.MIC collects the history location information and forecasts the future location for mobile users.

Data Collection and Data Preprocess.
As mentioned above, each mobile user follows regular movement patterns more or less, which provides some predictability for the movements [27].Before executing the prediction scheme, information data used for the prediction should be collected and preprocessed in advance.
Distribute  users in the HetNets.For each mobile user  ∈ , the system observes and remembers the location {  1 ,   2 , . . .,    } and the corresponding subnet number {sn  1 , sn  2 , . . .., sn   } for  times.   is the th location coordinate.Let   be the observing epoch.sn   is the subnet number at the next   , that is, the subnet that user  will leave for in the next   (will still stay in the current subnet or move to another one).sn   is anticipated by using the history location information.Because the value of sn   is multiclass, a pretreatment to the location vector series is carried out ahead.
To avert too large error owning to the diversities of magnitudes between the input and output data, we try to eliminate the diversities of magnitudes among the location feature samples.Based on this consideration, the location samples {  1 ,   2 , . . .,    } are normalized to interval [−1, +1] by the following way: where   max and   min are the maximum and minimum value of the location samples, respectively.Then, we make a copy of each    for 2 times and spread every    with the construction vector h of length 2.By using this process, the multiclass prediction problem is dexterously transformed into a two-class prediction problem.Let    =    and    = sn   .The th spread element (   ,   ) can be written as where the mathematical notation ⊕ stands for aggregating the two vectors before and after it, that is, connecting the two vectors.How to construct the th ( ∈ (1, . . ., 2)) element of vector ℎ  is given as And the th prediction value sn   () of the spread sample vector can be defined as Thus, the original  feature samples are transformed to  = 2 *  spread samples (  1 ,   2 , . . .,    ) and (  1 ,   2 , . . .,    ), which are the input feature samples for SVM in Section 3.Then, the traditional two-class SVM algorithm can be adopted to solve our prediction problems.

Multi-SVMMP Online Mobility Prediction.
As mentioned in Section 2, the regular and random movement patterns are distinguished and detected differently.Here we introduce the concept of location entropy to capture the degree of regularity.Suppose that a user visits 2 subnets; the appearance number at subnet  is   ; the user appears  times in total.Thus, the frequency that the user visits subnet  is   /.Let  be the random variable representing the user's subnet.The entropy of  can be calculated as follows: If a mobile user revisits one or two subnets (home or office) with high frequency, the location entropy will become small.It means that the regularity and predictability are high.Therefore, we distinguish the regular and random movement patterns by defining a location entropy threshold  th .If one detected that () of a mobile user is smaller than  th , it follows regular movement patterns.And if one detected that () of a mobile user is bigger than  th , it follows random movement patterns.Evidently, the smaller the location entropy is, the bigger the probability of the movement regularity and predictability is.For instance, different days in a week have different movement pattern regularity.Generally, weekends shows more variation and randomness than weekdays, because most people may just visit work place and home with high frequency at weekdays while making arrangements for more activities at different places at weekends.Hence, the location entropy of the weekends is bigger than that of the weekdays.
Regular and random movement patterns are treated differently in the detailed mobility prediction steps.First of all, mobile users with regular movement patterns may revisit one or two subnets (home or office) with high frequency.This phenomenon makes those locations have closer relationship and have more guiding value for the prediction.Longterm history location information may be very meaningful for the prediction.Hence, in our Multi-SVMMP regular mobility prediction, multidimensional location samples (  −1 ,   −2 , . . .,   − ) are input to SVM module to train the prediction model for user ; that is, every next subnet number is evaluated by its former  location samples: where XE   and    are multiple history information for training the regular mobility predictor.
For the random movement patterns, users may visit different subnets randomly, which makes the relationship of the locations become small.Thus, we suppose that in the next subnet the mobile user leaves because she/he has something to do with the subnet number one step before.Based on this consideration, we design different input feature sample vectors for random movement patterns.Location information   −1 is used as input feature sample vectors to SVM module, denoted by where XA   and    are multiple history information for training the random mobility predictor.
Then, according to (14)  ).The detailed prediction steps are as follows.
Step 1. Observe user  for a period  and determine the movement patterns (regular or random) the user belongs to by calculating the location entropy () according to (17).
Step 2. Collect the test feature vectors XE  or XA  and spread them into two-class vectors, xe  or xa  , according to the following formula.Here, "/" means "or" operation: . . .
. . . Step Thus, we obtain the whole process for Multi-SVMMP.In order to investigate the influence of our design about location entropy  th and TR on the prediction performances, we subdivide Multi-SVMMP to two algorithms, which are named Multi-SVMMP-HT and Multi-SVMMP-HNT, respectively, where Multi-SVMMP-HT denotes Multi-SVMMP with  th and TR operation and Multi-SVMMP-HNT denotes Multi-SVMMP with  th but no TR operation.

Simulation Results
In this section, we compare our Multi-SVMMP-HT and Multi-SVMMP-HNT with other existing methods in terms of the prediction accuracy and the user adaptability against different parameters.Two main comparison approaches are included: classic method [22] that applies Markov model (Classic Markov) with finite order found in many studies and two-class SVM [28] scheme that introduces neither TR nor the entropy threshold  th between regular and random movement patterns.The method in [28] is called MSS-TP Algorithm in our simulation.the movement data in [23,29] and on the website [30] for the experiments.In the mobile patterns, one mobile user tends to be stationary at some destinations for some time and then travels for a long distance to other destinations.This process repeats again and again for the user.These data are based on the real trajectory traces which are trustworthy.MIC is in charge of collecting the history location information and applying the trained predictor online.And the history information of 100 mobile users is collected every   within a period of  for multiclass SVM training module.During the simulation,   and V max are set as 5 s and 50 m/s, respectively.
Just as mentioned before, a mobile user may prefer the top two or three locations most frequently on weekdays.And more nonfrequent locations may be included at weekends.Therefore, the average location entropy  th is set to 1.34 for weekdays and 1.56 for weekends [25,26].To quantify the performances of these prediction methods under different mobility patterns, the results of statistical comparisons are presented as follows.3 and 4 show the location prediction accuracy performances of the proposed Multi-SVMMP and other comparison schemes against different training points for weekdays ( th = 1.34) and weekends ( th = 1.56), respectively.It can be seen that our Multi-SVMMP-HT and Multi-SVMMP-HNT outperforms MSS-TP Algorithm and Classic Markov scheme for both weekdays and weekends.Classic Markov has the worst performance due to its too simple mobility model.Note that the accuracy of Multi-SVMMP-HT is not much higher than Multi-SVMMP-HNT.However, Multi-SVMMP-HT saves much more energy than Multi-SVMMP-HNT when implementing the prediction process.It is because Multi-SVMMP-HT does not need to forecast the location when the mobile user roams out of TR.

Simulation Results and Analyses. Figures
In addition, the prediction accuracy fluctuation of all the schemes is shown in Figures 3 and 4 for both weekdays and weekends.We can see that the fluctuation is relatively big for all the four algorithms at the beginning.It is due to the fact that the former three SVM based algorithms do not find their SVs and that Classic Markov does not get enough history information to figure out the accurate transition probability matrix.In fact, little training samples (about some hundred samples) are needed to achieve the convergences and arrive at their optimal prediction results for the three SVM based methods.On the contrary, Classic Markov has to collect much more training samples (about 9000 samples) than the other algorithms.It is because SVM based algorithms are able to find the SVs by few training points thanks to the character of statistical learning.However, in Classic Markov, the state space increases dramatically with the increase of users and subnets, which need much more history information to obtain the transition probability matrix.Furthermore, it also can be found that the accuracy performance for weekdays is better ( th = 1.34) than that for weekends ( th = 1.56) in all the schemes.The reason is that the movement patterns in weekends have more variety and randomness, which reduces the predictability of the movement patterns.The performances of MSS-TP Algorithm and Classic Markov decline more than Multi-SVMMP-HT and Multi-SVMMP-HNT.It is because these schemes result in susceptibility to movement randomness without considering the difference between regular and random movement patterns.
Figure 5 illustrates the research on the relationship of the location prediction accuracy and the location entropy  th .We can see that the accuracy performances reduce with the growth of entropy for all the four schemes, which is because high entropy means high randomness and low predictability of the movements for users.On the other hand, MSS-TP Algorithm and Classic Markov do not treat the regular and random movements differently and have worse adaptability to the variety than Multi-SVMMP-HT and Multi-SVMMP-HNT when more mobile users come with erratic roam.Thus, the prediction accuracy performances of MSS-TP Algorithm and Classic Markov decline more than our proposed Multi-SVMMP-HT and Multi-SVMMP-HNT obviously.As shown in Figure 6, the relation between the location prediction accuracy and the maximal movement speed V max for one mobile user is given out for both weekdays and weekends.We can see that the location prediction accuracy performances reduce with the increase of V max for all the four algorithms.It is because the correlation of the prediction time and geographical distance becomes small with the increase of V max , which affects their prediction accuracy.In addition, the prediction accuracy performances of Multi-SVMMP-HT and Multi-SVMMP-HNT are better than that of MSS-TP Algorithm and Classic Markov, which indicates that  th operation in Multi-SVMMP brings better adaptability to the movement speed of users.
Figure 7 gives the location prediction accuracy against different   on weekdays.The prediction accuracy performances decrease with the increase of   for all the four algorithms.It can be seen that the gap of the prediction accuracy between Multi-SVMMP-HT and Multi-SVMMP-HNT is not big at the beginning, since users move a very short distance within such a short period of   .At this moment, the gain is not very obvious by Multi-SVMMP-HT.With the further increase of   , Multi-SVMMP-HT outperforms Multi-SVMMP-HNT owing to the TR design.However, the area of TR becomes large when   is big enough according to formula (1).Therefore, Multi-SVMMP-HT does not work much more efficiently than Multi-SVMMP-HNT.That is, Multi-SVMMP-HT and Multi-SVMMP-HNT become almost the same algorithm with the expansion of TR.
Besides the performance analyses on single user above, we also give out the user adaptability against different prediction accuracy for multiuser.Figure 8 shows the user ratio higher than the corresponding prediction accuracy.For instance, the user ratio which is higher than the location prediction accuracy of 0.6 is 77%, 72%, 34%, and 5%, respectively, on weekends.It indicates that Multi-SVMMP methods have higher user ability than the other two existing methods.And the user adaptability of the four algorithms on weekdays is better than that on weekends due to the small randomness of user movements for weekdays.

Conclusions
This paper proposes an accurate mechanism to anticipate the change of the location for users in every subnet of the HetNets.Based on the consideration that users usually have some degree of regularity in their movements, regular and random movement patterns are investigated separately to improve the prediction accuracy.Then, two different multiclass SVM location sample vectors are designed to treat the two mobility patterns differently.In addition, the prediction methods are triggered only when users roam into TR, which saves much prediction resources without impacting the location prediction accuracy.Thus, with our highly accurate location prediction scheme, users can immediately receive service or bandwidth resources when they roam to another location.Simulation results confirm that our Multi-SVMMP outperforms the other representative algorithms in terms of the prediction accuracy performance and multiuser adaptability.
Besides, we find that many users in the HetNets prefer to stay at one or two places for long time.Therefore, we can make further efforts to predict this temporal regularity in future.Actually, it is more meaningful to estimate the future trend of both spatial and temporal transformation.For instance, the system could reduce about 50% of energy consumption for location sensing if we can predict accurately that a user will spend 50% of their residence time in one place.

Figure 5 :
Figure 5: Location prediction accuracy against location entropy.

Figure 6 :
Figure 6: Location prediction accuracy under different V max on weekdays and weekends.

Figure 7 :
Figure 7: Location prediction accuracy against different   (s) on weekdays.

Figure 8 :
Figure 8: User adaptability of different prediction algorithms on weekdays and weekends.