An Evolutionary Algorithm Based on the Four-Color Theorem for Location Area Planning

As an important constituent of wireless network planning, location area planning (LAP) directly affects the stability, security, and performance of wireless network.This work proposes a novel evolutionary algorithm (EA) to solve the LAP problem.The difference between the proposed algorithm and the previous EA is mainly how to encode. The new coding method is inspired by the famous four-color theorem in graph theory. Only four numbers are needed to encode all chromosomes by this method. The encoding and decoding process is fast and easy to implement.What ismore, illegal solutions can be processed easily in the process of decoding.The design of effective and efficient genetic operators can also benefit from this coding method. The modified evolutionary algorithm with this coding method is especially effective for LAP problem. The use of the principle of fuzzy clustering in initialization can effectively compress the search space in this new algorithm. The computer simulation has been conducted, and the quality of proposed algorithm is confirmed by comparing the results of proposed algorithm with EA and simulated annealing (SA).


Introduction
The requirement for wireless communication, both realtime and non-real-time, has grown tremendously recently.As the foundation of mobile communication system, the management of the users' mobility plays an important role in the design of wireless mobile networks.When the user (which is also called mobile station or mobile terminal) roams in the entire mobile network, the network must transmit data or voice information to the user quickly and accurately.So it is necessary for the network to locate the users by tracing their location.These requirements result in the management of mobile communication.In the management of the users' mobility in a mobile communication system, LAs mainly have two functions: location inquiry and location update.Location inquiry is performed by the network itself.When the network launches a phone call to a certain user, all the base stations in the LA will page it.In the process, the network tries to locate the user based on its last known location information.In the process of location update, every user updates its location in the network and notifies the network its current location.Whenever the users' movement of entering a new LA service is checked, the records of home location register (HLR) and visitor location register (VLR) are updated timely.In the GSM network, cells are bounded together to form a series of LAs.It is assumed that the user reports to the system as soon as it crosses the LA boundary.The user's report is called location update (LU).So when the user enters a new LA, its location is updated, or the location will never be updated.Obviously, the higher the frequency of LA boundary crossing is, the larger the number of LU is.Consequently, frequent LU leads to a high updating cost in location register.It seems to be a good idea to simply increase the size of LA if we want to reduce LU cost.However, it does not mean that only lowering the number of LU can result in a better network performance.The size of the LA is limited by many other factors such as the paging capacity of an mobile switching center (MSC) and the number of available channels.Moreover, there is another constraint in terms of paging cost.Paging cost is caused by the network during a location inquiry when the network tries to locate a user.As the size of LA increases, the cost of paging will also increase because more cells need to be checked to find the called user in a bigger LA.It implies that the size of LA tends to be small.The problem is that LAs with small size will lead to high frequency of boundary crossing.The results are frequent LU and the waste of signal resource.Both paging and LU consume scarce resources such as wireless network bandwidth.Therefore, the size of the LA should not tend to be too small or too big.So a reasonable planning of LA, which aims to minimize users' LA boundary crossing and guarantee the quality of a signal received on currently allocated channel at the same time, can effectively reduce the total cost of a mobile management system.To build a rational mathematical model, it is not only the number of LAs, carrier frequency, telephone traffic load, paging, and LU frequency but also the assurance of a certain margin for system expansion that we should take comprehensive consideration of.Furthermore, there are many other variables and complicated constraints involved in LAP.It is obvious that the LAP is kind of combination optimization problem with constraints [1].It is characterized by the tradeoff between the number of LU and the amount of paging signaling which the network system has to deal with.It is an NP-hard problem which has a large solution space [2].
LAP is highly valued by network operators.A lot of heuristic algorithms such as artificial neural network (ANN), SA, and EA have been proposed to solve the problem.Lots of research literatures have emerged [1,[3][4][5][6][7][8][9].In paper [3], the authors summarize a lot of literatures about LAP problem.They give the general principle of LAP: try to make the LA border, avoid the area with big telephone traffic, and set LA boundary as vertical or oblique to the roads.Papers [1,4] discuss LAP optimization problem in a fixed area and evoke some new beneficial ideas.Paper [5] builds an objective function by making quantitative analogies.These analogies between signal cost and paging cost or LU cost have been widely used by lots of subsequent research works in the literature.However, some researches imply that these analogies have their unreasonable characters.The authors in [6,7] take the viewpoint that they are incomparable.They think that LAP optimization should be translated into an optimization problem of minimizing the times of LA boundary crossing under the condition of satisfying the network's paging capacity.They use SA and other heuristic algorithms to solve this model.The authors in [8][9][10][11] convert the multiobjective optimization into a single-objective optimization by regarding paging capacity as a constraint under some simplified hypotheses.However, this simplification in [8][9][10][11] misses some key factors and can only get some specific solutions.What is more, they do not focus on the design of algorithms.They use the existing algorithms to solve their model rather than propose a new effective method.In the literature [12], the authors regard the times of LA boundary crossing in the network as the measure of LU cost.Under some relevant constrains, the authors present an SA based on Cell-to-Switch (CSA) assignment.Paper [13] proposed a cell grouping strategy according to real statistics of traffic flow between these cells.This strategy assigns the cells with high frequency of intercell user movements (high traffic load) to the same LA.The purpose of this strategy is to limit users' movements into the interior of LAs.The authors in [14,15] try to solve LAP by combining LU cost with paging cost.This strategy of this kind of converting and grouping is useful and reasonable.It is adopted by many other researchers who study LAP.Although those algorithms open doors for us to solve LAP problem, they may still have some details to be completed when it comes to the practical application.There are some weaknesses along with them.Firstly, the existing coding methods limit the use of those algorithms, and the encoding and decoding methods of those algorithm are relatively complicated and inefficient; secondly, some search algorithms are easy to fall into local optimum, and the solutions of some algorithms rely heavily on the initial value; thirdly, some algorithms cannot produce good solutions in an acceptable time.
In this paper, an EA with a new coding method is proposed to solve LAP problem.As we discussed previously, a proper coding method can significantly increase the performance of EA.Although a variety of different coding methods for LAP have been presented in previous literatures, the coding method to solve LAP problem still needs to be explored deeply.The existing coding methods in previous works of literature could not be regarded to really satisfy LAP.The processes of encoding and decoding based on them may be very hard to deal with.In this paper, we creatively design a coding method based on the famous four-color theorem in graph theory.According to this coding method, only four numbers are needed to encode all the solutions of LAP.The encoding and decoding process is fast and easy to implement when corresponding evolutionary algorithm is used to solve LAP problem.This design of the chromosome decoding method can effectively avoid the generation of illegal solutions.Unfortunately, illegal solutions cannot be avoided completely.Based on this new coding method, we design a repair strategy which can easily handle illegal solutions.The corresponding evolutionary algorithm of this new coding method is used to search the good solutions of LAP problem.To avoid the shortcomings of some algorithms described previouly and make the model more practical, we use a LAP model integrated with the road distribution information and traffic flow.In the process of solving this model, the multiple constraints are integrated into a single constrain.A new constrain that cells adjacent but without road connection should belong to different LAs is added.This strategy can effectively compress the solution space and increase search efficiency.A reasonable coding method can benefit not only the process of encoding and decoding but also the design of genetic operators.In this algorithm, fixed length chromosomes with a size equal to the number of cells in the network are used.Owing to the novel coding method, only four numbers are needed to encode all chromosomes.Therefore, the crossover and mutation of the chromosomes can be very convenient.What is more, in the process of initialization, the use of fuzzy cluster method based on the known information can produce solutions with lower LA boundary crossing.It can effectively compress the search space and enhance the practical value of the algorithm.
The remainder of this paper is organized as follows.Section 2 describes formulation of the problem and the new model proposed in this paper.Section 3 gives the design of evolutionary algorithm in detail and shows the framework of the proposed algorithm.In Section 4, the numerical simulations of the 2 different generated networks are conducted.The experimental results are showed in Tables 3 and 4, and corresponding configurations of the cells are showed in Figures 4, 5, and 6.Finally, we conclude the paper in Section 5.

The LAP Problem
2.1.Problem Statement.In the GSM network, the geographical coverage area is partitioned into cells, while each cell is served by a base station (BS).The essential task of LAP is to group those cells to form a series of LAs.The LU cost and paging cost of the network are all related to the size of the LA.Therefore, the goals of LAP are to lower the user LA boundary crossing and improve the quality of service at the same time.So the division of cells in the network planning should be scientific and reasonable.The study of intelligence optimization algorithms to solve LAP focuses on mathematical model and corresponding algorithm.It is of great significance both in theory study and practical application.
To satisfy the goals of LAP, LA should be divided neither too big nor too small.We hope that the paging cost and the LU cost can be minimized simultaneously when we plan a network.But the fact is that those two objectives are conflicting, and the situation in which these two goals both achieve their smallest value at the same time does not exist.We can only get a set of trade-off solutions.In order to achieve a satisfactory result, some proper transformation of the two goals should be made.Regarding that the task is to find a balance between the paging cost and LU cost, we take the paging load as a constraint in this paper.Therefore, the problem of finding trade-off solutions between the two goals is converted to finding a minimum LU cost satisfying corresponding constraints.
When solving optimization problem, it is very important to properly model the problem.However, almost all the mathematic models about LAP commonly used previously only consider the theoretical distribution of cells.In other words, the information provided by the real geographic environment such as the location of mountains, rivers, and streets is ignored.In fact, this environmental factors can directly affect the mobility of users in the real situation.Those effects could be decisive in some situations.It is a common sense that the regions with connective roads have a higher frequency of user mobility, while the regions without connection roads have a relatively smaller users' movements.For example, if there are mountains or rivers in some area, the users' movements in this area tend to be small.The LU among LAs is generated by the users' LA boundary crossing.So the geographic information is crucial in LAP.If we know that there is a river or something else between two cells which hinders people's movements, we should avoid to assign the two cells to the same LA in the process of dividing.Considering the impact of streets and roads, the boundaries of LA should avoid contacting with the road or paralleling the road to reduce the so-called ping-pong effect.

Location Area Planning
Model.This model is based on an important hypothesis that the user movement between two cells is realized by roads.It means that the LA boundary has no LU if there are no roads crossing the boundary because of rivers or something else hindering the user movement.
Figure 1 shows an instance of a GSM network.As shown in Figure 1, it is supposed that there is a river between cell 23 and cell 24.So there is no user movement between the two cells.The result is that no LU is generated in the boundary of the two cells.According to the statistical information of real situation, the road traffic flow in busy hour is relatively stable although the number of users on a road is unable to be confirmed at any time.Understandable, the average road traffic flow can be regarded as the number of boundary crossing while the boundary crossing, can be regarded as a measure of LU.To make the model more realistic, the roads in every network will be classified to different types according to their traffic flow.To mimic the user movements, we suppose that daily user traffic flow in every type of roads subjects to a uniform distribution on different intervals.The other important assumptions for the model are as follows.Each cell has at least one road, and the mobile users on the road have a relatively strong mobility.The user movements in each cell are confined on the road.So the total number of LU generated in a network is equal to the total number of mobile users LA boundary crossing.Based on the hypotheses above, the calculation of LU between different LAs can be simplified as counting traffic flow in various roads across the boundary of LAs.Therefore, the objective of this model is to get a configuration of LAs with the least boundary crossing.It is intuitive that the good solutions of this new model will give a configuration of LA with much better practical applications.For reasons of simplicity, roads are roughly divided into three different types: main road, street, and alley.The main notations used in the following equations are listed as follows: (i) : the total number of cells in a network, (ii) {1, 2, . . ., }: the corresponding index set of the cells, (iii) : the total number of road types, (iv) { where    and   +1 represent the indexes of two different cells.
If cell  and cell  are assigned to a same LA,   = 1; otherwise,   = 0.       +1 represents the traffic flow between cell    and cell   +1 in a type  road.
For the entire network system, the total number of LA boundary crossing in busy hour is .Because of the assumption we make, the minimization of  can achieve the goal of minimizing corresponding cost of LU.As stated above, we use the total number of the user LAs boundary crossing in busy hour as the objective of the model in this paper.The goal of this model is to find a configuration of LA to make  as small as possible without violating the constraints.So the model can be described as For the sake of readability of the section, there are notations described as follows: 1 cell  and cell  belong to different LAs, 0 others, 1 cell  and cell  is assigned to BSC  , 0 others, Here,  *  is the total paging load of LA  , and   represents the total paging load produced by the mobile users in cell .In this paper, we make every BS corresponds to only one cell, which means that those two conceptions can be substituted mutually. BS is the paging capacity of a single cell.Paging cost of an LA is determined by the paging cost generated in every cell divided into this LA.Generally speaking, it is independent whether a user is called.So the paging cost generated by each cell is also independent.It is obvious that the cell with more calls contributes more paging cost to LA.The size of LA relies on its paging capacity, while LA paging capacity is decided by the capacity of every single cell divided into this LA.
Those constraints are relevant to paging capacity and call traffic capacity.The meanings of (3)-( 9) are explained in detail as follows: (3) means that a cell can only belong to exactly one MSC; (4) means that a cell can only be assigned to exactly one LA; (5) means that the paging capacity must not be exceeded; ( 6) and (7) means that the call traffic capacity of both MSC and LA must not be exceeded.  represents the call traffic capacity of cell  ( = 1, 2 . . ., ), while  MSC and  LA represent the traffic capacity of a single MSC service area and LA, respectively; (8) means that the cells in the same LA must belong to the same MSC; (9) means that adjacent cells without boundary crossing should belong to different LAs.Here,   is the traffic flow between cell  and cell .

Code Design and Determination of LA.
In mathematics, the four-color theorem states that, given any separation of a plane into contiguous regions, no more than four colors are needed to color all the regions of the plane so that no two adjacent regions have the same color.Inspired by the four-color theorem, we use four different numbers {1, 2, 3, 4} to encode LA as the chromosome of the EA.Each number represents a color, so four number is enough to represent all LA division.Then, the basic rules are as follows: According to the previously described coding method and the method to divide LA, the cells can be divided into different LAs using the following steps.
Step 2. Set ℎ = ℎ + 1. Begin with an undivided cell  (1 ≼  ≼ ) and place cell  in queue .Define state of cell  as the cell to be checked.
Step 3. Check all the cells adjacent to cell  and not yet having been divided.Put the cells having the same code in the end of queue  sequentially, and change the state of the cell  to have been checked.
Step 4. If all the elements in  have been checked, turn to Step 5; or choose the first unchecked cell in  as the next cell to be checked.
Step 5. Divide all the cells in  into LA ℎ; if all cells have been divided, stop; or turn to Step 3.

Constraint Violation Adjustment Strategy.
If those constraints are handled directly, the implementation of EA can be very difficult.So constraints violation adjustment strategy is adopted in this paper.As we know, every cell must belong to an LA, and LA is divided in the interior of MSC.Therefore, the constraints presented in (4), (5), and ( 9) are necessarily met.For (6), it can be adjusted as Here,  represents the total number of LA.  = 0 means that the constraint is satisfied, while  ̸ = 0 means that the constraint is not satisfied.For ( 7) and ( 8), it can be adjusted as If  = 0, the constraint is satisfied, or it is not.For (7), due to the fact that MSC paging handling capacity is large enough and the cells which meet the MSC paging capacity are preferentially divided in process of division, LA is divided without considering the MSC paging constraint in this paper.
For (8), it can be satisfied by some other strategies in the process of solving the model.

LA Initialization Strategy.
The initialization of GA plays an important role in finding the solution effectively.We must make sure that all possible solutions can be generated from the initial population.This paper integrates fuzzy clustering into the initialization.The fuzzy clustering algorithm procedure is very simple and easy to implement.It will be described as follows in detail.At first, fuzzy similarity matrix is determined.Secondly, fuzzy equivalence matrix is calculated.At last, a threshold to the equivalent matrix is set to get equivalence class.The traffic flow between cell  and cell  is unitized as similarity coefficient   .Here,   is defined as Considering the different distances between cells, we define an operator ∘ to calculate fuzzy equivalence matrix.
where   = max =1,2,..., (  ⋅   ). ∘  can be abbreviated as  2 .We can use square method to deal with   .Then, we get  →  2 →  4 →  16 → ⋅ ⋅ ⋅ →  2  → ⋅ ⋅ ⋅ .It has been proved that there must be a  ∈  satisfying , after a finite number of times operation.Then, the fuzzy equivalence matrix  * =  2  =  2 +1 can be gotten.Every individual in the population represents a configuration of LA, the population is initialized using the following steps.
Step 2. Find cells which should not be divided into the same LA according to (8).The matrix  2× is used to store the two sequences of the cells.For an example, cell  1 and cell  2 cannot be divided into same LA.Here,  is the total member of cells that cannot be divided into same LA.
Step 3. Calculate the equivalence matrix  * based on the similarity matrix .
Step 4. If  > , stop or compare the elements of matrix  * in row   column  1 with the elements in row   column  2 .The element with a smaller value is reset to 0. If no constraints are violated, divide the cells in row   with a relatively large and not zero value into the same LA as many as possible,  →  + 1.
According to the steps above, it can be ensured that the cells with large number of boundary crossing are divided into the same LA.What is more, the situation that two cells which should not be assigned to the same LA are divided into the same LA can be avoided.

The Strategy of Updating Population.
Evolutionary algorithm is implemented to solve the LAP problem in this paper.Each individual represents a configuration of LA.The population size  means that there are total  different configurations of LA.The crossover and mutation of every individual in the population are implemented with the best individual in the current population to generate new individuals (new configuration of LA).In order to give the population a valid guidance and accelerate the convergence of the population evolution, there are two kinds of updates: the best individual (optimal configuration of LA) update and the population update.The two kinds of updates are described as follows: Step 1.The violation values   and   of each new individual are calculated according to (11) and (12).
Step 2. Set   = (  / max ) + (  / max );  max ,  max are the maximal constraint violation values of parent population and offspring population, respectively.Thus, the values   of  offspring individuals and  parent individuals can be calculated.If the number of population with the value of  = 0 (satisfy all the constraints) is less than , then select  individuals with a lower value  from the new population and parent population as the population of next generation.If the number of population with the value  = 0 is larger than , select  individuals with a lower value of  from the individuals with  = 0 as the population of next generation.
The best individual in current population can maintain the optimal state by means of comparing with every newly generated individual and updating according to the above best individual update method in time.The crossover and mutation with the best individual in the current population can make best use of information kept by the best individual.In the process of dealing with constraints, the normalization method used in this algorithm can evaluate different constrains by giving a qualitative standard.It is a feasible way to avoid the incomparability of different constraints.As a consequence, the new population keeps the highquality individuals from both parent population and the new population.

Crossover and Mutation.
The crossover and mutation operation with the best individual play an important role in exploring the good individuals in solution space.The method of mutation and crossover is described in detail as follows.
Generate a random number  ∈ (0, 1).If  > 0.5, an intermediate offspring is generated from   and  * by single point crossover, or the intermediate offspring is generated by randomized crossover.After crossover, the new individual is generated as follows: mutate the code of intermediate offspring with probability of 0.5 at random, and then randomly choose an adjacent cell for mutation.Make sure that the two cells have the same code after mutation.In this way, a new individual is generated.
3.6.The Framework of the Algorithm.The algorithm can be implemented by the steps described as follows.
Step 1. Set the population size , the number of evolution generation gen = 0, and the maximum number of evolution generation  max .Initialize population and encode the initialized population according to the method described in Section 3.
Step 2. Select the best individual  * .
If the newly generated individual violates (9), the code repair process needs to be conducted.This violation can assume that cell  and cell  have the same code but without user crossing the boundary.The violation has three situations: (c) both the two cells have adjacent cells with different codes.Identify the LAs where the adjacent cells are located, and calculate the sum of the traffic flow between the two cells within these LAs.The cell with larger traffic flow will be retained, while the cell with smaller traffic flow is divided into a new LA using the method described in (b).
Step 4. New individuals are processed according to the coding method in Section 3. Update the best individual and population using the update method described in Section 3.
Step 5.If gen ≼  max , go to Step 2; otherwise, stop.(e) the population size  = 100 and the maximum evolution generations  max = 1000.

Computational Experiments and Comparison
The simulation program has been developed within the MATLAB programming environment.

Simulation Experiment and Analysis.
We generate 10 groups of   ,   , and   randomly according to the experimental hypothesis for each network to simulate the real situation.It is known that finding the optimal answer of a problem by EA is highly dependent on the number of the iterations that the algorithm undergoes.However, it is not convenient to determine the recursion terminal condition.To avoid the difference, the two algorithms are executed until the maximum number of generation is reached.The costs of each simulation for the two given networks are showed in Tables 3 and 4, respectively.For simplicity, only 3 groups of corresponding configurations of the 5 × 5 network are showed in Figures 4-6.The results of simulation experiment in Tables 3 and 4 show that the novel algorithm outperforms general EA and SA when solving the proposed model.Figures 4-6 show that cells with big traffic flow are assigned to the same LA.It lives up to the general rules established by the previous researchers.

Conclusion
In this paper, we propose an evolutionary algorithm with a novel coding method based on four-color theorem to solve an NP-hard problem of LAP.The cell informationbased algorithm for LAP problem has proved its ability to significantly reduce signaling costs by computer simulation.
In the process of initialization, we use the fuzzy clustering method based on the information of real situation to enhance exploration.The use of the new coding method based on four-color theorem increases the efficiency of the proposed algorithm.The constraints are adjusted during the update process using different strategies.The comparisons of the experimental results show the practical applicability of the algorithm.

Figure 1 :
Figure 1: The LU between cells in LA.
(a) cell  corresponds to a digital coding   , ( = 1, 2, . . ., )   ∈ {1, 2, 3, 4}; (b) cells with different codes belong to different LAs.That is, if   ̸ =   , then cell  and cell  belong to different LAs ( = 1, 2, . . ., ), (c) cells with the same code belong to the same LAs.That is, if   =   , then cell  and cell  belong to the same LA ( = 1, 2, . . ., ), (d) if cell  and cell  belong to the same LA and cell  and cell  belong to the same LA, then it can be deduced that cell  and cell  must belong to the same LA.
(a) the best individual update: in every generation of the evolution, every newly generated individual by crossover and mutation is compared with the best individual in current population.First of all, their constraints violation values are compared.The individual with a lower constraints violation value is set as the new best individual.If the constraints violation values of the two are equal, the individuals with a lower LU is set as the new best individual, (b) the population update: the constraints violation values of every individual both in parent population and offspring population are calculated using the following steps.
(a) cell  and cell  have the same code with their adjacent cells.Then, we center the two cells and divide the LA into two new LAs.The method is to divide the cells having big traffic flow with cell  into one LA and divide the cells having big traffic flow with cell  into another LA according to fuzzy equivalent matrix; (b) only one of the two cells supposed as cell  has adjacent cells with different code.Check the adjacent cell of cell , and identify in which LA the cell with the largest flow is located.Then, cell  is divided into the LA;

Table 3 :
EA results of LAP for given 5 × 5 network.

Table 4 :
EA results of LAP for given 5 × 6 network.