Compromise Rank Genetic Programming for Automated Nonlinear Design of Disaster Management

. This paper presents a novel multiobjective evolutionary algorithm, called compromise rank genetic programming (CRGP), to realize a nonlinear system design (NSD) for disaster management automatically. This NSD issue is formulated here as a multiobjectiveoptimizationproblem(MOP)thatneedstooptimizemodelperformanceandmodelstructuresimultaneously.CRGP combinesdecisionmakingwiththeoptimizationprocesstogetthefinalglobalsolutioninasinglerun.Thisalgorithmadoptsanew rankapproachincorporatingthesubjectiveinformationtoguidethesearch,whichranksindividualsaccordingtothecompromise distanceoftheirmappingvectorsintheobjectivespace.Weproveherethattheproposedapproachcanconvergetotheglobal optimumundercertainconstraints.ToillustratethepracticalityofCRGP,finallyitisappliedtoapostearthquakereconstruction managementproblem.ExperimentalresultsshowthatCRGPiseffectiveinexploringtheunknownnonlinearsystemsamonghuge datasets,whichisbeneficialtoassistthepostearthquakerenewalwithhighaccuracyandefficiency.Theproposedmethodisfound tohaveasuperiorperformanceinobtainingasatisfiedmodelstructurecomparedtootherrelatedmethodstoaddressthedisaster managementproblem.


Introduction
Natural disasters occurred more frequently during these years.Most of them caused a large amount of infrastructure damage, heavy casualties, and financial loss every year, such as earthquake, floods, and typhoon.For example, Sichuan Earthquake left at least 5 million people without housing and government had to spend billions over the following years to rebuild the ravaged areas [1].In order to avoid the enlargement of economic and mental damage for the people and society, disaster management is therefore a growing need for many governments.The important and complex task of disaster management is to make an efficient reconstruction strategy that can rescue the victims on time and rebuild the ravaged areas efficiently with the limited resources and finance support.Formerly, several qualitative analyses were pointed out for certain special aspects of the reconstruction strategy, such as conceptual decision support for disaster mitigation [2,3], rescue planning of telecom power [4], and optimized strategy for resource allocation [5].But, few mathematics models for disaster management currently exist in the literature, and modeling the reconstruction strategy remains an important open research topic.
From the application view, [6] pointed out that some prediction models for "in-advance reconstruction strategy" were demanded which can reduce inevitable delays in the recovery process.In addition, [7] hints that the speed and quality of renewal process interact with the assignment of limited resources (such as experts, medical teams, and finance support).These underlying relationships can be modeled from the "Big Data" of disaster management for prediction.Therefore, it is an urgent problem to design the prediction models for disaster management to assist in making the efficient reconstruction strategy in advance.
Since nonlinear models are empirical for the complex process modeling, such as industrial control systems [8,9], biomedical data modeling [10,11], and chemical process systems [12], such underlying models for disaster management 2 Mathematical Problems in Engineering can be converted as a problem of nonlinear systems design (NSD).This NSD problem involves determining the structure and estimating the parameters of the nonlinear models embedded in the disaster management.Traditionally, solving NSD problems focuses on parameter estimation, while the structures are usually assumed to be known or approximated by some universal approximators such as neural networks [13].However, actually no a priori information can be understood about the nonlinear models for disaster management, and the disaster datasets are usually incomplete and inconsistent [14].In this case, this approximated approach has its limitation in the sense that it usually uses a very complex model to describe a maybe simple underlying function.Then, the unnecessary number of parameters posts a problem in estimating the parameters and contradicts the basic principle in designing a nonlinear system.Moreover, the related decision variables in the model, which are significant to express the system behavior, are generally difficult to select among a lot of features.In addition, the generalization performance of an approximated model is usually not guaranteed.Therefore, it is a significant challenge to determine an exact structure of each nonlinear model for disaster management and design these nonlinear models with sparse data and little a priori information.
To discover the nonlinear model structure, genetic programming (GP) is a powerful tool by using tree chromosome representation and the crossover operator [15,16].It searches in the functional space to determine the optimal structure of a nonlinear system [17].Generally, GP is used to coevolve model structure and parameters by optimizing a single objective, such as prediction error.However, this approach does not always converge to a global optimal structure since the functional space is usually too large to search for [18].It is observed that spurious terms and dependent variables are the main causes of this problem.Evolution of the spurious terms and dependent variable results in rapid growth of tree sizes in GP.It tends to cause the algorithm stagnating and leads to the phenomenon of bloating [19].The simple and common way to prevent bloating for GP is to limit the maximum tree depth of all the individuals and the maximum number of nodes [20].But for our problem with little a priori information, it is hard to determine the appropriate depth and the number of nodes.The other approaches focus on controlling the offspring trees growth or targeting redundant nodes by improving the crossover and selection operators [21,22].Our proposed approach mainly uses this idea to resolve bloating in GP.Besides, it is also a good way for controlling bloating to use a fitness penalty (parsimony pressure) that is biased against the individuals with more complex structure [23].Luke presented that all of these bloating control methods can perform well without reducing their ability [24].Considering that the NSD problem for disaster management should have a practical optimum solution, yet an over-complex model results in bad comprehensibility; thus model complexity should be restrained at a low level.Thus, model complexity should be optimized which is also considered as a fitness penalty to control bloating.Since a poor model structure may lead to poor parameter estimation, model accuracy and complexity should be optimized simultaneously in solving the NSD problem.Since the maximization of model accuracy is not the same as the minimization of model complexity, our NSD problem for disaster management is thus modeled as a multiobjective optimization problem (MOP) and solved by a multiobjective evolutionary algorithm (MOEA) cooperating with GP.
The MOEAs mainly include aggregating method and Pareto-based approach [25].For the aggregating method, predefining weights or a priori information of the objectives is required to convert the MOP to a single-objective optimization.This approach is easy to implement, but improper a priori information might lead to poor optimization result [25].The Pareto-based approach obtains an optimal solution in two steps: Pareto optimality process and multicriteria decision making process [26].This approach is usually time consuming in the first step to achieve the Pareto optimal set which consists of many redundant solutions.The final solution is selected from the Pareto optimal set using the goal or preference information from decision makers in the second step.Although several algorithms have been proposed to improve the validity of Pareto optimal set [27][28][29], little efforts have been put on incorporating the subjective information with Pareto optimality to reduce the search space and improve the efficiency.Since the NSD for disaster management requires a single solution instead of the whole Pareto optimal set, incorporating the subjective information is important to discover the exact nonlinear model for each underlying relation in disaster management.Because we found that the multiple objectives of the NSD for disaster management have different priority for ranking in different situations (and this ranking strategy can be adjusted according to the subjective probability theory), we here propose a novel multiobjective GP algorithm called compromise rank genetic programming (CRGP), to address the NSD for disaster management by combining the subjective information with Pareto optimality.
The proposed CRGP aims to uncover the exact nonlinear structure and parameter estimation of the models embedded in the incomplete disaster datasets and tries to combine with the subjective probability theory to reduce the computational complexity.No choosing of goals or weights information is required for CRGP, which makes it eliminate the error due to the weights mistake.The proposed CRGP utilizes the relative distance of chromosomes to guide the search in the Pareto optimality process and to obtain the final compromise solution in a single run.This characteristic is beneficial in reducing the evolution probability of the model structures composed by redundant terms and unimportant features.It can resolve the bloating problem in GP evolution and improve the convergence rate.To evaluate the effectiveness and practicality of CRGP for our problem, the proposed approach is then applied to a practical problempost-earthquake reconstruction in disaster management.
The paper is organized as follows.Section 2 presents the formulation of NSD based on MOP.Section 3 describes the proposed CRGP algorithm.Convergence analysis of CRGP is given in Section 4. Section 5 reports the application of CRGP to disaster management problem using real postearthquake reconstruction data.The method is compared with the traditional single-objective GP method, aggregating MOGP method, and Pareto-based MOGP approach.Concluding remarks are given in Section 6.
Traditional nonlinear system identification usually assumes x * and  * to be determined a priori, and the main task is to determine c * by some parameter estimation techniques.One popular approach is the minimum mean square error (MSE) method; that is, where ŷ is the estimated output and R contains all the real numbers.In practice, the a priori information about x * and  * is not usually available.A general formulation for NSD based on minimizing MSE can then be expressed by where G is the functional space that contains all possible nonlinear functions, X n is the input sequence space, and  1 is the MSE between the true and estimated output.Since input variables might be correlated, different combination of input variables might result in similar level of MSE.Therefore, the order  of the appropriate input set  * should also be minimized; that is, In addition, different model structures can have the same MSE if there are redundant terms.Thus, the nonlinear function  * should be in the most parsimonious form.The complexity of model structure is then considered as another measure of  * here.Two factors are considered to measure the complexity of model structure.One is the total number of terms in  to avoid redundancy and another is the ratio of the number of nonlinear terms to the total number.That is, where () is the number of terms in  and  nonlinear () is the number of nonlinear terms.Since the three optimization problems, (3), (4), and (5), are correlated, they should be solved simultaneously for a consistent solution.For example, the optimal input set x * is determined by minimizing  1 and  2 together.At the same time, an improper input set would cause divergence in optimizing  3 .In this paper, we propose formulating NSD as a MOP given below: One popular approach to MOP is the evolutionary multiobjective optimization (EMO).Traditional EMO approaches [25] convert MOP to a single-objective problem by weighting.Taking the sum of weighted objectives, the single fitness function can be expressed as: where   are the weights.The optimal solution can be obtained by ranking the individuals in terms of the singleobjective value.This approach has some difficulties to solve the NSD of disaster management problem.First, the values of those weights are hard to be defined a priori, and different weightings can result in very different solutions.Thus, several runs with different weight combinations are usually required to obtain various Pareto-optimal solutions [25], but they still cannot ensure the exploration of the real solution except for higher computation cost.Second, our model ( 7) is usually nonconvex, because the epigraph of the function  3 is usually not a convex set with respect to all the candidate structures of the unknown function "." To overcome these difficulties, the Pareto-based EMO approaches are usually considered.Compared to the objective space of a single-objective optimization, the objective space for Pareto-based EMO is usually more complex.It is a challenge to determine how to evaluate the individuals according to several inconsistent objectives.Pareto-based EMO [30] are usually implemented in two steps: Pareto optimality process and multicriteria decision making process.The first step ranks individuals by their nondominance degree and then achieves the Pareto-optimal set as a vector which is composed of a set of "nondominated" solutions for all the objectives.In the second step, the final solution is selected from the Pareto-optimal set according to the goal or preference information provided by decision makers [31].The definition of "dominance" is shown as below.
Definition 1 (Pareto dominance [26,32]).For a minimization optimization problem, given vector u and vice versa for a maximization optimization problem.Generally, the Pareto dominance rank method assumes that all objectives have an equal weighting for optimization.However, this assumption does not apply to the NSD of disaster management problem.Actually, the three objectives play different roles in determining the final best nonlinear system.In addition, two steps of the general Pareto-based methods cost much time to calculate many redundant Pareto-optimal solutions which is not required for our disaster management problem.To improve these issues, CRGP is proposed here to solve for the NSD MOP of disaster management problem.

Compromise Rank Genetic Programming
CRGP is proposed with the purpose of incorporating decision making process with Pareto optimality process to obtain the final compromise solution in one process.The main difference between CRGP and the general Pareto-based EMO methods is the rank approach of individuals according to multiple inconsistent objective functions.

Compromise Rank Approach.
Assume that there are  objectives ( ≥ 1).These objective functions map individuals in the variable space to vectors in the objective space.The individuals are ranked by evaluating the -dimensional vector in the objective space.Assume u and k be two different dimensional vectors in the objective space.  and V  (1 ≤  ≤ ) denote the th objective values of the u and k, respectively.For the NSD MOP problem for disaster management, there are three objectives as (7); that is,  = 3, and they are always positive; in fact, in this paper we only consider that all objective values are positive; that is, ∀,   > 0 and V  > 0.
We define the relative distance from u to k relative to the beginning point u through the th objective space as Δ  (u, k); that is, The sign of Δ  (u, k) is determined by the magnitude of   and V  , and the magnitude of Δ  (u, k) indicates the relative increment or decrement of the th objective when k is compared to u.For example, the case that Δ  (u, k) is negative implies that   is smaller than V  .For minimization, u is then considered to perform better than k for the th objective.Assume the term "rank" to measure the performance of every individual; the smaller the rank, the better the optimal solution.Thus, comparing the relative distances of different objective vectors, we have the following situations: (1) If ∀ Δ  (u, k) > 0, for minimization u should be ranked higher than k.(2) If ∀ Δ  (u, k) < 0, for minimization u should be ranked lower than k.
(3) If ∃ Δ  (u, k) > 0, and ∃ Δ  (u, k) ≤ 0, u and k are considered as having the same rank based on Pareto dominance sorting method (see Definition 1).As shown later, unlike Pareto dominance sorting, compromise rank method attempts to consider relative differences of all the objectives and to assign different ranks to u and k in this case.
(4) If ∀ Δ  (u, k) = 0, the rank of u and k should be the same.
Here, we analyze the characteristics of the objective vectors (i.e., u, k) in different situations for our MOP problem (7) and subsequently propose a new rank rule to solve it.Obviously, when the relative distance of u and k falls into the first situation and second situation, the rank can be determined by the sum of all parts Δ  (u, k).For the third situation, the relationship between the absolute values of Δ  (u, k) can be divided into three cases: In the case (a), the NSD MOP has solutions with  2 and  3 close to each other, but their MSE values, that is,  1 , can be quite different.For example, consider u corresponding to the model structure  1 +  2 and k corresponding to the model structure  1 *  2 ; their  2 and  3 values are close.However, their  1 values are quite different; that is,  1 and V 1 are 0.004 and 5 * 10 −7 , respectively.Actually in the case (a) the first objective has a higher priority; therefore the rank of u and k in this case would be in agreement with the sign of the sum of Δ  (u, k).In the case (b), the MSE values of different models are similar, and the difference between  1 and V 1 can be ignored.On the other side, in the case (b) the first objective has a lower priority.So, in this case the conclusion is also achieved in which the rank of u and k would be in agreement with the sign of the sum of Δ  (u, k).In the case (c), u and k are considered to have the same rank, yet it occurs rarely.Thus we found that the sum of Δ  (u, k) can adaptively reflect the preference information of objectives of NSD at different cases.This relationship is a novel character the proposed approach uses to apply to the NSD MOP for disaster management, unlike other approaches which need to know the preference information a priori.
Therefore, the proposed approach defines "compromise distance" of two vectors in a -dimensional objective space as below.
Definition 2 (compromise distance).The compromise distance from u to k is defined as the sum of all the relative distances of every objective from u to k in the objective space.That is, Compromise rank approach is then proposed to rank the compromise distance between two vectors in the objective space to guide the Pareto optimality search process.Assume that all objective values are positive; the rules for compromise distance ranking are given as follows: (1) If  uk > 0 and  ku < 0, then rank(u) > rank(k).
Additionally, these rules are Pareto-compliant which is proved later in Theorem 3.
Figure 1 illustrates the ranking scheme based on Pareto dominance sorting and compromise rank approach, respectively.For the two vectors with rank 1 in Figure 1(a), their relative distance of the first objective is smaller than that of the second objective.Thus, the compromise rank approach will rank these two vectors in agreement with the order of the second objective as shown in Figure 1(b).Consider any vector with rank 1 and any vector with rank 2 in Figure 1(a); their relative distance of the first objective is much bigger than that of the second objective.Thus, the compromise rank approach will rank them with more levels as shown in Figure 1(b).
The implementation of compromise rank approach is similar to the fast nondominated sorting approach [28].First, in one population, each individual is evaluated by pairwise comparison of compromise distance with other individuals.According to the compromise rank rules, the rank relations among individuals are recorded.For instance, individuals with rank higher than the compared individual are reserved in a set.The number of individuals with rank lower than the compared individual is also kept for record.Then, the individuals with rank lower than any other individuals are assigned with rank 1, and they are considered as the best solutions in this population.After that, according to the recorded rank relations of individuals, those remaining in this population are assigned with corresponding rank in order.

Main Loop of CRGP to Solve the NSD Problem.
By incorporating compromise rank into GP, CRGP can be divided into three steps: initialization, evaluation objectives, and individuals evolution as shown in Figure 2.

Initialization.
GP uses a tree to express a nonlinear system structure as an individual.All the internal nodes are mathematical operators, and the leaf nodes are the input variables.GP defines a function set that includes all the possible mathematical operators and a terminal set that involves all the possible input variables.For example, when a polynomial system is considered, the function set is defined as {+, * } and the terminal set contains all the input variables ( 1 ,  2 , . . .,   ).The tree structure is encoded as a string as in Figure 3.In order to solve model structures for consistency with semantic restraints, appropriate values within the function and terminal set of GP should be determined to restrict the search space.
The initial trees are selected by the ramped half-and-half method [17].The maximum depth of a tree is predefined.Half of the initial tree is a full tree with the maximum depth, the leaf nodes of which are randomly chosen from the terminal set and the other nodes are chosen from the function set.The other half of the initial tree has variant depth of no more than the maximum depth, all the nodes of which are selected from the terminal set or the function set at random.It is noted that the tree might be nonsensical when an internal node is determined as an input variable from the terminal set.Hence, every individual tree is checked by a model certification Model structure: procedure before calculating the fitness values of it.In this procedure, all the internal nodes are checked to figure out it is a mathematics operator or a symbolic variable.If an internal node is a symbolic variable, the subtree from this internal node down to the leaf nodes is deleted.An example is shown in Figure 4.This procedure is of particular importance for calculating the complexity of the nonlinear model in the following evaluation step.

Evaluation of Multiple
Objectives.After generating the individual trees of a population, the performance of a tree is evaluated by calculating the fitness value according to the multiple objectives.It should be noted that evolution operations will make new trees grow larger and have more insignificant subtrees.Before calculating the multiple objectives of each tree, we adopt orthogonal least squares (OLS) method [33] to eliminate the redundant subtrees of the individuals that contribute less to improving the model accuracy.For example, if a polynomial system is considered, an individual tree is divided into  subtrees in terms of the plus operators.OLS calculates the error reduction ratios of those subtrees.Let the input regression matrix Z = [z 1 , z 2 , . . ., z  ], where z 1 , z 2 , . . ., z  denotes the function term expressed by each subtree.The expected output vector can be expressed as where   is the parameter of the th subtree.Since these unknown parameters are linear, least squares (LS) method is employed to estimate c; that is, The first step of OLS is to make QR decomposition on Z. Suppose Z = Q * R, where Q is an orthogonal matrix and R is an upper triangular matrix.Then the positive diagonal matrix D = Q  Q is obtained and assistant parameter vector p for the output y is defined as where p is the corresponding solution vector of OLS.Thus, the mean square of output can be expressed as where e is the error between the expected and real outputs.Therefore, the error reduction ratio of every subtree z  ( = 1, 2, . . ., ) is given by Whether a subtree is eliminated is determined by comparing the error reduction ratio of the subtree with a threshold.When the error reduction ratio of a subtree is less than the threshold, the subtree will be eliminated; otherwise, the subtree will be kept.
After deleting the redundant subtrees, parameters of every subtree are obtained by (13).Consequently, the three objective functions of NSD are calculated tree by tree.The values of  1 ,  2 , and  3 form the -dimensional vectors in the objective space.The ranks of these vectors are calculated by the compromise rank approach and are then assigned as the fitness values of the corresponding individual trees.The smaller the fitness value is, the better a tree is.

Evolutionary Operations.
After evaluating the fitness values of individual trees, new trees are generated by evolutionary operations, including selection, reproduction, crossover, and mutation.First, the selection operation is carried out to select individuals from the population.These individuals form a mating pool which is used for reproduction, crossover, and mutation.We define the size of the mating pool as the half of the population size.Thus, tournament selection method is operated to compare any two individuals and the individual with better fitness value is selected.After that, individuals in the mating pool are separated into three parts with the probability to apply for reproduction, crossover, mutation, respectively.The individuals with better fitness values are reproduced as the new individuals of the next generation.In the crossover operation, two individual trees are selected, and a subtree of each individual is selected randomly and exchanged.For example, in Figure 5, the diagonal mark means the crossover point of parent 1 tree and parent 2 tree.The subtrees of " 2 " in parent 1 and " 1 " in parent 2 are exchanged in the child 1 and child 2 trees.In the mutation operation, a subtree of the individual tree is randomly selected and replaced by a new subtree.This new subtree is generated using the same method as the initial tree.After crossover and mutation, certain internal nodes might not be mathematical operators, which violate model construction rules, so the model certification procedure described before will be adopted after evolutionary operations.At last, an elitism mechanism, named competition strategy, is employed after applying the evolution operators to let the better individual survive and to avoid the loss of good genes due to random effects.This mechanism constructs a combined population with parents and offspring population.Every individual in the combined population is assigned a rank based on the compromise rank approach.The individuals with lower ranks are selected to construct the new population.When the new population grows with the same size of the former population, this work is finished.This elitism mechanism has been demonstrated for its convergence property [34,35] and is successfully applied to the genetic algorithm to solve for many real applications [36].
The proposed CRGP is summarized as follows: (1) Initialize the function set and terminal set of GP and set the parameters of the algorithm, such as the population size, generation number, and crossover probability.* * * * x 2 x 2 x 3 (2) Evaluate the compromise rank value of every tree in a generation using fast compromise rank approach.
(3) Generate a mating pool using tournament selection method and the candidate tree with a lower rank is selected in the tournament comparison.Then, assign every tree in the generated mating pool with an evolution operation among three different operations: elite, crossover, and mutation, based on operation probability.
(4) Realize the corresponding operations of parent trees to generate their children trees, and calculate the multiobjective vector of each children tree.
(5) Combine all the parent trees with children trees as an intermediate population and rank them by compromise rank approach.Put the trees with lower ranks in the next generation set until the population set is met.

Theoretic Analysis of Compromise Rank Genetic Programming
In order to ensure the solution accuracy of CRGP, the convergence properties of CRGP are discussed in this part.We present two theorems which prove that CRGP can converge to the actual Pareto Front.

Theorem 3. CRGP converges to Pareto optimum (Pareto Front
* ) with probability one; that is, where   contains vectors in the objective space at the th generation.
Proof.It is proved that an EA using Pareto-based ranking and a monotonic selection algorithm converges to the global optimum of a MOP with probability one [37].
Because compromise rank results obey Pareto dominance sorting, CRGP obeys Pareto-based ranking.In addition, this approach uses elitism mechanism during the selection process, and the rank of any individual in the offspring population is lower than or equal to the minimum rank in the parents population [38]; that is, where ( + 1) and () denote the objective vectors in the parents and offspring generation, respectively.It is proved that the selection algorithm is monotonic.In addition, the evolutionary procedure of CRGP has the mutation operator that can supply irreducibility property.That means the CRGP algorithm satisfies all the optimum convergence criteria.
Therefore, CRGP converges to the Pareto optimum with probability one after a finite number of iterations.This part aims to discuss the applicability constraints that should be satisfying when CRGP converges to the best solution on the Pareto Front.CRGP introduces compromise distance of vectors in the objective space to substitute the role of preference information in the optimization process.Thus, the constraints are related to the distribution of vectors in the objective space.When the preference information of decision making can be performed correctly by the relative distances of vectors in the objective space as given below, CRGP is applicable to solve other MOP problems.in the objective space, CRGP converges to the best solution of NSD with probability one; that is, where  * = ( * 1 ,  * 3 ,  * 3 ) denotes the unique best solution;   = ( 1 (),  2 (),  3 ()) denotes a vector in the objective space at the th generation on the Pareto Front  * ; (⋅) denotes the Hamming distance between the associated incidence vectors.Here, assume that  * 1 ,  * 2 , and  * 3 are the value of model error, model structure complexity, and the number of selected variables of the best solution, respectively;  1 (),  2 (), and  3 () are the first, second, and third objective value of a vector at the th generation, respectively.
Therefore, for vectors in the objective space within   ( * ) + , CRGP converges to the optimum with simplest structure complexity; that is, (2) Next, we analyze vectors in   ( * ) − .Consider u ∈   ( * ) − and k ∈   ( * ) + .From Figure 6, we have  1 (u) >  1 (k),  2 (u) <  2 (k), and  3 (u) <  3 (k).Then ∀, u, k, Δ +  (u, k) ∈ (0, 1).But Δ −  (u, k) ∈ (−∞, +∞), the sign of  uk cannot be determined without any constraint, and the convergence is uncertain.Therefore, in the following, we deduce the constraints which are required to realize the convergence; that is, Because CRGP adopts the elitism selection strategy, the objective vector corresponding to a smaller rank will be selected after mutation and recombination.That means the convergence can be realized only if the order of u and k meets rank(u) > rank(k).According to the rank rule, there are two situations: No constraint is required in this case.
(ii) If  uk < 0, then it requires  ku < 0 and  uk >  ku when rank(u) > rank(V) is met.
For the former, since this requirement is satisfying.For the later, the requirement can be written as this requirement is satisfying.In other words, the constraint can be written as Thus,  ku <  uk , and hence rank(u) > rank(k).
Summarizing the constraints in the two cases, it is found that if is satisfying, the convergence described by ( 23) can be achieved.That means when the value of an objective (e.g.,  1 has a higher priority) is far from the global optimum, the decrease distance in this objective from this value to the global optimum is always larger than the sum of the increase distances in other objectives (e.g.,  2 and  3 have lower priorities).If such a difference is greater than two, then the algorithm converges to the global optimum.
Combining ( 21) and ( 23), we conclude that CRGP converges to the best model structure with the minimum model error with probability one; that is, when the relative distances of pairwise comparison meet the constraint as (25) in the objective space.
Not only does this rank approach obey Pareto dominance sorting, but it also incorporates the preference of decision making of NSD by taking into account its special relationship among the three objectives.So it can conduct the final Paretooptimal solution in one process.

Application to Postearthquake Disaster Management
5.1.Description of Dataset.An actual dataset which recorded the 2003 Bam earthquake is studied in the following experiment.The database records 76 features through 106 time frames (nearly week intervals).It includes the number of buildings corresponding to seven stages of reconstruction process, respectively: building permit at first, building at foundation stage, building at framing stage, building at fencing stage, building at roofing stage, building close to be used, and building ready for use.Additionally, some features depict resources about the human and material needed and finance support during the recovery process, such as "number of engineers, " "number of loaders, " "number of reconstruction offices, " "number of heavy and not heavy equipments in reconstruction offices, " "number of units sent to consultants for building plan, " and "number of units received financing." In order to help human to efficiently determine the reconstruction strategy with high quality in the disaster management, we aim to uncover the underlying relationships between reconstruction processes and available resources.As we know, polynomial nonlinear models are empirical for process modeling.Thus, we assume that all the relationships can be expressed by nonlinear polynomial models.The structure of models is difficult to achieve due to the incomplete data in the dataset.For example, in our dataset, the "number of units at foundation stage" missed the records at the last thirty weeks, and it had to be filled with the sum of "number of commercial units at foundation stage" and "number of residential units at foundation stage" in the dataset.As a result, the record of "number of units at foundation stage" was not so accurate.Thus, the modeling method is required to avoid overfitting the training data and reduce the influence of measure error.
Besides, the variables of these models are not independent of each other and their underlying relations are uncertain.Thus, the significant underlying relationship models about these variables are unknown and not easy to understand which tend to be embedded by other weak relationships.That means that many models with different structures may result in equivalent satisfying model performance.Therefore, the difficulty to solve the underlying models is how to optimize model structure and model performance simultaneously.This paper would apply CRGP to solve it using the actual data from the 2003 Bam earthquake.

Application Results
. The first step of this simulation is done by using the proposed CRGP approach to generate six underlying nonlinear models among the building amounts in seven renewal stages.The results are shown in Table 1.
For each model, the left symbol of the equation denotes the estimated target and the other six features are considered as the input variables.These six CRGP employed the same parameter setting defined in Table 2.These models deliver the closest and simplest expression for the number of building units in every stage.Based on them, it can be observed that "the number of units at foundation stage" (feature  2 ) is the most significant element during the building reconstruction process.Once its value is decided, the values in other stages in the following time can be estimated in advance.Additionally, the relationship between the numbers of units at different stage is almost linear, which hints that other intelligent computing methods suitable for solving linear structure can be conducted to this problem with satisfying result.For example, these models can be simulated using neural network, but the satisfying models can only be obtained when training data are complete and simulation parameters are set properly.
But a problem is raised, that is, how to determine the feature  2 at the foundation stage.Although Model 1 offers an expression to predict  2 in mathematics, the feature  3 cannot be known before  2 is set in the logical level.The same problem happens to the feature  1 as well.Obviously, available where  8 denotes "number of units received financing." Then, decision makers can estimate the number of buildings for reconstruction according to the available financial resource and human resource.It can avoid the unnecessary loss of useful resource and overestimation for construction building.

Analysis.
To analyze the validity of the proposed CRGP approach for disaster recovery scheduling problem, the learning process of the variable "number of units at foundation stage" is given here as an example.Six available resource variables required during reconstruction are defined as the input features, that is, "number of engineers, " "number of loaders, " "number of reconstruction offices, " "number of heavy and not heavy equipments in reconstruction offices, " "number of units sent to consultants for building plan, " and "number of units received financing." In order to test the accuracy of solution, we chose 4/5 of dataset as training data and the residual 1/5 of dataset as testing data.The optimal solution is obtained by using CRGP to deal with the training data set and tested by testing data set.As a result, the estimated performance of CRGP solution is plotted in Figure 7 by comparing with the real data, and test accuracy of the testing data is 98.7%.It can be observed that CRGP can obtain the solution with high accuracy.
To illustrate the availability of CRGP, a comparison of CRGP with single-objective GP and NSGAII-GP algorithms to solve the relationship model for "number of units at foundation stage" is given.All of these three methods applied the same parameter setting as in Table 2. Without generalization, this single-objective GP chooses the approximate error of model as the objective function.The aim of the comparison of performance of CRGP with single-objective GP is to show that NSD problem is addressed better by considering it as a multiobjective optimization problem than a single-objective optimization problem.This NSGAII-GP applied a Pareto-based EMO algorithm, NSGAII, to deal with the evaluation and evolution process of multiobjective GP.The comparison of CRGP with NSGAII-GP is presented in order to demonstrate the validity and superior property of compromise rank approach.
In Figure 8(a), in order to make a common baseline for comparison, we convert a multiple objectives vector to an experienced objective as  1 +  1  2 +  2  3 to plot the learning curves of CRGP, single-objective GP, and NSGAII-GP.Here,  1 and  2 are chosen as the experience value 0.001, respectively, to express the different impact of every objective on the nonlinear system design.To make it clear, the learning curves of model error objective and model complexity objectives are, respectively, shown in Figures 8(b) and 8(c).
In Figure 8(b), even though it is shown that NSGAII-GP and single-objective GP can achieve better performance in terms of model error, CRGP can also obtain small model By comparing the performance of CRGP with singleobjective GP, it is seen that the solution of single-objective GP always involves small model error and complex structure.This is because single-objective GP only searches for the optimization of model error and the size of model structure tends to increase much due to bloating problem.Then, more than one structure corresponds to a smaller error.For our problem, many dependent variables and uncertain relations build the underlying models; thus it is hard to determine an exact boundary or a weighting parameter to control bloating and balance model error against model complexity.Therefore, it is more reasonable to treat our NSD problem as a multiobjective optimization problem than as a singleobjective optimization problem.
Through the comparison of CRGP with NSGAII-GP, it is found that NSGAII-GP also cannot obtain the optimal model structure and CRGP can easily converge to the best solution.The reason is that NSGA-II can only find the Pareto-optimal set, from which designers should use multicriteria decision making (MCDM) techniques to obtain the best solution.But the realization of MCDM always needs some weighting information between multiple objectives or goal information about the special application, which can scarcely be obtained for our NSD problem.Here, NSGAII-GP chooses the smallest approximate error in the Pareto-optimal set as the final result, and CRGP uses compromise rank approach to combine decision making with optimization process.It is concluded that CRGP provides a better way to this NSD problem for disaster management without any prior information and is able to obtain the satisfying solution.
Besides, average running time required of every algorithm is presented in Table 3.These experimental results illustrate that CRGP is beneficial to efficiently obtain the optimal model structure with allowable small model error and its efficiency and solution accuracy are better than singleobjective GP and NSGAII-GP approaches.Furthermore, such outstanding convergence performance is beneficial to explore nonlinear relationship models embedded in the postearthquake recovery management.

Conclusion
Many real world problems, such as disaster management, can be converted as a nonlinear system design (NSD) problem.The difficulty in NSD is to obtain the unknown model structure in the case that no a priori information of system is available.To address this problem, we propose to model NSD as a MOP here for hybrid estimation of mode performance and model structure simultaneously.In particular, the CRGP algorithm is developed in this paper.
The main feature of CRGP is that it uses the characteristics of individuals from disaster management problem to reflect the preference information of the NSD MOP problem.CRGP uses a novel compromise rank approach for Pareto-based EMO to obtain the final Pareto-optimal solution in one process.It is realized by comparing the compromise distances of vectors in the objectives space.The convergence properties of CRGP are also investigated here.It is shown that CRGP can converge to a global optimum of NSD problem.Moreover, the convergence constraint is deduced.CRGP is further applied to solve for a postearthquake reconstruction management problem using real earthquake data.Six underlying relationship models about recovery scheduling are generated, which illustrate that the number of buildings constructed in different recovery stages has nearly linear relationships and the number of buildings in the foundation stage is mainly related to financial support.Additionally, simulation results show that CRGP has a better solution accuracy and a faster convergence rate than the single-objective GP and another multiobjective GP algorithm (NSGAII-GP) for this NSD problem.In the future, CRGP is developed to solve other realworld MOPs, such as dense signal estimation and detection.

Figure 1 :
Figure 1: Example of the multiobjective ranking procedure of (a) Pareto dominance sorting without preference information and (b) compromise rank approach for NSD.

2 Figure 5 :
Figure 5: An example of crossover operation in CRGP.

Figure 7 :
Figure 7: Approximation performance of the relationship model for "number of units at foundation stage."

Figure 8 :
Figure 8: Convergence properties comparison of three algorithms: CRGP, single-objective GP, and NSGAII-GP applied to disaster management problem.

Table 2 :
Parameter setting of algorithms based on GP for the postearthquake reconstruction problem.

Table 3 :
CPU time required by three algorithms for the postearthquake reconstruction problem.