Optimizing Locations of Primary Schools in Rural Areas of China

Scientific location selection of schools is an important way to optimize the allocation of educational resources, improve the efficiency of operating schools, and realize the balanced development of education, especially in rural areas. Many studies have considered the location of schools, but most have omitted the impact of transportation network conditions and the time cost differences caused by different travel speeds under different road conditions. (e object of this study is to minimize the total transportation costs for students, construction costs for new schools, and the construction and upgrading costs for roads on a traffic network with travel time uncertainty indicated by different travel time scenarios. A mixed-integer programming model for this problem was proposed. Furthermore, a hybrid simulated annealing algorithm was used to solve the problem. Finally, a practical case study was used to illustrate the application of the proposed mathematical model. (e results showed that the traffic network has an important influence on the optimization location of rural schools, and the improvement of traffic network conditions can greatly reduce the time required for students to travel to school.


Introduction
e report of China's Nineteenth National Congress of the Communist Party proposed the strategy of "Rural Revitalization." e important task of "Rural Revitalization" is to revitalize education. School location, which involves planning and layout of school space in a region, is a crucial part of revitalizing regions and one of the important aspects of allocating educational resources [1]. Optimization of school location based on transportation network considerations has two aspects: one is to minimize students' travel costs, and the other is to minimize infrastructure costs by minimizing the need for newly constructed or upgraded roads and facilities. In particular, since " e State Council Decisions on the Basic Educational Reform and Development" were published in 2001 [2], rational planning and layout of primary and secondary schools in rural areas have received more and more attention from local governments. In 1987, the former Chinese State Education Commission made it clear that students' school time was not more than 45 min. In addition, considering the high cost of mountain environment construction, government departments should minimize the cost of school and road construction. Considering budget constraints on road and school construction, research on ways to reduce the time for students to get to school while simultaneously optimizing the location of schools has become increasingly urgent. In this context for rural areas typical of China, the objective of this study was to minimize the total travel costs for students, construction costs for school facilities, and the construction and upgrading costs for roads. To improve the efficient operation of a rural school system, the current infrastructure must be used, and critical decisions must be made on where to build a new school and where to upgrade an existing road or construct a new road.

Relevant School Location Models.
School location is a facility location problem. e earliest scientific research on facility location is that of Weber [3] who determined the optimal location of a factory warehouse. Because schools provide a very important public service, the school location problem has attracted the attention of many scholars. After the systematic study of many scholars, the theory of facility location has matured. Tewary et al. [4] used a maximum coverage model to study how to maximize the population covered within a maximum travel distance with a fixed number of schools. Pizzolato et al. [5] examined the location of elementary public schools in a large urban setting using operational research techniques in which an un-capacitated p-median model was used to minimize the sum of the overall residence-to-school distances. Subsequently, studies considered school capacity explicitly and located schools of different sizes. By building a dynamic (multiperiod) optimal model, Antunes [6] formulated planning proposals for the evolution of several school networks in Portugal. is model allowed uncapacity for facility closure or size reduction, as well as facility opening and size expansion, with sizes limited to a set of predefined standards. Pizzolato et al. [7] applied geographic information system (GIS) technology to the school location problem to evaluate the layout of Brazil's schools in two cases (p-median model with capacity and without capacity). Teixeira and Antunes [8] presented a discrete hierarchical location model for public facility planning, using a p-median model with capacities so that the allocation was made to the nearest school with available capacity. Hasse et al. [9] developed a model of location planning for a school network to maximize the standardized expected utility of all students taking into account capacity constraints and a given budget for the school network. Delmelle et al. [10] proposed a multiperiod capacitated median model for school network facility location planning that minimized transportation costs while functional costs were subject to a budget constraint. Some scholars have studied the problem of school location specifically for China. Zhang and Wang [11] applied GIS technology and used the gravity center model, the maximum welfare model, and the p-center model to study the location of universities for the elderly. Peng and Wang [12] compared the advantages and disadvantages of the p-center model and the p-gravity center model and improved the p-gravity center model to develop a decision support system for the optimal location of rural primary schools based on GIS technology. Based on Peng and Wang's [12] research, Dai et al. [1] added the constraint of school scale and improved the p-center of the gravity model to study the location of primary schools.

Facility Location and Network Design.
Classical facility location problems consist of some number of facilities to be located within a given network to satisfy a set of customers in respect of some constraints [13]. Network design problems address decisions about selecting a number of candidate links between network nodes to minimize the sum of construction and travel costs [14,15]. Essentially, the facility location network design problem (FLNDP) is a combination of the facility location and network design problems; Daskin et al. [16] introduced the first model of FLNDP. It not only pays attention to the selection of facility location but also considers the optimal layout design of the traffic road network. e facility location-network design method can effectively solve the optimal location problem of rural public service facilities [17]. ey believe that the cost of building new facilities may be much higher than that of changing the underlying traffic and assume that the underlying network configuration is changeable. en, Melkote [17] divided the facility location design problem into three models: uncapacitated facility location network design problem (UFLNDP), capacitated facility location network design problem (CFLNDP), and maximum covering location network design (MCLNDP). On this basis, many scholars have extended the research on FLNDP [18][19][20][21][22][23].

Facility Location under Uncertain Environment.
Generally, stochastic optimization and robust optimization (RO) are used to solve optimization problems under conditions of uncertainty. Stochastic optimization assumes that an uncertain parameter follows a probability distribution, and the objective is to find a solution that minimizes the expected cost. In contrast, discrete scenarios or a continuous range is assumed in RO. e objective of RO is to minimize the worst-case cost or regret [24]. e idea of set-based RO was first introduced by Soyster [25]. Kouvelis et al. [26] first introduced the concept of p-robustness to solve a layout planning problem for manufacturing systems; the relative regret in each scenario was ≤p. Snyder and Daskin [27] subsequently proposed a p-robustness measure that combined the advantages of both stochastic optimization and RO approaches in which the expected costs were minimized while the relative regret in each scenario was restricted. RO and stochastic optimization for handling uncertainly are frequent in location models, as reviewed by Owen and Daskin [28] and Snyder [24]. Unfortunately, FLNDPs under conditions of uncertainty, unlike other types of location problems, have received limited research attention. Rahmaniani and Shafia [29] analyzed the maximum covering FLNDPs that included uncertainty and presented comprehensive optimization models for their solution. Shishebori and Yousefi Babadi [30] proposed an efficient linear programming model for finding a robust and reliable solution to a medical service center location network design problem.
Many studies have examined the optimal location of schools. To our knowledge, most previous studies have neglected both the impact of traffic accessibility on the location of rural schools and the time cost differences caused by different travel speeds under different road conditions. In addition, previous research has failed to consider whether it was financially more appropriate to upgrade a road (or construct a new road) than to open new schools. Furthermore, there has been no study in which school location problems were considered together with network design and travel time uncertainly. To fill these knowledge gaps and contribute to the empirical literature, the objective of this study was to develop and test a methodology for examining the impact of a traffic network on school location. To do so, a facility location network design model with travel time uncertainly was constructed. e model was designed to minimize the total travel costs for students, construction 2 Complexity costs for school facilities, and the construction and upgrading costs for roads. e remainder of this paper is organized as follows. Section 2 introduces the robust optimization and p-robustness methods. Section 3 presents the mathematical formulation of the proposed problem. Section 4 contains a brief description of the hybrid simulated annealing algorithm. Section 5 illustrates how the model is applied to the location optimization of a rural primary school in a town in Guizhou Province, China. Section 6 provides discussion and Section 7 summarizes the study findings and concludes with observations about future research needs.

Introducing Stochastic p-Robust Approach
Stochastic optimization usually tries to minimize the total expected cost under all scenarios. However, the optimal solution found by this method can be satisfactory in some scenarios but unsatisfactory in others. In addition, the robust optimization method tries to minimize the worst-case cost objective value, and it provides a solution that is satisfactory in all scenarios. However, because the worst-case probability is very low, traditional RO provides a very conservative solution. Both stochastic optimization and RO have disadvantages in addressing uncertainty problems. Snyder and Daskin [27] presented a novel robustness measure called "stochastic p-robustness" based on a set of scenarios. is approach combines the two objectives of stochastic optimization and RO by minimizing the expected cost while bounding the relative regret in each scenario.
To expound on this robustness measure, let S represents a set of scenarios. Let P s be a deterministic minimization problem for each scenario s ∈ S. Let z s be the optimal solution of the objective function P s . X indicates the feasible solution of P s for all s ∈ S; z s (X) represents the objective value of P s under solution X. z * s (X) represents the optimal objective function value for P s . Meanwhile, z * s (X) > 0 for each scenario. us, the minimax cost and minimax regret objective functions can be formulated as equations (1) and (2), respectively.
To formulate p-robustness, let p be a nonnegative constant value that indicates the desired robustness level. erefore, a feasible solution e left-hand side of equation (3) is the relative regret under scenario s, and z s (X) − z * s indicates the absolute regret. Equation (3) sets the maximum upper limit for relative regret in each scenario s. Equation (3) can be rewritten as Next, the p-robustness measure and minimum expected cost objective function are combined to obtain a problem definition of the following form: minimize s∈S q s z s (X), subject to z s (X) ≤ (1 + p)z * s ∀s ∈ S, X ∈ Χ, where q s represents the occurrence probability of scenario s and Χ indicates a set of solutions feasible for all objective function P s .

Problem Definition and Assumptions.
is section describes a school location and network design problem that includes travel time uncertainly. We regard a set of villages as demand nodes exist in a geographical region, a set of roads as transfer links. e road links in the network contain existing and (potential) new road links. A set of schools already exists in the region, and it is clearly desirable to locate a set of new schools, to construct new road links, and to improve the existing road links such that the total investment costs (including the travel costs for students, construction costs for school facilities, and the construction and upgrading costs for roads) are minimized. Because the set of villages is situated in a mountain area, severe seasonal weather conditions have an important impact on (and cause uncertainty in) the time it takes students to travel to school. Several assumptions are inherent in the proposed problem solution. First, each node of the network represents a village or the center of a collection of residences. Second, each node can be a supply center or a demand center; that is, at each residential center, a primary school can be opened. ird, a primary school must be located only at the nodes (villages/ residential centers) of the network. Fourth, only one primary school can be located at each node (village/residential center) of the network. Fifth, the network is a customer-toserver system in which the students travel to the primary school to be served. Sixth, all travel costs are symmetric. Seventh, all network links are directed. Eighth, the schools and transfer roads are un-capacitated. Last, all travel costs are uncertain and have a stochastic behavior.

Notations.
e sets, parameters, and decision variables used in the proposed model are defined in Table 1.

Model Formulation.
Two models were formulated. Model 1 determines the optimal location of schools under the condition that existing roads can be upgraded. Model 2 determines the optimal location of schools under the condition that existing roads can be upgraded and new roads can be constructed. Regarding the above assumptions and notations, the two mixed-integer programming models for the p-robust school location network design problem (p-RSLNDP) can be defined. Model 1 is described as follows: subject to e unit travel costs of students were assumed to be uncertain under different weather conditions. us, formulation (5) is the objective function that minimizes the expected travel costs of all students, upgrading costs of roads, and the construction costs for school facilities over all scenarios. q s is the probability that scenario s occurs and indicates its importance in decision-making. Formulations (6)-(17) are constraint condition. Constraint (6) enforces the p-robust criterion and indicates that the relative regret of each scenario must be no greater than the robustness coefficient p. Furthermore, a small value of p may produce infeasible solutions. It should be note that if p � ∞, the p-robust criterion becomes meaningless, and the formulation is equivalent to a deterministic school location design problem. Constraint (7) means students at village i are either served by a school at i or obtained services from other places. Constraint (8) ensures that for students k, the inflow into i nodes is equal to the outflow from i nodes. Constraint (9) requires that all pupils must be served. Constraint (10) and (11) ensure that potential transfer links and schools are not used if they are not constructed. Constraint (12) guarantees that there is only one-way link between two nodes. Constraint (13) ensures that the cost of construct schools is less than the fixed budget B. Constraint (14) enforces the binary restriction on the location decision variable. is means that if there is a new school at demand node i, the value of Z s i is 1; otherwise, the value is 0. Constraints (15) and (16) enforce the nonnegativity of the flow variable. Constraint (17) enforces the binary restriction on the link decision variable. It means that if the existing link from node i to j is upgraded, the value of X s ij is 1; otherwise, the value is 0. Corresponding to Model 1, the optimal scenario costs Z * s can be calculated based on the following model: subject to (7)-(17) Model 2 is described as follows: subject to Complexity 5 Compared with that of Model 1, the objective function of Model 2 adds the costs of new roads. L indicates the set of existing and new candidate transfer links in the network. Constraint (31) enforces the binary restriction on the link decision variable. It means that if the transfer link from node i to j is upgraded or constructed, the value of X s ij is 1; otherwise, the value is 0. In addition, constraint (27) means that the total costs of school construction and road construction and upgrading budget are less than the fixed budget B.
Corresponding to Model 2, the optimal scenario costs Z * s can be calculated based on the following model: subject to (20)-(31)

Solution Methodology
When the number of scenarios is equal to 1, the problem presented is the classical UFLNDP. Melkote [17] proved that UFLNDP is a nondeterministic polynomial-time (NP)-hard problem. Simulated annealing (SA) is a local search-based heuristic that can avoid the local optimum solution by accepting inferior solutions during the solution iterations. is method is effective for solving highly complex combinatorial problems [31]. e approximate global optimal solution is obtained by using a hybrid metaheuristic algorithm which combines SA and Teitz-Bart algorithm in this paper. e SA algorithm was proposed by Metropolis et al. [32] and was independently described by Kirkpatrick et al. [33]. e SA algorithm is based on general concepts regarding the gradual cooling process of metal. e SA algorithm starts with an initial solution at a high temperature. e temperature is reduced according to a certain temperature regulation process. At each step, a neighbor is generated randomly, and its corresponding objective function is calculated. e movement to improve the objective function is always accepted, and other movements are accepted with some probability. e Teitz-Bart algorithm was proposed by Teitz and Bart [34]. is algorithm is a local search method and a heuristic improvement algorithm.
rough tentatively exchanging a node in the current solution for a node that is not in the current solution, it improves iteratively the solution found using either constructive algorithm [35].

Solution Representation.
To reduce the difficulty and complexity of solving the problem, we divide the singlescenario solution process into three stages for each s ∈ S. First, we should find out the existing roads that need to upgrade and the potential roads that need to construct in the road network according to the cost budget and fix the network investment decisions. Second, Teitz-Bart algorithm is used to determine the location of new schools according to the remaining cost budget. ird, each demand point must find a corresponding school to seek service. Moreover, the walking path of students to school at each demand point was determined. In the first stage of solving the process, we need to find out the low-quality, medium-quality, and potential roads in the road network and combine them into an array according to a certain order. e solution obtained by the simulated annealing algorithm is also represented by an array. e array elements consist of the location number of new facilities and a code of 0 or 1. Code 1 indicates the upgrading of low-quality and medium-quality roads or the construction of potential roads. Code 0 indicates no road upgrade or no road construction. Low grade, middle grade, and potential roads are arranged in the front, middle, and back of the array, respectively. For a road network consisting of five low-quality roads, five medium-quality roads, and five potential roads, the solution can be expressed as an array containing 15 elements (as shown in Figure 1). is array means upgrading or constructing second and third lowgrade roads, second and fourth middle-grade roads, and first, second, and fourth potential roads, respectively.

Initial Solution Generation.
e initial solution for each scenario in this study was generated randomly. e solution obtained by the SA algorithm was an array (the value of the element in the array is 0 or 1) that also was generated randomly within the range of meeting the budget constraint conditions.

Neighborhood Generating Procedure.
Neighborhood generation was one of the main factors that affected the performance of the SA algorithm. e neighborhood is obtained by swapping the value of two transfer links that 6 Complexity have different values. For this purpose, we look for a pair of node (i, j) with relationship (i, j) ∈ L, and then, other pairs would be selected to swap their values. After the variable X ij is fixed, we use Teitz-Bart algorithm to optimize the facility location. Finally, after the variables X ij and Z i are fixed, the problem is solved, and a neighborhood is generated with its objective function in this way.

Hybrid SA's Framework.
e main operations of the hybrid SA algorithm are shown in Figure 2. In the flowchart, x best represents the optimal solution so far; T 0 denotes the initial temperature; T f indicates the end temperature; α indicates the cooling coefficient and is used to control the cooling rate; ran d represents a random number; mapkob indicates the length of Markov chain; L represents the current length of the Markov chain; Δf denotes the objective function value difference between the current solution and its neighborhood. It is not difficult to solve the problem if the scenarios are restricted to just a single scenario. e algorithm is started by initializing a feasible road network and temperature and using Teitz-Bart algorithm to obtain facility location. is establishes the values of X ij and Z i , and the current objective value can be obtained by calculation. en, a new feasible road network is generated in the process of gradual cooling down. Location of facilities under new road network was obtained using the Teitz-Bart algorithm. erefore, we can get a new objective function value. e road network and facilities location are continually updated according to the Metropolis guidelines, and the process is repeated until the current temperature reaches its lowest value. Finally, the resultant road network and its corresponding facilities location set are the optimal solutions of the problem under the fixed scenario. In addition, the main operations of Teitz-Bart algorithm are shown in Figure 3. e main steps of Teitz-Bart algorithm include the following: (i) determining an initial subset of facilities location; (ii) using other nodes to try to replace each node in the subset of initial facility locations; and (iii) replacing the node with the largest contribution to total weighted distance reduction in the subset of facility location.

Basic Data.
e feasibility of the Models 1 and 2 and the hybrid SA algorithm was tested by applying them to a town in Guizhou, China. Guizhou is located in a mountainous environment that is typical of areas designated for "Rural Revitalization." Improving accessibility to schools for students in Guizhou is very important. As shown in Figure 4, the town consists of 26 villages (nodes, numbered 1 to 26) with a total population of 24,730. e population of each population center is obtained through field investigation. e road network, as well as the existing schools, is indicated by squares in Figure 4. Initially, there were 32 existing and six potential roads. e six potential roads were designed according to the actual topography of the study area (as portrayed in Google Earth). e roads were classified according to their quality ("high," "medium," and "low"). e costs of upgrading or constructing roads vary according to different types of roads. Consequently, the cost of upgrading low-and medium-quality roads to high-quality roads is lower than the cost of constructing new roads. In addition, the travel costs (in terms of time) of students vary with different qualities of roads.

Parameter Setting.
As described in Sections 3.1 and 3.3, there was uncertainty in the travel cost due to weather impacts. erefore, the transfer unit costs were stochastic in some scenarios, which were classified as "good," "medium," "bad," and "too bad." e influence of weather conditions on students' travel speed was the main consideration in the study.
us, according to the daily weather in 2017, the probabilities of the four uncertain situations were 0.48 (good), 0.38 (medium), 0.09 (bad), and 0.05 (too bad). e transfer costs for each student (in meters) were defined as a proportion of distance in the four mentioned scenarios and were 0.5 (good), 1 (medium), 1.5 (bad), and 2 (too bad). e demand for each population center was equal to the number of students residing there. e fixed cost of opening a school depended on a population center's demand and uniformly varied between 18,000 and 26,000 MU (monetary units). e construction cost of a new road and the costs of upgrading low-and medium-quality roads to high-quality roads were calculated as quadruple, triple, and twice, respectively, the road distance between two connected villages. e transportation costs (meters) of high-, medium-and low-quality roads for each student were randomly calculated subject to a discrete uniform distribution in (0.3, 0.32), (0.2, 0.22), and (0.1, 0.12), respectively.
It must be emphasized that in China, responsibilities for investment in school construction and road construction or upgrading are vested in local education departments and transportation departments, respectively. ese departments must put forward a comprehensive plan to improve students' accessibility to school. In addition, to save the cost of school construction and road construction or update, the ministries have a limited investment budget constraint that is determined. We assume that the cost budget of school construction and road construction or update in Model 1 and Model 2 is 80000 MU.  The sum of the weighted distance is changed Output subset V End through some intermediate villages to get to school. For example, the students in village 9 should be serviced by a located school in village 15 through route (9-14-15). e optimal solution value of model 1 is 297805 (that in detail, the value of first, second, third, and fourth unreliability scenario is 207147, 350611, 473845, and 617570, respectively). Besides, the optimal solution value of Model 2 is 329690 (that in detail, the value of first, second, third, and fourth unreliability scenario is 221987, 379940, 527179, 663330, respectively).

e Impact of Desired Robustness Level Changes on
Total Cost. In order to determine the influence of p-robust parameter changes on the objective function value of Model 1 and Model 2, a sensitivity analysis experiment is performed. According to Figures 7 and 8, we can see that the expected robustness level (p) and budget constraint (B) have an impact on the total cost. At first, p was set to a larger value and then began to decrease, until there was no feasible solution. In addition, the total cost has a decreasing trend as desired robustness level (p) grows. Based on the proposed model with increases in the desired robustness level (p), the feasible region does not decrease. Clearly, the desired robustness level (p) changes have an important impact on the total cost. It was possible to prove that p-RSLNDP is infeasible for p ≤ 0.05.

e Impact of Investment Budget Change on Travel
Costs. e investment budget constraint (B) is a significant parameter of the p-RSLNDP model that can affect the optimal solution. Figures 9 and 10 present the impact of the investment budget on the travel cost with varying p values. As expected, the travel costs are reduced with the increase of the investment budget. Moreover, we can see from Figures 9 and 10 that with the increase of p value, the travel cost under the same budget constraint is also reduced.

Discussion
is study constructs a mixed-integer programming model for the location optimization of rural primary schools and proposes a hybrid simulated annealing algorithm to solve the model, which effectively solves the location optimization   problem of rural primary schools. However, due to constraints of energy, ability, and time, this study has some limitations. Firstly, in the process of building the models, the size limit of primary schools is omitted. In real life, the size of schools may limit the number of students they accept. erefore, future research also should consider capacity constraints of school facilities and/or the traffic network. Secondly, we did not consider the impact of other factors (topography, slope, land use type, etc.) on the location of rural primary schools. ese factors also have an important impact on the choice of the primary school location. erefore, interested readers can further explore the location of rural primary schools and consider these factors. irdly, this paper only explored the static primary school location problem, without considering that the parameters such as demand, school, or road construction cost will change with time. It is necessary to consider the impact of demand changes and construction cost changes on the location of primary schools in future research. Lastly, further research is needed to develop algorithms to solve large problems. In a word, we hope       that in the future research and work, we will have the opportunity to continue the discussion, and we also hope that more scholars who work on the location optimization of rural primary schools can continue excavating such research.

Conclusions
Scientifically planning the location of schools is an important way to optimize the allocation of educational resources, improve the efficiency of running schools, and realize the balanced development of education, especially in rural areas. erefore, the object of this study is to minimize the total transportation costs for students, construction costs for schools, and the construction and upgrading costs for roads on a traffic network with travel time uncertainty indicated by different travel time scenarios. In this study, an optimization mixed-integer linear programming model and hybrid SA algorithm were developed and used to optimize the location of rural schools based on traffic network characteristics. In addition, the stochastic p-robust approach was used to model travel time uncertainly. e newly developed model efficiently determines the optimal locations of new schools in rural areas, optimal constructing/upgrading of transfer links, and optimal allocating of students to schools. A practical case study confirmed the applicability of the mathematical model as a means to solve the problems in planning rural schools. e traffic network has an important influence on the optimization of rural school locations. To determine the influence of p-robust parameter changes on the objective function value and investment budget on the travel costs for the proposed model, a sensitivity analysis experiment is performed.
e sensitivity analyses indicated that the changes of p values and investment budget have a significant effect on the objective function value and travel costs.
Nevertheless, further research is needed to develop algorithms to solve large problems. Future research also should consider capacity constraints of school facilities and/ or the traffic network.
Data Availability e data that support the findings of this study are available upon request to the author.