Incorporating Space-Time Correlation of Population Densities into the Design of a Candidate Rail Transit Line over Years

In contrast to private cars, rail transit systems are a more effective way to deal with the emerging challenges in cities with high population densities, such as congestion, air pollution, and traffic emissions. Rail transit systems, however, are commonly costly, due to substantial investments in construction and maintenance. It is thus necessary to design the candidate rail transit systems carefully to ensure public transport accessibility and sustainability, with consideration of the space-time correlation of population densities. In this paper, the space-time correlations of population densities are incorporated into the design of a candidate rail transit line over years. A closed-formed mathematical programming model is proposed, with an optimisation objective of social welfare budget maximisation. *e social welfare budget is defined as the summation of the expected social welfare and social welfare margins. *e model decision variables include rail line length, rail station number, and project start time of the candidate rail transit line. *e analytical solutions for the proposed rail design model are given explicitly for different scenarios with various constraints.


Motivation and Literature Review.
e preferred travel mode in areas with low population density is a private car, such as the United States. e preferred travel mode in areas with high population density is rail transit, such as Hong Kong. One possible reason is that high congestion and long delay may occur on highways, during morning peak hours on working days in highly populated areas. e generalised travel cost of a private car may be higher than rail transit in such areas. e travel mode choices were commonly examined, with the assumptions of given and fixed population densities (see, e.g., [1][2][3]). ese assumptions were reasonable in their models because the travel mode choices were analysed for short-term operation optimisation.
Under these assumptions, the effects of population densities on travel mode choices, however, were not explicitly investigated. e effects of population densities are of some importance and significance, for the design optimisation of a candidate rail transit line over years.
In the transportation corridor of a candidate rail transit line, the population densities in each residential location vary year by year. If the increase of population density in the first year leads to an increase in the second year, positive temporal correlations then exist between population densities in the first year and the second year, and vice versa. Similarly, if the increase of population density in one residential location leads to the increase of population density in another residential location, positive spatial-temporal correlations then exist between population densities in these two residential locations. e space-time correlation of population densities can be considered, with the nested logit model (e.g., [4][5][6]) and the C-logit model (e.g., [6][7][8][9]). In these nested logit models and C-logit models, spatial-temporal correlation of population densities was considered in the residential locations and/or travel choice behaviours of households. In contrast with the nested logit model, the C-logit model had a simple closed-form probability expression and was simpler for calibration [7,8].
In these nested logit and C-logit models, the space-time correlations between alternatives were investigated to calculate the choice probabilities for the residential locations and/or travel choice behaviours of households. In other words, the space-time correlations of population densities cannot explore explicitly with the nested logit and C-logit models. e possible effects of the space-temporal correlation of population densities on the design of a candidate rail transit line cannot be examined explicitly by nested logit or C-logit models. e space-time correlations of population densities can be taken into account explicitly by the space-time correlation coefficient of population densities [2,3,7,8,[10][11][12] proposed a bilevel optimisation model to estimate the space-time correlation of OD demands during the same peak hour periods due to day-to-day fluctuations over the whole year. Liu [2] investigated the effects of spatial and temporal correlation of population densities on system disutility in a railway transportation corridor. It concluded that the spatial and temporal correlation of population densities had a significant influence on the results of population densities and the system performance measured in system disutility, consumer surplus, and social welfare of railway system.
Zhang et al. [3] explored the implementation flexibility of multiperiod rail line design with consideration of uncertainties in population distribution. e space-time correlation of population densities was taken into account, but the proposed model was not analytical. Yang et al. [8] proposed an estimation framework based on the Generalized Method of Moment to infer the probability function of origin-destination (OD) demand variables using sets of traffic counts over a network. Sun et al. [7] conducted the stochastic OD traffic demand estimation with a biobjective optimisation model for the traffic count location problem.
It was noted that the space-time correlations of population densities were mainly considered for road traffic origin-destination estimation in these previous studies. In this paper, we will incorporate the space-time correlations of population densities by the space-time correlation coefficient of population densities for the design optimisation of a candidate rail transit line.
Based on the space-time correlation coefficient of population densities, a closed-form programming model is introduced to examine the effects of space-time correlation of population densities on the design of a rail transit line in this paper. e optimisation objective of the proposed model is budget social welfare maximisation. e budget social welfare is defined as a summation of expected social welfare and social welfare margin. e model decision variables include rail line length, rail station number, and the project start time. Figure 1, a linear transportation corridor is separated into n residential locations from the Central Business District (CBD) to the city boundary, and the planning time horizon is divided into m equal time periods. n and m are positive integers. L(t 1 ) and L(t 2 ) are the lengths of candidate rail transit line with respect to project start time in years t 1 and t 2 , which can be determined endogenously with the use of the proposed model. P(x 1 , t 1 ) and P(x 2 , t 2 ) are the population densities in residential locations x 1 and x 2 in years t 1 and t 2 , respectively, with ∀x 1 , x 2 ∈ [0, B] and ∀t 1 , t 2 ∈ [0, m]. Spatial correlation exists between population densities P(x 1 , t 1 ) and P(x 2 , t 1 ), and P(x 1 , t 2 ) and P(x 2 , t 2 ), while temporal correlation exists between population densities of P(x 1 , t 1 ) and P(x 1 , t 2 ), and P(x 2 , t 1 ) and P(x 2 , t 2 ). Two major extensions to the related literature are made in this paper: (i) the effects of space-time correlation of population densities on the design of a candidate rail transit line over years are investigated by a closed-form mathematical programming model; (ii) the analytical optimal solutions of design variables of the candidate rail transit line over years are obtained with the proposed model. e remainder of this paper is organised as follows: in the next section, some basic considerations are given. A rail design model is proposed in Section 3, taking account of the space-time correlation of population densities along a linear transportation corridor. Section 4 gives illustrative numerical examples to show the application and contributions of the proposed model. A summary of this paper is given in Section 5.

Assumptions.
To facilitate the presentation of the essential ideas, some basic assumptions are made, listed as follows: A1. e candidate rail transit line is assumed to be linear and start from the CBD and then be built along a linear transportation corridor [1,13]. e candidate rail transit line project in each period is assumed to finish on time and the rail service is expected to be supplied at the end of each design period [14]. A2. e standard deviation (SD) of the population density is assumed to be an increasing function with respect to its mean value. is function is referred to as the stochastic population density function. In addition, the stochastic population density function is assumed to be a nondecreasing function with respect to its mean value. [15]. A3. Households' responses to the quality of the rail service provided are measured by a generalised travel cost that is a weighted combination of in-vehicle time, access time, waiting time, and the fare [16]. Households 2 Discrete Dynamics in Nature and Society are assumed to be homogeneous and have the same preferred arrival time at the workplace located in the CBD. is study focuses mainly on households' homebased work trips, which are compulsory activities. e number of trips is, thus the number of trips is not affected by other factors, such as income level [17].
A4. e study period is assumed to be a peak hour, for instance, the morning peak hour, which is usually the most critical period in the day [19]. A5. Rail station number depends on rail line length and rail station spacings. To obtain the analytical solutions, an even rail station spacing is assumed. In other words, with the assumption of constant rail station spacing, once rail line length is determined, the rail station number is also determined. is assumption is also used in the works of Li et al. [17] and Liu [2].

Space-Time Correlation of Population Densities.
To take into account the space-time correlation of population densities, it is assumed that there exists a perturbation in the population density. e yearly perturbed population density P(x, t) is given by the following equation [12]: where P(x, t) is the expected population density at location It is noted that the expected population density P(x, t) is a deterministic value. In terms of A2, the SD of population density can be expressed as [15] where φ(•) is defined as the stochastic population density function, which represents the functional relationship between the mean value and the standard deviation of the stochastic population density. Specifically, a coefficient of variation of population density CV P is defined as where CV P is a standardised measure of the dispersion of the probability distribution or frequency distribution of the population density.
To take spatial and temporal correlations between population densities into account, the following spatial and temporal covariance is defined as [10] σ P x 1 , t 1 ; where ρ is the correlation coefficient, which is an important measurement reflecting the statistical correlation between P(x 1 , t 1 ) and P(x 2 , t 2 ). ere are three correlation coefficient cases: negative, positive, or zero, representing negative, positive statistical dependence or statistical independence of population densities. Specifically, with x 1 � x 2 and t 1 � t 2 , the spatial and temporal covariance becomes the standard deviation value.

Households' Residential Locations Choice Behaviours.
Households are assumed to choose the residential locations to maximise their own utilities subject to budget constraint. A Cobb-Douglas form of the utility function is adopted, shown as follows [17]: (5) where U(g(x, t), h(x, t)) represents the daily household utility function for residential location x in year t; g(x, t) is the daily consumption of nonhousing goods for households in a residential location x in year t, of which the price is normalised to 1; h(x, t) is the consumption of housing in a residential location x in year t, measured in square meters of floor space; α and β are positive constraints, and α + β � 1. e budget constraints for households are expressed as follows: where r(x, t) is the daily housing rent per unit of housing in residential location x in year t, I is the average daily household income, and π(x, t) is the daily generalised travel cost from residential location x to the CBD in year t. Under user equilibrium condition, no households can increase his/her utility by unilaterally changing their  Discrete Dynamics in Nature and Society 3 location choices. Mathematically, the utility maximisation for households can be expressed as A similar mathematical formulation has been formulated in Li et al. [17]. According to the equilibrium condition proposed in their study, the equilibrium household utility is shown as follows: with where U equilibrium is the equilibrium household utility in year t and r(0, t) is the housing rent in the CBD in year t. r(x, t) in equation (9) is the daily housing rent function per unit of housing in residential location x in year t, and h(x, t) in equation (10) is the daily consumption function of housing for households in residential location x in year t. It can be seen that both r(x, t) and h(x, t) are functions of daily generalised travel cost from residential location x to the CBD in year t.
To keep the balance of the supply and demand of housing, it requires that Substituting equations (10) in (11), we have where h(x, t) represents the consumption of housing in residential location x in year t, measured in square meters of floor space, and P(x, t) is the expected population density of households in residential location x in year t at equilibrium. e population conservation equation can be expressed as whereP t is the total population along the candidate rail transit line in year t and B is the length of the rail transportation corridor. To describe the year-by-year variation of the total population, a yearly growth factor is assumed and shown as follows [14]: where c(t) is a compound-account factor to measure the growth of the total population compared with the based year and P 0 is the total population in the base year. As c(t) is positive, the implication is that the total population along the candidate rail transit line increases and vice versa.
is the multiplier of the total population to measure the variation of the total population in year t compared with the total population in the base year.

Social Welfare Budget.
e government or the rail operator will build a rail transit line to meet the increasing travel demand of households and eases highway traffic congestion. Social welfare is commonly used to assess the performance of a candidate rail transit line. Due to the yearly uncertainty associated with rail travel demand, the social welfare of the candidate rail transit line is also not a deterministic value. Because of the uncertainty of social welfare, an extra safety margin is assigned to ensure a higher probability of gaining a certain level of social welfare. In view of this, the concept of social welfare budget is proposed as follows: where is the standard deviation of social welfare, and λσ[SW] is the social welfare margin. λ relates to the requirement on ensuring a certain social welfare gain. A high value of λ implies a relatively high ϕ(SW) and a higher probability of social welfare gain. Formally, λ can be related mathematically to the probability that there is a gain in the budget social welfare, namely, where δ is the probability of a gain in the social welfare budget. Rearranging terms in equation (16), then From equation (16), we can obtain Let Φ(•) be the standard cumulative distribution function. Equation (18) can be rewritten as follows: and, with equation (20), λ � − Φ − 1 (δ) can be obtained. us, the social welfare budget defined in equation (15) can be rewritten as e value of δ represents the government's or rail operator's attitudes toward social welfare gain. A larger δ implies a larger negative safety margin and a higher probability of a gain in social welfare budget.
Social welfare of the candidate rail transit line consists of the consumer surplus of households and the profit of the rail operator. Mathematically, expected social welfare E [SW] can be expressed as follows: where E[CS] is expected consumer surplus of households and E[PR] is the expected profit of the rail operator. e expected consumer surplus of households E[CS] is given by where 365 is a parameter converting daily consumer surplus into yearly consumer surplus, is the expected generalised travel cost from residential location x to the CBD in year t by rail, m is planning time horizon in years, and B is the length of the rail transportation corridor. e expected profit of rail operator E[PR] is given by where 365 is a parameter converting daily profit into yearly profit, f(x, t) is rail fare, c is a variable cost to supply rail service for each passenger, L(t) is rail length in year t, C r is yearly unit fixed maintenance cost of rail line, n t s is rail station number in year t, and C s is yearly fixed operation cost of each rail station.
In terms of A3, the travel demand function of rail service from residential location x in year t, q(x, t) is assumed to be given by an exponential function shown as follows [18]: whereθ is a positive constant, which responds to the households' sensitivity to the rail service level, and ε q is a random term, with E[ε q ] � 0. e inverse function of travel demand can be obtained as follows: Substituting it into equation (23), the following equation is obtained: e expected generalised travel cost consists of fare, access cost from residential locations to rail stations, waiting for cost for rail service at stations, and in-vehicle cost from rail stations to the CBD, shown as follows [20]: (28) where(μ c /μ w /μ i ) are values of access time, waiting time, and in-vehicle time, respectively; f(x, t) is distance-based fare for rail service, κ(t) is a compound-account factor to convert future values to present values, t c (t) is average access time from residential locations to the rail station, with (zt c (t)/zn t s ) < 0,t w (t) is average waiting time for rail service at stations, and t i (t) is average in-vehicle cost from rail stations to CBD. e distanced-based fare f(x, t) is given by where f 0 is the fixed fare component and f 1 is the variable fare component per kilometre. Waiting time t w (t) is closely concerned with travel demand and supply of the rail service. For long-term planning, this value can be estimated using the following function: where 0.5 is a reasonable parameter for short train headway and passengers arrival time and h(t) is the average headway in year t [21]. e average headway in year t is closely concerned with cycle time of train operation T(t) and fleet size of trains F(t): e cycle time of train operation can be calculated by [22] where L(t) is the rail line length in year t, v(t) is the average train speed in year t, and ς is average constant terminal time. e average in-vehicle travel time from rail station i to the CBD, t i (t), is given by the distance between rail station i and the CBD D i (t), divided by the average train speed in year t, v(t), namely, where D i (t) can be calculated as follows if a constant station spacing is assumed: with i ∈ [1, n t s ]. In terms of equations (1), (2), (24)-(32), we obtain the expected budget social welfare shown as follows: Discrete Dynamics in Nature and Society 5 e standard deviation of travel demand for rail service and the standard deviation of budget social welfare can be calculated as follows:

Model Formulation and Properties
As stated above, the government or rail operator aims to maximise the social welfare budget of the candidate rail transit line by determining the optimum rail line length, rail station number, and project start time of the candidate rail transit line.

Model Formulation.
In terms of equations (21), (22), and (37), the social welfare budget maximisation model is formulated as follows: where ϕ(SW) is the social welfare budget, L(t) is rail line length, n t s is rail station number, and t is the project start time of the candidate rail transit line.

Proposition 1. For the budget social welfare maximisation problem (38), the social welfare budget ϕ(SW) is a decreasing function of the spatial and temporal correlation coefficient ρ
Proof. In terms of equation (38), it can be found that the variation of budget social welfare ϕ(SW) with respect to spatial and temporal correlation coefficient ρ

Proposition 2.
For the social welfare budget maximisation problem (38), at the equilibrium of equations (8)- (10), the optimal rail length L(t), rail stations number n t s , and project start time t can be obtained by the following equations: with Proof. To obtain the optimal solution of the rail line length, the partial derivative of objective function equation (38) with respect to L(t) was set to zero. en, 6 Discrete Dynamics in Nature and Society and, then, the optimal rail line length can be obtained as follows: (43) Similarly, to obtain the optimal solution of the rail station number, the partial derivative of objective function equation (38) with respect to n t s was set to zero, namely, and, then, the optimal rail station number can be obtained as follows: To obtain the optimal solution of the project start time of the candidate rail transit line, the partial derivative of objective function equation (38) with respect to t was set to zero, namely, where and, then, the optimal project start time of the candidate rail transit line can be obtained as follows: Discrete Dynamics in Nature and Society (49)

Numerical Examples
To facilitate the presentation of the essential ideas and contributions of this study, two illustrative examples are given below.

Example 1.
e input parameters are summarised in Table 1. Figure 2 plots the contour of optimal social welfare budget in the space of spatial and temporal correlation coefficient (cc) with the objective of social welfare budget maximisation. It can be seen that, for a particular spatial cc, as temporal cc increases, the optimal social welfare budget of the candidate rail transit line decreases. For instance, for spatial cc 0, as temporal cc increases from − 1 to 1, the optimal social welfare budget decreases from level of 1.071 × 10 10 HK$ to 7.750 × 10 9 HK$.
For a given total population, positive temporal cc means that the increase of population density in the first year leads to the increase of population density in the next year. As a result, households are distributed to limited residential locations and the total population has a centralised distribution.
In summary, as temporal cc increases, the total population has a more centralised distribution, and the social welfare budget of the candidate rail transit line decreases. More centralised population distribution can lead to a lower social welfare budget of the candidate rail transit line. Decentralised population distribution takes a high social welfare budget of the candidate rail transit line.
Similarly, for given temporal cc, as spatial cc increases, the optimal social welfare budget decreases. For instance, for temporal cc 0.8, as spatial cc increases from − 1 to 1, the optimal social welfare budget decreases from 7.822 × 10 9 HK$ to 7.526 × 10 9 HK$.
Positive spatial cc implies the increase in population density in a residential location and leads to the increase of population density in another residential location. A type of cooperation relationship may exist between these two adjacent residential locations. For instance, the population growth in a new town can lead to an increase in population density in residential locations of the adjacent suburban city.
In summary, as spatial cc increases, the residential locations are more correlated with each other, and the optimal budget social welfare decreases. More correlated residential locations can lead to lower budget social welfare for the candidate rail transit line. Conversely, a competitive relationship between residential locations leads to the availability of a high budget social welfare for the candidate rail transit line.
It is also noted that the effects of temporal cc on the optimal social welfare budget are more significant than spatial. For instance, as temporal cc increases from − 1 to 1, the optimal social welfare budget decreases from level of 1.071 × 10 10 HK$ to 7.750 × 10 9 . As spatial cc increases from − 1 to 1 and temporal cc of 0.8, the optimal social welfare budget decreases from 7.822 × 10 9 HK$ to 7.526 × 10 9 HK$.
Compared with traditional studies assuming a spatial and temporal cc of 0, the optimal social welfare is overestimated in parts of (a) and (b) in Figure 2 and underestimated in parts of (c) and (d). For instance, in part of (b), the results in traditional studies are overestimated from 7.154 × 10 9 HK$ to 9.137 × 10 9 HK$ with spatial and temporal cc of 1. In part (c), the results in traditional studies are underestimated from 1.130 × 10 10 HK$ to 9.137 × 10 9 HK$ with spatial and temporal cc of − 1. Table 2 shows numerical results of optimal rail line length L(t), optimal rail station number n t s , and optimal project start time of the candidate rail transit line in terms of the base year with respect to temporal correlation coefficient (cc) of − 1 and 1 and spatial cc from − 1 to 1. It can be seen that with temporal cc of -1 the optimal rail line length L(t) in each year is longer than that of a temporal cc of 1. For instance, with temporal and spatial cc of − 1, the optimal rail line length is 30.98 km in year 1, 17.49 km in year 2, and 9.59 km in year 3, while the optimal rail line length is 22.09 km in year 1, 14.36 km in year 2, and 9.21 in year 3 with temporal cc of 1 and spatial cc of − 1. It implies that the optimal rail line length is longer with decentralised population distribution than that with centralised population distribution.
It can also be seen that the optimal rail line length L(t) in each year decreases as spatial cc increases from − 1 to 1. For instance, with temporal cc of − 1 and spatial cc increasing from − 1 to 1, the optimal rail line length in year 1 decreases from 30.98 km to 20.74 km. It implies that as cooperation between residential locations becomes strong and competition between residential locations becomes week, the optimal rail line length in year 1 decreases.
From Table 2, it can be found that the optimal project start time of the candidate rail transit line t is fast-tracked when temporal cc increases from − 1 to 1. For instance, with temporal and spatial cc of − 1, the optimal project start time of the candidate rail transit line is year 11.19 in terms of the base year, while the optimal project start time of the candidate rail transit line, with temporal cc of 1 and spatial cc of − 1 is 8.60. It implies that the optimal project start time of the candidate rail transit line is earlier under centralised population distribution than that under decentralised population distribution. Figure 3 gives the housing unit prices map around MTR stations for Hong Kong in the first half-year of 2015.

Example 2.
e housing unit prices, within the range of two hundred meters at each MTR station, are the transaction prices of representing housing estates. e representing housing estates are in residential locations, which have the largest transaction numbers of housing estates in the past six months.
e housing unit prices are measured in HK$/ 8 Discrete Dynamics in Nature and Society square feet (Sqft). e data come from Centra data, linked by hk.centranet.com/eng/ehome.htm. Table 3 gives the housing rent list of representing housing estates at each rail station of the West Island Line in first half-year of 2015. e housing rent price ratio is around 3% in Hong Kong at year 2015.
is data comes from Chiefgroup of Hong Kong (www.chiefgroup.com.hk). e average flat size is 36.5 sqft, according to Housing Authority Annual Report 2014-2015 (www.housing.wa.gov.au/hou-singDocuments). e daily housing rents around rail stations of Western Island Line are calculated based on the housing prices, housing rent price ratio, the average flat size, and a constant parameter. For instance, in this example, daily housing rent � (36.5 * 7 * housing price * 3%)/(12 * 30), and 676.34 � (31864 * 3% * 36.5 * 7)/(12 * 30). Figure 4 shows the effects of space-time correlation of population densities on social welfare budget for the Western Island Line. It can be found that, with a given spatial correlation coefficient (cc) of population densities, as temporal cc increases from − 1 to 1, the social welfare budget for the Western Island Line decreases. It also can be found that the effect of temporal cc of population densities is more Variable cost for rail service 3 C r (HK/km) Daily unit fixed maintenance cost of rail line  Discrete Dynamics in Nature and Society Temporal cc Spatial cc Note: "cc" represents covariance coefficient. L(t) represents optimal rail length in year t t ∈ 1, 2, 3 { }), n t s represents optimal rail station number in year t, and t represents the optimal project start time of the candidate rail transit line with respect to the base year. significant than spatial cc of population densities on the social welfare budget. e results are in accord with the results of numerical example 1.

Conclusions
is paper proposes a closed-form model to investigate the effects of space-time correlation of population densities on the design of a candidate rail transit line over years. e traditional studies with an assumption of independence of irrelevant alternatives (IIA) population densities, namely, space-time correlation of population densities of 0, are special cases of the proposed model in this paper. e proposed model offers several insights. For example, the decentralised population distribution takes the high social welfare budget of the candidate rail transit line. Competition between residential locations takes the high social welfare budget of the candidate rail transit line. e effects of the temporal correlation coefficient (cc) on the optimal social welfare budget are more significant than the spatial correlation coefficient. e optimal rail line length L(t) in each year is longer compared to temporal cc of − 1 with that with temporal cc of 1. e optimal project start time of the candidate rail transit line t is fast-tracked as temporal cc increases from − 1 to 1. e proposed model also offers some managerial implications. For instance, from Proposition 1, we know that the social welfare budget is a decreasing function of the spatial and temporal correlation coefficient. e rail transit line can strengthen the spatial and temporal correlation coefficient of population densities. e social welfare budget then can be eliminated by the construction of a candidate rail transit line. e optimal design value of a candidate rail transit line, namely, the optimal rail length, rail stations number, and project start time, can be determined explicitly by Proposition 2.
is paper provides a new avenue for the modelling and analysis of space-time correlation of population densities on the design of a candidate rail transit line over years.
In this paper, the population are assumed to be homogeneous with trips commuting only from residences to CBD. e proposed model can be extended to incorporate the effects of households' risk preference on early and late arrival to CBD on the design of a candidate rail transit line over years. [22].  Discrete Dynamics in Nature and Society e decision to extend a rail line involves consideration of technological, social, and economic factors. e prime reason could be social or in other words a desire to make life more convenient as regards manoeuvrability for a specific set of people, namely, those living in the vicinity of the line and new stations to be constructed. However, only pressing economic factor is considered in this paper. More detailed social factors can be taken into account in further studies, for instance, appreciation of land value along the rail line. [3].

Data Availability
e data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest
e authors declare that they have no conflicts of interest.