Iterative Learning Control Approach for Signaling Split in Urban Traffic Networks with Macroscopic Fundamental Diagrams

Recent analysis of field experiments in cities revealed that a macroscopic fundamental diagram (MFD) relating network outflow and network vehicle accumulation exists in the urban traffic networks. It has been further confirmed that an MFD is well defined if the network has regular network topology and homogeneous spatial distribution of vehicle accumulation. However, many real urban networks have different levels of heterogeneity in the spatial distribution of vehicle accumulation. In order to improve the mobility in heterogeneously congested networks, we propose an iterative learning control approach for signaling split, which aims at distributing the accumulation in the networks as homogeneously as possible and ensuring the networks have a larger outflow. The asymptotic convergence of the proposed approach is proved by rigorous analysis and the effectiveness is further demonstrated by extensive simulations.


Introduction
With the rapid development of urbanization and increasing demand on automobile, the elimination of traffic congestion or the reduction of traffic delays in urban areas becomes a challenging task for both traffic practitioners and researchers.Generally, traffic signal control is considered as a fundamental and effective measure to alleviate the traffic congestion and improve the efficiency of the existing urban transportation infrastructure.
A variety of traffic signal control strategies for urban areas has been developed in the past few decades.It is widely believed that real-time control schemes responding automatically to the current traffic conditions are potentially more efficient than fixed-time settings and actuated control approaches.Among numerous real-time traffic signal control means, the major one is the model-based control method.SCOOT [1] and SCATS [2] are both well-known and widely used in many metropolises around the world.They are both traffic responsive strategies that work effectively in real world traffic.However, the performance of these two systems was reported to be worse under congested conditions.In the 1980s and 1990s, other model-based control strategies such as OPAC [3], PRODYN [4], RHODES [5], CRONOS [6], and MOTION [7] emerged.These control approaches are able to obtain better control performance by use of the forecasting models.However, their performance is somewhat restricted, as these control methods mainly use simple models on the basis of traffic data measured by upstream detectors [8].Moreover, the online computational complexity becomes a challenging problem for these strategies when implemented in real-life urban networks.To reduce the complexity of the computation, a number of control methods requiring less online computation time or using more efficient models are proposed, such as pagerank approach [9], model predictive control (MPC) [10][11][12], and the traffic-responsive urban control (TUC) [13][14][15].
A limitation of model-based control approaches is the sensitivity to or the dependence on the modeling accuracy.Modeling the traffic flow dynamics in urban traffic networks remains a big challenge, due to various unpredictable factors such as the route choice, time of departure, and driving behavior.Thus, model-based control methods may not be able to achieve the satisfactory control performance.Instead of this model-based approach, the macroscopic fundamental diagram (MFD) aiming at simplifying the modeling problem of traffic flow describes the evolution of network outflow and vehicle accumulation in the urban network at an aggregate level.The notion of an MFD was originally proposed by Godfrey [16], but its existence is recently verified by empirical data [17].Based on the MFDs of traffic network, many control strategies have been developed to improve the network mobility by adjusting the vehicle accumulation in the network.Perimeter control strategies based on the network MFD models have been proposed for single-region cities in [18].In [19], an elegant perimeter control approach has been developed for a two-region urban network with well-defined MFDs and for multiregion cities in [20].The stability of the perimeter control approach was analysed in [21].In [22,23], a model predictive control method has been introduced to solve the optimal traffic control problems.An easy-toimplement feedback control approach has been proposed to ameliorate the traffic mobility within the network in [24,25].Moreover, route guidance strategies utilizing the concept of MFD have been investigated in [26], and the influence of route choice on the distribution of congestion and the MFD shape has been discussed in [27].Recent studies [28][29][30] indicate that heterogeneous spatial distribution of vehicle accumulation can notably decrease the network outflow for the same value of accumulation.
On the other hand, the urban traffic flow patterns exhibit obvious cyclical characteristics on a daily basis, although they change at different time of the day.For instance, the urban traffic flow always starts from a very low level at midnight and reaches the maximum during rush hour almost at the same time every day.Thus, we should take full advantage of the inherent repeatability of urban traffic flow to improve mobility and decrease delays in urban traffic networks.To this end, we will apply the iterative learning control (ILC) method to solve the signal splitting for urban networks.In fact, all fixedtime control strategies are developed based on the assumed traffic repeatability.
ILC was initially proposed by Arimoto for improving tracking performance of systems that execute the same task repeatedly in a finite interval [31].Compared with other control methods, ILC has many advantages; for instance, it can learn from the previous execution to improve control performance; it requires much less system knowledge and need not know exogenous disturbance patterns as far as they are repeated [32].ILC is very suitable for urban traffic control, as the realistic modeling of urban traffic network is a daunting task and the exogenous disturbances may not be fully known in practice.
In this study, the paper focuses on the development of the ILC controller based on MFD and its application to the control of signaling split in urban networks.The proof of the convergence of the proposed method will be given by rigorous analysis.The simulation results show that the ILC controller is able to balance the accumulation distribution in the urban networks and keep the vehicles number in each link at the desired level, which makes the urban network run with a well-defined MFD and have a larger outflow.

Network outflow (veh/h)
Vehicle accumulation (veh) The remainder of this paper is organized as follows: Section 2 introduces the macroscopic fundamental diagram of urban traffic networks used by the ILC control strategy.In Section 3, the utilized store-and-forward model for the signal control problem and the control objective are described.Section 4 presents the MFD-based ILC control strategy for urban networks with rigorous analyses on the learning convergence properties.Simulation results are given in Section 5. Section 6 draws the conclusions and suggests directions for future work.

Macroscopic Fundamental Diagram of Urban Traffic Networks
The MFD of urban traffic defines a unimodal low-scatter relationship between network outflow and network vehicle accumulation for a homogeneous network.It is observed that the maximum outflow of the network may be over a range of vehicle accumulation values that is close to a critical accumulation.The typical shape of an MFD for urban networks is shown in Figure 1, where ñ is the critical accumulation corresponding to the maximum outflow  max and the span  1 ∼  2 represents the range of vehicle accumulations which can produce almost the same maximum outflow around the critical accumulation.
In Figure 1, the region  1 reflects that the traffic states are undersaturated; that is, the green times are partially wasted due to lack of sufficient traffic demand.Traffic states along the line  2 indicate that the traffic flows are partially saturated; that is, most links in the network experience saturated traffic flow during respective green times without no obvious queue spillback to upstream intersections.The traffic states on the declining line  3 imply that there are significant queue spillbacks in some links and several upstream intersections are blocked, where the green times are seriously wasted.The growth of the vehicles number within this region will remarkably decrease the network outflow.Finally, region  4 denotes that a complete network-wide gridlock takes place, where the network outflow nearly equals zero.Once this situation occurs, any signal control strategy may be powerless.
Note that the slope of line  1 indicates the average speed of vehicles in the networks.This average speed may be increased via proper traffic signal operations and a larger maximum network outflow   max with wider vehicle accumulation span   1 ∼   2 can be obtained (as indicated by dotted line in Figure 1).The oversaturated traffic conditions (region  3 ) can also be ameliorated by better adapted control strategies.The control strategy investigated in this paper exactly attempts to distribute the vehicle accumulation in the networks as homogeneously as possible, so that the networks can produce the optimal outflow and run with a well-defined MFD.

Urban Traffic Network Modeling and Problem Formulation
3.1.Store-and-Forward Modeling.In recent years, many macroscopic urban traffic flow models that describe the traffic flow dynamics with different levels of detail have been developed.Without attempting to realistically model the complex dynamics of traffic, the control strategy investigated in this paper focuses on the long term evolution of the traffic flows in the network.Thus, the store-and-forward model first suggested by Gazis and Potts [33] and later used by the trafficresponsive urban control (TUC) framework is employed.This modeling approach introduces a model simplification that enables the mathematical description of the traffic dynamics without use of discrete variables and allows for a number of highly efficient optimization and control methods to be employed [14].The model for traffic flow dynamics in urban networks presented below is taken from [13].
An urban traffic network can be considered as a directed graph with intersections  ∈  and links  ∈ .The sets   and   are defined as the incoming and outgoing links of intersection , respectively.The rate of vehicles reaching an intersection  and running toward downstream  ∈   from upstream  ∈   is denoted by the turning rate  , .  denotes the saturation flows of links  ∈ .In addition, the quite usual assumption that the cycle times   for all intersections  ∈  equal a common cycle time  is made here.Furthermore, the set   denotes the fixed number of phases at intersection , while  , is the green time of phase  ∈   .Also, let   be the set of phases where link  has the r.o.w. at intersection .The turning rates  , and saturation flows   are assumed to be known and constant in this paper.Nevertheless, they can also be considered to be time-varying and may be continuously estimated by a constrained state observer approach [34,35].
By definition, the common cycle is enforced by the constraint ∑ ∈   , +  = , where   is the lost time of intersection .The green time  , is constrained by  , ∈ [ min , ,  max , ], where  min , and  max , are the minimum and maximum allowable green times.Link  connecting two consecutive intersections  1 and  2 is illustrated in Figure 2.
The traffic flow dynamics of link  in Figure 2 is given by the following conservation equation: where   () is the number of vehicles in link ;   () and   () are the inflow and outflow of link  in the sample period Δ[, ( + 1)], respectively; Δ is the control interval;  = 0, 1, . . .,  is the discrete time index.Moreover, Δ is equal to the common cycle time .  () is the traffic demand within the link which is not originating from adjacent links, and   () is the exit flow.
The traffic flow entering into link  is computed as where  , is the turning rate towards link  ∈   1 from link  ∈   1 .
The average outflow of link  during the corresponding cycle is expressed as where ∑ ∈    2 , () denotes the green time for vehicles leaving link .
Substituting the terms   () and   () into (1) with ( 2) and (3), the following state equation is obtained: where   () is a single disturbance determined by the demand   () and the exit flow   ().

State-Space Representation.
For the purpose of the traffic control analysis, we first derive a state-space model for the traffic network.Denote where  is the number of controlled links, which is equal to the number of the green times of all phases.Applying (4) to all the network links, then a linear statespace model is obtained: x ( + 1) = x () + u () +  () , where x() is the state vector; u() is the control vector containing all the green times  , ; () is the disturbance vector.
The state matrix  and output matrix  are both identity matrices.The input matrix  with appropriate dimensions reflects the network characteristics, such as network topology, turning rates, and saturation flows.The disturbance matrix  is a diagonal matrix with the same diagonal elements Δ.

Control Objective.
The control objective is to seek an appropriate sequence of green times that drives the numbers of vehicles in all links of the network to converge to the desired ones over the entire interval  ∈ [1, 2, . . ., ], despite the presence of modeling uncertainties and disturbances.In other words, the ILC controller tries to distribute the accumulation in the network as homogeneously as possible, which leads to the network having a larger outflow under a welldefined MFD.Throughout this paper, ‖ ⋅ ‖ denotes 1-norm, that is, for an  ×  matrix , in which  , symbolizes its entries:

The Investigated Control Strategy
4.1.Some Assumptions.To facilitate the learning convergence analysis, some necessary assumptions are given below.
Assumption 1.Throughout the repeated iterations, the reinitialization condition is satisfied.That is, where x  (0) and y  (0) are the initial values of the desired state and output and  is the iterations number.
Assumption 2. There exists a sequence of control inputs u  () that can exactly drive the system output to track the desired trajectory y  () for system (6) over the finite time interval.
Assumption 3.For system (6), the matrix  is full rank.
Assumption 1 requires the initial state and output values to be consistent with the desired ones.In fact, this condition is not always met in practice.In this case, we can revise the target trajectory to be aligned with the actual one at the initial stage of tracking [36].
Assumption 2 is a reasonable assumption that the control problem should be solvable.
Assumption 3 is a standard assumption that guarantees the existence of iterative learning control law, which will be derived in the next sections.

The ILC Strategy.
In the proof of the iterative learning convergence, the -norm of a vector ℎ() is defined as below: where  > 0 and  > 1.
In practice, the green time  , for phase  at intersection  must be constrained by  , ∈ [ min , ,  max , ] to guarantee sufficient green time allocated to pedestrian.Therefore, it is necessary to investigate the ILC strategy with restricted inputs bounded by certain lower and upper limits.Thus, the saturator is introduced to express such input constraints.
Before investigating the iterative convergence under input constraints, an important property is firstly introduced.
The proof in details can be found in [32].
The state-space equation of traffic system (6) under constraints is The iterative learning control law under constraints is constructed below: where  denotes the iterations number and  is an iterative learning gain matrix.By the definition of output error,   ( + 1) =   ( + 1) −   ( + 1), and   () is the desired output at time .The convergence property with the ILC law ( 13) is summarized in the following theorem.
Theorem 5.Under Assumptions 1-4, choosing the learning gain matrix  of ( 13) such that ‖ − ‖ < 1, the output of the traffic system (12) will converge to the desired output iteratively; that is, Proof.For any given control input (),  ∈ [0,], the general solution () to the traffic system (12) can be written in the following form: Thus, for the th iteration, it can be known from (15) that, for any  ∈ [0, ], Furthermore, using ( 16) together with ( 13) and Assumption 1, we have Using Assumption 1 gives Thus, by the definition of the output error   () and ( 17)-( 18), we have Here, Accordingly (20) Thus, we can complete the proof of the theorem.
Remark 6.According to Assumption 3, if the matrix  is full rank, the value of  can be easily determined by the condition ‖ − ‖ < 1; that is, the learning gain matrix can be calculated by where inv(⋅) denotes the inverse of the matrix.x 6,7 x 10,11 x 15,11 Remark 7. It is interesting to note that the learning convergence is solely depending on the matrix .The unknown exogenous disturbances (), as far as repeatable, will be eliminated by the learning control; thereafter it does not affect the learning convergence.
Since the ILC strategy does not explicitly consider the existence of cycle constraints, the constraints are additionally imposed after application of (13).For this reason, the following optimization problem is solved at each sample time  for each intersection  so as to obtain the feasible green times on the basis of the ILC-based green times  , () resulting from (13): where  , () is the closest solution in Euclidean space to  , ().  () is a quadratic programming problem which can be solved using an efficient algorithm that was initially developed by [37].

Simulation Studies
In order to test and validate the effectiveness of the MFDbased ILC approach to the urban traffic signal control problem, the example network of Figure 3 with 16 intersections and 32 one-way roads is considered as a test bed for simulations.Note that the exit links of the network are not taken into account.To simulate the real traffic, a microscopic traffic simulation tool, that is, VISSIM 4.30, is employed for developing the simulated network model and examining the proposed strategy performance.The ILC strategy is developed using MATLAB and the programming language Visual Basic (VB) is utilized to enable the simulation through the Component Object Mode (COM) provided by VISSIM.

Network Description and Simulation Setup.
It is shown that an MFD is well defined under the specific condition that the spatial distribution of network vehicle accumulation is homogeneous and the network topology is regular [20].As the paper mainly focuses on homogenously distributing the vehicle accumulation in the network by the proposed ILC strategy, a network with regular topology depicted in Figure 3 is investigated.All the link lengths in the network are equal to 500 m and each link has 2 lanes.As illustrated by intersections 6 and 11 in Figure 3, the turning rates at the intersections are set to 2/3 for through and 1/3 for left or right movements.Peripheral intersections marked as "  " are the source nodes where traffic flows enter the network.For comparison purpose, the ILC strategy will be compared with the fixed-time control strategy (FT) executed in the urban network based on the same simulation setup, which is always used as a reference approach to compare the efficiency of the new developed control methods.The fixedtime signals are designed through a method known as Webster's procedure [38].More specifically, the nominal splits are computed by the equation below: where    is the nominal inflow to link  (veh/h);   is the saturation flow of link ;   is the cycle time of intersection ;   represents the lost time of the same intersection;  , is the nominal green time assigned to phase  of intersection ;   is the set of phases of intersection .
As the default traffic configuration and vehicle behavior defined in VISSIM are chosen for simulations, the saturation flow rate is 2000 veh/h per lane (refer to the user's manual of VISSIM).All the intersections in the network have two phases.The cycle times of all intersections in the network are set to be 120 s, and Δ =  is taken as a control interval for all strategies.The average vehicle length is 6.7 m.For the ILC strategy, the FT signal plans are used as the initial input values and the desired outputs of all links are set to   = 22 veh.The learning process is iterated 15 times.
In order to measure and compare the performance of the proposed ILC approach and FT plan, four traffic scenarios are considered: one balanced scenario and three unbalanced scenarios.For each scenario, the traffic flows from all the source nodes entering the network have two values: 3000 veh/h and 1500 veh/h.The distributions of the traffic demands in the four different scenarios are given in Table 1.The simulation horizon for each scenario is 1 h (30 cycles).

Simulation Results.
To verify the effectiveness of the proposed ILC strategy, evaluation is conducted by using the corresponding MFDs and performance indices defined in VISSIM.
Figure 4 presents the MFDs resulting for the four considered scenarios under different control strategies.Note that the throughput is used as a surrogate for the network outflow since the network outflow is not directly measured by VISSM.Throughput in VISSIM is referred to as "number of vehicles that have left the network." The figures plot the throughput-accumulation relationship in the network for the whole simulation period.Each measurement point in the figures corresponds to 120 s. Figure 4 confirms the existence of an MFD for urban networks, whose exact shape is closely related to the applied signal control strategy.Also, with regard to the same control strategy, the shape of the MFD is seen to depend on the spatial distribution of the vehicle accumulation in the network.For FT control strategy, as shown in Figure 4, MFDs with different shapes are derived under the four different demand scenarios.In Figure 4(a), as the network vehicle accumulation is distributed more homogeneously under scenario 1, the resulting slope (average speed) is highest and no traffic congestion occurs.The throughputs quickly reach the maximum values (around 520 veh per cycle) corresponding to a critical accumulation range from 1600 to 1800 vehs.For the other three unbalanced scenarios, due to the different levels of heterogeneity in the spatial distribution of vehicle accumulation, the traffic states exhibit varying degrees of congestion.Among them, the most severe traffic congestion takes place under scenario 4, followed by scenarios 3 and 2. As Figure 4(d) shows, the network throughput values decrease to only 74 vehs per cycle at the end of the simulation, which indicates that the network will suffer from a network-wide gridlock.From the MFDs resulting for all the demand scenarios under the ILC strategy, it can be seen that the traffic states of the network are unsaturated or partly saturated, which means that the vehicles run more efficiently and smoothly in the network under the ILC strategy.In general, the main reason for this is that the ILC controller is able to distribute the vehicle accumulation in the network more homogeneously than the FT controller.
The control performance for different control strategies can also be compared by the MOEs (measure of effectiveness) defined in VISSIM, such as average delay time per vehicle (ADT), average number of stops per vehicles (AVS), average speed (AS), and total travel time (TTT).The integrated simulation results are listed in Table 2.
As displayed in Table 2, the ILC strategy is seen to strongly outperform the FT strategy for scenarios 2, 3, and 4, except for scenario 1.The reason for this is that the vehicle accumulation is homogeneous under scenario 1 while different levels of heterogeneity exist under scenarios 2, 3, and 4. Therefore, the ILC controller is able to adjust its signal plan to distribute the vehicle accumulation more homogeneously and achieve better control performance than the FT controller under unbalanced demand scenarios.For instance, in scenario 4, the ADT, AVS, and TTT controlled by FT strategy are 593.2,81.5, and 2256.2.However, operated by the ILC controller, these indices are decreased to 160.1, 14.3, and 1321.3,respectively, which are improvements of 73.0%, 82.5%, and 41.4%.The AS is also increased from 10.2 to 26.7 with an improvement of 61.8%.A similar analysis can be conducted for other scenarios, as shown in Table 2.
The comparison of the four MOEs achieved by different control strategies at every control time step for scenarios 2, 3, and 4 are plotted in Figures 5, 6, and 7, respectively.These figures reveal how the four MOEs change with time and they well conform with the MFD results of Figures 4(b)-4(d), along with providing more evidence for the superiority of the new developed ILC strategy over the FT scheme.
Furthermore, in order to verify that the ILC strategy is able to distribute the vehicle accumulation in the network more homogeneously, we show the spatiotemporal network density maps under the two controllers in scenario 2 and scenario 4, respectively (see Figures 8 and 9).The spatiotemporal network density map displays how the link densities change with time (cycle) during the simulation horizon.The color on the density map denotes the density value of a certain link at the corresponding time, and the density value varies from 0 to 1.As expected, the density map unevenly distributed under the FT controller in both scenarios, which means that the link densities in the network are scattered more heterogeneously; by comparison, the density map more homogeneously distributed under the ILC controller, which illustrates that the link densities in the network are less heterogeneously scattered than those of the FT controller.Therefore, the result reveals that the ILC controller is able to balance the vehicle accumulation in the network and improve the network mobility.

Conclusion
In this paper, we propose a novel approach named ILC strategy for split control of urban traffic networks with MFD representation.The presented approach aims at distributing the vehicle accumulation in the urban networks as homogeneously as possible and maintaining the number of vehicles in each link around a desired point under various traffic demands.The main advantage of our approach is that it is able to learn and improve control performance under a repeated traffic environment and does not require high computational effort.Despite the existence of model mismatches and exogenous disturbances in the traffic model, the rigorous analysis reveals that the ILC method can guarantee the asymptotic learning convergence.Finally, the effectiveness of the proposed ILC approach is demonstrated by use of the corresponding MFDs and performance indices in a test bed network.
Future research will deal with the comparison of the proposed ILC approach with other urban traffic control strategies (e.g., feedback control) in more elaborate simulation and aim at developing a combined control method integrating the feedback and ILC strategies to achieve better control performance.

Figure 1 :
Figure 1: Macroscopic fundamental diagram for urban traffic networks.

Figure 2 :
Figure 2: Traffic flow dynamics in link .

Figure 3 :
Figure 3: The test bed network.

Figure 4 :
Figure 4: MFDs resulting for the four scenarios under different control strategies.

Figure 5 :Figure 6 :
Figure 5: Comparison of the four MOEs at every control time step in scenario 2.

Figure 7 :
Figure 7: Comparison of the four MOEs at every control time step in scenario 4.

Figure 8 :
Figure 8: Spatiotemporal density of the network under different controllers in scenario 2.

Figure 9 :
Figure 9: Spatiotemporal density of the network under different controllers in scenario 4.

Table 1 :
The supply flow rates (veh/h) for the network.

Table 2 :
Comparison of the simulation results of the four MOEs defined in VISSIM.