General-Logistic-Based Speed-Density Relationship Model Incorporating the Effect of Heavy Vehicles

Owing to its mathematical elegance and empirical accuracy, the speed-density model is critical in solving macroscopic traffic problems. This study developed an improved general-logistic-based speed-density model, which is a new method in macroscopic traffic flow theory. This article extensively discusses the properties of the general-logistic-based speed-density model. The physical meanings and values of all the parameters were determined based on the effect of heavy vehicles and the method for the linear and nonlinear regression analysis. The accuracy and versatility of the developed model were also found to be excellent based on the field data and relative error.


Introduction
The fundamental diagram in the field of traffic flow theory is essentially a two-dimensional graph showing the relationship between travel speed and traffic flow density (i.e., the speeddensity relationship model).This fundamental diagram is widely used owing to its concise mathematical form and excellent statistical accuracy.Therefore, a well-organized speed-density relationship model should have simple mathematical equations, physically defined parameters, high fitting accuracy, and strong robustness.However, the main models that are currently in use do not simultaneously satisfy the abovementioned requirements.
Research on the speed-density relationship model started with Greenshields' linear model [1], a groundbreaking work that showed perfect mathematical elegance.However, this model is insufficient for matching empirical observation data, especially under current traffic conditions in which traffic flow is more or less affected by heavy vehicles.Greenshields' oversimplified model was revised or improved in several studies that followed.Greenberg [2] developed a mathematically simple speed-density model using the microscopic traffic flow model; however, it predicted an infinite free-flow speed.Underwood's model [3] has the same advantages as the abovementioned models, but it is not useful in cases of traffic congestion because the jam density is unlimited, and the speed never goes to zero; this deficiency also appeared in the northwestern model [4].The Greenshields model was also used to develop several other models, including the Drew model [5], Pipe-Munjal model [6], and modified Greenshields model [7].Therefore, these models have the same disadvantages as the Greenshields model, especially because they ignore the effect of heavy vehicles on the whole traffic stream.Further, some speed-density relationship models were derived from the car-following model, including the Newell model [8], Var Aerde model [9], and MacNicholas model [10].Among these three models, the latter two were used relatively less frequently owing to their mathematical complexity.Though this deficiency was not observed in the Newell model, it always failed to match the empirical observation data as the traffic speed decayed extremely fast with increasing density.This inaccuracy could also be found in Kerner's model [11].
All of the abovementioned models could be defined as single-regime models, which means the two-dimensional speed-density graph is a single curve.Generally, the speeddensity graph of multiregime models consists of two or three regimes.When single-regime models did not fit the observed data well, multiregime models were developed.The typical multiregime models are the Edie model [12], two-regime model [13], and modified Greenberg model [14].The main principle of multiregime models is using different traditional single-regime models in different segments based on the boundaries in free flow, transitional flow, and congested flow.Further, the main limitation of current multiregime models is their inability to determine breakpoints in a scientific manner.
In the late 20th and early 21st centuries, the generalized logistic curve proved its superiority in modelling growth pattern dynamics [15][16][17][18].Its "S-shaped" elegant mathematical form and adjustable parameters make it naturally suitable for building speed-density relationship models.In 2011, a research group led by Daiheng Ni proposed a logistic model of the equilibrium speed-density relationship [19].In their work, a five-parameter logistic model was analyzed in detail, and a three-parameter logistic model was developed based on this analysis.This three-parameter logistic model has proven to be precise enough in traffic streams without large speed gaps between small vehicles and heavy vehicles (mainly in western countries where heavy vehicles have better dynamic characteristics).
However, in traffic flow affected by heavy vehicles, in other words, a mixed traffic flow, the effectiveness of threeparameter logistic models remains unknown.Inspired by Ni's work, the main objectives of this paper are described as follows: (i) To improve on Ni's and previous works, an accurate physical meaning is given to every parameter in the general-logistic-based speed-density model and finding scientific and simple methods to determine these parameters.
(ii) To improve the general-logistic-based speed-density model with mathematical elegance and empirical accuracy, the effect of heavy vehicles on the traffic stream is incorporated, which was not extensively studied in previous works.

Data Resources
To verify the efficiency of logistic models in mixed traffic flow, a large amount of traffic flow data is needed.Therefore, the raw data were collected at two road sections.Following the principle of convenience and typicality, two road sections were chosen considering the three conditions described below: (i) The data should be collected from expressways to eliminate the effect of pedestrians and nonmotor vehicles.(ii) The traffic flow should have temporal distribution characteristics to facilitate the collection of traffic data in both low and high density conditions.(iii) The road sections should not have mountainous areas to eliminate the effect of topography.
Following these conditions, the road sections K450+200-K460+700 in G30 and K1241+100-K1251+300 in G65W were chosen.Each road section had three study sites, and traffic data (including time, instantaneous speed, and vehicle type) were collected in each lane.Figure 1 shows the details of the road sections.To guarantee the availability of raw data, especially the data showing a heavy traffic density, all the data were collected during holidays (International Labour Day, National Day, and Mid-Autumn Festival).
More than 100,000 raw data points were collected.The average traffic speed, traffic volume, and mixing ratio of heavy vehicles were counted by dividing the raw data into groups, and the traffic density was obtained using the classic equation  =  × V. Finally, 4,000 data groups were obtained.

General Logistic Model
As mentioned earlier, the logistic model is naturally suitable for building speed-density relationship models owing to its sigmoidal function, elegant mathematical form, and adjustable parameters.The general function of the logistic model is given below [20].
where p ≡ [a, b, c, d, g] is the parameter vector of the logistic model; the domain of the parameters is restricted so that c > 0 and g > 0. As observed in the equation, the general logistic model has five parameters.To build a speed-density relationship model based on the general logistic model, each parameter should be given a clear and precise physical meaning.Therefore, the impact of each parameter on the function should be studied.
Figure 2 shows the families of curves that can be generated from (1) by varying one parameter at a time.The figure clearly shows some characteristics of the general logistic model.Overall, Figure 2 indicates that the output approaches a horizontal asymptote as the input approaches zero, and it approaches a horizontal asymptote as the input approaches infinity.Moreover, a transitional region exists that contains a single inflection point between the asymptotic regions.
Figure 2 reveals some of the ways in which these parameters affect the resulting curves.
(i) Parameter a controls the position of the top asymptote.
(ii) The magnitude of b controls the steepness of the curve between asymptotes.Its sign, along with the order of a and d, controls the slope of the curve.It solely controls the rate of approach to the top asymptote, and, along with g, it controls the approach to the bottom asymptote.
(iii) Parameter c controls the position of the transition region in the input.
(iv) Parameter d controls the position of the bottom asymptote.
(v) Parameter g controls the rate of approach to the bottom asymptote.
Based on the above analysis, some of the parameters in the speed-density relationship model based on the general logistic model could be given precise physical meaning combined with the observation of traffic flow [10,19].Table 1 gives the physical meaning of some parameters, which can be used to express the speed-density relationship model based on the general logistic model as follows: Usually, parameters b and g are treated as dimensionless constants to control the shape of the curve [19], and the values of these constants are restricted based on statistical data.Therefore, determining these two parameters without giving them precise physical meanings always presents some difficulties and uncertainties.Although the parameter   is given a clear physical meaning, the method for calculating the value of this parameter remains unknown [19].Therefore, the current work on logistic models based on the speed-density relationship should be further extended.The remainder of this article focuses on the problems mentioned above.

Improved General-Logistic-Based
Speed-Density Model

Statistical Results.
As stated earlier, some of the parameters in the general-logistic-based speed-density relationship model are given precise physical meanings, but some parameters remain unknown.Moreover, the methods for calculating or determining the values of parameters V  , V  , and   remain a subject of debate.To solve these problems, the statistical results are first presented.As shown in Figure 3(a), when the traffic density is lower than 40 (veh/km * l), the empirical volume-density curve approximately follows a linear distribution, in agreement with the results from previous studies (Payne 1984; Bando 1995).In other words, when the traffic stream is in a free-flow condition, the volume-density curve follows a linear distribution.However, when traffic density exceeds 40 (veh/km * l), the empirical data are more scattered in a two-dimensional distribution.As shown in Figure 3(b), the initial speed (in free-flow conditions) varied from 50 km/h to 120 km/h, and the data are also distributed in a two-dimensional area.These curve characteristics cause two problems: first, the boundary between free flow and synchronized flow is blurred, and second the volume-density and speed-density relationships cannot be depicted by a single line.
According to actual road traffic, the effect of heavy vehicles plays a critical role in traffic streams, which was not extensively studied in previous works.Therefore, considering the effect of heavy vehicles in the speed-density relationship model is essential to arrive at a feasible and suitable solution to the problems mentioned above.The effects of heavy vehicles could be summarized below.Based on these observations, an improved speed-density relationship model will be developed based on the general logistic model and the mixed ratio of heavy vehicles.(i) The low operation speed of heavy vehicles significantly decreased the average speed of the vehicle platoon in free-flow conditions.
(ii) The behaviors during overtaking have a relatively obvious effect on the average speed of the vehicle platoon in free-flow conditions.
(iii) In synchronized and congested flows, the effects on the average speed could be ignored, but the size of heavy vehicles has an effect on traffic density.

Boundary between Free Flow and Synchronized Flow.
To determine parameters V  , V  , and   in the improved generallogistic-based speed-density relationship model considering the heavy vehicle mixing ratio (represented by "r"), the boundary between free flow and synchronized flow should be studied first.Previous studies did not provide a clear method for determining this boundary [10,19].However, the volume-density curve is widely accepted to follow a linear distribution under free-flow conditions (Payne 1984; Bando 1995).Combined with the three effects of heavy vehicles discussed in Section 4.1, the boundary between free and synchronized flow is assumed to be related to the heavy vehicle mixing ratio.
To determine if a function obeys a certain distribution rule, the coefficient of determination R 2 is calculated, and the proximity of parameter "Adj R 2 " to 1 determines the fitting goodness of the raw data.Therefore, taking "Δr = 5%" as the step size, the statistical data were divided into 11 groups (r=0%, r=5%, r=10%, r=15%, r=20%, r=25%, r=30%, r=35%, r=40%, r=45%, and r=50%).In each group, taking "Δk = 0.1" as the step size, the linear fitting goodness was tested progressively to find the inflection point of Adj R 2 ; in other words, the boundary between free flow and synchronized flow was determined.The result is shown in Figure 4.
According to the inflection point analysis result, the boundaries in different groups differ from each other, which validated the assumption that the boundary between free flow and synchronized flow is related to the heavy vehicle mixing ratio, and the values of parameter of "k t " in each group could be determined.
Additionally, the parameter V  stands for the average travel speed under unstable stop-and-go conditions; however, this value is near zero and is difficult to obtain.Although neglecting V  in (2) will slightly affect the value of , this effect can be considered negligible.Therefore, taking V  as 0 and V  as the average speed in free-flow conditions, the values of V  and   in the improved speed-density relationship model could be calculated and are given in Table 2. ure 2(e), parameter "g" controls the rate of approach to the bottom asymptote, that is, the value of traffic density in congested flow.In congested flow, vehicles are stuck in the traffic stream or move at a speed close to 0 km/h, i.e., "bumper-tobumper flow".Under this condition, the effect of overtaking behaviors could be ignored because all of the vehicles are in an extremely constrained driving environment.However, as mentioned at the end of Section 4.1, the size of heavy vehicles affects the traffic density under this condition.Using the same method described in Section 4.2, the statistical data were divided into 11 groups.The traffic densities "  " in congested flow were obtained, including "bumper-to-bumper flow" and  traffic jams.In other words,   values with different heavy vehicle mixing ratios were obtained, as shown in Figure 5. Figure 5 reasonably suggests that parameter "" is directly related to traffic density in congested flow.Further, with different heavy vehicle mixing ratios, k m varies.Therefore, parameter "g" could be considered the congested density coefficient, given by  =  1 (  ).

Determining Parameters "b" and "g". As shown in Fig-
As stated in Section 3, b controls the steepness of the curve between asymptotes.That is to say, in the generallogistic-based speed-density relationship model, with a fixed traffic density, a high value of "b" results in a low value of the traffic speed.Considering the observations in Section 4.1, parameter "b" can be concluded to be directly related to the heavy vehicle mixing ratio, and this parameter could be given physical meaning as the heavy vehicle coefficient.Thus, the relationship b = f 2 (r) was obtained.Then, (2) could be rewritten as follows: Using the analysis method described in Section 4.2 and setting the step size to "Δr = 5%", the statistical data were divided into 11 groups (r=0%, r=5%, r=10%, r=15%, r=20%, r=25%, r=30%, r=35%, r=40%, r=45%, and r=50%).Then, the nonlinear fitting method was used for the different groups to find the best values of "b" and "g".The fitting result is shown in Figure 6 and the best values of "b" and "g" are given in Table 3.
Using the best values of the congested density coefficient  =  1 (  ) and heavy vehicle coefficient  =  2 (), the relationship between  and   and that between  and  could be determined using linear regression.The results of linear regression are presented in Figure 7 and Table 4   Speed-Density data Mixing ratio=0% Mixing ratio=5% Mixing ratio=10% Mixing ratio=15% Mixing ratio=20% Mixing ratio=25% Mixing ratio=30% Mixing ratio=35% Mixing ratio=40% Mixing ratio=45% Mixing ratio=50% The values of v f , k t , and k m in different traffic streams with different compositions are presented in Table 2 and Figure 5.However, the model was developed based on the statistical data obtained from expressways G30 and G65W.Therefore, the versatility and accuracy of the model should be further verified.Hence, the model should be validated based on two aspects.First, the data to verify the accuracy of the model should be collected from another expressway.
Second, the model should be compared with other speeddensity relationship models.
The data were collected from a section of the Xi'an Ring Expressway G3001, and the observation point was in K25+500.The improved model is directly related to the mixing ratio of heavy vehicles; therefore, data with a mixing ratio of 10% were chosen to verify the model, with a total of 1680 data points.According to Table 2 and Figure   Therefore, it appears that most data points are around freeflow condition in Figure 8. Figure 8 demonstrates that the improved model fits the statistical data from G3001 well, and the value of Adj R 2 is 0.9468.Although this value is lower than those in Table 3, it still proves the usefulness and versatility of the improved model.Moreover, as mentioned above, the accuracy of this model should also be considered.Therefore, the relative error is introduced in this section and is given below.

𝐸
In this equation,   () denotes the empirically observed speed-density mean, and   () represents a mathematical speed-density model.Theoretically, if the proposed model matches the empirical mean of speed-density observations in a perfect manner, this error will be 0, but this is not possible in reality.As a result, this error will be in the range of [0, 1]; the closer the error is to 0, the better the model performs.Taking Ni's logistic-based model [19] as the main comparison object, along with the Greenshields, Greenberg, Underwood, and Kerner models, the theoretical V m (k) values were calculated using the data obtained from G3001.The relative errors for different traffic densities could be calculated, as shown in Figure 9.
This figure confirms the accuracy of this model.Specifically, the accuracy of this improved model is at the same level as that of Ni's logistic model, and the model is more accurate than other classical models such as the Greenshields, Greenberg, Underwood, and Kerner models.According to the conclusion shown in Figure 9, the advantage of improved model in this paper, compared to Ni's logistic model, should be further studied.As depicted in Section 1, this model is focused on mixed traffic flow; therefore, the relative errors for different heavy vehicle mixing ratios (r) of improved model in this paper and Ni's logistic model are calculated, using statistical data collected in G3001.The results are shown in Figure 10.
Figure 10 demonstrates that in traffic flows of different heavy vehicle mixing ratios the model in this paper behaves better, whose relative error remains around 0.1.When heavy vehicle mixing ratio is lower than 15%, the relative error of Ni's logistic-based model remains around 0.1, and with the increasing heavy vehicle mixing ratio the relative error also increases.Therefore, the accuracy of the model developed in this paper has been proved.

Conclusions
Based on revisions to several speed-density relationship models, this study focused on the properties and analysis of a general-logistic-model-based speed-density relationship model.The parameters in the model were analyzed in detail, and the boundary between free flow and synchronized flow was identified based on the effect of heavy vehicles and the progressive linear fitting goodness test.Using this boundary, the values of parameters v f and k t were determined.Parameters "b" and "g" were studied and both were given a precise physical meaning.Using nonlinear regression to find the best values of "b" and "g" based on the heavy vehicle mixing ratio, functions  =  1 (  ) and  =  2 () were determined.Then, an improved general-logistic-based speed-density relationship model was developed, and the versatility and accuracy of this model were also proven using field data and relative errors.
However, this improved model has some limitations; particularly, the high dispersion of speed-density data in free flow could not be quantitatively analyzed using the macroscopic traffic flow model.Although this model is convenient in terms of its mathematical form and has a natural advantage in dealing with macroscopic traffic problems, it ignores the time variable, which makes it insufficient for dealing with real-time traffic problems.Therefore, future studies will focus on two aspects.The connection between this model and microscopic traffic behavior will be further studied from both theoretical and statistical perspectives to enhance the real-time applicability of the model.Moreover, the characteristics of the road network, including the road geometry and network structure, will be further considered to increase the practical applicability of the model.

Figure 1 :
Figure 1: Distribution of study sites.

Figure 2 :
Figure 2: (a-e) Effects of varying parameters a, b, c, d, and g, respectively.

Figure 3 :Figure 4 :
Figure 3: Statistical results obtained from data collected at G30 and G65W.

4. 4 .
Model Verification.With the results given in Section 4.3, the improved general-logistic-based speed-density relationship model aimed at mixed traffic flows affected by heavy vehicles could be developed, and the model equation is given below.

Figure 6 :
Figure 6: Fitting results of different groups.

Figure 8 :Figure 9 :
Figure 8: Statistical data of G3001 and plot of the improved model.

Table 1 :
Physical meaning of some parameters.

Table 2 :
Values of parameters v f , v b , and k t .

Table 3 :
Best values of "b" and "g".

Table 4 :
Regression results of g = f 1 (k m ) and b = f 2 (r).