Analysis and Prediction on Vehicle Ownership Based on an Improved Stochastic Gompertz Diffusion Process

. This paper aims at introducing a new improved stochastic differential equation related to Gompertz curve for the projection of vehicle ownership growth. This diffusion model explains the relationship between vehicle ownership and GDP per capita, which has been studied as a Gompertz-like function before. The main innovations of the process lie in two parts: by modifying the deterministic part of the original Gompertz equation, the model can present the remaining slow increase when the S-shaped curve has reached its saturation level; by introducing the stochastic differential equation, the model can better fit the real data when there are fluctuations. Such comparisons are carried out based on data from US, UK, Japan, and Korea with a time span of 1960–2008. It turns out that the new process behaves better in fitting curves and predicting short term growth. Finally, a prediction of Chinese


Introduction
The growth of vehicle ownership has witnessed a great change of transportation demand sector over the years and is an important part of urbanization.The study by Simonsen and Walnum [1] showed that transportation contributes nearly 30% of CO 2 emission in OECD countries and accounts for a critical cause of regional and local air pollutions.In addition, the prediction of future vehicle numbers is of great policy revelation.Therefore, the increasing pattern of vehicle ownership should be paid high attention to, especially for developing countries, for example, China, who are stepping into the fast growth stage [2].
Many factors have influence on vehicle ownership growth, such as economics factor, public transportation service level, policy restrictions, and urban layout, while the economic growth has been the dominant driven factor, which is GDP per capita in the present paper.The relationship of growth of vehicle ownership and GDP per capita can be modeled in specific form, and an improvement has been carried out based on the most usually used Gompertz curve in order to obtain a better fitting projection.The introduced model solves two significant issues existing in the original Gompertz curve by introducing the stochastic diffusion process and a modification part.It reveals a new way of better prediction vehicle ownership based on limited data, with only the aggregate vehicle ownership condition and GDP per capita.
This paper is organized as follows.The next part reviews related studies of the specific topic.Section 3 proposes the model (improved stochastic Gompertz diffusion curve) with its structure, inference, and solution.In Section 4, applications to real data of the US, UK, Japan, and Korea have been carried out to evaluate comparisons, and projection of vehicle ownership growth of China is depicted based on the new model.Finally, we conclude our study and give suggestions on future study.

Literature Review
An external or internal vehicle ownership model is often used for various purposes, as mentioned by de Jong et al. [3].For the aggregate level, vehicle ownership model can be used by car manufactory for market analysis, by national and local government in order to make policy incentives based on the forecasting results, or by energy industry who is concerned about the oil consumption related to vehicles.For the disaggregate level, the model is usually treated as an input to mode choice in transportation model systems and thus could have a more detailed output.Examples can be found as traditional disaggregate car choice model [4], panel models [5], and dynamic transaction models [6].This paper focuses on the aggregate model, as we aim at the main pattern and future trend of vehicle ownership growth, rather than the detailed types or components of cars.In addition, the aggregate model has a much lighter requirement of data, while the disaggregate one relies heavily on the amount and types of data collected and sometimes even requires dataset with a long period of observation.
There are various aggregate models based on their different goals.Focusing on the market changing, K. U. Leuven, and Standard and Poor's DRI [7] studied how different structure of transportation modes and car prices influence the stock of personal vehicles.Focusing on vehicle ownership revolution, Van den Broecke [8] divided people into different categories by their age and predicted the growth pattern of vehicles by people becoming older, assuming that the behavior characteristics in each category would remain the same over the years.Based on product life cycle and diffusion theories, many studies use different models to depict the relationship between vehicle ownership and economics factors (e.g., per capita income or gross domestic product, GDP), which is more straightforward and is especially suitable for developing countries as they do not possess enough data related to vehicle ownership growth for other detailed types of studies.Early studies on this kind of models analyzed and described the relationship between vehicle ownership and time series, which is found as an S-shape curve [9].Different variables have been shown to influence the development of vehicle ownership projection, however, given the difference in data sources, model of including too many variables could lead to the difficulty in comparison of different results from various countries and regions [2,10,11].In addition, it is rather difficult to obtain and unify all the data of different variables in order to provide a complete dataset.Therefore, it is more appropriate to generate model based on simple dataset easily obtained, in order to present the projection of vehicle ownership development and give a unified prediction of different countries, especially of developing countries whose vehicle ownership remains rather small.
Hereafter, studies focus more on the economic factor, which is seen as the main driving force of vehicle ownership growth.Various kinds of models are developed to fit the Sshape curve, such as semi-log linear and log linear regression models by Dunkerley and Hoch [12], quasi-logistic function model by Button et al. (1993), elasticity analysis model by Stares and Liu [13], and Gompertz diffusion function model  by Dargay and Gately [2].The Gompertz model is found to be more flexible than logistic model and is suitable for analysis on both short term and long term prediction [14].He has carried out a series of examples using a simple parametric method to choose between a Gompertz and a logistic equation and suggested that the Gompertz curve would be indeed appropriate for the stock of car series.In the present paper, we use the Gompertz function as the base of our model, with some improvements in order to make it fit better.
The Gompertz function was firstly used in biological field, with its good performance in predicting growth, mortality, and thus the lifespan (see example papers as those by Zwietering et al. [17] and Finch and Pike [18]).Acutt and Dodgson [19] used the Gompertz curve in his study to forecast the future car ownership.Dargay and Gately [2] applied the Gompertz function to countries of full range, with low-income countries and high-income ones.As the model is more flexible and suitable for developing countries, there are studies concerning the prediction of Chinese vehicle stock based on it.Wang [20] used the general Gompertz function to present the S-shape curve of vehicle ownership growth in China.Zhao [21] estimates the function with panel data of 21 countries and arears, 1963-2008, in order to predict the vehicle ownership in China up to year 2050.Although the Gompertz curve is widely used these years, it has some disadvantages in applications.As shown in Figure 1 [2], the shape of curve is smooth S-shape and remains the same after it quickly reaches the saturation level.This brings about two significant problems, firstly, the general Gompertz curve is unable to present the fluctuations existing in real data; secondly, it cannot predict the remaining slow growth after the growth has reached its saturation level.
According to the slow remaining growth in the rear part of curve, a modification may be made to the model (see details in Section 3.2).In order to fit the fluctuations in real data, we use the stochastic differential equation (SDE) in the present paper.Gutierrez-Jaimez et al. [22] have tested the SDE on Gompertz equation and successfully proved that this new model performs well for random growth of rabbit weights.The new model proposed thus has advantages as follows: (1) The new model is able to present a better fitting result based on limited data of only vehicle ownership and GDP per capita.
(2) It has overcome two main shortcomings of the original Gompertz curve as described before, by introducing a modification part as well as the stochastic diffusion process.

Methodology
As used in previous studies, the relationship of vehicle ownership to GDP per capita has been represented by Gompertz growth curve, modeled as follows: where    is the quantity of vehicle ownership per 1000 people in year , and   is GDP per capita in year , and , ,  are parameters of the function to be calculated in regression.
Although there have been problems in applying this function to real data, as implied in the literature review; the S-shaped curve could successfully present the general growth pattern of the process and thus remains the main structure of the new model.In this part, we propose two kinds of improvement to the original model, as illustrated below.

The Stochastic Differential Equation of Gompertz Growth
Function.In order to obtain a diffusion process related to Gompertz curve (1), we should search for a process in which the solution of the Fokker-Planck equation without noise is such a curve, as proposed by Capocelli and Ricciardi [23], and is successfully conducted by Gutiérrez et al. [24] for a specific Gompertz-like curve used in biological phenomena.In this paper, we perform the procedure and define the stochastic Gompertz diffusion process (SGDP): where    and   remain the same as before,   ,  ∈ (1, 2, 3), are three parameters to be calculated in regression, and   is a one-dimensional wiener standard process with zero mean and var(  −   ) = ( − ).
By applying the Fokker-Planck equation, this process has forward equation and infinitesimal moments as where  = 1 for a standard wiener process.It is clear that when  vanishes, the solution of (3) turns into the original equation (1).Thus the process we proposed fulfills the condition imposed.
After defining the SGDP function, we continue to its parameter estimation.There are three parameters in the function, with  1 ,  2 being drifting parameters and  3 being the noise coefficient.Ferrante et al. [25] have proposed Itô's stochastic differential equations from an observed continuous sample path.With the same method, the estimations are calculated as where {  ;  ∈ [0, ]} is the observed sample path.In practice, as there is no continuous data for vehicle ownership, the estimation could only be based on discrete sample data (  1 ,   2 , . . .,    ).In the present study, we use Riemann integral instead of the continuous stochastic integral.The interval is divided with a small step (0.001 in this paper), and each is applied with Itô formula, in order to approach the continuous function.
With the same procedure, we can obtain the noise coefficient with the following form: The conditional function of SGDP is presented as As   is a wiener standard process with a variation of  ( Applying ( 8) to ( 7), and with the initial value, we can get the conditional trend function as (9), which should be used in the prediction of future values.

𝑚 (𝑔
3.2.The Improved SGDP Model.By applying the SGDP model to a set of data of vehicle ownership and GDP per capita in America, as presented in Figure 2, we can see that although the SGDP curve fits better than the original Gompertz curve when there is fluctuations in data, it still cannot present the slow increase when the curve begins to reach its saturation level in the rear.Therefore, an improvement in the deterministic part should be carried out for this problem.
In the present paper, an improvement is carried out in the deterministic part of the improved SGDP model, proposed as where  1 and  2 are parameters to be estimated.The improved SGDP model has a more complex function and thus is difficult for estimation by inference.In this paper, we use the SDE Toolbox of Matlab Package by Umberto Picchini (Umberto Picchini, SDE Toolbox: Simulation and Estimation of Stochastic Differential Equations with Matlab, http://sdetoolbox.sourceforge.net)to get the numerical results, illustrated in the next part.

The Estimation and Comparison in Sample Countries.
The growth pattern of vehicle ownership per 1000 people has been changing with the increasing of GDP per capita.In terms of elasticity (elasticity: the ratio of the average% growth in vehicle ownership to the average% growth in per  capita income), the values are different in different stages of development, also in different countries and regions [13].A country such as the United States has reached the saturation level with vehicle ownership per 1000 people of approximately 800 units, while some countries are in the stage of slow growth, like Japan, with the value of elasticity approximately 0.5, and other developing countries are in the stage of fast growth with a high elasticity of 1.7, taking China as an example [26].In addition, some cities, for example, London and Singapore, have conducted policy restrictions on vehicle usage/purchase and thus have a different path from countries without any restrictions.Therefore, in the present paper, we choose four countries with quite different growth curve to illustrate the differences between a general Gompertz curve and the improved SDEG curve proposed.
The countries selected are the United States, the United Kingdom, Japan, and Korea.Due to data availability, the period of data estimation is 1960-2008 for the former three countries and is 1966-2008 for Korea.Three kinds of data are collected: After the calculation of vehicle ownership per 1000 people and GDP per capita, we use the data of up to year 2000 as inputs variables and the parameters of proposed improved SGDP could be given by the SDE Toolbox of Matlab package, as presented in Table 1.
Then we predict the vehicle ownership per 1000 people based on the known GDP per capita and parameter estimated, from year 2001 to year 2008.Table 2 presents the real data, the results from the general Gompertz fitting curve, and the improved SGDP fitting curve as well as its 95% confidence interval.The improved SGDP curve has a much lower standard error and thus predicts better than the general Gompertz function.For simplicity, we only give the standard error of the other three countries, as shown in Table 3.
It should be highlighted that although improved SGDP function gives an obvious better performance for the data of the United States, the United Kingdom, and Japan, it behaves almost the same as general Gompertz for the data of Korea.To see more directly into this condition as well as  intuitive comparison of different fitting curves, each country's paths have been presented as Figures 3-6 (using data with a time span up to 2008 (when using data of full time span, the parameters regressed are a little different from those presented in Table 1; as the difference is little, for simplicity, we do not present the results specifically but pay more attention to the comparing figures)).
It is quite straightforward that the improved SGDP Gompertz curve proposed in this paper behaves better than the general one, which fits more close or nearly the same pattern as the real growth curve for all four sample countries, especially for the US, the UK, and Japan, whose real data either   reaches the level of saturation or has quite large fluctuations in the path.As for Korea, as its path has a "perfect" feature of S-shaped curve, and almost no obvious fluctuations, the real data, general Gompertz curve, and the improved SGDP curve behave almost the same.Combining with the prediction results from Table 2, the general Gompertz behaves even slightly better than the improved SGDP curve.
Therefore, the improved SGDP model outstands when either there are features of the original curve: it has reached or almost reached the saturation level, or there is quite obvious fluctuations in the curve path.Otherwise, when the curve is a smooth S-shaped path, the general Gompertz performs   approximately the same with the improved SGDP model, under which circumstance, we suggest using the general Gompertz as the estimation is much easier.4 and 5, and the projection of vehicle ownership per 1000 people and total vehicle quantity in China up to year 2025 are depicted in Figures 7 and 8.According to the prediction, China will maintain a fastgrowing trend, and its vehicle ownership per 1000 people will reach 265 units in 2025, and a total vehicle quantity of over 350 million.Future studies can focus on such aspects: firstly, the aggregate study usually follows on the relationship between vehicle ownership and economics factor; other influencing factors such as the public transportation service, vehicle policy restrictions, and car culture (proposed by research groups of IFMO (The Institute for Mobility Research by the BMW Group)) should be included to obtain a more comprehensive understanding of vehicle ownership revolution, a simple model with main influencing factors is highly recommended in the future study.Secondly, a mathematical solution for the improved SGDP may be developed instead of the numerical outputs by Matlab Package.Last but not least, a more detailed model may be proposed according to different stages of growth, in order to better the features of vehicle ownership revolution.

Conclusion
Figure 1: Projections of estimated vehicle Gompertz functions.

Figure 2 :
Figure 2: Real data versus SDE Gompertz versus Trend Gompertz in the United States.

Figure 3 :
Figure 3: Real data versus Improved SDE Gompertz versus Trend Gompertz in the United States.

Figure 4 :
Figure 4: Real data versus Improved SDE Gompertz versus Trend Gompertz in the United Kingdom.

Figure 5 :
Figure 5: Real data versus Improved SDE Gompertz versus Trend Gompertz in Japan.

Figure 6 :
Figure 6: Real data versus Improved SDE Gompertz versus Trend Gompertz in Korea.

4. 2 .
The Projection of Vehicle Ownership to 2025 in China.China is in the initial stage of mobility development and thus possesses a quite low quantity of vehicle ownership per 1000 people and a fast-growing trend.The vehicle ownership growth in the initial stage is quite unstable, with lots of possible fluctuations [13].Therefore, we conduct the proposed improved SGDP model to the original data (civilian vehicles, with a time span of 1980 to 2014 (data resource: Mar-coChina database (1980-2011) http://www.macrochina.com.cn/macro_data/;National Bureau of Statistics (2010-2014) http://www.stats.gov.cn/)) in China.After regression, the prediction of vehicle ownership per 1000 people as well as the total vehicle quantity is made based on the parameters and predicted GDP per capita and population.As the improved SGDP model is suitable for short term predictions, thus we assume a predicted time span of up to year 2025.The estimated parameters and predicted values are presented in Tables

Figure 7 :Figure 8 :
Figure 7: Projection of vehicle ownership per 1000 people to 2025 in China.
This study introduces a new form of Gompertz function based on stochastic differential equation, aiming at depicting the growth pattern of vehicle ownership with economic driving.Different from previous models used for fitting the curve, the proposed improved SGDP model has an adjustment in the deterministic part and is transformed to the stochastic differential form.The general solution of SGDP model is presented in the present paper, and numerical studies are carried out based on a SDE Toolbox of Matlab Package.In comparison, the improved SGDP model has the following advantages and features: (i) Better fitting and predicting for countries with vehicle ownership reaching (or almost reaching) the saturation level: the improved SGDP model can reveal the slow growth in the rear and thus can obtain a better fitting curve and a more precise prediction in the short term.(ii) Better performance for vehicle ownership growth curves with fluctuations: when there is obvious fluctuations in the pattern, the stochastic nature and adjustment part can capture them in order to fit better and precisely capture the real projection; this is useful for the understanding of vehicle ownership growth.(iii) For growth patterns which is exactly the S-shaped curve, we suggest using the general Gompertz equation instead of the improved SGDP; they perform almost the same, and the computing work is much easier for the general Gompertz function.Then we use the improved SGDP model to draw the prediction of Chinese vehicle ownership up to year 2025.The fitting curve performs quite well, and the prediction value of vehicle ownership per 1000 people and total vehicle quantity is over 250 units and over 350 million units in year 2025.

Table 1 :
parameters of improved SGDP and general Gompertz in sample countries.

Table 2 :
Predict values by improved SGDP and general Gompertz of vehicle ownership per 1000 people in the United States.

Table 3 :
Prediction standard errors of improved SGDP and general Gompertz in sample countries.

Table 4 :
Estimated parameters of improved SGDP in China.