A New Algorithm of Parameter Estimation for the Logistic Equation in Modeling CO 2 Emissions from Fossil Fuel Combustion

CO 2 emissions from fossil fuel combustion have been considered as the most important driving factor of global climate change. A complete understanding of the rules of CO 2 emissions is warranted in modifying the climate change mitigation policy.The current paper advanced a new algorithm of parameter estimation for the logistic equation, which was used to simulate the trend of CO 2 emissions from fossil fuel combustion. The differential equation of the transformed logistic equation was used as the beginning of the parameter estimation. A discretization method was then designed to input the observed samples. After minimizing the residual sum of squares and letting the summation of the residual be equal to 0, the estimated parameters were obtained. Finally, this parameter estimation algorithm was applied to the carbon emissions in China to examine the simulation precision. The error analysis indicators mean absolute percentage error (MAPE), median absolute percentage error (MdAPE), maximal absolute percentage error (MaxAPE), and geometric mean relative absolute error (GMRAE) all showed that the new algorithm was better than the previous ones.


Introduction
Global climate change is presently one of the most important issues in scientific and political agenda [1][2][3][4].The Intergovernmental Panel on Climate Change (IPCC) has consistently documented a scientific consensus on the link between anthropogenic greenhouse gas (GHG) emissions and climate change [5][6][7][8].As a direct result of fossil fuel combustion, energy-related CO 2 emissions contribute well to over 80% of the world's total and account for over twothirds of all anthropogenic GHG emissions [9].A complete understanding of the rules of CO 2 emissions from fossil fuel combustion is important in modifying the climate change mitigation policy.
Similar to other energy-related indicators, the development of CO 2 emissions from fossil fuel combustion has its own law.Köne and Büke [10] have first researched the longterm characteristics of energy-related CO 2 emissions.They drew the emission figures of the top 25 emitters and attempted to simulate them with linear models.On this basis, Meng and Niu [11] have analyzed the reasons for the long-term change in energy-related CO 2 emissions and proposed that the S-shaped model is more appropriate than the linear one in simulating the long-term emission curve.They have also offered three similar algorithms to estimate the parameters of the logistic equation, the representative of the S-shaped model.An empirical analysis in China shows that their logistic model is better than the linear one in terms of both the maximum empirical risk and the quality of fit.
In essence, all three parameter estimation algorithms advanced by Meng and Niu [11] have two key logical steps.First, the logistic equation must be transformed into a linear structure because the unknown parameters of this equation exist as a whole in the denominator, rendering direct parameter estimation impossible.Second, new parameters and variables are used to replace each term of the linear structure, and their values are estimated by the ordinary least square (OLS) algorithm.These processes can help estimate the parameters of the logistic equation but, at the same time, affect the estimate precision.One of the important reasons is that the replacement for the linear structure may change the relative importance of different samples.
For example, as demonstrated by Meng and Niu [11], the logistic equation is given by It can be transformed into the linear structure as By letting   = ( +1 −   )/ +1 ,  = 1 −   , and  = −(1 −   ), (2) can be transformed into the linear model   =  +   .Using the OLS method, the linear parameters  and  are obtained.Consequently, the parameters , , and  are easily obtained.
When estimating the parameters, the influence of a sample is positively correlated with its residual; that is, a larger residual indicates greater influence.In fact, according to (1) and ( 2), a sample with a large residual in (1) does not necessarily have a larger residual in (2) because the latter residual is not only determined on its own but also significantly affected by the next sample.As a result, the relative importance of different samples may be changed in the process of replacement for the linear structure.In other words, the estimated optimal parameters for (2) may not be the optimal parameters for (1).Thus, the present paper proposes a new algorithm (NA) of parameter estimation for the logistic equation, which does not require the process of replacement for the linear structure.
The current paper is organized as follows.Section 2 introduces the complete NA of parameter estimation for the logistic equation.Section 3 presents a case study on CO 2 emissions from fossil fuel combustion in China to test the NA.A detailed comparison between the NA and the three previous algorithms (PAs) is offered.Finally, Section 4 provides the conclusions.

Parameter Estimation Algorithms
By letting   = 1/  , the logistic equation ( 1) is written as Equation ( 3) is an exponential curve and parameters , , and  jointly determine the exact shape.Parameter  is at the exponential location; thus, the estimated values of the three parameters cannot be obtained by OLS.The Grey theory [12,13] has advanced an idea to treat this kind of problem.Following this idea, we developed the algorithm of parameter estimation.
Given that the three parameters of (3) cannot be directly estimated, the form of differential equation is selected as the starting point.Equation ( 3) is written as According to the definition of the derivative and considering the statistical results, the first term on the left of ( 4) can be approximated as Considering that   and  +1 are used in discretizing the first term on the left of ( 4), the second term can reasonably adopt the mean value of   and  +1 : As a result, (4) can be written as Inputting the samples into (4) yields the following results: where  is the number of samples and  is the residual.Equation ( 8) can be written as where To obtain the optimum values of parameters  and , the residual sum of squares must be minimized.

By letting
the derivative of Q to Ẑ must equal zero.Consider In other words, Accordingly, the estimations of  and  are as follows: By introducing b, ĉ, and all samples into (3), the following equation must be valid to ensure the estimated curve across the center of the samples (the summation of the residual equals zero): The estimation of  is then obtained as follows: Finally, the estimations of , , and  are all obtained.

Experimental Setup and Results
In [11], the three PAs (A1, A2, and A3) have been adopted to simulate separately the carbon emissions of each main industrial sector in China from 1998 to 2007.The best algorithm for each time series is then selected, and the simulated results of each best algorithm are summed to obtain the simulation results of the total carbon emissions.
To compare the NA offered in this paper with the three PAs [11], the same data were used and the same process was followed to simulate the total carbon emissions.
First, A1, A2, A3, and NA were adopted to simulate the carbon emissions of each main industrial sector.The simulation results of each algorithm and their mean absolute percentage error (MAPE) (17) to the real emissions (REs) are listed in Table 1.
where () is the RE of the th sample, x() is the simulation result, and  is the number of data used in the MAPE calculation.
Using MAPE as an indicator, the best algorithms of A1, A2, and A3 for each time series of carbon emissions were selected.
As shown in Table 1, the NA always has a better MAPE than A1, A2, and A3.This preliminarily result proved the advantage of the NA.
Second, following the same process in [11], the selected best results of A1, A2, and A3 (see Table 1) were summed and the simulation results of the PAs for the total emissions were obtained.We also summed the simulation results of the NA for each time series of carbon emission.The real total emissions (RTEs), simulation results of the PAs and NA, and relative errors of each algorithm in each year are listed in Table 2.
As RTEs are usually influenced by stochastic factors, forecasting results of NA are not always more precise than PAs.That is, for some forecasting points in Table 2, the relative errors of NA are less than PAs and, for other forecasting points, PAs perform better than NA.As a result, the performance of each method should be evaluated comprehensively by comparison.To compare the two methods, the median absolute percentage error (MdAPE), maximal absolute percentage error (MaxAPE), and geometric mean where x * () is the simulation result obtained from the benchmark method.The values of the different error analysis indicators were obtained using (17)-(18) and are listed in Table 3.
In essence, the above error analysis indicators have similar functions in distinguishing the better algorithm.However, they still have fine distinctions.The MAPE is an indicator of accuracy reflecting the general closeness of the simulation results to the real data.The MdAPE is the middle value of all absolute percentage errors ordered by size.Similar to the MAPE, the MdAPE has the ability to reflect the general closeness of the simulation results to the real data, but it also has the ability to overcome the influence of a few outliers.The MaxAPE is the worst simulation result; it reflects the maximal simulation risk.The GMRAE calculates the extent of improvement of the new model compared with the previous model.It adopts the geometric mean algorithm and is especially suitable for comparison between different models.Table 3 indicates that the NA is always better than the PAs for each indicator.

Conclusions
The intrinsic trend of CO 2 emissions from fossil fuel combustion is an S-shaped curve and is feasibly simulated by the logistic equation.To use the OLS method for parameter estimation, PAs replaced each term of the linearized logistic equation with new parameters and variables.Given that the replacement may change the relative importance of different samples, the precision of the estimated parameters may also be affected.In the current paper, a new parameter estimation algorithm that does not require replacement was advanced.It adopted the differential equation form of the transformed logistic equation as the starting point.After discretization to the differential equation, the observed samples were inputted into the equation.By minimizing the residual sum of squares, the estimated values of parameters  and  were obtained.
Inputting the estimated values of  and  into the logistic equation and letting the summation of the residual equal zero, the estimated value of  was also obtained.Finally, the carbon emissions in China were chosen to test the precision of the NA.Error analysis indicators (MAPE, MdAPE, MaxAPE, and GMRAE) all showed that the NA was better than the PAs.

Table 1 :
Simulation results of each algorithm for each industrial sector.: carbon emitted in agriculture, forestry, animal husbandry, fishery, and water conservancy; IC: carbon emitted in industry (including the electricity generation industry); CC: carbon emitted in construction; TSC: carbon emitted in transport, storage, and post; WC: carbon emitted in wholesale and retail trades; OC: carbon emitted in others. AC

Table 2 :
Comparison of the simulation results of the total emissions.

Table 3 :
Values of error analysis indicators.