Pakistan CO2 Emission Modelling and Forecasting: A Linear and Nonlinear Time Series Approach

Pakistan is considered among the top five countries with the highest CO2 emissions globally. This calls for pragmatic policy implementation by all stakeholders to bring finality to this alarming situation since it contributes greatly to global warming, thereby leading to climate change. This study is an attempt to make a comparative analysis of the linear time series models with nonlinear time series models to study CO2 emission data in Pakistan. These linear and nonlinear time series models were used to model and forecast future values of CO2 emissions for a short period. To assess and select the best model among these linear and nonlinear time series models, we used the root mean square error (RMSE) and the mean absolute error (MAE) as performance indicators. The outputs showed that the nonlinear machine learning models are the best among all other models, having the lowest RMSE and MAE values. Based on the forecasted value of the nonlinear machine learning neural network autoregressive model, Pakistan's CO2 emissions will be 1.048 metric tons per capita by 2028. The increasing trend in emissions is a frightening and clear warning, suggesting that innovative policies must be initiated to reduce the trend. We encourage the Pakistan government to price CO2 emissions by companies and entities per ton, adapt electricity production from hydro, wind, and different sources with no emissions of CO2, initiate rigorous planting of more trees in the populated areas of Pakistan as forest covers, provide incentives to companies, organisations, institutions, and households to come out with clean technologies or use technologies with no CO2 emissions or those with lower ones, and fund more studies to develop clean and innovative technologies with less or no CO2 emissions.


Introduction
Te collection of greenhouse gas (GHG) emissions in airspace is creating chaos across the globe and destroying life, economy, health, and food. Carbon (IV) oxide (CO 2 ) is a major component of GHG [1]. Te world is nowhere near securing a global temperature rise of less than two degrees Celsius as proposed in the Paris Agreement of the United Nations (UN) Framework Convention on Climate Change (UNFCCC). Using the 1990 criterion, most countries are still releasing more GHG into the atmosphere, with some at a standstill and others releasing less [1]. Te UN Environmental Program (UNEP) Climate Action Note indicated that the world is in a state of climate emergency [1].
Pakistan is labelled among the countries with the highest GHG emissions in the world by the UNEP [1]. It contributes to 1.02% of global GHG emissions. Pakistan emitted an estimated 504.59 million tons of GHG in 2018. It is one of the most impacted nations from unfriendly efects of environmental change as well as air contamination. GHG emissions are the leading cause of global warming, thereby becoming the maximum essential research issues within the felds of technological know-how and politics [1].
In Southern Asia, Pakistan is ranked among the developing economies with swift economic positive growth. Tis is expected to continue for a long time. Agriculture is the principal dominant area in Pakistan's economic growth irrespective of the growing industrial sector. However, this industrial drive, coupled with high population growth, has caused massive deforestation, resulting in less absorption of CO 2 by plants, leading to a higher CO 2 concentration in the atmosphere [2,3].
Pakistan is ranked frst among countries in Asia facing mass deforestation as a result of bulk utilization of natural energy from the environment. Tis is causing serious degradation of the environment. Tis degradation of the environment, together with deforestation, has brought about global warming, leading to climate change. Te critical factor of Pakistan's commercial revolution is reworking its economy from organic economies based totally on human and animal energy to inorganic economies, which are completely dependent on fossil fuels. Interestingly, the by-product of fossil fuels is CO 2 , thereby contributing to the concentration of CO 2 in the environment [2,3].
Te quest to be an industrialized giant defnitely comes at a price. Pakistan is geared towards a full-industrial revolution which encourages the use of fossil fuels to power the growing industry in the country. Te growth in population means expanding cities by destroying green vegetation, which absorbs most of the CO 2 in the atmosphere. Te growing population also means increasing cars with carbon emissions being used in the country [2,3].
Te Government of Pakistan has put in place a lot of programmes to reduce GHG emissions in the country. Great progress has been chalked up in climate action that ranges from policies, programmes, and nature-based solutions (NBS) to technology-based innovations [4]. Pakistan, recognizing the role of nature in climate modifcation and diminution, has put forward robust natural capital restoration plans including the Ten Billion Tree Tsunami Program as well as the Protected Drive. Tese programmes have contributed to enhancing livelihoods and giving opportunities to the vulnerable, including youth and women. In addition, Pakistan has introduced a number of policy actions focused on mitigating GHG emissions from high-emission sectors, such as energy and industry [4].
Notwithstanding, this available information indicates a rise in CO 2 emissions in Pakistan. Tis needs to be scientifcally substantiated and verifed to assist the government to restrategize in an attempt to achieve the targeted goal in the fght of reducing these emissions in order to achieve the Paris Agreement of the UNFCCC. Tis study provides a statistical and machine learning framework to delve deep into the discourse of CO 2 emissions in Pakistan from 1960 to 2018. It provides CO 2 emission forecasts for future and possible policy initiatives and directions to combat this issue based on scientifc results to meet the UNFCCC target.

Literature Review.
Te world is being confronted with signifcant difculties connected with an unnatural weather change and discharge of ozone-depleting substances in large amounts. In 2017, energy businesses represented 46% of all CO 2 discharges worldwide [5]. Bokde et al. [5] proposed two-time series disintegration techniques for transient determination of CO 2 outfow with a bunch of cutting-edge models for a 48-hour skyline power market. By estimating benchmarks for France, they showed that their new technique has a mean outright rate blunder, that is, 25% lower than that of existing models. Furthermore, they illustrated the use of the conjecture for booking adaptable power utilization in Germany, Norway, Denmark, and Poland. Teir results showed that booking an adaptable block of 4 hour of power utilization at a 24-hour stretch can on average diminish subsequent CO 2 discharge in all countries [5].
Te carbon exchanging market has turned into a powerful weapon in easing fossil fuel by-products in China, and carbon cost is at the centre of its activity. Tus, the carbon exchanging market flls in as an imperative part in gauging carbon cost precisely ahead of time [6]. Sun and Li [6] imaginatively investigated a group of driven long short-term memory network (LSTM) models in light of reciprocal outft experimental mode decay (ROEMD) for carbon cost gauging, applying it to each of the eight carbon exchanging piloting centres in China. ROEMD was frst executed for mode change to disintegrate the frst convoluted mode into a bunch of straightforward modes. Accordingly, LSTM was utilized to demonstrate the planning between time-slacking factors and every model's objective quality, thereby constructing numerous LSTM models for comparison. At last, converse ROEMD computation was acquainted with the coordinates' expected consequences of the multimode on eventual outcomes. Its viable application at the same time embraced every one of the eight carbon piloting centres in China, covering their compared carbon cost information over a signifcantly extended period. Te acquired outcomes showed that the proposed model had adequate precision in carbon cost anticipation in China in contrast to the single LSTM model as well as other ordinary machine learning models. Moreover, as indicated by the extent of its application, the creative model displayed solid dependability and comprehensiveness [6].
Wu and Liu [7] proposed a multiresolution combopredicting model that uses ensemble empirical mode decomposition (EEMD-ADD) to improve the predicting accuracy of the cost of carbon. Tey achieved it by decomposing carbon price sequencing using EEMD into numerous intrinsic framework functions (IFFs) by dividing these IFFs into several high-frequency compartments, lowfrequency compartments, and drift compartments after which the least square vector support machine (LSSVM), particle swarm optimization LSSVM (PSO-LSSVM), and bat algorithm-LSSVM (BA-LSSVM) were utilized to forecast three compartments, respectively. Similarly, Yun et al. [8] built a robust carbon price-predicting model of complete ensemble empirical mode decomposition with adaptive noise (CEEMDAN)-long short-term memory (LSTM) by blending the merits of CEEMDAN in decaying multiresolution time-frequency carbon price waves and the LSTM model by installing monetary waves.
A report by Eckstein et al. [9] positioned Pakistan as the ffth generally impacted country because of environmental changes throughout the course of recent years, while the author in [10] positioned Pakistan's air as the second generally contaminated in 2020. To diminish these efects, the nation needs to take broad variation and alleviation measures while changing the energy area towards dirtying and carbon impartial choices [11].
Existing studies imply that greater than two-thirds of GHG emissions in Pakistan come from fossil power-related CO 2 emissions. Consequently, forecasting CO 2 emissions is crucial for adjusting regulations to mitigate weather changes. Te forecasting of CO 2 emissions has been mentioned dramatically in [12].
Pao and Tsai [13] implemented the grey model to obtain CO 2 emissions in Brazil between 1980 and 2007. A combination of models based on the optimization algorithms of quantum harmony and discounted mean square has been established and employed in CO 2 emission forecasting for the world's top fve countries with high CO 2 emissions [14]. Aufhammer and Carson [15] came up with China's CO 2 emission path using a set of panel data in a provincial-level framework [16][17][18][19]. Te application of the traditional grey model and the upgraded grey model has been implemented to forecast CO 2 emissions [20,21].
From qualitative and quantitative perspectives, numerous discussions have been conducted on the relationship between consumption of energy, economic growth, and CO 2 emissions [22]. Al-mulali and Sab [23] investigated the infuence of the consumption of basic energy and emissions of CO 2 on sixteen developing countries, focusing on the impact on the development of their economies. Wang et al. [24] examined the signifcant factors contributing to CO 2 emissions in China's road freight transport. Zhang and Nian [25] computed the emissions of CO 2 and examined the factors contributing to emissions in China's transport sector. In China, Lin et al. [26] developed a new bootstrap ARDL bounding test for CO 2 emissions and industrial development. Pao and Tsai [27] conducted an examination of the dynamism of the causal relationship, linking the emission of pollutants, consumption of energy, and their output in Britain, Russia, India, China, and other countries over a defnite period [28][29][30][31][32][33][34].
Little work has been conducted on providing mathematical and statistical modelling techniques for modelling and predicting CO 2 emissions in Pakistan and where the country stands in the future concerning this important issue. It is against this backdrop that this paper provides a modelling and forecasting technique based on linear and nonlinear time series models. Tis study predicts the amount of CO 2 emissions in Pakistan for almost a decade to assist the Pakistan government and other stakeholders to plan ahead to devise ways of mitigating the efects of CO 2 emissions or possibly curtailing them. Section 2 of the paper presents the source of data, data characteristics, and the modelling and forecasting techniques applied. Section 3 presents the discussion and the results, while the last section presents the conclusions and recommendations of the study.

Data.
Te data are made up of CO 2 emissions in Pakistan from 1960 to 2018. Te amount of CO 2 was measured in metric tons per capita according to international standards. Te data were obtained from the World Bank data indicators of Pakistan (https://databank. worldbank.org/). Figure 1 shows CO 2 emissions (metrics tons per capita) from 1960 to 2018 as obtained from the World Bank. It can be observed from the fgure that there is an increasing trend, which is an indication of a seasonal trend in the series. Seasonality must therefore be removed from the series before applying our modelling and    [35,36] can be represented by With lagged values θ 1 y t−1 , . . . , θ p y t−p and lagged errors α 1 e t−1 , . . . , α q e t−q . Te p, d, and q of the series x t are the autoregressive term order, the diferencing degree of the series, and the moving average order term, respectively. Te   white noise e t has mean 0 and variance σ 2 . Te diferencing of the series x t can be conducted once or more times.
Te seasonal multiplicative ARIMA model has the form ARIMA (p, d, q) × (P, D, Q) written as In the formula, f and K are the seasonality frequency and the shift balanced operator, respectively. Te seasonal diference and ordinary diferencing degrees are represented by D and d, respectively. δ P (E f ) and θ p (E) are the seasonal autoregressive polynomial of order P and regular autoregressive polynomial of order p, respectively while μ Q (E f ) and ϑ q (E) are the seasonal moving average of order Q and polynomials of the regular moving average of order q, respectively. Similarly, Te white noise ε t has mean 0 and variance σ 2 . Figure 2 shows the diagrammatical display of the steps involved in this methodology. In Figure 2, it can be noticed that there are four steps involved in ARIMA modelling. We discuss the main steps involved in this method briefy.
Step 1. Identifcation of the ARIMA Model Tis step requires the series to be stationary and parameters independent of time. In time series analysis, it is often found that the series is not white noised in advance. To make the series white noised, diferencing is required. To test whether the series is stationary or not, we use the augmented Dickey-Fuller test. Once the series becomes stationary, graphical tools such as the autocorrelation function (ACF) and partial autocorrelation function (PACF) are implemented to identify the order of the candidate model.

Step 2. Model Estimation
With the help of the visual display of the autocorrelation function (ACF) and partial autocorrelation function (PACF) of the series, the appropriate candidate model for the series is estimated. Diferent combinations of candidate models are applied, and the fnal model is selected based on the criteria of accuracy of parameters.
Step 3. Diagnostic Checks Here, the selected candidate model passes through various tests, which include the ACF plot, QQ-Norm plot, and Shapiro-Wilk to verify the normality of residuals. Once the model passes all these tests, the model is considered one of the best models for the next step.
Step 4. Forecasting In this step, the candidate model fulflling the three steps discussed earlier is used for predicting the future values of the series. Te efciency of the forecasted model is measured by the root mean square error (RMSE) and the mean absolute error (MAE). Te mathematical expressions for RMSE and MAE are given by   Journal of Environmental and Public Health 5 where x 1 , . . . , x n and x n+1 , . . . , x t are the partitions of the data for training and forecasting, respectively. In our case, we used the complete data for both training and forecasting. Te RMSE and MAE assess the quality as well as summarize the model for superior forecasting. Te model with the smallest RMSE and MAE values is chosen for forecasting [5]. Other indicators of measuring efciency can also be implemented.

(2) Trigonometric seasonality, Box-Cox transformation, ARMA errors, Trend, and Seasonal (TBATS) Modelling Approach
Te trigonometric seasonality, Box-Cox transformation, ARMA errors, trend, and seasonal component (TBATS) model is mainly used to predict the time series with complex seasonal patterns by exponential smoothing. Te TBATS model utilizes the representation of trigonometric seasonal components based on the Fourier series. Te TBATS model [38,39] is given by wherex (α) t is the Box-Cox-transformed time series at moment t and α (i) t−n i is the ith seasonal component. τ t is the local level, while k t is the trend with damping. ϑ t is ARMA (p, q) and ε t is the white noise from a Gaussian process. T is the seasonality amount, whereas n i is the length of the ith seasonal period. Te Box-Cox transformation is α, while β is trending damping. Also, ω, c represent the smoothing and δ i , θ i are the ARMA (p, q) coefcients. ε t is the white noise. Te seasonal component of the TBATS model is given by  whereξ i � 2πj/n i and ϕ (i) 1 , ϕ (i) 2 are the seasonal smoothing.
(3) Naïve Modelling Approach. Te naïve modelling approach utilizes the last year's genuine data as a fgure for the current year. Te following year's estimate t + 1 will be equivalent to the current year's information [33]. Te supposition that its interest lies later in the period will be equivalent to the interest in the last period. Te expression of the naïve method [37,40,41] is Te seasonal naïve model for the period t + k is of the form where n is the period of seasonality and g is the integer part of k − 1/n, which is the number of years prior to the forecast year t + k.

(4) Exponential Smoothing (ETS) Modelling Approach.
Te exponential smoothing (ETS) technique is identically connected to one or more stochastic models. It likewise follows the property of robustness and treated as a model of critical extrapolation. Te thought behind the ETS is that the estimates from this approach are the weighted normal average of past data, and weights exponentially decay with time. Tat is, the current data have bigger loads in contrast to the previous ones. In this way, ETS delivers savvy forecasting in contrast to typical techniques [37,42,43]. Te simplest form of ETS [37,42,43] is given by where β, 0 ≤ β ≤ 1, is the parameter for smoothing and t + 1 is the time for one-step-forward average forecast for the series x 1 , . . . , x t . β controls the rate of decrease in weights.
For ETS (A, N, and N), the additive model [37,42,43] is represented mathematically by where (9) is the equation for the data and (10)   Journal of Environmental and Public Health where F t is the new level and F t−1 is the previews level. ε t is white noise with mean 0 and variance, σ 2 . We, therefore, present the nonlinear machine learning time series model after thoroughly discussing linear time series models.

Nonlinear Machine Learning Time Series Models.
Here, we briefy discuss neural network autoregressive and multilayer perceptron nonlinear machine learning time series models.
(1) Neural network autoregressive (NNAR) modelling approach. A neural network autoregressive (NNAR) model is obtained when the lagged values of the time series are input to a neural network. NNAR (p, r) denotes p delayed inputs and r nodes. NNAR (p, 0) equals ARIMA (p, 0, 0) without stationarity parameterization. Te mathematical expression for NNAR (p, r) [36,[44][45][46] is of the form Model (12) can be constructed in two diferent stages, with the frst being the activation R. Also, the hidden layer B(r) is computed as where r � 1, . . . , R and y j � y t−1, . . . , y t−p .s is a previously defned nonlinear activation function. Every B(r) acts as a unique transformation characteristic b r (y). Mathematically, the hidden layer, receives R instigations. Te NNAR model assumes a sigmoid function, , (15) used in the linear function translation of 0 probability to a probability of 1.
(2) Multilayer Perceptron (MLP) Model. Te structure of the multilayer perceptron (MLP) model [47][48][49][50] is composed of the input layer, hidden layer, and output layer with limited nonlinear functions that are  Journal of Environmental and Public Health diferentiable. In the MLP, information in the form of a nonlinear diferentiable function is processed by artifcial neurons from the input layer through the hidden layer to the output layer, yielding a response in the form of a disjointed feed-forward network algorithm. Te mathematical expression of the MLP is given by where c n , d n , k, k s , t(x), g i pn , and g 0 1p are the network inputs, bias of the network, activation function of intermediate layers, output layer activation function, output signal, weights of the intermediate layer, and connections of output neurons, respectively. Figure 3 shows the MLP model with a single hidden layer.
Te RMSE and MAE will be used for all model comparisons.

Results and Discussion
From the time series plotted in Figure 3, the series is not stationary. We verifed this by applying stationarity at the level (0) and found that the series is not stationary at the level (0). Tis means that the mean, variance, and covariance are not stationary over some time. Tis violation of nonstationarity was removed by taking the frst diference of the series. We implemented the Dickey-Fuller (DF) test without drift, the Phillips-Perron (PP) test without drift, to verify whether our series is stationary or not, and the structural break (Chow test) [57] test to check whether there is a structural change in the series. Te Chow test indicated that there is no structural change in the series as the p value is greater than 0.05. As a result, the series becomes stationary at the level (0). Te output of these tests is given in Table 2 for the level (0).
From Table 2, the series is not stationary at the level (0), so we implemented a single diferencing technique to make the series stationary. After making the series stationary, we searched for the best candidate model for the CO 2 emission data by producing the correlogram of the diferenced series. Te visual autocorrelation (ACF) and partial autocorrelation (PACF) plots are presented in Figure 4.
In Figure 4, the ACF and PACF suggest that the estimated ARIMA model for modelling CO 2 emissions includes two terms from the autoregressive (AR) part and one term from the moving average (MA) part. In the correlogram, the suitable ARIMA model for the series is ARIMA (4,1,7) since the lags of ACF and PACF are out of the boundaries on 5 and 7.
Using the complete dataset, we presented the visual single horizon forecast of the ARIMA, ETS, TBATS, naïve, NNAR, and MLP models in Figure 5, Figure 6, Figure 7, Figure 8, Figure 9, and Figure 10, respectively, to assess their We used the nnfor and neuralnet packages in R to perform the analyses for the nonlinear machine learning time series models. Te packages adjust itself for the optimum number of parameters according to the behaviour of the series.
Although all models showed a similar trend as the original series (observed), the nonlinear machine learning time series models had a more closer trend compared to the linear time series models. Tis is an indication that the nonlinear machine learning models have superior forecasting qualities than the linear time series models.
Except for ETS and naïve models (Figures 6 and 8) which forecasted stable CO 2 emissions in subsequent years, all other models forecasted increasing CO 2 emissions in Pakistan in subsequent years. Tis trend is worrying and devastating and similar to that observed by Butt et al. [11].
We estimated the RMSE and MAE for both linear and nonlinear time series models to illustrate the numerical performance of our models.  NNAR (1, 1) having the least value. We noticed that both nonlinear machine learning time series models had smaller RSME and MAE values than all linear time series models, indicating their superiority to linear time series models. Te NNAR (1, 1) model's least RSME and MAE values of 0.0007 and 0.0005, respectively, is a clear evidence that the machine learning nonlinear NNAR (1, 1) model possesses a superior forecasting quality and data assessment than the rest of the time series models. As a result, we utilized the NNAR (1, 1) model for forecasting CO 2 emissions in Pakistan from 2019 to 2028. Te forecast values as illustrated in Figure 9 show that CO 2 emissions in Pakistan will be 1.048 metric tons per capita by 2028. Tis depicts a rapid increase in CO 2 emissions from previous years. Tese results are in line with those of Aftab et al. [2] and Faruque et al. [3].    [3]. To achieve the targeted goal in the fght of reducing CO 2 emissions, in order to accomplish the Paris Agreement of the UNFCCC, we propose that the Pakistan government should adapt the following:

Conclusion
(i) We should charge US$ 1000 per ton of CO 2 emitted by companies and entities in every quarter of the year [61,62]. Tis amount, which is set out by international authorities to curtail CO 2 emissions, should be increased with time to discourage the use of high emitting fuels and encourage the use of low emitting ones. When this is initiated, it can reduce CO 2 emissions by over ffty percent [61,62].
(ii) We should produce electricity from hydro, wind, and diferent sources with no emissions of CO 2 . Tis can reduce CO 2 emissions in excess of seventy percent as well as make electricity cheaper [61].
(iii) We should initiate rigorous planting of more trees, at least fve million per year, in the populated areas of Pakistan as forest covers to absorb a greater chunk of emitted CO 2 in the atmosphere. Studies reveal that fve million trees in a particular area can absorb seventy percent of CO 2 concentrates in the atmosphere in the area [61]. Terefore, this could reduce the concentration of CO 2 in the atmosphere by over seventy percent of the recent value [61].
(iv) We should provide, as a matter of urgency, incentives to companies, organisations, institutions, and households to come out with clean technologies or use technologies with no CO 2 emissions or those with lower ones [61]. (v) We should fund more studies to develop clean and innovative technologies with less or no emissions   [61]. Tis will go a long way to reduce emissions by over forty percent.
Our study can be used as a suggestion for forecasting CO 2 emissions depending on diferent and dynamic settings. New policies for future purposes can be formulated and devised in light of this study. Our study can be extended to comparative analysis with other machine learning models using diferent activation functions.

Data Availability
Te data consist of CO 2 emissions in Pakistan from 1960 to 2018, available at https://databank.worldbank.org/.