Trend Analysis and Prediction on Water Consumption in Southwestern Ethiopia

This study intensively examined the monthly water consumption forecasting performance using advanced time series (ARIMA) models. Thus, this study intends to identify the appropriate ARIMA models to best ﬁ t the water consumption data in Southwestern Ethiopia Tepi town and forecast water consumption e ﬀ ectively in the city. The data used for this study was the monthly water consumption in Tepi town from January 2016 to December 2021.The data were converted to returns to enhance their statistical properties and the returns were used to ﬁ t a mean equation. The monthly average water consumption in Tepi Town is 77227.8 meters cubic. Both original and transformed data show the trend of water consumption is increasing over time. Several ARIMA models were ﬁ tted to the data, and it emerged that the most adequate model for the data was ARIMA (1, 1, 1) based on the model selection criterion. The parameters for ARIMA models were estimated using the Ordinary Least Squares Estimation (OLS) method. The model was used to forecast the consumption for the next ten months and to advise Tepi town Water Company Limited in the city to meet the demand of the people. Conclusion . The consumption of water is increasing from December to September.


Introduction
Water is one of the most important nutrients in the world. It is a source of security for human beings and plants and is a source of livelihood for human beings. In general, water is used for agriculture, industry, transportation, electricity, and recreation [1]. According to the United Nations, about seven million people in sixty countries suffer from water shortages [2]. Many people around the world now suffer from water shortages [3]. It is clear that unhealthy drinking water is often unhealthy, so contaminated water can cause bacteria, viruses, and parasites to cause stomach worms [4]. The study of five African countries (Ghana, Uganda, Niger, Sierra Leone, and Rwanda) found that government commitment to water supply and sanitation was low [5].
Some studies in underdeveloped countries indicate that due to diarrhea which is the main reason of poor sanitation, improper water use and poor sanitation lead to a total of 88% deaths [6,7]. In larger cities, such as Addis Ababa, there is an increasing demand for tap water as the community's demand for tap water is very high [6][7][8][9][10][11]. Respect and resilience have a great deal to do with the use of water. Water shortage is primarily a problem for the community that provides food, shelter, electricity, and so on. Studies show that the use of lightening and distilled water as well as the continuous flow of water is a main reason for water shortage supply and utilization. The use of rainwater, groundwater, etc., using a variety of technologies can help in reduction of water supply [12,13].
The millennium goal development of Ethiopia targets for water supply coverage is 62% and 52% sanitation coverage [14]. In Ethiopia, average water access development achieved was 17 percent of the total population of Ethiopia. Currently, Ethiopia is making significant natural and social changes to manage the water supply and demand. Therefore, formulating water use policy is also the main agenda of the country [15,16]. The assessment of water consumption in the town is not yet studied, and there is no research article published around the study area. However, the literature reviews were used, from national-as well as global-related works.
Forecasting and predicting water consumption is very helpful to improve the ability of water resource management, as well as in delivering technical support for assessment or management of water resources management [17]. Secondly, water consumption forecast can improve the management and service quality of water supply enterprises. Also, forecasting for water consumption effectively can also assure the demand of water consumption and water management during various periods to service the quality of water supply consumption.
The contribution of this work is to indicate that the demand for water in the community is high. So, an alternative source is needed to overcome the scarcity of water supply in the city and around to make awareness for the communities to use water properly.
Therefore, the main goal of the study was to forecast water consumption in Tepi town by using the advanced time series analysis.

Materials and Methods
The study was conducted in Tepi town, located in Southwestern Ethiopia. The study area is 611 km far away from the headquarter of Ethiopia, Addis Ababa. It is also 263 km away from Jimma city and 1.5 km from the Mizan-Tepi University, Tepi Campus. The city is located on latitude and longitude, respectively, on〖 7〗^0 〖12〗^′ N〖35〗^0 〖27〗^′ E and its mean elevation of 1,097 meters above sea level [18]. Several ethnic groups are present in this city like Shakacho, Amhara, Kafficho, Oromo, Bench, Sheko, Majang, and all other ethnic groups. Data was obtained from Tepi town water Consumption Corporation recorded over time (measured in meters cubic) from 2016 to 2021.

Statistical Method.
Descriptive statistics are used to describe the characteristics of a set of statistical data by organizing, presenting, and describing the amount of water consumption in Tepi town. From inferential statistics, time series analysis method was used to make inferences, estimation, hypothesis testing, and conclusion about the water consumption in Tepi town. The term time series is referred to as the observations arranging across a unit of time and arranged in chronological order.

Test of Randomness.
The term randomness in time series is referred to as observations that fluctuate around a constant value (such as mean-variance) and assumed to be statistically independent [19].

Turning Point Test.
A turning point is a simple test given by counting the number of peaks and troughs. Before testing the turning point, it must determine the distribution of the turning point in a random series. Let Y 1 , Y 2 ⋯ Y t be the water consumption at different times. The hypothesis is designed to be as follows: H 0 : Y t is the independent identically distributed variable ðt = 1, 2 ⋯ nÞvs:H 1 : not H o . Let p be the number of turning points defined by counting variables for the series of observations then apply the test. Let us consider an otherwise, the number of the turning point P = ∑Z t .

Test of Stationarity
. The stationarity can be tested by the ADF test. Both Bayesian information criteria and Akaike information criteria are used to select the best model.
where α represents constant, β for coefficients on a time trend model, p for the lag order of the autoregressive process imposing the constant α = 0 and β = 0 corresponds to modeling the random walk with a drift. The unit root test is formulated as H o : γ = 0, nonstationary vs. H a : γ < 0 stationary. Test statistic is given by The statistic was then compared with the critical value for the Dickey-Fuller test. If DF γ statistic is less than the Dickey-Fuller test, the null hypothesis is rejected and no unit root is presented. Differencing and log transformation are used to make nonstationarity time series to stationary time series [20].

Trend Component.
The term trend is referred to as the long-term pattern of data or series. It is applicable to price inflation, general economic changes, and population growth. The model in trend analysis is stationary.
2.6. Trend Analysis. Trend analysis is applicable in finding the trend result for economic and business time series. In addition; it is used to forecast future values. The parameter in trend analysis can be estimated by the least square method.

Time Series
Model. Both partial and auto correlation (ACF and PACF) function patterns for the autoregressive models tend to fit well in smooth time series, whereas the moving average models (MA) tend to fit well in irregular series. It is given by the following general formula.
Consider the following formula Journal of Nanomaterials Moving average or MA (q) represents as autoregressive model, AR (∞), so that it will be expected that moving average processes is the opposite pattern. In fact, the partial correlation function will be distributed exponentially. The autocorrelation function is used to recognize the order of moving average process of autoregressive moving average, ARIMA ðp, qÞ: If p = 0, it is reduced to (MA) and if q = 0, it is reduced to MA ðqÞ process. Observing both the autocorrelation function and partial autocorrelation function showed us some combinations of autoregressive and moving average process. So, once, the autocorrelation function and partial autocorrelation function are determined, it is possible to move to the estimation stage.
2.9. Autoregressive Integrated Moving Average, ARIMA (p, d, q). In practice, most time series are nonstationary so that it should be converted a nonstationary to stationary using successive difference.
Formula of ARIMA: where b is the lag operator.
In the methodology, the Box-Jenkins stands for the procedure in identifying and estimating time series models such as the moving average (MA), autoregressive (AR), autoregressive moving average (ARMA), and autoregressive integrated moving average (ARIMA).
2.10. Process of Building ARIMA Models. If the time series is MA (q) process, it cuts off the limits after known, and if it is AR (p) and ARMA (p, q), sample autocorrelation dies gradually. Tentative models are selected by the ACF, PACF, BIC, and AIC.
Autocorrelation Function (ACF): The mathematical equation of ACF statistic is given as The lag k of partial autocorrelation for regression coefficient is given by By using ACF and PACF, we can identify the model in Table 1.
The parameter is estimated for both AR (p) and MA (q) models using the sample autocorrelation function.

Distribution of Monthly Water Consumption in Tepi
Town, Ethiopia. As shown in this study, Tepi town water consumption for the last 60 months is 42,325.00 minimum, 77,227.77 average, and 111,833.0 maximum. The time series analysis of the five year data implied the negative skewness -0.073761 and the kurtosis coefficient to be 1.982369 which is highly leptokurtic, this result revealed that the distribution of water consumption in Tepi Town was far from the required amount of water demand listed in Table 2.

Test of Stationarity.
To see the distribution of monthly water consumption in the study area, the time series plot was used and the plot implied that there is fluctuation of water supply from quarter to quarter. The plot was drawn by using the original data of water consumption in m 3 and months in quartile. Figure 1 presents water consumption. The horizontal of the dataset illustrates the month from 2016 to 2021. The general form of the pattern of the monthly water consumption in Tepi town changes over time. In other words, the general trend of water consumption does not look similar across the months. It indicated the original data is nonstationary. Figure 1 also shows the fluctuation of water consumption. The upward trend revealed that water. Table 3 presents the stationarity test water consumption. Since the ADF test is greater than the ADF statistic, the null hypothesis is not rejected at 1%, 5%, and 10% level of significances. Thus, it has been confirmed that the series is not stationary.

Transformation of Original Water Consumption Data.
If the data are not stationary, it should be converted to stationary by using different methods.
The nonstationary series are transformed into stationary series by using log transformation The plot of the transformed data is fluctuated about a fixed point. Table 4, after the first difference, the p values for ADF Test are 0.0001, which were less than 0.05, and after the first difference, the water consumption data become stationary (Table 4).

Identification of Tentative
Model. The Box-Jenkins approach is commonly used to identify the possible AR and MA as well as their order. Such as the ARIMA model, ARIMA (p, d, q) model would be appropriate for time series data. Graphically, examining plot of autocorrelation function and partial correlation function on the differenced data is illustrated in Figure 2. From the ACF graph, the plot shows the autocorrelation function of first differencing of water consumption at various lags. It was found out that the ACF has a negative significant spike at lag 6 and lag 9. From the above PACF plot, it has a positive significant spike at lag 4 and negative significant spike at lag 6.
3.6. Tentative Model. In Table 5, the criterion values suggested that ARIMA (1, 1, 1) is the best among the competing models since it has the least Bayesian Information Criterion (BIC) and AIC.    IV I II III IV I II III IV I II III IV I II  I II 4 Journal of Nanomaterials Table 6 Forecasting using the ARIMA model for water consumption data. The 2022 monthly (ten-month) water consumption of Tepi town has been forecasted.
Based on 2021forecasted value of water consumption in Tepi town.
As we have seen from the above forecasted value, there will be increasing uniform from September to December.

Discussion
This study seeks to identify the best fit time series model to the water consumption data in southwestern Ethiopia, Tepi town and forecast the water consumption in the municipality. The time series analysis of water consumption in Tepi town illustrated in Figure 3 implied there was a decrement in water       5 Journal of Nanomaterials consumption in the third quarter of every year, which is mainly in the winter season. However, the results of this study revealed that in 2020-2021, the water consumption in the study area showed good progression during the winter season. This study is consistent with the study conducted in Southern Brazil [21][22][23]. Clear evidence of nonstationarity in the series was observed (Figure 4) and it was confirmed by the ADF (Table 2). Log transformation was made in the series, and no clear evidence of nonstationarity (trend in the transformed series was not observed) ( Figure 5). Stationarity in the logtransformed was confirmed by the ADF test ( Table 7). Evidence of the lack of outward trend in the transformed series of the various ARMA models considered in this study was found as the best model for water consumption. The model was selected among the other competing models using AIC and BIC with a minimum forecasting error. The water con-sumption at the current time is considerably affected by the previous one lags of p ≤ 0:001 and at 5% level of significance. This shows that amount of water consumption for the current month depends on the amount of water consumption in the previous one month (autoregressive component), and the previous period shocks (moving average component). Appropriate monthly forecasting for the next two months has been made using the best-fitted model in this study. The forecasted values show that water consumption is increasing over the next two months.

Conclusions
Water consumption forecasting in water supplying system is one of the basic and strategic management tasks for water supplying organization. This is done using time series models and water consumption data. This study provided a short-term forecasting of water consumption in Tepi Town South Nations Nationalities and People's Region, South Ethiopia. In the town within five years on average, 77,227.77 mm 3 of water was consumed by the society. From AIC and BIC values among all the proposed time series models, ARIMA (1, 1, 1) is the paramount model for forecasting the water and its trend consumption in the town. In general, water supply in Tepi town has increased from 2016 GC to 2021GC. According to the forecast period, the water supply will increase by 2022 GC. There are also points that could be further improved. Long-term forecasting of water consumption, which is more important, should be conducted in further study to advance forecast period.

ACF:
Autocorrelation function ADF: Augmented Dickey Fuller AIC: Akaike information criteria AR (p): Autoregressive process of order p ARIMA (p,d,q): Autoregressive integrated moving average ARMA (P, Q): Autoregressive moving average BIC: Bayesian information criterion MA (q): Moving average model of order q MSD: Mean square deviations PACF: Partial autocorrelation function.

Data Availability
The data underlying the results presented in the study are available within the manuscript.

Conflicts of Interest
No potential conflict of interest was declared.