Harvested Energy Prediction Schemes for Wireless Sensor Networks : Performance Evaluation and Enhancements

We review harvested energy prediction schemes to be used in wireless sensor networks and explore the relative merits of landmark solutions.We propose enhancements to thewell-knownProfile-Energy (Pro-Energy)model, the so-called Improved Profile-Energy (IPro-Energy), and compare its performance with Accurate Solar Irradiance Prediction Model (ASIM), Pro-Energy, and Weather Conditioned Moving Average (WCMA). The performance metrics considered are the prediction accuracy and the execution time which measure the implementation complexity. In addition, the effectiveness of the considered models, when integrated in an energy management scheme, is also investigated in terms of the achieved throughput and the energy consumption. Both solar irradiance and wind power datasets are used for the evaluation study. Our results indicate that the proposed IPro-Energy scheme outperforms the other candidate models in terms of the prediction accuracy achieved by up to 78% for short term predictions and 50% for medium term prediction horizons. For long term predictions, its prediction accuracy is comparable to the Pro-Energy model but outperforms the other models by up to 64%. In addition, the IPro scheme is able to achieve the highest throughput when integrated in the developed energy management scheme. Finally, the ASIM scheme reports the smallest implementation complexity.


Introduction
Wireless Sensors Networks (WSNs) have long been considered as the technology that has the potential of revolutionizing the world we live in.With the emergence of Internet of Things (IoT), this prospect is quickly becoming a reality.Zig-Bee, an alliance of 400+ companies, has developed standards which have led to the manufacturing of hundreds of millions of ZigBee products for a variety of energy management and industrial and consumer applications.Sensor nodes are used for the collection and transmission of sensed data after desired processing.However, sensor nodes are characterized in many applications by scarce energy resources and for this reason energy provisioning is a significant aspect of sensor network design which greatly affects the overall system performance and lifetime.
Energy provisioning mechanisms can be classified with respect to a number of attributes.When the source of energy is considered, the following three categories can be identified [1]: (1) battery, (2) energy harvesting, and (3) energy transference.Each class can be further divided into subcategories.Batteries are divided into fixed and rechargeable ones.Fixed batteries were the single source of power for initially designed sensor nodes.However, the rapid evolution of WSNs and their deployment to serve more demanding applications such as biotechnologies, agriculture, and military purposes [2] exacerbated the scarce energy problem and has led to the limited lifetime and the leakage of batteries becoming a serious issue [3].
To overcover these limitations, the concept of energy harvesting has emerged and its utilization for the smooth operation of WSNs is now a challenging research topic [4].Energy harvesting is a mechanism which allows one to extract energy from external sources.Energy harvesting reduces the dependency of sensor nodes on fixed batteries as a single source of energy.There are a number of sources for energy harvesting, for example, solar irradiance, wind, thermoelectric, piezoelectric, and vibration [5,6].Among these, the most commonly used is the solar irradiance.Energy transference is the latest methodology to transfer the energy from high to low powered sources [7].Magnetic-resonance, reflection of solar irradiance, electromagnetic waves, and the lasers are the main technologies employed in energy transference enabled systems [8][9][10][11].In this category, a node with high energy storage can share or transfer its energy to nodes facing deficiencies in the available energy [7,8,[12][13][14].A characteristic example is the mobile Wireless Charging Vehicle (WCV) [13] which, when required, approaches an energy deficient node after a fixed time interval and injects specific amount of energy to the battery through Witricity [11] via a magneticresonance based technique.Figure 1 visualizes the above explained classification of energy provisioning mechanisms along with subcategories.In this work, we focus on energy harvesting techniques when, in particular, the external sources of energy are solar irradiance and wind power.Energy harvesting enabled systems can be better managed when effective energy prediction models are readily available.For this reason, a number of energy prediction models have appeared in literature [15][16][17][18][19][20][21][22][23][24][25][26][27][28].The principle aim of this work is to review recently proposed harvested energy prediction schemes and provide a comparative study against landmark solutions which appear in the literature in order to investigate the relative advantages of each policy.Such a study can be used as a baseline for the selection of the most suitable energy prediction policy when designing an energy harvesting enabled system.
To this end, the most prominent existing prediction policies are considered, enhancements are proposed, and the resulting prediction schemes are compared in a number of scenarios to identify which policies perform better.In particular, we propose enhancements to the Pro-Energy model, the so-called Improved Profile-Energy (IPro-Energy).We then compare its performance with the Pro-Energy, WCMA, and our recently proposed ASIM model as both short and long term predictors.The performance evaluation is based on the prediction accuracy, the achieved sensor node throughput, and the execution time.The latter is a good measure of the implementation complexity of the algorithm whereas the achieved throughput is a good measure of the effectiveness of the prediction policy when integrated in an actual sensor network.We also develop a new energy management scheme which uses the predictions generated to control data transmission.The policy is integrated on realistic simulation models of specific sensor motes and the achieved throughput is measured.We demonstrate that despite the simplicity of the changes made in the Pro-Energy model, IPro-Energy performs better in terms of the reported prediction accuracy, achieved throughput, and average execution time.On the other hand, Pro-Energy exhibits comparable performance in terms of the prediction accuracy; however, a degradation in performance is observed when the execution time is considered.The ASIM model is shown to be less accurate and also results in smaller throughput.However, it has the smallest execution time which indicates a smaller implementation complexity.
Our contributions can be summarized as follows.
(1) We propose enhancements to Pro-Energy model to which we refer as IPro-Energy model.(2) We extend the Pro-Energy, IPro-Energy, and WCMA models for long term prediction for the comparison of all four models over a long term prediction horizon.(3) We extend the ASIM model for short term prediction to compare all four models over short term prediction horizon.(4) We perform simulations to evaluate the performance of the four considered models using the prediction accuracy, the execution time, and the throughput as the performance metrics.(5) We perform simulations for low and high powered sensor nodes to show the performance of each candidate model in terms of best battery capacity and maximum throughput.
The rest of this paper is organized as follows.In Section 2 we describe the related work while, in Section 3, we describe implementation details of the considered prediction models including our proposed enhancements.In Section 4, we present the comparative simulation study and finally in Section 5, we offer our conclusions and future research directions.

Related Work
Energy prediction models usually rely on available datasets, patterns, and samples to increase the prediction accuracy and a number of parameters are involved with which the prediction error rate can be controlled.Prediction models can be categorized in three major classes according to [15] and statistical, stochastic, and machine learning based models.

Statistical Models.
To predict the energy, statistical models take advantage of different statistics, for example, mean, moving average, standard deviation, and variance.In [16], a statistical prediction model is proposed which computes the mean solar irradiance value over a time period of one hour of a particular day.On the basis of a correction factor, the method is shown to improve the prediction accuracy when compared to previously proposed models.In [17], the Exponentially Weighted Moving Average (EWMA) is considered, and the proposed method relies on the assumption that the energy patterns of the most similar days remain unaltered at specific corresponding intervals.This consideration is shown to increase the prediction accuracy in a variety of weather conditions such as rainy, sunny, or mixed.Weather conditions are also considered in [18] and the WCMA is proposed which takes averages over specific time intervals from previously observed days and scales them according to a weighting factor.Moreover, statistics such as Autoregressive (AR), Autoregressive Moving Average (ARMA), Autoregressive Integrated Moving Average (ARIMA), and Linear Regression (LR) have also been considered for energy prediction in [20].A recently proposed solution is the Pro-Energy model, described in [21].This is commonly used in the literature as it has been shown to outperform existing solutions by as much as 60% in some cases.For this reason, we use this protocol as a landmark solution against which we compare the considered proposals and enhancements.We describe the protocol in detail in subsequent sections.It must be noted that Pro-Energy and the schemes in [22,23] are improvements of the EWMA approach [15].

Stochastic Models. Stochastic models incorporate various types of stochastic processes to represent interested signals.
One stochastic process commonly used is Markov chains [24,25].A first-order Markov chain is used in [24] to generate solar irradiance predictions.A similar first-order Markov chain approach is adopted in [25] which also incorporates the concept of active and inactive states to achieve improved performance.MAKERS is another stochastic model [26], which utilizes first-order Markov chains to generate residual energy predictions for Body Sensor Networks (BSN) which constitute a promising network paradigm.Unlike previous proposals in [15], we use Markov chains of increasing order to generate solar irradiance predictions.We have demonstrated that increasing prediction accuracy levels can be achieved as the order of the model increases.

Machine Learning Based Models.
Machine learning prediction models utilize machine learning techniques such as fuzzy logic (FL) and neural networks (NN).A neural network model is incorporated in [27] to propose a scheme which predicts irradiance values over a time horizon of half a day.The scheme is shown to outperform the AR and FL models by achieving increased accuracy.The General Fuzzy Model (GFM) proposed in [28] utilizes Fuzzy Logic Techniques and is used for long term prediction.The model is hybrid in nature as it also incorporates the Gaussian Mixture Model (GMM) to generate predictions.In general, these models  are more computationally expensive and consume more time than statistical and stochastic models.The reason behind this time consuming process is the dependency of these models on a variety of environmental parameters.The analysis conducted in this paper considers the WCMA, Pro-Energy, and ASIM models as all of these models are recently proposed and have been shown to outperform previously proposed models.For clarity of presentation, we describe these models in detail in the next section.

Prediction Models
This section describes the fundamental concepts behind the ASIM, WCMA, Pro-Energy, and IPro-Energy prediction models.The WCMA and Pro-Energy models are considered as landmark solutions which have been shown in the literature to outperform previous proposals and we thus use them as reference solutions against which we compare our recently proposed ASIM scheme and the IPro-Energy scheme which is first presented in this paper.

ASIM Model.
ASIM is a stochastic prediction model which uses Markov chains to predict the solar irradiance availability over a long term prediction horizon.Unlike previous proposals, it uses Markov processes of increasing order.That is, the probability of the discrete random process to attain a state at a particular time instant depend not only on the state of the previous time instant but also on the states of the  previous instants, where  denotes the order of the model.The state transition dependencies are shown schematically in Figure 2.
The model is created using measured solar irradiance datasets which dictate both the attainable states and the state transition probabilities.The possible states are generated by dividing the training dataset into fixed sized bins.Each bin represents a unique state.So, the number of states is found by dividing the highest irradiance value of the dataset with the bin size.The transition probabilities are also obtained using the training data.To find the transition probability from  previous states to a particular state the number of such transitions, present in the dataset, are divided by the total number of transitions from the  states to all permissible states.The adopted design procedure when a dataset is available is to divide the set into two equal parts.The first part is used for training purposes, as described before, while the second part is used for evaluation purposes.

WCMA Model.
WCMA is a statistical prediction model and its unique characteristic is that it considers solar irradiation values together with weather data for the current day to generate predictions.It uses a ( × ) matrix  to store the measured energy values for the past  days.WCMA estimates the expected energy by taking into account the  previously observed samples for the current day and the average values of the  previously observed days.Equation (1) formally describes the predicted energy (,  + 1) for the timeslot  + 1 of the current day  [18].
where  is a weighting factor and   (,  + 1) is the average of the ( + 1)th values of  previously observed days.GAP  is a weighting factor used to calculate the relationship between the current day and the previous days.Equation (2) formally describes GAP  .
where  is a vector that holds the quotients of previously recorded samples and the previously recorded average energy of  days for same recorded samples.WCMA weights the impact of the previous days according to their proximity to the current day (the closer the day the greater the weight) through a vector . holds the quotients of the distance of past samples and the total  samples.WCMA has been observed to yield minimum prediction error when 3 previously observed samples of the current day are considered and 4 previously observed days are considered when  = 0.7.

Pro-Energy Model and Enhancements (IPro-Energy).
In this section, we outline the main features of the Pro-Energy model indicating proposed enhancements which lead to the IPro-Energy model.Pro-Energy is also a statistical energy prediction model, designed to predict the energy over short and medium term horizons.It considers the dataset of previously recorded days as an input for the prediction of the future energy intake.It divides a particular day into  equally sized timeslots. is usually chosen to be 48.At each particular interval, it predicts the energy to be available in the next timeslot.In this model, a vector is used to store the predicted energy during the current day.This vector stores the 48 values corresponding to the equally sized timeslots.Also, a (×) size matrix is used to store the profiles of previously observed typical days as a pool. represents the total number of days stored in the pool and  represents the total number of timeslots in the specific stored day.At each timeslot, Pro-Energy forecasts the energy for the next timeslot by considering the most similar, previously stored profiles in the pool.It takes the last  observations into account, while matching the most similar day by computing Mean Absolute Error (MAE), to lower the chances of selecting a wrong day for prediction.Equations ( 3) and ( 4) formally describe the estimated energy for the timeslot  + 1 over short and medium term prediction horizons, respectively [21].
where Ê+1 is the estimated energy for the next timeslot, ( = 1),   +1 is the energy harvested at timeslot  + 1 on a most similar day,  and  are weighting factors, and   is the energy harvested during the current day at timeslot .
The authors of this model have proposed enhancements to their original scheme, which involve variable size timeslots (e.g., 30, 60, and 90 minutes), unlike their original design where the timeslot duration was fixed to 30 min.They refer to their improved model as Pro-Energy-VLT [30].In order to calculate the size of each of the  timeslots, a Perceptually Important Point (PIP) technique is employed [31].It is an iterative algorithm which calculates  + 1 points having maximum effect on the pattern of the harvesting profile.In our comparative study, we use the original Pro-Energy model, as the Pro-Energy-VLT, despite its increased complexity and variable length timeslots, does not report a significant improvement in the prediction error.Pro-Energy-VLT reported a prediction error up to 12.33% lower than the error reported by Pro-Energy [30].Also, the reason behind the absence of Pro-Energy-VLT in our comparative study is that all considered models are predicting fixed time interval based energy values.On the other hand, Pro-Energy-VLT considers variable length timeslots for predictions.We compare our enhanced model on the basis of similar parameter settings (i.e., models having fixed length timeslots).Specifically, we map the Pro-Energy and IPro-Energy from short interval to long interval prediction to test these models over long term prediction accuracy.The granularity of the ASIM model is one aggregated value per day.Due to this core attribute of the ASIM model, we consider the original Pro-Energy in our comparison.
In this paper, an IPro-Energy model is proposed which is also a statistical energy prediction model and an enhancement of the Pro-Energy model.It is an enhanced version of Pro-Energy that is proposed to improve the prediction accuracy by changing the implementation technique instead of revising the basic components and modules of the Pro-Energy scheme.It uses the previously observed harvested energy for the prediction over short and medium term horizons.It has two main distinguishing features.First, it does not classify typical days with respect to their characteristics.More specifically, unlike Pro-Energy, it does not store a day's data based on the fact that it is pure sunny, cloudy, rainy, or mixed.This design choice is based on a series of tracedriven experiments which show that this is one of the main limitations of Pro-Energy resulting in prediction errors.To compensate for weather variations, IPro-Energy uses the weighted profile (WP) technique, also used in [21].Secondly, it minimizes the control overhead in terms of both storage and execution time.This is achieved by minimizing the size of the most similar combined days.In the implementation and design of IPro-Energy, it considers and combines just the two most similar previously recorded days (i.e.,  = 2).Combining more number of days can make an impact by increasing the prediction error.
IPro-Energy has three main modules: Analyzer, Predictor, and Updater.The purpose of the Analyzer is to select the most similar profile from the pool having the least MAE.The Predictor estimates the future energy intake over short and medium term horizons.The Updater refreshes the pool entries at the end of the day.Some basic notations used in this paper are given in Notations section.The detailed working of all the three modules is explained in the next subsections.Note that some of the implementation details of the IPro-Energy model presented below are common to the Pro-Energy model as well.

Analyzer.
It is the core module of IPro-Energy as it feeds the Predictor with the most similar profile.It stores the energy harvested values up till timeslot  of the current day in a vector  of size .Previous harvested days are stored in a ( × ) sized matrix , where  represents previously harvested days with  timeslots each.Initially, the matrix  contains the harvested energy of the last 30 observed days (i.e.,  = 30).The Analyzer applies the MAE function over vector  and matrix  to select the most similar day(s).For this purpose, it matches the last  timeslots of the current day with all the stored days where  is less than the current timeslot .For instance, if   is the day with the least MAE among  days, then profile   will be selected.Mathematically, MAE is computed as The value of , that is, previous timeslots used to check the similarity, plays an important role in reducing the selection probability of an inappropriate day.For example, we fix the value of  (i.e.,  = 2) in our simulations.The Analyzer module in our scheme compares only the last two timeslots at  using the MAE function and this will lead to high probability of an erroneous selection.It must be noted, however, that small values of  decrease the computational overhead and thus the choice is a tradeoff between performance and implementation overhead.To handle the variations in weather conditions, IPro-Energy uses the weighted profile (WP) technique [21] and correlates more than one profile having least MAE with the current day.WP is defined formally as where   is a factor which gives the relative weight for combining more than one similar profile and WP +1 is the harvested energy at timeslot  + 1 of the WP.WP adjusts the value of more than one similar profile at timeslot  + 1 according to the average MAE of most similar days.It is also worth mentioning here that IPro-Energy takes the two most similar profiles to compute WP.

Predictor.
Its main objective is to forecast the energy intake for EH-enabled WSNs over a short term horizon (30 minutes) and a medium term horizon (from one hour to a couple of hours).It takes the current day's observations and the most similar days as input to compute the expected future energy intake.In earlier prediction algorithms [18,21], the harvested energy values of the current day's timeslot  and the most similar days'  + 1 timeslots were used.This approach, however, ignores the current day's pattern observed so far.This shortcoming is addressed in IPro-Energy through the use of a "smarting factor" .This parameter not only incorporates the current day's harvested values up to timeslot  but also leads to the reduction of the computational complexity and prediction time (more on this in Section 4.2).The prediction through IPro-Energy is completely independent of the characteristics of a day and the size of the prediction horizon.Unlike Pro-Energy, it uses single equation (7) to predict the future intake instead of using two different equations for short and medium term prediction horizon, respectively.The reason behind the single equation is that IPro-Energy does not involve the Pearson correlation approach used in Pro-Energy [21].Furthermore, Pro-Energy uses  and  as weighting factors in (3) and (4), respectively.In this paper, we derive single equation after a series of experiments and introduce   as a weighting factor in the new equation instead of using the previously used  and .Equation ( 7) expresses the expected energy Ĉ+ for timeslot  +  of the current day.
where  is the th timeslot with respect to  and   is a weighting factor taking values in the interval 0 and 1.The weighting factor is added to give more importance to the current day's energy pattern.Throughout the paper we choose   = 0.7. represents the smarting factor given by It should be noticed at this stage that WP is used to combine the previously observed most similar days.On the other hand, (8) indicates that the "smarting factor" , incorporates the average rate of change of energy between timeslots  − 1 and  of the current day at  = 0.5.The purpose of incorporating this approach is to consider the role of current day's energy pattern or trend by taking into account the last two timeslots of current days at timeslot . is an important factor in the proposed approach as it directly adds the value in (7).To control the impact of any unexpected outcome of  due to abrupt change in weather condition, IPro-Energy takes 50% of the smarting factor (i.e.,  = 0.5) for prediction.In order to demonstrate the effectiveness of the proposed energy predictor algorithm, in Figure 3 we show the prediction outputs obtained when the algorithm is used for short term prediction.Figure 3(a) compares the predictions against actual measurements using traces of solar energy availability for three consecutive days in May 2012 obtained in Prewitt, New Mexico, USA.The results indicate that the predictions are able to closely follow the actual power density profile to a very good extent.In Figure 3(b) the ability of the algorithm to effectively predict wind energy profiles is also demonstrated.The predicted values are compared against the actual values using wind traces for a period of three consecutive days in August 2012 obtained in Lockney, Texas, USA.Again, the predictions closely follow the real data profiles.The performance of the model and the role of different parameters for improving accuracy are discussed in Section 4.

3.3.3.
Updater.The primary objective of the Updater module in IPro-Energy is to refresh the existing entries in the pool.Initially, the pool contains previously harvested values for the past 30 days.After the completion of a day, the Updater removes the oldest entry and adds the recently harvested day in the pool.This replacement strategy is used to give freshness to the pool which in turn leads to significant performance enhancement.We have investigated using extensive simulations and various aspects of this refreshing policy and we have deduced that the most similar days relative to the current day can be found in the last 20-30 days.The above will be elaborated in the upcoming discussion on performance evaluation results.

Performance Analysis and Evaluation
In this section, we compare the performance of the aforementioned prediction models with respect to a variety of metrics such as the accuracy achieved, the execution time, and their ability to result in effective solutions when these are incorporated in WSNs protocols which adaptively regulate the data transmission policy based on the predictions generated.The datasets were chosen so that they are representative of different content and conditions: different locations, different time horizons and solar and wind data were considered.The time horizon over which the prediction is generated is of particular importance as different time horizons can lead to different observed behavior.In order to compare over the same time horizons, it was necessary to extend the existing prediction models: the WCMA, Pro-Energy, and IPro-Energy models to the long term and the ASIM model to the short term prediction horizon.The behavior of the long term extensions can be observed in Figures 4-7.The graphs show the predictions generated by the four models for one complete year using datasets from four different regions.The predictions are compared with the actual data in order to evaluate the achieved accuracy.The predictions and  simulations are all conducted on MATLAB.All four locations exhibit similar trends for each model.WCMA consistently generates under predictions for all locations as it uses the average values of previously observed days for prediction but the results for Algeria dataset are relatively close to the actual data.The ASIM model reports significant variation during prediction.On the other hand, prediction results of IPro-Energy and Pro-Energy are close to the actual values and, unlike ASIM, they did not show rapid variation in their prediction.

Datasets.
A number of datasets were considered with the double purposes of tuning the considered prediction models and also evaluating their performance with respect to different performance metrics, by analyzing the predictions generated.Solar irradiance datasets from two exactly opposite border states of USA [32], (1) New-Mexico (NM) and ( 2) Michigan (MI), were considered, as well as two datasets of wind traces in (3) Portal, North Dakota, and (4) Lockney, Texas [33].Each dataset contains solar irradiance and wind traces for two years (2011 and 2012) having the granularity of one value per 30 minutes.Some basic characteristics of the data are shown in Table 1.Average solar irradiance from 6:00 a.m. to 5:59 p.m. is considered, as solar power density values outside this range are almost negligible.The selected dataset is significantly diverse in terms of weather conditions as well as location.For instance, weather of New Mexico is more clear and sunny than Michigan while North Dakota and Texas are located in the upper midwestern part of USA and south-central region of USA, respectively.
In this paper, we use the same data source as in [21]; however, we consider four different locations in the presented comparison.Since, IPro-Energy has been shown to exhibit superior performance in four different locations (more on this in Section 4.2), this demonstrates a high probability of its superiority in arbitrary traces.However, the simulation comparison presented in the paper cannot be exhaustive in the sense that although more number of traces could be used, the consideration of four different traces is sufficient to offer the required confidence.

Performance Evaluation.
The considered prediction models were compared with respect to a number of performance metrics such as the accuracy achieved, the execution time, and their ability to result in effective solutions when these are incorporated in WSNs protocols which adaptively regulate the data transmission policy based on the predictions generated.Below, we show the comparison results extracted for each of the considered metrics.

Prediction Accuracy.
The most important property that a prediction model must possess is high prediction accuracy, that is, the ability to generate predictions which are sufficiently close to the observed measurements.The time horizon over which this accuracy is evaluated is of critical importance as different time horizons can lead to different performance behavior.We evaluate the prediction accuracy of the candidate models in three different settings: (a) short term comparison, (b) medium term comparison, and (c) long term comparison.In the first setting, we compare all four models (i.e., WCMA, Pro-Energy, our enhanced IPro-Energy, and ASIM) for both solar irradiance and wind traces over a short term horizon.In the second setting, we compare the Pro-Energy and our enhanced IPro-Energy model for both solar and wind traces over medium term horizon.In the third setting, we compare all four models (i.e., WCMA, Pro-Energy, our enhanced IPro-Energy, and ASIM) for both solar irradiance and wind traces over long term horizon.In order to measure the accuracy of prediction over short and medium term horizons, we randomly select 96 days from 2012 (excluding the leap day).These days are selected in a way so that, on average, 2 to 3 days are picked from a window of 10 days and so on.For the long term prediction, all datasets have the granularity of one accumulated value per day, that is, sum of energy for a complete day.Each dataset contains solar irradiances and wind traces for two years from 2011 to 2012.We set the same optimal values of all the parameters (e.g., , , and ) while implementing the considered schemes for the fair comparison, analysis, and validation.
The performance metric used for the prediction accuracy evaluation is the Mean Absolute Percentage Error (MAPE) defined formally as where   is the actual energy and Ĉ is the predicted energy at timeslot .For the short and medium term predictions,  is the consecutive number of timeslots usually taken during the peak hours of a day.To evaluate the prediction accuracy, unlike Pro-Energy, we do not discard values which are less than 10% of peak energy.All the plots are drawn with 95% confidence interval.
Figure 8 shows the MAPE plots for short and medium term prediction horizons (i.e., 30 minutes to 2 hours) obtained using the four previously mentioned datasets.(c) and 8(d) show the results for the wind datasets of Texas and North Dakota, respectively.There are two noticeable trends which are illustrated by the plots, Firstly, MAPE values increase as we move from short to medium term predictions.This is expected due to the fact that the higher the forecasting horizon is, the higher the probability of error accumulation during the longer interprediction times is.
Secondly, it is evident that the IPro-Energy prediction model exhibits superior performance as it consistently reports significantly lower prediction errors compared to the other models.In case of solar and wind predictions, IPro-Energy is 51%, 60%, and 78% better than Pro-Energy, WCMA, and ASIM, respectively, over the short term prediction horizon.We must point out here that, for comparison purposes over the medium term horizon, the WCMA and ASIM models are not considered, due to limitations in their design and implementation.WCMA predicts the available energy for the fix period of 30 minutes (short term horizon).There is no flexibility in the design that WCMA predicts the energy beyond the very next timeslot.Similarly, ASIM model uses accumulated values as a training dataset and manages states using Markov chains.So, according to the design of both models, these can predict the values only in a sequence of time.That is why, for the medium term predictions, only Pro-Energy and IPro-Energy are compared.IPro-Energy again exhibits superior performance reporting 50% and 43% better performance than Pro-Energy for solar and wind predictions, respectively.Consistent behavior of the predictors is of critical importance especially when they are used to manage the harvested energy.For instance, if a node decides to schedule its major communication activity based on a wrong prediction, it can end up draining its battery to an alarmingly low or a dead level.The better performance of IPro-Energy can be attributed to its smarting factor  whose value is significantly affected by the most recent variations in weather.More specifically, IPro-Energy considers the energy pattern of previously observed timeslots of the current day and makes decisions accordingly.This in turn helps in countering the impact of abrupt changes in weather conditions.Pro-Energy, on the other hand, ignores the most recent energy patterns and relies only on the energy values of the current and the next timeslot of the weighted profile.Table 1 shows the recorded mean annual temperature for all considered states (i.e., New Mexico, Michigan, North-Dakota, and Texas) along with the mean solar irradiance for the solar dataset.To test each model for mixed weather conditions, these states have been selected as these are exactly opposite border states of USA.The selected datasets are significantly diverse in terms of weather conditions as well as location.For instance, the weather of New Mexico is more clear and sunny than Michigan with respect to the mean solar irradiance.The diversity of weather between New Mexico and Michigan can be observed through the average annual sunshine.The daylight percentage, total sunny hours, and annual clear days for New Mexico are 76%, 3415 hours, and 167 days, respectively.On the other hand, daylight percentage, total sunny hours, and annual clear days for Michigan are 51%, 2392 hours, and 71 days, respectively [34].It should be noted that the clear days are the days when the sky is clear at least 70% of total sky in daylight.There is not a significant disparity in the results of both solar datasets due to similar average temperatures.The average wind speeds, shown in Table 1, for both datasets (North Dakota and Texas) are almost equal.Texas, however, shows more accurate results than North Dakota due to the consistent wind speed throughout the year.In long term prediction, aggregate values are used leading to a prediction of the energy intake for one complete year (i.e., 2012, excluding leap day).The MAPE is thus calculated for  = (1 × 365) = 365 for each dataset.There are two reasons behind the calculation of MAPE for one complete year.The first is the aggregate (nonzero) values of power densities throughout the year.The second is the granularity of each dataset which is one day (i.e., one aggregate value for a complete day).Figures 9 and 10 show the MAPE results for the four different datasets (i.e., New Mexico, Michigan, Texas, and North Dakota) over long term prediction horizon.The comparison reveals that Pro-Energy and IPro-Energy exhibit comparable performance which is relatively better than WCMA and ASIM as higher accuracy is achieved for all datasets.For instance, in case of solar prediction, IPro-Energy and Pro-Energy both are 18% and 50% better than ASIM and WCMA, respectively.For the case of wind datasets, IPro-Energy and Pro-Energy both are 29% and 64% better than ASIM and WCMA, respectively.Quantitative analysis shows that, due to consistent accumulated values of wind datasets, the results are slightly better than the results of solar datasets.predictions generated to regulate data transmission.The objective is to maximize the throughput by sending as much as possible, without depleting the energy sources at each node.So, the main rationale behind the energy management scheme is that the higher the predicted energy availability is, the higher the allowed transmission should be.Prediction errors, however, lead to degradation in performance.Consistent overestimates lead to depletion of the energy sources whereas underestimates lead to decreased throughput.Below, we provide the details of the utilized energy management scheme.

Active
We consider energy being harvested periodically at each node with period   .The time is thus slotted in timeslots   of duration   indexed by  = 1, 2, 3, 4, . . ., .The initial time of each slot is denoted by   .At   , a prediction is made using the considered prediction policy and is denoted by Ê .We assume that each node, when in active mode, is persistently consuming a constant power  to perform basic network operations.In addition, during each timeslot , we assume that there exists an active period   during which the node is transmitting data and that for this reason it is consuming a power   in addition to .For the rest of the time, the node is in idle or sleep mode in which case the dissipated power is equal to 0. So at each timeslot   , the power  that the node consumes is given by The longer the active period, the higher the throughput.At each timeslot, the generated prediction is used to calculate the value of   by matching the received energy with the energy to be consumed during the timeslot such that The projection is introduced to guarantee that the calculated value of  is less than the period, as larger values are not meaningful.The performance of the combined prediction and energy management scheme is evaluated in terms of the achieved throughput and energy consumption.The objective is to maximize the throughput without depleting the battery resources.We conduct our study using MATLAB.As we assume that the energy management scheme is implemented in a totally distributed manner, without communication overhead between the network nodes, the considered reference scenario involves the implementation of the proposed scheme on a single node.Two types of sensor nodes are considered in simulations: the  [35] mote and the -250 [36] mote.Parameters within the considered simulation model are tuned according to their data sheets.The transmission power   and the circuit power  are different for each mote. has low power consumption such that   +  = 0.08382 W while -250 has high power consumption such that   +  = 0.137 W. Both nodes have the same data transmission rate (i.e., 250 Kbps).We set the initial battery energy level to 600 mAh for each scenario.The maximum capacity of the battery is set to 2500 mAh.Each simulation scenario is run for 37800 Sec. and at the end of the simulation scenario, both the achieved throughput and the energy left in the battery are recorded.The throughput is defined as the average number of bits sent per second during the entire simulation time.The simulations are conducted using solar irradiance data from a random day of the New Mexico dataset.The battery level is updated using the actual solar irradiance levels.In addition, predictions Ê are generated using the four considered prediction schemes which are then used to determine the active period time according to (11).A scenario is also considered where the actual irradiance values are used to determine the active period.This serves as a performance reference scenario as it represents the case where 100% accuracy is achieved.The period   is set to 30 minutes.
The obtained simulation results are shown schematically in Figures 11 and 12. Figure 11 shows the active period and the remaining energy recorded for the 5 considered scenarios (the four prediction models and the performance reference scenario).Figure 11(a) shows the results for the  mote whereas Figure 11(b) shows the results for the -250 mote.It can be observed that the smaller the active period achieved is, the smaller the energy which has been dissipated is, resulting in more energy levels being left in the battery at the end of the simulation time.The results demonstrate that the IPro-Energy model is able to achieve the highest active period, comparable to the one achieved by the "optimal" reference scenario, without depleting the battery level.This is a result of its higher prediction accuracy.The picture is similar in Figure 12 which shows the throughput achieved.The similar results, which again demonstrate the effectiveness of the IPro scheme, are due to the fact that the active period is linearly related to the throughput.The higher the active period, the more the packets sent and the more the throughput achieved.

Execution Time.
The implementation complexity and overhead of a protocol is of great significance, especially in WSNs which have limited energy and computational resources.The execution time is a parameter associated with the implementation complexity of a model and in this study we use it as an evaluation metric.Simulation experiments are conducted on MATLAB and the average execution time is evaluated by means of the  and tic-toc function.The simulation scenario involves prediction over a single timeslot and the results are thus independent of the prediction horizon.The tic-toc function is the recommended function to measure the model performance [37] but  is taken into account as well.In Table 2, the measured execution time is presented for each candidate prediction model using both time-measuring functions.The ASIM model reports smaller execution time compared to the other three models but it varies for different datasets.In the ASIM model, the execution time is directly proportional to the highest value of the irradiance value present in the dataset.A large upper bound on the irradiance value entails more numbers of states that can lower the performance by increasing the overall execution time.IPro-Energy and WCMA report execution times which are not far from the ones achieved by the ASIM model.The Pro-Energy model, however, reports much higher high execution time per timeslot when compared to the other three models.

Miscellaneous Overheads.
Apart from the execution time, memory overhead is also a significant factor affecting model performance among others (power consumption, processing complexity, resource occupancy, etc.).Pro-Energy, for example, in order to predict the energy for future intake, loads, matches, and combines the nine most similar profiles to increase the prediction accuracy [21].IPro-Energy, on the other hand, significantly reduces this overhead by loading, matching, and combining just the two most similar observed days.WCMA considers the past four days to compute the average for the prediction of the ( + 1)th timeslot.For the prediction of  number of years, the ASIM model uses the  number of previously observed years as a training dataset.For example, to predict the energy intake for two years, it uses equal number of previously observed values in the dataset, that is, for two years.Hence we conclude that IPro-Energy is a memory efficient solution when compared to WCMA, Pro-Energy, and ASIM models.However, some tradeoffs exist for different performance metrics.IPro-Energy has much better prediction accuracy and it is memory efficient as well but, on the other hand, it takes more prediction time per timeslot than WCMA and ASIM models.In the case of ASIM model, the prediction accuracy is relatively poor as compared to the IPro-Energy and Pro-Energy models and also bulky as per memory consumption but at the same time it takes significantly less time for prediction.Pro-Energy shows average performance in the case of prediction accuracy and is less memory efficient than IPro-Energy model but it takes significantly long time to generate predictions as compared to the other models.
The implementation of the discussed methods in actual motes and the performance implications (in terms of achieved throughput, prediction accuracy, and network lifetime) of this implementation is considered in the paper as the simulation parameters are chosen based on real parameter values of the  and -250 motes from the relevant data sheets.To study the implementation feasibility, all the simulation experiments are conducted on MATLAB.The results not only illustrate the prediction accuracy and performance but also indicate that, with respect to real technical aspects, IPro-Energy is a node-friendly model that can feed the reliable predicted intake to a sensor node for the decision-making process.As the IPro-Energy scheme has similar implementation complexity to the Pro-Energy scheme and the Pro-Energy scheme has been implemented on  −  motes, the implementation of the IPro-Energy scheme on actual sensor motes is expected to be feasible, without additional modifications.

Conclusions
A simulation study is conducted to compare the performance of energy prediction schemes to be used in WSNs.We propose the IPro-Energy scheme and compare its performance against ASIM, which we have recently proposed and two landmark solutions, namely, Pro-Energy and WCMA.Our results indicate that the proposed IPro-Energy scheme outperforms the other candidate models in terms of the prediction accuracy achieved by up to 78% for short term predictions and 50% for medium term prediction horizons.For long term predictions, its prediction accuracy is comparable to the Pro-Energy model but outperforms the other models by up to 64%.Moreover, the ASIM model has been observed to dominate WCMA, IPro-Energy, and Pro-Energy in terms of minimum execution time.We also investigate the effectiveness of these prediction mechanisms when integrated in a simple energy management scheme that we have developed.IPro-Energy scheme reports the highest throughput without depleting the energy resources.In the future, we aim to further evaluate the performance of these predictors using NS-3 simulations and practical implementation on actual motes.In addition, we aim to examine how these schemes can be used to improve the MAC layer functions and compare the performance of the integrated systems.

𝑁:
Number of timeslots per day : Number of days stored in the pool : Harvested energy during current day : Previous timeslots used to check the similarity   : Harvested energy during timeslot  of current day : Matrix( × ) contains previously observed days Ĉ+1 : Predicted energy at timeslot  + 1 of current day    : Harvested energy at timeslot  of day    : Weighting factor value between 0 and 1 : Number of combined days for estimation of energy WP: Weighted profile WP +1 : Harvested energy at timeslot  + 1 of weighted profile : Constant ratio to control the "Smarting factor" : Number of timeslots for which MAPE is calculated MAE: Mean Absolute Error MAPE: Mean Absolute Percentage Error : Circuit power of device : Totalconsumedpowerofanode   : Transmission power of device : Active period of device.

Figure 2 :
Figure 2: Transition probabilities of the Markov chain.

Figure 4 :
Figure 4: Algeria: long term prediction for one complete year.

2 )Figure 7 :
Figure 7: Ireland: long term prediction for one complete year.
Figures 8(a) and 8(b) show graphically the results for the solar datasets of New Mexico and Michigan states while Figures 8
Period and Throughput.In this subsection, we investigate the effectiveness of the considered prediction

Table 2 :
Average simulation-time using tic-toc and  functions in MATLAB.