Optimal Neural Network for Predicting Solar Energy in Sensor Units Based on a Cascaded Input/Structure Direct Optimization

German Jordanian University (GJU), Energy Engineering Department, 11180 Amman, Jordan Princess Sumaya University for Technology, Electrical Engineering Department, 11941 Amman, Jordan Philadelphia University, Electrical Engineering Department, 19392 Amman, Jordan The University of Jordan, Mechatronics Engineering Department, Amman 11942, Jordan Chemnitz University of Technology, Professorship for Measurement and Sensor Technology, 09126 Chemnitz, Germany


Introduction
The sensor units have recently been trended to integrate solar harvesters as an additional power source to the existing batteries. From one side, this is considered a benefit for fulfilling more sustainable technology. On another side, this is attributed to the need to reduce the cost and the efforts resulting from the repeated visits to replace the batteries when they are depleted or even to maintain them [1][2][3].
Moreover, solar energy is preferred over other forms for being characterized by a high power density (about 15 W/cm 2 ) [4,5]. Since the sensor units are mostly employed in outdoor environments, solar energy adds a feature to them. This feature is mainly considered required for fulfilling the task of measuring some parameters from distant places. Especially, when those parameters are connected with military targets and purposes as well as meteorological stations in which reaching is mostly not easily possible [6][7][8].
The sensor units fed by solar harvesters face the main challenge represented by the intermittent or fluctuating nature of solar energy. This energy is counted as a nonfixed supplying energy [9,10]. Against this problem, predicting solar harvested energy has been adopted to overcome this obstacle [11]. In this regard, the prediction was used to fulfill an advanced knowledge about the collected energy to be a help index for organizing the operation of the sensor units in a way that ensures effective and sustainable work. And without a doubt, this work level could not be reached without achieving well management of the energy consumption in those units [12][13][14].
For more clarification on this topic, the parts or the components that jointly build the sensor node and contribute to managing the consumption of energy in the sensors units would be figured in Figure 1 showing the direction of energy movement. This figure is taken from [15]. According to this figure, the DC to DC converters play a significant role in amplifying the energy collected from the solar harvester or curtailment it to be suitable for storage. The microcontroller is considered the core of the energy management toolbox, for being the operator or the component that controls the distribution of energy that the other components in the sensor unit would consume during different operation periods. It is also named the "decision-maker" about the energy allocation in those nodes. In this component, energy management schemes that depend on a preprediction for the harvested energy are utilized. Thus, the prediction algorithms would be programmed also besides those schemes inside this component. On the right side of this figure, the sensor unit appears. This unit includes the measurement unit represented by the sensors, as well as the communication unit that is responsible for transferring or receiving the data wirelessly [5,15].
Units of sensors are described with small size and limited ability for their microcontrollers. For this reason, the traditional prediction approaches (i.e., the stochastic and    Journal of Sensors statistical ones) seem suitable for being simple enough and being easily implemented in the microcontroller. Despite that, their precision is not sufficient to achieve a sensor unit of well-managed energy consumption [16][17][18]. The main reason is attributed to utilizing historical data for a few days before the predictable slots only with the traditional methods.
Here, large prediction mistakes appear when sudden weather changes as well as at the slots of sunrise and sunset. After appearing as a modern, commonly used, and accurate pre-diction approach, the artificial neural networks (ANNs) were suggested to implement the prediction task not only in the sensor units powered by solar energy but also in the optical sensors where the bio and optical monitoring is performed [12,[19][20][21][22]. ANNs have been implemented till now with an arbitrary architecture. Meaning that the structure (topology) and the inputs of the neural network, that allows the implementation in the microcontroller and simultaneously fulfills a high level   3 Journal of Sensors of precision, are still needed to be identified. The main target of this paper is to apply a cascaded direct optimization technique to identify the optimal neural network's topology and inputs, respectively, to be applied with solar-supplied sensor nodes as well as the optical ones.
This paper is structured as follows. In Section 2, an explanation of the processing, the assessment, and the mechanism of implementation for artificial neural networks (ANNs) as a prediction approach is presented. Section 3 proposes a cascaded direct optimization technique to be applied. Section 4 analyses and validates the proposed optimizing algorithm of the ANN for sensor units which includes identifying the optimal number and type of input parameters, specifying the optimal number of hidden layers and hidden neurons, and realizing the prediction error caused. Lastly, a conclusion about the resulted optimal ANN appears in Section 5.

Artificial Neural Networks (ANNs)
2.1. Processing and Assessment of ANNs. As stated earlier, either stochastic or statistical prediction approaches could be used with small, limited computing devices or resources like microcontrollers. These approaches mainly utilize previous solar energy readings in order to perform the prediction process in the sensor units. Unfortunately, these approaches    Journal of Sensors are unreliable for the manufacturers of the technology of sensor nodes as they may not fulfill an acceptable and desired precision level. Hence, empirical models (e.g., Angstrom model) that describe global solar radiation are not in use with these units. This is attributed to the limitations of the availability of some parameters considered for calculations. In this context, ANNs have been promoted as a modern and accurate approach with the sensor nodes to be benefited from [23].
ANNs, as a machine-learning-based methodology, work depending on predefined historical data or samples. The network is trained by mapping premeasured samples with the desired output which in this case is represented by global solar radiation (GSR) for the sensor node application [24]. In other words, the ANN is utilized as a modeling technique for linear and nonlinear functions, where it is built of three layers (input, hidden, and output). The basic element in each layer is named a neuron, and each layer consists of several neurons as in Figure 2, which is reproduced from [5]. Each neuron connects to the other neurons in the adjacent layers. Through a connection of a specific weight and bias [25,26], Figure 2 has been created to describe the fixed part of the structure (number of layers and neurons) which is needed to be optimized. Thus, the weights and biases would never been indicated because they are changing iteratively.
Processing of the neural network divides into two actions, namely, training and normalization. These processes can be implemented in the following sequence to predict GSR values [27][28][29]: (i) The first step is to normalize the premeasured input and output values. This is performed by scaling the data to values ranging between zero and one. This is actually done by dividing the values available in the dataset over the maximum founded value (ii) The second step is to define the size of the input matrix (i.e., the number of input parameters and the number of samples available) (iii) The third step is represented by creating the ANN by selecting the number of hidden layers, the number of neurons in each layer, and the activation function (iv) Finally, training the ANN, during the training phase, the value of the predefined activation function is calculated for each neuron in the first layer. Then, neural variables are calculated using the resulted value of the activation function, weights, and biases of each connection according to Eq.
(1). These variables are passed to the adjacent neurons in the next layer. This process continued until the last output layer where S i is the sate of the neuron i. X i is the resulted value of activation function at neuron i. w ij is the weight of connection between neuron i to neuron j. b ij is the bias of connection between neuron i to neuron j. σ is the activation function.
Journal of Sensors (v) Minimizing the error, the calculation process is performed iteratively to minimize the network performance function which is described by mean square error (MSE). In each iteration, the values of weights and biases were updated (vi) Generating an output matrix with a size similar to the input matrix (vii) In the last step, the output values are unnormalized, and their accuracy is assessed by correlating the outputs with the measured output values. This is implemented through calculating the correlation coefficient (R) as in the following equation.
where x t is the measured GSR at specific hour t. y t is the predicted GSR at specific hour t. T is the total number for the hourly samples of GSR.     Journal of Sensors It is known that several activation functions are available to be used within artificial neural networks. Those variant functions are shown in Table 1. Consequently, many models have been created by changing the number of neurons, hidden layers, activation functions, and number and type of input parameters. While ANNs are networks in which signals flow in the forward direction from the input to the output neurons, they also include a closed-loop for propagation. That allows the error to backpropagate in the neurons in a backward direction during the training phase. For these reasons, the ANN is called a feedforward-backpropagation network [30,31].

Mechanism of ANN Implementation with Sensor Units
Some obstacles associated with applying this technique for such sensor units have appeared. The most common obstacle is the limitation of the microcontrollers. This backs to the fact that the large-topology neural network needs more space in the memory to save or store the weights and biases. Thus, more execution time and more computational efforts would be required. As a solution, ANNs were applied offline with those sensors (i.e., implemented by a device with large computational efforts like a PC followed by extracting the resulted weights and biases     Journal of Sensors to be inserted in the microcontroller as fixed numbers for a specific location) [33][34][35].
The neural network that is used with sensor units would be of feedforward-back propagation type. It is worth mentioning too that the utilized activation function should be chosen to fit with the nature of the daily curve of harvested solar energy. Thus, "Logsig" function would be used to predict solar energy in the sensor units. This function shows the highest accuracy compared to the ones. The reason behind this is the values of the collected solar energy (global solar radiation) which are always positive and never be zero and not less than that, even at the night. Regarding this matter, the correlation coefficient (R) is usually considered enough to evaluate the accuracy of the utilized predictive ANN for positive values, while the squared correlation coefficient ðR 2 Þ would be suitable when a predictive parameter has both negative and positive values. Computing ðR 2 Þ for the case of global solar radiation would be used for more confirmation of the accuracy results [36].

Cascaded Direct Optimization Technique
In this section, a direct optimization technique will be applied and followed in a cascaded way to elicit the optimal neural network for the sensor nodes supplied by solar harvesters. It is worth mentioning that the "direct optimization" term indicates the procedure or the process in which comparison for various parameters is executed iteratively until     Journal of Sensors finding a satisfactory solution or the optimum one. This kind of optimization was created to solve problems of difficult global optimization with bound constraints and a real-valued objective function [37,38]. However, Figure 3 shows the algorithm for eliciting this optimal neural network that has been addressed for our problem. According to Figure 3, data about different values for global solar radiation (GSR) will be collected at first to be used as output, while some other environmental or atmospheric parameters are used as inputs. Then, a direct optimization process would be implemented in two cascaded steps: the first one is to find the optimal number and type of inputs. In the second step, another direct optimization process would be executed to extract the optimal number of hidden layers and hidden neurons [39]. For each one of these two cascaded direct optimization processes, a specific criterion should be considered to extract the optimal target (inputs and structure). For example, the high accuracy (represented by the correlation coefficient (R)) or the low computational efforts (represented by the number of iterations) would be adopted for the first direct optimization, while the mean square error (MSE) would be taken for the second one. Note that the number of neurons and layers would be kept constant in the first direct optimization process. On the other hand, the number and type of inputs that resulted from the first direct   Moreover, the calculations of all these criteria for the ANNs understudy would be performed using the neural networks toolbox in MATLAB software. This toolbox is systematically designed to calculate smoothly the values of all those criteria for different models of neural networks. Thus, we would have high flexibility to change the number and type of inputs, the number of neurons and hidden layers, and the activation function.

Optimizing a Neural Network for Sensor Units Powered by Solar Energy
Under this section, three subsections would be inserted to show, respectively, the data set or group that would be utilized for analysis and assessment, the optimum number and type of inputs, and the optimum structure (topology) of the desired neural network that would be considered for the application of the sensor unit.

Data Group.
A group of input and output data is required to train the proposed ANN applied to the sensor units. Since the GSR is the real source of energy that feeds these units, it has been chosen here as the output parameter.
The considered values of GSR are shown in Figure 4. For input data, some environmental parameters have been selected to be studied. Among different environmental parameters, diffuse radiation (DR), zenith angle (θ z ), air temperature (AT), and relative humidity (RH) have been chosen to implement the training process. The selection process was made to nominate the parameters that do have a direct connection with global solar radiation. Note that all these parameters have been measured hourly for the same location in Berlin city over the same simulation period (from 1 st Jan 2011 to 31 st Dec 2016) and figured in Figure 5. One more time, this city and this period have been determined according to the free availability. We have to say also that the reason behind choosing five years is the reasonable need for a dataset describing real and different weather situations. This is considered necessary for an application like the sensor unit. In this figure, there are some spikes appeared for the parameters of air temperature, diffuse radiation, and relative humidity. These spikes in this figure indicate exceptional conditions occurred in weather parameters. Nevertheless, they did not appear with the zenith angle as this parameter is associated with the sun which is outside the atmosphere (i.e., the zenith angle would have predefined and fixed values).

Optimizing the Number and Type of Inputs.
To apply the developed ANN with the sensor units at high accuracy, low computational efforts, and low complexity, the optimal number and type of input parameters have been selected through the proposed optimization process considering these requirements carefully. For this purpose, the aforementioned input parameters have been taken and tested in three different possibilities: single, combinations of two, and combinations of three parameters. The correlation coefficient R has been calculated for all these possibilities at fixed numbers of neurons and hidden layers as shown in (Figures 6-8). Besides this factor, the proportion of the variance represented by R 2 (squared correlation coefficient) as well as the slope of the linear equation would also be indicated in those figures as being important and necessary as statistical parameters confirming the accuracy results and the linearity levels. Because the values of R in all possibilities are less than 1, the values of R 2 would be less than them for R. It is worth mentioning here that these values are a little bit less than standards because the utilized approximation in the MATLAB toolbox considers three digits 14 Journal of Sensors after the comma. However, this would not influence the results of the optimization process. It is necessary to declare here that the parameters of (AT and RH) will be measured later (at the implementation time) by small sensors if they have been chosen to be part of the participated component of the developed ANN, while the parameters of (DR and θ z ) will be measured by a pyranometer. From Figure 6, it can be observed that the network of zenith angle (θ z ) shows the highest R (0.865). While the network of air temperature shows the lowest R (0.603), the network of diffuse radiation and the network of relative humidity were in the middle (0.819 and 0.690), respectively. The values of R 2 accompanied with the values of R are 0.748, 0.670, 0.364, and 0.476, respectively, for the networks of zenith angle, diffuse radiation air temperature, and relative humidity. Not only for Figure 6 but also others (Figures 7  and 8), the authors will only offer the slope values in the figures for the readers who are interested in more detailed calculations without mentioning them in the texts.
From Figure 7, the network of zenith angle and relative humidity behaves the best with R (0.948) and (0.899) for R 2 . From the same figure, the network of the lowest R (0.730) appeared for the network of air temperature and relative humidity together. This value leads also to the lowest R 2 which is recorded to be 0.533 as in the figure.
As shown in Figure 8, the network that considers a combination of diffuse radiation, zenith angle, and air temperature shows a value of R (0.908) corresponds to (0.824) for R 2 . Note that this is the worst among the three input combination networks. On the other hand, the network of zenith angle, air temperature, and relative humidity has shown the best R with (0.997) and the best R 2 (0.994), meaning that it is the optimal one among all combinations of inputs. Based on that, eliminating the parameter of diffuse radiation as input and relying only on the other remaining parameters (i.e., the zenith angle, the air temperature, and the relative humidity) could lead to defining the optimal network input data.

15
Journal of Sensors coefficient R, the number of iterations, and the MSE have been considered as criteria for direct optimization. Note that the optimal structure should be selected to show the maximum value of R (the highest accuracy), the lowest number of iterations (the lowest computational efforts), and the minimum value of MSE (the lowest complexity). Hence, all these factors have been calculated for a sequence of layers. Note that a range of eight hidden layers has been selected for purpose of this study. In other words, (1 to 8) would be the bounds for the constraint of the layer's number.
Speaking of optimizing the structure of the addressed ANN, the number of neurons should be also considered to contribute to the desired targets represented by reducing the size of the memory. For this reason, the number of neurons has been changed over a range of values (bounds for the constraint of neuron's number) to examine this impact. For instance, in this research, the number of neurons has been increased from 5 to 15 over a step of 5 (i.e., 5, 10, and 15) as depicted in Figures 9-11. It is worth mentioning here that the proportion of the variance ðR 2 Þ and the slope of the linear equations would not be indicated for being the MATLAB toolbox approximate them to the same values rather than the number of layers and neurons after considering three digits after the comma as stated before. However, those values are (R 2 ≅ 0:99 and the slope ≅ 1).
To facilitate and easily realize the optimization process, the values of R and the iteration number have been collected and figured together in Figure 12 for different layers. It is worth pointing out that this step is essential for examining the optimal number of hidden layers, meaning that priority is given to the structuring of the neural network because it is directly related to the degree of complexity. This is aimed to reduce both the execution time and the computational efforts (i.e., less memory size). In this regard, it can be noticed that increasing the number of hidden layers would increase the value of R as shown in Figure 12. However, after seven layers, the value of R will be almost constant regardless of the number of neurons. Additionally, a decreasing behavior for the number of iterations accompanies an increasing number of hidden layers. Unfortunately, both behaviors (for R and iterations) conflict with the low complexity that is required for the optimal network. However, two hidden layers show a reasonable limit of complexity, computational efforts, and accuracy. Thus, it can be considered the optimal one for wireless sensor units.
Observe Figure 13 which shows the impact of changing the number of neurons on the corresponding correlation factor and MSE error. It can be noticed that increasing the neurons from 5 to 10 neurons is accompanied by an increase in the value of R . Conversely, this leads to a higher MSE. This means that the highest accuracy does not match the lowest complexity. Consequently, the optimal operating point of the network is selected to be the middle point (10 neurons). The reason behind that is the slight improvement for R when the neurons changed from 10 to 15 compared to the significant improvement when the neurons changed from 5 to 10.
Combining the results from both direct optimization processes, the optimal neural network would have three inputs (air temperature, relative humidity, and zenith angle) and one output (global solar radiation). It also would include two hidden layers with ten neurons inside each one. For more clarification about the construction (setup) of this optimal network, Figure 14 is used to show the whole components that participated in the optimal neural network.

Prediction Error.
After identifying the optimal network that could fit when applied with sensor units, it will be used to predict the GSR throughout the study period (1 st Jan 2011 to 31 st Dec 2016) for testing purposes. The predicted GSR results have been figured in Figure 15. Consequently, the predicted error has been calculated according to Eq. (3) and figured in Figure 16. This figure indicates that the optimal neural network minimizes the error of prediction to be less than 2% for the whole observations during the 5 years under study.

Prediction Error =
Measured Energy − Predicted Energy Measured Energy × 100%: Taken both figures (Figures 15 and 16) together, the results have proven the efficacy of the following cascaded

16
Journal of Sensors direct optimization process in selecting the optimum fits among the inputs and the structure options of the ANN for solar energy prediction applications. It can be summarized that an ANN structured of 2 hidden layers with 10 neurons each would achieve the most accurate prediction of solar-harvested energy when it is fed by the air temperature, relative humidity, and zenith angle as selected inputs.

Conclusion
A cascaded direct optimization approach for selecting both inputs and structure of ANNs that are applied in sensor units has been proposed. Artificial neural networks (ANNs) have been presented in this paper as a prediction approach utilized with the sensor units. This approach depends on a training process for premeasured environmental data. The predicted variable resulting from the utilized neural networks is the global solar radiation (GSR), the real source of operational energy. This paper found the optimal neural network that is suitable for the sensor units powered by solar harvesters based on a cascaded direct optimization for the inputs/structure. In these optimization processes, an analysis was done considering the aspects of accuracy, computational efforts, and complexity. This neural network included three input parameters (air temperature, relative humidity, and zenith angle). Additionally, it consisted of two hidden layers with ten neurons in each. The prediction error caused by applying the optimal resulted neural network did not exceed 2%.

Nomenclature
AT: Air temperature ANNs: Artificial neural networks b ij : Bias of connections between neuron i and neuron j cm 2 : Squared centimeter DR: Diffuse radiation GSR: Global solar radiation J: Joule MSE: Mean square error R: Correlation coefficient R 2 : Squared correlation coefficient RH: Relative humidity. S i : State of the neuron i T: Total number of hourly GSR samples w ij : Weight of connection between neuron i and neuron j W: Watt WSN ′ s : Wireless sensor networks x t : Measured GSR at specific hour t X i : Resulted value of activation function at neuron i y t : Predicted GSR at specific hour t θ z : Zenith angle σ: Activation function.

Data Availability
The weather data (global solar radiation, diffuse radiation, air temperature, zenith angle, and relative humidity) which are used to support the findings of this study are included within the article.

Conflicts of Interest
The authors declared that there is no conflicts of interest.