A Novel Cluster-Based Wireless Sensor Network Reliability Model Using the Expectation Maximization Algorithm

Wireless sensor networks (WSNs) have been used widely across various industries and business fields that require the coverage of large geographical regions that are difficult for humans to reach. It is therefore important to be able to model, assess, and predict the reliability of WSNs. Masked data is a type of missing data used to represent system failure when the exact cause of the failure is unknown. This paper proposed a novel additive reliability model for a cluster-based WSN system using general masked data and uses the expectation maximization (EM) algorithm to solve the problem of the maximum likelihood estimation (MLE). Moreover, the proposed model assumes that a WSN comprises several clusters, and the failure processes of these clusters are independent. The probability characteristics of the system are determined according to the topology of the WSN system to evaluate the system reliability. Finally, the proposed model is demonstrated to be powerful for estimating WSN system reliability using a simulated dataset.


Introduction
Wireless sensor networks are considered one of the most important technological innovations today. These networks have environmental adaptability, self-correction, and random distribution and can meet the requirements of accuracy and real-time and cross-environmental information collection. A wireless sensor network is a typical distributed network system, which consists of low-cost nodes, randomly allocated and densely distributed. Compared with centralized sensor systems, wireless sensor networks have higher reliability and better flexibility in antiattack expansion, as well as functions in the case of degradation or node loss [1]. Despite these advantages, it is imperative to study the measurement and prediction of WSN reliability, as quantitative measures of the abilities of WSNs.
The wireless sensor network is a distributed network system whose reliability evaluation can be studied using network reliability analysis approaches, and several such studies have already been conducted. AboElFotoh et al. computed WSN reliability based on random failure [2]. Niu and Varshney developed a performance analysis pipeline for distributed detection in a random sensor field, in which the sensor number is random and the wireless channels have nonnegligible error rates [3]. Other authors have proposed ordered binary decision diagram-based WSN reliability models [4,5]. Lee et al. analyzed the entire aging process of a sensor network in a periodic data gathering application [6]. Chen and Wang analyzed the reliability of the wireless sensor network system that executes a distributed code attestation protocol [7]. Silva et al. proposed a method to evaluate the reliability and availability of wireless sensor networks based on automatic fault tree generation technology [8]. In order to improve the survival time of sensor networks, several algorithms and methods have been proposed [9]. Wang et al. modeled the reliability and lifetime of WSN nodes in three typical working scenarios [10]. Some researchers have studied the energy efficiency and power reliability of WSNs [11][12][13][14]. Dâmaso et al. studied a reliability evaluation method for the WSN system based on routing algorithms [15]. Cai et al. used an acknowledgment-based transmission scheme to study the reliability of the data flow in event-driven WSNs [16], and others have studied the wireless link reliability [17].
Some researchers discussed the reliability and performance model of the wireless sensor network [18,19]. Zhu et al. proposed a certain assessment model and dynamic framework to meet the needs of users for network transmission reliability evaluation [20]. Kafi et al. reviewed the existing wireless sensor network reliability protocols, which are considered to be specially designed for wireless sensor networks due to their special characteristics [21], and a WSN communication reliability model was recently proposed [22][23][24]. Wang et al. proposed some algorithms to improve the energy efficiency of wireless sensor networks to extend their lifetime [25].
Zhang et al. proposed a mathematical model of network reliability with a diameter constraint and node proportion constraint to meet the requirements for a WSN performance evaluation [26]. Mostafaei and Obaidat proposed a distributed learning automaton-based algorithm and an irregular cellular learning automaton-based algorithm to preserve sensor protection [27,28]. Zhao et al. developed general WSN reliability models by removing the independent assumption of component or subsystem failures of WSNs [29]. Mostafaei modeled the problem as a multiconstrained optimal path problem and proposed a distributed learning automatonbased algorithm to preserve WSN [30]. Chakraborty et al. studied the reliability of WSNs with multistate nodes and proposed an approach to evaluate the flow-oriented network reliability of WSNs comprising multistate sensor nodes [31]. Díez-González et al. used a five-node time difference of the arrival localization method to develop a new sensor deployment method to guarantee system availability in case of a sensor failure [32].
Most of studies were based on probability evaluations and used probability-based methods. However, the WSN is a random and dynamic system, so a random process is needed to describe the failure process of the system. Moreover, in order to improve the quality of WSN, most of the researches focus on energy consumption, routing cost, etc., but the proposed studies do not make full use of the cluster failure data to evaluate the WSN system reliability. The main research contents of this paper are as follows: (a) This paper discusses a cluster-based WSN system reliability modeling (b) The system reliability estimation method is studied when general masked data are present (c) An expectation maximization algorithm is studied in order to solve the problem of maximum likelihood estimation The remainder of this paper is organized as follows. Section 2 discusses related work. Section 3 reviews the reliability model based on additive nonhomogeneous Poisson process (NHPP) and discusses the general masked data of wireless sensor networks. In addition, this paper also proposes a novel additive cluster-based WSN system reliability model, which uses general masked data. Section 4 describes the process of maximum likelihood estimation of reliability model parameters. Section 5 describes a numerical example applied to the proposed model using grouped general masking data. Finally, Section 6 summarizes this paper.

Related Work
The topology of the wireless sensor network plays an important role in the network reliability assessment, which depends on the sensor coverage and network connectivity. Several different topologies of wireless sensor networks are applicable for different operating environments. The five most popular and simple structures, namely, mesh, star, tree, ring, and fully connected topologies, have been used in many practical fields ( Figure 1) [33]. All wireless sensor networks with these topologies include a cluster head and multiple sensor heads. All the data collected from the sensor head is transferred to the cluster head and then transferred from the cluster head to the base station through the middleware receiver. However, reliability modeling is not only related to the connectivity of each subnet but also to spread ability of wireless sensor networks [34]. In general, a large WSN contains many subnets. Figure 2 shows the topological structure of a WSN system with three subnets. The described system contains three subnets (clusters), each of which contains one cluster head and several sensor heads.
Some proposed WSN reliability models have considered the failure process to be stochastic. Zhu et al. studied the reliability of wireless sensor networks in which the topology is switched between possible connections, and the connections are controlled by Markov chains [35]. Song et al. studied the sink node reliability method considering both software and hardware systems [36]. Salvo Rossi et al. developed a novel decision fusion approach and modeled a system by using a hidden Markov model [37]. Zhao et al. merged the theorem of NHPP with masked data when analyzing WSN systems [38]. Lei et al. studied the energy reliability in wireless sensor networks [39]. Ciuonzo et al. developed a novel fusion rule corresponding to a generalized Rao test to reduce the computational complexity [40]. Masked data represent system failure data when the exact cause of the failure is unknown; i.e., any system component (e.g., module, subsystem, and object) may have caused the failure [41]. Because of the influence of the actual environment, however, the cause of a system failure may be a subset of system components and not a single component. If failure data also comprises general masked data, the failure process cannot be reduced to simple cluster processes. Thus, common methods used to maximize or minimize a complex function cannot be easily applied due to the potential existence of many unknown parameters. Some researchers have solved the problems of parameter estimation in an additive reliability model using the expectation maximization algorithm [42] and expectation least squares algorithm [43]. This is also commonly seen in WSN applications for which the failures cannot be attributed to specific subnets or when such information is not available. As often observed in practice, when the additive NHPP-based model cannot be reduced 2 Journal of Sensors to simple NHPP models, the efficiency of parameter estimation is relatively low.

Wireless Sensor Network Reliability Model
3.1. Review of the Additive NHPP Reliability Model. The additive NHPP model is an important reliability model for estimating and predicting system reliability using subsystem failure data. For example, in the hyper-exponential NHPP model [44], the ordinary model was actually Goel-Okumoto (GO) model [45]. Yamada et al. also studied a similar version of the superexponential model [46]. Xie and Goh proposed a component-based software system reliability growth model [47]. A power law model has also been applied to a hardware/software system reliability data analysis based on component failure data [41]. The additive model contains many parameters, and therefore, the main problem associated with this type of model involves the effective estimation of the model parameters. In general, an additive system reliability growth model requires the following basic assumptions [41]: (a) The system contains k subsystems (b) Counted failure numbers fN i ðtÞ, t ≥ 0g of subsystem i are characterized by the nonhomogeneous Poisson process, and N i ðtÞ are statistically independent (c) Cumulative failure numbers of the system can be calculated by Based on the above basic assumptions, the mean value function mðtÞ and failure intensity function λðtÞ of the system are An additive reliability model requires that a system can be decomposed into several subsystems. Of course, the subsystem mentioned above may be failure mode, module, or    Journal of Sensors components. It is easy to obtain the reliability function of the system as follows:

General Masked Phenomenon for Wireless Sensor
Networks. The frequent deployment of sensor nodes in wireless sensor networks in harsh environments, coupled with the obvious shortcomings of sensor nodes in terms of network bandwidth, battery and computing power, and memory, means that some sensor nodes are prone to failure. In addition, sensor nodes are usually in a redundant state and generate a lot of redundant data, such as data fusion [48,49] that can lead to the loss of data and the generation of masked data. Some attack behaviors used to target WSNs, such as spoofs, alterations, and replays [50,51], can also lead to masked data. In summary, masked data often exists in the reliability analysis of WSN systems. Suppose that a wireless sensor network contains k clusters (or subnets), and S = f1, 2,⋯,kg is the cluster set. The general masked failure data in the wireless sensor network is defined as follows.
Definition 1. Suppose that a wireless sensor network contains k clusters (subsystems or subnets), and S = f1, 2,⋯,kg is the cluster set. S j is the failure cause set (FCS) at time t j , and S j ⊆ S. The vector ðk, t j , S j Þ is the mathematical structure of general masked data. Figure 3 exemplifies a failure process as described for a WSN system. As shown in Figure 3, when a failure occurs, the failure arrival time t j and FCS S j can be observed. The example system failed at time t 4 , failure cause set S 4 = f2, 3g, and both clusters 2 and 3 may have caused the WSN system failure. These data are masked because it was impossible to determine which cluster was the cause. It is easy to determine that the causes of the system failures were masked at times t 1 , t 6 , t 8 . Moreover, when the system failed at time t 3 , failure cause set S 3 = f2g, the cause of WSN system failure was cluster 2, and there were no masked data. Figure 3 shows the causes of the WSN system failures not masked at times t 2 , t 5 , t 7 , t 9 .
Assume that the wireless sensor network system has continuous observation time t 1 < t 2 < ⋯<t m . Table 1 shows the general observation matrix of masked data. n Mj are masked failure numbers at time t j ðj = 1, 2, ⋯, mÞ, and n ij ði = 1, 2, ⋯, kÞ are nonmasked failure numbers for the ith cluster.

Cluster-Based Wireless Sensor Network Reliability
Modeling with General Masked Data. Modern wireless sensor network systems are becoming more and more complex and are generally cluster-based. Additive reliability modeling is an important approach to a cluster-based system reliability analysis. Considering the complexity of the environment, some assumptions are needed to evaluate the reliability of ?
? ? Figure 3: Failure process of a three-cluster wireless sensor network system. Table 1: Grouped general masked data of a wireless sensor network system.
Failure causes of the WSN system Observation times Journal of Sensors the WSN system. The development of the proposed additive reliability model is also based on assumptions described in Section 3.1 replacing clusters (subnets) with subsystems. Based on the above assumptions, the MVF mðtÞ and failure intensity function λðtÞ of the WSN system can be calculated by Furthermore, the reliability function of the WSN system is as follows: In equation (5), mðtÞ is the expected number of failures for the WSN system until time t, and m i ðtÞ is the expected number of failures for cluster i. It is noted that a cluster may contain many sensor nodes, as shown in Figures 1 and  2. Equation (6) expresses the probability of no failure in the time interval ðt, t + δtÞ.
In accordance with the above reliability model, the following chapters will continue to estimate the parameters for this model, evaluate the model performance, and evaluate and predict the reliability of the WSN system. A flowchart for the reliability modeling and evaluation of the WSN system is shown in Figure 4.

Maximum Likelihood Analysis of Wireless Sensor Network Reliability and Model Validation
Least squares estimation (LSE) and maximum likelihood estimation are two commonly used methods in parameter estimation. However, when the WSN system failures are masked, the objective functions of LSE and MLE in the model are both high-dimensional and complex multivariate functions. For example, an additive power law reliability model containing k clusters in Table 2 contains a total of 2k parameters to be simultaneously estimated. Therefore, due to the potential existence of many unknown parameters, the methods usually used to maximize or minimize objective functions cannot be easily applied.

Maximum Likelihood Estimation of Parameters in the
Reliability Model. The grouped failure data used in the modeling of this paper is a commonly used reliability data. First, we give the following symbol definition: Assume that the wireless sensor network system is observed at time 0 = t 0 < t 1 < ⋯<t m , and the observed masked failure data from Table 1 are where k is the number of clusters in the WSN system, m is the number of observations, n M j is the number of observed masked failures corresponding to S * j ðj = 1, 2, ⋯,

Journal of Sensors
and n i j is the number of observed failures for cluster i ði = 1, 2, ⋯, kÞ. It is therefore easy to determine that n M j = 0 ⟺ S * j = ∅. Let N i j be a random variable of the number of failures for cluster i in ðt j−1 , t j , and N i j is known to follow an independent Poisson distribution with intensity λ ij m i ðt j Þ − m i ðt j−1 Þ. We give the following symbols: Based on basic probability and statistics theory, the likelihood function is Take the logarithm for formula (11) to obtain the loglikelihood function as The maximum likelihood estimates of the parameters can be obtained by solving the partial derivative equations or maximizing the log-likelihood function. However, it is difficult to obtain the global optimal solution because the log-likelihood function of additive reliability model is very complicated, as shown in formula (12). In the next section, we will introduce the EM algorithm to solve this problem.
The maximum likelihood estimations of parameters in the additive model become very simple when there is no masked failure data, because the failure process of each cluster is statistically independent. The likelihood function and log-likelihood function then take the following simple form, and the computations of MLE are no longer complicated, as shown in previous studies [45].
log L ⋅ n i The parameter estimation methods in the additive reliability model can also be found in Reference [41] when there is a traditional masked failure data, i.e., S * j = f1, 2, ⋯, kg. Based on formula (11), the likelihood function becomes

Expectation Maximization Algorithm for Estimating
Model Parameters. The expectation maximization algorithm has recently gained popularity and is used for various applications, especially to simplify the calculation of the likelihood function maximization in reliability model. Let θ i be the parameter of MVF m i ðtÞ. When n M j > 0, due to incomplete observations of random variable N i j , missing data sometimes occurs. The function Q by using formula (14) is It is well known that EðN i j | n i j , n M j , θ ðlÞ Þ is independent of dummy variable θ and is equivalent to a constant in function Q. Therefore, the maximization step in the EM algorithm can be accomplished by maximizing the following functions Q i , respectively: Journal of Sensors In order to realize the expectation maximization algorithm, the expected number of failures for each cluster i must be determined when n M j > 0. Next, we focus on the case of n M j ≠ 0, i.e., S * j ≠ ∅. When S * j = ∅, there are no masked failure data. Let random vector N * j = ðN r 1 j , ⋯, N r Lj j Þ, r l ∈ S * j , where l = 1, 2, ⋯, Lj and Lj is the number of elements in set S * j . The random vector N * j obeys the multinomial distribution, i.e., N * j~M ðn * j , p r 1 , p r 2 ,⋯,p r Lj Þ, where n * j is described in formula (10), and Now, we easily determine that ∑ r∈S * j p rj = ∑ Lj l=1 p r l = 1 ðj = 1, 2,⋯,mÞ. Moreover, the conditional probability is Because the expectation E½N i j is unknown when n M j > 0, we must calculate E½N i j to implement the EM algorithm. Here, letN i j ≡ EðN i j | n i j , n M j , θ ðlÞ Þ, which can be shown aŝ In summary, we can obtain the steps of the EM algorithm to estimate the parameters θ = ðθ 1 , θ 2 ,⋯θ k Þ with general masked data.
Step 2. CalculateN i j using formula (20), and solve the maximum value of the log-likelihood function shown in formula (17) as the new estimatesðθ 1 ,⋯,θ k Þ ð1Þ .
Step 4. Repeat steps 2 and 3 until the stop rule is met.

Model Performance Evaluation Criteria.
In order to compare the performance of the reliability model, the mean squared error (MSE) and adjusted MSE are used to compare the goodness of the model fit to the observed failure data. The MSE is defined as where mðt j Þ is the estimated cumulative numbers of system failures ð0, t j and m j is the observed cumulative numbers of failures for the WSN system until time t j .
To consider the influence of the number of parameters K on the model, the adjusted MSE is defined as Clearly, the smaller the MSE and adjusted MSE, the better the fitness of the model to the observed data.

Selected Models and Simulation Data.
In reliability analyses, a very widely used NHPP model is the power law model, also known as the Duane model. The power law model is very flexible, as the intensity can be decreasing, constant, or increasing. Its MVF and intensity function are described as [52] m t ð Þ = αt β , λ t ð Þ = αβt β-1 , α > 0, β > 0: In this paper, the power law model is applied to build the additive reliability model of a WSN. Table 2 describes the proposed PLGM, traditional PL, and PLTM models corresponding to MVF.
To illustrate the method described in previous sections, a numerical example is given for a WSN with three clusters. Here, the MVF is given by A simulated dataset is shown in Table 3. In Table 3, C1, C2, and C3 represent the numbers of failures at each month for cluster 1, cluster 2, and cluster 3, respectively. Obviously, the WSN system in this paper contains three clusters. M denotes the numbers of failures for masked data. GM means general masked data, and TM means traditional masked data. S * j is the FCS described in formula (7).

Expectation Maximization Algorithm Performance
Analysis. In our computation, the stop rule of the EM algorithm is 7 Journal of Sensors R software was used to write the program and obtain the number of iterations for the EM algorithm under different initial values, as shown in Table 4. For 10 independently repeated experiments, the number of iterations is 8-10, which is very low. This indicates that the EM algorithm has a lower computational complexity and higher robustness for the present choice of initial values.

Reliability Evaluation of Wireless Sensor Network
Systems. Using the EM algorithm for the PLGM model described in Appendix A, if the initial value of ðα 1 , β 1 , α 2 , β 2 , α 3 , β 3 Þ is taken as (20, 0.06, 3, 0.5, 0.8, 1.5), then the EM estimates of the parameters are obtained in eight iterations,       Table 5. Here, the MSE and adjusted MSE of the proposed PLGM model are less than those of the traditional PL and PLTM models. Furthermore, the MSE and adjusted MSE of the PLTM model are less than those of the traditional PL model. Overall, it is reasonable that the proposed PLGM reliability model has a better goodness of fit than all selected traditional models. In addition, the traditional PL model has the lowest goodness of fit among all selected models. Figure 5 shows the fitted mean value functions for the WSN system. Figure 6 shows the fitted failure intensity functions of all selected models. Evidently, the traditional PL model is unable to fit the data. The proposed reliability model in this paper not only can evaluate the reliability of WSN systems but can also evaluate the reliability of each cluster. Figure 7 shows the estimated MVFs and failure intensity functions of three clusters. The reliability of each cluster can also be predicted using the above estimated MVF.

Conclusions
In this paper, the failure processes of clusters are characterized by the stochastic process NHPP, not the static probability-based methods. The proposed model can use the failure data of the clusters to evaluate the reliability of the system in order to improve the lifetime of the WSN system. Moreover, useful and powerful EM algorithm is very powerful in handling optimization problems of likelihood function. That is, for 10 independently repeated experiments, the number of iterations is 8-10, which is very low. Finally, we used a simulated dataset to comparatively analyze the model performance and showed that adjusted MSE and MSE of the proposed PLGM model are less than those of the traditional reliability models.
In the future, the proposed WSN reliability model can be extended to other systems, such as Internet of things systems and software/hardware systems. Reliability models of WSN 10 Journal of Sensors systems that consider both masked data and random pulses can be the focus of further studies. Finally, the WSN system reliability model considering energy consumption, routing cost, sensor node, and link failures is also the future research topics.