Abnormal Event Detection in Wireless Sensor Networks Based on Multiattribute Correlation

Abnormaleventdetectionisoneofthevitaltasksinwirelesssensornetworks.However,thefaultsofnodesandthepoordeployment environmenthavebroughtgreatchallengestoabnormaleventdetection.Inatypicaleventdetectiontechnique,spatiotemporal correlationsarecollectedtodetectanevent,whichissusceptibletonoisesanderrors.Toimprovethequalityofdetection results,weproposeanovelapproachforabnormaleventdetectioninwirelesssensornetworks.Thisapproachconsidersnotonly spatiotemporalcorrelationsbutalsothecorrelationsamongobservedattributes.Adependencymodelofobservedattributesis constructedbasedonBayesiannetwork.Inthismodel,thedependencystructureofobservedattributesisobtainedbystructure learning,andtheconditionalprobabilitytableofeachnodeiscalculatedbyparameterlearning.Weproposeanewconceptnamed attributecorrelationconfidencetoevaluatethefittingdegreebetweenthesensorreadingandtheabnormaleventpattern.Onthe basisoftimecorrelationdetectionandspacecorrelationdetection,theabnormaleventsareidentified.Experimentalresultsshow thattheproposedalgorithmcanreducetheimpactofinterferencefactorsandtherateofthefalsealarmeffectively;itcanalso improvetheaccuracyofeventdetection.


Introduction
Abnormal event detection is one of the main problems in wireless sensor networks [1].In wireless sensor networks, abnormal events are usually complex, because an event usually involves multiple observed attributes, and it is difficult to describe an abnormal event pattern [2].Existing anomaly detection algorithms detect an abnormal event by comparing a single attribute threshold [3,4] or by considering the spatiotemporal correlations of sensor readings [2,[5][6][7][8].However, some important information may be hidden in the correlations among different attributes [9].
In [3], an adaptive distributed event detection method is proposed, which dynamically adjusts the decision threshold based on the trust value of the sensor nodes and uses the moving average filter to tolerate the transient faults of the sensor nodes.Although this method is fault-tolerant, it is still possible to misjudge the event nodes into faulty nodes.Particularly when the event range is large, the accuracy of detection will decrease significantly.Besides, this method computes a trust value for each sensor node, so it can only be applied to univariate applications.Paper [5] models the event region based on Dynamic Markov Random Field.This method can effectively capture the dynamic changes of local area; since the method needs to exchange information of space-time neighbor constantly, the detection efficiency is low.Besides, the detection of the events lacks a global perspective, which may lead to misjudgment of abnormal events.Paper [6] proposed an event detection scheme based on spatiotemporal correlations.In this method, the sensor nodes are divided into multiple working groups; the time correlation of the sensor data is used to eliminate low frequency errors.Different working groups cooperate to determine whether the anomalies represent an event.However, this method only constructs the model based on the single sensing attribute and does not consider the relations between the multisensory attribute and the abnormal event.
The attributes of the sensor readings usually contain time information, sensor topology information, and other attributes directly sensed by the sensor (e.g., temperature, humidity, and light intensity).When abnormal events occur in the network, events often show temporal correlation, spatial correlation, and attributes correlation [9].In most cases, event detection methods that take the spatiotemporal correlation of the data into account are susceptible to both sensor failures and external environmental noises.For observed attributes, a simple threshold comparison is insufficient to determine whether an abnormal event occurs.For instance, in an indoor fire monitoring application, the increase of the temperature and smoke concentration may be caused by cooking, rather than a fire accident.
In order to improve the accuracy of abnormal event detection in wireless sensor networks with multiple attributes and reduce the influence of environmental noises and sensor failures on the event detection results, this paper proposes a new method called Abnormal Event Detection based on Multiattribute Correlation (MACAED).First, considering that Bayesian network can effectively represent the dependencies among variables, a Bayesian network is used to establish the dependency model of observed attributes.In this model, the dependency structure of abnormal events is obtained by structure learning.Each node learns the parameters to get a conditional probability table.Then, the attribute correlation confidence is introduced to judge whether the attribute correlation mode of the point is an abnormal mode.Based on the sliding window model, the degree of temporal correlation was calculated; the spatial similarity was calculated by using the neighbor node information.Finally, the anomaly events were detected by three kinds of attribute correlation.

Attribute Dependency Model
In wireless sensor networks, abnormal events usually show the following three characteristics: (1) For a single sensor node, the anomaly event will continue for a period of time once the event occurs; the adjacent time of the data shows a certain degree of similarity [7].In addition, abnormal events will inevitably affect the physical environment of network monitoring, and the sensor data will change accordingly, showing a special mode.
(2) For a number of sensor nodes, sensor nodes in the event region will exhibit spatial similarity when abnormal events occur [10]; in other words, the readings of adjacent nodes exhibit similar patterns.
(3) When the abnormal events occur in the monitoring area, the sensed attributes of the sensor readings show a certain degree of relevance, and this correlation appears as probability relations [9].
According to the three kinds of characteristics of abnormal events in wireless sensor networks and the experience that Bayesian network can effectively represent the probability relationship among attributes, we construct the attribute dependency model.The attribute correlation confidence is proposed to measure the degree of similarity between the measured points and the anomalies in observed attribute probability model.Given a sample dataset ( 1 ,  2 , . . .,   ), let Bayesian network  take all the variables in the node set ( 1 ,  2 , . . .,   ) as nodes, and instantiate all the variables of  using the attribute value   in the sample dataset.The variable   has   possible values ( 1 ,  2 , . . .,    ).Let the parent variable set of   be Π  ,   denotes the th instantiation value of the parent node Π  with respect to , and   denotes the number of instances in which the value   of the variable   is taken and is instantiated into   by Π  ,   = ∑   =1   .The Bayesian scoring criterion is used to compute the likelihood ratios of the two Bayesian network structures  1 and  2 .Since ( 1 | )/( 2 | ) = ( 1 , )/( 2 , ), we only need to compare the joint probability ( 1 , ) and ( 2 , ).This can be calculated by using the formula [12] where () is the prior probability and the arrangement order of Π  is (1, . . .,   ).Maximizing the joint probability (, ) in ( 1) It can be seen that, for each variable   , it is only necessary to maximize max In the initial stage of constructing the network structure, it is assumed that each node has no parent node.The nodes which meet the posterior probability maximization formula are recursively added to the parent set of nodes.When (, ) is no longer increased, stop adding to the parent node set; then the network structure   is obtained.For the current sample dataset ,   is the optimal network structure under the Bayesian scoring standard.

Attribute Correlation Confidence.
Attribute correlation confidence is a concept we proposed to measure the fitting degree between the sensor reading and the abnormal event pattern.It is equal to the ratio of the joint probability distribution between the measured point and the abnormal point.Let ( 1 ,  2 , . . .,   ) be the sensor reading at the current time.For an abnormal event   , the joint probability of all node variables ( 1 ,  2 , . . .,   |   ) is calculated according to the Bayesian network structure and the conditional probability table.Since in Bayesian network, not every node has an arc to the all the rest nodes, the conditional probability only depends on the direct parent node.In other words, given the values of parent variables, the probability of nondescendant node is conditionally independent of the parent node.So the calculation of joint probability ( 1 ,  2 , . . .,   |   ) can be simplified by using the chain rule [11], in which  () represents the parent node of   .After calculating ( 1 ,  2 , . . .,   |   ), we can get the probability pattern of the reading in an event.According to the formula, the attribute correlation confidence  of the tested point is calculated.The higher the probability, the more the possibility for the anomaly to represent an abnormal event.

Abnormal Event Detection Algorithm Based on Multiattribute Correlation
In this paper, we propose a detection algorithm based on multiattribute correlation, which is divided into three phases: attribute correlation pattern decision, temporal similarity detection, and spatial similarity detection.

Description of Abnormal Event.
For an abnormal event, define event information  = {, , , ,   }, where  is the time of occurrence of abnormal events,  is the location of abnormal events, and  is the attribute set that an event involves.Parm is the parameter set, which includes temporal similarity threshold , spatial similarity threshold , and attribute correlation confidence threshold .For different application environments, the values of each item in Parm can be adjusted to achieve the best detection result adaptively.  represents the event type,  = 0 means no abnormal events occurred,  > 0 means that abnormal events occurred, and the higher the value  is, the more severity the abnormal event has.

Temporal Similarity Detection.
The data sampling frequency of most wireless sensor networks is relatively high and data change range at the adjacent time is relatively small, so the sensor data is time-correlated.Combining with sliding window model and the attribute dependency model obtained, candidate anomalies that may represent abnormal events are detected.
Let  be the size of the sliding window, and for each data sequence   within the window, calculate the similarity between   and the current time series   (  , ) = 1 Considering that the data sequence that is closest to the current time is most correlated, the average similarity between the current time data and the data in the window is calculated by the weighted summation method where the weight is   = 1/( −   ).If the average similarity is smaller than the threshold  and the confidence degree of the attribute correlation is greater than or equal to the threshold , it means that not only does the data sequence of the current time significantly deviate from the historical data, but also the relationship among the attributes is in accordance with the probability relation when the abnormal event occurs, which needs a further spatial correlation detection.In other cases, it will be filtered as a noise.

Spatial Similarity Detection.
The similarity between the candidate anomaly and the neighbor node's data sequence is calculated.If the candidate anomaly and the neighbor node's data sequence satisfy certain similarity degree, it indicates that the abnormal event occurs in the region where the candidate anomaly is located and needs to be uploaded to the sink node.The similarity between the candidate anomaly and the neighbor node sequence is calculated according to the following formula: If the spatial similarity (  ,   ) is greater than or equal to the threshold , it indicates that both nodes have detected an abnormal event at the same time and mark the candidate anomaly nodes and their neighbor nodes as abnormal event nodes.On the contrary, it indicates that no neighbor nodes detect abnormal information at this time, and the candidate anomaly belongs to noise data, which is also filtered out.

Description of MACAED Algorithm.
Based on the calculation of attribute correlation confidence and the detection of temporal and spatial correlation of sensor data, an abnormal event detection algorithm based on multiattribute correlation is proposed.The pseudocode of the algorithm is shown in Algorithm 1.
In the pseudocode of Algorithm 1, rows (2)∼(3) train the Bayesian network through the scoring-searching method and choose the network structure  with the highest score as the observed attribute dependency model, rows (4)∼(26) detect abnormal events in real time, where rows (9)∼(10) proceed parameter learning for each sensor in order to update the probability distribution in attribute dependency model, rows (10)∼( 14) compute the attribute correlation confidence of observed attributes, row (15) calculates the average similarity between the current time readings and the readings within the window, row (18) calculates the average similarity between the current node and the adjacent node readings, and rows (17)∼(24) determine whether the current reading represents abnormal events readings.

Time Complexity Analysis.
Let  be the number of observed attributes, which corresponds to the number of nodes in Bayesian network;  is the number of instances, that is, the number of readings;  is the number of possible values for each observed attribute;  is the number of nodes in WSN;  is the size of sliding window.For the structure learning part, the time complexity is ( 4 ) [12].For abnormal event detection part, it contains two layers: outer layer loops ( −  − 1) times and inner loops () times.The parameter learning consists of a cycle of () times.The time correlation detection consists of a cycle of () times.The spatial correlation detection consists of a cycle of () times.The total time complexity of the algorithm is ( 4 )+ ( −  − 1)()( +  + ).Since, for most wireless sensor networks, the value of  is small (less than 10) and sliding window  and the number of possible values of each attribute  are relatively small (in this experiment,  = 10;  = 9), the influence of these values on the total time complexity can be ignored, so the total time complexity can be simplified to () + ( 2 ) = ( 2 ).

Experimental Results and Analysis
4.1.Datasets.We test the performance of the MACAED algorithm by means of conducting simulation experiments on Matlab 2014a.The experiments are run on a PC with an Intel Core i3-2120 @3.30GHZ Cpu, 4 GB memory, and Windows 7 operating system.For the instance of detecting fire event, the performance tests are based on the processed data of Intel Lab Data [13] from Intel Berkeley Lab.Except for the real data field, we insert the fire events and interference events data field into the dataset manually.
The experiment dataset contains the records of 54 sensors deployed in the IBRL lab during the time span from February 28th to April 5th in 2004.The MicaDot sensors collect temperature, humidity, light intensity, and voltage value every 31 seconds.Sensor node deployment is shown in Figure 2.  5) if %period = 0//period is parameter update period (6) flag = true; //flag represents update parameter or not ( 7) end (8) for  = 1 to // is the id of WSN,  is the number of sensors (9) learn parameter for each sensor node (10) if dataPointer [] < group_length //prevent the  exceed the length of group (11) if groupData_time [] <  //prevent a break caused by data loss (12) compute  from M (13) end ( 14

Data Preprocessing.
In our experiment, we choose the records within 24 hours in February 28th as our test data; we preprocess the raw data as follows: (1) Since the unit of measurement attributes directly sensed by each sensor is different and the changing range of different attributes is wide, so the raw data needs to be standardized and mapped to [0, 1]; in this way, the relative distance can be calculated.
(2) Since the change of each attribute value is continuous and periodic, in order to facilitate the calculation, the experimental datasets are discretized, and the values of each attribute are divided into 10 intervals.(3) For some parts of the raw IBRL datasets have missing values and the failure nodes (both node 5 and node 15 have no records; node 28 only has 3 attribute records), the NaN is used in this experiment to fill the missing values, and these values will be discussed in different situations, not for computation.
(4) In order to verify the performance of our algorithm on detecting abnormal events, abnormal readings that represent abnormal events are added in the dataset.
In addition, the readings of the abnormal events with the interference are added (e.g., opening heater in the room will make the temperature rise).

Experimental Parameters.
Temperature , humidity , light intensity , and voltage  are numbered with 1, 2, 3, 4. In order to obtain relatively stable Bayesian network structure, we set the maximum number of parent nodes in structure learning max_fan_in = 2, learning step length step = 10, and the number of instances ncases = 1000.The optimal parameter learning cycle period = 600.Bayesian networks with four different scores are showed in Figure 3; the higher the score is, the more stable the network structure is.Thus, we choose the structure whose score = 74 as an attribute dependency model in this experiment.In this method, the sliding window size has a direct impact on the detection results.The precision, the recall, and the 1-measure of anomaly detection under different sliding window sizes are experimented.The experimental results are shown in Figure 4.
From Figure 4, we can find that the recall decreases with the increase of the sliding window width; however, the overall change is not obvious.But the precision declines relatively faster, leading to the quick decrease of 1 value.This is because, with the increase of window width, the historical data increases, and the calculated average value declines ceaselessly, which means that the possibility of becoming candidate anomalies is higher.Considering that the sliding window width is small and the amount of uploaded data is small, so we set the sliding window size  = 10; in this way, we will make full use of historical data.There are different requirements for the threshold settings when the environment of wireless sensor networks differs.We change the value of three different thresholds and test the accuracy of the anomalies under the change of single threshold; the results are shown in Figure 5.
From Figure 5 it can be concluded that it gets the highest detection accuracy when temporal similarity threshold  = 0.1, spatial similarity threshold  = 0.2, and attribute correlation confidence threshold  = 0.5.

Contrast Experiment.
In the contrast experiment, we still use the IBRL dataset, in which the number of sensor nodes is 54, and the deployment of nodes is shown in Figure 2. We use (, , , ) to represent four different attributes: temperature, humidity, light intensity, and voltage.Since there are no interference factors in the dataset, we add some false abnormal events artificially, which are shown in Table 1.
The contrast algorithms include the Adaptive Fault-Tolerant Event Detection (AFTED) algorithm proposed in   [3], the Online Dynamic Event Region Detection (ODERD) algorithm proposed in [5], the Real-Time Event Detection Approach based on Temporal-Spatial Correlations (TSCRED) presented in [6], and the Spatiotemporal Correlation based Fault-Tolerant Event Detection (STFTED) scheme proposed in [8].And we compare the detection accuracy, false alarm rate, and detection time of abnormal events.In the proposed algorithm, we use the same parameter settings as the previous experiments.In AFTED algorithm, we set the window size for tolerating transient faults  AFTED = 4, and the threshold for filtering transient faults  AFTED = 0.75, which have been verified to be the most appropriate in their experiment.In ODERD algorithm, since we only focus on the static abnormal event detection, the parameters controlling the shift and deformation of event regions are set to 0 s.To compare these algorithms in an equivalent level, we set the sliding window size of TSCRED and STFTED to 10, which is the same as the proposed algorithm.Besides, all of the sensor nodes have the same communication range  = 4.And each event region is assumed to be a circle with radius  = 2.
The results of the proposed algorithm compared with the other four algorithms in detection accuracy are shown in Figure 6.It can be seen from Figure 6 that when the node failure rate goes from 0.05 to 0.3, the detection accuracies of the five algorithms are similar, reaching 0.96 or more; this is because most of the noise is filtered out in the time correlation detection phase.When the node failure rate is greater than 0.3, the detection accuracies of the five algorithms decrease significantly, but the MACAED algorithm is significantly better than the other four algorithms.The reason is that all the five algorithms have the spatial correlation detection stage.With the increase of the failure rate, the faulty nodes are easily affected by the neighbor nodes which have not detected the abnormal events, and they are converted into the normal state, therefore misjudging that no abnormal events occurred.
As for the false alarm rate, these compared results are shown in Figure 7.It can be seen that MACAED has a significantly lower false alarm rate than the other four algorithms as the node failure rate increases.This is due to the fact that MACAED fully considers the impact of attribute correlations on abnormal event detection.By calculating the attribute correlation confidence, the fitting degree between the data records and the abnormal event attribute dependency model can be determined, so the abnormal event and interference factor can be distinguished effectively.
The running time of the five algorithms is shown in Table 2.
It can be seen from Table 2 that the MACAED algorithm consumes the longest time.The reason is that the MACAED algorithm needs to train the network structure at the beginning.This process takes about 5 s on average.If the trained network structure is saved as the known result, the detection phase needs 12.546 − 5 = 7.546 s, which is very close to TSCRED algorithm and ODRED algorithm.

Conclusion
In this paper, we present a new approach to detect abnormal events in wireless sensor networks.We construct a dependency model of observed attributes based on Bayesian network and propose a new method to measure the dependency of the attributes.Combining with the temporal correlation detection based on sliding window and the spatial correlation detection based on neighbor node information, the influence of noise and interference event factors on event detection results is effectively reduced.Experimental results show that the algorithm proposed in this paper can effectively eliminate the influence of interference events.It not only reduces the false alarm rate of abnormal events but also improves the accuracy of event detection compared with the other four algorithms.

5 Figure 1 :
Figure 1: An example of attribute dependency model.
Input.WSN data set  Output.Abnormal event Information Info (1) standardize  into values between 0 and 1 (2) divide  into  subsets, choose the first set to learn Bayesian network (3) choose the network with highest score as attribute dependency model  (4) for  =  + 1 to epoch//epoch is incremental tick ( (25) flag = false; (26) end Algorithm 1: Abnormal event detection algorithm based on multiattribute correlation.

Figure 2 :
Figure 2: Location of sensor nodes deployed in IBRL lab.

Figure 4 :
Figure 4: Influence of sliding window size on the test results.

Figure 5 :
Figure 5: Influence of the three thresholds on the test results.
= {(  | (  ))},where   is the th node in  and (  ) is the set of parent nodes of node   .Figure1is an example of an attribute dependency model.
and the training sample and select the appropriate search strategy to search the network structure with the highest scoring value.
Learning.According to the trained network structure, the parameter of each node in the network is learned to get the corresponding conditional probability table.The conditional probability table contains the probability relations among the variables.Using the maximum likelihood estimation method, suppose ( 1 ,  2 , . . .,   ) is a set of possible values of random variable set ( 1 ,  2 , . . .,  The maximum likelihood estimation value θ of  is calculated through max ∈Θ ( 1 , . . .,   ; ).The conditional probability table for each node is obtained from the sample data and prior knowledge.

Table 2 :
Running time of five algorithms.