We first propose an MPD-Model, a novel distributed multipreference-driven data fusion model for WSNs. Here, preferences are looked as the core elements of collaboration mechanism in a data fusion procedure. We then present MFA, a distributed multi-preference feature-level fusion algorithm based on weighted average method. Next, to implement feature extraction of wrist-pulse data, we propose FEA, a light-weight adaptive feature extraction algorithm for time series sensed data. Simultaneously, we design TFD-Pattern that is a unique human pulse pattern. Based on historical data, we propose an SVM-based algorithm for health status detection tasks. Finally, we implement the proposed methods in a real wearable healthcare monitoring system which had been previously developed in-house. We validate the proposed methods using real-world data sets with 2046 pulse samples. Experimental results show that the proposed methods outperform the baseline methods, and the proposed MPD-Model is reasonable and effective.
1. Introduction
The rapid development of Wireless Sensor Networks (WSNs) brings some new situations. On the one hand, WSNs applications rapidly expand from traditional fields, for example, Military and National Defense and Environmental Monitoring, to emerging civil fields such as intelligent transportation, healthcare monitoring, and smart home. On the other hand, more mobile devices (e.g., smartphones) or local network systems (e.g., Body Sensor Networks) are frequently treated as one single node of WSNs. Undoubtedly, the situations will lead to form a large-scale and complex sensor network. We think the network is a typical Internet of Things (IoT) [1] and it have some characteristics such as interactivity and sociality. Simultaneously, It emphasizes intelligence and sensing-actuating ability of WSNs applications. Consequently, sensed data generated under the situations have the following features [1–4]. (1) The data has polymorphism and heterogeneity. (2) The data is massive. (3) The data is real time. (4) Storage locations of the data have much diversity. (5) The data has high complexity and strong relationships.
In the new situations, users play more and more importance to intelligence of sensor network systems because of the growing actual demands. But it is very difficult to implement the intelligence due to seriously limited resources on nodes and the lack of historical sensed data. This is particularly true in the healthcare monitoring applications based on WSNs. Fortunately, system detection/identification accuracies and intelligence can be greatly improved if multisource data fusion technologies are fully leveraged. However, if so, some challenges still exist below. (1) How to select and evaluate those multisource sensing parameters to fuse? (2) How to establish novel data fusion model/mechanism to improve detection accuracy? (3) How to leverage limited resources on nodes to automatically recognize complex parameter patterns?
To address these challenges, we propose an MPD-Model, a novel-distributed multi-preference-driven data fusion model for WSNs. Based on traditional data fusion architecture, MPD-Model takes regard preference information as core elements of collaboration mechanism in a data fusion procedure. It introduces large-scale historical sensed data into machine learning for data fusion, treats the massive data as real-world test set, and thus effectively ensures enough identification accuracy and high intelligence of WSNs application systems. Here, preference is defined as a kind of description information being different from existing nature, accuracy, and detail level information of monitoring tasks in traditional data fusion method. It can reflect subjective wishes of individuals or a group. For example, the preference, Task Parameter Relevance, just is a kind of importance degree information telling that the accuracies of detection tasks depend on sensing parameters selected for fusion. It has two characteristics: uncertainty and dynamics. The former means that preference information varies with both the size and the type of WSNs applications. This well embodies individual data fusion. The latter indicates that preference information can be quantified and then evaluated by formalizing them.
We systematically address the problems of data fusion and health status detection in a healthcare monitoring system based on WSNs with the following contributions.
To address the basic problem of data fusion, we propose an MPD-Model, a novel-distributed multi-preference driven data fusion model for WSNs. Based on MPD-Model, this paper conducts the individual data fusion of multisource sensing parameters in a real-world healthcare monitoring system. MPD-Model can not only solve incompleteness, inaccuracy, and uncertainty problems of preference information, but also significantly improve detection accuracy and intelligence of WSNs applications. According to MPD-Model, this paper presents several fusion algorithms to finish practical health status detection tasks.
In MPD-Model, we design and formalize four kinds of preference information (or impact factors). Through analyzing multi-preference information, we can quantify and evaluate importance degree of different multisource sensing parameters during data fusion. We propose MFA, a distributed and unified multi-preference fusion algorithm based on weighted average method for generating complex parameter patterns that will be further used for health status detection tasks.
To implement MPD-Model in a real wearable health status detection system [1], we propose EFA, a light-weight adaptive feature extraction algorithm for pulse data, to obtain physiological features of human wrist pulse waveforms. According to these features, we design TFD-Pattern, a novel time series pattern, and make the pattern as the input of MFA algorithm. Considering massive historical pulse data, this paper also presents SVM-based feature-decision-making algorithm to detect human health statuses, for example, subhealth. The algorithm is simultaneously an instance of yi element in MPD-Model.
We validate our proposed MPD-Model and fusion algorithms using large-scale real-world healthcare data sets. Experimental results show that FEA is effective and outperforms the existing derivative-based method at an over 16% feature extraction accuracy; an average fusion ratio of 98.73% is available when running MFA. SVM-based health status detection method well support intelligence of MPD-Model via outperforming (20~30%) AR and WPT methods. Different weighted values of multi-preference parameters perform distinct diagnosis performance. It proves that our individual fusion model based on preferences and weighted average method is reasonable and effective.
2. Multipreference-Driven Data Fusion Model for WSNs2.1. MPD-Model
In this paper, we propose an MPD-Model, a novel-distributed multi-preference-driven data fusion model for Wireless Sensor Networks, as in Figure 1. Explanations of MPD-Model are given below. Its input may include environment monitoring information, human physiological parameters, and movements/activities information. Ri(1≤i≤n) indicates the output results of signals/data processing procedure Pi on the sensor node Si. Ri♦Rj(1≤i≠j≤n) states the fusion operation between any two output results. Dk(1≤k≤m) means one motoring target state or decision-making as the output of the model. yi stands for any method, algorithm, or model for recognition or decision-makings.
MPD-Model: a distributed multipreference driven data fusion model for Wireless Sensor Networks.
The model is suitable for all kinds of WSNs applications for human sensing, particularly in healthcare monitoring, due to the following flexible features. (1)Pi, a signal/data processing procedure, may be any kind of processing method or technology that can run on the sensor nodes, and they should be distributed, light-weight, or interactive. (2) Fusion operation, namely, Ri♦Rj, may also be any kind of calculation, algorithm, or technology. (3) The decision-making yi may be a complicated classification method (e.g., Support Vector Machine and Time-Space Factor Graph Model), a user-defined rule, or even a simplest Threshold Value Judgement Operation such as IF statement. (4) Values of Dk depend on different WSNs applications. It can identify one monitored target state with Yes or No when n=2. Otherwise, multiple states of one target can be recognized and output.
MPD-Model views intelligence and historical sensed data as core elements and key features. This is reasonable and scientific because they are the development tendency of data fusion model. There are several reasons as follows. (1) The two features are strongly required by user demands of WSNs applications in the new situations. Another point to illustrate is that characteristics of sensed data generated in WSNs have dramatically varied. (2) Historical sensed data is paying more and more important role in a data fusion procedure. Thus, the role should be clearly embodied in a novel data fusion model, or fusion performance conducted by intelligent methods based on analyzing historical data will be seriously weakened. (3) With rapid expanding of WSNs applications from traditional domains to civil fields, they have dramatically increased demands on the intelligence. (4) Data processing ways depend on data types. But only data fusion model can effectively solve the requirement when considering the intelligence. This model should be driven by physical model of WSNs, and aiming at the intelligence of WSNs applications, this model should combine all kinds of data processing technologies and establish a unified architecture or a novel fusion mechanism to efficiently implement users’ requirements.
2.2. Formalization of Preferences’ Information/Impact Factors
Influences on detection decision-making results of sensing parameters depend on specific WSNs applications. Thus, we propose an individual data fusion model for WSNs. The model first computes fusion weights of different impact factors according to one WSN application’s own features, and then combines and fuses these factors using weighted average method. Finally, more reasonable and accurate event detection results can be obtained. In the model, we innovatively design four kinds of ubiquitous preferences (impact factors or constraints) for data fusion in WSNs. They can clearly describe uniqueness or features of every WSNs application. They are formalized and described below.
2.2.1. Task-Parameter Relevance, TPR
The impact factor measures the relevance degree between sensing parameters and task types of event detection. For instance, although some parameters, for example, temperature, humidity, air composition, and so on, are often combined together to detecting fire event, temperature parameter will obviously have more influences on fire event detection than others. TPR is designed from the perspective of the event detection types and computed using the following formula:
(1)t_score(P,E)=q→P·q→E∥q→P∥∥q→E∥,
where q→E is the vector of event types and q→P is the vector of sensing parameters for detection or monitoring task.
The impact factor reflects the relevance degree between the event location and the sensing parameter collection position. As we all know, those parameters that are collected from event locations will have more influences on event detection performance. For example, in the fire detection, temperature parameters at fire location will have higher reference value than the ones at other locations. ECR is computed according to formula (2)
(2)e_score(P,E)={#P’sesofE’slocationifE’slocation∈La,0ifE’slocation∉La,
where es indicates event and La={l1,l2…,lm} means the set of position where sensing parameters are collected from.
2.2.3. Observer Recommendation Influence (ORI)
It indicates one parameter’s importance degree in the eyes of observers or domain experts during event detection. It is well known that different experts will emphasize different sensing parameters in one same event detection task. Thus, the preference, that experts assign to one sensing parameter, will be very the important information for fusion. For example, environmental experts can obtain more accurate detection results of Algae outbreak by combining water quality, PH value, and water temperature parameter together, and ordinary observers can directly do it by using the value of Algae content information. ORI is calculated using the following formulas:
(3)o_score(O,E)={1-spw(O,E)ifspw(O,E)<1,1ifspw(O,E)≥1,o_score(P,E)=∑P’sobserverso_score(O,E)#P’srecommenders,
where O denotes an observer or a domain expert, E an event, and spw means the weight of shortest path between O and E. The term “observers” is equivalent with “recommenders.”
2.2.4. Collection Cost Preference (CCP)
It measures collection cost of sensing parameters selected in data fusion of WSNs applications in order to design more reasonable detection solution. For example, we can easily detect Algae outbreak event using Algae content sensors. However, for most environmental managers, it is impossible to do it by this way due to the technology difficulty or the high-cost solution. But according to suggestions of environmental experts, they can combine water quality, water temperature, water visibility, and PH value together to detect the same event. These kinds of information can be obtained using sensors or devices with high quality and low price. In this case, collection cost of sensing parameters has undoubtedly effect on the detection precision of Algae outbreak event.
Computing CCP needs to map all kinds of information from both sensing parameters and collecting sensors/devices into one same number interval so that comparisons might be made at one same order of magnitude. Therefore, Sigmoid function is used to map all inputs of real number field into the interval [0,1], and then rank fraction function is applied to map ranking information into the interval [0,1]. They are defined as in (4), respectively,
(4)sigmoid:S(t)=11+e-t,rank_score:R(r,c)=1-log10(1+9×r-1c-1),c>1,
where r indicates the rank of a sensing parameter or a device, and c indicates the number of the ranks. We first adjust input errors of devices and then carry out formatting and normalization processes. At last, combining sensing parameters’ information, we compute t_rank—the rank of technology maturity degree of sensors/devices, t_count—the number of technology maturity degrees, p_rank—the rank of popularity degree of devices, p_count—the number of popularity degrees, d_rank—the rank of difficulty grade/degree for collecting sensing parameters, d_count—the number of difficulty grades, and gpa_score—the percentage transferred from the price of devices. Thus, we can compute collection cost score of a sensor/device for a parameter using formula (5), the final formula for computing CCP impact weight is given, as in (6):
(5)c_score(P,I)=S(R(I’st_rank,t_count)+(I’sp_rank,p_count)-R(P’sd_rank,d_count)-I’sgpa_score),(6)c_score(P,E)=∑P’sdevicesforEc_score(P,I)#P’sdevicesforE,
where d_count=n, t_count=m, p_count=k, P means a kind of sensing parameter, I indicates a sensor/device, and E means a monitoring task or an event.
2.3. Historical Sensed Data in MPD-Model
Though lots of WSN-based applications had employed machine learning methods to conduct recognition in WSNs [1, 2], to the best of our knowledge, few work definitely introduce Historical Sensed Data (or Historical Data), shortly HSD, as the core component of fusion model [6]. In our opinion, HSD should be involved in data fusion model design due to the following facts.
In WSNs, long-term continuous monitoring will inevitably produce massive data. The importance of these data is that it not only provides query services for users, but also lays a solid foundation of decision-making for future data mining.
In WSNs, most parameters, such as pulse, ECG, and physical image, have complex data patterns. Accordingly, diverse machine learning methods are needed to train, learn, and classify these patterns. Therefore, HSD should be undoubtedly involved in data fusion procedure of WSNs systems.
Parameter types often determine what decision-making methods should be employed to finish health status detection task. Specifically, in the case of unattended monitoring, only some simple data processing method can be employed.
In the case of human sensing, complex factors such as users’ social relationships might be considered and accordingly call for more complicated data processing methods (e.g., Time-Space Factor Graph Model).
To illustrate the role of HSD in data fusion tasks, we conduct comparative diagnosis experiments based on AR [7] and WPT [8] methods to recognize the five kinds of human health statuses. SVM and user-defined rules are employed as decision-making methods. The comparison results, as in Figure 2, show that SVM-based methods with HSD often have higher recognition accuracies (30~40%) than user-defined rules without HSD.
Contribution and importance of Historical Sensed Data (HSD) for data fusion (user-defined rules are designed according to expert’s prior knowledge. At the x-axis, H shows the comparison results of identifying the Hypertensive from the Normal, S—Subhealthy versus Normal, Co—Coronary Heart disease versus Normal, Ci—Cirrhosis disease versus Normal, and P—Pregnant versus Normal female.)
3. Health Status Detection
Chinese Pulse Diagnosis Theory (CPDT) [9] has been used in clinical diagnosis for thousands of years and has been proved to be valuable. According to CPDT, wrist-pulse data contains the abundant physiological and pathological parameter information of human body. Pulse information has different patterns which can identify different kinds of people with a specific health status or symptom/disease. Thus, sensed pulse data can be leveraged to effectively detect human health status.
3.1. TFD-Pattern: A Novel Time Series Pattern for Health Status Detection
In this paper, a unique kind of pulse pattern is designed based on CPDT [9] and our previous researches on pulse diagnosis [1, 5]. We first compute 90 time-domain feature variables based the five feature points (see Figure 3). For example, the feature variable T1/T can be obtained through leveraging the T-point and the P-point. Then we compute 36 frequency-domain feature variables based on the discrete Fourier transform formula, as in (7), and the discrete power spectral density function, as in (8). Finally we obtain 6 feature variables for pressure value. Each of them is the average value of all the real-time external pulse pressures in one pulse waveform data. Actually, these above computations are just suboperations of f(·) in Figure 1:
(7)X(k)=∑n=0N-1x(n)e-j(2π/N)kn,(8)∑n=0N-1x2(n)=1N∑k=0N-1|X(k)|2.
The five most important physiological feature points of human pulse waveform data [5].
We merge the three types of feature variables into one 132-dimensional pulse pattern that stands for a wrist-pulse waveform and call it time-frequency-domain Pattern, shortly TFD-Pattern, as in Figure 4.
Illustration of our designed TFD-Pattern.
3.2. SVM-Based Feature-Decision-Making Algorithms for Health Status Detection
Since the SVM performs well on problems with low training set sizes, we conduct SVM-based identification experiments on these pulse data sample groups with different human health statuses. Based on decision function of SVM theory, as in (9), we propose a health status detection algorithm which runs on the central server:
(9)f(x)=sgn(∑i=1kαi*yiK(Ei,x)+b*),
where k is the number of pulse data samples; E is a set of feature vectors. K might be one of Linear, Quadratic, Polynomial, RBF, and MLP kernel functions. αiyi is weight vector and ∑i=1nαiyi=0.
It needs to be noted that the algorithm was designed just according to our proposed MPD-Model and is corresponding to αiyi in the model. Another characteristic of this algorithm is that it is based on our designed TFD-Pattern which is totally different from other pulse diagnosis (or health status detection) methods [7, 8].
4. Distributed Multipreference-Driven Feature-Level Fusion Algorithms Based on Weighted Average Method
In terms of f(·):R1♦R2♦⋯♦Rm in MPD-Model, we present MFA, a novel distributed and unified multi-preference fusion algorithm based on weighted average method. MFA first computes multi-preference parameter’s impact weights of detecting health statuses using probabilities of four impact factors mentioned above, as in (10), and then fuses all physiological parameters to generate parameter patterns for further health status detection decision-making, as in (11):
(10)impact_weight(P,D)=(t_score(P,D)×t_weight+m_score(P,D)×m_weight+e_score(P,D)×e_weight+c_score(P,D)×c_weight)/(t_weight+m_weight+e_weight+c_weight),(11)Fusion(P1,P2,…,Pm)=⋃i=1mf(Pi)♦impact_weight(Pi,D),
where the initial values of all weights are set to 1 and ∑i=1mimpact_weight(Pi,D)=1. With receiving feedbacks from users, the model would use feedback data to optimize these weight values.
5. Experimental Results5.1. Experimental Setup
In our previous work, we had developed a wearable healthcare monitoring sensor network system [1] for health status detection. This system helps us to collect the real-world health care data set in a large scale clinic experiments which happened at the Institute of Computing Technology (ICT), Chinese Academy of Sciences (CAS) in 2009 and 2010 Hi-tech Fair of China in Shenzhen City, respectively. Wrist-pulse data samples distribution is shown in Figure 5.
Wrist-pulse samples distribution in the real-world health care data sets.
We validate the effectiveness of the proposed methods comparing with the following baseline methods.
In this method, an automated derivative-based time-domain feature extraction method on wrist pulse waveform is proposed to extract magnitude-type feature hi(i=1to5). Here, the set of hi is equal to the five feature points (see Figure 3) that are employed in our work.
5.1.2. AR Pulse Diagnosis Method [7]
In this health status detection method, an autoregressive- (AR-) based method is proposed to extract the pulse signal features. The mean and variance of the prediction error are calculated and selected as features. The selected features are then taken as inputs to a support vector machine (SVM) for diseases classification.
5.1.3. WPT Pulse Diagnosis Method [8]
This health status detection made pulse waveform feature extraction based on optimal Wavelet Packet Transform (WPT). Subband energies contained in the best basis were extracted as features and SVM classifiers were trained, differentiating cholecystitis patients and nephritic syndrome patients from normal people.
We evaluate our method from the following four angels.
Precision. We evaluate the proposed method in terms of Extraction Precision and Diagnosis Precision comparing to corresponding baseline methods.
Fusion Ratio. It states the ratio of the fused file size versus the original file size. The former file is used to save multi-preference parameter data. Fusion ratio is computed by the formula: (1 − fused size/original size)*100%.
CPU Time. It indicates how long learning procedures would take for different SVM∖_based methods. It is employed to measure real time of these comparative methods.
Significant Difference. It can tell users what feature variables in TFD-Pattern can much help in revealing the human symptom condition, and then, it would prove that our proposed method is effective if any valid feature variable with significant difference exists.
5.2. Feature Extraction
According to CPDT [9], we must first extract the five most important time-domain feature points (see Figure 3) before running MFA algorithm in our health status detection. Thus, we propose FEA, an adaptive feature extraction algorithm for pulse data. It can be considered as an instance of Pi→Ri(1≤i≤n) in MPD-Model. Specifically, we formalize the slope variability as a trend decision function, as in (12). Then, the final desirable feature points can be selected from the candidate ones according to some adaptive parameters. The most important adaptive parameter Threshold_of_PULSE is used to find the PB-point, the TD-point, and the PD-point, and it is computed by (13). Another adaptive parameter Amp_pulse indicates the mean value of all pulse cycles in one pulse waveform. It can be computed by (14):
(12)f(i)={0Other,1s(i)<0,s(i+1)<0,2s(i)>0,s(i+1)<0,-2s(i)<0,s(i+1)>0,(13)Threshold_of_PULSE=(1/Fs+m)Amp_pulse(14)Amp_pulse=1Peak_Num∑k=1Peak_Num-1(T-point(k+1)-T-point(k)).
5.2.1. Effectiveness of Adaptive Parameters
To illustrate the effectiveness of Threshold_of_PULSE, we run FEA algorithm on normal pulse data with 50 Hz, 100 Hz, and 1000 Hz sampling frequencies, respectively. The results are shown in Figure 6. From this figure, we can see that all the beacon features points can be accurately extracted.
Effectiveness of the adaptive parameter Threshold_of_PULSE.
Similarly, according to our prior knowledge, we may set Amp_pulse = T/2 when the irregular slippery pulse is processed, and Amp_pulse = T/3 when the irregular normal pulse is dealt with. T indicates the length value of pulse waveform cycle. The results in Figure 7 indicate that FEA algorithm also well performs on different wrist-pulse conditions.
Effectiveness of prior knowledge: Amp_pulse.
5.2.2. Performance Evaluation of Feature Extraction
We compare the proposed FEA algorithm with a typical derivative-based method [11] via extracting these five feature points on 354 healthy people’s wrist-pulse samples. Table 1 shows comparison results of extraction accuracies of 10 random examples.
Comparison results of FEA versus the derivative-based method.
Number
ID
Derivative-based method
FEA
…
…
…
26
A19_709A
71%
86%
27
A20_809C
70%
84%
28
A22_737E
67%
90%
29
A37_719E
72%
82%
30
A66_524G
69%
89%
31
B20_735E
74%
89%
32
B9_0724C
64%
84%
33
B10_723E
70%
78%
34
B18_706K
73%
89%
35
B19_706G
70%
88%
…
…
…
Average
70%
86%
From Table 1, FEA outperforms (+16%) the derivative-based method due to the following. (1) Though the derivative-based method is also light-weight, it is vulnerable to heavy noise from human body’s activities. (2) The FEA not only is light-weight and provides great support to real-time pulse signal processing tasks, but also makes full use of many adaptive parameters based on automatic computing or prior knowledge.
5.3. Fusion Effectiveness
In our experiments, different types of data at each processing stage would be saved into the respective files. Considering the existing information fusion estimator, for example, sample-based estimator using measurements [12], we calculate data fusion∖reduction ratios by comparing file sizes before and after processing sensed pulse data. Reduction degree of these files size directly shows the performance of MFA algorithm. Table 2 displays MFA’s data fusion ratios. In this table, the first column indicates the number of one wrist-pulse ample, the second column stands for the sizes of these files before running MFA algorithm, the third column shows the sizes of these files after running MFA algorithm, and the final one gives data fusion ratios. The results show MFA algorithm has excellent fusion effect for wrist-pulse data.
Data fusion ratios of MFA algorithm.
Data file name
Original size (KB)
MFA
Ratios
….
….
..
…
NORMAL0613B_4_01
1491
14
99.06%
hBP_0612A_0_01
658
6
99.08%
hBP_AO0709B_1_01
192
3
98.44%
NORMAL0612B_0_01
271
5
98.15%
NORMAL0709A_5_02
858
8
99.07%
hBP_AO0709_1_01
119
2
98.32%
NORMAL0715A_1_01
1058
11
98.96%
….
….
..
…
Average
98.73%
5.4. Health Status Detection Performance
We make evaluations from the following aspects.
5.4.1. Impact Factor Weight Validation
According to Chinese Pulse Diagnosis Theory (CPDT) [9, 10], we investigate that what wrist-pulse parameters from diverse sources (i.e., Cun, Guan and Chi on left and right human wrists [13]) have great impacts on revealing disease/symptom information. Based on this investigation (see Figure 8), we further compute different weight values that are listed in the 2–6 columns of Table 3. According to formula (1), we detect four kinds of health statuses or diseases/symptoms: hypertension (H), coronary (Co), cirrhosis (Ci), and pregnancy (P). These weights are selected and grouped into three weight solutions shown in Table 3. According to these weight solutions we conduct comparative experiments and give the results shown in Figures 9 and 10. From the two figures, we can see that the greater weight values of related impact factor (Pi) become the higher health status detection precisions have. This proves that our multi-preferences-driven data fusion-based impact factor evaluation can well work.
Different weights of multi-preference parameters for health status detection.
P1
P2
P3
P4
P5
P6
Weights solution-I of impact factors
H
0.5
0
0
0
0
0.5
Co
1
0
0
0
0
0
Ci
1
0
0
0
0
0
P
0
0
0.5
0
0
0.5
Weights solution-II of impact factors
H
0
0.25
0.25
0.25
0.25
0
Co
0
0.2
0.2
0.2
0.2
0
Ci
0
0.2
0.2
0.2
0.2
0
P
0.25
0.25
0
0.25
0.25
0
Weights solution-III of impact factors
H
0.3
0.1
0.1
0.1
0.1
0.3
Co
0.7
0.06
0.06
0.06
0.06
0.06
Ci
0.8
0.04
0.04
0.04
0.04
0.04
P
0.02
0.02
0.4
0.02
0.02
0.4
Wrist-pulse from diverse sources has different effects on revealing health status conditions according to Traditional Chinese Medicine [10].
AR-based comparison results of different weights solutions for multi-preferences parameters (characters L, Q, P, R, and M stand for Linear, Quadratic, Polynomial, RBF, and MLP kernel function of SVM, respectively. The title of every subfigure tells which health status had recognized).
WPT-based comparison results of different weight solutions for multipreference-parameters (explanations of this figure have been given in Figure 9).
5.4.2. Comparative Evaluation
In this paper, we compare with the baseline methods, AR [7] and WPT [8] methods, to validate the effectiveness of the proposed MPD-Model. The two methods and ours all focus on feature extraction of wrist-pulse data and are based on SVM to distinguish patients with different diseases from normal people. The difference is that distinct wrist-pulse pattern is employed to recognize. Thus, we can get solid comparison results shown in Figure 11. These comparative experiments are conducted on the real healthcare data sets mentioned above (see Figure 6).
Comparison with AR and WPT methods (Part (a), (b), (c) and (d) show SVM-based diagnosis experiments with Linear, Quadratic, Polynomial and RBF kernel function, respectively. The y-axis indicates Precision (%) of health status or disease/symptom identification, and the explanations of the x-axis has already been given in Figure 2.)
Linear
Quadratic
Polynomial
RBF
CPU time of ours, AR, and WPT health status detection method is approximately 17.3121 s, 164.7448 s and 42.0289 s, respectively. Among all comparative experiments, our SVM-based health status detection method takes the least running time and thus it is proved to be light-weight and more suitable to fit wearable healthcare monitoring system based on WSNs. As for hypertension, coronary, cirrhosis, and pregnancy symptoms, our method outperforms (average 15~30%) AR and WPT methods. Thus, our method is more effective to detect human health status. However, identification accuracy of our method is a little lower than the two methods when detecting subhealth status. There are two reasons to explain it: (1) low accuracies of feature extraction for pulse waveform data will cause that it is hard to clearly distinguish some pulse patterns from others. But AR and WPT methods have no such problem just because algorithms of computing feature vector are totally different from ours. (2) Generally, it is very difficult to identify people with subhealth status from the normal ones since such patients often have not clear disease characteristics.
5.4.3. Effectiveness Evaluation
At last, feature variables with significant difference in the two groups, as in Table 4, are obtained by conducting T statistical analysis method.
Feature variables with significant difference.
Feature variable number
Left/right wrist
Cun, Guan, or Chi
Feature variable name
Number 22
Left
Cun
Real-time external pulse pressure (EPP)
No. 59
Left
Chi
S5/S
As Figure 3 shows, feature variable S5/S indicates the ratio of TD-point–T-point area to the whole area of one pulse cycle. It reveals clear physiological meaning. That is, patient’s reflected wave of blood flow in one heartbeat cycle are stronger than the healthy. This is may be one reason that blood pressure values of a hypertension people are always high. Another feature variable, real-time External Pulse Pressure (EPP), indicates that real-time blood pressure values recorded by pulse-sensor nodes have distinct effects on the patients. It is more reasonable than S5/S to explain such the pathological phenomenon that patient’s blood pressure is higher than healthy people’s.
6. Related Work
WSN-based healthcare solutions become a hot research topic. It has been well developed through the research work in activity recognition [14], physiological data gathering [15], pattern recognition [16], and so forth. Though lots of existing work collect blood pulse data for a healthcare-monitoring purpose, few work study systematically how to fuse parameters from diverse sources based on impact factors.
The five most important feature points (see Figure 3) considered as the basis of the objectification of pulse study by most researchers [17]. Currently, most extraction methods of time domain features are derivative based [11]. This kind of method is easy to be implemented on sensor nodes, but is just suitable to regular pulse waveform with high Signal-to-Noise Ratio (SNR). To address the problem of limited resources, feature extraction in our work focuses on sensed pulse data with low SNR. The proposed algorithm is light-weight and adaptive. Much work focus on pulse detection and classification. For example, [18] proposed a Lempel-Ziv complexity analysis-based approach to classify seven pulse patterns that exhibit different rhythms. Besides, Chinese Pulse Diagnosis Theory (CPDT) is also objectified in order to pulse acquisition, feature extraction of pulse, and pulse diagnosis [17]. As a powerful method for pattern classification with low training set sizes, SVM is often leveraged to diagnose pulse signals/patterns that are computerized using Modified Auto-Regressive Models [7] or spatial and spectrum features [19].
Though there already exist some data fusion methods, for example, Crash Fault Correction [20], Privacy Protection [21], position-based Aggregator node election in WSNs [22, 23], and environmental Monitoring [24], they are not suitable for Wearable Sensor Networks due to complicated sensed data patterns and high intelligence requirements [6]. Reference [25] also focus on multisource fusion model, but employing Chi-square distribution with n degrees of freedom method instead of weighted average method. Reference [12] discusses information fusion estimation problems and provides useful estimation methods for our proposed data fusion model. A continuous classification of multisource sensed data in [26, 27] inspired us to introduce historical data into designing multisource-driven data fusion model.
7. Conclusion
In this paper, we try to systematically investigate the problems of data fusion and health status detection and propose MPD-Model, a novel-distributed multi-preference-driven data fusion model for WSNs. Based on MPD-Model, we present MFA algorithm (an instance of f(·) in MPD-Model), EFA algorithm (Pi), and SVM-based health status detection method (yi). We also design TFD-Pattern, a novel time serials pattern according to CPDT. It is a kind of human pulse pattern as well as is both EFA’s output and MFA’s input. Experimental results show that they all outperform baseline methods with the feature extraction accuracy of +86% and the system diagnosis accuracy of +75%, and our proposed MPD-Model is reasonable and effective.
Acknowledgments
This paper is supported in part by the National Basic Research Program of China (973 Program) under Grant no. 2011CB302803, the “Strategic Priority Research Program” of the Chinese Academy of Sciences under Grantno. XDA060307000, the National Natural Science Foundation of China under Grant no. 61003292, and the Key Project of Natural Science Foundation of Jiangsu Province under Grant no. BK201100.
ZhangJ. J.WangR.LuS. L.EASiCare: design and implementation of a portable chinese pulse-wave retrieval systemProceedings of the 9th ACM Conference on Embedded Networked Sensor Systems, (SenSys ’11)2011Seattle, Wash, USA149161KeallyM.ZhouG.XingG. L.PBN: towards practical activity recognition using smartphone-based body sensor networksProceedings of the 9th ACM Conference on Embedded Networked Sensor Systems, (SenSys '11)2011Seattle, Wash, USA24625910.1145/2070942.2070968ParkT.LeeJ.HwangI.YooC.NachmanL.SongJ.Demo: e-gesture—a collaborative architecture for energy-efficient gesture recognition with hand-worn sensor and mobile devicesProceedings of the 9th ACM Conference on Embedded Networked Sensor Systems, (SenSys '11)2011Seattle, WA, USA3593602-s2.0-7996109212210.1145/1999995.2000034ErtinE.StohsN.KumarS.AutoSense: unobtrusively wearable sensor suite for inferring the onset, causality, and consequences of stress in the fieldProceedings of the 9th ACM Conference on Embedded Networked Sensor Systems, (SenSys ’11)2011Seattle, Wash, USA274287GongJ. B.LuS. L.WangR.CuiL.PDhms: pulse diagnosis via wearable healthcare sensor networkProceedings of the IEEE International Conference on Communications, (ICC '11)2011Kyoto, JapanNakamuraE. F.LoureiroA. A. F.FreryA. C.Information fusion for wireless sensor networks: methods, models, and classifications2007393A9/1A9/552-s2.0-3454862941210.1145/1267070.12670731267073ChenY.ZhangL.ZhangD.ZhangD. Y.Computerized wrist pulse signal diagnosis using modified auto-regressive models20113533213282-s2.0-7995974404910.1007/s10916-009-9368-4GuoQ. L.WangK. Q.ZhangD. Y.LiN. M.A wavelet packet based pulse waveform analysis for cholecystitis and nephrotic syndrome diagnosisProceedings of the International Conference on Wavelet Analysis and Pattern Recognition, (ICWAPR '08)August 2008Hong Kong, China5135172-s2.0-5674910497810.1109/ICWAPR.2008.4635834XuL.ZhangD.KongW.Wavelet-based Cascaded Adaptive Filter for Removing Baseline Drift in Pulse
Waveforms2005521119731975FeiZ. F.2003Beijing, ChinaPeople’s Medical Publishing HouseXiaC.LiY.YanJ.WangY.YanH.GuoR.LiF.Wrist pulse waveform feature extraction and dimension reduction with feature variability analysisProceedings of the 2nd International Conference on Bioinformatics and Biomedical Engineering, (ICBBE '08)May 2008204820512-s2.0-5094913196510.1109/ICBBE.2008.841MadanR. N.RaoN. S. V.A perspective on information fusion problems200951410.1080/15501320802498430ChenY.ZhangL.ZhangD.ZhangD. Y.Wrist pulse signal diagnosis using modified Gaussian models and Fuzzy C-Means classification20093110128312892-s2.0-7044970409510.1016/j.medengphy.2009.08.008GuT.WuZ.TaoX.PungH. K.LuJ.epSICAR: an emerging patterns based approach to sequential, interleaved and concurrent activity recognitionProceedings of the 7th Annual IEEE International Conference on Pervasive Computing and Communications, (PerCom '09)March 2009Galveston, Tex, USA192-s2.0-7034933387710.1109/PERCOM.2009.4912776CristescuR.Beferull-LozanoB.VetterliM.On network correlated data gatheringProceedings of the 23rd Annual Joint Conference of the IEEE Computer and Communications Societies, (INFOCOM '04)March 2004257125822-s2.0-8344251697ShapiroJ. M.Embedded image coding using zerotrees of wavelet coefficients19934112344534622-s2.0-002779811010.1109/78.258085XuL.WangK.ZhangD.Objectifying researches on traditional chinese pulse diagnosis20035663XuL.ZhangD.WangK.WangL.Arrhythmic pulses detection using Lempel-Ziv complexity analysis20062006122-s2.0-3364566352210.1155/ASP/2006/18268018268ZhangD. Y.ZuoW.ZhangD.Wrist blood flow signal-based computerized pulse diagnosis using spatial and spectrum features201043361366BalasubramanianB.GargV. K.Fused data structures for handling multiple faults in distributed systemsProceedings of the International Conference on Distributed Computing Systems, (ICDCS ’11)2011677688HeW. B.LiuX.NguyenH.NahrstedtK.A cluster-based protocol to enforce integrity and preserve privacy in data aggregationProceedings of the 29th IEEE International Conference on Distributed Computing Systems Workshops, (ICDCS ’09)20091419ButtyánL.SchafferP.Position-based aggregator node election in wireless sensor networks20102010152-s2.0-7995285851210.1155/2010/679205679205MpitziopoulosA.GavalasD.KonstantopoulosC.PantziouG.CBID: a scalable method for distributed data aggregation in WSNs20102010132-s2.0-7995281994210.1155/2010/206517206517DauweS.Van RenterghemT.BotteldoorenD.DhoedtB.Multiagent-based data fusion in environmental monitoring networks201220121532493510.1155/2012/324935BenediktssonJ. A.KanellopoulosI.Classification of multisource and hyperspectral data based on decision fusion1999373136713772-s2.0-003263503410.1109/36.763301XingG.WangJ.ShenK.HuangQ.JiaX.SoH. C.Mobility-assisted spatiotemporal detection in wireless sensor networksProceedings of the 28th International Conference on Distributed Computing Systems, (ICDCS '08)July 20081031102-s2.0-5184912281010.1109/ICDCS.2008.81WaskeB.BenediktssonJ. A.Fusion of support vector machines for classification of multisensor data20074512385838662-s2.0-3634900714510.1109/TGRS.2007.898446