Cyber physical systems have grown exponentially and have been attracting a lot of attention over the last few years. To retrieve and mine the useful information from massive amounts of sensor data streams with spatial, temporal, and other multidimensional information has become an active research area. Moreover, recent research has shown that clusters of streams change with a comprehensive spatial-temporal viewpoint in real applications. In this paper, we propose a spatial-temporal clustering algorithm (STClu) based on nonnegative matrix trifactorization by utilizing time-series observational data streams and geospatial relationship for clustering multiple sensor data streams. Instead of directly clustering multiple data streams periodically, STClu incorporates the spatial relationship between two sensors in proximity and integrates the historical information into consideration. Furthermore, we develop an iterative updating optimization algorithm STClu. The effectiveness and efficiency of the algorithm STClu are both demonstrated in experiments on real and synthetic data sets. The results show that the proposed STClu algorithm outperforms existing methods for clustering sensor data streams.
Cyber physical systems (CPS) have grown exponentially and have been attracting a lot of attention over the last few years [
To obtain interesting relationships and useful information from multiple data streams, a variety of methods have developed over the past decades. Beringer and Hüllermeier [
The values of a sensor in the temporal domain usually have spatial correlation features, meaning that the sensors’ readings are influenced by near sensors. Such as in transportation systems, the congestion swiftly expands along the street and influences nearby sensors. A serious congestion usually lasts for a few hours and covers hundreds of sensors when reaching the full size. As time passes by, it shrinks slowly, eventually reduces the coverage, and finally disappears. As shown in Figure
Time-series of upstream and downstream traffic flow.
In this paper, we propose a spatial-temporal clustering algorithm (
The rest of this paper is organized as follows. Related research work is introduced in Section
Graphs are used in a wide range of applications, such as transportation networks, sensor networks, and social networks. So, various problems are studied under graph mining. To find clusters in graphs is a new challenge if the graph is evolving over time. Matrix is a usual representation of a graph; the relationships between matrix factorization and
Singular value decomposition (SVD) has served as a building block for many important applications, such as PCA and LSI [
Nonnegative matrix factorization (NMF) [
In particularly, the result of a
To extend the applicable range of NMF methods, when the data matrix is unconstrained, Semi-NMF is motivated from the perspective of clustering [
To expand the range of application of NMF, Semi-NMF and Convex-NMF algorithm have been proposed in [
However, the traditional NMF, Semi-NMF, and Convex-NMF are linear model and they may fail to discover the nonlinearities of data streams. In real world, the data streams have potential nonlinear structure. Kernel method is a powerful technique for extracting useful information form nonlinear correlations. To solve the problem of nonlinearities, the kernel method is to map the data nonlinearly into a kernel feature space. Then, the Convex-NMF method can be accomplished in the kernel feature space to process the nonlinear data.
Indeed, several important applications can be modeled as large sparse graphs, such as transportation network analysis and social network analysis. Low-rank approximation for the matrix of a graph is essential in finding patterns and detecting anomalies. And it can extract correlations and remove noise from matrix structured data. This has led to proposal of methods such as CUR [
To monitor the traffic status along a freeway, sensors are deployed and utilized to collect readings, such as vehicle speed and volume of traffic. It is assumed that data records arrive synchronously, which means that all data streams will be updated simultaneously. The sensors are the points in the graph and an edge is formed between each pair of sensors; these data records can be used to construct graphs of transportation networks periodically.
Given two sets of spatial objects
Spatial-temporal cluster: graph
Without loss of generality, we use the adjacency matrices
The spatial relationships between two graphs are denoted as a set of edges connecting the nodes between
At timestamp
By utilizing time-series observational data streams and geospatial relationship for clustering multiple sensor data streams, the proposed method
The objective function can be written as
Specifically, assume cluster numbers
As can be seen, the objective function in (
We optimize the objective function with respect to one variable while fixing the other variables. This iterative procedure repeats until convergence. Using the matrix properties
To alternately update the entries of
Since the derivatives of
Using the Karush-Kuhn-Tucker conditions [
According to (
Proofs of the convergence of update rules (
To eliminate linearly dependent columns and construct the subspace used for low-rank approximation from data matrix
Input: Matrices Output: Matrices (1) use the family of Colibri methods to get (2) determine cluster number (3) initialize (4) if we need to form a new partition (5) go to (8) (6) else (7) let (8) while not converging and (9) update (10) (11) update (12) (13) end while (14) return
The computational cost of the proposed
In this section, we use several synthetic and real world data sets to evaluate the effectiveness and efficiency of the
To demonstrate the efficiency of
To evaluate clustering quality, all our comparisons are based on clustering accuracy (ACC) and normalized mutual information (NMI) measurements.
Clustering accuracy (ACC) discovers the one-to-one relationships between clusters and classes and measures the extent to which each cluster contains data points from the corresponding class. It is defined as [
Between two random variables
Since there are no predefined categories in our data, we have to design an alternative way to carry out the evaluation. In this paper, we use the
In this section, we use synthetic datasets to demonstrate that
We show the performance of four methods while the time window evolving over time and the number of clusters in the data varies from 2 to 20. The stream number is fixed at 400, and each of them contains 1,000 points. As can be seen in Table
Comparison of ACC and NMI results for the four methods based on synthetic data sets.
|
ACC | NMI | ||||||
---|---|---|---|---|---|---|---|---|
|
Ncut | NMF | STClu |
|
Ncut | NMF | STClu | |
2 | 0.3959 | 0.4954 | 0.5449 | 0.6542 | 0.3480 | 0.5130 | 0.5895 | 0.6749 |
4 | 0.3855 | 0.4740 | 0.5120 | 0.6436 | 0.4148 | 0.5233 | 0.4416 | 0.6967 |
5 | 0.3773 | 0.4685 | 0.5205 | 0.6757 | 0.3810 | 0.5016 | 0.5781 | 0.6122 |
6 | 0.3710 | 0.4866 | 0.5425 | 0.5603 | 0.4162 | 0.5430 | 0.6041 | 0.6059 |
8 | 0.3864 | 0.4636 | 0.4893 | 0.6038 | 0.4257 | 0.5247 | 0.5098 | 0.7022 |
10 | 0.3711 | 0.4540 | 0.4919 | 0.6231 | 0.3960 | 0.5154 | 0.4082 | 0.7238 |
12 | 0.3896 | 0.4802 | 0.5231 | 0.5868 | 0.4051 | 0.5088 | 0.5916 | 0.5421 |
15 | 0.3819 | 0.4728 | 0.5102 | 0.4834 | 0.4180 | 0.5207 | 0.5252 | 0.6281 |
18 | 0.4121 | 0.4827 | 0.5082 | 0.5293 | 0.4315 | 0.5786 | 0.5850 | 0.5823 |
20 | 0.4019 | 0.5275 | 0.6110 | 0.6516 | 0.4171 | 0.5823 | 0.4390 | 0.6299 |
|
||||||||
Average | 0.3873 | 0.4805 | 0.5254 | 0.6012 | 0.4053 | 0.5311 | 0.5272 | 0.6398 |
The average ACC and NMI value for the proposed algorithm varies with the varied number of clusters or data streams shown in Figures
The performance of the proposed algorithm varying with the size of (a) clusters, (b) data streams, respectively.
Finally, we evaluate the average processing time for a round of clustering multiple data streams. There are different factors affecting the execution time. In this experiment, we evaluate the effect of the window size and the number of data streams on the response time for the compared algorithms.
The following set of experiments evaluate the effect of window size on the execution time of these algorithms. Figure
Average processing time: (a) while the window size
The real world data sets were obtained from the PeMS (
First, we discuss the experiments on the PeMS data set with the number of clusters varying from 2 to 20. For each given cluster number
Comparison of the ACC and NMI results for the four methods on PeMS with the number of clusters varying from 2 to 20.
|
ACC | NMI | ||||||
---|---|---|---|---|---|---|---|---|
|
Ncut | NMF | STClu |
|
Ncut | NMF | STClu | |
2 | 0.2716 | 0.3014 | 0.3163 | 0.5060 | 0.2808 | 0.3027 | 0.3260 | 0.5127 |
3 | 0.2685 | 0.2950 | 0.3064 | 0.5065 | 0.2762 | 0.2947 | 0.3197 | 0.5167 |
5 | 0.2660 | 0.2933 | 0.3089 | 0.5107 | 0.2741 | 0.2952 | 0.3256 | 0.5540 |
8 | 0.2641 | 0.2988 | 0.3155 | 0.5123 | 0.2759 | 0.3011 | 0.3119 | 0.5679 |
10 | 0.2687 | 0.2919 | 0.2996 | 0.5078 | 0.2748 | 0.2899 | 0.3105 | 0.5282 |
12 | 0.2642 | 0.2890 | 0.3004 | 0.5115 | 0.2711 | 0.2889 | 0.3137 | 0.5611 |
15 | 0.2697 | 0.2969 | 0.3097 | 0.5057 | 0.2777 | 0.2973 | 0.3129 | 0.5098 |
19 | 0.2674 | 0.2946 | 0.3059 | 0.5104 | 0.2754 | 0.2943 | 0.2959 | 0.5510 |
20 | 0.2764 | 0.2976 | 0.3052 | 0.5164 | 0.2813 | 0.2955 | 0.3023 | 0.6043 |
|
||||||||
Average | 0.2685 | 0.2954 | 0.3075 | 0.5097 | 0.2764 | 0.2955 | 0.3132 | 0.5451 |
Next, we discuss the experiments on the PeMS data set with different time steps. For each time step, the clustering results of all these algorithms on the PeMS data set are shown in Table
Comparison of the ACC and NMI results for the four methods on PeMS with different time steps.
Time step | ACC | NMI | ||||||
---|---|---|---|---|---|---|---|---|
|
Ncut | NMF | STClu |
|
Ncut | NMF | STClu | |
1 | 0.2671 | 0.3033 | 0.3262 | 0.5052 | 0.2796 | 0.3085 | 0.3248 | 0.5052 |
2 | 0.2777 | 0.3157 | 0.3340 | 0.5053 | 0.2908 | 0.3184 | 0.3276 | 0.5065 |
3 | 0.2805 | 0.3102 | 0.3057 | 0.5100 | 0.2895 | 0.3019 | 0.3279 | 0.5477 |
4 | 0.2716 | 0.3074 | 0.2753 | 0.5083 | 0.2838 | 0.2856 | 0.3162 | 0.5323 |
5 | 0.2744 | 0.3054 | 0.3303 | 0.5140 | 0.2842 | 0.3115 | 0.3165 | 0.5836 |
6 | 0.2782 | 0.3090 | 0.3104 | 0.5028 | 0.2878 | 0.3035 | 0.3193 | 0.4839 |
7 | 0.2823 | 0.3263 | 0.3283 | 0.5077 | 0.2983 | 0.3208 | 0.3214 | 0.5277 |
8 | 0.2779 | 0.3275 | 0.2845 | 0.5091 | 0.2967 | 0.2999 | 0.3069 | 0.5395 |
9 | 0.2576 | 0.3409 | 0.3810 | 0.5027 | 0.2881 | 0.3796 | 0.3346 | 0.4834 |
10 | 0.2480 | 0.3256 | 0.3627 | 0.5086 | 0.2779 | 0.3759 | 0.3423 | 0.5350 |
11 | 0.2390 | 0.3118 | 0.3471 | 0.5072 | 0.2698 | 0.3678 | 0.3245 | 0.5227 |
12 | 0.2305 | 0.2989 | 0.3315 | 0.5120 | 0.2637 | 0.3656 | 0.3445 | 0.5653 |
|
||||||||
Average | 0.2654 | 0.3152 | 0.3264 | 0.5077 | 0.2842 | 0.3283 | 0.3255 | 0.5277 |
Finally, we report the experimental results on PeMS data set over a different number of time steps with a varying number of clusters. The clustering results of all these algorithms on the PeMS data set are shown in Table
Comparison of the ACC and NMI results for the four methods on PeMS with the number of clusters varying from 2 to 20 at different time steps.
Time step | ACC | NMI | ||||||
---|---|---|---|---|---|---|---|---|
|
Ncut | NMF | STClu |
|
Ncut | NMF | STClu | |
1 | 0.2694 | 0.3024 | 0.3213 | 0.5056 | 0.2802 | 0.3056 | 0.3254 | 0.5090 |
2 | 0.2731 | 0.3054 | 0.3202 | 0.5059 | 0.2835 | 0.3066 | 0.3237 | 0.5116 |
3 | 0.2733 | 0.3018 | 0.3073 | 0.5104 | 0.2818 | 0.2986 | 0.3268 | 0.5509 |
4 | 0.2679 | 0.3031 | 0.2954 | 0.5103 | 0.2799 | 0.2934 | 0.3141 | 0.5501 |
5 | 0.2716 | 0.2987 | 0.3150 | 0.5109 | 0.2795 | 0.3007 | 0.3135 | 0.5559 |
6 | 0.2712 | 0.2990 | 0.3054 | 0.5072 | 0.2795 | 0.2962 | 0.3165 | 0.5225 |
7 | 0.2760 | 0.3116 | 0.3190 | 0.5067 | 0.2880 | 0.3091 | 0.3172 | 0.5188 |
8 | 0.2727 | 0.3111 | 0.2952 | 0.5098 | 0.2861 | 0.2971 | 0.3014 | 0.5453 |
9 | 0.2670 | 0.3193 | 0.3431 | 0.5096 | 0.2847 | 0.3376 | 0.3185 | 0.5439 |
10 | 0.2607 | 0.3184 | 0.3494 | 0.5107 | 0.2822 | 0.3466 | 0.3389 | 0.5536 |
11 | 0.2481 | 0.3093 | 0.3384 | 0.5085 | 0.2731 | 0.3398 | 0.3301 | 0.5344 |
12 | 0.2539 | 0.3044 | 0.3084 | 0.5129 | 0.2757 | 0.3286 | 0.3308 | 0.5731 |
|
||||||||
Average | 0.2654 | 0.3152 | 0.3264 | 0.5077 | 0.2842 | 0.3283 | 0.3255 | 0.5277 |
Clustering multiple sensor data streams in CPS has been extensively studied in various applications, including transportation systems, sensor networks, and social networks. To extract and retain meaningful information from multiple sensor data streams, we assume that the spatial feature is the summary of the atypical event in temporal dimension, and the temporal feature is the summary of the event in spatial dimension. In this work, we have proposed a spatial-temporal clustering algorithm, called
In this section, we investigate the convergence and correctness of the objective function in (
For
The proofs are provided with the aid of auxiliary functions. Since the first term of the objective function
For any
Consider
Thus,
Next, we show that the update rule for
Since our update rule operates elementwise, it is sufficient to show that each
The function
We first get the Taylor series expansion of
Thus, (
Replacing
Since (
The authors declare that there is no conflict of interests regarding the publication of this paper.
The authors gratefully acknowledge the support provided for this research by the Research Fund for the Doctoral Program of Higher Education of China (Grant no. 20120191110047), Natural Science Foundation Project of CQ CSTC of China (Grant no. CSTC2012JJB40002), Engineering Center Research Program of Chongqing of China (Grant no. 2011pt-gc30005), Fundamental Research Funds for the Central Universities of China (Grant no. 106112014CDJZR178801), and Fundamental Research Funds for the Central Universities of China (Grant no. CDJXS10170004).