Reverse skyline queries have been used in many real-world applications such as business planning, market analysis, and environmental monitoring. In this paper, we investigated how to efficiently evaluate continuous reverse skyline queries over sliding windows. We first theoretically analyzed the inherent properties of reverse skyline on data streams and proposed a novel pruning technique to reduce the number of data points preserved for processing continuous reverse skyline queries. Then, an efficient approach, called Semidominance Based Reverse Skyline (SDRS), was proposed to process continuous reverse skyline queries. Moreover, an extension was also proposed to handle
The skyline operator [
Example of skyline and its variations.
As a variation of skyline, given a query point,
Compared with dynamic skyline,
In such applications, the dealer may want to continuously monitor the trading system for selecting the customers who will recommend the new used car. As the price of used cars is always fluctuant, the information too long before may not be quite relevant to the current used car recommendation. Therefore, we tend to only focus on the most recent (e.g., within a week) used car information, that is, the reverse skyline query over sliding windows.
Although reverse skyline query processing has been well studied in recent years [ We propose an efficient algorithm SDRS to process the reverse skyline queries over the sliding window. By using the semidominance relationships and first-in-first-out property of the sliding window, SDRS only maintains a small number of points in the sliding window. Also, by maintaining a reasonable structure of each reserved point, SDRS can quickly calculate the new reverse skyline once the sliding window moves. By building and maintaining a 2D Last but not least, extensive experiments show that our proposed SDRS approach can efficiently support continuous reverse skyline queries, including
Dynamic skyline was first introduced by Papadias et al. [
Based on the concept of dynamic skyline, Dellis and Seeger [
There are some works that have been proposed to address the skyline query processing on data streams. Lin et al. [
Bai et al. [
We first recall two important concepts called
A point
A point
Figure
Full-dominance and semidominance.
In order to explain semidominance more clearly, for each point
It has been theoretically proved in [
Given a query point
According to Theorem
Given a query point
In this section, we present the details of SDRS approach. Specifically, some important properties and query processing techniques are discussed in Sections
Suppose there are two points
A data point
Since we have
Consider the example in Figure
As illustrated in Figure
An example of false positive.
Now, we will present a very important characteristic of semidominance which helps us to solve the above problem.
If
Lemma
Then, the data points in
We use
A candidate point
Next, we will discuss the correctness of only keeping the data points in
A data point
It can be immediately deducted from Lemmas
A newly arriving point
Since
We use apagoge to prove the second part. Suppose point
Lemma
Based on the analysis above, we can get Theorem
It can be immediately deducted from Theorems
In this section, we will introduce the data structures and details of our SDRS approach successively.
Data structure.
If point
Dominance relationships.
If point
If
For each entry
find find the predecessor find find
Algorithm
initialize find the
insert all entries in the root of
remove the top entry break; insert break;
Algorithm
initialize sets find the
insert all entries in the root of remove the top entry remove all points in insert remove remove balance
Algorithms
In this section, we extend the proposed SDRS algorithm to support
Firstly, we introduce the definition of
Given the recent
According to Definition
For every point in
In this section, we introduce the definition of
Given the recent
According to Definition
For every point in
In this section, we experimentally compare our proposed SDRS algorithm against the only existing DCRS algorithm [
Experimental results of real dataset-stock.
Space usage versus window size
Result size versus window size
Response time versus window size
Figure
The synthetic datasets contain three different distributions, including Uniformly Distributed, Clustered, and Anticorrelated Distributed Datasets [
Experimental parameters.
Parameter | Range |
---|---|
Dimensionality | 2, 3, |
Window size |
|
We first evaluate the impact of sliding window size
Space usage versus window size.
Uniform Dataset
Clustered Dataset
Anticorrelated Dataset
Result size versus window size.
Uniform Dataset
Clustered Dataset
Anticorrelated Dataset
Response time versus window size.
Uniform Dataset
Clustered Dataset
Anticorrelated Dataset
Next, we evaluate the impact of dimensionality
Space usage versus dimensionality.
Uniform Dataset
Clustered Dataset
Anticorrelated Dataset
Result size versus dimensionality.
Uniform Dataset
Clustered Dataset
Anticorrelated Dataset
Response time versus dimensionality.
Uniform Dataset
Clustered Dataset
Anticorrelated Dataset
In this section, maintenance time and processing time are used to evaluate the algorithm performance of
As shown in Figures
Maintenance time for
Uniform Dataset
Clustered Dataset
Anticorrelated Dataset
Processing time for
Uniform Dataset
Clustered Dataset
Anticorrelated Dataset
In this section, maintenance time and processing time are used to evaluate the algorithm performance of
As shown in Figures
Maintenance time for
Uniform Dataset
Clustered Dataset
Anticorrelated Dataset
Processing time for
Uniform Dataset
Clustered Dataset
Anticorrelated Dataset
Despite its importance in real-world applications, reverse skyline computation on data streams has not been well studied. Therefore, in this paper, we focus on the problem of efficiently computing reverse skyline against sliding windows over an append-only data stream. Specifically, we present an effective pruning approach to minimize the number of points to be kept in the sliding window and propose efficient semidominance based on approach SDRS for processing continuous reverse skyline queries. Moreover, we also propose an extension for handling
The authors declare that there is no conflict of interests regarding the publication of this paper.
This research was partially supported by the National Natural Science Foundation of China under Grants nos. 61472069, 61402089, and 61100022 and the Fundamental Research Funds for the Central Universities under Grant no. N130404014.