Frequent-Pattern Based Broadcast Scheduling for Conflict Avoidance in Multi-Channel Data Dissemination Systems

With the popularity of mobile devices, using the traditional client-server model to handle a large number of requests is very challenging. Wireless data broadcasting can be used to provide services to many users at the same time, so reducing the average access time has become a popular research topic. For example, some location-based services (LBS) consider using multiple channels to disseminate information to reduce access time. However, data conflicts may occur when multiple channels are used, where multiple data items associated with the request are broadcast at about the same time. In this article, we consider the channel switching time and identify the data conflict issue in an on-demand multi-channel dissemination system. We model the considered problem as a Data Broadcast with Conflict Avoidance (DBCA) problem and prove it is NP-complete. We hence propose the frequent-pattern based broadcast scheduling (FPBS), which provides a new variant of the frequent pattern tree, FP*-tree, to schedule the requested data. Using FPBS, the system can avoid data conflicts when assigning data items to time slots in the channels. In the simulation, we discussed two modes of FPBS: online and offline. The results show that, compared with the existing heuristic methods, FPBS can shorten the average access time by 30%.


Introduction
With advances in wireless communications technologies, mobile devices deeply affect our daily lives, such as notebooks, smart phones, and tablets. Users can easily access various information services, such as online news, traffic information, and stock prices. Recently, wireless data dissemination becomes a popular topic [1][2][3], which can transmit information to a number of users simultaneously. In comparison with the conventional end-to-end transmission (or client-server) model, wireless data dissemination can make use of wireless network channels to reduce the delivery time for obtaining information. Wireless data broadcasting is well-suited to the location-based services (LBS) in an asymmetric communication environment, where a large number of users are interested in popular information such as news [4], traffic reports [5], and multimedia streams [6,7].
In general, wireless data dissemination can be classified into two modes: push-based and pull-based (on-demand).
In push-based wireless data dissemination environments [8][9][10], data items are disseminated cyclically according to a predefined schedule. In fact, the access pattern of data items may change dynamically, and the broadcast frequency of popular data items may be lower than the broadcast frequency of unpopular data items. Such a case will result in a poor average access latency. In view of this, pull-based wireless data dissemination [11][12][13] that disseminates data items timely according to the received requests was proposed to overcome the aforementioned drawback. In the pull-based mode, the users first upload their demand information to the server through the uplink channel, and then, the relevant information will be immediately arranged into the broadcasting channels for disseminating data to users. In wireless data dissemination environments, a way of judging the quality of a scheduling approach is to measure the access time of the generated schedule. The access time is a measured time period from starting tuning the channels to obtaining all the requested information. Thus, it is important to have a better broadcasting schedule for shorter access time.
1.1. Motivation. In early literature, some conventional works [14][15][16] focus on how to maximize the bandwidth throughout or minimized the access time in single channel environments. Recently, with the advance on antenna techniques, most of works [17][18][19] has shifted their focus on the similar issues in multiple channel environments. In general, a multichannel wireless data dissemination system can provide a more network bandwidth and a shorter access time for data dissemination than a single-channel wireless data dissemination system can.
However, one new issue, data conflict [20][21][22], emerges while each client retrieves data items on multiple channels with channel switching in push-based broadcasting environments. Two types of conflicts may occur in multichannel dissemination systems. The first type of conflict is that two required data items are allocated on the same time slot of different channels, so the client cannot download the required data items simultaneously. The second type of conflict occurs if two required data items are allocated on the t and ðt + 1Þ time slots of different channels, respectively. In such a scenario, the client cannot download both required data items during the time period ½t, t + 1. The 1st conflict type is obvious. The reason of the 2nd conflict type is that switching from any channel to a different channel takes time. A client cannot download data at time slot t + 1 from one channel if it was downloading data item from another channel at time slot t, because a time slot is already the smallest unit for data retrieving. Note that a client is allowed to access one channel at one time.
Such a data conflict issue makes a client miss its needed data items during the time period for channel switching, thereby leading to a worse access time. On the one hand, some works [20][21][22] provide some solutions from the client's point of view. These solutions can make each client schedule itself for retrieving the data items on channels efficiently. On the other hand, only one work [13] provides a server-side scheduling algorithm with consideration of the data conflict issue in on-demand multichannel environments. The provided algorithm considers the associations between data items and requests while allocating data items on multiple channels and this provides a conflict-free schedule.
Most broadcast scheduling techniques in on-demand multichannel data dissemination environments do not consider the time requirement for channel switching, thereby leading to data conflicts or long access time. This phenomenon motivates us to propose a more efficient server-side scheduling method with conflict avoidance using frequent pattern mining technique, thereby shortening the average access time.

Contribution.
In this study, we discuss how to shorten the average access time on a multichannel wireless data dissemination environment under the data conflict conditions. The contributions of this work are listed as follows: (1) Identify the data broadcast with conflict avoidance (DBCA) problem in on-demand multichannel wireless data dissemination environments and prove the considered DBCA problem is N P -complete (2) We propose a heuristic approach, frequent-patternbased broadcast scheduling (FPBS), for providing an approximate schedule in polynomial time. Inspired by frequent-pattern tree (FP-tree), we suggest a new tree, FP * -tree, for FPBS to schedule the requested data items with the consideration of channel switching (3) We analyze the time complexity and average access time of FPBS in both average case and worst case (4) We verify the performance of FPBS which achieves a shorter average access time in comparison with the existing method, UPF [13] The rest of this paper is organized as follows. Section 2 gives the background and reviews related research in the literature. Section 3 defines the DBCA problem and proves that the DBCA problem is N P -complete. Section 4 explains the proposed approach with examples and algorithms in detail. In Section 5, we discuss the time complexity and access time of the proposed approach in worst case. Section 6 presents the experimental simulation results and validates the correctness and effectiveness of the proposed methods in various situations. Finally, we conclude this work in Section 7.

Related Work
In the multichannel dissemination environments, many related research works focused on data scheduling to improve the access time performance [17,18] from the perspective of spectrum utilization. Yee et al. [17] proposed a greedy algorithm to find the best way to distribute data items into the channels, allowing users to access requested data in a limited time. Zheng et al. [18] considered the data access frequency, data length, and channel bandwidth into a model and proposed a two-level optimization scheduling algorithm (TOSA) to find an appropriate schedule. They also showed that the schedule of TOSA is approximate to the best average time. Yi et al. [19] proposed a method to allow replicating multiple copies of a data item in a broadcasting channel. If there are multiple copies of a popular data item in the channel, the average access time can be effectively reduced.
In addition to the above methods, some works considered the priority of incoming queries and found ways to reduce the access time [12,14,15,23]. Lu et al. [14] proposed some algorithms to schedule data for maximum throughput request selection (MTRS) and minimum latency request ordering (MLRO) problems in a single-channel environment and proved that both problems are N P -hard. Xu et al. [15] proposed a SIN-α algorithm with a set of priority decisions based on the ratio of the length of the expiration time over the amount of information. Lv et al. 2 Wireless Communications and Mobile Computing [23] proved that minimizing access time in the broadcasting scheduling of multi-item requests with deadline constraint in a single-channel environment is an N P -hard problem. The authors provided a profit-based heuristic scheduling algorithm to minimize the request miss rate (or delivery miss rate) considering the access frequency of data. Liu and Su [12] focused on reducing the demand for the loss rate and shortening the access time. Two kinds of algorithms, most popular first heuristic (MPFH) and most popular last heuristic (MPLH), were proposed to solve the problems and they also analyzed differences between the online version (the user demands continuously come in the system, so the scheduling task needs to wait until it starts receiving information of the demands) and offline version (the system already has all the information of demands). Some works had found that the dependency between requested data items may greatly influence the performance of multichannel data broadcasting. Lin and Liu [24] considered the dependencies among data items as a directed acyclic graph (DAG). They proved that finding the best schedule preserving dependencies between each data item is an N P -hard problem and proposed some heuristics for the problem. Qiu et al. [25] proposed a three-layer on-demand data broadcasting (ODDB) system for enhancing the uplink access capacity by introducing a virtual node layer. Each virtual node can merge duplicated requests and help the server reduce huge computational load, there by improve the broadcasting efficiency.
Lu et al. [20][21][22] firstly defined two types of well-known data conflicts in multichannel broadcast applications. They proved the client-side retrieval scheduling problem is N P -hard and provided some client-side data retrieval algorithms for helping clients to retrieve data within multiple channels efficiently. Liu et al. [26] firstly proposed a serverside heuristic data scheduling algorithm, dynamic urgency and productivity (DUP), for on-demand multichannel systems with consideration of the request conflict (or request overlapping) issue and the dependency between requests for scheduling at the request level and giving higher priorities to the requests which are close to their deadlines. Such an approach provided a counteracting effect to the request starvation problem and improved the utilization of broadcasting bandwidth. However, they did not consider two types of data conflicts. He et al. [13] proposed a server-side heuristic scheduling approach, most urgent and popular request First (UPF), with the consideration of two types of data conflicts in on-demand systems. Except for UPF method, the hardness of data scheduling problem considering two types of data conflicts from the server perspective is seldom discussed.
The comparisons of the existing works and this paper are summarized in Table 1. In this work, we propose a new server-side heuristic scheduling approach for providing a conflict-free multichannel data broadcast service with a better performance on the average access time.

Problem Description
The length of a broadcasting cycle is an important factor which is normally predefined in the wireless data dissemina-tion applications. Most of existing data scheduling strategies focus on investigating how to efficiently schedule data items in each broadcasting cycle. To validate the performance of a scheduling strategy, average access time (or average latency), is the commonly and widely used metric. If the average access time is shorter, users generally can obtain all the requested data in a shorter time, meaning that the used scheduling strategy is more efficient. In the following subsections, we will describe the considered system model, define the considered scheduling problem, and then prove the hardness of this problem.
3.1. System Model. In this work, the considered on-demand multichannel data dissemination system is shown in Figure 1 and we only consider the one-hop broadcasting scenario. The considered data dissemination system uses |C | +2 antennas with orthogonal frequency division multiplexing (OFDM) technique [27] to provide |C | downlink broadcast channels, 1 downlink index channel, and 1 uplink request channel, where C = fc 1 , c 2 , ⋯, c |C| g and |C | >1. The downlink index channel and request uplink channel are denoted as c index and c uplink , respectively. Each user device has two antennas with one for receiving data over the downlink broadcast channels and one for transferring requests via the uplink request channel. We assume that each user device can only access one channel at one time. We assume that all the channels are nonoverlapping, synchronous and discretized into fixed-duration slots. The broadcasting server puts the requests coming from the uplink channel into a buffer with first-come-first-serve (FCFS) strategy and handles all the received requests in a batch manner. In this work, we only focus on the efficiency of (application-layer) data/ packet scheduling for users to retrieve the requested data items by accessing the downlink channels.
We assume that all the requested data items are in a dataset D = fd 1 , d 2 , ⋯, d jDj g, where jDj is the size of D, and the length of a broadcasting cycle is L = jDj in default. Suppose that there are n queries, Q = fq 1 , q 2 , ⋯, q n g, and each query q i requests k data items from the dataset D, where i = 1, 2, ⋯, n and k = 1, 2, ⋯, | D | . We let q i = fd i 1 , d i 2 , ⋯, d i k g and all the data items have the same data size, where d i j ∈ D, j = 1, 2, ⋯, k, and ∪ n i=1 q i ⊆ D. Thus, the system has to arrange the requested data items into jCj broadcasting channels. Note that each time slot on a broadcasting channel can contain at most one data item and data replication is only allowed on different channels. That is, multiple copies of one data item may be placed within a broadcasting cycle. Suppose L is the cycle length, each index I t at time slot t records the informations about all the data items in time slot t ′ and the corresponding requests of these data items, where t ′ is obtained by When a client tunes in the channel, it will access the 3 Wireless Communications and Mobile Computing index channel in advance until obtaining information about the first required data item.

Problem Formulation.
The considered scheduling problem can be treated as a mapping M that data items associated with all the queries to |C | broadcasting channels. For each data item d i j ∈ D associated with a query q i ∈ Q, let Since there are multiple channels and each user can only tunes into one broadcasting channel at one time instance, each user may switch channels many times for retrieving all the requested data items on different channels. In general, channel switching is a relatively fast operation (in the microsecond range) [28,29]. For simplicity, we follow the similar assumptions about channel switching in [22], and each channel switching takes one time slot in the considered data dissemination environment. Figure 2 shows an example of the channel switching. However, channel switching may cause a new problem, data conflict, in multichannel wireless data dissemination systems. For example, if one of requested data items for request q i is placed at the previous, the same, or the later location of a scheduled data item which is also associated with q i on different channels, a data conflict occurs. An example of data conflicts is presented in Figure 3. The data conflict may result in a longer access time and can be defined as Definition 1.
Definition 1 (data conflict). For a query q i , two requested data items d i j and d i In summary, the problem we want to solve in this work is data broadcast with conflict avoidance (DBCA) problem which can be defined as follows.
Send requests Data channel c 2 (1) there is no data conflict for each query in the mapping, i.e., w.r.t. query q i , for each pair of data items (2) the average access time of M, acc M = ∑ n i=1 accðq i Þ/n, is minimized.

NP-Completeness.
To the best of our knowledge, most of the existing works only considered the schedules without data replication in a broadcasting cycle. They did not discuss and analyze the schedules with conflict avoidance problem on multichannel dissemination environments in detail. Conversely, our proposed approach, FPBS, considers a multichannel dissemination environment which allows replicating data items on different channels of a broadcasting cycle. In such a scenario, we investigate the data conflict problem and propose a new approach to avoid this problem. In this subsection, we will prove DBCA problem is N P -complete.
In the definition of DBCA problem, the first objective indicates that the broadcasting schedule avoids the data conflict problem. The second objective is to minimize the average access time. Since the server has no prior knowledge about the coming requests, the process for scheduling the broadcasting is made in an online fashion. We first look at the offline version of the DBCA problem in the follow-ing and it refers to conflict-free data broadcasting with minimum average latency (CDBML) problem and define it as below.
Definition 3 (CDBML problem). Instance: There are jCj data broadcasting channels with cycle length L, a set of jDj data items D = fd 1 , ⋯, d jDj g, and a set of n requests Q = fq 1 , ⋯, Any two data items associated with two different requests are different, and every data item needs an unit time u t to be broadcast. Let loc i min and loc i Max be the start time and finish time of q i , respectively.
Question: Does there exist a mapping M : In the definition of CDBML problem, the first objective indicates that the broadcasting schedule avoids the data conflict problem. The second objective is to reduce the average access time and all of the data items associated some request q i should be broadcasting before the end of the broadcasting cycle. W i is an indication function used to present if a request is served or not. To show further that the CDBML problem is N P -complete, we consider a special case of it, where the number of data items associated with each request is the same and equal to the number of channels. That is, we consider the case k = jCj. The data items associated with different requests are all different. The following gives the definition of the decision problem for the above special case.
Definition 4 (CDBMLρ problem). Instance: There are jCj data broadcasting channels with cycle length L, a set of jDj data items D = fd 1 , ⋯, d jDj g, a set of n requests Q = fq 1 , ⋯, q n g, and an integer h.
Any two data items associated with two different requests are different, and every data item needs an unit time u t to be broadcast. Let loc i min and loc i Max be the start time and finish time of q i , respectively.
Question: Does there exist a mapping M : To show that the CDBMLρ problem is NP-complete, we reduce the minimizing mean flow time in unit time open  [30] problem with preemption (O | p i,j ∈ f0, 1g ; pmtn | ΣC i ) to the CDBMLρ problem. [30] has proved such a problem (O | p i,j ∈ f0, 1g ; pmtn | Σ C i ) is N P -hard by the reduction from the graph coloring problem, and thus, the CDBML problem is N P -hard. The MMUOS problem is defined as follows.
Definition 5 (MMUOS problem). Instance: Given m machines, a set of n jobs J = fJ 1 , J 2 , ⋯, J n g, a set of jOj unit operations O = fo 1 , o 2 , ⋯, o jOj g, and an integer T. Each job has to be processed on the j th machine. Job J i will be processed in a window defined by a release time r i and a finish time c i . Question: Does there exist a mapping M : Proof. It is easy to see that the CDBMLρ problem is in N P , since validating the existence of an given conflictfree schedule simply needs polynomial time. In order to prove the CDBMLρ problem is N P -hard, a reduction from the MMUOS problem can be made. Suppose that I ′ is an instance of the MNUOS problem. A corresponding instance I of the CDBMLρ problem can be constructed from I ′ as follows: (1) An unit operation time is equal to the unit time slot to broadcasting a data item  According to the last step of the construction, the first objective of MMUOS problem can be equivalent to the first objective of CDBMLρ problem and the above construction can be done in polynomial time. It is straightforward to show that there is a solution for an instance I ′ of the MMUOS problem if and only if there is a solution for the instance I of the CDBMLρ problem since the reduction is a one-to-one mapping for the variables from the MMUOS problem to the CDBMLρ problem. Hence, the CDBMLρ problem is N P -complete.
Thus, we can conclude the following theorem. Theorem 7. The CDBML problem is N P -complete.

Frequent-Pattern-Based
Broadcast Scheduling In this section, we propose an approach, the frequent-pattern-based broadcast scheduling (FPBS), to shorten the average access time per user for the DBCA problem. In FPBS, we construct a new tree with the frequent patterns of queries. This tree is named as FP * -tree. FPBS includes four stages: (1) sorting requested data items, (2) constructing the FP * -tree's backbone, (3) constructing the FP * -tree's accelerating branches, and (4) schedule mapping. In the following, the proposed method will be introduced with a running example in detail.
4.1. Stage 1: Sorting Requested Data Items. We consider a running example which uses two data broadcasting channels c 1 , c 2 and an additional index channel c index . The data dissemination server receives five queries and q 5 = fd 2 , d 5 , d 8 g and then derives the access frequency f d j of each data item d j in these queries. After that, the server sorts all the data items in each query according to the descending order of their access frequencies and also derives the statistical average access frequency f q i of each query q i .
The detailed process, FPBS StatisticAndSort ðQÞ, for the first stage is presented in Algorithm 1. Line 2 and Line 3 analyze the received query set Q, derive the statistical information, and save it as a temporary set S. The operations from Line 4 to Line 6 sort every requested data item of each query according to the access frequency of the data item. As the example shown in Table 2, the orders of requested data items in queries q 4 and q 5 change after the sorting. Line 7 and Line 8, respectively, store the results in two lists, lis t SortedWithSize and list SortedWithFre , in different orders. Finally, the process returns these two lists at Line 9 for the use in following stages.

Stage 2:
Constructing the FP * -Tree's Backbone. After deriving some statistical information and the sorting result in Table 2, the system starts to create the backbone of a FP * -tree. In this stage, the system will always select the query which requests the most number of data items to be inserted into the FP * -tree in advance. If there are multiple queries which request the same number of data items, the system will select the one which has the maximum average access frequency f q i . Thus, the system select q 4 as the first query to construct the backbone of a FP * -tree and the result 6 Wireless Communications and Mobile Computing is shown in Figure 4(a). After adding q 4 to the FP * -tree, the system will update the statistical information of unhandled queries, as shown in Table 3.
After updating the statistical information, the system will select the next query to handle in the same way. In the previously mentioned, both q 1 and q 3 request 2 data items so the system will compare the remaining average access frequencies of q 1 and q 3 (f q 1 = 2, f q 3 = 2) and both values are the same. Then, the query which comes into the system first will be selected, so q 1 becomes the next one in this step. Note that the numbering of q 1 is smaller than q 3 's and it means that q 1 comes into the system earlier. Thus, the handling priority of the remaining queries is q 1 ⟶ q 3 ⟶ q 5 . While adding data item d 2 into the FP * -tree, the system needs to consider the relations between d 2 and the other queries. In this case, q 2 and q 3 also request the data item d 2 . The system then checks the other data items which are in the request list of both queries and have been added into the FP * -tree. Since the level of d 4 is larger than d 3 's level, the system will insert d 2 as d 4 's child. Such a way can avoid increasing the access time of q 4 which has been handled. After handling d 2 , the system handles d 7 in the same way and the result of FP * -tree is shown in Figure 4(b). The system then updates the statistical information which is presented in Table 4.
The next query which will be handled is q 3 . Since there are no other queries relating to the requested data item d 8 , the system needs to add d 8 after d 2 according to the order of q 3 's requested list. However, d 2 is also requested by q 1 , and thus, d 2 already has one branch and the position is occupied by d 7 . Therefore, d 8 can only be scheduled in the level (time slot) after d 2 and d 7 . In this case, the system creates a new branch of d 2 and inserts an empty node between d 2 and d 8 . Note that an empty node is a node without saving any data item. After handling q 3 , the results are shown in Figure 4(c). The last query is q 5 , and there are no other queries relating to d 6 . Hence, the system has to add d 6 after d 1 according to the order of q 5 's requested list. However, d 1 is also requested by q 4 , and thus, d 6 needs to be scheduled after d 4 . In this case, the system creates a new branch of d 1 and inserts an empty node between d 1 and d 6 . Finally, the construction of FP * -tree's backbone is finished and the result is shown in Figure 4(d).
Algorithm 2 presents two functions for the backbone construction. FPBS CreateBackbone ðSÞ describes the main process of an FP * -tree's backbone construction and FPBS AddNodeForBackbone ðT , N p , dÞ is the function of adding a node during the backbone construction. From Line 3 to Line 5, the operations initialize an empty FP * -tree T and create a sorted query table Q table with the derived sorted result in the stage 1. The operations from Line 6 to Line 8 handle each requested data item of the first query in the sorted query set. The first query is the most important and has maximum number of requested data items. As shown as the above example in Figure 4(a), the query q 4 is the first to be handled. At Line 9, the remaining information of unhandled queries and data items in the query table Q table will be updated. From Line 10 to Line 17, the operations continuous inserting the unhandled data items of Q table into the backbone of T . At Line 13, the operation finds the right position of T 's backbone to insert the unhandled data item with the consideration of query dependency and the order of data items. The operations from Line 21 to Line 35 presents the detailed process of adding a data node to the backbone of T . Note that the operation, T :isOverloadðN temp :slot + 1Þ, at Line 26 is used to avoid scheduling data items out of |C | data broadcasting channels.  After the construction of FP * -tree's backbone, the system starts to create the accelerating branches to optimize the schedule. The purpose of constructing the accelerating branch is to increase the chance of each user getting the requested data item earlier after switching channels.
In this stage, we propose two different ordering rules, request-number-first and frequency-first, to insert data items in the FP * -tree's accelerating branches. The priority of a query for the insertion of FP * -tree is decided by following values: number of requested data items, average access frequency, and arrival time. With request-number-first rule, the system will select the query which requests the maximum number of data items to handle first. If multiple queries request the maximum number of data items, the system will select the one of them that has the maximum average access frequency. If multiple queries has the maximum average access frequency unfortunately, the system will select the query according to its arrival order. Conversely, with frequency-first rule, the system will first select the query which has the maximum average access frequency. If multiple queries has the maximum average access frequency, the system will select the one of them that requests the maximum number of data items. If multiple queries requests the maximum number of data items unfortunately, the system will select the query according to its arrival order. Note that the construction of the FP * -tree's backbone always follows the request-number-first rule in our design. The system can use different rules only when constructing accelerated branches of the FP * -tree.
Since different orders of handling queries and data items make the process constructs different accelerating branches of FP * -trees, we will compare the performance results of different schedules generated by using different rules. By default, the system uses frequency-first rule to select the query for constructing the FP * -tree's accelerating branches. Due to limitations on space and the similar process, we only 7 Wireless Communications and Mobile Computing introduce the proposed approach with frequency-first in detail. In this example, the system follows the frequencyfirst rule and gets the following handling sequence: q 2 ⟶ q 4 ⟶ q 1 ⟶ q 3 ⟶ q 5 . Note that the value of f q i is shown in Table 2.
The system first handles query q 2 and q 2 's sorted requested data items are d 2 , d 3 , and d 4 . Hence, the system sequentially schedules d 2 , d 3 , and d 4 . When scheduling d 2 , the system temporarily inserts d 2 ′ into level (or slot) 1 and the position is a right child of the root. Then, the system searches d 2 in the backbone and check whether p 2 > p 2 ′ and p 2 − p 2 ′ > 1 or not. In this case, p 2 > p 2 ′ and p 2 − p 2 ′ = 4 > 1 is hold, so d 2 can be inserted into the position of d 2 ′.
For the next requested data item d 3 , the system inserts d 3 ′ after d 2 in the accelerating branch and then checks whether the position is legal or not in the same way. In this case, d 3 can be inserted into the position of d 3 ′. For the last requested data item d 4 by query q 2 , the system tries to temporarily insert d 4 ′ after d 3 in the accelerating branch. However, the system can find d 4 in the backbone that p 4 − p 4 ′ ≤ 1. Thus, d 4 can not be inserted into the accelerating branch. After handling q 2 , the result of FP * -tree is shown in Figure 5(a).
For the next query q 4 , the system will do nothing in the accelerating branch. The reason is that q 4 is the first query handled in the backbone and the schedule, d 3 ⟶ d 5 ⟶

Query Unhandled data items
Items added in FP * -tree's backbone Wireless Communications and Mobile Computing d 1 ⟶ d 4 , has been optimized. Go on the next step, q 1 is going to be handled and q 1 's requested data items are d 2 , d 5 , and d 7 . Since d 2 has been inserted into the accelerating branch, the system skips d 2 and tries to insert d 5 in this step. According to the order of q 1 's requested list, d 5 needs to be inserted after d 2 . In the accelerating branch, node d 2 already has a child, so the system creates a new branch of d 2 , inserts an empty node as d 2 's right child, and then add temporary d 5 ′ after the empty node. Since there is no d 5 whose p 5 > p 5 ′ in the backbone, it is legal to insert d 5 at the position of d 5 ′ . For the last requested data item d 7 in q 1 , d 7 is inserted in the same way. The system inserts d 7 ′ after d 5 in advance and check whether the backbone contains d 7 or not. Since p 7 > p 7 ′ and p 7 − p 7 ′ = 2 > 1, it is legal to insert d 7 at the posi-tion of d 7 ′. After handling all the requested data items in q 1 , the result of FP * -tree is shown in Figure 5(b) After handling q 1 , the system will start to handle q 3 . The sorted requested data items are d 2 , d 5 , and d 8 . Since d 2 has been scheduled at the first slot (level) in the accelerating branch, the system skips d 2 in this step. The next data item d 5 also has been scheduled in the accelerating branch while handing the previous query q 1 . Hence, the system only needs to handle d 8 for q 4 . According to the requested list of q 4 , d 8 needs to be inserted at a position that is after d 2 and d 5 . In the accelerating branch, p 5 > p 2 so that d 8 will be inserted under the d 5 . Since d 5 already has a branch, the system creates a new branch of d 5 , inserts an empty node after d 5 , and tries to inserts a temporary d 8 ′ after the empty node Algorithm 2: Functions used for the FP * -tree's backbone construction. 9 Wireless Communications and Mobile Computing (at p 8 ′ = 5). However, C = 2 and the bandwidth has been occupied by d 2 and d 6 at slot p 8 ′ = 5. Then, the system will insert an empty node again and try to add a temporary d 8 ′ at position p 8 ′ = 6. Then, the system starts to find d 8 in the backbone and check whether p 8 > p 8 ′ and p 8 − p 8 ′ > 1 or not. In this case, p 8 − p 8 ′ = 1, so it is illegal to place d 8 at the position of d 8 ′ and the system removes all the empty nodes after d 5 in the accelerating branch. Hence, the final FP * -tree is shown in Figure 5(c).
Algorithm 3 presents the pseudocodes for the functions of accelerating branch construction. FPBS CreateAcceleratingBranch ðT , SÞ is the main function for constructing accelerating branch. The process calls the subfunction FPBS AddNodeForAcceleratingBranch ðT , N curr , dÞ to insert a data item into the accelerating branch of T at Line 6. Such a process is similar to the function FPBS AddNodeForBackbone ðT , N p , dÞ in the backbone construction. The operation at Line 7 calls another subfunction FPBS RangeSearch ðT , N p Þ to check whether the inserted data item is in the search range (or levels)) or not. The insertion will be illegal if the same data item in the backbone of T locates at one of search levels. If the insertion is illegal, the inserted nodes (including the data item and empty node(s)) will be deleted at Line 47.

Stage 4: Schedule Mapping.
After finishing stage III, the system will map every slot (or level) of FP * -tree into the broadcasting channels using the breadth-first-search (BFS) strategy. The final results are shown in Figure 6. Note that the maximum number of data items in each slot (level) is the number of channels, |C | . The mapping process is described as the operations before Line 24 in Algorithm 4. From Lines 25 to 29, the process schedules the index items in index channel and the result is shown in Figure 6. According to the indexing rule defined in (1), the index I 1 records the information about who requests the data items in slot 3 and the index I 6 records the similar information corresponding to the data items in slot 1.
Consider the example of Table 1, for the request q 2 = f d 2 , d 3 , d 4 g, the final schedule in Figure 6 generated by the proposed FPBS shows that the user can retrieve all the requested data items d 2 , d 3 (on c 2 ), and d 4 (on c 1 ) within 4 time slots including a channel switching. If there is no accelerating branch, the user needs 5 time slots to retrieve data items d 2 , d 3 , and d 4 on c 1 . This result shows that the proposed FP * -tree can indeed reduce the access time.

Analysis and Discussion
In this section, we analyze the performance of FPBS in terms of time complexity, space complexity, and access time.

Time Complexity.
Suppose that the notations are defined as above and the FP * -tree is denoted as T , then, the time complexity of the T 's construction will be OðnkÞ. The idea of FP * -tree design comes up from the FP-tree and only one difference between them is that FP * -tree needs to add an empty node when creating a new branch except for the root node. In the last stage of the proposed method, schedule mapping needs to maps all the data nodes of T to the broadcasting channels and |T | ≤nk, so the time complexity of schedule mapping is also OðnkÞ. Due the to nature of the FP * -tree which is evolved from FP-tree, FPBS costs OðnkÞ in both average case and worst case. In summary, FPBS provides a polynomial algorithm for solving the DBCA problem.

Space Complexity.
After discussing the time complexity of FPBS, we start to analyze the space complexity of FPBS. In this part, we only consider the temporary space for FPBS process. In the stage 1 of FPBS process, the system uses a OðnkÞ size table to store the sorted requests and the statistical information. In the stage 2, the system uses the obtained sorted table to construct the backbone of an FP * -tree and it also costs OðnkÞ space. In the stage 3, the system constructs accelerating branches of the FP * -tree and it costs Oðnk ′ Þ space, where 1 ≤ k′ ≤ k. In the last stage, the system just maps the FP * -tree to the channels and only costs Oð1Þ additional temporary space for traversing the FP * -tree. That is, the temporary space complexity during the scheduling process is OðnkÞ.

Access Time.
In wireless data dissemination environments, access time (or latency) is an important metric for validating the efficiency of scheduling. In FPBS, the system always first selects the request, whose size and average access frequency are maximum, and then schedules it in the backbone of FP * -tree. We then treat is as the base of schedule. That is, the access time for a request q i can be formulated as Theorem 8.

Theorem 8.
Suppose that F is the maximal frequent itemset in the first-scheduled request,t is the minimum cost for channel switching, and t wait is the average waiting time from tuning into the channel to receiving the first required data item for a request, the access time for a request q i can be expressed as whereσ 1 is the frequency of channel switching and σ 2 is the frequency of occupied slot (empty node in the FP * -tree) skipping.
Proof. With the use of index channel in FPBS, the average waiting time can be reduced efficiently. If q i ⊆ F, it means that all the required data items for q i can be obtained before the end of broadcasting all the data items in F. In such a case, the access time for q i will be t wait + |q i | +σ 1t + σ 2 , where |q i | +σ 1t + σ 2 ≤ |F | . If q i ∩ F = ∅ (is equivalent to | q i ∩ F | = 0), it means that q i and F are two disjoint sets. In this case, the data items requested by q i only can be allo-cated after the first-scheduled maximal frequent item-set, so the access time for q i will be t wait + |F | + | q i | +σ 1t + σ 2 . However, the time |F | can be merged into the average waiting time t wait until accessing the first data item requested by q i . Otherwise, for the case of |q i \ F | >0, q i and F are two partially overlapping. It means that some required data items for q i will be scheduled after F. Hence, the access time for q i will be After discussing the general case of access time, we also discuss the worst case in following Theorem 9.
Theorem 9. Suppose all the notations are defined as above. The worst case of access time will be Proof. In general, the worse case is the scenario that a client access the channels from the first time slot to the last time slot. In other words, the worse access time of FPBS will be the height of the FP * -tree. According to the design of FPBS approach, the accelerating branches of FP * -tree is impossible to be longer than the backbone of FP * -tree. Hence, the height of the FP * -tree H T will be the height of the backbone, | S n i=1 q i | . In practice, each client tunes in channel at random time slot, so the access time in worst case acc worst will be t wait + | S n i=1 q i | .

Wireless Communications and Mobile Computing
In FPBS, each data item is not replicated in the FP * -tree's backbone and | S n i=1 q i | . In this work, we focus on minimizing the average access time and the proposed FPBS approach can effectively shorten the access time of each request using the accelerating branches. In (2), the terms | q i ∩ F | and |q i \ F | are uncertain since the relation between request q i and the maximal frequent item-set F is unpredictable. Hence, FPBS focus on minimizing the frequencies of channel switching or occupied slot (empty node in the FP * -tree) skipping, such as σ 1 and σ 2 in (2). This problem is solved by FP * -tree using the accelerating branches in our proposed approach. In other words, FPBS is proposed for effectively make the upper bound of access time be tighter. Thus, the worst case becomes a very rare occurrence.

Simulation Results
We validate and discuss the performance of FPBS in terms of average access time by running the experimental simulations in different scenarios. The unit of time is a time slot. All the simulations are written in C++ and executed on a Windows 7 server which is equipped with an Intel (R) Core (TM) i7-3770 CPU @ 3.4 GHZ and 12G RAM. We use Quandl databases [31] to extract the U.S stock prices and then use the obtained stock dataset as the input of our simulation.
We assume that the maximum number of channels is 10 (|C | = 2, 3, ⋯, 10) in the simulation. Therefore, we assume that one of the channels is the uplink for receiving the request and the remaining 10 channels are used as the downlink broadcasting channel. The detailed parameters of our simulations are shown in Table 5.
In the simulations, FPBS is conducted in online and offline modes. In the online mode, the system will use a buffer to keep the information of queries and request data items. When the buffer becomes full, the system will start to schedule data into the broadcasting channels. The scheduled data items will be removed from the buffer and new user demands are continuously coming in the buffer. It means that the FP * -tree and schedule may change during the simulation. Conversely, we assume that the system in the offline mode schedules the data after storing all the requested information in the buffer.
Note that there are two selecting strategies during scheduling process of FPBS, request-number-first and frequency-1: Function FPBS ScheduleMapping ðT , S,|C | Þ Input: an FP * -tree T , a sorted query set S, and the munber of channels |C | Output: a scheduled channel set S channel and a index channel I channel 2: let a list list handling ⟵ T.root.children; 3: let a temporary list list next ⟵ ∅; 4: create a data channel S channel with |C | data broadcasting channels (or rows); 5: create an index channel I channel ⟵ ∅; 6: int i; 7: while list handling is not empty do 8: i ⟵ 1; / * i is used as a pointer to the current channel * / 9: for each node N in list handling do 10: if N is an empty node then 11: break; 12: else if N.parent is an empty node then 13: insert N into S channel whose slot N. first. Request-number-first strategy is to select the query according to the length of its requested data items first and then selecting the query according to its average access frequency if multiple queries request same number of data items. Frequency-first strategy is to select the query according to its average access frequency first and then select the query according to the length of its requested data items if multiple queries have the same average access frequency. Hence, we discuss the above two strategies in online and offline modes, respectively.
To the best of our knowledge, none of existing works model the optimal performance of the multi-item request scheduling simultaneously considering the channel switching and dependencies between different requests over multichannel dissemination environments. Only [13] provides a heuristic algorithm, UPF, to discuss the similar problem. This is the reason that we choose UPF as the comparative baseline in the simulations.
6.1. Size of Dataset. In the first simulation, we discuss the performance of FPBS with different sizes of dataset in terms of average access time. Note that the size of dataset indicates the number of different data items stored in the dataset. Figure 7 shows the results in three different cases if the      Figures 7(b) and 7(c), we can know that both of online and offline FPBS approaches can outperform UPF in different sizes of datasets when |C | ≥6. Additionally, the frequencyfirst strategy, FPBS-Fre, always has the best performance in different scenarios.
6.2. Number of Channels. In this part, we discuss the performance of FPBS in different scenarios that the number of broadcasting channels is set from 2 to 20 and the results are shown in Figure 8. The results indicate the existing method, UPF, is not suitable to multiple channel (C ≥ 4) broadcasting environments and UPF cannot dynamically  show that the average access time of UPF is unstable and becomes a slightly increasing trend when the size of dataset becomes large (|D | ≥500). The possible reason for this result is that UPF aims to minimize the request miss rate, not the average access time. There may be a trade-off between minimizing the request miss rate and the average access time.

Number of Requested Data
Items. If the number of requested data items becomes larger, the possibility of data dependency between each query becomes higher. In this subsection, we consider the effect of the different number of requested data items on the average access time. As shown in Figure 9, one can observe that all the FPBS-based approaches can outperform UPF when the maximum number of requested data items q max is smaller than 11. When q max is 2, all the FPBS-based approaches have similar performances on the average access time. As the value of q max increases, the average access time in all the FPBS-based approaches also increases linearly.
According to the result in Figure 9, we can know that the frequency-first strategies are better than the requestnumber-first strategies since the performances of FPBS-Fre and FPBS-Fre-Online are more smoothly increasing than the performances of FPBS-Rn and FPBS-Rn-Online. In addition, FPBS-Fre can has the best performance and its trend is almost parallel to the trend of UPS's performance.
6.4. Buffer Size. In the last simulation, we discuss the effect of the different size of buffer on the average access time for comparing two proposed online approaches, FPBS-Rn-Online and FPBS-Fre-Online. We also consider the trend of performance in some scenarios that the number of channel is, respectively, set to 3, 6, and 9.

Wireless Communications and Mobile Computing
The result in Figure 10 indicates that both FPBS-Rn-Online and FPBS-Fre-Online can have shorter average access time as the size of buffer increases. In an environment providing small number (C = 3) of channels, as shown as Figure 10(a), FPBS-Fre-Online can has a slightly better performance than FPBS-Rn-Online does when the buffer can store more than 2500 data items. The results in Figures 10(b) and 10(c) show that FPBS-Fre-Online is much better than FPBS-Rn-Online with different size of buffer when the number of channels increases (C ≥ 6).
6.5. Open Issues. In this subsection, we summarize some remaining issues (or potential challenges) in on-demand multi-channel data dissemination systems as follows: (i) Hardware constraint: although the minimum costt for channel switching is normalized as one time slot in FPBS, it is difficult to implement a broadcasting system that meets this condition due to hardware limitations (ii) Cross-layer system design: in this paper, we design a server-side data scheduling for serving the multiitem requests. For wireless networks, the timevarying and uncertain nature of wireless channels can be considered in the scheduling. Thus, the server needs a new cross-layer system design to simultaneously access the request information in the application layer and channel information in the physical layer and then schedule data items more efficiently

Conclusion
In this paper, we investigate and formulate an emerging problem, DBCA, in multichannel wireless data dissemination environments. We also prove that the DBCA problem is N P -complete. Then, we present a heuristic scheduling approach, FPBS, to avoid data conflicts on multiple broadcasting channels. In FPBS, we use frequent patterns of requested data items to build a FP * -tree for extracting the correlation between each received request. Thus, data conflicts can be avoided. During the construction of FP * -tree's accelerating branch, adding empty nodes at appropriate positions makes the user client have sufficient time to switch the channel for obtaining the required data. We not only analyze that FPBS can be done in polynomial time but also present the upperbound of access time of a request which is related to size of dataset. According to the simulation results, FPBS is much better than the existing work, UPF, in most of cases.

Data Availability
The stock dataset used to support this study is available online and is cited as a reference [31] in relevant places in the text. The program data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest
The authors declare that they have no conflicts of interest.