This paper proposes a tree-based adaptive broadcasting (TAB) algorithm for data dissemination to improve data access efficiency. The proposed TAB algorithm first constructs a broadcast tree to determine the broadcast frequency of each data and splits the broadcast tree into some broadcast wood to generate the broadcast program. In addition, this paper develops an analytical model to derive the mean access latency of the generated broadcast program. In light of the derived results, both the index channel’s bandwidth and the data channel’s bandwidth can be optimally allocated to maximize bandwidth utilization. This paper presents experiments to help evaluate the effectiveness of the proposed strategy. From the experimental results, it can be seen that the proposed mechanism is feasible in practice.
1. Introduction
Mobile web services are a new generation of web services accessible to mobile clients through the air in support of anytime-and-anywhere access to services [1, 2]. Furthermore, owing to the characteristics of wireless environments including device mobility, scarce bandwidth, and limited battery power, accessing services in wireless-oriented service environments has become an emerging challenge to the data-management and telecommunication communities [3].
In essence, there are two fundamental modes for data-service dissemination in a wireless region: the broadcasting mode and the on-demand mode [4]. In a broadcasting mode, data is broadcast periodically to mobile devices according to a broadcast program in the region [5–14]. To fetch a data record, mobile clients have to wait until the target data appears on the broadcast channel. In this way, a broadcast-based system can serve thousands of mobile users simultaneously, since the broadcast cost is identical regardless of the number of users. The other data dissemination mode is the on-demand mode. This mode is similar to the traditional client-server approach. In the on-demand mode, a mobile node first sends its query on an uplink channel and the server sends the requested data to the client through the downlink channel. In this paper, we consider data disseminated in a broadcast-based wireless environment.
In the literature, access efficiency and energy consumption are two issues of concern in assessing the performance of wireless communication systems [4, 15]. Access efficiency can be evaluated by access time, which means the time that has elapsed from the moment a client requests data to the moment the client retrieves the target item. Energy consumption concerns the battery power consumed by the client to retrieve the requested data, and it can be quantified according to tune-in time [15], in other words, according to the amount of time the mobile device stays active “listening” to the broadcast channel.
The plain-broadcast scheme is the simplest approach to generating data-broadcast programs and has been adopted in earlier research [3, 16]. Using this approach, the server broadcasts all data records in a round robin manner. Therefore, this method is easily implemented. Furthermore, since the plain-broadcast scheme treats all data items equally, the average waiting time for each packet of data equals half of the overall broadcast period. As a result, it is clear that this scheme is not feasible for cases in which data-access frequencies are not uniform.
An alternative data dissemination mechanism is the broadcast disks scheme, which permits data items to be broadcast with different frequencies [5]. This algorithm first divides data items into a few groups (i.e., disks) such that data items with similar popularity are assigned to the same disks. Afterwards, it determines the rotation speed of each disk according to the popularity of data items. In this way, one can construct a broadcast program that adjusts the trade-off between the access time of hot data and that of cold data.
In addition to access efficiency, power conservation is critical for mobile nodes owing to limited battery capacities [17–19]. To facilitate power saving, it is necessary for mobile devices to support two operation modes: the active mode and the doze mode [20]. Mobile clients normally operate in the active mode, and they can switch to the energy-saving doze mode when mobile devices become idle. Thus, keeping mobile devices in the doze mode for as long as possible could be achieved through the application of an air index technique.
By broadcasting the arrival time of data items to clients, mobile devices can stay in the doze mode until the requested data arrives. In this way, the tune-in time can be reduced to the initial index probe time plus the data-retrieval time. At present, several research efforts have addressed reducing the initial probe time [15, 21–26]. These studies complement our work in different aspects.
In this paper, we investigate the effects of data-access frequency and data size on access efficiency. And we propose the tree-based adaptive broadcasting (TAB) algorithm for skewed-data-access to generate an efficient broadcast cycle. The TAB algorithm first generates a broadcast tree to determine the broadcast frequency of each data record. After that, the broadcast tree is split into broadcast wood to balance the interbroadcast time of successive copies of data. In order to reduce the tune-in time, we further separate one individual channel from the broadcast channel to broadcast index packets.
The rest of this paper is organized as follows. Section 2 introduces related data-broadcast research. The system architecture used throughout this paper is presented as well. Section 3 discusses the proposed TAB algorithm and its role in improving data-access latency. Section 4 establishes an analytical model for optimizing index-channel and data-channel bandwidth allocation. Section 5 discusses the proposed dynamic broadcast adaptive for weight change. Section 6 discusses experiments serving to evaluate the performance of the proposed mechanism. Finally, Section 6 remarks on the conclusions drawn.
2. Related Work
This section reviews important attempts at applying data broadcasting and bandwidth allocation in mobile networks. This paper develops an analytical model to approximate the proposed TAB algorithm. This model makes it convenient to efficiently evaluate the mean access time of the generated broadcast program. Moreover, in light of the derived access time, both the index channel’s optimum bandwidth allocation and the data channel’s optimum bandwidth allocation are formulated.
In order to assess the feasibility and efficiency of our mechanism, we conducted several experiments. Results reveal that the proposed TAB algorithm performs well in terms of data-access efficiency. Moreover, the optimum bandwidth allocation yields a significant performance improvement in tune-in time. As a consequence, it can be seen from experimental results that putting the proposed mechanism into practice is entirely feasible.
In [5], this scheme assumes that data items are of equal sizes. In terms of practicality, it is not efficient to apply the broadcast disks to the varied-size data items. Moreover, it is hard for system developers to define the similarity of data popularity so as to partition data items into disks. The determination of the relative broadcast frequency for each disk is also imprecise. In this paper, we propose a TAB algorithm for varied-size data items to tackle the above drawbacks. The TAB algorithm grows a broadcast tree to determine the broadcast frequency of each data record. After that, we split the broadcast tree into some broadcast wood with similar sizes so as to place those data items in the broadcast cycle. The details are described clearly in Section 3.
Xu et al. present exponential-index technology that enables some flexibility in trade-off between tune-in time and access latency [21]. As shown in Figure 1, the exponential index adopts a flat broadcast and disseminates data items in the ascending order of their identifiers. These researchers further group data items into a chunk and maintain one index table for each chunk. The number of entries in an index table is determined by the index base. Then the exponential-index technology can adjust the trade-off between access efficiency and energy consumption by tuning the index base and the chunk-size parameters.
An example of an exponential index.
Therefore, the bandwidth of an exponential index in broadcasting index information is only dominated by the index base and chunk size. The current study further examines the effects of data placement on broadcast programs and establishes an analytical model to derive the optimum bandwidth allocation for index packets and data elements. The performance comparison between our mechanism and the exponential index is demonstrated in Section 5.
The system architecture in this paper is depicted in Figure 2. As shown in Figure 2, the proposed TAB algorithm schedules data items in a server’s database to construct a broadcast program. According to the broadcast program, the system disseminates these data records periodically through the data channel. In addition, in order to reduce power consumption, some effective indexing techniques can generate index packets. Those index packets contain information for mobile nodes, such as data identifiers and the nearest data-appearance time. Index packets are broadcast through the index channel.
The sketch of our system architecture.
On the other hand, when a user submits queries to a mobile client, the mobile equipment first fetches an index packet from the index channel to get the arrival time of the target data. Then, the mobile device switches from the active mode to the doze mode for energy savings until the target item appears on the data channel [20]. After that, the mobile client downloads the target-data transaction so as to process the user’s request.
Servers’ databases maintain some auxiliary information for each data item. As depicted in Table 1, each data entry consists of three attributes: the data identifier, the data-access probability, and the data size. Addressing these factors, this paper proposes an efficient TAB algorithm to generate a skewed broadcast program. Afterwards, the paper proposes an analytical model to approximate the mean access time of our mechanism. We also derive the optimum index-channel and data-channel bandwidth allocations. Details are presented in the following sections.
Data structure in servers’ databases.
Data identifier
Access probability
Size
D1
Pr(D1)
S(D1)
D2
Pr(D2)
S(D2)
D3
Pr(D3)
S(D3)
⋮
⋮
⋮
3. Tree-Based Adaptive Broadcasting
In this section, we propose the TAB algorithm as a way to improve the performance of existing data-broadcasting mechanisms. According to the statistical probability of data access, the TAB algorithm broadcasts a significant number of copies for popular data in a broadcast cycle to diminish the average access time. In addition, the proposed algorithm balances the interbroadcast time of successive copies of a single packet of data even though data-item sizes can vary.
The notation for the TAB algorithm is summarized in Table 2. Consider the case in which a server’s database contains N data items for broadcasting. The data-access probability and the data size for each data item are given as well. In terms of these factors, the TAB algorithm should construct an efficient broadcast program to reduce access time. In fact, the TAB algorithm can be split into three different steps: data-item reordering, broadcast-tree construction, and wood-size equalization. The details for each step are described as follows.
Notation used in the TAB mechanism.
Notation
Definition
N
The number of data items
Pr(Di)
The access probability of data item Di
S(Di)
The size of data item Di
ni
The broadcast frequency of data item Di
R
Sapling
I
Broadcast tree
h
Tree height
L
The length of a broadcast cycle
B
The bandwidth of a broadcast channel
τ
Maximum tree height of a broadcast tree
wi
The ith broadcast wood
3.1. Data-Item Reordering
The first step of the TAB algorithm is to sort all data records in the database by their access frequency and size. More precisely, after performing the data-item reordering, we would get a broadcast cycle [D1,D2,…,DN] such that (1) Pr(Di)≥Pr(Dj) and (2) if Pr(Di)=Pr(Dj), and then S(Di)≤S(Dj) for any integers i<j≤N. To reduce the average access time, it is beneficial to broadcast hotter data more frequently [27, 28]. Therefore, sorting data records from hottest to coldest can make it convenient to determine the broadcast frequency of each data item.
The second step is to determine the broadcast frequency (i.e., the number of replicates) for each data item. Note that the replication of a data record would reduce the access time of that data; however, the replication would lengthen the whole broadcast cycle and increase the access time of other records. Therefore, this paper proposes a broadcast-tree construction algorithm to balance the trade-off between these two factors.
As a matter of fact, the broadcast-tree construction yields broadcast trees in a top-down manner. Figure 3 presents the scenario of the broadcast-tree construction. First of all, the broadcast-tree construction starts with the sorted data from the data-item reordering (Figure 3(a)). Then, after some evaluation, the algorithm iteratively moves data items with high access probabilities to the lower level so as to double their broadcast frequencies (Figures 3(b) and 3(c)). Finally, we can get a broadcast tree by copying the nodes at each level, resulting in a full binary tree as drawn in Figure 3(d).
The scenario of the broadcast-tree construction.
Sapling with height 0
Sapling with height 1
Sapling with height 2
Broadcast tree with height 2
3.2. Broadcast-Tree Construction
Once the broadcast tree is built, the broadcast frequency ni for each data item Di is determined as well. The number of replicates in the broadcast tree stands for the data’s broadcast frequency. Thus, take the broadcast tree in Figure 3 as an example. In this case, we have n1=4, n2=n3=2, and n4=n5=⋯=n11=1. Specifically, the criterion for estimating which data items should be moved to the next level is determined by the following theorem.
Theorem 1.
Suppose that sapling R of height h has m data records at depth h (as shown in Figure 4(a)). Let di denote the depth of data item Di in R and let S(Di) represent its data size (i.e., the broadcast frequency ni=2diand the broadcast lengthL=∑i=1NniS(Di). Assume that the bandwidth of the data channel is B. Then the reduced average access time, achieved by moving data items D1,…,Dr(1≤r≤m) to the next level h+1, can be formulated by
(1)L2h+2B∑i=1rPr(Di)-1B(∑i=1rPr(Di)4+2h-1∑i=r+1NPr(Di)ni)(∑i=1rS(Di)).
Move data items {D1, …, Dr} to the next level.
Proof.
Consider the average access time before and after moving data to the next level. Figure 4 shows that, before data are moved, one can perform an approximate calculation of the mean access time TBE by using the equation
(2)TBE=∑i=1NPr(Di)(LBE2niB+S(Di)B),
where LBE denotes the current broadcast length. Likewise, after data are moved, the average access time TAF can be estimated by means of the equation
(3)TAF=∑i=1rPr(Di)(LAF2·(2ni)B+S(Di)B)+∑i=r+1NPr(Di)(LAF2niB+S(Di)B),
where the latter broadcast cycle length LAF is equal to LBE+∑i=1rniS(Di). Therefore, we can obtain the reduced access time
(4)TBE-TAF=∑i=1rPr(Di)4niBLBE-(∑i=1rPr(Di)4niB+∑i=r+1NPr(Di)2niB)(∑j=1rnjS(Dj))=LBE2h+2B∑i=1rPr(Di)-1B(∑i=1rPr(Di)4+2h-1∑i=r+1NPr(Di)ni)(∑j=1rS(Dj)).
According to Theorem 1, the constancy of bandwidth B facilitates the broadcast-tree construction procedure. The broadcast-tree construction starts with the sorted data elements from the data-item reordering.
Afterwards, we use Theorem 1 to determine the optimal cutpoint c for each level and move data records D1,…,Dc to the next floor so as to reduce the overall access time. In addition, note that the maximum height of the generated broadcast tree is limited by parameter τ. This factor can prevent the procedures shown in Algorithm 1 from taking too much execution time.
<bold>Algorithm 1</bold>
Broadcast-Tree Construction Procedure:
(0) Initial settings: Let h = 0, m = N, L=∑i=1NS(Di), and ni=1 for i=1,2,…,N
(1) Let F(r)=(L/2h+2)∑i=1rPr(Di)-(∑i=1r(Pr(Di)/4)+2h-1∑i=r+1N(Pr(Di)/ni))(∑i=1rS(Di)).
Find the cutpoint c such that F(c)=max1≤r≤m{F(r)}.
(2) IfF(c)≤0 or h>τ, then return the expanded full binary tree I.
(3) else move the data items {D1,…,Dc} to the next level.
(4) Set h=h+1, m=c, L=L+∑i=1cniS(Di), and ni=2h for i=1,2,…,c. Then go to step (1).
3.3. Wood-Size Equalization
Once the number of duplicates for each data item is obtained, we determine the replicated data placement in the broadcast cycle. Clearly, to achieve a better performance, the interbroadcast time of successive copies of data should be the same. However, it is known that such an optimum placement problem associated with the variant data sizes is an NP-complete problem [29]. Consequently, in this subsection we develop a wood-size equalization algorithm to place those data items in the broadcast cycle.
The functionality of the wood-size equalization is to split broadcast tree I of height h into 2^{
h} pieces of broadcast wood (w1,…,w2h) with similar sizes. Actually, the wood-size equalization is a recurrence in structure. It splits the broadcast tree in a bottom-up manner. Given broadcast tree I, we first get 2^{
h}/2 broadcast woods by applying the wood-size equalization to the left subtree of I and get the other 2^{
h}/2 broadcast woods from the right subtree of I. Afterwards, the root-cutting procedure permits the distribution of the data in the root to these woods such that each broadcast wood has a similar size. The wood-size equalization can be stated as follows.
Basically, the root-cutting procedure adopts a greedy strategy to divide the root node. More precisely, the root-cutting procedure iteratively splits the data ai with the largest data size from the root and attaches it to the minimum-size wood wc until all the data in the root are allocated. Thus, we can summarize the root-cutting procedure in Algorithms 2 and 3.
<bold>Algorithm 2</bold>
Wood-Size Equalization WSE(I):
//Let h:= the height of broadcast tree I,
//R:= the root node of tree I
//IL and IR denote the left and the right subtrees of I.
(1) If h=0, then returnI.
(2) else (w1,…,w2h-1):= WSE(IL),
(3) (w2h-1+1,…,w2h):= WSE(IR),
(4) returnRoot-Cutting(R,β1,…,β2h).
<bold>Algorithm 3</bold>
Root-cutting procedure (R,w1,…,wv):
(1)Sort the data items in root R according to their sizes into a non-increasing order a1,…,ak.
(i.e., S(ai)≥S(aj)iff1≤i≤j≤k).
(2)Fori = 1 to k
(3) Find the wood wcwith the smallest size.
(4) Split the data item ai from the root and attach it to the top of wood wc.
(5)return (w1,…,wv).
An example of the proposed wood-size equalization is illustrated in Figure 5. Consider broadcast tree I of height 2 in Figure 3(d). In the beginning, we recursively apply wood-size equalization, resulting in the intermedium tree with four broadcast woods shown in Figure 5(a). After performing the root-cutting procedure, we get four individual broadcast woods as drawn in Figure 5(b).
The scenario of the wood-size equalization.
Upon completion of the wood-size equalization, one can obtain the broadcast cycle by sequentially broadcasting each wood from top to bottom. Therefore, in this case we can get the broadcast program as shown in Figure 6.
The generated broadcast program.
3.4. Complexity Analysis
In this subsection, the time complexity of the TAB algorithm is studied. Recall that the TAB algorithm contains three steps. The first step is the data-item reordering, which requires time complexity O(NlogN) for sorting N data items. In addition, the second step, broadcast-tree construction, builds a broadcast tree having a height of at most τ. For each level, it takes at most O(N) time to determine the fittest cutpoint. And then, it requires O(2τN) time to expand from a sapling to a full binary tree. Finally, the wood-size equalization is a recurrence in structure, and it requires a time complexity of O(τNlogN +τ2τN). Besides, because the value τ is relatively insignificant for the large value of N, we concluded that the proposed TAB algorithm takes only a time complexity of O(NlogN) in total.
4. Optimum Bandwidth Allocation
In this section, we develop an analytical model to approximate the average access time of the proposed TAB algorithm (Theorem 9). Afterwards, this analytical model helps derive the optimum bandwidth allocation for our system architecture and minimize the average access time (Theorem 10). The details are described as follows.
It can be seen from Figure 7 that the access time of the proposed TAB algorithm can be decomposed into two portions: index-access time and data-access time. When a user submits queries to mobile clients, the mobile device needs to read an index packet from the index channel. The time interval between the moment that the user sends a query and the moment that the mobile device gets one index packet is called the index-access time. Likewise, the data-access time is defined as the time interval between the moment that the mobile client finishes the index packet and the moment that the mobile device obtains the target data packet. As a result, by the above definitions, we have the following lemma.
Sketch of the data-access time.
Lemma 2.
Let the random variable Taccess denote the total access time of a query. And let the random variablesTaccessIandTaccessDrepresent the index-access time and data-access time of the query, respectively. Thus, it is clear that
(5)Taccess=TaccessI+TaccessD.
Therefore, according to Lemma 2, we know that the index-access time and data-access time should be calculated first to obtain the total access time. In order to get the index-access time, we consider the index channel as shown in Figure 8. It can be seen from Figure 8 that the index access timeTaccessIis determined by the time the mobile device submits its query. Consequently, if we assume that a uniform distribution characterizes the duration of time extending from the starting point of the current index packets to the moment the mobile device submits its query [30], then the average index-access time can be arrived at by the following lemma.
Sketch of the index-access time.
Lemma 3.
Let Sindex represent the size of each index packet and let BI denote the bandwidth of the index channel. Assume that the interval between the starting point of the current index packet and the moment the mobile client submits its query follows a uniform distribution over[0,Sindex/BI). Then the average index-access time can be formulated as
(6)E[TaccessI]=∫0Sindex/BI(2·SindexBI-t)·(SindexBI)-1dt.
Proof.
Without loss of generality, we consider the case in which the mobile user submits a query during the jth index-packet broadcast time as in Figure 8. Let T1 denote the random variable representing the time interval between the starting point of the jth index-packet broadcast cycle and the moment that the mobile device submits its query. Thus, it is clear that the index-access time TaccessI can be formulated as
(7)TaccessI={SindexBIifT1=02SindexBI-T1if0<T1<SindexBI.
In addition, if we further assume that the random variable T1 follows a uniform distribution over[0,Sindex/BI), then the mean index-access time can be computed by
(8)E[TaccessI]=∫0Sindex/BI(2·SindexBI-t)·(SindexBI)-1dt.
On the other hand, since each data item Di has its own access probability Pr(Di), the average data-access time can be expressed as a weighted summation of the average access time of all data items. In terms of mathematic form, the mean data-access time can be formulated as follows.
Lemma 4.
Suppose that the server database contains N data items D1,D2,…,DN for broadcasting. Furthermore, let
Pr
(Di) denote the access probability of the data item Di, for i=1,2,…,N. Then the expected value of the data-access time can be expressed by
(9)E[TaccessD]=∑i=1NE[Taccess(Di)]·
Pr
(Di),
where Taccess(Di) stands for the random variable representing the access time of the specific data item Di.
As a consequence, in order to obtain the average data-access time, we need to get the average access time for each data item first. Figure 9 depicts the sketch of the data item D3’s access time. As shown in Figure 9, the access time of an arbitrary data item Di can be further decomposed into two parts: waiting time and retrieval time. The waiting time of an arbitrary data item Di is defined as the time interval between the moment the mobile device gets an index packet and the moment the data channel starts to broadcast the target data item Di. And the retrieval time represents the time interval during which the mobile equipment downloads the target data item. Thus, by the above definition, we have the following lemma.
Sketch of the data-access time.
Lemma 5.
Let Twait(Di) denote the random variable representing the waiting time of the data item Di and let Tret(Di) denote the retrieval time of the data item Di. Then we have
(10)Taccess(Di)=Twait(Di)+Tret(Di).
In addition, the retrieval time Tret(Di) can be determined by its data size S(Di) divided by the data channel bandwidth BD. That is,
(11)Tret(Di)=S(Di)BD.
On the other hand, it is not easy to derive the waiting time Twait(Di) directly for some data items Di because the proposed TAB algorithm broadcasts duplicates for those data items with high-access probability. Furthermore, the positions of the replicated data items in the broadcast cycle also determine the data items’ waiting time. Therefore, in this paper, we introduce a specific Li[k]-function to represent the position of the kth replicate of the data item Di in the broadcast cycle.
Definition 6.
The length L of a broadcast cycle is defined as the total number of data bits in this broadcast cycle. And the term Li[k] is defined as the total number of broadcasted bits before broadcasting the kth replicate of the data item Di in a broadcast cycle.
Example 7.
Take the broadcast cycle depicted in Figure 6 as an example. In this case, we have the equations L1[1]=S(D11)+S(D4)+S(D3), L1[2]=L1[1]+S(D1)+S(D8)+S(D2), L1[3]=L1[2]+S(D1)+S(D5)+S(D10)+S(D7)+S(D3), and L1[4]=L1[3]+S(D1)+S(D9)+S(D6)+S(D2). The length of the broadcast cycle L is equal to L1[4]+S(D1).
As a result, we know that the Li[k]-function can help describe any broadcast program accurately. Consider the case in which, after this work performs the proposed TAB algorithm, the data item Di has ni duplicates in a broadcast cycle, and these ni duplicates are located at Li[1],Li[2],…,Li[ni], respectively (see Figure 10). Subsequently, the following theorem can yield the data item’s waiting time Twait(Di) relative to the proposed TAB scheme.
The broadcast deployment for data item Di (Li-function).
Theorem 8.
Let BD be the bandwidth of the data channel. Suppose that the broadcast program obtained by performing the TAB algorithm is given in terms of the Li[k]-function. Moreover, let the random variable T represent the time interval between the starting point of the current broadcast cycle and the moment that the mobile client starts to wait for the target data item. Then the random variable Twait(Di) can be formulated as
(12)Twait(Di)={Li[1]BD-Tif0≤T<Li[1]BDLi[k+1]BD-TifLi[k]BD≤T<Li[k+1]BDfork=1,2,…,ni-1L-T+Li[1]BDifLi[ni]BD≤T<LBD.
Besides, if we assume that the random variable T follows a uniform distribution over[0,L/BD), then the average waiting time E[Twait(Di)] can be further simplified as
(13)E[Twait(Di)]=12BDL[∑k=1ni-1(Li[k+1]-Li[k])2(L-Li[ni]+Li[1])2hhhhhhhhhhh+∑k=1ni-1(Li[k+1]-Li[k])2].
Proof.
Without loss of generality, we assume that the mobile client starts to wait for the target data item Di in the mth broadcast cycle as in Figure 10. Since the data Di is broadcast ni times during a broadcast cycle, the mobile client retrieves the nearest replicate of the target Di according to the entry time the mobile client starts to wait. So with time T being the time at which the mobile device starts to wait, we now consider three cases for calculating the waiting time.
Case 1 (0≤T<Li[1]/BD). In this case, the mobile client starts to wait before the first replicate of the target item in the mth broadcast cycle is broadcast. Therefore, the waiting time can be obtained by
(14)Twait(Di)=Li[1]BD-T.
Case 2 (Li[k]/BD≤T<Li[k+1]/BD,k=1,2,…,ni-1). In this case, the mobile client retrieves the (k+1)th replicate of the target data item in the mth broadcast cycle. Thus, the waiting time can be computed by
(15)Twait(Di)=Li[k+1]BD-T.
Case 3 (Li[ni]/BD≤T<L/BD). In this case, all the replicates of the target data item in the mth broadcast cycle were broadcast when the mobile device began to wait. Thus, the mobile client would download the first replicate of the target data in the (m+1)th broadcast cycle. In other words, the waiting time can be formulated as
(16)Twait(Di)=L-T+Li[1]BD.
On the other hand, if we further assume that the random variable T satisfies a uniform distribution over[0,L/BD), then the expected value of the waiting time can be derived by
(17)E[Twait(Di)]=∫0Li[1]/BD(Li[1]BD-t)·(LBD)-1dt+∑k=1ni-1∫Li[k]/BDLi[k+1]/BD(Li[k+1]BD-t)·(LBD)-1dt+∫Li[ni]/BDL/BD(L-t+Li[1]BD)·(LBD)-1dt=12BDL[∑k=1ni-1(Li[k+1]-Li[k])2(L-Li[ni]+Li[1])2+∑k=1ni-1(Li[k+1]-Li[k])2].
According to the above lemmas and theorems, the average access time can be computed as well. We now summarize the derivation of the average access time via the following theorem.
Theorem 9.
The average access time E[Taccess] can be derived by
(18)E[Taccess]=3Sindex2BI+∑i=1N
Pr
(Di)·{S(Di)BD12BDL[∑k=1ni-1(Li[k+1]-Li[k])2(L-Li[ni]+Li[1])2+∑k=1ni-1(Li[k+1]-Li[k])2]+S(Di)BD}.
Proof.
Based on Lemma 2, it is clear that
(19)E[Taccess]=E[TaccessI]+E[TaccessD].
Furthermore, after applying Lemmas 4 and 5 to the above equation, we get
(20)E[Taccess]=E[TaccessI]+∑i=1NPr(Di)·(E[Twait(Di)]+S(Di)BD).
Finally, Lemma 3 and Theorem 8 permit us to calculate the average access time E[Taccess] as follows:
(21)3Sindex2BI+∑i=1NPr(Di)·{12BDL[∑k=1ni-1(Li[k+1]-Li[k])2(L-Li[ni]+Li[1])2+∑k=1ni-1(Li[k+1]-Li[k])2]+S(Di)BD[∑k=1ni-1(Li[k+1]-Li[k])2(L-Li[ni]+Li[1])2}.
From Theorem 9, we can not only estimate the average access time of a query for any broadcast program, but also determine the optimum bandwidth allocation for both the index channel and the data channel. Consider the case in which the overall channel bandwidth for data broadcasting is B. Then, for the proposed TAB algorithm to achieve the minimum access time, the optimum bandwidth allocation is given by the following theorem.
Theorem 10.
Let B denote the total bandwidth for data broadcasting. Then the optimum bandwidth settings necessary for index channel BIopt and data channel BDopt to achieve the minimum average access time can be formulated as
(22)BIopt=3Sindex3Sindex+2ξB,BDopt=2ξ3Sindex+2ξB,
where
(23)ξ=∑i=1N
Pr
(Di)·{12L·[∑k=1ni-1(Li[k+1]-Li[k])2(L-Li[ni]+Li[1])2+∑k=1ni-1(Li[k+1]-Li[k])2]+S(Di)}.
Proof.
According to Theorem 9, the average access time E[Taccess] is equal to
(24)E[Taccess]=3Sindex2BI+ξBD.
Denote the estimation function Φ(BI,BD) by
(25)Φ(BI,BD)=3Sindex2BI+ξBD.
As a consequence, to minimize the average access time E[Taccess], we need to choose the values BI and BD to minimize the estimation function Φ(BI,BD) subject to the constraint BI+BD=B.
Assume that Γ(BI)=Φ(BI,B-BI). Then it is clear that the optimum bandwidth of the index channel BIopt satisfies the equation
(26)dΓ(BI)dBI|BI=BIopt=0.
After substituting this into the estimation function, we have
(27)dΓ(BI)dBI|BI=BIopt=(-3Sindex2BI2+ξ(B-BI)2)|BI=BIopt=0.
That is,
(28)BIopt=3Sindex3Sindex+2ξB.
Also, the optimum bandwidth of the data channel BDopt can be arrived at through the following equation:
(29)BDopt=B-BIopt=2ξ3Sindex+2ξB.
In order to assess the performance of the proposed system architecture, we conducted several experiments. Table 3 shows the parameter settings in our experiments. We assumed that the number of total data records for broadcasting varied from 50 to 100. And each data item contained two attributes: data size and data-access frequency.
Parameter setting.
Parameters
Values
The number of data items (N)
50~1000
Bandwidth (B)
80 KB/sec
Index packet size (Sindex)
128 bytes
The sizes of data items (S(Di))
Normal distribution(mean: 50~150 KB, variance: 900 KB^{2})
The access probabilities (Pr(Di))
Zipf distribution (θ= 0.5~1.5)
The number of requests
5000
Maximum height of broadcast tree
3
In our experiments, the data sizes followed a normal distribution with the mean varying from 50 KB to 150 KB and a variance of 900 KB^{2}. The modeling of the data-access probabilities rested on the Zipf distribution with the parameter θ [31]. In other words, the probability of the data item Di was assumed to be
(30)Pr(Di)=(1/i)θ∑j=1N(1/j)θ,
where the value of skew factor θ ranged from 0.5 to 1.5.
We did not employ any indexing technology for the index channel in our system, and in this way we could realize the actual effects of the proposed method. The index packet size was assumed to be 128 bytes. Further, the total available bandwidth including the index and data channel was set to 80 KB/sec [3].
In addition to the proposed system architecture, we implemented a plain broadcast, broadcast disks, and an exponential index scheme for comparison. In line with the simulation model in [1], the broadcast-disk technology was implemented with three broadcast disks. And the relative frequencies between these disks were dominated by parameter Δ. More precisely, the broadcast frequency of the disk i was determined by ref_freq(i)=(3-i)Δ+1. In the experiments, we considered three kinds of broadcast disks schemes: Δ=1, 2, and 3.
For each experiment, we generated five different datasets, each one containing 50~100 data records for broadcasting. In addition, for each data set, we generated 5,000 queries and calculated the corresponding access time and tune-in time to evaluate access efficiency and power conservation. The interarrival time of queries followed an exponential distribution with an arrival rate of λ=1. The simulator and query generator were coded in MATLAB.
5.2. Experimental Results
This work applies average tune-in time and average access time as the performance measurement. It allows devices to power on when they need to access data so that these two values are lower than the ones gained by the other method. It means that this device can stay in sleep mode longer and save more energy.
Figure 11 shows the average access time and tune-in time of different schemes with the number of data items varying from 50 to 100. In this experiment, we considered the case in which the Zipf parameter was set at 0.9 and the data-size generation process was a normal distribution with a mean of 100 KB and a variance of 900 KB^{2}.
The average access time and tune-in time for various data-item numbers.
It can be seen from Figure 11(a) that even though we sacrificed some bandwidth to broadcast index packets, our mechanism still achieved a lower access time than the broadcast disks did. Furthermore, as presented in Figure 11(a), this phenomenon became more prominent as the number of data items increased. Figure 11(b) shows the effects of our system on power conservation. As shown in Figure 11(b), the average tune-in time of our mechanism was lower than that of the exponential-index scheme. This finding demonstrates the importance of efficient bandwidth allocation in power conservation.
Figure 12 presents our comparison of the access latency and tune-in time in various average data sizes. Without loss of generality, we considered the number of data records to be set at 75 and the Zipf parameter at 0.9 in this experiment. Furthermore, the data-size generation process followed a normal distribution with a mean ranging from 50 KB to 100 KB.
Performance comparison in various average data sizes.
As shown in Figure 12(a), the average access latency of all schemes increased as the average data size increased. Nevertheless, our mechanism consistently outperformed all the other schemes for various data sizes. In addition, the slope of our mechanism proved to be smaller than that of the broadcast disks. A similar condition appeared in our comparison of the tune-in time. As depicted in Figure 12(b), because our system used the optimum bandwidth allocation, our mechanism could achieve a better performance in power conservation than the exponential index scheme.
Regarding the effects of the skewness of the data-access probabilities, Figure 13 depicts our comparison of the average access time and tune-in time in various Zipf parameters. In this experiment, the Zipf parameter representing the skewness of data-access frequencies varied from 0.5 to 1.5. The number of data records was 100 and the data size had a normal distribution with a mean of 100 KB and a variance of 900 KB^{2}.
Performance comparison in various skew factors.
It can be seen from Figure 13(a) that our mechanism significantly outperforms all the other schemes. Furthermore, the performance gain of our mechanism becomes more and more conspicuous as the value of the skew factor increases. This indicates that an efficient determination of the data-broadcast frequency is critical, especially the more skewed the data access is. In terms of power conservation, Figure 13(b) shows that the skew factor only slightly affects the tune-in time of our mechanism and the exponential index scheme.
Regarding the accuracy of our approximation model, Figure 14 presents our comparison between the analytic results and the experimental results. We obtained each point depicted in the curve of approximation by calculating the derived access time in Theorem 9. Each point shown in the curve of the simulation results represents the average access time of 5,000 data queries on various parameters. Again, Figure 14 shows that the curve of our approximation is close to that of the numerical results.
Average access time for experimental and analytical results.
In conclusion, even though our system releases some bandwidth to broadcast index packets, our experimental results show that our mechanism exhibits better access latency than the plain broadcast and broadcast disks scheme do, especially when the data-access probabilities are skewed. In terms of power conservation, it is clear that our system can reduce much more tune-in time if a proper indexing technology is applied to our index channel. In addition, the numerical results of our experiments confirm the accuracy of the proposed approximation model.
6. Conclusion
Data broadcasting involves important data dissemination technology for accessing mobile services in wireless networks. In general, there are two main approaches to data broadcast, namely, push-based broadcast and on-demand broadcast [32]. Mobile Internet and mobile services that make use of mobile data are increasingly popular [33–35]. Among others, access efficiency and power conservation are two critical performance indexes for assessing the effectiveness of wireless communication systems. In this paper, we present a TAB algorithm to reduce the response time of mobile clients’ requests. We provide an analytical model to measure the expected access latency of the generated broadcast program. This analytical model helps formulate the optimum bandwidth allocation for index and data channels. From the experimental results, it can be seen that our mechanism outperforms the existing data-broadcast schemes in terms of access time. Moreover, the optimum bandwidth allocation also brings about a significant improvement in energy conservation. Based on these advantages, it can be seen that the proposed mechanism is scalable and can feasibly increase the efficiency of data dissemination in broadcast-based systems.
Conflict of Interests
The authors declare that there is no conflict of interests.
Acknowledgments
The authors thank the National Science Council of Taiwan for funding this research (Project no. NSC 102-2218-E-268-001).
ChengS. T.LiuJ. P.KaoJ. L.ChenC. M.A New Framework for Mobile Web ServicesProceedings of the IEEE Symposium on Applications and the Internet2002218222YangX.BouguettayaA.MedjahedB.LongH.HeW.Organizing and Accessing Web Services on AirImielinskiT.BadrinathB. R.Mobile wireless computing challenges in data managementImielinskiT.ViswanathanS.BadrinathB. R.Data on air: Organization and accessAcharyaS.AlonsoR.FranklinM.ZdonikS.Broadcast disks: data management for asymmetric communications environmentsProceedings of the ACM SIGMOD Conference on Management of DataMay 1995199210HuangJ. L.ChenM. S.Dependent data broadcasting for unordered queries in a multiple channel mobile environmentSumariP.DarusR. M.KamarulhailiH.Data organization for broadcasting in mobile computingProceedings of the International Conference on Geometric Modeling and graphicsJuly 20034954HameedS.VaidyaN. H.Efficient algorithms for scheduling data broadcastLeeG.LoS.-C.Broadcast data allocation for efficient access of multiple data items in mobile environmentsSunW.ShiW.ShiB.YuY.A cost-efficient scheduling algorithm of on-demand broadcastsJeaK. F.ChenM. H.A data broadcast scheme based on prediction for the wireless environmentProceedings of the 9th International Conference Parallel and Distributed Systems (ICPADS `02)December 2002HungH. P.ChenM. S.On exploring channel allocation in the diverse data broadcasting environmentProceedings of the 25th IEEE International Conference on Distributed Computing SystemsJune 20057297382-s2.0-27944480115JuhnL. S.TsengL. M.Fast data broadcasting and receiving scheme for popular video serviceChungW.EndresT. J.LongC. D.A data broadcasting system expanding the information capacity of existing analog communication systemsImielinskiT.ViswanathanS.BadrinathB. R.Energy efficient indexing on airProceedings of the ACM SIGMOD International Conference on Management of DataMay 199425362-s2.0-0028447926HermanG.GopalG.LeeK.WeinribA.The datacycle architecture for very high throughput database systemsProceedings of the ACM SIGMOD Conference on Management of DataMay 1987YangX.BouguettayaA.Adaptive data access in broadcast-based wireless environmentsYinL.CaoG.Adaptive power-aware prefetch in wireless networksViredazM. A.BrakmoL. S.HamburgenW. R.Energy management on handheld devicesTanK. L.OoiB. C.XuJ.LeeW.-C.TangX.Exponential index: a parameterized distributed indexing scheme for data on airProceedings of the 2nd ACM/USENIX International Conference on Mobile Systems, Applications and Services (MobiSys '04)June 20041531642-s2.0-4544301484ChenM.YuP. S.WuK.Indexed sequential data broadcasting in wireless mobile computingProceedings of the 17th International Conference Distributed Computing SystemsMay 1997ChenM.-S.WuK.-L.YuP. S.Optimizing index allocation for sequential data broadcasting in wireless mobile computingShivakumarN.VenkatasubramanianS.Efficient indexing for broadcast based wireless systemsHuQ.LeeW. C.LeeD. L.Indexing techniques for wireless data broadcast under data clustering and schedulingProceedings of the 8th International Conference on Information Knowledge ManagementNovember 19993513582-s2.0-0033279049LeeW. C.LeeD. L.Using signature techniques for information filtering in wireless and mobile environmentsHuangY.SistlaP.WolfsonO.Data replication for mobile computersProceedings of the ACM SIGMOD International Conference on Management of DataMay 199413242-s2.0-0028448235WolfsonO.MiloA.Multicast policy and its relationship to replicated data placementSipserM.YatesR. D.GoodmanD. J.ZipfG. K.WangH.XiaoY.ShuL.Scheduling periodic continuous queries in real-time data broadcast environmentsMolnarA.MunteanC. H.Cost-oriented adaptive multimedia deliveryHorngG. J.WangC. H.ChengS. T.HsuC. W.SuS. F.Tree-based adaptive broadcasting of bandwidth allocation for vehicle ad hoc networksProceedings of the 12th IEEE International Conference on High Performance Computing and Communications (HPCC '10)September 20103913972-s2.0-7814933451210.1109/HPCC.2010.66LaiY. L.JiangJ. R.Pricing resources in LTE networks through multiobjective optimization