Delay-Aware Program Codes Dissemination Scheme in Internet of Everything

Due to recent advancements in big data, connection technologies, and smart devices, our environment is transforming into an “Internet of Everything” (IoE) environment. These smart devices can obtain new or special functions by reprogramming: upgrade their soft systems through receiving new version of program codes. However, bulk codes dissemination suffers from large delay, energy consumption, and number of retransmissions because of the unreliability of wireless links. In this paper, a delay-aware program dissemination (DAPD) scheme is proposed to disseminate program codes with fast, reliable, and energy-efficient style. We observe that although total energy is limited in wireless sensor network, there exists residual energy in nodes deployed far from the base station. Therefore, DAPD scheme improves the performance of bulk codes dissemination through the following two aspects. (1) Due to the fact that a high transmitting power can significantly improve the quality of wireless links, transmitting power of sensors with more residual energy is enhanced to improve link quality. (2)Due to the fact that performance of correlated dissemination tends to degrade in a highly dynamic environment, link correlation is autonomously updated in DAPD during codes dissemination to maintain improvements brought by correlated dissemination.Theoretical analysis and experimental results show that, compared with previous work, DAPD scheme improves the dissemination performance in terms of completion time, transmission cost, and the efficiency of energy utilization.


Introduction
Due to recent advancements in big data, connection technologies, and smart devices, the number of connected devices has already exceeded the number of people on Earth since 2011.Connected smart devices have reached 9 billion and are expected to grow more rapidly and reach 24 billion by 2020 [1].Our environment is transforming into an "Internet of Everything" (IoE) environment.In 2012, global commercialization of IoT-based application systems generated a revenue of $4.8 trillion [2].Cisco estimates that, due to IoT, the global corporate profits will also increase approximately by 21% [3].Due to extremely low costs of sensors and actuators, they can surely find their places in a wide range of applications in smart factory, smart city, and smart life, which lead to "Internet of Everything" (IoE) [4][5][6][7][8].
For example, many smart wireless sensors have been deployed in smart factory to monitor states of machines, sensing temperature, humidity, and sound [9,10].Wireless sensors are well suited for complicated industry environment because the deployment of them requires no wiring, so they have already been widely used in industrial production fields.Smart factory, which is composed of smart wireless sensors, can collect various kinds of data from machines and mine these collected data (i.e., industrial big data) to obtain valuable information for factory operation [11].Machines are automatically controlled by obtained information to make an efficient production line (i.e., adequate production speed, low power consumption, and failure prediction).Therefore, smart wireless sensors make it possible to optimize the factory operation without human resources.

Mobile Information Systems
A sensor will work for several months or years once it is deployed [12].However, in order to gain new functions, the upgrade of industrial production line requires sensors to upgrade simultaneously.One method called reprogramming is considered to be economic and convenient for such operations [13][14][15][16].Besides, sometimes even without the upgrade of manufacturing facilities, these sensors will also need to upgrade to adapt to changes on production requirements.Therefore, in "Internet of Everything" (IoE) environment (e.g., smart factory, smart city, and smart life), it is common to disseminate new program codes to all wireless sensors through wireless communication.In this paper, such operation is called codes dissemination.Codes dissemination is a significant and crucial technique when sensors are deployed in environments where physically operating and reprogramming them are difficult or unfeasible.As a basic operation to enable wireless reprogramming, it attracts many research attentions in recent years.However, codes dissemination faces many challenges.First, the length of program codes is longer than the length of code packets and a network may include thousands of sensor nodes.Thus, disseminating large size codes correctly to a great amount of sensor nodes is one challenging issue.Another issue is dissemination delay (i.e., dissemination completion time, DCT), which refers to the required time for disseminating codes to all sensor nodes.It is better to obtain less dissemination completion time because large dissemination completion time may cause codes of different nodes to be inconsistent, resulting in loss of application due to the chaos about communication and signal transmission.Third, it is important to ensure complete reliability, which means that each active node in the network should receive program codes completely and correctly.Thus, large-scale programming codes dissemination for "Internet of Everything" (IoE) environment is a challenging task.
There are mainly three kinds of codes dissemination schemes.(1) The first one is a scheme called deluge [17]; this scheme uses negotiation to improve the performance of reliability.The method used in this scheme can be divided into three stages: broadcast, request, and send.Due to the fact that it needs three operations for each transmission, this scheme costs much to ensure the reliability and transmission delay.(2) The second scheme is flooding-based dissemination scheme [18]; this scheme removes request stage, which can make the speed for spreading program codes faster.However, the disadvantage is to cause broadcast storm problem.(3) The third scheme is link correlation-aware data dissemination scheme, which is proposed in [19].The main idea of correlated dissemination (CD) is disseminating codes to the whole network by the broadcasting of sensor nodes; thus it is a oneto-many operation in unreliable wireless networks.Nodes which start broadcasting are called parent nodes and nodes which receive codes are called child nodes.In CD scheme, each node can only choose one node as its parent node and this parent node will take the responsibility of broadcasting codes to all its child nodes.Link correlation refers to the proportion of packets that are successfully (or unsuccessfully) received by all child nodes during broadcasting.Assigning sensor nodes with high link correlation to a same parent node can make retransmission packets more likely to be needed by more than one child node and therefore number of retransmissions, dissemination completion time, and energy consumption are reduced.The main innovative idea of CD scheme is successfully building up a model to estimate link correlation and reducing number of retransmissions according to it.
Although many researches have already been done, some problems deserve further study [20].(1) The first problem is the problem on improving the reliability of wireless link.Previous researches normally ensure the reliability of codes dissemination through multiple retransmissions in network layer.However, such solutions also bring problems on increasing the delay of codes dissemination and energy consumption, especially in a network with high packet loss ratio.Therefore, how to ensure link reliability and maintain low delay simultaneously is an important and challenging issue.(2) Although correlated dissemination (CD) is able to reduce retransmission packets, it faces the problem of sending extra packets to obtain link correlation, which consumes more energy of sensors and shortens the lifetime of network.Besides, link correlations tend to change dynamically during real-world codes dissemination.Therefore, how to overcome the extra energy consumption of sensors and obtain the latest link correlation simultaneously is another big concern.
Based on the analysis above, a delay-aware program dissemination (DAPD) scheme is proposed to disseminate program codes with fast, reliable, and energy-efficient style.The improvement of DAPD on the performance of codes dissemination is founded on two facts in wireless sensor network.(1) Link quality is related to transmitting power of sensor nodes directly.The former one will improve greatly when enhancing transmitting power, which leads to a decrease in the number of retransmissions and DCT.Besides, energy consumption on retransmitting codes also reduces.On the other hand, although total energy of sensors is limited, sensors are actually in a state of sensing machines during most of time, transmitting collected data to the base station by multihop.With this many-to-one operation, sensors around the base station not only need to transmit their own data but also take the responsibility of forwarding data originated from sensors far from the base station (far nodes).Therefore, energy consumption of these sensors is much larger than far nodes and residual energy will accumulate in far nodes during this period.If such residual energy can be exploited when disseminating codes, the performance will improve greatly.(2) The premise for correlated dissemination is using extra packets to obtain link correlation before disseminating codes.If the number of these packets is too small, link correlation cannot be obtained correctly and adequately.On the other hand, if it is too large, the lifetime of network will be shortened due to much extra energy consumption.So sensor nodes far from the base station (also with excess energy) are able to obtain link correlations adequately with our scheme to achieve the goal that the lifetime of network will not be influenced while dissemination performance is improved.The main contributions of the DAPD scheme are listed as follows.
(1) DAPD scheme enhances transmitting power of sensors with excess energy to improve link quality and reduce dissemination completion time under the premise of not shortening the lifetime of network.
(2) DAPD scheme takes full advantage of residual energy to obtain link correlations correctly and adequately before codes dissemination.Besides, it also proposes a dynamic parent node selection algorithm during codes dissemination to quickly adapt to the latest environment and make use of link correlation more effectively.
(3) Through our theoretical analysis and simulation study, we demonstrate that, for DAPD scheme, codes dissemination completion time can be reduced and energy utilization efficiency can be enhanced simultaneously.Compared with former schemes, codes dissemination completion time can be reduced by as much as 19.05% (larger when the environment is highly dynamic).More importantly, the proposed scheme improves the performances without harming network lifetime, which is difficult to achieve in previous schemes.
The rest of this paper is organized as follows: Section 2 reviews related work.System models and problem statements are introduced in Section 3. In Section 4, a novel DAPD scheme is presented to disseminate program codes with fast, reliable and energy-efficient style.Performance analysis for DAPD scheme is provided in Section 5. Experimental results and comparisons are given in Section 6. Section 7 concludes the paper.

Related Work
Many program codes dissemination schemes have been proposed [21][22][23][24][25][26][27][28], with each of them focusing on one or two specific challenges during codes dissemination phase (e.g., latency, energy consumption, and reliability).Generally, these schemes can be divided into the following types based on their design purposes and requirements.
(1) The first type is schemes focusing on reducing latency.The objective of such schemes is to ensure the reliability of codes dissemination and reduce dissemination completion time simultaneously.Deluge can be considered as an example of such schemes [17].Three-way handshake and ACKbased protocol are adopted in deluge for reliability.Besides, transmission delay can be improved through dividing codes into fixed size pages.In deluge, each node will advertise about local pages.When one node (receiver) learns that another node (sender) has pages not successfully received by itself, it will send a request to the sender and prepare to receive pages.Many other schemes are based on deluge.For example, rateless deluge reduces latency further through using random linear codes to encode packets [21].
Zheng et al. proposed Survival of the Fittest (SurF) to solve the problem that negotiations between sensor nodes tend to incur long dissemination completion time [23].This scheme achieves a tradeoff between negotiation and flooding, which is another dissemination scheme but is considered to be energy-consuming.SurF selectively adopts two schemes (negotiation and flooding) to reduce dissemination completion time.
(2) The second type is schemes focusing on reducing energy consumption.The objective of such schemes is to prolong the lifetime of wireless sensor network.In many energy-efficient schemes, sensor nodes alternate between active state and dormant state to reduce energy consumption [24,25,28,29].Kulkarni and Wang proposed one scheme called MNP after finding that one main source of energy consumption in deluge results from high degree of message collision [26].In MNP, a sender selection algorithm is used to solve message collision problem.Besides, one node will go into dormant state if its neighbor nodes are transmitting packets already owned by itself.Due to a decrease on active radio time, energy consumption is significantly reduced.In addition, experiences show that single-hop reprogramming may achieve a better performance on dissemination completion time and energy consumption than multihop reprogramming under certain conditions.Therefore, one scheme called DStream is proposed [27], which has the abilities on both single-hop and multihop dissemination.
(3) Another kind of special dissemination scheme proposed recently is called link correlation-aware data dissemination scheme [19].Due to the fact that codes dissemination is one kind of wireless broadcast in this scheme, disseminated codes will be received by more than one sensor in the broadcast domain.Basically, sensors with high link correlation are more likely to successfully (or unsuccessfully) receive packets with same packet ID.On the other hand, lost packets of sensors with low link correlation tend to be different from each other.Therefore, this scheme combines sensors with high correlation together, which makes retransmission packets have more possibilities to be needed by more than one sensor.Correlated dissemination performs well in terms of latency, energy consumption, and reliability when the environment of network is stable.

System Models and Problem Statements
3.1.Network Model.The network model that we adopt is shown in Figure 1, which can also be found in [9].The whole network is composed of one base station and many sensor nodes evenly distributed in the network.The energy of base station is considered to be infinite.In contrast, sensor nodes are powered by batteries and total energy is limited.
Two main functions of network include data collection and codes dissemination.During the first operation, all sensor nodes need to transmit collected data back to base station by multihop with unicast style.Due to the unreliability of link, sensors may have to transmit data repeatedly.Packets transmitted during the first operation are called data packets in our scheme.The base station will transmit code packets to all sensor nodes in the second phase and sensor nodes will also participate in transmitting these code packets through broadcast style because of the limited transmission radius of base station.

Link Quality Model.
In this section, the relation between link quality  and transmitting power   is introduced.Specifically, link quality is measured by packet reception rate (PRR).In [30], Zuniga and Krishnamachari analyzed parameters on channel and proposed a mathematical formula to calculate packet reception rate (PRR): where   is data rate in bits,   is the noise bandwidth, and  is the frame size.SNR (signal-to-noise ratio)  can be calculated through (2) if given specific transmitting power   and distance between transmitter and receiver .
where  0 is a reference distance and  is the path loss exponent.Its value can be set to 2-4 if transmission approximately follows free space model.PL( 0 ) and 10 log 10 (/ 0 ) indicate attenuation caused by the adopted log-normal shadowing path loss model [31,32].In detail, path loss is influenced by signal diffusion and characteristics of channel while shadowing effect is caused by obstacles between transmitter and receiver [33].(0, ) is a zero-mean Gaussian RV with standard derivation  and   is noise floor.Their value can be obtained through empirical measurements.Curves in Figure 2 show relation between packet reception rate and  under different transmitting power   in a static environment ( = 0).

Link Correlation Model.
Link correlation model is used to obtain link correlation before codes dissemination phase.In this model, link correlation is obtained through broadcasting HELLO messages and reception vectors are used to keep information on receptions.One simple example shown in Figure 3 is used to demonstrate the construction of reception vectors and the calculation of link correlation.First, nodes A and B broadcast 10 HELLO messages to one-hop downstream sensors in their broadcast domain  (C, D, and E) separately.After successfully receiving one of these HELLO messages, C, D, and E will reply with an ACK (ACKnowledgement sent by receivers to confirm that data has been received successfully).Second, A and B construct reception vectors for C, D, and E according to these ACKs: if A receives the ACK for the th HELLO message from C, then the th element in the reception vector that A constructs for C will be 1.Otherwise, the th element will be 0 because of the loss of HELLO message or ACK during transmission.Link reliability is not guaranteed in these two steps in order to reflect link correlation correctly (link reliability indicates that all packets can be received successfully through schemes like retransmission).Third, A and B will broadcast packets that contain information on these reception vectors to C, where  is the length of reception vector (also the number of HELLO messages),   () is the th element in the reception vector of node , and  is the ID of node that broadcasts HELLO messages.In particular, denotes the AND result of th elements in reception vectors of all one-hop downstream nodes in the broadcast domain, which indicates that only when a HELLO message is received by all one-hop downstream nodes will the link correlation increase.For example, after A and B construct reception vectors for C and D and D and E separately and broadcast them, D will receive the reception vectors of C, D, and E as its location is covered by broadcast domains of A and B simultaneously.Then D uses (3) to calculate link correlations (A, D) and (B, D).Specifically, (A, D) is 1/5 = 20%, while (B, D) is 3/4 = 75%.

Energy Consumption Model.
Two important sources of energy consumption are transmitting and receiving data.Their energy cost   and   can be estimated as follows: where   and   are transmitting power and receiving power separately.  and   denote data size that needs to be transmitted and received, and   is data rate.

Problem Statements.
Delay-aware program dissemination (DAPD) focuses on reducing transmission delay during disseminating codes under the premise that the lifetime of network will not be influenced.Therefore, dissemination completion time (DCT) and residual energy in sensors are two main concerns.Problem statements are as follows.
(1) To Minimize the Dissemination Completion Time.Codes dissemination phase starts from the base station broadcasting code packets and ends until all active sensors in the network receive codes correctly.One aim of DAPD is to minimize time spent between these two time points.
(2) To Avoid Influencing the Lifetime of Network.Since one important operation in DAPD is to enhance transmitting power by utilizing excess energy, an overuse will definitely reduce the lifetime of network.Therefore, DAPD is designed based on two principles.(1) Available residual energy during codes dissemination phase in sensors is the difference between residual energy of themselves   and minimum residual energy in the network  min before this phase starts.
(2) Residual energy distribution should be as uniform as possible after codes dissemination phase.Two statements above can be expressed as follows: where   is the residual energy of sensor node ,  min is minimum residual energy in the network before disseminating codes,  is the total number of active sensors, and   is the left energy of sensor node  after codes dissemination phase.When the wireless sensor network is in the phase of data collection, neighbor sensors of the base station need to forward data originated in sensors far from the base station apart from transmitting back their own data, which leads to an unbalanced distribution of data load and energy consumption.Figure 4 shows the unbalanced distribution of residual energy in sensors with different distance from the base station after data collection phase.Furthermore, the lifetime of wireless sensor network can be defined as time that the network goes through until any sensor runs out of its energy (the first failure) [34].Therefore, far nodes tend to have much underutilized energy during the lifetime of wireless sensor network if no additional scheme is taken.

Scheme Design
Studies show that such residual energy can take up more than half of total energy in the network [34][35][36].Hence, we consider taking advantage of residual energy in far nodes to enhance transmitting power during codes dissemination, and such enhancement will lead to a better link quality.Figure 5 presents the improvement on link quality after enhancing transmitting power, where link quality improves greatly when transmitting power is enhanced from −4 (dBm) to 4 (dBm); therefore the performance of codes dissemination can be improved.
With a more reliable link, number of retransmission and transmission delays can be reduced.Figure 6 is an illustration of expected dissemination completion time (DCT) for one  sensor node to transmit code packets to its child nodes with different transmitting power.
Based on the analysis above, we can conclude that residual energy in sensors far from the base station can be used to enhance transmitting power during codes dissemination, and thus the performance of codes dissemination can be improved.

Change on Link
Correlation.The utilization of link correlation during codes dissemination has already been proven to be fast and energy-efficient.However, experiments also show that the performance of correlated dissemination will degrade and be no better than deluge when in a highly dynamic environment.In detail, previous choices on parent nodes become outdated because link correlation will change over environment.The example in Figure 7 is used to illustrate the impact of changes on link correlation and the necessity on reselecting parent nodes.Initially, C, D, and E have already chosen A to be their parent node according to link correlation, while B has been chosen by F and G.After that, A and B start to broadcast code packets (10 packets one time) and will not broadcast following packets until these packets are correctly received by all their child nodes.Receptions on these packets are shown in the right part of Figure 7.The conclusion that link correlation between C, D, and E (or F and G) is high can be concluded from receptions.To simplify the illustration, we assume that all following packets retransmitted by A and B will be successfully received by C, D, E, F, and G. Therefore, A needs to retransmit 5 packets (ID: 1, 3, 5, 7, and 10) to C, D, and E, while B needs to retransmit 6 packets (ID: 2, 4, 5, 7, 8, and 9) to F and G.
Transmission on the next 10 packets will be similar to Figure 7 if the environment stands stable.However, this is not the case in the real world.One actual case after finishing transmitting the first 10 packets is shown in Figure 8.
The environment changes and link correlation (B, E) is now higher than (A, E), which can be seen from receptions on the following 10 code packets.If the parent node of C, D, Hence, an operation that reselects parent nodes will lead to a better utilization of link correlation in the real world.Besides, compared with disseminating codes, cost on selecting parent node is relatively smaller.Therefore, it is possible to make better use of link correlation at the expense of going through another phase to reselect parent node when the environment is highly dynamic.

Design on Enhancing Transmitting Power.
In order to enhance transmitting power without reducing the lifetime of wireless sensor network, we first need to correctly estimate residual energy in sensors before disseminating codes.First, we analyze data load on each sensor during data collection phase.Timeout retransmission mechanism is adopted in our network model to ensure that all collected data can be sent back to base station correctly.Necessary notations are given as follows: , , : sensor node ID ℎ  : hop count of node     : transmitting power of     : receiving power of    : link quality between  (transmitter) and  (receiver)   : number of packets that  needs to transmit to its one-hop upstream node  DATA : size of data packets  ACK : size of ACKs Besides, as shown in Figure 9, , , and  are 3 neighbor sensors with different hop counts (ℎ  = ℎ  + 1; ℎ  = ℎ  + 1).
For example,  needs to receive   packets from  and transmit   ACKs back to .Expected number of transmission is 1/    because of the unreliability of link.On the other hand,  needs to transmit   packets to  and receive   ACKs from  and expected number of transmissions is 1/    .Therefore, we can obtain the amount of data that  needs to transmit    and receive    under the premise that   and   are known, which are shown in ( 6) and (7) separately.Besides,   and   can be calculated through (8) [37].Equation (8) successfully estimates data load on sensor nodes with different distances from the base station in Send-Wait style with ACK protocol and no packet loss.
where  is the distance between  and the base station,  is the transmission radius, and  is the largest integer that satisfies  +  <  (radius of the network).Figure 10 shows data load on sensors with  = 20 m in a wireless sensor network with  = 200 m.Second, residual energy in sensors  can be estimated according to the data load above, combined with energy consumption model in Section 3.4.where  0 is initial energy in sensors and  is times of data collection.Figure 11 shows residual energy in sensors after collecting data for 50, 100, and 500 times.
To avoid the lifetime of network being influenced by the enhancement of transmitting power, an upper limit should be set according to sensors that have minimum energy left after data collection phase.From Figure 11, we can observe that such sensors tend to be deployed around the base station.Therefore, residual energy that can be used to enhance transmitting power   can be calculated through the following equation: where   is the residual energy in sensor  and  min is minimum residual energy in the network.  will be used to enhance transmitting power during the phase that disseminates codes and the phase that reselects parent nodes.Therefore, an additional variable  is introduced in order to allocate   to these two phases properly.For example, enhanced transmitting power    of sensor  during codes dissemination is calculated according to the following equation: where  is total data size on code packets,   is the initial transmitting power, and PRR 0 (  ) is the link quality under initial   .Due to the fact that the amount of data that needs to be transmitted in the phase that reselects parent node is smaller than that of codes dissemination phase,  should be larger than 0.5.Figure 12 shows the enhanced transmitting power of sensors in a case where  is 500 bytes and codes dissemination starts after collecting data for 50 times.

Our Methodology.
In this section, we will show design details on delay-aware program dissemination (DAPD).(2) It exploits link correlation to reduce dissemination completion time.(3) It selectively goes through fast parent node reselection phase to quickly adapt to changes on environment and recover the improvement brought by link correlation.DAPD is composed of three phases: initial parent node selection phase, codes dissemination phase, and fast parent node reselection phase.Following sections will show detailed information on these three phases.

Initial Parent Node Selection
Phase.This phase is used to obtain link correlation and choose parent node according to link correlation before disseminating codes.
First, the base station will initiate a flooding which enables each sensor node to obtain its hop count.Second, each node broadcasts HELLO messages to all one-hop downstream nodes in its broadcast domain, which contain its own ID and hop count.These one-hop downstream nodes will reply with ACKs upon successfully receiving HELLO messages.Link reliabilities are not guaranteed here in order to reflect link correlation correctly.Third, nodes which broadcast HELLO messages construct reception vectors for one-hop downstream nodes.Detailed information on the construction of reception vector is shown in Section 3.3.At last, these reception vectors will be broadcast to one-hop downstream nodes and upon receiving packets that contain information on reception vectors, these one-hop downstream nodes will calculate link correlation between themselves and the transmitter according to (3).
Due to the fact that the location of these one-hop downstream nodes may be covered by more than one transmitters' broadcast domain, they may receive reception vectors from many transmitters.They will choose the transmitter with the highest link correlation to be their parent node and send a CHOSEN message to inform this node.To ensure link reliability, timeout retransmission mechanism is adopted during transmission on reception vectors and CHOSEN message.
For example, C will receive reception vectors of itself and D from A in Figure 3. Therefore, C only needs to calculate link correlation for one time.((A, C) = 1/4 = 25%) and it has to choose A as its parent node regardless of link correlation because  is only covered by the broadcast domain of A. However, D will receive reception vectors of all one-hop downstream nodes (C, D, and E).After using (3) to calculate link correlations between itself and A and B, it chooses B to be its parent node because (B, D) is much larger than (A, D).

Codes Dissemination Phase.
After initial parent node selection phase, each sensor node in the network will obtain its parent node ID and all its child nodes ID.Next, codes dissemination phase is initiated by the base station broadcasting code packets.Sensor nodes also start broadcasting code packets to child nodes after successfully receiving all packets.During broadcasting, nodes will continuously broadcast  packets at a time and will not broadcast following packets until all child nodes' ACKs for these packets are received.
Besides, one node may receive packets from other onehop upstream nodes apart from its parent node.In this case, it will compare the packet ID with  packet IDs that it can currently receive from its parent node.(1) The packet is not one of those  packets or has already been successfully received; then it will discard this packet.(2) The packet is one of those  packets and has not been received; then it will receive this packet and reply an ACK for this packet to its own parent node.Such mechanism will reduce dissemination completion time further.Code packets are transmitted according to operations above hop by hop until all sensor nodes receive codes successfully.

Fast Parent Node Reselection
Phase.Section 4.1.2shows that changes on environment during codes dissemination phase will lead to degradation on the performance of correlated dissemination.Current link correlation may be very different from the link correlation calculated before disseminating.Therefore, previous choices on parent node can be outdated.To reuse link correlation, another parent node selection phase is necessary.However, unlike Section 4.3.2which initiates from the base station, a parent node selection phase that only happens between nodes and their one-hop downstream nodes is needed here.In detail, one sensor node will rebroadcast HELLO messages to all one-hop downstream nodes in its broadcast domain before transmitting code packets.Then, again, these one-hop downstream nodes will reply ACKs for each HELLO message.The following steps are same to initial parent node selection phase.Figure 13 shows the data transmission for one-hop downstream nodes ( + 1) to choose parent nodes (), where  indicates packets that contain information on reception vectors;  is extra time to make sure all potential ACKs can arrive.
Since the incentive for going through this reselection phase is to reuse link correlation, therefore, it is necessary for sensor nodes to keep monitoring on link correlation during codes dissemination phase.When link correlation drops below a predefined threshold, one sensor node can make the assumption that environment around itself has changed and a fast parent node reselection phase is needed before transmitting code packets to child nodes.However, it can be time-consuming and unreasonable to obtain link correlation through the same way described in Section 4.3.2(needs parent node's participation).The method we adopted here to make quick estimations on link correlation is collecting statistics on the percent of unneeded retransmitted code packets.In detail, after finishing one round of transmission or retransmission, ( 12) is used to estimate link correlation: where   is the number of unneeded retransmitted code packets,  all is the total number of received code packets, and  indicates the percent of useful retransmitted code packets for one sensor node.When  is lower than a predefined threshold, one sensor node will go through a fast parent node reselection phase before disseminating code packets to its child nodes.Take Figures 7 and 8 as an example: after the first round of retransmission, node E will use (12) to estimate link correlation.The value of  in Figure 7 is (1−1/5) = 80%, while that in Figure 8 is (1 − 4/9) = 56%.Therefore, it is more likely for node E in Figure 8 to go through a reselection phase than node E in Figure 7.
Obviously, this phase will only prolong the dissemination completion time if link correlation remains stable during codes dissemination.However, this phase will improve the performance when the environment is highly dynamic.Therefore, it is necessary to achieve a balance between the improvement brought by this phase and extra delay results from this phase.Detailed analysis on delay is shown in Section 5.2 and its impact in our experiment is shown in Section 6.4.1.

The Delay-Aware Program Dissemination Algorithm.
See Scheme 1.According to (3), we can obtain number of code packets that are successfully received by all child nodes at the first transmission.

Analysis on Delay
where    () is the reception on the th packet of child node   .The left part of ( 13) indicates number of packets that need not to be retransmitted.According to (14), we also have Therefore, the value of  is irrelevant to number of packets that are successfully received by all child nodes in (13), which is a variable shared by all child nodes.
At the first time,  needs to transmit all  code packets.At the second time, the number is Mobile Information Systems 11 And, at the third time, the number is Besides, the expected number of transmissions  is 1/ min(     ).Hence, total number of code packets that parent node  needs to transmit can be obtained through the following equation: However, calculated number of packets that need to be transmitted above can be imprecise in the real world, since link correlation will change during codes dissemination and link quality will not always be the same to the link quality estimated by (1).
After obtaining , transmission delay can also be calculated.A complete process of transmission is described as follows.First, parent node  broadcasts  code packets continuously and starts to receive ACKs from its child nodes.Upon receiving some or all of these packets, one child node will reply with ACKs that contain information on which packets are successfully received.After keeping receiving ACKs for a long time which ensures that all potential ACKs can arrive at the parent node,  will broadcast left packets which are not successfully received by all child nodes.The process described above will cycle until all its child nodes receive  code packets; then parent node  will start to broadcast the next  packets.Figure 14 is a sequence diagram of transmission.
DATA is time spent on transmitting code packets,  ACK is time spent on replying ACKs, and  is an extra time to ensure that all potential ACKs can arrive.Hence, the delay for one node to broadcast  code packets to its child nodes is where   is number of code packets that need to be transmitted at the th time,  DATA and  ACK are size of code packet and ACK separately, and   is data transmission rate.Figure 15 shows transmission delay for one node to transmit ten code packets to all its one-hop downstream nodes in several cases where link correlation between these nodes is set to 20%, 40%, 60%, and 80% manually.Transmitting power of nodes in Figure 15 is same to the green curve in Figure 12, while  DATA ,  ACK , and   are same to Section 6.1.With the distance from the base station increasing, gaps between transmission delay start to shrink, since a more reliable link tend to weaken benefits brought by link correlation.Lo ss rat io 1 − q i

Analysis on
Lo ss rat io 1 − q j Lo ss rat io 1 − q j t DATA can be highly dynamic, link correlation obtained in initial parent node selection phase tends to change during codes dissemination phase.An example in Section 4.1.2shows that the improvement brought by adopting link correlation can be greater after reselecting parent nodes during codes dissemination.However, this new parent node selection phase will also bring extra delay to codes dissemination.Therefore, the delay brought by fast parent node reselection phase is analyzed in this section.Data transmission of this phase is shown in Figure 13.In this phase, reliabilities of HELLO messages and ACKs are not guaranteed, while that of  (packets contain information on reception vectors) and CHOSEN message are assured through timeout retransmission mechanism.First, sensor nodes continuously broadcast  HELLO messages to one-hop downstream sensors in their broadcast domain.Second, they start to receive ACKs sent by one-hop downstream nodes.An extra time  is also added here to ensure the arrival of ACKs.Then, they construct reception vectors according to these ACKs and broadcast , which is composed of packets that contain information on reception vectors.After receiving , one-hop downstream nodes will calculate link correlations for each node that broadcasts  and select the node with which they have highest link correlation as theirs parent node.At last, a CHOSEN message is sent to inform this node that it has been chosen.
According to the description above, the expected delay brought by fast parent node reselection phase is shown in the following equation: where  * is data size on corresponding packet and  is the expected number of transmissions. Figure 16 shows transmission delay for nodes to go through fast parent node reselection phase with  HELLO messages; additional delay brought by the phase reduces with distance from the base station increasing since a higher transmitting power also benefits this phase.

Experimental Evaluation
In this section, a simulation experiment is given to evaluate delay-aware program dissemination (DAPD).First, parameters of network and sensor nodes are introduced.Second, we conduct the experiment according to parameters above and experiment results are compared with deluge and Link-Correlation-Aware Data Dissemination (CD).Third, we analyze the impact of changes on network parameters.

Parameters Setting.
The wireless sensor network is composed of one base station and thirty sensor nodes.Each node can have 1 to 4 links (represented by black line) with one-hop downstream nodes and the largest hop count in the network is set to 5. The topology of network is generated randomly with parameters above and final result is shown in Figure 17.
The energy of base station is considered to be infinite; therefore its transmitting power is high enough.Initial energy of sensors in the network is 0.5 J and transmission radius of all sensors is 20 m.Total data size on codes is 40 kB, while data sizes on each code packet  DATA and ACK  ACK are 20 bytes and 5 bytes separately.Besides,  is designated as 5 (ms).Parameter settings on sensors are shown in Table 1.
To ensure that all sensor nodes can be updated successfully through codes, each sensor will broadcast 16 packets continuously and start to receive ACKs after then.Following packets will not be transmitted until it receives all its child nodes' ACKs for these 16 packets.Besides, change on link correlation during codes dissemination is ignored here and analyzed independently in Section 6.4.1.
The transmitting power of deluge and CD is shown in Table 2 and that of DAPD after enhancing without reducing the lifetime of network is shown in Table 3.Before reprogramming, they all have already collected data packets for 100 times and transmitted them back to the base station with   = 0 (dBm).For example, reception vectors kept by nodes with hop count 1 are shown in Table 4.
is the ID of transmitter,  is the ID of receiver, and () is the th element in the reception vector (1 indicates a successfully received HELLO message, while 0 indicates a failure).Figure 18 shows selected parent nodes after this phase.Top of red lines denotes selected parents of nodes.Grey lines indicate that although those one-hop upstream sensors are not parent nodes, code packets broadcast by them are still possible to be received by nodes connected by the bottom of grey lines.

Comparison with Deluge and CD
6.3.1.Evaluation on Delay.Average transmission delay of deluge, CD, and DAPD is shown in Figure 19. in -axis indicates codes dissemination from sensors with hop count − 1 to sensors with hop count .According to Figure 19, average transmission delay of DAPD is 19.05% and 16.65% smaller than deluge and CD separately.Two reasons can account for this improvement: (1) DAPD adopts link correlation compared with deluge; (2) DAPD intelligently enhances transmitting power to reduce delay further compared with CD. Figure 20 presents the distribution of link correlation and link quality collected from all reception vectors, from which we can observe that although link correlation does not present any regular distribution, link quality in DAPD is higher than CD appreciably due to the enhancement of transmitting power.

Evaluation on Energy Consumption.
Two main sources of energy consumption come from transmitting and receiving code packets, ACKs.Therefore, we make comparisons on the sum of these two sources.Results are shown in Figure 21.Due to the fact that there is no need for sensors with hop count 5 to transmit code packets, their energy consumption is far smaller than upstream nodes and is therefore ignored.From Figure 21, we can see that on one hand energy consumption of DAPD will become larger with hop count increasing, which meets our scheme that utilizes excess energy in far nodes to enhance transmitting power; on the other hand, the energy consumption of both CD and DAPD is smaller than deluge on sensors with hop count smaller than 4, where link correlation plays a more important role.Therefore, the adoption of link correlation can reduce delay and energy consumption simultaneously.Besides, Figure 22 shows the distribution of residual energy after disseminating codes for one and three times; DAPD slightly relieves the unbalanced residual energy distribution of sensors.
6.4.Impact of Network Parameters 6.4.1.The Volatility of Environment.When the environment of wireless sensor network is highly dynamic, parent nodes selected during initial parent node selection phase may be outdated and unable to make full use of link correlation during codes dissemination phase.For example, Figure 23 shows a possible situation where average link correlations between sensors with hop count 3 and 4 and sensors with hop count 4 and 5 are modified from 62.47% and 74.63% to 20% manually.Average transmission delay of CD and DAPD increases toward deluge after changing link correlations manually.
Through using (12) to estimate link correlation, sensor nodes with hop count 3 and 4 will make the assumption that it is necessary to go through a fast parent node reselection phase.Data sizes on HELLO message, ACK, and CHOSEN message are all 5 bytes, while that on packet of  is same as  DATA for convenience.Compared with time used for transmission, times spent on constructing reception vectors and calculating link correlation are far smaller; therefore they are ignored when measuring delay.Figure 24 shows the delay before and after adopting fast parent node reselection phase, from which we can see that although transmission delay cannot recover to previous unchanged level, it improves a lot compared with taking no action when link correlation changes.Besides, transmitting power of sensors becomes higher with distance from base station increasing, which leads to a lower delay, and this improvement also benefits the fast parent node reselection phase.Therefore, the delay brought by this phase will decrease when it happens on sensors with a large hop count.

Number of Hop Counts and Nodes.
The scale of wireless sensor network mainly depends on the number of hop counts and nodes, which tend to have great impacts on the delay.On one hand, the increase on hop count will directly increase number of times that sensor nodes transmit and retransmit packets.On the other hand, amount of data that  4) ( 5) ( 6) ( 7) ( 8) ( 9) ( 10)   generates during data collection phase will also increase with number of nodes becoming larger, which leads to more energy consumption and a more unbalanced residual energy distribution.Therefore, it is meaningful to analyze the impact of network's scale on our scheme.Since the distribution on link correlation is irregular according to Figure 20, the impact   concentrates on the enhancement of transmitting power.First, Figure 27 shows data load on nodes with different distance from base station during data collection phase ( 0 is the original density of nodes); gaps between nodes around and nodes far from the base station become wider with number of hop count and nodes increasing.
With a different distribution of data load after the scale of network changing, transmitting power during codes dissemination phase will also be different from Table 3, which is shown in Figure 28.6.4.4.The Period of Codes Dissemination.Apart from hop count and number of nodes, frequency of codes dissemination will also influence our scheme.Generally, the enhancement of transmitting power will be smaller with a higher frequency.The period  is defined as follows: However, as mentioned in Section 6.3.2, one disadvantage on DAPD is that residual energy in sensor nodes with largest hop count is not exploited, since there is no need for these nodes to broadcast code packets during codes dissemination phase, which is an important source of energy consumption for other nodes in the network.
Our future work includes integrating code packets transmission between sensor nodes with same hop count into DAPD to improve the performance further and a comprehensive research on the relation between link quality and link correlation.

Figure 3 :
Figure 3: The illustration of link correlation.

Figure 4 :
Figure 4: Distribution of residual energy on sensors.

Figure 6 :Figure 7 :
Figure 6: Expected dissemination completion time with different transmitting power.

Figure 9 :Figure 10 :
Figure 9: Three neighbor sensors, , , and , with different hop counts during data collection phase.

Figure 11 :
Figure 11: Residual energy in sensors after collecting data for different times.

Figure 12 :
Figure 12: The enhancement of transmitting power.
4.3.1.Overview.Delay-aware program dissemination (DAPD) is one kind of bulk data dissemination and it has 3 salient characteristics.(1) It exploits excess energy of sensors to enhance transmitting power during codes dissemination.

Figure 13 :
Figure 13: Data transmission during fast parent node reselection phase.

5. 1 .
Analysis on Delay of Codes Dissemination.First, we analyze transmission delay for one node to broadcast  code packets to its child nodes.Changes on link correlation during codes dissemination are ignored here and necessary notations are given as follows: : parent node : number of child nodes   : child node ID ( = 1, 2, . . ., )   : link quality when  transmits packets    : link quality when  replies ACKs : number of code packets that  transmits    : link correlation between   and Delay of Parent Node Reselection.Due to the fact that the environment of wireless sensor network

Figure 16 :
Figure 16: Expected delay brought by fast parent node selection phase.

Figure 25
shows dissemination completion time (DCT) of codes dissemination with different packet size, while Figure26shows dissemination completion time (DCT) with different total size of data.

Figure 20 :
Figure 20: Distribution of link quality and link correlation.
delay (s) Delay of CD before modifying Delay of CD after modifying Delay of DAPD before modifying Delay of DAPD after modifying

Figure 23 :Figure 24 :
Figure 23: Delay of CD and DAPD after modifying link correlation manually.

Figure 29 :
Figure 29: The enhancement of transmitting power with different period .

Table 1 :
Parameter setting on sensors.

Table 2 :
Transmitting power of deluge and CD.

Table 3 :
Transmitting power of DAPD.  is transmitting power, and  is theoretical link quality calculated through (1).6.2.Selected Parent Nodes.First, each sensor node will select its parent node during initial parent node selection phase described in Section 4.3.2.

Table 4 :
Reception vectors kept by nodes.