A Doping Bits Based BP Decoding for Rateless Distributed Source Coding Applications

A novel doping bits based belief propagation decoding algorithm, for rate-adaptive LDPC codes based on fixed bipartite graph code, is proposed.The proposed work modifies the decoding algorithm, by converting the puncturing nodes to regular source nodes and by following the encoding rule at the decoder. The transmitted doping bits in place of punctured bits, with the modified decoding algorithm at the decoder, feed all the punctured nodes with reliable log likelihood ratios. This enables the proposed decoding algorithm to recover all punctured nodes in the early iteration. The fast convergence leads to decoder complexity reduction while providing considerable improvement in performance.


Introduction
Conventional video coding technologies have been dominated by MPEG and H.26x family of standards, of which the most recent H.264/AVC [1] stands for the state of the art.They offer high quality, low bitrate video for demanding applications but at the cost of complex motion estimation and compensation for better rate-distortion performance.However, in some emerging applications such as low-power wireless video surveillance, wireless PC cameras, and mobile camera phone, low complexity encoding is required.Distributed source coding (DSC) [2,3], a new paradigm in coding, which allows for sparingly low complexity encoding, is well matched for these applications as it shifts the encoding complexity to the decoder.Motion estimation [4] plays an important role in H.264 and MPEG4 video compression.It estimates the motion vector between the intra-and interframe and transmits only the motion vector for interframe, whereas the source and channel coded data are transmitted for intraframe.Motion vector estimation module is the one which increases the complexity of the encoder.Distributed video coding (DVC) based on DSC does not depend on complex motion estimation.The core idea of DVC comes from some information theoretical results in 70s: Slepian-Wolf [5] and Wyner-Ziv [6] theorems.It states that if  and  are two correlated sources,  can be encoded at a rate greater than or equal to its conditional entropy ( | ).Though ( | ) is less than its entropy (),  can be decoded provided that  was encoded at a rate () and is available at the decoder as side information.Wyner pointed out the possibility of using linear error correcting codes for Slepian-Wolf coding in [7].As the practical DVC system needed high coding efficiency channel codes, the implementation was possible only after the invention of turbo and LDPC codes [8,9] which reaches Shannon's capacity limit with longer length.Though the issue with high coding efficiency is ended with the invention of LDPC codes, the problem with the assumption of conditional entropy at the encoder remains.As the conditional entropy determines the rate of the code to be used for , the transmitted syndrome bits based on presumption may or may not be sufficient to decode .If the bit error rate (BER) at the decoder exceeds a given value, more syndromes are requested from the encoder.In this circumstance, the code designed should be incremental; that is, the encoder should not reencode the data.More precisely, the first bits are kept, and only additional bits are sent upon request and this is essentially done by rate-adaptive codes.
The concept of rate-adaptive codes was originated in the channel coding literature.One of the key techniques for achieving rate adaptivity is by puncturing, but it is done at a relatively small performance loss [10].Ha et al. [10] investigated the existence of a good puncturing pattern in LDPC codes for rate adaptivity, but the length of the code was large.They also proposed a systematic way to find the locations of punctured symbols in short length LDPC codes [11], which is rate adaptive and minimizes the performance loss due to the puncturing.Hyo et al. [12] proposed an algorithm to prevent the formation of a stopping set from the punctured variable nodes even when the amount of puncturing is quite large.In spite of the existing works on rate-adaptive channel codes, the design of rate-adaptive LDPC code for Slepian-Wolf coding is a challenging task due to wide compression rates needed to achieve the theoretical compression limit.It has been studied extensively by various authors; Varodayan et al. [13] proposed a rate-adaptive scheme by sending doping source bits in addition to the source bits in a deterministic manner.They also proposed a new construction of codes [14] called LDPC accumulated (LDPCA) codes to fit the Slepian-Wolf coding.LDPCA code set is designed in syndrome splitting method keeping the size of the stopping sets.However, it does not eliminate the short cycles carefully for each member code and results in low girth [15].Rate-adaptive PEG codes have been suggested by [16] which increase the girth to reduce the cycles.The common shortcoming of these codes is that they handle the condition of poor side information at the decoder by sending more syndromes on request with variable bipartite graph for each rate change.
Figure 1 demonstrates how the bipartite graph varies with each rate change; Figure 1(a) shows the tanner graph for the base code, whose rate is one, and Figures 1(b) and 1(c) for 1/2 rate code.Figure 1(b) combines the nodes connected to odd indexed syndrome with the even indexed one to achieve 1/2 rate code.Figure 1(c) manipulates even indexed syndromes to achieve 1/2 rate code, which leaves with unconnected source nodes as shown.It does not favor decoding as the source nodes are already available as side information at the decoder.The varying structure of the tanner graph as shown in Figures 1(a), 1(b), and 1(c) needs a large memory management at the encoder.If the same LDPCA codes have to be used for Slepian-Wolf theoretical compression limit, leading to variable source nodes, zeros have to be padded to the actual length of information bits.Hence not only does that LDPCA need more complex memory management, but also it does not support rate adaptivity for any variation in block length.Skipping highly correlated colocated blocks in block based video coding leads to variable block length and hence source nodes.Hence the rate-adaptive codes for Slepian-Wolf coding have to serve two purposes: handle variable source nodes and check nodes length.
Chen et al. [17] showed that, under density evolution, each Slepian-Wolf coding problem is equivalent to a channel coding problem for a binary-input output-symmetric channel.Based on this any rate-adaptive channel codes design can be directly converted to Slepian-Wolf code design.Jiang et al. have addressed the problem of variable tanner graph and presented a solution based on fixed tanner graph in [18,19] by using the analytical results on punctured LDPC codes in [17,20].They also showed that any arbitrarily small gap to capacity for the original code can be preserved after puncturing on the binary erasure channel (BEC) and it is also shown that punctured LDPC codes are as good as ordinary LDPC codes.The issue of variable tanner graph is solved in [18] by designing the code with extra (20-25%) puncturing source and check nodes, which may or may not carry the information.This method uses the standard BP decoding to recover the extended punctured parity nodes leading to slow convergence.The main contribution of this paper is to develop an algorithm for [18] to speed up the convergence process in decoding and at the same time to give improved BER performance.The work presented here gives a new construction method to ease the code design and modifies the decoding algorithm.Thereby it improves some results from [18], where an earlier version of the algorithm was presented.The developed algorithm is called doping bits based fixed tanner graph algorithm, since it is based on sending doping bits instead of extended parity bits on additional syndrome request.PEG constructed LDPC codes are used in [21] as bit plane processing of DSC application requires short length codes.The iterative belief propagation (BP) decoder [22] with estimated side information and doping bits leads to performance improvement in terms of decoder complexity and BER.
The rest of the paper is organized as follows.Some important definitions and notations used throughout this paper are given in Section 2 and the novel concept of doping bits based decoding algorithm is introduced in Section 3. Simulation results are discussed in Section 4 and finally in Section 5 conclusion is presented.

Definitions and Notations
The notations and definitions used throughout the paper are introduced here.LDPC codes are represented by tanner graph [23] and consist of two sets of nodes connected by edges; variable nodes correspond to the  columns of the parity-check matrix  and the check nodes  represent the rows of .There are two ways to represent the LDPC for DSC application: (1) by (, ), where  (input) is the number of information bits and  (output) is the number of syndromes; that is, only the syndromes are transmitted (2) by (,  − ), where  (input) is the number of information bits and  −  (output) is the number of parity bits; that is, only parity bits are transmitted.The work proposed here is based on syndrome method.Since  variable nodes are used for information bits, from now on it is called source nodes.Apart from  source nodes and  syndromes as in usual scheme, there are  numbers of extended source nodes called parity nodes and extended syndrome nodes for rate adaption.Figure 2 shows the tanner graph of system described in [18].Finally  eq =  +  represents the total syndromes and  eq =  +  represents total source nodes.
Source nodes are combination of information bits  1 to  6 and parity bits  1 to  3 .Check nodes are combination of check nodes  1 to  3 and extended check nodes EC 1 to EC 3 .Extended check nodes are equal in number to parity nodes.

Doping Bits Based Rate-Adaptive LDPC Codes
3.1.Algorithm.LDPC codes in syndrome code form have been used effectively in fixed-rate distributed source coding due to its relatively more compression ratio than parity based scheme.The BP decoding is modified slightly in order to take into account the syndrome information [22].LDPC codes used as Slepian-Wolf codes have to be rate adaptive in two ways, with respect to variation in source length and variation in the quality of side information estimation.Filling extra source node with zeros is one of the ways to adapt variable source length, but it results in more encoding complexity.The design used in [18] gives solution for source length variation as well as variation in side information quality.Side information quality variation is handled by incremental syndromes.The encoding step includes (1) generation of parity sequence such that extended syndrome becomes zero; (2) generation of regular syndrome from the actual source using the generated parity sequence from step 1; (3) mixing a variable portion of the source bits with the generated parity sequence according to a predefined arithmetic operation; (4) in addition to this, the proposed scheme sends uncoded source bits called doping bits according to the desired compression rate.These bits are called doping bits for the reason explained later in the decoding section.
A different decoding method at the decoder facilitates these doping bits.Computation of parity sequence, that is,  1 to   , involves solving  linear equations associated with the corresponding check nodes.The basic idea is that, by way of modified decoding, the punctured parity node becomes reliable and assists the decoding process to converge fast in early iteration.The pattern of the doping bits over the source nodes is known to the decoder.There are two schemes to be explained: variable source length and fixed syndrome coding and fixed source length and variable syndrome coding.The supporting rate for Scheme A is / to / eq and for Scheme B is / to  eq / eq .The doping technique is applied when there is a requirement of additional syndrome from the decoder.

Scheme A: Variable Source Length and Fixed Syndrome
Coding.This scheme comes into picture when the source statistics change to give variations in source length.The proposed scheme punctures the extended parity bits according to the compression ratio and estimates the punctured parity bits using the modified decoding scheme at the decoder, whereas the algorithm described in [18] does not do estimation.Estimation makes punctured node a reliable node and hence the LLR computed for these bits are highly reliable.The reliable punctured nodes based on modified decoding lead to early convergence at the decoder.Additional  + number of source is sent for variable source length coding as   =   +  + , where  = 0 to . (1)

Scheme B: Variable Source Length and Variable Syndrome
Coding.This scheme is applied whenever the decoder requests for supplementary incremental syndrome due to poor side information estimation.For every additional syndrome requested, instead of sending the parity bits  1 to   , the encoder sends uncoded source bits to the decoder.This combined with the estimated  1 to   at the decoder improves the performance and also makes the decoding fast.In a nutshell the role of the rate-adaptive decoder is to recover the source  from the uncoded source bits and syndrome bits so far received from the encoder in conjunction with the block-candidate side information  for any data length and syndromes.As this uncoded source enriches the performance, they are called doping bits.

Decoding.
The key to the proposed solution is the modified decoding algorithm which follows the encoding steps.It differs from [18] by the way of parity nodes computation and BP decoding for the doping bits at the decoder.
Once the parity nodes are used to find the extended zero syndromes, it is used to find the other nonzero syndromes in combination with the other source symbols.The encoding part is elucidated first in order to appreciate the decoding.For simplicity (8,4) code is assumed where the punctured nodes are 20%, that is, 2 numbers.The new code dimension is (10,6).That is, 2 syndrome nodes must be made zeros.The tanner graph is drawn partially in Figure 3, in order to highlight only the zero syndrome nodes computation.In Figure 3, EC 1 and EC 2 are zero syndrome nodes and  1 and  2 are parity nodes.The equation for the zero syndrome nodes at the encoder is given by At the encoder  1 and  2 are computed such that EC 1 and EC 2 are zeros, with  6 and  3 are known.The same can be reversed at the decoder as the core idea of DSC coding is that the side information is correlated with the source being encoded.Hence at the decoder the equation to find the parity bits is given by As punctured nodes are computed to have value, it is not fed with zero as LLR.This modification would leverage the punctured/parity node's reliability and hence the decoding performance.This transforms the intentional puncturing with heterogeneous parallel channel (BSC/BEC) parameters [20] to homogeneous channel.The strengthened LLR values using the modified algorithm, at the decoder, lead to fast convergence.Eliminating the punctured parity nodes as described above leaves space to send some other bits in place of extended syndromes and these bits are called doping bits.As the doping bits are 100% reliable source bits, some modification can be done in BP decoding; that is, the message from doping bits node to check nodes is unaltered and always will be where  out  , the output, is message from source node to check nodes and  is the doping bit [13].This in turn not only converges fast, but also reduces the computational complexity.

Code Design.
Exclusive properties of the code design needed by system in [18] on comparison with the standard LDPC code are that (1) the extended source nodes and extended check nodes should be placed at the end in order to allow for puncturing within binary erasure channel; (2) in addition to the extended source nodes, other source nodes connected to the extended check nodes should have one to one connection as shown in Figure 3.Here the parity bits are directly connected to extended check nodes and only few other source nodes are connected to make the computation easier.If EC 2 had been connected to two parity nodes as in Figure 2, then equation for EC 2 would change to (5)  1 ,  2 , and  3 are computed in a way that, EC 1 , EC 2 , and EC 3 are zeros in each input cycle.Solving these equations will be computationally intensive, if the above three equations are not independent.It is very hard to place both extended source nodes and syndrome nodes at the end with independent check node equation for a specified BER parameter.As the scheme is based on fixed tanner graph, extended syndrome nodes can be placed anywhere with the only restriction of having independent check node equations.Selecting extended syndromes in this way opens too many choices, which is otherwise not possible with direct design.The position of the extended syndrome nodes has to be known by the decoder in this case.This is not very complicated, as it is one time process.The new construction is shown in Figure 4.
Selection of the number of codes has been explained with an example by taking a QCIF video.The total number of blocks with the block size of 4×4 for a QCIF (176×144) video is 1584.The number of source node when processed in bit planes without compression will be 1584.Additional example is 512 × 512 hyperspectral image [24] which also can be DSC coded due to the high correlation between neighboring bands.
In skip mode the length of the source node changes, due to skipping blocks which are highly correlated with the colocated blocks in the previous frame.This can be further explained with the help of mother-daughter video, taking  component of 3rd and 4th frames and hyperspectral image of 28th and 29th bands.As shown in Figures 5(c) and 5(d), since the grandma sequence is a slow motion video, all the blocks are almost the same and the PSNR metric between them is more than 45 db.The difference between the frames is only due to the camera lighting condition.Skipping these blocks for coding reduces the source bits for LDPC coding.Changing the PSNR of the skip blocks for a specified rate distortion performance also will change the number of source bits for coding.Another factor contributing to variable source node is quantized DCT coefficients which can be discarded due to its low magnitude.Apart from the above, variations in correlation with every adjacent frame of a video lead to variable source length.Due to the similarity of the decoding algorithm, the degree distributions optimized for LDPC channel codes (e.g., using [25]) can be applied directly to the proposed rate-adaptive codes.

Results and Discussion
The performance of doping bits based rateless code is analyzed on the basis of storage complexity, encoder-decoder complexity, and BER performance.The experimental setup parameters are as follows.
(i) The channel is modeled as multiple BSC channel in order to incorporate the precise correlation between each bit plane independently [26].
(ii) The channel code for the proposed system has been designed for BSC channel with 20% punctured or extra source nodes using density evolution.The threshold cross over probability of the resulting code is 0.12.(iii) The construction used for the code design is standard and PEG codes.
(iv) The virtual channel correlation was modeled as BSC/BEC when punctured nodes are involved and BSC channel when there is an estimation of punctured nodes.
(v) Maximum source length considered is 1584 and the minimum one considered is 500 as explained in Section 3.
(vi) A set of 5 codes has been designed with base code rate 1/2, within the range 500-1584, as in Table 1.

Storage Complexity.
There is no difference between the number of codes used in the proposed scheme and reference [18], but, to appreciate the advantage of the proposed scheme with the rate-adaptive codes which do only syndrome adaption, Table 1 compares the number of codes.The number of codes with other rate-adaptive codes for a source length variation from 500 to 1584 is 11 codes, whereas it is only 6 codes for the proposed algorithm.Also the resolution step is 100 for regular codes and 1 for the proposed algorithm.2 and 3, illustrate the number of iterations required for (1800 and 1050) rate-adaptive code, where the number of extended data and syndromes is 300.Decoder complexity reduction is achieved by modified   Figure 6 shows the BER performance comparison under variable source length and fixed syndrome (Scheme A).The rates are modified by changing the number of source nodes, keeping the syndromes constant.The performance improvement in low rate codes, due to less number of punctured nodes, is almost evenhanded by the less number of zero syndromes.Rate 0.5 and 0.4685 codes have only a small performance gap for both the proposed and reference [18] schemes and this too goes unseen at high correlation as punctured nodes converge faster with high correlated side information.However, comparatively, the higher gain of the proposed scheme is due to the estimation of punctured nodes at the decoder; that is, the proposed scheme reaches BER of 10 −5 at 0.06 cross over probability, whereas reference [18] scheme reaches only at 0.025 cross over probability.

Complexity of
Figure 7 shows the BER performance comparison under variable source length and syndrome (Scheme B).The rates are modified by changing the number of syndrome nodes and source length.High rate codes perform better as the doping bits dole out as requested extra syndromes in addition to modified decoding.The performance gap remains increasing as the number of doping bits increases.
Performance of the proposed Scheme A and Scheme B is compared with that of reference [18] in Figure 8.The rate 0.46875 code was used for comparison.Scheme A with estimated punctured node performs better than that of [18].Due to additional syndrome the rate becomes 0.5625 for Scheme B and unsurprisingly it shows better performance than any other schemes because of doping bits.Figure 9 compares the performance of LDPC codes constructed by standard algorithm and PEG construction algorithm, as expected PEG based codes outperforms standard one due to high girth.

Figure 1 :
Figure 1: Decoding graphs if the encoder transmits (a) entire syndrome, (b) accumulated syndrome, and (c) even indexed syndrome bits.

Figure 5 :
Figure 5: (a) and (b) 28th and 29th bands of hyperspectral image.(c) and (d)  component of 3rd and 4th frames of grandma video.

6 RFigure 6 : 6 R
Figure 6: Scheme A: comparison of BER with cross over probability for rate adoption.

Figure 7 :
Figure 7: Scheme B: comparison of BER with cross over probability for rate adoption.

Table 1 :
Comparison of number of codes required for the proposed scheme with other codes.
Encoder and Decoder.Encoder complexity is given by  =  *  where  is syndrome,  is information bits, and  is parity check matrix.Let  and  be the source and check nodes degrees, respectively; it means that  number of source nodes is connected to each check nodes.

Table 2 :
Number of iterations required for Scheme A.

Table 3 :
Number of iterations required for Scheme B.Table 2 is only due to modified decoding and hence a moderate improvement of only 16.66%, whereas the improvement for item 2 in Table 3 is due to two factors.They are modified decoding for 50% of the nodes and 50% of doping bits in place of extended syndromes.

Table 4 .
The virtual correlation channel is modeled as BSC due to bit plane coding.The variable conditional entropy was simulated by adding noise with various cross over probabilities with the motion compensated frame 1.

Table 4 :
Simulation parameters for rate compatibility.