Construction and Decoding of Rate-Compatible Globally Coupled LDPC Codes

. This paper presents a family of rate-compatible (RC) globally coupled low-density parity-check (GC-LDPC) codes, which is constructed by combining algebraic construction method and graph extension. Specifically, the highest rate code is constructed using the algebraic method and the codes of lower rates are formed by successively extending the graph of the higher rate codes. The proposed rate-compatible codes provide more flexibility in code rate and guarantee the structural property of algebraic construction. It is confirmed, by numerical simulations over the AWGN channel, that the proposed codes have better performances than their counterpart GC-LDPC codes formed by the classical method and exhibit an approximately uniform gap to the capacity overawiderangeofrates.Furthermore,a modifiedtwo-phaselocal/global iterativedecodingscheme forGC-LDPCcodesisproposed. Numerical results show that the proposed decoding scheme can reduce the unnecessary cost of local decoder at low and moderate SNRs, without any increase in the number of decoding iterations in the global decoder at high SNRs.


Introduction
Globally coupled low-density parity-check (GC-LDPC) codes, which were proposed by Li et al. in [1][2][3][4], are a special type of LDPC codes designed for correcting random symbol errors and bursts of errors or erasures.They have a different structure from the conventional LDPC block codes and the spatially coupled LDPC (SC-LDPC) codes [5][6][7][8][9][10][11][12].From the perspective of the Tanner graph, a GC-LDPC code using a group of global check nodes (CNs) couples (or connects) a set of disjoint Tanner graphs called local graphs.We refer to such codes as CN-based globally coupled LDPC (CN-GC-LDPC) codes.Due to the special structure, CN-GC-LDPC codes not only perform well on both the additive white Gaussian noise (AWGN) channel and the binary erasure channel (BEC) but also are effective for correcting burst erasures.
For time-varying channels, from the wireless communication theory, we need to adapt the rate according to the available channel state information (CSI); such an error control strategy is referred to as rate adaptability [13].Ratecompatible (RC) channel codes with incremental redundancy are often used in conjunction with the HARQ strategy [14][15][16][17][18][19][20].Most recently, RC-LDPC codes have been adopted by the 3rd Generation Partnership Project (3GPP) as the channel coding scheme for 5G enhanced mobile broadband (eMBB) data channel [21].Such codes with a wide range of rates and block lengths are a family of nested codes which can be interpreted as a graph extension of high-rate codes [17,22,23].
Unlike the SC-LDPC codes which have some conventional design methods for RC-LDPC codes, such as puncturing variable nodes from the codes with low rate and extending variable nodes to the codes with high rate, the classical GC-LDPC codes are mostly restricted to invariant code rate [11,20,[22][23][24][25].And the existing construction methods of GC-LDPC codes are not flexible enough for RC code design.In this paper, we present a family of RC CN-GC-LDPC codes.The proposed construction is based on combining algebraic construction method and graph extension.The highest rate code, which can be called the mother code, is constructed using the algebraic method.And the codes of lower rates are formed by successively extending the graph 2 Wireless Communications and Mobile Computing of the higher rate.Compared to the original CN-GC-LDPC codes, the proposed family of RC CN-GC-LDPC codes not only provides more flexibility in code rate and allows more efficient encoding but also guarantees the structural property of algebraic construction.Examples show that the proposed codes perform well and exhibit an approximately uniform gap to the capacity over a wide range of rates.
Furthermore, localization is a significant advantage of CN-GC-LDPC codes which allows us to process a consecutive sequence by some local sequences.If a decoder uses two-phase local/global iterative scheme, its decoding latency and memory requirements are much less than those of the classical decoding scheme without performance degradation [1,2,26,27].However, the local decoder performs a plenty of useless operations at low and moderate SNRs, which causes the unnecessary cost of the decoder.In light of this case, we present a modified two-phase local/global iterative decoding scheme for CN-GC-LDPC codes which reduces the unnecessary cost of the local decoder at low and moderate SNRs.
The remainder of this paper is organized as follows.In Section 2, we give a brief introduction to CN-GC-LDPC codes.In Section 3, we introduce the concept of graph extension through a four-edge-type LDPC code and propose a simple construction method of four-edge-type LDPC code family.In Section 4, we present a modified two-phase local/global iterative decoding scheme for CN-GC-LDPC codes.Numerical results for RC GC-LDPC codes are presented in Section 5. Finally, Section 6 concludes this paper.

CN-Based QC-GC-LDPC Codes.
A CN-based QC-GC-LDPC code C  is constructed using a base matrix B  which consists of two subarrays as shown: The upper subarray of B  is a  ×  diagonal array with  ×  matrices B  on its main diagonal and the lower part is an × matrix X  , where 0 ≤  ≤  − 1.So, B  is an ( + ) ×  matrix.
From the perspective of graph, let G  be the Tanner graph associated with B  for 0 ≤  ≤  − 1.We call G  a local Tanner graph.The different G  are pairwise disjoint (for 0 ≤ ,  ≤ −1, and  ̸ = , G  and G  are disjoint).The  global CNs connect all these local graphs, composing the Tanner graph associated with B  , denoted as G  , as shown in Figure 1 (in this figure, the edge labels are ignored).We see that the  global CNs provide the only connections between any two disjoint local Tanner graphs.

The Basic Method of Code Construction.
Li et al. presented two types of CN-based QC-GC-LDPC codes and their construction methods in [1,2,4].In this subsection, we briefly review the construction method for the first type of QC-GC-LDPC codes in [1].Let GF() be a nonbinary (NB) field with  elements, where  is a power of a prime.Let  be a primitive element of GF().Then, the powers of , namely,  0 ,  1 ,  2 , . . .,  −2 , give all the nonzero elements of GF() and  −1 =  0 = 1.We construct the following matrix B  over GF() using the method in [2]: Suppose  − 1 can be factored as the product of two positive integers  and ; that is,  − 1 = .The matrix B  which has a cyclic structure is then an  ×  square matrix.The first  rows in B  comprise an  ×  matrix, denoted by W 0 .Then, W 0 can be partitioned into  submatrices of size  × , denoted by W 0,0 , W 0,1 , . . ., W 0,−1 .Based on the cyclic structure of B  , we can obtain the next  rows in B  by cyclically shifting all the rows of W 0 = [W 0,0 , W 0,1 , . . ., W 0,−1 ]  positions to right, denoted by W 1 = [W 0,−1 , W 0,0 , . . ., W 0,−3 , W 0,−2 ].And so, the matrix B  can be partitioned into  submatrices of size  × , denoted by W 0 , W 1 , . . ., W  , . . ., W −1 , where W  = [W 0,− , . . ., W 0,−1 , W 0,0 , . . ., W 0,−−1 ], 0 <  ≤  − 1.Hence, we can form the following block-cyclic structured matrix B  from B  : For 0 ≤  <  and 1 ≤ ,  ≤ , we take an  ×  submatrix R 0, from W 0, .For the submatrix R 0, , we restrict the locations of its entries in W 0, to be identical (i.e., for 0 ≤  1 ,  2 < , the entries of R 0, 1 and R 0, 2 are taken from identical locations in W 0, 1 and W 0, 2 ).Then, we form the following  ×  array B  of  ×  submatrices over GF() with a block-cyclic structure: We denote the set of rows in the first row blocks of B  by Γ = [R 0,0 , R 0,1 , . . ., R 0,−1 ].So, B  is a submatrix of B  .In forming the  ×  array B  of  ×  submatrices over GF() given by (4), there are − rows in each row-block and ( − ) columns in each column-block of B  which are unused.We denote the set of ( − ) unused rows of B  in each rowblock by Π 1 .So, there are  sections which have a total of  components in each row of Π 1 .For each section in row, we remove  −  components which are not used in forming the array B  from B  .We denote the set of ( − ) rows by Π * 1 .The rows in set Π * 1 and the rows in set Γ are disjoint.Let  and  be two positive integers with 1 ≤  ≤ ( − ) and 1 ≤  ≤ .We remove the last  −  sections from Π * 1 and take  rows from the remaining sections of Π * 1 to form an  ×  matrix X  over GF().By taking the × diagonal array from the main diagonal of B  and appending the matrix X  to the bottom of them, we form the following base matrix of a QC-GC-LDPC code: Each entry in the upper subarray of B , is equal to R 0,0 which is an × matrix, and such subarray is called local part.Note that we can use  different member matrices in the set {R 0,0 , R 0,1 , . . ., R 0,−1 } as the matrices on the main diagonal of  ×  upper subarray of B , as well.And we call the lower subarray of B , global part.The ( − 1)-fold dispersion of B , results in an (+)× array H  of (−1)×(−1) CPMs and/or ZMs.The null space of H  gives a CN-based QC-GC-LDPC code whose Tanner graph has a girth of at least 6.
Let   be the design rate of a ( V ,  V ;   ,   ) binary CN-GC-LDPC code, where  V and  V denote the VNs (variable nodes) degrees of local part and global part, respectively, and   and   denote the CNs degrees of local part and global part, respectively.Then,

Rate-Compatible GC-LDPC Codes
In this section, we first introduce the concept of graph extension through a four-edge-type LDPC code.Then, we present a construction method of RC GC-LDPC codes.Global CNs q g (0)

Four-Edge
Extended CNs q e (0) Extended VNs p e (0) the VNs into  sections, denoted by p(0), p(1), . . ., p( − 1), each containing   VNs.Every p() is then split into two parts: the mother VNs p  () of length   () and the extended VNs p  () of length   (); that is, p() = [p  (), p  ()].We denote the corresponding coded symbols of VNs as a vector v of size .At the th section, we denote the corresponding symbol vectors of p  () and p  () , respectively.Then, we denote by q  the   CNs in the lower part which are called global CNs.In the upper part, we partition the CNs into  sections, each containing   CNs, denoted by q(0), q(1), . . ., q( − 1).We split each section of such CNs into two parts: the mother CNs q  () of length   () and the extended CNs q  () of length   (), where 0 ≤  < .We refer to the edges connecting p  () and q  () as the type 1 edges, the edges connecting p  () and q  () as the type 2 edges, the edges connecting p  () and q  () as the type 3 edges, and the edges connecting p  () and q  as the type 4 edges.
From Figure 2, we can see that the mother VN not only is connected to the global CNs by the type 4 edges but also is connected to the mother CNs and the extended CNs by the type 1 edges and the type 2 edges, respectively.However, the extended VNs which have the same number of the extended CNs only are connected to the extended CNs by the type 3 edges.This means that we can form GC-LDPC codes by connecting the mother CNs and the global CNs to the mother VNs in the graph.By extending the new nodes and then increasing new edges of type 2 and type 3, we can continuously form the new GC-LDPC codes.Note that the type 2 edges are the only edges connecting the mother nodes and the extended nodes, so they are quite different from the type 1 edges.Since there are four different types of edges in the Tanner graph, we refer to such LDPC codes as four-edge-type LDPC codes.
For 0 ≤  < , the submatrices which correspond to the four types of edges are denoted by Then, the parity-check matrix H FET of four-edge-type LDPC codes can be represented by Wireless Communications and Mobile Computing 5 Analogous to GC-LDPC codes, we call the upper subarray local part and the lower subarray global part.We extract H  (), H → (), and H  () from H FET to form an  local () ×  local () matrix over GF() as follows: where 0 is an all-zero matrix,  local () =   () +   (), and  local () =   () +   () for 0 ≤  < .So, the corresponding base matrix of H local () and H FET is B  and B  , respectively.Then, the four-edge-type LDPC code satisfies the following parity-check equation: We denote the LDPC code given by the null space of H FET as C FET and the LDPC code given by the null space of H local () as C local (), respectively (where 0 ≤  < ).Notice that H  () is square and has a full rank for 0 ≤  < , and the entries above H  () have to be all zero to achieve rate compatibility; thus, C local () is a RC-LDPC code [16].We extract H  () and H → () from H FET to form an (  + ∑    ()) × ∑    () matrix A over GF() as follows: Then, based on ( 8) and ( 9), we have This means that H FET is a RC parity-check matrix obtained progressively by extension, in order of decreasing rate [17].Furthermore, all the mother VNs are connected to q  and there are no edges connecting p() and q() for  ̸ = ; thus, the four-edge-type LDPC codes can be viewed as a type of RC CN-based QC-GC-LDPC codes.

Construction of Rate-Compatible GC-LDPC Codes.
We now present a method to construct RC GC-LDPC codes.Because the extended VNs are not connected to q  , the construction of RC GC-LDPC codes can be realized by successively extending the graph of local part.The successive extension of the parity-check matrices of local part in the th section is shown in Figure 3.
Assume that we intend to construct a family of fouredge-type LDPC codes containing  member codes C = {C FET,0 , C FET,1 , . . ., C FET,−1 }.The corresponding target rate set R = { FET,0 ,  FET,1 , . . .,  FET,−1 } is given.We denote H FET, as the  FET, ×  FET, parity-check matrix of C FET, , where 0 ≤  < .At the local part of H FET, , there are  submatrices on the main diagonal of size  local, ()× local, () and we denote each submatrix as H local, (), where 0 ≤  <  and 0 ≤  < .So, H local,0 () is the mother matrix of H local,1 ().And the code rate of C FET,0 is Suppose H FET,0 has a full rank:  combine the  extended submatrices to compose the global part of H FET,1 .Suppose H FET,1 has a full rank; we have Recursively, for the code C FET,+1 , we can obtain its paritycheck matrix H FET,+1 from the previously generated paritycheck matrix H FET, for C FET, , 0 ≤  ≤ −2.Suppose H FET,+1 has a full rank; the rate of the code C FET,+1 satisfies In particular, it is not necessary for H FET,1 to be a fullrank matrix for constructing the RC GC-LDPC codes.Using elementary row and column operations of H FET, , we have where and 0 <  ≤  − 1.Consider H , () as a nonsingular matrix, where 0 ≤  ≤  − 1; the rank Rank(H FET, ) of H FET, can be written as where 0 <  ≤  − 1.And it is clear to see that H FET, can be a full-rank matrix if and only if H FET,0 is a full-rank matrix.The construction for the parity-check matrix of the RC GC-LDPC codes is described in detail in Algorithm 1.
Furthermore, the RC CN-based QC-GC-LDPC codes with extension structure allow more efficient encoding.Especially for 0 ≤  ≤ −1 and 0 ≤  ≤ −1, if H , () is an identity matrix, the encoding of such RC CN-based QC-GC-LDPC codes is more efficient: after encoding the mother VNs, the encoding of the extended VNs only involves XOR operations on the precode output symbols.Based on the construction method described in Section 2, we form a 3072 × 15552 binary matrix H ,3 (192,192).It has one column weight 6, two row weights, 27 and 81.Based on base graph 1 in 5G standard, we construct a masking matrix D local of size 960 × 5184 for each submatrix on the main diagonal at the local part of H ,3 (192,192) [28].Particularly, the first 2×192 columns of D local are punctured columns.The degree distribution for D local is () = 0.0127 + 0.0759 + 0.7975 2 + 0.0506 3 + 0.0633 4 , () = 0.0380 2 + 0.962 18 . and  are the variable and check degree distributions from the edge perspective.We can construct a 3072 × 15552 matrix H FET,0 using D local to mask each submatrix on the main diagonal at the local part of H ,3 (192,192).Suppose  = 20.Based on the construction method described in Algorithm 1, we construct a family of four-edge-type LDPC codes.Particularly, by applying graph extension, we obtain the protomatrix of H →, () and H , (), where 1 ≤  ≤ 19 and 0 ≤  ≤ 2. Particularly, p  () and q  () are connected by one edge if and only if  = .In the terminologies of protograph construction, lifting the protograph of H →, () and H , () is equivalent to dispersing the base matrix of them [11].Then, we can use the method in [29] to find circulants for the protomatrix of H →, () and H , ().Based on the construction method described in Algorithm 1, we construct a family of four-edge-type LDPC codes.And we refer to its Input: , , , , and R Output: H FET (1) Using the construction method described in Section 2 to form an  FET,0 ×  FET,0 matirx H FET,0 which specify a GC-LDPC code C FET,0 with rate  FET,0 .(2) for  = 0 :  − 2 do (3) Use H FET, as the mother matrix.(4) for  = 0 :  − 1 do (5) Generate a matrix H →,+1 () of size  ,+1 () ×  local, ().( 6) Generate a matrix H ,+1 () of size  ,+1 () ×  ,+1 () which is square and has a full rank.(7) for  = 0 :  − 1 do (8) Compose the parity-check matrix H local,+1 () of the form (9) for  = 0 :  − 1 do (10) Compose the matrix H →,+1 () of the form H →,+1 () = [H →, () 0] .where 0 is an all-zero matrix of size   ×  ,+1 ().(11) Compose the matrix H FET,+1 of the form update  :  =  + 1.
th member as C FET, , where 0 ≤  ≤ 19.Consider that the four-edge-type LDPC code is constructed based on finite field which guarantees the structural property of algebraic construction, and the Tanner graphs of such codes have a girth of at least 6.The parameters of such family of four-edgetype LDPC codes are summarized in Table 1 and the diagram of its matrix is shown in Figure 4.
Example 3. In order to improve flexibility of code rate, we can puncture the parity bits from a family of four-edge-type LDPC codes.Based on H FET,1 presented in Example 2, for instance, we set the last 98 columns from H ,1 () as punctured columns, where 0 ≤  ≤ 2.Then, we obtain (14682,12480) GC-LDPC code with a rate of 0.85 and denote it as C 3 .

Local/Global Two-Phase Decoding Scheme
For classical iterative decoding scheme, the decoder includes all the VNs in a block and performs total decoding operations in one phase [30].We refer to such an iterative decoding scheme as one-phase iterative scheme.In contrast to classical iterative decoding scheme, Li et al. devised a two-phase local/global iterative scheme for CN-GC-LDPC codes [1,2].Taking advantage of the cascading structure of local part, we can split local part into  independent sections.And each section can use an independent decoder at the local phase.If all sections of local part are successfully decoded and the locally decoded codeword satisfies the parity-check constraints in global part, the locally decoded codeword would be delivered to the user.If it does not, the global decoder starts to process the received codeword from the local decoder.In a good channel environment, we only need to use a group of ( or less) local decoders to process a group of ( or less) consecutive received sections in parallel.This means that we can process a consecutive sequence by some local sequences for a GC-LDPC code.The advantage of this decoding scheme is that lower latency and less memory requirements are required by the decoder.We refer to such a scheme as normal two-phase local/global iterative scheme.However, in a bad channel environment, we find that the local decoder performs a plenty of useless operations, which causes the unnecessary cost of the decoder.Therefore, we present a modified two-phase local/global iterative decoding scheme for CN-GC-LDPC codes.

Modified Local/Global Two-Phase Iterative Decoding
Scheme.Let z = (z 0 , z 1 , . . ., z −1 ) be the received sequences.Firstly, in local phase of decoding,  received sequences are decoded by  independent decoders with the maximum iteration number  max 1 .If all the sections are successfully decoded and the locally decoded codeword satisfies the parity-check constraints in global part, the locally decoded codeword could be delivered to the user.If one of the decoders fails to decode a received section, we switch the decoding from the local phase to the global phase.If all sections are successfully decoded, but the locally decoded codeword does not satisfy the parity-check constraints in global part, save the decoded information (s) and return to the local decoder.We set the maximum iteration number of local decoders to  max 2 .Then, if one of the decoders fails to decode a received section or the locally decoded codeword does not satisfy the parity-check constraints in
In global phase of decoding, a global decoder is activated.It processes the received vector z with the channel information and the combined decoded information (s) of successfully decoded sections as inputs.And the diagram of modified two-phase local/global decoding iterative scheme is illustrated in Figure 5.
We define the order of decoding complexity as the number of operations required per information bit and denote the order of decoding complexity for the normal twophase decoding algorithm as O normal .Suppose the number of operations for one iteration of global part is   and the number of operations for one iteration in the th section of local part is   (), where 0 <  ≤  − 1.As stated in [1][2][3], we have

Wireless Communications and Mobile Computing
where  is the length of the information bits,   () (0 ≤   () ≤  max ) is the number of iterations involving updates of variables in the th section of the local part, and   is the number of iterations involving updates of variables in the global part.Then, the order of decoding complexity of the modified two-phase decoding algorithm, which is denoted as O modified , can be summarized as where

Numerical Results
In this section, we first present the simulation performance for RC GC-LDPC codes over the AWGN channel.Then, we compare the decoding complexity of different decoding schemes presented in Section 4. The decoding latency with different decoding schemes is also discussed.

Error-Correcting Performance.
We now provide the simulated BER and BLER performances for RC GC-LDPC codes over the AWGN channel with QPSK signaling.It is assumed that all the simulations are performed using the belief propagation (BP) algorithm with the maximum iteration number 50, if not specified.The BER and BLER performances for different code rates are plotted in Figure 6 together with the corresponding Shannon limits.The iterative decoding thresholds achieved by the proposed RC GC-LDPC codes are summarized in Table 2.It can be seen that the gaps between the iterative decoding thresholds and the Shannon limits are very small.Figure 7 depicts the BER and BLER performances

Decoding
Figures 10 and 11 depict the decoding complexity of C 1 and C FET,0 with one-phase, normal two-phase local/global, and modified two-phase local/global iterative schemes based on BP decoding algorithm, respectively.All operations associated with modulo-2 arithmetic have been neglected as conventionally done.The decoding complexity associated with BP algorithm is evaluated based on the forward and backward recursions proposed in [7].For C

Decoding Latency.
The decoding delay in a data transmission system is defined as the delay incurred in receiving the coded bits before decoding takes place and the ensuing decoder processing delay in [31].In this paper, we assume that all schemes being compared have approximately the same decoding complexity, but the decoder processing time is negligible.For one-phase iterative scheme, no information symbols are decoded until an entire block is received.Thus, the maximum decoding delay experienced by an information bit when LDPC code is used for one-phase iterative scheme is the arrival time of one incoming block.Suppose its number of iterations is  total .Then, the total decoding latency in received symbols and the total number of soft received values that must be stored in the decoder memory at any given time (decoding latency for short) can be represented by   total , where  is the total number of symbols This means that latency and memory requirements of twophase local/global iterative scheme are much less than for the one-phase iterative scheme when the channel environment is better.

Conclusion
In this paper, we introduced the graph extension through a four-edge-type LDPC code and presented a family of RC GC-LDPC codes; they are constructed by combining algebraic method and graph extension.It was shown that the proposed family of RC CN-GC-LDPC codes can provide more flexibility in code rate and guarantee the structural property of algebraic construction.It is confirmed, by numerical simulations over the AWGN channel, that the proposed RC GC-LDPC codes outperform their counterpart QC-GC-LDPC codes formed by the method in [1,2] in terms of waterfall performance and exhibit an approximately uniform gap to the capacity over a wide range of rates.Moreover, we presented a modified two-phase local/global iterative scheme which can reduce unnecessary cost of local decoders at low and moderate SNRs.

Mother
-Type LDPC Codes.The Tanner graph of a fouredge-type LDPC code is illustrated in Figure2.We partition CNs q m (0) Mother VNs p m (0)

Figure 2 :
Figure 2: The Tanner graph of four-edge-type LDPC codes.

Figure 3 :
Figure 3: Parity-check matrix extension in the th section of local part.

Figure 4 :
Figure 4: Extension structure of parity-check matrix of the RC GC-LDPC codes in Example 2.

Figure 6 :
Figure 6: The BER and BLER performances for different code rates of C FET,0 , C FET,2 , C FET,4 , C FET,8 , and C FET,19 over the AWGN channel with QPSK signaling.

Figure 7 :
Figure 7: The BER and BLER performances of QC-GC-LDPC codes of C 1 , C 2 , C FET,0 , and C 3 over the AWGN channel with QPSK signaling.
One phase scheme Normal two phase scheme, Global Normal two phase scheme, Local Modified two phase scheme, Global Modified two phase scheme, Local E b /N 0 (dB)

Figure 9 :Figure 10 :
Figure 9: Average iteration number of one-phase, normal two-phase local/global, and modified two-phase local/global iterative scheme based on BP decoding algorithm with C FET,0 .

Table 1 :
Parameters of a RC GC-LDPC code.
and   2 () (0 ≤   2 () ≤  max 2 ) are the number of iterations involving updates of variables in the th section of the local part in which maximum iterations number is  max 1 and  max 2 , respectively.As can be seen, in a bad channel environment, few sections are successfully decoded at first local phase.And each successfully decoded section performs not more than  max 1 iteration operations.So, we have Considering  max 1 (0) ≪  max , we have O modified ≪ O normal .Moreover, in a good channel environment, most successfully decoded sections satisfy the parity-check constraints in global part.Not more than ( max 1 +  max 2 ) iteration operations are required in each successfully decoded section.For  max =  max 1 +  max 2 , we have O modified ≈ O normal .
Complexity.Figures8 and 9depict the average iteration number of C 1 and C FET,0 with one-phase, normal two-phase local/global, and modified two-phase local/global iterative schemes based on the BP decoding algorithm, respectively.For both C 1 and C FET,0 , the maximum iteration number of one-phase iterative scheme is set to 50.The maximum iteration numbers in local decoder and global decoder of normal two-phase local/global iterative scheme with them are 50 and 100, respectively.For modified twophase local/global iterative scheme,  max 1 ,  max 2 , and the maximum iteration number in global decoder of C 1 are 30, 20, and 50, respectively.And  max 1 ,  max 2 , and the maximum iteration number in global decoder of C FET,0 are 60, 40, and 100, respectively.Based on Figures8 and 9, we conclude that the normal two-phase local/global iterative scheme requires a significantly higher number of iterations than modified twophase local/global iterative scheme at local phase and needs approximately the same iteration number as the modified scheme at global phase, especially at low and moderate SNRs.At high SNRs, two-phase local/global iterative scheme requires

Table 2 :
Parameters of a RC GC-LDPC code.
1 , the total complexity associated with one iteration of BP consists of 877,338 real multiplications, 104,328 real divisions, and 282,366 real additions at global phase.At local phase, it consists of 237,258 real multiplications, 29,736 real divisions, and 79,002 real additions in each local decoder.For C FET,0 , the total complexity associated with one iteration of BP consists of 559,872 real multiplications, 76,608 real divisions, and 198,720 real additions at global phase.At local phase, it consists of Figure 8: Average iteration number of one-phase, normal two-phase local/global, and modified two-phase local/global iterative scheme based on BP decoding algorithm with C 1 .129,984 real multiplications, 20,352 real divisions, and 20,352 real additions in each local decoder.Based on Figures 10 and 11, we conclude that the normal two-phase local/global iterative scheme requires significantly more operations than modified two-phase local/global iterative scheme at low and moderate SNRs.
. For twophase local/global iterative scheme, all information symbols are assigned to  local decoders.Suppose the number of iterations in global phase is  global .For the th local decoder in two-phase local/global iterative scheme, suppose the number of iterations is   .So,  global  + ∑ −1 =0    local represents the Figure 11: Decoding complexity of C FET,0 with one-phase, normal two-phase local/global, and modified two-phase local/global iterative scheme based on BP decoding algorithm.decoding latency of two-phase local/global iterative scheme.By using  local decoders in fully parallel local phase decoding, the maximum decoding delay experienced by an information bit is the arrival time of each incoming block in local decoder.Then, the decoding latency reduces to  global  + max{   local , 0 ≤  ≤  − 1}.Note that  global decreases with the increase of SNR.At moderate and high SNRs, the decoding latency approaches max{   local , 0 ≤  ≤  − 1} when  global → 0. Since  local ≪  and  total ≈ max{  , 0 ≤  ≤  − 1}, then max{   local , 0 ≤  ≤  − 1} ≪  total .