Construction of Quasi-Cyclic LDPC Codes Based on Fundamental Theorem of Arithmetic

Quasi-cyclic (QC) LDPC codes play an important role in 5G communications and have been chosen as the standard codes for 5G enhanced mobile broadband (eMBB) data channel. In this paper, we study the construction of QC LDPC codes based on an arbitrary given expansion factor (or lifting degree). First, we analyze the cycle structure of QC LDPC codes and give the necessary and sufficient condition for the existence of short cycles. Based on the fundamental theorem of arithmetic in number theory, we divide the integer factorization into three cases and present three classes of QC LDPC codes accordingly. Furthermore, a general constructionmethod of QC LDPC codes with girth of at least 6 is proposed. Numerical results show that the constructed QC LDPC codes perform well over the AWGN channel when decoded with the iterative algorithms.


Introduction
Low-density parity-check (LDPC) codes [1] are a class of modern channel coding. Because of the advantages of approaching the Shannon capacity and the iterative decoding algorithms with lower complexity, LDPC codes have been attracting great interests of the industries and academia. For various specific communication systems [2][3][4], LDPC codes have been well designed and chosen as their standard codes. As an important scenario of 5G communications, the enhanced mobile broadband (eMBB) data channel had adopted the LDPC coding scheme [5], and LDPC codes have recently been determined after several rounds of discussions [6][7][8][9][10][11][12]. However, the other two scenarios of 5G communications, that is, ultrareliable and low latency communications (URLLC) and massive machine-type-communication (mMTC), have no candidate channel coding at present. The promising coding techniques for 5G communication systems are turbo codes, binary/nonbinary LDPC codes, spatially coupled (SC) LDPC codes [13], block Markov superposition transmission (BMST) [14], and polar codes. The encoding/decoding complexity, performance, spectral efficiency, and robustness comparisons among them can be found in [15]. Recently, some low-complexity decoding algorithms of these modern channel codes have been proposed [16,17]. These significant works can facilitate and accelerate the applications of these modern coding techniques in 5G communications. According to the definition and description of URLLC and mMTC provided by ITU-R [18], these two scenarios require low latency and high reliability. That is, short data package communication which has no visible error floor down to block error rate (BLER) of 10 −5 should be considered. Research results [19] show that LDPC codes have good performance in the waterfall and error-floor region. Moreover, LDPC codes have good robust property [15,20] and then their good performance can be also obtained over various channels. Hence, LDPC coding still has a strong competitiveness in the applications of URLLC and mMTC.
LDPC codes can be divided into two major classes: (1) random-like codes constructed by means of computer search under the efficient algorithms [21,22] and (2) structured codes constructed based on algebraic tools, combinatorial structures, and graphs, such as finite geometries [23], finite fields [24], balanced incomplete block designs (BIBDs) [20], resolvable group divisible designs (RGDDs) [25], and protographs [26,27]. Research results show that well designed algebraic-based LDPC codes have no error floor at the bit error rate (BER) down to 10 −15 [28]. To 2 Wireless Communications and Mobile Computing facilitate implementation, LDPC codes usually have some special structures, such as diagonal structure and quasi-cyclic (QC) structure. In general, quasi-cyclic (QC) LDPC codes [29] have advantages of encoding and decoding with low complexity [30,31], easy hardware implementation [32], and good iterative performance [33], and then they have attracted comprehensive attention.
In order to support lots of data packets with various lengths in the eMBB scenario of 5G communications, the designed 5G LDPC codes are chosen as rate-compatible (RC) QC LDPC codes. Notice that the number of expansion factors (or lifting degrees) of 5G QC LDPC codes is not much. On the other hand, some encoding algorithms [34] are only suitable for QC LDPC codes with certain expansion factor (or lifting degree). Furthermore, the encoding and decoding of QC LDPC codes with expansion factors (or lifting degrees) being the power of two can be easily implemented by linear shift registers. Hence, it is interesting to construct QC LDPC codes from an arbitrary given expansion factor (or lifting degree).
In this paper, we focus on the construction of QC LDPC codes from given expansion factors (or lifting degrees). We first introduce the fundamental theorem of arithmetic in number theory and divide the integer factorization into three categories. By analyzing the cycle structure of QC LDPC codes, we present three classes of QC LDPC codes based on three families of integers. Furthermore, a general construction of QC LDPC codes with girth of at least 6 based on the fundamental theorem of arithmetic is proposed. Finally, in order to show the good performance of our constructed QC LDPC codes, numerical simulation results are provided.
The rest of this paper is organized as follows. Section 2 introduces the fundamentals of number theory, the definitions, basic concepts, and cycle structure of QC LDPC codes. Section 3 presents three classes of QC LDPC codes and a general construction method. Numerical results are also provided in this section. Finally, Section 4 concludes this paper.

Fundamentals of Number Theory
Theorem 1. Every composite number, which is greater than one, factors uniquely as a product of prime numbers.
This theorem is the well-known fundamental theorem of arithmetic in number theory, and it had been proved by Gauss and Clarke [36]. This theorem is also called the unique factorization theorem. That is, every integer greater than one is either prime itself or the product of the prime numbers and this product is unique, up to the order of the factors. For example, 1400 = 14 × 100 = 2 × 7 × 2 × 2 × 5 × 5 = 2 3 × 5 2 × 7. This theorem is twofold: first, 1400 can be written as a product of the primes, and second, no matter how this is done, there will always be three 2s, two 5s, one 7, and no other primes in this product. Hence, for a given integer ≥ 2, can be represented by the unique product; that is, where 1 , 2 , . . . , are prime numbers.
The following three lemmas and one theorem are useful for constructing QC LDPC codes with girth of at least 6.

QC LDPC Codes and Their Associated Tanner Graphs.
A ( , )-regular quasi-cyclic (QC) LDPC code [29] of length can be completely specified by the null space of the following matrix over GF (2): where, for 0 ≤ ≤ − 1 and 0 ≤ ≤ − 1, I( , ) is an × circulant permutation matrix (CPM) with a one at column-( + , ) (mod ) for row-, 0 ≤ ≤ −1, and zero elsewhere. It is clear that I(0) represents the × identity matrix. Notice that the parameter is referred to as the expansion factor (or lifting degree) [37]. It can be easily observed that the positions of nonzero elements in H are uniquely determined by the following matrix, called permutation shift matrix or exponent matrix: ] .
That is, there is a one-to-one correspondence between P and H. An LDPC code is commonly described by a bipartite graph known as Tanner graph [38] in coding theory. Tanner graph of H, denoted by G( , ), consists of a set of variable nodes (containing code symbols of a code word) and a set of check nodes (containing local check-sum constraints on the code symbols). An edge in G( , ) connects the variable node to the check node if and only if the element at column-and row-of H is nonzero. A cycle is formed by a sequence of vertices (or edges) in G( , ) which starts and ends at the same vertex (or edge) and contains other vertices (or edges) not more than once. The cycle of length is denoted as -cycle for short and the length of the shortest cycle is called the girth of G( , ) (or an LDPC code). As an example, Figure 1 shows the Tanner graph of H and an associate 6-cycle.
In graph theory, the biadjacency matrix A = [ ] of a bipartite graph G( , ) can be constructed as follows. The rows of A are labeled by the | | vertices in and the columns are labeled by | | vertices in . The element at the row labeled by the vertex ∈ and the column labeled by the vertex ∈ is 1 if and only if there exists an edge between the vertices and and otherwise 0. Actually, for an LDPC code given by the null space of H, H is the biadjacency matrix of its relevant Tanner graph G( , ).
Moreover, isomorphism theory of QC LDPC codes was proposed in [39][40][41] based on the isomorphism of graphs in graph theory. According to the isomorphism of QC LDPC codes, the parity-check matrix in (6) can be simplified as the following matrix: ] .
That is why the elements in the first row and first column of the exponent matrix P are usually set to 0 in the research process [29,42]. Hence, we only consider such H and P in the following discussions.

Cycle Structure of QC LDPC Codes.
Consider a QC LDPC code C given by the null space of H in (8). It can be seen from [29] that a cycle in the Tanner graph of C is associated with a family of the ordered CPMs in H. As shown in [29], a 2 -cycle in the Tanner graph of the code C (or H) is represented by an ordered sequence of CPMs The above sequence can be simplified as . (11) It can be seen that such a 2 -cycle corresponds to the elements 0 , 0 , 1 , 1 , 2 , 2 , . . . , −1 , −1 in the exponent matrix P. Furthermore, short cycles of QC LDPC codes can be determined by the elements of P [39,40]. Let be the girth of the code C. It can be seen from [43] that, for ≤ 2 ≤ 2 −2, the necessary and sufficient condition for the existence of a 2 -cycle in the Tanner graph of the code C (or H) can be generalized as follows: with 0 = , 0 = , ̸ = +1 , and ̸ = +1 . Note that (12) is not the sufficient condition for the existence of a 2cycle in the Tanner graph of the code C (or H) for 2 ≥ 2 , but it is the necessary condition. Here we give a counterexample. Consider a 2 -cycle ( ≥ ) whose cycle structure is given in Figure 2. Clearly, (12) is satisfied. Let , = + , + for 0 ≤ ≤ − 1, and 0 = 2 , 0 = 2 , ̸ = +1 , ̸ = +1 . According to (12), we have That is, the ordered elements (12) hold, but they do not determine a 4 -cycle. A visual representation is depicted in Figure 2. Therefore, (4) in [29] and (3) in [39] are not applicable to the cycles with lengths larger than 2 − 2.

Construction of Quasi-Cyclic LDPC Codes with Girth of at Least 6
Based on the aforementioned, there exists a one-to-one correspondence between the exponent matrix P and the paritycheck matrix H of a QC LDPC code. Hence, construction of a QC LDPC code is equivalent to the design of its exponent matrix P. In this section, we present three classes of QC LDPC codes with girth of at least 6 and then give a general construction of QC LDPC codes with girth of at least 6 based on an arbitrary integer.
Second, we replace the 0s and , in the designed exponent matrix P with CPMs I(0) and I( , ) of the same size × , respectively, and then obtain a × array H of × CPMs. This array is a × matrix over GF(2) with column and row weights and , respectively. The null space of this matrix gives a ( , )-regular QC LDPC code.
Remark 6. As shown in [44], girth and short cycles play an important role in the design of LDPC codes. If the above constructed ( , )-regular QC LDPC code does not have good iterative performance, we can replace some CPMs in the above array H with zero matrices (ZMs) of the same size to reduce the number of short cycles and possibly enlarge the girth value. This replacement is called masking. On the other hand, if the lengths of the desired QC LDPC codes are shorter than or they require much higher code rates, then we can take a × subarray of the designed array H, where ≤ and ≤ . Notice that this subarray can be obtained from the following two steps: (1) Choose the first row-CPMs of the designed array H; (2) select column-CPMs from column-CPMs of the designed array H. In this paper, both the masking technique and the selection method in [43] are employed to construct (or further optimize) QC LDPC codes.

Three Classes of QC LDPC Codes with Girth of at Least 6.
Based on (12), we can see that Tanner graph of the designed array H contains a 4-cycle if and only if the following equation is satisfied: where 0 ̸ = 1 and 0 ̸ = 1 . It can be observed that the existence of 4-cycles in the Tanner graph of the designed array H is related to . According to the fundamental theorem of arithmetic, the values of can be divided into three categories and three classes of QC LDPC codes with girth of at least 6 are proposed. Notice that all numerical simulations in the following examples, binary phase shift keying (BPSK), additive white Gaussian noise (AWGN) channel, and the sum-product algorithm (SPA), are assumed. Example 7. Consider = 256 = 2 8 . Let = 2 2 and = 2 6 . According to (14), we can obtain the exponent matrix P of size 4 × 64. By employing the method in [43], we select the first 4 rows and the 2nd, 16th, 19th, 35th, 50th, 55th, 62nd, and 63rd columns of P and construct a 4×8 array H of 256 ×256 CPMs by replacing the elements of the selected submatrix with the corresponding CPMs. By using the matrix respectively. It can be observed that these two codes have similar performance when decoded using the SPA with various iterations. It is well known that algebraic-based LDPC codes have fast decoding convergence [19,45,46]. That is, the SPA decoding of the proposed LDPC code also converges fast, as shown in Figure 3. We can see that the performance gap between 20 and 50 iterations is less than 0.15 dB at the BER of 10 −6 , and the gap is also less than 0.25 dB at the BER of 10 −7 ; hence this code achieves a fast rate of decoding convergence. , where 1 , 2 are two different prime numbers and 1 , 2 are two positive integers. Assume = min{ 1 1 , 2 2 } and = max{ 1 1 , 2 2 }. Without loss of generality, 1 1 < 2 2 is assumed. Since 1 ≤ 0 , 1 ≤ 1 1 −1, 1 ≤ 0 , 1 ≤ 2 2 −1, 0 ̸ = 1 , and 0 ̸ = 1 , then 1 ≤ 0 − 1 ≤ 1 1 −1 and 1 ≤ 0 − 1 ≤ 2 2 −1, where the calculation is taken modulo 1 1 and modulo 2 2 , respectively. Hence, (15) is not satisfied according to Lemma 3. That is, Tanner graph of the designed array H does not contain 4-cycles and the girth of the constructed QC LDPC codes is at least 6. Example 8. Consider = 72 = 2 3 ×3 2 = 4×18. Since 12 < 18, let = 2 2 and = 18. According to (14), we can obtain the exponent matrix P of size 4×18. By employing the method in [43], we select the first 4 rows and the 1st, 2nd, 3rd, 4th, 5th, 6th, 12th, 13th, 15th, 16th, 17th, and 18th columns of P and construct a 4 × 12 array H of 72 × 72 CPMs by replacing the elements of the selected submatrix with the corresponding CPMs. By using the matrix to mask H, a 288 × 864 matrix with column and row weights 3 and 9, respectively, is obtained. This matrix gives a (3, 9)regular (864, 576) QC LDPC code. The bit/word error rates (BERs/WERs) of this code decoded by the SPA with 50 iterations are shown in Figure 4. Also shown in Figure 4 is the performance of the (3, 9)-regular (864, 576) algebraic QC LDPC code constructed from the finite field GF(73) [35]. The exponent and masking matrices of this algebraic QC LDPC code are respectively. Notice that the CPM size of this algebraic code is 72 × 72. It can be observed that these two codes also have similar performance. Moreover, the cycle distributions of these two codes are given in Table 1. We can see that although the proposed code has fewer shortest cycles than the algebraic QC LDPC code, the proposed code has much more cycles of length 8 than the algebraic QC LDPC code. That is why the proposed code does not perform better than the algebraic QC LDPC code in the high-SNR region. Example 9. Consider = 105 = 3 × 5 × 7. Since 10 < 21, let = 5 and = 21. According to (14), we can obtain the exponent matrix P of size 5 × 21. By employing the method in [43], we select the first 5 rows and the 1st, 2nd, 3rd, 6th, 7th, 13th, 16th, 17th, 20th, and 21st columns of P and construct a 5 × 10 array H of 105 × 105 CPMs by replacing the elements of the selected submatrix with the corresponding CPMs. By using the matrix to mask H, a 525 × 1050 matrix with column and row weights 3 and 6, respectively, is obtained. This matrix gives a (3, 6)regular (1050, 525) QC LDPC code of girth 8. For comparison, we simultaneously present the simulation for the (3, 6)regular (1050, 525) LDPC code constructed based on the progressive edge-growth (PEG) algorithm [22]. The bit/word error rates (BERs/WERs) of these two codes decoded with the SPA (50 iterations) are shown in Figure 5. It can be seen that although these two codes have similar performance in the waterfall region, the proposed code performs better than the PEG-LDPC code in the high-SNR region.
Example 10. Consider = 127 > 4 × 31, and let = 4, = 31. According to (14), we can obtain the exponent matrix P of size 4 × 31. By employing the method in [43], we select the first 4 rows and the 1st, 2nd, 6th, 7th, 22nd, 26th, 29th, and 31st columns of P and construct a 4×8 array H of 127×127 CPMs by replacing the elements of the selected submatrix with the  corresponding CPMs. By using the method in [43], we design a masking matrix, that is, to mask H, a 508 × 1016 matrix with column and row weights 3 and 6, respectively, is obtained. This matrix gives a (3, 6)-regular (1016, 508) QC LDPC code of girth 8. For comparison, we also construct a (3, 6)-regular (1016, 508) QC LDPC code based on the partial geometry [28]. Note that the exponent matrix of this code is and the masking matrix is also M 4×8 in Example 7. The bit/word error performance of these two codes decoded by the SPA with 50 iterations is shown in Figure 6. It can be seen that these two codes have similar performance. We can also observe from Figure 6 that, for the proposed QC LDPC code, there are no error floors in the BER curves down to BER = 2.27×10 −7 and in the WER curves down to WER = 3.5×10 −6 .

Conclusion
In this paper, based on the fundamental theorem of arithmetic, we presented a method for constructing QC LDPC codes with girth of at least 6 from an arbitrary integer. According to the integer factorization, we divided the integers BER, partial geometry WER, partial geometry BER, proposed WER, proposed Figure 6: The bit error performance of the proposed (3, 6)-regular (1016, 508) QC LDPC code and the comparable (3, 6)-regular (1016, 508) QC LDPC code constructed from partial geometry [28] in Example 10. into three categories and then constructed three classes of QC LDPC codes. Furthermore, a general construction of QC LDPC codes with girth of at least 6 was proposed. Numerical results show that the constructed QC LDPC codes have good performance over the AWGN channel and converge fast under iterative decoding. In other words, for an arbitrary integer (≥ 6), we can easily construct QC LDPC codes whose parity-check matrices consist of several CPMs and/or zero matrices of size × , and the proposed method ensured that the resultant QC LDPC codes have girth of at least 6. Moreover, the proposed QC LDPC codes perform as well as the algebraic QC LDPC codes.