PB: A Product-Bitmatrix Construction to Reduce the Complexity of XOR Operations of PM-MSR and PM-MBR Codes over GF(2)

Edge computing, as an emerging computing paradigm, aims to reduce network bandwidth transmission overhead while storing and processing data on edge nodes. However, the storage strategies required for edge nodes are different from those for existing data centers. Erasure code (EC) strategies have been applied in some decentralized storage systems to ensure the privacy and security of data storage. Product-matrix (PM) regenerating codes (RGCs) as a state-of-the-art EC family are designed to minimize the repair bandwidth overhead or minimize the storage overhead. Nevertheless, the high complexity of the PM framework contains more finite-domainmultiplication operations than classical ECs, which heavily consumes computational resources at the edge nodes. In this paper, a theoretical derivation of each step of the PM minimum storage regeneration (PM-MSR) and PM minimum bandwidth regeneration (PM-MBR) codes is performed and the XOR complexity over finite fields is analyzed. On this basis, a new construct called product bitmatrix (PB) is designed to reduce the complexity of XOR operations in the PM framework, and two heuristics are used to further reduce the XOR numbers of the PB-MSR and PB-MBR codes, respectively. )e evaluation results show that the PB construction significantly reduces the XOR number compared to the PM-MSR, PM-MBR, Reed–Solomon (RS), and Cauchy RS codes while retaining optimal performance and reliability.


Introduction
Edge computing has emerged as a new paradigm for addressing local computing needs and moving data computation or storage to an edge node near the end user [1][2][3]. In contrast to cloud storage built on a network, edge computing-based storage is distributed among the edges of the network structure [4,5]. Consequently, these edge nodes are particularly in need of fault-tolerant techniques to ensure system reliability and availability and even more importantly to ensure data privacy and security [6]. Nowadays, whether it is a distributed storage system [7] or a decentralized system [8,9], it needs to guarantee the storage reliability of the source data, and these systems have used an erasure code (EC) strategy as opposed to a replication strategy which has a higher storage overhead. [10]. e well-known Reed--Solomon (RS) codes or more generally, maximum distance separable codes (MDS), have been applied to many storage systems [11]. But in a word, an EC is essentially a particular linear combination of data symbols that involves a large number of finite field matrix product operations, and the calculation complexity over finite field is too high. ere have been many studies on this issue [12][13][14]. In addition to the problem of high computational complexity over GF(2 w ), another problem of EC is that their repair bandwidth is equal to the size of the entire source data [15]. It has been reported that, based on RS coding, an average of 95,500 coded blocks per day needs to be recovered, and more than 180 TB of data per day must be transmitted through the TOR switch for data recovery [16]. Recently, a new class of ECs called regenerating codes (RGCs) has emerged that trades off both storage overhead and repair bandwidth [17]. RGC maintains the same reliability as ECs, but both of them are calculated over a finite field.
ere are two principal RGC classes, namely, minimum storage regenerating (MSR) and minimum bandwidth regenerating (MBR) codes, which imply two extreme points in a trade-off known as the storage-repair bandwidth trade-off. ere have been many frameworks designed for MSR and MBR codes, respectively [7,[18][19][20], but as far as we know, the product-matrix (PM) framework proposed by Rashmi et al. is the only framework that constructs two codes in a unified way [21], and the PM framework provides exact repair of PM-MSR (2k − 2 ≤ d) and PM-MBR (k ≤ d ≤ n − 1) codes. While there has been some research on the optimization of extensions based on this framework, such as reducing the disk I/O overhead or dynamically changing the number of helper nodes according to system requests and network changes, these studies have not analyzed and optimized the framework itself [22][23][24]. In this paper, we focus on the PM framework, analyzing the computational complexity of the encoding, decoding, and repair processes over GF (2 w ). Since the PM framework has a large number of XORs at each step, especially in the decoding process of the PM-MSR code, the computational complexity of the inversion of the Vandermonde matrix is O(n 3 ).
erefore, we propose a new construction called product bitmatrix (PB). Our contributions include the following: (i) e computational complexity of the PM-MSR and PM-MBR codes is strictly theoretically derived in combination with the finite field arithmetic operations. (ii) A new construction called product bitmatrix for MSR and MBR codes is designed, and we elaborate on the encoding, decoding, and repair processes in detail. (iii) Two new heuristic algorithms are presented that find a locally optimal PB which has low XORs, thereby reducing computational complexity. e experimental results show that the number of XORs dropped by 11.73% of the PB-MSR and 20.17% of the PB-MBR codes. e remainder of this paper is organized as follows. e background of the two codes and arithmetic complexity analysis over the finite field are analyzed in Section 2. Section 3 is the analysis of the computational complexity of the encoding, decoding, and repair processes of the PM framework. In Section 4, a new PB construction is presented and implemented. In Section 5, two different heuristic algorithms are designed to find the locally optimal product bitmatrix that has the least number of XORs. en, the results of many evaluation experiments conducted to verify the performance of optimization are presented in Section 6. Finally, Sections 7 and 8 are related works and conclusions.

MSR and MBR Codes.
RGCs are state-of-the-art ECs that aim to reduce the amount of data during the repair process, which means a new replacement node needs to connect to any d helper nodes and download β symbols from each node. As shown in Figure 1, a comparison between the reconstruction and repair processes is presented. Any kα symbols in an α-level stripe are transferred to reconstruct origin symbols, while dβ symbols downloaded for repair are much smaller than the total size of B. It is shown in the following formula: (1) Since both storage overhead α and bandwidth overhead β are costs, it is desirable to minimize both α as well as β. In [17], when (n, k, d, B) are fixed values, there is a trade-off between α and β; the two extreme points in this trade-off are termed the MSR and MBR. Clearly, (α, β) of the minimum storage point is given by ((B/k), (α/d − k + 1)), and at the the minimum bandwidth point of the trade-off, (α, β) is given by (dβ , (2B/k(2 d − k + 1))). RGC over GF(2 w ) is associated with a set of parameters n, k, d, α, β, B . A comparison about metrics between Replication, EC, MSR, and MBR is shown in Table 1.
As shown in Table 1, the upper bound of storage efficient R MBR code is reached (1/2) when k � d � n − 1, but there is no such limitation of R MSR . For the sake of simplicity, if the size of a file is B � 6, (n, k, d) � (6,3,5), the repair bandwidth c of EC, MSR, and MBR is 6, 3.33, and 2.5,respectively, and the total storage overhead nα of EC, MSR, and MBR is 12, 12, and 15, respectively. Although the EC and MSR codes use the same storage, the repair bandwidth of EC is higher than MSR. MBR code is where storage space is allowed to expand, but repair bandwidth is at the lowest bound. e MSR codes minimize c based on the optimal α, while the MBR codes minimize α under the optimal c.

Finite Field Arithmetic Complexity.
Since the finite field of prime size is not suitable for computers, the finite field of a size equal to a power of 2 is generally preferred. In GF(2 w ), each element is represented by the formal power series, which is a way of writing infinite sequences of bits. For example, a and b are bit sequences over the finite field GF(2); a � (a 0 , a 1 , a 2 , a 3 , . . .
Addition of a and b could be easily done by bitwise XOR, but multipli- where l is length of sequence. A notation a(z) is used as a formal power series (polynomial) to represent an infinite sequence: a(z) � a 0 + a 1 z + a 2 z 2 + · · · + a l−1 z l− 1 .
(2) e multiplication over GF(2 w ) is performed by multiplying two polynomials with coefficients in binary field GF(2) and then reducing the product by irreducible polynomials [25]. Hence, a polynomial represents a field element, and the set of polynomials is denoted by F 2 [z]. Due to the highest power series, dega(z) of a(z) is equal to w-1, and the sum of nonzero terms of polynomials as the weight of a(z) is denoted as ‖a(z)‖ which equals w. e addition is a bitwise XOR operation whose complexity is w.
For multiplication, where c(z): � a(z)b(z), there are two steps to compute complexity: compute the product and Step 1: in the \emph{product} stage, the complexity of this stage is the \emph{coefficients-XOR}, where Step 2: in the second stage, if degc(z) ≥ w, c(z) is reduced by an irreducible polynomial g(z), until degc(z) w. Let l � degc(z), and convert the term c l z l to c l (g(z) − z w )z l− w . Because an Divide c(z) into three polynomials: no-reduce part d(z), once-reduce part e(z), and twice-reduce part f(z) as shown in Table 2. While w − 2 + w ′ ≤ w − 1, there is a nonexistent second reduction.
Because c(z) � d(z) + e(z) + f(z), the reduction part is as follows: New node Transfer Bd/k(dk + 1) from d nodes Reconstruction Repair Figure 1: An example of parameters (n � 10, k � 5, d � 9): reconstruction by the data collector (DC) is shown on the left and repair by a new node when old node has failed is shown on the right.
erefore, the once-reduce part e(z) has w ′ 2 XORs. en, we compute number of XORs of the twice-reduce part. Let where n(z) is a polynomial in which degn(z) � 2w ′ − 2. us, twice-reduce part f(z) needs w ′ 2 + (w − 2) 2 + 2w ′ − 1 XORs. Consequently, the sum of XORs μ of the field multiplication over GF(2 w ) including two steps of the addition XOR among three parts is Generally, a multiplication operation over GF(2 w ) takes O(w 2 ) bit operations. For example, when w � 3, multiplication results are as shown in Table 3. Because the finite field size equals 2 8 that corresponds to a byte in the computer, w is usually set to be an integral times of 8 to make encoding operations convenient.
Matrix inversion is often used in EC decoding and repair processes. Suppose A is (n × n) matrix (3 ≤ n), and A − 1 � (A * /|A|); the determinant of A is solved by trigonometric matrix. Elements in the solution process include addition and multiplication, which require n i�2 i(i − 1)μ + n i�2 i(i − 1)w XORs. Adjoint matrix A * is an (n × n) matrix composed of algebraic cofactors in which each element A i,j equals the determinant value of A excluding row i and column j. erefore, computing A * requires n 2 n−1 i�2 i(i − 1)μ + n 2 n−1 i�2 i(i − 1)w XORs. Consequently, the XORs of matrix inversion are

The Computational Complexity of PM-MSR and PM-MBR
e famous PM-RGC construction is introduced to provide MBR constructions for all feasible values of (n, k, d) and MSR constructions for all (n, k, d ≥ 2k − 2) [21]. A detailed description of the complexity of the PM-MSR and PM-MBR codes is presented in this section, where matrices are used in the framework listed in Table 4.
For Encoding is constructed using Ψ � Φ ΛΦ where each element is chosen to satisfy the properties where (1) any d rows of Ψ are linearly independent; (2) any α rows of Φ are linearly independent; and (3) the n diagonal elements of Λ are distinct. Let the i-th row of Ψ be (1 × d) ψ t i , the i-th row of Φ be (1 × α) ϕ t i , and the i-th row of Λ be (1 × n) λ i . e α coded symbols stored by the i-th node are given by e complexity of encoding equals the sum of addition and multiplication XORs, where the MSR encoding complexity is nα[dμ + (d − 1)w] and each data encoding needs (n/k)(dμ + (d − 1)w) XORs.
Decoding is DC to reconstruct B. is process has three steps. Let Ψ DC � [Φ DC Λ DC Φ DC ] be the submatrix of Ψ corresponding to the k nodes. en, get encoded data en, (k(k − 1)/2) sets of equations are derived and one set of equations is as follows: is step involves matrix inversion; the XORs to compute a pair (P ij , Q ij ) is 16 μ+ 8w. Consequently, the DC solves the values of P ij , Q ij for all i ≠ j. e sum of XORs in this stage is (3) After elements of P are known, the i-th row (excluding the diagonal element) is given by en, compute [ϕ 1 . . . ϕ α ] t S 1 , from which S 1 is recovered. S 2 is similarly recovered from Q. In this stage, XOR is Table 3: Multiplication over GF(2 3 ) while g(z) � z 3 + z + 1 is an irreducible polynomial. 1  000  001  010  011  100  101  110  111  0  000  0  0  0  0  0  0  0  0  1  001  0 Security and Communication Networks 5 e sum of XORs during decoding is 126μ + 77w. Repair is to regenerate coded symbols on failed node f. us, the aim is to reconstruct c f where ere are two roles in this stage: helper(h) and replacement(r) nodes, respectively. At first, the i-th helper node (h i , i ∈ [d]) computes the inner product ψ t h j Mϕ f and transmits this value to the r node. en, then r node obtains the vector e content c f is eventually recovered. e process of repair is not complicated. e number of XORs on each h node is αμ + (α − 1)w and the total is d[αμ + (α − 1)w]. e computational complexity of r includes matrix inversion, matrix product, and coefficient product and is represented as follows: All feasible values of (n, k, d) of MBR codes have β � 1, symbols are used to construct matrix T.
Encoding is to obtain coding matrix C by matrix product with Ψ and M. Define the encoding matrix Ψ as the form Ψ � Φ Δ where each element is chosen in such a way that (1)any d rows of Ψ are linearly independent and (2) any k rows of Φ are linearly independent. en, C is given by C � ΨM. During this process, the sum of XORs is Decoding is DC to recover original symbols B by connecting to any k nodes. Because the i-th node stored the DC on the left of Φ DC T to recover first T, and the number of XOR in this step is ( Repair is recover ψ t f M symbols stored in the failed node to a replacement node by connecting to an arbitrary set h j |j � 1, . . . , d of d helper nodes. First, each h j computes the inner product ψ t f Mψ f and sends the result to the replacement node. e replacement node thus obtains the d symbols Ψ repair Mψ i and then multiplies on the left by Ψ −1 repair to recover matrix Mψ f . Since M is symmetric, e computational complexity of the PM-MSR and PM-MBR codes is shown in Table 5. As seen from Table 5, the PM framework requires multiple matrix inversions, and the number of XORs in the decoding process is very high.

The New Product Bitmatrix Ψ Constructed by Cauchy Matrix
In this section, we introduce the process of converting finite field elements into bitmatrices first. en, we construct a new Ψ with the Cauchy matrix and transform this matrix into a bitmatrix called PB. e encoding, decoding, and repair processes of the MSR and MBR codes based on PB are introduced in detail.

Transforming Finite Field Elements Using Bitmatrix.
rough the analysis of the finite field arithmetic in the previous section, each element e in GF(2 w ) is represented as a formal power series and the length of these polynomials is w. In [26], they have described a 1 × w row vector V or a w × w matrix M which is represented as an element over GF(2 w ) in a new representation over GF (2)[z]. For any e ∈ GF(2 w ), use M(e) as the matrix whose i th (0 ≤ i) column is V(e * z i− 1 ); M(1) is the identity matrix and M(0) is the allzero matrix. For example, Figure 2 shows bitmatrices over GF (2 3 ). e bitmatrix of e � 3 whose 1 st column is 011, 2 nd Table  5: e number of XORs of the PM framework in encoding, decoding, and repair processes.

PM-MBR
Encoding: XORs Decoding: XORs Repair: XORs of helper nodes Security and Communication Networks column is 3 * z � 3 * 2 � 110, and 3 rd column is 3 * z 2 � 3 * 4 � 111. ere are two forms of multiplication of bitmatrix as shown in Figure 3. Using the bitmatrix representation, the encoding and decoding of PB-MSR and PB-MBR are accomplished by XOR operations, together with some copying operations.
In paper [21], they have adopted classical "Vandermonde" construct Ψ, but the inverse of an n * n Vandermonde-based matrix needs time of complexity O(n 3 ). One well-known choice is to use the Cauchy matrix whose inverse has a time complexity O(n 2 ). Let X � x 1 . . . x α and Y � y 1 . . . y n , where x i and y j are distinct elements over GF (2 w ) and X ∩ Y � ∅. en, the Cauchy matrix defined by X and Y has (1/x i + y j ) in element (i, j). It is clear that any submatrix of a Cauchy matrix is still a Cauchy matrix. If using a Cauchy matrix as Ψ and converting it to bitmatrix, the number of ones in the bitmatrix means the number of XOR operations in encoding and o is the average number of ones per row in the matrix. Choosing different X and Y will get different o. Shown in Figure 4 is the instance over the finite field GF (8)

New Product Bitmatrix Ψ of MSR Codes.
For the PB-MSR code in the upper part of Figure 4, X � {1, 2} and Y � {0, 3, 4, 5, 6} are used to construct (n × α) matrix Φ, and Λ is an (n × n) diagonal matrix. Because Ψ � Φ ΛΦ , any d rows of Ψ are linearly independent, any α rows of Φ are linearly independent, and the n diagonal elements of Λ are distinct. According to these three conditions, encoded matrix C � Ψ × M is generated, and each row of coded symbols in C is stored on each corresponding node, which means the i-th symbols of a row are stored on the i-th node.
In the reconstruction scene, DC links any three nodes and downloads kα � 6 symbols.
After the DC recalculates S 1 ϕ f S 2 ϕ f , the content of node 5 is recovered.
However, the method of random enumeration to find suitable Λ based on the third condition is not reasonable.

Lemma 1.
In the finite field, any element a (except 0) has a unique multiplicative inverse element a − 1 , so the formula aa − 1 � (a − 1 )a � 1 holds.
If X � x 1 , x 2 and Y � y 1 , y 2 , y 3 , y 4 , where x i and y j are distinct elements over GF(2 w ), then a Cauchy matrix is as Φ: Because every (1/x i + y j ) is distinct and their multiplicative inverse is also distinct according to Lemma 1, en, the square of the multiplicative inverse of each element in i-th row and j-th column of Φ is used to construct diagonal matrix Λ and Ψ as follows: Each column of Ψ is linearly independent. Nevertheless, for every operation based on the finite fields GF(2 w ), when w and primitive polynomial change, the results of all operations change. Hence, matrix inversion is needed to determine whether the current structure is invertible. If the current Ψ has an irreversible submatrix, it changes each element of Λ by multiplying by the power to obtain a new completely different element. Since matrix inverses are encoded, decoded, and repaired only a few times and matrix products occur countless times, it is important to optimize the number of exclusive XORs for the matrix. e other details are covered in the next section.

Optimization of PB Framework Based on Minimizing the Number of Ones
An arbitrary Cauchy matrix with different X and Y brow the various number of ones, and the impact on performance is significantly different [27]. It is necessary to find a better bitmatrix Ψ which has fewer ones. erefore, in this section, we optimize the two new Ψ by two new heuristic algorithms.

Optimizing the Number of Ones in Bitmatrix Ψ of PB-MSR.
It is easy to know the number of ones of each element over GF(2 w ). Figure 2 shows that 1-7 elements have several ones that are represented by an array as [3,4,7,5,4,7,6]. A bitmatrix with fewer ones can reduce the encoding computation, and element 1 has the least ones. en, repeat the first three steps for all columns and get a new matrix which has a minimal number of ones called Φ ′ as well as the invertibility property.
Next, it is necessary to find Λ min × Φ ′ which has a minimal number of ones. Let λ � Λ 1 , Λ 2 , . . . , Λ d−k+1 denote the set of (n × n) diagonal matrices and let each Λ j be n multiplicative inverses of j th column of Φ ′ . For each Λ j in λ, do the following: (2) Determine whether the new Ψ is invertible. If it is, continue to the next step; if it is irreversible, (4) Repeat the first three steps and whichever Λ min gives the minimal number of ones, form Ψ by this Λ min × Φ ′ .
Algorithm 1 outlines the procedure to generate the set of λ and the process to find the best Λ k matrix.

Optimizing the Number of Ones in Bitmatrix
. . . , Λ d }(d is the number of helper nodes) denote the set of (n × n) diagonal matrices and let each Λ j be n multiplicative inverses of j th column of Ψ. For each Λ j , do the following: (2) Count the number of ones of new Ψ ′ .
(3) Repeat the first two steps d times and obtain the best Ψ ′ .
After one column of Ψ ′ becomes element 1, Algorithm 2 does the same for the remaining d-1 columns, and each column needs to be optimized by dividing the value on this column so that the number of ones in the current column is minimized. e detailed process is written in Algorithm 2.

Experiment
In this section, we have implemented the new PB-MSR and PB-MBR codes in C/C++ and employed the Jerasure 2.0 [28] with GF-complete [29] libraries for finite field arithmetic operations. All evaluations over GF (2 w ) are about the reduction of XORs, encoding, decoding, and repair performances. All tests have been conducted on MacOS Catalina with 8 Intel Core i9 with 2.3 GHz clock speed and 16 GB 2667 MHz. In Section 6.1, we analyze the improved performances of the number of XORs based on new PB construction. In Sections 6.2-6.4, we report on experiments about encoding, decoding, and repair performances. e encoding performance � total encoded file size/encoding time and the decoding performance � total decoded file size/decoding time. Finally, the influence of finite field size about w is analyzed. All experimental results were average values after maximum and minimum values have been removed.

Analyzing New Product Bitmatrix.
To better understand the effects of the new PB construct in speeding up the encoding computation, some evaluation experiments were designed on reducing the number of XORs, and the results are reported in Table 6. e baseline is based on the original Cauchy matrix without any additional optimization. Original PB-MSR and original PB-MBR have two subcolumns of data, which are the sum of XORs of Cauchy matrices formed by different combinations of X and Y. Since the original PB-MBR code is constructed by original Cauchy matrix, the two columns of before PB-MBR are the actual values of original Cauchy matrices. e latter two columns are reduction of PB-MSR and reduction of PB-MBR, respectively. Improvement is measured in terms of the decrease in XORs. In the last row of the table, the average improvement over all the tested parameters is included.
Since combinations of X and Y are different and the construction process of the PB-MSR code product matrix is also more complicated, each change will bring completely different results, so this experiment only observes the overall change trend. e size of the finite field is dependent on the size of computer bytes, which are 2 8 , 2 16 , and 2 32 . And the choice of w needs to satisfy the range of n + k ≤ (w − 1), when d � 2k-2. It can be seen from Table 1 that PB-MSR provides a 11.73% reduction in the number of XORs and PB-MBR provides a 20.17% reduction in the number of XORs. Because the optimization of PB-MBR is the overall optimization of matrix Ψ, the degree of optimization is greater. Figure 8 shows the XOR change curve. e two line graphs are the original PB-MSR code and PB-MBR code, which are baselines. And the number of XORs optimized for the two codes is lower than the values of the line graphs.

Encoding Performance Evaluation.
e RS and Cauchy codes use the Vandermonde and Cauchy matrices, which are the most classic experiment benchmarks for all erasure codes. Figure 9 shows the comparison results of the encoding performance of PB-MSR, PM-MSR, PB-MBR, PM-MBR, RS, and Cauchy codes when the total number of coded blocks n is fixed and r equals 2, 2.5, or 3. Cauchy-good is the optimized Cauchy-matrix encoding scheme provided by the Jerasure 2.0 library. Since encoding performance is directly related to the number of XORs, when the redundancy rate r is the maximum, the coding matrix of any coding scheme is the minimum. erefore, when r � 3, all encoding performances are optimal. With a decrease of r, the multiplicative operations in finite fields increase gradually, and all coding rates decrease. It can be seen from the figure that as r becomes smaller, the PB Ψ becomes larger and more optimization time is required. erefore, the coding rate of the PB-MSR and PB-MBR codes decreases faster than RS or Cauchy. According to the analysis in the previous section, the different Cauchy matrices will generate different optimization results, so the rise of PB-MBR code in r � 3 or r � 2.5 does not have the same proportions. However, the PB-MSR encoding rate is not as fast as the PM-MSR using the Vandermonde matrix directly when the coded object is small. is is because it requires more time to generate the optimization bitmatrix. e encoding rate of the PB-MBR code is lower than that of PB-MSR because more redundancy is needed to be written to disk during the encoding process of the PB-MBR code. From Figure 9(b), it can be seen that with an increase of coding objects, the proportion of time to generate or optimize the matrix is relatively small, so the rate of all coding strategies increases.

Decoding Performance Evaluation.
In the decoding/reconstruction experiments, the decoding performances for various values of r and fixed n for all coding schemes were tested. e decoding performance during reconstruction was measured in terms of the amount of data decoded per unit time. Figure 10 shows the comparison results. It can be observed that the decoding performances of all codes improve with increasing redundancy r. From Figure 10(a), we see that decoding performance of the PB-MSR code increased from 186.72 MB/s when r � 2 to 300.27 MB/s when r � 3. e PB-MBR code increased by 11.95% when r � 3 than when r � 2. From Figure 10(b), it can be seen that from r � 2 to r � 3, the decoding performance of the PB-MSR code increased by 48.26% and that of the PB-MBR code increased by 14.78%. Since the calculation complexity of PB-MSR and PB-MBR decoding is closely related to k, the smaller k, the lower the calculation complexity and the faster the decoding performances. Although the PB-MBR codes are similar to the PB-MSR codes, they are simpler and therefore have a higher decoding rate. As can be seen from Figure 10 (2) O c j is the number of ones of the j th column; (3) Use Cauchy matrix construct Φ and count O Φ ; (4) for each column j � 1 to d − k + 1 do (5) for each row i � 1 to n do (6) for each row k � 1 to n do (7) Calculate Count O c j of this new column; (9) end for (10) end for (11) Choose the column with min O c j as new j th column of Φ′; (12) end for (13) Construct Φ′ and each column has min O c j ; (20) end for (21) end for (22) if Ψ j � Φ′ Λ j × Φ′ is inverse then (23) Count O Λ j ×Φ′ ; (24) else (25) repeat  (1) O m is the number of ones of a matrix (2) O c j is the number of ones of the j th column (3) Use Cauchy matrix to construct Ψ and count O Ψ (4) λ � Λ 1 , Λ 2 , . . . , Λ d as the set of (n × n) diagonal matrices and each matrix form from multiplcative inverses of Ψ (5) for each Λ j ∈ λ do (9) Choose the best Ψ′ which has the min O Ψ′ (10) e column of the matrix with all number 1 is x th (11) e set η includes all columns except x th column (12) for each column j ∈ η do (13) for each row i � 1 to n do (14) for each row k � 1 to n do (15) Calculate end for (17) Count O c j of this new column (18) end for (19) Choose the column with min O c j as new j th column of Ψ″ (20) end for (21) Choose columns with the min O c j to construct Ψ″ (22) Count O Ψ″ ALGORITHM 2: Finding the optimized bitmatrix Ψ of PB-MBR (the algorithm first denotes a set of λ and each Λ j ∈ λ and finds the best Ψ ′ � Λ j Ψ which has minimal number of ones).  decoding performance is about 17.35% higher than PM-MSR, and PB-MBR is about 15.29% higher than PM-MBR. Figure 11 shows the data transfers across the network to repair one failed node. It can be seen that both the PB-MSR and PB-MBR codes have significantly lower data transfer compared to RS codes. As can be seen from the bar chart in Figure 11(a), when the encoding object is 5 MB, the classical RS code needs 1.5 times overhead of data transfer compared to the PB-MSR code and needs 2.25 times overhead of data transfer compared to the PB-MBR code. It also can be observed from Figure 11(a) that when the redundancy ratio r is fixed, changing the number of d will result in a smaller amount of data transferred across the network. When d � n − 1 in Figure 11(a), the data transferred overhead of PB-MSR code and PB-MBR code decreased by 35.74% and 16.22% compared to that of d � 2 * k − 2.

Repair Performance Evaluation.
Additionally, the repair performances were tested as shown in Figure 12. From the overall trend, while the value of n is fixed, as the redundancy rate increases, k gradually decreases, and the unit storage capacity of each node increases; disk I/O overhead increases, and the overall repair rate decreases. In contrast to Figure 11     is always the smallest in-unit storage capacity α. us, the data that need to be written to the disk is also the smallest. Besides, besides, based on the calculation and analysis results about repair complexity in Table 5, it can be seen that PB-MSR has lower calculation complexity than PB-MBR, so its repair rate is slightly higher.

e Effect of Finite Field GF(2 w ) on Coding Performance.
Based on the analysis thus far, we know that the complexity of the encoding, decoding, or repair process is related to the finite field size w, which means every symbol x ∈ [0, 2 w ). erefore, the experiment in this section analyzes the influence of the finite field size on the performance of ECs. e values of (n, k, d) were fixed, and the performances of w � 8, w � 16, and w � 32 were tested as shown in Figure 13. Figures 13(a), 13(b), and 13(c) are the comparison diagrams of encoding, decoding, and repair performances, respectively. From these figures, we see that as the finite field size w gradually increases, the calculation complexity increases and the rate declines in varying degrees.

Related Works
Regenerating codes are state-of-the-art ECs with optimal repair bandwidth that have been used by some storage systems. e NCCloud storage system was one of the earliest implementable designs for 2-parity functional MSR codes (F-MSR), which maintain the same data redundancy level and same storage requirement as traditional ECs but use less repair traffic [30]. In [31], Pamies-Juarez et al. have presented the evaluation of a novel MSR code known as the   Butterfly codes in both Ceph and HDFS. eir analysis shows that Butterfly codes are capable of reducing network traffic and reading I/O access during repairs. In [20], the Coupled-Layer (Clay) code is an MSR code that offers a simplified construction for decoding/repair by using pairwise coupling across multiple stacked layers of any single MDS code and has been evaluated over an Amazon AWS cluster. e Clay code is simultaneously optimal in terms of storage overhead, repair bandwidth, optimal access, and sub-packetization level. In [22], Rashmi [33]. In [34], Shah et al. presented the (n � d + 1, k, d) exact-repair-(ER-) MBR codes without arithmetic operation in the repair process. In [35], the authors introduced a family of repair-by-transfer (RBT) codes which was a class of (n, k, d � n-1) ER-MBR codes that were constructed based on congruent transformations applied to a skew-symmetric matrix of message symbols. In this construction, the encoding complexity decreases from    n 4 to n 3 . Based on the new coding matrix for PM-MBR, the minimum of a finite field is reduced from n-k + d to n. In [18], the authors first introduced a more natural extension to the classical PM-MBR framework, modified to provide flexibility in the choice of the number of helpers during node repair, permitting a certain number of error-prone nodes during repair. is was achieved by proving the nonsingularity of the family of matrices in large enough finite fields. To reduce the high coding and repair complexity that involves expensive multiplication operations in a finite field, another new class of RGC was proposed. In [36][37][38], the authors proposed a new framework of linear codes with binary parity-check code, named Binary Addition and Shift Implementable Cyclic-convolutional (BASIC) codes which enable coding and repair by XOR and bitwise cyclic shift.
Although some systems are currently using RGC, they are designed or optimized for coding structure, bandwidth overhead, storage overhead, and I/O overhead. However, there is still a lack of some classical coding framework complexity analysis from the perspective of the finite field.
ECs rely on finite field operations which can be performed using XOR operations. Many acceleration techniques have been proposed: optimizing bitmatrix design, optimizing computation schedule, common XOR operation reduction, cache management, and vectorization techniques [12]. In [13], two new heuristics have been derived called Uber-CHRS and X-Sets to schedule encoding and decoding bitmatrix by reducing the number of XORs. And several technologies were introduced in the same work by using different heuristic algorithms. In addition to smart scheduling and matching algorithms for reducing the number of XORs, other solutions were based on improving hardware performance. In [14], Plank et al. vectorized finite field operations directly based on singleinstruction-multiple-data (SIMD).

Conclusion
In this paper, we have combined the finite field arithmetic operations to elaborate on the computational complexity of the famous PM framework in encoding, decoding, and repair processes and proposed a new construction called product bitmatrix (PB). Based on two heuristic algorithms, PB-MSR and PB-MBR codes find local minimum XORs that can improve encoding, decoding, and repair performances. Although PB has been optimized by the original framework in computation and has advantages in the decoding and repair processes, when the encoding object is small, the optimization time occupies a portion of the total encoding time. Compared with the PM adopted by Vandermonde, its advantage is insufficient.
In future research, in addition to theoretical research, we will conduct in-depth research on the fault-tolerant technology of edge computing nodes in combination with the real environment, for instance, improving network transmission performance [39], adapting the recommended data reliability storage algorithm of edge computing [40,41], or to design new nonlinear coding methods with deep neural networks [42,43].

Data Availability
e data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest
e authors declare that they have no conflicts of interest.