Efficient and Memory Saving Method Based on Pseudoskeleton Approximation for Analysis of Finite Periodic Structures

An efficient and memory saving method based on pseudoskeleton approximation (PSA) is presented for the effective and accurate analysis of finite periodic structures. Different from the macro basis function analysis model, our proposed method uses the formulations derived by the local Rao-Wilton-Glisson basis functions. PSA is not only used to accelerate the matrix-vector product (MVP) inside the single unit but also adopted to decrease the calculation burden of the coupling between the different cells. Moreover, the number of decomposed coupling matrices is minimized due to the displacement invariance of the periodic property. Consequently, even compared with the multilevel fast multipole algorithm (MLFMA), the new method saves much more memory resources and computation time, which is also demonstrated by the numerical examples.


Introduction
Periodic structures have recently found wide applications in the electromagnetic engineering such as antenna arrays and metamaterials with negative permittivity and negative permeability.Hence, the accurate and efficient analysis of periodic structures becomes quite essential.If the periodic structure is an infinite array, simple methods can be applied based on Floquet's theorem [1] or periodic Green's function [2], where only a single cell of the periodic structure is the domain of interest.
However, all periodic structures have finite size in reallife problems, although the size may be very large.Therefore, the numerical algorithms which accurately consider the mutual coupling between all cells should be used if the accurate results are required.The method of moments (MoM) [3] and its fast algorithms such as fast multipole method (FMM) [4], adaptive cross approximation (ACA) [5], and FFT-based methods [6] are flexible approaches to study the surface problems.However, the efficiency of numerical methods is still limited since the periodic property is not used in the algorithm framework.Recently, a hybrid method combining the accurate MoM and periodic method of moment (PMM) [7] has been proposed which can gain the balance between the two methods.Moreover, some physically based entire-domain basis functions [8] have been developed to reduce the number of unknowns.Further, the FMM and FFT techniques are integrated to accelerate the calculation [9].
Compared to the ACA method, pseudoskeleton approximation (PSA) [10] is also an efficient low-rank-based algebraic fast algorithm which makes it a really competitive alternative.In this paper, we propose an efficient method with low-memory requirement based on PSA to perform the analysis for finite periodic structures effectively and accurately.In consideration of the accuracy of the mutual interactions [8] and the simplicity of implementation, our proposed method uses the formulations derived from the local basis functions instead of macro basis function (MBF) [11,12].In this paper, PSA is not only used to accelerate the matrixvector product (MVP) inside the single unit but also adopted to decrease the calculation burden of the coupling between the different cells.Moreover, the number of decomposed coupling matrices is minimized due to the displacement invariance of the periodic property.With these improvements, an efficient method with low-memory usage of finite periodic objects can be achieved.Several numerical examples are given to show the priority of the proposed method compared to the conventional multilevel fast multipole algorithm (MLFMA) [13] for periodic structures.

MoM and PSA Formulation
In this section, the basis principle of MoM and PSA is briefly introduced at first.Then, the choice of arguments in PSA is discussed.
2.1.MoM Equations and Its Fast Algorithms.Consider a time-harmonic e −jωt electromagnetic wave scattering or radiation problem of an arbitrary perfect electrically conducting (PEC) object.The object is excited by an incident electric field E inc r , then the electric field integral equation (EFIE) associated with the surface equivalent currents J r can be expressed by where t is the tangential unit vector on the surface S ′ of the object and ω and μ stand for the angular frequency and permeability, respectively.G r, r ′ = e −jk r−r ′ / 4π r − r ′ is 3D scalar Green's function.The linear system of MoM is obtained by discretizing the unknown vector J r with Rao-Wilton-Glisson (RWG) [14] basis functions and applying Galerkin's testing method.Let ZI = V represent the above EFIE matrix system.The MVP process can be accelerated by fast algorithms which can be written as follows: where Z near is the matrix of near field interactions which are directly computed and stored and Z far stands for couplings between the far-field interactions which will be accelerated together with Z far I in the iterative solving process.
2.2.Basic PSA Frameworks.According to the low-rank decomposition, the far-field interaction matrix Z far with rank-deficient property can be approximated by a product of two much smaller submatrices U and V: where r is the effective rank and satisfies r ≪ m, n.While in the skeleton approximation (SA) theory [10], there is a nonsingular r × r submatrix Ẑ in Z far .Denote the rectangular matrices as C and R which contain the columns and rows of Ẑ, respectively, then Z far is expressed as where C and R have dimensions with m × r and r × n, respectively.In the PSA method, Z far is reevaluated as where G is the pseudoinverse of Ẑ with dimensions of p × p p > r , then p columns and p rows are chosen from Z far to get the C and R. Three aspects need to be noted here: (i) Ẑ−1 is the inverse of Ẑ and the computation of inverse is very expensive; (ii) the determination of the value of p is a balance between accuracy and efficiency, where p is a number large enough so that r most important bases will be embedded; (iii) how to choose those working columns and rows of C and R.
For the (ii) and (iii) problems, we will discuss them in the next subsection.For the first problem, assuming that Ẑ can be decomposed via singular value decomposition (SVD) as where P and Q are p × p unitary matrices, Σ is a diagonal p × p matrix with nonnegative real numbers, and Q * is the complex conjugate transpose of Q.In the actual implementation, the dimension of P, Q, and Σ can be further decreased by a preset threshold [15].Let P, Σ, and Q represent the reduced submatrix of P, Σ, and Q, respectively.Then, calculation of the pseudoinverse of Ẑ (i.e., G) is straightforward: By combining ( 5) and ( 7), the original far-field interaction matrix Z far can be decomposed as As mentioned in the previous subsection, the choice of the value of p and the specific sampling rows and columns determine the performance of PSA.In the randomized PSA (RPSA) [15], p is equal to 2r.
Then, the problem of (ii) transforms into how to estimate the rank r of the original matrix.In [16], Chai and Jiao give the approximation of rank of the 3D EM problem by Rank 3D ~O k 0 , where k 0 is the studying wave number.In this paper, the rank is approximated by where d is the diameter of the bounding box corresponding to the octree structure and α is a preset positive parameter.
The larger the α is, the more accurate the matrix decomposition is.In this paper, when α is set as 3, the satisfactory accuracy can be guaranteed.
For the problem of (iii), instead of using random numbers in RPSA, we use a strategy analogous to ACA in this paper.Firstly, initialize from the 0th row as the first row index.Then, find the largest entry in this row, and the corresponding column value in which this entry is located is chosen as the next column index.Similarly, find the largest entry in the current column and get the next row index which should be different from all previous row indexes.This process is carried out iteratively until p rows and columns are found and stored.Please refer to [17] for more information.

Proposed Method for Periodic Structures
We consider a case of arbitrarily shaped PEC patch, for example, refer in Figure 1.The total impedance matrix 2 International Journal of Antennas and Propagation contains two parts: self-coupling blocks and mutual coupling blocks.Therefore, both the blocks are analyzed and decomposed by PSA to gain better efficiency and low-memory usage.
3.1.PSA for Decomposition of Self-Coupling Matrix.In the previous mentioned periodic algorithms, the self-coupling block elements are calculated by a direct MoM method which is not efficient for a large array unit.Therefore, PSA is used to decompose the self-coupling matrix.As the same to all fast algorithms of MoM, the low-rank decompositions based on PSA are performed on the far-field groups while the nearfield elements are directly computed and saved.Moreover, some operations are taken out to further improve the efficiency of PSA.Firstly, the pseudoinverse of the Ẑ (i.e., G) will not be directly calculated since the complexity of SVD is very high.Instead, the LU decomposition will be performed when the dimension grows up.Different from the previously proposed PSA, C, R, and the LU decomposition of Ẑ will be stored.Moreover, C and R in (8) are further decomposed by ACA technique.Finally, the original farfield coupling matrix can be decomposed into six submatrices which is showed by the following formula: where and L ẐU Ẑ is the LU decomposition of Ẑ.Note that the inverse of LU matrix of Ẑ is not directly computed.Actually, the LU back substitute is performed after the MVP based on the matrix U R V R in each MVP process.

PSA for Decomposition of Mutual Coupling Matrices.
In the traditional fast algorithms, the matrix decompositions are only performed on the far-field groups especially for the physically related methods such as MLFMA.However, the mutual interactions between two different cells (off-diagonal blocks) are all computed by PSA in our implementation.This treatment has been also used in [18] and demonstrated to be more efficient but losing little accuracy.Moreover, the displacement invariance property is explored in our implementation.Since the mutual coupling remains the same when the distance between two groups' centers is not changed, the corresponding matrix will be calculated and stored only once under the index of relative distance.In the MVP process, the stored coupling matrices may be used several times when the relative distances are the same.By using this scheme, the number of mutual coupling matrices can be decreased dramatically.For example, the original number of off-diagonal matrix blocks of Figure 1 is 20 × 20 − 20 = 380 while the reduced number will be 62 if the displacement invariance is used.

Preconditioner Considerations.
The preconditioning stage should be also considered carefully while dealing with complex structures.In this paper, the preconditioning matrix M for impedance matrix Z is built based on the self-coupling matrix M 0 .
Moreover, the inner-outer iteration scheme can be also applied when facing ill-conditioned matrix systems such as EFIE for radiation analysis.In our implementation, the inner iteration is the solution of the self-coupling matrices.Certainly, the inner solution area can be also extended to improve the convergence rate of outer iteration.

Numerical Results
In this section, several numerical experiments are presented to demonstrate the efficiency and low-memory requirement of the proposed method.For all the simulations, the mesh sizes are no less than 0.1λ (λ represents the wave length).The biconjugate gradient stabilized method (BiCGSTAB) is adopted as the iterative solver for the matrix equations, and the threshold of the iterative stopping criterion for residuum is set to 0.001.The sparse approximate inverse preconditioner is used in all simulations.All the computations were carried out on a workstation with four Xeon E5-4620 CPU and 256 GB of RAM with OpenMP technique, and the digits were stored in double precision.

Scattering Simulation.
In order to verify the accuracy and efficiency of the proposed method, we first consider a periodic structure consisting of N 0 = 4 × 5 = 20 element cells (also shown in Figure 1), where the unit cell is a 0.5 m PEC sphere.The working frequency is set to be 1 GHz which leads to 20286 unknowns (degrees of freedom) in each patch.Hence, the total number of unknowns is N = 20 × 20286 = 416520.The gap between two unit cells is set to be 0.5 m.The radar cross sections (RCS) computed by MLFMA, the proposed method (periodic PSA), and the commercial software (FEKO) are illustrated in Figure 2(a).It could be clearly seen that the numerical results from the proposed method have excellent agreement with both the conventional MLFMA and FEKO.While as in Table 1, both the computation time and peak memory usage by periodic PSA are less than the MLFMA method.What is more, another case is considered when the gap between two unit cells is decreased to be 0.1 m.The RCS results and the use of computer resources are shown in Figure 2(b) and Table 1, respectively.Although the rank of mutual coupling matrix gets bigger due 3 International Journal of Antennas and Propagation to the tight coupling interactions, the proposed method still shows the good efficiency and low-memory usage with losing little accuracy.
Furthermore, the scattering of a large-scale periodic structure consisting of N 0 = 32 × 32 = 1024 element cells is considered.The working frequency is set to be 1 GHz.Different radii of the sphere are considered which lead to about 0.81, 3.38, and 5.46 million numbers of unknowns corresponding to 0.1 m, 0.2 m, and 0.25 m models, respectively.The distance between two centers of neighboring sphere is set to be a fixed value of 1.0 m.Table 2 shows the information of models and the numerical performance between MLFMA and the periodic PSA.It can be seen that for a large-scale problem, the proposed PSA method still needs less computation time and much less memory requirement.It should be noted that in the 3rd example in Table 2 when the sphere radius is set to 0.25 m, the finest box in MLFMA has to be set to 0.20λ to avoid low-frequency breakdown.Figure 3 shows that the RCS results of 0.25 m sphere models computed from the periodic PSA still agree well with FEKO and MLFMA.

4
International Journal of Antennas and Propagation between two cells in this model is 0.005 m which is smaller than 0.1λ.Since EFIE is the only choice for radiation problems, the inner-outer iteration scheme is adopted to accelerate the calculation process.Figure 4 shows the current density distribution of the antenna array.The far-field gain patterns computed by the proposed method and MLFMA are shown in Figure 5.
Obviously, the two methods give nearly the same results.However, as in Table 3, the computation time has been reduced from 419 seconds to 172 seconds in solving matrix equations for the radiation problem.Moreover, the memory usage for MLFMA is 8841 MB while the value is only 358 MB for the proposed method.Again, the proposed method performs much better than MLFMA.The first reason is that only 360 off-diagonal blocks are need to be calculated and stored in our method while the original value is 100 × 100 − 100 = 9900.Secondly, the mesh size is about 0.01λ for the antenna cell (dense mesh).As it is known, the lowest-level box size is limited to be no less than 0.20λ since MLFMA suffers from the low-frequency breakdown, which results in a heavy near-field matrix consumption.

Conclusion
In this paper, an efficient and memory saving method based on pseudoskeleton approximation (PSA) for the effective and accurate analysis of finite periodic structures is presented.The appropriate choice of sampling rows and columns as well as sampling number in PSA is discussed and confirmed.Based on the algebraic fast algorithm PSA, we have carefully handled the self-coupling and mutualcoupling blocks of the original impedance matrix.Moreover, the number of decomposed coupling matrices is minimized by the employment of the displacement invariance of the periodic property.The simulation results of large-scale scattering and radiation examples show that the proposed periodic PSA method needs less computation time and much less memory compared to conventional MLFMA.Hence, numerical results demonstrate the efficiency and superiority of the proposed method.

Table 1 :
Computation time and memory usage of MLFMA and periodic PSA for scattering of 4 × 5 elements.

Table 2 :
Computation time and memory usage of MLFMA and periodic PSA for scattering of 32 × 32 elements.

Table 3 :
Computation time and memory usage of MLFMA and periodic PSA for radiation of 10 × 10 antenna array.