RCS Computation by Parallel MoM Using Higher-Order Basis Functions

. A Message-Passing Interface (MPI) parallel implementation of an integral equation solver that uses the Method of Moments (MoM) with higher-order basis functions has been proposed to compute the Radar Cross-Section (RCS) of various targets. The block-partitioned scheme for the large dense MoM matrix is designed to achieve excellent load balance and high parallel e ﬃ ciency. Some numerical results demonstrate that higher-order basis in this parallelized scheme is more e ﬃ cient than the conventional RWG method and able to e ﬃ ciently analyze RCS of various electrically large platforms.


Introduction
Radar Cross-Section (RCS) computation of electrically large platforms has attracted a great deal of attention in the past few decades.One traditional and widely adopted method is the method of moments (MoM) [1].However, when the operating frequency is high, the MoM method based on the Rao-Wilton-Glisson basis functions (RWGs) [2,3] produces a very large number of unknowns for electrically large structures.To reduce the number of unknowns and to accelerate the computation, the fast multipole method (FMM) is a feasible approach.Although this technique can achieve our goal to some extent, there may be a problem of convergence when the model to be simulated is complex.Another choice is to use higher-order polynomials over wires and quadrilateral plates as basis functions over larger subdomain patches [4,5].The use of higher-order basis functions significantly reduces the number of unknowns.However, it is necessary to state that higher-order basis is suitable for large smooth structure but not very beneficial for detailed structure.In addition, to reduce the total wall clock time, the large dense MoM matrix is divided into a number of small block matrices that are nearly equal in size and distributed among all the available processes in the parallel method.
In this paper, the parallel in-core MoM solver combined with the higher-order polynomial basis functions (HOBs) is employed on high-performance clusters so that the capability of the MoM method has been significantly improved.This technique is capable of solving electrically large scattering problems [3,5,6] of several hundred wavelengths in the maximum dimension.
In Section 2 of this paper, the basic theory of higherorder basis function and the matrix partition scheme are listed respectively; And then the computation platforms are described in Section 3; Section 4 lists some numerical examples to validate the accuracy, efficiency, and application of this paper's method: Section 4.1 demonstrates the convergence of the higher-order basis MoM method; Section 4.2.1 validates the accuracy of this paper's method through the comparison with measurement results; Section 4.2.2 demonstrates that higher-order bases are able to significantly reduce the number of unknowns and can effectively shorten the computation time; Section 4.3 checks the parallel efficiency of the parallel scheme used in this paper.Finally, Section 4.4 illustrates a real-life problem of the RCS computation of a missile whose maximum electrical dimension is bigger than one hundred wavelengths.Finally, Section 5 presents the conclusions and the acknowledgements follow, respectively.

Basic Theory
2.1.Higher-order Basis Functions.Flexible geometric modeling can be achieved by using truncated cones for wires and bilinear patches to characterize surfaces [4].The surface current over a bilinear surface is decomposed into its p and scomponents, as shown in Figure 1(a).However, the p-current component can be treated as the s-current component defined over the same bilinear surface with an interchange of the p and s coordinates.The approximations for the scomponents of the electric and magnetic currents over a bilinear surface are typically defined by where c i1 , c i2 , (i = 0, 1, . . ., N p ) are defined as The edge basis functions E i (p, s) and the patch basis functions P i j (p, s) (i = 0, . . ., N p , j = 2, . . ., N s ) are expressed by (3) and (4), respectively, where α p , α s are the unitary vectors defined as The parametric equation of such an isoparametric element can be written in the following form: where r 11 , r 12 , r 21 , r 22 are the position vectors of its vertices and the p and s are the local coordinates.A right-truncated cone is determined by the position vectors and the radii of its beginning and its end, r 1 , a 1 , r 2 , a 2 , respectively, as shown in Figure 1(b).Generalized wires (i.e., wires that have a curvilinear axis and a variable radius) can be approximated by right-truncated cones.
Currents along wires are approximated by polynomials and can be written as  where node basis functions, N(s), and segment basis functions, S i (s), are expressed as respectively, and where a i , (i = 2, . . .N s ) are the coefficients, and I 1 = I(−1), I 2 = I( 1) are the values of the currents at the wire ends, respectively.The parametric equation of the cone surface can be written as where φ is the circumferential angle, measured from the xaxis, and i ρ (φ) is the radial unit vector, perpendicular to the cone axis.

The Matrix Partition Scheme.
Assume that the matrix A is a large dense matrix, it can be divided into smaller blocks and distributed to each process grid [6].For explanation purposes, the MoM matrix equation is rewritten in a general form as where A denotes the complex dense matrix, X is the unknown vector to be determined, and B denotes the given source vector.Assume that the matrix A is divided into 6 × 6 blocks, which are distributed to 6 processes in a 2 × 3 process grid, as illustrated in Figure 2(a).Figure 2(b) shows to which process the blocks of A are distributed using ScaLAPACK's distribution methodology.
In Figure 2(a), the outermost numbers denote the row and column indices of the process coordinates.The top and bottom numbers in any block of Figure 2(b) denote the process rank and the process coordinate of a certain process, respectively, corresponding to the block of the matrix shown in Figure 2(a).By varying the dimensions of the blocks of A and those of the process grid, different mappings can be obtained.This scheme can be referred to as a block-cyclic distribution procedure.
Load balancing is critical to obtain an efficient operation of a parallel code.This parallel scheme is able to achieve the good load balancing.Little communication between processes is necessary during the matrix filling process [4].
Also, it is necessary to mention that the degree of higherorder basis is confirmed by the maximum length of edge of the corresponding plate.And the same load-balancing scheme is used no matter what the order of basis function is.

Description of the Computation Platforms
To illustrate the versatility of the solver, two representative computer platforms have been chosen.
(1) Personal computer: Quad core Intel I5 processor (2.67 GHz) with 4 GB RAM and 500 GB of hard disk.
(2) Shanghai supercomputer center (SSC): the 37 nodes from Magic-cube Machine with a total of 592 AMD CPU cores (1.9 GHz per CPU and 4 cores on each CPU): 16 CPU cores on each node and 4 GB RAM per core, and a total amount of RAM approximately z y x

Numerical Results and Discussion
4.1.The Accuracy versus the Order of Basis Function.In this benchmark, a model of PEC sphere is used to test the relationship between the accuracy of simulation and the order of basis function.The radius of the sphere is 1.0 meter.The simulation frequency is 1 GHz.The incident direction is along x-axis and the observation plane is XOY, as illustrated in Figure 3. Parallel higher-order MoM with 512 CPUs is employed to calculate the bistatic RCS (dB).Through changing the order of basis, the results obtained are compared in Figure 4 and the information of simulation process is listed in Table 1.We can see from Figure 4 that the results are stable when the order of basis function ranges from two to five.Therefore to this model, numbers of two to five can be chosen as the reasonable order of basis.Moreover, Table 1 lists the information about the simulations, respectively.It is obvious that the number of unknowns is more when the order of basis is higher.Meanwhile, the simulation time required is longer as the order of basis function is increased.

Comparison with the Measurement Results and Parallel
FMM with RWG Basis.To validate the accuracy and efficiency of the proposed parallel higher-order basis MoM methodology, two benchmarks of a truncated cone and a Y-8 plane are simulated to calculate their RCS, respectively.[7].This benchmark is an end-capped truncated cone oriented along the z-axis and centred in the plane z = 0 (illustrated in Figure 5).The elevation angle (θ) is taken from the positive z-axis and the azimuth angle (φ) from the positive x-axis.There are several interesting points in this target.First, it shows the RCS response of targets with single curvature (common in structural parts of an aircraft, such as the fuselage).It is also important to know the diffraction mechanism at curved edges.Reflection from planar surfaces with curved edges can also be observed.Therefore, this target is especially suitable for the validation of the prediction of objects with flat surfaces delimited by curved edges and for evaluation of curved edge contributions.

Truncated Cone
This model has been simulated at 7 GHz.The RCS pattern, for HH polarizations, corresponds to φ = 0 and θ  ranges from 0 • to 180 • with a 1 • step.The incident direction is perpendicular to the generatrix.The simulation is performed on the first kind of computer platform described above.
Figure 6 shows the RCS pattern of the truncated cone for HH polarization at 7 GHz.Three main lobes are clearly defined.Two of them correspond to the specular reflection from the two bases of the cone.The minor one corresponds to θ = 0 • and the major one to θ = 180 • , with the different levels resulting from the different areas of the corresponding bases.The other main lobe corresponds to the angle at which the generatrix is perpendicular to the incident direction.Diffraction from the curved edges becomes important in the intermediate region between the main lobes.The RCS pattern is compared with measured results and good agreement is seen.

Y-8
Plane.This benchmark is a real aircraft named Y-8 (illustrated in Figure 7).The elevation angle (θ) is taken from the positive x-axis and the azimuth angle (φ) also from the positive x-axis.The operating frequency is 100 MHz.The airplane model is 36.2m long, 38 m wide, and 10.5 m high.The corresponding electrical sizes of the model are  12.1λ, 12.7λ, and 3.5λ, where λ is the free-space wavelength at the operating frequency.The incident wave with HH polarization is along the negative x-axis.
In this simulation, the order of higher-order basis is three; also, FMM parameters are described as follows.
(2) Top level: 3. The simulation is performed on the first kind of computer platform described above.The Bistatic RCS results obtained by using the proposed parallel higher-order basis MoM method and the parallel FMM method, are plotted together in Figure 8.As shown in this figure the results agree with each other very well from 15 • to 345 • .The only considerable discrepancy between them occurs in the nose region of the plane, for angles from 0 • to 15 • and from 345 • to 360 • .
The comparisons of some computation parameters are listed in Table 2.
From Table 2, one can see that the higher-order basis adopted in the proposed method results in less number of unknowns than the FMM RWGs do.The total computation time of the proposed method is only about 23.6% of the time required by parallel FMM method, and it implies that the proposed method is about 4.2 times faster than the parallel FMM method.This benefit not only comes from the smaller number of unknowns needed when using higher-order basis, but also due to the parallel matrix partition scheme.

Parallel Efficiency of Mirage's RCS Computation.
In this benchmark, the parallel efficiency of Mirage's RCS computation has been measured with respect to different numbers of processes, as shown in Table 2.The model of the Mirage aircraft is described in Figure 9, and its geometric dimensions are 11.3 m × 7 m × 2.85 m.The operating frequency is 1.25 GHz.Thus the corresponding electrical dimensions are 47.1λ × 29.2λ × 11.9λ, where λ is the wave length in free space.The model is placed along x-axis and is excited by a plane wave propagating along the negative x-axis and with VV polarization.Taking the time for 16 processes as a reference, the parallel efficiencies for this simulation are described in Figure 10, and the times of simulation for different number of CPUs are listed in Table 3.In the cases of 16, 32, 64, and 128 processes, the parallel efficiencies for the wall time are higher than 80%, which demonstrates that this proposed method can reach  an excellent parallel efficiency and is capable of effectively reducing the computation time.

RCS Computation of Benchmark with Electrically Large
Dimension.In the following examples, the elevation angle (θ) is taken from the positive x-axis to z-axis and the azimuth angle (φ) from the positive x-axis.The simulations are performed on the second kind of computer platform described in Section 3.    (1) In the first part of this section, the parallel speedup ratio for computing a real missile's bistatic RCS is tested.
The model of this benchmark is listed as follows.The missile model is placed along the x-axis, as illustrated in Figure 11.Table 4 illustrates the corresponding parameters of this model.This testing benchmark is operating at 5.0 GHz, for which the number of unknowns is 69247.The simulation times for different number of processes and the results of parallel speed-up ratio are listed in Table 5 and Figure 12, respectively.Taking the time for 64 processes as a reference, it can be found in Figure 12 that the parallel speedup ratio is nearly linear.However, in the case of 192 and 256 processes, the parallel efficiencies for the simulation decrease compared with the theoretical results.An increase in the number of processes deteriorates the performance.This is expected because the ratio of the communication volume to computation increases with an increase in the number of processes for this problem.But the parallel efficiencies of all these four situations are also higher than 80%, which proves a good parallel performance of this paper's method.
(2) Consider next the missile's bistatic RCS and its surface current distribution.
Table 6 illustrates the corresponding parameters of this benchmark, and Table 7 summarizes the computation results.In the following, RCS results and surface current distribution of this benchmark are listed in Figures 13 and 14, respectively.From this benchmark, it is clear that the proposed method in this paper is able to handle electrically large scattering problems of hundreds of wavelength in the maximum dimension.

Conclusion
In this paper, RCS computation of electrically large platforms using parallel MoM technique with higher-order basis functions is presented.A load-balanced parallel method is achieved by a matrix partition scheme, so the total wall clock time for solving a large dense matrix is shortened.Its accuracy, efficiency, and applicability are also validated through several numerical examples.In conclusion, the method proposed in this paper can solve some electrically large problems with high accuracy and short computation time, which cannot be achieved by the conventional RWG MoM method.Also this paper contributes to ongoing research efforts on developing numerically accurate solutions for electrically large problems.

Figure 1 :
Figure 1: Geometric model: (a) a bilinear surface defined by four vertices; (b) a right-truncated cone defined by position vectors and radii of its beginning and end.

Figure 2 :
Figure 2: Block-cyclic distribution of a matrix [4]: (a) a matrix consisting of 6 × 6 blocks; (b) rank and coordinates of each process owning the corresponding blocks in (a).

Figure 5 :
Figure 5: Truncated cone description: size (in mm).The height of the target is 200 mm.The major diameter is 200 mm and the minor one is 100 mm.

Figure 14 :
Figure 14: Surface current distribution of the missile.

Table 1 :
Information of the simulation processes.

Table 2 :
Comparison between the proposed and FMM method.

Table 3 :
Comparison of the simulation time of Y-8's RCS with respect to processes.

Table 4 :
Parameters of the model.

Table 5 :
Comparison of the simulation time of missile's RCS with respect to processes.

Table 6 :
Parameters of the simulated benchmark.

Table 7 :
Computation results of the simulated benchmark.