In this paper we propose a high performance parallel strategy/technique to implement the fast direct solver based on hierarchical matrices method. Our goal is to directly solve electromagnetic integral equations involving electric-large and geometrical-complex targets, which are traditionally difficult to be solved by iterative methods. The parallel method of our direct solver features both OpenMP shared memory programming and MPl message passing for running on a computer cluster. With modifications to the core direct-solving algorithm of hierarchical LU factorization, the new fast solver is scalable for parallelized implementation despite of its sequential nature. The numerical experiments demonstrate the accuracy and efficiency of the proposed parallel direct solver for analyzing electromagnetic scattering problems of complex 3D objects with nearly 4 million unknowns.

Computational electromagnetics (CEM), which is driven by the explosive development of computing technology and vast novel fast algorithms in recent years, has become important to the modeling, design, and optimization of antenna, radar, high-frequency electronic device, and electromagnetic metamaterial. Among the three major approaches for CEM, finite-difference time-domain (FDTD) [

Traditionally, iterative solvers are employed and combined with fast algorithms to solve the MoM. Despite of the availability of many efficient fast algorithms, there are still some challenges in the iterative solving process for discretized IEs. One major problem is the slow-convergence issue. Due to various reasons, such as complex geometrical shapes of the targets, fine attachments, dense discretization, and/or nonuniform meshing, the spectrum condition of the impedance matrix of discretized IEs can be severely deteriorated. Therefore, the convergence speed of iteration will be slowed down significantly so that we are not able to obtain an accurate solution within a reasonable period of time. In order to overcome this difficulty, preconditioning techniques are usually employed to accelerate the convergence. There are some popular preconditioners, including diagonal blocks inverse [

So far, there are some existing literatures discussing the parallelization of direct solvers for electromagnetic integral equations [

The rest of this paper is organized as follows. Section

In this section, we give a brief review of surface IEs and their discretized forms [

Generally, we can solve the linear system (

In order to implement

After the

Overall procedure of

From

Based on the result of

The parallelization of

Since our FDS is executed on a computer cluster, the computing data must be stored distributively in an appropriate way, especially for massive computing tasks. The main data of

Data distribution strategy for forward/impedance matrix and also for LU factorized matrix.

The

During the

According to the recursive procedure of

For the sake of convenience, we use

After solving processes of triangular systems is done, the updating for

The overall strategy for parallelizing _{12} and

Parallel strategy/approach for

To analyze the parallelization efficiency under an ideal circumstance, we assume that the load balance is perfectly tuned and the computer cluster is equipped with high-speed network; thus there is no latency for remote response and the internode communication time is negligible. Suppose there are

Apparently, for the process of generating forward/impedance matrix,

For the process of

According to our analysis, we should say that, for a given size of IE problems, there is an optimal

Similarly, the number of OpenMP threads

The cluster we test our code on has 32 physical nodes in total; each node has 64 GBytes of memory and four quad-core Intel Xeon E5-2670 processors with 2.60 GHz clock rate. For all tested IE cases, the discrete mesh size is

First, we test the parallelization scalability for solving scattering problem of PEC spheres with different electric sizes. The radii of these spheres are

Parallelization efficiency for solving scattering problems of PEC sphere of different electric sizes.

Next, we investigate the hybrid parallelization efficiency for different MPI nodes (

Parallelization efficiency of different MPI nodes/OpenMP threads allocation.

Ideally, larger

Figures

Memory cost for different MPI nodes/OpenMP threads proportion.

Solving time for different MPI nodes/OpenMP threads proportion.

In this part, we test the accuracy and efficiency of our proposed parallel FDS and compare its results to analysis solutions. A scattering problem of one PEC sphere with radius

Bistatic RCS of PEC sphere with

Finally, we test our solver with a more realistic case. The monostatic RCS of an airplane model with the dimension of

Monostatic RCS of airplane model solved by proposed hybrid parallel FDS and FMM iterative solver.

In this paper, we present an

The authors declare that there is no conflict of interests regarding the publication of this paper.

This work is supported by the China Scholarship Council (CSC), the Fundamental Research Funds for the Central Universities of China (E022050205), the NSFC under Grant 61271033, the Programme of Introducing Talents of Discipline to Universities under Grant b07046, and the National Excellent Youth Foundation (NSFC no. 61425010).

^{2}-matrix-based fast integral-equation solvers for large-scale electromagnetic analysis