IJAP International Journal of Antennas and Propagation 1687-5877 1687-5869 Hindawi Publishing Corporation 495057 10.1155/2014/495057 495057 Research Article An Efficient Algorithm for EM Scattering from Anatomically Realistic Human Head Model Using Parallel CG-FFT Method Zhao Lei Chen Gen Xiao Gaobiao Center for Computational Science and Engineering School of Mathematics and Statistics Jiangsu Normal University, Xuzhou 221116 China xznu.edu.cn 2014 24 3 2014 2014 01 12 2013 02 02 2014 16 02 2014 24 3 2014 2014 Copyright © 2014 Lei Zhao and Gen Chen. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

An efficient algorithm is proposed to analyze the electromagnetic scattering problem from a high resolution head model with pixel data format. The algorithm is based on parallel technique and the conjugate gradient (CG) method combined with the fast Fourier transform (FFT). Using the parallel CG-FFT method, the proposed algorithm is very efficient and can solve very electrically large-scale problems which cannot be solved using the conventional CG-FFT method in a personal computer. The accuracy of the proposed algorithm is verified by comparing numerical results with analytical Mie-series solutions for dielectric spheres. Numerical experiments have demonstrated that the proposed method has good performance on parallel efficiency.

1. Introduction

In recent years, there has been an increasing effort to achieve an efficient numerical analysis of large-scale electromagnetic problems which usually require much computational time and large computer memory. An efficient numerical method for large and complex bodies is very important for many practical applications. The method of moments (MoM)  has become one of the most popular methods to compute the scattering problems in a variety of applications . However, MoM requires O ( N 2 ) memory usage and O ( N 3 ) computational load to solve the matrix equation using the LU decomposition or Gaussian elimination, where N is the number of unknowns. To reduce the computational time, CG-FFT is employed to solve the MoM matrix equation, which is one of the most efficient ways to solve the volume integral equation for dielectric targets and reduces the computational complexity to O ( N log N ) in each iteration . For the most practical EM problems, a regular computer cannot be sufficient for its limited available memory and performance. New developments of parallel-processing techniques and high-performance computer (HPC) system give the chance of solving large problems that were unattainable in the past. To reach this point, it becomes more and more important that the development and parallelization of fast algorithms with highly parallel performance be able to benefit from large amounts of computational memory and parallel processors of HPC system .

In the past few decades, the energy absorption in human head exposed to radio-frequency (RF) electromagnetic radiation has brought about an increased concern for the possible consequences of electromagnetic radiation on human health. Many studies have been performed for calculating the power absorbed in a human body exposed to the electromagnetic (EM) field emitted by radio-communication equipment . In this paper, the EM scattering problem from a high-resolution 3D anatomically realistic model of the human head was considered. The volume integral equations are applied to describe the problem. MoM is then used to discretize the coupled integral equations, and a CG-FFT algorithm has been proposed to solve the resulting discrete linear system. And the parallelization techniques were applied to speed up the FFT calculation, vector-vector product, and matrix-vector product during the process of CG iteration. The paper presents a deep review of the proposed parallel implementation of CG-FFT algorithm with pulse base function. Different stages of the parallel algorithm were described, and its overall parallel performance was analyzed carefully. With this implementation, we have done a benchmark model test with more than 400 million unknowns and solved a practical EM scattering problem with more than 40 million unknowns using a HPC system which includes 27 nodes. Each node of the cluster has two Intel Xeon E5520 CPU and 12 GB memory and they are connected by 10 Gbps Ethernet high speed network. We have verified the accuracy and efficiency of the algorithm by comparing the numerical results with analytical results for dielectric spheres. Numerical results show that the proposed method has good parallel performance.

2. Theory and Methods

Consider a 3D dielectric object of arbitrary shape that is in homogeneous space which is characterized by relative permittivity ε b ; we set the homogeneous space is free space ε b = 1 . The arbitrarily shaped dielectric object with complex permittivity ε r ( r - ) is inscribed by a cuboid L x × L y × L z . The time dependence of e - i ω t   is assumed and suppressed. Under the illumination of the incident electric field, the total electric field inside the dielectric object E - can be determined through the following volume integral equation: (1) E - ( r - ) +    1 4 π ε b V    G ̿ ( r - , r - ) ( ε b ( r - ) - ε b ) E - ( r - , r - ) d r - = E - inc , where (2) G ̿ ( r - , r - ) =       e i k b R R 5 [ G x x G x y    G x z G y x G y y G y z G z x    G z y    G z z ] is the dyadic Green’s function in homogenous space, in which the corresponding elements are given by (3) G ξ ζ = ( ξ - ξ ) ( ζ - ζ )    [ ( k b R ) 2 + i 3 ( k b R ) - 3 ] , ( ξ ζ ) G ξ ζ = ( ξ - ξ ) 2    [ ( k b R ) 2 + i 3 ( k b R ) - 3 ] - R 2 [ ( k b R ) 2 + i ( k b R ) - 1 ] , ( ξ = ζ ) .

The equivalent version for the induced current J - can be approximately obtained by (4) J - ( r - ) +    1 4 π χ ( r - ) V    G ̿ ( r -    , r - ) · J - ( r - ) d r - = J - inc ( r - ) , where χ ( r - ) = ( ε ( r - ) / ε b    ) - 1 , and (5) J - ( r - ) = χ ( r - ) E - ( r - ) , J - inc ( r - ) = χ ( r - ) E - inc ( r - ) are the normalized electric current inside the dielectric object and the equivalent incident current, respectively.

A box with the size of L x × L y × L z is used to bound the considered dielectric target and is discretized into N x × N y × N z cuboidal cells. Then the volume of each cell is Δ v = Δ x Δ y Δ z , where Δ ξ = L ξ / N ξ , ( ξ = x , y , z ) and N ξ is the division number in the ξ -direction. Choosing pulse function as the basis and testing function, we obtain the discrete forms of (4) as (6) J ξ D ( m , n , k ) + 1 4 π χ ( m , n , k ) × ς = x , y , z m = 0 N x - 1 n = 0 N y - 1 k = 0 N z - 1 G ξ ς D ( m - m , n - n , k - k ) × J ς D ( m , n , k ) = J ξ i D ( m , n , k ) in which (7) G ξ ς D ( m - m , n - n , k - k ) = Δ V m Δ x ( m + 1 ) Δ x n Δ y ( n + 1 ) Δ y k Δ z ( k + 1 ) Δ z G ξ ς ( ( m + 1 2 ) Δ x - x , a a a a a a ( n + 1 2 ) Δ y - y , a a a a a a a ( k + 1 2 ) Δ z a a a a a a a a a - z ( m + 1 2 ) ) d x d y d z J ξ i D ( m , n , k ) = m Δ x ( m + 1 ) Δ x n Δ y ( n + 1 ) Δ y k Δ z ( k + 1 ) Δ z J ξ inc ( x , y , z ) d x d y d z .

We remark that the above formulations (6)-(7) actually imply the scattering by small particles with the size of Δ V because of the use of pulse basis functions although the dielectric targets may be continuous. We can convert (6) into a linear system of equations (8) Z    ·    I    =    V , where Z is an N × N system matrix, I is a column vector with the coefficients of the unknown currents, and V is a column vector associated with the incident fields in the dielectric object. Here N = 3 N x N y N z is the total number of unknowns. However, the inner products in (6) are all 3D summations of the products of discrete Green’s functions and discrete electric currents, which are quite time and memory consuming. For electric-large electromagnetic problems, N is very large and it is very difficult to solve (8) directly. In order to calculate fast the products of Green’s functions and electric currents, the discrete Green’s functions are extended in a larger computational domain as (9) G ξ ς e ( m , n , k ) = ± G ξ ς D ( m 0 , n 0 , k 0 ) , where 0 m 2 N x - 1,0 n 2 N y - 1,0 k 2 N z - 1 , (10) m 0 = { m , 0 m N x - 1 , 2 N x - m , N x m 2 N x - 1 , n 0 = { n , 0 n N y - 1 , 2 N y - n , N y n 2 N y - 1 , k 0 = { k , 0 k N z - 1 , 2 N z - k , N z k 2 N z - 1 .

The signs of the expanded discrete Green’s functions are directly related to the even and odd nature of the components with respect to the coordinates in different extended subdomains. After defining the extended Green’s functions, the equivalent electric current J ξ D ( m , n , k ) can be defined in the extended domain by zero padding as (11) J ξ D e ( m , n , k ) = { J ξ D ( m , n , k ) , 0 m N x - 1 , 0 n N y , 0 k N z 0 , else .

Using the convolution theorem and FFT method, we can obtain the discrete form of the integral equation (6) with FFT method [18, 19]: (12) J ξ D ( m , n , k ) + 1 4 π χ ( m , n , k ) - 1 { ξ = x , y , z G ~ ξ ζ e ( i , j , l ) J ~ ξ D e ( i , j , l ) } = χ ( m , n , k ) E ξ inc ( m , n , k ) , where G ~ ξ ζ e ( i , j , l ) , J ~ ξ D e ( i , j , l ) are the discrete Fourier transform (DFT) of G ξ ζ e ( i , j , l ) , J ξ ζ D e ( i , j , l ) , respectively. Similarly, the corresponding adjoint operations can also be performed using FFT. As a consequence, we can solve (12) rapidly through the CG-FFT algorithm . In order to speed up the FFT calculation, the parallel FFT is used to obtain the FFT and inverse FFT results. In the proposed algorithm, both the FFT transform and the inverse FFT transform are implemented using the FFTW library, which is a C subroutine library for computing the discrete Fourier transform in one or more dimensions and supports the distributed-memory implementation based on message passing interface (MPI). For example, to calculate the vector-vector product in the CG-FFT method, which can be parallelized by call MPI_Allreduce() as shown in Algorithm 1.

<bold>Algorithm 1</bold>

/* S u m n o r m o f e a c h P r o c e s s */

M P I _ A l l r e d u c e ( & r n o r m , & r n o r m 1,1 , M P I _ F L O A T , M P I _ S U M , w o r l d ) ;

/* v e c t o r - v e c t o r P r o d u c t o n e a c h p r o c e s s */

f o r ( i n t i = 0 ; i < a l l ; i + + )

{

c j x [ 0 ] [ 0 ] [ i ] = c j x [ 0 ] [ 0 ] [ i ] + a m * c p x [ 0 ] [ 0 ] [ i ] ;

c j y [ 0 ] [ 0 ] [ i ] = c j y [ 0 ] [ 0 ] [ i ] + a m * c p y [ 0 ] [ 0 ] [ i ] ;

c j z [ 0 ] [ 0 ] [ i ] = c j z [ 0 ] [ 0 ] [ i ] + a m * c p z [ 0 ] [ 0 ] [ i ] ;

c r x [ 0 ] [ 0 ] [ i ] = c r x [ 0 ] [ 0 ] [ i ] - a m * c t x [ 0 ] [ 0 ] [ i ] ;

c r y [ 0 ] [ 0 ] [ i ] = c r y [ 0 ] [ 0 ] [ i ] - a m * c t y [ 0 ] [ 0 ] [ i ] ;

c r z [ 0 ] [ 0 ] [ i ] = c r z [ 0 ] [ 0 ] [ i ] - a m * c t z [ 0 ] [ 0 ] [ i ] ;

r e a x = a b s ( c r x [ 0 ] [ 0 ] [ i ] ) ;

r e a y = a b s ( c r y [ 0 ] [ 0 ] [ i ] ) ;

r e a z = a b s ( c r z [ 0 ] [ 0 ] [ i ] ) ;

r n o r m = r n o r m + r e a x * r e a x + r e a y * r e a y + r e a z * r e a z ;

}

3. Numerical Results

To illustrate the accuracy and efficiency of the proposed parallel CG-FFT algorithm, we first consider the EM scattering by a dielectric sphere illuminated by plane waves, which has a closed-form solution. In the following examples, the background is just free space. The dielectric sphere with ε r = 4 , r = 0.2  m is illuminated by a plane wave. The incident wave is polarized in the x direction and propagating in the z direction, in which the operating frequency is 0.3 GHz. The comparison of numerical results of the internal electric fields between parallel CG-FFT and analytical results is illustrated in Figure 1, which shows that the numerical results have good agreement with the analytical results. We have also computed the scattered electric fields from the dielectric object on the observation plane z = 1.0  m and compared such results with the exact solutions as shown in Figure 2.

Electric field distribution on the center line of plane z = 0.1 .

N x = N x = N x = 32

N x = N x = N x = 64

Electric fields on the plane z = 1.0  m. (a) Parallel CG-FFT results. (b) Analytical results.

Then, we do the parallel performance testing on a HPC which has 27 nodes shown in Table 1, in which nodes are connected by 10 Gbps Ethernet. The benchmark model is a homogenous cubic dielectric object with ε r = 4 , and the edge of cubic is 0.4 m. The incident wave is the same plane wave as that in Figure 1. We compare the network latency inside node and internode, which means that we test the network latency on one node and between two nodes, respectively. Figure 3 shows the testing results for internode and inside node. From Figure 3, we can see that the speed of network inside node is about 4 times of the inter node, which will be a bottleneck for the parallel CG-FFT method. To evaluate the performance of the parallel CG-FFT code, we define the performance as follows: (13) Performance ( cells / s ) = N x × N y × N z × Number    of    iteratioins Simulation    time ( s ) .

The HPC hardware information.

 CPU type Intel Xeon E5520 Clock speed 2.67 GHz Number of nodes 27 Available memory 30 × 12  GB (DDR3 1067 MHz) Operating system CentOS (Linux) Network system BNT 10 Gbps Ethernet

Network latency of nodes.

The parallel CG-FFT methods performance testing result is demonstrated in Figure 4, and the detail data is listed in Table 2. From Figure 4, we can obtain that the performance goes up when no more than 8 nodes are used, and the performance goes down using 10 nodes. The reason is that the network latency plays an important role when we use more than 8 nodes. The parallel efficiency is also tested, which is defined as (14) Parallel    Efficiency = T 1 P * T p , where P is the number of processes, T 1 is the running time used by one process, and T p is running time used by P processes. Figure 5 shows the parallel efficiency of parallel CG-FFT with different discretization and processes, and the detail results are listed in Table 3. From Figure 5 and Table 3, we can see that the parallel efficiency is above 60% when no more than 8 nodes are used.

Performance test.

Cells Processes Number of iterations Times (second)
6 4 × 64 × 64 1 30 22.8
128 × 128 × 128 2 30 90.4
256 × 256 × 128 4 30 258.9
256 × 256 × 256 6 30 367.9
256 × 256 × 512 8 30 612.3
512 × 512 × 512 10 30 3289.7

Parallel efficiency.

Cells Processes Time (second) Parallel efficiency
64 × 64 × 64 1 22.8 1
2 12.38 0.92
4 7.2 0.80

128 × 128 × 128 1 168.1 1
2 90.4 0.93
4 50.6 0.83
8 31.8 0.66

256 × 256 × 512 1 2988.3 1
2 1641.8 0.91
4 940.1 0.79
8 612.3 0.61

512 × 512 × 512 1 19409.2 1
2 10904.6 0.89
4 6469.7 0.75
10 3289.7 0.59
12 3171.4 0.51

The performance of parallel CG-FFT.

Parallel efficiency of different case.

Finally, we use the proposed parallel CG-FFT method to simulate EM scattering problem from 3D anatomically realistic human head model exposed to the plane wave working at 900 Mhz. The popular HUGO model  with a resolution of   1 × 1 × 1  mm, as shown in Figure 6, includes 16 different tissues and organs. The electromagnetic properties ( ε r   and   σ ) of 16 tissues in the model can be obtained from FCC published data , as listed in Table 4. In our simulation, 4 nodes are used and the computation time is about 65 minutes. Figure 7 shows the electric field on head surface. With the object oriented HUGO model, the field distribution over a specific object can be investigated. The electric field distributions on eyes, bone, and brain are demonstrated in Figures 8, 9, and 10, respectively.

Tissue parameters for HUGO model.

Tissue type Relative permittivity Relative permeability Conductivity
Skin 41.405334 1 0.86678
Fat 11.333888 1 0.109162
Muscle 56.879063 1 0.995364
Cartilage 42.653103 1 0.782333
Cerebrospinal fluid 68.638336 1 2.412575
Sclera 55.27013 1 1.166726
Vitreous 68.90184 1 1.636162
Lens nucleus 35.841595 1 0.484917
Grey matter 52.724701 1 0.942193
White matter 38.886288 1 0.590815
Nerve 32.530067 1 0.573612
Thyroid 59.683323 1 1.038448
Tongue 55.27013 1 0.936192
Bone 20.787804 1 0.339975
Blood 61.360718 1 1.538069
Air 1 1 0

A cut plane of the HUGO human head model.

The total electric field distribution on head surface.

The total electric field distribution on eyes.

The total electric field distribution on bone.

The total electric field distribution on brain.

4. Conclusion

In this paper, we have analyzed the performance of an efficient MPI parallel implementation of the CG-FFT algorithm on HPC computers. In the proposed method, the codes can run not only on share memory systems machine but also on distributed ones, which present high scalability behavior. Special attention was paid to communications during the matrix-vector product and vector-vector product, which are a key point for the parallel performance. We solved a problem with more than 400 million unknowns on a HPC including 27 nodes.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgments

This work was supported in part by the National Science Foundation of China under Grant no. 61372057, in part by Natural Science Foundation of the Jiangsu Higher Education Institutions under Grant no. 10KJD180004, and in part by Postgraduate Innovation Project of Jiangsu Province under Grant no. CXZZ13_0973.

Harrington R. F. Field Computation by Moment Methods 1968 New York, NY, USA MacMillan Goggans P. M. Kishk A. A. Glisson A. W. Electromagnetic scattering from objects composed of multiple homogeneous regions using a region-by-region solution IEEE Transactions on Antennas and Propagation 1994 42 6 865 871 2-s2.0-0028447619 10.1109/8.301713 Graglia R. D. Uslenghi P. L. E. Zich R. S. Moment method with isoparametric elements for three-dimensional anisotropic scatterers Proceedings of the IEEE 1989 77 5 750 760 2-s2.0-0024665654 10.1109/5.32065 Jarem J. M. Method-of-moments solution of a parallel-plate waveguide aperture system Journal of Applied Physics 1986 59 10 3566 3570 2-s2.0-29144517949 10.1063/1.336779 Livesay D. E. Chen K. Electromagnetic field induced inside arbitrarily shaped biological bodies IEEE Transactions on Microwave Theory and Techniques 1974 22 12 1273 1280 2-s2.0-0016183072 10.1109/TMTT.1974.1128475 Sarkar T. K. Arvas E. An integral equation approach to the analysis of finite microstrip antennas: volume/surface formulation IEEE Transactions on Antennas and Propagation 1990 38 3 305 312 2-s2.0-0025401496 10.1109/8.52238 Gan H. Chew W. C. A discrete BCG-FFT algorithm for solving 3D inhomogeneous scatterer problems Journal of Electromagnetic Waves and Applications 1995 9 10 1339 1357 2-s2.0-0029518207 Cui T. J. Fast algorithm for electromagnetic scattering by buried 3-D dielectric objects of large size IEEE Transactions on Geoscience and Remote Sensing 1999 37 5 2597 2608 2-s2.0-0033311260 10.1109/36.789654 Zhao L. Cui T. J. CG-FFT algorithm for EM scattering by small dielectric particles with high permittivity and permeability Microwave and Optical Technology Letters 2007 49 2 305 310 2-s2.0-33846564420 10.1002/mop.22116 Zhao L. Cui T. J. Li W. D. An efficient algorithm for em scattering by electrically large dielectric objects using MR-QEB iterative scheme and CG-FFT method Progress in Electromagnetics Research 2007 67 341 355 2-s2.0-33947189540 10.2528/PIER06121902 Yu W. Mittra R. Su T. Liu Y. Yang X. Parallel Finite Difference Time Domain Method 2006 Norwood, Mass, USA Artech House Yu W. Yang X. Liu Y. Mittra R. Chang D.-C. Liao C. H. Akira M. Li W. Zhao L. New development of parallel conformal FDTD method in computational electromagnetics engineering IEEE Antennas and Propagation Magazine 2011 53 3 15 41 2-s2.0-80053618327 10.1109/MAP.2011.6028417 Taboada J. M. Araujo M. G. Basteiro F. O. Rodriguez J. L. Landesa L. MLFMA-FFT parallel algorithm for the solution of extremely large problems in electromagnetic Proceedings of the IEEE 2013 101 2 350 363 2-s2.0-77955840280 10.1109/JPROC.2012.2194269 Gandhi O. P. Lazzi G. Furse C. M. Electromagnetic absorption in the human head and neck for mobile telephones at 835 and 1900 MHz IEEE Transactions on Microwave Theory and Techniques 1996 44 10 1884 1897 2-s2.0-0030257667 10.1109/22.539947 Lazzi G. Gandhi O. P. Realistically tilted and truncated anatomically based models of the human head for dosimetry of mobile telephones IEEE Transactions on Electromagnetic Compatibility 1997 39 1 55 61 2-s2.0-0031077845 10.1109/15.554695 Lee A. K. Choi H. D. Choi J. I. Study on SARs in head models with different shapes by age using SAM model for mobile phone exposure at 835 MHz IEEE Transactions on Electromagnetic Compatibility 2007 49 2 302 312 2-s2.0-34347375520 10.1109/TEMC.2007.897124 Li Q.-X. Gandhi O. P. Thermal implications of the new relaxed IEEE RF safety standard for head exposures to cellular telephones at 835 and 1900 MHz IEEE Transactions on Microwave Theory and Techniques 2006 54 7 3146 3154 2-s2.0-33746403371 10.1109/TMTT.2006.877050 Cui T. J. Chew W. C. Fast algorithm for electromagnetic scattering by buried 3-D dielectric objects of large size IEEE Transactions on Geoscience and Remote Sensing 1999 37 5 2597 2608 2-s2.0-0033311260 10.1109/36.789654 Weaver J. Applications of Discrete and Continuous Fourier Analysis 1983 New York, NY, USA John Wiley & Sons Bernardi P. Cavagnaro M. Pisa S. Piuzzi E. Specific absorption rate and temperature increases in the head of a cellular-phone user IEEE Transactions on Microwave Theory and Techniques 2000 48 7 1118 1126 2-s2.0-0034226642 10.1109/22.848494 http://www.fcc.gov/fcc-bin/dielec.sh