3D Parallel Monte Carlo Simulation of GaAs MESFETs

We have investigated three-dimensional (3D) effects in sub-micron GaAs MESFETs using a parallel Monte Carlo device simulator, PMC-3D [1]. The parallel algorithm couples a standard Monte Carlo particle simulator for the Boltzmann equation with a 3D Poisson solver using spatial decomposition of the device domain onto separate processors. The scaling properties of the small signal parameters have been simulated for both the gate width in the third dimension as well as the gate length. For realistic 3D device structures, we find that the main performance bottleneck is the Poisson solver rather than the Monte Carlo particle simulator for the parallel successive overrelaxation (SOR) scheme employed in [1]. A parallel multigrid algorithm is reported and compared to the previous SOR implementation, where considerable speedup is obtained.


INTRODUCTION
As semiconductor device dimensions continue to shrink in ultra-large scale integration technology, there is an increasing need for full, three-dimensional 3D device models to accurately represent the physical characteristics of the device.Solution of the Boltzmann equation using Monte Carlo methods is currently one of the most widespread techniques used in device simulation at this level of modeling 2 .In the Monte Carlo method, the motion of charge carriers electrons and holes is assumed to begiven by classical trajectories interrupted by random, instantaneous, scattering events which change the energy and momentum of the particles.In a device simulation, the forces determining the free trajectories are determined by the electric elds obtained by solving Poisson's equation on a mesh over the device domain.The random scattering events are generated stochastically using a random number generator and the quantum mechanical scattering probabilities for all possible mechanisms in the semiconductor.In a Monte Carlo device simulation, the solution of the particle motion is synchronized with the solution of Poisson's equation so as to provide an accurate representation of the time dependent evolution of the elds in the semiconductor.
Parallel or multiprocessor computers provide some relief to the computational requirements of Monte Carlo device simulation.We have previously developed a parallel 3D Monte Carlo device simulator, PMC-3D 1 , which w as implemented on the distributed-memory nCUBE multiprocessor.
In this algorithm, a subspace decomposition of the 3D device domain was performed, in which the particles and mesh nodes were distributed in a load-balanced way among the individual processors.
During each time step, the particle motion and eld calculation is performed locally, and the results communicated to neighboring processors at the end of the time step.In order to parallelize the solution of Poisson's equation in this initial implementation, an iterative successive o ver relaxation SOR method with a red-black ordering scheme was used.We have obtained good e ciencies using this algorithm, up to 70 with 512 processors.

3D MESFET SIMULATION
As an application of PMC-3D, we h a ve studied the scaling behavior of GaAs MESFETs as both the length and the width of the gate are scaled.Figure 1 shows the result for the particle distribution for a planar and a recessed gate structure under bias.The Monte Carlo model for bulk GaAs includes all the pertinent scattering mechanisms and a three-valley conduction band.Dirichlet boundary conditions are assumed for the three electrodes, and Neumann boundary conditions are assumed elsewhere, except on the top surface wherein Fermi-level pinning of the GaAs-air interface is assumed.The MESFET structure shown in Fig. 1 is doped 210 17 cm ,3 in the region shown, and is surrounded by a semi-insulating substrate which here is assumed doped p-ty p e a t 1 10 15 cm ,3 .
We have studied small signal parameters such as the transconductance and voltage gain as a function of gate width and length scaling.Results for the transconductance are shown in Fig. 2. As can be seen, signi cant deviations from the linear scaling relation with gate length are observed as the gate width is reduced.The major e ect responsible for this deviation is a stronger shift in the threshold voltage with gate length for narrow gate widths which m o ves the operating characteristics o of the optimum operating point.

THE MULTIGRID METHOD
We h a ve found in the above simulations that the parallel SOR solver consumes over 90 of the computation time in a typical run, motivating the search for a more e cient solution method for Poisson's equation.The multigrid technique is a well-established approach for solving ordinary and partial di erential equations.Its main advantage over other iterative methods like the SOR is that it is immune to increasing grid point numbers and or more accurate convergence thresholds 3 .
In the multigrid method, the convergence of the Gauss-Seidel iteration is accelerated through the use of coarser grids on which the residual is solved.Improvement of the 2D Poisson-Monte Carlo algorithm has been reported by Saraniti et al. using multigrid methods 4 .in which speedups of 10-20 times were reported for a sequential code compared to the SOR method.
The rst task in the multigrid method is to create a hierarchical set of grids ranging from the densest n to the coarsest possible k .The coarsening factor we used is 1=2, which implies that the grid spacing of n,1 is twice as big as the grid spacing of n .Fig. 3 illustrates the two dimensional representation of the multiprocessor coarsening scheme.Choosing the number of points of the form 2 k + 1 for all three directions but not necessarily with equal k values improves the convergence ratio of the Poisson Solver.
The main goal of the relaxation scheme is to reduce the high frequency components of the error on any given grid.There can be several suitable relaxation schemes for a speci c problem depending on the boundary conditions and or coarsening method.In this implementation, we c hose to use a pointwise Gauss-Seidel relaxation scheme.As the multigrid solver is designed to be a replacement for the former SOR solver, we chose to use a pointwise red-black ordered Gauss-Seidel relaxation scheme and restricted the grids to be homogenous and uniformly spaced along all three dimensions.Several parallel implementations of the multigrid method has been reported in the literature 5, 6 .Our parallelization of the multigrid code is essentially the same as the former SOR implementation.The partitioning and the communication routines are extended to service the hierarchical 6 A. Greenbaum, A multigrid method for multiprocessors," Applied Mathematics and Computation, 19:75 88, 1986.

Figure 1 :
Figure 1: Particle distribution in steady-state for a planar GaAs MESFET device L g = 0 :25m and a recessed-gate device L g = 0 :5m.

Figure 2 :
Figure 2: DC Transconductance variation with gate length for three di erent gate widths for the planar device shown in Fig. 1.

Figure 3 :
Figure 3: Two dimensional representation of the multiprocessor coarsening scheme.Here n is the densest, n,1 is the next coarser, and n,2 is the coarsest grid.

Table I :
The timings of the PMC-3D device simulator with SOR and MG solvers.The simulation is run for 100 time steps for 20,000 particles with di erent convergence thresholds on a 129 65 33 homogenous grid with uniform grid spacings on a 32 node nCUBE multiprocessor.The timings are VLSI Design, 6(1-4):273-276, 1998.