A Self-Consistent Model for Thermal Oxidation of Silicon at Low Oxide Thickness

Thermal oxidation of silicon belongs to the most decisive steps in microelectronic fabrication because it allows creating electrically insulating areas which enclose electrically conductive devices and device areas, respectively. Deal and Grove developed the first model (DG-model) for the thermal oxidation of silicon describing the oxide thickness versus oxidation time relationship with very good agreement for oxide thicknesses of more than 23 nm.Their approach named as general relationship is the basis of many similar investigations. However, measurement results show that the DG-model does not apply to very thin oxides in the range of a few nm. Additionally, it is inherently not self-consistent. The aim of this paper is to develop a self-consistent model that is based on the continuity equation instead of Fick’s law as the DG-model is. As literature data show, the relationship between silicon oxide thickness and oxidation time is governed—down to oxide thicknesses of just a few nm—by a power-of-time law. Given by the time-independent surface concentration of oxidants at the oxide surface, Fickian diffusion seems to be neglectable for oxidant migration. The oxidant flux has been revealed to be carried by non-Fickian flux processes depending on sites being able to lodge dopants (oxidants), the so-called DOCC-sites, as well as on the dopant jump rate.


Introduction
Thermal oxidation of silicon is one of the most decisive fabrication steps in microelectronics because it allows creating electrically insulating areas which enclose electrically conductive devices and device areas, respectively [1].Thermal oxidation enables the formation of thin, almost defect-free, and reliable silicon oxide films.This is the reason for the dominance of silicon in semiconductor industry in comparison to other semiconductive materials with better electrical properties than silicon (e.g., germanium of which the first transistors were made [2]).With increasing miniaturization and ongoing downscaling of transistor dimensions, thinner and thinner oxide films are gaining increasing importance because the thickness of gate oxide layers already went down to just a few atoms [3].
Thermal oxidation of silicon is based on the migration of oxidant species through the already formed silicon oxide layer and following interfacial reactions.The resulting silicon/silicon oxide interface strongly influences the device properties, for example, of metal-oxide-semiconductor fieldeffect transistors (MOSFETs).The thinner the silicon oxide film is the more decisive this influence becomes.The silicon/silicon oxide interface comprises a very thin suboxide layer of less than two monolayers and a small region of about three monolayers of distorted silicon in the substrate below the two-monolayer suboxide [4].At the interface a large number of dangling bonds (unsatisfied valences on immobilized atoms) occur with electronic states within the oxide energy gap.The defect density (about 10 10 cm −2 ) can be reduced by means of annealing by two or more orders of magnitude [4].
Deal and Grove derived a very simple formula describing the relation between oxide thickness  and oxidation time  [5].Their theory is based on simplified assumptions like linear dependence of the mobile oxidant concentration on

The Deal-Grove-(DG-) Model
Deal and Grove describe the oxidation process by the following stages of the inward movement of the transported species of the oxidant (Figure 1) [5]: (i) Transport from the bulk of the oxidizing gas to the outer surface where it reacts or is adsorbed: the steady-state flux  1 of the oxidant is proportional to the difference of the equilibrium concentration  * of Gas Oxide Silicon the oxidant in the oxide and the concentration  0 of the oxidant at the outer surface of the oxide at any given time: where ℎ is a gas-phase transport coefficient.
(ii) Transport of the oxidant across the oxide film towards the silicon: the corresponding flux  2 is assumed to be Fickian: where  is the effective diffusion coefficient, d/d the concentration gradient in the oxide,   the concentration of the oxidant near the oxide-silicon interface, and  the oxide thickness.The right part of (2) follows from the assumed steady-state oxidation condition with  2 / = 0.
(iii) Oxidant reaction at the oxide-silicon interface to form a new layer of SiO 2 as a first-order reaction with the reaction rate : Considering steady-state fluxes and eliminating   and  0 , it yields The growth rate  = d/d of the oxide layer follows directly from the flux  considering  as the number of oxidant molecules which form a unit volume of the oxide layer: Assuming an initial thickness   =  ( = 0), the final oxide thickness  can be calculated from ( 6) by integration: resulting in where Introducing a time  corresponding to the initial oxide thickness becomes The oxide thickness yields / and  are called the linear and the parabolic rate constant, respectively.However, this equation is not self-consistent.If we consider the continuity equation then it follows from  2 / = 0 directly: for all 0 ≤  ≤ .The concentration  0 of the oxidant at the outer surface of the oxide became time-independent: and-given by ( 4)-all fluxes were  =  1 =  2 =  3 ̸ = ().Then, (6) gave or which is not observed at long times and thick oxides.Following (2), the concentration   of the oxidant near the oxide-silicon interface should decrease with time due to an increasing oxide thickness ().This is contradictory to (3) where the flux  3 ∼   should remain constant.This shows that the DG-model is not self-consistent [15].Grove himself agreed to that in a personal communication with one of the authors (Karl Maser): "As the oxide layer grows it will contain more and more oxidant, therefore in reality / is larger than 0. This would correspond to a small curvature in the distribution of oxidant instead of it being a straight line.
Because of this approximation, strictly speaking, the theory is not internally consistent.However, neglecting this curvature introduces little practical error into the calculation of the rate of oxidation" [16].Summarizing, it can be concluded that the DG-model is limited to the following postulates [5]: (i) linear dependence of the mobile oxidant concentration on position in the oxide, (ii) a constant diffusion coefficient, (iii) a constant reaction coefficient in the mass-action law governing the first-order reaction of silicon oxide formation, (iv) steady-state conditions [5].
Consequently, that model is not able to describe processinherent changes of interactions and non-Fickian effects in the dynamic system of thermal silicon oxidation.

Other Models for Thermal Oxidation
Since Deal and Grove have determined   (from the intercept of the linear straight line with the thickness axis) to be about 23 nm, the applicability of the DG-model is restricted to correspondingly higher oxide thicknesses [17].The DGmodel underestimates oxidation rates for thinner oxides significantly.To overcome this discrepancy numerous models were proposed.These models can be distinguished into two classes [10,17]: (i) models modifying the DG-model to achieve a better match with experimental data, (ii) models assuming particular, different physical mechanisms from the DG-model.
The models use either (iii) empirical or semiempirical expressions describing the underlying mechanisms or (iv) generic expressions like power laws of oxide thickness on time.
The latter do not presume any detailed knowledge about interactions of dopant (oxidant) migrations and reaction kinetics.However, because they are model-free, they may give indications of new insights into underlying processes.Many models consider parts of the three steps of the DGmodel, in particular the transport of the oxidant across the oxide film towards the silicon and the oxidant reaction at the oxide-silicon interface to form SiO 2 .
In [18] a simple extension is proposed in which the diffusion through the amorphous oxide happens via molecular oxygen and the silicon oxidation through the reaction of a small concentration of atomic oxygen.This leads to the following relationship between oxide thickness  and oxidation time: with  ∼ sinh .
The model in [19] assumes two parallel, competing reactions to occur.Molecular oxygen reacts directly with silicon to form both silicon dioxide and atomic oxygen.This is accompanied by the dissociation of O 2 as second reaction.The atomic oxygen thus formed either reacts with silicon or recombines to molecular oxygen.The fact that the processes occur in parallel instead of consecutively was explained by the concave curvature of the Arrhenius plot of the observed rate constants [20].
Reference [21] describes the oxidation as a process where two-noninteracting-oxidants instead of only one diffuse through the oxide and react at the oxide-silicon interface.Hence, the total oxidation rate yields with  1 ,  1 ,  2 , and  2 as the constants corresponding to (9).This leads to the following oxide thickness versus oxidation time relationship: with Here, the linear and the logarithmic term of  in ( 21) describe together the rapid initial oxidation regime.The rate constants  1 ,  2 , and  2 / 2 show Arrhenius behavior proportional to exp(−  /) with   the respective activation energy.
Reference [22] (see also [17]) follows the basic assumptions of the DG-model by considering diffusion of molecular oxygen through the oxide followed by a single-step reaction.However, the diffusive flux  2 of the oxidant is related to the difference in the concentration of oxygen between one interface and another newly forming interface instead of the difference between the surface and the interface oxidesilicon: Finally, the growth law yields with The model in [23] (see also [17]) leaves the steady-state assumption for the diffusion of molecular oxygen through the oxide making it consistent with the mass balance at the oxidesilicon interface.As a result of the mass conservation of each species of the system, the oxygen concentration at the oxidesilicon interface becomes almost zero.Solving Fick's law with the boundary condition that the concentration at the oxidesilicon interface is zero, the growth law yields with The symbols  O 2 eq and  SiO 2 mean the molar concentrations of O 2 in SiO 2 at the interface between gas oxygen and silicon dioxide and of SiO 2 , respectively.Unfortunately, (26) does not lead to similar simple analytical expressions as in the models above.Several authors, for example, the ones of [24], have attributed the enhanced diffusion in very thin silicon oxides to space charge effects.Wolters and Zegers-van Duynhoven [24] compare thermal oxidation of silicon with the classical metal oxidation theory and describe the oxidation process as an electrochemical cell.Oxygen ions move into the silica towards the silicon enabled by thermal energy and driven by the concentration gradient.They charge the silicon negatively and built up an electrical field.Several mechanisms affect the equilibrium of electronic and ionic transport, thus influencing the transport of oxygen and, hence, the oxidation rate.In the case of coupled ionic and electronic currents the kinetics of oxide growth follows a power-of-time law [24][25][26]: The exponent  amounts to values between 0.25 (purely parabolic growth) and 1.0.Other authors relate the differences of the oxide growth in thin thermal oxides to compressively strained oxide layers near the silicon oxide-silicon interface which suppress diffusivity [27].The thickness   of the interfacial strained layer is estimated as 0.55-1.03nm for oxidation temperatures between 800 and 1200 ∘ C. The inverse growth rate is given as where the diffusion constant is a function of the depth . 0 itself is proportional to exp(−/  ).Δ is the incremental barrier of the diffusivity at the SiO 2 -Si interface and   the Boltzmann constant.The reaction at the oxide-silicon interface causes a considerably large change in molar volume by a factor of ca.2.2 which is totally relieved at high temperatures above 900 ∘ C and only partially relieved at lower temperatures [28] so that the initial rate constant is deformation-dependent and the parabolic rate constant is stress-dependent.Irene [28] as well as Revesz and Hughes [29] proposed also the transport in micropores within the oxide layer as a reason for the differences between the DG-model and experimental results.Parallel to diffusive flux according to (2), a micropore transport flux  pores occurs which contributes to the total reaction flux  3 =  2 +  pores .
Summarizing, all the proposed mechanisms and extensions to the DG-model described above have been strongly doubted by many authors [17,30].Up to now, none of these models is widely accepted [10].In practice, only empirical or semiempirical models are used which introduce additional terms to the DG-model [31].In [30] it is emphasized that most authors have tried to maintain the validity of the DG-model and have underestimated its limitations.As we have already mentioned above the DG-model is not self-consistent which is agreed on by [24].

Einstein's Deduction of Fick's Law
Fick's law [32] and the constancy of diffusivities were first theoretically founded by Einstein [33] with the help of a model decisively involving symmetric probability functions of dopant jump distances.Denoting (, ) as the dopant concentration at position  and time , then its amount will be altered to (,  + ) by jumps of dopant particles which have reached the reference position  from their starting points ( + ) during the time interval (,  + ).Here  means the jump distance of the dopant particles.Provided that their probability distribution  is only a function of  but is independent of position, concentration, and so forth, it leads to [34] with Following Einstein, the diffusion coefficient  can be interpreted as The constancy of diffusivities is inevitably connected with symmetric probability functions of dopant jump distances, for which Einstein excluded any interactions of these particles with each other.If this assumption does not hold, (31) becomes Since the continuity / relates to the divergence of the dopant flux , / = −/, the first term of the righthand side of (34) represents a non-Fickian dopant flux which is not only caused by the doping gradient /.Accordingly, the quotients ⟨⟩/ and ⟨ 2 ⟩/2 are variables, which may depend on position, concentration, and so forth.Therefore, dopant transport processes in solids have to be considered in a more generic mechanism than only the Fickian diffusion.
For instance, neutron activation analyses of the drive-in process of phosphorus in silicon at 1250 ∘ C under pure oxygen ambient atmosphere reveal that [35] (i) uphill-migration of phosphorus in the interior of the silicon crystal takes place near to the Si-SiO 2 interface, (ii) the total amount of phosphorus contained in the dopant zone of the silicon crystal is independent of the duration of drive-in diffusion.There is not any exodus of phosphorus from the silicon crystal into the SiO 2 -layer.
It has been assumed that the uphill-accumulation is caused by an inhomogeneous distribution of point defects, for example, vacancies, generating a non-Fickian dopant flux component.Non-Fickian phenomena of dopant transport in solids have to be considered on the base of a more generic approach like the concept of occupiable sites for dopant species (dopantoccupiable or DOCC-sites concept, resp.)[34].This atomistic Advances in Condensed Matter Physics kinetic framework considers dopant migration in solids as discrete jumps of dopant particles via distinct sites instead of assuming a continuous course of solid-state diffusion.It is based on two conditions: (i) Sites suitable for occupation by particles of the mobile dopant species must exist-in other words, sites being able to lodge dopants, the so-called DOCC-sites.
(ii) Particles of the mobile dopant species must have sufficient energy to occupy DOCC-sites.
Elementary place change interactions which can occur individually or simultaneously are direct exchanges of atoms and mechanisms involving point defects, for example, vacancy, interstitial, and divacancy mechanisms [36].
In [34,37] the transport of dopant particles via point defects in a cubic lattice is considered in detail.The relationship for the dopant particle flux density comprises four components: ) . ( The two contributions in the middle of the parentheses represent non-Fickian parts and are also called crossover components.
The symbol  means the lattice constant assuming jumps from one lattice plane to the next.Although silicon oxide is not crystalline in the narrower sense, [38] presents results showing that a transition from the crystalline into the amorphous phase proceeds at the SiO 2 /Si interface via the crystalline or ordered phase of SiO 2 . is the dopant DOCCsite neighborhood constant, which is designated as  in original papers [34,37].The dopant jump rate  describes the probability of the jumping process for a preferential dopant particle to jump into the neighboring lattice plane within a considered time interval. is the charge of the dopants,  the temperature, and   the Boltzmann constant.
From this generic model, numerous other effects can be derived like the ones of [39][40][41] as well as familiar Fick's law that would follow as a special case for timely and spatially constant parameters  =  0 and  =  0 .The diffusion constant would yield here  0 =  2  0  0 .
In [28] it is stated that Fickian transport of oxygen is not dominant at oxidation temperatures below 1000 ∘ C. In [42] it is shown by ab initio-modelling of the silicon oxide/silicon interface that thermodynamically stable configurations exhibit defects in the form of three-coordinated silicon atoms, five-coordinated silicon atoms, threefoldcoordinated oxygen atoms, and displaced oxygen atoms.Unfortunately, the corresponding parameters of the migration flux parts in (34) are not known in detail for thermal oxidation.

A Consistent Model for Thermal Oxidation of Silicon
It is the goal of the following considerations to give a summary formulation for the effective oxidant flux (i.e., the flux  2 or , resp.) in the silicon oxide at the interface to silicon.Details of rate process mechanisms, of the oxidation kinetics, and of the nature of the oxidant species will not be considered.We regard the total oxidant flux at the silicon oxide/silicon interface as the decisive parameter determining the growth rate d()/d of the oxide thickness ().
Thermal oxidation of silicon is a dynamic process.The stress-strain differences within the oxide as well as between oxide and silicon, which decisively depend on temperature and oxide thickness, alter oxidant migration and reaction kinetics.In this way, the growing oxide is both a reason for and a result of the continuously changing conditions.
To develop a model for the thermal oxidation of silicon that is self-consistent, the following assumptions should be made: (i) non-Fickian oxidant flux contributions in the oxide layer corresponding to the DOCC-sites concept of dopant migration in solids (see Section 5), (ii) a variable reaction coefficient  in the first-order reaction of silicon oxide formation at the silicon/silicon oxide interface, (iii) a consistent approach for the dependence of the oxidant concentration on depth in the oxide layer.
(a) Continuity Equation.Due to the assumptions made above, oxidant migration has to be described in a different way than by Fick's law and the DG-model.For that reason, continuity equation ( 14) is used to allow the consideration of non-Fickian fluxes of oxidants in the growing oxide.
(b) Growth Law for Silicon Oxide.As Blanc outlines in [6] and the references related to Section 2 of this paper show, there have been performed many studies on the rate of growth of oxide thickness.We will here refer to the raw data in [6] for the same reasons as Blanc did.These values are based on oxidation experiments by Hopper et al. [7] who took care to remove the native oxide prior to oxidation.The oxide thicknesses were measured by means of ellipsometry.The measurements cover different oxygen pressure values, temperatures between 780 ∘ C and 930 ∘ C, and oxidation times starting at a few seconds (corresponding to ca. 1 nm) (Table 1).The double logarithmic plots of oxide thickness  versus time  reveal power law relationships (Figure 2, taken from Figure 1 in [6]) Table 1: Experimental data from [7] in [6].where  and  are constants within distinct frame conditions (Table 2).For the example of (111)-Si at 870 ∘ C and 1.56 atm, the exponent  amounts to 0.688 (Figure 2).The linearity of the relationship ln(/nm) versus ln(/s) holds down to thickness values of about 5 nm, hence defining the limit of its applicability.Lower thicknesses are difficult to describe not only by the given validity but also due to surface nucleation effects determining the initial stage of silicon surface oxidation up to the first microns.The approach of (36) corresponds to similar power-oftime approaches in [43][44][45].Reisman et al. [43] found values for  which are neither unity nor 0.5, as might be expected from the DG-model, but exhibited a value of ca.0.8.Their literature survey presents values of  between 0.25 and 0.8 for dry oxidation depending only on oxidation temperature and oxygen partial pressure.Ngau et al. [45] expanded (36) by a time  corresponding to the growth of any existing oxide on the silicon: In [24,25] a power series similar to (36) was proposed which was related to coupled ionic and electronic currents in growing oxides (see Section 3).The reciprocal relationship shows close similarities to the DG-model.Interestingly, it is shown in [25] that 1/ and / are coupled linearly and are also interrelated with .It is stated there that this is strongly in favor of models which predict oxidation by power laws.
(c) Concentration of the Mobile Oxidant Species in the Oxide.
In the following we will use a new generic approach for the concentration of the oxidant: where () is parameter changing by time.The often used ansatz (, ) = (/ √ ) would-contrary to the given assumptions-lead again to Fick's equation.Several experiments to determine oxidant profiles in thermal oxides were performed by means of secondary-ion mass spectroscopy (SIMS) using O-16 and O-18 isotopes, respectively [13,21,[46][47][48].The investigations relate to double and triple oxidation steps executed in different isotopic atmospheres.It can be seen in all publications that the resulting profiles differ significantly from Fickian diffusion as is assumed in the DG-model.
The measurements in [13, Figure 2] show a clear exponential decay of O-18 up to 40 nm in depth.Extending the second oxidation step from 10 min to 35 min discloses a modified but again exponential profile [13, Figure 3].The experiments in [21,47,48] reveal the O-18 species to be predominantly bonded to lattice sites, so that the interstitial concentration is an insignificant fraction of the migration process.A third oxidation step performed in O-16 after the initial O-16/O-18 oxidation steps shows that the O-18 species stayed with their positions and did not migrate into the new oxide region [21].The investigations in [46] reveal that the oxide growth obeys a linear behavior up to 50 nm in thickness apart from an initial region of more rapid growth.The O-18 silicon oxide is found to grow also at the outer oxide surface (gas-oxide interface) which corresponds to an uphill-migration of O-18.
(d) Oxidant Flux in the Oxide.The oxidant flux   at  = () is given as where  = 4.6 ⋅ 10 22 O-atoms/cm 3 is the stoichiometric concentration of oxygen atoms of the SiO 2 network.Equation (36) as model-free approximation function () and the corresponding flux   are the essential bases for a consistent model of the oxidation process.
To solve continuity equation ( 14) a function (, ) according to (40) has to be found which fits (36) and (40) This formula is valid for the entire oxide layer (0 ≤  ≤ ()) and satisfies continuity equation ( 14) for the oxidant concentration (, ).The function () has to be determined from the boundary conditions at  = () and the position of the oxide-silicon phase boundary.Here, (39), (35), and (44) yield: It can be seen easily that a relation of the type is the sought-after solution of (45).Therefore, one gets the implicit relationship between the constants   , , , and : Table 3: Ratio / calculated from (47) and the data from Table 2.
/ 5.5 ⋅ 10 16 cm −3 [8] 2.392 ⋅ 10 −6 19.4 ⋅ 10 16 cm −3 [9] 8.435 ⋅ 10 −6 Noteworthily, any connections between parameters named in (47) have been unknown up to the present.Because the values  (from Table 2) as well as   and  are known, the ratio / is calculable (Table 3).It is much smaller than unity.From ( 41), (44), and ( 46) one gets In the previous representation, we intentionally avoided introducing Fickian diffusion as the mechanism for dopant migration as the DG-model does.For that reason, we have now to consider what mechanisms govern the migration processes at thermal oxidation.Equation ( 49) will be the starting point for these considerations.Inserting  = () as well as ( 36) into (49) gives where   means the oxidant concentration at the siliconsilicon oxide interface.Putting measured and calculated values of the ratio / of Table 3 in (50) gives the decisive result: In other words, the mobile oxidant concentration throughout the oxide layer has here been revealed to obey the relations or at 0 ≤  ≤ ().This means that the Fickian diffusion is neglectable, so that the flux   (45) has to be formulated in terms of non-Fickian components explained in the DOCCsites concept (Section 5, (35)).Inserting ( 52) into (48) yields the relation which describes empirically the experimental results.More detailed knowledge about the process kinetics has been deduced from several theoretical investigations [9, 49-52], which disclose neutral oxygen molecules as that species carrying mainly the oxidant flux through the oxide network.Hence, the drift component can be neglected, The postulate is inferred from numerical analyses, which take into account the difference between the barrier height of atomic oxygen migration and that of molecules.Results of Hoshino et al. [52] based on density functional calculations favor firmly the molecular oxidant migration.The postulate is numerically verified for both -quartz and -cristobalite to simulate approximately the silicon dioxide of thermal silicon oxidation.Noteworthily, these studies are more precise than those of Bongiorno and Pasquarello [49] using first-principle calculations and other simulations.
The vanishing Fickian oxidant flux following from ( 51)-( 53) gives rise to consideration of the oxidant migration in view of the abovementioned DOCC-sites concept, which takes into account non-Fickian dopant flux components in solids-here oxidant in oxide.
applies down to thickness values of 6.6 nm, satisfying the requirements of the present study.Figures 3 and 4 show selected intervals of the relationship between the mean values of ln(/nm) and ln(/s), respectively, corresponding to the power law of oxide thickness  and oxidation time  with respect to (36). Figure 3 and Table 4 refer to Blanc's data set A8, whereas Figure 4 relates to data set A11 for thermal oxidation of (111)-Si at 832 ∘ C in oxygen at 1.0.atm pressure.The latter shows an impressive case of dynamics in the oxidation process.The exponent  of the power law of (36) changes its value significantly within the thickness interval between 13.14 nm and 17.4 nm.

Figure 1 :
Figure 1: Model of Deal and Grove for the oxidation of silicon (after [5, Figure 3]).
(i) Fickian flux due to the dopant concentration gradient   = /, (ii) flux driven by the gradient   = / of the dopant jump rate , (iii) flux driven by the gradient   = / of the DOCCsites concentration , (iv) drift of particles due to the interior electrical field strength :  (, ) =  2  ⋅ (−  −    +    +

Table 2 :
[6])es of  and  in(36), calculated from the data in[6]. min is the minimum silicon oxide thickness where (36) holds.
[8]develop the present model has been found intuitively by trial and error (Karl Maser).  is a time-independent surface concentration.Bongiorno and Pasquarello calculated in[9]the O 2 solubility in amorphous SiO 2 and estimated a value for   of 19.4 ⋅ 10 16 cm −3 for a pressure of 1 atm and a temperature of 1078 ∘ C.They found very satisfactory agreement with the corresponding experimental value of 5.5⋅10 16 cm −3 in[8]that corresponds to a deviation of 0.1 eV on the energy scale of the energy minima for the O 2 molecule in a-SiO 2 which is within the uncertainty of their level of theory.

Table 4 :
Oxide growth rate d()/d calculated for distinct oxide thickness intervals for Blanc's data set A8 (Figure3).All rate values are calculated at  = 6.65 nm.