Tensorial Model for Photolithography Aerial Image Simulation

In this paper, we propose to adapt the multilinear algebra tools to the tensor of Transmission Cross-Coefficients (TCC) values for aerial image simulation in order to keep the data tensor as a whole entity. This new approach implicitly extends the singular value decomposition (SVD) to tensors, that is, Higher Order SVD or TUCKER3 tensor decomposition which is used to obtain lower rank-(K1, K2, K3, K4) tensor approximation (LRTA (K1, K2, K3, K4)). This model requires an Alternating Least Square (ALS) process known as TUCKALS3 algorithm. The needed number of kernels is estimated using two adapted criteria, well known in signal processing and information theory. For runtime improvement, we use the fixed point algorithm to calculate only the needed eigenvectors. This new approach leads to a fast and accurate algorithm to compute aerial images.


Introduction
Over the last decades, photolithography has been instrumental to the historical trend, better known as Moore's law, of doubling chip density roughly every eighteen months while maintaining a nearly constant chip price.Going down to the 32 nm node, without changing the wavelength, we move closer and closer to the theoretical optical resolution limit.Therefore resolution Enhancement Techniques (RETs) have been developed in order to period all shape of an integrated circuit properly on the silicon wafer.One of the most wellknown RETs is the optical proximity correction (OPC), which consists of making mask pre-compensation of all nonlinear effects, optical diffraction, and interference effects.The reason behind correcting images distortions between the mask and the silicon due to this resolution limit.Model-Based Optical Proximity Correction (MBOPC) can treat layouts by deforming mask pattern to improve resolution on silicon.As for current masks, several billions of segments have to be moved during several iterations necessary to reach convergence; fast and accurate algorithms are mandatory to perform OPC on a mask in a reasonable time for industry.
To overcome this limitation, mathematical simplifications have to be done.As imaging with a lithography system (see Figure 1) is analogous to microscopy, the theory used in MBOPC is drawn from works originally designed for microscopy theory.Fourier optics was first used to describe the image formed by a microscope [1,2].In 1951, Hopkins developed other formulation or method (1), that has been afterward used in other works involving image formation [3][4][5][6].This formulation has the advantage of a four-way transmission function independent of the shape of the object to image.However, this method, in aerial image computation for photolithography, remains highly runtime consuming and needs a numerical approximation in order to reach production constraints.
In the solution called sum of coherent systems (SOCSs) provided by Cobb in [7], the Transfer Cross-Coefficients (TCCs) (2) are written in matrix format, then Singular Value Decomposition (SVD) that is used on this matrix, resulting in (3), allows to spare a huge amount of time while preserving a good accuracy as only the eigenvectors associated with the most energetic eigenvalues are used in calculus.
Clearly, we use Hopkins formulation, and a notation where X will denote a tensor, X a matrix, x a vector, and x a scalar.
The intensity image at the image plane is given by where and I(.) intensity image at the image plane, E(.) object being imaged (mask), Γ(.) mutual intensity function, describing coherence properties of the illumination, F(.) coherent point spread function, describing properties of the projection system.
Typically, I 1 = I 2 = I 3 = I 4 = I for nonastigmatic systems because of symmetry reasons.Those values are spatial frequencies, correspond to the range of values such that T CC(i 1 , i 2 , i 3 , i 4 ) / = 0, and depend on the optical system [8].In his approach, Cobb [7] has unfolded tensor T CC to obtain a matrix T representation and then its eigendecomposition (SVD) leads to where the superscript "H" denotes transpose conjugate.
Then the lower rank matrix approximation of T gives the "kernels" used in aerial image calculation.According to (2), the TCCs are depending on four indices, thus in this paper we propose to represent them as a four-way array.I n will denote the size of tensor T CC in dimension n.Tensor approach permits to treat a fourway array as a whole entity.In the considered problem, this approach allows to preserve interdimensional correlation avoiding possible data loss [9].Low rank approximation, such as Singular Value Decomposition for two-dimensional array, considers significant and remaining parts of the data, they are based on data of most significant feature selection.The eigendecomposition of the matrix provides eigenvectors [10].The K n eigenvectors associated with the K n largest (dominant) eigenvalues span the so-called "useful subspace."Useful subspace-based methods are applied to source localization in array processing and image denoising.These methods were adopted to four-way-array (tensor) [9].The tensor model extends the classical matrix model [9].This approach implicitly extends the SVD to tensors, that is, Higher Order SVD (HOSVD).
A fixed point algorithm comes then replacing HOSVD algorithm to spare some computational load, as already shown in [11].
The remainder of the paper is organized as follows In Section 2, some tensor definitions and properties are proposed.Section 3 proposes an original lower rank tensor approximation (LRTA) adapted to TCC data approximation.In Section 4, we propose a runtime improvement algorithm using an original method.After presenting some experimental results in Section 5, we conclude the paper in Section 6.

Tensor Definitions and Properties
In the following we remind some tensor definitions.For more details the reader can be referred to the papers in [12][13][14].

Tensor Definition.
We consider an N-order tensor to represent an N-way array.Each of its elements is accessible through N indices (i 1 , i 2 , . . ., i N ).This tensor is written A ∈ C I1×•••×IN , and each element a i1,...,iN .Each component is called an "n-mode" referring to the nth index of the tensor.A zeroorder tensor is a scalar, a first-order tensor is a vector, and a second-order tensor is a matrix.
To ease understanding this paper, some useful definitions are presented in the following.

Unfolding of Tensor A.
To study multiway data properties in a particular n-mode direction, as shown in Figure 2, we define the n-mode flattening matrix of tensor A, written A n ∈ C In×Mn , where (4) Where × n operator defines n-mode product.It generalizes matrix product to tensor and n-mode vector product in a particular n-mode.Each element with index (i 1 , . . .
2.4.Tensorial Scalar Product.Tensorial scalar product is defined as follows.Suppose two N order tensors: A and Advances in OptoElectronics 3 A n columns are the I n -dimensional vectors obtained from A by varying index i n and keeping the other indices fixed.In A n , the I n -dimensional vectors obtained from A are ordered with respect to their index for each mode but the nth mode.These vectors are called in the following the nth-mode vectors of tensor A. The lower rank-(K 1 , K 2 , ..., K N ) tensor A approximation is possible by extending the classical subspace approach to tensors by assuming that, whatever the nthmode vector space of dimension I n , E (n) is the superposition of two orthogonal subspaces: 1 is generated by the K n eigenvectors of matrix A n , which are associated with the K n largest eigenvalues; (ii) and the subspace 2 is generated by the I n − K n singular vectors of matrix A n , which are associated with the The dimensions K 1 , K 2 , . . ., K N can be estimated by means of the well-known (Akaike Information Criterion) AIC or (Minimum Description Length) MDL criteria [10].These are entropy-based Information Criteria which have had a fundamental impact in statistical model evaluation problems.
2.8.Tensor Rank.In case of matrices, rank plays an important role in SVD, in canonical decomposition and in lower rank approximation.Consider the definitions of a tensor rank Classical Rank.Consider tensor of order N, From II-A, tensor A is built on N vector space: E (1) , . . ., E (N) , of dimension I 1 , . . ., I N .Tensor A has rank one if, for all n = 1, . . ., N, N vectors u (n) ∈ E (n) exist, such as A is equal to their external product: By extension, A is a tensor of rank K, if K is the minimal number or tensor of rank 1 which, by summation, give A. We write the rank of tensor A by Rank(A) = K. n-Mode Rank.It is possible to define n-mode rank of a tensor as the generalization of column vectors lines vectors rank of a matrix.n-mode rank of a tensor

Tensor Approximation Based on Lower Rank Tensor Approximation (LRTA) Algorithm
The main objective of this paper is to adopt the multilinear algebra tools to approximate the tensor T CC for system transmission based on Lower Rank Tensor Approximation (LRTA) algorithm; we will consider in the rest of the paper N = 4 which corresponds to the fourth-order tensor of the TCC values, T CC ∈ C I1×I2×I3×I4 .The corresponding n-mode rank values are denoted K 1 , K 2 , K 3 , K 4 .We assume that the value of K n is known or is estimated (see Section 4.2 to know how to estimate them) for all n = 1 to 4.
As an extension to the vector and matrix cases, in the tensor formulation, the projectors on the nth-mode vector spaces are determined by computing the rank-(K 1 , K 2 , K 3 , K 4 ) approximation of T CC in the least-squares sense.From a mathematical point of view, the rank-(K 1 , K 2 , K 3 , K 4 ) approximation of T CC is represented by tensor T CC which minimizes the quadratic tensor Frobenius norm T CC − T CC 2 subject to the constraint that The (1) Input: data tensor T CC, and dimensions K 1 , K 2 , K 3 , K 4 of all n-mode signal subspaces.
(2) Initialization k = 0: For n = 1 to 4, calculate the projectors P (n)  0 given by HOSVD-(K 1 , K 2 , K 3 , K 4 ): (a) n-mode unfold T CC into matrix TCC n ; (b) Compute the SVD of TCC n ; (c) Compute matrix U (n)  0 formed by the K n eigenvectors associated with the K n largest singular values of TCC n .U (n)  0 is the initial matrix of the n-mode signal subspace orthogonal basis vectors; (d) Form the initial orthogonal projector P (n)  0 = U (n) 0 U (n) T 0 on the n-mode signal subspace; (e) Compute the HOSVD-(K 1 , K 2 , K 3 , K 4 ) of tensor T CC given by B 0 = T CC× 1 P (1)  0 × 2 • • • × 4 P (4)  0 ; (3) ALS loop: Repeat until convergence, that is, for example, while B k+1 − B k 2 > ε, ε > 0 being a prior fixed threshold, (a) For n = 1 to 4: k+1 composed of the K n eigenvectors associated with the K n largest eigenvalues of C (n),k .U (n)  k is the matrix of the n-mode signal subspace orthogonal basis vectors at the kth iteration; (4) Output: the estimated signal tensor is obtained through T CC = T CC× 1 P (1)  kstop where k stop is the index of the last iteration after the convergence of TUCKALS3 algorithm.
TUCKER3 tensor decomposition also known as Higher-Order SVD (HOSVD) [15] and lower rank-(K 1 , K 2 , K 3 , K 4 ) tensor approximation (LRTA-(K 1 , K 2 , K 3 , K 4 )) has recently been used as multimode (Principal Component Analysis) PCA, in seismic for wave separation based on a useful subspace method, in image processing for face recognition and expression analysis and noise filtering of color images.Tensor T CC can be expressed with the TUCKER3 model [12] as (4) . ( The least square solution implies that Thus the best lower approximation of tensor T CC denoted T CC is to orthogonally project, for every n-mode, the vectors of tensor T CC on the nth-mode useful subspace E (n)  1 , for all n = 1 to 4 [9,13] T CC = T CC× 1 P (1) × 2 P (2) × 3 P (3) × 4 P (4) .( In (14), T is called the projector of the nth-mode space.The estimation of the projectors is a difficult nonlinear least square problem that is generally solved owing to an ALS algorithm referred to as TUCKALS3 algorithm [12,16].
TUCKALS3 algorithm [12,16] is an optimal algorithm to estimate the different n-mode useful subspaces.The description of TUCKALS3 algorithm, used in rank-(K 1 , K 2 , K 3 , K 4 ) approximation, is provided in Algorithm 1.
In this algorithm, the second-order statistics comes from the SVD of matrix TCC n at step 2b, which is equivalent, up to 1/M n multiplicative factor, to the estimation of tensor T CCn-mode vectors [9].The definition of M n is given in (8).In the same way, at step 3(a)iii, matrix C (n),k is, up to 1/M n multiplicative factor, the estimation of the covariance matrix between tensor T CC and tensor B (n),k nmode vectors.According to step 3(a)i, B (n),k represents data tensor T CC filtered in every mth-mode but the nth-mode, by projection-filters P (m)  l , with m / = n, l = k if m > n and l = k + 1 if m < n.TUCKALS3 algorithm has recently been used to process a multimode PCA in order to perform white noise removal in color images [9].A good approximation of the rank-(K 1 , K 2 , K 3 , K 4 ) approximation can simply be achieved by computing the HOSVD-(K 1 , K 2 , K 3 , K 4 ) of tensor T CC [9,15].Indeed, the HOSVD-(K 1 , K 2 , K 3 , K 4 ) of T CC consists of the initialization step of TUCKALS3 algorithm, and hence can be considered as a suboptimal solution for the rank-(K 1 , K 2 , K 3 , K 4 ) approximation of tensor T CC [15].This HOSVD-based technique has recently been used in [9] for image denoising.

Runtime Improvement: Fixed Point Algorithm
As computing LRTA involves to compute four times Singular Value Decomposition for T CC decomposition, we propose a fixed point algorithm based on Gram-Schmidt orthogonalisation to lower runtime.

Fixed Point Algorithm. We propose to adapt a Fixed
Point algorithm [17] to replace SVD in LRTA [11], Algorithm 1, it allows to spare a high amount of computational load while preserving a high level of accuracy and has never been used in optical domain yet.Although SVD needs to be computed only once, this method is very useful during model tuning as several variants are compiled.
One way to compute the K orthonormal basis vectors is to use Gram-Schmidt method (Algorithm 2).
The eigenvector with dominant eigenvalue will be measured first.Similarly, all the remaining K − 1 basis vectors (orthonormal to the previously measured basis vectors) will be measured one by one in a reducing order of dominance.The previously measured (p − 1)th basis vectors will be used to find the p th basis vector.The algorithm for pth basis vector will converge when the new value u + p and old value u p are such that u +T p u p = 1.It is usually economical to use a finite tolerance error to satisfy the convergence criterion u +T p u p − 1 < η, where η is a prior fixed threshold.
Let U = [u 1 , u 2 , . . ., u K ] be the matrix whose columns are the K orthonormal basis vectors.Then UU T is the projector onto the subspace spanned by the K eigenvectors associated with the K largest eigenvalues.This subspace is also called "signal subspace."It can be used in LRTA-(K 1 , K 2 , K 3 , K 4 ) to retrieve the basis vectors U (n)  0 in step 2c of Algorithm 1.Thus, the initialization step is faster since it does not need the I n basis vectors but only the K n first ones and it does not need the step 2b, that is SVD of the data tensor n-mode unfolding matrix TCC n .
According to what have been explained before, fixed point algorithm will replace SVD in LRTA algorithm, in order to improve runtime.However, a prerequisite to this algorithm is the knowledge of each n-mode rank.A method to determine them is developed in next subsection.

n-Mode
Rank Estimation.Whereas a common way to obtain K 1 , K 2 , K 3 , K 4 values is to use empirical data [18], we wish to present here a scientific tool optimizing the calculation.Each projector P (n) , n = 1, 2, 3, 4, is estimated by the truncation of unfolding matrices TCC n using SVD, that is, by keeping the K n eigenvectors associated with the K n largest singular values of TCC n , n = 1, 2, 3, 4. In order to estimate the value of K n for each n-mode, we extend the wellknown detection criteria [10].These criteria, initially, are developed for estimating the number of largest eigenvalues of any matrix.Thus, the optimal signal subspace dimension is obtained merely by minimizing one of (Akaike's Information Criterion) AIC [19] or (Minimum Description Length) MDL [20] criteria.This approach does note require any subjective threshold settings.The number of eigenvectors can be determined from the eigenvalues of the n-mode unfolding matrix TCC n .Thus, this method presents the advantage of being self-consistent as it computes the optimal number of kernels for any illumination configuration.
Consequently, for each n-mode unfolding of T CC, the form of detection criterion AIC can be expressed as and the MDL criterion is given by where G and A are geometricand arithmetic mean, respectively.(λ i ) 1≤i≤In are I n eigenvalues of the data matrix of the n-mode unfolding T CC: In , and M n is the number of columns of the n-mode unfolding T CC.
The n-mode rank K n is the value of k (k = 1, . . ., I n − 1) which minimizes AIC or MDL criterion.

Experimental Results
The proposed methods can be applied to any four-way array of data set such as image, multicomponent seismic signals, hyperspectral images.
We wish to compare LRTA algorithm benefits over SOCS algorithm.We use square size matrices with increasing rows and columns number as data set to compare algorithms runtime and accuracy.

Runtime Improvement.
We first focus on Fixed Point algorithm runtime gain toward SVD algorithm.Figure 3 shows runtime curves of increasing size matrices decomposition.Matlab SVD algorithm is used.We have to compute about 10 eigenvectors for each matrix with fixed point  algorithm.This value is provided by AIC criterion and corresponds to an average value of eigenvectors number used in classical OPC models.However, as fixed point algorithm runtime is linear with eigenvectors number to compute, Table 1 shows that this number can be increased up to around 300 eigenvectors for 3000 × 3000 matrices and up to around 100 eigenvectors for 120 × 120 matrices.The computations have been run with Matlab on a 2.4 GHz dual core Pentium with 4Go RAM under Windows XP.Those curves are helpful to understand that the rank given by AIC and MDL criteria is really connected to the eigenvectors associated with the largest eigenvalues.

Fixed Point Algorithm Reconstruction Error.
In order to assess our algorithm precision toward SVD decomposition, Figure 5 shows reconstruction error of both fixed point and SVD algorithms, T − T / T , where T denotes the original matrix and T the reconstructed matrix.For constancy purposes with fixed point algorithm, the same number of eigenvectors is used for both algorithms.It is difficult to distinguish the two curves on Figure 5; however fixed point algorithm reconstruction error remains very slightly above SVD reconstruction error.

Comparison between LRTA and SOCS Algorithms.
Figure 6 shows a comparison of SOCS and LRTA algorithms reconstruction error and both algorithms runtime.
To obtain these curves, random tensors of increasing size have been created.They have been unfolded following Cobb's method [7], decomposed using SVD algorithm, recomposed using only the dominant eigenvectors.
In the same time, they have been decomposed using LRTA algorithm with fixed point and also recomposed using only the dominant eigenvectors.
In both cases, the error function is the same than used in Section 5.3 to compute fixed point algorithm versus SVD algorithm error.
The curves show clearly the benefit of our method based on tensor data representation and on multilinear algebra tools.
The computations have been run with Matlab on a 2.4 GHz dual core Pentium with 4Go RAM under Windows XP.A lower number of kernels are sufficient for the LRTA algorithm to give a result close to the full hopkins.This is confirmed by the obtained results shown in Figure 8 using the same calculus only with 4 kernels for L-shaped geometry.Note that this sufficient number is estimated using AIC criterion.

Comparison of Aerial Images
Considering Full Hopkins gives the aerial image as close as possible to the image really obtained on the wafer.One can see that the aerial image obtained using LRTA approximation better looks like real image than one obtained by SOCS algorithm.

Conclusion
We have proposed here a new approach to Transfer Cross-Coefficients data set approximation in diffraction theory of  optical images based on multilinear algebra tools.The goal of this work is two way axed, it allows to adapt complex physical equations to tensor computation and to improve runtime by using fixed point algorithm instead of HOSVD while preserving accuracy by using LRTA and AIC algorithm.We have proven our method to be faster and more accurate than existing SOCS method, as we obtain a lower reconstruction error than SOCS algorithm, whatever the size of input data is.Those methods have proven their efficiency in other domains, such as four-way images restoration or denoising

( 1 )
Choose K, the number of principal axes or eigenvectors required to estimate.Consider matrix T and set p ← 1. (2) Initialize eigenvector u p of size d × 1, for example, randomly; (3) Update u p as u p ← Tu p ; (4) Do the Gram-Schmidt orthogonalization process u p ← u p − j=p−1 j=1 (u T p u j )u j ; (5) Normalize u p by dividing it by its norm: u p ← u p / u p .(6) If u p has not converged, go back to step 3. (7) Increment counter p ← p + 1 and go to step 2 until p equals K. Algorithm 2: Fixed-point.

Figure 3 :
Figure 3: Computational times (in sec.) as a function of rows and columns number.

Figure 4 (
a) shows AIC and MDL functions of a fourth-order tensor filled with random data.The minimum value in AIC and MDL curves gives rank estimation value of considered mode.Figure4(b)shows the normalized energy of the eigenvalues of the same array.
Normalized energy of matrix eigenvalues

Figure 4 :
Figure 4: Energy of matrix eigenvalues and comparison between AIC and MDL criteria.

Figure 5 :
Figure 5: Reconstruction error comparison between SVD and Fixed point algorithm.

Figure 7
shows aerial images computed using Full Hopkins equation and with SOCS algorithm and LRTA algorithm, for 4 (Figure7(a)), 10 (Figure7(b)) and 20 kernels (Figure7(c)).The structure used is a T-shaped geometry of total size 720 nm × 720 nm.The model used corresponds to a 193 nm wavelength illumination, 0.75 NA and 0.6 σ, optical diameter is 1280 nm, and grid size is 1 nm.
Comparison between SOCS and LRTA runtime
••×IN .Scalar product between A and B is given by 2.5.Useful Subspace Definition.Let us define E (n) the nthmode vector space of dimension I n , associated with the nthmode of tensor A. By definition, E (n) is generated by the column vectors of the nth-mode flattening matrix.The nthmode flattening matrix A n of tensor A ∈ C I1×•••×IN is defined as a matrix from C In×Mn , with

Table 1 :
Some numerical values corresponding with Figure3.