Stability Analysis of a Variant of the Prony Method

Prony type methods are used in many engineering applications to determine the exponential fit corresponding to a dataset. In this paper we study a variant of Prony’s method that was used by Martı́n-Landrove et al., in a process of segmentation of T2-weighted MRI brain images. We show the equivalence between that method and the classical Prony method and study the stability of the computed solutions with respect to noise in the data set. In particular, we show that the relative error in the calculation of the exponential fit parameters is linear with respect to noise in the data. Our analysis is based on classical results from linear algebra, matrix computation theory, and the theory of stability for roots of polynomials.


Introduction
In this paper we consider the problem of recovering the parameters b, λ j , and C j j 1, . . ., k, from a set of measurements y i , i 1, . . ., n with n ≥ 2k 1 and a given exponential model C j e −λ j t . 1.1 The measurement y i corresponds to the value of expression 1.1 for t iΔt, Δt > 0. Hence we get the set of equations C j e −iλ j Δt i 1, . . ., n.

1.2
This kind of exponential fit appears in the case of T 2 -weighted magnetic resonance images of the brain.T 2 is the transverse relaxation time in the process of magnetic resonance; it measures for how long the transverse magnetization lasts in a uniform external magnetic field.Two parameters, TR repetition time and TE echo time characterize the acquisition of the signal; T 2 weighted refers to a signal produced with long TR and TE.
The exponents λ j are the nonlinear parameters and correspond to the relaxation rates associated with the different tissues present in the images.The coefficients b, C 1 , . . ., C k are the linear parameters and are related to the noise, and the proportions of those tissues in the given images.In previous works Martín-Landrove et al. 1, 2 , it is considered a variant of the Prony method to recover the relaxation rates, the noise and the proportions.Paluszny et al. 3 compared that variant of the Prony method with the Variable Projection Method of Golub and Pereyra, 4-6 .The method proved to be reliable for low noise levels and k 1, 2, 3.The Variable Projection Method proved to be more robust in the presence of larger noise levels but required ten times more computational time to get results comparable to those of the variant of the Prony method.In this paper we first show that the method proposed by Martín-Landrove is equivalent to the Prony method described by Kahn et al. in 7 , and then we study the behavior of the linear systems and the conditioning of the polynomial roots that have to be computed to obtain the model parameters using the method.

Prony Method
In the model formulation 1.1 set b C 0 and λ 0 0, and write 2.1 The Prony type methods, also known as polynomial methods, use the fact that μ t satisfies a difference equation of the form where E is the forward shift operator and the values β j e −λ j Δt are the roots of the polynomial which is called the characteristic polynomial of the associated difference equation 2.2 .Evaluating 2.2 for t i iΔt, i 1, . . ., n, we get the following set of linear equations: Then the above system of linear equations can be rewritten in matrix form as If there is no noise in the observation data, then y i μ t i and the coefficients δ i can be determined from the equivalent system of equations where y y 1 , . . .y n T .Then the β j are computed as the roots of P z and finally λ j − log β j Δt j 0, . . ., k.

2.11
In the presence of noisy data instead of solving the system 2.10 , we consider the nonlinear optimization problem min δ y T X δ X T δ y.

2.12
In order to obtain a nontrivial solution, it is necessary to impose restrictions over the parameters δ j ; each choice of restrictions characterizes a particular version of the Prony method.For example the modified Prony method described in Osborne and Smyth 8, 9 , uses where X δ is the Moore Penrose generalized inverse of X δ .Osborne et al. considered X δ to be full rank, hence X δ X T δ X δ −1 X T δ .We consider a different optimization problem, which appears in the literature, to compare with the Martín-Landrove method: min δ y T X δ X T δ y subject to δ k 2 1.

2.14
The above methods and others, such as classical Prony method, Pisarenko's method, and the linear predictor method, are described in 7-11 .Once the nonlinear parameters have been found, the linear ones are computed as the least squares solution of the linear system obtained by replacing the nonlinear parameters in 1.2 , a separation of variables approach.

An Alternative Formulation of the Prony Method
To simplify the explanation let us consider the case k 3 and n ≥ 7.Then, the system of equations 1.2 can be written as

3.1
From 3.1 and defining q i y i − y i 1 we get

3.2
The dimension of the system can be reduced by using the transformation q j 1 − q j β 1 , to get

Mathematical Problems in Engineering 5
We now apply the transformation q j 1 − q j β 2 , and the new set can be written as

3.5
The equations above are equivalent to

3.6
Finally we use the transformation q j 1 − q j β 3 .At this final step we get the following system:

3.7
In matrix notations we have where If the data values y i are noiseless, the values w 1 , w 2 , and w 3 may be obtained from the previous system of equations and the β j are the roots of the polynomial

3.10
Once the roots have been computed the nonlinear parameters can be calculated using 2.11 , and likewise the linear parameters, as stated before.

Mathematical Problems in Engineering
In general, for an arbitrary k, let M k be the In this case the coefficients of the polynomial α z are the symmetric functions of β 1 , . . ., β k defined as

3.12
These coefficients are determinated by the solution of the system Finally, the β j are the roots of the polynomial

3.14
Next we will study the relationship between the solution obtained by the procedure described above and the Prony method described in Section 2.
Theorem 3.1.Let R be the k × k matrix defined as follows: for k 1 set R 1 and for k > 1

3.15
In addition, let P z and α k z be the polynomials defined in 2.4 and 3.14 , respectively.If δ δ 1 , . . ., δ k 1 , 1 is the solution of the optimization problem 2.14 , then Moreover, 3.17 Proof.The solution δ δ 1 , . . ., δ k 1 , δ k 2 of 2.14 satisfies δ k 2 1.In the case we are considering β 0 1 is a root of P z , which implies that 3.18 Then we have

3.19
Then, using 2.9 , it follows that

3.20
Now, for the polynomial P z we have Proof.Let δ ∈ R k 2 be the solution of the optimization problem 2.14 , ζ R −1 δ k , . . ., δ 1 T , and let ψ be the least squares solution of the linear system 3.13 .From Theorem 3.1 it follows that
In a similar way we can prove that γ in 3.24 is the solution of the optimization problem 2.14 , whenever ψ is the least squares solution of the linear equation 3.13 .

Stability of the Alternative Prony Method
In 3.13 , arrays M k and Q k depend on the vector of measurements y.In the presence of noisy data the equation that really holds is where 1 and 2 depend on the level of noise.Then it is necessary to determine the condition number of the matrix in order to see the accuracy in the coefficients given by w k .Let us consider 1.2 for n 2k 1.In this situation M k is a nonsingular square matrix of dimension k × k.In the following lemma we establish a factorization of the matrix M k that will be used in the stability analysis of the system 3.13 .Our analysis is similar to the one described in 12 .
Lemma 4.1.Let us consider n 2k 1.Let G and H be the k × k matrices defined by and let D be the k × k diagonal matrix given by d i,i C i 1 − β i β i .Then M k GDH.

4.3
Proof.By definition q i y i − y i 1 , i 1, . . ., 2k.Then To get 4.6 it is enough to observe that

4.8
The following theorem establishes an estimate for the variation Δw k in the vector components w k in 3.13 , as it depends on the noise level in the vector of the measurements y y 1 , . . ., y n T .

Mathematical Problems in Engineering 11
If Δy ∞ < δ, then the variation Δw k in the solution of the perturbed system

4.13
In a similar way we get

4.14
By using 4.6 we see that From 4.13 , 4.14 , and 4.15 , and using Theorem 2.7.2 in 13 , it follows that

4.16
Once we have estimated the vector w k w k Δw k , it remains to consider the corresponding impact in the calculation of the roots of the polynomial 4.17 Let β j be one of the roots of the polynomial α k z and suppose that β j β j Δβ j is the root of the polynomial α k z closest to β j .The following is an estimate for Δβ j which follows from the theory of stability for simple roots of polynomials, see 14 .

4.19
From 4.12 we see that there is a constant Λ 3 such that Thus, if δ in Theorem 4.3 is sufficiently small such that then the conditions of Lemma 4.4 are satisfied and therefore

4.22
With the computed β j , 1 ≤ j ≤ k, we should analyze the least squares solution of the perturbed system which will be written as B C y.The following theorem provides an estimate of ΔC 2 , being C C ΔC the solution of the perturbed system.Note that the following result is the first one in this paper where we require the nonlinear parameters in the model 1.1 to be positive.

4.24
Let us suppose that δ satisfies 4.10 and 4.21 and let where and y * is the orthogonal projection of the vector y on the subspace spanned by the columns of B.
Proof.Let ρ δ /Λ 4 .Using the mean value theorem we see that with β * l between β j and β j Δβ j and 0 < β j < 1.Then

4.29
From 4.20 and 4.22 it follows that

4.30
On the other hand

4.31
By definition

Mathematical Problems in Engineering
If the value of ρ δ in 4.25 is given by and sin σ 0, then This estimate is similar to the is studied one obtained in reference 12 , in which it is studied the stability of a confluent Prony system.Let us suppose, without loss of generality, that D is a diagonal matrix, therefore.We have η D max j C j β j 1 − β j min j C j β j 1 − β j .

4.37
We do not know a reasonable upper bound for the condition number η B .It follows that the method is well conditioned if each one of the following statements is satisfied: there is σ > 0 such that β j ≥ σ for 1 ≤ j ≤ k, the powers β j are widely separated, and the number of nonlinear parameters remains small.The proximity between two β j values leads to a large value of the condition number of G, η G , and hence to the ill-conditioning of the algorithm.This characteristic is consistent with the observations of Varah, 16 , who remarked the ill conditioning behaviour for any exponential fitting algorithm, provided that there are two parameters β j very close.

Numerical Results
In this section we present some numerical experiments to see the actual behavior of our implementation of the alternative Prony method described above.We have results for models with two and three exponentials.
Since the application which motivated our work is related to T 2 magnetic resonance brain images, we will take the linear parameters C j to be positive and such that k j 1 C j 255, and the nonlinear ones to lie in the interval 0.65, 26 , see 17 .
Given a particular model y t , the data values y i are generated evaluating y t i with Δt 0.04, t i iΔt and adding noise.We have used two different kinds of noise: Gaussian and Rician.
In MRI, data are intrinsically complex valued and corrupted with zero mean Gaussian distributed noise with equal variance 18 .MR magnitude images are formed by simply taking the square root of the two independent Gaussian random variables real and imaginary parts pixel by pixel.For an MR magnitude image, the Rician probability density function of the measure pixel intensity x is given by where I 0 is the modified zeroth-order Bessel function of the first kind, y is the underlying noise-free signal amplitude, and δ denotes the standard deviation of the Gaussian noise in the real and imaginary parts.
The figures are log-log graphs, the horizontal axis corresponds to the noise, and the vertical axis corresponds to the relative errors in the computations of the linear and nonlinear parameters For each example, we first show the results for one run corresponding to a particular value of the noise and then we show the average value of the relative errors for one hundred runs.
The Gaussian noise has zero mean, and the variance, δ, varies between 10 −10 and 1.For the Rice noise we use y i , δ , with δ varying between 10 −10 and 1 as the parameters for each simulation.

5.3
Note the linear dependence of the errors with respect to level of noise.We also noticed that, in general, the errors in the nonlinear parameters are slightly smaller than the errors in the linear parameters.In Figures 1 and 2 we have the results using Gaussian noise.Figures 3 and 4 show the results for the same model and Rice noise; as the behaviour is similar for Gaussian and Rice noises, we will only consider Rice noise for the rest of the experiments.
Next we present the results for the model y t 5 100e −11t 150e −12t .5.4 In Figures 5 and 6 the graphs show a deterioration, as the theory suggests, because the nonlinear parameters are closer than in the previous example.In this case the errors in the linear parameters are greater, for more than one order of magnitude, than the errors on the nonlinear parameters.
Finally, we use the model y t 5 70e −2t 85e −10t 95e −18t .5.5 The results are shown in Figures 7 and 8. Again, there is a deterioration, with respect to the first example, which is caused by the addition of one term to the exponential model.Now we present Table 1 with the computed condition numbers for the different relevant matrices for each one of the three examples.

Conclusions
In this work we have developed a stability analysis for the alternative formulation of the Prony method presented by Martín-Landrove et al. 1, 2 .The analysis shows the linear dependence between the perturbation of the data and the relative errors in the computed values for the model linear and nonlinear parameters.It is also shown that the errors in the linear parameters depend upon both the number and the closeness of the nonlinear parameters.

Theorem 4 . 5 .
Let n, y, Δy, and δ be as in the statement of Theorem 4.3, with Δy ∞ < δ.Let η β be the condition number of matrix B in the Euclidean norm and consider Λ 4 min B 2 , y 2 .
Let us suppose that there exists a unique solution to the optimization problem 2.14 .A vector δ ∈ R k 2 is the solution of the optimization problem 2.14 if and only if R −1 δ k , . .., δ 1T is the least squares solution of the linear equation 3.13 .

5
Lemma 4.2.Let us consider n 2k 1.Let η M k , η G and η D be the condition numbers in the infinite norm of the matrices M k , G, and D, respectively.Then Proof.Let η H be the condition number in the infinite norm of the matrix H. Using Lemma 4.1 it follows that