A NOVEL INTERPRETATION OF LEAST SQUARES SOLUTION

We show that the well-known least squares (LS) solution of an overdetermined system of linear equations is a convex combination of all the non-trivial solutions weighed by the squares of the corresponding denominator determinants of the Cramer’s rule. This Least Squares Decomposition (LSD) gives an alternate statistical interpretation of least squares, as well as another geometric meaning. Furthermore, when the singular values of the matrix of the overdetermined system are not small, the LSD may be able to provide flexible solutions. As an illustration, we apply the LSD to interpret the LS-solution in the problem of source localization.

1. INTRODUCHON.Given an overdetermined system of linear equations Ax -b  (1) where A [aij] is an mxn matrix with m n and rank (A) n, and b [bj] is an m-column vector, and x [x] is the unknown n-column vector.For simplicity, we consider only real numbers.There are at most det[A ix...i.; b: j] j 1, ...,n (2) x[ix...i,] det[A" ix...in] 1 < < < m where det[A: ix...in], assumed non-zero, is the nxn minor formed from A by taking the rows ix,..., i,, and det[A:ix...in; b: j] is the previous determinant with its jth column replaced by the corresponding bi, from the vector b.
The least squares (LS) solution is [3]: Xts (a 'A )-Xa 'b   (3) where (A'A)-XA' is the generalized inverse ofA.Using our notations, the LS-solution can be rewritten as det[A 'A ;A 'b j] Xi[LS]" det[A 'A j 1, ...,n (3') Furthermore, it is well-known that the generalized inverse and the LS-solution can be expressed in terms of the singular-value decomposition (SVD) [3] which suggests a way of obtaining more accurate solutions by taking the sum over those large singular values only.LEAST (5) .', """ E det[A" kx...k.]l PROOF.We have assumed that A'A is non-singular (since rank(A)-n).Note that Eq. ( 5) is a special case of Eq. ( 4), and Eq. ( 6) is the consequences of Eqs. ( 2), ( 3), ( 4) and ( 5).The case when m n is obvious, because in this case, there is only one term in the summations.The case when n 2 can be verified easily by direct evaluation.The proof for the general case is notationally lengthy.In order to illustrate the spirit, we will prove Eq. ( 4) for m 4 and n 3 in the following.Note that if A is complex, we simply replace all the transposes by the conjugate transposes.ab, a.2a a4.2a4.l a,b, l atjai2 a43 abi , aaiz a43 l ailb a41 , ailat 1,2,3 1 ai2bi a42 2 ai2a3 1,2,3 t ai3bi a4.3 1,3 aa2 1,2,3 1,2,3 l.zs'aaaz 1.3 a2 For the third determinant, multiply Column 2 by a4. and add it to Column 3.For the fourth determinant, multiply Column 1 by a4.2 and add it to Column 2, then multiply Column 1 by a4.3 and add it to Column 3. We then have, D12 det[A 'A ;A 'b" I]- Similarly, we obtain DI24,DI:;4 and Dz.Adding up all these expressions, we note that the sum of the second determinants in the above expression gives-det[A 'A;A 'b: 1], and similarly for the third and the fourth determinants.Therefore, which is the required LSD (Eq.( 4)).
If all the singular values ofA are not small, we cannot reduce the summation in the SVD.However, the Least Square Decomposition (LSD) (Eq.( 6)) suggests that we may still get a better answer by summing those NT-solutions whose Cramer denominator determinants are large in magnitude.We will verify the LSD formulas and the above idea via an example in the next Section.In fact, the LSD has the same form as a well-known result in statistics: If 1 ,, are unbiased estimators of t9 with variances respectively, then the linear unbiased minimum variance estimator of 19 is well known to be [2]: -2 (7) j-1 Eq. ( 7) is also the result of minimizing where E E (e,-e)/ () by the method of least squares [2].Thus, ifwe interpret the squares of the Cramer denominator determinants as the o "-'s, the N-T-solutions are the estimates of the "true" solutions, then Eq. ( 6) and Eq. ( 7) are identical.Therefore, the LSD has a second meaning of "least squares" (Eq.( 8))!
Eq. ( 6) gives a simple geometric interpretation, namely, the LS-solution is the "center of mass" among the NT-solutions.The LS-solution is therefore lying near those NT-solutions whose denominator deter- minants are large in magnitude (or, small variances).
3. AN EXAMPLE.Consider   As shown in Hence all the LSD formulas have been verified.Furthermore, the ratio of the magnitude squared (or, just the magnitude) of the determinants in decending order provides information about the significant number of terms in the LSD sum.In our case, we have (44) (23) (-5) (1) 1 0.27324 0.01291 0.00052 Thus, we may define a "condition number" as the ratio of the largest determinant squared to the smallest.
The singular values can be computed [1]: s t-7.501111, s 2-2.926371, s3-2.273693.Note that sls2s3 2491.None of these singular values are small, because the rank of the matrix is 3.In this case, the SVD gives no further improvement, but the LSD is still flexible as illustrated in Table l(b), where the LSD is summing up the N-T-solutions with large denominator determinants in decending order of magnitude.Thus, we may first find the SVD of the system to see how many singular values are small to do the necessary rank reduction, then we apply the LSD for an ultimate improvement.
(a) NT-Solutions and LS-Solutions X Cramer's rule: NT-Solutions Rows 4. AN APPLICATION TO SOURCE LOCALIZATION.In navigation and sonar we adopt sen- sors (at least 3) to receive signals from a source.Let us first consider a 2-dimensional problem with a constant sound speed for simplicity.Suppose we can estimate the time delays between every two sensors.The locus of the source falls on a hyperbola with these two sensors as its loci.Thus every two sensors determine a hyperbola.Suppose there are n sensors (n 3), then the intersection fo all the hyperbolas will give the source location.This is the well known technique of hyperbolic fixing.However, due to noisy time delay measurements, these hyperbolas do not intersect at a unique point.Usually, the source is far away from the sensors and the hyperbolas may be approximated by their asymptotes.The problem is now reduced to a system of pairs of 2x2 linear equations.A least-squares solution gives the source location.With the LSD theorem, we interpret the LS-solution as the weighted sum of all possible source locations according to their denominator determinants.Each denominator is proportional to the tangent of the angle between the two hyperbolas.Thus, if the hyperbolas intersect at almost a right angle, the source location is more accurate than those intersections at small angles.This angle interpretation is simple and intuitive, and it is justified by the LSD theorem.Furthermore, we can just select those solutions with large denominators only.Note that the SVD method has no improve- ment because the rank is always 2. The optimal location is the "center of mass" of the possible loca- tions.
Moreover, instead of using hyperbolas, Schmidt [4] has shown that the source location is the focus of a conic passing through the 3 sensors, hence the source is on the focal line.With more than 3 sensors, we have more than one focal lines.The intersections of these focal lines give the source location(s).Thus, we actually solving linear equations, not just an approximation using asymptotes as in the hyperbolic fixing technique.We can use the LSD to interpret the angles between the focal lines as a measure of the accuracy of the solutions.Formulations of the localization in 3-dimensions using Schmidt's method and other   equivalent methods to get LS-solutions have been done [5], [6].We can interpret all these LS-solutions using the LSD similar to the 2-dimensional case.

Journal of Applied Mathematics and Decision Sciences
Special Issue on Intelligent Computational Methods for Financial Engineering

Call for Papers
As a multidisciplinary field, financial engineering is becoming increasingly important in today's economic and financial world, especially in areas such as portfolio management, asset valuation and prediction, fraud detection, and credit risk management.For example, in a credit risk context, the recently approved Basel II guidelines advise financial institutions to build comprehensible credit risk models in order to optimize their capital allocation policy.Computational methods are being intensively studied and applied to improve the quality of the financial decisions that need to be made.Until now, computational methods and models are central to the analysis of economic and financial decisions.However, more and more researchers have found that the financial environment is not ruled by mathematical distributions or statistical models.In such situations, some attempts have also been made to develop financial engineering models using intelligent computing approaches.For example, an artificial neural network (ANN) is a nonparametric estimation technique which does not make any distributional assumptions regarding the underlying asset.Instead, ANN approach develops a model using sets of unknown parameters and lets the optimization routine seek the best fitting parameters to obtain the desired results.The main aim of this special issue is not to merely illustrate the superior performance of a new intelligent computational method, but also to demonstrate how it can be used effectively in a financial engineering environment to improve and facilitate financial decision making.In this sense, the submissions should especially address how the results of estimated computational models (e.g., ANN, support vector machines, evolutionary algorithm, and fuzzy models) can be used to develop intelligent, easy-to-use, and/or comprehensible computational systems (e.g., decision support systems, agent-based system, and web-based systems) This special issue will include (but not be limited to) the following topics: • Computational methods: artificial intelligence, neural networks, evolutionary algorithms, fuzzy inference, hybrid learning, ensemble learning, cooperative learning, multiagent learning

SupposeA
shown in Column 5. We have, from Table l(a), Table 1, there are 4 NT-solutions (see Columns 1 to 4 in Table l(a)).The LS-solution is computed from (A 'A )x (A 'b), as

Table 1 .
Example of Least Squares Decompositions.

•
Application fields: asset valuation and prediction, asset allocation and portfolio selection, bankruptcy prediction, fraud detection, credit risk management • Implementation aspects: decision support systems, expert systems, information systems, intelligent agents, web service, monitoring, deployment, implementation