Is the Best Fitting Curve Always Unique ?

is is a continuation of our paper [1] where we studied the problem of existence of the best �tting curve. Here we deal with its uniqueness. Our interest in these problems comes from applications where one describes a set of points PP1,... , PPnn (representing experimental data or observations) by simple geometric shapes, such as lines, circular arc, elliptic arc, and so forth. e best �t is achieved when the geometric distances from the given points to the �tting curve are minimized, in the least squares sense. Finding the best �t reduces to the minimization of the objective function


Introduction
is is a continuation of our paper [1] where we studied the problem of existence of the best �tting curve.Here we deal with its uniqueness.
Our interest in these problems comes from applications where one describes a set of points  1 , … ,   (representing experimental data or observations) by simple geometric shapes, such as lines, circular arc, elliptic arc, and so forth.e best �t is achieved when the geometric distances from the given points to the �tting curve are minimized, in the least squares sense.Finding the best �t reduces to the minimization of the objective function where  denotes the �tting curve (line, circle, ellipse, etc.).
Here dist(, ) =   (, ) denotes the shortest distance from  to , and  stands for the Euclidean metric in ℝ 2 .We refer the reader to [1] for the background of the geometric �tting problem.Most publications on the �tting problem are devoted to practical algorithms for �nding the best �tting curve minimization (1) or statistical properties of the resulting estimates.Very rarely one addresses fundamental issues such as the existence and uniqueness of the best �t.If these issues do come up, one either assumes that the best �t exists and is unique or just points out examples to the contrary without deep investigation.
In our previous paper [1] we investigated the existence of the best �t.Here we address the issue of uniqueness.ese issues turn out to be quite nontrivial and lead to unexpected conclusions.As a glimpse of our results, here and in [1], we provide a table summarizing the state of affairs in the problem of �tting most popular 2D objects (here Yes means the best �tting object exists or is unique in all respective cases; No means the existence/uniqueness fails in some of the respective cases).
We see that the existence and uniqueness of the best �tting object cannot be just taken for granted.Actually 2/3 of the answers in Table 1 are negative.In particular, the uniqueness can never be guaranteed.(For the exact meaning of all cases and typical cases we refer the reader to [1].) e uniqueness of the best �t is not only of theoretical interest but also practically relevant.e nonuniqueness means that the best �tting object may not be stable under slight perturbations of the data points.An example is described by Nievergelt [2].He presented a set of  =  points that can be �tted by three different circles equally well.en by arbitrarily small changes in the coordinates of the points, one can make any of these three circles �t the points a bit better than the other two circles, thus the best �tting circle will change abruptly.
A similar example was described by Chernov in [3, Section 2.2], where the best �tting line to a given data set of  =  points is horizontal, but aer an arbitrarily small change in the coordinates of the data points, it turns 90 ∘ and becomes vertical.
Such examples show that the best �tting object may be extremely sensitive to small numerical errors in the data or round-off errors of the calculation.

Uniqueness of the Best Fitting Line
We begin our study of the uniqueness problem with the simplest case��tting straight lines to data points.We �rst introduce relevant statistical symbols and notation.
Given data points ( the components of the so-called "scatter matrix" which characterizes the "spread" of the data set about its centroid (, ).is matrix is symmetric and positive semide�nte.e scatter matrix  de�nes the so called scattering ellipse whose center is (, ) and whose axes are spanned by the eigenvectors of the scatter matrix  (the major axis is spanned by the eigenvector corresponding to the larger eigenvalue).
Next we �nd the following best �tting line �3, Chapter 2].We will describe lines in the  plane by equation where , , and  are the parameters of the line.Now the best �tting line is found by minimizing the objective function where   (, )  denotes the parameter vector.Minimizing (9) subject to the constraint ‖‖  1 is a simple problem of the matrix algebra; its solution is the eigenvector of the scatter matrix  corresponding to the smaller eigenvalue.Observe that the parameter vector  is orthogonal to the line (5), thus the line itself is parallel to the other eigenvector.In addition, it passes through the centroid, hence it is the major axis of the scattering ellipse.
e above observations are summarized as follows.
eorem 1. �very best �tting line        passes through the centroid and coincides with the major axis of the scattering ellipse.
For typical data sets, the above procedure leads to a unique best �tting line.�ut there are certain exceptions.
If the two eigenvalues of  coincide, then every vector    is its eigenvector, and the function ℱ(, ) is actually constant on the unit circle ‖‖  1.In that case all the lines passing through the centroid of the data minimize ℱ; hence, the problem has multiple (in�nitely many) solutions.is happens if and only if  is a scalar matrix, that is, e above observations are summarized as follows.
eorem 2. � best �tting line is not uni�ue if and only if the eigenvalues of the scatter matrix  coincide.In this case the scattering ellipse becomes a circle.Moreover, in this case every line passing through the centroid (, ) is one of the best �tting lines.
us we have a dichotomy; either there is a single best �tting line or there are in�nitely many best �tting lines.In the latter case, the whole bundle of lines passing through the centroid (, ) are best �tting lines.
A simple example of a data set for which there are multiple best �tting lines is  points placed at the vertices of a regular polygon with  vertices (-gon).Rotating the data set around its center by the angle 2 takes the data set back to itself.So if there is one best �tting line, then by rotating it through the angle 2 we get another line that �ts equally well.us the best �tting line is not unique.
It is less obvious (but true, according to eorem 2) that every line passing through the center of our regular polygon is a best �tting line; they all minimize the objective function.
Data points placed at vertices of a regular polygon seem like a very exceptional situation.�owever multiple best �tting lines are much more common.e following is true.eorem 3. Given any data points ( 1 ,  1 ), …, (  ,   ), one can always move one of them so that the new data set will admit multiple best �tting lines.�recisely, there are always  ′  and  ′  such that the set ( 1 ,  1 ), …, ( −1 ,  −1 ), ( ′  ,  ′  ) admit multiple best �tting lines.
In other words, the    points can be placed arbitrarily, without any regular pattern whatever, and then we can add just one extra point so that the set of all  points will admit multiple best �tting lines, that is, will satisfy (10).
Still, the existence of multiple best �tting lines is a very unlikely event in probabilistic terms.If data points are sampled randomly from an absolutely continuous probability distribution, then this event occurs with probability zero.Indeed, (10) speci�es a subsurface (submanifold) in the 2dimensional space with coordinates   ,   , … ,   ,   .at submanifold has zero volume; hence, for any absolutely continuous probability distribution, its probability is zero.
However, if the data points are obtained from a digital image (say, they are pixels on a computer screen), then the chance of having (10) may no longer be negligible and may have to be reckoned with.For instance, a simple con�guration of 4 pixels making a 2 × 2 square satis�es (10), and thus the orthogonal �tting line is not uniquely de�ned.

Uniqueness of the Best Fitting Circle
We have seen in Section 2 that the simplest �tting prob-lem�that of �tting straight lines�can have multiple solutions, so it may not be too surprising to �nd out that more complicated problems also can have multiple solutions (we emphasize that the best �tting circle minimizes the sum of squares of geometric distances, as de�ned in the Introduction).Here we demonstrate the multiplicity of the best �t for circles.
However, we cannot describe all data sets for which the best �tting circle is not unique in the same comprehensive manner as we did that for lines in Section 2. We can only give some examples of such data sets.
All the known examples are based on the rotational symmetry of the data set.We already used this idea in Section 2. Suppose the data set can be rotated around some point  through the angle 2 for some integer   2, and aer the rotation it comes back to itself.en, if there is a best �tting circle, rotating it around  through the angle 2 would give us another circle that would �t the data set equally well.is is how we get more than one best �tting circle.
is is a nice idea but it breaks down instantly if the center of the best �tting circle happens to coincide with the center of rotation .en we would rotate the circle around its own center and obviously would get the same circle again.us one has to construct a rotationally symmetric data set more carefully to avoid best �tting circles centered on the natural center of symmetry of the set.F 1: Four data points and three �tting circles.e earliest and simplest example was given by Nievergelt [2].He chose    data points as follows: (0, 0) , (0, 2) ,  √ 3,  ,  √ 3,  .
(11) e last three points are at the vertices of an equilateral triangle centered on (0, 0).So the whole set can be rotated around the origin (0, 0) through the angle 23, and it will come back to itself.Nievergelt claimed that the best �tting circle has center (0, 3) and radius   .is circle passes through the last two data points and cuts right in the middle between the �rst two.So the �rst two points are at distance    from that circle, and the last two are right on it (their distance from the circle is zero).us the objective function is It is easy to believe that Nievergelt's circle is the best, indeed, as any attempt to perturb its center or radius would only make the �t worse (the objective function would grow).However a complete mathematical proof of this claim would be perhaps prohibitively difficult, so we leave it out.
Our goal is actually more modest than �nding the best �tting circle in Nievergelt's example.Our goal is to show that there are multiple best �tting circles (without �nding them explicitly).And the multiplicity here can be proven as follows.
According to our general results [1], for every data set the best �t exists, which may be a circle or a line.If the best object is a circle, then its center is either at (0, 0) or elsewhere.So we have three possible cases: (i) the best �tting object is a line, (ii) the best �tting object is a circle centered on (0, 0), and (iii) the best �tting object is a circle with a center di�erent from (0, 0).In the last case our rotational symmetry will work, as explained above, and prove the multiplicity of the best �tting circle.So we need to rule out the �rst two cases.
Consider any circle of radius  centered on (0, 0).It is easy to see that the respective objective function is Its minimum is attained at   32, and its minimum value is is is larger than ℱ = 2 in (12).us circles centered on the origin cannot compete with Nievergelt's circle and should be ruled out.
Next we consider all lines.As we have seen in Section 2, for rotationally symmetric data sets, all the best �tting lines pass through the center.All of those lines �t equally well.Taking the  axis, for example, it is easy to see that the corresponding objective function is is is greater than ℱ = 2 in (12) and even greater than ℱ = 3 in ( 14).us lines are even less competitive than circles centered on the origin, so they are ruled out as well.e proof is �nished.erefore, the best �tting circle has a center different from (0, 0).us by rotating this circle through the angles 23 and 43, we get two more circles that �t the data equally well.So the circle �tting problem has three distinct solutions.e alleged best �tting circles are shown in Figure 1.Aer Nievergelt's example, two other papers presented, independently, similar examples of nonunique circle �ts.
Chernov and Lesort [4] used a perfect square, instead of Nievergelt's regular triangle.ey placed four points at the vertices of the square and another 4 points at its center, so the data set consisted of  =  points total.en they used the above strategy to prove that at least four different circles achieve the best �t.
Zelniker and Clarkson [5] used a regular triangle again and placed three points at its vertices and three more points at its center (so that the data set consisted of  = 6 points).en they showed that at least three different circles achieve the best �t.
ese examples lead to an interesting fact that may seem rather counterintuitive.Let  be a circle of radius  with center .Let us place a large number of data points on  and a single data point at the center .Suppose the points on  are placed uniformly (say at the vertices of a regular polygon).en it seems like  is an excellent candidate for the best �tting circle-it interpolates all the data points and misses only at , so ℱ =  2 .It is hard to imagine that any other circle or line can do any better.
However, a striking fact proved by Nievergelt [6, Lemma �] says that the center of the best �tting circle cannot coincide with any data point.erefore in our example,  cannot be the best �tting circle.Hence some other circle with center  ′ ≠  �ts the data set better.And again, rotating the best circle about  gives other best �tting circles, so those are not unique.
Rotationally symmetric data sets described above are clearly exceptional; small perturbations of data points easily destroy the symmetry.But there are probably many other data sets, without any symmetries, that admit multiple circle �ts, too.We believe that they are all unusual and can be easily destroyed by small perturbations.Below is our argument.
Suppose a set of data points  1 , … ,   admits two best circle �ts, and denote those circles by  1 and  2 .First consider a simple case;  1 and  2 are concentric, that is, have a common center, .Let   denote the distance from the F 2: e best �t to a uniform distribution in a square.
point   to the center .By direct inspection, for any circle of radius  centered on  the objective function is is is a quadratic polynomial in , so it cannot have two distinct minima.So the two best �tting circles cannot be concentric.Now suppose the circles  1 and  2 are not concentric, that is, they have distinct centers,  1 and  2 .Let  denote the line passing through  1 and  2 .Note that the data points cannot be all on the line  (because if the data points were collinear, the best �t would be achieved by the interpolating line and not by two circles).So there exists a point   that does not lie on the line .Hence we can move it slightly toward the circle  1 but away from the circle  2 .en the objective function ℱ changes slightly, and it will decrease at one minimum (on  1 ) and increase at the other (on  2 ).is will break the tie and ensure the uniqueness of the global minimum.

Uniqueness of the Best Fitting Ellipse
Based on the previous two sections, we should expect that data sets exist for which the best �tting ellipse is not unique.However, we could not �nd any explicit examples in the literature, so we supply our own.
Our previous paper [1] was the �rst to provide an example of that sort.We �tted conics to a uniform distribution in a perfect square, [0, 1] × [0, 1].We found, quite unexpectedly, that the best �t was achieved by two distinct ellipses; they were geometrically equal (i.e., they had the same major axis and the same minor axis), and they had a common center, but one was oriented vertically and the other horizontally.See Figure 2.
Strictly speaking, in this example we did not have a data set-we replaced it with a uniform distribution that is obtained as a limit of large samples, as   .But we would get the same picture-two best �tting ellipses-if we place  ×  data points in the square arranged as a perfect square lattice (e.g., the points have coordinates (, ), where  = 1, … ,  and  = 1, … , ).
A more elegant example can be constructed as follows.Recall (Section 3) that Nievergelt's example of multiple �tting circles consisted of  = 4 data points; three were placed at vertices of an equilateral triangle and the fourth one at its center.
Note that a circle has three independent parameters, but an ellipse has �ve.So it is natural to generalize Nievergelt�s example by placing ��e data points at vertices of a regular pentagon and the sixth one at its center.us we have    data points as follows: (0, 0) , (0, 2 We strongly believe that the best �tting ellipse passes through the last four data points and the point (0, 1).ese �ve points determine the ellipse uniquely.It is obviously symmetric about the  axis, so its major axis is horizontal.is ellipse cuts right in the middle between the �rst two data points.So those two points are at distance   1 from that ellipse and the last four are right on it (the distance is zero).us the objective function is Below we provide a partial proof of our claim that the above ellipse is the best.We also designed a full computer-assisted proof that involves extensive numerical computations.Lastly, by rotating this ellipse through the angles 2 for   1, 2, 3,  we get four more ellipses that �t the data equally well.So the ellipse �tting problem has �ve distinct solutions; see Figure 3.
We will compare our ellipse to the best �tting circle centered on the origin and the best �tting lines.Consider any circle of radius  centered on (0, 0).It is easy to see that the respective objective function is Its minimum is attained at   3, and its minimum value is is is larger than ℱ  2 in (18).us circles centered on the origin cannot compete with our ellipse.
Consider all lines.As we have seen in Section 2, for rotationally symmetric data sets all the best �tting lines pass through the center, and all of those lines �t equally well.Taking the  axis, for example, it is easy to see that the corresponding objective function is is is pretty good, better than the best �tting circle in (20).But still it is a little worse than the best �tting ellipse in (18).us our ellipse �ts better than any circle centered on the origin, any line, and any pair of parallel lines.In order to conclude that it is really the best �tting ellipse, we would have to compare it to all other ellipses and parabolas.is task seems prohibitively difficult if one uses only theoretical arguments as above.Instead, we developed a computerassisted proof.It is a part of the Ph.D. thesis by Q. Huang, which we plan to post on the web [7].
1 ,  1 ),…,(  ,   ), we denote by  and  the sample means [1]n ℱ  2 in (18) and even greater than ℱ  103 in (20).uslines are even less competitive than circles centered on the origin.Also, in the ellipse �tting problem, pairs of parallel lines are legitimate model objects; see[1].We examined the �ts achieved by pairs of parallel lines.e best �t we found was by two horizontal lines    1 and    2 , where Note that  1 is the average -coordinate of the �rst four points in our sample.us the �rst line is the best �tting line for the �rst four points, and the second line passes through the last two points.e objective function for this pair of lines is