We discuss computation of Gröbner bases using approximate arithmetic for coefficients. We show how certain considerations of tolerance, corresponding roughly to absolute and relative error from numeric computation, allow us to obtain good approximate solutions to problems that are overdetermined. We provide examples of solving overdetermined systems of polynomial equations. As a secondary feature we show handling of approximate polynomial GCD computations, using benchmarks from the literature.

Gröbner bases provide a means for solving a myriad of problems in computational algebra. In its original form, the whole arithmetic was carried through exactly, on rational numbers. This was necessary in order to know when combinations of polynomial coefficients cancel. In 1993 Shirayanagi [

Work on numeric Gröbner bases begins with [

A typical situation in which one might desire to work with approximate coefficients is in solving polynomial systems of equations. A common method for such solving sets up an eigendecomposition problem; this is done either with resultants [

This work provides a description and empirical study of methods that extend Gröbner bases to handle approximately consistent polynomial systems. We define these, informally, as inconsistent systems for which there exists a “small” perturbation of input coefficients that makes them consistent. This is, in effect, the opposite situation to that of an artificial structural discontinuity of a Gröbner basis as discussed in [

Several of the ideas we present have been developed independently in the cited literature. Our main contribution is to show how they can be made to work effectively in practice. We provide several nontrivial examples from the literature on numerical polynomial system solving and approximate GCD computation to illustrate the merit of this work.

Gröbner bases are a tool used universally in computational commutative algebra. Among several excellent references we single out [

One first defines a term order on the exponent vectors of power products of a polynomial (these are simply products of powers of variables, e.g.,

One next has a notion of reduction by rewriting a given polynomial. This can happen if the leading term of one divides the leading term of another; we can “reduce” that second one by subtracting an appropriate multiple of the first so that the resulting polynomial has a smaller leading monomial. Using any variant of the algorithm developed by Buchberger (as in the references above) one then rewrites the given set of polynomials to obtain new set called a Gröbner basis. A key step in this process, well explained in the literature, is to methodically generate new polynomials that cannot be reduced by the old ones.

This new set generates the same polynomial ideal and hence has the same solution set. If computed with respect to a lexicographic term ordering, one has in effect triangulated the system, in a form analogous to a row reduced system of linear equations. If computed by a degree-based term order one has in the basis at least one ideal member of smallest total degree. Each of these can be useful in various situations. We avail ourselves of both in this paper.

An issue with these bases is that often they are strenuous to compute. One cause can be a form of size increase wherein coefficients grow quite large compared to inputs. Use of approximate numbers of a specified maximal size helps to combat this. Moreover there are situations in which inputs are only known approximately to begin with; in such cases it makes little sense to do exact computations if that can be avoided. In [

We begin with the observation that there are two variants of approximate Gröbner basis computations. In one we assume that coefficients of input are exactly known, and we use approximate numbers in order to avoid either intermediate swell of integers or difficult computations with algebraic numbers.

The first scenario is in some sense quite nice insofar as theorems can be proven regarding the quality of result based on input precision. This is covered in some detail in [

In Section

While the case of exact input is certainly of interest, here we are primarily interested in a different setting. Coefficients are known only approximately, and moreover we may have an overdetermined system. The rest of this section pertains to both settings. The sequel is then devoted to the case of interest.

Gröbner bases computation using approximate arithmetic can be subject to several problems. First, as noted above, is the issue of recognizing when a cancellation has occurred. The model of approximate arithmetic we use, significance arithmetic, turns out to be quite good at handling this. Indeed, over a decade of experience suggests this poses no issue, provided we do not work with an overdetermined system [

A secondary issue is that, with this choice of arithmetic, precision gradually erodes over the course of a computation (as the arithmetic first order error estimates grow). What this means in practice is that often one must start with high precision input (say, a few hundred digits). Clearly this is well beyond the precision one can expect from input that arises as measurements of data. Again, when the problem at hand is not overdetermined, this is not a serious issue. One simply adds digits, arbitrarily, to the input coefficients. If the problem is not ill conditioned then when finished we know we have solved a nearby system. In practice one observes that residuals from such a solution, used in the original input, are typically small. If so desired, they can be further improved via local refinement methods.

Yet another problem, one particularly associated to use of significance arithmetic, is that in rare cases a decision might be made that a full cancellation took place, when in an exact computation perhaps a small but nonzero value would be obtained. This is discussed in [

We end this section with a historical note. As mentioned earlier, the first reported implementation of numerical Gröbner bases (of which this author is aware) is due to Shirayanagi [

We have just given a brief overview of how we can manage approximate coefficient arithmetic reliably when handling nonoverdetermined (and reasonably well conditioned) systems. Indeed this suffices for many practical sorts of computations. But there is a growing body of literature involving overdetermined systems. It thus becomes important to consider ways in which Gröbner bases can be extended to address them. To motivate this we begin by describing a few sources of such systems.

One place where overdetermined problems are encountered is in best fitting of data. While local methods are typically used, there are cases where one might not have adequate information to give a starting point such that convergence will be attained. For these situations one can utilize an approximate solution to an overdetermined system, obtained with help of a Gröbner basis computation.

A related common scenario is when one uses an overdetermined system in order to rule out undesired solutions. An example is in camera pose estimation [

Another source of overdetermined problems arises in trying to find “approximate” polynomial greatest common divisors [

Once we go from an exactly determined to an overdetermined systems, high precision approximate arithmetic in computing a Gröbner basis no longer suffices to catch cancellation of coefficients. The problem is that we need to expand the size of what we might regard to be zero, as it is now on a scale with the precision of our input.

We are thus faced with a situation where we need to coarsen our classification of what will be regarded as full cancellation. We note that one must be a bit careful in terminology at this point; “zeros” can refer to approximate solutions to a system of equations or to coefficient combinations that cancelled (see [

We discuss in brief notions of tolerance as applied to both absolute and relative accuracy. This is entirely informal; the purpose is simply to motivate our approach to zero recognition. By tolerance we typically have in mind a small threshold, below which we regard values as zero.

Recall that the key operations in Gröbner basis computations are forming of S-polynomials and reduction thereof [

As noted above, we will utilize both relative and absolute tolerance values. Recall that in Gröbner basis computations all manipulations involving coefficient arithmetic arise from addition of pairs of polynomials. Prior to performing such an operation we compute the average magnitude of the coefficients in these polynomials; we will refer to this value as IPCA, for “input polynomial coefficient average”. If, after addition, a resulting coefficient is less than the relative error tolerance times this IPCA, we regard it as zero and remove it. If in fact all coefficients are smaller than the absolute tolerance times the IPCA; then we regard the entire resulting polynomial as zero. In short, we employ the relative error mode to remove coefficients that are small relative to other coefficients, and we use the absolute error to justify removing an entire polynomial when all coefficients are small in absolute magnitude. We again note that this latter assumes some sort of normalization is in place for the polynomials that gave rise to the removed polynomial sum, since now comparison is not with other coefficients of the same polynomial, but rather with a pair of different polynomials whose sum generated the one under scrutiny.

As a practical matter working with these tolerances can pose difficulties. For example, there are many problems where, even after scaling of variables, coefficient sizes will be orders of magnitude apart. Thus a relative tolerance can remove coefficients that are actually needed. Cases where no such tolerance can discern between those coefficients to keep and the ones to discard are, for purposes of this method, ill conditioned.

The absolute tolerance is typically less prone to misuse (at least in the types of examples we will present). While some of the examples did in fact require trial-and-error selection of tolerances, many do not, and experience indicates one can often base a sensible setting on the precision of the input. Typical values for the sort of problems in the proceeding examples, with machine numbers for input, tend to be around

We mention that Kondratyev and coauthors [

The next few sections are organized around examples. All computations were with version 8.0.4 of

We begin with some classical numeric systems that are not overdetermined, in order to indicate that no special handling is needed (at least for the Gröbner basis phase of the computations). These provide a baseline in contrast with later computations. As they are exactly determined rather than overdetermined, in these initial examples we use no tolerancing.

First we will show the Cassou-Noguès system [

We check that the residuals are indeed small.

Observe that the resulting residuals, while small, are many times larger than the precision. This simply indicates that precision loss occurred in parts of the computation. As the computation is relatively fast, and the actual basis computation takes but a small fraction of the total time spent, it does not appear that precision loss is from the overall number of arithmetic operations. We suspect rather that this loss is due to the appearance of approximate clone polynomials (as defined in [

We next have an example that is considerably slower: the Caprasse system. It is troublesome because several roots have multiplicity, and moreover the multiplication (endomorphism) matrices utilized in the solver are derogatory (this circumstance is known to make trouble for the eigendecomposition method).

Now we show a small perturbation of this troublesome system. This moves the system to one that is nearly but not exactly derogatory. The numerical solver again obtains good results in reasonable time.

While we do not show the Gröbner bases computed, we again remark that they have different structure. From either one we have recovered solutions that give small residuals and thus can be validated a posteriori (moreover they can be further refined using local root-finding methods).

Here we tackle another common benchmark in the literature, the substantially larger Katsura-8 system.

The previous section shows several standard examples that use numeric Gröbner basis computations. Those cases were exactly determined and, in some sense, well conditioned. They required no special tolerancing. Indeed, while the Caprasse system and the perturbed variant have exact or near multiplicity in roots, this can be handled directly by the numerical arithmetic as described in [

We will now show several examples that require tolerancing. There are two reasons this might be necessary. One is when the system is both overdetermined and only known to low approximation. In such cases a nontolerancing approach will result in (1), that is, a determination that the polynomials generate the entire ring. Another situation where tolerancing is important is when coefficients for our system lie approximately on a discriminant variety in the parameter space of all possible coefficients [

We first work with a system that is in fact exactly determined, but nonetheless shows quite interesting behavior if not handled with tolerancing. It is a kinematics problem for a certain type of Stewart platform. The polynomial system comes from [

With default settings, NSolve will find 80 solutions. This is twice the number given at [

The crux is that the input describes a numerically unstable situation, wherein coefficients need to satisfy certain algebraic constraints in order to correctly specify the type of platform in question. In making the coefficients machine doubles, they become perturbed slightly and now we have a system with more solutions. Those that give large residuals at machine precision are in fact not wanted; they are the artifacts of having approximated the polynomial coefficients and thereby moved from a singular manifold to the generic case. We emphasize that this is not a situation where numeric difficulties arise from an artificially discontinuity in a Gröbner basis. To the contrary, we seek a Gröbner basis that is different in structure, rather than one coming from the generic case arising for nearby systems.

The tolerancing that repairs this is quite straightforward. We use

Around the time this paper was first submitted a similar but smaller example appeared on the internet StackExchange forum

We check explicitly that all residuals are zero (to close approximation).

When this is solved without tolerancing there are eight rather than two solutions. Six give residuals that indicate they are not of high quality. If one instead does the untoleranced computation after first raising precision of the inputs, then residuals become small. Nevertheless they remain notably larger than both solution precision and residuals from the “good” results, by a power-of-two factor roughly comparable to the number of bits of machine floats. This helps to explain how these artifact solutions arise. They come about from a system with coefficients that originate on the discriminant variety. They then become slightly removed due to round-off error in representing them as machine numbers. The distance they lie from the discriminant variety is manifested in the sizes of these solutions and in the discrepancy between solution precision and residual size.

Here is an example from [

From the plot (Figure

Figure

Here are the residuals of the polynomial and first derivatives, evaluated at the approximate zero of that system.

We now show an overdetermined camera pose problem from [

We check that the worst residual is not terribly large.

There is a vast literature on ways to compute approximate polynomial GCDs. Most involve reformulations as linear algebra problems and make use of numeric algorithms well suited to computing matrix rank reliably in the presence of approximation input. For background on such methods, see [

For univariate polynomials it is well known that we can extract a GCD via simple Gröbner basis computation. This is in effect a form of polynomial remainder sequence and thus bears similarity to the univariate case of the method discussed by T. Sasaki and F. Sasaki in [

Here is another example from [

We see it corresponds closely to the GCD of the “obvious” polynomial pair formed by rounding coefficients.

Now we show an example from [

Multivariate polynomial approximate GCDs algorithms are presented in [

We show some examples below. Note that we make no effort to locally improve the result, for example, by Newton’s method.

This is example exF07 from [

This is an example from [

We create a pair of polynomials with prescribed GCD. We readily recover it using approximate arithmetic.

Here we show that, with some amount of noise thrown in, we can still recover a reasonable approximate GCD.

A natural question, which we now consider, is what this result might represent. We will show that it is an exact (up to the numerical precision in use) GCD for a nearby set of inputs. For purposes of assessing proximity we will use the customary 1-norm of a given polynomial, defined as the sum of the absolute values of its coefficients. To gauge the quality of our approximated GCD we will try to find a nearby set of inputs that has this GCD exactly. This is straightforward to do using the generalized division with remainder (also called polynomial reduction) of Buchberger’s algorithm [

We compute the generalized remainders.

Here are the norms of the inputs and also of these remainders.

So the perturbation polynomials, quot1 and quot2, are indeed small compared to the input polynomials.

Now we check that the perturbations formed by subtracting these from their respective inputs; each is divisible by the GCD polynomial. This is shown by the fact that the divisions give only constant terms on the order of the computational error one expects from the precision we used.

One can moreover check that the approximate GCD computed from these perturbed inputs, using much higher tolerances, agrees (up to constant multiple) with the GCD found in the noisy input problem.

We remark that one can use optimization methods such as iterative refinement in order to attempt to improve the perturbations by making them smaller in norm. In this setting one might also allow for perturbing the GCD, so long as it remains an exact GCD of the (newly) perturbed inputs. Such ideas appear in [

We have demonstrated how relative and absolute error from numerical computation can be adapted to the setting of numerical Gröbner bases. While by no means flawless, we see from numerous examples that these approaches hold promise for handling overdetermined systems of algebraic equations. These computational methods also apply to other problems from hybrid symbolic-numeric computation, such as finding approximate polynomial GCDs.

While most examples covered seem to work efficiently and give reasonable results, it remains an open question as to how competitive these methods are in regard to speed and quality of results, as compared to other approaches. An advantage to Gröbner bases is that polynomial algebra is carried out in a sparse setting; many methods based on linear algebra require dense matrix manipulation. The examples presented offer evidence that, when working with input of modest degree, Gröbner bases methods are viable. That the coding is simple makes them all the more attractive.

An open area for further work is in determining, in some automated fashion (perhaps based on problem type), what reasonable tolerances for a specific problem are. A possible approach would be to set up an outer level optimization, wherein one strives to maximize a degree of a candidate GCD, or the (finite) number of solutions to an overdetermined system, and has for parameters these tolerances. This is another place where SVD-based matrix approaches have an advantage; a “natural” tolerance is generally revealed from the largest ratio in consecutive singular values (possibly excepting cases where a jump is from a very small singular value to zero). At present all Gröbner basis methods need some prespecification of tolerance.

Another avenue for future work is to adapt methods from [

It is also an open question whether symbolic “epsilon” powers can be used to improve the methods of this paper. The idea, roughly, is to replace coefficients that are deemed “small” (according to some relative error tolerance, say) by suitable powers of a variable that is local in the term ordering sense (hence monomials having powers of this variable are smaller than any monomial not containing it, including constants). Variants of this idea are discussed in [

Based on experimentation and comparison of timings with other methods reported, we state a tentative conclusion. The methods of this paper are viable and effective when the problem at hand is unperturbed from an exactly solvable variant. They often give good results when the problem is overdetermined, provided the noise is modest relative to an exactly solvable nearby problem, and the scale of coefficients does not vary too much. In other situations it is not clear whether our methods can be adapted so readily.

Below is code used in computations in this paper.

Here are inputs for several of the examples.

Here are equations from

We can run the solver as follows.

If instead we run at default settings we get some “spurious” solutions, as explained earlier in the paper. We show here that both solution size and residual size can be quite large.

We can get a smaller residual by solving a nearby exact system, as below. One may notice that the residuals of the unwanted solutions (all but the last two) are around machine precision larger in scale than the residuals of the two good solutions at the end.

Of course if we plug the solutions to the exact problem into the original one, the residuals from the bad solutions are again quite large. This goes to illustrate the extreme ill conditioning that gave rise to the unwanted solutions.

This following code is for the camera pose problem.

I thank the reviewers of this and earlier drafts for providing several helpful comments, suggestions, and references. It is my hope that the attempt to address their remarks has improved this paper.