IJMMS International Journal of Mathematics and Mathematical Sciences 1687-0425 0161-1712 Hindawi Publishing Corporation 353917 10.1155/2012/353917 353917 Research Article A New Proof of the Pythagorean Theorem and Its Application to Element Decompositions in Topological Algebras Greensite Fred Gilányi Attila Department of Radiological Sciences University of California Irvine Medical Center Orange CA 92868 USA uci.edu 2012 1 8 2012 2012 28 02 2012 03 06 2012 2012 Copyright © 2012 Fred Greensite. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

We present a new proof of the Pythagorean theorem which suggests a particular decomposition of the elements of a topological algebra in terms of an “inverse norm” (addressing unital algebraic structure rather than simply vector space structure). One consequence is the unification of Euclidean norm, Minkowski norm, geometric mean, and determinant, as expressions of this entity in the context of different algebras.

1. Introduction

Apart from being unital topological *-algebras, matrix algebras, special Jordan algebras and Cayley-Dickson algebras would seem to have little else in common. For example, the matrix algebra n×n is associative but noncommutative, the special Jordan algebra derived from it is nonassociative but commutative (and so are the spin factor Jordan algebras), and Cayley-Dickson algebras are both nonassociative and noncommutative (apart from the three lowest dimensional instances). However, these latter sets of algebras share an interesting feature. Each is associated with a function f(s) that vanishes on the nonunits and provides a decomposition of every unit as (1.1)s=f(s)*f(s-1), with (1.2)f(*f(s-1))=1, where *f(s-1) indicates that the gradient f is evaluated at s-1 following which the involution * is applied. For example,

on an n-dimensional Cayley-Dickson algebra, f(s) is the quadratic mean of the components multiplied by n, that is, f(s) is the Euclidean norm,

on the algebra n with component-wise addition and multiplication, f(s) is the geometric mean of the absolute values of the components multiplied by n,

on the matrix algebra n×n, f(s) is the (principal) nth root of the determinant multiplied by n,

on the spin factor Jordan algebras, f(s) is the Minkowski norm.

Looked at another way, the Euclidean norm, the geometric mean, the nth root of the determinant, and the Minkowski norm are all expressions of the same thing in the context of different algebras. With respect to topological *-algebras, this “thing” supercedes the Euclidean norm and determinant since neither is meaningful in all settings for which the solution to (1.1) and (1.2) is meaningful.

There is another aspect relevant to the Cayley-Dickson algebras. In addition to (1.1) and (1.2), there is a function f on the elements of the algebra such that for any unit s in the algebra, (1.3)s=f(s)f(s), with (1.4)f(f(s))=1. This equation set makes no reference to multiplicative structure; that is, it is a general property of the underlying vector space. Indeed, f(s) is again the Euclidean norm. In fact, (1.3) and (1.4) can be derived from the (Hilbert formulation) axioms of plane Euclidean geometry without use of the Pythagorean theorem—and as such can be used as the centerpiece of a new proof of that theorem.

So, we first prove the Pythagorean Theorem by deriving (1.3) and (1.4), we then use the latter equations to develop (1.1) and (1.2), following which we demonstrate the assertions of the first paragraph of this Introduction. Ultimately, existence of the decomposition (1.1), (1.2) is forwarded as a kind of surrogate for the Pythagorean theorem in the context of topological algebras.

It will also be seen that there is a hierarchy related to the basic equations, evidenced by progressively more structure accompanying the solution function on particular algebras. The equations (1.1) and (1.2) and have a clear analogy with the form of (1.3) and (1.4). Both cases present a decomposition of the units of an algebra as the product of a particular function's value at that point multiplied by a unity-scaled orientation point dependent on the function's gradient. The equations are nonlinear, in general. However, along with the prescription that f(s) vanishes on the nonunits, the above particular matrix, Cayley-Dickson, and spin factor Jordan algebras, also happen to satisfy the additional property that there is a real constant α such that for any unit s in the algebra, (1.5)f(s)f(s-1)=α. Replacing s with s-1 in (1.1), the above implies a decomposition of the inverse of a point as (1.6)s-1=α*f(s)f(s). Furthermore, multiplying both sides of (1.6) by f(s)s   indicates that the above is a linear equation for f. One can be even more restrictive and consider the set of algebras on which a function exists satisfying all of the above where in addition (1.5) is strengthened to f(s1)f(s2)=αf(s1s2) for any two units s1,s2. In this case, the units form a group and f(s)/α is a homomorphism on this group. The octonions occupy a special place as an algebra satisfying this prescription that is not a matrix subalgebra.

2. Euclidean Decomposition Theorem 2.1 (Pythagoras).

In a space satisfying the axioms of plane Euclidean geometry, the square of the hypotenuse of a right triangle is equal to the sum of the squares of its two other sides.

The theorem hypothesis is assumed to indicate the Hilbert formulation of plane Euclidean geometry . One will refer to the point set in question as the “Euclidean plane.” Points will be denoted by lower case Roman letters and real numbers by lower case Greek letters.

Proof. We begin by providing an outline of the proof.

A vector space structure is defined on the Euclidean plane E after identifying one of the vertices of the hypotenuse of the given right triangle as an origin o. We define the Euclidean norm implicitly as the function f:E giving the length of a line segment from the origin to any given point in the plane. Since the Hilbert formulation includes continuity axioms, we can employ the usual notions relating to limits and thereby define directional derivatives. The crux of the proof, Lemma 2.3, is the demonstration that the parallel axiom implies the existence and continuity of the directional derivatives at points other than the origin, and the largest directional derivative at a point s associated with a unit length direction line segment has unit value and is such that its direction line segment is collinear with line segment os¯. The novelty lies in the necessity that this be accomplished in the absence of an explicit formula for the Euclidean norm. A Cartesian axis system is now generated from the two sides of the given triangle forming the right angle, following which an isomorphism from E to vector space 2 is easily demonstrated. Using this isomorphism, Lemma 2.3 is seen to imply the existence of the gradient f(t)E for t not at the origin, with its characteristic property regarding generation of a directional derivative from a particular specified direction line segment, and such that the origin, t, and f(t) are collinear. It is then a simple matter to show t=f(t)f(t) and f(f(t))=1. For t=(τ1,τ2), a solution to the latter partial differential equation is supplied by f(τ1,τ2)=τ12+τ22. This solution is unique because t=f(t)f(t) implies that t and f(t) are collinear with the origin, so that the equation can be written as an ordinary differential equation—which is easily shown to have a unique solution. This explicit representation of the Euclidean norm implies the Pythagorean theorem, thus concluding the proof.

We now fill in details the above argument.

Given a right triangle Δ{o,a,s} with line segment os¯ as the hypotenuse, we define a function f(t) that gives the length of the line segment ot¯ for any point t in the plane, with the convention that f(o)=0. The continuity axioms that are part of the Hilbert formulation support the equivalent of the least upper bound axiom for the real number system, and we have the usual continuity properties related to , whose elements identify lengths of line segments in particular. Thus, the usual notion of limit can be defined and is assumed.

Definition 2.2.

A direction line segment with respect to a particular point is any line segment one of whose endpoints is the given point. If the following limit exists, the directional derivative of f at t with respect to direction line segment tw¯ is (2.1)Dtw¯f(t)limϵ0f(tϵtw¯)-f(t)ϵ, where tϵtw¯ formally denotes a point on the line containing tw¯ such that the line segment defined by this point and t has length given by the product of |ϵ| and the length of tw¯, and t lies between this point and w if and only if ϵ<0.

Lemma 2.3.

For any point w different from s, the directional derivative of f at s specified by sw¯ exists and is continuous at s. Furthermore, the largest directional derivative at s associated with a direction line segment sw¯ of a particular fixed length is such that o,s,w are collinear with s between o and w, and if the direction line segment has unit length, then the largest directional derivative has unit value.

Proof of Lemma <xref ref-type="statement" rid="lem2.1">2.3</xref>.

Let the length of sw¯ be λ. If w lies on the line containing os¯ with s between o and w, it is clear that the directional derivative exists and has a value given by λ, since the expression inside the limit in (2.1) has this value for all ϵ0. In the same way, we can demonstrate that the directional derivative exists when w is on the line containing os¯ but s is not between o and w—in which case the directional derivative has value -λ.

Thus, suppose sw¯ is not on the line containing os¯. Figure 1 can be used to keep the following constructions in context. Consider the orthogonal projection of the point sϵsw¯ to the line containing os¯, that is, the point p on the line containing os¯ such that {o,p,sϵsw¯} is a right angle (the expression {o,p,sϵsw¯} denotes the angle within the triangle Δ{o,p,sϵsw¯} that is formed by the two line segments op¯ and p(sϵsw¯)¯). Consider the circle centered at o of radius f(sϵsw¯). We claim that p is in the region enclosed by the circle, that is, f(p)<f(sϵsw¯). Indeed, suppose this is false. Consider the circle with center at o of radius f(p), and let the tangent line to this circle at the point p be denoted Tp. Being a tangent, all of the points y of Tp other than p will be such that (2.2)f(y)>f(p)f(sϵsw¯). Note that Tp intersects the line containing os¯ at a right angle (a tangent to a circle at a particular point is perpendicular to the circle radius at that point). But there is only one line through p that meets the line containing os¯ at a right angle, and that is the line containing p(sϵsw¯)¯, as previously defined. So Tp must contain p(sϵsw¯)¯, that is, sϵsw¯ is on Tp but is not the point p. The first inequality of (2.2) would then imply that y=sϵsw¯Tp is such that f(sϵsw¯)>f(p). This contradicts the second inequality of (2.2). Hence, we have established our claim that f(p)<f(sϵsw¯).

Let Pos¯ be the line perpendicular to os¯ at o. Since s is not the point o, the line containing sw¯ intersects Pos¯ in at most one point. In what follows, we assume |ϵ| is small enough so that no point of s(sϵsw¯)¯ is on Pos¯. Again consider the circle centered at o with diameter f(sϵsw¯). Its tangent at sϵsw¯ must intersect the line containing os¯ at the point r (since sϵsw¯ is not on Pos¯, the tangent cannot be parallel to os¯). Being on a tangent but not the point of tangency itself, r is necessarily external to the region enclosed by the circle (f(r) is greater than the circle radius). This circle intersects the line containing os¯ at two points defining a diameter of the circle. We denote as q the one of these two points such that q is between p and r.

We are first required to show the existence of the limit in (2.1) (for t=s), which we can write as limϵ0(f(q)-f(s))/ϵ, since q and sϵsw¯ both lie on the aforementioned circle. To do this, we will initially assume that ϵ>0 and show that limϵ0+(f(q)-f(s))/ϵ exists, after which it will be clear that an entirely analogous argument establishes the same value for limϵ0-(f(q)-f(s))/ϵ.

We have (2.3)f(q)-f(s)ϵλ=f(q)-f(p)ϵλ+f(p)-f(s)ϵλ,(2.4)f(q)-f(p)ϵλf(r)-f(p)ϵλ, since q is between p and r. Note that ϵλ is the length of s(sϵsw¯)¯, according to Definition 2.2. The triangle Δ{p,o,sϵsw¯} is similar to Δ{p,sϵsw¯,r} (ultimately, because the tangent line to the circle at sϵsw¯ implies that {o,sϵsw¯,r} is a right angle). Let μ be the length of p(sϵsw¯)¯. It follows that (2.5)f(r)-f(p)μ=μf(p).

If s and p are ever the same point for some value of ϵ, then s(sϵsw¯)¯ is perpendicular to os¯, and will remain so for any other ϵ, so that s and p will always be the same point. In that case, f(s)=f(p), and μ=ϵλ. It then follows that the right-hand side of (2.5) tends to zero with ϵ (since this right-hand side is (ϵλ)/f(s)), so the right-hand side of (2.4) also tends to zero with ϵ, which means that the first term on the right-hand side of (2.3) tends to zero with ϵ. But since f(s)=f(p), the second term on the right-hand side of (2.3) is zero. It then follows that limϵ0(f(q)-f(s))/ϵ=0 and, in particular, the required limit exists.

Thus, suppose s and p are different, and consider Δ{p,s,sϵsw¯}. This defines a set of similar triangles for all values of ϵ0, because the angle {p,s,sϵsw¯} does not change as ϵ varies and {s,p,sϵsw¯} remains a right angle. Consequently, μ/(ϵλ) is a nonzero constant for all ϵ>0 since this is the ratio of two particular sides of each triangle in this set of similar triangles. This means that limϵ0+μ=0, since μ/(ϵλ) could not otherwise remain a constant because ϵλ tends to zero with ϵ. On the other hand, because |f(p)-f(s)|/(ϵλ) is also constant as ϵ varies (being a ratio of a different combination of sides of these same triangles), it must also follow that limϵ0+f(p)=f(s)0 (which also means that o is not between p and s for small ϵ). So, it must be that the right-hand side of (2.5) tends to zero as ϵ>0 tends to zero. Consequently, the right-hand side of (2.4) also tends to zero as ϵ>0 tends to zero (because, as we have noted, μ/(ϵλ) is constant as ϵ>0 varies). Hence, the first term on the right-hand side of (2.3) also tends to zero as ϵ>0 becomes small. This means that the left-hand side of (2.3) has the same limit as ϵ>0 tends to zero as the second term on the right-hand side of (2.3) (assuming the limit exists). But, we have already noted that the term |f(p)-f(s)|/(ϵλ) is a constant as ϵ varies, since it is determined by the ratio of particular sides of Δ{p,s,sϵsw¯}, and (as we have noted) for any ϵ0 all such triangles are similar. In fact (removing the absolute value sign in the numerator), we further claim that (2.6)κf(p)-f(s)ϵλ is a constant for all small ϵ>0. To see this, recall that for small ϵ>0, o is not between s and p, and {p,s,sϵsw¯} is constant. Now, if p changes from being between o and s versus not being between o and s as ϵ>0 varies, the latter angle must change from being an acute angle to being an obtuse angle, or vice versa, which contradicts the fact that the angle is constant for all ϵ>0. For small ϵ>0, it follows that s is always between o and p, or p is always between o and s—which establishes our claim that (2.6) is constant. Therefore, (2.7)limϵ0+f(q)-f(s)ϵ=λκ. An analogous argument establishes limϵ0-(f(q)-f(s))/ϵ=λκ, because Δ{p,s,(sϵsw¯)} are similar triangles for all nonzero values of ϵ. This establishes the limit in (2.1).

Next, we establish the continuity of each directional derivative. Consider any sequence of line segments {siwi¯} such that {s,w,wi,si} form a parallelogram for all i, and the limit of the length of the line segments {sis¯} is zero (which implies that the limit of the length of the line segments {wiw¯} is also zero). To establish the continuity of a directional derivative at s it is required to show that the left-hand-side of the following equation is zero: (2.8)limi|Dsw¯f(s)-Dsiwi¯f(s)|=limi|κ-κi|λ. For each i, κi is the ratio of particular sides of any member of a particular set of similar triangles, as is also the case for κ as above. Furthermore, as i increases, the ratio of the lengths of the sides of the triangle relevant to κi converges to the same ratio as that of the corresponding triangles relevant to κ, since si converges to s and wi converges to w. Therefore, |Dsw¯f(s)-Dsiwi¯f(s)| necessarily tends to zero, implying continuity of the directional derivative at s.

We next show that the largest directional derivative at s associated with a direction line segment sw¯ of a particular fixed length is such that o,s,w are collinear with s between o and w. We have from (2.7) (and the equation in the following sentence) that the directional derivative is λκ. Thus, we only need to show that |f(p)-f(s)|ϵλ, with equality if and only if sw¯ is on the line containing os¯, because once that is established it is easy to show that if o,s,w are collinear, then κ will be negative if s is not between o and w, and will be positive otherwise.

Now, λϵ is the length of the hypotenuse of right triangle Δ{s,p,sϵsw¯}, and |f(p)-f(s)| is the length of the side that is on the line containing os¯. So, we only need to show that a nonhypotenuse side of a right triangle has a shorter length than the hypotenuse. Thus, consider any right triangle Δ{a,b,c} with {a,b,c} being the right angle (vertex a here has nothing to do with vertex a of our earlier given right triangle Δo,a,s). Consider a circle centered at a having radius ac¯. This circle intersects the line containing ab¯ at a point b. Suppose the length of ab¯ is less than the length of ab¯, meaning that the length of the hypotenuse is smaller than the length of one of the other sides. Then b is external to the circle. So, consider a second circle centered at a but now with radius given by the length of ab¯. This circle intersects the line containing ab¯ at the point b. The two circles are concentric, with the circle containing b lying wholly external to the circle containing c. Now, the second circle (containing b) has a tangent at b making a right angle with the line containing ab¯ (tangents are perpendicular to the radius at the point of tangency). But the line containing bc¯ is also perpendicular to the line containing ab¯ (since {a,b,c} is already given as a right angle). So the point c (which lies on our first circle of radius given by the length of ac¯) must also be a point on the tangent line at the point b of our second circle. This is impossible since, given two concentric circles, a tangent to a point on the circle of greater radius cannot intersect the circle of smaller radius. This contradicts the assumption that the hypotenuse is smaller than the length of one of the sides. Furthermore, the hypotenuse cannot equal the length of one of the other sides of the right triangle, because in that case a circle centered at a with radius equal to the length of the hypotenuse would intersect the right triangle at the two points b and c, which would require that a tangent line to the circle at b (making a right angle with the line containing ab¯) would have another point of intersection with the circle (i.e., at c, again since the line containing bc¯ is also perpendicular to the line containing ab¯ and there can be only one such perpendicular). Thus, unless s,p,sϵsw¯ are collinear, |f(p)-f(s)|<ϵλ, and it is immediately verified that |f(p)-f(s)|=ϵλ if the points are collinear. It is then evident that the value of the largest directional derivative is λ. Thus, if λ=1, the largest directional derivative has unit value. Hence, Lemma 2.3 is proved.

Diagram relating to proof that the function giving the length of a line segment with one endpoint at o has continuous directional derivatives at points other than o, Lemma 2.3. Three facts used in the proof are as follows: (1) Δ{p,o,sϵsw¯} is similar to Δ{p,sϵsw¯,r} because {o,sϵsw¯,r} is a right angle, (2) for all ϵ0, the triangles Δ{p,s,sϵsw¯} are similar, (3) q is between p and r.

Definition 2.2 actually suggests two operations, and these will be referred to in Lemma 2.5 as “the Definition 2.2 associated operations.” That is, we define the (more general) expression cγuv¯ to represent cγv(-γ)u subject to the following.

The multiplication of a scalar with a point, γv, is defined to be the point on the line containing ov¯ such that f(γv)=|γ|f(v) and such that o is between this new point and v if and only if γ is negative.

The “sum” of two points, h1h2, is defined as follows.

If o,h1,h2 are not collinear, h1h2 is defined to be the point z such that the vertices {o,h1,h2,z} form a parallelogram.

If either h1 or h2 is the point o, then h1h2 is the point that is not o, or is o if h1 and h2 are both o.

If h1 and h2 are the same point, then h1h2 is the point on the line containing oh1¯ such that f(h1h2)=2f(h1) and o is not between h1 and h1h2.

If o,h1,h2 are distinct and collinear,

when o is not between h1,h2 then h1h2 is the point on the line containing h1h2¯ such that f(h1h2)=f(h1)+f(h2) and o is not between h1 and h1h2,

when o is between h1,h2,

if f(h2)>f(h1) then h1h2 is the point on the line containing h1h2¯ such that f(h1h2)=f(h2)-f(h1) and o is between h1 and h1h2,

if f(h2)<f(h1) then h1h2 is the point on the line containing h1h2¯ such that f(h1h2)=f(h1)-f(h2) and o is between h2 and h1h2,

if f(h1)=f(h2) then h1h2 is o.

With this understanding, it is clear that tϵtw¯ as defined in Definition 2.2 is the same thing as tϵw(-ϵ)t. Of course, this suggests operations on a vector space.

Remark 2.4.

The central role of the derivative of the norm function as featured in Definition 2.2 is not without precedent. The derivative of the norm also plays an important role in semi-inner product spaces [2, 3] (and premanifolds ), where the condition that the space be continuous (or, alternatively, uniformly continuous) can be shown to be equivalent to the condition that the norm is Gateaux differentiable (or, alternatively, uniformly Frechet differentiable). Naturally, once the isomorphism between E and 2 is established, our definition is seen to be analogous to the standard one.

Lemma 2.5.

There is an isomorphism between the vector space 2 with component-wise addition of elements and component-wise multiplication of elements by scalars, and the Euclidean plane with the Definition 2.2 associated operations.

Proof of Lemma <xref ref-type="statement" rid="lem2.2">2.5</xref>.

We first identify a particular Cartesian axis system on the plane. One Cartesian system is already present, consisting of the lines containing the line segments forming the right angle of the right triangle given at the outset of this proof (i.e., oa¯ and as¯). However, since the length function f(w) is referenced to o, we use the latter axis system to set up a different Cartesian system at o. Using the parallel axiom, consider the line though o that is parallel to the line containing as¯, and furthermore (again using the parallel axiom) consider a point b on this new parallel line such that the vertices {o,a,s,b} form a rectangle. The lines containing oa¯ and ob¯ are our Cartesian system (the “oa¯-axis” and the “ob¯-axis”).

We identify o with the ordered pair (0,0). Any point t in the plane different from o is associated with a unique ordered pair (τa,τb) implied by the line segment ot¯. That is, we take the orthogonal projection of t to the line containing oa¯ and take |τa| to be the length of the line segment formed by o and this projection of t. τa is negative or positive depending on whether or not o is between a and the projection of t to the line containing oa¯. τb is defined analogously with respect to b and the ob¯-axis. Conversely, every ordered pair of real numbers is associated with a point in the plane. That is, for (χa,χb) we find the point xa on the oa¯-axis with f(xa)=|χa| such that o is between xa and a if the sign of χa is negative, and o is not between xa and a if the sign of χa is positive—and similarly for a point xb on the ob¯-axis relating to χb. Then the point x associated with (χa,χb) is the point such that {o,xa,x,xb} is a rectangle. Its existence and uniqueness is guaranteed by the parallel axiom. It is further obvious that the first construction associates t to (τa,τb) if and only if the second construction associates (τa,τb) to t.

The above is therefore a one-to-one mapping between the points of the Euclidean plane and the points of 2 (ordered pairs of real numbers), explicitly employing the parallel and betweenness axioms. 2 becomes a vector space once we specify that ordered pairs (i.e., vectors) representing points in the plane can be added together component-wise and multiplied by scalars component-wise. The isomorphism between vector space 2 and E with the Definition 2.2 associated operations is then easily verified. Thus, Lemma 2.5 is established.

It follows that the isomorphism in the above lemma leads to an expression for the directional derivative in the vector space 2 that gives the same result as it did in the original Euclidean plane. In particular, basic arguments from multivariable calculus establish the existence of a total derivative, the gradient f(s), as the ordered pair of directional derivatives of f at s with direction line segments defined by unit length line segments parallel to the oa¯-axis and ob¯-axis such that the directional derivatives are the inner product of f(s) with the ordered pair in 2 corresponding to a particular direction line segment. For example, this is seen from (2.9)f(a1,b1)-f(a0,b0)=[f(a1,b1)-f(a0,b1)]+[f(a0,b1)-f(a0,b0)]=f(ξa)a(a1-a0)+f(ξb)b(b1-b0), with ξa,ξb given by the mean value theorem. Applying the definition of directional derivative to both sides above one sees that the directional derivative is given by the usual inner product of f(s)=(f(s)/a,f(s)/b) with the direction line segment. This is accomplished without use of the Euclidean norm or prior use of the inner product operation. Being an ordered pair, f(s) is a point in the plane, and we can refer to f(f(s)). Furthermore, we have already established that the largest directional derivative of f at s associated with a line segment sw¯ of length λ is suchthat o,s,w are collinear with s between o and w. A standard argument establishes that o, f(s), and w are then collinear, and o is not between s and w (because the gradient is proportional to the direction line segment associated with the greatest directional derivative and, according to Lemma 2.3, this direction line segment sw¯ lies on the line containing os¯ with o not between s and w). So, we have s=βf(s), for β>0. Furthermore, a standard multivariable calculus argument establishes that the magnitude of f(s) (the length of of(s)¯, i.e., f(f(s))) is the value of the largest directional derivative associated with a direction line segment of length unity—which according to Lemma 2.3 is unity. Thus, f(f(s))=1 (the magnitude of f(s) is unity), so that s=f(s)f(s) since f(s) is the length of the hypotenuse os¯.

In fact, it is easy to see that the equations of the last sentence of the prior paragraph pertain not just s but to any point t in the plane different from o. They hold trivially if t is a point on the oa¯-axis or ob¯-axis. For any other point t, we can consider ot¯ to be the hypotenuse of a right triangle Δ{o,at,t}, where at is the orthogonal projection of t to the oa¯-axis, and then proceed in the same manner as we have already done for Δ{o,a,s} (noting that our axes are unchanged). Also, as stated at the outset, f(o)=0, and lately we have the identification of o as the ordered pair (0,0). Including the latter, the equations in the last sentence of the prior paragraph constitute a partial differential equation. For t identified in our Cartesian system as (τa,τb), it is easily verified by standard differentiation that one solution is f(t)=τa2+τb2. To show that this solution is unique, consider that t=f(t)f(t) means that t and f(t) and (0,0) are collinear. Thus, for the points x on the line containing ot¯, we have the ordinary differential equation x=f(x)[df(x)/dx] with f(df(x)/dx)=1 and f(0,0)=0. This has a unique solution, so that the already identified solution, f(t)=τa2+τb2, must be the only solution. Thus, we have derived the Euclidean norm, and hence proved the Pythagorean theorem.

Equations (1.3) and (1.4) could also be used in the definition of the Euclidean norm as follows.

Corollary 2.6.

A function f:n is the Euclidean norm if and only if it is continuous, vanishes at the origin, and at any other point it satisfies (1.3) and (1.4).

Equations (1.3) and (1.4) indicate that, from a differentiable viewpoint, the Euclidean norm is a scaling-orientation function in the decomposition of a point as the product of a scalar with a unity-scaled orientation point derived from the function's gradient. We can consider this set of equations to represent “Euclidean decomposition."

But the Euclidean norm does not address the multiplicative structure of an algebra and so does not have an essential role in most algebras. Instead, we shall see that the role of f(s) on the vector space n is taken up by *f(s-1) on topological *-algebras over n, and we will consider (1.3) and (1.4) so modified to represent “Jacobian decomposition.”

3. Jacobian Decomposition and Inverse Norm

The “defining” equations of the Euclidean norm, (1.3) and (1.4), make no reference to the multiplicative structure of an algebra. Nevertheless, the Euclidean norm has application to the Cayley-Dickson algebras (an unending sequence of real unital topological *-algebras beginning with the only four real normed division algebras, , , the quaternions, and the octonions). Each of these algebras is characterized by a basis {e0,,em}, with m=2k-1 for any nonnegative integer k. Multiplication is distributive and thus defined by a multiplication table relevant to {ei}. In particular, e02=1 and ei2=-1 for i0. Any point s=α0e0++αmem (with each αi) has a conjugate s*α0--αmem, with   * evidently an involution. In particular, ss*=α02++αm2=[f(s)]2, where f(s) is the Euclidean norm. Thus, s-1=s*/[f(s)]2. Hence, the Euclidean norm in this case helps express the inverse of a point. Being the Euclidean norm, f(s) satisfies (1.3) and (1.4), and so also defines a Euclidean decomposition of the point. But given the above involution, it is easy to show that (3.1)f(s)=*f(s-1), (where, as always, *f(s-1) represents evaluation of the gradient f at the point s-1 followed by application of the involution). Substituting the above into (1.3), we obtain s=f(s)*f(s-1). Since (1.4) holds for all s0, and all such points have inverses, we must also have f(*f(s-1))=1. Now we have a formulation for the Euclidean norm that makes reference to unital algebraic structure. On Cayley-Dickson algebras, it is equivalent to the Pythagorean theorem. When this latter decomposition exists on a topological   *-algebra but f(s) is not the Euclidean norm, it can be considered to be an algebraic ghost of the Pythagorean theorem.

Definition 3.1.

For a topological   *-algebra A defined on n, a continuous function f:n is an inverse norm if it is zero on the nonunits of A, as a function restricted to the domain n it is differentiable on the units of A, and for any unit, (3.2)s=f(s)*f(s-1), with (3.3)f(*f(s-1))=1.

The above equations mimic the equations for the Euclidean norm referred to in Corollary 2.6, but instead decompose a point as a function's value at the point multiplied by a unity-scaled orientation point dependent on the function's gradient at the inverse of the point (or, alternatively to the Euclidean norm's direct expression of a point, (3.2) and (3.3) express the inverse of a point—i.e., substituting s-1 for s in the latter equations). Thus, we use the term “inverse norm.”

Of course, from (3.1) we have already shown the following.

Theorem 3.2.

The Euclidean norm is an inverse norm on the Cayley-Dickson algebras.

However, inverse norms have applicability well beyond the Cayley-Dickson algebras.

Theorem 3.3 (Jacobi).

For s a member of the algebra of real matrices n×n, let f(s)det(s). If s is a unit, then (3.4)s=f(s)*f(s-1), where * indicates matrix transpose.

The above is a well-known immediate consequence of the Jacobi’s formula in matrix calculus (the latter expresses gradient of the determinant in terms of the adjugate matrix).

Corollary 3.4.

For the algebra of real matrices n×n, (3.5)f(s)n  (det(s))1/n is an inverse norm.

Proof.

f ( s ) is evidently continuous everywhere, as well as differentiable on the units (the invertible matrices), and it vanishes on the nonunits. For any unit s on this algebra, we have det(s)0, and (3.6)*f(s)=[f(s)sij]*=(nn)(det(s))-1+1/n((det(s)))*. For a unit s, (3.4) implies (det(s))*=s-1det(s). Substituting this into (3.6), and using (3.5), we obtain (3.7)*f(s)=f(s)ns-1=s-1f(s-1). If we evaluate *f on the left-hand-side above at the point s-1 instead of evaluating it at s, we obtain (3.2). Applying f in (3.5) to both sides of (3.7) we obtain (3.3).

In analogy with Euclidean decomposition (1.3) and (1.4), we can consider the equations of Definition 3.1 to represent “Jacobian decomposition” (i.e., in view of Theorem 3.3).

For the algebra n with component-wise addition and multiplication, it is also easy to show that an inverse norm is given by the product of n with the geometric mean of the absolute values of the components of a point. That is, for a point s=(s1,,sn), set (3.8)f(s)n(i=1n|si|)1/n. Note that f is continuous and f vanishes on the nonunits. If s is a unit, then (3.9)f(s)sj=nn(i=1n|si|)-1+1/nsgn  (sj)ij|si|=nn(i=1n|si|)1/n1sj=f(s)n(1sj). It is then a simple task to verify that f(s) satisfies the requirements of Definition 3.1 (with  * as the identity).

Now we turn to Jordan algebras.

Theorem 3.5.

The Minkowski Norm is an inverse norm on the spin factor jordan algebra.

Proof.

Thinking of n in the format of n-1, write its points as s=w+z, with w and z=(z1,,zn-1)n-1. We introduce a multiplication operation such that (3.10)  (ωa+z1)(ωb+z2)(ωaωb+z1z2)+(ωaz2+ωbz1), where “·” is the usual inner product on n-1. This multiplication defines a commutative but nonassociative algebra, the spin factor Jordan algebra . The multiplicative identity element is evidently the point where w=1 and z=0. An inverse element exists for points w+z such that z·zw2. That is, (w+z)(w+z)-1=1 for (3.11)(w+z)-1=-w+z-w2+zz, when z·zw2.

Now we define f:n such that for s=(w,z1,,zn-1)n, (3.12)f(s)=-w2+z12++zn-12  , where the above square root represents the principal value. On the domain comprised of the units of the spin factor Jordan algebra (the points w+z, w,z3, such that w2z·z), we have (3.13)f(w+z)=-w+zf(w+z)=(w+z)-1f(w+z)=(w+z)-1f((w+z)-1), where the three equalities follow from (3.11) and (3.12). Hence, on the units, we have s-1=f(s-1)f(s). Taking the involution * to be the identity, the latter equation is equivalent to the first equation of Definition 3.1.

Applying f to both sides of the first equality in (3.13) and using (3.12), we obtain f(f(s))=1 for any unit s. This is equivalent to the second equation of Definition 3.1.

On the other hand, the Jordan algebra obtained from the algebra of matrices n×n has the product of two of its members A,B as given by AB(AB+BA)/2 where AB and BA indicate the usual matrix product. We then have 1=AA-1=AA-1, where 1 is the identity element in the algebra (in this case, 1=      diag  {1,1,,1}). Therefore, A-1 is the usual matrix inverse. Consequently, the associated inverse norms are the same as those for the algebra of matrices n×n. Thus, Jacobian decomposition holds for the Jordan algebra obtained from the matrix algebra.

Supplying an inverse norm nominally requires solution of a nonlinear partial differential equation (3.2). However, if we apply further restrictions on the nature of f(s), one can obtain a linear equation. In particular, for each algebra example considered up till now there is a constant α such that (3.14)f(s)f(s-1)=α, so that (3.2) evaluated at s-1 instead of s implies (1.6) and (3.15)αs*f(s)=f(s)1.

However, not all unital algebras have an inverse norm satisfying (3.15). First, since an inverse norm is zero on nonunits, and a nonunital algebra consists only of nonunits, the inverse norm on a nonunital algebra is identically zero. On the other hand, one might ask whether an inverse norm satisfying (3.15) exists on the unital hull  of a nonunital topological algebra.

Theorem 3.6.

The unital hull of a nonunital topological algebra does not have an inverse norm satisfying (3.15).

Proof.

The unital hull of a nonunital algebra A is defined by elements s^(σ,s) for σ and s=(s1,,sn)A, with component-wise addition of elements and multiplication defined by (3.16)(σ,s)(τ,t)(στ,σt+τs+st), where st indicates the product between elements of A. The identity element is 1=(1,0). Equation (3.15) requires that the units satisfy (3.17)  (σ,s)(f(s^)σ,f(s^))=f(s^)α1, where (/s1,,/sn). Using the multiplication rule, we can write this as (3.18)(σf(s^)σ,σf(s^)+sf(s^)σ+sf(s^))=(f(s^)α,0). Thus, it is required that σ[f(s^)/σ]=f(s^)/α, so that f(s^)=σ1/αh(s), where h(s) indicates some function independent of σ. In addition, the right-hand-side of (3.18) requires that the second component of the left-hand-side of (3.18) be zero. But with regard to variation in σ, the requirement that f(s^)=σ1/αh(s) means that the first term of this second component is O(σ1+1/α), the second term is O(σ-1+1/α), and the third term is O(σ1/α). It is thus impossible for this second component to remain zero as σ varies unless f is identically zero. But in that case, the equations of Definition 3.1 cannot be satisfied for the units. Thus, an inverse norm satisfying (3.15) does not exist.

One can get even more restrictive and consider algebras for which not only is f(s)f(s-1) constant on the units (i.e., (3.14) is satisfied) but in addition the units constitute a group on which a multiple of f(s) is homomorphism. In fact, the inverse norm on the first four Cayley-Dickson algebras satisfies this prescription (the latter are the only real normed division algebras by Hurwitz's theorem, and the Euclidean norm is a homomorphism on their units). The inverse norm f(s)=n(det(s))1/n on the matrix algebra n×n also satisfies this requirement (i.e., f(s)/n is a homomorphism on the group of units). Along these lines, we observe that the Cayley-Dickson algebras and the set of subalgebras of the real matrix algebra n×n overlap on the algebras of real numbers, the complex numbers, and quaternions—which happen to be the only associative real normed division algebras. The Cayley-Dickson algebra sequence contains the only other real normed division algebra—the (noncommutative/nonassociative) algebra of octonions. Thus, the octonions provide an example of an algebra with an inverse norm that is a homomorphism on the group of units, but without a representation as a matrix subalgebra. This point of nonoverlap is typical of the exceptionalism of the octonions .

Hilbert D. The Foundations of Geometry 1950 LaSalle, Ill, USA The Open Court Giles J. R. Classes of semi-inner product spaces Transactions of the American Mathematical Society 1967 129 3 436 446 Horváth A. G. Semi-indefinite inner product and generalized Minkowski spaces Journal of Geometry and Physics 2010 60 9 1190 1208 2-s2.0-77953139788 10.1016/j.geomphys.2010.04.006 Horváth A. G. Premanifolds Note di Matematica 2011 31 2 17 51 McCrimmon K. A Taste of Jordan Algebras 2004 New York, NY, USA Springer Baez J. C. The octonions Bulletin of the American Mathematical Society 2002 39 2 145 205 2-s2.0-0035997407 10.1090/S0273-0979-01-00934-X