Nonlinear Alignment and Its Local Linear Iterative Solution

In manifold learning, the aim of alignment is to derive the global coordinate of manifold from the local coordinates of manifold’s patches. At present,most ofmanifold learning algorithms assume that the relation between the global and local coordinates is locally linear and based on this linear relation align the local coordinates ofmanifold’s patches into the global coordinate ofmanifold.There are two contributions in this paper. First, the nonlinear relation between the manifold’s global and local coordinates is deduced by making use of the differentiation of local pullback functions defined on the differential manifold. Second, themethod of local linear iterative alignment is used to align the manifold’s local coordinates into the manifold’s global coordinate. The experimental results presented in this paper show that the errors of noniterative alignment are considerably large and can be reduced to almost zero within the first two iterations.The large errors of noniterative/linear alignment verify the nonlinear nature of alignment and justify the necessity of iterative alignment.


Introduction
Several papers published in Science in 2000 started the research on manifold learning [1,2].From then on, manifold learning has made great progress and produced many representative algorithms as well as a lot of improvements around these representative algorithms, such as ISOMAP [2], Locally Linear Embedding (LLE) [1,3], Hessian LLE (HLLE) [4], Local Tangent Space Alignment (LTSA) [5], Laplacian Eigenmaps (LE) [6], Diffusion Maps [7], and Maximum Variance Unfolding (MVU) [8].Manifold learning as a way of machine learning has achieved good performance in many applications of machine learning.
It may be worth noting that although many machine learning algorithms claim to be manifold learning algorithms, they seem to have nothing to do with the topological manifolds defined in mathematics, not to mention the differential manifolds.The manifold learning algorithm proposed in this paper is based on the mathematical characteristics of topological/differential manifolds.This kind of algorithms can be divided into two stages: local homeomorphism and alignment.The mathematical foundation of local homeomorphism is based on the definition of topological manifolds, while the mathematical foundation of alignment is based on the characteristics of differential manifolds.In local homeomorphism, a manifold is divided into a finite number of overlapped local regions.Each local region is homeomorphic to an open set of Euclidean space.The local regions are called patches of manifold and the open sets homeomorphic to the patches are called local coordinates of patches or local coordinates of manifold directly.In alignment, the local coordinates are aligned in Euclidean space to form an area corresponding to the manifold.The area is called global coordinate of manifold.This paper only focuses on alignment; that is to say, we assume that the local coordinates have already been obtained during local homeomorphism and under this assumption we only study how to derive global coordinates from local coordinates.
The remaining sections are arranged as follows.In Section 2, some related works are reviewed briefly.In Section 3, the mathematical foundations of manifold learning are laid.In Section 4, the local nonlinear relation between 2 Mathematical Problems in Engineering the global and local coordinates is deduced mathematically.In Section 5, the local linear iterative alignment (LLIA) solution to nonlinear alignment is proposed.In Section 6 the experimental results are presented.In Section 7, some conclusions are given.

Related Works
There are three kinds of alignment in manifold learning: local coordinate alignment, patch alignment, and manifold alignment.The alignment in the algorithm proposed in this paper belongs to local coordinate alignment.
The local coordinate alignment is based on the mathematical definition of manifold.According to the mathematical definition of manifold, a manifold can be divided into a number of overlapped patches and each patch is homeomorphic to an open set of Euclidean space.The open sets are called local coordinates of manifold.The local coordinate alignment is to align the local coordinates in Euclidean space to form a larger open set which will be correspondent to the manifold globally.This larger open set is called global coordinate of manifold.The local coordinate alignment can be done by aligning local coordinates one by one on the light of geometrical intuition [9][10][11] or by turning alignment into an eigenvaluesolving problem with differential geometry [5,[12][13][14][15].
The patch alignment [16] is more algebraic than geometrical.Before alignment, the given data  = { 1 , . . .,   } have to be divided into overlapped patches   = {  1 , . . .,    } ⊆ ,  = 1, . . ., .The patch alignment is then formulated as follows: where   is called local pattern of   ,  = 1, . . ., .How to calculate   from   is dependent on different algorithms.For example, in LE algorithm [6], where In supervised or semisupervised learning, besides the inherent attributes of data; there are discriminative labels assigned to data.Many recent researches in patch alignment try to incorporate the discriminative labels into the calculation of   .For example, in MPA algorithm [20], the neighbours of   are first divided into two groups:    1 , . . ., where More schemes in this respect can be founded in [20,21].
The manifold alignment is involved in a number of highdimensional datasets which are taken from different manifolds.These datasets are all reduced to a low-dimensional Euclidean space and aligned according to certain rules.For example, let   = { = 0. How to determine the matching points and how to calculate the similarity are application-dependent.More complicated manifold learning algorithms can be founded in [22][23][24][25][26][27].

Mathematical Foundations of Manifold Learning
Manifold learning can be divided into two stages: local homeomorphism and alignment.In this section we will elaborate the mathematical foundations of local homeomorphism and alignment.It is clear that φ is also a homeomorphic mapping between   and another open set Θ .An important feature of φ which will facilitate the mathematical deduction of nonlinear alignment is that φ () = 0. Hereafter, all the local homeomorphic mappings used in this paper are assumed to have this feature.

Mathematical Foundations of
Furthermore, the set {  |  ∈ } is an open cover of ; that is,  = ⋃ ∈   .If  is compact, then there must be a finite subset of {  |  ∈ } such that this finite subset is also an open cover of  [28].More specifically, if  is a -dimensional compact manifold, then there must be a finite number of points  1 , . . .,   and the corresponding neighbourhoods  1 , . . .,   such that (1)   is homeomorphic to an open set Θ  of dimensional Euclidean space   ,  = 1, . . ., . ( In practice, the given data  1 , . . .,   are always assumed to be taken from a compact manifold  and have the neighbourhoods  1 , . . .,   which are locally homeomorphic to the open sets Θ 1 , . . ., Θ  of Euclidean space.The aim of manifold learning during the stage of local homeomorphism is to find these neighbourhoods  1 , . . .,   and then derive the local coordinates Θ 1 , . . ., Θ  of these neighbourhoods.It is clear that the existence of the neighbourhoods and their local coordinates is guaranteed by the mathematical definition of topological manifold.At present, the most commonly used method of finding the neighbourhoods is the K Nearest Neighbours (KNN) method and the most commonly used method of deriving the local coordinates of neighbourhoods is the so-called tangent space method, that is, the local PCA method [5,29].

Mathematical Foundations of Alignment.
Let  be a topological manifold and {(  ,   ) |  ∈ } an atlas of ; if, for any two charts (  ,   ) and (  ,   ) such that   ∩   ̸ = ⌀, the mapping 2).In the rest part of this paper the manifolds are all assumed to be  ∞ differential manifolds.
Let  :  →  be a function defined on the differential manifold .Generally speaking,  cannot be differentiated directly on .In order to define the differentiability of  on , we have to define its local pullback function first.Now let  ∈ .According to the definition of manifold, there must be a neighbourhood   ⊆  of  such that   is homeomorphic to an open set Θ  of Euclidean space   , where  is the dimension of .Let   :   → Θ  be the homeomorphic mapping between   and Θ  , the so-called local pullback function of  is then defined as   =  ∘  −1  : Θ  →  (see Figure 3).
Note that, in the theory of topological spaces, being homeomorphic means being identical.Also note that manifold is a kind of topological space.Therefore, when limited within the local regions   and Θ  ,  can be regarded as the same with its local pullback function is completely dependent on .This is the reason why the differentiability of  within the local region   is defined by the differentiability of its local pullback function   =  ∘  −1  within the local region Θ  .More specifically, by making use of   () = 0, that is, =  −1  (0), the gradient vector and Hessian matrix of  at  can be expressed as follows: If  is  ∞ differentiable,  can be even Taylor-expanded in the neighbourhood of .In fact, for all   ∈   , where   =   (  ) ∈ Θ  .The differentiation of local pullback functions lays the mathematical foundation of alignment.

Nonlinear Alignment
In manifold learning, it is always assumed that an -dimen- Obviously, the component functions of  can be regarded as the functions defined on the manifold  and can be locally pulled back to the -dimensional Euclidean space   (see Figure 4).According to (6), deduced in Section 3.2, the component functions can be Taylor-expanded in the neighbourhood   of : for all   ∈   , Figure 3: The Local Pullback of functions defined on manifolds;  is defined on   and  ∘  −1  is defined on Θ  .Since   and Θ  are homeomorphic, the differentiation of  on   can be defined by using the differentiation of its pullback function  ∘  −1  on Θ  .

The Local Linear Iterative Solution to Nonlinear Alignment
In algorithm, a matrix where    =   ∘  −1  is the local pullback function of   from   to Θ  ,  = 1, . . ., .
Equation ( 9) is a nonlinear equation.From the perspective of computational mathematics, a nonlinear equation can be solved by linear iteration.Therefore (9) has to be first linearized: where   = (1/) ∑  =1    is the linear part of (  ).The error incurred from the linearization will be reduced again and again during the linear iteration.
Equation ( 10) can be rewritten in matrix form: where ∈   .In geometry, (11) means that   , the global coordinate of the patch   , can be locally approximated by translation, rotation, and scaling of Θ  , the local coordinate of the patch   .

Local Linear Iterative Alignment (LLIA) Algorithm.
In the proposed LLIA algorithm, the global coordinate  is approximated iteratively: In the above iteration the initial iterative value  (0) is locally set to be the local coordinates  (0)  = Θ  , while the other iterative values  (+1) are derived from the last iterative values  () based on the following local linear relation of alignment: where Ŷ()  = where ( Ŷ()  ) + is the solution to the following problem: The algorithm framework of iterative alignment is as follows.

𝑛
The End of Iteration.

Experimental Results
Here the experimental results of 8 manifold learning algorithms on the toy and real-world data are presented.The 8 manifold learning algorithms are the proposed local linear iterative alignment (LLIA) algorithm, LLE, HLLE, LTSA, Diffusion Map, MVU, ISOMAP, and LE.The last 7 manifold learning algorithms are implemented by using MANI, a widely used platform of manifold learning which can be downloaded from internet [30].

Toy Data.
The toy data used in the experiments are Trefoil, Punctured Sphere, and Toroidal Helix.These toy data are also produced by using MANI.Figures 5, 6, and 7 show the experimental results of 8 manifold learning algorithms on three toy datasets.At present there are no widely accepted quantitative evaluation criteria in manifold learning [31].The evaluations of manifold learning algorithms are commonly based on visual perception.At the top of Figure 5, an image of Trefoil and some coloured data taken from Trefoil are presented.Since Trefoil is constructed by twisting a planar coil into a cubic body, a good manifold learning algorithm should be able to recover the planar coil from the given coloured data and keep the order of colours unchanged.It can be seen from Figure 5 that the proposed LLIA algorithm does a good job in this respect.
The so-called Punctured Sphere is a spherical surface whose top has been cut off.The colour of the given data shown in Figure 6 indicates the position relationship between the given data.The aim of manifold learning is to produce an overhead view of Punctured Sphere on a plane.It can be seen from Figure 6 that the proposed algorithm achieves much better results than other manifold learning algorithms.
Toroidal Helix is something like that shown at the top of Figure 7.Although Toroidal Helix is much more complicated than Trefoil, it is also constructed by twisting a planar coil into a cubic body.The aim of manifold learning is also to recover the planar coil from the given coloured data.In doing so, the order of data colour must be kept unchanged because the order of data colour indicates the position relationship of data which should not be damaged during dimension reduction.It can be seen from Figure 7 that the performance of the proposed LLIA algorithm is one of the best among 8 manifold learning algorithms.Table 1 shows the average alignment errors of the proposed LLIA algorithm during iterations.It can be seen from Table 1 that the initial alignment errors are considerably large and will be reduced sharply to almost zero during the first 2 iterations.The fact that the initial alignment errors are considerably large indirectly verifies the nonlinearity of alignment and justifies the necessity of iterative alignment.Furthermore the fact that the errors of iterative alignment will be reduced sharply to almost zero during the first 2 iterations means that the time complexity of the proposed LLIA algorithm is quite low.

6.2.
Real-World Data. Figure 8 shows the experimental results of 8 manifold learning algorithms on the dataset of Frey Face.Frey Face is often used to test the effects of manifold learning algorithms in many academic literatures [1,3,32,33].The photos in Frey Face are all taken from the same person with different head poses and facial expressions.Therefore, although the dimension of photos is 560, the intrinsic dimension of photos is two: head pose and facial expression.In Figure 8, the photos are dimensionally reduced from 560 to 2 by using 8 different manifold learning algorithms, including the proposed LLIA algorithm.In the images of Figure 8, the facial expression changes along the vertical direction, while the head pose changes along the horizontal direction.
It can be seen that the proposed LLIA algorithm achieves the best visual effect among the 8 manifold learning algorithms.
Figure 9 shows the experimental results on the dataset of plaster.Plaster is also a dataset often used to test the effects of manifold learning algorithms in many academic literatures [2,33].The photos of plaster are all taken from the same plaster from different shooting angle and under different shooting illumination.Therefore, although the dimension of photos is 4096 = 20 × 28, the intrinsic dimension of photos is two: shooting angle and illumination.In Figure 9, the photos are dimensionally reduced from 4096 to 2 by using manifold learning algorithms.In the images of Figure 9, the shooting angle changes along the vertical direction, while the shooting illumination changes along the horizontal direction.Again, it can be seen that the proposed LLIA algorithm achieves the best visual effect among the 8 manifold learning algorithms.
As shown in Figures 5, 6, 7, 8, and 9, the effects of manifold learning algorithms are evaluated qualitatively, not quantitatively.At present, there are no widely accepted quantitative evaluation criteria in manifold learning [31].However, the evaluations of manifold learning algorithms can be quantized indirectly by applying manifold learning algorithms to practical applications.In Table 2, the manifold learning algorithms are applied to the clustering application.USPS is an image dataset of handwritten numerals and has been used in many academic literatures of manifold learning [3,34].In Table 2, the images are first dimensionally reduced by using manifold learning algorithms and then clustered by using the K-means algorithm.The clustering precisions shown in Table 2 are then used to evaluate the effects of manifold learning algorithms indirectly.It can be seen from Figure 11 that the proposed LLIA algorithm is superior to LLE, HLLE, LTSA, and Diffusion Map but inferior to ISOMAP and LE.Note that both ISOMAP and LE belong to the so-called global preserving algorithms, while the proposed LLIA, LLE, HLLE, LTSA, and Diffusion Map belong to the so-called local preserving algorithms.Table 2 shows that, in the clustering of handwritten numerals, although the local preserving algorithms are inferior to the global preserving algorithms, the proposed LLIA achieves the best result among the local preserving algorithms.There are two Matlab sentences in MANI which can be used to add noises to the manifold data: noiseSigma = handles.noise;handles.X = handles.X + noiseSigma * randn(N,3).

Robustness Experiments.
In the above Matlab sentences, the variable noiseSigma controls the noise strength.In Figures 12 and 13, the noise strength has been changed from 0 to 18%.
It can be seen in Figure 12 that, on Toroidal Helix, the performance of proposed algorithm LLIA is quite robust against noises, while other manifold learning algorithms deteriorate quickly when the noises are strengthened steadily.On the other hand, on Punctured Sphere, ISOMAP, Diffusion Map, and the proposed LLIA perform stably against the change of noises.Anyway, it can be concluded from both Figure 12 and Figure 13 that the proposed LLIA algorithm is robust against noises on different manifolds and different noise strength.

Conclusions
(1) At present, many machine learning algorithms, including the famous LLE algorithm [1,35], claim to be manifold learning algorithms.However, the manifolds mentioned in these algorithms seem to have nothing to do with the manifolds defined in mathematics.In contrast, the LLIA algorithm proposed on this paper is constructed based on the mathematical characteristics (local homeomorphism, local pullback, etc.) of differential manifolds.
(2) The manifold learning can be divided into two stages: local homeomorphism and alignment.In local homeomorphism, a Taylor expansion between a patch and its local coordinate can be deduced by using oneparameter transformation group acting on manifolds [5,36].In this paper, a Taylor expansion between the global and local coordinates of a patch is also deduced by using the local pullback of functions defined on differential manifold.It should be pointed out that these two Taylor expansions are completely different.
The former reveals the relation between the highdimensional data and low-dimensional date, while the latter reveals the relation between two low-dimensional data.
(3) In the proposed LLIA algorithm, the nonlinear problem is solved by using a linear iteration method.
There are two problems to be addressed here: the convergence of iteration and the time complexity of algorithm.In computational mathematics, the linear iterative methods are often used to solve nonlinear problems and the convergences of iteration have been thoroughly discussed [37].In practice, the experimental results presented in this paper show that the alignment errors would be considerably large if without iteration and will be reduced sharply during the first two iterations.This fact also implies that the time complexity of the proposed LLIA algorithm will be quite low because only few iterations are needed.
(4) At present, the algorithms of local coordinate alignment can be divided into two categories: the one-time alignment and the gradual alignment.In the onetime alignment, the manifold's global coordinate is obtained at one time by solving an eigenvalue problem [3,5,[12][13][14], while in the gradual alignment, the manifold's global coordinate grows up gradually from a patch's local coordinate [9][10][11].No matter which alignment is used, they are all based on the local linear model of alignment.The proposed LLIA algorithm is based on the nonlinear model of alignment and therefore belongs to neither one-time alignment nor gradual alignment.

Figure 7 :
Figure 7: The experimental results on Toroidal Helix.

Figure 8 :Figure 9 :
Figure 8: The experimental results on Frey Face.
Figure 1: Definition of topological manifolds,   is the neighbourhood of , Θ  is an open set in   , and   is a homeomorphic mapping between   and Θ  .neighbourhood⊆  of  such that   is homeomorphic to an open set Θ  of -dimensional Euclidean space   , then  is called an -dimensional topological manifold (see Figure1).Now let   :   → Θ  be the homeomorphic mapping between   and Θ  ; (  ,   ) is called a chart of  and {(  ,   ) |  ∈ } is called an atlas of .In manifold learning, the neighbourhood   is called a patch of  and the open set Θ  =   (  ) is called the local coordinate of   .Furthermore, let Θ = Θ  − {  ()} and φ :   → Θ such that for all   ∈   , φ (  ) =   (  ) − .
Local Homeomorphism.Let  be a Hausdorff topological space; if, for all  ∈ , there is a given, where the column vectors of  are assumed to be taken from a -dimensional, compact, and  ∞ differential manifold  ⊆   which is embedded into the -dimensional Euclidean space   , where  ≪ ; the manifold learning algorithms want to find a matrix  = [ 1 ⋅ ⋅ ⋅   ] ∈  × such that   = (  ) is the global coordinate of   ,  = 1, ..., .Figure 4: The Local Pullback of component functions of global mapping;   is the th component function of  and   ∘  −1 is the pullback function of   defined on Θ  .Generally speaking,  is nonlinear, so is  ∘  −1  .If   ∘  −1  is differentiable, then   ∘  −1  can be Taylor-expanded in Θ  .theneighbourhoodincludesitself.Let   = (1/) ∑ =1    be the centre of   .From now on,   is regarded as the neighbourhood of   , no longer the neighbourhood of   .Let  = [  1 ⋅ ⋅ ⋅    ] =   bethe global coordinate of   , that is,    = (   ), where  = 1, . . ., , and Θ  = [ ,1 ⋅ ⋅ ⋅  , ] the local coordinate of   , that is,  , =   (   ), where  = 1, . . .,  and   is the local homeomorphic mapping between   and Θ  .According to (8) derived in Section 4, one has 5.1.Linearization of Nonlinear Alignment.We first locally linearize the nonlinear relation of alignment deduced in Section 4. For each data point   , let   = [  1 ⋅ ⋅ ⋅    ] =   be the neighbourhood of   , where   ∈  × is the selection matrix in which the   th element of the th column is 1; other elements are 0,  = 1, . . ., .It is noted that ()    , Ŷ(+1)     .The deduction of  (+1)  from  () is as follows: = (+1)

Table 1 :
Average alignment errors of LLIA algorithm on Trefoil, Punctured Sphere, and Toroidal Helix during iterations.

Table 2 :
Clustering accuracy of USPS by using the 2D embedding.