Dimensionality Reduction by Weighted Connections between Neighborhoods

and Applied Analysis 3 2.2. Isomap. When the data points lie on or close to a low-dimensional nonlinear manifold embedded in a highdimensional space and the nonlinear structure cannot be adequately represented by a linear approximation, classical MDS as mentioned above usually fails to recover the low-dimensional structure of the nonlinear manifold. For example, the data points lying on anunderlyingmanifold two-dimensional “Swiss-roll” embedded in a three-dimensional space cannot be reduced effectively by MDS method. To deal with this problem, recently, a class of nonlinear embedding techniques has been designed to discover the structure of high-dimensional data and find their embedding in a low-dimensional Euclidean space. Among them, in 2000, Tenenbaum et al. [5] proposed a well-known dimensionality reduction technique, named Isomap, which combines the major algorithmic advantages of PCA and MDS. In the Isomap algorithm, geodesic distances between points are extracted instead of simply taking the Euclidean distance. The geodesic distances are computed by constructing a sparse graph in which each node is connected only to its closest neighbors. The geodesic distance between each pair of nodes is taken to be the length of the shortest path in the graph that connects them. These approximated geodesic distances are then used as input to classical MDS. Up to now, Isomap has proven to be successful in many applications, such as wood inspection, visualization of biomedical data, and head pose estimation. The Isomap algorithm can be written briefly as follows. Step 1 (construct neighborhood graph). Determine which points are neighbors on themanifoldM. Two simplemethods are to connect each point to all points within some fixed radius ε (ε-Isomap) or to all of its k nearest neighbors (KIsomap). These neighborhood relations are represented as a weighted graph G over the data points, with edges of weight d x (i, j) between neighboring points. Step 2 (compute shortest paths). Isomap estimates the geodesic distances d M (i, j) between all pairs of points on the manifold M by computing their shortest path distance d G (i,


Introduction
Real-world data, such as speech signals, digital photographs, or fMRI scans, usually has a high dimensionality.In order to handle such real-world data adequately, its dimensionality needs to be reduced.Ideally, the reduced representation should have a dimensionality that corresponds to the intrinsic dimensionality of the data.Fortunately, in many of those applications, all of the components of those high-dimensional data vectors are not independent of each other and in many cases the data points can be considered as lying on or close to a low-dimensional nonlinear manifold embedded in a highdimensional space.Dimensionality reduction methods can also be thought of as a principled way to understand the highdimensional data.
Mathematically, the problem of dimensionality reduction can be described as follows briefly.Assume we have a dataset with  samples in -dimensional space.Assume further that this dataset has intrinsic dimensionality  (where  <  and often  ≪ ).The intrinsic dimensionality means that the points in dataset are lying on or near a manifold with dimensionality  that is embedded in the -dimensional space.The manifold may be non-Riemannian because of discontinuities.Dimensionality reduction techniques transform dataset with dimensionality  into a new dataset with dimensionality , while retaining the geometry of the data as much as possible [1].
A large number of nonlinear techniques for dimensionality reduction have been proposed in the past decades.Among them, principal component analysis (PCA) [2], linear discriminant analysis (LDA) [3], and multidimensional scaling (MDS) [4] are three most classical techniques for dimensionality reduction.PCA finds a low-dimensional embedding of the data points that best preserves their variance as measured in the high-dimensional input space.Classical MDS finds an embedding that preserves the interpoint distances, equivalent to PCA when those distances are Euclidean.LDA searches the projection axes on which the not-same-class points are far from each other while requiring the same-class points to be close to each other.In 2000, two novel methods for manifold learning, isometric feature map method (Isomap) [5] and the locally linear embedding method (LLE) [6], have drawn great interests.Unlike other nonlinear dimension reduction methods, both LLE and Isomap methods emphasize simple algorithmic implementation and avoid nonlinear optimization formulations that are prone to local minima [7].In Isomap, the geodesic distances between the data points are computed by constructing a neighborhood graph , in which every data point is connected with its  nearest neighbors.The geodesic distances between all data points in dataset are computed, thereby forming a pairwise geodesic distance matrix.The low-dimensional representations are obtained by applying classical MDS on the resulting pairwise geodesic distance matrix.In LLE, the local properties of the data manifold are constructed by writing the high-dimensional data points as a linear combination of their nearest neighbors.In the low-dimensional representation of the data, LLE attempts to retain the reconstruction weights in the linear combinations as good as possible.van der Maaten et al. [1] presented a review and systematic comparison of a variety of nonlinear dimensionality reduction techniques which include Kernel PCA, maximum variance unfolding, diffusion maps, Laplacian eigenmaps, local tangent space analysis, Sammon mapping, multilayer autoencoders, locally linear coordination, manifold charting, Isomap, and LLE.The performances of these nonlinear techniques are investigated on artificial and natural tasks.
Recently, a great many methods of dimensionality reduction methods based on manifold learning have been introduced and some classical techniques have been further improved [8][9][10][11][12][13][14][15].In [8], Hu et al. proposed a new dimensionality reduction algorithm called discriminant multidimensional mapping (DMM), which combines the advantages of multidimensional scaling (MDS) and LDA.DMM is effective for small sample datasets with high dimensionality.A dimensionality reduction technique, named orthogonal isometric projection (OIP), is proposed in [9].In contrast with Isomap, which learns the low-dimension embedding and solves problem under the classic multidimensional scaling (MDS) framework, they consider an explicit linear projection by capturing the geodesic distance, which is able to handle new data straightforwardly and leads to a standard eigenvalue problem.To address the problem of finding a template function that represents the common pattern of a sample of curves, a novel algorithm based on a robust version of the isometric featuring mapping (Isomap) algorithm is developed by Dimeglio et al. [11].H. Choi and S. Choi [12] presented a robust kernel Isomap which relates the Isomap to Mercer kernel machines, so that the generalization property naturally emerges, through kernel principal component analysis.However, it is still a challenging problem to further propose a novel method to reduce the dataset in high-dimensional space into a desired dataset in low-dimensionality; meanwhile, the local topology nature of dataset is preserved perfectly.
Motivated by the Isomap and LLE dimensionality reduction methods proposed by Tenenbaum et al. [5] and Roweis and Saul [6], respectively, we in this paper improve the -Isomap algorithm by local weighted connection between neighbors.The nearness between neighbors to a data point can be reflected well by the proposed technique.The desired dimensionality reduction results have been obtained on the three classical examples by the proposal.
The paper is organized as follows.Section 2 introduces classical MDS and Isomap dimensionality reduction techniques.An improvement of -Isomap algorithm is depicted in Section 3. Section 4 shows the validity of the proposal by two well-known datasets which are widely used by a large number of dimensionality reduction techniques.The conclusion is given in the last section.

MDS and Isomap Techniques
2.1.Classical MDS.MDS [4,8,16] is one of the global nonlinear techniques for dimensionality reduction which attempts to preserve global properties of the data.It maps the highdimensional data representation to a low-dimensional representation while retaining the pairwise distances between the data points as faithfully as possible.
Suppose for a set of  points  = { 1 ,  2 , . . .,   },   ∈   .The pairwise Euclidean distance between the data points   and   is where  ()  denotes the th component of   .Without loss of generality, we can assume that the   's are centered.
Notice that the squared pairwise distance is Then, the squared-distance matrix  = [ 2  ] × can be rewritten as where  is the -dimensional vector of all ones and  = [ 1 ,  2 , . . .,   ].
The mean-centered inner product matrix is defined as where  =  −   / is the mean-centering matrix.
To recover , let the eigendecomposition of  be where   is the th largest positive eigenvalue and V  is the corresponding orthonormal eigenvector.
Then, the required -dimensional embedding vectors  are given by the columns of the following matrix: where MDS is widely used for the visualization of data and in molecular modeling.The popularity of MDS has also led to the proposal of its variants.

Isomap.
When the data points lie on or close to a low-dimensional nonlinear manifold embedded in a highdimensional space and the nonlinear structure cannot be adequately represented by a linear approximation, classical MDS as mentioned above usually fails to recover the low-dimensional structure of the nonlinear manifold.For example, the data points lying on an underlying manifold two-dimensional "Swiss-roll" embedded in a three-dimensional space cannot be reduced effectively by MDS method.
To deal with this problem, recently, a class of nonlinear embedding techniques has been designed to discover the structure of high-dimensional data and find their embedding in a low-dimensional Euclidean space.Among them, in 2000, Tenenbaum et al. [5] proposed a well-known dimensionality reduction technique, named Isomap, which combines the major algorithmic advantages of PCA and MDS.In the Isomap algorithm, geodesic distances between points are extracted instead of simply taking the Euclidean distance.The geodesic distances are computed by constructing a sparse graph in which each node is connected only to its closest neighbors.The geodesic distance between each pair of nodes is taken to be the length of the shortest path in the graph that connects them.These approximated geodesic distances are then used as input to classical MDS.Up to now, Isomap has proven to be successful in many applications, such as wood inspection, visualization of biomedical data, and head pose estimation.
The Isomap algorithm can be written briefly as follows.
Step 1 (construct neighborhood graph).Determine which points are neighbors on the manifold .Two simple methods are to connect each point to all points within some fixed radius  (-Isomap) or to all of its  nearest neighbors (-Isomap).These neighborhood relations are represented as a weighted graph  over the data points, with edges of weight   (, ) between neighboring points.
Step 2 (compute shortest paths).Isomap estimates the geodesic distances   (, ) between all pairs of points on the manifold  by computing their shortest path distance   (, ) in the graph .
Step 3 (construct -dimensional embedding).Apply classical MDS to the matrix of graph distance   =   (, ), constructing an embedding of the data in a low-dimensional Euclidean space  that best preserves the estimated intrinsic geometry of the manifold.-Isomap implies that the  method is used in Step 1 in Isomap algorithm.The detailed procedure of Isomap algorithm can be referred to in [5].

An Improvement of 𝐾-Isomap
The basic idea of Isomap algorithm is that, for a neighborhood of points on a manifold, the Euclidean distances provide a fair approximation of geodesic distances.For faraway points the geodesic distance is estimated by the shortest path through neighboring points.It should be noted that the neighbors satisfy symmetrical relation in -Isomap algorithm.If data point   is in the neighbourhood of data point   , then   must be in the neighbourhood of   since the distance between   and   is less than or equal to .However, the symmetrical relation between neighbors in -Isomap method is no longer right.As shown in Figure 1, if we take  = 5 nearest neighbors of   to construct neighborhood graph in -Isomap,   is the fifth neighbor of   .But   is not one of the five nearest neighbors of   .
In this case, it is more reasonable to express the relation between   and   via   although the distance   is less than   +  .To best preserve the close relation between neighbors properly in the procedure of dimensionality reduction, a novel technique is described to reduce high-dimensional data into a low-dimensional Euclidean space.
Suppose the data consist of  real-valued vectors, each dimensionality , sampled from some underlying manifold.Provided there is sufficient data, we expect each data point and its neighbors to lie on or close to a locally linear patch of the manifold.We assign neighbors to each data point   based on the distances   between pairs of points   and   by using the  nearest neighbors in the input space.To reflect the nearness of neighbor   to data point   ,    is defined as where   denotes the distances between neighbor   and the data point   .If   does not belong to the set of neighbors of   , then   = ∞.
It is obvious that    is generally not equal to    .In order to use MDS algorithm, therefore, we define the symmetrical matrix  as follows: The elements   of matrix  indicate the average weights between   and   .The neighborhood relations are represented by a weighted graph  over the data points, with edges of weight   between neighboring points.The graph  constructed by the proposed method is more effective to reflect the neighborhood relations between data points than that defined by -Isomap.For instance, the edge connected data points   and   in Figure 1 do not exist according to our method because the relation between them is not close together.
The geodesic distances    between all pairs of points on the manifold  can be obtained by computing their shortest path distances    in the graph .As -Isomap algorithm, we apply classical MDS to the matrix   (   ), constructing an embedding of the data in a low-dimensional Euclidean space that best preserves the manifolds estimated intrinsic geometry.

Experiments
To verify the dimensionality reduction capability of the introduced method, we would like to test it on three classical examples which are widely used to test the effectiveness of dimensionality reduction techniques.The experimental platform is based on Windows 7 with AMD Phenom (tm) II P960 Quad-Core Processor 1.8 GHz and 2.00 GB memory.The programming language is Matlab R2011a.
In what follows, two classical pattern classification problems, face recognition and handwritten digit recognition, are considered in order to analyze the performance of our proposal.
One of the classical examples in dimensionality reduction from the domain of visual perception is face recognition.Multiple photographs ( = 698) of the same face with different poses and lighting directions are digitized as 64 × 64 = 4096 gray scale images.All of the images which are saved in no particular order can be considered as points in a highdimensional vector space and lie on an intrinsically threedimensional manifold.We test the improved method against this dataset (http://isomap.stanford.edu/datasets.html) and take  = 8.A two-dimensional projection is shown in  The MNIST database of handwritten digits, available from http://yann.lecun.com/exdb/mnist/,has a training set of 60,000 examples and a test set of 10,000 examples.The digits have been size-normalized and centered in a fixed-size image.Each digit set in the dataset consists of about 1100 instances of a given numeral ("0" to "9") scanned in as 28 × 28 = 784 pixel gray scale images.These handwritten digits vary in slope, angle, and thickness characters.In order to evaluate the performance of our proposal,  = 1032 the digits "2" which are written in various forms are selected from the MNIST database.The major features of the "2" are of bottom loop, top arch, and thickness.Applied to this dataset, the proposed method ( = 19) learns a two-dimensional embedding of the data's intrinsic geometric structure.It is easy to see from Figure 3 that similar symbols are near each other.This fact indicates that the neighbors' relation of data points in highdimensional space still holds in two-dimensional space by using the introduced approach.
A low-dimensional nonlinear manifold embedded in a high-dimensional space is generally invisible, and therefore it is difficult to understand the dimensionality reduction clearly.To illustrate it effectively, we would like to introduce an artificial dataset "Swiss-roll" which is three-dimensional data ( = 2000) sampled from a two-dimensional manifold as shown in Figure 4(a).The colors of data points in Figure 4 indicate their neighborhood relations.It is easy to see from Figure 4(b) that the "Swiss-roll" is unfolded by the proposed approach ( = 15) in two-dimensional space.The example explains visually that the neighborhood relations of all data points in high-dimensional space are preserved well while reducing them into low-dimensional space by our proposal.preserved well in the process of dimensionality reduction by the proposal.The reduced results of three classical examples account for the validity of our proposed method.
It is well known that the dimensionality reduction is a stage of data preprocessing in data mining, not our final destination.It is worthwhile to investigate how to apply the reduced results to classify/cluster the real/artificial datasets.Our future work will focus on this problem.

Figure 1 :
Figure 1: Five nearest neighbors of   and   .

Figure 2 :
Figure 2: A two-dimensional projection of face dataset in 4096dimensional space.
Figure 2. The circles indicate the original input images and the points reflect the neighbor relations of the images.The direction of coordinate axis represents the gradual change of poses.The results show that the relations of images with similar poses are preserved perfectly on this dataset in the procedure of dimensionality reduction.