Traditional manifold learning algorithms, such as locally linear embedding, Isomap, and Laplacian eigenmap, only provide the embedding results of the training samples. To solve the outofsample extension problem, spectral regression (SR) solves the problem of learning an embedding function by establishing a regression framework, which can avoid eigendecomposition of dense matrices. Motivated by the effectiveness of SR, we incorporate multiple kernel learning (MKL) into SR for dimensionality reduction. The proposed approach (termed MKLSR) seeks an embedding function in the Reproducing Kernel Hilbert Space (RKHS) induced by the multiple base kernels. An MKLSR algorithm is proposed to improve the performance of kernelbased SR (KSR) further. Furthermore, the proposed MKLSR algorithm can be performed in the supervised, unsupervised, and semisupervised situation. Experimental results on supervised classification and semisupervised classification demonstrate the effectiveness and efficiency of our algorithm.
In real applications, the resulting data representations are generally high dimensional. Practical algorithms usually behave badly when faced with many unnecessary features. Hence, finding a way of transforming them into a unified space of lower dimension can facilitate the underlying tasks such as pattern recognition or regression problems. Dimensionality reduction (DR) techniques, which have been widely used in many fields of information processing, include unsupervised, supervised, and semisupervised methods due to different assumptions about the data distribution or the availability of the data labeling.
In order to handle the data sampled from a nonlinear low dimensional manifold, many manifold learning techniques, such as ISOMAP [
The paper is structured as follows. In Section
Since the relevant literature is quite extensive, our survey instead emphasizes the key concepts crucial to the establishment of the proposed framework.
In the traditional spectral dimensionality reduction algorithms, seeking an embedding function which minimizes the objective function involves eigendecomposition of dense matrices, which has the high computational cost in both time and memory. The SR algorithm uses the least squares method to get the best projection direction, instead of computing the density matrix of features, so it has much faster learning speed. An affinity graph
Given a training set with
Constructing the adjacency graph
If
If
Otherwise, if
Constructing the weight matrix
If there is no any edge between nodes
Otherwise, if both
For eigendecomposing let
Calculate
Let
Next, we briefly discuss the kernel spectral regression. If we choose a nonlinear function in RKHS; that is,
Let
MKL learns a kernel machine with multiple kernel functions or kernel matrices. Recent studies have shown that MKL not only increases the recognition accuracy but also enhances the interpretability of the resulting classifiers. Given a set of base kernel functions
In recent years, dimensionality reduction methods based on multiple kernels have been proposed to improve the performance of those using single kernel. In [
We first explain how to integrate MKL and SR for dimensionality reduction. Then, we propose an optimization procedure to complete the framework.
Suppose that the ensemble kernel
The additional constraints in (
Observe from (
The optimization problem (
Since direct optimization to (
We can indirectly utilize
Solving the problem (
Let
We have
Theorem
Solve the eigenproblem in (
Find
Since the matrix
By fixing
The additional constraints
The training procedure of MKLSR
Input: A set of training data, matrices
and parameter
Output: Sample coefficient vectors
Kernel weight vectors
Make an initial guess for
Computing the largest
(1) Compute
(2)
(3) Compute
(4)
After accomplishing the training procedure of MKLSR, we can project a testing sample
We used seven datasets (ionosphere, letter, digit, and satellite) from the UCI machine learning repository to perform unsupervised learning task. For the letter and satellite data sets, we only used their first two classes. Several multiclass data sets were created from the digits data. The experiments on supervised and semisupervised classification were performed on the CMU PIE face data set and the extended Yale B data set [
Description of the datasets.
Data  Size ( 
Feature ( 
Class 

Ionosphere  351  34  2 
Letter AB  1555  16  2 
Satellite C1C2  2236  36  2 
Digits 0689  713  64  4 
Digits 1279  718  64  4 
CMU PIE  850  1024  5 
Extended Yale B  2114  1024  38 
To validate that MKLSR is effective for an unsupervised dimensionality reduction task, we applied the proposed algorithm as a tool to learn an appropriate kernel function for KSR. Each data set was reduced by SR, single kernel based SR, kernel principal component analysis (KPCA), and MKLSR, respectively. The normalized cut spectral clustering (NC) algorithm was adopted to evaluate the clustering performance on the reduced data. We set the number of clusters equal to the true number of classes and compared the clusters generated by these algorithms with the true classes by computing the clustering accuracy measure as
To obtain stable results, for each data set, we computed the average results of each algorithm over 20 runs. For comparison, we also performed the NC algorithm in the original data space (Baseline). For SR, KSR, and MKLSR, the dimension of the subspace is the number of categories. For KPCA, we tested its performance with all the possible dimensions and report the best result. For SR, KSR, and MKLSR, we simply set the value of the parameter
Table
Clustering accuracy (in percent) based on different DR methods.
Data  Baseline  SR  KSR  KPCA  MKLSR 

Ionosphere 





Letter AB 





Satellite C1C2 





Digits 0689 





Digits 1279 





In this experiment, we mainly compared MKLSR with the following approaches: KPCA, LDA, SR, and KSR. In order to evaluate the performance of these algorithms, we performed the SVM algorithm in the original face image space (baseline) and KPCA, LDA, SR, KSR, and MKLSR subspace. The kernels and parameters are set in the same way as in the unsupervised learning. From each class of the CMU PIE face data sets, we randomly selected
For each given
Recognition accuracy rates on PIE (mean ± stddev%).
Train size  Baseline  KPCA  LDA  SR  KSR  MKLSR 




































The key parameter in MKLSR is the regularization parameter
Performance of MKLSR with respect to
In the semisupervised case, we compared the performance of MKLSR with KPCA and semisupervised KSR. For comparison, we performed the SVM algorithm in the original face image space (baseline), KPCA, and semisupervised KSR and MKLSR subspace. For KSR and MKLSR, we simply set the value of the parameter
We average the classification accuracy over 30 random splits for each given
Recognition accuracy rates on Extended Yale B (mean ± stddev%).
The number of labeled samples  Baseline  KPCA  Semisupervised KSR  Semisupervised MKLSR 
 

























In this paper, we propose a new dimensionality reduction framework called MKLSR. By means of SR, we solve the outofsample extension problem by seeking an embedding function in RKHS induced by multiple kernels. Thus, this method can not only construct the nonlinear embedding function in the form of convex combination of base kernels but also improve the performance of single kernel based SR in the supervised, semisupervised, and unsupervised case. Experimental results validate the effectiveness and efficiency of the MKLSR algorithm. In the near future, we will further explore how to integrate different MKL methods into our model.
The authors declare that there is no conflict of interests regarding the publication of this paper.
This work was supported by the Fundamental Research Funds for the Central Universities under Grant no. 2013XK10.