^{1}

^{2}

^{1}

^{2}

The problem of learning the kernel function with linear combinations of multiple kernels has attracted considerable attention recently in machine learning. Specially, by imposing an

Kernel methods such as Support Vector Machines (SVMs) have been extensively applied to supervised learning tasks such as classification and regression. The performance of a kernel machine largely depends on the data representation via the choice of kernel function. Hence, one central issue in kernel methods is the problem of kernel selection; a great many approaches to selecting the right kernel have been studied in the literature [

We begin with reviewing the classical supervised learning setup. Let

In this paper, we assume that

By restricting the regularization to be the form

Because of their simplicity and generality, kernels and associated RKHS play an increasingly important role in Machine Learning, Pattern Recognition and Artificial Intelligence. When the kernel is fixed, an immediate concern is the choice of the regularization parameter

Kernel learning can range from the width parameter selection of Gaussian kernels [

In this paper, we mainly focus on the

The optimization problem subsumes state-of-the-art approaches to multiple kernel learning, covering sparse and nonsparse MKL by arbitrary

It should be pointed out that the Tikhonov Regularization in (

The following Lemma (see [

If

Hence, Lemma

In the following, we assume that

In this paper, we only focus on the least square loss:

The projection operator

The target of error analysis is to understand how

To show some ideas of our error analysis, we first state learning rates of (

Let

Theorem

Our main result is about learning rates of (

The approximation ability of the hypothesis space

The regularization error of the triple

Our assumption implies that when

If

Next we define the truncated sample error as

The function

A useful approach for regularization schemes with sample independent hypothesis spaces such as RKHS is an error decomposition, which decomposes the total error

Let

We can decompose

We are in a position to estimate the sample error

Let

Define the random variable

From the definition of

Observe that

Next we estimate the first term

Let

The

Let

Denote by

There exists an exponent

For any function

Note that for any function set

Let

Our concentration estimate for the sample error dealing with

Let

Denote the set of function

If

Consider the set

Applying Lemma

We are now in a position to obtain the learning rates of projected algorithm (

Following the error decomposition scheme in Proposition

Suppose that

Following Propositions

When

Our learning rates below in Corollary

Suppose that

Recall a result from [

On the other hand, observe Lemma

Let us compare our learning rates with the existing results.

In [

In [

When

As for empirical risk minimization (ERM), classical results on analysis of ERM schemes give error bounds between the empirical target function and the regression function. In particular, learning rates of type

By our assumptions on M different kernels

Let

If there exists a function

In other words, if

From statistical effective dimension point of view, we will discuss the impact of the multikernel class

Let us compare the multikernel class regularization with Tikhonov regularization in

In this case, our analysis implies that we should use an alternative kernel with faster eigenvalue decay when the spectral coefficients of the target function decay faster: for example, using

In general, for the sample error, there exist rates of convergence which hold independently of the underlying distribution

In the last section, we exclusively discuss sparsity in the case of the square loss regularization functional in (

For any kernel

According to Lemma

We assume that

For

The authors would like to thank the two anonymous referees for their valuable comments and suggestions which have substantively improved this paper. The research of S.-G. Lv is supported partially by 211 project for the Southwestern University of Finance and Economics, China [Project no. 211QN2011028] as well as Annual Cultivation of the Fundamental Research Funds for the Central Universities [Project no. JBK120940].