JAM Journal of Applied Mathematics 1687-0042 1110-757X Hindawi Publishing Corporation 792196 10.1155/2013/792196 792196 Research Article Block Empirical Likelihood for Longitudinal Single-Index Varying-Coefficient Model Song Yunquan 1, 2 Jian Ling 2 http://orcid.org/0000-0003-1803-5394 Lin Lu 1 Gao Zhiwei 1 Shandong University Qilu Securities Institute for Financial Studies and School of Mathematics Shandong University Jinan 250100 China sdu.edu.cn 2 College of Science China University of Petroleum Qingdao 266580 China cup.edu.cn 2013 3 10 2013 2013 26 05 2013 15 08 2013 15 08 2013 2013 Copyright © 2013 Yunquan Song et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

In this paper, we consider a single-index varying-coefficient model with application to longitudinal data. In order to accommodate the within-group correlation, we apply the block empirical likelihood procedure to longitudinal single-index varying-coefficient model, and prove a nonparametric version of Wilks’ theorem which can be used to construct the block empirical likelihood confidence region with asymptotically correct coverage probability for the parametric component. In comparison with normal approximations, the proposed method does not require a consistent estimator for the asymptotic covariance matrix, making it easier to conduct inference for the model's parametric component. Simulations demonstrate how the proposed method works.

1. Introduction

The single-index varying-coefficient model which was proposed by Huang and Zhensheng  is a very important tool to explore the dynamic pattern in many complex dynamic systems, such as economics, finance, politics, epidemiology, medical science, and ecology. As mentioned in Gao et al. , the concept of complex dynamic systems arises in many varieties. Such systems are often concurrent and distributed, because they have to react to various kinds of events, signals, and conditions. They may be characterized by a system with uncertainties, time delays, stochastic perturbations, hybrid dynamics, distributed dynamics, chaotic dynamics, and a large number of algebraic loops. Moreover, many related literatures, such as Jian et al.  and Hu et al. , have been proposed. The single-index varying-coefficient models is one method that can be used to describe the complex dynamic systems. They are natural extensions of classical parametric models with good interpretability and are becoming more and more popular in data analysis.

Longitudinal data arise frequently in many scientific studies. For longitudinal data, we know that the data that are collected from the same subject at different times are correlated and that the observations from different subjects are often independent. Therefore, it is of great interest to estimate the regression function incorporating the within-subject correlation to improve the efficiency of estimation. The single-index varying-coefficient model is a popular nonparametric fitting technique; it is easily interpreted in real applications because it has the features of the single-index model and the varying-coefficient model. In addition, the single-index varying-coefficient model may include cross-product terms of some components of covariates. Hence, it has considerable flexibility to cater for a complex multivariate nonlinear structure.

Without loss of generality, we consider a longitudinal study with N subjects and ni observations over time for the ith subject (i=1,,N) with a total of n=i=1Nni observations. In this article, we apply longitudinal data to a single-index varying-coefficient model, and propose a single-index varying-coefficient longitudinal data model of the form (1)yij=gT(βTxij)zij+εij,i=1,,N;j=1,,ni, where (xij,zij)Rp×Rq is a vector of covariates, yij is the jth measurement on the ith unit, β is an p×1 vector of unknown parameters, g(·) is an q×1 vector of unknown functions, and εij is a random error with mean 0 and finite variance σ2, assuming that εij and (xij,zij) are independent. For the sake of identifiability, it is often assumed that β=1 and the first nonzero element is positive, where · denotes the Euclidean metric.

Obviously, model (1) includes a class of important statistical models. For example, if q=1 and zij=1, model (1) reduces to the single-index longitudinal data model which was proposed by Bai et al.  to estimate the index coefficient and unknown link function in a single-index model for longitudinal data by combining penalized splines and quadratic inference functions. If p=1 and β=1, (1) is the varying-coefficient longitudinal data model studied by Chiang et al. , Huang et al. , and Qu and Li , among others. So model (1) is easily interpreted in real applications because it has the features of the single-index longitudinal data model and the varying-coefficient longitudinal data model. In addition, model (1) may include cross-product terms of some components of xij and zij. Hence, it has considerable flexibility to cater for complex multivariate nonlinear structure.

When ni=1, model (1) reduces to the nonlongitudinal single-index varying-coefficient model. Some authors have studied the estimation and application of the model. Recently, empirical likelihood methods have been applied to nonlongitudinal single-index varying-coefficient model. For example, Xue and Wang  developed statistical techniques for the unknown coefficient functions and single-index parameters in the single-index varying-coefficient models. They first estimate the nonparametric component via the local linear fitting, then construct an estimated empirical likelihood ratio function, and hence obtain a maximum empirical likelihood estimator for the parametric component. The motivation is that empirical likelihood based inference has many desirable statistical properties. For example, this method does not involve any variance estimation which is rather complicated in nonparametric or semiparametric regression settings and hence are robust against the heteroscedasticity; confidence region based on the empirical likelihood method does not have predetermined symmetry so that it can better correspond to the true shape of the underlying distribution, and so on. Owen [10, 11] and many others developed this into a general methodology. For example, Wang and Jing , Chen and Qin , Shi and Lau , and Xue and Zhu , among others. A recent survey on empirical likelihood can be found in the monograph of Owen . More methods about the single-index varying-coefficient model have been proposed, such as Huang and Zhang  and Feng and Xue . When ni>1, model (1) is the single-index longitudinal data model. The usual empirical likelihood method cannot be applied, however, to the single-index longitudinal data model (1) due to correlation within groups. In this paper, we propose a block empirical likelihood procedure to accommodate this correlation. A nonparametric version of the Wilks’ theorem is derived, which can be used to construct confidence regions with asymptotically correct coverage probabilities for the parametric component in the model. Compared with normal approximations, our method has the appealing feature that it does not require one to construct a consistent estimator for the asymptotic covariance matrix. Furthermore, the block empirical likelihood method avoids intensive Monte Carlo simulations usually required by the bootstrap method.

The rest of the paper is organized as follows. Section 2 introduces the estimated block empirical likelihood method. Section 3 derives the nonparametric version of Wilks’ theorem. Section 4 provides a data-driven procedure to choose the tuning parameters. A simulation study is given in Section 5. Proof of the main result is relegated to Section 6.

2. Block Empirical Likelihood Method

In this section, we are to extend the results of You et al.  and Xue and Wang  to the single-index varying-coefficient longitudinal data model.

To apply the block empirical likelihood method to model (1), we introduce an auxiliary random vector (2)ηij(β)={yij-gT(βTxij)zij}g˙T(βTxij)zijxijω(βTxij), where g˙(·) stands for the derivative of the function vector g(·), and ω(·) is a bounded weight function with a bounded support 𝒰ω, which is introduced to control the boundary effect in the estimations of g(·) and g˙(·). For convenience, we pointed that ω(·) is the indicator function of the set 𝒰ω. Note that E{ηij(β)}=0 if β=β0. Hence, the problem of testing whether β is the true parameter is equivalent to testing whether E{ηij(β)}=0, for i=1,,N;  j=1,,ni. Because of the unknowns g(·) and g˙(·), we cannot directly use the block empirical likelihood method to make statistical inference on β. A natural way is to replace g(·) and g˙(·) by their estimators. In this paper, we estimate the vector functions g(·) and g˙(·) via the local linear regression technique (see, e.g., Fan and Gijbels ). The local linear estimators for g(u) and g˙(u) are defined as g^(u;β)=a^ and g˙^(u;β)=b^ at the fixed point β0, where a^ and b^ minimize the sum of weighted squares: (3)i=1Nj=1ni[yij-{a+b(βTxij-u)}Tzij]2Kh(βTxij-u), where Kh(·)=h-1K(·/h), K(·) is a kernel function, and h=hn is a bandwidth sequence that decreases to 0 as n increase to . It follows from the least squares theory that (4)(g^T(u;β),hg˙^T(u;β))T=Sn-1(u;β)ξn(u;β), where (5)Sn(u;β)=(Sn,0(u;β)Sn,1(u;β)Sn,1(u;β)Sn,2(u;β)),ξn(u;β)=(ξn,0(u;β)ξn,1(u;β)ξn,1(u;β)ξn,2(u;β)) with (6)Sn,k(u;β)=1ni=1Nj=1nizijzijT(βTxij-uh)kKh(βTxij-u),ξn,k(u;β)=1ni=1Nj=1nizijyij(βTxij-uh)kKh(βTxij-u).

Remark 1.

Since the convergence rate of the estimator of g˙0(u) is slower than that of the estimator of g0(u) if the same bandwidth is used, this leads to a slower convergence rate for the estimator β^ of β than n. To increase the convergence rate of the estimator of g˙0(u), we introduce the another bandwidth h1 to replace h in g˙^(u;β) and define it as g˙^h1(u;β).

Similar to Owen  and Shi and Lau , {r^ij(β)=yij-g^T(βTxij)zij,  i=1,,N;  j=1,,ni} can be treated as a random sieve approximation of the random error sequence {εij,i=1,,N;  j=1,,ni}. In order to deal with the correlation within groups, we use the block empirical likelihood procedure proposed by You et al. . Unlike the usual empirical likelihood method, the block empirical likelihood procedure takes the “data” r^ij(β) for j=1,,ni into account as a whole. Let η^ij(β)=r^ij(β)g˙^T(βTxij)zijxijω(βTxij) be ηij(β), with g(βTxij) and g˙(βTxij) replaced by g^(βTxij;β) and g˙^(βTxij;β), respectively, for i=1,,N;  j=1,,ni. Then an estimated block empirical likelihood function for β is defined as (7)L^(β)=max{i=1npipi0,i=1kpi=1,i=1kpij=1niη^ij(β)=0}. For a given β a unique maximum exists, provided that 0 is inside the convex hull of the points j=1niηij(β) for i=1,,N. The maximum of (7) may be found via the method of Lagrange multipliers. The optimal value for pi satisfying (7) may be shown to be (8)pi=1N×11+λTj=1niη^ij(β), where the Lagrange multiplier λ=(λ1,,λp)T is the solution of the following equation: (9)0=1Ni=1Nj=1niη^ij(β)1+λTj=1niη^ij(β).

Since p1××pN is maximized for pi=1/N in the absence of parametric constraints, we define the corresponding estimated profile block empirical log-likelihood ratio as (10)l^(β)=-i=1Nlog[1+λTj=1niη^ij(β)].

We will show in the next section that if β0 is the true parameter vector, l^(β0) is asymptotically chi-square distributed.

3. Theoretical Properties

Throughout this article, we assume that N increases to push up the total sample size n=n1++nN, while the ni is fixed. To establish the nonparametric Wilks' theorem for LR(β0), we first make the following assumptions.

The density function of βTxij, f(u), is bounded away from zero for u𝒰ω and β near β0 and satisfies the Lipschitz condition of order 1 on 𝒰ω, where 𝒰ω is the support of ω(u).

The function gk(u),1kq, have continuous second derivatives on 𝒰ω, where gk(u) are the kth components of g(u).

E(xij6)<, E(zij6)<, and E(|εij|6)<.

Nh2/(logN)2, Nh4logN0;  Nhh13/(logN)2,  Nh15=O(1).

The kernel K(·) is a symmetric probability density function with a bounded support and satisfies the Lipschitz condition of order 1 and u2K(u)du0.

The matrix D(u)=E(zijzijTβ0Txij=u) is positive definite, and each entry of D(u) and C(u)=E(vijzijTβ0Txij=u) satisfies the Lipschitz condition of order 1 on 𝒰ω, where vij=xijg˙0T(β0Txij)zijω(β0Txij), and 𝒰ω is defined in (A.1).

The matrices B(β0)=E(vijvijT) and B*(β0)=B(β0)-E{C(β0Txij)g˙0(β0Txij)E(xijTβ0Txij)} are positive definite, where vij is defined in (A.6).

Remark 2.

Condition (A.1) is used to bound the density function of βTxij away from zero. This ensures that the denominators of g^(u;β) and g˙^(u;β) are, in probability one, bounded away from 0 for u𝒰ω. The second derivatives in (A.2) are standard smoothness conditions. (A.3)–(A.5) are necessary conditions for the asymptotic normality or the uniform consistency of the estimators. It should be pointed out that the condition can be replaced by E(xij6+δ)<, E(zij6+δ)<, and E(|εij|6+δ)< for some δ>0. In the current work, the exponential index of the norm is set as 6 for it is the minimum value to meet the asymptotic normality or the uniform consistency of the estimators. Conditions (A.6) and (A.7) ensure that the asymptotic variance for the estimator of β0 exists.

Let ={βRp:β=1, and the first nonzero element is positive}. Then β0 is an inner point of set . The following theorem shows that -2l^(β0) is asymptotically distributed as a weighted sum of independent χ12 variables.

Theorem 3.

Suppose that (A.1)(A.7) hold, then as N, (11)-2l^(β0)Dω1χ1,12++ωpχ1,p2, where D represents convergence in distribution, χ1,12,,χ1,p2 are independent χ12 variables, and the weights ωj, for 1jp, are the eigenvalues of G(β0)=B-1(β0)A(β0). Here B(β0) is defined in condition (A.7), (12)A(β0)=B(β0)-E{C(β0Txij)D-1(β0Txij)CT(β0Txij)}, and C(u) and D(u) are defined in condition (A.6).

To apply Theorem 3 to construct a confidence region or interval for β0, we need to consistently estimate the unknown weights ωj. By the plug-in method, A(β0) and B(β0) can be consistently estimated by (13)A^(β^)=1Ni=1Nj=1ni{v^ijv^ijT-C^(β^Txij)D^-1(β^Txij)C^T(β^Txij)},(14)B^(β^)=1Ni=1Nj=1niv^ijv^ijT, respectively, where β^ is the maximum empirical likelihood estimator of β0 defined by (9), v^ij=xijg˙^T(β^Txij;β^)zijω(β^Txij), C^(·)=i=1nWni(·)v^ijzijT, and D^(·)=i=1nWni(·)zijzijT with (15)Wnij(·)=K1(β^Txij-·/bn)k=1Nl=1nkK1(β^Txkl-·/bn), where K1(·) is a kernel function, and bn is bandwidth with 0<bn0.

This implies that the eigenvalues of G^(β^)=B^-1(β^)A^(β^), say ω^j, consistently estimate ωj for j=1,,p. Let c^1-α be the 1-α quantile of the conditional distribution of the weighted sum s^=ω^1χ1,12++ω^pχ1,p2 given the data. Then an approximate 1-α confidence region for β0 can be defined as follows: (16)(α)={β:-2l^(β)C^1-α}.

In practice, the conditional distribution of the weighted sum s^, given the sample {(yij,xij,zij),1iN;1jni}, can be calculated using Monte Carlo simulations by repeatedly generating independent samples χ1,12,,χ1,p2 from the χ12 distribution.

In addition to the above, direct way of approximating the asymptotic distributions, we can also consider the following alternative. The alternative is motivated by the results of Rao and Scott . Now, we propose another adjusted empirical log-likelihood, whose asymptotic distribution is chi-squared with p degrees of freedom. The adjustment technique is developed by Wang and Rao  by using an approximate result in Rao and Scott . Note that ρ^(β) can be written as (17)ρ^(β)=tr{A^-(β)A^(β)}tr{B^-1(β)A^(β)}.

By examining the asymptotic expansion of -2l^(β), which is specified in the proof of Theorem 4 below, we define an adjustment factor (18)r^(β)=tr{A^-(β)Σ^(β)}tr{B^-1(β)Σ^(β)}, by replacing A^(β) in ρ^(β) by Σ^(β), where Σ^(β)={i=1Nj=1niη^ij(β)}{i=1Nj=1niη^ij(β)}T. The adjusted empirical log-likelihood ration is defined by (19)l^a(β)=r^(β){-2l^(β)}, where l^(β) is defined in (10).

Theorem 4.

Suppose that conditions (A.1)(A.6) hold. Then, l^a(β)Dχp2.

According to Theorem 4, l^a(β) can be used to construct an approximate confidence region for β0. Let (20)a(α)={β:l^a(β)χp2(1-α)}. Then, a(α) gives a confidence region for β0 with asymptotically correct coverage probability 1-α.

4. Bandwidth Selection

For practical implementation, the tuning parameters need to be decided. We employ a data-driven procedure to choose the tuning parameter h, where h controls the smoothness of g^(·) and g˙^(·). We all know that various existing bandwidth selection techniques for nonparametric regression, such as the cross-validation, generalized cross-validation, and the modified multifold cross-validation criterion, can be adapted for the estimation g^(·) and g˙^(·). Because the algorithm of the modified multifold cross-validation criterion proposed by Cai et al.  to select the optimal bandwidth is simple and quick, throughout the empirical studies in this paper, we consider the modified multifold cross-validation criterion. Specifically, let m and M be two given positive integers and n>mM. The basic idea is first to use M subseries of lengths n-km(k=1,,M) to estimate the unknown coefficient functions and then to compute the one-step forecasting error of the next section of the sample of length m based on the estimated models. More precisely, we choose h which minimizes (21)AMS(h)=k=1M1mi=n-km+1n-km+mj=1ni{yij-l=1qg^l,k(βTxij)zij(l)}2, where g^l,k(·) are computed from the sample {(yij,xij,zij),1in-km;1jni} with bandwidth equal to h(n/(n-km))1/5. Note that for different sample size, we rescale bandwidth according to its optimal rate, that is, hn-1/5. Since the selected bandwidth does not depend critically on the choice of m and M, to computation expediency, we take m=[0.1n] and M=5 in our simulation.

Let hopt be the bandwidth obtained by minimizing (21) with respect to h>0; that is, hopt=infh>0  AMS(h). Then hopt is the optimal bandwidth for estimating g^(·). When calculating the block empirical likelihood ratios and estimator of β0, we use the approximation bandwidth (22)h=hoptn-1/20(logn)-1/2,h1=hopt, because this insures that the required bandwidth has correct order of magnitude for the optimal asymptotic performance (see, e.g., Carroll et al. ), and the bandwidth h^ satisfies condition (A.4).

5. A Simulation Study

In this section, we carry out some simulations to study the finite sample performance of the estimated block empirical likelihood method.

Example 5.

The data are generated from (23)Yij=g0(βXij)+g1(βXij)Zij+εij,i=1,,N,j=1,,ni, where Xij~N(0,1), Zij~N(0,1), g(t)=sin(2πt), εij=aεi,j-1+eij, and eijs are i.i.d N(0,1). For each combination of N, ni, and a, 1000 samples are generated from the above model in all simulations. For each sample, a 95% confidence interval for β=2 are computed using our estimated block empirical likelihood method. For the smoother, we used a local linear smoother with the Gaussian kernel Kh(t)=(1/h2π)exp(-t2/2h2) with a modified multifold cross-validation criterion bandwidth throughout all smoothing steps. Some representative coverage probabilities and coverage confidence intervals are reported in Table 1. Simulation results show that our estimated block empirical likelihood confidence regions have high coverage probabilities and short average confidence interval lengths.

The selection probabilities of adaptive EL shrinkage estimation.

N Number of replicates CI CP
a = 0.2
50 n 1 = = n 25 = 4 , n26==n50=4 [1.645015, 2.349996] 0.9372
50 n 1 = = n 25 = 4 , n26==n50=5 [1.634619, 2.355068] 0.9270
100 n 1 = = n 25 = 4 , n26==n100=4 [1.634619, 2.355068] 0.9343
100 n 1 = = n 25 = 4 , n26==n100=5 [1.634619, 2.355068] 0.9424

a = 0.4
50 n 1 = = n 25 = 4 , n26==n50=4 [1.634619, 2.355068] 0.9428
50 n 1 = = n 25 = 4 , n26==n50=5 [1.634619, 2.355068] 0.9334
100 n 1 = = n 25 = 4 , n26==n100=4 [1.634619, 2.355068] 0.9427
100 n 1 = = n 25 = 4 , n26==n100=5 [1.634619, 2.355068] 0.9351

Example 6.

Consider the regression model (24)Yij=g0(β0TXij)+g1(β0TXij)Zij(1)+g2(β0TXij)Zij(2)+εij, where β0=(1/3,2/3) and the εijs are independent N(0,12) random variables. The sample {Xij=(Xij(1),Xij(2))T;1iN,1jni} was generated from a bivariate uniform distribution on [-1,1]2 with independent components, {Zij=(Zij(1),Zij(2))T;1iN,1jni} was generated from a bivariate normal distribution N(0,Σ) with var(Zij(1))=var(Zij(2))=1, and correlation coefficient between Zij(1) and Zij(2) is ρ=0.5. In model (24), the coefficient functions are g0(u)=8exp(-2u2), g1(u)=6u2 and g2(u)=4sin(πu).

For the smoother, we use a local linear smoother with a Gaussian kernel Kh(u)=(1/h2π)exp(-u2/2h2), and use the modified multifold cross-validation criterion proposed by Cai et al.  to select the optimal bandwidth throughout all smoothing steps because the algorithm is simple and quick. We take the weight function ω(u)=I[-1,1]. The sample size for the simulated data is 100, and the run is 1000 times in all simulations.

The confidence regions of β0 and their coverage probabilities, with nominal level 1-α=0.95, were computed from 1000 runs. The estimated block empirical likelihood was used to construct the confidence regions. The simulated results are given in Figure 1. Simulation results show that our block empirical likelihood confidence regions have high coverage probabilities and short average confidence interval lengths.

Averages of 95% confidence regions of (β1,β2), based on EEL (solid curve) and AEL (dashed curve ) when n=100 in the cases of Example 6.

The histograms of the 1000 estimators of the parameters β1 and β2 are in Figures 2(a) and 2(b), respectively. The Q-Q plots of the 1000 estimators of the parameters β1 and β2 are in Figures 3(a) and 3(b), respectively. Figures 2 and 3 show empirically that these estimators are asymptotically normal. The means of the estimates of the unknown parameters β1 and β2 are 0.33342 and 0.66673, respectively, and their biases (standard deviations) are 0.000128 (0.00308) and 0.000603 (0.00352), respectively.

The histograms of the 1000 estimators of every parameter, the estimated curve of density (solid curve) and the curve of normal density (dased curve): (a) for β1 and (b) for β2.

(a) for β1 and (b) for β2: the Q-Q plot of the 1000 estimators of every parameter.

We also consider the average estimates of the coefficient functions g0(u), g1(u), and g2(u) over the 1000 replicates. The estimators g^j(·) are assessed via the root mean squared errors (RMSE); that is, RMSE=j=02RMSEj, where (25)RMSEj=[ngrid-1k=1ngrid{g^j(uk)-gj(uk)}2]1/2, and {uk,k=1,,ngrid} are regular grid points. The boxplot for the 1000 RMSEs is given in Figure 4. From Figures 4(a)4(c) we see that every estimated curve agrees with the true function curve very closely. Figure 4(d) shows that all RMSEs of estimates for the unknown functions are very small.

The true curve (solid curve) and the estimated curve (dashed curve); (d) the boxplots of the 1000 RMSE values in estimations of g0(·),g1(·), and g2(·) and the sum of the three RMSEs.

Example 7.

We now apply the block empirical likelihood method to analyze the data from a longitudinal hormone study . The study involved 34 women whose urine samples were collected in one menstrual cycle and whose urinary progesterone was assayed on alternate days. A total of 492 observations were obtained, with each woman contributing from 11 to 28 observations over time. Each woman's cycle length was standardized uniformly to a reference 28-day cycle since the change of the progesterone level for each woman depends on time during a menstrual cycle. In the following, we consider the following model: (26)Yij=g0(β1AGEij+β2BMIij)Yij=+g1(β1AGEij+β2BMIij)tij+εij, where Yij is the jth log-transformed progesterone value measured at standardized day tij since menstruation for ith woman, and AGEij and BMIij are age and body mass index for the ith individual at day tij, respectively.

We apply the block empirical likelihood method to fit the data. Because we focus on the estimators of β1 and β2, we only summarize the estimators of β1 and β2 in Figure 5. Next, we denote βind and βAR as the estimators of β=(β1,β2), when the correlation structures are specified as independence and first-order autoregressive, respectively. We see from Figure 5 that both βind and βAR are significant for neither of confidence regions for the two estimators including (0,0). Therefore, we conclude that the parameters β1 and β2 are not significant, which is consistent with the conclusion of Zhang et al. .

The 0.95 confidence regions for the regression coefficients β1 and β2 with correlation structures being independence (dotted curve) and first-autoregressive (sold curve).

6. Proof of the Theorem

In order to prove Theorem 3, we introduce the following several lemmas. The following lemma gives uniformly convergent rates of g^(u;β) and g˙^(u;β). This lemma is straight-forward extension of known results in nonparametric function estimation. Moreover, the proofs of Lemma 9 and Lemma 10 is similar with the corresponding Lemma 9 and Lemma 10 of Xue and Wang . We hence omit these proofs.

Lemma 8.

Let n={β:β-β0c0n-1/2} for some positive constant c0. Suppose that conditions (A.1)(A.3), (A.5), and (A.6) hold. Then (27)supu𝒰ω,βng^(u;β)-g0(u)=Op({log(1/h)nh}1/2+h2),supu𝒰ω,βng˙^(u;β)-g˙0(u)=Op({log(1/h)nh3}1/2+h).

In order to describe Lemma 9, we use the following notations. Denote 𝒢={g:𝒰ω×Rq},  g𝒢=supu𝒰ω,βng(u;β). From Lemma 8, we have g^-g0𝒢=op(1) and g˙^-g˙0𝒢=op(1); hence, we can assume that g lies in 𝒢δ with δ=δn0 and δ>0, where (28)𝒢δ={g𝒢:g-g0gδ,g˙-g˙0gδ}. Let g0(βT𝒳;β)=E{g0(β0T𝒳)|βT𝒳} and g˙0(βT𝒳;β)=E{g˙0(β0T𝒳)|βT𝒳}, (29)Q(g,β)=E[{𝒴-gT(βT𝒳;β)𝒵}g˙T(βT𝒳;β)𝒵𝒳ω(βT𝒳)],(30)Qn(g,β)=1ni=1Nj=1ni{yij-gT(βTxij;β)zij}g˙T=1ni=1Nj=1ni11×(βTxij;β)zijxijω(βTxij).

Lemma 9.

Suppose that conditions (A.1)(A.6) hold. Let (31)J1(g,β)=Qn(g,β)-Q(g,β)-Qn(g0,β0),J2(g^,β)=Q(g,β)-Q(g0,β)J2(g^,β)=-ϖ(g0(βT𝒳;β);β)J2(g^,β)=×{g(βT𝒳;β)-g0(βT𝒳)},J3(g,β)=ϖ(g0(βT𝒳),β){g(βT𝒳;β)-g0(βT𝒳)}J2(g^,β)=-ϖ(g0(βT𝒳;β),β0)J3(g,β)=×{g(β0T𝒳;β0)-g0(β0T𝒳;β)},J4(g^,β0)=Qn(g0,β0)J4(g^,β0)=+ϖ(g0(βT𝒳),β){g(βT𝒳;β)-g0(βT𝒳)}. Then (32)sup(g,β)𝒢×nJ1(g,β)=op(n1/2),(33)supβnJ2(g^,β)=op(n1/2),(34)sup(g,β)𝒢×nJ3(g,β)=o(n1/2),(35)nJ4(g^,β0)DN(0,σ2A(β0)), where A(β0) is defined in (12).

Lemma 10.

Suppose that conditions (A.1)(A.6) hold. Then (36)supβnQn(g^,β)=Op(n-1/2),(37)supβnRn(β)-σ2B(β0)=op(1),(38)supβnmax1iN1jniη^ij(β)=op(n1/2),(39)supβnλ(β)=op(n-1/2), where Qn(g^,β) is defined in (30), Rn(β)=n-1i=1Nj=1niη^ij(β)η^ijT(β),B(β0) is defined in condition (A.7), and η^ij(β) is defined in (2).

Proof of Theorem <xref ref-type="statement" rid="thm3.1">3</xref>.

Note that, when β=β0, Lemma 10 also holds. Applying the Taylor expansion to (7) and invoking Lemma 10, we can obtain (40)-2l^(β0)=-i=1Nj=1ni[λTη^ij(β0)-12{λTη^ij(β0)}2]+op(1).

By (9) and Lemma 10, we have (41)i=1Nj=1ni{λTη^ij(β0)}2=i=1Nj=1niλTη^ij(β0)+op(1),λ={i=1Nj=1niη^ij(β0)η^ijT(β0)}-1i=1Nj=1niη^ij(β0)+op(n-(1/2)). This together with (40) proves that (42)-2l^(β0)=nQnT(g^,β0)Rn-1(β0)Qn(g^,β0)+op(1), where Qn(g^,β0) and Rn(β0) are defined in (30) and (37), respectively. From (37) of Lemma 10 and (42), we obtain (43)-2l^(β0)={(σ2A)-1/2nQn(g^,β0)}TG(β0)-2l^(β0)=×{(σ2A)-1/2nQn(g^,β0)}+op(1), where G(β0)=A1/2(β0)B-1(β0)A1/2(β0). Let G0=diag(ω1,,ωn), where ωi,1ip, are the eigenvalues of G(β0). Then there exists an orthogonal matrix H such that HTG0H. Using the notations of Lemma 9, we have (44)Qn(g^,β)=J1(g^,β)+J2(g^,β)+J3(g^,β)Qn(g^,β)=+J4(g^,β)+Q(g0,β).

Noting that Q(g0,β0), from the above equation and Lemma 9, we have (45)Qn(g^,β0)=J4(g^,β0)+op(n-1/2).

Hence, by (35) of Lemma 9, we have (46)H{σ-2A-(β0)}1/2nQn(g^,β0)DN(0,Ip), where Ip is the p×p identity matrix. This together with (43) proves Theorem 3.

Proof of Theorem <xref ref-type="statement" rid="thm3.2">4</xref>.

By Lemma 10 and, similarly to the proof of (42), we can obtain (47)l^(β)=-n2QnT(g^,β){σ2B(β)}-1Qn(g^,β)+Op(1), uniformly for βn, where op(1) tends to 0 in probability uniformly for βn. Note that A^(β0)pA(β0) and B^(β0)pB(β0). By the expansion of l^a(β0), defined in (19) and (47), we get (48)l^a(β0)=QnT(g^,β){σ-2A-(β0)}Qn(g^,β0)+Op(1).

This together with (44) and (48) proves Theorem 4.

Then we complete the proof.

Acknowledgments

This research was supported by NNSF project (11171188 and 11231005) of China, Mathematical Finance-Backward Stochastic Analysis and Computations in Financial Risk Control of China (11221061), NSF and SRRF projects (ZR2010AZ001 and BS2011SF006) of Shandong Province of China, K C Wong-HKBU Fellowship Programme for Mainland China Scholars 2010-11, and the Fundamental Research Funds for the Central Universities (27R1310008A).

Huang Z. Empirical likelihood for single-index varying-coefficient models with right-censored data Journal of the Korean Statistical Society 2010 39 4 533 544 10.1016/j.jkss.2009.12.002 MR2780223 Gao Z. Kong D. Gao C. Modeling and control of complex dynamic systems: applied mathematical aspects Journal of Applied Mathematics 2012 2012 5 869792 MR3005231 10.1155/2012/869792 Jian L. Shen S. Song Y. Improving the solution of least squares support vector machines with application to a blast furnace system Journal of Applied Mathematics 2012 2012 12 949654 10.1155/2012/949654 Hu J. Gao Z. Modules identification in gene positive networks of hepatocellular carcinoma using pearson agglomerative method and pearson cohesion coupling modularity Journal of Applied Mathematics 2012 2012 21 248658 10.1155/2012/248658 Bai Y. Fung W. K. Zhu Z. Y. Penalized quadratic inference functions for single-index models with longitudinal data Journal of Multivariate Analysis 2009 100 1 152 161 10.1016/j.jmva.2008.04.004 MR2460484 Chiang C. T. Rice J. A. Wu C. O. Smoothing spline estimation for varying coefficient models with repeatedly measured dependent variables Journal of the American Statistical Association 2001 96 454 605 619 10.1198/016214501753168280 MR1946428 Huang J. Z. Wu C. O. Zhou L. Varying-coefficient models and basis function approximations for the analysis of repeated measurements Biometrika 2002 89 1 111 128 10.1093/biomet/89.1.111 MR1888349 Qu A. Li R. Quadratic inference functions for varying-coefficient models with longitudinal data Biometrics 2006 62 2 379 391 10.1111/j.1541-0420.2005.00490.x MR2227487 Xue L. Wang Q. Empirical likelihood for single-index varying-coefficient models Bernoulli 2012 18 3 836 856 10.3150/11-BEJ365 MR2948904 Owen A. B. Empirical likelihood ratio confidence intervals for a single functional Biometrika 1988 75 2 237 249 10.1093/biomet/75.2.237 MR946049 Owen A. Empirical likelihood ratio confidence regions The Annals of Statistics 1990 18 1 90 120 10.1214/aos/1176347494 MR1041387 Wang Q. H. Jing B. Y. Empirical likelihood for partial linear models with fixed designs Statistics & Probability Letters 1999 41 4 425 433 10.1016/S0167-7152(98)00230-2 MR1666112 Chen S. X. Qin Y. S. Empirical likelihood confidence intervals for local linear smoothers Biometrika 2000 87 4 946 953 10.1093/biomet/87.4.946 MR1813987 Shi J. Lau T. S. Empirical likelihood for partially linear models Journal of Multivariate Analysis 2000 72 1 132 148 10.1006/jmva.1999.1866 MR1747427 Xue L. Zhu L. Empirical likelihood for a varying coefficient model with longitudinal data Journal of the American Statistical Association 2007 102 478 642 654 10.1198/016214507000000293 MR2370858 Xue L. Zhu L. Empirical likelihood semiparametric regression analysis for longitudinal data Biometrika 2007 94 4 921 937 10.1093/biomet/asm066 MR2416799 Xue L. G. Zhu L. Empirical likelihood for single-index models Journal of Multivariate Analysis 2006 97 6 1295 1312 10.1016/j.jmva.2005.09.004 MR2279674 Owen A. B. Empirical Likelihood 2001 London, UK Chapman & Hall Huang Z. Zhang R. Testing for the parametric parts in a single-index varying-coefficient model Science China: Mathematics 2012 55 5 1017 1028 10.1007/s11425-011-4336-0 MR2912492 Feng S. Xue L. Variable selection for single-index varying-coefficient model Frontiers of Mathematics in China 2013 8 3 541 565 10.1007/s11464-013-0284-z MR3044669 You J. H. Chen G. M. Zhou Y. Block empirical likelihood for longitudinal partially linear regression models The Canadian Journal of Statistics 2006 34 1 79 96 10.1002/cjs.5550340107 MR2267711 Fan J. Gijbels I. Local Polynomial Modelling and Its Applications 1996 66 London, UK Chapman & Hall Monographs on Statistics and Applied Probability MR1383587 You J. Zhou Y. Empirical likelihood for semiparametric varying-coefficient partially linear regression models Statistics & Probability Letters 2006 76 4 412 422 10.1016/j.spl.2005.08.029 MR2232483 Rao J. N. K. Scott A. J. The analysis of categorical data from complex sample surveys: chi-squared tests for goodness of fit and independence in two-way tables Journal of the American Statistical Association 1981 76 374 221 230 MR624328 Wang Q. Rao J. N. K. Empirical likelihood-based inference in linear errors-in-covariables models with validation data Biometrika 2002 89 2 345 358 10.1093/biomet/89.2.345 MR1913963 Cai Z. Fan J. Yao Q. Functional-coefficient regression models for nonlinear time series Journal of the American Statistical Association 2000 95 451 941 956 10.2307/2669476 MR1804449 Carroll R. J. Fan J. Gijbels I. Wand M. P. Generalized partially linear single-index models Journal of the American Statistical Association 1997 92 438 477 489 10.2307/2965697 MR1467842 Zhang D. Lin X. Raz J. Sowers M. Semiparametric stochastic mixed models for longitudinal data Journal of the American Statistical Association 1998 93 442 710 719 10.2307/2670121 MR1631369