Blind Deconvolution for Jump-Preserving Curve Estimation

In many applications, observed signals are contaminated by both random noise and blur. This paper proposes a blind deconvolution procedure for estimating a regression function with possible jumps preserved, by removing both noise and blur when recovering the signals. Our procedure is based on three local linear kernel estimates of the regression function, constructed from observations in a left-side, a right-side, and a two-side neighborhood of a given point, respectively. The estimated function at the given point is then defined by one of the three estimates with the smallest weighted residual sum of squares. To better remove the noise and blur, this estimate can also be updated iteratively. Performance of this procedure is investigated by both simulation and real data examples, from which it can be seen that our procedure performs well in various cases.


Introduction
Nonparametric regression analysis provides us statistical tools for estimating regression functions from noisy data 1 .When the underlying regression function has jumps, the estimated functions by conventional nonparametric regression procedures are not statistically consistent at the jump positions.However, the problem to estimate jump regression functions is important because the true regression functions are often discontinuous in applications 2 .This paper focuses on estimation of the jump regression function when the observed data are contaminated by both random noise and blur.
In the literature, there are some existing procedures for estimating regression curves with jumps preserved in cases when only random noise is present in observed data.These procedures include the one-sided kernel estimation methods e.g., 3-9 , the local leastsquares estimation procedures e.g., 10-12 , the wavelet transformation method 13 , the spline smoothing method 14 , and the direct estimation methods e.g., 15, 16 .In some applications, our observations are both blurred and contaminated by pointwise noise e.g., signals of groundwater levels in geothermy .It is, therefore, important to remove both noise and blur when estimating the true regression function.In the nonparametric regression literature, we have not seen any discussion about this problem yet.In the context of image processing, which can be regarded as a two-dimensional nonparametric regression problem 2 , there is a related research area concerned about deblurring images.Most existing image deblurring procedures assume that the pointspread function psf , which describes the blurring mechanism, either is known or has a parametric form, and this function is homogeneous in the entire image e.g., 17-20 .In many applications, however, it is difficult to specify the psf completely or partially by a parametric model.Our major goal in this paper is to provide a method to estimate the true regression function properly in cases when both noise and blur are present and the psf is unspecified.
The remaining part of the paper is organized as follows.In next section, our proposed method is discussed in detail.Some comparative results are presented in Section 3 with several simulated examples.A real data application is presented in Section 4. Some concluding remarks are included in Section 5.

Our Proposed Method
Suppose that the regression model concerned is   where K is a density kernel function with support −1, 1 and h n is a bandwidth parameter.Then the solution of 2.3 to a, denoted as a c x , is defined as the LLK estimator of f x .In Figure 1 b , the dotted curve denotes the blurred regression function, and the dashed curve denotes the LLK estimate of f.It can be seen that the LLK estimate removes the noise well; but the blur is not removed and it is actually made more serious around the jump point.
Qiu 15 finds that the major reason why the conventional LLK estimate could not preserve the jump at the jump point is that it uses a "continuous" function i.e., a linear function locally to approximate the jump regression function.To overcome this limitation, Qiu suggests fitting a piecewise linear function around x as follows: where I • is an indicator function defined by I u 1 if u ≥ 0 and 0 otherwise.The solution to a l and a r are denoted as a l x and a r x .It is easy to see that a l x and a r x are actually LLK estimates of f x constructed from observations in the left-sided neighborhood x − h n , x and the right-sided neighborhood x, x h n , respectively, with kernel functions K l x K x I −x and K r x K x I x .Then, Qiu suggests the following jump-preserving estimate of f x : where WRMS l and WRMS r are the Weighted Residual Mean Squares WRMSs of the leftsided and right-sided estimates, respectively, defined by Qiu 15 proves that f 1 is a consistent estimate of f when there is no blurring in the observed data.
Since only part of observations is actually used in f 1 , this estimator would be quite noisy in continuity regions of f.To overcome this problem, similar to the method in 16 , we propose a modification as follows:

2.7
By 2.7 , when x is far away from any jump points, f x would be estimated by the conventional LLK estimate.It would still be estimated by one of the one-sided estimates around the jump points.Therefore, it is expected that f 2 x would be better for estimating f x than f 1 x .The solid curve in Figure 2 denotes f 2 constructed from the data shown in either plot of Figure 1.The two dotted curves show a l and a r , respectively.It can be seen that f 2 indeed estimates f well in this case.The estimate defined in 2.7 can also be updated iteratively as follows.The estimated values { f 2 x i , i 1, 2, . . ., n} can be used as observed data, and the estimate f 2 can be updated by 2.7 with all quantities on the right-hand side of 2.7 computed from { f 2 x i , i 1, 2, . . ., n}, and this process can continue iteratively.Numerical results in the next section suggest that a good estimate can usually be generated after about 5 iterations in all the cases considered there.
In our procedure, the bandwidth parameter h n should be chosen properly.To this end, we use the following cross-validation CV procedure: 2.8 where f −i x is the estimate of f x using all of the data points except the ith point x i , y i .

Simulation Study
In this section, some simulated examples are presented concerning the numerical performance of our proposed procedure.In all numerical examples presented in this paper, the Epanechnikov kernel function K x 3/4 1−x 2 I |x| ≤ 1 is used.We consider the following two true regression functions:

3.1
The function f 1 has two jumps at 0.4, 0.8, and a roof discontinuity i.e., jump in the slope at 0.2.The function f 2 has a jump at 0.78, and an unbalanced cusp i.e., a sharp angle at 0.26.These functions are shown by the solid curves in Figure 3.Our observations are generated from model 2.1 with random noise from the N 0, σ 2 distribution and the psf h 1 − x 2 1 cos 1.5xπI |x| ≤ 1 .One set of observations from each regression function is shown by the little pluses in Figure 3.
For the proposed procedure, its Mean-Squared Error MSE values with several different n, σ combinations are presented in Figure 4 as functions of the number of iterations, where the bandwidth h n is chosen by the CV procedure in each iteration.All of the MSE values are computed based on 100 replications.From the plots in Figure 4, it can  be seen that, for each n, σ combination, MSE values first decrease and then increase with the iteration number.The optimal number of iteration is around 5 in each case, which is the number that we recommend to use in applications.Next, we compare the proposed procedure denoted as NEW with the conventional local linear kernel LLK smoothing procedure and the jump-preserving curve estimation JPCE procedure by Qiu 15 .Figure 5 presents the estimated regression functions by all three methods from the observed data shown in Figure 3.For procedure NEW, 5 iterations are used.From the plots in Figure 5, it can be seen that LLK blurs the jumps, JPCE preserves the jumps well but its estimates are quite noisy in continuity regions, and our proposed procedure NEW preserves the jumps well and also removes noise efficiently.
Tables 1 and 2 present the MSE values and the corresponding standard errors of the three methods in various cases.We use 5 iterations in the proposed procedure NEW.From Tables 1 and 2, it can be seen that procedure NEW performs the best in all cases.

An Application
In this section, we apply our proposed method to a groundwater level data.Possible jumps in groundwater level arise from changes in subsurface fluid currents, which has become an    important predictor of earthquakes.In Figure 6, little pluses denote the groundwater level observed by the Seismograph Network Stations of China Earthquake Center during January and May 2007.The solid curve denotes the estimated regression curve by our proposed procedure in which all parameters are chosen in the same way as in the simulation examples presented in Section 3. As indicated by the plot, the jumps around Mar 23, Aril 10, and Aril 17 are preserved well by our procedure.We checked the earthquake history and it is confirmed by China Earthquake Center that earthquakes with magnitude of more than 4.0 occurred in these periods in the area of observation.

Concluding Remarks
We have presented a blind deconvolution procedure for jump-preserving curve estimation when both noise and blur are present in the observed data.Numerical results show that it performs well in various cases.However, theoretical properties of the proposed method are not available yet, which requires much future research.We believe that the proposed method can be generalized to two-dimensional cases to solve problems such as image deblurring and restoration.

Figure 1 :
Figure 1: a Solid lines denote the true regression function, and dotted curve denotes the blurred regression function.b Solid lines denote the estimate of the regression function by the proposed method, dotted curve denotes the blurred true regression function, and dashed curve denotes the conventional local linear kernel estimate of the regression function.In both plots, little " "s denote a set of observations generated from model 2.1 .

Figure 2 :
Figure 2: The solid curve denotes f 2 constructed from the data shown in either plot of Figure 1.The two dotted curves show a l and a r , respectively.

Figure 3 :
Figure 3: a Solid curve denotes the true regression function f 1 , and little pluses denote a set of observations when n 400 and σ 0.2.b Solid curve denotes the true regression function f 2 , and little pluses denote a set of observations when n 400 and σ 0.2.

Figure 4 :
Figure 4: MSE values of the estimated regression function.a f f 1 and n 200, b f f 1 and n 400, c f f 2 and n 200, d f f 2 and n 400.

Figure 5 :
Figure 5: Solid curves denote the estimates by the proposed procedure NEW, dotted curves denote the estimates by LLK, and dashed curves denote the estimates by JPCE.

Figure 6 :
Figure 6: Little pluses denote the groundwater level observed by the Seismograph Network Stations of China Earthquake Center during January and May 2007, and solid curve denotes the estimated regression curve by our proposed procedure.

Table 1 :
Comparison of the MSE values of the three methods in various cases when f f 1 .The numbers in parentheses are the standard errors.

Table 2 :
Comparison of the MSE values of the three methods in various cases when f f 2 .The numbers in parentheses are the standard errors.