Numerical Differentiation of Noisy , Nonsmooth Data

In many scientific applications, it is necessary to compute the derivative of functions specified by data. Conventional finite-difference approximations will greatly amplify any noise present in the data. Denoising the data before or after differentiating does not generally give satisfactory results see an example in Section 4 . A method which does give good results is to regularize the differentiation process itself. This guarantees that the computed derivative will have some degree of regularity, to an extent that is often under control by adjusting parameters. A common framework for this is Tikhonov regularization 1 of the corresponding inverse problem. That is, the derivative of a function f , say on 0, L , is the minimizer of the functional


Introduction
In many scientific applications, it is necessary to compute the derivative of functions specified by data.Conventional finite-difference approximations will greatly amplify any noise present in the data.Denoising the data before or after differentiating does not generally give satisfactory results.(See an example in Section 4.) A method which does give good results is to regularize the differentiation process itself.This guarantees that the computed derivative will have some degree of regularity, to an extent that is often under control by adjusting parameters.A common framework for this is Tikhonov regularization [12] of the corresponding inverse problem.That is, the derivative of a function f , say on [0, L], is the minimizer of the functional where R(u) is a regularization or penalty term that penalizes irregularity in u, Au(x) = x 0 u is the operator of antidifferentiation, DF (Au − f ) is a data fidelity term that penalizes discrepancy between Au and f , and α is a regularization paramater that controls the balance between the two terms.
as is appropriate if f has additive, Gaussian noise.(See [8] for an alternative in the case of Poisson noise.)In [12], the regularization term is the squared L 2 norm; this controls the size of u, without forcing minimizers to be regular.Tikhonov regularization was first applied to numerical differentiation by Cullum [4], where the regularization is the squared This forces minimizers to be continuous, as is required for the H 1 norm to be finite.This prevents the accurate differentiation of functions with singular points.
Other variational methods have the same drawback of forcing smoothness.An approach that penalizes the L 2 norm of u forces the minimizer to be a cubic spline (see [11,9,6]).The variational approach of Knowles and Wallace [7] does not fall into the category of Tikhonov regularization, but explicitly assumes that u is smooth.

Total-variation regularization
We propose to use total-variation regularization in (1).We will thus compute the derivative of f on [0, L] as the minimizer of the functional We assume f ∈ L 2 (an empty assumption in the discrete case), and for convenience that f (0) = 0. (In practice we simply subtract f (0) from f .) The functional F is defined on BV [0, L], the space of functions of bounded variation.It is in fact continuous on BV , as BV is continuously embedded in L 2 , and A is continuous on L 2 (being an integral operator with bounded kernel).Existence of a minimizer for F follows from the compactness of BV in L 2 [1, p. 152] and the lower semicontinuity of the BV seminorm [1, p. 120].This and the strict convexity of F are sufficient to guarantee that F has a unique minimizer u * .Use of total variation accomplishes two things.It suppresses noise, as a noisy function will have a large total variation.It also does not suppress jump discontinuities, unlike typical regularizations.This allows for the computation of discontinuous derivatives, and the detection of corners and edges in noisy data.
Total-variation regularization is due to Rudin, Osher, and Fatemi in [10].It has since found many applications in image processing.Replacing A in the two-dimensional analog of (2) with the identity operator gives a method for denoising an image f .See [3,2] for an example where A is the Abel transform, giving a method for regularizing Abel inversion.

Numerical implementation
A simple approach to minimizing (2) is gradient descent.This amounts to evolving to stationarity the PDE obtained from the Euler-Lagrange equation: where Replacing the |u | in the denominator with (u ) 2 + for some small > 0 avoids division by zero.Typically, (3) is implemented with explicit time marching, with u t discretized as (u n+1 − u n )/∆t for some fixed ∆t.
The problem with (3) is that convergence is slow.A faster algorithm is the lagged diffusivity method of Vogel and Oman [14].The idea is to replace at each iteration of ( 3 Our discrete implementation of the algorithm is straightforward, and follows [13].We assume that u is defined on a uniform grid {x i } L 0 = {0, ∆x, 2∆x, . . ., L}. Derivatives of u are computed halfway between grid points as centered differences.Integrals of u are likewise computed halfway between grid points, using the trapezoid rule.The matrix of the differentiation operator is denoted D; the matrix of the antidifferentiation operator A is denoted K. Let E n be the diagonal matrix whose ith entry is The matrix H n is an approximation to the Hessian of F at u n .The update s n = u n+1 − u n is the solution to H n s n = −g n , where Less straightforward is the choice of the regularization parameter α.We use the discrepancy principle: the mean-squared difference between Au * and f should equal the variance of the noise in f .This has the effect of choosing the most regular solution to the ill-posed inverse problem Au = f that is consistent with the data f .In practice, the noise in f is not generally known, so we estimate the noise variance by comparing f with a smoothed version of f .

Example
Let f 0 (x) = |x − 1/2| on [0, 1].We define f at 100 evenly-spaced points in [0, 1], and add Gaussian noise of standard deviation 0.05. Figure 1 shows the resulting f .First, we show in Figure 2 the result of computing f by simple centered differencing.The noise has been greatly amplified, so much that denoising the result is hopeless.We compare with this the result in Figure 3 of denoising f before computing f by differencing.The denoising is done by total variation regularization, as in (2) but with A replaced by the identity operator.The residual noise in the denoised f is still amplified enough by the differentiation process to give an unsatisfactory result.Now we implement our total-variation regularized differentiation, (2).The result is in Figure 4.The overall shape of f 0 is captured almost perfectly.The jump is correctly located.The one inaccuracy is the size of the jump: there is a loss of contrast, which is typical of total-variation regularization in the presence of noise.Decreasing the size of the jump reduces the penalty term in (2), at the expense of increasing the data-fidelity term.We also show the result of applying the antidifferentiation operator to the computed f , and compare with f 0 in Figure 5.The corner is sharp and the lines are straight, though a little too flat.[7] I. Knowles and R. Wallace, A variational method for numerical differentiation, Numer.Math., 70 (1995), pp.91-110.
[8] T. Le, R. Chartrand, and T. J. Asaki, A variational approach to reconstructing images corrupted by Poisson noise.Submitted for publication.
) the nonlinear differential operator u → d dx u |u | with the linear operator u → d dx u |u n | .The algorithm has been proven to converge to the minimizer of F [5].

Figure 2 :
Figure 2: Computing f with finite differences greatly amplifies noise.

Figure 3 :
Figure 3: The function f is denoised, then differentiated with finite differences.The result is noisy and inaccurate.

Figure 4 :
Figure 4: Regularizing the differentiation process with total-variation produces a noiseless derivative with a correctly located, sharp jump.The discrepancy of the values from ±1 are due to contrast loss, an artifact of total variation methods in the presence of noise.

Figure 5 :
Figure 5: The function f 0 (solid line) and the antidifferentiated numerical derivative (circles).The numerically computed function is very similar to the exact one.