Smooth Kernel Estimation of a Circular Density Function: A Connection to Orthogonal Polynomials on the Unit Circle

In this note we provide a simple approximation theory motivation for the circular kernel density estimation and further explore the usefulness of the wrapped Cauchy kernel in this context. It is seen that the wrapped Cauchy kernel appears as a natural candidate in connection to orthogonal series density estimation on a unit circle. This adds further weight to the considerable role of the wrapped Cauchy in circular statistics.


Introduction
Consider an absolutely continuous (with respect to the Lebesgue measure) circular density f (θ), θ ∈ [−π, π], i.e f (θ) is 2π−periodic, f (θ) ≥ 0 for θ ∈ R and π −π f (θ)dθ = 1. (1.1) In the literature on modeling circular data, starting from the classical text of Mardia (1972), there appear many standard texts such as Fisher (1993), Jammalamadaka and SenGupta (2001) and Mardia and Jupp (1972) that cover parametric models along with many inference problems. More recently various alternatives to these classical parametric models, exhibiting asymmetry and multimodality have been investigated with respect to their mathematical properties and goodness of fit to some real data; see Abe and Pewsey (2011), Jones and Pewsey (2012)), Kato and Jones (2015), Kato and Jones (2010), Minh and Farnum (2003) and Shimizu and Iida (2002). In cases where multimodal andor asymmetric models may be appropriate, semiparametricr or nonparametric modelling may be considered more appropriate. Fernändez-Durän (2004) and Mooney et al. (2003) considered semiparametric analysis based on mixture of circular normal and von Mises distributions and Hall et al. (1987), Bai et al. (1988), Fisher (1989, Taylor (2008) and Klemelä (2000) have considered nonparametric approaches.
Given a random sample (θ 1 , ....θ N ) from the density (1.1), the circular kernel density estimator is given bŷ is a circular kernel density function that is concentrated around φ as h → h 0 for some known h 0 . As motivated in Taylor (2008), a natural choice for the kernel function is one of the commonly used circular probability densities, such as the wrapped normal distribution, or the von Mises distribution. Taylor (2008) investigated the use of von Mises kernel, in which case the density estimator is given bŷ where I 0 (ν) is the Bessel function of order r and ν is the concentration parameter. Di Marzio et al. (2009) considered the use of circular kernels to circular regression while extending the use of von Mises kernels to more general circular kernels. In the present note I demonstrate that the wrapped Cauchy kernel presents itself as the kernel of choice by considering an estimation problem on the unit circle. We also show that this approach leads to orthogonal series density estimation, however no truncation of the series is required. It may be noted that the wrapped Cauchy distribution with location parameter µ and concentration parameter ρ is given by that becomes degenerate at θ = µ as ρ → 1. The estimator of f (θ) based on the above kernel is given bŷ In Section 2, we provide a simple approximation theory argument behind the nonparametric density estimator of the type introduced in (1.2) and (1.3). In Section 3, first we present some basic results from the literature on orthogonal polynomials on the unit circle and then introduce the strategy of estimating f (θ) by estimating an expectation of a specific complex function, that in turn produces the non-parametric circular kernel density estimator in (1.5). The next section shows that the circular kernel density estimator is equivalent to the orthogonal series estimation in a limiting sense. This equivalence establishes a kind of qualitative superiority of the kernel estimator over the orthogonal series estimator that requires the series to be truncated, however the kernel estimator does not have such a restriction.

Motivation for the Circular Kernel Density Estimator
The starting point of the nonparametric density estimation is the theorem given below from approximation theory (see Mhaskar and Pai (2000)). Before giving the theorem we will need the following definition: Definition 2.1. Let {K n } ⊂ C * where C * denotes the set of periodic analytic functions with a period 2π. We say that {K n } is an approximate identity if A. K n (θ) ≥ 0 ∀ θ ∈ [−π, π]; B. 1 2π π −π K n (θ) = 1; C. lim n→∞ max |θ|≥δ K n (θ) = 0 for every δ > 0.
The definition above is motivated from the following theorem which is similar to the one used in the theory of linear kernel estimation (see Prakasa Rao (1983)).
Note that taking the sequence of concentration coefficients ρ ≡ ρ n such that ρ n → 1, the density function of the Wrapped Cauchy will satisfy the conditions in the definition in place of 2πK n . The integral in the above theorem. In general 2πK n may be replaced by a sequence of periodic densities on [−π, π], that converge to to a degenerate distribution at θ = 0.
For a given random sample of θ 1 , ..., θ N from the circular density f, the Monte-Carlo estimate of f * is given bỹ the suffix n for the kernel K may be a function of the sample size N. The kernel given by the wrapped Cauchy density satisfies the assumptions in the above theorem that provides the estimator proposed in (1.5).
( 3.3) The Poisson representation says that if g is analytic in a neighborhood of D with g(0) real, then for z ∈ D, g(z) = e iθ + z e iθ − z Re(g(e iθ )) dθ 2π (3.4) (see (Simon , 2005, p. 27)). This representation leads to the result (see (ii) in §5 of Simon (2005)) that for Lebesgue a.e. θ, exists and if dµ = w(θ) dθ 2π + dµ s with dµ s singular, then where F (z) = e iθ + z e iθ − z dµ(θ). (3.7) Our strategy for smooth estimation is the fact that for dµ s = 0 we have We define the estimator of f (θ) motivated by considering an estimator of F (z), the identity (3.6) and (3.8), i.e. (3.11) where r has to be chosen appropriately. Recognize that where ω j = e iθ j , then using (3.4), we have and thereforef that is of the same form as in (1.5).

Orthogonal Series Estimation
We get the orthogonal expansion of F (z) with respect to the basis {1, z, z 2 , ...} as is the j th trigonometric moment. The series is truncated at some term N * so that the the error is negligible. However, we show below that estimating the trigonometric moment c n , n = 1, 2, ... aŝ the estimator of F (z) is the same as given in the previous section. This can be shown by writinĝ which is the same as F N (z) given in (3.12). This ensures that the orthogonal series estimator of the density coincides with the circular kernel estimator. The determination of the smoothing constant may be handled based on the cross validation method outlined in Taylor (2008).
Remark: Note that the simplification used in the above formulae does not work for r = 1. Even though, the limiting form of (4.1) is used to define an orthogonal series estimator as given bŷ where n * is chosen according to some criterion, for example to minimize the integrated squared error. Thus the above discussion presents two contrasting situations: in one we have to determine the number of terms in the series and in the other number of terms in the series is allowed to be infinite, however, we choose to evaluate Re F (re iθ ) for some r close to 1 as an approximation to Re F (e iθ ).