Convex and radially concave contoured distributions

Integral representations of the locally deﬁned star-generalized surface content measures on star spheres are derived for boundary spheres of balls being convex or radially concave w.r.t. a fan in R n . As a result, the general geometric measure representation of star-shaped probability distributions and the general stochastic representation of the corresponding random vectors allow additional speciﬁc interpretations in the two mentioned cases. Applications to estimating and testing hypotheses on scaling parameters are presented, and two-dimensional sample clouds are simulated.


Introduction
The families of multivariate Gaussian and elliptically contoured distributions have served for a long time as the main basis of numerous probabilistic models and their many successful applications.Basics of estimation theory of elliptically contoured distributions can be found in [5], [1], [3] and, e.g., [9].Advancing needs of statistical practice as well as longstanding challenging mathematical questions stimulated the development of larger classes of probability laws containing many well known distributions as particular elements.We note that [2] surveys a big part of the distribution theory in R 2 .Numerous authors contributed to establishing the class of multivariate star-shaped distributions.For a recent review of this development, see [14].Estimating level sets of star-shaped densities has been dealt with in [15], [16], [17] and [4].
Several aspects of analyzing a cloud of sample points may be of importance for the process of defining a class of probability laws.The visual impression of the appearance of starshaped figures built by the points of a sample cloud may lead to the idea that the boundaries of star-bodies, henceforth called star-spheres, represent density level sets of a probability law.
Counting the sample points belonging to thin layers about star-spheres then leads to the idea that a certain function that assigns a nonnegative number to every such star-sphere serves as the density generating function (dgf) of a non-negative random variable (rv), or more generally, as a function being proportional to the Radon-Nikodym density of a multivariate probability law w.r.t. a certain σ-finite measure defined on the sample space.We call such a function an univariate level density function of the multivariate probability distribution.
The combination of the aspects of defining level sets of a multivariate density and of assigning a non-negative level to every such set will be reflected here in a new method of integration.This method may be considered as a hightening and generalization of the classical principle of Cavalieri which was modified by Torricelli.Combining integration on level sets with that along the levels may also be considered as a geometric disintegration method.This method is essentially based upon certain non-Euclidean surface measures on star-spheres.It is one of the main aims of this paper to further develop this theory for two important types of star-spheres.Convex bodies and star-bodies being radially concave w.r.t. a fan in R n build these two classes of star-bodies.Thus, in this paper, the focus is on considering probability laws having the boundary of such sets as their density level sets or their contour sets.Actually, the results in Section 3 are mainly restricted to, possibly shifted, symmetric contour sets being norm or antinorm spheres.
There are different ways to introduce a dgf of a continuous probability law.Looking through the statistical and mathematical literature, one finds many interesting non-negative and suitably integrable functions which may serve as a dgf.Another way to introduce a dgf is to analyze the structure of a known multivariate density and to extract from it, if possible, a function which does not depend on the surface measure on the star-spheres but depends exclusively on the levels of the multivariate density.
It is well known that the definition of a dgf is not unique, and how to deal with this circumstance.Densities with heavy tails may be of interest in (re-)insurance, and densities with light tails may be of interest in reliability theory.Both types of densities can be modeled using each time a suitable dgf.
Modeling the density level sets and the univariate level density of a multivariate distribution can be done in a combined way or separate from each other.Sometimes a parameter may influence both the level density and the density contour sets of a multivariate distribution.Another parameter may be only for one of these two aspects of importance.
The paper is organized as follows.Some basic facts from the theory of star-shaped distributions, with an emphasis on geometric measure representations, are collected in Section 2. New geometric descriptions of surface measures on boundary spheres of balls being radially concave w.r.t. a fan in R n , or convex, are presented in Section 3.Moreover, new statistical applications of geometric measure representations of norm and antinorm contoured distributions and of stochastic representations of correspondingly distributed random vectors are discussed there.In particular, distributions are illustrated by simulated sample clouds.The basics for estimating and testing hypothesis on scaling parameters are presented at the end of Section 3. Section 4 deals with proving the new results, and a discussion of the results can be found in the final Section 5.

Star-shaped distributions
Geometric measure representations and stochastic representations of corresponding random vectors have been proved in [14] for general star-shaped distributions making essentially use of the notion of a star-generalized surface content measure.The latter is defined in a local way by taking derivatives of sector volumes and is known to be equivalently defined in an integral (in dimension two even explicitly differential geometric) way in the cases of l n,p -spheres and ellipsoids.For recent results and a survey of their probabilistic and statistical applications we refer to [14].Here, some basic facts from star-shaped distribution theory and its applications will be summarized.
Let a random vector Y follow the probability density function (pdf) where ν ∈ R n is a vector of location, K ⊂ R n is a star-body having the origin as an inner point, d K is the distance function, or Minkowski functional, of the star-body K, dr, and the normalizing constant allows the representation .
Assuming that the technical Assumption 1 in [14] is satisfied which deals with a certain smoothness property of the boundary S of K, O S denotes the star-generalized surface content measure defined on the Borel subsets of S. The probability measure corresponding to ϕ g,K,ν allows the geometric measure representation or disintegration formula K is called the density contour defining star body of this distribution, and any g under consideration a density generating function (dgf).The sets (B − ν) ∩ S(r) with S(r) = rS may be considered playing the role of the indivisibles within a generalized principle of Cavalieri (which was modified by Torricelli).The random vector Y satisfies the stochastic representation where R and U S are stochastically independent, R has the pdf and U S follows the star-generalized uniform probability distribution ω S on the Borel-σ-field Because of (3), U S is called the star-generalized uniform basis of Y .The symbol X d = Y means that the random vectors X and Y follow the same probability law while X ∼ Q indicates that the random vector X follows the probability distribution Q.
For A ∈ B S , we introduce the central projection cone and the star sector of star radius r, where is a star ball of star radius r.Let µ be the Lebesgue measure in R n .Then the star-generalized surface measure is defined on rB S in a local approach by Making use of the star-sphere intersection proportion function (ipf) of a set A, the disintegration representation of Φ g,K,ν may be written The most immediate applications of this formula appear in cases where the ipf is a constant or an indicator of an interval.If, for a certain set B, the ipf takes a constant value, C say, then Φ g,K,ν (B) is just equal to this value C.
If, for a statistic T , B(t) = {T < t}, t ∈ R, and the ipfs of all sets B(t) take the constant value, C(t) say, then the statistic T is robust w.r.t. the dgf g, i.e. the distribution of T does not depend on g.
If, for a certain set A(x), the ipf is the indicator function of an interval, [0, b(x)] say, then For specific statistical examples of such type we refer to Section 3. Applications of ( 5) in cases where the ipf is more structured are often more involved.Such a situation will be considered in Example 3.4.
The main aim of this paper, however, is not only to give attractive examples where the geometric measure representation applies but is also to give non-trivial explanations of the locally defined surface measure O S on the basis of an integral (or differential-geometric) approach.This will be done in the first two parts of Section 3 for the two important cases where K is a norm or antinorm ball.As a result, in formulas ( 1)-( 3) and ( 5), O S will afterwards allow additional specific integral (or differential geometric) interpretations in the two mentioned cases.The class where intK means the interior of K is called the class of continuous star-shaped distributions.A random vector Y is said in [14] to belong to the bigger class of star-shaped distributions StSh (n) if there are a vector ν ∈ R n , a star body K with 0 ∈ intK (and boundary S), a non-negative random variable (rv) R with cumulative distribution function (cdf) F such that and R and U S are independent.In this case, we write and The random vector U S is called the star-generalized uniform basis of the class For the latter notions, see [11].Note that l n,p -symmetric distributions are norm or antinorm contoured if p ≥ 1 or 0 < p ≤ 1, respectively.We shall study general convex or norm contoured distributions in Section 3.

Results
We start the presentation of new results with a remark on asymmetric distribution laws which seems to be very useful: a distribution beeing star-shaped w.r.t. a fan F may be restricted to arbitrary unions of elements of F.
Here, I is a suitable index set.The proof of this result follows immediately by conditioning.We call the collection of all such distributions the class of fan restricted star laws and denote it by StL, Elements of this distribution class are not symmetric, in general.

Norm contoured distributions
Let K be convex and symmetric w.r.t. the origin throughout this section.Our consideration is restricted therefore here to the class of norm contoured distributions, Let the system of Borel sets from the upper half of the sphere S be B + S .For A ∈ B + S , put and denote, where ever it exists, the outer normal vector to the norm sphere S at the point (ϑ T , η) T ∈ S by N (ϑ).Where ever the outer normal vector is not defined, let N (ϑ) denote the zero element of R n .Note that the set of boundary points of K where ∇η does not exist is countable and thus without of any influence onto the value of the integral in the following theorem.We recall that the surface content measure O S was locally defined in (4).We shall refer to this result as to the integral or differential geometric approach to measuring surface content on a norm sphere based upon the dual norm geometry.We mention that a similar representation of O S (A) follows for arbitrary A ∈ B S .Due to Theorem 3.1, if K is convex, the surface measure O S henceforth allows both the local and the integral interpretation in formulas (1)-( 3) and (5).Moreover, Theorem 3.1 reflects a certain specific aspect of duality theory for norms.
In the next section we will deal in an analogous way with balls being radially concave w.r.t. a fan in R n .
Figures 1-3 show sample clouds of size k = 2000 of p-generalized Gaussian distributed two-dimensional (n = 2) random vectors for different choices of p, p ≥ 1.Notice that the six frames reflect different scaling of the clouds due to different values of p.While the sample cloud in Figure 1(a) might seem to be similar to the illustration of the Gaussian case the shape of the sample cloud approaches that of an axes-aligned square if p increases (or even tends to infinity).At the same time, the cloud (probability mass) becomes more and more concentrated.If, however, p ≥ 1 is tending to one then the shape of the sample cloud approaches that of the diamond.At the same time, probability mass becomes much less concentrated and the contour of the sample cloud appears to be not as sharp as in the opposite

Antinorm contoured distributions
Throughout the present section, let K denote a star body having a positive and continuous radial function and being symmetric w.r.t. the origin and radially concave w.r.t. a fan The Minkowski functional or distance function of K is then an antinorm, d K = ., i.e. a continuous, positively homogeneous, non-degenerate, and in F super additive function.For more details we refer to [11].According to all the assumptions made so far, our consideration is restricted here to the class of antinorm contoured distributions, Moreover, we assume that K belongs to a particular class of antinorm balls, AN 1, meaning that 1) for every i there is a 1-1-map x S,i : where S (n−1) denotes the Euclidean unit sphere in R n and 2) for every i, for all u ∈ S (n−1) ∩ C i there is a hyperplane T (u) satisfying T (u) ⊥ u and being an inner tangent plane to (the boundary of) K at x S,i (u).
We define the anti-support function of K w.r.t.F by and the anti-polar set of K, Let N (ϑ) be the inner normal vector to S at (ϑ T , η) T ∈ S. In the sense of Remark 1 in [14], in what follows we will simply write Theorem 3.2.In formulas (1)-( 3) and ( 5), the surface content measure O S allows the representation Additionally to the general local definition in formula (4), the result of this theorem allows the integral or differential geometric interpretation of the surface measure O S in the geometry having K o as its unit ball.Note that the special case where surface content of subsets of the boundary of the antinorm ball K = B a,p = {x ∈ R n : ( ∀i is measured based upon the corresponding semi-antinorm geometry has been dealt with already in [14]. Figures 5 and 6 show sample clouds of the same sample size and from the same distribution class as in Section 3.1 but with parameter p chosen from the interval (0, 1).A big proportion of probability mass can be observed tending to the far tails of such distribution if p is approaching zero.At the same time one can identify several points which could be considered being outliers if they had appeared under other circumstances.Hence, the parameter p of such distribution might be called a shape-tail parameter.

Statistical applications
This section deals with several examples where formula (5) applies.The first three examples present relatively immediate applications while the last example concerns a more advanced situation for calculating the ipf.
Example 3.1.Let X 1 , ..., X n be independent rv's following the common density of the powerexponential (or p-generalized Gaussian, or p-generalized Laplace) distribution, where the location parameter µ ∈ R and the shape-concentration or shape-tail parameter p > 0 are known and the scaling parameter σ is unknown.The maximum-likelihood estimator (mle) of σ is and the random vector X (n) = (X 1 , ..., X n ) T follows the density ϕ g P E ,K,ν where ν = (µ, ..., µ) T , |x i | p ) 1/p , and The set B p is convex if p ≥ 1, and radially concave w.r.t. the standard fan in follows a centered convex or radially concave contoured distribution if p ≥ 1 or 0 < p ≤ 1, respectively.It turns out that where I(g P E ) = p n/p−1 Γ(n/p), Γ denotes the Gamma function, S p = ∂B p is the p-sphere, i.e. the set of boundary points of B p and the p-sphere ipf of t 1/p B p is Consequently, i.e. n( σ σ ) p follows the χ p g P E -density with n d.f.introduced in [12], t → t n/p−1 e −t/p p n/p Γ(n/p) = f χ n,g P E ,p (t), t > 0.
This exact distributional result allows constructing confidence intervals for and testing hypotheses on the scaling parameter σ.
Example 3.2.We consider independent rv's X 1 , ..., X n following the densities where µ i ∈ R, p > 0, a i > 0 are known and σ > 0 is an unknown common scaling parameter.
The mle of σ allows the functional representation where The density of where B a,p = {x ∈ R n : |x| a,p ≤ 1}.Note that F Sa,p (t 1/p B a,p , r) does not depend on a.Hence, n( σ σ ) p follows the χ p g P E -distribution with n d.f., independently of a = (a 1 , ..., a n ) T .
While Examples 3.1 and 3.2 are restricted to the p-generalized Gaussian or Laplace distribution which is defined using the dgf g P E , p > 0, Example 3.3 deals with measuring the same sets as before but with measures having another level distribution, especially allowing lighter and heavier distribution tails.Such tails may be of interest in various types of applications.
Example 3.3.Let us assume that Y follows the probability distribution law Φ g,K,ν with ν ∈ R n , K = B σa,p and dgf g.Examples of dgf 's are, besides g P E , the Kotz type dgf g K defined by g K (r) = r M −1 e −βr γ , β, γ > 0, M + n > 1, and the Pearson-VII-type dgf g P T 7 defined by g P T 7 (r) = (1 + r m ) −M , M > n, m > 0. Note that the Student-and Cauchy-type dgfs are special cases of g P T 7 , and that where B denotes the Beta function.It follows with B σa,p = σB a,p and Note that our earlier representation of this density in [13] differs from the present one because of the (slightly) different notation for the dgf g.
We define X = (X 1 , ..., X n 1 ) T and Y = (Y 1 , ..., Y n 2 ) T .It follows from the well known theory of exponential families that the statistic Thus, for every measurable function h : R 2 + → R, A non-constant ipf of the set B(t) has been dealt with for different functions h and under different parameter assumptions in earlier papers of the author and several coauthors,see [14].
Example 3.5.For data in [8] reporting the profits at the box office and the number of sold home videos, in [10] the authors study fitting a linear regression model with random errors distributed according to an exponential power distribution.When analyzing the residuals they present Q-Q-plots for both the exponential power distribution with the estimated parameter p = 2, 386877 and the normal distribution (p = 2).The observer's subjective impression after a visual inspection of these plots may be that one cannot really be sure in preferring one of the two error models, this way.This example, which was presented by the authors only for the purpose to show the use of the functions implemented by them to fit a linear regression model if errors are possibly exponentially power distributed, gives rise to throw up in a similar two-dimensional situation the following standard question of statistical practice: how large should a sample size be to   make a practical decision based upon a visual inspection "relatively safe" ?In particular, how large should a sample size be for the observer being able to visually choose between the two two-dimensional p-generalized normal distributions with parameters p = 2 and p = 2, 388677 ?
This question, clearly, is not formulated in a strong mathematical way, and will not be answered in such way, here.Instead, we present Figures 6-8 showing that one can hardly distinguish this way between the parameters p = 2 and p = 2, 388677 of the two-dimensional p-generalized normal distribution even if sample sizes are large.As a consequence, one may ask, e.g., for a mathematical method yielding a sure decision about the first decimal place of parameter p, say.For a certain general 20-percent sensitivity and g-robustness principle, established when dealing with another particular problem, we refer to Application 2 and Section 3 in [6].
Example 3.6.(a) Simulation in dimension one.There are several possibilities for simulating the p-generalized normal distribution.The p-generalized polar method and the pgeneralized rejecting polar method are established in [7] and compared with several methods known from the literature.Moreover, the resulting recommendation for using which of the methods in which situation is realized by S. Kalke in the R-module 'pgnorm'.are introduced in [12] and applied to the class of l n,p -symmetric distributions in [13], and the polar angle Φ has the pdf For a graphical representation of this function, we refer to [7].

Proofs
The general method of proof in this paper can be divided into two main parts.Using the properties of the support function of a convex body, h K , in the first part it will be shown that the absolute value of the Jacobian of a certain transformation may be interpreted in terms of the normal vector N to the boundary of K.This allows according to Lemma 1 in [14] to represent the surface measure O S as an integral of h K (N ).The second part of the method of proof deals with a relation between the functional h K (N ) and the Minkowski functional of a suitably defined set K * , or K o .While, in the convex case, K * is extensively studied, K o yet has to be found in the most general case when K is radially concave w.r.t. a fan in R n .
Proof of Theorem 3.1.The support function of the convex body K is defined as Recall that if u ∈ S (n−1) then h K (u) describes the distance from the origin to the hyperplane with outer normal vector u and supporting K.For compactness and continuity reasons, the supremum is always attained, The set of all such points x S is called the supporting set of K at u.If the norm is smooth then x S (u) ∈ S ∩ T (u) where T (u) is the tangent hyperplane to S at the point x S (u) with T (u) being orthogonal to u.If K is strongly convex then the supporting set of K at u consists of just one point, thus x S (u) being then always uniquely defined.Note that it may happen that a point ξ ∈ S satisfies ξ = x S (u) for more than one point u ∈ S (n−1) .To see this, assume that S contains corner points, and let ξ be such a corner point of S. As a consequence, even the union of all supporting sets of a convex body may be finite.According to Lemma 1 in [14], where the function ϑ → η(ϑ) is chosen such that d K ((ϑ T , η(ϑ)) T ) = 1 describes the boundary S of K.Note that here x = (ϑ T , η(ϑ)) T ∈ S and a.e.N (ϑ) = (∇η(ϑ), −1) T is the outer normal vector of S at (ϑ T , η(ϑ)) T , thus (ϑ T , η(ϑ))N (ϑ) > 0. It follows from (10) that The proof will be finished by the following lemma.Proof.The radial function of the radially concave star-shaped set K o is, on the one hand, by definition , u ∈ S (n−1) , and allows, on the other hand, the representation

Discussion
To make both the similarity and the difference between Theorems 3.1 and 3.2 more visible, let us remark that K * allows a representation looking similar to that of K o , K * = {y ∈ R n : y T x ≤ 1, ∀x ∈ K} = {λ(u)u : 0 ≤ λ(u)h K (u) ≤ 1, u ∈ S (n−1) }.
For dealing with a combined example where both Theorems 3.1 and 3.2 apply, we recall that the function x → |x| a,p is a norm if p ≥ 1 and, according to [11], an antinorm if 0 < p ≤ 1.Thus, K = B a,p is convex if p ≥ 1 and radially concave w.r.t. the standard fan F if 0 < p ≤ 1.
Let q be defined by the equation 1 p + 1 q = 1, and 1 a = ( 1 a 1 , ..., 1 an ) T .Note that if p > 1 then K * = B 1 a ,q , and if 0 < p < 1 then q < 0 and is a semi-antinorm ball.For an illustration of such sets, see [11].Note that a and p are independently dealt with when constructing the sets K * and K o .Remark 5.1.In the applications of Theorem 3.1, the surface content of [ 1 r (B −ν)]∩S (which may be considered as an indivisible of the set B − ν) is always measured w.r.t. the metric d K * .Successful applications of this generalized method of indivisibles are surveyed in [14].The representation of Theorem 3.1 generalizes those presented in the latter and several earlier papers.
Remark 5.2.The set K o in Section 3.2 is radially concave, thus d K o is a semi-antinorm.
Proof.We show that if x 1 and x 2 are from Ko ∩C for some C ∈ F then λx 1 +(1−λ)x 2 ∈ Ko ∩C for 0 < λ < 1 where Ko means the complement of the set K o .Let x 1 = λ 1 u 1 , x 2 = λ 2 u 2 with u i ∈ S (n−1) , λ i ≥ 1 1 and distributions being radially concave w.r.t. a fan in R n , or antinorm contoured, in Section 3.2.The main aim of these two sections is to give closer descriptions of O S being basic for both the general stochastic representation of Y in (2) and the specific geometric measure representations of ω S in (3) and Φ g,K,ν in (1) and(5) in case Y has a density.Moreover, two-dimensional distributions are illustrated by graphics showing simulated sample clouds.Applications to estimating and testing hypotheses on scaling parameters are demonstrated in Section 3.3.The proofs of the results from Sections 3.1 and 3.2 will be presented in Section 4, and a final discussion of the results follows in Section 5.

Figure 2 :
Figure 2: Convex contoured cases far from the normal one

Lemma 4 . 1 .
The anti-support function of K w.r.t.F is equal to the distance function ofK o , h F K = d K o .