An Estimator of Heavy Tail Index through the Generalized Jackknife Methodology

In practice, sometimes the data can be divided into several blocks but only a few of the largest observations within each block are available to estimate the heavy tail index. To address this problem, we propose a new class of estimators through the Generalized Jackknife methodology based on Qi’s estimator (2010). These estimators are proved to be asymptotically normal under suitable conditions. Compared to Hill’s estimator and Qi’s estimator, our new estimator has better asymptotic efficiency in terms of the minimum mean squared error, for a wide range of the second order shape parameters. For the finite samples, our new estimator still compares favorably to Hill’s estimator and Qi’s estimator, providing stable sample paths as a function of the number of dividing the sample into blocks, smaller estimation bias, and MSE.


Introduction
The heavy tailed distributions have been applied to many fields, such as finance, insurance, telecommunications, natural calamities, and environmental science.The heavy tail index plays a very important role, with the inherent quality that larger the tail index, the heavier the distributional tail, and more the rare events.Thus, how to estimate the tail index of a heavy tailed distribution has attracted much attention in the literature.Since the seventies of last century, [1][2][3][4][5][6][7] proposed various parametric or semiparametric estimators.These estimators are constructed from the upper order statistics exceeding a certain threshold.
However, sometimes only the information on the largest value occurring is recorded or only several largest observations are available for analysis.Specifically, sometimes the data can be divided into several blocks but only a few of the largest observations within each block can be used to infer.For example, for financial data, it is very common that only the information on a few largest quoted prices is reported to the public (see [8,9]).For meteorology data, only the highest and lowest temperatures of each day are forecasted.In many athletics games, only the scores for a few best players are observed and these observations can be considered as the largest observations within each game.Other actual situations are also mentioned in [10][11][12].
Thus, Davydov et al. [13] propose a new estimator for the tail index.In their approach, observations are divided into several blocks and the estimator of the tail index is constructed from the ratios of the first largest and second largest terms within blocks.Since Davydov-Paulauskas-Račkauskas (DPR) approach does not use all the upper order statistics when it is used to estimate the tail index, it may not be as efficient as Hill's estimator (see [1]), the most wellknown estimation of the tail index, in sense of the minimum mean squared error (MSE).In fact, when only several largest observations within each block are available for analysis, DPR's approach has its advantages over others, since none of the aforementioned methods is applicable.
A similar idea as DPR's is used by [14], who study the limiting distribution of Galton's ratio computed from each of the blocks in the entire sample and develop a parallel procedure to test whether the underlying distribution is from the external domain of attraction of the Gumbel distribution.Paulauskas [15] studies the properties of DPR's estimator and shows that the large sample performance of the estimator 2 Mathematical Problems in Engineering is good besides the simplicity of the statistic used for the estimator.After investigating the asymptotic properties of the DPR estimator, Paulauskas and Vaičiulis [16], Vaičiulis [17] propose a class of modifications of the DPR estimator with better asymptotic properties but a nonnull bias.Qi [18] proposes a new class of estimators by using a similar setup to DPR's, according to the fact that only several largest observations within each block can be used for the inference.Qi's estimator is more efficient than DPR's in the sense that it has a smaller asymptotic variance under the second order regular variation, but with a nonnull asymptotic bias dependent on the number of the largest random variables used for inference within each block.
The main purpose of this paper is to propose a new class of estimators for the tail index, with a null asymptotic bias and smaller asymptotic variance compared to those aforementioned methods, through the Generalized Jackknife methodology.The Generalized Jackknife methodology based on nonparametric resampling techniques is to reduce the bias of an estimator by means of considering a combination of two suitable estimators.In addition to the application of this methodology in this paper, the first estimator obtained through the Generalized Jackknife methodology is the one introduced by [19], under a different context.Gomes et al. [20] propose several Generalized Jackknife estimators, by the use of suitable Generalized Jackknife methodologies, associated with Vries' estimator (see [21]) and Hill's estimator.They find that these statistics could be used to reduce bias, preferably without increasing the MSE-which seems not to be an easy goal to achieve for all values of the second order shape parameter, and their performances in finite sample are closely related to the sample size.Gomes et al. [22] propose a class of Generalized Jackknife estimators associated with any two members of the class of Hill's estimators and improve on the well-known, bias-variance, trade-off characteristic of Hill's estimator both asymptotically and for finite samples, when the underlying distribution is in Hall's class of models.
The Jackknife methodology may be easily generalized to other semiparametric estimators of the tail index.Falk [23] studies convex combinations of two members of the class of Pickands' estimator, showing its superiority over Pickands' estimator.However, the simulation results presented by [24] show that the convex combinations of two members of the class of Hill's estimators do not improve highly the behavior of the original Hill's estimator.Similar studies based on the Hill estimator are also done in [25,26], providing a new class of estimators for  ∈ R + under the second order regular variation.Thus, motivated by better asymptotic efficiency of Qi's estimator and reduced-bias capability of the Generalized Jackknife methodology, we propose a Jackknife estimator associated with Qi's estimators at two different levels.Asymptotic comparisons and simulation studies are presented to show that the new estimator presents the existence of some possible improvement in terms of the minimum MSE for a wide range of the second order shape parameter compared to the well-known Hill's estimator and original Qi's estimator.
The rest of the paper is organized as follows.Section 2 briefly introduces some necessary preliminaries.In Section 3, new estimators are introduced and discussed asymptotically.
In Section 4, some asymptotic comparisons of tail index estimators under study are provided.In Section 5, their performances for finite samples are illustrated through the Monte Carlo technique.Finally, in Section 6, some conclusions are given.

Preliminaries
To derive the asymptotic properties of our new estimator and compare its asymptotic efficiencies to other well-known estimators, some necessary preliminaries on regular variation behaviors and asymptotic properties of other estimators are given as follows.
Let  1 ,  2 , . . .,   be a set of  independent and identically distributed (iid) random variables with a common distribution function (df) : for large , where () is a slowly varying function; that is, for every  > 0, ()/() → 1 as  → ∞.Consequently, 1 −  ∈ RV −1/ , where RV −1/ stands for the class of regularly varying functions at infinity with index of regular variation equal to −1/.Let us denote the associated ascending order statistics (o.s.) by  1, ≤  2, ≤ ⋅⋅⋅ ≤  , , and the maximum linearly normalized by real constant sequences {  > 0} and {  ∈ R}, such that ( , −   )/  converges in distribution to a nondegenerate limit distribution, that is, the generalized extreme value (GEV) distribution: is thus in the max-domain of attraction of EV  , denoted by  ∈ D M (EV  ).
For  > 0, it is well known that both the first and second order behavior of the df 's are in the domain of attraction of EV  .The first order behavior is introduced that where The conditions in (3) characterize completely the first order behavior of (⋅).To make an inference about , the second order behavior stronger than the first behavior is required as in [27].Throughout this paper, we assume that there exists a function () → 0 as  → ∞, such that lim for all  > 0, where |()| must then be of regular variation with index , that is, |()| ∈ RV  (see [28]), and  is a second order shape parameter, which eventually also needs to be properly estimated from the original sample, and whose estimation will be addressed in another paper.In this paper, we will assume that (5) holds with  < 0 and that we can choose () =   with  ̸ = 0, a second order scale parameter.
As the most popular semiparametric estimation of the tail index , Hill's estimator has the weak consistency, strong consistency, and asymptotic normality.Based on  largest order statistics, the Hill estimator (o.s.) is defined by For any intermediate sequence  = (), that is, a sequence such that under the second order condition in ( 5), the following distributional representation for the Hill estimator holds, where  (1)   is asymptotically a standard normal r.v., that is,  (1) and the corresponding asymptotic mean square error (AMSE) is given by Davydov et al. [13] and Paulauskas [15] propose a new estimator for the tail index as follows.First, divide the sample  1 , . . .,   into   blocks,  1 , . . .,    , and each block contains  = () = [/  ] observations, where [] denotes the integer part of  > 0. To be more specific, denote the order statistics of the  observations in the th block.Set an estimator of .Under the second order condition in (5), it is proved that Qi [18] proposes a new class of estimators by using a similar setup to DPR's, which may be dependent on more information on the largest observations in each block.Let  ≥ 1 be an integer and assume that the  + 1 largest random variables within the   blocks are used to estimate : where   satisfies the intermediate condition as in (7).
If √  (/  ) → , finite, as  → ∞, for which under the second order condition in ( 5), the distributional representation holds, where and and the asymptotic mean square error is given by

Our New Estimators
The main objective of the Jackknife methodology (see [29]) is to reduce the bias of an estimator constructed by two different estimators with similar asymptotic properties.Specifically, as a particular case of the Jackknife theory, if there exist two different biased consistent estimators  1  and  2  of , with asymptotic bias of  1  and  2  .Put a weight between  1  and  2  that provides the elimination of the asymptotic bias for .The Generalized Jackknife statistic associated with ( 1   ,  2  ) is an unbiased consistent estimator of , provided   ̸ = 1, for every .
It is not difficult to acquire the information about asymptotic bias of the estimators in extreme value theory (EVT), so one can use this information to build new estimators with a reduced asymptotic bias.In this paper, we intend to deal with the estimator    (  , ) proposed by [18] and build the associated Generalized Jackknife estimator, which may provide stable sample paths as functions of implied parameters and a flatter mean square error.The estimator    (  , ) includes two different parameters, that is, the number of the blocks   and the number of largest random variables  used for inference within each block, which will generate three classes of different Generalized Jackknife statistics.
The first class is associated with ( with That is, dependent on the second order parameter , which needs eventually to be estimated by any of consistent estimators ρ; then Remark 1.We may also estimate  adequately, either internally as in [30,31], or externally as done successfully in [32], through any of the -estimators available in the literature, like the ones in [33,34]. Theorem 2. Under the second order condition in (5),   → ∞,   / → 0, and √  (/  ) → , finite, as  → ∞; then Proof.This asymptotic normality can be interpreted briefly as follows.
The estimator   , (  , ) provides asymptotic unbiased results, with an asymptotic variance inverse proportional to ; that is, with Cov ∞ (   (  , ),    (  , )) =  2 /; what is more, the term of the coefficient of  2 is both depending on  and : The Hill estimator as well as many other semiparametric estimators of the tail index is consistent for intermediate ranks, but with high bias for large value of  and high variance for small value of .The estimator   , (  , ) proposed by us as the function of   and  whether has a similar behavior to Hill estimator?We will give the answer in the following parts of this paper.
Remark 3. Since the estimation of the second order shape parameter  is still problematic, it is useful to analyze the behavior of   , (  , ) for a nonoptimal choice of   .However, due to the high bias and variance of those existing estimators of , we will not estimate the value of  for   , (  , ) in this paper.
For the new estimator    (  , ), increasing the value of  decreases the asymptotic variance of the estimator and, meanwhile, costs an increase of the asymptotic bias if the bias is not negligible, which is similar to Qi's estimator    (  , ).Therefore, one has to be cautious in selecting the value of  in practice for the optimal mean squared error criterion.The new estimator    (  , ) proposed by us is a particular case of the Generalized Jackknife estimator   , (  , ), assuming a known value  = −1.Tailored for the specification of the second order shape parameter instead of a consistency estimation, the new estimator    (  , ) provides a nonnull asymptotic bias as presented in (34), which is different from the asymptotic unbiased results of   , (  , ).Even so, the asymptotic bias of our new estimator    (  , ) is always smaller than the asymptotic bias of Qi's estimator, due to 2 +1 − 1 < 1 with  < 0, when selecting the same value of  in the two estimators.However, compared to Qi's estimator, our new estimator    (  , ) increases the asymptotic variance 5 times.

Asymptotic Comparison of the Estimators at Optimal Levels
If the asymptotic bias of the estimator    (  , ) is not negligible, we should compare asymptotically the efficiencies of the new estimator to other estimators.The well-known Hill's estimator and original Qi's estimator are under consideration because of outstanding asymptotic properties.
Under the second order condition in (5), for    (),    (  , ),    (  , ), that is, these semiparametric estimators of the tail index , we have the following general distributional representation: where  •  is asymptotically standard normal and eventually  • ̸ = 0; that is, the estimator  •  () has a nonnull asymptotic bias, whenever  • ̸ = 0 and the level  is chosen in such a way that √ (/) converges to a finite  ̸ = 0, "•" denoting , , and , respectively.Thus, the asymptotic mean square error (AMSE) of  •  () is given by Generally, whenever we have, as  → ∞, an AMSE of the type  2 •  2 (/) +  2 • /, since there exists a function  ∈ RV 2−1 , which is positive, decreasing, and regularly varying 1)), as  → ∞ (e.g., [20]), and we have This result in (39) comes from Lemma 2.8 of [29].Following closely the results in [15], for the estimator  •  (), whenever  • ̸ = 0, there exists a function Given two biased estimators  (1)   () and  (2)   () for the tail index with the same asymptotic distributional representation as in (37), computed at the optimal levels  (1)  0 and  (2)  0 , respectively, define the asymptotic efficiency of  (1)   ( (1)  0 ) relatively to  (2)   ( (2)  0 ) as (1)   ( (1)  0 )] AMSE [ (2)   ( (2)  0 )] the ratio between the asymptotic mean squared error of  (1)   () and the asymptotic mean squared error of  (2)   (), computed at the optimal levels.We say in the sense that the estimator  (1)   ( (1)  0 ) is more efficient than the estimator  (2)   ( (2)  0 ) if AEFF 1|2 < 1; in other words, the estimator  (1)   ( (1)  0 ) has a smaller AMSE than the estimator  (2)   ( (2) 0 ), if not the estimator  (1)   ( (1)  0 ) is less efficient than the estimator  (2)   ( (2) 0 ).For    (),    (  , ),    (  , ), we have the following results about the asymptotic efficiency at the optimal levels, for a suitable range of , independently of .The comparison between    () and    (  , ) is given by We also have the asymptotic efficiency of    () relatively to    (  , ), considering particular cases for    (  , ), that is, putting the value of  equal to 1, 2, and 3, respectively.Besides, the asymptotic efficiency of    (  , ) relatively to    (  , ) is given by Among these estimators considered, there is not a dominant one over all (, )-plane from the asymptotic point of view.The asymptotic efficiency of    () relatively to    (  , ) indicates that the optimal mean squared error for Hill's estimator is smaller than Qi's estimator, in the whole available (, )-plane.For selecting several reasonable values of , the new estimator    (  , ) in (31) proposed by us can compare favorably asymptotically to Hill's estimator for a reasonable wide range of  values.When the data can be divided into several blocks but only a few of largest observations within blocks are available for analysis, our new estimator    (  , ) is more efficient in sense of the minimum MSE, whenever  < −0.35 but  ̸ = −1 for all available values of .

Simulation Study for Finite Sample
Our new estimator    (  , ) based on the consideration of a suitable Generalized Jackknife statistic relied both on   and , which causes that it is necessary to explore the impacts of   and  on the estimator.As well as many other semiparametric estimators of , the new estimator    (  , ) proposed by us also has the same type of behavior: consistency for intermediate ranks, high variance for small value of   , and high bias for large value of   .Consequently, there is an obvious question that is immediately put forward: will it be possible to provide stable sample paths of our estimator as function of   and a flatter MSE at the optimal sample fraction?This question will be answered in this section.For the situation that the data can be divided into several blocks but only a few of largest observations, even fewer largest observations within each blocks, are available for analysis, both    (  , ) and    (  , ) are feasible.Thus, we choose  = 1, that is, the two largest random variables within each block, to infer the heavy tail index for the two estimators.
We have implemented simulation experiments based on  = 100 replicas with  = 10000 runs to present the finite sample performances of    (  , 1),    (  , 1), and    () for Fréchet, generalized Pareto (GP), and Burr underlying models with different classes of distribution function, respectively: Fr é chet model:  () = exp (− −1/ ) ,  ≥ 0; In Figures 1-6, we compare the estimation bias and sample paths with   -value from 1 to 200 between    (  , 1) and    (  , 1).In Figure 1, we choose the Fréchet model with  = 1 under study.The estimation results of    (  , 1) show that the deviations from the true value  = 1 seem to be smaller than    (  , ) for wide range of   -value and vary in much small span around  = 1 as changing the value of   .Thus, our new estimator based on the Generalized Jackknife methodology also provides stable sample paths as function of   .In Figure 2, the performances of    (  , 1) and    (  , 1) for Fréchet model with  = 2 are similar in Figure 1.In Figures 3 and 4, we present the results of    (  , 1) and    (  , 1) for the generalized Pareto (GP) model with  = 1 and  = 2, respectively.Whether  = 1 or  = 2, the performances both in Figures 3 and 4 are similar to the ones simulated by the Fréchet model, which show that our new estimator provides stable sample paths as function of   for the GP model.In Figures 5 and 6, we also present the results of    (  , 1) and    (  , 1) for the Burr model with  = 1 and  = 2, respectively.The performances of our new estimator for the Burr model show more convincing results on whether the estimation bias or the sample paths.
In general, if there only exist several largest observations in practical fields for use, it is not possible to compare our new estimator to Hill's estimator since Hill's estimator cannot be applicable in case of incomplete data.Our new estimator responds to the case that the data can be divided into several blocks but within each block only several largest observations are available for analysis, while Hill's estimator is constructed by the upper order statistics exceeding a certain threshold from all data.Instead of providing sample paths as function of the number of dividing data into blocks (the horizontal ordinate   in the figures), we can compare our new estimator to Hill's estimator in terms of the mean squared errors at their optimal levels.Thus, we generate  = 100 replicas with  = 10000 runs each from Fréchet, GP, and Burr underlying models and display the simulated mean values (E[⋅]) and the simulated mean squared errors (MSE[⋅]) for Hill's estimator    (), Qi's estimator    (  , ), and our new estimator    (  , ) at their optimal levels.The simulated mean value of γ•  () is the the average of the  = 100 values γ• , (); that is, and the simulated MSE is the average of the squares of the differences γ• , () − ; that is, In  model with  = 1 and  = 2, respectively, but also for the GP model and the Burr model.Compared to the estimator    (  0 , 1), the MSEs of our estimator presented in Table 1 also demonstrate the superiority.Similarly to what we have done before, we also report the corresponding results for Hill's estimator.The Hill's estimator compares favorably to our new estimator at the optimal levels for the Fréchet model and the GP model but inferiorly to our new estimator for the Burr model.The results listed in Table 1 give great importance from a practical point of view.

Conclusions
In this paper, we propose an estimator of tail index through the Generalized Jackknife methodology if the data can be and be robust to the ways of dividing the sample into blocks for underlying models.However, the new class of estimators   , (  0 , ) proposed by us through the Generalized Jackknife methodology is dependent on the second order shape parameter , which needs eventually to estimate the unknown parameter.Due to the high bias and variance of those existing estimators of , we take the value of  = −1 for a reasonable general point.Unsatisfactory but unsurprisingly, the simple and convenient deal will lead to a nonnull bias asymptotically and practically.Let us assume next that we estimate  consistently, through an adequate estimator ρ.Moreover, the robustness of our estimation results on the parameter  is our research directions in the future.

Table 1
We can that see there exists a significant difference between the behaviors of these statistics under study.Our new estimator    (  0 , 1) has much smaller bias at the optimal levels, that is, the most appropriate numbers of dividing the sample into blocks than the estimator    (  0 , 1), not only for the Fréchet Figure 5: Underlying Burr parent with  = 1.

Table 1 :
The simulated mean values and MSEs of    (  0 ),    (  0 , 1), and    (  0 , 1).divided into several blocks but only a few of the largest observations within each block can be available.In terms of the criterion of simulated mean values and mean squared errors, our new estimator with the first and second largest random variables used for inference within each block compares favorably to Hill estimator.Besides, our new estimator also behaves better than Qi's estimator in simulated results