The Use of Generalized Means in the Estimation of the Weibull Tail Coefficient

Due to the speci ﬁ city of the Weibull tail coe ﬃ cient, most of the estimators available in the literature are based on the log excesses and are consequently quite similar to the estimators used for the estimation of a positive extreme value index. The interesting performance of estimators based on generalized means leads us to base the estimation of the Weibull tail coe ﬃ cient on the power mean-of-order- p . Consistency and asymptotic normality of the estimators under study are put forward. Their performance for ﬁ nite samples is illustrated through a Monte Carlo simulation. It is always possible to ﬁ nd a negative value of p (contrarily to what happens with the mean-of-order- p estimator for the extreme value index), such that, for adequate values of the threshold, there is a reduction in both bias and root mean square error.


Introduction and Preliminaries
Statistics of extremes, either univariate or multivariate, have been recently faced with many different challenges, which have enabled to better understand the complexity of extreme events in the most diverse areas of applications, like biostatistics, dynamical systems, environment, finance, insurance, and structural engineering, among other fields. Risky events are commonly in the tails of the underlying distribution, and there are usually only a few observations in those tails. Consequently, and thinking only on the univariate situation, estimates either much above the observed maximum or below the observed minimum are often required. It is thus necessary to consider models for the tails, and those models are most of the times based on asymptotic results.
Let us assume that, possibly after an adequate transformation, the available transformed sample, X n = ðX 1 , ⋯, X n Þ, can be regarded as a sample of size n of independent, identically distributed (IID) random variables (RVs) from a cumulative distribution function (CDF) F. More generally, X n can be assumed to be a sample of stationary weakly dependent RVs from F. Let us use the notation X 1:n ≤ ⋯ ≤ X n:n for the associated ascending order statistics (OSs). Further assume that there exist sequences of real constants fa n > 0g and fb n ∈ ℝ g such that the linearly normalized maximum, ðX n:n − b n Þ/ a n , converges weakly to a nondegenerate RV. Then (see Gnedenko [1]), the limiting CDF is necessarily of the type of the general extreme value (GEV) CDF, given by used. The GEV ξ model, in (1), is perhaps the most relevant univariate asymptotic model in statistical extreme value theory (EVT). For other relevant asymptotic models and different approaches to statistics of univariate extremes, see the reasonably recent overviews by [2][3][4][5]. The parameter ξ is the extreme value index (EVI), one of the most relevant parameters of large events. This parameter measures the heaviness of the right-tail function FðxÞ ≔ 1 − FðxÞ, and the heavier the right tail, the larger ξ is. Heavy-tailed models, i.e., Pareto-type underlying CDFs, with a positive EVI, belong to D + M ≔ D M ðGEV ξ>0 Þ, with GEV ξ ≡ G ξ defined in (1). Note that, in a univariate framework and with R a denoting the class of regularly varying functions at infinity with an index of regular variation a, i.e., positive measurable functions g such that lim t⟶∞ gðtxÞ/gðtÞ = x a , for all x > 0 (see Bingham et al. [6], for details on regular variation), and with the notation the following equivalences hold: As an example of a CDF in D + M , and among many others, we mention the Fréchet CDF, FðxÞ = exp ð−x −α Þ, x ≥ 0, α > 0 (ξ = 1/α).
In this paper, our interest lies essentially in the estimation of the Weibull tail coefficient (WTC), another relevant parameter of extreme events. Regularly varying cumulative hazard functions HðxÞ ≔ −ln ð1 − FðxÞÞ will thus be considered. Indeed, the WTC is the parameter θ in a right-tail function of the type: The class of models with a Weibull-type tail is quite broad and includes, among others, the normal, the gamma, the Weibull, and the logistic distributions. This type of models is quite useful in several areas of applications such as hydrology, meteorology, environmental sciences, and nonlife insurance (see de Wet et al. [7]). Further note that condition (4) is equivalent to assume that the inverse cumulative hazard function H ⟵ is a regularly varying function with index θ. Thus, with L ∈ R 0 , a slowly varying function.

Semiparametric Estimators of the WTC.
Regarding the estimation of the WTC, one of the first WTC estimators in the literature was based on record values [8]. The use of the k upper order statistics in the sample was considered in Broniatowski [9], Beirlant et al. [10,11], and Dierckx et al. [12]. Most WTC estimators are based on the relative excesses: or on the log excesses: Indeed, Beirlant et al. [10] proposed the estimator with functional form: Beirlant et al. [13] and Girard [14] considered the following estimator of the WTC: Weighted versions of the estimator b θ G ðkÞ can be found in Gardes and Girard [15] and Goegebeur et al. [16]. The following Hill-type WTC estimator was studied in Gardes and Girard [17]: with HðkÞ being the classical Hill (H) [18] EVI estimators for heavy-tailed models, which can be written as the average of the log excesses, i.e., with V ik defined in (7). Consistency of the Hill estimator for ξ holds if k = kðnÞ is an intermediate sequence, i.e., if Recent developments in the estimation of the WTC can be found in papers [19][20][21][22][23].
The quite positive performance of most of the EVI estimators based on generalized means (GMs) leads to the consideration of a simple generalization of the H EVI estimators, in (11), studied in Brilhante et al. [24], and almost simultaneously in Paulauskas and Vaiciulis [25] and in Beran et al. [26] (see also Segers [27]). Such a generalization leads to the so-called power mean-of-order-p (H p ) EVI estimators. Indeed, on the basis of (11), it is possible to write Since the H EVI estimators are the logarithm of the geometric mean (or mean-of-order-0) of U ik , 1 ≤ i ≤ k, defined in (6), the mean-of-order-p of U ik , for any real p (see Gomes and Caeiro [28] and Caeiro et al. [29], among others), can be more generally considered. This leads to the mean-of-orderp EVI estimators: Just as mentioned above, GMs have recently been used with high success in the estimation of a positive EVI allowing one to obtain reduced bias estimators of ξ. The adequate choice of p, in (14), enables such bias reduction for the mean-of-order-p EVI estimators. Due to the specificity of the WTC, its relevance and its deep link to a positive EVI, the GMs, in (14), will now be used for the estimation of the WTC, with the consideration of with H p ðkÞ defined in (14), for any real p. Notice that the estimator b θ GG ðkÞ in (10) is a particular case of b θ p ðkÞ.
In Section 2 of this paper, after a few comments on the role of the WTC and some preliminary results, a few details on the asymptotic behaviour of the WTC estimators in (8), (9), and (15) are provided. Again, a high variance for small k and a high bias for large k can appear, and thus it is necessary to reduce bias and/or properly choose the tuning parameters in play. Section 3 is dedicated to an extensive Monte Carlo simulation of the WTC estimators under study. Regarding the mean-of-order-p estimation, it was always possible to find a value of p (negative, contrarily to what happens with the mean-of-order-p EVI estimation), such that, for adequate values of the threshold, there is a reduction in both bias and root mean square error (RMSE). Finally, in Section 4, a few overall conclusions are drawn. One of the main points of the article is that, as even asymptotically equivalent estimators may exhibit very diversified finite sample properties, it is always sensible to work, in practice, with a few WTC estimators, possibly dependent on tuning parameters, which make them more flexible.

Preliminary Results.
To study the nondegenerate asymptotic behaviour of the estimators, a second-order condition is required to specify the bias term. This condition can be expressed in terms of the slowly varying function Lð·Þ in (5). Let us assume that the rate of convergence of ln LðtxÞ − ln LðtÞ towards 0 is ruled by a function B. Then, there exists β ≤ 0: and jBj ∈ RV β . This second-order parameter β quantifies the rate of convergence of ln LðtxÞ/LðtÞ to 0. The closer β is to 0, the slower is the convergence.
Next, we provide some information regarding the distributional behavior of V ik , defined in (7). Suppose that Y 1:n , 3 Computational and Mathematical Methods Y 2:n , ⋯, Y n:n are the order statistics generated by n independent standard Pareto random variables with CDF F Y ðyÞ = 1 − 1/y, y ≥ 1. Then, X i:n = d UðY i:n Þ, 1 ≤ i ≤ n and Y n−i+1:n / Y n−k:n = d Y k−i+1:n . If k is intermediate, the following distributional representation: holds. Hence, since E i = ln Y i are independent, identically exponentially distributed with mean one (see [33]), it follows that for i = 1, 2, ⋯, k, Results for U ik can be easily deduced due to the relation U ik = e V ik . Thus, we get 2.2. Asymptotic Behaviour of the Estimators. The next theorem establishes the limit distribution of b θ GG ðkÞ. (12), the esti- (10), is consistent for the estimation of θ.

Theorem 2. For intermediate values of k as in
More than that, the distributional representation Proof. Using (22) and the result E n−k:n~l n ðn/kÞ ⟶ ∞, it is (24) and consistency of b θ GG ð kÞ follow straightforwardly.
The asymptotic behaviour of the new class of mean-oforder-p WTC estimators, in (15), is next stated and proven.  (4) and (16), with k being a sequence of intermediate values, as in (12), the asymptotic distributional representation holds for the mean-of-order-p WTC estimator, b θ p ðkÞ, in (15), with P k being the standard normal RV in (24).
Proof. It is just needed to prove equation (26) for p ≠ 0, since the case p = 0 was already derived in Theorem 2.
By using (23) and the result E n−k:n~l n ðn/kÞ ⟶ ∞, we obtain the following distributional representation: Consequently, under the validity of (12) and using (22) and the same results used in the proof of Theorem 2, it is possible to write Then, and from the definition of H p ðkÞ in (14), it follows that Computational and Mathematical Methods The result in (26) follows straightforward from the definition of b θ p ðkÞ = ln ðn/kÞH p ðkÞ, in (14).

Remark 4.
Under the conditions of Theorem 3, the asymptotic distributional representation of the WTC estimators in (10) and (15) is the same. The independence on the real tuning parameter p, in (26), associated with the mean-of-order-p prevents the determination of the optimal p value, i.e., the value of p that cancels the asymptotic bias, or minimizes the RMSE of the mean-of-order-p WTC esti-mator. However, dependence on p can appear if higherorder terms are considered in the expansion of the tail quantile function.
Next, we state the asymptotic behaviour of the WTC estimators in (8) and (9). Bðln ðn/ kÞÞ ⟶ λ ∈ ℝ and ffiffi ffi k p /ln ðn/kÞ ⟶ 0,     Proof. For the proof of the first limit result, we refer to Theorem 3.2. of Beirlant et al. [10] with some trivial modifications. The second limit result is a particular case of Theorem 1 in Gardes and Girard [15].
Remark 7. Although Corollary 5 and Proposition 6 provide similar asymptotic distributions for the WTC estimators considered in this work, the same cannot be guaranteed about their finite sample performance. It is known that asymptotic equivalent estimators of the WTC can provide a different behaviour for small sample sizes (see Goegebeur et al. [16], p. 3697). Indeed, a similar comment applies to estimators of any parameter of rare events.

Monte Carlo Simulation Study
In this section, the finite sample performance of the class of estimators b θ p ðkÞ is evaluated through a Monte Carlo simula-tion study. For comparative purposes, the WTC estimators b θ B ðkÞ and b θ G ðkÞ in (8) and (9), respectively, were also included in the study. The values for the parameter p were selected from a preliminary simulation study. The value p = 0 was always used, since it provides the estimator in (10). The value p = 1 was also used to illustrate the effect of a positive value parameter. The following Weibull-type models were considered: (1) Exponential distribution, exp ð1Þ, with CDF The WTC is θ = 1.
were obtained. In addition, the simulated optimum levels, were computed. Figures 1-4 are related to the behaviour of the aforementioned class of WTC estimators b θ p ðkÞ, as a function of k. At the left, the simulated values of the mean value are presented, and, at the right, the corresponding estimates of the Table 2: Simulated optimal sample fraction (OSF),k 0 /n, mean value, and RMSE (both computed at the optimal level) of b θ p ðkÞ for p = −3, −2, −1, 0, and 1, b θ B ðkÞ, and b θ G ðkÞ from Γ (0.75,1) underlying parents (θ = 1). Although the simulation is limited to this selected Weibull tail models, the following comments can be drawn. (ii) It appears that, in all the simulated cases, it was always possible to find a negative value of p that drastically reduces the absolute bias and the RMSE. This is the opposite to what typically happens with the mean-of-order-p EVI estimator, where there is a reduction of bias as well as of RMSE, for positive values of p. And for such a value of p, b θ p ðkÞ strongly beats the estimator b θ GG ðkÞ considered by Gardes and Girard [17]; (iii) The estimators b θ B ðkÞ and b θ G ðkÞ, in (8) and (9), beat the class of mean-of-order-p WTC estimators in terms of bias and RMSE for the exponential and Weibull parents under study. For these two parents, the best estimator was the one proposed by Girard [14]; (iv) For the gamma and half-normal parents, here considered, it is always possible to find a value of p such that the estimator b θ p ðkÞ outperforms in bias and in RMSE both the estimators b θ B ðkÞ and b θ G ðkÞ; (v) Algorithmic details on the choice of tuning parameters p and k are still under development but can be easily devised, similarly to what is done in Caeiro and Gomes [34] or Gomes et al. [35].
In Tables 1-4, the simulated values of the optimal sample fraction (OSF, the optimal level divided by the sample size) of the mean value (E) and of the RMSE of the estimators under study are presented. For each model, the mean value closest to the target value θ and the smallest RMSE Table 3: Simulated optimal sample fraction (OSF),k 0 /n, mean value, and RMSE (both computed at the optimal level) of b θ p ðkÞ for p = −20, −12, −5, 0, and 1, b θ B ðkÞ, and b θ G ðkÞ from W (2,1) underlying parents (θ = 0:5). Observe that to reach the smallest absolute bias or the smallest RMSE, it is necessary to use a larger sample fraction than the one required by b θ GG ðkÞ (p = 0). The smallest absolute bias and RMSE are always achieved by b θ p ðkÞ with p < 0, for the gamma and half-normal models. Also, the optimal p decreases, as the sample size n increases. For large sample sizes, the choices p = −10, −3, −20, and −1:5 seem to provide an overall good performance for the exponential, gamma, Weibull, and half-normal models, respectively. For the exponential and Weibull models, the smallest absolute bias and RMSE are always achieved by b θ G ðkÞ.

Conclusions
In this paper, the estimation of the WTC, a parameter of high interest when working with Weibull-type models, is the main topic under discussion. Due to the similarity between the WTC estimation and the EVI estimation and the good performance of the EVI estimators based on GMs, a new class of WTC estimators was introduced based on the power mean-of-order-p. The consistency and asymptotic normality of the new class of estimators were obtained under adequate conditions. The finite sample behaviour of the estimators was evaluated through a Monte Carlo simulation study applied to some selected Weibull-type models.  (14). This topic should be addressed in a future work. Anyway and looking at the simulated values, it is possible that a choice of p different from the ones considered in the Monte Carlo Table 4: Simulated optimal sample fraction (OSF),k 0 /n, mean value, and RMSE (both computed at the optimal level) of b θ p ðkÞ for p = −2, −1:5, −1, 0, and 1, b θ B ðkÞ, and b θ G ðkÞ from half-normal underlying parents (θ = 0:5).

Data Availability
The simulated data used in this study is available from the authors upon request.

Conflicts of Interest
The authors declare that they have no conflicts of interest.