Actuarial Measures , Regression , and Applications of Exponentiated Fréchet Loss Distribution

In this study, a new loss distribution, called the exponentiated Fréchet loss distribution is developed and studied. e plots of the density function of the distribution show that the distribution can exhibit dierent shapes including right skewed and decreasing shapes, and various degrees of kurtosis. Several properties of the distribution are obtained including moments, mean excess function, limited expected value function, value at risk, tail value at risk, and tail variance. e estimators of the parameters of the distribution are obtained via maximum likelihood, maximum product spacing, ordinary least squares, and weighted least squares methods. e performances of the various estimators are investigated using simulation studies. e results show that the estimators are consistent. e new distribution is extended into a regression model. e usefulness and applicability of the new distribution and its regression model are demonstrated using actuarial data sets. e results show that the new loss distribution can be used as an alternative to modelling actuarial data.


Introduction
In actuarial practice, there is the need to appropriately model data sets. Achieving this can lead to optimal capital allocation as a result of accurate calculations of risk measures and insurance premiums.
is is essential for risk management purposes. Due to this, probability distributions are very essential in actuarial practice. Several distributions have been used in actuarial practice including Pareto, gamma, beta, Fréchet, exponential and Weibull distributions. However, given the nature of actuarial data, speci cally loss data, some of these distributions are not able to appropriately model such data. For instance, loss data are observed to be heavy-tailed in nature and require distributions that exhibit such properties to be able to model them [1,2]. us, several new distributions have been developed and studied by researchers over the decades for modelling loss data.
In this study, a new extension of EF distribution, known as the exponentiated Fréchet loss (EFL) distribution is developed and studied. e new distribution is developed using a family of loss distributions proposed by Ahmad et al. [20]. Regression models are very essential in relating a response variable to an independent variable(s). Letting the response variable follow the EFL distribution, an EFL regression model can be developed. us, in this study, an EFL regression model is developed and its application demonstrated. e rest of the article is organized as follows: Section 2 presents the EFL distribution. Some statistical properties, including moments and moment generating function, are presented in Section 3. Section 4 presents some actuarial properties including mean excess function, limited expected value function, value at risk, tail value at risk, and tail variance. Section 5 presents four parameter estimation methods for estimating the parameters of the distribution. Monte Carlo simulation studies to assess the performance of the estimators are carried out in Section 6. A new regression model based on the EFL distribution is given in Section 7. e usefulness of the new distribution and its regression model are demonstrated on real data sets in Section 8. Section 9 presents the conclusion of the study.

Exponentiated Fréchet Loss Distribution
Let the random variable X follow the family of loss distributions proposed by Ahmad et al. [20]. en its cumulative distribution function (CDF) is given as follows: where G(x) is CDF of the baseline distribution. In this study, the EF distribution is used as the baseline distribution. e CDF of the EF distribution is given as follows: (2) where β is a scale parameter and θ and λ are shape parameters. Substituting equation (2) into equation (1) gives the CDF of the EFL distribution as follows: e differential of equation (3) gives the probability density function (PDF) of the EFL distribution as follows: e hazard rate function of ELF distribution is obtained as follows: It should be noted that when θ � 1 in the EFL distribution, Fréchet loss distribution is obtained. To show the flexibility of the EFL distribution, some plots of its PDF and hazard rate functions are obtained for some parameter values of the distribution and shown in Figure 1. It can be observed that the PDF can exhibit right-skewed, decreasing, approximately symmetric shapes, and various degrees of kurtosis. Also, the hazard rate function exhibit decreasing, increasing, and reverse J shapes.

Expansion of PDF.
e expansion of the PDF of EFL distribution is obtained in this subsection. e expansion is useful for obtaining some quantities, such as the moments, moment generating function, and other properties, of the distribution that involve integrals of the PDF or its functions. Using generalized binomial expansion defined as Also, using the binomial expansion International Journal of Mathematics and Mathematical Sciences Substituting equations (6) and (7) into the PDF of the EFL distribution in equation (4) gives Using the expansion  International Journal of Mathematics and Mathematical Sciences 3 Substituting equations (9) and (10) into equation (8), and after some algebraic manipulations, gives the PDF of the EFL distribution as follows: where

Statistical Properties
In this section, some statistical properties of the EFL distribution are obtained. ese include the quantile function, ordinary and incomplete moments, and moment generating function.

Quantile Function.
e quantile function of a distribution can be used to characterize the distribution. It can also be used to obtain random numbers from the distribution and obtain some quantile-based quantities, such as the skewness and kurtosis, of the distribution. e quantile function is obtained as the inverse function of the CDF of a distribution. at is, Q X (u) � F − 1 (u). For the EFL distribution, the quantile function is obtained as follows: where W(xe x ) � x is the Lambert function. Substituting u � 0.5 into the quantile function gives the median of the EFL distribution as follows: Moor's kurtosis (K) and Bowley's skewness (S) can be defined using the quantile function, respectively, as follows: Plots of the kurtosis and skewness of the EFL distribution are obtained and shown in Figure 2 for α � 8.1, β � 1.8, and a range of values for θ and λ. It can be observed that the EFL distribution can assume various degrees of kurtosis and also assume both negative and positive skewness.

Moments and Moment Generating Function.
e ordinary moments, incomplete moments, and moment generating function (MGF) of the EFL distribution are given in this section. ese properties are useful for characterizing the distribution and for obtaining some other properties of the distribution such as variance, coefficients of skewness and kurtosis, and mean excess function.

Ordinary Moments.
e ordinary moment of a distribution is defined as E[X r ] � ∞ 0 x r f(x)dx, r � 1, 2, 3, . . .. us, the ordinary moment of the EFL distribution is obtained by substituting the PDF of the distribution in equation (11) into the definition. is is given as follows: Letting z � (i + j + k)(β/x) λ and y � (i + j + k + 1) (β/x) λ in the first and second integrals, respectively, in equation (16), and after some algebraic manipulations, gives the ordinary moment of the EFL distribution as follows: where Γ(a) � ∞ 0 x a− 1 e − x dx is the gamma function and Π jkm and Π * ijkm are as defined in equation (11). e mean of the EFL distribution is obtained by letting r � 1 in equation (17). us, the mean of EFL distribution is given as follows: 4 International Journal of Mathematics and Mathematical Sciences Important measures such as standard deviation, coefficient of variation (CV), coefficients of skewness (CS), and kurtosis (CK) of the EFL distribution can be obtained via various ordinary moments of the distribution. e standard deviation and CV are measures of risk and are defined as σ � ������ μ 2 ′ − μ 2 and CV � σ/μ, respectively. Also, CS and CK are defined, respectively, as CS � (μ 3 ′ − 3μ 2 ′ μ + 2μ 3 )/σ 3 and CK � (μ 4 ′ − 4μ 3 ′ μ + 6μ 2 ′ μ 2 − 3μ 4 )/σ 4 . Table 1 shows the first four moments, σ, CV, CS, and CK of the EFL distribution for three sets of parameter values: Again, it can be observed that the EFL distribution can exhibit various degrees of kurtosis and skewness, including negative skewness.

Incomplete Moments.
e incomplete moment of a distribution with PDF f(x) is defined as m r (y) � y 0 x r f(x)dx, r � 1, 2, 3, . . .. Substituting PDF of the EFL distribution in equation (11) into the definition gives After substitution, similar to obtaining the rth ordinary moment, and some algebraic manipulations, the incomplete moment of the EFL distribution is obtained as follows: where Γ(a, y) � ∞ y x a− 1 e − x dx is the upper incomplete gamma function. e first incomplete moment of the EFL distribution is given as follows: e moment generating function (MGF) of a distribution is defined as M X (t) � E[e tX ] and is useful in obtaining moments of the distribution. Using Taylor series expansion, MGF can be written .. e MGF of the EFL distribution is obtained by substituting the ordinary moment in equation (17) into the definition. is gives the MGF of the EFL distribution as follows:

Actuarial Properties
In this section, some actuarial properties of the EFL distribution are obtained. ese include mean excess function, limited expected value function, value at risk, tail value at risk, and tail variance.

Mean Excess Function.
e mean excess function is useful in so many fields. It is also known as mean residual function or complete expectation of life. In an insurance context, an insurance policy with a fixed deductible, say t, has its mean excess function defined as the expected payment, with losses less than t not paid. Also, in a mortality context, it can be defined as the remaining lifetime of an individual, given that the individual attained a particular age, say t. e mean excess function is defined as follows: Using the PDF of the EFL distribution given in equation Letting z 1 � (i + j + k)(β/x) λ and z 2 � (i + j + k + 1) (β/x) λ , and after some algebraic manipulations, we have where c(a, y) � y 0 x a− 1 e − x dx. Substituting equation (25) into equation (23) gives the mean excess function of the EFL distribution as follows: (26)  Figure 3 shows some plots of the mean excess function for three sets of parameter values of the EFL distribution. It can be observed that the mean excess function generally increases and can also assume both linear and nonlinear shapes.

Limited Expected
Value Function. Given a policy limit or a deductible from a reinsurance perspective, say u, a limited loss random variable is defined as follows: e limited expected value function is defined as the expectation of the limited loss random variable given as follows: where m 1 (u) is the first incomplete moment given in equation (21). Substituting equations (3) and (21) into the definition gives the limited expected value function of the EFL distribution as follows: 4.3. Value at Risk. Value at risk (VaR) is a commonly used risk measure. VaR is defined as the loss that will not be exceeded with a given probability. Mathematically, given a probability p, VaR � inf x ∈ R: P(X ≤ x) ≥ p . us, VaR is also known as a quantile risk measure and is defined as VaR � F − 1 (p) for a continuous distribution. VaR of the EFL distribution with probability p is defined as follows: Substituting equation (25), with t � VaR, into the definition gives the TVaR of EFL distribution as follows: Figure 4 shows the plots of simulated values of VaR and TVaR for different parameter values and a range of confidence levels. It can be observed that increasing confidence levels are associated with increasing VaR and TVaR. is is consistent with practice, as more capital would have to be allocated for risk management purposes if a company wants to be safer at a higher probability.

Tail Variance.
TVaR measures the expectation of losses exceeding VaR but does not measure the variability of these losses. Tail variance (TV) measures the conditional variance of losses given that they exceed VaR at a given probability. TV at a probability of p is defined as follows: Using the PDF of the EFL distribution in equation (1), we have

International Journal of Mathematics and Mathematical Sciences
With the necessary substitutions and algebraic manip-      (33) gives the TV of the EFL distribution as follows:

Parameter Estimation Methods
Estimators of the parameters of the EFL distribution are presented in this section. Four different estimation methods including maximum likelihood, maximum product spacing, least squares, and weighted least squares estimation methods are presented.

Maximum Likelihood Estimation
. Let x 1 , x 2 , . . . , x n be n independent and identically distributed random samples from the EFL distribution with a set of parameters ϕ � (α, β, θ, λ) ′ . e total log-likelihood function of the density of the distribution given in equation (4) is obtained as follows: Equating the score functions, which are obtained by differentiating equation (37) with respect to each parameter, to zero and solving them simultaneously for the parameters give the maximum likelihood estimates (MLE) of the parameters of the EFL distribution. Numerical methods are employed to obtain the parameter estimates since the solution to the equations does not result in closed-form solutions.

Maximum Product Spacing Estimation.
e maximum product spacing (MPS) method of obtaining parameters is an alternative to the maximum likelihood method. Let the ordered random samples of the EFL distribution be given as x (1) , x (2) , x (3) , . . . , x (n) with CDF F(x) given in equation (3). Define the uniform spacing as follows: where F(x (0) |ϕ) � 0, F(x (n+1) |ϕ) � 1 and n+1 i�0 H i (ϕ) � 1. e MPS estimates of the parameters are obtained by maximizing the function as follows: with respect to each parameter.

Ordinary and Weighted Least Squares Estimation.
Let the ordered samples of the EFL distribution be given as x (1) , x (2) , x (3) , . . . , x (n) . e ordinary least squares (OLS) estimates of its parameters are obtained by minimizing the function as follows: with respect to the parameters of the distribution. Also, the weighted least squares (WLS) estimates are obtained by minimizing the following function with respect to the parameters of the distribution: (41)

Simulation Studies
Simulation studies are carried out in this section to assess the performance of the parameter estimators. e R program with the nlminb function is used for the simulation. e International Journal of Mathematics and Mathematical Sciences 9 function uses the L-BFGS-B optimization method. e simulation procedure is given as follows: (i) Generate N � 3000 samples of size n � 20, 50, 100, 250, 500 from the EFL distribution using its quantile function in equation (12). (ii) Compute the MLE, MPS, OLS, and WLS parameter estimates of the samples obtained in the previous step. (iii) For each parameter estimate, obtain the average estimate (AE), absolute bias (AB), and the root mean square error (RMSE) defined as follows: where ϕ � (α, β, θ, λ). (iv) Steps (i) to (iii) are repeated for two parameter sets Tables 2 and 3 show the simulation results. It can be observed that all the estimation methods are consistent since their AE grows closer to the true parameter values, while AB and RMSE grow towards zero for all the estimation methods. However, generally, for smaller sample sizes, WLS performed better for α and λ, while MPS and MLE performed better for β and θ, respectively, for both sets of simulations. But, for larger sample sizes, MLE and MPS generally performed better for all the parameters in both simulations. Due to the desirable properties of MLE, it will be used to estimate the parameters of the distribution for application purposes.

EFL Regression Model
Regression analysis plays an important role in data analysis in most fields, including actuarial science. In this section, a new regression model with the response variable following the EFL distribution is given. Using the regression structure where x i � (1, x i1 , x i2 , . . . , x ip ) ′ is the ith vector of independent variables and δ � (δ 0 , δ 1 , δ 2 , . . . , δ p ) ′ is the vector of parameters. h(π i ) is known as a link function and links the response variable to the independent variables. Generally, the response variable is linked to the independent variables via the mean. But, also, the response variable can be linked to the independent variable via the quantile or a model parameter. In using a model parameter, a scale or shape parameter is used [2]. In this study, the shape parameter λ is used. Also, the log link function is used. is gives the response variable Z|x i following the EFL distribution with parameters i � 1, 2, . . . , n. e PDF of the EFL regression model is given as follows: e parameters of the EFL regression model can be obtained via the maximum likelihood method by maximizing the log-likelihood function given by For practical purposes, after fitting a model, residual analysis is used to diagnose the model and assess its adequacy. In this study, Cox-Snell [23] residual analysis is employed. Cox-Snell residuals are defined as e i � − log(1 − F(z i ; ϕ)), i � 1, 2, . . . , n, where ϕís a vector of estimated parameters. e Cox-Snell residuals are standard exponentially distributed if the model fits the data. Checking the adequacy of a model using the Cox-Snell residuals can also be graphically investigated.

Simulation Studies.
A Monte Carlo simulation study is carried out to assess the MLE estimators of the parameters of the EFL regression model. ree independent variables are considered in this simulation. us, the regression structure used is e process used for the simulation is as follows: (i) Generate 3,000 samples of sizes n � 25, 50, 150, 300, 600 from the EFL distribution using its quantile function and the independent variables, x 1 , x 2 , and x 3 , from a uniform U(0, 1) distribution (ii) Obtain the MLE estimates of the parameters α, β, θ, δ 0 , δ 1 , δ 2 and δ 3  1.2619 1.2967 1.6927 2.7635 1.1575 1.1888 2.6758 3.5177 2.0413 2.1709  250 1.0227 1.8845 0.9203 0.9756 0.8856 1.7126 0.8064 0.8546 1.7599 2.6189 1.5651 1.7263  500 0.6038 1.0308 0.6002 0.6064 0.4686 0.8665 0.4807 0.4818 1.1109 1.6642 0     e results of the simulation study are given in Table 4. It can be observed that the estimators of the parameters are consistent as the AE gets closer to the true parameter values, while AB and RMSE decrease with increasing sample size.

Applications to Real Data
In this section, the applications of the EFL distribution and EFL regression model to real data are demonstrated. e first data consist of the cost associated with natural catastrophic disasters in Australia from 1967 to 2014. e normed cost in millions of 2014 Australian dollars (AUD), computed as the inflated cost using the consumer price index, is used. e data can be found in the CASdatasets package [24] of the R program with the name auscathist. Table 5 shows the descriptive statistics of the data. e data has 206 observations with a wide range of values. Since the median is less than the mean, it suggests that the data is rightskewed. Figure 5 shows the histogram and box plot of the data. Both figures confirm that the data is right-skewed. is suggests that the EFL distribution can be used to model the data.
e parameter estimates and their corresponding standard errors, in brackets, of the EFL distribution and the other competing distributions are shown in Table 6.  Table 7 shows the goodness-of-fit measures of the distributions. It can be observed that the EFL distribution fits the data better than the competing distributions as it has the least of all the goodness-of-fit measures with large corresponding p-values. Figure 6 shows the PDF plot superimposed on the histogram of the data, the CDF, and probability-probability (P-P) plots of the EFL distribution. It can be observed that the EFL distribution fits the data.

Data 2: Automobile Collision Data.
e second data consist of severity, the average amount of claims (in pounds sterling) adjusted for inflation, of automobile collisions in the United Kingdom. e data can be found in insuranceData package [25] of the R program with the name AutoCollision. Table 8 shows the descriptive statistics of the data. e data consists of 31 observations and indicates positive skewness, as its mean is greater than its median. Figure 7 shows the histogram and box plot of the data. e data can be observed to be positively skewed; confirmation of the observation is made in Table 8. Table 9 shows the parameter estimates of the EFL distribution and the other competing distributions with their standard errors in brackets.
e goodness-of-fit measures of the fitted distributions are shown in Table 10 with their corresponding p-values. It        can be observed that EFL distribution has the least of the measures and the largest p-value. Figure 8 shows the histogram of the data with the fitted PDF, the CDF, and P-P plots of the EFL distribution. It can be observed that the EFL distribution can be used to model the automobile collision data.

Application of EFL Regression Model.
is subsection presents an application of the EFL regression model to a real data set. e data used is obtained from insuranceData package [25] in the R program with the name dataOhlsson and comes from the former Swedish insurance company Wasa. e data contains aggregated data on all insurance policies and claims from 1994 to 1998. In this data set, the variables used are the claim cost (z) in 10,000 Swedish krona, vehicle age (x 1 ) and MC class, a classification by the socalled EV ratio, defined as (engine power in kW × 100)/ (vehicle weight in kg + 75), rounded to the nearest lower integer. e 75 kg represents the average driver weight. e EV ratios are divided into seven classes. is data set was analyzed in a regression context by Gündüz and Genç [2]. e descriptive statistics of the claims and frequencies of the MC class are given in Table 11. It can be observed that there are 670 observations with more than zero claims. Also, MC class 6 can be observed to have the highest number of occurrences, with class 7 having the least. e independent variable MC class is a categorical variable with seven levels and is coded using an indicator variable for the regression model. Given a categorical variable with p levels, then p − 1 new indicator variables are introduced. In such a case, one of the categories is chosen as a reference level. Usually, the level with the highest frequency is used as the reference level. Similar to Gündüz and Genç [2], level 6 of the MC class is chosen as the reference level because it has the highest number of occurrences as shown in Table 10. In this scenario, the following levels and their corresponding indicator variables are used: and (7, x 7 ). Hence, the regression model considered is given as follows: e performance of the EFL regression model is compared with the EF regression model with parametrization I as defined by Gündüz and Genç [2]. Table 12 shows the parameter estimates of the EFL and EF regression models with their corresponding standard errors (SE) and p-values. Also, the average marginal estimates (AME), which measures the average contribution, of each independent variable is presented in Table 12. Again, the negative log likelihood (− ℓ), Akaike information criteria (AIC), and Bayesian information criteria (BIC) are also presented.   It can be observed from Table 12 that the vehicle age and MC class 3 are significant and significantly different from MC class 6 at a 5% significance level for both regression models. Both of these variables have a negative impact on the claims, as can be observed from their AME for the EFL regression. However, vehicle age contributes positively to the claims in the EF regression model, while MC class 3 contributes negatively. Finally, EFL regression performs better in modelling the data as compared to the EF model, as the EFL model has the least values in terms of − ℓ, AIC and BIC measures.
Cox-Snell residuals analysis is performed on the fitted models to evaluate their fit. Figure 9 shows the P-P plots of the empirical probabilities of the residuals against the theoretical probabilities from the standard exponential distribution. It can be observed that the EFL regression has more plotted points closer to the diagonal as compared to the EF regression model. is confirms that the EFL regression model performed better than the EF regression model in modelling the data.

Conclusion
A new loss distribution, called the exponentiated Fréchet loss distribution, is developed and studied. Various statistical properties including the quantile function, moments, and moment generating function are obtained. Also, some actuarial properties including value at risk, tail value at risk, and tail variance of the distribution are obtained. Four estimation methods are used to obtain the estimators of the loss distribution. Simulations studies are performed to assess    the performance of the estimators. e new loss distribution is extended into a regression model. e usefulness of the new loss distribution and its regression model are demonstrated using real data sets. e results show that the exponentiated Fréchet loss distribution and its regression model can serve as an alternative to modelling loss data.
Data Availability e data used for the analysis are openly available, and the sources are stated in the manuscript.

Conflicts of Interest
e author declares that there are no conflicts of interest with respect to this research.