Extension and Application of Credibility Models in Predicting Claim Frequency

In nonlife actuarial science, credibility models are one of the main methods of experience ratemaking. B¨uhlmann-Straub credibility model can be expressed as a special case of linear mixed models (LMMs) with the underlying assumption of normality. In this paper, we extend the assumption of B¨uhlmann-Straub model to include Poisson and negative binomial distributions as they are more appropriate for describing the distribution of a number of claims. By using the framework of generalized linear mixed models (GLMMs), we obtain the generalized credibility premiums that contain as particular cases another credibility premium in the literature. Compared to generalized linear mixed models, our extended credibility models also have an advantage in that the credibility factor falls into the range from 0 to 1. The performance of our models in comparison with an existing model in the literature is also evaluated through numerical studies, which shows that our approach produces premium estimates close to the optima. In addition, our proposed model can also be applied to the most commonly used ratemaking approach, namely, the net, the optimal Bonus-Malus system.


Introduction
Credibility theory is one of the key topics in the field of actuarial science.Since the beginning of the twentieth century, the greatest-accuracy credibility theory based on the Bayesian philosophy has gradually taken the place of the limited fluctuation credibility theory and has become widely used in nonlife ratemaking.In credibility theory, the insurance premium for the individual contract is derived from a convex combination of prior mean, say , and the mean of claim experience for each contract, say , by  = +(1−), where  represents the credibility factor, ranging from 0 to 1. Jewell [1] shows that the credibility formula can be derived from Bayes' theorem by using a Poisson-gamma model.To simplify the credibility factor in the Bayesian method, Bühlmann [2] developed a linear credibility formula under the principle of minimum mean square error (least squares).Without assuming a prior distribution with specified distributional parameters, Bühlmann considered the best linear estimator based on observed claims, which can be estimated consistently by the method of moments [3].Frangos and Vrontos [4] and Tzougas et al. [5] proposed an explicit expression for the Bayesian credibility model using the conjugate prior distribution, but this had limited applicability due to the difficulty in the calculation of the integration when nonconjugate prior distribution was used.Frees et al. [6] discussed the credibility model as a special case of the linear mixed model, introducing the fixed effect and the random effect to represent the overall mean of claims over the collection of subjects and the deviation of the individual mean of a specific subject from the overall mean, respectively.It is also well known that different methods can lead to the exact expression of credibility factor.These methods are, among others, a generalization of distribution-free approaches [7], the posterior regret Γ-minimax approaches [8], Bayesian nonparametric method [9,10], and approximate credibility formula [11].
While linear mixed models have applications in many other areas, they have limited applications in actuarial science.Considering the insurance data is usually right-skewed or discrete (claim frequency models), linear mixed models have to be extended to the generalized linear mixed models (GLMM); see Antonio and Beirlant [12].These models no longer require the normal distribution assumption and can adapt to distributions in the exponential family, such as Poisson and gamma distributions.With the introduction of random effects, the predictors in the generalized linear mixed models are similar to the predictors in the credibility model.The credibility factor can be deduced from the GLMM predictor, but it cannot be guaranteed to fall into the range from 0 to 1 (Meng, 2014) and apparently violates the principle in credibility model.
In this paper, we propose a new approach to modeling credibility model, using generalized linear mixed models framework for analyzing longitudinal claims data.The contribution of this article is that, first, we generalize the underlying normal assumption in the Bühlmann-Straub model to Poisson and negative binomial distribution, which are more appropriate distributions to describe the claim frequency, overcoming the insufficiency of the Bühlmann-Straub model.Second, we borrow the generalized linear mixed models basic framework, use the fixed effect to describe the overall mean of claims over the collection of insurance contracts, and adopt the random effect to describe the deviation of the individual mean of specific contract from the overall mean.We derive explicit expressions for credibility formula under the Poisson and negative binomial assumption and ensure that the credibility factor falls in the interval [0, 1], which makes it easier to explain the credibility predictor.Third, we provide a numerical method that gives practitioners access to the optimal Bonus-Malus system based on our proposed credibility models, allowing them to adjust the premium in the next year based on claim experiences.
The remainder of the article is organized as follows.Section 2 introduces our nations and describes the relationship between linear mixed model and Bühlmann-Straub model.The new formulas of credibility predictor and credibility factor for claim frequency under Poisson and negative binomial distribution assumptions are derived in Section 3. Section 4 provides implementation details for parameter estimation.In Section 5, we give a numerical example to show the benefit of our model compared to others in the literature.Some concluding remarks are given in Section 6.

Basic Assumption
Consider a portfolio of insurance contracts of  insured each with an available a history of  time periods.Denote by [  :  = 1, 2, . . ., ;  = 1, 2, . . ., ] the number of claims for individuals  during the time period .Let [Θ  :  = 1, 2, . . ., ] be the risk parameters that take into account all the common characteristics individual risks, which are referred to as potential individual characteristics in Bühlmann-Straub model.Assume that, for every , (  ,   ) are independent with (  ) = 1.
In credibility theory, the credibility predictor has the following linear form: where   = (1/) ∑  =1   is the average historical claims for the individual  and  is the overall mean in the insurance portfolio.The credibility factor   is a weight assigned to the individual's own claims experience.
Bühlmann-Straub model can be considered as a special case of linear mixed models [13].The credibility predictor is equivalent to the linear mixed models with only intercept and random effect.In actuarial practice, the parameters of Bühlmann-Straub model are generally estimated using the method of moments, while in the linear mixed models, one uses the method of the maximum likelihood.These estimators may differ from each other and, in particular, for the credibility factor and the credibility predictor.
Since the numbers of claims are discrete data, the implicit normal distribution assumption in the linear mixed models is not appropriate.Therefore, we generalize the normal distribution to include Poisson and negative binomial distribution, which can be extended to generalized linear mixed models.For an individual  in the year , given the risk parameters Θ  = , let the conditional random variable   | Θ  follow the exponential distribution with probability density function as follows: where  is the dispersion parameter and   is the natural parameter.The mean and variance for the conditional random   | Θ  can be expressed as where V(⋅) is the variance function.
In the generalized linear mixed models framework with only intercept (similar to credibility model), we can model the relationship between predictor with fixed effect  0 and an unobserved random effect   with link function (⋅) given by In actuarial applications, log link function is usually used: that is, where exp( 0 ) represents the overall mean over the whole insurance portfolio.The prediction for the number of claims for an individual  in the next year can be obtained from the adjustment factor exp(  ).
To ensure that a priori ratemaking is correct on average, we need to apply the following constraint on the random effect in (5), (   ) = 1.The a priori premium is given by In (5), if the random effect,   , follows a normal distribution with zero mean,   ∼ (0,  2 ), then these models are considered as generalized linear mixed models with the only intercept.However, under the constraint (   ) = 1, the random effect follows a normal distribution with a nonzero mean,   ∼ (− 2 /2,  2 ).This assumption will cause the estimator of the overall mean to be   0 and the estimator for the th individual to be   0 ×    .The estimator for the th individual can be expressed as a ratio adjustment to the overall mean with the adjusting factor    .
A number of commonly used probability distributions (discrete or continuous) follow the form given in (2); these include the normal, Poisson, binomial, gamma, inverse Gaussian, and geometric distributions.In this paper, we focus on two commonly used distributions to model the number of claims, namely, the Poisson and negative binomial distributions.

Extended Credibility Model
The random variable,   , represents the numbers of claims for the th individual insurance contract at the th year.The credibility model is aimed at predicting the next year loss,  ,+1 , based on historical observations ( 1 ,  2 , . . .,   ).The estimator, Ŷ,+1 , for the next year in a credibility model is a function of the historical claims ( 1 ,  2 , . . .,   ): that is, Ŷ,+1 = ( 1 ,  2 , . . .,   ).In the greatest-accuracy credibility approach [2], the credibility estimator is constrained to be linear in historical observations, 0 , . . .,   ,  = 1, . . .,  are estimated under the quadratic loss function: that is, find  0 , . . .,   such that the expected squared difference between  ,+1 and Ŷ,+1 is minimum: Unlike estimation method in Bühlmann-Straub model, we estimate Ŷ,+1 , and  0 , . . .,   using the framework of the generalized linear mixed models.First, we present the following result as a lemma.

Lemma 1. Let the random variable 𝑌 𝑖𝑡 denote the number of claims. In the framework of generalized linear mixed models, its expectation and variance are given by
var The credibility estimator for individual  in the period  + 1 is given by Ŷ,+1 =  0 where V denotes the variance of exponential transformation of random effect V fl var(   ) and  0 denotes the overall mean over the insurance contracts  0 fl   0 .
And then we find out the final expression for Substituting this result in (13), we get This leads to the credibility predictor This suggests a more flexible modeling approach that takes the place of Bühlmann credibility model.Note that the forms of credibility estimator in our extended credibility model vary with the expectation and variance function V(⋅) of responses variable   .In the following section, we mainly discuss two special cases of extended credibility models for predicting the claim frequency, the extended Poisson credibility model, and negative binomial credibility model.

Extended Poisson Credibility Model.
Let the number of claims follow a Poisson distribution, that is,   ∼ Possion( 0   ), and the probability density function is given by In this case, where the variance function for a Poisson distribution is given by V(  ) =   (  ), then the expectation and variance of   will take the exact form given by Substituting the above expectation and variance of   into (11), the credibility predictor can be simplified as where V fl var(   ) and the overall mean  0 fl   0 .Note that this credibility predictor can be decomposed into the familiar form of the Bühlmann credibility predictor, overall mean  0 over the collection of insurance contracts and averaged experience loss where the credibility factor  is In this case, the variance is given by V(  ) =   +   2 and the expectation and variance of   will take the exact form given by Substituting the above expectation and variance into (11), the credibility predictor can be simplified as where V fl var(   ) and the overall mean  0 fl   0 .
The extended negative binomial credibility predictor simplifies to the expression we found in Section 2.

Ŷ𝑖,𝑇+1 = 𝜆
where the credibility factor  is given by It is clear from ( 29) and (34) that the extended claim frequency model we proposed will result in a credibility factor that is within the interval [0, 1].

Relationship between Extended Model and Bayesian Credibility Model.
In the Bayesian credibility theory, Poissongamma credibility model is used to predict claim frequency [14].It assumes that the numbers of claims follow a Poisson distribution with a mean parameter   ,   ∼ Gamma(, ), and (  ) = 1.The credibility estimator in Poisson-gamma credibility models is given by the expectation of posterior distribution where credibility factor is in Poisson-gamma credibility model: In the extended Poisson credibility model, we propose that if the risk parameter   and random effect   satisfy the following relationship,   =    and (  ) = (   ) = 1, then the credibility factor in Section 3.1 can be expressed in the form similar to credibility factor in Poisson-Gamma credibility model Mathematical Problems in Engineering This is identical to Bayesian Poisson-Gamma credibility model.Furthermore, if the random effect is normally distributed   ∼ (− 2 /2,  2 ), then the Poisson model credibility factor can be rewritten as This model degenerates to Poisson-lognormal model in claims count data [15].

Parameter Estimation in GLMMs Framework
In order to obtain the credibility estimator for Ŷ,+1 , one has to obtain estimates for the elements of variance components, V, and for the overall mean, λ0 , in the extended Poisson credibility model.Then, one has to estimate the scale parameter, k, in the extended negative binomial credibility model.The estimators of the variance components, V, are correlated to the variance of the random effect, while λ0 can be obtained using the exponential transformation of the estimation of the fixed effect in the framework of generalized linear mixed models.Joint probability density function for the model can be expressed as (  | , ,   )(  ).The estimation for parameters of the model can be obtained based on the marginal log-likelihood function below where (⋅) denotes the probability distribution of response variable and (  ) denotes the density function of random effect.The integration required in the log-likelihood function is quite complex.Thus, we apply numerical approximation algorithm, adaptive Gaussian quadrature, which also enables us to apply the likelihood ratio test in our model [16].While there is a number of software packages available to solve this problem, the NLMIXED procedure in SAS is the most convenient (also authors are more comfortable with SAS).
The parameters in both Poisson and negative binomial credibility models include the estimation of fixed effect, β0 , the variance components of random effect, Var(  ) = σ2 , and scale parameter k in the negative binomial credibility model.Since the random effect in our model follows a normal distribution (−σ 2 /2, σ2 ), the exponential transformation of the random effect exp(  ) follows a lognormal distribution that leads to the parameter V = exp(σ 2 ) − 1.Finally, the credibility predictor and credibility factor in ( 11) can be obtained by substituting V, λ0 , and k into the extended credibility model (28).

Empirical Study
In this section, we implement the estimation procedures from Section 4 and show how to use the resulting estimates to produce the credibility predictor and credibility factor in both the Poisson distribution and the negative binomial distribution from Section 3. We used a claims dataset that was collected at a Chinese auto insurance company.It is a balanced longitudinal data and contains claims information from the calendar year 2006 to 2008.The dataset contains 9712 policyholders that stay in the company for complete 3 years' periods, resulting in 36748 insurance contracts and risk exposures of each contract are 1.The mean of the number of claims is 0.2884783 and the variance is 0.4072942, which implies overdispersion.Table 1 presents the claim frequency distribution over time.In each year, the number of claims has a significant fraction of zeros.This is consistent with the insurance practice, where insurers manage the risk pool through diversification effect.
The estimates for parameters are β0 = −2.387and V = 1.455 in extended Poisson credibility model and β0 = −1.942,V= 1.281, and k = 0.1434 in extended negative binomial credibility model.To demonstrate the advantages of our model, we also compare the results of the proposed credibility model with linear mixed model (LMM), which is the exact same form of Bühlmann-Straub model following Frees (1998) and optimal Bonus-Malus systems using finite mixture models following Tzougas et al. [5].The differences between models produce different results.
We compare goodness-of-fit statistics values of competing models by using AIC and BIC statistics based on the sample.As credibility models can be implemented to establish the optimal Bonus-Malus system, we will find the optimal BMS from our extended credibility models following Lemaire [17] and Frangos and Vrontos [4].The BMS will be defined from (27) and (33) which is presented in Tables 3 and 4.This BMS can be considered generous with good drivers and bad drivers.Take the optimal BMS based on Poisson distribution: for example, the bonuses given for the first claim free year are 12% of the basic premium.Drivers who have three accidents over the first year will have to pay a malus of 373% of the basic premium.Compared to Poisson distribution, the optimal BMS based on negative binomial credibility model has higher punishment and award of the premium; for example, the bonuses given for the first claim free year are 15% of the basic premium.Drivers who have three accidents over the first year will have to pay a malus of 292% of the basic premium.The result of optimal BMS based on Bühlmann-Straub models was also added in Table 5.
Compared to our extended credibility model, the optimal BMS based on Bühlmann-Straub model have more tender bonuses and malus for the premium.

Conclusion
Bühlmann-Straub model is widely used in the experience ratemaking with a significant disadvantage.The Bühlmann-Straub incorporates an implicit normal distribution assumption which is a poor model for the discrete claim frequency.
To address this problem, we assumed Poisson and negative binomial models, which are more appropriate distributions for the claim frequency than the normal distribution assumption.
The extended credibility model we proposed has more generalized credibility expression, which contains the Bayesian credibility model.When the exponential expression of the random effect follows the gamma distribution, the extended credibility model degenerated into Poisson-gamma credibility model; when the random effect follows a normal distribution, the extended credibility model degenerated to Poisson-Lognormal credibility model.
We have assumed that the claim frequency follows either Poisson distribution or negative binomial distribution and provide a new approach to credibility model.The new formulas of credibility predictor and credibility factor for Mathematical Problems in Engineering claim frequency under Poisson and negative binomial distribution assumptions are derived and the optimal Bonus-Malus system is modeled by using a claims dataset that was collected at a Chinese auto insurance company in the years 2006-2008.The empirical results show that goodness-of-fit statistics values of our proposed credibility models are much lower than the linear mixed model, Bühlmann-Straub model, and finite mixture model in the literature, which implies that our proposed model can fit the data very well.
Compared to the Bühlmann-Straub model, the results of an optimal Bonus-Malus system based on our extended credibility models show a more severe punishment and give more reward to the premium.In addition, compared to the generalized linear mixed model under the assumption of Poisson or negative binomial distribution, the extended claim frequency credibility model not only is able to solve for the credibility factor but also ensures that the credibility factor falls in the range from 0 to 1, while generalized linear mixed model cannot be applied in the optimal Bonus-Malus system.

Table 1 :
Number of percentage of number of claims over years.
Table 2 reports that our proposed models perform the best and the AIC, BIC values in Poisson distribution and negative binominal distribution are almost the same, which indicate that both our extended credibility models have no significant difference.The linear mixed model or Bühlmann-Straub performs worst, which indicate that linear mixed model or Bühlmann-Straub model may be not appropriate for our datasets.This is not surprising as (we mentioned earlier) Bühlmann-Straub model implicitly assumes normal distribution for the response variable.We also fit this dataset

Table 3 :
Optimal BMS based on extended Poisson credibility model.

Table 4 :
Optimal BMS based on extended negative binomial credibility model.