A Bayesian Approach for Evaluation of Determinants of Health System Efficiency Using Stochastic Frontier Analysis and Beta Regression

In today's world, Public expenditures on health are one of the most important issues for governments. These increased expenditures are putting pressure on public budgets. Therefore, health policy makers have focused on the performance of their health systems and many countries have introduced reforms to improve the performance of their health systems. This study investigates the most important determinants of healthcare efficiency for OECD countries using second stage approach for Bayesian Stochastic Frontier Analysis (BSFA). There are two steps in this study. First we measure 29 OECD countries' healthcare efficiency by BSFA using the data from the OECD Health Database. At second stage, we expose the multiple relationships between the healthcare efficiency and characteristics of healthcare systems across OECD countries using Bayesian beta regression.


Introduction
Public expenditures on health are a matter of concern for governments in today's world. These expenditures have recently accelerated and are putting pressure on public budgets. Therefore, health policy makers have focused on the performance of their health systems and many countries have introduced reforms to improve the performance of their health systems. After two reports published by World Health Organization (WHO) which estimate health system efficiency for 191 countries between 1993 and 1997, studies on the health system efficiency have increased significantly [1,2]. They used a panel-fixed effects model to create a production frontier with a time-invariant inefficiency term. They modelled health outcome as the output of the production function and health expenditures and their square and educational levels as the inputs. Hollingsworth and Wildman [3] and Greene [4] used the same variables with additional variables to estimate health system efficiency.
Stochastic Frontier Analysis (SFA) can be used for efficiency comparisons in the performance of health systems. Since WHO reports in 2000, SFA, which is a parametric approach, has been used only a few times in the literature to compare health system efficiency in OECD countries. In this study, one of our main targets is to use SFA with a Bayesian approach. In spite of many advantages, there are a few studies on SFA with a Bayesian approach. The Bayesian approach for the SFA was first introduced by van den Broeck et al. [5]. Koop et al. [6] describe the use of the Markov Chain Monte Carlo (MCMC) as a numeric integration method in a stochastic frontier framework. Also, [7][8][9] give current developments in BSFA. More recently, Griffin and Steel [10] describe MCMC methods for Bayesian analysis of stochastic frontier models using the WinBUGS package, freely available software. Tsionas and Papadakis [11] provide a Bayesian approach to the problem organized around simulation techniques. Tabak and Langsch Tecles [12] use a Bayesian stochastic frontier for cost and profit efficiencies of the Indian banking sector.
In recent literature, there are some more studies investigating the relationship between health spending and health outcomes. The stochastic frontier allows for the estimation of a potential level of outcome given expenditures in a country, where the deviation of the actual from the potential level is used in the computation of an inefficiency score. Wranik [13] assesses which policy-relevant characteristics of a healthcare system contribute to health system efficiency measured by using the stochastic frontier approach. de Cos and Moral-Benito [14] investigate the most important determinants of healthcare efficiency for OECD countries. In their study, first, the countries' health efficiencies were estimated and ranked using alternative parametric and nonparametric indices. At the second stage they regress efficiency scores on a set of 20 indicators representing health system characteristics based on the database by Paris et al. [15]. They use Data Envelopment Analysis (DEA) and SFA for estimating health system efficiency. Later, Least Square and Bayesian model averaging were selected as regression method.
In many research areas, it is very common to encounter the fact that the dependent variable takes values in the standard unit interval (0, 1) such as rates, proportions, percentages, and fractions. There are some drawbacks when the linear regression model is applied to this kind of dependent variable. The major drawback is the fitted values for dependent variable which exceed its lower and upper bounds. To overcome these drawbacks, one way is to transform the dependent variable to values on the real line. However, this approach has also its own obstacles. The major obstacle is that the model parameters cannot be interpreted in terms of the original dependent variable. Ferrari and Cribari-Neto [16] introduced a beta regression model which is based on the assumption that the dependent variable is beta-distributed. Despite the many benefits associated with Bayesian beta regression, there are a few studies such as [17][18][19][20].
As mentioned earlier, there are two main alternative approaches considered in the literature for efficiency estimations in the performance of health systems. These are DEA and SFA. In Bayesian approach, we need a specific functional form for the production frontier and a source of randomness in production. SFA approach includes both components. So we prefer SFA with Bayesian approach to DEA at the first step of this study. For the second stage, classic linear regression and Tobit-type approaches have been used widely in recent literature. Also in a few studies Bayesian model averaging has been used recently. We prefer Bayesian beta regression for the benefits of robustness, since the other regression approach will produce more biased estimations.
In this study, the first aim is to estimate health system efficiency across OECD countries using BSFA. The second aim is to select the factors affecting the health system efficiency using Bayesian beta regression. We used the extended data set used by de Cos and Moral-Benito [14].

Bayesian Stochastic Frontier
Model. SFA approach proposed by Aigner et al. [21] can be specified as follows: where is log output, is a vector of input measures, is the vector of coefficients, V is independent and identically distributed error term, and ≥ 0 is technical inefficiency. Error term = V − has a symmetric distribution.
The posterior distribution is shown as follows: where is the vector of coefficients, is the vector of parameters in the prior distribution of , and is the matrix of the inputs.

Beta Regression Model. Let
be continuous variable which takes values in unit interval (0, 1). The variable is assumed to be beta-distributed with the following parameterization: where 0 < < 1 and > 0. Practitioners usually try to model this kind of variables such as proportions, percentages, or rates with regression analysis [22]. Beta regression is used when the response variable is restricted on the interval (0, 1). Regression model is obtained with the parameterization of the mean and precision as ( ) = and Var( ) The mean response is related to linear predictors through a monotonic and twice differentiable link function such that where = ( 1 , 2 , . . . , ) represents the beta coefficients on regression parameters, = ( 1 , 2 , . . . , ) denotes the predictor variables in mean model, and (⋅) shows the link function. It is possible to choose link functions between several functional forms such as logit, probit, and complimentary log-log. We get the logit form for the regression analysis. Beta regression model also contains a precision parameter for modelling the dispersion parameter which is the reciprocal of . For this study, we assumed the precision parameter as constant which does not vary across all observations. Beta coefficients are estimated with the log-likelihood function of the model. Log-likelihood function of the model is We can estimate the beta coefficients based for mean model on scoring functions derived from the log-likelihood function with numerical optimization methods. Most common used methods are Fisher scoring or Newton-Raphson algorithms for the maximum likelihood estimation of the model [23].
Computational and Mathematical Methods in Medicine 3

Results and Discussion
Our sample covers 29 OECD countries. The variables used in the first step for the estimation of efficiency indices are mainly taken from the OECD Health Database and cover the period between 1997 and 2009; in contrast, the variables referring to health system characteristics are obtained from Paris et al. [15] and represent a cross section for the year 2009. The measurement of productive efficiency is based on the relationship between output produced and inputs required for production. In this paper we consider life expectancy as output in the same way as [13,24]. Turning to the inputs in the health production function, we consider per capita GDP, per capita health expenditures, education, tobacco consumption, alcohol consumption, fruit and vegetables consumption, and nitrogen oxide emissions.
The information referring to health system characteristics draws on a survey in which health authorities of each country respond to 269 questions on their healthcare system. Paris et al. [15] then summarise the information in 20 healthcare policy indicators that take values between 0 (minimum) and 6 (maximum). These indicators include information on the influence of the market and regulations on healthcare users, insurers and suppliers, the characteristics of basic healthcare coverage, the management of the healthcare budget, and the decision-making process in the provision of healthcare systems. de Cos and Moral-Benito [14] give a brief description of each of the 20 indicators. Those variables are choice of insurer ( 1 ), insurer level for competition ( 2 ), over the basic coverage ( 3 ), degree of private provision ( 4 ), volume incentives ( 5 ), regulation of prices billed by providers ( 6 ), user information on quality and prices ( 7 ), regulation of the workforce and equipment ( 8 ), patient choice among providers ( 9 ), gate keeping ( 10 ), price signals on users ( 11 ), priority setting ( 12 ), stringency of the budget constraint ( 13 ), regulation of prices paid by third-party payers ( 14 ), degree of decentralization ( 15 ), degree of delegation to insurers ( 16 ), consistency in responsibility ( 17 ), breadth ( 18 ), scope of basic coverage ( 19 ), and depth of coverage ( 20 ).
The proposed model at first stage is specified as follows: where, for th country in th year, is the natural logarithm of life expectancy, GDP is the natural logarithm of per capita GDP, HEX is the natural logarithm of the per capita health expenditures, EDU is the education, TOC is the tobacco consumption, ALC is alcohol consumption, VEC is fruit and vegetables consumption, NIT is nitrogen oxide emissions, and is a time trend. V is a symmetric disturbance capturing the effect of noise and as usual is assumed to be normal and is interpreted as an indicator of the inefficient use of the life expectancy and is assumed to be a half-normal distribution, i.i.d.
∼ + (0, ). For Bayesian inference we need to specify prior distribution for all unknown parameters. All parameters are assumed to be a priori independent. We assign ∼ (0, 2 ). For a half-normal distribution, we assign a Gamma prior for the precision, that is, −1 ∼ Ga( , ). The complexity of these models makes it necessary to use numerical integration methods such as MCMC and in particular the Gibbs sampling algorithm with data augmentation as introduced by Koop et al. [6]. For the proposed model, implementation was carried out using the WinBUGS package.
Posterior summaries and densities for the proposed model after running the MCMC algorithm for 37500 iterations and discarding the initial 12000 are provided in Table 1. Table 1 shows posterior mean, sample standard deviation (SD), Monte Carlo error (MC error), and median with a 95% posterior credible interval. One way to assess the accuracy of the posterior estimates is by calculating the MC error for each parameter. This is an estimate of the difference between the mean of the sampled values and the true posterior mean. As a rule of thumb, the simulation should be run until the MC error for each parameter of interest is less than about 5% of the sample standard deviation. As seen in Table 1, the Monte Carlo error for each parameter of interest is less than about 5% of SD. Table 1 shows that GPD, health expenditure, tobacco consumption, fruit and vegetables consumption, nitrogen oxide emissions, and time are significant, whereas the other parameters are not significant.
29 OECD countries' health system efficiencies based development performances are measured by using BSFA. Technical efficiency scores of the countries are estimated and, according to these scores, ranking of health system efficiency was presented in Table 2.
Based on the technical efficiency, while the most effective country was Australia, the lowest effective country was Turkey.
At second stage, Bayesian beta regression was performed to obtain the related factors with efficiency scores. The beta regression model used is as follows: where = ( 1 , 2 , . . . , 20 ) represents the regression coefficients on regression parameters, = ( 1 , 2 , . . . , 20 ) denotes 20 healthcare policy indicators as the predictor variables, and (⋅) shows the link function. We prefer the logit link for the sake of simplicity. In this regression model, 20 healthcare policy indicators described above are considered as independent variables and efficiency scores obtained from BSFA are considered as dependent variables. These efficiency scores are assumed to be beta-distributed.
For Bayesian inference, we need to specify prior distribution for all regression parameters. All parameters are assumed to be a priori independent. We assign all ∼ (0, 2 ). Posterior summaries and densities for the Bayesian beta regression model after running the MCMC algorithm for 48000 iterations and discarding the initial 9500 are provided in Table 3. The posterior mean, SD, and MC error with a 95% posterior credible interval were presented in Table 3.  As seen in Table 3, the MC error for each parameter of interest is less than about 5% of SD. So convergence is satisfied for all parameters.

Conclusion
In this study, we used a Bayesian approach for stochastic frontier model to determine the health system performance For the second step, we regress efficiency scores on a set of 20 indicators capturing health system characteristics. Unlike other similar studies, in this study, Bayesian beta regression was performed to overcome the problems of lack of degrees of freedom and model uncertainty in this regression for the benefits of robustness. Bayesian beta regression model was constructed to obtain the related factors with efficiency scores. While regulation of prices billed by providers ( 6 ), regulation of the workforce and equipment ( 8 ), patient choice among providers ( 9 ), price signals on users ( 11 ), degree of decentralization ( 15 ), and scope of basic coverage ( 19 ) were related factors with the health system efficiency for Bayesian beta regression, the other variables are not related to health system efficiency scores. In particular the lowest effective countries such as Turkey, Slovak Republic, Hungary, and Italy should reconsider the health system policy and take precautions to improve the health system efficiency paying attention to six important factors mentioned and highlighted above.