Performance Analysis in Production Systems with Uncertain Data: A Stochastic Data Envelopment Analysis Approach

The problem of determining an optimal benchmark to ineﬃcient decision-making units (DMUs) is an important issue in the ﬁeld of performance analysis. Previous methods for determining the projection points of ineﬃcient DMUs have only focused on one objective and other features have been ignored. This paper attempts to determine the best projection point for each DMU when the inputs and outputs data are in stochastic form and presents an alternative deﬁnition for the best projection by considering three main aspects: technical eﬃcient, minimal cost, and maximal revenue as much as possible. Considering the important role of the electricity industry in the economic growth of each country, a practical example has been implemented on 16 regional electricity companies in Iran in 9 consecutive periods. The eﬃciency score along with the projection points of the three technical models (BCC model of Banker et al. (1984)), cost, and stochastic revenue are compared with the projection point obtained from the model presented in this article, which simultaneously meets these three objectives, showing the improvement of companies’ performance.


Introduction
Data envelopment analysis (DEA) was first introduced by Charnes et al. [1] (CCR model) to calculate the relative efficiency of a set of homogeneous decision-making units (DMU) and extended by Banker et al. [2] (BCC model). Traditional DEA models (BCC and CCR models) rely on past information and they are developed in deterministic case, and hence the uncertainty and stochasticity of the data are ignored. In the real world, we are faced with many decisions, most of which are associated with uncertainty. Stochastic data envelopment analysis models to address these shortcomings were proposed by Charnes and Cooper [3]. In the last two decades, performance analysis in an uncertain environment has been studied frequently. See for instances Hossain et al. [4], Mo et al. [5], Rong et al. [6], Amirteimoori et al. [7], and Ghasemi et al. [8].
Some of the differences between DEA and stochastic DEA (SDEA) models have been presented in Table 1. SDEA method is based on stochastic programming methods, which is one of the branches of mathematical programming. Stochastic programming models are divided into E-model and P-model in terms of the type of objective function. E-model based stochastic programming models are used to obtain the expected value or mathematical expectation that the objective function of such models does not include a random variable. To obtain the highest probability of an event occurring, stochastic programming models based on P-model are used so that in the objective function of such models, a random variable is used possibly. Charnes and Cooper [9] proposed a stochastic programming model with potential constraints. ey were then introduced by Land et al. [10] E-model-based stochastic programming model. Programming models based on E-model are used to obtain the expected value. e objective function of these models does not include random variables. e efficiency measure in which, in addition to input and output data, the price of the input of the DMUs is also required and its goal is cost minimization, called cost efficiency, was introduced by Farrell [11]. Färe et al. [12] developed cost efficiency. By studying data envelopment analysis and stochastic cost functions, Cooper and Ton [13] identified some specific problems in stochastic frontier analysis (SFA). Stochastic frontier analysis method is a parametric method. Shiraz et al. [14] formulated the stochastic cost model in a nonlinear form.
Färe et al. [12] have played a key role in the development of revenue efficiency. In addition to input and output data, they have also paid attention to the output of the decisionmaking unit aiming at maximizing revenue. Lin [15] proposed a method to set revenue targets at the efficient frontier. Shiraz et al. [14] also presented a random income model that is nonlinear.
In data envelopment analysis, DMUs are divided into two categories: efficient and inefficient units. Inefficient units can be efficient by decreasing their inputs level and increasing their outputs level.
Achieving optimal scale size (OSS) has always been of interest to researchers. Cesaroni and Giovannola [16] introduced a new definition of optimal scale size based on minimizing the cost of each unit so that average cost productivity combines scale productivity and allocation and generalizes economic scale size in productivity analysis. In the following, Haghighatpisheh et al. [17] proposed a new definition of optimal scale size that uses both of the cost of inputs and the revenue of outputs. is measure of average cost-revenue efficiency (ACRE) shows the ratio of profitability efficiency to average productivity. e process of industrialization of developed countries shows proper planning, optimal use of resources, and determining the appropriate pattern as the main goal in the development process of these countries. e electricity industry has an infrastructural and influential role on the economic growth of any country. In recent years, per capita electricity consumption in Iran has increased significantly in parallel with its production rate. Accordingly, efficiency and determining the appropriate model for companies has always been the concern of managers in the electricity industry. In this article, a real and useful example has been done on Iran's regional electricity distribution companies and it has used random data that can play an important role as a vision in the electricity industry and solving future challenges.
Most studies to find the projection point of each inefficient unit have focused on only one goal. For example, the projection point obtained from the cost model finds an efficient point on efficient frontier with the cost minimization approach. e projection point obtained from the revenue model moves towards the frontier with the revenue maximization approach. Bagheri et al. [18] presented a method that simultaneously tried to meet the projection point of each unit from three perspectives, cost minimization, revenue maximization, and the shortest distance to the efficient frontier as much as possible and it was proved that the efficiency of the model obtained for inefficient units is higher than the unit under evaluation. e results of solving this method for managers to make decisions were based on past information, so we tried to use stochastic programming. In this paper, for each inefficient unit, a projection point is obtained, which is very important in two stages. First, the resulting projection for each unit meets as many as possible the three technical, cost, and revenue concepts. Secondly, random data and stochastic programming in linear form are used in it. erefore, it will greatly help managers to make better decisions in the future and improve the performance of units. is paper has been organized as follows: technical, cost, and revenue efficiency with deterministic and random data are examined in Sections 2 and 3, respectively. e random models of the linearly shaped models have been presented in Section 3. In Section 4, the method proposed by Bagheri et al. [18] to determine the ideal projection point has been implemented this time in stochastic programming. In Section 5, we will provide a real and practical example, and finally in Section 6, the conclusion of the article is presented.

Deterministic DEA Models
In this section, we briefly review some basic DEA models in deterministic environment. Technical efficiency measure was first introduced by Debreu [19] and Farrell [11]. e traditional DEA-based models CCR model of Charnes et al. [1] and its subsequent extension BCC model of Banker et al. [2] was developed on classical efficiency analysis model of Farrell [11]. e BCC model for evaluating a specific DMU p Allows you to predict performance in the future e efficiency frontier is very sensitive to small changes in inputs and outputs. e efficiency frontier is less sensitive to changes in inputs and outputs. e error level in such models is considered zero.
Efficiency is defined according to the level of error in the model. In the final model, the correlation between inputs and outputs is not considered. e correlation between inputs and outputs is considered in such models.
2 Complexity is formulated in the input-oriented deterministic environment as follows: In the above model, θ is the input abatement factor. DMU p is said to be technically efficient if and only θ � 1 and all slack variables are equal to zero. Corresponding to each inefficient point, a frontier point is determined by reducing inputs. Removing the third convexity constraint n j�1 λ j � 1 leads to CCR model.
For each inefficient unit, there is one or more efficient units on the efficient frontier that is called the reference set.
e reference set of a specific inefficient DMU (denoted by E o ) is defined as follows: In technical efficiency models we have less information on inputs and outputs of DMUs. In other words, all we know the quantitative values of inputs and outputs and there is no information on inputs and outputs prices. More information will be available if we also have the costs of the inputs and the prices of the outputs. In this case, we are motivated to calculate the allocative efficiency of the DMUs. We first introduce cost allocative model. Suppose that C � (c 1 , c 2 , . . . , c m ) t is the vector of input prices and X � (x 1 , x 2 , . . . , x m ) t is cost minimizing vector of input quantities. Farrell (1957) proposed a measure of cost efficiency. e cost efficiency model in VRS environment is formulated as follows: e cost efficiency of DMU p is defined as the ratio of the optimal cost to the actual cost as follows: (4) Clearly, CE p ≤ 1 and when this score is equal to one, then DMU p is called cost efficient, otherwise we say DMU p is cost inefficient.
Now suppose that we are given information on the price of the outputs. In this case, the revenue efficiency model in VRS environment is formulated as follows: In the above equation, P � (p 1 , p 2 , . . . , p s ) t is vector of output prices and Y � (y 1 , y 2 , . . . , y s ) t is revenue maximizing vector of output quantities. e revenue efficiency of DM U p is defined as the ratio of optimal revenue to the actual revenue, i.e., RE p � s r�1 p r y r / s r�1 p r y rp . It is easy to see that RE p ≥ 1. If this score is equal to one, then DMU p is called revenue efficient, otherwise we say DMU p is revenue inefficient.

Basic DEA Models in Stochastic Environment
As we stated before, in many real applications, we often encounter uncertainty, so it is necessary to generalize the models to the uncertainty mode, especially the random mode. SDEA models were first proposed by Charnes and Cooper (1959). Land et al. (1993) introduced the CCR model in the stochastic mode considering the inputs and outputs and estimating the probability distribution prevailing over them, after estimating the efficiency in the future, that by considering the convexity condition, the BCC model is obtained randomly in the nature of the input. In this model, we show the random input and output values by x ij and y rj , respectively. ese values show the mathematical expectations of random inputs and outputs. Also, a ij and b rj are the standard deviations of random inputs and outputs, respectively. Using a random variable error structure, Cooper et al. [20] proposed the following input-oriented DEA model: e first two constraints in model (4) are related to input variables and the next two constraints are related to output variables. e goal is to minimize the level of inputs by the abatement factor θ. σ has a standard normal distribution and ∅ − 1 shows the inverse of the standard normal distribution function. e positive deviation variables p + i , p − i , q + r , q − r have been used to transform the model into a linear form.
Corresponding to each inefficient DMU, the projection point was obtained from the above stochastic model through the formula ( n j�1 λ j x ij , n j�1 λ j y rj ). Producing a certain level of outputs from the least cost of inputs is called cost efficiency, in which information about the price of inputs is considered. In the following, we present the cost efficiency model in stochastic environment. Inspired by the deterministic form of a stochastic model, we assume that c i is the i-th input price whose value is constant for all units. Also, x i is considered to be the ideal input whose value is unknown.
Suppose that x ij and y rj are the mathematical expectations of the inputs and outputs of x ij and y rj , respectively. Moreover, suppose a ij and b rj are their corresponding standard deviations. e cost efficiency model is formulated as follows: In this model, the objective is to minimize the total cost of inputs. is model is formulated under variable returns to scale and the cost efficiency of DMU p is defined as follows: Corresponding to each inefficient point, the efficient projection point is obtained from the above stochastic model through the formula ( n j�1 λ j x ij , n j�1 λ j y rj ). It can be easily shown that the obtained projection point is efficient at the desired confidence level. σ has a standard normal distribution and ∅ − 1 shows the inverse of the standard normal distribution function in the above linear model. Now, we propose the revenue efficiency model. As it is known in DEA literature, producing the highest level of outputs from a certain level of inputs is called revenue efficiency. Assume that w r is the r-th output price whose value is constant for all units. Also, y r is considered to be the r-th optimal output value whose value is unknown. Suppose that x ij and y rj are the mathematical expectations of the inputs and outputs of x ij and y rj , respectively. Also, a ij and b rj are the standard deviations of the inputs and outputs x ij and y rj , respectively. e revenue efficiency model is formulated in a linear form in a random environment as follows: e first two constraints in the above model are related to input variables and the next two constraints are related to output variables. In this model, the goal is to maximize the revenue of outputs. is model is formulated under variable returns to scale and its efficiency score is obtained from the following formula. 4 Complexity e projection point is obtained from the above linearized random revenue model through the formula ( n j�1 λ j x ij , n j�1 λ j y rj ). It can be easily proved that the obtained projection point is efficient at the desired confidence level. σ has a standard normal distribution and ∅ − 1 shows the inverse of the standard normal distribution function in the above linear model.

The Best Projection Point
In this section, the best projection point in a random environment is determined. As you know, there is an ideal pattern for any inefficient unit. e ideal pattern means the projection point obtained as each decision unit. Bagheri et al. [18] presented a method in which an ideal model was obtained for each unit, which first dominates the unit under evaluation. erefore, it will be much more efficient. Secondly, in the convex combination of projection points obtained from cost, revenue, and technical models, this projection point is simultaneously examined from three technical, minimum cost, and maximum revenue perspectives. One of the defects of such models is that they rely on past information, so they cannot provide the desired result.
One way to solve this problem is to use random data envelopment analysis. In such models, a random error is added to the model in the form of a random component, in which the efficiency is defined according to the error level ( ∝ .)and the correlation between the input and output variables is taken into account in the model. Toward this end, we proceed as follows: Step 1: First, we obtain the projection points of the technical, cost, and random revenue models in linear form under the variable returns to scale (models (4)-(6)), which were described in the previous section, , such that random cost and revenue models in linear form are an innovation of this article.
Step 2: en, using the proposed model, we introduce the point distance (x i , y r ) from each of the projection points of the BCC, cost, and revenue linearized random models (models (4)- (6)), which are represented by d 1 , d 2 , d 3 , respectively, and minimized under norm 2 in the objective function of the model (8) so that this point of the new projection is located in the convex shell of these points and at the same time overcome the unit under its evaluation. erefore, the obtained model will be more efficient for each inefficient unit and also more efficient than the unit under evaluation.
Complexity Table 2: Input and output variables.

Complexity 7
Considering that the expression under the radical of the first three constraints is greater than or equal to zero, therefore, the model can be rewritten as follows: . . , m j � 1, . . . , n, λ 1 y T rp + λ 2 y C rp + λ 3 y R rp � y r , r � 1, . . . , s j � 1, . . . , n, In the above model, we intend to minimize the total sum of the distance between projection points and ideal points from each of the projection points obtained from the BCC random, cost, and revenue models under norm 2 (L2) in the objective function of the proposed model so that the ideal point are in the convex combination of these points and at the same time dominate the unit under evaluation. is projection point obtained for each unit simultaneously meets the three technical, cost, and revenue objectives as much as possible. Secondly, random data and technical, cost, and revenue stochastic models are used in a linear form that has not been addressed so far, so obtaining such a pattern with these scores is very important.

Practical Example
e electricity industry has an infrastructural and influential role on the economic growth of any country. Iran Regional Electricity Company is one of the most important companies in the field of electricity industry, whose main task is to 8 Complexity produce, distribute, and transmit electricity. In this practical example, fuel consumption and nominal power are used in the power generation process. e nominal power of a propulsion device has been written by the manufacturer on its specifications plate for certain conditions in terms of horsepower or megawatts. In small machines, the nominal power is specified in kilowatts. e power transmission section is done by posts and lines of the transmission network. e substation is part of a network, which has been concentrated in a given location and is used to selectively connect and disconnect electrical circuits within a network. Power distribution is also used to send and sell energy to companies. Iranian electricity distribution companies are responsible for managing and coordinating subsidiary units and supplying the production, transmission and sale of electricity and are subsidiaries of Tavanir specialized parent company.
Considering the importance of this industry in the economic growth and development of the country, the model is evaluated based on data related to 16 regional electricity companies in Iran, in a 9-year period (from 2005 to 2014) derived from the statistical yearbook of the Ministry erefore, for each company, six inputs and two outputs have been considered, which are defined as follows.
e selection of these indicators is based on a study of research conducted in the electricity industry.
Input and output variables have been organized in Table 2.
Input and output indices have a normal distribution, the mean value and standard deviation for each of which have been shown in Tables 3 and 4, respectively. Table 5 shows the efficiency number obtained from technical, stochastic cost, and revenue models at three error levels (α � 0.05, 0.3, and 0.5). According to the results obtained from this table, units 1, 3, 4, 6, 7, 8, 9, 10, 13, and 16 e projections obtained from the technical, stochastic cost, and revenue models can be seen at three levels of error in Tables 6-8. For each company, the projection points of the technical, cost, and revenue models have been shown in three rows with TE, CE, and RE, respectively. e results have been set in separate tables for three error levels. ese tables have been presented in the appendix.
Also, the projection point of the proposed model for each of the electricity distribution companies with three levels of error has been shown in Tables 9-11, which is attached to the article. According to the results obtained for most units, the projection point of the proposed dominant model of the unit under evaluation has been obtained. Dominance means having fewer inputs and more outputs than the unit under evaluation. For example, consider the electricity distribution company of Khorasan province (unit 5) at the error level of 0.05. e model point of the model presented in this paper shows that all inputs have decreased and all outputs have increased compared to the unit under evaluation of this unit. ese results can also be generalized to units 10, 12, 14, and 15. e projection point obtained from the model presented in this paper for units 1,3,4,7,8,9,11, and 13 equals to the unit under evaluation is obtained at an error level of 0.05. ese results can be analyzed for other error levels.
In Table 12, the ideal cost and revenue are compared with the observed cost and revenue at three levels of error for all companies.
e results in this table show that for all companies in all three levels of error, the ideal cost is not more than the observed cost and the ideal revenue is not less than the observed revenue as per unit, which indicates an improvement in performance of all companies. For example, consider the Electricity Distribution Company of Khuzestan Province (DMU 6) at the error level of 0.05. Although in the proposed model, the second and fifth inputs have been increased, but the ideal cost of this unit is less than the  observed cost. Also, despite the decrease in the first output of the proposed model for this province, its ideal income has been obtained more than its observed income. is analysis can be generalized to other units. e input price vector and the output price vector for all units are given a fixed value as follows: For example, the ideal cost and the observed cost along with the ideal revenue and the observed revenue for DMU11 at the error level of 0.03 are obtained from the following formula, which can be generalized to all units.

Conclusions
Decision-making units are divided into two categories: efficient and inefficient units. For each inefficient DMU, there is a projection point on the efficient frontier so that the inefficient unit moves towards the efficient frontier with different approaches and becomes efficient. In this paper, in the first step, by technical efficiency models, linear stochastic cost, and revenue under variable returns to scale, we identify projection points such as each DMU.
en, through the presented model, a projection point is obtained that the distance from each of the projection points obtained in the first step under norm 2 is the least. e projection point obtained for each inefficient DMU is very important in two ways. First, the projection obtained for each DMU simultaneously meets the three main aspects of technical efficient, minimal cost, and maximal revenue as much as possible. Secondly, random data and stochastic programming in linear form are used in it. erefore, it will help managers to make better decisions in the future and will enjoy the benefits of random planning. Considering the importance of the electricity industry in the economic growth and development of each country, a practical example has been implemented on 16 regional electricity companies in Iran in 9 periods. Comparison between cost efficiency and ideal income with observed cost and revenue efficiency showed that the ideal cost of each unit is less than or equal to its observed cost and the ideal revenue of each unit is greater than or equal to the observed revenue of that unit at three levels of error. According to the obtained results, the managers of Iranian electricity distribution companies can plan by identifying inefficient companies,