An Interval Efficiency Measurement in DEA When considering Undesirable Outputs

Data envelopment analysis (DEA) is a popular mathematical tool for analyzing the relative efficiency of homogenous decisionmaking units (DMUs). However, the existing DEA models cannot tackle the newly confronted applications with imprecise and negative data as well as undesirable outputs simultaneously. -us, we introduce undesirable outputs into modified slack-based measure (MSBM) model and propose an interval-modified slack-based measure (IMSBM) model, which extends the application of interval DEA (IDEA) in fields that concern with less undesirable outputs. -e novelties of the model are that it considers the undesirable outputs while dealing with imprecise and negative data, and it is slack-based. Furthermore, the model with undesirable outputs is proven translation-invariant and unit-invariant. Moreover, a numerical example is provided to illustrate the changes of the lower and upper bounds of the efficiency score after considering the undesirable outputs. -e empirical results show that, without considering undesirable outputs, most of the lower bounds of the efficiency scores will be overestimated when the DMUs are weakly efficient and inefficient. -e upper bound will also change after considering undesirable outputs when the DMU is inefficient. Finally, an improved degree of preference approach is introduced to rank the DMUs.


Introduction
Data envelopment analysis (DEA) is a popular mathematical tool for analyzing the relative efficiency of homogenous decision-making units (DMUs). With multiple inputs and outputs, DEA can measure the relative efficiency of DMUs by using a ratio of the weighted sum of outputs to the weighted sum of inputs. An efficient DMU always consumes less input to produce a specific amount of outputs or produces more outputs by consuming an equal amount of inputs. However, the conventional DEA models of CCR [1] and BCC [2] are based on two priori assumptions that limit their application: the input and output data should be first precise and, second, nonnegative.
Obtaining precise data in real-life situations is not always possible, so bounded (interval), ordinal, and ratio-bounded data are often used in applications [3,4]. is precise data assumption can, in some cases, limit the applications of conventional DEA models. Cooper et al. [5] first introduced the imprecise (interval) DEA (IDEA) to cope with imprecise data, and many scholars have since contributed to the theoretical development of this method. Despotis and Smirlis [6] transformed a CCR model to handle interval data, and it gave a natural outcome in the form of lower and upper bounds of efficiency scores. However, the transformation was only applied to variables. Entani et al. [7] formulated dual models of the IDEA with an interval efficiency obtained from both optimistic and pessimistic viewpoints. Based on this, Wang et al. [8] developed a pair of interval models to convert ordinal preference information and fuzzy data into interval data through scale transformation and an α-level set, respectively. Wang et al. [9] further introduced a virtual antiideal DMU into a bounded DEA model to unify the best and the worst relative efficiencies under optimistic and pessimistic situations. However, Azizi and Jahed [10] pointed out that this assumed virtual anti-ideal DMU will make no sense when the input is zero and proposed a pair of improved IDEA models that make it possible to conduct a DEA analysis using the concepts of the best and worst relative efficiencies. Toloo et al. [11] constructed a pair of IDEA models based on pessimistic and optimistic standpoints to identify the unique status of each imprecise dual-role factor. Amir et al. [12] addressed the managerial and technical issues in allocating weights and in handling imprecise data through a total cost of ownership-(TCO-) based DEA approach. However, these models are not slack-based and can only deal with nonnegative data, indicating that these models can only measure radial efficiency with nonnegative data.
In addition to the assumption of precise data, conventional DEA models assume that all DMU inputs and outputs are nonnegative. However, this is not always possible in reallife problems when loss occurs, such as with profit or noninterest income. Traditionally, negative data are eliminated or transformed to positive through data transformation [13,14]. However, eliminating the negative data will lose some DMU information, and the solution of the object function will be affected through the data transformation. Pastor [15] was the first to use the translation invariance property of DEA models when addressing negative data, which does not require the data to be eliminated or transformed. Halme et al. [16] introduced the property to radial models for dealing with interval data, including negative data. Hatamimarbini et al. [17] developed the interval semioriented radial measure (SORM) model to evaluate efficiency in the presence of interval data without sign restrictions. Cheng et al. [18] developed a variant of radial measure (VRM) to address variables, which could be negative or nonnegative, for different DMUs, but the efficiencies produced by the input-oriented VRM model may be negative [19] and those from the output-oriented VRM model can be in the range of [0.5, 1] [20]. To avoid such drawbacks, Tung [20] further defined two efficiency measures for input-oriented and output-oriented VRM models. Although the models mentioned above can deal with negative data, and some are translation-invariant and/or unitinvariant, they are still not slack-based models and ignore the inefficiency caused by nonradial slacks.
us, most developed models have addressed only imprecise data or only negative data rather than both simultaneously, and none are slack-based. Tone [21] proposed a slack-based measure SBM (a) of efficiency that puts aside assumptions about proportionate changes in inputs and outputs and deals directly with the input excesses and the output shortfalls of DMUs. Lotfi et al. [22] integrated the SBM (a) model into IDEA to address interval data from the optimistic perspective and defined the upper and lower bounds of the SBM-efficiency scores, to classify DMUs into three subsets. Azizi et al. [23] formulated SBM (a) models in IDEA from both optimistic and pessimistic perspectives to measure the overall performance of DMUs. ese SBM (a)based IDEA models measure the nonradial efficiency with interval data, but do not consider negative data. Sharp et al. [24] introduced the idea of the range-possible improvement into the SBM (a) model and developed a modified slackbased measure (MSBM) model to evaluate DMUs with negative data. e MSBM considers input and output slacks and possesses the property of being translation invariant. Tone et al. [25] proposed base point SBM (BP-SBM) models, which are consistent with ordinary SBM (a) models, to deal with negative data. Both MSBM model and BP-SBM models are slack-based and can handle negative data. However, they ignore imprecise data. Yang and Mo [26] considered these three characters simultaneously and extended the MSBM model to the interval MSBM (IMSBM) model, to evaluate the efficiency of particular DMUs with imprecise and negative data, and is also slack-based. However, the IMSBM model does not consider undesirable outputs. Tone [27] developed a new SBM (b) model from the SBM (a) model to measure efficiency in the presence of undesirable outputs. However, SBM (b) cannot yet deal with imprecise data.
is study develops the IMSBM model to address undesirable outputs, which extends the application of IDEA in fields that concern with less undesirable outputs, such as air pollutants, hazardous wastes, and nonperforming loans. Our new IMSBM model is based on SBM (b), unlike the current IDEA models, and thus it considers undesirable outputs and both radial and nonradial efficiencies from the perspectives of slacks. We also confirm that the new model is unit-invariant and translation-invariant. In Table 1, we compare the new IMSBM model with the other DEA models mentioned above. e remainder of this paper is organized as follows. In Section 2, the IMSBM model with undesirable outputs is presented. Section 3 classifies DMUs into three subsets, and an improved degree of preference approach is introduced to rank the interval efficiencies. e IMSBM model with undesirable outputs is applied to evaluate the interval efficiency of Chinese city commercial banks in Section 4. e final section presents our conclusions.

The MSBM and IMSBM Models with Undesirable Outputs
Färe et al. [28] pointed out that the assumption of the constant returns to scale (CRS) suggested that any DMU could be radially expanded or contracted to form other feasible DMUs, which causes inconsistency with negative data. However, this is not the case under a variable returns to scale (VRS), so the models mentioned below are therefore assigned under the VRS.

e MSBM Model with Undesirable
Outputs. First, we extend the MSBM proposed by Sharp et al. [24] to deal with undesirable outputs. Consider a set of n homogenous units under analysis, and each consumes varying amounts of m different inputs to produce s different outputs (s � s r + s l ), where s r is the number of good outputs and s l is the number of bad (undesirable) outputs. Specifically, 2 Complexity DMU j (j � 1, 2, . . . , n) consumes x ij (i � 1, 2, . . . , m) of each input to produce y g rj (r � 1, 2, . . . , s r ) of each good output and y b lj (l � 1, 2, . . . , s l ) of each bad output. e inputs, good outputs, and bad outputs can be represented by three vectors X � (x ij ) ∈ R m×n (j � 1, 2, . . . , n, i � 1, 2, . . . , m), Y g � (y g rj ) ∈ R s r ×n (r � 1, 2, . . . , s r ), and Y b � (y b lj ) ∈ R s l ×n (l � 1, 2, . . . , s l ), respectively. en, the production possibility set (P) under VRS assumption is defined as where λ � (λ 1 , λ 2 , . . . , λ n ) T is the intensity vector and n j�1 λ j � 1 keeps P under VRS assumption. A DMU k (x k , y g k , y b k ) is efficient in the presence of undesirable outputs if there is no vector (x k , y g k , y b k ) ∈ P such that x k ≥ x, y g k ≤ y g and y b k ≥ y b with at least one strict inequality [27]. When considering input and output slacks, i.e., input exceeds (s − ), good output shortfalls (s g+ ), and undesirable outputs exceeds (s b− ), the production possibility set (P ′ ) under VRS assumption can be defined as We now introduce the ideal point into the MSBM model with undesirable outputs. For a given dataset, the ideal point is considered as I � (min j x ij (i � 1, 2, . . . , m), max j y g rj (r � 1, 2, . . . , s r ), min j y b lj (l � 1, 2, . . . , s l )) . erefore, for DMU k , the range of possible improvement is defined as Obviously, R − ik , R g+ rk , R b− lk ≥ 0. Replacing the corresponding terms in the SBM (b) model with R − ik , R g+ rk and R b− lk , the MSBM model with undesirable outputs is thus subject to According to Tone [29] and Cooper et al. [30], in formula (4), the minimization of the numerator can be interpreted as the MSBM-input-efficiency, that is, In addition, the reciprocal of the maximization of the denominator can be interpreted as the MSBM-output-efficiency, that is, . erefore, the MSBM nonoriented efficiency can be defined as min ρ k through multiplying ρ I k by ρ O k , and min ρ k subjects to P ′ . In formulas (4) and (5), s − ik , s g+ rk and s b− lk are slacks in the i th input, r th good output and l th bad output of DMU k , respectively. e weights of each input w i , good output v r , and bad output v l are determined subjectively by decision-makers and subject to m i�1 w i � 1, For the IMSBM model with undesirable outputs, the inputs, good outputs, and bad outputs are assumed to be interval variables denoted as Complexity is the upper bound of y b lj . In this case, the ideal point in the IMSBM model with undesirable outputs is considered as I � (min j x ij (i � 1, 2, . . . , m), max j y g rj (r � 1, 2, . . . , s r ), min j y b lj (l � 1, 2, . . . , s l )). Consequently, for DMU k , the range of possible improvement is defined as subject to is zero, the corresponding term is assumed to be dropped from the numerator or denominator.
e lower bound of the interval efficiency ρ k is under the most unfavourable situation for DMU k .
us, DMU k consumes x ik to produce y g rk and y b lk , while DMU j consumes x ij to produce y g rj and y b lj (j ≠ k). Symmetrically, the upper bound of the efficiency ρ k is the most favourable situation for DMU k . us, DMU k consumes x ik to produce y g rk and y b lk , while DMU j consumes x ij to produce y g rj and y b lj (j ≠ k). erefore, models (7) and (8) interpret the IMSBM model with undesirable outputs as a whole, including the relative efficiencies under the most unfavourable and favourable situations. Subsequently, they can be divided into a pair of precise models, the lower efficiency models, and the upper efficiency models. Models (9) and (10) interpret the lower efficiency under the most unfavourable situation for DMU k ; inversely, models (11) and (12) interpret the upper efficiency under the most favourable situation [26,31]: subject to 4 Complexity subject to According to the Charnesa and Cooper transformation [32] and referring to [33][34][35], the IMSBM model with undesirable outputs can be transformed into a linear programming form. We multiply a scalar variable t(t > 0) for both the numerator and the denominator of the objective function of (9) which does not impact ρ k . By adjusting t and if the denominator equals 1, then the denominator can be regarded as a constraint, and the objective function minimises the corresponding numerator. e lower bound of the IMSBM model with undesirable outputs is Formula (14) is a nonlinear programming problem due to its nonlinear terms, and some definitions are needed to transform it into a linear programming problem. Assume Assuming the optimal solution of (16) and (17) to be , then, according to (15), the optimal solution of (9) and (10) can be obtained numerically as Symmetrically, the transformed problem of (11) and (12) is

Properties of the IMSBM Model with Undesirable
Outputs. e following properties are considered the bases of designing an efficiency measure [1].

Property 1 (translation-invariant).
is is critical, particularly when input-output data contain zero or negative values.

Property 2 (units-invariant).
is is considered an important property in DEA, and in general mathematical terms, this property is referred to as dimensionless.

Theorem 1. e IMSBM model with undesirable outputs is translation-invariant.
Proof. A measure is translation-invariant if and only if the model is equivalent before and after the translation [36].

Theorem 2. IMSBM model with undesirable outputs is unitinvariant.
Proof. Consider the g th input, the h th good output, and the q th bad output in the models (9) and (10), rescale both bounds of the g th input by multiplying it by a scalar α > 0, rescale both bounds of the h th good output by multiplying it by a scalar β > 0, and rescale both bounds of the q th bad output by multiplying it by a scalar c > 0. e ideal point is I α,β,c � (min j x ij (i � 1, 2, . . . , m, i ≠ g), min j (αx gj ), max j y rj (r � 1, 2, . . . , s 1 , r ≠ h), max j (βy hj ), min j y lj (l � 1, 2, . . . , s 2 , l ≠ q), min j (cy qj )). It can be proven that , and s − qk,c � cs − qk . e rescaling does not impact ρ k . us, models (9) and (10) are unit-invariant.
Models (11) and (12) can be similarly proven. erefore, the IMSBM with undesirable outputs is unit-invariant. is proof is thus complete.

Classification and Ranking of the DMUs
e efficiency scores measured by the IMSBM model with undesirable outputs are calculated in an interval form, and thus a simple and practical approach is required to compare and rank the performance of the DMUs.
Haghighat and Khorram [37] noted that DMUs can be classified into three subsets according to the interval efficiency. e first is the strictly efficient subset, with E ++ � DMU j |ρ j � 1, ρ j � 1, j � 1, 2, . . . , n . e second is the weakly efficient subset, with E + � DMU j |ρ j < 1, e third is the inefficient subset, with E − � DMU j |ρ j < 1, ρ j < 1, j � 1, 2, . . . , n . Ranking the DMUs in the same subset is obviously difficult when the DMU number is greater than one. Wang et al. [38] proposed the degree of preference approach for ranking interval data. However, although this approach is suitable for a pairwise comparison, it is less convenient in a complex system. us, we introduce an improved degree of preference approach to rank interval efficiency scores. Suppose there are two interval efficiencies, denoted as ρ i � [ρ i , ρ i ] and ρ j � [ρ j , ρ j ]. en, the degree of preference of ρ i over ρ j (ρ i ≻ρ j ) can be defined as P ij � P(ρ i ≻ρ j ), which reflects the interrelationship among ρ i and ρ j : Accordingly, the degree of preference of ρ j over ρ i (ρ j ≻ρ i ) can be defined as Besides the above two options in (26) and (27) (ρ i ≻ρ j and ρ i ≺ρ j ), the interrelationship among ρ i and ρ j exists in the third option, that is, ρ i � ρ j (ρ i � ρ j and ρ i � ρ j ). It is easy to verify that if ρ i � ρ j , then P ij � P ji � 0. According to (26) and (27), if ∀i ≠ j, such that P ij + P ji � 1, then the following n × n matrix that consists of P ij is an antisymmetric matrix. P n×n � − P 12 · · · P 1n P 21 − · · · P 2n · · · · · · · · · · · · P n1 P n2 · · · − where P ik � P(ρ i ≻ρ k ) and P ik � P(ρ i � ρ k ) � 0, i, k � 1, 2, . . . , n, i ≠ k.
According to transitivity, it can be verified that the degree of preference approach possesses the following property.
Here, r i and r j denote the sum value of the degree of preference of the deferent rows in matrix, that is, where r i and r j can be denoted by the vector R � (r 1 , r 2 , . . . , r n ) T . We can verify from the property that if ρ i ≻ρ j , then r i > r j , and if ρ i � ρ j , then r i � r j . erefore, the different interval efficiency scores in subsets E + and E − can be ranked through vector R � (r 1 , r 2 , . . . , r n ) T due to the transitivity.

Application to Chinese City Commercial Banks
In this section, we implement the proposed IMSBM model with undesirable outputs to evaluate the efficiency scores and the classification of Chinese city commercial banks in 2017.
Our study represents the first attempt to measure the interval efficiency of these banks with both negative data and undesirable outputs. Based on the availability of data, we evaluate the interval efficiency scores of 99 city commercial banks, each of which is associated with two inputs (staff costs (COST) and total assets (ASST)) and three outputs (noninterest income (NINT), interest income (INTE), and nonperforming loan (NPL)). To simplify the problem, the inputs and outputs are weighted equally, as presented in Table 2 in the parentheses. Due to space limitations, we only give the DMUs with negative data in Table 2. Four DMUs have negative outputs in all of the samples. DMU 87 (Cang Zhou bank) has negative noninterest income at both the lower and the upper bound. ree other banks (DMU 66 (Xia Men bank), DMU 77 (Ying Kou bank), and DMU 97 (Gui Zhou bank)) have negative noninterest income at the lower bounds. e remaining 95 banks have positive inputs and outputs at both bounds and are not included in Table 2.
e resulting interval efficiency scores and corresponding classification evaluated by the IMSBM model with undesirable outputs are shown in Table 3, and those for the model without undesirable outputs are given in the two adjacent columns for comparison. Table 3 shows that, for the IMSBM model with undesirable outputs, only DMU 18 (Liang Shan Zhou bank) is strictly efficient, 82 banks are weakly efficient, and the remaining 16 are inefficient.
A comparison of the efficiency scores evaluated by the IMSBM models with and without undesirable outputs shows that the strictly efficient DMUs in the two models are the same (i.e., DMU 18 ). However, it is important to note that the interval efficiency scores of the weakly efficient and the inefficient DMUs changed after considering the undesirable outputs. When the DMUs are weakly efficient, the lower bound of the efficiency score decreased after considering undesirable outputs except DMU 3 , DMU 22 , DMU 71 , and DMU 85 , while the upper bounds of the efficiency score remained unchanged. When the DMUs are inefficient, the lower bound of the efficiency score also decreased after considering undesirable outputs, while the change of the upper bound of the efficiency score is complicated. e upper bounds of the efficiency score of 8 banks increased, and for the other 13 banks, the opposite is observed. erefore, without considering the undesirable outputs, the lower bound of the efficiency score will be overestimated as a whole, when the DMUs are weakly efficient and inefficient. In addition, the upper bound of the efficiency score will change when considering the undesirable outputs when the DMUs are inefficient. e details of the performance ranking are required, in addition to the classification. According to (26) and (27), the interrelationship among DMUs can be established through the degree of preference P ij , which constitutes a 99 × 99 matrix. Due to space limitations, the matrix is not shown in this paper. e sum value r of the degree of preference can then be calculated according to (29), and all of the DMUs are ranked based on the value. e r value and the corresponding rank of each DMU are shown in Table 4, where r * i and R * denote the sum values of the degree of preference and the rank of each DMU, respectively, with the IMSBM model with undesirable outputs, and the contrasting r i and R with the model without undesirable outputs are given in the adjacent two columns.
As shown in Table 4, DMU 18 is strictly efficient under both models; therefore, the sum values r * i and r i are both equal to 98, excluding the value on the leading diagonal. From r * i and r i , DMU 18 is found to be ranked in the top position under both models. We can then examine the ranks of the other 10 DMUs below DMU 18  is indicates that the IMSBM model with undesirable outputs leads the ranks of the weakly efficient and inefficient DMUs to change.

Conclusion and Discussion
is study develops the IMSBM model to address undesirable outputs, which extends the application of IDEA in fields that concern with less undesirable outputs, such as air pollutants, hazardous waste, and nonperforming loan. Several models in the literature have been developed to handle problems of imprecise and (or) negative data, but few  10 Complexity models consider handling imprecise and negative data simultaneously. ese models also ignore undesirable outputs. us, we first propose the IMSBM model with undesirable outputs. e model is novel as it considers undesirable outputs while dealing with imprecise and negative data, and it is slack-based, which ensures efficiency is obtained when considering both radial and nonradial slacks.
is study establishes that the IMSBM model with undesirable outputs is translation-invariant and unit-invariant. e model is applied to evaluate the interval efficiency scores of Chinese city commercial banks, which are compared with those evaluated by the IMSBM model without considering undesirable outputs.
e empirical results show that the IMSBM model with undesirable outputs reduces the lower bounds of the efficiency scores of the weakly and inefficient DMUs as a whole. erefore, without considering undesirable outputs, most of the lower bounds of the efficiency scores will be overestimated when the DMUs are weakly efficient and inefficient. In addition, the model leads to changes in the upper bounds of the efficiency scores of inefficient DMUs. Finally, the interval efficiency scores are ranked with an improved degree of preference approach. e proposed IMSBM model with undesirable outputs is assigned under the VRS, but not the CRS. erefore, the interval efficiency scores evaluated by the right model are pure technical efficiencies (PTE). In addition, the resulting interval efficiency scores are in the range of [0, 1], and the upper bound of each cannot be greater than one, so the strictly efficient DMUs cannot be ranked. us, in future studies, we will focus our attention on the IMSBM model with undesirable outputs under the CRS to evaluate the technical efficiency (TE). In addition, we will develop a superefficiency model from our model to rank the strictly efficient DMUs.

Data Availability
All of the data used to support the application of the model were collected by the authors from the annual reports and audit reports of Chinese city commercial banks.