Data Envelopment Analysis (DEA) is a nonparametric technique to estimate the current level of efficiency of a set of entities. DEA also provides information on how to remove inefficiency through the determination of benchmarking information. This paper is devoted to study DEA models based on closest efficient targets, which are related to the shortest projection to the production frontier and allow inefficient firms to find the easiest way to improve their performance. Usually, these models have been solved by means of unsatisfactory methods since all of them are related in some sense to a combinatorial NP-hard problem. In this paper, the problem is approached by genetic algorithms and parallel programming. In addition, to produce reasonable solutions, a particular metaheuristic is proposed and checked through some numerical instances.
1. Introduction
In the literature, technologies have been estimated using many different approaches over the past 50 years [1]. The two principal methods are stochastic frontiers, which resort to econometric techniques, and Data Envelopment Analysis (DEA), which involves mathematical programming models. In particular, Data Envelopment Analysis (DEA) is a nonparametric technique based on mathematical programming for the evaluation of technical efficiency of a set of decision making units (DMUs) that consumes inputs to produce outputs [2]. In contrast to other efficiency methodologies, for example, stochastic frontiers, DEA provides simultaneously both an efficiency score and benchmarking information through efficient targets. These two pieces of information are usually inseparable in DEA. Indeed, the efficiency score is obtained from the distance between the assessed DMU and a point on the frontier of the technology, which serves as efficient target for the assessed DMU. Information on targets can play a relevant role since they indicate keys for inefficient units to improve their performance.
Traditional DEA measures of technical efficiency yield targets that are associated with the furthest efficient projection to the evaluated DMU [3]. However, several authors (see [3, 4] to name but a few) argue that the distance to the efficient projection point should be minimized, instead of maximized, in order for the resulting targets to be as similar as possible to the observed inputs and outputs of the evaluated DMU. The cornerstone of this approach is that closer targets suggest directions of improvement for inputs and outputs that lead to the efficiency with less effort than other alternatives.
In general, closest targets are easily attainable and provide the most relevant solution to remove inefficiency from the process of production. The determination of closest targets has attracted an increasing interest of researchers in recent DEA literature [4–7]. In particular, revising the literature we can find that Frei and Harker [8] determined closest targets by minimizing the Euclidean distance to the efficient frontier. Gonzalez and Alvarez [9] proposed to minimize the sum of input contractions required to reach the boundary of the technology. Portela et al. [4] used the software QHull, which allows determining all facets of a polyhedron, with the aim of getting closest targets. More recently, Baek and Lee [10], Pastor and Aparicio [7], Amirteimoori and Kordrostami [5], and Aparicio and Pastor [6] focused their corresponding analysis on a weighted version of the Euclidean distance, while Jahanshahloo et al. [11] provided methods for both obtaining the minimum distance from DMUs to the frontier and evaluating group performance of DMUs. On the other hand, Briec [12], Briec and Lesourd [13], and Briec and Lemaire [14] obtained the minimum distance to the frontier using Hölder norms, introducing in this way the Hölder distance functions in the literature related to efficiency measurement. Coelli [15] proposed an alternative to the second stage of the process for solving radial models based on closest targets. Cherchye and Van Puyenbroeck [16] maximized the cosine of the angle between the input vectors of the evaluated DMU and its corresponding projection on the boundary. Finally, other related papers are those by Lozano and Villa [17], who determined a sequence of targets to be achieved in successive leaps which finally converge to the efficient frontier, and by Charnes et al. [18] and Takeda and Nishino [19] who use techniques for assessing sensitivity in efficiency classification in DEA based on minimizing distances.
Regarding papers that have studied the computational aspects of DEA models associated with the determination of closest targets, we may cite several references: Aparicio et al. [3] and Jahanshahloo et al. [11, 20–22]. Overall, some of these approaches are based on Mixed Integer Linear Programming or Bilevel Linear Programming while others are derived from algorithms that allow the determination of all the facets of a polyhedron. As we will argue in detail in Section 2, all these approaches present some strong points and weaknesses and, consequently, currently there is no approach accepted as the best solution to the problem.
In view of this discussion, it seems that determining closest targets has been one of the important issues in the DEA literature (see Cook and Seiford [23], for a recent survey on DEA). Nevertheless, from a computational point of view, the problem, that is, the determination of closest targets, is difficult enough and this fact justifies the effort to apply new methods in order to overcome it.
In this paper, we use several genetic algorithms with the aim of determining closest targets in DEA. We apply this type of methodology to solve the aforementioned problem. In particular, we resort to the approach by Aparicio et al. [3], which is based on Mixed Integer Linear Programming, and try to find feasible solutions of the model that these authors introduced by means of genetic algorithms. The difficulty of the problem requires us to study the genetic algorithm by parts, incorporating restrictions while the results are analyzed.
The remainder of the paper is organized as follows. In Section 2, a brief introduction of the main notions associated with Data Envelopment Analysis is presented. Additionally, in this section the existing approaches for determining closest targets are outlined. In Section 3, the genetic algorithm is developed. After that, a random search is used to improve this algorithm, to obtain a hybrid metaheuristic. The idea is to use a random search method in some parts of the genetic algorithm to explore the space of feasible solutions better. To reduce the execution time, a parallel algorithm in shared memory is developed and studied. In Section 5, the results of some experiments are summarized. Finally, Section 6 concludes the paper.
2. Data Envelopment Analysis and Closest Targets
Recent years have seen a great variety of applications of DEA for the use in evaluating the efficiency of many different types of entities [2]. DEA models are based on mathematical programming. While other methods, as stochastic frontiers, need the specification of a functional form for the production function (such as the Cobb-Douglas form), DEA is a nonparametric technique that estimates a piecewise-linear convex technology without this requirement, constructed such that no observation of the sample of data lies above it (refer to Figure 1). In Figure 1, we represent a sample of several firms (DMUs) that use one input (x) to produce one output (y). Each firm is represented by a point in the figure. We also show both the Cobb-Douglas function estimated from the data, if the analyst assumes such functional form, and the piecewise-linear convex estimation of the frontier of the technology resorting to DEA techniques.
Stochastic frontiers versus DEA.
DEA involves the use of Linear Programming to construct the nonparametric piecewise surface over the data. Technical efficiency measures associated with the performance of each DMU are then calculated relative to this surface, as a distance to it. Although Farrell [24] was the first to introduce these ideas in the literature, the method did not receive wide attention until the paper by Charnes et al. [25], in which the term Data Envelopment Analysis was coined. Since then there have been a large number of papers which have extended, adapted, and applied this methodology in the fields of economics, operations research, and management science.
Before continuing, we need to define some notations. Assume that there are data on m inputs and s outputs for each of n DMUs. For the jth DMU these are represented by xij≥0, i=1,…,m, and yrj≥0, r=1,…,s, respectively.
There are a lot of DEA technical efficiency measures in the literature. The basic DEA models are the CCR [25] and the BCC [26]. Both models are based on radial projections to the production frontier. However, many other approaches give freedom to the projection so that the final efficient targets do not conserve the mix of inputs and outputs. In particular the enhanced Russel Graph measure [27] or slacks-based measure [28] may be calculated for DMU k, k=1,…,n, as follows:
(1)min1-(1/m)∑i=1m(sik-/xik)1+(1/s)∑r=1s(srk+/yrk)s.t.∑j=1nλjkxij=xik-sik-,∀i,∑j=1nλjkyrj=yrk+srk+,∀r,λjk,sik-,srk+≥0,∀j,i,r.
Equation (1) is a linear fractional program that can be easily transformed into a linear program by a change of variables as described in Pastor et al. [27]. Specifically, described equation (1) is equivalent to the following linear program:
(2)minβk-1m∑i=1mtik-xiks.t.βk+1s∑r=1strk+yrk=1,-βkxik+∑j=1nαjkxij+tik-=0,∀i,-βkyrk+∑j=1nαjkyrj-trk+=0,∀r,βk,αjk,tik-,trk+≥0,∀j,i,r.
The Enhanced Russell Graph measure, defined as the optimal value of the above model, satisfies several interesting properties from a mathematical and economic point of view. However, it presents a weakness. In particular, (1), or equivalently (2), yields efficient targets, points onto the piecewise frontier, which are far away from the inputs and outputs of the evaluated DMU k. To illustrate this idea, note that the targets for this model are defined as ∑j=1nλjkxij=(xik-sik-) for inputs and ∑j=1nλjkyrj=(yrk+srk+) for outputs and that the slacks, sik- and srk+, appear in the objective function. Consequently, and since we are minimizing, the model seeks slacks as large as possible and, therefore, (1) yields the furthest targets for DMU k with original inputs and outputs (x1k,…,xmk,y1k,…,ysk). In order to determine closest targets instead of furthest targets, it seems enough to change “minimizing” by “maximizing.” However, it is not true. In this case, we could show that the targets generated by the model would be not technically efficient but inefficient [3]. And, therefore, they could not serve as benchmark for the assessed DMU.
This problem was the main reason behind the introduction of different approaches in the literature to determine closest targets. On the one hand, a group of the researchers [20, 21] focus their work on finding all the faces of the polyhedron that defines a technology estimated by DEA. For example, in Figure 1, we show five of these faces. The computing time of these algorithms increases significantly as the problem size grows (n+m+s) since this issue is closely related to a combinatorial NP-hard problem. On the other hand, a second group proposes to determine closest targets through mathematical programming [3, 22]. In particular, Aparicio et al. [3] introduced the following Mixed Integer Linear Program to overcome the problem for DMU k:
(3)maxβk-1m∑i=1mtik-xiks.t.βk+1s∑r=1strk+yrk=1,(c.1)-βkxik+∑j=1nαjkxij+tik-=0,∀i,(c.2)-βkyrk+∑j=1nαjkyrj-trk+=0,∀r,(c.3)-∑i=1mνikxij+∑r=1sμrkyrj+djk=0,∀j,(c.4)νik≥1,∀i,(c.5)μrk≥1,∀r,(c.6)djk≤Mbjk,∀j,(c.7)αjk≤M(1-bjk),∀j,(c.8)bjk=0,1,(c.9)βk≥0,(c.10)tik-≥0,∀i,(c.11)trk+≥0,∀r,(c.12)djk≥0,∀j,(c.13)αjk≥0,∀j.(c.14)
Regarding (3), some comments are in order. First, note that (c.1)–(c.3) are the same constraints as those used in (2) and it implies that we are considering the same set of feasible points. However, note also that the objective function is maximized in this case instead of minimized. Second, with (c.4)–(c.6) we are considering supporting hyperplanes such that all the points of the estimated technology (a polyhedron) lie on or below these hyperplanes. Finally, (c.7) and (c.8) are the key conditions that connect the two previous sets of constraints. Specifically, they avoid that the targets correspond to interior points of the estimated technology. Finally, (c.9) defines bjk as a binary variable and (c.10)–(c.14) are the usual nonnegative constraints.
A weakness of the approach in Aparicio et al. [3] is that it uses a “big M” to model the key constraints (c.7) and (c.8). Specifically, it allows to link djk to αjk by means of the binary variable bjk. However, the value of M may be calculated if and only if we previously calculate all the faces that define the technology. Accordingly, this alternative is again associated with a combinatorial NP-hard problem. In the same manner, Jahanshahloo et al. [22] resorted to Lineal Bilevel Programming and a big M to determine closest targets and, therefore, their approach presents a similar drawback.
In view of the preceding discussion, from a computational point of view, the determination of closest targets in DEA has not yet been satisfactorily solved and, consequently, it justifies the effort to apply new methods in order to overcome the problem. In this paper, we apply genetic algorithms to solve (3). In particular, and since the problem is really difficult, we will find valid solutions of (3) in the following sections.
3. Genetic Algorithms for Determining Closset Targets
Genetic algorithms [29] are used here to obtain a satisfactory solution of (3). A population of chromosomes representing particular valid solutions of (2) is explored. Each chromosome represents a candidate to be the best model and it is composed by βk, αjk, tik-, trk+∈ℝ+ and bjk∈{0,1} with i=1,…,m, j=1,…,n, r=1,…,s. An evaluation of each chromosome is calculated by using the objective function of (3). Consider(4)βkb1k⋯bnkt1k-⋯tmk-t1k+⋯tsk+α1k⋯αnk.
Algorithm 1 shows the scheme of a genetic algorithm. The function in the scheme and the values of some parameters for a basic genetic algorithm are shown in Algorithm 1.
<bold>Algorithm 1: </bold>Scheme of a genetic algorithm.
(1) Initialize (S)
(2) whileNotEndConditions (S) do
(3) Evaluate (S)
(4) SS1 = Select the best ranking of S
(5) SS2 = Crossover and Mutation (SS1)
(6) S = IncludeSolutions (SS2)
(7) end while
3.1. Defining a Valid Chromosome
A valid chromosome has to satisfy at least the following constraints in (3): (c.2), (c.3), (c.8), (c.9), (c.10), (c.11), (c.12), and (c.14). Four methods to generate the initial population of chromosomes have been tested. Methods 1, 2, and 4 are different and Method 3 is an extension of Method 2.
Method 1.
The parameters of the chromosomes are generated randomly.
Method 2.
The process starts obtaining a random βk and a set of αjk values using Algorithm 2. Next the bjk values are generated considering αjk in order to satisfy (c.8). The values trk+ and tik- are deduced from inputs and outputs matrices and the previously generated βk and αjk using (c.2) and (c.3). In case of obtaining a valid solution the algorithm ends, if not, Algorithm 3 is used.
Algorithm 3 decreases and increases βk in order to obtain the minimal βk that satisfy (c.11). Two parameters are used: a factor number q and the maximal number of iterations of the process, which also determines the increasing or decreasing value of βk. To increase βk, a similar algorithm is used working in a similar way to decrease βk, but doing the add operation in the first inner loop and using the constraint (c.12) as a condition instead of (c.11) for both inner loops.
<bold>Algorithm 2: </bold>Generate alpha.
Require:X∈ℝ+m,n, Y∈ℝ+s,n, DMU k.
Ensure:α1k,…,αnk∈ℝ+, ∀αjk≥0.
Vx,Vy∈ℝ+n such as Vjx=1m∑i=1mXi,j and Vjy=1s∑i=1sYi,j
Sort Vx and Vy in decreasing.
forj:= 1,…,ndo
ifVjx<⌊n/2⌋andVjy≥⌊n/2⌋then
αjk←0
else ifVjx≥⌊n/2⌋andVjy<⌊n/2⌋then
αjk← Generate 0.5≤αjk≤1 randomly.
else if (Vjx≥BoundPointandVjy≥BoundPoint) or (Vjx<BoundPointand
Vjy<BoundPoint) then
αjk← Generate 0 ≤αjk≤0.25 randomly.
end if
end for
Find the minimum αjk and modify its value in order to satisfy (3)
Ensure: A minimal βk for the chromosome that satisfy constraint 11: ∀tik-≥0.
d←0
while d≤MaxIter do
while ∀tik-≥0 do
{Decrease βk while still satisfies (c.11)}
βk←βk-qd
Generate tik- using expression (2).
end while
{At this point (c.11) is not satisfied because of the decrease of βk. Increase βk value
until it is satisfied again}
repeat
βk←βk+qd
Generate tik- using (3).
until ∀tik-≥0
d←d+1
end while
Method 3.
This method is an extension of Method 2 through adding a third tweak process for the αjk (Algorithm 4).
<bold>Algorithm 4: </bold>Generate a chromosome (Method <xref ref-type="statement" rid="method3">3</xref>).
Require:X∈ℝ+m,n, Y∈ℝ+s,n, q∈ℝ, DMU k, MaxIter∈ℤ
Ensure: Chromosome c.
repeat
Generate 0≤βk≤1 randomly.
Generate α1k,…,αnk using Algorithm 2 and t1k-,…,tmk- and t1k+,…,tsk+ using expression (2).
while∃tik-≤0and∃trk+≤0and number of αjk≠0>2do
if∃tik-≤0and∀trk+≥0then
Increase βk
end if
if∀tik-≥0and∃trk+≤0then
Decrease βk
end if
if∃tik-≤0or∃trk+≤0then
Choose αjk no null randomly and make it equal to zero, decreasing the rest αjk
in p and find the minimum αjk and modify its value in order to satisfy (3)
end if
end while
until A valid chromosome is obtained orIterations≥MaxIteror the number of αjk
no null is lower than n-2
Method 4.
Method 4 consists in a hyperheuristic (metaheuristic that operates over some other metaheuristic) [30]. A genetic algorithm has been used to produce sets of αjk and βk for the initial population, which satisfy (c.2), (c.3), (c.11), (c.12), and (c.14). The evaluation function in the hyper-heuristic is sum of the negative values of trk+ and tik-, with the chromosome with the higher punctuation being a better candidate. Moreover βk and αjk are considered, penalizing values close to 1.
3.2. Select the Best Ranking, Crossover, and Mutation
In each generation a proportion of the existing population is selected to breed a new generation. A comparison of the evaluations of all the chromosomes in the population is made in each generation, and only part of them (those which are in the best ranking) will survive. The number of chromosomes which survive in each population (called SurvSize) is preset. For each two new solutions to be produced (“son" and “daughter"), a pair of “parent" (“father" and “mother") chromosomes is selected from the set of chromosomes.
Due to the number of constraints to satisfy, a traditional crossover method will produce an offspring with a higher rate of nonvalid chromosomes. With the aim of avoiding this, a crossover with different “levels" is introduced in Algorithm 5. The defined levels are 1 for βk, 2 for αjk, 3 for tik-, and 4 for trk+. A level is chosen randomly, and values in that level are crossed and those in lower levels are deduced to satisfy (3).
<bold>Algorithm 5: </bold>Crossover.
Require: Chromosome c1, Chromosome c2.
Ensure: Chromosome c3, Chromosome c4
Generate 0≤γ≤1 randomly.
Generate 0≤ϕ≤4 randomly.
ifϕ=0then
{Crossing βk}
βk3←βk1*γ+(1-γ)*βk2
βk4←βk2*γ+(1-γ)*βk1
else ifϕ≤1then
{Crossing αjk}
forj:=1,…,ndo
αjk3←αjk1*γ+(1-γ)*αjk2
αjk4←αjk2*γ+(1-γ)*αjk1
end for
else ifϕ≤2then
{Crossing tik-}
fori:=1,…,mdo
tik-,3←tik-,1*γ+(1-γ)*tik-,2
tik-,4←tik-,2*γ+(1-γ)*tik-,1
end for
else ifϕ≤3then
{Crossing trk+}
forr:=1,…,sdo
trk+,3←trk+,1*γ+(1-γ)*trk+,2
trk+,4←trk+,2*γ+(1-γ)*trk+,1
end for
end for
Deduce αjk, bjk, tik- and trk+ for c1 and c2 in order to satisfy (3)
In each iteration a chromosome is randomly chosen to be mutated. The procedure is similar to crossover, considering the dependences between variables and defining different levels of mutation.
4. Parallel Algorithms
The algorithms proposed for GA are very time-consuming when the problem size and PopSize increase. Thus, we propose parallel algorithms which aim to reduce the execution time. The parallelization is carried out on a shared-memory model but a design for distributed memory could be similarly generated. Thus, we assume a shared-memory computer with p processors. The parallel programming environment that allows the use of this computer is OpenMP [31]. There are multithread implementations of the basic linear algebra library BLAS [32], which can be used by higher level libraries (LAPACK [33]) to take advantage of this architecture by using multithreading techniques. For example, the Intel MKL library and the ATLAS [34] package have efficient multithread versions of BLAS.
The costliest parts of the genetic algorithm are Evaluate, Crossover, and Mutate and have been paralleled simply by assigning some chromosomes to each processor. The population is initialized only once, but it has a high cost and is also paralleled as above.
5. Experimental Results
Experiments have been carried out in a computer with Itanium-2 dual-core. The code has been written in C. We used the 11.1 version of Intel compiler and the Intel MKL library, version 10.2.2.025, which contains a multithread implementation of LAPACK.
Some experiments have been carried out to tune certain parameters of the algorithm. In all of them, Population=100, SurvSize=PopSize/2, and MaxIter=1000. Moreover, for all the experiments, the data have been simulated.
The experiments have three objectives. The first is to compare the four methods of creating valid chromosomes, the second is to study the genetic algorithm, and the third is to study the execution cost when parallel algorithm is used, all when varying the problem size.
Table 1 shows both the averaged execution time (in seconds) and the averaged percentage of valid chromosomes associated with the four aforementioned methods. Additionally, the corresponding standard deviations are shown as subscripts. Method 1 works very poorly for all the six instances. Method 2 performs better than Method 1. However, the averaged percentage of valid chromosomes decreases as the problem size increases, reaching a value of zero for the biggest numerical example. The performance of Method 4 is also worse than that corresponding to the second method, even with respect to the execution cost. On the other hand, Method 3 is clearly the best approach for determining valid chromosomes, although the size of the problem adversely affects its performance. Specifically, Figure 2 illustrates the superiority of the third method compared to the other three possibilities.
Execution cost and % of valid chromosomes when the four methods of initiation are used, all when varying the problem size.
Size
Method 1
Method 2
Method 3
Method 4
m
n
s
Time
% val.
Time
% val.
Time
% val.
Time
% val.
2
15
1
0.0030.004
0.751.71
0.0080.005
50.8341.92
26.42351.440
82.0838.58
17.2443.566
10.583.12
3
25
2
0.0040.004
0.000.00
0.0100.006
33.5538.24
6.72216.025
90.0530.46
21.2830.801
0.800.83
4
30
2
0.0040.005
0.000.00
0.0220.008
26.8729.09
0.2230.584
100.000.00
29.52110.540
1.571.20
5
40
3
0.0040.003
0.000.00
0.0190.003
13.9023.90
13.12520.640
73.9043.40
18.1871.209
0.000.00
6
60
4
0.0060.000
0.000.00
0.0320.001
0.030.16
2.0661.132
34.7444.07
103.6980.185
0.180.46
10
100
10
0.0110.000
0.000.00
0.0920.002
0.000.00
8.4263.235
32.6541.44
302.3160.713
0.000.00
A comparison between the percentage of valid chromosomes obtained with each generation method.
As for the genetic algorithm, Table 2 shows the averaged execution cost, the averaged percentage of valid chromosomes, and the averaged best solution (the objective function in (3)) when the genetic algorithm is used together with Method 3. The utilization of Method 3 is justified by its superiority compared to Methods 1, 2, and 4. Regarding the percentage of valid chromosomes, the algorithm generates about 84% in the best case (m=4, n=30, and s=2) and about 30% in the worst case (for the two more complex instances). Although there is not a clear trend with respect to the size of the analyzed problems, the results show that the averaged percentage of valid chromosomes is smaller for m=6, n=60, and s=4 and m=10, n=100, and s=10. Moreover, though the last instance (m=10, n=100, and s=10) is considerably more complex than the remaining simulations, the genetic algorithm does not work worse than in the preceding numerical example.
Execution cost, percentage of valid chromosomes, and solution value of the genetic algorithm (using Method 3), all when varying the problem size.
m
n
s
Time
% val.
Sol.
2
15
1
0.2070.302
70.0041.68
0.8400.100
3
25
2
0.29756.900
56.9040.99
0.8060.101
4
30
2
0.1730.174
84.2224.79
0.7310.100
5
40
3
0.7730.536
46.5241.26
0.7080.120
6
60
4
2.1091.138
30.8441.47
0.7210.062
10
100
10
8.5163.325
30.5138.67
0.6600.077
Table 3 shows the execution time and speedup for the parallel version of genetic algorithm. In general, the speedup values are satisfactory being better when the problem size increases. In small problems, the speedup when using 16 processors is lower than that obtained when using 8 or less. Therefore the selection of the number of processors is essential for resource optimization.
Execution time and speedup of parallel version of the genetic algorithm when varying the problem size. The average and standard deviation of the values of DMUs are shown. In each case, p is the number of processors used, time is the execution time in seconds, and sp is the speedup.
m
n
s
p
Time
sp
2
15
1
1
0.210.30
—
2
0.140.19
1.241.24
4
0.090.12
1.850.59
8
0.080.10
2.031.23
16
0.120.17
1.530.50
3
25
2
1
0.300.27
—
2
0.180.15
1.550.22
4
0.110.09
2.460.67
8
0.080.07
3.221.37
16
0.100.09
2.590.86
4
30
2
1
0.170.17
—
2
0.100.10
1.650.19
4
0.060.05
2.590.46
8
0.050.05
3.170.85
16
0.060.05
2.620.87
5
40
4
1
0.810.55
—
2
0.460.30
1.720.21
4
0.250.17
3.040.44
8
0.190.12
4.190.93
16
0.170.12
4.711.44
6
60
4
1
2.111.14
—
2
1.130.60
1.850.10
4
0.610.32
3.360.33
8
0.380.20
5.380.96
16
0.290.16
7.111.84
10
100
10
1
0.660.08
—
2
4.371.68
1.940.13
4
2.320.83
3.590.37
8
1.310.46
6.330.70
16
0.800.27
10.271.55
6. Conclusions and Future Works
Determining benchmarking information through closest efficient targets is one of the relevant topics in the recent Data Envelopment Analysis literature. However, from a computational point of view, it has been solved by unsatisfactory approaches since all of the existing methods, as was argued, are related to a combinatorial NP-hard problem.
In this paper, for the first time, the determination of closest targets has been approached by means of genetic algorithms and parallel programming. Specifically, a first step has been carried out to solve the problem obtaining valid solutions for the mathematical programming model proposed by Aparicio et al. [3]. At this respect, four different methods to generate the initial population of chromosomes were used and tested by means of some simulated experiments. The third method, based on three algorithms to yield the suitable parameters of the problem, clearly presented the best performance. In a subsequent phase, this third method was used in the generation of valid chromosomes in the genetic algorithm. Finally, the execution time and speedup for the parallel version of the genetic algorithm were shown, demonstrating in particular that the speedup was better when the problem size increased.
In this paper, the introduced approach only takes into account eight constraints on a total of fourteen in (3). Although the complexity of the problem justifies this first step, the development of a method considering the remaining restrictions may be a good avenue for further follow-up research. In particular, improving the method to generate the initial population of chromosomes and testing different heuristics could help to achieve this specific objective. A deeper study is also in mind of the relation between the different initial parameters and the size of the problem with the computing time required and the effectiveness of the algorithm. Finally, it is worth mentioning that (3) was associated with the Enhanced Russell Graph measure [27]. However, there are a lot of measures in DEA that can be used for measuring technical efficiency through closest targets. Therefore, programming the approach based on the genetic algorithm to solve all of them can be seen as a suitable future work.
Conflict of Interests
The authors declare that they have no conflict of interests regarding the publication of this paper.
Acknowledgments
The authors gratefully acknowledge the computer resources, technical expertise, and assistance provided by the Parallel Computing Group of the Murcia University. Additionally, Juan Aparicio and Jose Lopez-Espin are grateful to the Generalitat Valenciana for supporting this research with Grant no. GV/2013/112.
CoelliT.RaoD. S. P.BatteseG. E.CooperW. W.SeifordL. M.ToneK.AparicioJ.RuizJ. L.SirventI.Closest targets and minimum distance to the Pareto-efficient frontier in DEAPortelaM. C. A. S.BorgesP. C.ThanassoulisE.Finding xlosest targets in non-oriented DEA models: the case of convex and non-convex technologiesAmirteimooriA.KordrostamiS.A Euclidean distance-based measure of efficiency in data envelopment analysisAparicioJ.PastorJ. T.On how to properly calculate the Euclidean distance-based measure in DEAPastorJ. T.AparicioJ.The relevance of DEA benchmarking information and the least-distance measure: commentFreiF. X.HarkerP. T.Projections onto efficient frontiers: theoretical and computational extensions to DEAGonzalezE.AlvarezA.From efficiency measurement to efficiency improvement: the choice of a relevant benchmarkBaekC.LeeJ.-d.The relevance of DEA benchmarking information and the least-distance measureJahanshahlooG. R.VakiliJ.MirdehghanS. M.Using the minimum distance of DMUs from the frontier of the PPS for evaluating group performance of DMUs in DEABriecW.Hölder distance functions and measurement of technical efficiencyBriecW.LesourdJ. B.Metric distance function and profit: some duality resultsBriecW.LemaireB.Technical efficiency and distance to a reverse convex setCoelliT.A multi-stage methodology for the solution of orientated DEA modelsCherchyeL.Van PuyenbroeckT.A comment on multi-stage DEA methodologyLozanoS.VillaG.Determining a sequence of targets in DEACharnesA.RousseauJ. J.SempleJ. H.Sensitivity and stability of efficiency classiffications in data envelopment analysisTakedaA.NishinoH.On measuring the inefficiency with the inner-product norm in data envelopment analysisJahanshahlooG. R.Hosseinzadeh LotfiF.ZohrehbandianM.Finding the piecewise linear frontier production function in data envelopment analysisJahanshahlooG. R.Hosseinzadeh LotfiF.Zhiani RezaiH.Rezai BalfF.Finding strong defining hyperplanes of production possibility setJahanshahlooG. R.VakiliJ.ZarepishehM.A linear bilevel programming problem for obtaining the closest targets and minimum distance of a unit from the strong efficient frontierCookW. D.SeifordL. M.Data envelopment analysis (DEA)—thirty years onFarrellM. J.The measurement of productive efficiencyCharnesA.CooperW. W.RhodesE.Measuring the efficiency of decision making unitsBankerR. D.CharnesA.CooperW. W.Some models for estimating technical and scale inefficiencies in data envelopment analysisPastorJ. T.RuizJ. L.SirventI.An enhanced DEA russell graph efficiency measureToneK.A slacks-based measure of efficiency in data envelopment analysisMitchellM.BurkeE.HartE.KendallG.NewallJ.RossP.SchulenburgS.Hyperheuristics: an emerging direction in modern search technologyChandraR.MenonR.DagumL.DongarraJ. J.Du CrozJ.HammarlingS.HansonR. J.An extended set of FORTRAN basic linear algebra subroutinesAndersonE.BaiZ.BischofC.DemmelJ.DongarraJ. J.Clinton WhaleyR.PetitetA.DongarraJ.Automated empirical optimizations of software and the ATLAS project