Two-sided matching is a hot issue in the field of operation research and decision analysis. This paper reviews the typical two-sided matching models and their limitations in some specific contexts, and then puts forward a new decision model based on uncertain preference sequences. In this model, we first design a data processing method to get preference ordinal value in uncertain preference sequence, then compute the preference distance of each matching pair based on these certain preference ordinal values, set the optimal objectives as maximizing matching number and minimizing total sum of preference distances of all the matching pairs under the lowest threshold constraint of matching effect, and then solve it with branch-and-bound algorithm. Meanwhile, we take two numeral cases as examples and analyze the different matching solutions with one-norm distance, two-norm distance, and positive-infinity-norm distance, respectively. We also compare our decision model with two other approaches, and summarize their characteristics on two-sided matching.
1. Introduction
Two-sided matching is an important research branch in the field of operation research and decision analysis, which has been applied into many aspects of engineering and economics, such as commerce trading [1, 2], work assignment [3, 4], and resource allocation [5, 6]. Two-sided matching decision problem derives from Gale and Shapley’s research on stable marriage matching and college admission problem in 1962 [7]. Based on the pioneering work of Gale and Shapley, Roth first gives an accurate conception of two-sided matching: “two-sided” refers to the fact that agents in such markets belong to one of two disjoint sets, for example, firms or workers, that are specified in advance, and matching refers to the bilateral nature of exchange in these markets, for example, if I am employed by the University of Pittsburgh, when the University of Pittsburgh employs me [8]. In many actual two-sided matching cases, due to the difficulty of information acquisition and fuzziness of the information identification, it is much easier for a decision maker to acquire the preference sequence of each element than other kinds of information, such as the weight value of connection between two elements in two disjoint sides, so preference sequence usually may be the essential and even the only basis for decision making. However, when the scale of data grows rapidly, on one hand, it is nearly impossible to collect the complete preference sequence and, on the other hand, two or more elements in the preference sequence cannot be distinguished which ranks higher or lower because they have the same preference degree for the preference subject.
Now we define some conceptions about the preference sequence. If preference sequence of an element in one set to the other disjoint set includes all the elements in the latter set, we call it a complete preference sequence; otherwise we name it as an incomplete preference sequence. If any two elements in the preference sequence of an element in one set to the other disjoint set do not have the same preference degree, we call it a strong preference sequence; otherwise we name it as a weak preference sequence. We define the preference sequence of an element in one set as a certain preference sequence only when all the elements in the other set have direct and certain ordinal value in accordance with this sequence; otherwise it is an uncertain preference sequence. Obviously, a preference sequence is certain only when it is complete and strong at the same time; otherwise it is an uncertain preference sequence. The decision model constructed in this paper is just against this background: the only information given for two-sided matching is the uncertain preference sequences.
2. Research on Two-Sided Matching
In general, we can categorize two-sided matching problem into three typical kinds of models in terms of different decision objectives: stable matching, maximum cardinality matching, and maximum weight matching. In the first model, the objective is to seek a stable matching solution, and we count a solution as stable matching only when there does not exist any alternative pairing (A, B) in which A and B are individually better off than they would be with the element currently matched. Gale and Shapley put forward an approach, also named Gale-Shapley algorithm, to get a stable matching solution in the perspective of mathematics and game theory, which symbolizes the beginning of two-sided matching research and enlightens the subsequent scholars to pay more attention to this topic. In the second model, the objective is to seek a solution in which the number of matching pairs is maximized. This kind of problem has been widely applied to graph theory, and one common solving approach is the Hungarian algorithm, which was put forward by Hungarian mathematician Edmonds in 1965 on the basis of the Hall Theorem [9]. The key point of the Hungarian algorithm is to seek an augmenting path, and this kind of two-sided matching problem is equivalent to maximum matching in the bipartite graph. Maximum weight matching sometimes also is called optimal weight matching, for minimum weight matching is easy to transfer to maximum weight matching and has the same solving approach. In this model, each matching pair consisting of two elements in two disjoint sets has a corresponding weight value, and the objective is to maximize or optimize the total weight sum of all the matching pairs. This matching model can be classified as assignment problem in the field of operation research, and one common solving approach is the Hungarian method, which was first put forward by Kuhn in 1955 on the basis of a mathematical theorem found by another Hungarian mathematician König [10]. It is worth mentioning that the names of approaches put forward by Edmonds and Kuhn are similar because of “Hungarian,” but they are two totally different methods or algorithms. Meanwhile, assignment problem model also can be regarded as a maximum weight matching problem in the bipartite graph, in which one common solving approach is the Kuhn-Munkres algorithm put forward by Munkres in 1957 [11].
Based on the previous scholar’s work, the current research on two-sided matching is usually conducted in the following two ways. The first one is to seek a more effective method or analyze some certain algorithms for the typical models, especially the stable matching model; for example, Roth puts forward the hospital-resident algorithm regarding many-to-one matching case [12], Knoblauch researches the characteristics of the Gale-Shapley algorithm on the condition of randomly distributed preference ordinal value [13], McVitie and Wilson put forward a new algorithm based on “Breakmarriage Operation” regarding the situation when the number of the two sides is not the same [14], and Teo et al. study strategic issues in the Gale-Shapley stable marriage model [15]. The second one is the specific application in different decision contexts; after all each decision context has its own characteristics and the decision makers have to take the distinctive constraints into consideration, so they should extend or revise the typical two-sided matching model; for example, van Raalte and Webers research a two-sided market where the one type of agents needs the service of a middleman or matchmaker in order to be matched with the other type [16] and Sarne and Kraus address the problem of agents in a distributed costly two-sided search for pairwise partnerships in a multiagent system [17]. The research content in this paper is just conducted in the second way.
In many actual decision situations, we cannot easily classify most of the two-sided matching cases into the mentioned typical matching models. For example, it is really hard to set an appropriate weight value for each matching pair directly. Though we have some quantitative methods or techniques, such as AHP just shown in [18], to help to determine the weight value, they take little effect when the scale of data in the decision background is very huge or the useful information given is very scarce. Many researchers set the stable matching as their most important optimal objective; however stable matching under the uncertain preference sequence has some limitations. When the preference sequence information is incomplete but strong, the Gale-Shapley algorithm still takes effect. The solution is still stable, while the number of matching pairs may be reduced. In fact, when the preference sequence information is incomplete but strong, or weak but complete, the solutions are both able to be solved in polynomial time [19]; nevertheless when the preference sequence information is incomplete and weak at the same time, the problem is NP-hard, and the common solving approach is to release some constraints or adopt approximation algorithms, just shown in [20, 21]. What is more, sometimes decision objective on two-sided matching is not to get a stable matching, because stable matching not only cannot take the benefit of two sides into consideration at the same time, but also cannot maximize the total utility in economics or other perspectives. One of the common decision objectives of two-sided matching in rational economic perspective is to maximize the total utility, so a decision maker generally should transfer the preference sequence information to utility value with some data processing methods and then use typical maximum weight matching model to solve it. Li et al. replace the utility with satisfactory and try to maximize the total sum of it. Based on a hypothesis that satisfactory degree decreases with the growth of preference ordinal value and the speed of its decline is slower and slower, Li et al. construct a transformation mechanism between preference ordinal value and satisfactory value [22]. However in their research, not only is the preference sequence complete, but also the matching solution highly depends on the transfer function between preference ordinal value and satisfactory value and it maybe changes if any parameter in the function or the function itself changes. Regarding this situation, we research the two-sided matching on the condition of uncertain preference sequence information and put forward a new decision model which integrates typical maximum cardinality model and optimal weight matching model. The objectives in the integrated model are maximizing the matching number and minimizing the distance from ideal status.
3. Uncertain Preference Sequences and Ordinal Value
The two-sided matching model constructed in this paper computes preference distance of any two elements in two disjoint sets, respectively, on the basis of their preference ordinal value, so we first design a data processing method, to get the preference ordinal value in uncertain preference sequence. We denote two disjoint sets by X and Y, and the number of elements in them is m and n, respectively, X={X1,X2,…,Xm}, Y={Y1,Y2,…,Yn}, so Xi and Yj are the elements in X and Y, i∈I, j∈J, I=[1,m], J=[1,n]. We denote the preference sequence of Xi to Y by Pi (or Pi∗). Specifically, only when the preference sequence of Xi to Y is incomplete and Xi would not like to accept the elements not in its preference sequence do we label the preference sequence of Xi to Y as Pi∗; otherwise we label the preference sequence of Xi to Y as Pi. Pi (or Pi∗) consists of the elements in Y, and these elements are sorted in descending order according to the preference degree of Xi. If Xi has the same preference degree as some elements in Pi (or Pi∗), we label them with parentheses. For example, X and Y both have 5 elements, and the preference sequence of X3, one of the elements in X, to Y is labeled as P3 (or P3∗). One of its instances is {Y2,(Y1,Y4),Y5}, which refers to the fact that, for X3, Y2 is its first preference item in Y, Y1 and Y4 are tied for the second item, and Y5 is the fourth item. As the preference sequence of X3 is an incomplete preference sequence and Y3 is not in it, in order to deal with this situation, we classify incomplete preference sequence into two types: one is that after considering the elements in the preference sequence preferentially: it would like to accept the elements which are not in its preference sequence; the other one is that it would not like to accept the elements not in its preference sequence at any time. To the latter situation, we label it with ∗; for example, if we denote the preference sequence of X3 by P3 directly, it refers to the fact that X3 would like to accept Y3 as its preference item, after considering Y2, Y1, Y4, and Y5 preferentially; if we denote it by P3∗, it refers to the fact that X3 would not like to accept Y3 as its preference item at any time.
We denote the real ordinal value of Yj in Pi (or Pi∗) by oij, so when Yj is one of the elements in Pi (or Pi∗), oij is 1,2,…,n; and when Yj is not in Pi (or Pi∗), oij is ⌀. Take P3={Y2,(Y1,Y4),Y5} mentioned above as an example; the real ordinal value of Y2, Y1, Y4, and Y5 is o32=1, o31=2, o34=3, and o35=4, and the real ordinal value of Y3 is o33=⌀. We also denote the preference ordinal value Yj in Pi (or Pi∗) by rij; according to the data processing method designed in this paper, the transition between oij and rij is defined as follows: (a) when oij is ⌀, if the preference sequence of Xi is labeled as Pi, rij is |Pi|+1+n/2, where Pi is the number of elements in Pi, and else if the preference sequence of Xi is labeled as Pi∗, rij is n+1; (b) when oij is not ⌀, if there does not exist any other element tied with Yj in Pi (or Pi∗), rij equals oij, and else if there exists any other element tied with Yj in Pi (or Pi∗), which we denote including Yj by set Y′, rij equals arithmetic average value of the real ordinal value of all the elements in Y′, labeled as avg(Y′). The definition of oij and transition between oij and rij is presented as follows:(1)oij=1,2,3,…,n,Yj∈Pior Pi∗⌀,Yj∉Pior Pi∗,rij=Pi+1+n2,oij=⌀,∃Pin+1,oij=⌀,∃Pi∗oij,oij≠⌀, not tied with Yj in Pior Pi∗avgY′,oij≠⌀, tied with Yj in Pior Pi∗.Similarly, we denote the preference sequence of Yj to X by Qj (or Qj∗). Specifically, we label the preference sequence of Yj to X as Qj∗ only when the preference sequence of Yj to X is incomplete, and Yj would not like to accept the elements not in its preference sequence at any time; otherwise we label it as Qj directly. We denote the real ordinal value of Xi in Qj (or Qj∗) by sij and the preference ordinal value of Xi in Qj (or Qj∗) by tij, and the transition between sij and tij also is similar to oij and rij: (a) when sij is ⌀, if the preference sequence of Yj to X is labeled as Qj, tij is |Qj|+1+m/2, where Qi is the number of elements in Qj, and else if the preference sequence of Yj to X is labeled as Qj∗, tij is m+1; (b) when sij is not ⌀, if there does not exist any other element tied with Xi in Qj (or Qj∗), tij equals sij, and else if there exists any other element tied with Xi in Qj (or Qj∗), which we denote including Xi by set X′, tij equals the average value of the real ordinal value of all the elements in X′, labeled as avg(X′). The definition of sij and the transition between sij and tij is presented as follows:(2)sij=1,2,3,…,m,Xi∈Qjor Qj∗⌀,Xi∉Qjor Qj∗,tij=Qj+1+m2,sij=⌀,∃Qjm+1,sij=⌀,∃Qj∗sij,sij≠⌀, not tied with Xi in Qjor Qj∗avgX′,sij≠⌀, tied with Xi in Qjor Qj∗.
4. Preference Distance
As we cannot measure the relationship between preference ordinal value and preference utility exactly, no universal transition function between them is widely accepted. No matter whether it is a linear function or a nonlinear function, both depend on the specific decision background. However, the preference ordinal value is a good unit to measure the distance from ideal status, which we name as preference distance. The larger the ordinal value is, the further the distance is, and the less value the matching effect has. So our model first computes the preference distances from the ideal matching situation on the basis of preference ordinal values, and then minimizes the total sum of these distances.
Take the elements Xi and Yj in X and Y, respectively, as an example. In Pi, the preference sequence of Xi, the preference ordinal value of Yj is rij; and in Qj, the preference sequence of Yj, the preference ordinal value of Xi is tij. We use (rij, tij) to represent the matching status between Xi and Yj. In the ideal matching status, Xi and Yj are both the first preference items of each other, and we label this ideal matching status as (1, 1). Preference distance of Xi and Yj is the distance between the real matching status and ideal matching status of this matching pair, denoted by dij. According to the definition of distance given by Minkowski, the computation of dij is defined as follows:(3)dij=rij-1q+tij-1q1/q.If we take the different importance of two sides into consideration, the computation mentioned above can be modified as follows: (4)dij=rij-1q+αqtij-1q1/q.And α is the importance factor to balance two sides. In this paper, we ignore the difference and assume that two sides have the same importance, so α is always assigned as 1 and the computation of preference distance is equivalent in (3) and (4). In the computation equation, it refers to different kinds of distance when q varies. Theoretically, q can be any real number from 1 to +∞, but the most common value is 1, 2, and +∞.
When q=1, dij is a one-norm distance, also called Manhattan distance. It is a form of geometry in which the distance between two points is the sum of the absolute differences of their Cartesian coordinate value. In this case, dij between point (rij, tij) and point (1, 1) is rij+tij-2, a linear expression of their coordinate value. A common criterion to evaluate a matching solution is the total sum of preference ordinal values of all the matching elements. The total sum of preference ordinal values and the total sum of one-norm distances are linearly equivalent in mathematics; namely, all the feasible solutions of minimizing total sum of one-norm distances and minimizing total sum of preference ordinal values are completely the same. And both of them have a negative correlation with matching effect. One-norm distance applies to the decision situation where the ranges of preference ordinal value of two sides do not have significant difference, and two sides roughly have the same metrics so that we can use simple additive relationship to represent the whole matching effect.
When q=2, dij is a two-norm distance, also called Euclidean distance. It is a form of geometry in which the distance between two points is the length of the line segment connecting them and is computed by Pythagorean formula. In this case, dij between point (rij, tij) and point (1, 1) is (rij-1)2+(tij-1)2, just equaling their Euclidean distance. Since the coordinate value is integer number generally, dij involves floating computing if q=2. So in the perspective of solving efficiency, the solving time will increase rapidly when the scale of data grows. Therefore, two-norm distance applies to the situation where the dimensions represented in two sides are independent, and the data scale is in an accepted scope.
When q=+∞, dij is a positive-infinity-norm distance, also called Chebyshev distance. It is a form of geometry in which the distance between two points is the greatest of their differences along any coordinate dimension. In this case, dij between point (rij, tij) and point (1, 1) equals the larger value in rij-1 and tij-1. In mathematics, take one point as the origin of coordinates; the points which have d Chebyshev distance with origin of coordinates make up a quadrate. The origin of coordinates is the central point of this quadrate, and the length of its each side is 2d; meanwhile each side is parallel with coordinate axes. Positive-infinity-norm distance applies to the situation where the balanced performance of two sides is important, and the difference of two sides should not be too much.
5. Modeling Construction and Solving
In the typical optimal weight matching model, the main constraint is that any element in one set can only match one element in the other set at most, and the objective is to optimize the total sum of weight values; for example, we take the preference distance of each matching pair as its weight value; the optimal objective is to minimize the total sum of preference distances of all the matching pairs. Besides, matching number is also a constraint, though the value is obvious in the typical model: if the number of elements in two sides is the same, the matching number just equals the number of elements in each side; otherwise, the matching number equals the less one. In the model put forward in this paper, in order to avoid the performance of some matching pairs being too bad, we set a threshold of matching performance; namely, preference distances of all the matching pairs should not be more than the value set in advance. As this new constraint on matching performance, the matching number is not a constant value any more, and it depends on restraint degree of the threshold. Regarding this context, we set two optimal objectives in the decision model: one is to maximize the number of all matching pairs and the other one is to minimize the total sum of all matching pairs’ preference distances.
We use xij to denote the matching relationship between Xi and Yj, xij∈{0,1}. If xij is 1, it refers to the fact that Xi and Yj match each other and make up a matching pair; else if xij is 0, it refers to the fact that Xi and Yj do not match each other and also do not make up a matching pair. The constraint that any element in one set can only match one element in the other set at most is presented as follows:(5)0≤∑i∈Ixij≤1,∀j∈J,0≤∑j∈Jxij≤1,∀i∈I.We denote the maximum value of dij by dmax and the minimum value by dmin, and the specific value of dmax and dmin depends on the range of parameter q and the preference sequences. We also denote λ as the threshold factor of matching performance, λ∈[0,1]. When λ=1, it refers to no constraint on threshold; when λ decreases, it refers to the fact that the constraint degree on threshold increases; and when λ=0, it refers to the fact that the constraint on threshold is the strictest. The constraint is presented as follows:(6)xijdij≤dmin+λdmax-dmin,∀i∈I,∀j∈J.Since one element should not match an element which is not in the former one’s preference sequence and meanwhile the former element also would not like to accept that, we should add a new constraint: if rij equals n+1, xij should be zero; and if tij equals m+1, xij also should be zero. As the maximum value of rij and tij is n+1 and m+1, respectively, and the second maximum value of rij and tij is n and m, respectively, this constraint can be presented as a linear expression as follows:(7)0≤xij≤n+1-rij,∀i∈I,∀j∈J,0≤xij≤m+1-tij,∀i∈I,∀j∈J.Maximizing the matching number and minimizing the total sum of preference distances are presented as follows:(8)max∑i∈I∑j∈Jxij,min∑i∈I∑j∈Jxijdij.As it is a multiobjective optimization, we generally have three different ways to deal with it: the first one is to transfer a multiobjective problem to one single-objective problem, such as simple linear additive weight method, maximal-minimal method, and TOPSIS method: the critical point of this solving way is to ensure that the optimal solution of the new single-objective problem is also the noninferior solution of the original multiobjective problem; the second one is to transfer it to multiple single-objective problems in a special order, such as hierarchical method, interactive programming method: we get the optimal solution of the original problem through solving these single-objective problems one by one, which is also the optimal solution of the last single-objective problem; the third way is some nonuniform methods, such as multiplication division method and efficiency coefficient method. In this paper, we hold that maximizing the number of all matching pairs is the main optimal objective and minimizing the total sum of preference distances are the second optimal objective, so we use the second way mentioned above to deal with it. Namely, we first get the optimal solution of the main objective, then add it as a new constraint, and finally get the optimal solution of the second objective. These two objectives are integrated as a linear equation as follows:(9)maxy=M1∑i∈I∑j∈Jxij-M2∑i∈I∑j∈Jxijdij,M1≫M2,where M1 and M2 are both positive real numbers and “M1≫M2” refers to the fact that M1 is far greater than M2. We can also give a specific value to M1 and M2 if we can measure the importance of these two objectives exactly, but it is not the content discussed in this paper. In conclusion, the decision model constructed in this paper is presented as follows:(10)maxy=M1∑i∈I∑j∈Jxij-M2∑i∈I∑j∈Jxijdijs.t.0≤∑i∈Ixij≤1,∀j∈J0≤∑j∈Jxij≤1,∀i∈Ixijdij≤dmin+λdmax-dmin,∀i∈I,∀j∈J0≤xij≤n+1-rij,∀i∈I,∀j∈J0≤xij≤m+1-tij,∀i∈I,∀j∈Jxij=0 or 1dij=tij-1q+rij-1q1/qλ∈0,1,q∈1,+∞,M1∈R+,M2∈R+,M1≫M2.In the typical maximum weight matching model, as the maximum matching number is constant and obvious, the model is easy to convert to a standard assignment model. In standard assignment model, the number of elements in each side is the same and every element in one set will be matched with one element in the other set, through adding some zero elements to balance the number of two sides. When the scale is not too large, the Hungarian method is one common approach to solve this kind of model, and it is on basis of the following two theorems: (a) if all elements in one column or row of efficiency matrix are plus or minus a number, the optimal solutions of the origin matrix and new matrix are the same; (b) the maximum number of independent zero elements in efficiency matrix equals the minimum number of lines which cover all the zero elements. Now so many literatures have researched and promoted this method and provide programming codes in different programming languages or coding platforms, such as C, JAVA, and MATLAB, so we do not discuss it repetitively.
Regarding nonlinear optimization, the common solving approach is heuristic algorithms, such as genetic algorithm, simulated annealing algorithm, and tabu search algorithm. However, if we first compute dij in terms of rij and tij, then input the dij matrix and the value of dmax, dmin, and λ; all the constraints and optimal objectives involving xij are linear expressions, so the model we construct is a standard 0-1 integer linear programming model, which we can use branch-and-bound method to solve, even when the scale of data is very large. Branch-and-bound method is an algorithm design paradigm for discrete optimization problems as well as general real valued problems. It is first proposed by Land and Doig in 1960 for discrete programming [23] and has become the most commonly used tool for solving integer programming and NP-hard problem. It has two procedures: branching and bounding; branching refers to dividing the origin problem into some subproblems in which the union set of all solutions covers all the feasible solutions in the origin problem and bounding refers to computing an upper bound and a lower bound for optimal objective value. The main idea of branch-and-bound algorithm is to increase lower bound and decrease upper bound iteratively and get the optimal value finally. It also can be classified into some specific types on basis of the different branching search strategies. Modern linear programming software, such as CPLEX, solves integer programming model with branch-and-bound algorithm package. In this paper, we also use this algorithm to solve our model and analyze the solution in the following section.
6. Numerical Cases and Analysis
We first give a numerical case, named Case 1, and each side of it has 10 elements. Set X, Y, and their preference sequences P and Q are shown in Table 1. The real ordinal value of Yj in Pi, the preference sequence of Xi, makes up a matrix, labeled as O=oij10×10, and the preference ordinal value also makes up a matrix, labeled with R=rij10×10. Similarly, the real ordinal value of Xi in Qj, the preference sequence of Yj, makes up a matrix, labeled as S=sij10×10, and the preference ordinal value also makes up a matrix, labeled with T=tij10×10.
Matching sets X and Y and preference sequences P and Q in Case 1.
Xi
P
Yi
Q
X1
P1=Y6,Y9,Y5
Y1
Q1=X8,(X9,X10),X5,X4,X7,X1,X3
X2
P2∗=Y5,(Y7,Y6),Y9,Y10,Y2,Y3
Y2
Q2∗=X8,X10,X7,X6,X3
X3
P3∗=Y7,Y5,Y8,Y1,Y3,Y2
Y3
Q3=(X9,X7,X8),X5,X6,X4,X10,X3,X2,X1
X4
P4=Y5,Y6,Y7,(Y9,Y10),Y3,Y1
Y4
Q4=X7,X9,X8,X6
X5
P5=Y7,Y5,Y6,Y8,Y3
Y5
Q5∗=X7,X9,X8,X6,X5,X1
X6
P6=Y5,Y7,Y6,Y10,Y8,Y9,Y4,Y2,Y1,Y3
Y6
Q6=X8,X6,X3
X7
P7=Y6,Y4,Y1,Y2
Y7
Q7∗=X8,X9,X7,X6,X10,X2,X1
X8
P8∗=Y6,Y8,Y9,Y1,Y3,Y2,Y10,Y4
Y8
Q8=X9,X7,X10,X6,X4,X5,X1,X2
X9
P9=Y5,Y7,(Y8,Y6,Y10),Y9,Y4,Y2,Y3,Y1
Y9
Q9=X7,X8,(X9,X10),X4,X5,X2,X6,X1,X3
X10
P10=Y7,Y5,Y6,Y4,Y2,Y8,Y3,Y1
Y10
Q10=X7,X9,X8,X10,X4,X5,X1
The four matrixes O, R, S, and T are presented as follows:(11)O=⌀⌀⌀⌀31⌀⌀2⌀⌀67⌀132⌀45465⌀2⌀13⌀⌀7⌀6⌀123⌀45⌀⌀5⌀2314⌀⌀9810713256434⌀2⌀1⌀⌀⌀⌀4658⌀1⌀2371089714236585742316⌀⌀,S=7⌀10⌀6⌀7797⌀⌀9⌀⌀⌀687⌀858⌀⌀3⌀⌀10⌀5⌀6⌀⌀⌀⌀5554⌀4⌀5⌀⌀666⌀45442448⌀63211⌀32111133311⌀232⌀122⌀2132327⌀⌀⌀5344,R=777731772711671112.52.5114546511211131111796912394.54.5885823148898107132564347.527.517.57.57.57.546581111123710897142464857423169.59.5,T=711107.56777979.51197.511768798587.5113119.510951167.51171155541147.557116669.5454424489632117321111233119.5232.5112227213.522.5277.5117533.54.After computing matrixes O, R, S, and T, we can get matrix dij according to (3) when q=1, q=2, or q=+∞. We first set λ as 1, namely, without threshold constraint; the results solved through CPLEX based on branch-and-bound algorithm are as follows: when q=1, the optimal solution, labeled as Solution 1, is that x16, x27, x31, x4,10, x53, x65, x74, x89, x98, and x10,2 are 1 and the rest are 0; when q=2, the optimal solution, labeled as Solution 2, is that x15, x27, x31, x4,10, x53, x66, x74, x89, x98, and x10,2 are 1 and the rest are 0; when q=+∞, the optimal solution, labeled as Solution 3, is that x11, x29, x32, x4,10, x53, x67, x74, x86, x95, and x10,8 are 1 and the rest are 0. Now we compare these three solutions in the following six different criteria.
Criterion 1 (concordance rate on the first item, labeled as c1). We define the number of matching pairs in which one element matches its first preference item (including tied ranking and the same below) as N1 and the number of all matching pairs as N. c1 is the ratio between N1 and N. Obviously, c1 has a positive correlation with matching effect: the higher the rate value, the better the matching effect. Take Solution 1 as an example; to set X, only X1 and X6 match their first preference items, so concordance rate of X is 20%; to set Y, only Y4 and Y8 match their first preference items, so concordance rate of Y is also 20%. Hence the average concordance rate on the first item of Solution 1 is 20%. Similarly, the average concordance rate on the first item of Solutions 2 and 3 is 10% and 20%, respectively.
Criterion 2 (concordance rate on the first three items (including tied ranking and the same below), labeled as c2). We define the number of matching pairs in which one element matches one of its first three preference items as N3, and the number of all matching pairs is still denoted by N. c2 is the ratio between N3 and N. c2 also has a positive correlation with matching effect: the higher the rate value, the better the matching effect. In Solution 1, to set X, X1, X2, X6, X7, X8, and X9 match elements in their first three preference items, so concordance rate of X is 60%; to set Y, Y2, Y4, Y8, and Y9 match elements in their first three preference items, so concordance rate of Y is 40%. Hence the average concordance rate on the first three items of Solution 1 is 50%. Similarly the average concordance rate on the first three items of Solutions 2 and 3 is 60% and 40%, respectively.
Criterion 3 (concordance rate on all the items, labeled as c3). We define the number of matching pairs in which one element matches the item in its preference sequence as N0, and the number of all matching pairs is still denoted by N. c3 is the ratio between N0 and N. c3 also has a positive correlation with matching effect: the higher the rate value, the better the matching effect. In Solution 1, to set X, all elements in X match elements in their preference sequence, so concordance rate of X is 100%; to set Y, all elements except Y6 match elements in their preference sequences, so concordance rate of Y is 90%. Hence the average concordance rate on the first three items of Solution 1 is 95%. Similarly the average concordance rate on all items of Solutions 2 and 3 is 100% and 95%, respectively.
Criterions 4 and 5 (average preference ordinal value and the standard deviation of it, labeled as c4 and c5, resp.). In Solution 1, preference ordinal value of the elements in set X to set Y is 1, 2.5, 4, 4.5, 5, 1, 2, 3, 4, and 5, so the average preference ordinal value and standard deviation are 3.2 and 1.53; preference ordinal value of the elements in set Y to set X is 7, 6, 8, 5, 4, 4, 1, 2, 1, and 2, so the average preference ordinal value and standard deviation are 4 and 2.49; take sets X and Y as a whole; the average preference ordinal value and standard deviation are 3.6 and 2.06. Similarly, we can get the average preference ordinal value and standard deviation in Solutions 2 and 3. Obviously, c4 has a positive correlation with matching effect, while c5 has a negative correlation with matching effect.
Criterion 6 (the total sum of disagreements of two sides, labeled as c6). It equals the total sum of all the absolute differences between two sides’ preference ordinal values, which measures the difference of two sides. So c6 is defined as follows:(12)c6=∑i∈I∑j∈Jtij-rijxij,∀i∈I,∀j∈J.Criterion 6 also has a negative correlation with matching effect: the lower the total sum of disagreements, the better the matching effect. The values on Criterion 6 in Solutions 1, 2, and 3 are 26, 21, and 12.5, respectively.
The value on these six criteria in three solutions is shown in Table 2, and we can see that no solution performs better than the rest on all the criteria: Solution 1 performs better on Criterions 1 and 4, for it has a higher concordance rate on the first item and the least average preference distance if taking X and Y as a whole; Solution 3 performs better on Criterions 1 and 6, for it has the same performance as Solution 1 on Criterion 1 and the least total sum of disagreements of two sides; Solution 2 performs better on Criterions 2, 3, and 5, for it has a higher concordance rate on the first three items and concordance rate on all the items and the least standard deviation no matter whether it is on set X or set Y or sets X and Y together. What is more, Solution 2 is also the only one in which the elements matched are all in their preference sequences. When q varies from 1, 2, and +∞, the value on Criterion 4 increases, while the value on Criterion 6 decreases; namely, the performance on Criterions 4 and 6 moves in an opposite direction; however the value on Criterion 5 goes down at first and then rises up. Among Criterions 4~6, each solution performs better on one criterion and worse on the other two criteria. So which kind of preference distance will be used depends on the actual decision background; for example, the decision maker should first balance the importance of each evaluation criterion and then choose an appropriate solution in terms of its different performance on these criteria.
Solutions and the value on six criteria based in Case 1.
When λ varies from 0 to 1, as the influence of threshold constraint, matching number also will vary. Matching number rate is the ratio between the actual number of matching pairs in the optimal solution and the maximum possible number of matching pairs, and the latter one equals the smaller value between m and n obviously. The rate is shown in Table 3 when λ and q have a different value. We can see that the range scope of λ which has effect on matching number rate is wider in high norm distance than in low norm distance; that is to say, λ is more sensitive to matching number when q grows. This has a guiding significance to set an appropriate value of λ in the actual decision situation: the constraint on threshold is stricter when q grows, so if we want to have the same level of matching number rate in different norm distance, we should set λ of a larger value with the increasing of q.
Matching number rate when λ and q vary in Case 1.
q
λ∈[60%,1]
λ = 50%
λ = 40%
λ = 30%
λ = 20%
λ = 10%
λ = 0
q = 1
100%
100%
90%
60%
40%
30%
10%
q = 2
100%
90%
60%
40%
40%
30%
10%
q = +∞
100%
70%
40%
40%
30%
10%
10%
We also take the numerical case in [22] as an example and name it as Case 2, for it compares its own approach and the approach put forward in [14]. The preference sequences are complete, but the number of elements in two sides is not equal, and the information of matching sets and their preference sequences are shown in Table 4. We label the solving approaches when q=1, q=2, and q=+∞ as Approaches 1, 2, and 3 and label the approaches in [22] and [14] as Approaches 4 and 5, respectively. We use these five approaches to solve two-sided matching problem in Table 4 and get three solutions, labeled as Solutions A, B, and C. Each solution and its value on the six criteria mentioned above are just presented in Table 5.
Matching sets X and Y and preference sequences P and Q in Case 2.
Xi
P
Yi
Q
X1
P1=Y1,Y3,Y4,Y2,Y8,Y5,Y6,Y7
Y1
Q1=(X2,X6),X1,X5,X3,X4
X2
P2=Y1,(Y4,Y3),Y2,Y7,Y6,Y8,Y5
Y2
Q2=X1,X3,X2,X5,X6,X4
X3
P3=Y4,Y1,Y2,(Y3,Y5),Y7,Y6,Y8
Y3
Q3=X5,X4,(X2,X6),X3,X1
X4
P4=Y3,Y1,Y4,Y5,Y6,Y2,Y7,Y8
Y4
Q4=X3,X2,X4,X1,X5,X6
X5
P5=Y3,Y4,Y2,Y1,Y7,Y8,Y6,Y5
Y5
Q5=X2,(X3,X6),X5,X1,X4
X6
P6=(Y2,Y6),Y1,Y4,Y5,Y7,Y8,Y3
Y6
Q6=X1,X2,X5,X3,X4,X6
Y7
Q7=X6,X3,X2,X1,(X4,X5)
Y8
Q8=X3,(X5,X6),X1,X4,X2
Solutions and the value on six criteria in Case 2.
Solution A
Solution B
Solution C
Solution (λ = 1)
X1
~Y2X2~Y1X3~Y4X4~Y3X5~Y8X6~Y7
X1
~Y2X2~Y1X3~Y4X4~Y6X5~Y3X6~Y5
X1
~Y2X2~Y1X3~Y4X4~Y5X5~Y3X6~Y6
Approach
Approach 1
Approaches 2, 3, and 5
Approach 4
Criterion 1
X: 50%, Y: 50%, Avg.: 50%
X: 50%, Y: 50%, Avg.: 50%
X: 67%, Y: 67%, Avg.: 67%
Criterion 2
X: 67%, Y: 100%, Avg.: 83%
X: 67%, Y: 83%, Avg.: 75%
X: 67%, Y: 67%, Avg.: 67%
Criterion 3
X: 100%, Y: 100%, Avg.: 100%
X: 100%, Y: 100%, Avg.: 100%
X: 100%, Y: 100%, Avg.: 100%
Criterion 4
X: 3.17, Y: 1.50, X and Y: 2.33
X: 2.83, Y: 2.00, X and Y: 2.42
X: 2.08, Y: 2.75, X and Y: 2.42
Criterion 5
X: 2.48, Y: 0.63, X and Y: 1.93
X: 2.04, Y: 1.58, X and Y: 1.79
X: 1.50, Y: 2.52, X and Y: 2.01
Criterion 6
13
6
10
When comparing these three solutions, is there a solution which performs better than the rest of the solutions definitely? Unfortunately, we cannot find it when not giving a determinate evaluation system of decision making. It is obvious that each approach makes the value in its own optimal objective better than other approaches; for example, Approach 1 is to minimize the total sum of one-norm distances, which is equivalent to minimizing the total sum of preference ordinal values, so Approach 1 has the minimum total sum of preference ordinal values, compared with other approaches. Approaches 2 and 3 are similar to Approach 1; they perform better in the total sum of two-norm distances or positive-infinity-norm distances. Approach 4 has a better satisfactory degree which highly depends on transition function put forward by itself. Approach 5 guarantees that the solution is a stable matching. So we list some other criteria which are different from their original optimal objectives to evaluate the matching effect. Three solutions have the same performance on Criterion 3 because the preference sequences are complete. Solutions A and B perform better than Solution C on Criterion 2 but worse than that on Criterion 1. And Solution A has the least average preference ordinal value, and Solution B has the least standard deviation of preference ordinal value, if taking sets X and Y together. Meanwhile, Solution B also has the least total sum of disagreements of two sides. It is worth mentioning that when the number of matching pairs is a constant value, Criterion 4 is linearly equivalent with the total sum of one-norm distances. After comparing on these criteria, we can find that Solutions A, B, and C, solved with five different approaches, have their own advantages. Before giving more specific evaluation information, we cannot regard any of them to be better than the rest. We should still choose an appropriate approach according to the specific decision background, though some different approaches have the same one solution, just like Solution B.
7. Conclusion
Two-sided matching based on preference sequences has different solutions under different optimal objectives. We list six criteria to evaluate the matching solutions: concordance rate on the first item, concordance rate on the first three items, concordance rate on all the items, average preference ordinal value, standard deviation of preference ordinal value, and total sum of disagreements of two sides. All of these six criteria are determined by preference ordinal value of the elements in matching sets. As no approach performs better than other approaches on all or most of the criteria, we think that no approach mentioned above applies for all two-sided matching decision situations. In fact, if we regard these six criteria as the decision objectives, Solutions 1~3 or Solutions A~C are all noninferior solutions. We have three ways to deal with multiobjective optimal problem, just mentioned in Section 4. As we have no further information about weight value of each objective, Approaches 1~5 are all conducted in the second way. The model we construct holds that maximizing the matching number and minimizing the total sum of preference distances are the most important and appropriate objectives for two-sided matching problem on the condition that uncertain preference sequences are the only decision information source.
Compared with other approaches, the model we construct extends preference sequence from a complete and strong to an incomplete and weak one and adds the constraint on threshold to guarantee matching effect of each matching pair is not too bad. Through adjusting λ from 0 to 1, we can strengthen or slack the threshold constraint. We also adopt different norms to represent different kinds of distance and promote the applicability in the decision background. Through adjusting q from 1, 2, and q=+∞, we can use this universal decision model in different decision ground, which is our advantage to other approaches. Besides, the model we construct is succinct and explicit, which is classified into 0-1 integer programming problem in the field of linear optimization, and can be solved with branch-and-bound algorithm. We think it is a better method to solve two-sided problems with uncertain preference sequences in actual decision situation.
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
Acknowledgments
The authors would like to thank the anonymous reviewers for their valuable and insightful comments for further improving the quality of this work. The authors also would like to thank Dr. Wu Xiang for his great and sincere help. This research was financially supported by Key Projects in the National Science & Technology Pillar Program during the Twelfth Five-Year Plan Period of China (2012BAJ05B06).
JanssenM.VerbraeckA.Comparing the strengths and weaknesses of Internet-based matching mechanisms for the transport market200844347549010.1016/j.tre.2006.11.0022-s2.0-38749099835SorensenM.How smart is smart money? A two-sided matching model of venture capital20076262725276210.1111/j.1540-6261.2007.01291.x2-s2.0-36649032314LinH.-T.A job placement intervention using fuzzy approach for two-way choice20093622543255310.1016/j.eswa.2008.01.0152-s2.0-56349152667HuangD. K.ChiuH. N.YehR. H.ChangJ. H.A fuzzy multi-criteria decision making approach for solving a bi-objective personnel assignment problem200956111010.1016/j.cie.2008.03.0072-s2.0-59049085136HaasC.KimbroughS. O.CatonS.WeinhardtC.Preference-based resource allocation: using heuristics to solve two-sided matching problems with indifferences20138193Berlin, GermanySpringer149160Lecture Notes in Computer Science10.1007/978-3-319-02414-1_11AbdulkadiroğluA.PathakP. A.RothA. E.The New York city high school match200595236436710.1257/0002828057746701672-s2.0-29544450886GaleD.ShapleyL. S.College admissions and the stability of marriage196269191510.2307/2312726MR1531503RothA. E.Common and conflicting interests in two-sided matching markets1985271759610.1016/0014-2921(85)90007-82-s2.0-0001828015EdmondsJ.Maximum matching and a polyhedron with 0,1-vertices1965691-212513010.6028/jres.069b.013MR0183532KuhnH. W.The Hungarian method for the assignment problem19552839710.1002/nav.3800020109MR0075510MunkresJ.Algorithms for the assignment and transportation problems1957513238MR0093429RothA. E.On the allocation of residents to rural hospitals: a general property of two-sided matching markets198654242542710.2307/1913160MR832765KnoblauchV.Marriage matching and gender satisfaction2009321152710.1007/s00355-008-0303-2MR24695492-s2.0-55949117627McVitieD. G.WilsonL. B.The stable marriage problem197114748649010.1145/362619.362631MR0307688TeoC.-P.SethuramanJ.TanW.-P.Gale-Shapley stable marriage problem revisited: strategic issues and applications20014791252126710.1287/mnsc.47.9.1252.97842-s2.0-0035466682van RaalteC.WebersH.Spatial competition with intermediated matching199834347748810.1016/s0167-2681(97)00080-22-s2.0-0032012491SarneD.KrausS.Managing parallel inquiries in agents' two-sided search20081724-5541569KorkmazI.GökçenH.ÇetinyokuşT.An analytic hierarchy process and two-sided matching based decision support system for military personnel assignment2008178142915292710.1016/j.ins.2008.03.0052-s2.0-43849109366IwamaK.ManloveD.MiyazakiS.MoritaY.Stable marriage with incomplete lists and ties19991644Berlin, GermanySpringer443452Lecture Notes in Computer Science10.1007/3-540-48523-6_41ManloveD. F.IrvingR. W.IwamaK.MiyazakiS.MoritaY.Hard variants of stable marriage20022761-226127910.1016/S0304-3975(01)00206-7MR1896356ZBL1050.681712-s2.0-0037029336HalldorssonM.IwamaK.MiyazakiS.MoritaY.Inapproximability results on stable marriage problems20022286Berlin, GermanySpringer554568Lecture Notes in Computer Science10.1007/3-540-45995-2_48LiM.FanZ.LiuY.A method for two-sided matching based on preference ordinal information2012214112118LandA. H.DoigA. G.An automatic method of solving discrete programming problems196028349752010.2307/1910129MR0115825