A New Distributed Approximation Algorithm for the Maximum Weight Independent Set Problem

Maximum weight independent set (MWIS) is a combinatorial optimization problem that naturally arises in many applications especially wireless networking. This paper studies distributed approximation algorithms for finding MWIS in a general graph. In the proposed algorithm, each node keeps exchanging messages with neighbors in which each message contains partial solutions of the MWIS optimization program. A parameter 𝐻 is introduced to achieve different tradeoff between approximation accuracy and space complexity. Theoretical analysis shows that the proposed algorithm is guaranteed to converge to an approximate solution after finite iterations; specifically, the proposed algorithm is guaranteed to converge to the optimal solution with 𝐻 = +∞ . Simulation results confirm the effectiveness of the proposed distributed algorithm in terms of weight sum, message size, and convergence performance.


Introduction
Consider a graph  = (, ) with a set  of nodes and a set  of edges.For each node  ∈ , there is a positive weight   .A subset of  can be represented by binary variable   , 1 ≤  ≤ ||, where   is 1 if  is in the subset and 0 otherwise.A subset is called an independent set if no two nodes in the subset are connected by an edge.We are interested in finding the maximum weight independent set (MWIS) [1], which can be expressed as an integer program: max ∑      s.t.  +   ≤ 1, (, ) ∈    ∈ {0, 1} ,  ∈ . (1) The MWIS problem has been extensively studied in the literature.For example, it is known to be solvable in polynomial time for many cases including perfect graphs [2], interval graphs [2], disk graphs [3], claw-free graphs [4], forkfree graphs [5], trees [6], sparse random graphs [7,8], circle graphs [9], and growth-bounded graphs [10].Moreover, there has been extensive work on approximating the MWIS [11], and specialized algorithms have been developed for exactly computing the MWIS [12][13][14][15].
Further, the MWIS problem naturally arises in many applications, especially wireless networking, which require distributed solutions.For example [16][17][18], in scenarios involving resource scheduling in wireless networks that lack a centralized infrastructure and where nodes can only communicate with local neighbors, the MWIS problem needs to be solved in a distributed manner.Fundamentally, any two wireless nodes that transmit at the same resource will interfere with each other if they are located close-by.The scheduling problem is to decide which nodes should transmit at the given resource so that there is no interference and nodes with long queue length are given priority.If each node is given a weight equal to the queue length, it is optimal to schedule the set of nodes with the highest total weight.If a conflict graph is made, with an edge between each pair of interfering nodes, the scheduling problem is exactly the MWIS problem for the conflict graph.The lack of an infrastructure and the local

Notation
This section introduces some notations which will be used throughout this paper.The symbols used in this paper are summarized in the List of Symbols.

Partial Solution.
Let  = {  , 1 ≤  ≤ ||} denote the full variable set of the MWIS problem.Assuming  is a subset of , a partial solution of the MWIS problem is expressed as where V  is 0 or 1.We call  the partial variable set of the MWIS problem.The objective associated with  is expressed as Further, given a partial variable set , the partial solution set can be expressed as where To help understand this, we use the graph shown in Figure 1 as an example.For this graph, there are four nodes and the weights are  1 = 3,  2 = 4,  3 = 5, and  4 = 5.The full variable set  = { 1 ,  2 ,  3 ,  4 }.Assuming the partial variable set  = { 1 ,  2 } ⊆ , then  = { 1 = 1,  2 = 0} is a partial solution of the MWIS problem.The objective associated with this partial solution is

Combination. For any two partial solutions 𝑝
=   , then we say these two partial solutions are contradictory; otherwise, if there does not exist such common variable, we say these two partial solutions are compatible.For more than two partial solutions, if there exist two partial solutions which are contradictory, we say these partial solutions are contradictory; otherwise, we say these partial solutions are compatible.

Feasible Partial Solution.
Assuming  is a subset of , define the associated constraint set as which contains all constraints involving variables in .Then, define the associated feasible partial solution set where k 1 = (V  :   ∈ ) and k 1 ≻ () represents the notion that k 1 satisfies the constraints in ().We have the following two lemmas.
Lemma 1.For two subsets  1 and  2 , one has Proof.See Appendix A.

The Proposed Algorithm
The proposed algorithm is performed in an iterative manner.Let   denote the set of neighbors of node .At each iteration , node  sends a message    to each neighbor in   , where message    is actually a partial solution set.The details of the proposed distributed algorithm are presented as follows.
3.1.Algorithm.The operation of node  consists of two steps, which are presented in sequence as follows.
Step 1 (the combination operation).Node  receives the message  −1  from each neighbor  ∈   and then generates a partial solution set as Specifically, for  = 1, we set where For convenience, we express    as where    is the partial variable set associated with node  in the th iteration.
Step 2 (the truncation operation).The number of elements of    can be very large.A parameter  is introduced to control where  is a parameter known by each node.Node  broadcasts the message    to its neighbors.
Each node shall repeat the above steps until convergence.As proved in Section 4 (Theorem 5), this process will converge after finite iterations.Upon convergence, assume the message node  sending to each neighbor is   .Let   denote the partial variable set associated with   .We can write   = {  , 1 ≤  ≤ |  |} and   = {  = V , ,   ∈   }.Node  shall determine and estimate If we set  = +∞, node  will not delete any partial solution in the truncation operation, and the algorithm will converge to the optimal solution as proved in Section 4 (Theorem 6).However, the message size will increase exponentially with the node number || as shown in Section 5 (Figure 2).On the other hand, for general , node  will delete a part of the partial solutions in the truncation operation to keep the message size small, and the algorithm will converge to an approximate solution with acceptable loss of optimality as shown in Section 5 (Figures 3 and 4).Therefore, the proposed algorithm can achieve different tradeoff between optimality and complexity.

Example 1:
The Synchronous Case.To illustrate the above procedure, we use the graph shown in Figure 1 as an example.It can be verified that the maximum weight independent set is {node 2, node 4} (i.e.,  1 = 0,  2 = 1,  3 = 0, and  4 = 1) and the associated objective is 9.For this example, we assume all nodes operate synchronously.The optimal solution (equivalent to H = +∞) For each node , it will send message    to its neighbors at the th iteration.We will assume  = +∞ and  = 3, respectively.
Comparing these two cases, we can observe that, for  = 3, the node will delete some partial solutions in the truncation operation to keep the message size small.

Example 2:
The Asynchronous Case.The proposed algorithm can be run fully asynchronously.This is due to the fact that the combination operation, which is the core of the algorithm, does not require all nodes to act synchronously.We use the graph shown in Figure 1 again as the example.For this example, we assume node 1 is the one which is not synchronized with other nodes.Specifically, we assume that node 1 will not generate messages until  = 2.We use the proposed algorithm to solve this problem.To focus on the problem raised in this comment, we assume  = +∞ so

Theoretical Analysis
This section presents the theoretical analysis results of the proposed distributed algorithm.
Recall that    is the partial variable set associated with node  in the th iteration.Firstly, we have the following two lemmas.

Lemma 3. For general 𝐻, one has
Proof.We prove these two equations by induction on .
Initially,  = 1, according to the algorithm, we have , ( +1  )), which means that ( 22) is correct for  + 1.This completes the proof of this lemma.

Lemma 4. For 𝐻 = +∞, one has
Proof.We prove this equation by induction on .Initially,  = 1, according to the algorithm, since  = +∞, we have  23) is correct for  + 1.This completes the proof of this lemma.
Based on these two lemmas, we have the following two theorems.Let diam() denote the diameter of the graph .Theorem 5.For general , the proposed algorithm will converge to a feasible solution after 2 × diam() + 1 iterations.
Theoretical analysis results show that the proposed algorithm is guaranteed to converge to an independent set and the tradeoff between approximation accuracy and algorithm complexity is controlled by parameter , which are confirmed by the simulation results reported in Section 5.

Performance Evaluation
In this section, we evaluate the proposed algorithm via computer simulations.We first describe the simulation methodology and then present and analyze the simulation results.
We develop a simulator based on MATLAB.In the simulator, a total of || nodes are randomly distributed over a 10-meter × 10-meter flat field.For each node, the weight is a random number ranging from 0 to 1.For any two nodes, if the distance is less than 6 meters, there will be an edge connecting them.Thus, we have generated a general graph.For this graph, we use five methods to find its MWIS.The first is to use the full version of the proposed algorithm, the second is to use the general version of the proposed algorithm with parameter  ranging from 5 to 30, the third is to use the greedy algorithm proposed in [19], the fourth is to use the messagepassing algorithm proposed in [30], and the fifth is to use MATLAB function bintprog to solve the integer program in (1).For the first two methods, we collect four metrics that characterize the performance of the proposed algorithm: (1) the weight sum of the output set, (2) the iteration number needed to converge, (3) the diameter of the graph, and (4) the message size, that is, the number of ME contained in each message, upon convergence.For the other two methods, we only collect the weight sum of the output set.The procedure is repeated 1000 times for each value of ||, each time with a new generated graph and weights, and results are averaged out.
We first report that the proposed algorithm is convergent for all simulation runs, and the output set is always independent.Then, we present the simulation results obtained from computer experiments, which are organized in four figures.Figure 2 shows the average message size upon convergence as the node number || varied.Specifically, it is worth mentioning that since the message-passing algorithm does not always converge and the output set is not always independent, we only collect the results when the message-passing algorithm converges and the output set is independent.In this figure, for  = +∞, the message size grows exponentially with node number, while for general  the message size is well controlled.Figures 3 and 4 show the average weight sum performance as the node number || and parameter  varied.In these two figures, the proposed algorithm with general  will converge to an approximate solution with acceptable loss of optimality as parameter  is appropriately selected.Additionally, in Figure 3, the curve corresponding to  = +∞ coincides with the curve of the optimal solution.Thus, the optimality part of Theorem 6 is verified.Finally, Figure 5 shows the average iteration number needed to converge as the node number || varied.In this figure, the curve corresponding to  = +∞ is always under the upper bound predicted by Theorem 6 (i.e., the diameter plus 1), while the curves corresponding to the general  are always under the upper bound predicted by Theorem 5 (i.e., two times the diameter plus 1).Thus, Theorem 5 and the convergence part of Theorem 6 are verified.Simulation results show that the proposed algorithm with  = +∞ is a distributed optimal algorithm, while the proposed algorithm with general  is a distributed approximate algorithm which by changing parameter  can achieve different tradeoff between approximation accuracy and space complexity.
Finally, we empirically study how to set value for the parameter .Actually, we need a way to estimate a good  for a given weighted graph.We propose to set  as the integer times of the number of nodes.The simulation results are summarized in Table 4.According to the simulation results, we suggest to set  as 2-4 times the number of nodes.

Conclusions
A new distributed algorithm for finding the maximum weight independent set in a general graph is presented.The proposed algorithm runs iteratively in which each node receives a message form each neighbor, updates the message of its own, and sends it to each neighbor.It is shown that the proposed algorithm will converge to an independent set after finite iterations and can achieve different tradeoff between approximation accuracy and space complexity.Specifically, the proposed algorithm with  = +∞ is guaranteed to converge to the optimal solution.

A. Proof of Lemma 1
Proof.Assume (

Figure 1 :
Figure 1: An example of the graph.

Figure 2 :
Figure 2: The average message size performance.

Figure 3 :
Figure 3: The average weight sum performance.

Figure 4 :
Figure 4: The average weight sum performance.

Figure 5 :
Figure 5: The average iteration number performance.

Table 4 :
Simulation results with different .