Congestion Control and Traffic Scheduling for Collaborative Crowdsourcing in SDN Enabled Mobile Wireless Networks

1College of Computer Science and Engineering, Northeastern University, Shenyang 110819, China 2Key Laboratory of Vibration and Control of Aero-Propulsion System of Ministry of Education, Northeastern University, Shenyang 110819, China 3College of Jangho Architecture, Northeastern University, Shenyang 110819, China 4State Key Laboratory of Rolling and Automation, Northeastern University, Shenyang 110819, China


Introduction
In recent years, crowdsourcing have received extensive attention from industry and academia, which was originally proposed by American journalist Jeff Howe in 2006.Crowdsourcing means that tasks performed by employees in a company or institution before will be outsourced to the unspecific public networks in a free and voluntary form.Tasks of crowdsourcing are usually undertaken by the individual.But if the people involved need to collaborate to complete the task, there may appear in form of individual production dependent on open source.Many tasks cannot be achieved through a simple algorithm, such as image labeling, commodity evaluation, and entity recognition.These kinds of problems are difficult for machines to handle but can be done with crowdsourcing.In crowdsourcing it publishes tasks directly to the Internet and gathers unknown people on the Internet to solve problems that are difficult to deal with by traditional computers alone, such as Wikipedia, reCAPTCHA [1], tagged images, and language translations.According to the different forms of public participation in crowdsourcing, it can be divided into collaborative crowdsourcing and crowdsourcing contest.In collaborative crowdsourcing, tasks require collaboration be tween the masses, but people who perform tasks usually do not have rewards.Crowdsourcing can effectively solve machine-hard tasks by leveraging machine and a large group of people on the web.
Software-Defined Networking (SDN) refers to a new network architecture developed from OpenFlow technology [2].SDN technology can be programmed by software to control the data forwarding and ultimately achieve the purpose of free data transfer control.SDN technology has outstanding advantages in flow control; therefore, we hope to use SDN technology to solve the problem of congestion control and traffic scheduling [3] in crowdsourcing-based Mobile Wireless Networks (MWN).
Currently, a number of crowdsourcing-based mobile applications have been applied in mobile networks and Internet of Things (IoT), targeted at real-time services and recommendation, for example, Uber, Elance, Amazon, and Airbnb.These frequent information exchanges and data transmissions are heavily injected into the current communication networks [4], which poses great challenges for congestion control and traffic scheduling problem [5] in Mobile Wireless Networks.To solve the emerging challenges, this paper focuses on the traffic scheduling and load balancing problem in software-defined Mobile Wireless Networks for collaborative crowdsourcing.This paper first presents a network model towards traffic engineering problem and then designs a hybrid routing forwarding scheme as well as a congestion control algorithm to achieve the feasible solution.To validate the performance of the proposals, a lot of simulation experiments are carried out.
The rest of this paper is organized as follows.Related work in recent years is reviewed in Section 2. The network model is then formulated in Section 3. In Section 4, design of congestion control and traffic scheduling scheme are presented in detail.Simulation results and analysis are discussed in Section 5. Finally, conclusions are given in Section 6.

Related Work
At present, some researchers have summarized the research work of crowdsourcing from different perspectives.
Yuen et al. in [6] summarized the progress of crowdsourcing from applications, algorithms, performance, and data sets.Kittur et al. in [7] explained the challenges of crowdsourcing in 12 aspects such as synchronous collaboration, real-time response, and dynamic machines.Doan et al. in [8] reviewed the crowdsourcing system applied on the world wide web and summarized the crowdsourcing system according to the problem type and the way of collaboration.Zhao and Zhu in [9] reviewed crowdsourcing research from four perspectives: information, technology, the public, and organization.Kittur et al. in [10] have studied how to decompose complex tasks and how to integrate workers' answers to perform initial tasks and proposed a MapReduce framework to achieve the decomposition of tasks.However, their method is only suitable for specific types of tasks, and the general effect is unsatisfying.Scalability still needs to be solved.References [11][12][13] focus on the technology of combining machine and human with the join operation in crowdsourcing environment, which first filters the problem through the machine algorithm and then assigns the remaining problems to the workers.The authors of [12] used the transitive relationships of entities to further reduce the number of tasks, thereby saving the cost.Lofi et al. in [14] reduced the cost of the task by preprocessing data sets containing missing data through the "error model" and getting the answers from workers.Sakamoto et al. in [15] studied the ways in which crowdsourcing participants often interact in different task types.Heer et al. in [16] studied how to carry out a survey and found that the design interface was more suitable for crowdsourcing workers through questionnaires.The authors in [17] proposed a method based on random map generation and messaging task allocation.The limitation of this method is that it can only be used for a specific type of task to the difficulty of the task.However, there are various types of task crowdsourcing platform, and some tasks need special professional knowledge, such as language translation task.Liu et al. in [18] implemented a data analysis system to ensure the quality of the results as the main goal, first through forecasting model number assigned tasks and, then, in the process of task execution, through online quality assessment results to determine whether to terminate the task ahead of time, thus saving cost and time.The authors in [19] proposed a new workers' model in crowdsourcing.Through this model, the workers' quality can be computed accurately and timely.For big data tasks, the number of tasks affects the overall cost of the tasks.The number of tasks can be reduced by effectively designing the task, thus saving the task cost.Marcus et al. in [20] proposed the strategy to transform the problem of each task into multiple subproblems.But when a task contains a large number of subproblems, the price of task needs to improve.Otherwise, it will be easy to cause only a small number of workers selected task.That is to say, even though such an approach reduces the number of tasks, the overall cost of the task is not guaranteed to be reduced.The authors of [21] presented a comprehensive system model of Crowdlet that defines the task, worker arrival, and worker ability models.In [22], the authors designed an approximate task allocation algorithm that is near optimal with polynomial-time complexity and used it as a building block to construct the whole randomized auction mechanism.Compared with deterministic auction mechanisms, the proposed randomized auction mechanism increases the diversity in contributing users for a given sensing job.The authors of [23] presented a new participant recruitment strategy for vehicle-based crowdsourcing.This strategy guarantees that the system can perform well using the currently recruited participants for a period of time in the future.The authors in [24] focused on a more realistic scenario where users arrive one by one online in a random order.The authors in [25] focused on the problem of how to efficiently distribute a crowdsourcing task and recruit participants based on D2D communications.In [26], existing definitions of crowdsourcing were analyzed to extract common elements and to establish the basic characteristics of any crowdsourcing initiative.Based on these existing definitions, an exhaustive and consistent definition for crowdsourcing is presented and contrasted in eleven cases.In [27], the authors defined traffic engineering as a large-scale network project to solve the performance evaluation and network optimization in the network.In [28], traffic engineering has been further explained, and the traffic engineering is a route optimization method to improve the quality of network service by avoiding the link congestion in the network.

Network Model
There are a number of possible next hops that may occur after the crowdsourcing task has selected the assignment object in the mobile wireless network, and different next hop options affect the load balancing in the network.As shown in Figure 1, if the crowdsourcing task source node in the figure forwards the mission to the destination node through the next hop 1, the maximum link utilization rate in the network is 0.6.If the next hop 2 is chosen to forward the task assignment, then the maximum link utilization in the network is 0.4.Therefore, the SDN controller in the network needs to calculate the next hop periodically to achieve the load balancing in the network.
Taking into account the fact that the routing network already exists in the current mobile wireless network, it requires a lot of manpower and resources to replace all the wireless nodes for the SDN node [29].Therefore, we consider the SDN node in the mobile wireless network part of the configuration of the scenario.We assume the nodes in the mobile wireless network run the OSPF protocol, so the SDN controller can collect the load information of the links in the network.And the SDN nodes can obtain the link utilization rate of all the links in the network.When the crowdsourcing task leaves the SDN node, it may pass through other SDN nodes on its forwarding path.These nodes can also have multiple next hops, as shown in Figure 2, where the yellow nodes represent SDN nodes, while the white nodes represent non-SDN nodes.In addition, the solid line represents the forwarding path, and the dotted line represents the possible forwarding paths.It is assumed that the current task node 1 is an SDN node, and it selects node 2 as its next hop.There is also another SDN node 4 on the forwarding path, and the next hop of node 4 may be node 5 or node 6.Through the coordination of multiple SDN nodes distributed in the Mobile Wireless Networks, we can have multiple possible forwarding paths to carry the crowdsourcing tasks to achieve load balancing for global networks.
Therefore, we first need to find out all the possible paths that the package task forwards.We use the tree structure to build all possible forwarding paths [30].First, we construct the source node of the task as the root of the tree.Each node in the tree can be divided into SDN nodes and non-SDN nodes.If it is an SDN node, then it can have multiple child nodes; otherwise, it only has a child node.We assume that when the package task is forwarded, each node in the network will inject an identity packet of the current node.When passing through the SDN node, we check this identity packet and remove the branch path containing the nodes that already exist in the current identity packet, ensuring that the loopback is not generated when the packet task is forwarded.In Figure 2, for example, the tree structure of all possible forwarding paths can be constructed as shown in Figure 3.
In what we described above, there is only one crowdsourcing task in the wireless sensor network, but, in reality, there can be multiple crowdsourcing task in the network [31].
Assume that there is no interference between nodes and links.Suppose  is the crowdsourcing tasks matrix, and the task set is (1 1  2 , 2 1 2 , . . .,    ) ( is the crowdsourcing task source node and  is the crowdsourcing task destination node).And the amount of task is (   ).Define () as the link capacity.Define link utilization as (), which can be formulated as in Define that when a crowdsourcing task passes through a node, all possible forwarding paths are added to set     V .There are two scenarios in the mobile wireless network: when a number of crowdsourcing tasks pass through non-SDN nodes, they can only be forwarded in accordance with the OSPF protocol to its next hop.And when multiple crowdsourcing tasks pass through SDN nodes, we have multiple possible forwarding paths.In the mobile wireless network, we can only control SDN nodes.Therefore, when the crowdsourcing task traffic passes through the SDN node, the problem we need to solve is as follows: given , , () and , how we schedule the task  over the path  with the path capacity () to minimize the maximum link utilization , then achieving load balancing.We describe it as problem (, (   ), (), , ).Given the definitions above, the problem can be formalized as follows: Minimize Subject to: (   ) ≥ 0 ∀ (4) Formula (3) indicates that the size of the task on any link is less than or equal to the maximum link utilization in the network multiplied by the link capacity.Formula (4) indicates that the amount of task on any forwarding path should be nonnegative.Formula (5) indicates that task should be nonnegative.When the crowdsourcing task traffic passes through the non-SDN node, we use OSPF protocol to perform the next hop routing.When the crowdsourcing task traffic passes through the SDN node, we describe this as problem (, (   ), (), , ).There is a special case of problem , where  = 1 and  = 1.The problem (, (   ), (), 1, 1) is NP, and we can reduce the well-known 0-1 knapsack problem [32] to this problem.Therefore, (, (   ), (), 1, 1) is NP-hard.Thus, the more general problem (, (   ), (), , ) is also NPhard.This means the computation cannot be completed in a reasonable time for large networks.Therefore, we develop a heuristic algorithm for this problem with polynomial-time complexity.

Congestion Control and Traffic Scheduling Schemes
On the algorithm we make the following assumptions: (1) SDN control center can be aware of the relevant information in the network correctly and timely.(2) Network topology is stable in a short time, and we do not consider the interference of wireless networks.
(3) All the nodes are running standard OSPF protocol nodes in the mobile wireless network in addition to SDN nodes.
(4) Mobile wireless network has only one SDN controller.
(5) In the process of routing, SDN nodes select only one path to forward when processing a crowdsourcing task flow.
(6) The task flow is forwarded hop by hop.
In this case, we assume that none of the links in the network will be congested, and there will not be a number of crowdsourcing task traffic on a link exceeding the capacity of the link.Therefore, when the SDN node forwards the crowdsourcing task, we can sort the crowdsourcing tasks according to the task load.Then, according to the greedy algorithm, the crowdsourcing task is distributed to the corresponding link, which makes the value of maximum link utilization in the network minimum.
The hybrid routing and forwarding algorithm is given in Algorithm 1.
Since we define the utilization of the link as the ratio of the link capacity of the data flow on the current link, if the data flow is far greater than our link capacity, our link utilization will be greater than 1.So the network's maximum link utilization is greater than 1, which is contrary to the idea of load balancing in traffic engineering.Therefore, our crowdsourcing task traffic matrix cannot be generated arbitrarily; as Algorithm for hybrid routing and forwarding (1) Begin (2) Input: mobile wireless network topology graph (, ), crowdsourcing task flow matrix ; (3) for each row in  do (4) If V is non-SDN node then (5) Assign the task flow to its next hop forwarding link; (6) repeat (7) ∀V ∈ , V + +; (8) until all non-SDN nodes are traversed; (9) If V is SDN node then (10) Sort the task in ascending order according to the load of the task flow; (11) Compute all possible forwarding path ; (12) Use the greedy algorithm to assign task to its next hop forwarding link; for the task flow size, according to the method described in literature [33], we generate the formula as follows: In formula (6),   represents the size of the traffic flow from the source node  to the destination node ,   represents a random number in an interval [0, 1], (, ) represents the link capacity between the source node  and its neighboring node , (, ) is the link capacity between destination node  and its neighboring node , and (, ) represents the capacity on the link (, ).We generate 40 sets of crowdsourcing task flow matrices as simulation data according to formula (6).According to the above conditions, we have simulated the proposed algorithm.

Design of Congestion Control Algorithm.
As mentioned above, we assume that there will be no congestion in the mobile wireless network, but in fact congestion is inevitable in the process of mass crowdsourcing.Therefore the problem (, (   ), (), , ) should be (, (   ), (),   , 1), because the maximum utilization of the link is 1 and   is the first link of the possible path .In this case, when an SDN node is forwarding the crowdsourcing task, it needs to select a subset of its task set  {1, 2, . . ., } first.Then, these subtasks will be assigned to the possible forwarding link   , with the maximum value of assigned tasks under the limitation of each link.It is a multiknapsack problem.Multiple Knapsack Problem (MKP) refers to the selection of a subset of items in an item collection  {1, 2, . . ., } to be loaded into  {1, 2, . . ., } backpack.The purpose is to maximize the total value of selected items, with the total capacity not exceeding the volume of each backpack.Here we use the AFSA algorithm in [34] to solve this problem.Artificial Fish Swarm Algorithm (AFSA) is a new intelligent optimization algorithm for biomimetic group.Artificial fish can make AFSA better intelligent and suitable for solving large-scale complex optimization problems.We assign the crowdsourcing tasks as many as possible to the link without exceeding the link capacity.According to this heuristic rule, if we want to assign the task  to the link , there are two possibilities.One is the link capacity () < (), and we cannot assign the task to the link.The other one is the link capacity () ≥ ().Let   () represent the remaining capacity of the link .There are two conditions: (1)   () ≥ (), if task  is never assigned to any link, then task  is assigned to the link , and   () =   () − (); if task  was assigned to link  ( ̸ = ), we firstly execute TakeOut(, ) (TakeOut(, ) which means taking the task  out of link , and then   () =   () + ()).Then we assign the task  to the link , and the remaining capacity of the link  decreases ().(2)   () < (), we Algorithm for Congestion control (1) Begin (2) Input: mobile wireless network topology graph (, ), crowdsourcing task flow matrix .
(3) for each row in  do (4) If V is non-SDN node then (5) Assign the task flow to its next hop forwarding link; (6) repeat (7) ∀V ∈ , V + +; (8) until all non-SDN nodes are traversed; (9) If V is SDN node then (10) Compute all possible forwarding path   ; (11) Compute the link capacity   (  ); (12) Use the AFSA algorithm to assign task flow to its next hop forwarding link; (13) repeat (14) ∀V ∈ , V + +; (15) until all SDN nodes are traversed; (16) Update the crowdsourcing task traffic matrix  to   ,   =   + 1.Return to the third step.execute TakeOut(, ) ( is any task that is assigned to the link ) until   () ≥ (), and then we execute (1).The artificial fish is always kept in a feasible solution and close to the bound boundary.The effective optimization of artificial fish under the guidance of behavior strategy was carried out by artificial fish feeding, rear-ending, and clustering.
Since we can only control SDN nodes in the network, we will take the traffic of non-SDN nodes in the forwarding link first.The remaining capacity of the link is the backpacking capacity of our multibackpack problem.We also need to assume that the crowdsourcing task flow cannot be split.Assume that when the number of tasks on a link exceeds the link capacity of the link, it causes the task to be discarded and needs to be reposted.Define   as the number of times that the crowdsourcing task has been forwarded.Finally, we evaluate our congestion control algorithm by calculating the link throughput.We use formula (7) to compute the throughput of the network: The congestion control algorithm is given in Algorithm 2.

Number Results and Analysis
We mainly use VS2010 to complete the simulation, which is coded in C/C++.We use the wireless network standard based on IEEE 802.11b [35] to build our mobile wireless network with a maximum bandwidth of 11 Mbps, which means the maximum link capacity can be set to 11 M.Here we use the method described in [36] to set the link capacity in mobile wireless network.First, divide all the nodes into two categories according to the degree of each node, A class node represents those nodes whose degree is less than 3, and B class represents the set of other degrees of nodes.If a link has two nodes in the B class node set, then the link capacity is 11 M; if there is a node in the link in the A class node, set the link capacity of 6 M. Simulation of the mobile wireless network topologies are shown in Figures 4 and 5, where yellow nodes represent SDN nodes and white nodes represent non-SDN nodes, and we simulate the experiment by increasing the number of SDN nodes gradually.
For the hybrid routing and forwarding algorithm, we compare the network with no SDN nodes by increasing the amount of SDN nodes in the network, which is the network that we assume all nodes are forwarded according to the OSPF protocol.From (a) to (d) compare the maximum link utilization between our proposed hybrid routing forwarding scheme and the OSPF protocol by increasing the number of SDN nodes.The simulation results are shown in Figures 6 and  7.
Figures 6 and 7 present analyses of the maximum utilization with different SDN nodes deployment in Topology 1 and Topology 2. The simulation results are shown in Figures 6  and 7, the -axis represents the maximum link utilization, and the -axis represents the number of crowdsourcing task flow matrixes.We can see intuitively that with the increase in SDN nodes the overall trend of maximum link utilization is decreasing in the mobile wireless network from the simulation results in Figures 6 and 7.However, it can be seen in the  figure that when the SDN nodes in the network are relatively few, the maximum link utilization in the network obtained from the hybrid routing and forwarding algorithm is almost the same compared with the OSPF routing algorithm.This is because the SDN controller can only control the SDN nodes to manipulate the traffic in the network.When the SDN nodes in the network are few, the traffic in the whole network becomes uncontrollable.Although traffic through the SDN nodes can be controlled, the maximum utilization rate of the local link in the network is reduced, and the local network can achieve load balancing, and it is difficult to achieve load balancing for the whole network.In addition, by comparing Topology 1 and Topology 2, the benefits of deploying SDN nodes will become more apparent as the number of nodes in the network increases and the network topology becomes more complex.For the congestion control algorithm, we gradually increase the number of SDN nodes in the mobile wireless network.We calculate the throughput of the network through formula (7) and then compare it.The simulation results are shown in Figure 8.
Figure 8 shows the analysis of throughput for different SDN nodes deployment in Topology 1 and Topology 2. It can be observed that the throughput performance of Topology 1 and Topology 2 are both better with SDN nodes increasing.From the comparison results in Figure 8, it can be concluded that our congestion control algorithm can effectively improve the network throughput.

Conclusion
At present, massive crowdsourcing-based mobile applications have been applied in mobile networks and IoT, targeted at real-time services and recommendation.The frequent information exchanges and data transmissions in collaborative crowdsourcing are continually injected into the current communication networks, which poses great challenges in Mobile Wireless Networks (MWN).This paper focuses on the traffic scheduling and load balancing problem in softwaredefined MWN and designs a greedy heuristic algorithm as well as a congestion control algorithm to achieve feasible solutions.The proposed traffic scheduling algorithm sorts the tasks in ascending order according to the amount of tasks and then solves them using the greedy scheme.The packet task is assigned to the corresponding link for forwarding, so that the maximum link utilization in MWN is the least.In the proposed congestion control scheme, the traffic assignment is transformed into a multiknapsack problem, and then the AFSA algorithm is employed to solve the problem.The node selects a subset in its feasible task set and assigns it to the p links, which makes the maximum amount of tasks allocated without exceeding the limited capacity of each link.The simulation results demonstrate that compared with the traditional schemes the proposed congestion control and traffic scheduling methods can achieve load balancing, reduce the probability of network congestion, and improve the network throughput.

Figure 1 :
Figure 1: The illustration of the next hop selection in our network model.

4. 1 .
Design of Hybrid Routing and Forwarding Algorithm.In our model, we divided the nodes in the mobile wireless network into two categories: SDN nodes and non-SDN nodes.

Figure 6 :
Figure 6: Comparison of the maximum utilization with different SDN nodes deployment in Topology 1.

Figure 7 :Figure 8 :
Figure 7: Comparison of the maximum utilization with different SDN nodes deployment in Topology 2.
Compute link utilization on all links in the network.Get the maximum link utilization ; (17) Update the crowdsourcing task traffic matrix  to   ; (18) If   ≥  then