Joint Task Partition and Resource Allocation for Multiuser Cooperative Mobile Edge Computing

Exploiting the idle computation resources distributed at wireless devices (WDs) can enhance the mobile edge computing (MEC) computation performance. This paper studies a multiuser cooperative computing system consisting of one local user and multiple helpers, in which the user solicits multiple nearby WDs acting as helpers for cooperative computing. We design an e ﬃ cient orthogonal frequency-division multiple access-(OFDMA-) aided three-phase transmission protocol, under which the user ’ s computation-intensive tasks can be executed in parallel by local computing and o ﬄ oading. Under this setup, we study the energy consumption minimization problem by optimizing the user ’ s task partition, jointly with the communication and computation resources allocation for task o ﬄ oading and results downloading, subject to the user ’ s computation latency constraint. For the nonconvex problem, we ﬁ rst transform the original problem into a convex one and then use the Lagrange duality method to obtain the globally optimal solution. Compared with other benchmark schemes, numerical results validate the e ﬀ ectiveness of the proposed joint task partition and resource allocation (JTPRA) scheme.


Introduction
The real-time communication and computation of massive wireless devices (WDs) (e.g., smart wearable devices and laptops) promote the rapid growth of emerging applications (e.g., face recognition, smart grid, and autonomous driving) [1].In fact, these applications or tasks may be computationintensive and latency-critical, but WDs are generally of small size and only have the finite battery power.Hence, how to enhance their computation capabilities and reduce the computation latency is one crucial but challenging task to be handled.To deal with such limitations, mobile edge computing (MEC) has been proposed as a promising technology by providing cloud-like computing at the network edge (e.g., base stations (BSs) and access points (APs)) [2].
Various efforts have been devoted to handling technical challenges against different computation task models.Two extensively adopted task models in the current research works are partial and binary offloading, respectively.Note that in partial offloading, the mutual dependency of the computing tasks significantly affects the computation offloading process [3].That means partial offloading can be classified as the task-call graph and the data-partition model.Also, there exists different types of MEC system architectures, such as single user single-server [4][5][6][7], multiuser single-server [8][9][10][11][12][13][14], and single/multiuser multiserver [15][16][17][18].
Due to the increasing number of WDs, resource contention may occur on MEC servers.Under this circumstance, cooperative computing provides a viable solution by utilizing abundant idle computation resources distributed at WDs [19,20].In a basic two-user device-to-device (D2D) cooperative computing system, [19] jointly optimized both users' local computing and task offloading decisions over time, in order to minimize their weighted sum-energy consumption.Under a single-user single-helper single-server setup, [21,22] jointly optimized the communication and computation resources allocation at both the user and helper based on time-division multiple access (TDMA) and nonorthogonal multiple access (NOMA), respectively.In a cellular D2D MEC system, [23] proposed a joint task management architecture to achieve efficient information interaction and task management.Also, by integrating D2D into the MEC system, [24] jointly optimized D2D pairing, task split, and the communication and computation resource allocation, in order to improve the system computation capacity.
In the above research works, the user is mostly assumed to cooperate with single helper at the same time.In practical design, multiple helpers can simultaneously share their own computation and communication resources to help the user [20,[25][26][27].In the D2D MEC system, [25] jointly optimized helpers' selection and the communication and computation resource allocation for minimizing the energy consumption.In [26], a multihelper MEC with NOMA-based cooperative edge computing has been presented to maximize the total offloading data subject to the latency constraints.However, the system model in [25,26] ignores results downloading, which may be not applicable for practical design.Thus, [20,27] focus on the joint task offloading and results downloading.In the D2D-enabled multihelper MEC system, [27] jointly optimized the time and rate for task offloading and results downloading, as well as the computation frequency for task execution, in order to minimize the computation latency.Unlike the binary offloading model in [20,27], it investigated a multiuser computational offloading scheme, in which the controlling user partially distributes its computing tasks to multiple trusted helpers.Also, [20] ignores computation resource of the local user and dynamic management of computation frequency.
Despite the recent research progress, cooperative computing still faces some technical challenges.First, the previous research works mostly consider cooperating with the MEC server or single helper.When multiple helpers share unused resources to help the user, how to effectively coordinate the cooperation between the user and multiple helpers for achieving computing diversity remains challenging, especially when the helper number becomes large.Second, the previous research works generally ignore potential performance improvement brought by dynamic management of computation frequency as well as results downloading.When the MEC system considers these options, how to solve such a complex problem is also challenging.
Motivated by this, we consider a multiuser cooperative MEC system consisting of one user and multiple nearby WDs serving as helpers.The user has individual computation tasks to be executed within a given time block.To implement the cooperation between the user and the helpers, the time block is divided into three phases.In the first phase, the user simultaneously offloads the computing tasks to multiple nearby helpers.In the second phase, the helpers execute their assigned computation bits.In the three phase, the helpers send the computation results back to the user.Under this setup, this paper develops an energyefficient multiuser cooperative MEC design by optimizing the user's task partition, jointly with the communication and computation resource allocation for task offloading and results downloading.The main contributions of this paper are summarized as follows.
(1) We propose an MEC framework for multiuser cooperative computing, in which the user can simultaneously offload the computing tasks to multiple nearby helpers (2) We design an OFDMA-aided three-phase transmission protocol involving results downloading, which efficiently coordinates the cooperation between the user and multiple nearby helpers (3) For the energy consumption minimization problem, we optimize the user's task partition, jointly with the communication and computation resource allocation for task offloading and results downloading.Due to nonconvexity of this problem, we first transform it into a convex one and then use the Lagrange duality method to obtain the globally optimal solution The rest of this paper is organized as follows.Section 2 introduces the system model.The proposed joint task partition and resource allocation problem is formulated in Section 3. The joint task partition and resource allocation algorithm is presented in Sections 4. Section 5 provides numerical results, followed by the conclusion in Section 6.
Notation is as follows: we employ uppercase boldface letters and lowercase boldface ones for matrices and vectors, respectively.Δ is represented by "denoted by" ½x b a .And [x] + is denoted by fb, min fa, xgg and maxf0, xg, respectively.A continuous random variable z uniformly distributed over [a, b] |A| denotes the determinant of a matrix A. Moreover, l u + ∑ K k=1 l k = L u and R + stand for the sets of nonnegative real vectors of dimension K and positive real numbers, respectively.

System Model
As shown in Figure 1, we consider a multiuser cooperative MEC system, which consists of one user and a set.K = Δ f1,⋯,Kg of nearby helpers all equipped with single antenna.We focus on a time block with length T, where the user should execute the computing tasks with data-size L u (in bits) within this block.Here, T is no larger than the channel coherence time [21].Suppose that there is a central controller that is responsible for collecting the network information, such as the global channel state information (CSI), accordingly, the central controller can send the optimized strategies to the user and helpers to take actions [21,22].For easy implementation, it is further assumed that the task offloading and result downloading channel reciprocity are leveraged in this paper [20].
Specifically, the L u bits generally can be divided into K + 1 independent parts for local computing and offloading to the helpers, respectively.Let l u ≥ 0 and l 1 ≥ 0, ⋯, l K ≥ 0 denote the numbers of bits for local computing at the user and offloading to K helpers, respectively.Then, we have 2.1.Local Computing.The l u bits are executed locally with the optimal central process unit (CPU) frequency given as [21] 2 Wireless Communications and Mobile Computing where c u denotes the number of CPU cycles for computing 1-bit input-data at the user.Note that f u is subject to the maximum frequency constraint, that is, Accordingly, the user's energy consumption for local computing is given by where γ u denotes a constant related to the user's hardware architecture [21].Replacing f u in (4) with ( 2), E comp u can thus be reexpressed as 2.2.Remote Computing at Helpers.The OFDMA-aided threephase transmission protocol is shown in Figure 2. At first, the user first offloads l k bits to the k -th helper with duration t off k via OFDMA in the task offloading phase, k ∈ K.Then, the k -th helper executes its assigned bits with duration t comp k in the task execution phase.At last, in the result downloading phase, the k -th helper sends the computation results back to the user with duration t dl k via OFDMA.Note that the cooperation between the user and K helpers does not affect each other.To meet the user's latency requirement, we have the following time constraint: In the following, we describe the OFDMA-aided threephase transmission protocol in detail.

Phase I (Task Offloading). Let h off
k denote the channel power gain from the user to the k -th helper, k ∈ K.The achievable offloading rate at the k -th helper is given by where W in Hz denotes one frequency resource block, p off k is the transmit power for offloading data to the k -th helper, and σ k 2 is the power of additive white Gaussian noise (AWGN) at the k -th helper.Hence, we have the offloaded bits l k from the user to the k -th helper as Accordingly, the total energy consumption for task offloading consumed by the user is expressed as 2.2.2.Phase II (Task Execution).After receiving l k bits, the k -th helper executes with the optimal CPU frequency given as where c k denotes the number of CPU cycles for computing 1-bit input-data at the k -th helper.Similarly as in (3), f k is also subject to the maximum frequency constraint, that is, Consequently, the energy consumption for cooperative computation at the k -th helper is expressed as where γ k denotes a constant related to the k -th helper's hardware architecture [21].

Phase III (Result Downloading).
After executing the user's assigned bits, the k -th helper begins sending the computation results back to the user via OFDMA.Let h dl k denote the channel power gain from the k -th helper to the user.The achievable downloading rate from the k -th helper is given by where p dl k is the transmit power of the k -th helper, and σ 2 0 is the power of AWGN at the user.The corresponding computation results are thus given by User Helper k Helper K Helper 1 Task offloading Results downloading 3 Wireless Communications and Mobile Computing where q ∈ R + denotes the normalized ratio between the size of computation results and the size of computing tasks [20].The energy consumption for results downloading consumed by the k -th helper is expressed as

Problem Formulation
In this paper, we aim to minimize the total energy consumption of the multiuser cooperative MEC system (i.e., Þ) by jointly optimizing the user's task partition, the task offloading time, the result downloading time, and the transmit power of the user and helpers, subject to the user's computation latency constraint T. Specifically, the energy consumption minimization problem is formulated as (1) denotes the user's task partition constraint, (3) and ( 11) denote the maximum CPU frequency constraints at the user and helpers, respectively, (16a) and (16b) denote the constraints for data transmission between the user and the helpers, and (16c) and (16d) denote the transmission energy consumption constraints at the user and helpers, respectively.Note that in problem (P1), we replace the equality in ( 8) and ( 14) as the inequality constraints (16a) and (16b), respectively.(16a) and (16b) should be met with strict equality at optimality of problem (P1).This is consistent with intuition.Because of the coupling of t off k and p off k and t dl k and p dl k in the objective function and the constraints (16a) and (16b), problem (P1) is nonconvex.

Feasibility of Problem (P1).
Before solving problem (P1), we need to guarantee its feasibility so that the multiuser cooperative MEC system can support the latencyconstrained task execution.Let L max denote the maximum data size in bits supported by the proposed MEC system within duration T.There is no doubt that only when L max ≥ L u , problem (P1) is feasible, or problem (P1) is infeasible.Hence, we check the feasibility of problem (P1) by determining L max .Intuitively, L max is obtained when the user and helpers make full use of the communication and computation resources in the proposed MEC system.This   Wireless Communications and Mobile Computing corresponds to letting the constraints (3) and ( 11) be met with strict equality in problem (P1).Then, the data maximization problem is formulated as Due to the similarity of between problems (P1) and (P2), problem (P2) can be solved like problem (P1).By comparing L max and L u , we finally check the feasibility of problem (P1).

Optimal Solution
In this section, we first transform problem (P1) into a convex one and then present an efficient algorithm to obtain the globally optimal solution.
To accomplish this target, we introduce two auxiliary variable vectors y off ≜ ½y of f Proof.It is obvious that the function r k off ðxÞ is a concave function with respect to x ≥ 0. As the perspective operation maintains convexity, xr off k ðy/xÞ is jointly concave with respect to x ≥ 0 and y ≥ 0 [28].Similarly, this also applies to xr dl k ðy/xÞ.Therefore, the set defined by the constraints The leading principal mirrors of H are given by From the above analysis, we can validate that l In view of Lemma 1, to gain engineering insights, we next leverage the Lagrange duality method to solve problem (P1.1).
[28] Let λ 1 ∈ R K ≥0 and λ 2 ∈ R K ≥0 indicate the dual variables related to the constraints in (18a) and (18b), respectively, and let μ 1 ∈ R ≥ 0, μ 2 ∈ R, and μ 3 ∈ R K ≥0 be the dual variables related to the constraints in (18c), (1), and (18d), respectively. Define The dual function of problem (P1.1) is expressed as As a result, the dual problem of problem (P1.1) is given by Denote Ψ and λ opt and μ opt as the feasible set and the optimal dual variables for problem (P1.1-dual), respectively.
7 Wireless Communications and Mobile Computing Lemma 3.Under given (λ, μ) ∈ Ψ, the optimal solution l * u to problem (P1.1-subK+1) is Remark 4. Note that in ( 28) and ( 29), if ρ k,i = 0 (for any i ∈ f1, 2g), the optimal solution for evaluating g (λ, μ).Since such choice may not be feasible or optimal for problem (P1.1), we add an additional step to find the primal optimal ðt off k Þ opt and ðt dl k Þ opt , as will be shown in Section 4.3.

Finding the Optimal Primal Solution to (P1).
Having obtained λ opt ,μ opt , we still need to further solve problem (P1.1).By replacing (λ,μ) with λ opt ,μ opt in Lemmas 2 1 ,•••, 2 K , and 3, we obtain the corresponding p Since problem ( 50) is an instance of LP, it can be solved by the interior-point method [28].Finally, we obtain the globally optimal solution for problem (P1).The proposed joint task partition and resource allocation (JTPRA) scheme is thus summarized in Algorithm 1.
Remark 5.With Lemmas 2 1 ,•••, 2 K , and 3, the following insights can be obtained as follows: (1) As for local computing, it is observed from Lemma 3 that l u opt generally increases as T becomes large.This indicates that the user prefers executing more tasks when the user's computation latency constraint becomes loose (2) As for cooperative computing, it is evident that, based on Lemma 2, the offloading power ðp off k Þ opt increases as the channel power gain h off k becomes stronger.That is, the user prefers offloading more tasks to the closer helper, in order to reduce the marginal energy consumption for offloading.Similarly, this also applies to ðp dl k Þ opt 4.4.Complexity.The complexity of the ellipsoid method is OðN 2 Þ, where N is the number of dual variables and N = 3 K + 2 in (23) [29].Moreover, the complexity of the interior-point method is OðM 3:5 log ð1/εÞÞ where M is the number of optimal variables, log (1/ε) is the iteration complexity order, and M = 2 K is in (50) [28].Hence, the total complexity of Algorithm 1 is OðK 3:5 log ð1/1ε − εÞÞ.

Simulation Results
We provide simulation results for verifying the effectiveness of the proposed joint task partition and resource allocation 8 Wireless Communications and Mobile Computing (JTPRA) scheme in Section 4, as compared against the following five benchmark schemes: (1) Local Computing with Optimal Frequency (LCOF): the computation tasks are executed locally with the optimal CPU frequency, and thus the optimal energy consumption for local computing is (2) Local Computing with Fixed Frequency (LCFF): the computation tasks are executed locally with the maximum CPU frequency, and thus the energy consumption for local computing is (3) Full Offloading (FO): the computation tasks are partitioned into K parts for offloading to nearby helpers, which corresponds to solving problem (P1) by setting l u = 0 (4) Joint Offloading Ratio and CPU Frequency (JORCF) [7]: in this scheme, the user adjusts both the offloading ratio and CPU frequency to cooperate with the MEC server (5) Fixed Frequency (FF) [20]: let the constraints (3) and ( 11) be met with strict equality.This corresponds to solving problem (P1) by setting In the simulation, the distance between the user and helpers is d ∼ U [d min , d max ] meters, where d max = 30 meters and d min = 1 meters.The path loss between any two nodes is modeled as bd −φ , where b = 10 −3 corresponds to the path loss at a reference distance of 1 meter, d denotes the distance from the user to a helper, and φ = 3 is the path loss exponent [21].Also, the helpers' maximum CPU frequencies are assumed to be uniformly chosen from the set {1.6, 2.4, 3}GHz.The other parameters are set as shown in Table 1 unless otherwise specified.
Figure 3 shows the maximum data-size versus the block length T where L u = 0:2 Mb and K = 3.In the following sim-ulation, under given T, we set L u smaller than L max to guarantee the feasibility of problem (P1).
Figure 4 shows the average energy consumption versus data size L u where T = 0:15 sec and K = 3.It is observed that our proposed JTPRA achieves the minimum energy consumption than other schemes.Moreover, we have some observations as follows.
(1) The average energy consumption by all the schemes increases as L u increases.JTPRA achieves significant performance gain over FF.This indicates the benefit of computation frequency optimization in energy saving for MEC (2) For schemes with unchanged CPU frequency, FF achieves a lower energy consumption than LCFF.This is because the user prefers offloading the computing tasks to the helpers whose maximum CPU frequencies are below that of the user, compared with local computing -120 dBm [22] Computation intensities c u = c 1 = ⋯ = c K 10 3 cycle/bit [21] User's maximum CPU frequency f u max 2 GHz [21] Available energy for data transmission E off max = E dl max 0.5 joule Normalized ratio between the size of computation results and the size of computing tasks q 0.2

Number of channel realizations 500
0.1 0.11 0.12 0.13 0.14 0.15 0.16 0.17 FO achieves significant energy reduction than LCOF, which is because the helpers' optimal CPU frequencies are far below than that of the user.Also, by comparison with JORCF, JTPRA has about 69.9% energy consumption reduction on average.This indicates the performance gain brought by proximity Figure 5 shows the average energy consumption versus the block length T where L u = 0:2 Mb and K = 3.We have generally similar observations in Figure 5 as in Figure 4. Specifically, it is observed that our proposed JTPRA has about 52.4% energy consumption reduction on average, compared with JORCF.Moreover, we have some observations as follows.(1) For schemes with unchanged CPU frequency, LCFF remains unchanged as T increases, while FF keeps almost unchanged.This indicates there is no need for the user and helpers to increase the transmission rate when the latency requirement is loose.
(2) For schemes involving optimizing CPU frequency, the average energy consumption by all the schemes decreases as T increases, which is because as T increases, the user and helpers can lower down their optimal CPU frequency for consuming less energy.However, JORCF decreases with T very slowly.This indicates fixing the transmission rate is not conducive to improve the MEC performance.
Figure 6 shows the average energy consumption versus the frequency resource block W where T = 0:15 sec, L u = 0:2 Mb, and K = 3.We have generally similar observations in Figure 6 as in Figure 5. Specifically, it is observed that our proposed JTPRA has about 67.5% energy consumption reduction on average, compared with JORCF.Moreover, for schemes involving optimizing CPU frequency other than LCOF and JORCF, the average energy consumption first steadily decreases and then keeps almost unchanged as W increases.This is because a large W not only signifies a high transmission rate but incurs decreased transmission energy consumption between the user and the helpers.
Figure 7 shows the average energy consumption versus the helper number K where L u = 0:2 Mb and T = 0:15 sec.
We have generally similar observations in Figure 7 as in Figure 6.Specifically, it is observed that our proposed JTPRA has about 65.1% energy consumption reduction on average, JORCF [7]  FF [27]  JTPRA Average energy consumption (joule) LCFF LCOF FO JORCF [7]  FF [27]  JTPRA LCFF LCOF FO JORCF [7]  FF [27]     Wireless Communications and Mobile Computing compared with JORCF.Obviously, for schemes with unchanged CPU frequency, FF decreases as K becomes large.In addition, for schemes involving optimizing CPU frequency, JTPRA achieves a lower energy consumption than both LCOF and FO, while when K < 2, JORCF achieves a lower energy consumption than JTPRA, while the reverse is true when Kbecomes large.This is because the more helpers whose optimal CPU frequencies are below that of the user are helpful for achieving more significant energy reduction.
Figure 8 shows the average energy consumption versus the maximum distance d max from the user to the helpers where L u = 0:2 Mb, T = 0:15 sec, and K = 3.We have generally similar observations in Figure 8 as in Figure 7. Specifically, it is observed that our proposed JTPRA has about 15.1% energy consumption reduction on average, compared with JORCF.For schemes involving optimizing CPU frequency, FO first grows rapidly as d max increases and then becomes even worse than both JORCF and LCOF, while JTPRA steadily increases as d max increases.This is because as d max increases, the channel gain between the user and the helpers becomes smaller, which leads to increased transmission energy consumption between the user and the helpers, and thus local computing is more beneficial than computation offloading at large d max values.

Conclusion
In this paper, we have proposed a novel joint task partition and resource allocation (JTPRA) scheme, in which nearby helpers share their own communication and computation resources to help the user.By considering an efficient OFDMA-aided three-phase transmission protocol, we proposed an energyefficient design framework by jointly optimizing the user's task partition, and the communication and computation resources allocation for task offloading and results downloading, subject to the user's computation latency constraint.Based on convex optimization methods, we presented an efficient algorithm to obtain the globally optimal solution.Extensive numerical results demonstrated the merits of the proposed JTPRA scheme over alternative benchmark schemes.
Due to space limitation, there are some other challenging problems to be handled in this paper, which are investigated as follows to inspire future work.
(1) Although this paper considered single-user multihelper model, our results are extendable to more general ones with multiuser multihelper.In this case, we can design helper selection policy to pair each user with one or multiple helpers, such that the helpers can use the proposed JTPRA scheme to help the computation of the paired user.However, how to efficiently handle the joint optimization problem of helper selection and resource allocation is a quite challenging problem worthy of further study (2) Due to easy implementation of OFDMA, we designed the proposed protocol based on it in this paper.To further improve the system performance, we can next exploit other orthogonal multiple access schemes, e.g., NOMA schemes [22] and sparse code multiple access (SCMA) [30] (3) In terms of energy saving, we achieved the expected goal.But for MEC standardization, how to improve the propose scheme's implementation like D2D, e.g., symbol synchronization and signaling interaction, is a difficult problem worth pursuing in the future [31] LCFF LCOF FO JORCF [7]  FF [27]  JTPRA  LCFF LCOF FO JORCF [7]  FF [27]  JTPRA

Figure 1 :
Figure 1: System model of multiuser cooperative MEC.

Figure 2 :
Figure 2: An illustration of the OFDMA-aided three-phase transmission protocol.

and μ 3 ≜
½μ 3,1 ,⋯,μ 3,K .The partial 5 Wireless Communications and Mobile Computing Lagrangian of problem (P1.1) is given by opt , respectively.However, due to the nonuniquenessof ðt off k Þ * and ðt dl k Þ * , k ∈ K,we implement an extra procedure to obtain the optimal solution of other variables for problem (P1).With p opt off , p opt dl , M opt , and l opt u , the optimal solution must satisfy l

Figure 3 :
Figure 3: The maximum data-size versus the block length T.

Figure 4 : 2 T
Figure 4: The average energy consumption versus data size L u .

Figure 5 :
Figure 5: The average energy consumption versus the block length T.

Figure 6 :
Figure 6: The average energy consumption versus the frequency resource block W.

Figure 7 :
Figure 7: The average energy consumption versus the helper number K.

Figure 8 :
Figure 8: The average energy consumption versus the maximum distance d max from the user to the helpers.