Article Convergence Rate Analysis of the Proximal Difference of the Convex Algorithm

In this paper, we study the convergence rate of the proximal diﬀerence of the convex algorithm for the problem with a strong convex function and two convex functions. By making full use of the special structure of the diﬀerence of convex decomposition, we prove that the convergence rate of the proximal diﬀerence of the convex algorithm is linear, which is measured by the objective function value.


Introduction
Difference of convex programming (DCP) is a kind of important optimization problem that the objective function can be written as the difference of convex (DC) functions.
Up to now, one of the classical algorithms for DCP is the DC algorithm (DCA) [7] in which the nonconvex part of the objective function is replaced by a linear approximation. By DCA, only a convex optimization subproblem needs to be solved at each iteration. After that, the DCA has been attracted by a lot of researchers. Le i et al. [8] proved the linear convergence rate of DCA by employing the Kurdyka-Lojasiewicz inequality. Assuming that the subproblem of DCA can be easily solved [6], Gotoh et al. [4] proposed the proximal DC algorithm (PDCA) for solving the DCP, in which not only the nonconvex part in the objective function is replaced by the same technique as in DCA but also the convex part is replaced by a quadratic approximal. e PDCA reduces to the classical proximal gradient algorithm for convex programming if the nonconvex part of the objective function is void [9]. To accelerate the PDCA, Wen et al. [10] introduced a new type of proximal algorithm (PDCA e ) with the help of an extrapolation technique. Since the convergence rate of PDCA e heavily depends on the Kurdyka-Lojasiewicz inequality, PDCA e converges linearly in general [10].
In this paper, we study the linear convergence rate of PDCA by the structure, which is different from the techniques in [8,10]. Under conditions that the objection function can be divided into difference of a strong convex function and two convex functions with Lipschitz continuous gradient, we prove the linear convergence rate of PDCA, which is measured by the objective function value. e remainder of the paper is organized as follows. In Section 2, several useful preliminaries are recalled. In Section 3, more details about the DC optimization problem are given, and the PDCA proposed in [4] is listed for the sake of simplicity.
e linear convergence rate of the PDCA is established in Section 4. Final remarks are given in Section 5.

Preliminaries
In this section, we recall some useful definitions and properties.
Let f: R n ⟶ [− ∞, +∞] be an extended real function. e domain of f is denoted by (1) If f(x) never equals − ∞ for all x ∈ dom f and dom f ≠ ∅, we say that f is a proper function. If the proper function is lower semicontinuous, then it is called a closed function. A proper closed function f(x) is said to be level bounded if the lower level set of f (i.e., Let f: R n ⟶ R ⋃ +∞ { } be a proper closed function. en, the limit subdifferential of f at x ∈ domf is defined as follows: where It is well known that the limit subdifferential reduces to the classical subdifferential in convex analysis when f(x) is a convex function, that is, Furthermore, if f is continuously differentiable, then the limit subdifferential reduces to the gradient of f denoted by ∇f.

DC Programming and PDCA
In this section, we begin to consider the DC programming problem: where f: R n ⟶ R is a strong convex function with constant a > 0 and g, h: R n ⟶ R are convex functions, and their gradients are Lipschitz continuous with constants L g > 0 and L h > 0, respectively. roughout the paper, we assumed that F(x) is level bounded and a > 1. Apparently,

The Convergence Rate of PDCA
In this section, we give the linear convergence rate of PDCA. To continue, the following lemma is useful.

Since ∇h(x) is Lipschitz continuous with constant
that is, Since g is a convex function, we have Summing (7), (9), and (10), we get On the contrary, since h is a convex function, we have which is equivalent to the following form: Since ∇g(x) is Lipschitz continuous with constant L g > 0, by (5), there exists 0 < μ ≤ 1/L g such that 2

Mathematical Problems in Engineering
Summing (13) and (14), we have (15) Adding f(x) on both sides of (15), we get Taking x � x k+1 , it follows that By optimality conditions of Algorithm 2, we know that where ξ k+1 ∈ zf(x k+1 ), which means that By (11) and (17), it holds that where the first equality follows from (19) and the last inequality follows from a > 1. e desired result follows. Now, we are at a position to prove the main theorem as follows.
□ Theorem 1. Let x k be generated in Algorithm 2. en, where x * is the stationary point of (4).
Proof. By Lemma 2, let x � x k , and we have that (1) Initial step: choose ε > 0 and x 0 ∈ R n , and set k � 0.
(2) Iterative step: compute the new point by the following formula: ALGORITHM 1: DCA for problem (4).

Mathematical Problems in Engineering
en, it follows from μ > 0 that F(x k ) ≥ F(x k+1 ), which means that the sequence F(x k ) is nonincreasing. en, for any k 0 ∈ N, it follows that By Lemma 2 again, let x � x * , and we have that By (23) and (25), it yields that and the desired result follows.

Conclusions
In this paper, we give the linear convergence rate of PDCA for the case that the objective function is divided into a strong convex function and two convex functions. Different from the method in [8,10], which depends heavily on the Kurdyka-Lojasiewicz inequality, we give a simple proof by the special structure of the optimization problem. Actually, there may be some other potential applications about the proposed PDCA. We leave this work in the future. For example, we will study further applications of the PDCA algorithm to some nonconvex problems [11,12], tensor optimization problems [13,14], and so on [15][16][17][18].

Data Availability
e data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest
e authors declare that there are no conflicts of interest regarding the publication of this paper.