Optimization Control of Cooperative Trajectory towards Dual Arms Based on Time-Varying Constrained Output State

Aiming at the impact and disturbance of dual-arm robots in the process of coordinated transportation, a dual-arm cooperative trajectory optimization control based on time-varying constrained output state is proposed. According to the constraint relationship of the end-effector trajectory of the dual-arm coordinated transportation, the joint space trajectory mathematical model of the dual-arm coordinated transportation was established by using the master-slave construction method. Based on the time impact optimization index of joint trajectory, a multiobjective nonlinear equation is established. Using random probability distribution to extract the interpolation features of nonuniform quintic B-spline trajectory, the feature optimization target is selected, and the Newton numerical algorithm is used for iterative optimization. At the same time, it is combined with an elite retention genetic algorithm to further optimize the target. Based on the disturbance and tracking problem, a PD control method based on time-varying constrained output state is proposed, and the control law is designed. Its convergence is verified by establishing the Lyapunov function equation and asymmetric term. (e trajectory optimization results show that the proposed trajectory optimization method can increase the individual diversity and enhance the individual local optimization, thus avoiding the premature impact of the elite retention genetic algorithm. Finally, the proposed control method is simulated on the platform of Gazebo; compared with the traditional PD control method, the results show that the proposed control algorithm has high robustness, and the rationality of the coordinated trajectory control method is verified by the double-arm handling experiment.


Introduction
With the diversification of production tasks, the drawback of single operation for the traditionally mechanical arm has gradually emerged in recent years. In comparison, dual-arm robot possesses better flexibility and load capacity and is more competent for cooperative tasks. Particularly, in the face of some complex operation tasks, the advantages of dual-arm robots are more obvious, such as carrying heavy objects and flexible assembly. us, it has attracted more attention of researchers [1][2][3]. e dual-arm robot is not just a simple superposition of two single-arm robots but cooperates with each other in the same system. Compared with the single-arm operation, the dual-arm operation expands the cooperation between the dual arm. us, the coordinated control between two manipulators and adaptability of the environment is very important. e trajectory optimization and coordinated control between the dual-arm have been focused widely [4][5][6]. e coordinated operation consists of two parts: one is the coordinated trajectory planning, and the other is the coordinated motion control method.
Motion planning is the basis of dual-arm control. e dual-arm cooperative motion planning methods mainly include master-slave motion planning, obstacle avoidance based on motion planning, multipriority motion planning, and other planning methods. Trajectory optimization is a key research field of manipulator trajectory planning. To realize it, lots of methods have been developed so far. e optimal trajectory proposed by researchers in the last century is time-optimal trajectory. e joint motion angular velocity and angular acceleration continuous cubic spline interpolation curve were used. Under the speed constraint, the speed was adjusted to minimize the motion time. Because of the quick development of robotic techniques, the multiobjective optimization algorithm is more and more popular in the field of manipulator trajectory. Among them, the genetic algorithm is widely used. Particularly, to avoid the defects, it is usually employed by mixing with other optimization algorithms. In the previous research [7,8], for the optimization of the quintic B-spline trajectory with the goal of motion time, energy consumption, and jerky, an improved immune clonal selection algorithm was proposed. In order to overcome the deficiency of local optimization ability of the algorithm, He et al. [9] put forward a difference algorithm combined with the nonlinear program to optimize the objective function. To prevent the evolution of the elite population from being destroyed and affecting the global planning later, Gu's group utilized the elite retention strategy in the hybrid algorithm to avoid premature population [10]. All the above researches belong to the population competition strategy, and the search time is too long.
Owing to the friction, model error, and unknown disturbance, the end-effector trajectory of the manipulator will produce obvious deviation. erefore, the control of trajectory motion is particularly important. At present, the main control methods used in the manipulator are masterslave control, position/force hybrid control, impedance control, adaptive control [11][12][13], neural network control [14][15][16], fuzzy control [17], and so on. In [18], the authors adopted a PD model compensation synovial control law to achieve dual joint trajectory tracking, but it was not suitable for a high degree of freedom obviously. Wei et al. [19] proposed that the chattering of the system could be easily suppressed by using traditionally active compliance control, but in that case, an accurate dynamic model was needed. By virtue of constrained neural network position control [20,21], the anti-interference ability could be also enhanced and the chattering of the system could be suppressed effectively. However, for the high degree of freedom robot, the function of neural network fitting unknown term was not obvious. Besides, a position/force hybrid control method was also reported [22], in which the desired position and contact force at the end of the dual arm were taken as the input control variables, and the position control and force control modes were switched by selecting the matrix. However, this switching control mode was prone to make the system unstable and had poor suppression to external interference. Position/force hybrid control is to put force control and position control in orthogonal space to satisfy force control or position control at the expense of position in a certain direction.
In this paper, to solve the impact and disturbance problem of dual-arm coordinated transportation, a dualarm cooperative trajectory optimization control method based on time-varying constraint output state is proposed, and three aspects of coordination planning, trajectory optimization, and joint control are discussed in detail. Firstly, the master-slave kinematics modeling method is used to plan the dual-arm trajectory and establish the time impact trajectory optimization equation. According to the coordinated transportation, the constraint equations are established to obtain the pose, velocity, and acceleration of the master-slave arm. en, under the trajectory of the dual-arm robot that is well planned, the angle, angular velocity, and angular acceleration are optimized. To avoid the premature problem caused by the lack of local optimization ability, the genetic algorithm with elite mechanism and the random feature selection combined with Newton numerical iteration method [23,24] are employed as the global optimization and the local optimization, respectively. At the same time, a PD joint position control method with time-varying output constraints is proposed to avoid the problem of complicated modeling on the premise of overcoming the interference. A dual-arm control system is built by the combination with the master-slave framework. Also in our case, the rationality of the optimization algorithm is further verified by trajectory optimization experiments, and the effectiveness of the trajectory optimization control method of dual-arm coordinated transportation is tested by using Gazebo simulation. Figure 1 shows a way of coordinated handling. When the arms start to handle objects, two end-effectors are relatively stationary. us, the two end-effectors and the object can be regarded as a whole. In working, the master-slave method can be used to plan the coordinated transportation of the dual arm with the left arm as the main arm and the right arm as the slave arm. e movement tracking of the main arm is preset, and the movement tracking of the slave arm is obtained from the movement track of the main arm according to the constraint conditions of the dual arm. Figure 1, the pose constraint relationship between the object and the endeffector of the main arm is given by

Position and Orientation Constraints. In
where T Similarly, the pose relationship between master arm and slave arm can be expressed as follows: where T As R f L f can be obtained according to the initial state of cooperation between dual arm. us, equation (2) can be changed as follows: 2 Complexity In order to facilitate the experimental results, it only needs to plan the position and pose of the main arm end-effector. e position and pose matrix of the end-effector of the main arm can be obtained by equation (3) and the corresponding joint angle can be calculated by the inverse solution.

Speed Constraint.
e relationship between the position and posture and the differential motion of the main arm end-effector is as follows: where T next L is the next moment pose of the main arm endeffector, T cur L is the current moment pose of the main arm end-effector, and dT L is transformed from the differential motion (dq L ) of the main arm joint. Here, dq L can be calculated from equation (5), where _ q L is the joint velocity planned by the main arm, Δt is the instantaneous time difference, and acceleration has little effect on the joint control feedback. In order to improve the reliability of T next L and ignore the influence of instantaneous joint angular acceleration, an appropriate Δt can be set to reduce the influence of acceleration.
en, the next moment position and posture from the end-effector of the slave arm can be obtained from equation (3): where T next R is the position and posture of the slave manipulator end-effector at the next moment. e joint angle and joint differential (dq R ) at the next moment can be obtained by inverse solution, and thus, the joint angular velocity ( _ q R ) could be obtained.

Acceleration Constraint.
In order to analyze the acceleration constraint relationship of the dual-arm robots in coordinated transportation, equation (5) can be changed into the relationship among the displacement, velocity, and acceleration of the main arm, as shown in the following equation: where € q L is the angular acceleration of the main arm joint and Δt a is the instantaneous time difference, which is a small value. To ignore the instantaneous acceleration, here, Δt a should be many times larger than the value of Δt in equation (5). e corresponding joint angular acceleration ( € q R ) of the slave arm can be calculated by repeating equation (6).
To sum up, through the planning of the main arm position and pose, the corresponding main arm joint angle can be obtained. en, the joint angle and angular velocity as well as angular acceleration of the main and slave arm in the joint space can be further planned.

Establishing the Optimization Equation.
e masterslave motion planning method used in this paper focuses on the main arm. To ensure the great accuracy of angular displacement, the instantaneous variables of the slave arm are calculated at the expense of the accuracy of the next order variables. By analogy, the impact of the slave arm joint is not accurate. erefore, the coordinated motion trajectory of the dual arm is adjusted indirectly only by optimizing the time impact equation of the main arm. e B-spline curve has the advantage of local adjustment, which can improve the resolution on the premise of ensuring the accuracy of the fitting curve [25,26]. erefore, in our case, to establish a multiobjective nonlinear optimization equation about time impact, a nonuniform quintic B-spline curve to interpolate the joint trajectory of the main arm is adopted and shown in the following equation: where t i+1 − t i is time interval; t f is the execution time of the arm; v and a are the angular velocity and angular acceleration variables respectively; V max and A max are the constraint angular velocity and angular acceleration variable limits; J(x) is the joint acceleration (i.e., impact); S 2 is the square sum integral of impact.
In order to obtain the minimum value and increase the constraint-specific gravity, equation (8) is simplified to equation (9), in which w 1 , w 2 , m, n are the weights. Owing to the limit of angular velocity and angular acceleration of the slave arm, V max and A max can be reduced appropriately.

Genetic Algorithm Based on Elite
Mechanism. e genetic algorithm based on the elite mechanism extends the binary tournament mechanism and elite population retention mechanism on the basis of the genetic algorithm, which greatly improves the convergence ability. However, the binary tournament mechanism and elite population retention mechanism will lead to the decline of population diversity, thus ignoring the potential of some individuals and resulting in a premature algorithm, as shown in Figure 2.
e crossover operator can be described as equation (10), in which r is a random number of 0∼1 and x 1 , Considering that the time interval variables are all positive, the mutation operator can be changed as follows: where r 1 is a random number of 0∼1 and x 3 is the mutation position individual, f(p) � (1 − (p 1 /p 2 )) 2 , p 1 is the current iteration number, p 2 is the maximum iteration number, and abs is the absolute value.

Newton Numerical Iteration
Method. Newton numerical iteration is a method to get the root of an equation in nonlinear programming. For the equation with a positive objective function, the root obtained is minimum. e expression of Newton's solution is equation (12), is the variable solution of iterative update time interval, and k is the number of iterations.

Algorithm
Principle. e improved algorithm includes two aspects. On the one hand, an elite selection genetic algorithm is used to optimize the population. On the other hand, the Newton numerical iteration method is used to optimize individuals. As shown in Figure 3, our algorithm first optimizes individual operations to tap individual potential before retaining the excellent population. en, the local search ability is enhanced and the impact of precocity could be reduced.
As there are many time interval variables in the nonuniform quintic B-spline curve equation, the random sampling method can be used to jump out of the local extremum of local optimization [27]. Before Newton numerical iteration method, the random probability distribution is used to extract a single characteristic variable. e expression is as follows: where x n is the nth time interval variable; max x 1 , . . . , x n is the maximum value of time interval variable; min x 1 , . . . , x n is the minimum value of time interval variable; rand x 1 , . . . , x n is the random value of time interval variable; r is a random number of 0∼1; p 1 and p 2 are the distribution probability boundary. rough the combination of equations (12) and (13), we can optimize the objective function by taking the time discontinuous individual as the variable, limit the number of iterations, find the optimal variable from the Newton numerical iteration process, and optimize the individual. So as to overcome the potential neglect problem caused by the decline of population diversity.

Realization of Optimization Steps
Step 1. Population initialization. Set population size PS groups, the time interval variable range, and maximum number of evolution (gen max ). According to the left arm joint position (q L ), randomly select PS groups of time interval variables x 1 , . . . , x n . Set fitness as optimization objective equation.
Step 2. Individual single variable optimization. e number of iterations is limited to k. e Newton numerical iteration method and random probability distribution are used to select a single variable in the individual for iterative optimization. e fitness is used as the measurement standard, and the optimal solution is selected to replace the current individual to form a new population.
Step 3. Select the operation. With fitness as a measure, a binary tournament mechanism is adopted and the number of repeated selection candidates is limited to select the better candidate to enter the new population.
Step 4. Use Step 2 to optimize individuals.
Step 5. Cross operation. Two individuals are randomly selected and the crossover position is randomly selected to form a new individual by using the crossover operator.

Complexity
Step 6. Use Step 2 to optimize individuals.
Step 7. Mutation operation. A new individual is formed by randomly selecting an individual and a mutation position.
Step 8. Use Step 2 to optimize individuals.
Step 9. Elite retention. e offspring and parents were combined, and the best population was selected according to fitness.
Step 10. Evolutionary loop operation. While the current evolution number gen cur is less than the maximum evolution number gen max , return to Step 1.

Problem Description.
To overcome the disturbance of PD control and reduce the modeling difficulty of active compliance control, a constraint output state is proposed. It is assumed that the constraint condition of the manipulator output angular displacement is k m (t) ≤ θ(t) ≤ k m (t). In order to ensure the convergence of the closed-loop system, the following error constraints are given: Let angular displacement error e 1 � [e 11 , e 12 , . . . , e 1n ] � θ − θ d and angular velocity error e 2 � [e 21 , e 22 , . . . , e 2n ] � _ θ − α, in which α is the virtual controller. Define α: where k 1 � diag(k 11 (t), . . . , k 1n (t)); s � diag(s 11 (t), . . . , s 1n (t)).
Let s 1i (t) � In order to realize the convergence of the closed-loop system, the following two proved lemmas are given [28,29].
By Lemma 2, the output state is included in the timevarying constraints (− k a < e 1 < k b ). Obviously, the constraints include the constant case mentioned in literature [29,30]. erefore, the time-varying constraints are more flexible.

Controller Design and Stability Analysis. Lyapunov matrix equation is as follows:
en, where _ . en, equation (18) can be changed to where W, C, and G are inertia matrix, centrifugal and Coriolis force term matrix, and gravity moment matrix in the dynamic model. ey are all positive definite matrixes. By using PD control strategy, the controller can be designed as follows: where A � [u 1 e 2p− 1 11 , . . . , u n e 2p− 1 1n ] and K 2i > 0.5, as control gain matrix.
In this paper, considering the complexity of dynamic modeling, the inertia estimation matrix K f is selected, and the controller is changed to Combining equation (19) with equation (21)  Complexity where ε is the approximation error. From Young's inequality, equation (23) is obtained: According to Lemma 1, we can get the following result: where k * 2i − 0.5 > 0. Obviously, where p � 2pk 1i , 2k 2i and C � (1/2)ε 2 . us, equation (26) is easy to be obtained: Based on the constructed Lyapunov function, the following result can be obtained: Obviously, So when the initial ξ(0) < 1, based on Lemma 2, we can determine ξ(t) < 1, − k ai < e 1i < k bi , that is, the angular displacement error does not exceed the error constraint limit, so e 1i is convergence. It is easy to conclude that α is convergence, too.
From equation (27), the following equation is established: erefore, the output angular velocity error converges. To sum up, all output convergence can be gained by constraining the output state.

Dual-Arm Coordinate Control.
In this paper, we choose the dual-arm robots as the experiment object. Using the master-slave programming method with the left arm as the main arm and the right arm as the slave arm, the expected angular displacement, angular velocity, and angular acceleration of the dual arm are planned as the input value. Under the constraint of the constraint equation, the dual arm is controlled, respectively, according to the output value feedback. e coordinated control framework is shown in Figure 4.
As shown in Figure 4, the trajectory of the main arm is optimized and then input into the joint controller to complete the coordinated transportation operation.

Initial State and Trajectory Setting of Dual Arm. In this paper, dual-arm robots are composed of two iiwa14 manipulators, and its D-H model is built by using Robotics
Toolbox. e base position of the left and right arms in the global coordinate system is as follows: Let the starting angle of the left arm be q L � [− 1.768, − 0.321, 0, 1.588, 0, − 1.233, − 1.768] (in radian system), q R � − q L , and the Cartesian space trajectory of the endeffector position of the left arm (stationary during the period) is as follows: where

Optimization Calculation.
e weights w 1 , w 2 , m, n in the objective equation are set to be 1, and the constraint conditions V max and A max are set to be 0.4 (rad/s) and 1 (rad/s 2 ). e initial velocity, initial acceleration, terminal velocity, and terminal acceleration of each joint are set to be 0 by using a 5-degree nonuniform B-spline curve interpolation trajectory. e optimal retention coefficient of the optimization algorithm, the upper limit of replication times of the binary tournament, the population size, the maximum evolution algebra gen max , and the cross probability are 1/2, 2, 30, 100, and 0.6, respectively. e mutation probability is given by p mutation � 0.2 + ((0.001 − 0.2) * gen/gen max ). In Newton's iterative method, the upper limit of iteration times is 3, and the maximum as well as minimum features of trajectories are extracted by random probability distribution. e distribution probability boundaries p 1 and p 2 are 0.5 and 1, respectively. After the interval variable area is set to (0.05, 1), the results of 5 random trials are shown in Table 1, and the fitness changes of test 1 in three algorithms (the improved optimization algorithm in this paper, the elite genetic algorithm (ESGA), and the standard genetic algorithm (SGA)) are shown in Figure 5.
It can be seen from Table 1 and Figure 5 that both the optimal fitness and average fitness of the improved optimization algorithm as well as ESGA are almost the same in 6 Complexity the end. ey converge at about 50 generations, indicating that the algorithm with elite mechanism will reduce the population diversity. Table 1 and Figure 5 show that the population diversity of SGA is better than that of ESGA and our optimization algorithm and the evolution direction is limited due to the uneven population. As also shown in Table 1 and Figures 5(a) and 5(b), ESGA adds a competition mechanism and retains the elite part; compared with the improved optimization algorithm, the local optimization ability is poor; thus, it is unable to tap individual potential and avoid premature ESGA. In a word, the improved optimization algorithm can enhance the local search ability to optimize individuals and the influence of premature ESGA can be avoided, but the disadvantage is also clear; namely, the optimization time is longer. To further optimize the algorithm, we need to increase the diversity of individual variables. ere are three schemes, scheme 1 is to randomly extract the maximum and minimum features of the trajectory Scheme 2 takes random extraction of maximum and minimum features as the main method and random extraction of features as the secondary method. Scheme 3 is to randomly extract trajectory features. (p 1 , p 2 ) of the three schemes are (0.5, 1), (0.45, 0.9), and (0, 0), respectively. e results of three randomized trials are displayed in Table 2, and the fitness changes of test 1 in three schemes are shown in Figure 6.
From Table 2 and Figure 6, it can be seen that the evolution of scheme 1 is stagnant in about 50 generations, while schemes 2 and 3 are still evolving slowly and have better optimization ability. erefore, random feature extraction can increase the diversity of individual variables, so that individual variables can continue to be optimized. From Table 2 and Figures 6(b) and 6(c), in comparison, the stability of optimization variables of scheme 3 is not as good as scheme 2. Also, the average optimization ability and average time of scheme 2 are better than those of scheme 3 in the three tests. erefore, scheme 2 is the best. e optimal test (test 2 of scheme 3) is selected from the tested tests to plan the trajectory of dual-arm robots. e planned trajectory is shown in Figure 7. Figures 7(a)-7(c) show that the improved optimization algorithm does not exceed the motion constraints. Moreover, the motion trajectory is smooth and does not produce mutation, which verifies the rationality of the improved optimization algorithm again. Figures 7(a) and 7(d) suggest that the coordinated movement can be completed by coordinating the pose constraint relationship. Figures 7(a)-(7c) indicate that the coordinated motion trajectory can be obtained by combining the differential motion with the constraint relationship of coordinated motion, which is the basis of dynamic simulation and can be determined by the results of Gazebo simulation.

Simulation of Coordinated Operation
To verify the feasibility of the dual-arm coordinated motion trajectory and control method, the information of dual-arm coordinated motion trajectory is connected with the plan interface of Moveit, and the coordinated handling control simulation is realized on the Gazebo platform.
In order to verify the tracking performance of the proposed control method, a comparative test is carried out with a single arm as the research object. e expected angular displacement, angular velocity, and angular acceleration of the joint are sin(t) (rad), cos(t) (rad/s), and − sin(t) (rad/s 2 ), respectively. e constrained boundary k m (t) is (0.2θ d − 3, 1.2θ d + 0.8). In the virtual controller, k 1 and β are diag(0.1,. . ., 0.1) and 0.2, respectively. e factors k 3 and k 4 in the adaptive law are 5 and 1, respectively. In the controller, k 2 is 30 and k f is 0.4 * diag (3,4,4,4,4,3,2). e real-time ratio when debugging Gazebo ranges from 0.92 to 0.97, and the comparison effect with the PD control method of selecting appropriate PID parameters is shown in Figure 7.
In Figure 8, the initial and end time of planning trajectory are about 4 and 14 s, respectively. Because the initial and end speed of the planned trajectory are inconsistent with the set speed in Gazebo, there will be sudden changes at the beginning and end of the trajectory. e comparison between Figures 8(a)-8(d) shows that the tracking error of PD control in joint position and speed is far more than that of the proposed control method, testing the effectiveness of the proposed control method indirectly.
For the dual-arm coordinated transportation experiment, the mass of the object is set as 10 kg, the expected trajectory of the dual arm is the planned dual-arm coordinated trajectory, and the other control parameters remain unchanged. e transportation process and trajectory are shown in Figures 9 and 10. Figure 8 shows that the task of dual-arm coordinated transportation can be completed by the dual-arm coordinated trajectory planning and the dual-arm coordinated transportation method proposed in this paper. e mass of the block in Figure 8 is 10 kg, which indirectly verifies the Complexity strong anti-interference ability of the proposed control algorithm in the process of coordinated transportation. To further verify the transportation effect of the dual arm, the joint information is derived from the Gazebo, and the tracking error of the dual arm, the joint torque, and the trajectory of the transportation end are drawn in Figure 9.
It can be seen from Figures         smooth state as a whole, and the maximum error difference is less than 0.004 rad. In addition, the tracking speed error curve is in a small vibration curve, the maximum error is not more than 0.03 rad/s, and the average error is less than 0.01 rad/s. In Figure 10(g), the end-effector trajectory is almost the same as the expected trajectory. In conclusion, the effectiveness of the coordinated planning method, the high tracking accuracy, and the robustness of the control method are all verified. Figures 10(c) and 10(f ) show that the output torque of the manipulator is smooth without mutation, confirming the rationality of joint trajectory optimization control.

Conclusion
Aiming at the problem of impact and disturbance of dualarm coordinated transportation, a dual-arm cooperative trajectory optimization control method based on timevarying constraint output state was proposed, and three aspects of coordination planning, trajectory optimization, and joint control were mainly discussed.
(1) Considering the joint impact problem in the next step, this paper adopts the master-slave path planning method, plans the left arm joint trajectory with nonuniform fifth B-spline, and takes the coordinated motion as the constraint condition. And the joint displacement, joint velocity, and joint acceleration of the right arm by combining the instantaneous differential motion relationship were calculated. us, the calculation was simplified and the accuracy of the Jacobian inverse matrix could be avoided. (2) To solve the joint impact problem, the main arm was taken as the joint optimization objective and the joint optimization objective equation was established. At the same time, to overcome the local extremum defect of traditional genetic algorithm with elite strategy, a genetic algorithm based on elite mechanism and individual random variable optimization was also proposed. (3) For the disturbance and trajectory tracking problems, a PD joint position control method with timevarying output constraints was proposed on the premise of overcoming the complex modeling problems of various controllers. By the combination with the master-slave framework, a dual-arm robot control system was built. (4) e rationality of the proposed optimization algorithm was verified by trajectory optimization experiments. Based on the optimization algorithm, three schemes were compared and analyzed, and the second scheme was the best one. (5) e effectiveness of this method was tested by the simulation of Gazebo.

Data Availability
e data used to support this study are included within the article.

Conflicts of Interest
e authors declare that there are no conflicts of interest regarding the publication of this paper.