Human-Robot Interaction and Demonstration Learning Mode Based on Electromyogram Signal and Variable Impedance Control

In this research, properties of variable admittance controller and variable impedance controller were simulated by MATLAB firstly, which reflected the good performance of these two controllers under trajectory tracking and physical interaction. Secondly, a new mode of learning from demonstration (LfD) that conforms to human intuitive and has good interaction performances was developed by combining the electromyogram (EMG) signals and variable impedance (admittance) controller in dragging demonstration. In this learning by demonstration mode, demonstrators not only can interact with manipulator intuitively, but also can transmit end-effector trajectories and impedance gain scheduling to the manipulator for learning. A dragging demonstration experiment in 2D space was carried out with such learning mode. Experimental results revealed that the designed human-robot interaction and demonstration mode is conducive to demonstrators to control interaction performance of manipulator directly, which improves accuracy and time efficiency of the demonstration task. Moreover, the trajectory and impedance gain scheduling could be retained for the next learning process in the autonomous compliant operations of manipulator.


Introduction
With the intensifying aging trend and proposal of new industrial development strategies in recent years, the cooperative robots which have variable impedance compliant operation abilities and can accomplish complicated interaction tasks with human beings are attracting wide attention.Nevertheless, the past robot programming mode and teaching box obviously cannot meet the need of learning and demonstration of diversified human-robot collaboration or interaction tasks.Therefore, it is very necessary to develop new humanrobot interaction and task learning mode of robots.
Robot learning and implementation of complicated variable impedance human-robot interaction and cooperation tasks involve following research contents.(1) The variable impedance control algorithm is the basis for compliant operation of robots, which can also be used to improve the intuitiveness of interaction and demonstration process.
(2) The new human-robot interaction and demonstration mode can help robot to get more better learning data and reduce workload of demonstrators by using appropriate data collection (visions and EMG signals as well as encoder and force sensor of the robot [1,2]) and control methods (variable impedance or admittance control).(3) The collected data are conformed into the commands that robot can understand and implement by the appropriate learning algorithm.In the same time, the robot is empowered with adaptation to task environmental changes.
Generally speaking, variable impedance operation of manipulator can be realized by the variable impedance joints on the structure [3,4] or the variable impedance algorithm based on rigid or flexible joints [5,6].The later one not only avoids the complicated engineering problems in the former one and is conducive to thorough theoretical researches, but also can realize equivalent interaction effect of the former one to a large extent [5,7].Of course, variable impedance control algorithm has some problems.For instance, the stability of variable impedance control has attracted extensive attentions of scholars in recent years [8,9].
How to acquire appropriate impedance gain scheduling is an important problem of the variable impedance control algorithm.Impedance gain scheduling represents the interaction performance of manipulator to specific interaction tasks.This requires design and study of human-robot interaction and demonstration mode and learning algorithm.Existing studies mainly include the following: (1) Impedance parameters were adjusted based on intelligence control methods, such as adaptive control, optimal control, fuzzy control, and machine learning [10][11][12].For example, Dimeas et al. adjusted the impedance gain scheduling through fuzzy learning and reinforcement learning [11,13,14].
Learning from demonstration or developing new controllers by observing and summarizing the laws of human motion [19] are two promising directions to develop the new variable impedance control algorithm and human-robot interaction mode.Firstly, high-quality initial learning data are provided to the robot by human demonstrator's learning and cognition ability of variable impedance operation [20], which can save the learning time and enhance the final learning effect of robot.Secondly, the movement trajectory and impedance parameters of human arms are easy to be acquired by many ways.Trajectory data could be acquired by dragging demonstration [6], vision [21], and teleoperation [18], while impedance parameters can be acquired through EMG signal [21] or kinesthetic teaching [15].By combining the algorithm of learning from demonstration, demonstrators can make intuitive teaching and simplify the teaching process to the maximum extent.Moreover, robots can learn not only representation (trajectory) of human actions, but also dynamic changes and intentions (impedance parameters of human arms) of human in movement.If robot can extract some skills of human operation, reaction ability and fault tolerance ability of robots in independent execution of tasks might be improved.
This study mainly addressed the problem of how humans transmit the complication operation ability of human arm to manipulator effectively and how robots learn and reproduce operation skills of human.Main research contents included the following.(1) Influences of parameter changes of variable impedance controller and variable admittance controller on trajectory tracking and physical interaction were analyzed through simulations, which can give an intuitive understanding of control method.It also shows the interaction effect of the variable impedance (admittance) controller.(2) Based on new human-robot interaction and demonstration mode by combining variable impedance (admittance) control and EMG signals, more intuitive interaction and demonstration can be achieved; the manipulator end-effector trajectory and demonstrator's arm end-point stiffness are recorded and estimated by demonstrator dragging the manipulator and collecting EMG signals in the same time, which is beneficial to get the robot end-effector desired trajectory and impedance gain scheduling of one task which is planned by human brain.(3) Existing styles of learning from demonstration were introduced.And the data recorded from variable impedance controller and human-robot interaction mode can be used in a best way to generate command of robot independent operation through searching and discussion the appropriate LfD algorithm; this part of work is laying foundations for future studies.The remainder of this paper is organized as follows.Section 2 introduces the proposed method, including theoretical induction of variable impedance controlling and end-point stiffness estimation of human arm.Section 3 conducts simulation and experiments and generates executable end-effector trajectory and impedance gain scheduling of manipulator.Section 4 shows results of analysis and discussion.Section 5 is the conclusion and introduction of the future work.

Proposed Approach
2.1.Impedance and Admittance Control.Impedance control has two feasible forms, namely, impedance control based on robot position control and impedance control based on joint torque control.Some people divided them into admittance control and impedance control.

Cartesian Space Admittance Control.
Firstly, the Cartesian space admittance control of the robot was simulated.The trajectory changes were observed according to adjustment of impedance parameters to experience the compliant control effect intuitively, and the result can help us select impedance parameters more conveniently in the future.Since admittance control involves no dynamic control of manipulator, but makes the manipulator flexible by setting position commands on the basis of the admittance control law, so the robotic toolbox of MATLAB is used in this part of work.The 7 degrees of freedom (DOFs) manipulator model of iiwa was constructed for simulation.
Generally, the end-effector expected dynamic equation of the end-effector which is realized by admittance control is where  푑 ,  푑 ,  푑 ∈ R 6 × 6 are expected inertia matrix, expected damping matrix, and expected stiffness matrix, respectively.The expected damping matrix and the expected stiffness matrix can vary with tasks.Δ, Δ ẋ , Δ ẍ are positional deviation, speed deviation, and acceleration deviation at the end-effector of manipulator: where  푑 is the expected trajectory, which is the expected point set in the program. 푟 is the reference position which is solved according to the expected dynamic model and it is used as the destination of actual movement of manipulator.
In other words, the position controller at the manipulator x d ẋ r J † q r q r = q r •Ts+q p q P q r  q, q controls different joints to move to this reference position.Therefore, the end-effector of manipulator represents characteristics of the spring-damped second-order system.Figure 1.
shows the concept of admittance control.
To simplify the programming and retain the elasticity, the expected dynamic equation of the end-effector was modified as For the manipulator, the Jacobian matrix is necessary which can convert the reference position in the Cartesian space into the reference position in the joint space.With respect to the manipulator of 7 DOFs, the pseudoinverse Jacobian matrix  † is where  is the Jacobian matrix of manipulator,  Τ is the transpose of Jacobian matrix, and  ∈ R 6 × 6 is one unit matrix.
Given the end-effector force  푒푥푡 , expected position  푑 , velocity ẋ 푑 , and the last reference position  푟 푝 of the manipulator, the reference position  푟 of the manipulator end-effector can be gained by solving (3).Then, the reference speed ẋ 푟 is feasible to solve the joint velocity by Jacobian pseudoinverse.
With reference to the previous joint position  푝 , the position command  푟 which is sent to the manipulator at the next moment is while another kind of modification in [6] can let the robot easily be dragged by human in free-space cooperation: And the reference speed ẋ 푟 can be calculated by (4) directly.Redoing ( 5) and ( 6), the position command  푟 can be got easily.
These two kinds of admittance controller have different function; the first controller can be used to execute some interaction tasks and make sure the robot will not generate overlarge interaction force, while the second one can be used to accomplish the demonstration teaching, and changing the impedance gain can give the demonstrator different interaction feeling.

Joint Space Impedance Control.
Later, the impedance control of the manipulator was simulated.Impedance control shall be realized from dynamics of the manipulator.Appropriate feedforward and feedback shall be introduced as dynamics compensation of the manipulator and to generate the dynamic characteristic of impedance.Without consideration to end-effector trajectory, single-joint simulation can approximately represent simulation of the whole manipulator.In this paper, the impedance control law was given from the perspective of 7 DOFs and the single-joint model was constructed by Simulink and an impedance controller was designed to control single-joint of manipulator.The concept of impedance control is shown in Figure 2.
Firstly, it is hypothesized that the rigid body dynamics of single-joint of the manipulator is known accurately: where , q , q ∈ R 푁 are position, speed, and acceleration of each joint of the manipulator. 푟 () ∈ R 푁×푁 is the inertia positive-definite matrix of the robot joint space. 푟 ( q , ) ∈ R 푁×푁 is the Coriolis force and centrifugal force.() ∈ R 푁 is the gravity moment on different joints, while  푐 ∈ R 푁 and  푒 ∈ R 푁 are control moment and the external moment of the robot, respectively.
In the joint space, the expected dynamic relationship of impedance control can be expressed as where  푑 ,  푑 , and  푑 ∈ R 푁×푁 are expected inertia, expected damping, and expected stiffness, respectively.The expected inertia is a fixed value, but the damping and stiffness matrix is variable.
where  푑 is the expected joint position.The inverse dynamic command  푐 is set as Inverse dynamic Impedance control law u Robot dynamics Torque control loop It can be understood as the control feedforward to offset influences of manipulator dynamic characteristics and the external force in control moment.Next, a new control law  was designed: Obviously, when the expected inertia is equal to the actual inertia of manipulator  푑 =  푟 , it is not necessary to measure the external force in this controller.This hypothesis is set true in the single-joint simulation.
Based on the deduced Simulink simulation, the sin signal tracking by single-joint and its interaction with the external force under variable impedance control are tested.Simulation results are introduced in Section 4.

Human-Robot Interaction and Demonstration Mode
Based on EMG Signals.In this section, end-point stiffness of human arm was estimated by measuring EMG signals of the antagonistic muscles and then mapped onto controlling parameters of the robot.Flexor carpi radialis (FCR) and extensor carpi radialis (ECR) were chosen as the antagonistic muscles for estimation of end-point impedance parameters of human arms.The locations of these two muscles are shown in Figure 3.
Based on measurement of the EMG signals  푖 of these two muscles and the corresponding maximum voluntary contraction  푖 푀푉퐴 , the activation percentage  푖 of muscle  is Next, the collaborative contraction degree of muscles  푊 could be calculated and used to estimate the end-effector stiffness  of the arm.
In fact  푊 represents stiffness of wrist joint.Dynamics of the arm were neglected during slow movement and it is viewed simply that  푊 can be used to estimate .After  is gained, two functions could be realized.
(1) The first function is real-time adjustment of interaction performance in interactive teaching through the change of stiffness of human arm.The estimated end-effector stiffness of the arm is used as one reflection of movement intention of the demonstrator.When the end-point stiffness of the demonstrator's arm is high, the demonstrator is attempting to guide the manipulator to make one action that requires accurate position or the act that shall make strong force interaction with the environment, i.e., running through one narrow crack or inserting the hinge pin into the hole.Therefore, the manipulator shall be kept in high damping control status of the impedance controller in order to assure movement accuracy and interaction stability.On the contrary, if the end-point stiffness of demonstrator's arm is low, the demonstrator has no requirements on trajectory accuracy of the manipulator but focuses on flexibility and fast movement of the manipulator.Hence, the manipulator shall be in the low damping state.
(2) Moreover, impedance parameter changes and trajectory of the manipulator in the demonstration process are stored, which can be used in next learning step.And at this process, the human arm stiffness level should be transformed into robot end-effector stiffness and damping, so the robot  can show a good interaction performance in an interaction task by imitating human demonstrator.
In this way, the demonstrator not only can transmit impedance gain scheduling and trajectory easily in the demonstration teaching, but also gains very intuitive and comfortable interaction outcomes.And Figure 4 shows the Basic structure of the variable impedance interaction and demonstration mode based on human arm end-point stiffness estimation of EMG signals.

Leaning and Reproduction of Trajectory.
A certain amount of demonstration trajectories could be acquired easily based on the abovementioned design and simulation of the variable impedance controller as well as design and realization of human-robot interaction mode.These trajectories include end-effector trajectory planned by the demonstrator by using the complicated human sensory perceptual system and end-effector stiffness variation sequence of human arms which are generated by the human central nervous system.Therefore, the appropriate end-effector trajectories and impedance variation sequence are collected by the learning from demonstration (LfD) algorithm from the complicated perception ability of human.On the one hand, these data conform to that human habits and robot dynamics are the best and can be used as initial learning data well.On the other hand, kinematics and dynamic characteristics of manipulator are different from those of human.Therefore, self-learning based on characteristics of manipulator is needed in the future, aiming to acquire the moving trajectory of the manipulator and impedance gain scheduling of the controller in accordance with operation characteristics of the manipulator and even adapting to changes of task requirements.Currently, associated popular learning algorithms include GMM-GMR [15] and DMP [2].After learning based on these algorithms, the manipulator can get the task command suitable for movement of the manipulator through the demonstration trajectory and even adaptation data of trajectory and impedance scheduling to the changing task requirements.3)) of the manipulator was simulated using MATLAB robotic toolbox according to theoretical deduction in Section 2, and the other admittance controller (see (4)) was used for demonstration, and see part 2.2 of the video in the Supplementary Material for effect of this controller.In the simulation process, when end-effector of the manipulator is below the  = 0.2 plane, the damping matrix and stiffness matrix are

Simulation and Experiments
where  ∈ R 6 × 6 is one unit matrix.On the contrary, when end-effector of the manipulator is above the  = 0.2 plane, the damping matrix and stiffness matrix are In addition, an external force of 5N was applied in these two regions at a certain moment along  and it was kept for one second to observe phenomenon.The simulation results are shown in Figure 5. See part 1 of the video in the Supplementary Material for details of this simulations.
Changes of joint velocity, constraining force, and Cartesian position of the manipulator are shown in Figure 6.
Hence, influences of damping and stiffness on responses of the manipulator to disturbances can be reflected intuitively.The manipulator makes smaller responses to disturbance  under the larger stiffness and damping.This is similar to the operation mechanism of human arms.Joint stiffness of human arms increases upon contractions of antagonistic muscles, resulting in the increased end-effector stiffness and weakening influences of disturbance on end-point position of human hands.Furthermore, given high end-point stiffness, interaction force in the interaction task can influence endpoint trajectory of human hand or hand tool slightly.

Simulation of Impedance Control.
The single-joint impedance control simulation was carried with Simulink in MATLAB.Since impedance control needs an accurate dynamic model for the purpose of feedforward control, a dynamic model of rigid joint with force sensor was constructed in Simulink.In this paper, three Simulink simulations were chosen.In the first simulation, single-joint impedance control tracks the trajectory of sin curve under the changing stiffness.The changing stiffness  푑 () is Trajectory () of the sin curve is It can be seen from Figure 7 that the variable impedance controller can track the expected trajectory well in the given stiffness range.Although the trajectory error is relatively large in the beginning, it converges gradually as time goes on.This simulation reflects the tracking stability of the variable impedance controller under specific changes of impedance parameters.According to [8], the variable impedance controller has certain stability problems, which, however, were not discussed thoroughly in this paper.The second simulation focuses on the twice disturbance of the static manipulator by the applied force under changing stiffness  푑 ().Under this circumstance,  푑 () is The external force of 20 N was applied at  = 5,  = 38 and was kept by 5s and 20s, respectively.It can be seen from Figure 8 that due to changes of the impedance parameter, the external force disturbs the trajectory differently, but the trajectory returns to the stable position after the removal of the external force.This reflects the ability of variable impedance controller to cope with different disturbances of the external force.
The third simulation lets the joint movement track trajectory of sin signal based on the second simulation.Similarly, it can be seen from Figure 9 that the variable impedance controller can interact with external disturbances well.The joint still can maintain stable tracking of the trajectory and cope with external disturbances well when impedance changes.

3.2.
Experiment.This experiment is to verify whether the operator can increase the dragging damping of manipulator by increasing the arm stiffness under the assistance of the constructed variable impedance human-robot interaction mode.What is more, the experiment also wants to show the better interaction performance of variable impedance controller during the demonstration teaching than fixed gain impedance controller.As a result, the robot can achieve high trajectory accuracy in copper wire path.On the other hand, decreasing the dragging damping of the manipulator by relieving arm stiffness can help the operator move quickly at less effort.To accomplish this experiment, the dragging handle for demonstration was designed and processed, and an open circular ring was installed at the end of the handle (Figure 10).This open circular ring was used as constraint in the trajectory demonstration and reference trajectory is a piece of copper wire in the shape of Figure 10.KUKA IIWA manipulator is used in this experiment and controlled by ROS system.We can change the impedance gains of the robot by ROS commands.
The collection principle of sEMG signals is shown in After getting the EMG signal, we use these two functions of ROS: Publisher and Subscriber, Service, and Client to build  the connect between EMG signal and iiwa stack software package (which can send commend to KUKA sunrise and get state feedback form it). Then we use a C++ program to complete the calculation of impedance gain on the basis of EMG signal and send the commend to KUKA sunrise controller, so achieving the impedance gain change on the basis of EMG signal.The interaction and demonstration experiment are shown in Figure 13.See part 3 of video in the Supplementary Material for details of this experiments.The circular ring was covered on the copper wire and the operator dragged the manipulator to move along the trajectory of copper wire.The manipulator started from start point and ran through the trajectory of copper wire as soon as possible.At the same time, the ring centre shall run through the copper wire as much as possible, preventing the ring from the copper wire.According to this experimental rule, the manipulator arrived at the destination and the circular ring was removed from the copper wire.Next, the manipulator returned to the starting point quickly and started to move again.This process was repeated several times.The experiments were mainly divided into two groups: (1) fixed admittance control experiment and (2) variable admittance control experiment.

Fixed Admittance Control
Experiment.This experiment is to verify performance of the impedance controllers with low and high damping parameters in the demonstration process.
Two impedance controllers with high and low fixed damping were set for abovementioned trajectory demonstration.At the same time, the demonstration trajectory was recorded.Moreover, 20 trajectory demonstration experiments were carried out to each admittance controller.In each experiment, the mean absolute error between the demonstration trajectory and reference trajectory (trajectory of the copper wire) was calculated and each demonstration time was recorded.The demonstration trajectories of manipulator end-effector of the admittance controllers with low and high damping are shown in Figures 14 and 15, respectively.
After multiple experiments, the mean demonstration experiment time under low damping parameters was 5s and the mean trajectory error was 17 mm.However, the mean demonstration experiment time under high damping parameters was 12s and the mean trajectory error was 8 mm.

Experiment of Variable Admittance
Control.This experiment focuses on sEMG measurement and mapping of stiffness changes of human arms, thus enabling realizing online adjustment of impedance controllers with high and low damping parameters.As a result, operators can accomplish the trajectory demonstration task quickly and accurately.The experimental process was consistent with the fixed admittance control experiment.One demonstration trajectory is shown in Figure 16.The mean demonstration time in multiple experiments is 8s and the mean trajectory error is 11 mm.
It can be concluded from experimental data that the admittance controller with low damping parameters has high   demonstration speed, but large demonstration trajectory error.On the contrary, the admittance controller with high damping parameters has high demonstration accuracy, but low demonstration speed.The variable impedance controller has advantages of these two controllers.It achieves high demonstration accuracy and high demonstration speed simultaneously.

Conclusion and Future Work
Some conclusions could be drawn from the above simulations and experiments.impedance control of the manipulator is a bionic control method conforming to the biological organism.
(2) Real-time human arm stiffness estimation and interactive demonstration by combining the variable impedance control and human-robot interaction mode can contribute better interaction and demonstration effects.They can not only increase human-robot interaction efficiency and accuracy and improve interaction intuitive, but also lower requirements on operation experiences of demonstrators as well as energy consumption of demonstrators in the interaction and demonstration process.
(3) Skill features (e.g., planning trajectory and impedance gain scheduling) of human movement can be collected reasonably through demonstration base on the designed human-robot interaction mode, which can be applied onto the manipulator in late processing.Therefore, the control command suitable for operation of manipulator could be gained by late learning optimization of data.
This paper only involves simple end-effector stiffness estimation of human arms and simple production of demonstration trajectory.In future studies, mapping accuracy of human arm end-effector stiffness shall be increased appropriately to conform better to the principle of imitating human behaviours and learning by demonstration.For example, the EMG signals as well as accurate mapping between human arm postures and end-effector stiffness of human arms are constructed through the recognition method and then used in the interaction mode.As a result, intention of the demonstrator can be reflected more truly and thoroughly.This is beneficial for the manipulator to make more careful imitation.On the other hand, it is necessary to learn the acquired end-effector trajectory and impedance gain scheduling by searching and using some appropriate learning algorithms.Hence, the controller that can execute tasks effectively and even can cope with changes of task conditions as well as disturbances in a certain range is gained.Moreover, future studies shall pay attention to stability of the manipulator throughout the imitation of the human expected variable impedance trajectory.Offline calculation or online adjustment of the executing trajectories and impedances should be studied, aiming to protect stability and safety during execution of tasks by the manipulator itself.

Figure 4 :
Figure 4: Variable impedance interaction and demonstration mode based on human arm end-point stiffness estimation of EMG signals.

Figure 7 :
Figure 7: Trajectory, trajectory error, and stiffness of controller during first simulation.

Figure 8 :
Figure 8: Trajectory, external torque, and stiffness of controller during second simulation.

Figure 11 .
Pasting positions of surface electrodes are shown in Figure 12.Skin surface shall be cleaned before the signal acquisition.The EMG signal collection module can rectify and amplify the surface electrode signals and output two signal modes: (1) EMG integrated signals and (2) EMG original signals.Arduino development board converts the analog voltage signals into digital signals by the muscle electric transducer and then transmits data to the computer for later processing.

Figure 11 :
Figure 11: Principle of sEMG signal acquisition and conversion.

Figure 12 :
Figure 12: EMG sensor locations of FCR and ECR.

Figure 13 :
Figure 13: Variable admittance flexible control experiment based on sEMG.