Semiparametric Deep Learning Manipulator Inverse Dynamics Modeling Method for Smart City and Industrial Applications

,


Introduction
Smart cities and factories contain "intelligent" things that can autonomously and collaboratively enhance the quality of living and working conditions, save human lives, and act as a sustainable resource ecosystem. To implement these advanced collaborative technologies, such as drones, robots, artificial intelligence, and Internet of ings, it is required to increase the "intelligence" of smart cities and factories, by improving the connectivity, energy efficiency, and quality of services [1].
ere have been many excellent application cases, such as [2][3][4]. Particularly, intelligent robotic platforms are a technology, increasingly used, in smart cities and factories, where the constantly changing applications scenarios also place higher demands in robot control. Specifically, in motion control systems, there is a time delay in the transmission of feedback information, making smooth motion impossible to achieve by feedback control alone. erefore, feedforward control becomes particularly important. In robotics, feedforward control usually refers to model-based control, involving the dynamics of the robotic platform. e accuracy of such dynamical models is critical to the development of control laws that are compliant, energy efficient, and safe [5].
ere are two major approaches for modeling robot dynamics: parametric and nonparametric. Parametric approaches rely on parameterized Newtonian physics models of the robot dynamics. Common methods for physics-based dynamics modeling can be found in the literature. ese methods require the mechanical parameters of the rigid bodies, composing the robot, to be identified [6][7][8][9] and then employed in model-based control and state estimation schemes [10]. e advantage of these models is that they represent a global and unique relationship between the joint trajectory (q, _ q, € q) and the torques τ RBD . is type of inverse dynamics model can be computed efficiently and employed in real time. us, a great deal of prior knowledge is acquired, without the need of data. For example, it is well known that robots are subject to gravitational forces, viscous forces, and joint constraints, making it wasteful to have to go through a laborious data-gathering and machine learning process, to discover these well-known constraints. e disadvantage of parametric models is that they are only crude idealizations of the actual system dynamics, such as rigidity of links or a simple analytical form of friction, which may not be accurate in real systems. In the case of traditional industrial robots, these unmodeled dynamics can often be ignored. However, for modern robotic platforms, these omittances and simplifications result in significant control inefficiencies.
Alternatively, the model can be obtained from experimental data, using machine learning techniques, resulting in a nonparametric model. Nonparametric methods, based on algorithms such as Support Vector Regression (SVR) [11][12][13], Neural Network (NN) [14][15][16], Local Weighted Projection Regression (LWPR) [17][18][19], Independent Joint Learning (IJL) [20][21][22], or Gaussian Processes Regression (GPR) [23][24][25][26][27], can model dynamics by extrapolating the input-output relationship directly from the available data. If a suitable kernel function or learning architecture is selected, then the nonparametric model is a universal approximator which can account for the dynamics factors, not considered by the parametric model. erefore, nonparametric methods can be more flexible to use and are powerful in capturing higher order nonlinearities, resulting in faster model approximation and higher learning accuracy. When learning inverse dynamics, the nonparametric methods will approximate a function describing the relationship q, _ q, € q ⟶ τ, including all nonlinearities encoded by the sampled data.
Nonparametric methods attempt to learn the model from scratch and, thus, do not make use of any knowledge available from analytical robotics. Nevertheless, nonparametric learning methods also exhibit several drawbacks. First, very large amounts of data are necessary for obtaining a sufficiently accurate model and predictions on the entire input space [28]. Second, since nonparametric models rely on local neighborhood training data to make predictions, they do not generalize well to unexplored state regions, where little or no training data are available. Covering the entire state space becomes exponentially harder, as the complexity and number of degrees of freedom in the robot system increase. us, if only small and relatively poor data sets are available, nonparametric models will not be able to generalize well for unknown data. ird, it is indeed wasteful to have to go through a laborious data-gathering and machine learning process to discover such well-known prior knowledge as Rigid Body Dynamics.
us, it appears quite desirable to combine the benefits of parametric and nonparametric approaches to improve on the aforementioned issues. However, doing so, in an efficient way, is not trivial. A reasonable approach would be to first fit a parametric model and then fit a nonparametric model to the errors made by the parametric model. Nguyen-Tuong et al. [29] present a learning technique which combines prior knowledge about the physical structure of the mechanical system and learning from available data using Gaussian Process Regression (GPR) [30]. Similar approaches are presented in [20] and [31]. In [32], an incremental semiparametric robot dynamics learning scheme, based on Locally Weighted Projection Regression (LWPR), initialized using a linearized parametric model, is presented [33]. However, this approach uses a fixed parametric model that is not updated, as new data become available. Moreover, LWPR has been shown to underperform with respect to other methods (e.g., [34]). ese semiparametric methods, as described above, could not benefit from simultaneous optimization of parametric and nonparametric models. Instead, the nonparametric model is applied, after parametric identification, which may result in a suboptimal model. In addition, as far as it can be known, there is no semiparametric method based on deep learning methods.
Deep learning is a new approach in machine learning, which has been widely applied in smart cities and factories [35]. Deep learning has turned out to be very good at discovering intricate structures in high-dimensional data and is therefore applicable to many domains of science, business, and government. Since it requires very little engineering by hand, it can easily take advantage of increased amount of available computation resources and data [36]. In this work, a method that is based on deep learning and semiparametric approach is presented. e method is formalized in the framework of what is called Semiparametric Deep Learning (SDL), designed for optimal inference using combinations of parametric RBD and Nonparametric Deep Learning models. Key properties of this method are (1) appropriate deep learning frame for a semiparametric approach and (2) features that can be optimized simultaneously for parametric and nonparametric models. e proposed method is validated using experiments performed on a UR5 robot. e article is organized as follows. In Section 2, a complete description of the proposed Semiparametric Deep Learning framework is introduced. Section 3 presents the validation of the proposed method on the UR5 robotic platform. Finally, Section 4 summarizes the content of the presented work.

Methodology
e parametric modeling method and NDL, as the basis of the SDL method, have been elaborated in previous research publications [37,38]. erefore, this section only briefly reviews the above two methods, while it analyzes the SDL modeling method, proposed in this paper.

Parametric Robot Dynamics Model.
It is well known that the robot dynamics can be modeled according to the following [39]: 2 Complexity in which q, _ q, € q ∈ R m×1 are joint positions, velocities, and accelerations of the robot, respectively, τ ∈ R m×1 denotes the joint torques, M(q) is the generalized inertia matrix of the robot, C(q, _ q) are the Coriolis and centripetal forces, and G(q) is gravity. As shown in equation (1), the robot dynamics equation contains Rigid Body Dynamics (RBD) model: e model errors ε(q, _ q, € q) are caused by unmodeled dynamics (e.g., hydraulic tubes, actuator dynamics, and flexibility and dynamics of the cable drives), ideal-joint assumptions (e.g., no friction and clearance), and inaccuracies in the RBD model parameters. e RBD model of a manipulator is well known to be linear regarding the parameters β [39], i.e., in which φ is a matrix containing nonlinear functions of joint angles, velocities, and accelerations, often called basis functions. Modeling the robot dynamics, using the RBD model in equation (3), requires the identification of the dynamics parameters β. For the 6 Degree-of-Freedom (DoF) UR5 robot, for example, 60 dynamics parameters are to be identified (for each DoF, there are 10 parameters that could ideally be obtained directly from the CAD data).

NDL Formulation.
e inverse dynamic model, in model-based control, is described as the mapping from joint positions, velocities, and accelerations to torques, as shown in equation (1). e aim of nonparametric learning model is to predict the torque value of the i th joint, τ i ∈ R, as the response of the query point at (q, _ q, € q) ∈ R 3m×1 , by using the given n training data, Since the problem can be considered as a supervised learning problem, any supervised learning technique can be employed for the learning process, as shown in Deep learning methods are representation-learning methods with multiple levels of representation, obtained by composing simple but nonlinear modules, each transforming the representation at one level (starting with the raw input) into a representation at a higher, slightly more abstract level. With the composition of enough such transformations, very complex robotic dynamics functions can be learned [36,40]. erefore, the application cases of deep learning are too numerous, such as [41][42][43][44][45].
e sequential nature of manipulator inverse dynamics suggests that, to predict the joint torque, it is important to model the relationship among sequential data points [39]. RNNs, a type of deep learning network, can be seen as very deep feedforward network, where all the layers share the same weights. Although their main purpose is to determine long-term dependencies, theoretical and empirical evidence shows that it is difficult to learn to store information for very long time [46]. LSTM networks have subsequently proved to be more effective than conventional RNNs, especially when there are several layers at each time step [48], enabling an entire speech recognition system that goes all the way from acoustics to the sequence of characters in the transcription. LSTM networks or related forms of gated units are also currently used for the encoder and decoder networks, performing very well in machine translation [47][48][49]. Moreover, studies presented in [38] and [50] confirmed the validity of applying LSTM to the prediction of manipulator inverse dynamics. erefore, in this section, the LSTM network is proposed as the nonparametric learning technique for modeling the inverse dynamics of manipulator.

NDL Model Architecture.
In this paper, the proposed architecture of the Nonparametric Deep Learning network has one input layer, one LSTM layer, one full-connected layer, one dropout layer, and one output layer, as shown in Figure 1. Only 1 LSTM layer is used here, because it has been verified in our previous studies that, with the same number of neurons, the fewer the layers, the better the prediction performance [38].
e input layer has 18 neurons (manipulator's 6 joint positions, 6 velocities, and 6 accelerations, as shown in Figure 2). e state activation functions of LSTM cells are set to "tanh," while the gate activation functions are set to "sigmoid." e input weights are initialized according to the Glorot initializer. e forget gate bias is initially set to 1 and the remaining biases are set to 0. e training algorithm adopts back-propagation through time.

Proposed Semiparametric Deep Learning (SDL) Model.
In this section, the proposed semiparametric model, based on deep learning and RBD, is described in detail. e method is formalized in the framework of the so-called Semiparametric Deep Learning (SDL), which can be used to predict the joint torque of a robotic arm more accurately. First, formulation of SDL is introduced in Section 2.3.1, while the specific model architecture of SDL is described in Section 2.3.2.

SDL Formulation. Using the Nonparametric Deep
Learning (DL) framework, the robot dynamics can be modeled by τ ∼ DL(q, _ q, € q), in which q, _ q, € q is the input and τ is the output of the Deep Learning Model. Consequently, the DL model does not make use of any prior knowledge, which allows reproducing arbitrary functions. One way to include the RBD model, as shown in equation (5), is to set the τ RBD as input to the DL model. is approach is equivalent to a semiparametric model: e resulting dynamics model, as described in equation (5), is a semiparametric frame which consists of a parametric Complexity 3 part, i.e., the RBD model. When comparing equation (5) to the robot dynamics in equation (1), it becomes evident that the main purpose of the nonparametric term is absorbing the unmodeled dynamics ε(q, _ q, € q). In order to approximate the unmodeled dynamics with an appropriate DL, a model can be used, such as DNN or LSTM. Key properties of this semiparametric learning method are (1) employment of appropriate deep learning frame as nonparametric part and (2) features that can be optimized simultaneously for parametric and nonparametric model; i.e., the value of τ RBD will also be updated by weight changes as the learning process evolves.
If the RBD model perfectly describes the robot dynamics, the error ε(q, _ q, € q) in equation (1) will disappear and the prediction will depend only on the RBD part; i.e., it is very easy to train deep learning networks. Equation (5) also shows that if the query point is far away from the training data, the resulting torque prediction will mainly depend on the RBD part. is property is important, as the complete state space can never be completely included using finite (and possibly small) training data sets. If the robot moves to the regions of the state space, not considered by the sampled data (i.e., the learned nonparametric models may not generalize well in these state space regions), the torque prediction will rely on the parametric RBD part.

SDL Model Architecture.
e effectiveness of the NDL model has been verified in previous work presented in [40]. e SDL model proposed in this article adds an RBD term to the NDL architecture. Its specific model architecture is shown in Figure 3. e input of the SDL architecture is still [q, _ q, € q]. However, unlike NDL, the input passes through the RBD term, forming the new input vector [τ RBD , q, _ q, € q]. Next, the new input vector enters the LSTM hidden layer. e advantage of this architecture is that the parametric and nonparametric parts can be optimized simultaneously; i.e., the weight of τ RBD in the feature vector will also be updated during network training. e grid search method is used to optimize the hyperparameters of the SDL model.

Evaluation
e proposed SDL method will be verified on a collaborative robot UR5, while the torque prediction results will be compared to the ones provided by NDL methods. e prediction performance for training and for generating  4 Complexity predictions in rhythmic motor tasks is evaluated. e joint angles, the joint velocities, the joint accelerations, and the joint torques are recorded using a GUI (Graphical User Interface), i.e., PolyScope, making it easy to program the robot to move the tool along a desired trajectory path.

Experimental Setup.
UR5 is a 6-DoF collaboration robot with extruded aluminum tubes and joints. It has six rotary joints, and its structure is shown in Figure 4. e UR5 robot has a joint rotation range of [−2π, 2π] (rads) and a joint acceleration range of [0, π] (rads/s 2 ). e UR5 robot is very popular in the robot research field. According to the robot rotation angle and installation restrictions, the robot workspace is selected to be a hemisphere with an approximate radius of 850 mm above the installation plane. e range of joints motion is shown in Table 1. In order to best approximate the actual working situation of the robot, within the selected robot workspace, 1000 points are randomly selected. According to the actual use requirements of the robot, the joint running speed range is [0.8-2] rads/s, while the acceleration range is [1-1.8] rads/s. e robot is ordered to run in a rhythmic way, according to the set trajectory. e joint position, speed, and servo motor current data, along the robot trajectory, is delivered from the robot controller at a frequency of 100 Hz. Since the UR5 robot is not equipped with a torque sensor, the measured torque is obtained indirectly through the motor current, at each joint. e relationship between torque and current is as follows [51]: (6) in which N i is the gear ratio, N i � 101, i � 1, · · · , 6; k i motor constant, k i � 0.125 Nm/A, i � 1, · · · , 3, k i � 0.0922 Nm/A, i � 4, · · · , 6; and I i is the motor current (A).
During the actual operation, the robot is affected by noise, making the sampling data fluctuate. In that case, there will be large fluctuations and ripples in the actual measured current, which seriously affects the accuracy of the torque prediction. e average data method can increase the signalto-noise ratio of the data [52], reduce the influence of noise, and improve the prediction accuracy. e average joint position _ q can be expressed as follows: in which M is the number of running trajectories, q m (k) is the k-th sampling point of one running trajectory, and q(k) is the position after M times of averaging. e same method is used to deal with speed and current. e zerophase low-pass Butterworth filter (forward and reverse IIR Butterworth filters) with a cutoff low-pass frequency of 1 Hz is used to process the averaged position and velocity (q, _ q). Acceleration is obtained by the central difference method [53]. Data processing is important for the accuracy of parameter identification, making the above processing very appropriate, as it avoids large deviations in identification.
A total of 1,000 groups (a total of 108,008) of valid samples were obtained. According to the ratio of 80% to 20%, they were divided into training and testing sets, serving in K-fold cross-validation. In addition, the test data is ensured to be sufficiently different from the training data, highlighting the generalization ability of the learned models. e above sample set is used to train and test the NDL model and SDL model, respectively, while it also analyzes and compares the prediction performance.   Complexity 5 All input and output data are normalized to match the consistency of the learning model. After the prediction of inverse dynamics, the real value is restored. e normalized equation is as follows: in which x n represents the normalized value; x r denotes the real value; and x min and x max are the minimum and maximum real values, respectively. e performance of manipulator inverse dynamics predictions is evaluated using Root Mean Square Error (RMSE), which is defined as follows: in which p i and y i represent the i-th predicted value and real value, respectively, and N is the total number of the data sets.

Results
e training and prediction in this paper were performed with MATLAB 2019a, using an ordinary personal computer. Computer hardware has a high influence on training time. In this work, the models are trained on a CPU with a clock speed of 2.7 GHz. e structure and hyperparameters of the NDL and SDL models were initially set according to previous work, while the final settings were determined based on the fivefold cross-validation method. e prediction results of the RBD, NDL, and SDL models are listed in Table 2. e data in the table shows the RMSE of the RBD, NDL, and SDL models, on different robot axes and under different cross situations. Based on Table 2, the following conclusions apply: (1) e results of "cross average" (Figure 5) indicate that the prediction accuracy of SDL is generally more accurate than the one of NDL and NDL is generally more accurate than RBD. Also, the prediction accuracy of the SDL model for the first 3 axes of the robot is significantly improved.
(2) e results of "all axes" (Figure 6) indicate that the prediction accuracy of the NDL model is sometimes better than the one of the RBD model (cross 3, 4, 5), while sometimes it is worse (cross 1, 2). However, the NDL model is better than the RBD model (mean) in general, while the prediction accuracy of the SDL model is always better than in the case of the other two models.
(3) All the data in Table 2 shows that the semiparametric models are able to combine the strengths of both models, i.e., the parametric RBD model and the Nonparametric Deep Learning model. e prediction accuracy of the SDL model is always better than that of the RBD and NDL models.
RMSE is not sufficient to fully represent the performance of torque prediction, because the range of torque variation for each joint of the robot varies greatly. erefore, the ratio of cross average RMSE to the range of measured torque values was used to further analyze the predictive   Table 3, the torque prediction performance of the first three (elbow) joints is better than that of the last three (wrist) joints. Part of the measured values of each joint torque, RBD predicted value, NDL predicted value, and SDL predicted value are all plotted in Figure 7. Figures 7(a)-7(f ) are the predicted moments of axis 1 to axis 6, respectively; Figures 7(g)-7(l) are the predicted errors of moments of axis 1 to axis 6, respectively; the purple curve represents the measured torque value, the blue represents the calculated torque value using RBD method, the red represents the DL predicted torque value, and the yellow represents the SDL predicted torque value. Figure 7 shows that the SDL model combines the advantages of the RBD model and the NDL model. For example, as shown by the red dashed box in Figure 7, when the torque prediction error of the RBD is       Complexity large, the nonparametric part works, which greatly improves the prediction accuracy of the SDL model. As another example, shown by the green dotted box in Figure 7, when the learned nonparametric models do not generalize well to the state space regions, the torque prediction will rely on the parametric RBD part.

Conclusion
In this work, Semiparametric Deep Learning (SDL) method is proposed to model robot inverse dynamics, for smart city and industrial applications. e SDL model takes advantage of the global characteristics of classic RBD and the powerful fitting capabilities of deep learning methods. Moreover, SDL model can be optimized simultaneously for the parametric and nonparametric model, instead of separate optimizations. e results on the UR5 robot show that the SDL models provide higher accuracy and better generalization, compared to RBD and NDL. e essence of the SDL model is to fully utilize the a priori information encoded in the parameterized model, overcome the limitations of the NDL method, and always show good learning performance. As far as future work is concerned, the flexible factors in dynamics will be considered as addition to the SDL model, in order to improve the accuracy prediction performance.

Data Availability
e data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest
e authors declare that they have no conflicts of interest.