An Enhanced GRUModel With Application to Manipulator Trajectory Tracking

Service robots, e.g. massage robots, have attracted more and more attention in recent years and the most popular study within this field is trajectory tracking. Due to the actual demand for service robots, the solution of trajectory tracking requires fast convergence and high accuracy. In order to solve the above issues, this paper proposed an enhanced Gated recurrent unit (GRU) to deal with trajectory tracking tasks of robot manipulators. The main feature of enhanced GRU is utilizing cell states as well as various gate units to build a novel neural cell. Besides, the presented enhanced GRU resolves the problem of the general neural network model and large memory occupancy. Then the derivations about the computational process of cell state and mixed hidden state of the proposed model have been illustrated. Finally, three trajectory tracking applications, comparison, and visual simulation have verified feasibility as well as the superiority of the enhanced GRU model. Received on 22 November 2021; accepted on 23 December 2021; published on 07 January 2022


Introduction
Robot manipulator motion tracking, a fundamental subsidiary subject of robot motion planning control, has always drawn researchers' wide concern [4,5]. As service robots, especially massage robot, attracts more and more attentions from public in recent years, the applications of manipulator motion tracking becomes more extensive [1]. For instance, as it is presented in Fig. 1, it is the class application of trajectory tracking in massage robot production [2]. Robot acquires trajectory information from historical memory, which could be last masseuse's technique or customer's usage trace. And then self-learning method is utilized to build motion imitation model. At last, massage robot gives a scheduled motion massage on costumers via pretrained model. However, in order to improve the costumers' experience, trajectory tracking in massage robot requires real-time as well as accuracy more then other elements [3]. Therefore, a feasible solution  Figure 1. A classic application of motion tracking in massage robot production. Robot catches trajectory data from historical memory, which could be last masseuse's technique or customer's usage data. And then self-learning method is utilized to build motion imitation model. At last, massage robot gives a scheduled motion massage on costumers via pre-trained model.
Considerable approaches and technologies for tracking robot manipulator's motion have been comprehensively researched and introduced among the algorithms extended from general inverse kinematics [11], optimization theory [12], machine vision [13] as well as adaptive-control algorithms [10]. Traditional robot manipulator inverse kinematics solution has existed for several decades, but due to the multiple-solution and singular point, it could not perform well in continuous work. Over a span of past 10 years, machine learning has been the topic in focus of practitioners on account of its favourable performance and the convenient endto-end modality. The most famous among these neural networks is convolutional neural network (CNN), however, majority of CNN models are applied in image processing [20]. In order to achieve the application in sequence processing, M. Wang et al. [15] proposed a novel CNN model (genCNN) with the ability of predict the next word with the history of words of variable length. Recurrent neural network (RNN), which was introduced in 1990 but attracted less interest from practitioners, had been refocused when it reached the unexpected result in natural language processing [21]. Y. Li et al. [21] considered an advanced RNN structure to improve control precision and enhance adaptiveness for robot motion. Comparing with classical RNN models, long short-term memory (LSTM) could validly impede gradient explosion or disappearance and achieve better results by making well use of previous cell states. In [16], Sepp Hochreiter et al. first reported LSTM for resolving long time cost by storing information over extended time. With the introduction of forgot gate and sigmoid activation function, LSTM equips the ability to deal with long-term dependence. As LSTM gradually comes into researchers' attention, numerous developments based on LSTM have been studied. Gated Recurrent Unit (GRU) is one outstanding variation of LSTM, whose performance is similar to LSTM but with less computation [17] Various novel works about RNN were published and presented. S. Li et al. [22], for instance, proposed a new RNN design to achieve efficient kinematic control of redundancy of manipulators in the presence of noises. P. Shrey et al. [18] introduced a robot learning from demonstration paradigm to imitate therapist's action based on LSTM. D. Robert et al. [19] compared four RNN architectures (simple RNNs, LSTM, GRU and mixed history RNNs) for recognizing complex action from kinematic of robot and indicated different performances of these models.
Although advanced techniques have tremendous achievements, the utilization of RNN in robot manipulator trajectory tracking reaches plateau in terms of large memory bandwidth, weakness for super long sequences as well as vast computation cost [23,25,[34][35][36]. In fact, trajectory tracking requires Real-time, continuous and accuracy [24,37,38]. Therefore, this paper presented a novel enhanced GRU model to achieve the purpose of lower latency, smaller size, higher precision and continuous solutions in trajectory tracking tasks. For better understanding of advantages and disadvantages of various solution for robot manipulator trajectory tracking, comparisons are indicated in Table 1 [27][28][29]. Moreover, main innovation points as well as contribution of this paper are summarized as below. • This is an attempt to design robot manipulator trajectory tracking model by uniting cell state in GRU. The proposed enhanced GRU model is capable of less time to convergence as well as excellent prediction ability and solve the slightly low performance problem in GRU.
• The research bridge from various gate units to neural cell state as well as hidden state is triumphantly built, which provides more possibilities for researchers to develop more interesting studies above it.
• Complete formula derivation about enhanced GRU provides researchers convenience and clearer mind to improve the performance of enhanced GRU.
The following paper is arranged as below. Section 2, as research background, explains the principles of three inverse kinematic solutions. Section 3 details the training process and performance of enhanced GRU model. In Section 4, three trajectories-tracking applications, comparisons and simulations in V-rep are presented. Section 5 concludes conclusion of this study.

Related works
General inverse kinematics of solution processing with the three connected rods as example is firstly presented in this section. Afterwards, the principles and processing of the application of CNN, LSTM as well as GRU in trajectory tracking are indicated.

General Inverse Kinematics Solution
A three connected rods structure (D-H parameters are shown in Fig. 2) is utilized here for explaining how to resolve each joints' values from Transpose matrix [27]. Obviously, the kinematics formula of this equipment can be presented as below, where B W T is the transpose matrix from rod B to rod W , c 1 23 means cos(θ 1 + θ 2 + θ 3 ) and s 1 23 stands for sin(θ 1 + θ 2 + θ 3 ).
Then to simplify the operation process, some Algebra is assumed as below, Now algebraic method is applied to resolve the solution of (2). According to (2), we can acquire this formula, from 3, The condition that the above method has a solution is the right value of (4) must be −1 to 1. Physically, if (4) does not satisfy the situation, then the target location and pose is the destination that manipulator cannot reach.Supposing the target point is in the reachable space of robot manipulator, the expression of s 2 should be, At last, inverse tangent formula is utilized in (5), The ± in 5 correspond multiple-solutions, which presents in this issue that the location of elbow of the manipulator (up or down). After getting the value of θ 2 , the value of θ 1 can be resolved connecting with (2). (2) can be transformed as below, If we suppose r as follows, and, then, so (7) could be changed as below, x /r = cos(γ) cos(θ 1 ) − sin(γ) sin(θ 1 ) y /r = cos(γ) sin(θ 1 ) − sin(γ) cos(θ 1 ) (11) hence, cos(γ + θ 1 ) = x /r sin(γ + θ 1 ) = y /r (12) applying double variables inverse tangent equation in (12), at last, by combining the values of θ 1 and theta 2 , we can easily obtain the value of θ 3

Convolutional Neural Network
CNN, which are one of the representative algorithms of deep learning, are a variety of feed-forward neural networks which contain convolutional computation and with depth architecture [30]. The most essential convolutional computation process can be presented as below, where x l j presents feature map j in the l layer, k l i,j indicates the convolutional kernel j in layer l, b l j is the j bias in the l layer and function f stand for activation function. Tanh function is applied here, whose formula is shown as below, as shown in (15), in order to determine the matrix value of k as well as b, gradient descent optimize algorithm is utilized here. Adam [31] is one of the commonly used adaptive algorithms among various optimization algorithms. The formula to calculate gradient as below, where g t represents the gradient of f t (θ) while f t (θ) is the gradient of t epoch. The major step to achieve the goal is to decrease gradient as below, where β 1 and β 2 are the number in [0,1) indicated by ourselves, m t is the exponential mean while the v t presents square gradient, α t indicates exponential attenuation controlled by parameters β 1 as well as β 2 .

Gated Recurrent Unit
As one popular variates of LSTM, GRU is introduced for using less computation resources to reach the similar performance of LSTM. Compared with LSTM containing three gates, GRU has simpler structure with only two gates, update gate as well as reset gate [33].
The expression of GRU is as below, the r t in 19 is belonged to reset gate, which is applied to control the extent to which the previous cell state affect current cell state. The function of z t in update gate is to forget or choose to memorize information from last cell.

Cell State and Gate Units
Cell state and Gate Units are introduced to solve the Long-time dependency problem [32]. Cell state could be deemed to store historical information and record the last cell's situation. There is still the issue that a part of invalid information exists in the cell state and it is why gated units were presented to alleviate this problem. The most outstanding gate units are forgotten gate, input gate as well as output gate. The first gate named forgotten gate, which is utilized to drop information in cell state. The formula is presented as below, where the f t shows the percentages of retained information in previous cell state, the h t−1 presents the hide state of cell, and the σ in 20 stands for sigmoid function, then an Input gate, which includes two procedures, is used to update the information in cell state, where i t indicates the percentage of retained information in addable information while C t is addable information. After obtaining the outputs from forgotten gate and input gate, a process is applied to combine them into cell state, Taking h t−1 as well as x t as input through output gate in last step when the cell state has been updated, where o t presents the reserved part of hide cell state while h t is the hide state of cell.

Experimental Method
As research background of this study, the principles and solving processes of four popular solutions are introduced in previous part. In this part, two neural network models would be established for resolving robot manipulator trajectory tracking. For trajectory tracking task, we want to obtain the transformation from transpose matrix T to each joints' angles ⃗ θ. The relation could be as below, 1 a 1,2 a 1,3 x a 2,1 a 2,2 a 2,3 y a 3,1 a 3,2 a 3,3 z 0 0 0 1 where ⃗ a presents the rotation matrix while x, y, z indicates the location of manipulator end.

CNN for Trajectory Tracking
A CNN architecture is proposed in this subsection to predict joint angles according to 0 7 T , which contains six main layers shown in Fig. 2 (one reshape layer, four convolution layers as well as one dense layer).
As shown in (25), removing the bottom row without physical meaning of 0 7 T , the rest part should be input data. A reshape layer is utilized to convert that 3x4 matrix to 12-Dimensional vector. After that one convolution layer with 12 kernels will convolve the vector while next convolution layer with 24 kernels would convolve with the output. The convolution layer as below also has 24 kernels but with the operation 0.2 dropout. The standard CNN depends on gradient of each parameters to reduce otherness  between prediction data and real data while it may cause over-fit due to the complex effect. The detail operation of dropout is applying p probability to drop one portion of neural cells and 1 − p probability to reserve others. The deeper convolution layer with 36 kernels and 0.1 dropout operation. Finally, all the feature maps should go through the full-connected layer to extract feature from 1x36 map to 1x7 vector as the output, which are each joints' values.

Enhanced GRU for Trajectory Tracking
The principles of GRU and cell state had been explained in section 2. It is a valid approach to improve model performance by appending cell state and gate unit into GRU [39][40][41]. Then a novel enhanced GRU is introduced here to acquire both excellent ability of convergence, real-time as well as wonderful performance.
The detail architecture and component of enhanced GRU is indicated in Fig. 3, including one cell state, one hidden cell state as well as five gate units. This architecture integrates reset gate, update gate, forgotten gate, input gate as well as output gate to assist cell state and hidden cell state to add or subtract information. And the unit takes two steps t, whose inputs are last unit's hidden state h t−1 , rare data x t and x t+1 as well as last unit's cell state C t−1 while outputs are current unit's cell state C t+1 and hidden state h t+1 .
The derivation of the unit formula can be indicated as below. The expression of intermediate quantity h t is as the same as that in (19). Merging (20) and (19), then, moreover, according to (22) and (19), i t+1 and C t+1 could be calculated by the following formula, whereupon C t+1 can be presented as below, then with (24) and (19), At last, So far, the solution procedure of enhanced GRU has finished. To meet the enhanced GRU's solution modaility, 0 7 T is flattened to 12-Dimensional tensor. And in enhanced GRU model, two cascade enhanced GRU with dropout layers are utilized to extract sequence feature from the input tensor as shown in Fig. 3.At last, 7 full-connected neural cell is applied to transform the feature state to each joints' angles ⃗ θ.

Applications, Comparisons and Tests
Performance analysis by using KUKA iiwa LBR 7 as experimental robot manipulator in V-rep simulation environment via three different trajectory-tracking tasks to indicate the feasibility of enhanced GRU model. As the interaction approach with coding environment and V-rep simulation environment shown in Fig. 4, the peripheral controller would obtain each joints' angles ⃗ θ while three varieties of methods are utilized to resolve the transpose matrix. After that, ⃗ θ would be sent to V-rep simulation environment through python remote client while the location of manipulator end would be returned. After the comparisons on accuracy performance in space distance with other inverse kinematics solutions in the case are record. Due to the discontinuity of general inverse kinematics, it did not complete the test and is not utilized as a comparison object in section 4.2 while the real performance of it will be presented at the end of this section.

Enhanced GRU Training Process
Tensorflow 2.0 is applied in this paper to build enhanced GRU model as Fig. 3. Before training, the input data should be standardized to improve the generalization of deep learning model, and the standard method is as below, (32) where x is the standardized data while x presents original data,x means the average value of x, x max and x min indicates the max and min value of x respectively. After that, standardized data x is split to training dataset and validation dataset with the proportion four to one, which provides the performance on un-trained dataset and meanwhile enhances the generalization ability of model. Through the training process, mean absolute error (mae) is utilized to analyse the quality of model and the calculation formula is as below, the n in (33) is the number of samples in one batch, m means the number of joints, the θ i,j presents the predict jth joint value in ith sample while θ i,j indicates the real jth joint angle in ith sample [42,43]. All the three trajectories are made up of 450 points and presented as sub-picture (a) in Fig. 6, Fig.7 as well as Fig. 8. Trajectory 1 has the largest spans in Z-axis and small inflection point while the main trajectory of trajectory 2 is like s-shape with few span in Z-axis and contain various of inflection points. From the top view, trajectory 3 is similar with M-shape and the latter half of it is straight line motion.
The whole training process contains 6000 epochs with loss optimized algorithm Adam and 128 batch size. The change maps of loss value through the whole training process are illustrated in Fig. 5 and (a),(b) as well as (c) present the train process from trajectory 1 to 3 respectively. As (a) in Fig. 5 indicated, the loss value of training dataset reached slightly less than 0.1 on 1000 epochs while it settled the number nearby 0.05 on about 5000 epochs. The situation is similar in validation dataset of trajectory 1 but with lower loss value. Due to the higher trajectory difficulty of trajectory 2, the loss values of both datasets are slightly over than 0.1 on 1000 epochs while the values decline to approximately 0.07 on 6000 epochs. The loss map is a bit zigzag in last training of trajectory 2, which attain probably 0.1 on neighbouring 1000 epochs while it is cut a half on 3000 epochs.

Comparison with Standard GRU and CNN Model
In order to investigate the otherness between enhanced GRU, standard GRU and CNN model, the same trajectory tracking tasks have tested by both of them (the two other neural network models' training processes copied from enhanced GRU's). The theoretical performances of them are calculated by forward kinematics and indicated in Fig. 6 to Fig. 8. The trajectories generated by standard GRU model are not indicated in these picture on account of the visual differences between them with enhanced GRU's, but the quantitative analysis (mean space distance) of the discrepancy between them is illustrated in Table 4. In 3D view of the sub-figure (a) in Fig. 6, the real trajectory and prediction trajectory are almost overlap while the prediction trajectory is a bit offset at the corner on the top. The situation becomes clearer in top view of the sub-figure (b) in Fig. 6 that real trajectory has a little crooked while the prediction trajectory slides diagonally down from the top at the corner on the left. After first corner, the gap between the two trajectories narrows. The prediction trajectory is also similar to real trajectory while some points are out of desired trajectory in 3d view of the sub-figure (a) in Fig. 7. While the horizon changes to top, the sawtooth of prediction trajectory becomes more distinct but the general trajectories are resemblance. It should be pointed out that prediction trajectory has distinct radian when it comes to a corner. Some scattered points of prediction trajectory are particularly evident in 3D Zuyan Chen et al. Figure 6. Motion results of trajectory 1 via enhanced GRU model (31) and CNN model with a robot manipulator to track trajectory 1 (The trajectories generated by standard GRU model are not indicated in these pictures on account of the visual differences between them with enhanced GRU's, but the quantitative analysis (mean space distance) of the discrepancy between them is illustrated in Table 4). The result created by general IK solution is not presented because the multiple solutions as well as singular of general IK solution causes the stuck of Vrep. The trajectory generated by enhanced GRU model performs better while at first corner there is still a slight difference with desired trajectory. In contrast, the prediction trajectory of CNN presents that the CNN model is under-fitting  (31) and CNN model with a robot manipulator to track trajectory 2 (The trajectories generated by standard GRU model are not indicated in these picture on account of the visual differences between them with enhanced GRU's, but the quantitative analysis (mean space distance) of the discrepancy between them is illustrated in Table 4). The result created by general IK solution is not presented because the multiple solutions as well as singular of general IK solution causes the stuck of Vrep. The trajectory generated by enhanced GRU model performs better while also at corners there still be some slight differences with desired trajectory. In contrast, the prediction trajectory of CNN presents that the CNN model is not convergence. For a better difference analysis between prediction trajectory and real trajectory, average Cartesian space distance is calculated and presented in table 4. The largest space distance of both of the two models is in trajectory 2 tracking task, which means the difficulty is the greatest. The average space distances between prediction trajectory and desired trajectory of enhanced GRU model as well as CNN model are 181.56mm and 1130.00mm respectively. The smallest space distance is indicated in trajectory 1 tracking task and the values of them are 126.85mm and 219.97mm severally. What should be pointed out is that all the trajectories tracking tasks via enhanced GRU model are below 200mm, while presents the feasibility of enhanced GRU model in trajectory tracking task.
To intuitively investigate the trajectories, this paper applied these models in the case in trajectory tracking Figure 8. Motion results of trajectory 1 via enhanced GRU model (31) and CNN model with a robot manipulator to track trajectory 3 (The trajectories generated by standard GRU model are not indicated in these picture on account of the visual differences between them with enhanced GRU's, but the quantitative analysis (mean space distance) of the discrepancy between them is illustrated in Table 4). The result created by general IK solution is not presented because the multiple solutions as well as singular of general IK solution causes the stuck of Vrep. The gap between desired trajectory and prediction trajectory by enhanced GRU is less. Although the difference is very less, one point should be point out is that enhanced GRU performs better in arc angle. The CNN model requires more than 6000 epochs to converge on account of the bad performance in trajectory 3 tracking task.  task to be visualized and simulated in Vrep. As it is illustrated in Fig. 9, the best tracking performance among the three solutions belongs to enhanced GRU while general IK solution reached the worst consequence. Because the solution generated by general IK solution is not related to the solution in last time t − 1, the pose would change a bit, which cause the stuck of Vrep, although the trajectory was correct. The manipulator controlled by CNN has the resemble pose with desired motion but with low accuracy to real trajectory. There is only one comparison in Vrep indicated in this section, and you can have a full observation of performance in the enclosure video.
To be summarize, general IK solution has excellent ability with solving a single problem but weak to deal with continuous issue. In addition, due to the multiplesolutions as well as singular of general IK solution, it could not be a wonderful choice to resolve a trajectory tracking. The standard GRU can converges promptly while lacks less precise in detail, especially at corners. The potential of CNN was not fully developed in this section on account of its long convergence time. The enhanced GRU model indicated the best performance among the three solutions whatever in training process, otherness of trajectories and visual simulation.

Conclusion
In conclusion, this paper introduces a novel enhanced GRU to solve the trajectory tracking of robot manipulator. By incorporating GRU model with cell state as well as gate units, the enhanced GRU model achieve the target of low convergence as well as high accuracy of trajectory tracking and pose imitation. The main principle of this study is to unite the mixed hidden state as well as the cell state to build one hybrid unit to solve the slightly low performance problem in GRU. The derivations about the calculation of cell state and mixed hidden state of presented model has been indicated. At last, three trajectories tracking tasks, comparison and visual simulation have been testified the feasibility as well as superiority of enhanced GRU model.