Optimal Search Strategy of Robotic Assembly Based on Neural Vibration Learning

This paper presents implementation of optimal search strategy (OSS) in verification of assembly process based on neural vibration learning. The application problem is the complex robot assembly of miniature parts in the example of mating the gears of one multistage planetary speed reducer. Assembly of tube over the planetary gears was noticed as the most difficult problem of overall assembly. The favourable influence of vibration and rotation movement on compensation of tolerance was also observed. With the proposed neural-network-based learning algorithm, it is possible to find extended scope of vibration state parameter. Using optimal search strategy based on minimal distance path between vibration parameter stage sets (amplitude and frequencies of robots gripe vibration) and recovery parameter algorithm, we can improve the robot assembly behaviour, that is, allow the fastest possible way of mating. We have verified by using simulation programs that search strategy is suitable for the situation of unexpected events due to uncertainties.


Introduction
The planning is a key ability of intelligent systems, increasing their autonomy, reliabilities, efficiently and flexibility through the construction of sequences of actions to achieve their goals [1].In artificial intelligence, planning originally meant a search for a sequence of logical operators or actions that transform an initial world state into a desired goal state.Robot motion planning usually ignores dynamics and considers other aspects, such as uncertainties, differential constraints, modeling uncertainties, and optimality.The robotic assembly, wheelchair navigation, sewer inspection robot, autonomous driving system in urban and off-road environments, and machine's task planning for the robotic system all are examples of autonomous systems, which solve path planning/replanning problems [2,3].
Dynamic replanning is necessary because at any time during execution of its tasks the robot might unexpectedly run into problems [2].The typical approach used for replanning is repair plans, which are prepared in advance and invoked to deal with specific exceptions during execution.This class of approaches may work well in relatively static and predictable environment.In more dynamic and uncertain environment where it is hard to anticipate possible exceptions, the replanning generates a (partially) new plan in case when one or more actions have problems during execution [4].
Very interesting area of research is using planning strategies in robot assembly.The example components can be assembled faster, gentle, and more reliably using the intelligent techniques.In order to create robot behaviours that are similarly intelligent, we seek inspiration from human strategies date [5].The working theory is that the human accomplishes an assembly in phases, with a defined behaviour and a subgoal in each phase.The human changes behaviours according to events that occur during the assembly, and the behaviour is consistent between the events.The human's strategy is similar to a discrete event system in that the human progresses through a series of behavioural states separated by recognizable physical events.The primary source of difficulty in automated assembly is the uncertainty in the relative position of the parts being assembled [6].The crucial thing in robot assembly is how to enable a robot to accomplish a task successfully in spite of the inevitable uncertainties.Often a robot motion may fail and result in some unintended contact between the part held by the robot and the environment.
There are generally three types of approaches to tackle this problem.One is to model the effect of uncertainties in the off-line planning process, but computability is the crucial issue.A different approach is to rely on on-line sensing to identify errors caused by uncertainties in a motion process and to replan the motion in realtime based on sensed information.The third approach is to use task-dependent knowledge to obtain efficient strategies for specific tasks rather than focusing on generic strategies independent of tasks.
A systematic replanning approach which consisted of patch planning based on contact analyses and motion strategy planning based on constraints on nominal and uncertainty parameters of sensing and motion is introduced in [7].In order to test the effectiveness of the replanning approach, they have developed a general geometric simulator SimRep which implements the replanning algorithms, allows flexible design of task environments and modeling of nominal and uncertainty parameters to run the algorithms, and simulates the kinematics robot motions guided by the replanning algorithms in the presence of uncertainties.
Another possibility in achieving acceptably fast robot behavior with assuring contact stability is learning unstructured uncertainties in robot manipulators date.The example components can be assembled faster, gentle, and more reliably using the intelligent techniques.Many promising intelligent control methods have been investigated [5,8].For example, work in [9] describes intelligent mechanical assembly system.Correct assembly path is chosen by using form of genetic algorithm search, so the new vectors are evolved from most successful "parents."Another possibility in achieving acceptably fast robot behavior with assuring contact stability is learning unstructured uncertainties in robot manipulators date.The paper [10] presents implementation of intelligent search strategy based on genetic algorithm in verification of assembly process in the presence of uncertainties.
The main contribution of our work is using optimal search strategy in combination with robot learning from experimental setup.The research platform is the complex robot assembly of miniature parts in the example of mating the gears of one multistage planetary speed reducer.Assembly of tube over the planetary gears was noticed as the most difficult problem of overall assembly, and favorable influence of vibration and rotation movement on compensation of tolerance was also observed.For robotic assembly, the tolerance is especially difficult problem because in process of mating it must be compensated, but it takes time and requires corresponding algorithms.In order to compensate tolerance during robot assembly, we plan motion, involving path alternatives to yield minimum distance.The neural-networkbased learning gave us new successful vibration solutions for each stage of reducer [11].In this paper, we introduce optimal search strategy based on vibration parameters state in order to overcome uncertainties during motion planning.

Robot Assembly
The first part of our research was the complex robot assembly of miniature parts in the example of mating the gears of one multistage planetary speed reducer.The main difficulty in assembly of planetary speed reducers is the installation of tube over planetary wheels.Namely, the teeth of all three planetary wheels must be mated with toothed tube.Figure 1 presents planetary speed reducer (cross-section 20 mm, height of five stages 36 mm), which has been used for experiments.
In this research, it has not been considered the complete assembly of each part of planetary reducer but only the process of connecting the toothed tube to five-stage planetary reducer.By solving the problem of assembly of the gears, there will be no problem to realise complete assembly of planetary speed reducer.
For the process of assembly, the vertical-articulated robot with six degrees of freedom, type S-420i of the firm FANUC has been used, completed by vibration module (Figure 2), developed at Fraunhofer-IPA in Stuttgart, Germany.
Namely, the analysis of assembly process showed that movement based on vibration and rotation acted positively on the course of process.Vibration module produced vibration in xand y-direction and rotation around the zaxis.
By starting the robot work, vibration module vibrated with determined amplitude (±2 mm) and frequency (to max. 10 Hz) for each stage of reducer.The ideal Lisague figures (double eight, circle and line) have been used as figures of vibration for extensive experiments.The vibration figure horizontal EIGHT (Figure 3) was selected for waiter experiments, because we achieved the best performance in assembly process.In that case, the frequency ratio between down and above plate was During the robot assembly of two or more parts, we encountered the problem of tolerance compensation.
According to the functioning, the individual systems of tolerance compensation can be divided into: (i) controllable (active) system for tolerance compensation in which, on base of sensor information on tolerance, the correction of movement is made for the purpose of tolerance compensation; (ii) uncontrollable (passive) system for tolerance compensation in which the orientation of external parts is achieved by the means of advanced determined strategy of searching or forced by connection forces; (iii) combination of above two cases.
For this system of assembly, the passive mechanism of tolerance compensation has been used with specially adjusted vibration of installation tools [12].The assembly process started with gripe positioning together with toothed tube exactly 5 mm above the base part of planetary reducer and continued in direction of negative z-axis (Figure 4).In order to compensate tolerance during robot assembly, in experimental setup, we used the "search strategy", which adjusted amplitudes and frequencies gained from experimental experience (amplitudes of upper and down plate, frequencies of upper and down plate).As optimum values of amplitudes of down and above plate that were valid for all stages of reducer are A D = A U = 0.8 mm.
From experiments, we gained that smaller frequencies of vibration were better ( f D / f U = 4/2 or 6/3) for 1-2 stage (counting of stages starts from up to down), while for each next stage the assembly process was made better with higher frequencies ( f D / f U = 8/4 or 10/5).
In case of jamming from different physical reasons (position, friction, force, etc.), robot returned to beginning of current reducer stage, where the jamming was made.It exploited the technique of blind search in optimal parameter space with repeated trials at manipulation tasks.When the jamming had been overcome, robot kept moving until it reached the final point in assembly.
The used speeds of robot were from 4-20 mm/s.The time of complete assembly process for a given range of speeds was a function of frequency, amplitude of upper and lower plate of vibration module, amplitude and frequency of motor rotation, and the speed of motor movement in z-direction.The fastest process of assembly was for robot movement speed of 16 mm/s.Then, the complete process of assembly was only 4 s.
There were extensive experimental complex investigations made for the purpose of finding the optimum solution, because many parameters had to be specified in order to complete assembly process in defined realtime.But, tuning those parameters through experimental discovering for improved performance is a time-consuming process.To make this search strategy more intelligent, additional learning software was created to enable improvements of performance.

Advanced Replanning
The planning involves the representation of actions and world models, reasoning about the effects of actions, and techniques for efficiently searching the space of possible plans.Famous search algorithms tailored to planning problems are heuristic search algorithms (A * , D * , and Dijkstra algorithm) and their variants.
The planning under uncertainty is a hard job and requires replanning task structure.For example, the robot has to be able to plan the demonstrated task before executing it if the state of the environment has changed after the demonstration took place.The objects to be manipulated are not necessarily at the same positions as during the demonstration, and thus the robot may be facing a particular starting configuration it has never seen before.The replanning is used as specific case of planning process (in case of jamming).Combining with the planning operation, we can describe the replanning strategy as follows.
(1) Given the initial planning problem P a = (S, G), where S is an initial state parameter, G is a goal state parameter, a plan P a is a network of actions that lead from S to G (result from optimal search strategy is set of states parameters).( 2 In order for the robots to react to stochastic and dynamic environments, they need to learn how to optimally adapt to uncertainty and unforeseen changes [13].The robot learning covers a rather large field, from learning to perceive, to plan, to make decisions, and so forth.Learning control is concerned with learning control in simulated or actual physical robots.It refers to the process of acquiring a control strategy for a particular control system and particular task by trial and error. In our research, we use this concept state and action order to describe the relationships between the parts being assembled.Namely, the states are assembly parametersvibration amplitudes and frequencies for each planetary reducer stage and transition actions (minimal path) are used to move through assembly process from one stage to another of planetary reducer.
Our planning/replanning search strategy consists of three phase: (i) learning phase (assembly through the states in X, try various actions, and data collecting), (ii) planning phase, (iii) replanning phase.
We use neural-network-based learning which gives us new successful vibration solutions for each stage of reducer.With this extended vibration parameters as source information for planning/replanning task, we introduce optimal search strategy for robot assembly (Figure 5).
The error model is used to model various dynamic effects of uncertainties and physical constraints by jamming.Combing the efforts of the planner and learned optimal values, the replanner is expected to guarantee that agent system enters the region of convergence of its final target location.

Neural Aspects of Robot Assembly Learning
Machine learning usually refers to the changes in systems that perform tasks associated with artificial intelligence [14].The changes might be either enhancement to already performing systems or synthesis of new system.
In order for the robots to react to stochastic and dynamic environments, they need to learn how to optimally adapt to uncertainty and unforeseen changes [13].Artificial neural networks are capable of modeling complex mappings between the inputs and outputs of a system up to an arbitrary precision [15].Process of "capturing" the unknown information is called "learning of neural network" or "training of neural network".In mathematical formalism to learn means to adjust the free parameters (synaptic weight coefficients and bias levels) in such a way that some conditions are fulfilled [16].
There exist many types of neural networks, but the basic principles are very similar.Neural-network-based learning is used in this research to generate wider scope of parameters in order to improve the robot behaviour.The amplitude and frequencies vibration data are collected during assembly experiments and are used as sources of information for the learning algorithm.
In our research, we used multilayer feed-forward neural networks (MLF) and Elman neural networks.MLFs, trained with a backpropagation learning algorithm, are the most popular neural networks.Elman neural network differs from conventional ones in that the input layer has a recurrent connection with the hidden one.Therefore, at each time step, the output values of the hidden units are copied to the input ones, which store them and use them for the next time step.This process allows the network to memorize some information from the past, in such a way to better detect periodicity of the patterns [17].
We expected that Elman neural network will be better than a standard MLF in our application, but we got the better results with MLF.Namely, MLF is better for learning in order to extend learning area parameters.
In our research, we used MLF neural network containing 10 tansig neurons in hidden layer and 1 purelin neuron in its output layer.The feed-forward neural networks were formed and tested for each stage of assembly process.Each one was initialized with random amplitudes A U = A D = A i between 0 and 2 and frequencies values f i between 0 through 4. Namely, the range of the frequencies measurement is normalized by mapping from frequencies ratio f U / f D = (4/2, 6/3, 8/4, 10/5) onto the range of the state frequencies values (0 through 4).
To trains the MLF network, we used 35 vibrations sets for each 5 phases of assembly.The mean square errors (MSE) during the training of 5 MLF networks were achieved for 7-10 epochs.Two thousand data points were taken as a testing sample.
The feed-forward neural networks were formed and tested for each stage of assembly process.The following pictures (Figures 6, 7, and 8) present learning of new optimal stage vibration sets indicated by their respective picture.
The results show that the scope of adjusted vibration parameters obtained from autonomous learning is extended in respect to adjusted vibration sets from experimental robot assembly.We can see that critical moment in assembly process is second phase, which presents medium clutter position of optimal vibration parameter sets through stages.Second phase presents discontinuity between first and third phases in clutter space.
The search strategy involved in assembly experiments exploited the technique of blind search of optimal vibration values in repeated trials in each stage.If selected optimal value is in discontinuity area, then the path between one selected optimal stage parameter set and another will be outside of cone (Figure 9).
In this case, the tolerance compensation is not achieved, because position tolerance of some stage D is greater than admitted position tolerance D0.In order to solve this problem, we introduce optimal search strategy.

Optimal Search Strategy
Rather than being satisfied with any sequence of actions that leads to the goal set, we would like to propose a solution that would optimize some criterion, such as time, distance, or energy consumed, that is, we talk about optimal planning [1].
Consider representing the optimal planning problem in terms of states and state transitions.Let X be a nonempty, finite set of states, which is called the state space.Let x ∈ X denote a specific state, x I denote the initial state, and x G ∈ X denote a goal states.
For each x ∈ X, let U(x) denote a nonempty, finite set of actions.Let L denote the function, which is applied to the sequence u 1 , . . ., u K of applied actions and achieved states x 1 , . . ., x G .The task is to find a sequence of actions, u 1 , . . ., u K that minimizes cost for each segment L k , that is, A path L is defined as a series of linear segments L k connecting state points (P k , P k+1 ), k = 1, . . ., N.
In our research, the problem with applied search strategy in experiments was in case of behaviour switching (case of assembly jamming).The search strategy tried to continue assembly process with another optimal, but blind chosen parameter state value.But, using Optimal Search Strategy, we use the transition action with minimal distance between vibration state sets: An algorithm finds minimal distance vector from selected optimal value (A i , f i ), i = 1, . . ., N from current extended vibration state s k gained from learning process towards next vibration state s k+1 .The minimal path between two phases is in cone, and we have compensated tolerance (D < D0), see Figure 10.
In case of jamming (in our simulator: error event signal), we propose recovery parameter algorithm with learned optimal values, which offers new plan for path tracking during simulation of robot assembly.
We can explain this with next example.Figure 10 presents next situation: system detects error event during second state of assembly and strategy try to continue assembly process with another optimal set value (A 2 , f 2 ) from state s 2 .This value is optimal parameter value, which distance is mean value of all distances from state s 1 to state s 2 .We make enough offset from this critical optimal point to another optimal solution (Figure 11).After that, strategy establishes action between values (A 2 , f 2 ) and (A 3 , f 3 ).
Backward formulation of the optimal cost of each segment to the goal: To demonstrate the validity of this paradigm, we present test results obtained by implementation of robot assembly agent in Matlab.Some results of using optimal search strategy are demonstrated in Table 1.In third case, agent starts with vibration value (1.36, 1.70).In case of detecting of error event signal in second state, deterministic search strategy tries instead optimal value (1.32, 1.75) to continue assembly process with another optimal assembly vibration parameter stage set value (1.29, 0.72) * .New transition action is made from this new optimal value from current state with minimal path distance towards optimal vibration parameter stage set in next state, until it reaches the final point in assembly simulation process.

Conclusion
In this paper, the problem of path planning/replanning due to unexpected events during robot assembly is presented.As an example of robot assembly, it was researched the complex assembly of toothed tube over planetary gears.Important contribution of paper is solving tolerance compensation's problem using combination search strategy and neural learning approach.Namely, we used an approach with taskdependent knowledge to obtain efficient strategy for specific task.
The supervised neural-network-based learning is used to generate wider scope of vibration state parameters in order to accommodate the uncertainty in complex assembly of tube over planetary gears in case of jamming.The optimal search strategy is used to reach a goal matting point with minimum segment cost, that is, with minimal time of robot assembly.
In order to verify this approach, we have tested the several model data by computer simulation.The results show this approach satisfactorily solves the complex problem of tolerance compensation under uncertainty regardless of their complexity.This intelligently based path replanner has evolved to be suitable for number forms of robot planning/replanning tasks.

Figure 4 :
Figure 4: Particular phases of assembly process.
) If an action u in P a fails, we define a replanning area RA = {u}.(3) RA is treated as a partially/new plan, and construct a planning problem P b = (S , G), where S is a new start point used by RA.P b is a partially set of states parameters, produced by RA as effects of new optimal search strategy.(4) We search for a plan for P b .P b replaces P a in RA, and go to 5. If new action in P a fails, we go to 2. (5) Resume the execution of P b .

)Figure 7 :
Figure 7: Results of neural network training for second stage.

Figure 8 :Figure 9 :
Figure 8: Results of neural network training for third stage.

Figure 12 :Figure 13 :
Figure 12: Presentation of Optimal search strategy without error event signals.

Table 1 :
Examples of using optimal search strategy.