We used evolutionary computation to predict the trajectory of surface drifters. The data used to create the predictive model comprise the hourly position of the drifters, the flow and wind velocity at the location, and the location predicted by the MOHID model. In contrast to existing numerical models that use the Lagrangian method, we used an optimization algorithm to predict the trajectory. As the evaluation measure, a method that gives a better score as the Mean Absolute Error (MAE) when the difference between the predicted position in time and the actual position is lower and the Normalized Cumulative Lagrangian Separation (NCLS), which is widely used as a trajectory evaluation method of drifters, were used. The evolutionary methods Differential Evolution (DE), Particle Swarm Optimization (PSO), Covariance Matrix Adaptation Evolution Strategy (CMAES), and ensembles of the above were used, with the DE&PSO ensemble found to be the best prediction model. Considering our objective to find a parameter that minimizes the fitness function to identify the average of the difference between the predictive change and the actual change, this model yielded better results than the existing numerical model in three of the four cases used for the test data, at an average of 19.36% for MAE and 5.96% for NCLS. Thus, the model using the fitness function set in this study showed improved results in NCLS and thus shows that NCLS can be used sufficiently in the evaluation system.
The technology for predicting particle trajectories in the ocean can be used in a variety of ways. For example, it can provide a method to track objects in the ocean during a distress situation or an accident through the last observed time and location data, as well as predicting the path of icebergs floating at the sea. It also presents the possibility of tracing pollutants in the event of accidents such as the 2010 Deepwater Horizon oil spill in the Gulf of Mexico; as a result, numerous studies have been conducted on the matter [
To predict particle trajectories in the ocean, we used several drifters such as those shown in Figure
Surface drifters [
The remainder of this paper is organized as follows. Section
To design the prediction model in this study, seven drifters were dispatched at different locations from November 6 to October 16, 2015, near the offshore of Sosan in Korea. The location of each drifter was recorded in hours from the first drop in the ocean. The period and the number of datasets measured are shown in Table
Period and number of datasets measured.
Case 
Case 
Case 
Case 
Case 
Case 
Case 


Period  9 d 23 h  1 d 2 h  4 d 13 h  4 d 16 h  4 d 17 h  1 d 15 h  1 d 21 h 
Number of data  239  26  109  112  113  39  45 
Wind velocity data were obtained from the Korea Meteorological Administration (KMA). The flow velocity used reanalyzed data provided by ARA Consulting & Technology [
Observed trajectories of the seven drifters used to build our predictive models.
The training data for predicting drifter movement have the same attributes as those in Table
Attributes of the training data (examples).
Time  Observed location  Wind  Flow  

Year  Month  Day  Hour  Latitude  Longitude  U  V  U  V 
2015  11  6  8  125.0796  36.5788  0.1770  −0.0933  −6.2055  −0.1881 
2015  11  6  9  125.0731  36.5750  0.1339  −0.0046  −5.6724  0.5072 
2015  11  6  10  125.0690  36.5750  0.0909  0.0841  −5.1393  1.2025 
2015  11  6  11  125.0633  36.5816  0.0481  0.1726  −4.6124  1.8818 
In other words, the location of the drifters should be determined through the wind and flow velocities. However, because velocity is a variation as well, the value that can be obtained by using the wind velocity and flow velocity should also be the amount of change of the position. As the current data only shows the absolute position of the drifter, the observed location part of the training data should be changed from the absolute position to the positional change amount. Therefore, it can be said that the change of the location as shown in Table
Attributes of the transformed training data (examples).
Time  Observed location  Wind  Flow  

Year  Month  Day  Hour  Latitude  Longitude  U  V  U  V 
2015  11  6  9  −0.0065  −0.0039  0.1770  −0.0933  −6.2055  −0.1881 
2015  11  6  10  −0.0041  0.0000  0.1339  −0.0046  −5.6724  0.5072 
2015  11  6  11  −0.0057  0.0066  0.0909  0.0841  −5.1393  1.2025 
2015  11  6  12  −0.0083  0.0106  0.0481  0.1726  −4.6124  1.8818 
Observed locations are subtracted from the values in the next line, and the result indicates the changed amount. Therefore, the numerical value obtained by the wind and flow velocities in one row becomes the positional variation and is in a form suitable for training. Unfortunately, data from other studies predicting trajectory [
The conventional prediction model compared with the results of this experiment uses the MOHID (MOdelo HIDrodinâmico) model. MOHID is a model developed in 1985 at the Marine and Environmental Technology Research Center (MARETEC) of the Instituto Superior Técnico (IST) of the University of Lisbon, Portugal [
Observed trajectory of drifter (OBS) and predicted one of MOHID model to be compared with our predicted models (Case
In this study, we formulated an equation using wind and flow velocity to predict the trajectory of drifters. In the equation, the wind and flow velocity into the variables are combined with the constant parameters to calculate the position variation. Therefore, we need to use an evolutionary method to deal with the real parameter optimization problem. In this study, we used Differential Evolution (DE), Particle Swarm Optimization (PSO), and Covariance Matrix Adaptation Evolution Strategy (CMAES), which have had success resolving such problems in the past.
Price et al. [
Kennedy and Eberhart [
Proposed by Hansen et al. [
An ensemble is a technique that generates multiple models and combines the predicted results of each model to generate new results [
The above algorithms are suitable for real number parameter optimization problems and can all have a fitness function. The attributes that can be used as parameters in Table
All experiments were tested through cross validation. Cross validation is a process in which all of the data are divided into
Number of tuples on the training data.
Data  Case 
Case 
Case 
Case 
Case 
Case 
Case 

The number of tuples  238  25  108  111  112  38  44 
Cases
Each evolutionary method can set parameters to suit the experimental situation. In this experiment, the most suitable parameters were found and the best method was found by comparing the results representing each evolutionary method. The best fitness and average fitness of the population according to each generation of the evolutionary method are also displayed in a graph; thus, the performance is shown graphically. The computer used in this experiment had an Intel i7 7700k (4.2 GHz) CPU and the evolutionary method was implemented in the C language.
As shown in Figure
Longitude error and latitude error of predicted location.
This can be expressed as in
The principle of operation of the algorithm is almost the same as that of Section
Error defined by Euclidean distance between the predicted location and the observed one.
This can be expressed by
NCLS, also called skill score, was developed by Liu and Weisberg [
Formula for calculating skill score (ss) of NCLS.
The tolerance threshold is a number that prevents the skill score from being low when
To utilize DE, a program distributed by Storn [
Strategies of DE.
Exponential  DE/best/1 
DE/rand/1  
DE/randtobest/1  
DE/rand/2  
DE/best/2  


Binomial  DE/best/1 
DE/rand/1  
DE/randtobest/1  
DE/rand/2  
DE/best/2 
As shown in the table, “Exponential” and “Binomial” can be selected. As there is no difference between the two methods, “Exponential” was tested; the results obtained are shown in Table
Results on parameters of DE.
Parameter  Evaluation  Case  

1  5  6  7  
Strategy  
DE/best/1  MAE  0.1351  0.0761  0.0494  0.1308 
Euclid  0.1772  0.1218  0.0840  0.1931  
NCLS  0.8879  0.8153  0.8895  0.7627  

MAE  0.0920 



Euclid  0.15402 




NCLS  0.9025 





MAE  0.0920 



Euclid  0.1540 




NCLS  0.9025 




DE/best/2  MAE 

0.0842  0.0825  0.0966 
Euclid 

0.1364  0.1259  0.1400  
NCLS 

0.7854  0.8344  0.8280  
DE/rand/2  MAE  17.0368  9.6299  19.0175  2.8862 
Euclid  27.0270  14.6791  28.9367  4.2267  
NCLS  −16.1006  −21.0442  −37.1143  −4.1928 
Each value represents the evaluation value of the best individual prediction among the population of 20,000 generations. “MAE” means the average error value from (
Line graph of showing prediction errors over the number of generations in DE on four test cases.
Case
Case
Case
Case
The average fitness and best fitness of each population were not significantly different, and it was found that they converged relatively quickly. Table
Computing time of DE.
Data  Case 
Case 
Case 
Case 

Computing time (CPU second)  19.4  23.8  23.9  24.3 
A program distributed by Kyriakos was used for PSO [
Results on parameters of PSO.
Parameter  Evaluation  Case  

1  5  6  7  
Inertia  

MAE 




Euclid 





NCLS 






MAE  0.0914  0.0822  0.0386  0.0643 
Euclid  0.1529  0.1331  0.0586  0.0942  
NCLS  0.9033  0.7982  0.9230  0.8843  

MAE  0.0916  0.0822  0.0386  0.0643 
Euclid  0.1531  0.1332  0.0585  0.0942  
NCLS  0.9032  0.7981  0.9230  0.8843  

MAE  0.0917  0.0823  0.0385  0.0643 
Euclid  0.1534  0.1332  0.0583  0.0943  
NCLS  0.9030  0.7980  0.9233  0.8842 
The result for the parameter was not significantly different. The larger the Inertia is, the worse the experimental results were. As the Inertia, we chose the value 0.3 showing the best performance. Figure
Line graph of showing prediction errors over the number of generations in PSO on four test cases.
Case
Case
Case
Case
The initial error was very high compared to DE, but, after 100 generations, it was close to zero. Therefore, PSO can also search the parameter space sufficiently. The execution speed was faster than DE. Table
Computing time of PSO.
Data  Case 
Case 
Case 
Case 

Computing time (CPU second)  5.3  6.6  6.7  7.0 
In the CMAES experiment, a source code developed directly by Hansen was used [
Results on parameters of CMAES.
Parameter  Evaluation  Case  

1  5  6  7  
Weight  Log 



















200  MAE  0.1303  0.1393  0.0762  0.1304  
Euclid  0.2165  0.2300  0.1190  0.2199  
NCLS  0.8631  0.6511  0.8438  0.7299  
Linear  100  MAE  0.1303  0.1393  0.0761  0.1304  
Euclid  0.2165  0.2300  0.1189  0.2200  
NCLS  0.8631  0.6510  0.8438  0.7299  
200  MAE  0.1303  0.1393  0.0760  0.1304  
Euclid  0.2165  0.2300  0.1188  0.2199  
NCLS  0.8631  0.6510  0.8435  0.7299 
The default values (
Line graph of showing prediction errors over the number of generations in CMAES on four test cases.
Case
Case
Case
Case
The lower the iteration, the faster the execution. Table
Computing time of CMAES.
Data  Case 
Case 
Case 
Case 

Computing time (CPU second)  3.5  3.2  2.9  2.8 
The ensemble was assembled with the four cases using DE, PSO, and CMAES, as described in Section
The data can be visualized as shown in Figure
Bar graph of MAE and Euclidean distance for all the tested predictive models on four test cases. The lower the values, the better.
MAE
Euclidean distance
Compared with the existing MOHID model, the performance improvement can be examined through
The MAE values showed the best performance with Ensemble (DE&PSO), by 19.36% compared to MOHID. Euclidean values also showed an improvement via Ensemble (DE&PSO) by 18.71% compared to MOHID. Next, the larger the NCLS, called the skill score, the better. The results are shown in Figure
Bar graph of NCLS skill score for all the tested predictive models on four test cases. The higher the values, the better.
Compared with the existing MOHID model, the performance improvement value can be identified through
The definitions of the variables are the same as in (
CMAES exhibited the worst performances, whereas the Ensemble (DE&PSO), which showed good performance in MAE and Euclid, showed the best performance in NCLS, at 5.96% better than the MOHID model. Tables
Results on parameters of all the tested methods.
Method  Evaluation  Case  

1  5  6  7  
DE, PSO  MAE  0.0900 

0.0334  0.0611 
Euclid 


0.0514  0.0896  
NCLS 


0.9324  0.8900  


PSO, CMAES  MAE  0.0962  0.1051  0.0391  0.0851 
Euclid  0.1656  0.1754  0.0587  0.1431  
NCLS  0.8963  0.7338  0.9228  0.8242  


CMAES, DE  MAE  0.0962  0.1048  0.0386  0.0867 
Euclid  0.1656  0.1756  0.0580  0.1459  
NCLS  0.8953  0.7334  0.9237  0.8207  


DE, PSO 
MAE 

0.0943 

0.0751 
Euclid  0.1537  0.1593 

0.1224  
NCLS  0.9028  0.7582 

0.8495  


DE (rand/1)  MAE  0.0920  0.0828  0.0392  0.0653 
Euclid  0.1541  0.1342  0.0593  0.0956  
NCLS  0.9026  0.7965  0.9220  0.8826  


PSO (Inertia = 0.3)  MAE  0.0907  0.0820  0.0385  0.0622 
Euclid  0.1512  0.1330  0.0584  0.0913  
NCLS  0.9044  0.7984  0.9232  0.8878  


CMAES (weight = log, 
MAE  0.1303  0.1393  0.0761  0.1304 
Euclid  0.2165  0.2300  0.1189  0.2200  
NCLS  0.8631  0.6513  0.8437  0.7297  


MOHID model [ 
MAE  0.1352  0.1238  0.0656 

Euclid  0.2161  0.1890  0.1155 


NCLS  0.8633  0.7134  0.8480 

Performance improvement with respect to MAE (%).
MAE  Case 
Case 
Case 
Case 
Average 

DE&PSO  33.43 

49.09 


PSO&CMAES  28.85  15.11  40.40 

−2.93 
CMAES&DE  28.85  15.35  41.16 

−3.60 
DE&PSO&CMAES 

23.83 


11.12 
DE  31.95  33.12  40.24 

13.71 
PSO  32.91  33.76  41.31 

16.17 
CMAES  3.62 

−16.01 


Performance improvement with respect to Euclidean distance (%).
Euclidean distance  Case 
Case 
Case 
Case 
Average 

DE&PSO 


55.50 


PSO&CMAES  23.37  7.20  49.18 


CMAES&DE  23.37  7.09  49.78 


DE&PSO&CMAES  28.88  15.71 


3.68 
DE  28.69  28.99  48.66 

13.53 
PSO  30.03  29.63  49.44 

15.93 
CMAES 





Performance improvement with respect to NCLS (%).
NCLS  Case 
Case 
Case 
Case 
Average 

DE&PSO 


9.95 


PSO&CMAES  3.82  2.86  8.82 

1.20 
CMAES&DE  3.71  2.80  8.93 

1.09 
DE&PSO&CMAES  4.58  6.28 


3.64 
DE  4.55  11.65  8.73 

5.14 
PSO  4.76  11.91  8.87 

5.44 
CMAES 





The trajectory (DE&PSO) predicted by the DE&PSO ensemble showing the actual movement path (OBS) in Cases
Comparison of predicted trajectory by our ensemble of DE&PSO, one by existing numerical model (MOHID) and observed one for four major drifters.
Case
Case
Case
Case
In this paper, we proposed a novel method for predicting drifter trajectory using evolutionary computation. The study is significant in that it is the first to perform parameter optimization using evolutionary computation in predicting particle trajectories in the ocean. In three of the four cases, the trajectory was more accurate than the existing MOHID model. In addition, the fitness function of the evolutionary computation was set as the difference between the observation change rate and the position prediction change rate according to flow velocity and wind speed. The predicted model showed excellent performance of 19.36% on MAE and 5.96% on NCLS on average. Therefore, it is clear that the fitness function can be utilized to increase the NCLS score. In the future, we plan to use machine learning techniques instead of evolutionary methods along with more data. Furthermore, in the current ensemble, all algorithms are combined in an equal ratio. We plan to use the weighted voting ensemble [
The authors declare that they have no conflicts of interest.
This research was supported by a grant [KCG01201705] through the Disaster and Safety Management Institute funded by Korea Coast Guard of Korean government. The authors would like to thank Mr. DoYoun Kim, a director in ARA Consulting & Technology, for providing the drifter data.