High frequency (HF) radar installations are becoming essential components of operational real-time marine monitoring systems. The underlying technology is being further enhanced to fully exploit the potential of mapping sea surface currents and wave fields over wide areas with high spatial and temporal resolution, even in adverse meteo-marine conditions. Data applications are opening to many different sectors, reaching out beyond research and monitoring, targeting downstream services in support to key national and regional stakeholders. In the CALYPSO project, the HF radar system composed of CODAR SeaSonde stations installed in the Malta Channel is specifically serving to assist in the response against marine oil spills and to support search and rescue at sea. One key drawback concerns the sporadic inconsistency in the spatial coverage of radar data which is dictated by the sea state as well as by interference from unknown sources that may be competing with transmissions in the same frequency band. This work investigates the use of Machine Learning techniques to fill in missing data in a high resolution grid. Past radar data and wind vectors obtained from satellites are used to predict missing information and provide a more consistent dataset.
The risk of oil, from marine spillages beaching on shores, hitting important economic resources and causing irreversible environmental damage is a very realistic menace in the Malta Channel and the stretch of sea between Malta and Sicily. Especially in a small island state like Malta where economic assets are concentrated in space, the damage would be even more devastating. Moreover, this region is situated along the main shipping lanes of the Mediterranean Sea.
Risks can be highly minimised by using the best tools for surveillance and operational monitoring against pollution threats, as well as a capacity to respond with informed decisions in case of emergency. In the CALYPSO project, top-end technology consisting of an array of HF radars was installed to monitor in real-time meteo-marine surface conditions in the Malta Channel. The collected measurements continuously provide accurate information to monitor and respond effectively to threats from oil spills. Observed sea surface currents can be coupled with Lagrangian particle models to compute the hindcast trajectory of any detected spill. If such information is coupled with historic data from Vehicle Tracking Systems (VTS), marine vessels intersecting the predicted spill movement can be identified and the source of the pollution may be determined. Moreover, data from the HF radar provides an avenue for a wider range of applications including search and rescue and safer navigation.
While the use of the 13.5 MHz radar frequency used in CALYPSO provides a good spatial coverage and resolution over the required domain, considerable interference with the radar signals, noted in the area especially in the early afternoon periods, results in significant loss in spatial coverage. In this work, an intelligent gap filling technique that makes use of past sea currents as well as wind measurements recorded by satellite is proposed. A mesh of neural networks is trained at each grid cell to model the circulation patterns from recent observations. The zonal (
McCulloch and Pitts introduced neural networks to the field of artificial intelligence in around 1943, when they modelled the switching activity of neurons in the human brain [
The performance of the proposed gap filling technique is compared to an existing method which computes missing values by a Data Interpolating Empirical Orthogonal Function (DINEOF) algorithm [
In the following section, the CALYPSO and CALYPSO Follow On projects as well as the outcomes and deliverables are described briefly. In Section
CALYPSO was a two-year project partly financed by the EU under the Operational Programme Italia-Malta 2007–2013 [
The HF radar data are intended to primarily support applications and optimise intervention in case of oil spill response as well as support tools for search and rescue (SAR), maritime security, safer navigation, improved metro-marine forecasts, monitoring of sea conditions in critical areas such as proximity to ports, and better management of the marine space between Malta and Sicily. A key service consists in the direct access of the HF radar data by the Armed Forces of Malta (AFM) through the Search and Rescue Optimal Planning System (SAROPS). Based on the US Coast Guard model, this software is used to support SAR missions. In case of an accident, past and real-time met-ocean data is automatically obtained from the Environmental Data Server (EDS) which is linked to the CALYPSO system. SAROPS can then utilise the high resolution local data to identify the “most likely” location of missing persons or drifting objects based on drift models. The search pattern, probability of success, and probability of containment are computed and given to the authorities [
The CALYPSO project also served in capacity building in the monitoring of the coastal seas and adjoining resources. The measured data is shedding new insights into the dynamics of the sea in this part of the Mediterranean, leading to research efforts also related to improved forecasting of the marine environment, protection from oil spills, search and rescue, and fisheries.
The CALYPSO Follow On project was a six-month extension project that improved on the achievements of the original project. After its completion, a more robust HF radar monitoring system was established and downstream services to targeted users were accomplished including the launch of a smartphone application for use by mariners. The location of the radial sites as well as the data recorded on the 17/06/2016 at 00:00 is presented in Figure
Data from the CALYPSO HF Radar Network for the 27/04/2016 at 03:00 UTC.
Validation of the observed HF radar currents was done through 27 Surface Velocity Program (SVP) drifters that were released in five different deployments along a chosen transect in the Malta Channel. The Iridium satellite constellation was used to track the position of each buoy with a temporal frequency of one hour. Apart from the geographical coordinates with an accuracy of about 10 m, the transmitted data included battery level, sea surface temperature, and an indication of the presence of the underwater drogue that reduces the wind influence on the followed path. By using consecutive points, the zonal and meridional velocity components at each transmitted location were calculated and compared to the remotely sensed currents recorded by the radar network. Accuracy in the surface layer was found to be between 1 cm/s and 3 cm/s [
After the installation and calibration of the first radars, strong interference at the same frequency was noted. Spectral analysis revealed external transmissions at 13.5 MHz which are active every day in the afternoon. According to Resolution 612 of the International Telecommunications Unit (ITU), this band can be used by oceanography radars on a secondary basis and hence other transmissions are allowed. The operation of the CALYPSO HF Radar Network involves weak signals of about 40 W. Scattered waves from long range cells can therefore be corrupted and cannot be correctly interpreted by the radar. Figures
Normal radial coverage at Ta’ Barkat on 11/01/2014 at 00:00 UTC (b) and reduced coverage due to external interference on 28/09/2013 at 20:00 UTC (a).
Typical spectra at Ta’ Barkat on 11/01/2014 at 00:00 UTC (a) and noisy spectra due to external interference on 28/09/2013 at 20:00 UTC (b).
Instances of typical (a) and reduced (b) spatial data coverage by the CALYPSO HF Radar Network on the 11/01/2014 at 00:00 UTC and on 28/09/2013 at 20:00 UTC, respectively.
Such data gaps in both space and time are highly restrictive on the quality of the service provision to users. HF radar data streams need therefore to be processed to fill in the gaps by reliable guesses. An off-the-shelf interpolation technique was initially applied using the DINEOF algorithm made available by the GeoHydrodynamics and Environment Research (GHER) lab [
Examples of observed HF radar sea surface current vector fields (black) overlaid on the results by the DINEOF gap filling technique (red).
Inconsistencies between the radar data (black) and the interpolated vectors using the DINEOF technique (red).
An alternative data filling method based on Machine Learning techniques was then assessed. Neural networks provide a good basis to generate missing meridional and zonal sea current components from a learning process that makes use of previously observed HF radar and wind fields. Such networks connect a number of elements in a structure that takes a set of inputs and produce a single real number. The learning algorithm determines numeric weights to apply between each of these neurons to obtain the desired output. One main advantage of this technique is that it can produce good results even when it is supplied with noisy and incomplete data.
Missing current vectors at a particular time are generated by processing the HF radar observations in the previous few hours preceding the gap. Sea currents in the Malta Channel are the expression of a number of factors influencing the motion of the water at different temporal and spatial scales. The general circulation is indeed dictated by the slow basin scale (vertical) thermohaline structure of the Mediterranean and exhibits known seasonal characteristics. The spatial scale of these circulation patterns is captured at the level of the full HF radar domain. However, the circulation is also modified by strong mesoscale signals in the form of eddy, meander, and filament patterns. These mesoscale processes are triggered by the synoptic scale atmospheric forcing. The heat and momentum fluxes at the air-sea interface represent the dominant factor in the mixing and preconditioning of the surface of the Atlantic Water that crosses the Malta-Sicily Channel on its way to the Eastern Mediterranean [
The preference to the use of wind velocity rather than wind stress relies on the linear relationship between currents and friction velocity evidenced in elaborated models of wind driven currents [
Satellite wind data (Level 4) were acquired for the dataset period from the Copernicus Marine Environment Monitoring Service [
Original IFREMER CERSAT global blended mean wind fields (blue) and upsampled (red) satellite wind data. The arrow scales of the two datasets are not the same and have been set for better visualisation of the overall wind pattern circulation.
A dataset was initially created and provided as a training set to a number of Artificial Neural Networks (ANNs) to learn the local patterns. Since the gap pattern in the data is not constant, each grid point is treated separately. Different ANNs are defined and trained to predict the values for each cell without requiring data from adjacent points. This ensures that sea surface current values can always be computed irrespective of how large or long the HF radar data gaps are in space and time. Data with a temporal frequency of 1 hour collected between 01/01/2013 at 00:00 UTC and 31/01/2015 at 23:59 UTC was considered. This resulted in a dataset of 18,264 raster sets collected over 25 months. A high resolution regular grid of 736 nodes with a spatial resolution of 0.04 degrees was defined over the Malta Channel. The domain extended between 13.6376°E and 15.3981°E in longitude and between 35.7263°N and 37.0192°N in latitude. 70% of the labelled datasets were used to build the models. In each iterate 15% were used to assess, validate, and check for convergence. The remaining 15% of training examples were used to quantify the accuracy of the system before processing the data gaps. Since real radar observations were available for this labelled set of vectors, the behavior and accuracy of the model could be tested on unseen data.
To predict the coefficient at a particular point in time, the wind and current data for the past six hours were used. Table
Sample training dataset used for supervised learning of
Wind | Wind | Wind | Wind | Wind | Wind | Current | Current | Current | Current | Current | Current | Current |
---|---|---|---|---|---|---|---|---|---|---|---|---|
4.17 | 2.42 | 2.77 | 3.12 | 3.47 | 3.82 | −29.77 | −34.68 | −31.94 | −31.61 | −33.78 | −31.75 | −27.25 |
2.42 | 2.77 | 3.12 | 3.47 | 3.82 | 4.17 | −34.68 | −31.94 | −31.61 | −33.78 | −31.75 | −27.25 | −25.12 |
2.77 | 3.12 | 3.47 | 3.82 | 4.17 | 4.14 | −31.94 | −31.61 | −33.78 | −31.75 | −27.25 | −25.12 | −27.22 |
3.12 | 3.47 | 3.82 | 4.17 | 4.14 | 4.12 | −31.61 | −33.78 | −31.75 | −27.25 | −25.12 | −27.22 | −24.96 |
3.47 | 3.82 | 4.17 | 4.14 | 4.12 | 4.10 | −33.78 | −31.75 | −27.25 | −25.12 | −27.22 | −24.96 | −26.17 |
3.82 | 4.17 | 4.14 | 4.12 | 4.10 | 4.07 | −31.75 | −27.25 | −25.12 | −27.22 | −24.96 | −26.17 | −22.13 |
4.17 | 4.14 | 4.12 | 4.10 | 4.07 | 4.05 | −27.25 | −25.12 | −27.22 | −24.96 | −26.17 | −22.13 | −20.87 |
4.14 | 4.12 | 4.10 | 4.07 | 4.05 | 4.03 | −25.12 | −27.22 | −24.96 | −26.17 | −22.13 | −20.87 | −21.26 |
4.12 | 4.10 | 4.07 | 4.05 | 4.03 | 4.29 | −27.22 | −24.96 | −26.17 | −22.13 | −20.87 | −21.26 | −20.28 |
4.10 | 4.07 | 4.05 | 4.03 | 4.29 | 4.55 | −24.96 | −26.17 | −22.13 | −20.87 | −21.26 | −20.28 | −18.65 |
A separate network was generated for each grid cell. As shown in Figure
Schematic diagram of the 12:15:1 neural network used to learn the sea surface currents.
Once all the data was collected and processed, an ANN was trained for each grid cell of the HF radar domain. Each network was built using all the available information. Training vectors for which at least one value was missing were ignored. The converged ANNs were run for the timestamps where radar data were missing in order to fill in the gaps. Points corresponding to the centre of the domain for which a lot of information was available converged in a few iterates with a Pearson correlation of 0.9867 (
Gap filling results by the ANN technique.
Strong correlation between the original radar data and the interpolated results by the ANN technique.
Figures
MSE and absolute residual averages between the DINEOF and ANN techniques for the grid point at 14.3601°E longitude and 36.0659°N latitude (corresponding to Figure
Zonal ( | Meridional ( | Zonal ( | Meridional ( | |
---|---|---|---|---|
Mean Square | | | | |
Absolute residual average (cm/s) | 3.6815 | 3.9142 | 1.2029 | 1.1193 |
MSE and absolute residual averages between the DINEOF and ANN techniques for the grid point at 14.888269°E longitude and 36.296182°N latitude (corresponding to Figure
Zonal ( | Meridional ( | Zonal ( | Meridional ( | |
---|---|---|---|---|
Mean Square | 0.0022 | 0.0015 | | |
Absolute residual | 3.4598 | 2.8885 | 1.2478 | 1.0943 |
Time series plot of DINEOF interpolated, ANN interpolated, and original radar data (a) and residuals between real and interpolated data (b) for the grid point at 14.3601°E longitude and 36.0659°N latitude.
Time series plot of DINEOF interpolated, ANN interpolated, and original radar data (a) and residuals between real and interpolated data (b) for the grid point at 14.888269°E longitude and 36.296182°N latitude.
The routine acquisition of multidisciplinary, spatially widespread, long-term datasets of the ocean and coastal seas is expected to trigger an unprecedented leap in the economic value of ocean data and information and will additionally target multiple applications and users. The HF Radar Network installed during the CALYPSO projects puts Malta and Sicily at the forefront of such initiatives in the Mediterranean and will serve as a stepping stone to add to the system in the future to have a coverage of the full marine space around the Maltese Islands and the Sicilian perimeter, including the coastal areas.
For higher quality data, this work investigated the potential of using Machine Learning techniques to fill in gaps within the HF radar observed current maps. While further work is necessary to get the system running in an operational mode, the proof of concept has shown that very good results can be achieved using the latest six hours of observed current and wind data. Planned future work includes experimentation with other learning methods. The applicability of such techniques for short term forecasting will also be studied. In particular, the potential use of observed sea surface currents and wind vectors to predict surface state conditions of the sea over the next few hours will be investigated.
The authors declare that there is no conflict of interests regarding the publication of this paper.