Comparison between a Machine-Learning-Based Method and a Water-Index-Based Method for Shoreline Mapping Using a High-Resolution Satellite Image Acquired in Hwado Island , South Korea

Shoreline-mapping tasks using remotely sensed image sources were carried out using the machine learning techniques or using water indices derived from image sources.This research compared two differentmethods formapping accurate shorelines using the high-resolution satellite image acquired in Hwado Island, South Korea.The first shoreline was generated using a water-index-based method proposed in previous research, and the second shoreline was generated using a machine-learning-based method proposed in this research. The statistical results showed that both shorelines had high accuracies in the well-identified coastal zones while the second shoreline had better accuracy than the first shoreline in the coastal zones with irregular shapes and the shaded areas not identified by the water-index-based method. Both shorelines, however, had low accuracies in the coastal zones with the shaded areas not identified by both methods.


Introduction
A coastal zone is defined as "the coastal waters (including the lands therein and thereunder) and the adjacent shorelands (including the waters therein and thereunder) strongly influenced by each and in proximity to the shorelines of several coastal states, and include islands, transitional and intertidal areas, salt marshes, wetlands, and beaches" [1].Coastal erosions generally cause serious damage in the ecosystems and human lives in coastal zones [2].A shoreline is defined as "the line along which a large body of water meets the land" [3].A shoreline-mapping task is critical for the prevention of coastal erosion, the management of coastal zones, the preservation of coastal properties, and the description of the detailed coastal shapes [4,5].Historically, the shorelinemapping tasks have been carried out using the groundsurveying methods, but due to the irregular coastal surfaces and the huge size of coastal areas, the ground-surveying method is not an efficient method for the shoreline-mapping tasks [6].
Research on shoreline mapping using the remote sensing datasets has been carried out because the utilization of such datasets is efficient for acquiring the surface and geometric information of wide coastal zones with high accuracy and without human access [4][5][6].Li et al. (2001) and Guariglia et al. (2006) compared the multiple techniques for mapping shorelines using the different datasets [7,8].Li et al. (2003) used high-resolution satellite imagery for mapping shorelines by using the photogrammetry techniques [9].Liu et al. (2009) and Choung et al. (2013) used the airborne topographic LiDAR (light detection and ranging) data for mapping shorelines by using geometric analysis [4,10].Lee (2012) utilized high-resolution satellite imagery for mapping shorelines by using the unsupervised segmentation method [6].2015) utilized a high-resolution satellite image for mapping shorelines using a water index [15,16].Bouchahma and Yan (2012) and Choung and Jo (2015) utilized Landsat imagery for mapping shorelines by using a water index [5,17].
A recent research on mapping shorelines using various remote sensing data was carried out using two different approaches: (1) mapping shorelines by the supervised approach such as the machine learning techniques and (2) mapping shorelines by the unsupervised approach based on the water index derived from multispectral image sources.A comparison of these two approaches for mapping shorelines using the high-resolution image sources, however, has been limited.This research proposed a machine-learningbased method and compared the proposed method with the previous water-index-based method proposed by Choung and Jo (2015) for mapping accurate shorelines using a highresolution satellite image.

Study Areas and Datasets
Hwado Island, South Korea, was selected as the study area in this research due to the data availability (see Figure 1).The coastal zones of Hwado Island have an approximately 7 km total shoreline length.
The orthorectified high-resolution satellite image was acquired by the WorldView-2 satellite on October 11, 2011.The given WorldView-2 image consists of the four available spectral bands (blue: 450-510 nm; green: 510-580 nm; red: 630-690 nm; and NIR (near infrared): 770-895 nm), and the ground resolution of the WorldView-2 image is 50 cm [18].The horizontal datum of the given WorldView-2 image is WGS (World Geodetic System) 84, and the RMSE (root mean square error) of the image is 25 cm.

Methodology
This section illustrates the water-index-based method, the previous method, and the machine-learning-based method, the proposed method, for mapping shorelines using the given WorldView-2 image.Figure 2 presents a flowchart showing the procedure for mapping shorelines using the two different methods.In the water-index-based method, the NDWI (normalized-difference water index) image was generated from the WorldView-2 image, and then the first binary image separating the land and water features was generated from the NDWI image by an adaptive thresholding method.In addition, the first shoreline was extracted from the first binary image by selecting the boundary between the identified land and water features.In the machine-learning-based method, the coastal-surface classification map was generated from the given WorldView-2 image by the support vector machine (SVM) classifier, and then the second binary image was generated from the coastal-surface classification map by grouping the land (rock, vegetation) and water features.Morphological filtering was applied to refine the boundary between the land and water features in the second binary image, and then the second shoreline was extracted from the refined second binary image by selecting the boundary between the identified land and water features.Finally, the accuracy of both generated shorelines was measured using the checkpoints to select the most appropriate approach for the shoreline-mapping task using the given WorldView-2 image.
3.1.Water-Index-Based Method for Mapping the First Shoreline.NDWI is a remote-sensing-derived index for detecting water features such as oceans, rivers, lakes, and reservoirs from multispectral image sources by using their spectral bands [5,16,19,20].As the water features are enhanced while other features (e.g., land or vegetation features) are suppressed in NDWI, it is a widely used index for detecting water features from multispectral image sources [17][18][19].In this study, the NDWI image was generated using the green and NIR bands of the WorldView-2 image given by (1) [19][20][21][22].
where   and  NIR are the reflectance of the green and NIR bands of the given WorldView-2 image, respectively.The generated NDWI image is shown in Figure 3.As the NDWI image is the ratio image in which one pixel of one spectral band is divided by the corresponding pixel of the other band, the minimum value in the NDWI image is −1, and the maximum value is 1.As can be seen in Figure 3, the pixels representing the water features (ocean) have relatively high values close to 1 while the pixels representing the land features (vegetation, roads, buildings, etc.) have relatively low values close to −1.The next step was to convert the NDWI image into the first binary image separating the land and water features.In the water-index-based method, the adaptive threshold derived from the adaptive thresholding method is used to separate the land and water features in the NDWI image because it chooses an adaptive intensity threshold in the NDWI image for minimizing the intraclass variance of the white and black pixel groups in the converted binary image [5,23].Hence, the adaptive thresholding method was also used in this research to convert the NDWI image into the first binary image separating the land (the black pixels) and water (the white pixels) features (see Figure 4).As in the methodology proposed by Choung and Jo (2015), the first binary image was converted from the NDWI image using the adaptive thresholding method through the Matlab5 program (Matlab 2013b, Mathworks, Inc., Natick, Massachusetts, USA). Figure 4 shows the first binary image converted from the NDWI image by the adaptive thresholding method.
Finally, the first shoreline was extracted from the first binary image by selecting the boundary between the land and water features.Figure 5 shows the first shoreline generated from the first binary image by selecting the boundary between the land and water features.

Machine-Learning-Based Method for
Mapping the Second Shoreline.Machine learning is defined as "a branch of artificial intelligence in which a computer generates rules underlying or based on the raw data that have been fed into it," and the machine learning technique is defined as "the ability of a machine to improve its performance based on previous results" [24].Machine learning has multiple advantages because it is useful for high-value prediction and realtime smart decision-making without human intervention [25].The machine learning technique is generally used in many applications, such as fraud detection, face recognition, spam filtering, stock trading, and text categorization [26].Recently, the machine learning techniques have been used in the remote sensing applications, for detecting important features and for classifying land covers from the remote sensing datasets [27].
Machine learning is classified into several methods according to the type of learning algorithm used, such as the supervised learning technique operated with training samples and the unsupervised learning technique not requiring training samples [28].SVM, a machine learning technique, is a supervised learning algorithm for finding the appropriate hyperplane that maximizes the margins between the two classes in -dimensional spaces [29].The SVM classifier has been widely used for the land cover classification tasks using the remote sensing datasets because it produces a very accurate classifier and avoids classification noises by using a hyperplane [30].Considering these advantages, the SVM classifier was utilized in this study for generating the     Visual Information Solution, Inc., Boulder, Colorado, USA). Figure 6 shows one section of the generated coastal-surface classification map.
The next step was to convert the coastal-surface classification map into the second binary image.As the rock and vegetation features represent the land features, these features were grouped into the land features in the converted binary image.Figure 7 shows the second binary image converted from the coastal-surface classification map.As seen in Figure 7, the land features were manually set to be the white pixels while the water features were also manually set to be the black pixels, respectively, in the second binary image.
In the second binary image, the land features (the white pixels) often include small holes and gaps (the black pixels) due to the irregular shapes of the coastal zones, the continuous wave actions, or the misclassification errors of the SVM classifier, and they generally cause errors in mapping accurate shorelines.Hence, in this research, morphological filtering was applied on the second binary image to remove the small holes and gaps existing in the land features and to preserve the shapes of the land features.Morphological filtering is an image-processing technique for modifying the shapes of the input objects by running the structure elements with specific shapes over the input objects [31,32].It was recently used to create new shapes for the input objects for extracting the accurate boundaries of the important features from remote sensing datasets [32].In this research, morphological filtering was done to remove the small holes in the land features and to preserve the land feature shapes through the two following steps (see Figure 8).In the first step, the image dilation filter was applied to the original binary image shown in Figure 8(a) to remove the small holes in the land features by expanding the land features using the structure element.As can be seen in Figure 8(b), the land features were expanded, and the small holes in the land features were removed by the image dilation filter.In the second step, the image erosion filter was applied to the dilated image for eroding the outside of the expanded land features to preserve their shapes using the same structure element.Finally, as can be seen in Figure 8(c), the outside of the expanded land features was eroded, and the shapes of the land features were preserved.The shape of the structure elements is also significant for reshaping the original objects through morphological filtering [32].As the shorelines have a linear structure, the shape of the structure elements was set as a square.Also, the width of the structure element was set as 2 m based on empirical analysis.In this research, the entire process of the refinement of the second binary image by the morphological filtering was carried out using the Matlab program (Matlab 2013b, Mathworks, Inc., Natick, Massachusetts, USA).
After the eroded image was generated through the morphological filtering process, the second shoreline was extracted from the eroded image by selecting the boundary between the land and water features (see Figure 9).

Accuracy Measurement of the Coastal-Surface Classification Map.
In this section, the accuracy of the coastal-surface classification map is assessed using the 100 checkpoints, defined as the first checkpoint group, generated by manual digitization located around the second shoreline.Table 1 shows the accuracy of the identified coastal surfaces classified by the SVM classifier.
In the generated coastal-surface classification map, there were some misclassification errors owing to the following.First, some water features were misclassified into rock features due to the coastal materials located under the shallow water surfaces.Second, some rock features were misclassified into water features due to their similar reflectance characteristics caused by the shadows on the rock surfaces.Third, some  vegetation features were misclassified into water features or vice versa, also due to their similar reflectance characteristics caused by the shades on their surfaces and so forth.Figure 10 shows examples of the misclassification errors that occurred in the coastal-surface classification map.

Accuracy Measurement of the First and Second Shorelines.
In this section, the accuracy of the two shorelines generated by the water-index-and machine-learning-based methods,  respectively, was measured through the following steps.First, 100 checkpoints, defined as the second checkpoint group, were generated by manual digitization, and the average distance of these checkpoints was 70 m (see Figure 11).Then the accuracy of both shorelines was assessed by measuring the shortest distance from the checkpoints of the second checkpoint group to the first and second shorelines, respectively.Table 2 shows the accuracy of both shorelines generated by the water-index-and machine-learning-based methods, and Figure 12 shows the line graphs of the shortest distances from each checkpoint to the shorelines generated by the different methods.

Discussions
As can be seen in Table 2, the second shoreline generated by the machine-learning-based method had better overall accuracy than the first shoreline generated by the waterindex-based method.For a detailed examination of the comparison results, checkpoint indices located in the regions where the first or second shoreline had significant errors were selected (see Figure 13(a)).In Figure 12, the first shoreline had serious errors in checkpoint indices 1, 18, 23, 24, 42, 43, 52, 55, 56, 70, 71, 76, 79, 85, 98, and 99 while the second shoreline had serious errors in checkpoint indices 24, 52, and 55.In general, both shorelines had high accuracy in the well-identified coastal zones where the boundary between land and water was easily recognized in the given highresolution satellite image.The selected Region 1 shows an example of the coastal zones where both shorelines had high accuracy (see Figure 13(b)).The second shoreline generally had better accuracy than the first shoreline for the following reasons: (1) the irregular coastal shapes that were hardly recognized in the NDWI image due to the continuous wave actions and so forth (checkpoint indices 1, 18, 85, and 99) and (2) the shaded areas that were not identified in the NDWI image but were identified in the coastal-surface classification map generated by the SVM classifier (checkpoint indices 23, 42, 43, 56, 70, 71, 76, 79, and 98).The selected Region 2 (checkpoint index 85) shows an example of the coastal zones where the second shoreline had better accuracy than the first shoreline for the first reason (see Figure 13(c)), and the selected Region 3 (checkpoint indices 42 and 43) shows an example of the coastal zones where the second shoreline had better accuracy than the first shoreline for the second reason (see Figure 13(d)).Both shorelines, however, had low accuracy in the regions where the coastal zones were hardly identified due to the shaded areas that were identified neither in the NDWI image nor in the coastal-surface classification map (checkpoint indices 24, 52, and 55).The selected Region 4 (checkpoint index 24) shows an example of the coastal zones     where both shorelines had low accuracy due to the shaded areas not identified in both the NDWI image and the coastalsurface classification map (see Figure 13(e)).
In conclusion, both shorelines had high accuracy in the well-identified coastal zones while the second shoreline generated by the machine-learning-based method had better accuracy than the first shoreline generated by the waterindex-based method in the coastal zones with irregular shapes, light shades, and so forth.Both methods, however, showed inefficient performance for mapping the shorelines in the coastal zones with significant shades that were not identified in the NDWI image or the coastal-surface classification map.In general, the pixels representing the shaded areas have the intensity values lower than other pixels in all multispectral bands [33], and it causes the errors for identifying the features in the NDWI image or the coastal-surface classification map generated by using the multispectral bands of the image sources.Hence, the additional data not affected by the shadows would be needed for mapping the shorelines in the shaded areas.

Conclusions
The shoreline-mapping task using remotely sensed image sources is efficient for the estimation of the shoreline positions without human access.This research compared different methods (the water-index-based method and the machinelearning-based method) for mapping accurate shorelines using a high-resolution satellite image.The water-indexbased method is useful for separating the land and water features from multispectral image sources, but it is limited for identifying the various land covers that constitute the coastal zones.The machine-learning-based method is useful for identifying these various coastal features with different spectral-reflectance characteristics, which means that using the machine-learning-based method is better than using the water-index-based method for mapping shorelines using multispectral image sources.There are improvements required, however, in future research for the development of an automatic shoreline-mapping process and the estimation of shoreline positions in various coastal zones.First, different machine learning algorithms or any other technique should be applied to generate a more accurate coastal-surface classification map for mapping accurate shorelines in various coastal zones.Second, additional datasets not influenced by the shadows should be integrated into the image sources for mapping accurate shorelines not only in the well-identified coastal zones but also in the shaded coastal zones.Third, the ground truths acquired by the ground-surveying method would be used for measuring accuracies of the generated shorelines and the coastal-surface classification map.

Figure 1 :
Figure 1: Hwado Island, South Korea, selected as the study area.

Figure 2 :
Figure 2: Flowchart showing the procedure for mapping shorelines using the two different methods.

Figure 4 :
Figure 4: First binary image converted from the NDWI image through the adaptive thresholding method.

Figure 5 :
Figure 5: First shoreline extracted from the first binary image by selecting the boundary between the land and water features.

Figure 6 :
Figure 6: One section of the generated coastal-surface classification map: (a) one section of the given high-resolution satellite image; (b) one section of the generated coastal-surface classification map.

Figure 7 :
Figure 7: Second binary image converted from the coastal-surface classification map.

Figure 8 :
Figure 8: Process showing morphological filtering applied to the second binary image: (a) one section of the original binary image; (b) one section of the dilated image; and (c) one section of the eroded image.

Figure 10 (
a) shows an example of the misclassification in the region where the water features were misclassified into rock features, Figure 10(b) shows an example of the misclassification in the region where the rock features were misclassified into water features, and Figure 10(c) shows an example of the misclassification in the region where the vegetation features were misclassified into water features.

Figure 9 :
Figure 9: Second shoreline extracted from the eroded image by selecting the boundary between the land and water features.

Figure 10 :
Figure 10: Examples of the misclassification errors that occurred in the coastal-surface classification map: (a) example of the misclassification in the region where the water features were misclassified into rock features; (b) example of the misclassification in the region where the rock features were misclassified into water features; and (c) example of the misclassification in the region where the vegetation features were misclassified into water features.

Figure 11 :Figure 12 :
Figure 11: Locations of the second checkpoint group for the measurement of the accuracy of both shorelines.

Figure 13 :
Figure 13: Detailed examination of comparison results: (a) locations of the selected Regions 1, 2, 3, and 4 in the entire study area; (b) Region 1, where both shorelines had high accuracy; (c) Region 2, where the second shoreline had better accuracy for the first reason; (d) Region 3, where the second shoreline had better accuracy for the second reason; and (e) Region 4, where both shorelines had low accuracy due to the shaded areas that were not identified in the NDWI image or the coastal-surface classification map.

Table 1 :
Accuracy of the coastal surfaces classified by the SVM classifier.

Table 2 :
Accuracy of the first shoreline generated by the waterindex-based method and the second shoreline generated by the machine-learning-based method.