Landslide Susceptibility Mapping Using GIS and Bivariate Statistical Models in Chemoga Watershed, Ethiopia

. This study aimed to map the landslide susceptibility in the Chemoga watershed, Ethiopia, using Geographic Information System (GIS) and bivariate statistical models. Based on Google earth imagery and ﬁ eld survey, about 169 landslide locations were identi ﬁ ed and classi ﬁ ed randomly into training datasets (70 % ) and test datasets (30 % ). Eleven landslides conditioning factors, including slope, elevation, aspect, curvature, topographic wetness index, normalized difference vegetation index, road, river, land use, rainfall, and lithology were integrated with training landslides to determine the weights of each factor and factor classes using both frequency ratio (FR) and information value (IV) models. The ﬁ nal landslide susceptibility map was classi ﬁ ed into ﬁ ve classes: very low, low, moderate, high, and very high. The results of area under the curve (AUC) accuracy models showed that the success rates of the FR and IV models were 87.00 % and 90.10 % , while the prediction rates were 88.00 % and 92.30 % , respectively. This type of study will be very useful to the local government for future planning and decision on landslide mitigation plans.


Introduction
Landslide is a major natural hazard that poses a significant threat to human lives and infrastructure [1,2].Natural hazards such as landslides, flood, earthquake, and drought risk cannot be avoided completely but the processes and consequences can be mitigated [3,4].The Chemoga watershed, located in the northern part of Ethiopia, is prone to landslide hazards due to its steep slopes, rugged topography, and intense rainfall.The increasing population pressure and the rapid expansion of infrastructure have also contributed to the occurrence of landslides in the area [5,6].
In this study, we aimed to develop a landslide susceptibility map using GIS and bivariate statistical models in the Chemoga watershed, Ethiopia.We collected landslide inventory data through field surveys and prepared various thematic layers such as slope, elevation, aspect, curvature, topographic wetness index (TWI), normalized difference vegetation index, road, river, land use, rainfall, and lithology from the digital elevation model (DEM) and satellite imagery.Two bivariate statistical models, namely, FR and IV, were used to analyze the relationships between the landslide occurrences and the thematic layers.The accuracy of the models was evaluated using a validation dataset.
The results of this study can provide valuable information for land use planning and management in the Chemoga watershed.The development of a landslide susceptibility map can help in identifying areas that are prone to landslide hazards and prioritizing mitigation measures to reduce the risk of landslide disasters.

Materials and Methods
2.1.Description of the Study Area.The Chemoga watershed is located in the upper Abay River basin Ethiopia with an area 1,414.85km 2 .According from UTM coordinate system (zone 37N), the location of watershed is approximately between longitudes 330,000-380,000 m E and latitude 1,110,000-1,170,000 m N and topographically, the altitude ranges from 863 to 3,946 m, shown in Figure 1.Topographically, the altitude ranges from 863 to 3,946 m and the slope angle varies from 0°to 67°.In terms of land use, most of the watershed is covered by scrub/shrub and crop lands.The study area receives high amount of rainfall during the summer season.Based on Ethiopian National Metrological Agency, the average recorded annual precipitation and temperature of the area was 1,376 mm and 16.95°C, respectively.

Data Source and Methodology.
In this study, I used both primary and secondary data.The primary data were collected from field survey and observation and the secondary data were acquired from governmental and nongovernmental institutions, journals, internet, and other documents.The main data used for this study were sentinel-2 images and 30 m DEM of the area, Google earth imagery and topographical map of the area.The data layer of land use and NDVI were derived from Sentinel-2 images and DEM data used to create the slope, elevation, aspect, curvature, and TWI data layers and their extents through spatial analysis tools.The data of annual rainfall were obtained from the National Meteorological Agency of Ethiopia.The main road and river were digitized from the topographical map of Ethiopia and the geological map was used to create the lithology layer of the study area.All the data layers have been constructed and combined in ArcGIS 10.4 tool.Accordingly, the FR and IV models were used to generate elaborative landslides susceptibility map.The conditioning factors considered, their format and sources is presented in Table 1, while the methodological workflow is shown in Figure 2.

Landslide Inventory Map.
Landslide inventory mapping is the systematic mapping of existing landslides in a region using various techniques such as field survey, aerial photographs or Google earth imagery interpretation, satellite image interpretation, and literature search technical and scientific reports, governmental reports, and the interview of experts [45,46].In this study, the landslides inventory map which has a total of 169 individual landslide locations was generated according to the integration of different data sources such as Google earth imagery digitized into points and field surveys, i.e., GPS points (period between 2016 and 2022).Landslide types in the study area include rockslide, soil slide, debris flow, earth flow, rock fall, and rock toppling.Though there is no specific rule for defining how landslide occurrence will be allocated into training and validation data sets [47], usually research work has been done by using 70% of landslides events as training data sets and the rest 30% for validation of the output model [11,14,48].In this study, 118

Landslide Conditioning Factors.
To identify landslide occurrence conditioning factors is a very complex phenomenon, because there is no standard rule to select which factor to be used [49].In this study, 11 conditioning factors were selected based on the literatures, effectiveness, availability of data, and the relevance with respect to land slide occurrence [23].These conditioning factors are slope, elevation, aspect, curvature, TWI, NDVI, road, river, land use, rainfall, and lithology.Each factor was converted to a raster format and was classified based on Jenks natural breaks method in Arc-GIS, shown in Figure 3.
In landslide susceptibility studies, slope is considered one of the major contributing factor [21,50].According to the importance of slope contribution factor landslide occurrence, the slope data were classified into five classes.With increase in slope angle, the possibility of landslide occurrence increases [19,51,52].Elevation is an important conditioning factor in landslide susceptibility mapping and it also impacts the environmental conditions on slopes such as human activity, vegetation, soil moisture, and climate [53,54].Curvature has an important role in the surface runoff and ground infiltration thus affects the erosion of the surface and ground  Advances in Civil Engineering water condition of the region [17].The curvature map was classified into concave (negative), convex (positive), and flat (zero) surfaces.In the case of curvature, the more negative the value, the higher the probability of landslide occurrence [29].Aspect represents the direction that a slope faces [53].Slope aspect affects erosion, surface evaporation, desertification, solar heating and surface weathering, thus affecting the occurrence of landslides [50,55].TWI is among one of the  6 Advances in Civil Engineering important factors responsible for the landslide, which can quantitatively display the control of terrain on the spatial distribution of soil moisture, is a widely used terrain attribute.The TWI conditioning factor was obtained from DEM with 30 m spatial resolution by Equation (1) to express as follows: where A s is the specific catchment area (m 2 /m) and β is slope angle in degrees [56].TWI is used to measure topographic control of hydrological procedures [57].Rainfall is considered to be one of the landslides occurrences conditioning factor.Rainfall map was prepared using five station locations in the study area through the IDW interpolation method of annual average precipitation (1990-2021).Road is one of the most effective factors on landslide occurrence [1].Road construction near the hillside may lead to changes in the natural conditions of areas.River networks plays an important role in landslide occurrence factor closely to surface water.The NDVI conditioning factor was obtained from Sentinel-2 satellite imagery with 30 m spatial resolution by Equation ( 2) to express as follows: where IR is the infrared and R is the red bands of the electromagnetic spectrum.NDVI values between −1.0 and 1.0, where any negative values are mainly generated from clouds, water, and snow and values near zero are mainly generated from rock and bare soil and the positive value indicates that the ground is covered by vegetation.Land use is an important conditioning factor that affects the occurrence of landslides.The map of land use was derived from Sentinel-2 satellite imagery, by using a supervised classification technique and classified in to six classes.The study area is predominantly covered with the cropland and scrubs.The lithology also classified into four classes and the dominant lithology is tertiary extrusive and intrusive rocks in the study area.
2.5.Landslide Susceptibility Modeling 2.5.1.Frequency Ratio (FR) Model.FR is one of the most widely adopted and popular methods for landslide susceptibility assessment [14,16,58].The FR is the ratio of the area where landslides occurred in the total study area and also is the ratio of the probabilities of a landslide occurrence to a non-landslides occurrence for a given attribute [59,60].Generally, a greater ratio indicates a stronger relationship between a conditioning factor and landslide and vice versa.FR value is greater than 1, it indicates a high probability of landslide occurrence, and a value less than 1 indicates a low relationship between probabilities of landslide occurrence.The landslides susceptibility map (LSM) can be calculated by summing the FR of all of the factors considered Equation (3) as follows: where LSM is landslide susceptibility map and FR represents for each factor type or class, n is the number of factors.The FR can be obtained by Equation ( 4) as follows: where the number of landslide pixels in class i of the factor X is represented by N pix (SX i ); the total number of pixels within factor X j is represented by N pix (X j ); m is the number of classes in factor X i ; and n is the total number of factors in the study area [60].

Information Value (IV)
Model.The IV model is a bivariate statistical approach that objectively assesses landslide susceptibility using information theory, providing an advantage in accurately identifying areas at risk of landslides and the model was originally proposed by [61] and later slightly modified by [46].The information value model is used to evaluate the spatial relationship between the conditioning factor classes and the probability of landslide occurrence.Generally, the higher value of IV model corresponds to the stronger relationship between the probability of landslide occurrence and the conditioning factor class.IV value is greater than 0 indicates a high probability of landslide occurrence, and a value less than 0 indicates a low relationship between the probabilities of landslide occurrence.Therefore, the LSM for each pixel was computed by summing the information values of each factor class as follows: where LSM is the landslide susceptibility map and IVi is the information value each factor class, n is the number of factors.IV was applied, and the weights were assigned to each class of each conditioning factor.The information value (IV) can be calculated using the following formula [61]: Nsl 00 pix 00 =Nc 00 pix 00 Nts 00 pix 00 =Nta 00 pix 00 ; where Nslpix is a number of landslide pixels in a given class, Ncpix is the number of pixels in a given class, Ntspix is a total Advances in Civil Engineering number of landslide pixels in the study area, and Ntapix is a total number of pixels in the study area.

Results and Discussion
3.1.Application of Frequency Ratio (FR) Model.FR was measured for each class of every landslide conditioning factor by dividing the landslide occurrence ratio by the area ratio.The results of the FR model for each of the classes of effective factors are shown in Table 2.In general, the FR value of 1 indicates the average correlation between landslide occurrence and effective factors.A FR value greater than 1 indicates a high likelihood of landslide occurrence, while a FR value less than 1 indicates a low likelihood of landslide occurrence [47].The analysis of FR for the relationship between landslide occurrence and slope degree indicate that class 33°-67°, the highest FR value of 9.27 among the other classes of slope degree.The remaining classes of slope have low probabilities of landslide occurrence.In the study area, it was observed that the probability of landslide occurrence increased with slope gradient up to a certain extent, and then decreased, consistent with results from other literature studies [20].This is because higher slope values increase the effects of gravity and shear stress [46].The relationship between landslide occurrence and elevation indicated that the range between Therefore, the existing road and the on-going constructions disturb the stability of slope there by increasing the probability of landslide occurrence [19,20].According to Guzzetti [62], the landslides probability decreases with the increasing distance from river networks.In this study area, distance from river network between 2,560-4,133 m exerts the highest influence on landslide occurrence.The reason is that permanent rivers are the main source of moisture for landslide occurrence.In the NDVI, the FR value is greater than one, where the NDVI classes −0.04 to 0.10 and 0.23-0.

Application of Information Value
Model.The information value of each conditioning factor was calculated through Equation ( 5), and the spatial relationship between each conditioning factors and flood occurrence is shown in (Table 2).
If the factor class of IV value is negative, there is a low likelihood of landslide occurrence.On the other hand, if the value is positive, there is a high-probability value is landslide occurrence [46].The slope indicate that 33°-67°is highly prone to landslide having the highest IV value of 0.967, whereas the flat slope shows less probability.The occurrence of landslides tends to increase with higher slopes and decrease with lower slopes.The elevation factor indicate that the class 1,509-2,042 m (IV = 445), has a high probabilities of landslide occurrence and all other classes have very low impact.Generally, landslides mostly occurred on the higher area.But in this study, the landslides occurred in the lower area.Advances in Civil Engineering performance.In the present study, the performance of the LSM produced by FR and IV models was evaluated using area under the curve (AUC).The AUC is the measure that indicates the accuracy of the landslide susceptibility maps by creating success and prediction rate curves [63].The success rate curve represents the model fitness to the existing landslide.The prediction rate curve indicates the model efficiency to predict future landslide [47].The AUC rate curves were drawn through the x-axis both the training and validation landslides (true positive rate) and y-axis (false positive rate).The total AUC value can be utilized as a qualitative measure to determine accuracy of the susceptibility map, where a larger value indicates a higher level of accuracy achieved.The AUC value ranges from 0.5 to 1.0 are used to evaluate the accuracy of the model [63].The qualitative relationship between AUC and prediction accuracy can be classified as follows; excellent (0.9-1.0); very good (0.8-0.9); good (0.7-0.8); average (0.6-0.7), and fair (0.5-0.6), [63].If AUC value is close to 1.0, then the model will have ideal performance, where as a value is equal or less than 0.5, then the model will have poor performance [64].The results indicated that the AUC values for the success rate curves were 0.870 and 0.901 for the FR and IV models, respectively, which can be interpreted as prediction accuracies of 87.00% and 90.10%, respectively (Figure 5(a)).The results indicated that the AUC values for the prediction rate curves were 0.880 and 0.923 for the FR and IV models, respectively, which can be interpreted as prediction accuracies of 88.00% and 92.30%, respectively (Figure 5(b)).The success rate and predictive rate value range between 0.8-0.9indicate a very good performance of FR model.Also, the success rate and predictive rate value range between 0.9-1.0implies excellent performance of the IV model.

Conclusion
The use of GIS and bivariate statistical models proved to be an effective approach in mapping landslide susceptibility in the Chemoga watershed, Ethiopia.The study identified several factors that influence landslide occurrences in the Chemoga watershed, such as slope, elevation, aspect, curvature, TWI, normalized difference vegetation index, road, river, land use, rainfall, and lithology.A landslide inventory map was prepared using Google earth imagery and field survey assessment.For this process, 169 landslide locations were identified and mapped.The susceptibility maps produced with the FR and IV models were divided into five susceptibility classes including very low, low, moderate, high, and very high susceptibility.The AUC rate curve quantitatively indicates the performance of the susceptibility maps.The results of this study showed that the IV model outperformed the FR model, with the accuracy of success rate 90.10% and 87.00% and the predicative rate 92.30% and 88.00%, respectively.Finally, this study confirmed that the integration of GIS and bivariate statistical models provides an effective approach in mapping landslide susceptibility in the Chemoga watershed, Ethiopia.The findings of this study can contribute to the development of a comprehensive disaster risk reduction strategy in the study area and other landslide-prone regions in Ethiopia.
The aspect conditioning factor classes have the lowest abundance on flat facing (IV = −0.534),north (IV = −0.434),and northeast (IV = −0.166)indicating a low probabilities of landslide occurrence.The remaining categories with positive IV values indicate a high probability of landslide occurrence.In terms of curvature, the flat class has the lowest IV value (−0.156) indicating a low probability of landslide occurrence, while the convex and concave classes have higher IV values (0.142 and 0.139, respectively), indicating a high probability of landslide occurrence.Distance from the road factor also shows that the class between 6,985-11,577 m has the highest IV value (0.312), indicating a high probability of landslide occurrence.The distance to river factor has a high IV value (0.178) for subclass 2,560-4,133 m, while the remaining subclasses have low IV values indicating a low probability of landslide occurrence.NDVI classes −0.04 to 0.10 and 0.23-0.48have positive IV values indicating a high probability of landslide occurrence, while the remaining NDVI classes have negative IV values 8 Advances in Civil Engineering

TABLE 1 :
Type of conditioning factors, format, and source.

TABLE 2 :
Spatial relationship between each conditioning factors and landslide occurrence using FR and IV models.
the other classes have negative IV values indicating a low probability of landslide occurrence.The other important conditioning factor is lithology in this study.Lithology factor classes are the most abundance on Precambrian and Triassic and per main (IV = 0.278) and Triassic and permain (IV = 0.360), indicating a high probabilities of landslide occurrence.wereclassifiedintofivesusceptibility classes of very low, low, moderate, high, and very high susceptibility in both models using the geometrical interval method for visual interpretation, shown in (Table3).3.4.Validation of Landslide SusceptibilityMaps.The FR and IV models were validated to check their reliability and

TABLE 3 :
Landslide susceptibility classes and summery of FR and IV models.