Classification and Regression Tree Models for Remote Recognition of Black and Odorous Water Bodies Based on Sensor Networks

Black and odorous water bodies represent a topic of significant interest in the field of water pollution prevention and control. Remote sensing technology is increasingly exploited for the monitoring of black and odorous water bodies because of its high efficiency and large-scale monitoring potential. In the present study, the Sentinel-2A imagery data were combined with data obtained by measuring spectral properties of black and odorous water bodies to produce a classification and regression tree (CART)model-based improved remote sensing recognitionmethod for such water bodies.,is method transforms the traditional single-feature empirical threshold segmentation algorithm to a multi-feature fuzzy decision-tree classification algorithm. ,e results reveal overall accuracy values of 84.78%, 92.85%, and 72.23% for the CARTdecision-tree algorithm, the confidence zone classification, and the fuzzy zone node classification, respectively. ,e method proposed in the present study enables the highly precise extraction of features representing black and odorous water bodies from satellite imagery. ,e characterization of confidence and fuzzy zones minimizes the need for field inspections, and it enhances the efficiency of diverse applications including engineering.


Introduction
Black and odorous characteristics, that is, a dark color associated with an unpleasant odor, reflect extreme organic pollution of a water body [1]. Owing to the discharge of tremendous exogenous organic matter into water bodies from anthropogenic activities, oxygen in such an environment is consumed by the biochemical activities of aerobic microorganisms, thereby creating anoxic or anaerobic conditions. ese conditions promote the death and decomposition of algae and other aquatic organisms. e associated processes produce gases with obvious odors, such as H 2 S and NH 3 , while metals, such as Fe and Mn, are reduced to dark-colored sulfides. In recent years, because of the rapid economic and societal development, severe black and odorous water problems have been experienced in many cities in China, and these seriously threaten the urban ecological environment and the health and safety of residents [2,3]. In 2015, the State Council of the PRC issued the Action Plan for Prevention and Control of Water Pollution, in which the monitoring and treatment of black and odorous water bodies were included as important measures of water pollution prevention and control. In fact, based on this document, black and odorous water bodies in built-up areas in cities across the country must be eliminated before the end of 2030. To implement the Action Plan for Prevention and Control of Water Pollution, the Ministry of Housing and Urban-Rural Development and the Ministry of Environmental Protection jointly issued the Guidelines for the Remediation of Urban Black and Odorous Water (herein after referred to as the Guide), in which several technical issues associated with the monitoring and treatment of black and odorous water bodies were clarified [1].
Conventional black and odorous water monitoring relies on manual sampling and verification, and the results obtained are usually based on the experience of the on-site staff and the chemical data for the water samples. is method, however, is time-consuming and labor-intensive, and thus, it is challenging for regional monitoring. Concurrently, because black and odorous water bodies are usually small and dispersed, their identification using artificial methods, which are often characterized by blind areas and dead ends, is difficult. However, satellite remote sensing technology involves continuous and large-scale monitoring characteristics. e spectral differences between black and odorous and regular water bodies can facilitate the extraction of features representing the former from satellite imagery, thereby providing a rapid and reliable method for the identification and monitoring of black and odorous water bodies in urban areas [4]. In 2016, the Institute of Remote Sensing and Digital Earth of the Chinese Academy of Sciences in collaboration with the Satellite Environment Application Center of the Ministry of Environmental Protection conducted a remote sensing screening and verification study of black and odorous water bodies in Beijing, Shenyang, Taiyuan, and other cities. A related study was considered one of the ten major investigations in the field of remote sensing in China.
At present, shallow learning models that are commonly employed in China for the recognition of black and odorous water bodies through remote sensing mainly involve the following: (1) threshold segmentation based on a single feature and (2) empirical decision tree based on multiple features. e former approach relies on indexes generated from the spectral differences between black and odorous and regular water bodies. ese indexes are obtained through statistical analysis of data from field samples. For example, Wen et al. proposed a band ratio method for the extraction of black and odorous water features from remote sensing imagery [5]. Li et al. also advanced a WCI and combined several remote sensing imagery signals to distinguish the two types of water [6]. In fact, Li et al. utilized the Nemerow comprehensive pollution index (NCPI) to characterize the extent of pollution of urban water bodies and compared the results retrieved using six regression models. A regression model suitable for calculating the NCPI of a scene to detect black and odorous water bodies was then obtained [7]. Yao et al. employed verification data from Shenyang to improve the band ratio method and then introduced the BOI algorithm [8]. Furthermore, Yao et al. proposed an HI threshold segmentation method based on PlanetScope images [9], while Zhang et al. enhanced the HI by suggesting the HCI [10]. Nevertheless, most of these methods utilize a single feature to perform the threshold segmentation based on a spectral analysis or a comprehensive comparison. In contrast, Li et al. proposed a classification based on an empirical decision tree involving multiple features and then proposed the DBWI, GR-NIR AWI, NDBWI, and green band features [11]. Reasonable thresholds to facilitate the identification and classification of black and odorous water bodies were also set.
Although high extraction accuracy values were achieved in some areas according to previous investigations, empirical methods were employed in most of these studies for the selection of features and determination of thresholds, and thus, these involve uncertainties. In addition, because black and odorous water bodies originate from multiple causes, mildly black and odorous water bodies can be difficult to distinguish from regular water bodies, which creates a socalled "fuzzy area" in classifications. erefore, existing methods, especially the single feature threshold segmentation, exhibit shortcomings. To eliminate these limitations, in the present study, the CART decision-tree was employed on remote sensing and field data for Langfang in Hebei Province to propose a superior remote sensing method for recognizing black and odorous water bodies. e Gini coefficient minimization criterion and the binary recursive segmentation were used to determine the characteristics and thresholds, and then a decision-tree model was constructed for classification. e category attributes of leaf nodes were defined by calculating the degree of membership, and this created fuzzy and confidence zones. e proposed method is characterized by high classification accuracy, and the fuzzy and confidence zones generated can facilitate field inspections and improve the efficiency of different applications including engineering.

Materials and Methods
In the present study, pits and ponds in Langfang were utilized to evaluate the suitability of remote sensing for the identification and monitoring black and odorous water bodies in areas of high pollution, scattered water bodies, and manual verification challenges. A total of 94 samples were collected from these water bodies in 2021 [12]. e transparency (SD), dissolved oxygen (DO), and redox potential (ORP) of the waters were measured on-site, while the ammonia nitrogen (NH 3 -N) was determined in the laboratory using the samples collected [13][14][15][16]. e samples were divided into black and odorous and regular water bodies based on the four physicochemical parameters measured (see Table 1). According to the criteria in Table 1, 47 of the water bodies were regular, while the other 47 were black and odorous ( Figure 1). Among the 94 samples, 47 were randomly selected as the training set for data analysis and model training, while the remaining 47 served as the validation set for accuracy assessment. Concurrently, the Sentinel-2 imagery was employed for monitoring of the water bodies using remote sensing technology. Owing to the high spatiotemporal resolution, the Sentinel-2 imagery is widely utilized in monitoring the land surface, such as the vegetation, soil cover, and water bodies. e Sentinel-2 image contains 13 bands, and the blue (B), green (G), red (R), and near-infrared (NIR) bands involve an identical resolution of Langfang cover >3,000 m 2 , the 10 m band resolution of the Sentinel-2 imagery data was suitable for the present study. To enhance the relationship to the field sampling time, Sentinel-2A data in synchronous transit were utilized to generate spectral data for the field sampling points, to create sample sets and images, and to perform analysis. e spectral curves of the two types of water bodies based on preprocessing of data for the visible (VIS) and NIR bands are displayed in Figures 2 and 3. As shown in Figure 2, compared with the regular water bodies, the black and odorous water bodies exhibit low reflectivity values, and the associated spectral curves for the Rrs (G) − Rrs (B) and Rrs (G) − Rrs (R) display relatively gentle changes. Generally, the optical characteristics of water are determined by algal pigment contents, suspended solids, and colored dissolved organic matters. Because of extreme organic pollution, black and odorous water bodies are often enriched in organic pollutants and nutrients. Consequently, black and odorous water bodies usually have higher contents of organic matters and suspended solids than regular water bodies. In addition, nutrients would promote the growth of algae, thereby increasing the algal pigment contents. ese pollutants and pigments account for changes in the optical properties of water bodies. For example, the high absorption and low backscattering of the colored dissolved organic matter and algal pigments significantly reduce the reflectivity of water in the VIS bands, so that the black and odorous water bodies usually have lower reflectance in blue, green, and red bands, thus the Rrs (B) + Rrs (G) + Rrs (R) was also utilized as the feature of black and odorous water bodies.
ese observations are consistent with those reported in previous studies. Owing to spectral differences between the two types of water bodies in the VIS and NIR bands, multiple indexes for the extraction of features representing black and odorous water bodies were derived from sampling data by setting empirical thresholds (see Table 2). However, because black and odorous water bodies originate from diverse causes and the optical properties of some are comparable to those of regular water bodies, fuzzy areas are missed if just one spectral index is used for the extraction of features, and this negatively affects the classification accuracy. Based on data generated in the present study, four typical indexes including the band ratio (BD), BOI, HI, and WCI (Table 2) were tested, and the results are shown in Figure 4. In the spectral index range between N1 and N2, several black and odorous water samples overlap with regular water samples, and this area is termed the fuzzy area. is fuzzy area causes misclassification of black and odorous waters and elevates the uncertainty in the selection of the threshold. e threshold selection is often then subjective, and this affects the classification accuracy.
Obviously, the effective identification of black and odorous water bodies using a single feature is difficult. erefore, in the present study, multiple features are exploited to establish a remote sensing recognition model for black and odorous water bodies. In previous studies, the Rrs (NIR) was utilized as the analysis feature, but because algal bloom and duckweed can cause regular water bodies to exhibit optical properties comparable to those of black and odorous water bodies, in the present study, it was not considered in the selection of features to prevent the introduction of additional errors [19][20][21][22]. erefore, based on the analysis of spectral features, the Rrs (G) − Rrs (B), Rrs (G) − Rrs (R), and Rrs (B) + Rrs (G) + Rrs (R) were used to extract features for the recognition of black and odorous water bodies through remote sensing. ese combinations reflect spectral differences between the two types of water bodies better because of the following: the Rrs (B) + Rrs (G) + Rrs (R) is the sum of reflectance in the visible light band, in which a black and odorous water body is characterized by a low reflectivity; the Rrs (G) − Rrs (B) and Rrs (G) − Rrs (R) are the reflectance differences between the G and B and the G and R bands, respectively, which reflect the smoothness of curves in the band ranges.  In the present study, calculations for all features were expressed in identical units (sr −1 ), while the training and test data sets were generated by random sampling.

Results
e CART model decision tree obtained in the present study is shown in Figure 6. According to the Rrs (G) − Rrs (B), Rrs(R) −

Rrs (G), and Rrs (B) + Rrs (G) + Rrs (R) features, a classification involving categories A-D was produced.
is classification from the CART model is based on the degree of membership, which represents the probability of a set belonging to a black and odorous water body (Table 3) and the location of its size in the [0,1] interval. Initially, each set was divided into a confidence and a fuzzy area according to the degree of membership [26,27]. A set with a degree of membership of 0 or 1 was assigned to the confidence area, and the category attributes of the associated set were then determined. e number 1 was then assigned to a black and odorous water body, while 0 was attributed to a regular water body [28]. e set with degrees of membership in the [0, 1] interval was assigned to the fuzzy area, and the category attributes of this set were uncertain. Considering the principle of the maximum likelihood classification, the set with degrees of membership in the [0.5, 1] interval was defined as black and odorous water bodies in the fuzzy area, while those in the [0, 0.5] are regular water bodies in the fuzzy area. Data for the attributes of category A-D, which are based on the principles examined, are presented in Table 3.

Model Accuracy Evaluation and Analysis.
e test set was used to verify the accuracy of the model, and this was calculated using the following expression: where n represents the total number of sample points and M is the number of correct sample points. Among the 47   sample points in the test set, the results reveal that 39 were correctly classified, which produced an overall accuracy of 82.97%. e associated kappa coefficient of 0.7571 highlights the consistency of the data and the accuracy of the model. e results from the CART model were then compared with those obtained from other approaches commonly used to extract features representing black and odorous water bodies. According to the existing methods, the training set serves for recalibration of the threshold and the extraction of features associated with black and odorous water bodies, whereas the test set is utilized to evaluate the accuracy. According to the results, the CART model proposed in the present study produced the highest accuracy (82.97%), followed by the multi-feature decision tree model of Li (74.19%), and then the single-feature threshold segmentation models (BOI � 73.12%, WCI � 72.04%, and BD � 63.44%).
In the present study, the extraction accuracy of the confidence and fuzzy zones were also evaluated. For the test set, among the 32 samples extracted into the confidence zone, 29 were correctly classified, and this represents an accuracy of 90.63%. Relatedly, out of the 15 samples extracted into the fuzzy area, 10 were correctly classified, yielding an accuracy of 66.67%. Evidently, the error involved in the CART model originates largely from the fuzzy zone.
is is mainly because black and odorous water bodies are linked to multiple causes, and thus, some are mistaken for regular water bodies in the feature space.

Temporal and Spatial Characteristics of a Black and
Odorous Water Body. Based on the decision-tree model established, remote sensing monitoring was performed from July to September 2021 (Figures 7-9), and 30 points were randomly selected for field verification each month. e accuracy values from the monitoring and verification study are presented in Table 4. Obviously, the accuracy of this model for applications is good, and thus, it is suitable for engineering endeavors requiring the identification of black and odorous water bodies. In addition to the classification of water bodies, overall, the distribution of black and odorous water bodies decreases each month from July to September 2021. At the end of September, no black and odorous water body is present in built-up areas in all counties (cities and districts), and thus, this problem was effectively controlled.
is effectiveness is attributed to the intensive measures introduced in all localities in recent years. However, an imbalance in the treatment of black and odorous water bodies in Langfang still exists. Several black and odorous water bodies are present in rural areas and at boundaries between urban and rural areas, which highlight characteristics of the overall distribution and local aggregation. Hotspots are concentrated in the north, central, east, and south areas of Langfang. e terrain in the central, east, and south areas are relatively low, and pits and ponds are common. In the north, the animal husbandry and food processing industries are relatively developed in Sanhe and Dachang County, while the central, east, and south areas are characterized by concentrated enterprises and high population densities. erefore, activities associated with production and life, which involve the discharge of sewage, are higher in these three regions referred to above, and these elevate the probability of creating black and odorous water bodies.

Analysis of the Cause of a Black and Odorous Water Body.
Evidently, from July to September 2021, black and odorous water bodies decreased significantly in Langfang. In fact, black and odorous water bodies were eliminated in built-up areas (cities and districts), and thus, major bodies  disappeared, while small local bodies remained in rural and urban-rural areas, because of the regulations implemented in Langfang in recent years. Regarding counties (cities and districts) and other built-up areas, because of the adequate treatment of black and odorous water bodies, the remediation effect is obvious. At present, almost no black and odorous water body is present in counties (cities and districts) and other built-up areas; however, in rural and urbanrural areas, black and odorous water bodies are more difficult to control because of the high traffic and poor underground pipe network [29][30][31][32][33][34]. erefore, although black and odorous water bodies have been significantly reduced in these areas, their elimination still requires time. Considering the field verification results (Figure 10), the formation of black and odorous water bodies in rural areas of Langfang is attributed mainly to the following: (1) Garbage removal and management problems: owing to the untimely removal and transportation of garbage in rural areas, pits and ponds are the main stacking places, and thus, leachates from domestic garbage invade water bodies through processes at the surface.
(2) Domestic sewage discharge problem: in rural areas, because of the poor underground pipe network, domestic sewage is commonly discharged into pits and ponds, and this promotes the accumulation of organic matter. (4) Poor fluidity and insufficient self-purification capacity of the pit and ponds: pits and ponds are abundant in rural areas of Langfang, and in low-lying areas, the waters in these originate mostly from surface runoff.     Scientific Programming ese bodies generally occupy small areas, which are characterized by poor fluidity and inadequate self-purification capacities, and these limitations are favorable for the production of black and odorous water bodies.

Discussion
In the engineering application of remote sensing monitoring of black and odorous water bodies, after the extraction of the associated features, field verification or remote sensing interpretation marks are also required for discrimination and to improve the accuracy. e CART model utilized in the present study adequately differentiates the extracted features associated with black and odorous water bodies and, thus, optimizes the classification accuracy. According to the results, the accuracy of extracting features representing black and odorous water bodies in the confidence zone is ideal. In engineering applications, such features can henceforth be extracted without field verification or visual interpretation. Results for the fuzzy area reveal samples that can easily be confused with regular water samples, and this area is characterized by a relatively low accuracy. Field verification or visual interpretation, however, can be performed as needed to improve the accuracy. e CART model and the confidence and fuzzy zones proposed in the present study can lessen the field verification burden for diverse applications. e proposed approach can improve the efficiency of engineering applications because of its high accuracy for the extraction of features representing black and odorous waters.

Conclusions
In the present study, water samples collected from 94 pits and ponds in Langfang, Hebei Province, in 2021 were characterized. ese data were combined with data from the Sentinel-2a imagery for the same period to highlight spectral differences between black and odorous water and regular water bodies. Based on the CART model algorithm, an improved method for identifying black and odorous water bodies through remote sensing was proposed. ree features extraction parameters including the RRs (R) − RRs (G), RRs (B) + RRs (G) + RRs (R), and RRs (NIR), were used to create a decision-tree model. e results produced a classification, which facilitated the extraction of information associated with black and odorous water bodies. e main findings of the present study are summarized as follows: (1) Spectral differences in the visible bands distinguished black and odorous water bodies from others. Overall, black and odorous water bodies produced weak reflectance values in the visible bands, and the spectral curve variation was relatively gentle. e Rrs (R) − Rrs (B), Rrs (R) − Rrs(G), and Rrs (B) + Rrs (G) + Rrs (R) exhibited potential for adequate characterization of these features, and thus, these can be exploited for the extraction of black and odorous water bodies information from remote sensing data. (2) e CART model was constructed based on the Gini index minimization criterion, and the classification reflected the degree of membership of leaf nodes. Features associated with a degree of membership value ≥0.5 were considered as black and odorous water bodies, while regular water bodies showed values <0.5. According to the test set data, the overall accuracy of the CART model was 84.78%, while the kappa coefficient was 0.783, which highlights a superior potential for the extraction of black and odorous water features from satellite imagery. (3) In the present study, the confidence and fuzzy regions were defined according to the degree of membership. e classification accuracy associated with the confidence region was 90.63%, while that of the fuzzy area was 66.67%. e classification method proposed in the present study alleviates the field verification and satellite data interpretation to identify black and odorous water bodies, which enhances the efficiency of diverse applications, especially engineering.

Data Availability
e experimental data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest
e authors declare that there are no conflicts of interest.

Authors' Contributions
Q.Z. and X.D. conceived and designed the study; Q.Z., X.D., and Y.Q. collected and analyzed the data; Q.Z. and X.D. wrote the manuscript; and G.L. and Y.J. reviewed and edited the manuscript. All authors have read and agreed to the published version of the manuscript.