Research on Sewage Monitoring and Water Quality Prediction Based on Wireless Sensors and Support Vector Machines

Water resource protection has an important impact on ecosystem security and human survival. Therefore, water quality testing and early warning of the sewage status are getting more and more attention. In order to solve the problems of information transmission delay and insufficient water quality prediction in current water quality monitoring, this paper proposes a wireless sensor-based dynamic water quality monitoring and prediction technology. Firstly, this paper uses the wireless sensor technology and ZigBee protocol to establish a sewage monitoring model and real-time dynamic monitoring of total nitrogen, total phosphorus, ammonia nitrogen, and other indicators of the water quality of the basin. Secondly, on the basis of wireless monitoring, a support vector algorithm is used to construct a water quality prediction model to make a reasonable prediction of the water quality of the watershed. Finally, the actual test results show that the technology can automatically and real-timely monitor the water quality of the watershed to meet the requirements of water quality monitoring in practical applications.


Introduction
Water environment refers to the environment in which lakes, rivers, oceans, and other water qualities are located. Changes in the water environment will have a serious impact on water quality [1]. Determine whether the environment is contaminated by testing the physical and chemical properties of water quality. The water environment is an inseparable part of the ecosystem and is the basis for human survival and development. However, with the advancement of human science and technology, the water environment is increasingly polluted [2]. The Netherlands, Japan, the United States, and other countries have conducted research on water quality automatic monitoring systems and applied them to actual water quality monitoring [3]. The Netherlands has established an expensive water quality monitoring system [4] in the lower reaches of the Ames River. This system monitors the parameters of ammonia, nitrogen, total phosphorus, and pH in water quality in real time, and an alarm device is added to this monitoring system. A number of automatic water quality monitoring stations are used to monitor the water quality changes of the Windlass River in the United Kingdom [2]. Wireless sensor network technology is used in this system, and monitoring personnel can remotely control the monitoring station.
With the research of wireless sensor technology, the automatic water quality monitoring system also adopts wireless sensor network technology. Typical representatives are the distributed sensor network water quality automatic monitoring system designed by America [5], the wireless sensor network lake water quality system of the University of Virgin Mary [6], and the intelligent coastal water quality automatic monitoring system designed by the Irish [7]. In the above automatic water quality monitoring, wireless sensor networks have been widely used, reflecting the advantages of wide distribution, high density, low energy consumption, and low cost of wireless sensor networks [8]. Compared with traditional sensor networks, the construction of wireless sensor networks will not cause any damage to the natural environment. By using the mutual communication between sensor nodes, the automatic water quality monitoring system covers a wide monitoring area [9]. When a node in the wireless sensor network is broken or the power supply is insufficient, new sensor nodes will be added to the network to ensure smooth communication.
Although the above water quality automatic monitoring systems all use wireless sensor networks, there are still many problems that need to be improved [10]. The current wireless sensor water quality automatic monitoring system mainly faces the problems of high-power consumption and unreliable communication and needs to be improved in these two aspects. Moreover, today's wireless sensor network water quality automatic monitoring system mainly monitors oceans and lakes and rarely involves automatic water quality monitoring in watersheds [11]. Therefore, this paper proposes a sewage monitoring and water quality prediction model based on wireless sensors and support vector machines. The technical scheme of water quality monitoring is studied by combining wireless sensor hardware design and data processing optimization techniques such as support vector machines.

Sewage Detection Based on Wireless Sensor Networks
With the urbanization of society, the loss of domestic sewage, factory sewage, and farmland pesticides or fertilizers will cause the eutrophication of water quality in the basin. When the total nitrogen, total phosphorus, and ammonia nitrogen in the water quality of the watershed exceed the watershed's standard, the algae or other plankton in the water will multiply, which will cause the dissolved oxygen content in the water to drop sharply, and fish and other organisms will die [12,13]. Total nitrogen, total phosphorus, and ammonia nitrogen are difficult to decompose naturally, so contaminated water may flow into people's drinking water system, threatening people's lives and health. Therefore, monitoring total nitrogen, total phosphorus, ammonia nitrogen, and COD is of great significance. The content of COD reflects the overall status of the water quality. By monitoring the content of total nitrogen, total phosphorus, and ammonia nitrogen, the types of pollutants can be judged, and then corresponding treatment can be made to ensure the safety of water quality [14]. Therefore, this paper proposes a water quality automatic monitoring system based on the wireless sensor of the watershed to monitor the four indicators of total nitrogen, total phosphorus, ammonia nitrogen, and COD of the water quality in real time. The water quality monitoring subnode sends the collected data to the water quality monitoring base station through the ZigBee protocol [15]. This paper mainly designs wireless sensor network water quality monitoring base stations and monitoring nodes and establishes a model for the distribution of wireless sensor monitoring nodes to improve the utilization of sensor nodes, reduce network costs, expand the monitoring range of wireless sensor networks, and improve the ability of on-site control of pollutants.
2.1. Overall Design. The overall structure of the wireless sensor network water quality automatic monitoring system is shown in Figure 1. This system contains three main subsystems: data acquisition subsystem, control and communication subsystem, and data management subsystem. The data monitoring nodes are distributed on the sections that are easily contaminated. The four indicators of total nitrogen,   Wireless Communications and Mobile Computing total phosphorus, ammonia nitrogen, and COD are monitored in real time, and an alarm will be issued once it is found to exceed the standard [16]. In this way, the monitoring personnel can deal with the pollutants in time to prevent the spread and ensure the safety of the water quality. The overall architecture of the automatic water quality monitoring system is shown in Figure 1.
The data acquisition subsystem consists of sensors that measure total nitrogen, total phosphorus, ammonia nitrogen, and COD. The working principle of these four sensors is that when the sensor probe is immersed in water, a current will be generated on the two diaphragms. The magnitude of the current depends on the content of the detected object in the water. The microcontroller measures the magnitude of the current through an analog input circuit and performs temperature compensation on this value to obtain an accurate current value. The system includes data monitoring subnodes and data base stations. The control and communication subsystem is composed of the ZigBee communication module of wireless sensor network monitoring subnodes, ZigBee communication module of data base station, and 5G communication module [17]. The wireless sensor network subnode communicates with the data base station through the ZigBee protocol. The data base station sends data to the remote server through the 5G module. Users can use a computer to access the data collected by the wireless sensor network subnode. The data monitoring node can be controlled by the remote client or by the touch screen of the data base station. Users can reduce the power consumption and increase the service life of the data monitoring node by controlling the sampling frequency and working mode of the data monitoring node.
The data management subsystem is based on a remote server, and remote clients can access the server to obtain the water quality status. The data management subsystem not only receives new water quality data in real time and updates the database but also saves previous data for a period of time. Through the comparison of the data, the trend of the water quality in the river can be obtained. In order to fully grasp the status of river water quality, the wireless sensor network automatic monitoring system needs to monitor different sections of the river [18]. Considering the problems of cost and utilization efficiency of sensor nodes, it is necessary to model the distribution of wireless sensor networks to achieve the optimal distribution of sensors.

Wireless Sensor Network Data Monitoring Subnode
Structure. The data monitoring subnode is the basic component of the water quality automatic monitoring system based on the basin's wireless sensor network. It has four main functions. The first function is to collect the total nitrogen, total phosphorus, ammonia nitrogen, and COD content in the water quality. Each data monitoring subnode is equipped with a sensor for measuring these four indicators. The data monitoring subnode will perform linear processing and temperature compensation on the data returned by the sensor. The second function is to store the collected data. The data monitoring node does not send the data to the data base station immediately after the data processing is completed but stores it in a storage area of the microcontroller first and then waits for the data base station's wireless communication module to be idle before sending it [19]. The third function is to receive the command of the data base station. According to the command of the data base station, each data monitoring subnode changes the sampling frequency and working mode to reduce power consumption and extend the service life. The fourth function is to communicate wirelessly with the data base station. The ZigBee communication protocol is used here because ZigBee technology has the advantages of low energy consumption, low cost, and fast communication. The ZigBee technical data monitoring subnodes form a tree-shaped communication network, which not only has a wide monitoring range but also can easily integrate new data monitoring nodes. The structure of the data monitoring subnode is shown in Figure 2.
The ZigBee protocol includes four layers of the 7-layer OSI network communication protocol. The physical layer provides a physical medium for the transmission or reception of ZigBee data packets and realizes the conversion of electrical signals. The MAC layer determines the ZigBee communication address when sending data and determines whether the data packet is destined for its own node when receiving data. The network layer determines the forwarding direction of data packets [20]. The application layer is provided for users to meet different needs of users. With the help of the ZigBee protocol stack, users only need to operate on the  3 Wireless Communications and Mobile Computing application layer to realize wireless communication. The data monitoring subnode structure is mainly composed of six modules, namely, ATMEGA module, sensor module, I/V module, gain control amplification module, A/D module, ZigBee wireless communication module. The sensor module returns a current signal. The current signal is converted into a voltage signal through the I/V module. Because the voltage signal may not be within the range of the A/D module, the voltage signal needs to be amplified or reduced [21]. The I/V module, gain control module, and A/D module operate under the control of the microcontroller to ensure the accuracy of the data information. The ATMEGA processor module receives the digital signal of the A/D module and sends the digital signal to the ZigBee wireless communication module through the UART protocol. The ZigBee wireless communication module sends data at the set transmission frequency. If it cannot directly communicate with the data base station, the data information is sent to the data base station through other nodes.

Data Base Station Structure of Wireless Sensor Networks
Based on Watersheds. The wireless sensor network data base station is the core part of the basin-based wireless sensor network water quality automatic monitoring system. Its main function is to control the data monitoring subnodes and collect data information, display the data information on the touch screen, and send it to the remote through the 5G module. Client: in the process of monitoring the transmission of data from the child node to the base station node, a treelike routing protocol is used. Each node sends out information through its parent node until the data information is received by the data base station. Another feature of this tree routing protocol is that the nodes work in standby mode most of the time, which can extend the service life of the nodes [22,23]. To achieve this goal, each node uses a realtime system to ensure that it can return to its normal working state under the control of the data base station. Data: the base station can display the collected total nitrogen, total phosphorus, ammonia nitrogen, and COD information on the touch screen. Users can understand the water quality status based on the collected information. The data base station can also send the collected data information to the remote client through the 5G module so that researchers can remotely monitor the data monitoring node. The internal mechanical structure of the sensor is shown in Figure 3.
The data base station structure of the wireless sensor network based on the watershed contains five modules, namely, ZigBee communication module, PXA270 module, touch screen display module, and 5G remote data communication module [24]. The ZigBee wireless module is used to receive the data sent by the data monitoring substation and send the data to the PXA270 processor through the SPI bus. The PXA270 processor analyzes the received data; extracts data reflecting the information of total nitrogen, total phosphorus, ammonia nitrogen, and COD; and then displays the extracted data on the touch screen through the Modbus bus. The touch screen can also receive user operation commands and display data information of the node selected by the user.

Research on Water Quality Prediction Based on Support Vector Machines
In the field of automatic water quality monitoring, high-cost monitoring equipment in the past will be replaced by a large number of wireless sensor network nodes with low prices,   Wireless Communications and Mobile Computing superior performance, and strong mobility. This not only reduces the cost of monitoring but also increases the scope of water quality monitoring. In the beginning, the sensors were randomly distributed, and the sensors in the area to be measured were deployed by means of airplane spreading or artillery ejection [25]. However, random distribution does not result in efficient coverage, especially when the distribution of wireless sensors is particularly concentrated and there are few sensors in the critical area of the area to be measured. Therefore, algorithms must be used to optimize sensor deployment and data optimization to improve sensor coverage and effective utilization. This chapter proposes a support vector machine-based water quality prediction wireless sensor network distribution algorithm, which solves the problems of traditional algorithms in the application of water quality monitoring [26]. The support vector machine (SVM) is a new algorithm proposed by Vapnik et al. on the basis of VC dimension theory and structural risk minimization principle in statistical learning theory [27]. Vapnik et al. introduced the regression estimation and signal processing method based on support vector machines in detail, which broadened the research field of support vector machines. Because the algorithm has extremely outstanding regression and classification performance, it has been widely used and researched in many research fields. In recent years, it has moved to the forefront of artificial intelligence scientific research and complex nonlinear science [28]. Support vector machines can be used to solve recognition and regression problems and treat them as a quadratic programming problem [28]. Support vector machines are linearly separable in the feature space. According to the limited sample information, the linear inseparable input space is mapped to the high-dimensional feature space through the kernel function. The model's learning ability (that is, any sample without wrong recognition ability) and complexity (that is, the learning accuracy of the specified training samples) are needed to find the best coordination scheme to obtain the best generalization ability [29].
3.1. Support Vector Machine Algorithm. The basic idea of the support vector machine (SVM) is to find the optimal hyperplane in the original classification of space under the condition of linear separability [30]. In the case of linear inseparability, the relaxation variable is added, and the sample of the low-dimensional input space is mapped to the high-dimensional attribute space through nonlinear mapping to become a linearly separable case. This makes use of linear algorithms for the high-dimensional attribute space. Nonlinear analysis is possible, and the optimal hyperplane in the feature space can be found [31].
One of the core ideas of the SVM method is to find the optimal classification surface of the two types of classification problems, leading to the concept of support vectors [32]. Another core idea of the SVM method is to map the sample set to a high-dimensional and infinite-dimensional Hilbert space (called a feature space) through nonlinear mapping, which realizes the linearity of the high-dimensional nonlinear problem in the sample space in the high-dimensional space, thus solving the nonlinear problem [33]. Unlike the classification problem, there is only one type of sample points for regression. The optimal hyperplane sought is not to make the two types of sample points "open" but to minimize the "total deviation" of all sample points from the hyperplane. At this time, all the sample points are between the two boundary lines, and finding the optimal regression hyperplane is also equivalent to finding the maximum interval [34].
3.1.1. ε-Insensitive Loss Function. Support vector machine regression algorithm (SVR) is a new regression algorithm based on support vector machines. This algorithm needs to introduce a suitable loss function to ensure the existence of the important properties of support vector machines [35]. SVR regards the Vapnik-insensitive function as an error function (i.e., when the error is less than, it is regarded as no error).
Let the training set be Using the above ε-insensitive loss function and limited to the regression estimation function in the set of linear functions, based on the principle of structural risk minimization, when the distance from all sample points to the required hyperplane is less than, find the optimal regression hyperplane. The problem is transformed into solving a quadratic convex programming problem as follows [36]: The constraints are as follows: When the distance from some sample points to the optimal hyperplane is greater than ε, a slack variable is introduced and a fault tolerance penalty coefficient C is constructed. At this time, the optimization problem is transformed into [37] s:t:

Wireless Communications and Mobile Computing
Restrictions are as follows: The solution of SVM to nonlinear problems is shown in Figure 4.
Introduce the Lagrange function to get its dual form: Restrictions are as follows: According to KKT conditions, there are From the above formula, b can be obtained as follows: In the above formula, it is the standard support vector machine set, which is the number of standard vector machines.
The available hyperplane linear regression function is 3.1.2. Solution of Nonlinear Regression. To solve the nonlinear regression function, you need to map the training set to a high-dimensional space and then use the linear regression method to solve in the high-dimensional space. Assuming that there is a mapping function, we can map the input set in the Euclidean space to the Hilbert space to get the corresponding original problem. The original problem with the feature mapping function is as follows: After applying the Lagrange multiplier method to the dual transformation of the original problem, the dual problem with the feature mapping function is obtained. In some cases, the problem of dimensional disaster will occur in the feature mapping process. Therefore, in the process of solving the regression problem, the kernel function method is usually used to perform the inner product operation of the mapping function. If, for any, the function meets the conditions shown in the following formula, it is called a kernel function.

Sewage Testing Actual Case Verification
In order to discover the existing or potential problems in the system software and hardware and ensure the stable and effective operation of the system, the system was tested in two stages. The first stage is testing and analysis in a laboratory environment. On the premise of ensuring that the firststage test passes, the second stage of sewage monitoring site testing and analysis is carried out.

Comparison of Water Quality Prediction.
In order to predict the water quality in order to prevent the occurrence of sewage and other conditions, this paper uses a support   Wireless Communications and Mobile Computing vector machine algorithm for prediction research. On the basis of wireless monitoring, a support vector algorithm is used to construct a water quality prediction model to make reasonable predictions of water quality in the basin. The prediction results and comparison chart are shown in Figure 5. It can be seen from Figure 5 that the three methods all have good performance in the process of water quality prediction, with the maximum error within 10. Among them, the support vector machine algorithm is the best for water quality prediction, with an average error of less than 2%. Therefore, it can be concluded that the support vector machine algorithm can be effectively applied to the water quality prediction of the river basin.

Phase 1 Testing.
In order to verify that the design of the system in terms of data collection and network communication satisfies the functional requirements of the monitoring system, the first phase of the network was simulated under laboratory conditions. The test platform at this stage consists of 4 sensor nodes, 1 routing node, and 1 coordinator node. The four sensor nodes are used to collect four kinds of sensors, respectively. The routing node is used as a medium for information exchange between the sensor node and the coordinator node. After receiving the sensor data, the coordinator node communicates with the PC and displays the collection results in the serial debugging assistant software. In addition, because the pH sensor and the dissolved oxygen sensor were not in place during the first stage of testing, the pH and dissolved oxygen values were collected using an analog voltage input method. The laboratory does not have a Presell water tank, so the flow value collection is also replaced by the liquid level value. In order to speed up the progress of the system test, this phase of the test did not set the collection interval to 5 minutes but set to collect data every 1 minute. After the collection, the data was transmitted to the coordinator node through the routing node. The results of the firststage testing of water quality monitoring indicators are shown in Figure 6.
After testing, it was shown that the monitoring network was successfully established in the laboratory environment and the data transmission was stable. After actual measurement, the data collected by the sensor node is accurate and reliable, which meets the design requirements.

Second-Stage
Test. The laboratory environmental noise interference is relatively small, and the working environment is close to the ideal state. In order to verify the system's operating status and performance at the monitoring site and further test the stability and reliability of the system, the second stage of testing is to install sensors at the sewage monitoring site for testing. The test network consists of 1 coordinator node, 10 routing nodes, and 50 sensor nodes. The client software was developed using LabVIEW for real-time display of monitoring data. The sensor node is installed at the sewage index collection site. The routing node and the sensor node are placed within a distance of 1 km without obstruction. The sensor node is connected to the nearest routing node using the principle of proximity, thus forming a test network. The results of the second-stage test water quality monitoring indicators are shown in Figure 7.
The second phase of the test was conducted continuously for 10 days under the monitoring site environment, and the test data was displayed in real time through the client software. Figure 7 shows the test data over a period of time.

Wireless Communications and Mobile Computing
The test data shows that the node networking under the monitoring site environment is successful, and the data can be transmitted. But there are still some problems. When collecting the pH sensor, the output value of the read sensor obviously exceeds the measurement range of the sensor. Carefully check the hardware wiring diagram for correctness, check the sensor data sheet to know that there is a problem with the sensor, replace the sensor, and the problem is solved immediately. The stability of data transmission cannot be satisfied. After analysis, the reasons for unstable data transmission include poor antenna performance and complicated monitoring site environment. Purchase antennas from pro-fessional antenna manufacturers during the test, and elevate the antenna parts of the routing node and sensor node to effectively improve the stability of data transmission. The next step will be to look for reasons from other aspects and improve the stability of data transmission.

Conclusion
In view of the problems faced by water quality monitoring in river basins, this paper proposes an automatic water quality monitoring system based on wireless sensor networks by studying the development trend of water quality monitoring  at home and abroad. The system uses wireless sensor network technology, which enables wireless communication between data collection nodes and between data collection nodes and data base stations, reducing damage to the environment. In order to improve the utilization rate of data collection nodes, this paper also optimizes the collected data. In this paper, the wireless sensor technology and ZigBee protocol are used to establish a sewage monitoring model and real-time dynamic monitoring of total nitrogen, total phosphorus, ammonia nitrogen, and other indicators of the water quality of the watershed. On the basis of wireless monitoring, a support vector algorithm is used to construct a water quality prediction model to make reasonable predictions of water quality in the basin. Finally, the actual test results show that the technology can automatically and real-timely monitor the water quality of the watershed to meet the requirements of water quality monitoring in practical applications.
Although this paper has achieved some research results in the automatic monitoring of water quality in wireless sensor networks, there are still many problems that need to be improved. For example, the power supply problem of the water quality automatic monitoring system. If the electric energy cannot be regenerated, then the system will have a relatively short service life. For the dimension of water quality monitoring, water quality monitoring should be carried out at different depths in the basin, which can improve the reliability and integrity of water quality monitoring.

Data Availability
All data are available from the corresponding author.

Conflicts of Interest
The authors declare that there are no conflicts of interest regarding the publication of this paper.