Machine Learning-Based Sine-Cosine Algorithm for Wastewater Quality Assessment Using Activated Carbon

Activated carbon is one of the most highly proven adsorbents for organic chemicals from wastewater. It acts as a filter and adsorbs various chemicals from the wastewater. It has large pore size and strong adsorptive capacity. The quality of wastewater is generally determined by chemical oxygen demand (COD), biochemical oxygen demand (BOD5), total suspended solids (TSSs), total phosphorus (TP), and total nitrogen (TN). Wastewater contaminant measurement is significant for saving aquatic life and reusing treated water. Adsorption of contaminants that contribute for wastewater quality indicators uses machine learning algorithm for prediction. Many research works have been done, and the issues are inefficiency and time consuming in the adsorption of contaminants by activated carbon in wastewater management. To overcome these issues, this paper introduces hybrid technique of Voting-Based Extreme Learning Machine with sine cosine algorithm (VELM-SCA). The accuracy of VELM-SCA algorithm in classification of water quality status produced improved accuracy is 0.97.


Introduction
Removal of dissolved chemical components and pollutants from the wastewater plant by surface phenomenon is called as adsorption. The wastewater treatment is a complicated process that ensures effective purification of wastewater. The adsorption of the organic pollutant to maintain the reduction of pollutant level in the wastewater treatment plant is an essential one. In the wastewater, treatment of plants depends to the indicator values of parameters which describe the quality of wastewater. For maintaining adequate level of indicators of wastewater quality in the aspect of reducing the pollutant in wastewater, many techniques are available for predicting both quality and quantity of influent wastewater. The wastewater quality indicator parameters are total nitrogen (TN), total phosphorus (TP), biochemical oxygen demand (BOD 5 ), total suspended solids (TSS), and chemical oxygen demand (COD).
United nation sustainable goal 6 that states "ensure access to water and sanitation for all" motivates us to research on wastewater treatment. Survey points that one person out of every three does not have safe drinking infra-structure. Water is basic necessities for all living creature. It is essential to monitor wastewater and its contaminants present in it.
Adsorption utilizes carbon-based material like activated carbon, and biochar was one of the prominent methods in the wastewater treatment [1]. To implement this, so many existing algorithms like random forest (RF), decision tree (DT), Gradient Boosting algorithm (GBA) are available. The issues in the existing algorithms are high efficiency but inaccurate in the reduction of pollutant in the wastewater plants. To overcome this issue, this paper promotes hybrid technique of Voting-Based Extreme Learning Machine with sine cosine algorithm (VELM-SCA). It produces high variation in the prediction of carbon in the wastewater plants and gives high variation in quality and quantity of reduction of the pollutant in the wastewater. The concept of voting ELM is a majority voting system which classified the adsorption of carbon in the wastewater. To improve the accuracy rate, this voting approach is used [2]. For improving the exploitation sine-cosine, the algorithm (SCA) allows to select the prediction of carbon by using search operations in a mathematical function of the sine and cosine [3].
Machine learning algorithm plays a powerful tool in elaborating the complicated multivariate and interpretation for optimizing the adsorption of the dissolved component in the wastewater system. The algorithms like response to surface methodology (RSM), and boosted regression tree (BRT) and general regression neural network (GRNN) were applied in the prediction of carbon in the wastewater model [4,5]. This paper [6] proposed the polluted natural environment for a long period because of the usage of persistent contaminants which can easily enter into the surface and groundwater. The most powerful and reusable adsorbent is the activated carbon for wastewater treatment [7]. The main contributions of this paper are as follows: (1) Implementing the hybrid technique of the proposed method by using VELM based on the concepts of prediction of carbon absorption in the wastewater by taking decision making of the majority of the voting system (2) To improve the accuracy of sine-cosine, a mathematical function-based optimization algorithm is implemented The paper has been organized as follows: Section 2 describes the review of the literature, Section 3 introduces the prediction of wastewater treatment using VELM-SCA, Section 4 discusses the experimented results, and Section 5 concludes the paper with future directions.

Review of Literature
Adsorption is the removal of the organic and inorganic pollutants on the surface of the adsorbents as well as in wastewater treatment. In recent years, searching for low-cost adsorbents in the pollutant component which are available in the water and wastewater treatment and activated carbon plays a vital role in the adsorption of pollutant material [8]. Adsorbents are categorized into two types, namely, natural and synthetic adsorbents. Natural adsorbents are clays, charcoal, zeolites, clay minerals, and ores. Synthetic adsorbents are formed from household wastage, agricultural product wastage, industrial wastage, polymeric adsorbents, and sewage sludge.
Activated carbon acts as an adsorbent for both natural and synthetic adsorbents because of it is an easily available product and cheap, enhancing the properties of adsorption capabilities. Similarly, granular activated carbon (GAC) and powdered activated carbons (PAC) are commonly available adsorbents in the removal of organic and inorganic and removal of industrial wastage water in the land surface area [9]. In the carbon-based adsorbent, predicting the capacity of adsorption is by (Q, mg/g), and it is compared with various influencers of properties of carbon-based properties which are total carbon content (C, wt.%), the molar ratio of oxygen to carbon (O/C), the molar ratio of hydrogen to carbon (H/C),molar ratio of oxygen and nitrogen to carbon [(O+N)/C], ash content (ash, wt.%), point of zero charges (pHpzc), and Brunauer-Emmett-Teller surface area (SBET, m2/g) [1]. This paper [10] proposed the sine-cosine algorithm (SCA) for improving the metaheuristic search. In improving the concept of ELM, this voting ELM can be implemented. This VELM has difference between samples of class forming various vector set values. The multiple independent ELMs are impended for taking majority voting method for making decisions [11]. Table 1 shows the survey on activated carbon in various techniques.

Proposed Methodology
Nowadays, domestic wastage and industrial wastage pollute the eco-friendly environment, which has been adsorbed by activated carbon. This wastage contains soluble and insoluble content and chemicals. In this proposed work, we implement the hybrid of Voting-Based Extreme Learning Machine with sine-cosine algorithm (VELM-SCA) to improve the efficiency in the prediction of wastewater treatment plants [20][21][22][23][24][25][26][27][28][29]. The architecture of the proposed work (VELM-SCA) is given in Figure 1. Figure 1 describes three phases, namely, data collection, preprocessing, and analysis of VELM-SCA.

Data Collection.
In this work, the data were collected from the wastewater treatment plant (WWTP) in Busan, Korea, for three years from January 2008 to December 2010 [19], which are total nitrogen (TN) and total phosphorous (TP) elements, chemical oxygen demand (COD), and biochemical demand (BCD).

Preprocessing.
To enhance the quality in the prediction of carbon in the wastewater, plant needed normalization as a preprocessing technique. Data normalization is based on linear normal function by using: 2 Adsorption Science & Technology  [12] Using morphological operator for activated carbon for dye removal Alshabib et.al (2021) [13] Removal of methylene blue from wastewater using DFT in activated carbon Wang et.al (2021) [14] Adsorption of dyes on an activated carbon Zhu et.al (2021) [1] Carbon-based materials for tetracycline and sulfamethoxazole adsorption F. Yang et.al (2020) [15] Biochar derived dissolved organic matter on adsorption of sulfamethoxazole Li et.al (2020) [16] Magnetic biochar in water treatment Xiang et.al (2019) [17] Carbon-based materials as adsorbents for antibiotic removal Hasan, et.al (2019) [18] Activated carbon composite and its testing for methyl orange removal Kaetzl et.al (2019) [19] Biochar using wastewater treatment Input: sample of training data set S = fða n , b n Þg, n = 1, 2 ⋯ , N with label of b n ∈ f1, 2, ⋯, Cg ; wðtÞ is the water quality indicator of wastewater plant t; w is the total number of organic materials in the wastewater at t; p is the crossover probability and its mutation.
Output: prediction of carbon in the wastewater plant using max voting decision.
Step 1: randomly select the water quality indicator parameter wðtÞ Step 2: evaluate fitness value of wðtÞ using Step 3: choose the next parameter of water quality indicator wðtÞ. Crossover and mutation on wðtÞ are used. Repeat the following steps until the criteria of genetic algorithm (GA) is met.
(1) Probability of crossover prob c is determined by ð1 − prob c Þw members of wðtÞ, and its fitness value is selected probabilistically to wðt + 1Þ.
(2) Probability of chromosomes is prob m in wðt + 1Þ for the mutation, and store the weights of fwt Step 4: voting-based ELM is the produced output taken as decision making of multiple ELMs by using Algorithm 1: VELMSA.

Adsorption Science & Technology
where nor i,j is the normalized value for each data point; ver ij is the sample data and Min k , Max k .

Analysis of VELMSCA
3.3.1. Evolutionary-Based Voting ELM. Extreme learning machine (ELM) is based on the structure of input layersingle hidden layer with feed forward neural network. This voting-based ELM employs the concept of multiple indepen-dent ELM training models, and the final output decision is based on majority voting. The main advantage of using this VELM is the improvement in high quality of detecting available of carbon in the wastewater plant. In addition to this, it lowered the variance. The Algorithm 1 steps of VELM are given below: Algorithm 1 describes that as per Equation 3, S contains samples of S = fða n , b n Þg and n = 1, 2, 3 ⋯ , N, where b n is class label. Here, assume that a is an input and b is predicted by φ wt ða, SÞ. As per voting based ELM, it predicts output of multiple ELM training models. Ifφ wt ða, SÞ predicts thek th classifier, thenD k,j ∈ f0, 1gandk = 1, 2, 3 ⋯ , andKand its ensembled decision (max voting) is defined byφ wt ða, SÞ = arg max c j=1 ∑ K k=1 wt opt k D k,j .

Sine-Cosine Algorithm (SCA).
For getting the best prediction of carbon adsorption in the wastewater plant, this SCA function is implemented. It is based on the mathematical function of sine and cosine. SCA algorithm employs searching process by using sine or cosine equation as Step 1: initialize the random population solutions, X 1 , X 2 , ⋯, X n .
Step 2: by using Equation (4), the value of rnd 1 is used to switch from explore to exploit. It gradually decreases from 2 to 0.
Step 4: compute the fitness value fitfor each solution.
Step 5: express X d ðf Þ that has the best fitness value fit d .
Input: sample of training dataset S = fða n , b n Þg, n = 1, 2 ⋯ , N with label of b n ∈ f1, 2, ⋯, Cg ; wðtÞ is the water quality indicator of wastewater plant t; w is the total number of organic materials in the wastewater at t; p is the crossover probability and its mutation.
Output: prediction of carbon in the wastewater plant.
Step 1: randomly select the water quality indicator parameter wðtÞ Step 2: evaluate fitness value of wðtÞ using Equation (2).
Step 3: choose the next parameter of water quality indicator wðtÞ. Crossover and mutation on wðtÞ are calculated.
Step 4: using Algorithm 2, determine the best fitness fit d .
Step 5: update the best solution out of all water quality indicator parameter value.
Step 6: update the position of adsorption of carbon in the wastewater plant by using If rnd 3 > 1, then rnd 3 + X d ðf Þ: It merges all related carbon particle.
Algorithm 3: VELM-SCA.   (2) and (3) as whereX d is the destination solution,X i ð f Þis current solution || that indicates random variables ofrnd 1 , rnd 2 , rnd 3 , and, rnd 4 which contains absolute values. The parameter rnd 1 is a random variable to determine the path of next solution, and it may be towards or outside between X i ð f Þ and X d ð f Þ. That is switch between exploration and exploitation. The update of rnd 1 is given by where c is a constant value; T is maximum number of iterations; and it is the current iteration. The random variable rnd 2 is used to identify the next movement towards destination solution or best solution X d ðf Þ: Also, random variable rnd 3 adds the random weight to X d ð f Þ, and it is stochastically emphasized rnd 3 > 1 or deemphasize rnd 3 < 1. The random variable rnd 4 is used to switch between the function sine and cosine using Equation (4).The algorithm 2 describes the SCA algorithm.
Algorithm 2 describes that the concept of SCA is randomly choose the solution from the set of solution with various dimensions. For every solution it computes fitness values. The output of best solution of SCA is best fitness.

Proposed Hybrid of Voting ELM with Sine-Cosine
Algorithm (VELM-SCA). VELM-SCA enhances the accuracy and efficiency in terms of classification of the wastewater quality and reduces the computation time for both training and testing dataset. Procedure for VELM-SCA is given below.
Algorithm 3 describes that the training of the dataset S = fða n , b n Þg, with input parameter values. Calculate the fitness for each parameter and choose the best fitness that denotes the carbon content in the wastewater plant. Update the value and choose the next set of data. Update the position of carbon particle using new random variable rnd 3 > 1 , and it merges all related carbon particle.    . It is the relationship the between actual and predicted variables that is in relationship between positive or negative. The formula is

Result and Discussion
where d is the difference in ranks between two variables, and n is the number of samples. The range of values is between -1 and +1.

Coefficient of Determination (R 2 ).
It explains the fraction of variations in its output, and it is defined by The maximum and minimum values of R 2 are 1 and 0. The higher the value it indicates, the best the model.

Mean Absolute Error (MAE)
. The MAE is defined as an average of absolute differences between the model outputs and the actual outputs. The MAE can be calculated using the following formula: whereŷ is the predicted value, and y is the actual value.

Root Mean Square Error (RMSE).
It is an error prediction measure. It calculates the overall error of this system, and it can be calculated as In this work, VELM-SCA, the metric measures its performance based on accuracy, specificity, and sensitivity,    Table 2 shows that performance metric measures of VELM-SCA.
From Table 2, it seems that the performance of evaluation is based on SRCC, RMSE, MAE, andR 2 implemented in VELM, SCA, and our proposed work (VELM-SCA) algorithms. For the SRCC shows strongly correlation of 0.96. In RMSE, VELM got 4.56, SCA got 3.87, and VELM-SCA got 2.45. For the R 2 value, proposed work VELM-SCA got 0.97. For the mean absolute error, VELM got 2.43, SCA got 1.67, and VELM-SCA got 1.22.
From the above Table 3, the accuracy of VELM-SCA (proposed work) is higher as compared to other classifier algorithms of VELM and SCA. In Table 3, the next higher accuracy is SCA which is also closer to VELM-SCA. Figure 2 shows the computation time for CO 2 adsorption in various algorithms like VELM, SCA, and VELM-SCA. Figure 2 shows that the proposed work needs less computation time for the adsorption of carbon. Table 4 Table 4 shows that in the influent parameter values of wastewater quantity and quality, the ranges of variation were established [20].The parametric values in Table 5 indicate  substantial variation in TN, TP,BOD, COD, TSS, and Q values. These indicators of wastewater quality constitute input data for the models of VELM-SCA describing the changes in carbon, nitrogen, phosphorus, and biochemical changes in the compounds of bioreactors.
By using the Table 5 parametric values, our proposed work VELM-SCA predicted activated carbon-based TN, TP, BOD, COD, and TSS values using mean absolute error (MAE), mean relative error (MAPE), and coefficient of determination(R 2 ). The analysis of these parameters are shown in Table 6.
From Figure 3, it seems that our proposed work VELM-SCA gives the better probability prediction of wastewater treatment for plants. Table 6 shows the effectiveness of various machine learning algorithm in terms of various influent indicators by using where N is the total number of testing data, and L is the total number of losing test data. From Table 7, it seems that the effectiveness of our proposed work produces better result compared with other existing algorithms. Figure 4 shows the VELM-SCA model for predicting the influent water based on activated carbon with various indicators.
In Figure 4, it seems that our proposed work gives effective prediction of WWTP that measured various indicators for the treatment of plants, which is the convergence curves of VELM-SCA for all datasets. Our proposed work has a fast convergence compared with other two algorithms.

Conclusion
The prediction of wastewater treatment plants used the adsorption of unwanted influent indicators using activated carbon. In this work, VELM-SCA uses the predicting influent indicators in the wastewater and identify the maximum and minimum point values with boundary set values. It also implemented various parameters like total phosphorous (TP), total nitrogen (TN), BOD, COD, and TSS. VELM-SCA accurately predicts the unwanted influent indictors and filtered it using activated carbon with minimum error prediction and also proves the efficient and robust performance. The accuracy of VELM-SCA algorithm in classification of water quality status that produced highest accuracy is 0.97. Our proposed work VELM-SCA outperforms the best result compared with existing classifier. In future to work this will be extended by using various ML algorithms and also upgrade our work in various influent and Effluent indicators for predicting the wastewater treatment for plants.

Data Availability
The data that support the findings of this study are available from the corresponding author upon reasonable request.

Conflicts of Interest
The authors declare that they have no conflicts of interest.