Computational Intelligence-Based Structural Health Monitoring of Corroded and Eccentrically Loaded Reinforced Concrete Columns

in dealing with such complex databases. In this article, an ML-based artifcial neural network (ANN), Gaussian process regression (GPR), and support vector machine (SVM) algorithms have been applied to estimate the residual strength of corroded and eccentrically loaded RC columns. Te performance of the analytical and ML models is accessed using commonly used performance indices, namely, the coefcient of determination ( R 2 ), root mean square error (RMSE), mean absolute error (MAE), mean absolute percentage error (MAPE), a-20 index, and Nash–Sutclife (NS). Te results of the proposed ANN model have been compared with the existing analytical model to identify the suitability of the best model. Based on performance analysis, the precision of the GPR and SVM models is lower than that of the ANN model. Te processed results revealed that the R 2 value of the ANN model for training, testing, and validation datasets is 0.9908, 0.9757, and 0.9855, respectively. Te MAPE, MAE, RMSE, NS, and a-20 index for all the datasets are 8.31%, 48.35kN, 72.53kN, 0.9886, and 0.8978, respectively. Te precision of the ANN model in terms of the coefcient of determination is 225.77% higher than that of the analytical model. Te sensitivity analysis demonstrates that the compressive strength of concrete plays the most signifcant role in the load-carrying capacity of corroded and eccentrically loaded RC columns. Te proposed ANN model is reliable, accurate, fast, and cost efective. Tis model can also be used as a structural health-monitoring tool to detect the early damages in the RC columns.


Introduction
In RC structures, columns are essential to transfer the superstructure load to the foundation.Te deterioration of columns may afect the overall performance and service life of the structure.Te damages in the column may reduce the structures' residual life, and as a result, the structure may show poor performance during seismic activities.
Rapid growth in industrialization and mobility of vehicles in developing South Asian countries is a matter of concern [1].Te leading gases produced by industries are carbon dioxide (CO 2 ) and sulphur.Carbon from the atmosphere tends to get into RC structures through the voids of the concrete.Impressed CO 2 reacts with iron in the reinforcement to form oxides of iron oxide (FeO 2 ), commonly referred to as rust or corrosion.Te loss of alkalinity due to carbonation of steel bars, loss of alkalinity due to chloride attacks, cracks in concrete, and insufcient cover are some of the other factors that lead to corrosion of embedded reinforcement in concrete structures [2].Te utmost common cause of degradation and the sudden collapse of structures during their service life is corrosion of concrete-embedded reinforcement bars [3,4].
Steel reinforcement corrosion can be categorized into two types: chloride-induced and carbon dioxide-induced corrosion [5,6].Carbonation-induced corrosion leads to uniform steel bar corrosion throughout its length, whereas chloride-induced corrosion leads to the formation of pits.Pits are the critical sections in the reinforcement bar where heavy cross-sectional losses take place due to corrosion [7,8].Under the action of loads, these sections form highstress zones (as the efective area of steel is reduced at pit locations) and may lead to sudden failure of the reinforcement bar at that particular cross-section.In this way, the process of corrosion in RC columns leads to a reduction in the diameter of the reinforcement bar.Even a small loss in the cross-sectional area of a steel bar signifcantly reduces the load-carrying capacity of the structural element [9].Corrosion of the stirrup diminishes the overall buckling resistance or load-carrying capacity of the column [10,11].Stirrups owing to their closer proximity to the outer atmosphere (compared with the main bars) and lesser cover bear more corrosion than main bars, signifcantly degrading the shear capacity of the column [12].Also, the corrosion percentage of stirrups has higher signifcance than the corrosion level of main steel bars, as it governs the failure mode of columns [13].Te bond strength of concrete is also afected due to the volumetric expansion of corrosion as the formation of rust products, which is about 3-6 times more than the volume of steel-reinforcing bars [2].Tis volumetric expansion leads to the cracking of concrete due to internal stresses and spalling of concrete, which further accelerates the corrosion process [2,8].More corrosion leads to more and wider cracks on the outer surface of concrete.Corrosion also changes the failure mode of the column from fexure-dominated failure to fexure-shear or pure shearbased failure [14,15].All these factors combined slowly degrade the structure, which leads to ultimate failure during the service life of the structure.
In general, concrete present around the steel bar protects it from corrosion as concrete creates an alkaline environment for embedded steel reinforcement which resists corrosion of the steel surface, and corrosion species such as O 2 , Cl − , CO 2 , and H 2 O are prevented from entering.Despite these factors that subsidize the prevention of steel corrosion in concrete, cases of RC structure failure have been stated in alarming numbers from certain regions characterized by high humidity, weather, and high wet conditions [3].
Despite the recommendations provided by the codes of practice, steel corrosion is quite evident in structures.To fully understand the behavior of corroded structures, much efort is needed in the estimation of the remaining strength of corroded members varying the degree of corrosion.It is also important for the development of rehabilitation techniques/retroftting techniques for the enhancement of the service life of the structure.
Te available analytical model for the estimation of the remaining strength of eccentrically loaded and corroded RC columns is typically generated using a small number of column databases with constrained ranges of weight loss; thus, their accuracy is questionable.Moreover, analytical approaches are often complex and require additional assumptions, which makes them difcult to apply in actual practice.
Recently, soft computing tools such as artifcial intelligence (AI) have gained much importance.ML algorithms have been tried and tested in diferent domains of civil engineering for optimization and prediction of results [16].ML models have been successfully producing better results than already existing analytical models for prediction.Te robustness of a model depends on several factors such as the problem domain, the size and quality of the dataset, and the choice of hyperparameters.In general, support vector machines (SVMs) and artifcial neural networks (ANNs) have been known to perform well in a wide range of problems and have been widely used in various domains.Multivariate adaptive regression splines (MARS) is a nonparametric regression method that can handle nonlinear relationships between features and target variables.It has been shown to perform well in problems with complex relationships.Adaptive neuro-fuzzy inference systems (ANFISs) are a hybrid model that combines the strengths of both fuzzy systems and ANNs.It has been applied to various problems and has shown good performance in some cases.Extreme gradient boosting (XGBoost) is an ensemble learning method that has been widely used in civil engineering applications and has shown strong performance in many applications.Te estimated probability of recurrence (EPR) is a method for predicting the probability of recurrence of an event, and it is typically used in earthquake engineering.Terefore, the robustness of a model ultimately depends on the problem domain and the quality of the data, and the model that works best for one problem may not be the best for another.Recently, many data-drivenML-based algorithms have been introduced in the civil engineering sector, and some of the studies are elaborated hereunder [17][18][19][20][21][22].

2
Shock and Vibration Xu et al. [13] predicted the residual load-carrying capacity of RC columns using the ML approach.In their study, 180 cyclic loading test specimens were collected from the literature.Te study used six ML models to create the most efective model for the corroded columns' failure mechanism and load-carrying capacity estimation.Among six algorithms, CatBoost and RF had the highest accuracy of 89% in estimating the seismic mode of failure of corroded RC columns.For the prediction of the axial capacity of a corroded RC column, the CatBoost model has been found to be best with an R 2 value of 0.92.Also, the RF algorithm showed a shift in the failure mode of the column during its service life, which was also mentioned by Al-Osta et al. [23].
Imam et al. [24] investigated the residual strength of corroded RC beams using ANNs.Four diferent algorithms were developed in this study; each algorithm was created using two hidden layer confgurations, with two input and one output parameter.Te two input parameters were the diameter of the bar and the corrosion activity index of the beam, while the output parameters defned each of the two outputs.Te dataset was split into two sets, 70% and 30% for training and testing analyses, respectively.Te output presented that the ANN algorithms that were utilized to estimate C f showed better results than the available analytical models.ANN algorithms had 92% accuracy when compared to experimental equations.
Gupta et al. [25] predicted the mechanical properties of rubberized concrete exposed to high temperatures using ANNs.Te total dataset contains 324 specimens, divided into 70%, 15%, and 155 for training, validation, and testing datasets, respectively.Te input layer consists of four input parameters, and the hidden layer consists of seven neurons and fve output parameters.Te statistical parameter of training shows an excellent result for the training dataset, which proves the good performance of the trained ANN dataset.Te average correlation coefcient for the developed ANN model is 0.9923 and shows good precision.Tran et al. [26] predicted the punching shear strength (PSS) of a twoway RC slab using ANNs.Te study included 218 datasets of the RC slabs that were collected from the literature to develop the ANN model.Te dataset was split according to the Gupta et al. [25] study.Several ANN models were proposed (changing the neurons in the hidden layer), and the model with ten neurons in the hidden layer was found to be best according to the value of the MSE and R value.Te highest R value was observed for the ANN model of 0.995, which is very close to 1, thus confrming the efciency of the proposed model.
Mai et al. [27] proposed an ANN-based model for estimating the f ck of concrete with fy ash (FA) and blast furnace slag (BFS).Te dataset was randomly split into two parts, training and testing, with a percentage of 70% and 30%, respectively.Te best results were obtained at twentyfour neurons in the hidden layer.Te proposed ANN model is easy to use and reduces the cost of practical experiments.Te output presented that the ANN model is very efective in predicting the concrete compress strength with an R 2 , MAE, and RMSE of 0.9285, 3.29 MPa, and 4.42 MPa, respectively.Te feasibility and consistency of the ANN model in estimating the remaining strength of deteriorated RC columns are studied in this work.Tis study aims to calculate the remaining strength of deteriorated RC columns under axial loads through artifcial intelligence-based and MLimprovised ANNs.For this purpose, a database of 137 experimentally tested columns was collected from the literature.Te collected columns were corroded up to variant levels of corrosion and tested under eccentric-loading conditions.All the test specimens were corroded using an accelerated corrosion process, which is widely used by researchers in the literature.Te impressed currents of all the specimens in the collected database lie within an appropriate range of 0.1 mA/cm 2 to 4 mA/cm 2 as mentioned in [28,29].Quantifcation of corrosion levels was carried out based on the percentage weight loss of embedded steel-reinforcing bars, and the specimens were cast, followed by corrosion, and later experimentally tested.Corroded reinforcement bars were prepared according to ASTM G1-03 [30] to calculate the average percentage weight loss.
Najafzadeh et al. [31] predicted the scour in long contractions using ANFIS and SVM algorithms.Te average fow velocity, critical threshold velocity of sediment movement, fow depth, median particle diameter, geometric standard deviation, and uncontracted and contracted channel widths are the input factors that infuence the scour phenomenon in the modeling of ANFIS and SVM.Te performance of the ANFIS model was 1.14% higher than that of the SVM model.According to the results of a sensitivity analysis, the ANFIS model's scour depth modeling relies heavily on uncontracted channel widths and contracted channel widths.Additionally, the parametric analysis revealed that shear stress resulting from bed sediment motion at the contracted zone was a signifcant component in illuminating how input parameters afected the scour depth in protracted contractions.To estimate the scour depth around bridge piers, a comparison of the group method of data handling (GMDH) based on genetic programming (GP) and backpropagation (BP) systems was made by Najafzadeh and Barani [32].Te results showed that the GMDH-GP algorithm shows greater complexity and time-consuming nature than GMDH-BP, as GMDH-BP performed better at both the training and testing stages in predicting the scour depth.Te sensitivity analysis revealed that the relationship between the pier diameter and fow depth is the most important factor when it comes to the scour depth.Najafzadeh and Azamathulla [33] predicted the scour of pile groups due to waves using the neuro-fuzzy (NF)-GMDH algorithm.Te fndings suggested that predictions made using NF-GMDH models could be more precise than those made using model trees and conventional equations.Te maximum scour depth around piers with debris accumulation was estimated by Najafzadeh et al. [34] using GEP, EPR, and MT models.Comparisons were made between the performance of the testing fndings for these models and the conventional methods based on regression techniques.Quantifying and contrasting the MT's uncertainty prediction with those of other models was performed.In another study, the evaluation of neuro-fuzzyGMDH-based particle swarm optimization (PSO) Shock and Vibration was used to predict the longitudinal dispersion coefcient in rivers by Najafzadeh and Tafarojnoruz [35].Te results of the diferential evolutionary (DE), MT, genetic algorithm (GA), ANN, and conventional empirical equations were compared with the performance of the NF-GMDH-PSO model.Te result analysis revealed that the DE and GA approaches outperformed the other procedures when applied to equations based on an AI methodology.Te NF-GMDH-PSO network can be used as an alternative to the successful formulas discussed above because it accurately predicted the longitudinal dispersion coefcient.
For estimating the remaining strength of corroded and eccentrically loaded columns, there is no computer-based model that is currently accessible in the literature.Tis study is structured as follows: In Section 2, importance of this study has been discussed.Te details of the collected database are explained in Section 3, with standardization of the collected database and the performance indices.In Section 4, an available analytical model for the calculation of the residual strength of the deteriorated column is described.Section 5 gives a thorough description of the ML algorithms (ANN, SVM, and GPR) used for the prediction and development of the ANN model.Te fndings of the proposed predictive model and comparison with the existing analytical model are discussed in Section 6. Te conclusion of this study has been summarized in the last section.

Research Significance
It is important to determine the strength of the corroded column for the repair and rehabilitation of deteriorated columns.As such, there is no provision for the estimation of the remaining strength of corroded and eccentrically loaded columns in the current codes of practice.Tus, it becomes important to evaluate the strength of the corroded column with precision.However, the complexity of the eccentric loaded and corroded columns makes it very difcult to calculate the axial load-carrying capacity of the RC columns.To address this issue, an ML-based algorithm has been utilized in this study to calculate the axial load-carrying capacity of RC columns.According to the authors' knowledge, this paper for the frst time examines the axial capacity of the corroded and eccentrically loaded RC columns by utilizing ML algorithms (ANN, SVM, and GPR).

Methods
Te collection database is a very important step in developing machine-learning models.A database of 137 specimens was collected from previous studies to aid in the learning of the proposed ANN algorithm [36][37][38][39][40][41][42][43][44][45][46].Te details of the collected database are shown in Table 1.In total, eleven input parameters such as breadth (b), width (h), eccentricity (e), concrete compressive strength (f ck ), the tensile strength of longitudinal bars (f yt ), concrete cover (c), longitudinal bar diameter (d m ), the diameter of lateral bars (d s ), percentage of reinforcement (ρ), percentage weight loss (s), and stirrup spacing (S v ) and the output parameter being the axial load (P u ) are collected from previous studies.Te selection of appropriate input parameters becomes crucial in the formulation of a machine-learning model, as it significantly afects the results generated by the model [47].Te input parameters should be selected in such a way that they are easily measurable on-site with minimal destruction of the structure, and this allows the easy implication of the proposed model in real-life situations for best performance assessment [48,49].Figure 1 shows the distribution of the axial load (P u ) against the input parameters, and Figure 2 shows the correlation coefcient (R) plot of all the selected parameters.Te details of statistical parameters, such as the minimum, average, maximum, and standard deviation of all the specimens for the collected database, are shown in Table 2 1.Of the eleven input parameters, fve were concrete parameters, while the other six were steel parameters and the axial compressive load was the outcome.All these input parameters have been carefully chosen and they are also based on the availability of sufcient data for the best formulation of the proposed model.To propose any ML-based model, the number of required specimens should be greater than ten times the input parameters [50][51][52]; therefore, the development of the ANN algorithm for the calculation of the residual strength of the corroded column is satisfactory.
Te methodology of the present work is shown in Figure 3, which frst involves a literature review, followed by the collection and standardization of the database.Te selected database was then split into three parts (training, testing, and validation); the values were predicted from the analytical model, and the ANN model was compared based on the performance indices.

Standardization of Data.
Standardization is the process of making the data unitless, and hence, it is easily understood by artifcial or machine algorithms.Standardization is the technique in which all values are ranged between two numbers, such as 0 to 1 used in this work.In the absence of normalization, large-value neurons have a much larger efect on training than small-value neurons; this may deviate from the training of the model and result in inaccurate outputs.Terefore, standardization of data was needed in our study.In this work, standardization has been performed as follows [53]: where Z normalized is the normalized outcome, x is the value to be normalized in the selected dataset, x min is the minimum value in the selected dataset, and x max is the maximum value in the selected dataset.
4 Mathematical equations used to evaluate all performance indices are mentioned in equations ( 2)- (7).All these indices have also been used to compare the performance of analytical as well as ML-based models.Te coefcient of the determination (R 2 ) value measures the correlation between outputs and targets; an R 2 value of 1 means a close relationship, and 0 indicates a random relationship.Te mean absolute percentage error is a measure of the prediction   Shock and Vibration accuracy of a forecasting method in statistics [58].Nash-Sutclife close to 1 means the good performance of the model and reduces the accuracy of the model towards zero [59].In the case of a perfect model with a zero-estimate error variance, the Nash-Sutclife efciency (NSE) equals one (NSE � 1) and vice versa.Te a-20 index is a recently introduced statistical engineering measure that can be used to evaluate AI models by displaying the number of samples that suit the estimation values with a 20% variance from experimental values [60][61][62].R 2 , NS, and a-20 index values closer to 1 indicate the best correlation between the estimated and the experimental results.Te lower the values of errors (MAPE, MSE, and RMSE), the better the performance of the model.Tese performance indices are based on the statistical assessment of the predicted values and the available experimental values.In addition to that, the scatter index (SI) is also used to assess the performance of developed ML models.Te SI is a measure of the dispersion of data points in a dataset around their mean or average.It is a statistical tool used to assess the degree of spread of data and how far apart the data points are from one another.A scatter index of 0 indicates that data points are perfectly clustered around the mean, while a higher scatter index value implies a greater spread of data points and lower clustering around the mean.Te specifc calculation of the scatter index may vary depending on the type of data and the method used, but it is generally a useful tool in understanding the distribution of a dataset and making inferences based on it [63].Te SI can be computed by dividing the RMSE value by the observed dataset's average values.

Shock and Vibration
where E i refers to the experimental value, S i is the predicted value, E and S refer to the average of all experimental and estimated values, the m20 index is the number of specimens, for which the results of E i /S i lie between the range of 0.8 and 1.2, and N is the total number of specimens in the selected dataset.

Analytical Model
Te model given by Azad for estimating the residual strength of corroded columns is simple to apply; however, the predicted results are inaccurate [37].In this model, the author introduced a reduction factor, alpha, which is to be applied while calculating the strength of the corroded and eccentrically loaded column.Te author proposed a two-step analytical method for the estimation of the strength of eccentrically and corroded loaded columns.Te frst step primarily involves the calculation of P * based on conventional codes of practice, by using the reduced area of cross-sectionsA s .All other factors that infuence the strength reduction, such as bond degradation, crack damage, and loss of yield strength, are adjusted in a single reduction factor α.
Te residual strength of the eccentrically loaded and corroded RC columns may be calculated using the equation as follows: where P res is the residual compressive strength of deteriorated columns, α is the combined reduction factor, and P * is the strength of an uncorroded column calculated with the help of the conventional code of practice using the reduced area of cross-sectionsA s .Te reduction factor α is calculated using the following equation: where e is the eccentricity, h is the width of the column section, and D ′ , D, I corr , and T are the reduced diameter of the longitudinal reinforcing bars, the diameter of the uncorroded reinforcing bar, the impressed current, and the time period, respectively.In this equation, x 0 � 1.323, x 1 � 0.14, x 2 � 1.48, and x 3 � 0.192 were adjusted so that the predicted residual strength could lie within the range of 70% to 110% of the experimental value so as to make the predictions reliable and practical to use.
Te calculation of P * involves the use of the reduced area of the cross-sectionA s which is calculated using the following equation: where A s is the cross-sectional area of the original bar and c is a variable that may be calculated using the following equation: where P r is the penetration rate of the corroded bars, T is the time period of corrosion, and D is the original uncorroded diameter of the reinforcing bars.Te penetration rate P r can be given by the following equation: 8

Shock and Vibration
where I corr refers to the impressed current density.
With the help of equations ( 11) and ( 12), D, which is the reduced diameter of the corroded reinforcement bar, can be calculated using the following equation:

Artifcial Neural Network (ANN).
ANNs are computational models that use supervised ML algorithms to adjust and self-program in order to produce certain output parameters.Neural networks are advanced computational tools with the capability of solving multidimensional nonlinear problems [64].Te network is mainly comprised of individual elements called nodes linked to each other in a particular predefned architecture.ANNs attempt to simulate the logical decision-making capability of the brain to create a correlation between all the input parameters and the corresponding output.Developers have arranged processors or neurons in layers that function in parallel to allow them to accomplish the desired results.ANNs mainly comprise (n + 2) layer structures, where n represents the number of hidden layers in the network, while the other two layers are the input and output layers [58].Te number of hidden layers and neurons in the hidden layer can be adjusted according to the demand from the neural network.A neural network stands out because of its capability to complete tasks based on logical decisions with infnite permutations and combinations, just like the human brain.
Tere are a few attributes of neural networks, such as adaptive learning, self-organization, real-time operation, and fault tolerances, which set them apart and make them extremely powerful [65].Neural network essential programs themselves are based on learning datasets and thus do not require performing each and every step manually with human intervention as in the case of conventional computers.Artifcial neurons in the hidden layer receive inputs depending on the synaptic weight associated with that neuron, which is nothing but the amplitude or intensity of a connection between two particular neurons [66].

Introduction to ANN.
Here, in equation ( 14), z j may be compared with any one input parameter, for instance, the breadth of the column.W j and k are the internal weight and bias through the training process in the hit-and-trial method.Te ANN was developed in 1958 by Frank Rosenblatt and was called a perceptron; this formed the basics of ANNs [67].In the literature, there exist many variations of artifcial neural networks such as the recurrent neural network (RNN), feedforward neural network (FFNN), and spiking neural network (SNN).Te FFNN algorithm is the simplest among all the other algorithms, which is based on connecting the inputs and outputs through one-way connections between the two.FFNNs can be classifed into the following two types: the single-layer perceptron (SLP) and multilayer perceptron (MLP), while an SLP is simpler but is only capable of dealing with nonlinear problems.Multilayer perceptrons are widely used due to their capability to solve complex problems and ability to deal with nonlinear problems.
It is primarily based on the biological neural network, which is responsible for the functioning of the human brain.Biological neurons comprise soma cells, axons, synapses, and dendrites, which are replaced by nodes, inputs, weights, and outputs, respectively.Multiple nodes work together in parallel to estimate the output.Te input is obtained in each and every neuron if the input layer, adjusted by bias and weight, is sent to each neuron in the hidden layer one by one.Similarly, the output signal from the hidden layer is sent to every neuron in the output layer.Te essential part of the neuron is the learning procedure, which is mainly part of supervised learning.In the case of supervised learning, a dataset consisting of inputs along with the corresponding outputs is fed into the neural network for learning purposes.
Multiple inputs, z 1 , z 2 , z 3 , z 4 , . .., z j , are sent to the summing junction along with the corresponding weights from each neuron.Te summing junction receives the inputs in the form of w 1 z 1 , w 2 z 2 , w 3 z 3 , . . ., w j z j , which is nothing but the dot product of the weight and input matrices.Te neuron has its own bias/ofset, which is summed up with the dot product of inputs and weights, ultimately forming the predicted output y j .Te basic mathematical equation of working neurons is described in the following equation [24,68,69]: where y j is the desired output of the neuron, f(.) is a unit step function or transfer function, w j is the weight connected with the j th input to the neuron, z j is the input to the neuron, and k is the ofset, bias, or threshold.Shock and Vibration 9

Development of the ANN Model.
In this study, the dataset used to train the ANN algorithm is already explained in Section 2. Te ANN model is trained in the MATLAB 2021a environment on Intel(R) Xeon(R) W-2145 @3.70 GHz, 32 GB RAM system.Te input layer consists of eleven input parameters, and hidden layer neurons are adjusted between the range of three and ffteen, as shown in Figure 4.It shows the structure of the developed ANN model.Only one hidden layer gave satisfactory results in terms of R and MSE, and it is easy to develop the ANN model with a single hidden layer in between the input and output layers.In general, the accuracy of the ANN model is improved with each increasing neuron in the hidden layer, but at the same time, it increases the complexity of the projected models; therefore, only one hidden layer is adopted.
Te standardized dataset has been randomly divided into three diferent portions, training, testing, and validation.In this study, the training dataset comprises 70% of the total dataset, which is 95 samples out of a total of 137 total samples.Validation datasets were utilized to measure network generalization, and the validation dataset comprises 15% of the total dataset, which is 21 samples, and similarly, the testing dataset has the same number of samples.Te network output was repeatedly altered by changing the neurons' number from 3 to 15 in the hidden layer.
For training ANNs, three diferent algorithms are available in MATLAB, namely, scaled conjugate gradient (SCG), Bayesian regularization (BR), and Levenberg-Marquardt (LM).In this case, only the LM algorithm was used because in this algorithm, training automatically stops when generalization stops refning, as specifed by an increase in the MSE of the validation samples.In comparison to the other two gradient descent approaches, the LM technique is more powerful, fast, and accurate [70,71].Te LM backpropagation algorithm has been most widely utilized by researchers to train the ANN model in the past [72,73].As this algorithm requires less time than the other two, it was adopted for the training of the neural network.
Based on these outputs, three diferent values of the correlation coefcient and mean square error for training, validation, and testing were established, as shown in Table 3.With the help of the obtained data, rank analysis was performed for the selection of the best ANN model.Although the rank analysis was performed, other performance indices were calculated to ascertain the selection of the best ANN model.
Performance indices for the unnormalized results were calculated for all thirteen ANN models obtained through training of the ANN model by altering the neurons' number in the hidden layer, as shown in Table 4.
Te performance of the diferent neurons based on MSE for training, testing, and validation datasets is shown in Figure 5.It is depicted in Figure 5(a) that the least MSE was observed when the neurons' number in the hidden layer was 5 and that the maximum MSE was observed when the neurons' number in the hidden layer was 8. Figure 5(b) shows the MSE of testing; in the plot, the least MSE was seen when the neurons' number in the hidden layer was 15 and the maximum MSE was observed when the neurons' number in the hidden layer was 11.Figures 5(c) and 5(d) show the MSE of validation and all datasets.Te ANN model with 13 neurons in the hidden layer was selected as the best ANN model.

Gaussian Process Regression (GPR).
GPR is a Bayesian nonparametric machine-learning method used for regression analysis.It was introduced in the mid-90s by Carl Edward Rasmussen and Christopher K. I. Williams.GPR is based on the idea of modeling a function as a Gaussian process, which is a collection of random variables where any fnite number of them has a joint Gaussian distribution.Tis allows for the modeling of the uncertainty in predictions rather than just the mean value as in traditional linear regression.Formally, GPR involves defning "a prior over functions," which is then updated based on the observed data in order to obtain a posterior distribution over functions.Tis posterior is used 10 Shock and Vibration to make predictions by computing the expected value of the function at any new input points along with the uncertainty of the prediction.Te prediction is a Gaussian distribution with a mean and covariance, which encode the uncertainty in the prediction.In summary, GPR is a powerful method for function approximation and uncertainty quantifcation and has applications in various felds such as geostatistics, civil engineering, time series analysis, and robotics.Shock and Vibration

Support Vector Machine (SVM)
. SVM is a type of supervised learning algorithm used for classifcation and regression analysis.It was frst introduced by Vladimir N. Vapnik and Alexey Ya.Chervonenkis in the mid-90s.SVM is based on the idea of fnding a hyperplane that separates the data into diferent classes in the case of classifcation or predicts the target value in the case of regression.Te hyperplane is chosen such that it maximizes the margin or the distance between the hyperplane and the closest data points, known as support vectors.Tese support vectors are the key elements that determine the position of the hyperplane.Formally, SVM solves a quadratic optimization problem to fnd the optimal hyperplane.In the case of linear separability, the hyperplane can be represented by a simple equation, while in nonlinear separability, the data are transformed into a high-dimensional feature space using techniques such as the kernel trick, and the problem is solved in this space.In summary, SVM is a robust and efective machine-learning algorithm, particularly in highdimensional data and when the classes are well separated.It has found applications in various felds such as text classifcation, bioinformatics, and computer vision.

Results and Discussion
Te predicted results obtained from ANNs and analytical models are presented in this section.Performance indices have been used to analyze the performance of the individual models.
Te number of datasets has been passed through the analytical model presented by Najafzadeh and Barani [32], and the processed results are depicted in Figure 6. Figure 6(a) shows the graph between the predicted and experimental values calculated from Azad's model.Most of the specimens lie below the linear-ftting line, which refects that the values predicted by the analytical model are conservative.For about 64% of the specimens, the predicted residual strength was less than half of the actual experimental value.Figure 6(b) shows the line plot variation in the experimental and predicted results along with errors.Te blue line shows the experimental results, the pink-dotted line shows the predicted values, and the diamond symbol on the top red lines accounts for the observed errors.It also shows that the line variation of the predicted results lies below the experimental results, confrming that the analytical model is too conservative in terms of the predicted results.Figure 6(b) also shows that most of the errors lie above the zero line and have a larger deviation in the predicted values.
Figure 7(a) shows the range of errors in the analytical model.Tis fgure shows the number of samples vs. the error in the predicted results through the analytical model.Te higher the variation between the experimental data and predicted values, the more will be the errors and vice-versa.
Most of the errors observed are positive errors, confrming the conservativeness of the predicted results.While the maximum error value is observed as 2802.65 kN, the lowest observed error value is −108.64 kN. Figure 7(b) shows the frequency distribution histogram of the errors in the analytical model.
Te scatter plot, the line plot to show the variation in the experimental and predicted values with respect to their errors, the range of the errors in the analytical model, and the frequency distribution plot of the respective errors of the SVM model are shown in Figures 8(a)-8(d), respectively.In the SVM model, the range of errors is between −327 kN and 399 kN, as shown in Figure 8(d).Te coefcient of determination of the SVM model is 0.9828, and the other performance indices are shown in Table 5.Similarly, the scatter plot, the line plot to show the variation in the experimental and predicted values with respect to their errors, the range of the errors in the analytical model, and the frequency distribution plot of the respective errors of the GPR model are shown in Figures 9(a higher than that of the SVM model.Te MAPE value of the GPR model is 37.50% lower than that of the SVM model, demonstrating the superior accuracy of the GPR model.Te highest error accumulation is observed at −70 kN for about 42 specimens, followed by 33 for 190 kN.As only about 28% of the specimens, the predicted values lie within 0.8 to 1.2 times the experimental results.Te predicted results also show that the discussed model has high errors and very low accuracy in the prediction of results.Tus, on the basis of the predicted results, this analytical model is very conservative and is responsible for the uneconomical solution for repair and rehabilitation purposes.Tus, it is not helpful to apply this model for practical engineering purposes.Te ANN model with 13 neurons in the hidden layer was selected as the best model for the prediction, as this model had the highest coefcient of determination (R 2 ) of 0.9887, as shown in Table 4. Figure 10(a) represents the plot between the experimental and estimated results for the training dataset, which contains 95 samples from the selected dataset.It also shows a line plot to show variations between the experimental and predicted results.Te predicted line (pink-dotted line) almost coincides with the experimental line (blue line), which shows that the predicted values for the training dataset are very close.Similarly, Figures 10(b) and 10(c) show validation and testing datasets, respectively.However, as shown in Figure 10(d), the predicted value line does not completely coincide with the experimental value line for the testing dataset and also shows higher errors than the training and validation datasets.Te blue-and pink-dotted lines for the experimental and predicted results almost coincide with each other, showing that the estimated values are very adjacent to the measured values.
Te red error line in all the plots also lies very close to the zero-error line, showing minimal errors in the predicted values.Figure 10(d Te highest frequency of the error is observed at −20 kN, followed by +20 kN.Te red line shows the normal distribution of errors across the plot.All of these plots show that the ANN model has been successful in predicting the results accurately.Tis model shows minimal errors and a high correlation with experimental values.Now, for the selection of the best model and for the calculation of the remaining strength of corroded and eccentrically loaded columns, the analytical model is compared with the formulated ANN model.Te analytical model has performed very poorly in all the performance indices when compared with that of the ANN model.Te empirical model showed high errors in the predicted results, doubting its ability for strength prediction in practical engineering.On the contrary, the ANN model showed much confdence in the predicted results with minimal results.Te performance indices of the proposed model outperformed those of the existing analytical model.For the analytical model, the R 2 value is 0.3035, which is 69.30% less than that of the proposed ANN model, which is 0.9887.Similarly, the R 2 value of the ANN model is 0.12% higher than that of the GPR model.For the proposed ANN model, the MAPE value is 8.31%, which is 83.81% less than that for the analytical model, which is 51.94%.Similarly, the MAPE value of the ANN model is 7.12% lower than that of the GPR model.Te MAE and RMSE values of the ANN are 48.34 and 72.52, respectively, which are 89.85% and 90.4% less than those of the analytical model.Te analytical model values of the NS and a-20 index are 0.1653 and 0.2700, respectively, which are 83.27% and 69.92% less than those of the ANN model, which are 0.9886 and 0.8978, respectively.Te scatter index of the ANN model is 18.97% and 6% lower than that of the SVM and GPR models, respectively.Te details of performance indices for both the analytical model and the proposed ANN model are presented in Table 5. Te output of the performance indices depicts that the proposed ANN model has astonishingly outperformed the analytical model.Te developed model has better-predicted results, and negligible errors are observed.Tus, it is reliable to use this model in the practical engineering feld, as it produces good results. Figure 12 shows the comparison of the models on the basis of the frequency of errors.Te rectangular box shows the frequency of the errors presented in the analytical model (AM).Te other four boxes show the frequency of the errors Shock and Vibration of SVM, GPR, and ANN models.Green-, blue-, and turquoise-colored rectangular boxes show the errors in the SVM, GPR, and ANN models, respectively.Tis graphical representation of the frequency error plot also shows that the ANN model is quite accurate and efcient in calculating the axial load-carrying capacity of the eccentrically loaded and corroded columns.

Proposed Formulation.
Based on the ANN model, the following formulation can be utilized to estimate the axial capacity of the eccentrically loaded and corroded RC columns.Te generalized equation of the ANN algorithm is expressed as where the activation function used to model this ANN algorithm is "purelin" and expressed as where W HO are weights between the hidden layer and the output layer and B HO is the bias between the hidden layer and the output layer.Y i is the coefcient that depends on the following expression:

Shock and Vibration
where the activation function used to model this ANN algorithm is "tansig" and expressed as where W IO is the weight between the input layer and the hidden layer and B IO is the bias between the input layer and the hidden layer.Te fnal formulation for the estimation of the deteriorated capacity of the columns is expressed in the following equation: Te values of D 1 to D 13 are calculated from the following equation: 6.2.Sensitivity Analysis.Te sensitivity analysis was conducted to assess the infuence of input variables on the output of the ANN model.Tis was performed by analyzing the current weights in the neural network, as suggested by Milne [74].Te relative impact of each input variable on the network output was determined by using the linking weights between input neurons, hidden neurons, and output neurons.Te equation used for this purpose takes into account the weights (w ij and v jk ) for each hidden neuron in the network.
In the equation,  N r�1 |w rj | is the sum of the weights linking N input neurons to hidden neuron j.Q ik represents the proportion of the infuence of the input variable x i on the output variable in relation to the other remaining inputs.It is important to note that the total sum of the Q ik index for all input variables must be equal to 100%.Figure 13 illustrates the relative impact of the input variables on P u in this study.
As depicted in Figure 13, the variable compressive strength of concrete and tensile strength of longitudinal bars have the greatest and least impact on P u , with a relative importance of 11.69% and 7.56%, respectively.Te input variables that come next in terms of infuence on P u are stirrup spacing (S v ), diameter of lateral bars (d s ), percentage of reinforcement (ρ), concrete cover (c), width (h), longitudinal bar diameter (d m ), eccentricity (e), percentage weight loss (s), and breadth (b), respectively.

Conclusions
Tis paper develops three machine learning-based models, namely, SVM, GPR and ANN models, to forecast the residual strength of the corroded and eccentrically loaded RC columns.A total of 137 experimental datasets of corroded RC columns were gathered from the previous studies to establish the SVM, GPR, and ANN models.Te number of neurons in the hidden layer of the ANN model was changed in order to fnd the most efective ANN-based model.Te developed model consisted of eleven input parameters such as breadth (b), width (h), eccentricity (e), concrete compressive strength Te proposed ANN model in this work is efcient in estimating the residual strength of deteriorated RC columns, but more datasets should be employed in future research works to achieve the outstanding precision of machine-learning algorithms.Also, a majority of the test specimens are scaled-down columns and may not accurately refect the parameters involved in real-scaled specimens; therefore, the experimental outcomes of large-scale corroded RC columns might be utilized to encourage the use of machine learning-based models.Other ML techniques should be researched, and the impact of input parameters on model performance should be analyzed for the development of more resilient and efcient models.Te results of the proposed ANN model are only valid for the data falling in the input and output parameters, which is the limitation of this study.Tis model can also be used as a structural healthmonitoring tool to detect the early damages in RC columns.

Figure 1 :Figure 2 :
Figure 1: Distribution of P u against the input parameters (a) b vs. P u ; (b) h vs. P u ; (c) e vs. P u ; (d) f ck vs. P u ; (e) f y vs. P u ; (f ) c vs. P u ; (g) d m vs. P u ; (h) d s vs. P u ; (i) s vs. P u ; (j) ρ vs. P u ; (k) S v vs. P u .

Figure 3 :
Figure 3: Methodology chart of the present work.

Figure 4 :
Figure 4: Te architecture of the ANN.

Figure 6 :
Figure 6: Results of the analytical model.(a) Te scatter plot and (b) line plot to show the variation in the experimental and predicted values with respect to their errors.
) shows the results for all datasets; the red error line lies close to the zero-error baseline, and the experimental value line and the predicted value line almost overlap each other, showing a close relationship.Figure 11(a) shows the line variation of the error and the number of samples.Te red line shows the error in the predicted value and the experimental value.Te highest value of the error observed is 231.29 kN, while the lowest observed error is −219.19kN. Figure 11(b) shows the frequency histogram for the error.

Figure 7 :
Figure 7: (a) Range of the errors in the analytical model.(b) Te frequency distribution plot of the respective errors.

Figure 8 :
Figure 8: Results of the SVM model.(a) Te scatter plot, (b) the line plot to show the variation in experimental and predicted values with respect to their errors, (c) the range of the errors in the analytical model, and (d) the frequency distribution plot of the respective errors.

Figure 9 :
Figure 9: Results of the GPR model.(a) Te scatter plot, (b) the line plot to show the variation in experimental and predicted values with respect to their errors, (c) the range of the errors in the analytical model, and (d) the frequency distribution plot of the respective errors.

Figure 13 :
Figure 13: Te signifcance of the input variables in relation to P u .

18
Shock and Vibration (f ck ), tensile strength of longitudinal bars (f yt ), concrete cover (c), diameter of longitudinal bars (d m ), diameter of lateral bars (d s ), percentage of reinforcement (ρ), percentage weight loss (s), and stirrup spacing (s v ) and one output parameter (eccentric compressive load).Te reliability of the selected ANN algorithm for the estimation of the residual strength of the deteriorated RC column is compared with that of the existing analytical model and developed SVM and GPR models to obtain the best model:(i)Te analytical model has poor precision in estimating the residual strength of the deteriorated RC columns, as only 28% of the predicted values lie in the range from 0.8 to 1.2 times the predicted value.(ii) SVM and GPR models have coefcients of determination of 0.9829 and 0.9875, respectively.Te MAPE value of the GPR model is greater than that of the SVM model, demonstrating the GPR model's superiority.(iii) Te ML-based ANN model has good precision in estimating the residual strength of corroded and eccentrically loaded columns, as over 88% of the predicted values lie in the range from 0.8 to 1.2 times the predicted value.(iv) Te analytical model was found to be highly conservative in predicting the values; as for about 64% of specimens, the predicted values were less than 50% of the experimental values.(v) Te performance of the ANN model is superior to that of the SVM, GPR, and current analytical models, according to performance indices and graphical representation.(vi) Te analysis of sensitivity illustrates that the compressive strength of concrete has the most substantial impact (11.69%) on the ability of corroded and eccentrically loaded RC columns.
. Te ranges of the input parameters such as b, h, e, f ck , f yt, c, d m , d s , ρ, s,s v , and P u are from 100 to 250 mm, 100 to 350 mm, 0 to 157 mm, 17.70 to 63.50 MPa, 354.44 to 550 MPa, 15 to 54 mm, 9.20 to 20 mm, 6 to 16 mm, 1 to 3.88, 0 to 20, 50 to 100 mm, and 42.02 to 3530 kN, respectively, as shown in Table

Table 1 :
Details of the collected database.

Table 2 :
Statistical properties of the selected dataset.

Table 3 :
Selection of the best neuron based on R and MSE.

Table 4 :
Performance indices of the selected neuron and selection of the best neuron.

Table 5 :
Results of the analyzed models.