Prediction of Ultimate Load of Rectangular CFST Columns Using Interpretable Machine Learning Method

e ultimate compressive load of concrete-filled steel tubular (CFST) structural members is recognized as one of the most important engineering parameters for the design of such composite structures. erefore, this paper deals with the prediction of ultimate load of rectangular CFSTstructural members using the adaptive neurofuzzy inference system (ANFIS) surrogate model. To this end, compression test data on CFSTmembers were extracted from the available literature, including: (i) the mechanical properties of the constituent materials (i.e., steel’s yield strength and concrete’s compressive strength) and (ii) the geometric parameters (i.e., column length, width and height of cross section, and steel tube thickness). e ultimate load is the output response of the problem. e ANFIS model was trained using a hybrid of the least-squares and backpropagation gradient descent method. Quality assessment criteria such as coefficient of determination (R), root mean square error (RMSE), and slope of linear regression were used for error measurements. A 11-fold cross-validation technique was employed to evaluate the performance of the model. Results showed that for the training process, the average performance was as follows: R, RMSE, and slope were 0.9861, 89.83 kN, and 0.9861, respectively. For the validating process, the average performance was as follows: R, RMSE, and slope were 0.9637, 140.242 kN, and 0.9806, respectively. erefore, the ANFIS model may be considered valid because it performs well in predicting ultimate load using the validated data. Moreover, partial dependence (PD) analysis was employed to interpret the “black-box” ANFIS model. It is observed that PD enabled us to locally track the influence of each input variable on the output response. Besides reliable prediction of ultimate load, ANFIS can also provide maps of ultimate load. Finally, the ANFIS model developed in this study was compared with other works in the literature, showing that the ANFIS model could improve the accuracy of ultimate load prediction, in comparison to previously published results.


Introduction
Concrete-filled steel tubular (CFST) structural members exhibit very interesting properties, as they combine the advantages of the two constituent materials. In such composite structures, the tensile strength of the steel tube and the compressive strength of the concrete core combine to enhance many properties and structural performances of the members, such as strength [1,2], ductility [3,4], loadbearing capacity [5,6], fire resistance [7,8], earthquake resistance [9,10], energy absorption capacity [11,12], and so on. To date, rectangular CFST members have been employed in many constructions such as buildings, bridges, and underground stations because of their strong moment resistance [13] and simple beam-column joints [10,14]. Moreover, with a given sectional size, rectangular CFST members exhibit greater stiffness than circular or elliptical members [15][16][17]. Although the design process for rectangular CFST columns is set forth in many current codes such as Eurocode 4 [18], AISC [19], and ACI [20], up until now, the axial behavior of rectangular CFSTmembers has received crucial attention from researchers/engineers. e main reason is that current codes do not necessarily have the capacity to take account of different material strengths or ranges of geometrical dimension [21][22][23][24]. As indicated in Xiong et al. [22], Eurocode 4 is only applicable to CFST members with steel yield strength in the range of 235 MPa to 460 MPa, whereas concrete cylinder compressive strength varies from 20 MPa to 50 MPa. In the case of AISC, the yield strength of steel may vary up to 525 MPa, whereas the cylinder compressive strength of concrete may be up to 70 MPa. As axial compression of composite columns is a complex problem, there are a range of questions which still need to be investigated. Indeed, many variables are involved in this problem, including geometrical parameters and mechanical properties of the constituent materials [25]. As CFST members are composite structures, the relationship between variables and macroscopic properties must be established in order to accurately investigate their mechanical performance and failure. erefore, there are many ongoing theoretical, numerical, and experimental studies to obtain a better understanding of the axial behavior of rectangular CFST members. Experimental investigations are normally the best approach to study the behavior of CFST members. However, experimental design is often carried out subject to a small range of parameters, leading to a limited number of specimens [13]. In addition, extensive experimental studies have hitherto been costly and time-consuming [25,26]. In terms of numerical modeling, An and Han [27] put forward a finite element (FE) model for investigating CFST members under both compression and bending. e model developed has been used for a parametric study of the parameters influencing the strength of the composite structures. In another study, Zhou and Han [28] also employed the FE method to model the fire behavior of CFST members. Nguyen et al. [29] developed a FE model taking account of the interface properties between steel and concrete in CFST columns. e FE technique has also been used in many other works to numerically model the axial behavior of CFST columns [30][31][32][33]. ere are also several empirical formulae in the available literature such as Ding et al. [1], Wang et al. [23], Tran et al., [21] and Han et al. [34] for predicting the ultimate load of rectangular CFST members. However, these equations have been derived on the basis of simple assumptions and observations. Consequently, it is not guaranteed that these models will be applicable. e aforesaid studies have provided significant contributions to progress in modeling and prediction of axial behavior of CFST members. However, there is a need for a more efficient and robust manner to better characterize the mechanical performance of such composite structures, including the influence of variables on their macroscopic properties.
Artificial intelligence-(AI-) based models have received significant attention from researchers all around the world, especially in civil engineering-related problems [35][36][37][38][39][40][41][42][43][44][45][46]. For single-material structures, various studies have set out to predict (i) the buckling capacity of steel members [47][48][49][50] and (ii) the compressive strength of concrete [51][52][53][54][55]. For composite structures, Sarir et al. [36] proposed a tree-based and whale optimization model for predicting the load-bearing capacity of circular CFST members. In addition, Ahmadi et al. [56,57] applied an artificial neural network to predict the axial capacity of short CFST columns. Güneyisi et al. [58,59] derived a gene expression programming model to predict the load-bearing capacity of circular CFST members. e performance of such a model has been shown to be better higher than the formulae found in the preexisting literature. Al-Khaleefi et al. [60] introduced a neural network model for studying the fire resistance of CFST members, taking account of different structural, material factors, and loading conditions. Moon et al. [61] have successfully developed a fuzzy logic model for predicting the strength of circular CFST stub columns. e study investigated the effect of concrete confinement on the axial capacity of the columns. Despite the importance of rectangular CFST columns, most AI-based studies so far have concentrated on members with a circular cross section [36,58,61]. Most recently, a few studies have been published involving square cross sections. Ren et al. [35] employed support vector machine and particle swarm optimization to investigate the axial capacity of square CFST members. Tran et al. [21] developed a neural network-based model to predict the load-bearing capacity of square CFST columns. erefore, more investigations are required to assess the potential applications of AI-based models for studying axial behavior of rectangular CFST columns, especially in the highly topical context of high-rise construction.
is work is devoted to the prediction and influence of variables on the ultimate load of rectangular CFST columns, using the adaptive neurofuzzy inference system (ANFIS) model. It should be noted that ANFIS has not yet been used, in the literature, for studying rectangular CFST members and highlighting the influence of variables on the macroscopic properties. e reason for selecting the interpretable ANFIS technique is described in Section 2.2. Section 2.1 introduces the database used to train and validate the developed ANFIS model. In Section 2.2, details of considered variables and reasons for selection are presented. Section 3 presents the phase of training and 11-fold cross-validation of the ANFIS model, together with regression analysis. Finally, partial dependence (PD) analysis was applied in order to interpret the "black-box" ANFIS model, which elucidated the influence of each variable on the output response.
(2) Manufacture of steel tube (cold-formed or welded). In this work, 99 compression tests on CFST members were collected from the literature (data summarized in Table 1). Table 2 shows the initial analysis, including 2 Advances in Civil Engineering notation, unit, minimum, maximum, average, standard deviation, and coefficient of variation of all variables in the database. ose variables are height of cross section H, width of cross section W, thickness of steel tube t, length of column L, yield stress of steel f y , compressive strength of concrete f c ′, and ultimate load N u , respectively (see Figure 1 for geometric description). A hypothesis was made such that the influence of initial geometric imperfections and residual   Advances in Civil Engineering stress was negligible compared to the major geometric parameters and mechanical properties of the constituent materials [36]. In addition, there is no steel reinforcement in the concrete core or the tube (i.e., stiffeners). Finally, only uniaxial monotonic compression tests were considered.

Machine Learning Method: Adaptive Neurofuzzy
Inference System. e adaptive neurofuzzy inference system (ANFIS) is a combination between the learning rules of adaptive networks and a fuzzy inference system, designed to make precise predictions in many aspects of human knowledge. e inference system is based on if-then rules [73], while the adaptive networks system is based on the gradient descent and the chain rule introduced in [74]. Figure 2 shows the basic structure of the ANFIS algorithm in a simplified case with only two inputs. In more complicated cases with a large number of inputs, the algorithm is straightforward. Basically, the ANFIS structure consists of five main layers (shown in Figure 2), each layer containing node functions of the same function family [75].
e layers are as follows [41]: Each node in this layer corresponds to a node function, which can be chosen to be bell-shaped with minimum value equal to 0 and maximum value equal to 1-for example, the Gaussian function such that where x is the problem input and a i and b i are input parameters.
In fact, any continuous and differentiable function can be chosen for the nodes in this layer. Layer 2. Each node in this layer is a node function that multiplies the incoming inputs and sends the results to the next layer: Layer 3. Each node in this layer computes the ratio between the i-th rule's firing strength and the sum of all rules' firing strength: Layer 4. Each node in this layer is a node function chosen such that Layer 5. e circle node in this layer calculates the sum of all incoming results and exports the overall output: It is interesting to notice that ANFIS was especially helpful in various engineering applications where conventional techniques failed or the latter were too complicated to be used [76]. e crucial advantages of the ANFIS model are highlighted as follows: (i) simplicity, (ii) computational efficiency, and (iii) adaptability [77], compared with other machine learning methods. Indeed, ANFIS constructs an inout mapping based on human knowledge and generates output responses by using backpropagation algorithm [78]. After training, validating, and testing, the ANFIS model can be employed to recognize data that were semblable to any of the specimens exposed during the training process. e ANFIS model exhibits better effectiveness than the two lone models (i.e., artificial neural network and fuzzy logic), as proved in many studies such as Aditya et al. [79] and Nayak et al. [80]. Presently, ANFIS has been more and more employed in the field of structural engineering [78,[81][82][83][84][85].
e investigations explored that the ANFIS model yielded superior accuracy compared with other machine learning techniques and experimental data points. However, the ANFIS model suffers from a number of limitations; for instance, it is weak in finding the optimal firing strength [86,87]. By using several metaheuristic optimization techniques such as genetic algorithm or simulated annealing as examples in [88,89], it is possible to search for and better determine the firing strengths of parameters.

K-Fold Cross-Validation.
In this work, the K-fold crossvalidation technique was employed to evaluate the performance of the ANFIS model. It is interesting to notice that such a technique could reduce the overfitting problem as well as the effect of randomness of training and test data [90]. Moreover, this technique is also efficient in case of small dataset [91]. Various investigations have pointed out that 10fold is the optimal number of folds that allows obtaining a suitable result within an acceptable range of error [90,92]. erefore, in this study, regarding the number of data points, the 11-fold cross-validation technique was adopted to evaluate the efficiency of the ANFIS model, following the procedure described in Bui et al. [92]. e diagram of the 11fold cross-validation technique is shown in Figure 3. More precisely, the procedure is as follows. e index of 99 data in the initial dataset was randomly selected and split into 11 different subsets or folds. In the first run, the first fold was used to test the model while the 10 remaining subsets were employed for training the model. Hence, the ANFIS model was trained 11 times using 11 different training and testing datasets, i.e., all data points were used in both training and testing phases. In each run, the performance of the model was recorded in order to evaluate the overall performance of the model. 4 Advances in Civil Engineering

Quality Assessment Criteria.
In the present work, statistical criteria-namely, coefficient of determination (R 2 ), mean absolute error (MAE), and root mean square error (RMSE)-have been used in order to validate and test the AI models developed. R 2 allows us to identify the statistical relationship between two pieces of data. is measurement of the linear correlation yields a value between 0 and 1 inclusive, where 0 is no correlation and 1 is total correlation. R 2 can be calculated using the following equation [93][94][95][96][97]: where N is the number of the observations and p k and p are the predicted and mean predicted values while w k and w are measured and mean measured values of ultimate load, respectively (k � 1: N). In the case of MAE, a low MAE indicates good accuracy of prediction output using the models. MAE can be calculated using the following equation [98][99][100][101]: where p k and w k are the predicted and observed values, respectively (k � 1: N). e formulation of RMSE is described by the following equation [102][103][104][105][106]: Finally, the slope criterion is defined as the slope of the linear regression fit between predicted and observed vectors.

Interpretation of Machine Learning Method: Partial
Dependence Analysis. In this work, partial dependence (PD) analysis was used to interpret the AI-based model [107,108]. To this end, individual conditional expectation (ICE) [109] was first investigated to generate all possible partial responses. By design, ICE allows us to track any changes to the output response by varying a given input variable (other inputs remain unchanged). Consequently, ICE responses may be highly heterogeneous [109,110]. PD was then defined as the average of all partial responses. at way, PD reduces the complexity of the modeled relationship by Inputs Figure 2: Illustration of basic ANFIS structure with two input parameters.  Advances in Civil Engineering graphing the significant relationship between the predicted output and the predictors. More details on the calculation of ICE and PD can be found in Goldstein et al. [109] and Molnar [111].

Training and Performance of ANFIS.
is section presents the ANFIS training procedure. Before training, an initial Sugeno-type fuzzy inference system (FIS) was generated, as illustrated in Figure 4. e parameters of this initial FIS are also indicated in Table 3, showing the membership function type and the number of linear and nonlinear parameters. A hybrid combination of leastsquares and backpropagation gradient descent methods are used to optimize the initial FIS in accordance with the collection of input-output data. e cost function was the root mean square error, with 1000 being chosen as the stopping criterion. Figures 5(a) and 5(b) show the training process performance for one case in 11-cross folds in terms of cost function and step size, respectively, starting at a large value and decreasing to a smaller one. Figure 5(a) shows that convergence is obtained after about 600 iterations. An optimal step size profile of the ANFIS model initially increases, reaches a maximum, and then decreases for the rest of the training. Figures 6(a)-6(c) show the regression graphs using the training data, testing data, and all data, whereas Figures 7(a)-7(c) show both actual and predicted ultimate load as a function of sample index, respectively, for one case in 11-cross folds. Figures 6 and 7 show a strong correlation between the actual and predicted ultimate load. e average values of all quality assessment criteria at the end of the training process over 11 testing folds are given in Table 4.
e average values of R 2 , RMSE, ErrorStD, and slope for training are 0.9861, 89.93 kN, 90.3333 kN, and 0.9861, respectively. As indicated in Table 4 e overall responses confirm that the training process provides the optimal results. Finally, without exhibiting complex architecture, the proposed ANFIS model was able to produce the optimal results in an efficient way, avoiding costly computation.

Comparison with Literature.
Various investigations have been introduced in the literature in order to predict the ultimate load of CFST members using AI-based approaches. A highlight of previous studies involving the reference, the model used, the cross section geometry, the number of data, the number of inputs, and quality assessment criteria is given in Table 5. Various AI methods have been employed, such as particle swarm optimization, support vector machine, gene expression programming, artificial neural network, and so on. In addition, the cross section may be circular or square. In terms of the value of quality assessment criteria, the ANFIS model improves the ultimate load prediction, making it even more accurate than previously published results.

PD Analysis and Surface
Mapping. Based on the validated ANFIS model developed previously, PD analysis is employed in this section to interpret the machine learning "black-box" model. Figures 8(a)-8(f ) show the PD curve for H, W, t, L, f y , and f' c , respectively. It should be noted that the best fit was also applied for each case. PD allows us to locally track the impact of each predictor on the output result. As an example, Figure 8(c) shows that the relationship between N u and t can be approximated by a nonlinear quadratic equation such as y � 17.837x 2 + 51.447x + 611.2. at means the ultimate load of the columns increases when increasing the thickness of the steel tube following a nonlinear increment. e same conclusion (i.e., quadratic fit) was obtained for the cases of H, W, f y , and f′ c , but with different amplitudes (see Figures 8(a), 8(b), 8(e), and 8(f ) for details of the equation). Besides, in the case of L, a third-order equation should be used to describe the relationship between N u and L. It is observed that the effect of H, W, t, f y , and f' c on N u is positive. However, Figure 8(d) shows that L exhibits a negative effect on N u . ese observations were in accordance with the literature. If the length increases, the column becomes slender, and thus the ultimate load decreases. On the other hand, the ultimate load increases when increasing all other variables-especially the cross-sectional area (i.e., H, W, and t) [21,35,36]. e PD analysis presented herein demonstrates that the machine learning technique can assist in the design of rectangular CFST members. In addition to a reliable prediction of ultimate load, as presented above, ANFIS can also assist in the creation of ultimate load maps, as illustrated in Figure 9. In particular, four input values are kept constant       Advances in Civil Engineering and a performance map is created, which depicts the influence of the other two input parameters on ultimate load. us, the proposed ANFIS model can create a huge number of maps, each time selecting the parameters that will be kept constant in order to examine the influence of the other two parameters on ultimate load.
In Figure 9, four maps of ultimate load are presented (same color range), involving the relationship between ultimate load and t-L, t-W, t-f y , and t-f' c , respectively. Figure 9(a) illustrates that a maximum value of ultimate load can be obtained if t reaches its maximum and L reaches its minimum value. On the other hand, the ultimate load   reaches its minimum if L reaches its highest value and t reaches its smallest value. is map confirms the negative effect of L as identified by PD previously. In Figure 9(b), the ultimate load increases when increasing both t and W (i.e., increasing the cross-sectional area). e ultimate load is small when t and W are small. In Figures 9(c) and 9(d), the same remarks as in Figure 9(b) apply. However, the ultimate load may not reach the maximum values like the cases in Figures 9(a) and 9(b). is remark confirms that the geometric parameters of the cross section are more important  than the mechanical properties of the constituent materials. Generally speaking, these observations are in close agreement with the experimental results in the literature [21,35,36]. e maps presented herein aim exclusively to demonstrate the advantage of the proposed machine learning approach. More datasets in a wider range are urgently required in order to deliver reliable maps, and this will be the salient goal for future work.
To quantify the level of influence (i.e., sensitivity rate) of each input variable, the integral of each PD curve was computed and served as an indicator of importance. Figure 10 plots the values of PD's area of six input variables as a bar graph (normalized to 1). e ANFIS model demonstrates that the geometric parameters of the cross section (i.e., t, W, and H) are the most important variables, followed by L, f' c , and f y , respectively. Overall, without solving

Conclusion and Outlook
In this work, an ANFIS model was developed and trained to predict the ultimate load of rectangular CFST structural members under compression. Various statistical criteria such as coefficient of determination (R 2 ), root mean square error (RMSE), and slope of linear regression were employed for error assessment. A hybrid combination of least-squares and backpropagation gradient descent method was used to train the ANFIS model. A 11-fold cross-validation technique was employed to evaluate the overall performance of the model. In comparison with the literature, the ANFIS exhibited excellent potential as a surrogate model for the prediction of the ultimate load of rectangular CFSTcolumns. Moreover, the ANFIS model allowed us to quantitatively explore the influence of each input variable on the output response through PD analysis. In addition, many ultimate load maps were created using the ANFIS model. Such analysis could be useful in structural engineering design and evaluation. e developed ANFIS model could be useful in the predesign process, by exploring some initial calculations of the ultimate load before conducting any experimentation.
However, the application of an AI-based model is not always relevant for practical engineering. In further studies, an explicit empirical equation based on the ANFIS model developed here should be derived for better use in design and analysis. In addition, a numerical finite element scheme should be investigated for studying the mechanical behaviors of composite structures at both micro and macro scales. e finite element scheme could also be coupled with AI approaches to shed further light on the relationship between the micro and macro behaviors of CFST members. In future research, a broader database should be used, in order to cover more material properties and geometric ranges.
Finally, the methodology used in this work could be extended to estimate macroscopic properties of CFST members under different loads and boundary conditions (bending, eccentric compression, beam-column joint, etc.).

Data Availability
e Excel data used to support the findings of this study are available from the corresponding author upon request.

12
Advances in Civil Engineering