Shear Strength of Internal Reinforced Concrete Beam-Column Joints: Intelligent Modeling Approach and Sensitivity Analysis

In this paper, an intelligent modeling approach is presented to predict the shear strength of the internal reinforced concrete (RC) beam-column joints and used to analyze the sensitivity of the influence factors on the shear strength. -e proposed approach is established based on the famous boosting-family ensemble machine learning (ML) algorithms, i.e., gradient boosting regression tree (GBRT), which generates a strong predictive model by integrating several weak predictors, which are obtained by the wellknown individual ML algorithms, e.g., DT, ANN, and SVM. -e strong model is boosted as each weak predictor has its own weight in the final combination according to the performance. Compared with the conventional mechanical-driven shear strength models, e.g., the well-known modified compression field theory (MCFT), the proposed model can avoid the complicated derivation process of shear mechanism and calibration of the involved empirical parameters; thus, it provides a more convenient, fast, and robust alternative way for predicting the shear strength of the internal RC joints. To train and test the GBRTmodel, a total of 86 internal RC joint specimens are collected from the literatures, and four traditional ML models and the MCFTmodel are also employed as comparisons. -e results indicate that the GBRTmodel is superior to both the traditional ML models and MCFT model, as its degree-of-fitting is the highest and the predicting dispersion is the lowest. Finally, the model is used to investigate the influences of different parameters on the shear strength of the internal RC joint, and the sensitivity and importance of the corresponding parameters are obtained.


Introduction
Reinforced concrete (RC) beam-column joint or connection is one of the most critical and vulnerable components in RC structures. e failure of the RC beam-column joints could seriously affect the overall safety of the structures. Especially, it will suffer from the shear failure if there are insufficient transverse reinforcements and/or the material properties are deteriorated due to the aging effects. As it is known to all, shear failure is a brittle failure type without any warnings. erefore, it is vital to accurately predict the shear strength of the RC beam-column joints to avoid shear failure in design procedures in order to ensure the safety of the structures.
In general, there are three commonly used approaches to assess shear strength of the RC joints, i.e., experimental study, numerical simulation, and theoretical analysis. e experimental study is the most direct and classical way, which can be traced back to 1970s [1]. However, it is costly in both time and money and difficult to operate. e numerical simulation, e.g., finite element method (FEM), is also widely adopted for its low cost [2,3]. Nevertheless, it usually has several simplifications and some of the mechanisms are hard to be reflected in the FEM framework, e.g., multistress state behavior, shear behavior, and interfacial bond-slip behavior. Apart from the experimental and numerical studies, numerous theoretical models were also proposed to evaluate the performance of the RC beam-column joints, for instance, the well-known modified compression field theory (MCFT) [4], the strut-and-tie method (STM) [5], etc. ese models are actually derived based on the shear mechanisms of fundamental RC elements and can be widely used to evaluate the behavior of any type of shear-dominated RC members, including the beam-column joints [6]. A detailed review of the theoretical and empirical models for the RC joints can be found in [7].
In recent five years, there are some latest development on RC joint models. Eom et al. [8] developed an energy-based hysteresis model for RC beam-column joints by using the energy function and the existing backbone curve of ASCE/ SEI 41-06 [9]. Hwang et al. [10] proposed a shear strength degradation model for performance-based design of interior beam-column joints. In the model, all possible failure mechanisms of beams and joints, including flexural yielding of the beam end, diagonal cracking and concrete crushing in the joint panel, bar bond-slip, and bar elongation, are considered. Later, Hwang and Park [11] developed design equations of the joint shear strength and hoop requirement for the performance-based design of interior RC beamcolumn joints by considering the diagonal strut mechanism and truss mechanism. e target drift ratio and bar bond parameters are defined as the requirements of the joint shear strength and hoop strength. More recently, Hwang and Park [12] modified the shear strength degradation model for interior RC joints and applied it to exterior RC joints with standard hooked bars. Hwang et al. [13] simplified the softened strut-and-tie model to facilitate design practice for the strength prediction of discontinuity regions such as the RC beam-column joints. e shear-resisting mechanisms as suggested by the softened strut-and-tie model are considered in the simplified model. Similarly, Huang and Kuang [14] proposed a shear strength model for exterior RC wide beamcolumn joints by introducing the softened strut-and-tie concept. Hassan and Moehle [15] collected a database of exterior and corner beam-column joints without transverse reinforcement. Based on the database, they evaluated several existing shear strength models and developed a strut-and-tie model based on the ACI 318 [16] strut-and-tie modeling provisions and an empirical model by considering the effects of joint aspect ratio, column axial load, and concrete compressive strength.
Although the above empirical or theoretical approaches offer simple and clear explanation of the shear mechanism, they also introduce empirical assumptions which will reduce their accuracy. Moreover, the derivations seem to be complicated since the iteration process is likely involved and some of the parameters are empirical that needed to be determined according to the users' experience.
In recent years, with the flourishment of artificial intelligence (AI), a brand new way is come to people's horizons, i.e., using machine learning (ML) techniques to predict the shear strength of the RC beam-column joints. ML is a type of AI, which has various functions, e.g., classification, regression, and clustering. ML can learn the characteristics of a certain type of data according to the existing database and then classify, summarize, and predict the things of interest. Prediction of the shear strength of the RC joints is essentially a regression problem.
ere are already some successful applications of prediction using ML in structural engineering, for instances, evaluating the cement strength via fuzzy logic, artificial neutral network (ANN), and gene expression programming (GEP) [17,18], modeling the concrete properties via ANN and support vector machine (SVM) [19][20][21][22][23], simulating the failure of brittle anisotropic materials such as masonry via ANN [24,25], predicting the structural member capacities via hybrid ML algorithms [26,27], detecting the structural damage via GEP [28,29], etc. A detailed state-of-art of the application of ML in structural engineering was summarized in [30].
However, the majority of the ML algorithms used in the previous studies were individual-type learning algorithms such as ANN family [31], SVM family [32], and decision tree (DT) family [33]. e disadvantages of the individual-type learning algorithms are instable and with low accuracy. To improve their performance, a new type of learning algorithms known as ensemble learning algorithms has been recently proposed and successfully applied in various fields. e basic idea of the ensemble learning is to combine several weak learners generated by individual learning algorithms into a strong one. In brief, the ensemble learning algorithms are more stable and accurate compared to the individual learning algorithms [34]. ere are mainly two categories of ensemble learning algorithms: bagging and boosting. For the bagging family, the weak learners are produced in parallel while they are produced in sequence for the boosting family.
eoretically, bagging is more efficient and can effectively reduce the variance of the prediction, and boosting is relatively less efficient in reducing the bias. In practice, boosting is superior to bagging in terms of accuracy for general cases. erefore, one of the most typical boosting ensemble learning algorithms referred to as gradient boosting regression tree (GBRT) [35] algorithm is used in this study.
In this paper, we aim to develop a GBRT-based intelligent method for predicting the shear strength of the RC beam-column joints and make comparisons between the proposed data-driven model and some traditional ML-based models as well as the conventional mechanical-driven MCFT model. Firstly, some individual-type ML techniques, including linear regression (LR), SVM, ANN, and DT, are briefly reviewed. en, the mathematical background and implementation of GBRT are introduced. Afterwards, the shear strength data of 86 internal RC beam-column joints are collected from the literature. Based on the database, the prediction results from the GBRT-based model are verified by a 10-fold validation test and compared with those from the individual-type ML models. In addition, one of the representative conventional mechanical-driven approaches, i.e., MCFT, is briefly summarized and also used as comparison with the GBRT model. Finally, sensitivity analysis of input variables is conducted for the GBRT model to quantify the influences of different parameters.

Linear Regression (LR).
Linear regression (LR) is one of the most widely used statistical analysis techniques in determining the qualitative relationship between two or more variables. In general, the least square method is adopted to solve the LR problem. If only one independent variable and one dependent variable are considered and the relationship between them is approximately linear, then this type of regression analysis is called simple linear regression (SLR). On the contrary, if two or more independent variables are included and the relationship between the independent and dependent variables are approximately linear, then this regression analysis is called multiple linear regression (MLR). For the prediction problem considered in this study, more than two input parameters should be assigned as the independent variables, so it belongs to MLR.

Support Vector Machine (SVM).
Based on the statistical learning theory proposed by Vapnik [36], the support vector machine (SVM) is an effective optimizing tool to improve the generalization performance and obtain the globally optimal and unique solution. In implementing the SVM regression, the primary goal is to minimize an upper bound of the generalization error based on the structural risk minimization. e essence of the SVM regression is to map the input variables into a high-dimensional feature space by a nonlinear mapping and then conduct linear regression in the space.

Artificial Neural Networks (ANN).
e artificial neural network (ANN) is a complex information processing system composed of a huge number of interconnected processing elements (neurons) arranged in layers. It is the abstraction, simplification, and simulation of the structure and mechanism of biological nervous systems such as human brains. Just as the learning process in biological systems, the ANN involves adjustments to the synaptic connections between the neurons. When it is applied to solve engineering problems, a neural network can be a vector mapper which maps input vector(s) to an output one(s).

Decision Trees (DT).
Decision tree (DT) is one of the basic classification and regression methods. e DT regression approach mainly refers to one of the binary tree structures, i.e., classification and regression tree (CART) algorithm, in which the characteristic values of internal nodes are only yes or no. e main task for CART is to divide the characteristic space into several units. Every unit has a certain output. As each node is judged by yes or no, the divided boundary is parallel to the coordinate axis. Any testing data can be fallen into a unit according to its characteristic and thus obtain its corresponding output.

Boosted ML Approach: Gradient Boosting Regression Tree (GBRT)
ough the abovementioned traditional ML methods have already been applied in several aspects of structural engineering, including predicting the behavior of structural members, there still exist some drawbacks. For some cases, a "best" model may not be easily obtained using those algorithms. Meanwhile, models by different algorithms will have their own hypotheses, which may lead to great model uncertainty. erefore, this paper employs the ensemble learning algorithms to generate the predictive model for the joint shear strength. Specifically, the boosting family gradient boosting regression tree (GBRT) is adopted. e ensemble learning method is superior to the individual learning method since it offers a powerful framework to obtain a strong estimator (or learner) by integrating several weak estimators (or learners) produced by the individual learning method, so the accuracy and robustness are both enhanced. e boosting idea is reflected in the weights of the weak learners: the one with higher score will get higher weight in the final strong learner. e fundamental and theoretical backgrounds, as well as the implementation procedure, are all presented herein.

Gradient Boosting Framework of Ensemble ML.
As mentioned before, ensemble learning is not an individualtype ML method. It is accomplished by integrating multiple weak learners into a strong one. Boosting is a major group of ensemble learning algorithms, which generates the weak learners subsequently and can be interpreted as an optimization algorithm on a suitable cost function. e basic idea of boosting is to update the weight of each weak learner by its learning error. If a weak learner has a large learning error, it will be assigned a large weight so that it could be paid more attention in the subsequent training process. Like other boosting methods, the gradient boosting integrates several weak learners into a single strong learner in an iterative way.
Supposing it requires M steps to find out the final strong learner and at the m-th(m ∈ [1, M]) step we have an imperfect model f m (x) which is the sum of weak learners in the previous steps, where x is the vector containing the input variables; h i (x) and α i are the weak learner and the corresponding weight at . en, the optimization problem becomes how to find h m (x). e solution of gradient boosting starts with the observation that a perfect h m (x) would imply where y is the target output or the tested value of the output. Equation (2) can be equivalently expressed as erefore, in the following gradient boosting algorithm fits h m (x) with the residual y − f m (x). Like other members of the boosting algorithms, f m+1 (x) is attempted to correct Advances in Civil Engineering the errors of its predecessor f m (x). It is observed that the residual y − f m (x) is the negative gradient of the squared loss function 1/2[y − f m (x)] 2 , so the negative gradient can be extended to other kinds of loss functions. In other words, the gradient boosting algorithm is a gradient descent algorithm, which can be generalized by varying the loss function and the gradient.

Gradient Boosting Regression Tree (GBRT).
As can be seen in the previous section, gradient boosting is actually a framework for ensembling numerous weak learners, rather than a specific learning algorithm. eoretically, any individual algorithms from the ANN, SVM, and DT families can be used to train the weak learners. However, unlike other boosting algorithms, the individual algorithm for training the weak learners in gradient boosting is restricted to the DT algorithms, thus it is called as GBRT. In each step (or iteration), a new DT is established by fitting the negative gradient of the loss function. e number of DT is determined by the iteration number.
e GBRT model superimposes multiple DTs and is expressed as where T(x; Θ m ) represents the weak learner by DT; Θ m denotes the parameters of m-th DT model; M is the number of DTs, respectively. For a dataset D � (x 1 , y 1 ), (x 2 , y 2 ), . . . , (x N , y N ) where N denotes the number of the samples, the essence of training the boosting DT model is selecting the optimal parameters of DTs Here, the loss function is used to reflect the difference between the sample real value y i and the output of the GBRT Note that the GBRT model in equation (4) can also be written in a forward step way and expressed as erefore, training of the GBRT model can be achieved by M iteration steps. Specifically, at the initial step, we define f 1 (x) � 0, and for the m-th iteration step, a new T(x; Θ m ) is generated. e parameters Θ m of T(x; Θ m ) should be obtained to minimize the loss function where Θ m are the optimal DT parameters. If the squared loss function is used, then one obtains where erefore, the solution of equation (8) converts to the selection of appropriate Θ m to minimize the difference of the residual r m,i of the DT and the output T(x; Θ m ) or, equivalently, (x i , r m,i ) i�1,2,...,N can be used as the sample set of the decision tree T(x; Θ m ), and the optimal parameters Θ m are obtained according to the conventional DT generation process.
Moreover, in a more generalized sense, the negative gradient of the loss function can be used to represent the residual of the model, i.e., With (x i , r m,i ) i�1,2,...,N , we can fit the m-th DT h m , whose leaf nodes can be represented by R m,j , j � 1, 2, . . . , J, where J indicates the number of leaf nodes of the DT. For each leaf node of the regression tree h m , calculate the optimal fitting value c m,j : en, the weak learner for this step can be written as and the updated strong learner till this step is After M steps, the strong learner is finally obtained by e procedure of the GBRT algorithm can be summarized as follows: (1) Initialization of the function for the weak learner (3) After M iterations, the strong learner is obtained using equation (13) 3.3. Implementation of GBRT. In this study, one of the most widely used DT, i.e., CART, is employed as the individual learning algorithm. e implementation of the GBRT can be summarized as the following four steps: (1) Collect and process the data, such as the setting of input/output variables and the grouping of the training/testing datasets (2) Train the regression model using the GBRT with the training dataset (3) Validate the trained model with the testing dataset (4) Apply the model to the realistic problems e flowchart of the abovementioned procedure is depicted in Figure 1.
Another important issue associated with the implementation of GBRT is the determination of model parameters, which have two levels, i.e., the framework level and the level for the individual learning algorithm. At the framework level, there are two main parameters, i.e., the number of iteration (number of weak learners) and the learning rate, which is used to avoid the overfitting problem. At the single learning algorithm level, there are four primary parameters, i.e., the maximum depth of the tree, the minimum samples for split, the minimum samples of leaf node, and the minimum change in impurity. e selected values of these parameters are determined based on previous studies in literature and practical modeling experience, as shown in Table 1.

Collection of Experimental Data for Shear Strength of Internal RC Beam-Column Joints
In implementing the ML techniques for prediction of the shear strength for RC joints, an experimental database is required to train the predictive model and validate the model. erefore, a database including the experimental results of 86 internal RC beam-column joints was collected for this purpose in this study. In the database, there are 10 input parameters covering material properties and geometric dimensions and reinforcing details of the test specimens, i.e., the concrete strength f c , the section width of column b c , the section height of column h c , the section width of beam b b , the section height of beam h b , the yielding strength of beam longitudinal bar f y,b , the yielding strength of column longitudinal bar f y,c , the yielding strength of joint transverse bar f yv , the transverse bar ratio ρ v , and the axial load ratio n. e only output is the joint shear strength τ. e statistical information of these parameters, e.g., mean and standard deviation (St.D.), and the distributions of the aforementioned parameters are illustrated in Table 2 and Figure 2. e details of the tested specimens in the database are given in Table 3.

10-Fold Cross-Validation Results.
To validate the proposed method, the 10-fold cross-validation method is firstly used to evaluate the model's performance. e 10fold cross-validation method is developed to minimize the bias associated with random sampling of the training and testing datasets. It divides the experimental data samples into 10 subsets, and for each run, 9 are set as training subsets and 1 is set as validating subset. It is believed that repeating this for 10 times is able to represent the generalization and reliability of the predictive model. Moreover, three commonly used indicators are introduced to assess the prediction performance, which are respectively defined as Coefficient of determination R-squared (R 2 ): Root mean squared error (RMSE): Mean absolute error (MAE): where P i and T i are the predicted and tested values, respectively; T is the mean value of all the tested values; N is the total number of the samples in the dataset. Among the three indicators, R 2 indicates the degree of the linear correlation between the predicted and tested values. RMSE shows the deviation between the predicted and tested values. MAE reflects the ratio of the prediction error to the tested values. e closer the R 2 to 1, the smaller the RMSE or MAE, the better performance the prediction model possesses. Table 4 shows the 10-fold cross-validation statistic results of the GBRT model.
It can be drawn from  Figure 1: Flowchart of implementation of GBRT.       i.e., 80% of the data is used for training and 20% of the data is used for testing. Figure 3 shows the prediction results of the GBRT model and the four general ML models for the testing dataset. It is clear that the GBRT model has stronger linear correlation compared with other four ML models. e reason is that the GBRT is an ensemble learning algorithm with strong learner, while other four models use individualtype learning algorithms with weak learners.   Advances in Civil Engineering 9  Table 5 exhibits the prediction performance of the five ML models by providing the average statistical indices of the 10fold cross-validation results. Obviously, the GBRT model has the closest R 2 to 1 and smallest values of RMSE and MAE among the five ML models. It further verifies the superiority of the GBRT model over the general individual-type ML models.

Comparison with Conventional Mechanical-Driven Approach
6.1. Typical Mechanical-Driven Approach: MCFT. In this section, the derivation of MCFT is briefly summarized as it is a representative conventional mechanical-driven shear strength prediction method. A basic assumption for MCFT is that the crack direction of a RC plane element is in accordance with the principal compressive stress and varies accordingly. e definitions of stress, strain, rotational angle, and principal direction are illustrated in Figure 4, where the x-y coordinate system is the local system and the 1-2 coordinate system indicates the principal tensile strain-principal compressive strain system. e strain vector and stress vector of the RC element in the local system are denoted as ε x ε y c xy T and f x f y v xy T , respectively. e derivation of the MCFT includes three parts, i.e., compatibility equations, equilibrium equations, and constitutive equations. e detailed formulations are given as follows.

Compatibility Equations.
According to Mohr's circle of strain, the principal tensile strain ε 1 and the principal compressive strain ε 2 of the element are calculated as Accordingly, the rotational angle θ from the principal strain direction to the x-axis can be obtained by

Equilibrium
Equations. e basic element consists of a steel bar and concrete such that its equilibrium condition can be derived from the stress state as shown in Figure 4, which can be expressed as where σ cx and σ cy are the normal stresses of concrete in the x and y directions, respectively; τ cxy is the shear stress of concrete; ρ sx and ρ sy denote the reinforcement ratios in the x and y directions, respectively; σ sx and σ sy are the normal stresses of the steel bar in the x and y directions, respectively.
Considering the condition of Mohr' circle of stress, the normal stresses and shear stress of concrete are obtained by where σ c1 and σ c2 are the principal stresses in the 1 and 2 directions. (19) and (20), it is found that the stress vector of the RC element can be obtained by the stress states of concrete and steel. erefore, the constitutive stress-strain relations of these two materials are necessary for the state determination of the element. Especially, the steel bars are assumed in uniaxial stress state and the concrete is subjected to biaxial stress state, which can be described in the two principal directions. For reinforcement steel, the uniaxial elastic perfectlyplastic model is adopted, which is given by

Constitutive Equations. With equations
where E sx , ε sx , and f yx are the elastic modulus, strain, and yielding strength of the steel bar in the x direction, respectively; E sy , ε sy , and f yy are the elastic modulus, strain, and yielding strength of the steel bar in the y direction, respectively.
For concrete, the shear stress state is distinctly different from the uniaxial stress state. In consideration of the tensile stress perpendicular to the principal compressive direction having influences on the compressive behavior of concrete, it is recommended using the modified uniaxial stress-strain relationships to represent the stress-strain relationship of the RC plane element subjected to combined stress state, which are Stress-strain relationship in the tensile principal direction Stress-strain relationship in the compressive principal direction Advances in Civil Engineering with where E c is the elastic modulus of concrete; f t and f c ′ are the tensile and compressive strengths of concrete, respectively; ε cr and ε 0 are the strains corresponding to the tensile strength and the compressive strength, respectively; f c2 max is the maximum compressive stress in the principal compressive direction. It is clear that the modification equation (24) considers the reduction of concrete compressive strength due to the existence of tensile stress.

Crack Check.
Note that the abovementioned equations handle with the global behavior of the element in an average sense, while it cannot provide the local behavior description. e local equilibrium across a crack should also be satisfied, say, where σ sxcr and σ sycr are the steel stress at the crack; f ci and v ci are the local compressive and shear stresses at the crack, respectively. e abovementioned equation can be satisfied if there are no local compressive and shear stresses, say, However, a constrain should be ensured that the steel stresses at the crack should not exceed the yield strength of the steel, i.e., (σ sxcr /σ sycr ) < (f yx /f yy ).
erefore, if this condition is not satisfied, the local stresses should be calculated iteratively. e expressions for the local stresses are where w is the crack width; a is the maximum aggregate size; f ci is calculated according to ref [4]. e whole process of using MCFT applied to the shear strength of internal RC joints can be depicted in Figure 5. More details can also be found in [4].

Comparison between GBRT and MCFT.
To further evaluate the performance of the GBRT model, the conventional MCFT is also used to predict the shear strength of the 86 RC internal beam-column joints. e statistic results from the MCFT model are compared with the GBRT model and shown in Table 6. Note that to fairly compare the performance of the two models, the prediction results in the 10 testing sets of the 10-fold cross-validation process are used for the GBRT model.  As can be seen from Table 6, the determination coefficient of the GBRT model has been improved by 25.4% and closer to 1 compared to the mechanical-driven MCFT model, and all the other two indicators have been dropped more than 50%. In other words, the ML-based method has obviously better performance than the MCFT-based method. Furthermore, the predicted and tested values are also plotted in Figure 6. Evidently, the GBRT results match the experimental results much better than those of the MCFT model. Table 7 gives the statistic results of predicted value/tested value ratios for the MCFT and GBRT models. It can be concluded from Table 7 that the GBRT model statistically underestimates the shear strength because the mean value is less than 1, while the MCFT model slightly overestimates the shear strength. Apparently, the mean predicted value/tested value ratios for the GBRT approach is closer to 1 with less dispersion (St.D.) compared to the MCFT method. Figure 7 further illustrates the predicted value/tested value ratios for the GBRT and MCFT models. e solid line, the top dotted line, and the bottom line represent the mean value, mean value plus St.D., and mean value minus St.D., respectively. Evidently, better prediction performance is achieved by the GBRT model.

Sensitivity of Input Parameters.
With the developed GBRT model, it is convenient for us to investigate the influences of different parameters on the shear strength of the internal RC joint and even quantify the influences. In this study, 10 input variables with different value ranges are adopted to conduct a comprehensive parametric analysis. In the parametric analysis, the control variable method is used, i.e., one control parameter varies, while other parameters are fixed. e specimen J6 of [44] is used as the reference model. e numerical ranges of the 10 inputs are shown in Table 8. Figure 8 shows the predicted shear strength of the internal RC joints with different input variables by using the GBRT model. It can be drawn from Figure 8   Check if σ sxcr /σ sycr < f yx /f yy Integrate the stress state (σ x , σ y , τ xy ) T using Eq. (19) Calculate steel stress at the crack (σ sxcr , σ sycr ) T   7.2. Feature Importance. Feature importance, which is used to quantify the importance of the input variables (or features), is conducted to further investigate the sensitivity of        each input variable on the shear strength of the internal RC joints. e calculation of feature importance can be summarized as follows. Firstly, some out-of-bag (OOB) samples are selected. Secondly, the values of the target input variable are randomly shuffled while other inputs remain unchanged. en, the feature importance can be calculated as the accuracy difference of the two predictions using the GBRT model. Figure 9 shows the relative feature importance of all input variables. It is clear that concrete strength f c is the key feature determining the shear strength of the internal RC joints, which is in accordance with the conclusion obtained from the previous subsection. e influences of the yielding strength of joint transverse bar f yv , transverse bar ratio ρ v , and axial load ratio n on shear strength are subdominant. e remaining input variables are insignificant features. e feature importance results are also in accordance with the sensitivity results performed before.

Conclusions
is paper presents a ML-based approach to predict the shear strength of internal RC beam-column joints. One of the famous ensemble learning methods, GBRT, is employed to solve the prediction problem. A database of 86 sets of internal RC joint tests is collected from the literature. Some individual-type ML methods and the conventional MCFT method are adopted for comparisons of the developed GBRT prediction model. e model sensitivity analysis of input parameters is conducted for the proposed GBRT-based model. Based on the results, the following conclusions can be drawn: (1) e GBRT model can accurately and efficiently predict the shear strength of internal RC beamcolumn joints with given input variables. (2) If 80% of the whole dataset is used to train the GBRT model, the average determination coefficient R 2 of the 10-fold cross-validation is 0.875, which means that the prediction error is low. Meanwhile, the average RMSE and MAE are 0.948 MPa and 0.722 MPa, indicating that the prediction model has a low prediction deviation. Data Availability e data will be available from the corresponding author upon request.

Conflicts of Interest
e authors declare that there are no conflicts of interest regarding the publication of this paper.