Prediction of bridge component condition is fundamental for well-informed decisions regarding the maintenance, repair, and rehabilitation (MRR) of highway bridges. The National Bridge Inventory (NBI) condition rating is a major source of bridge condition data in the United States. In this study, a type of generalized linear model (GLM), the ordinal logistic statistical model, is presented and compared with the traditional regression model. The proposed model is evaluated in terms of reliability (the ability of a model to accurately predict bridge component ratings or the agreement between predictions and actual observations) and model fitness. Five criteria were used for evaluation and comparison: prediction error, bias, accuracy, out-of-range forecasts, Akaike’s Information Criteria (AIC), and log likelihood (LL). In this study, an external validation procedure was developed to quantitatively compare the forecasting power of the models for highway bridge component deterioration. The GLM method described in this study allows modeling ordinal and categorical dependent variable and shows slightly but significantly better model fitness and prediction performance than traditional regression model.
The highway bridge system is generally considered an essential part of the US transportation infrastructure. The efficient use of public funds for repairing and maintaining bridges requires an effective bridge asset management framework. Transportation management agencies worldwide have begun to adopt bridge management systems (BMS) to determine the optimum future bridge maintenance, repair, and rehabilitation (MRR) strategy at the lowest possible life-cycle cost based on forecasted bridge conditions [
In the United States, highway bridge ratings typically consist of three major components: deck, superstructure, and substructure. The current method for monitoring them relies heavily on visual inspections which only take into account the observed physical health of the bridge. During visual inspections, a condition rating of the three major components is given on an integer scale of 0 to 9 with 8 equal interim levels. On this scale, 0 is failure and 9 is near-perfect condition [
A review of the literature reveals that many researchers have used various models to predict the future condition rating of bridges ([
Applications of deterioration models such as Markov-chains and simulation are gaining popularity in forecasting bridge condition ratings; however they are limited by their inability to provide specific information on the deterioration of an individual infrastructure element [
Multinomial regression is a variant of nonlinear regression that is capable of handling discrete dependent variables with multiple levels. However, bridge condition ratings are commonly represented as variables that are both discrete and ordinal in nature. In multinomial logistic regression, values of the dependent variable do not indicate any order or ranking. Ordinal logistic regression is an extension of multinomial regression that is believed to be theoretically appropriate and practically feasible for modeling bridge component rating changes. Those logistic models have been widely adopted in modeling discrete choices in motor vehicle crash severity and, to a lesser degree, in pipeline deterioration and wastewater utility deterioration [
Moreover, the accuracy of the decision-making relies heavily on the outcomes of a reliable bridge condition forecasting model ([
In this research, the authors will evaluate the model forecasting capability based on various measurements including relative closeness measurements and exact accurate. Moreover, this research will validate the model forecasting power with not only in-sample data but also external data validation for three bridge component ratings.
In this study, an ordinal logistic regression method was developed to predict network-level bridge component ratings with North Dakota 2012 NBI data. A multiple linear regression model was also developed with the same data set as a reference for comparing model fitness and forecasting skill. The model is not perfectly suited for handling ordinal data as stated earlier; however it can be used for comparison since this type of model is popular within engineers and are straight forward to develop and use. Five criteria were used to evaluate and compare the two models: prediction error, bias, accuracy, out-of-range forecasts, Akaike’s Information Criteria (AIC), and log likelihood (LL). The developed model was validated with North Dakota 2013 and 2014 NBT data. The application of the model for predicting MAP-21 bridge performance indicator was conducted and discussed.
Ordinal logistic regression is used to model the relationship between an ordered multilevel dependent variable and independent variables. In the modeling, values of the dependent variable have a natural order or ranking. One example of ordinal variables is bridge component ratings (ranging from 0 to 9, with 0 being fail and 9 being near-perfect). When the response categories are ordered, in ordinal logistic regression model, the event being modeled not only is having an outcome in a particular category but also preserves information about response categories which are ordered. Ordinal logistic regression models, also known as proportional odds models, utilizing proportional odds, have the following general form [ j= 1,2,…,k-1;
Multinomial logit models do not consider proportional odds and ignore ordered response categories. For k possible outcomes, running k-1 independent binary logistic regression models in which one outcome, say k, are chosen as a reference and then the other k-1 outcomes are separately regressed against the reference outcome. The general form is followed by the following equation:
The restriction of ordinal regression originates from the proportional odds assumption even though ordinal regression takes care of ordinal relationship between levels of the dependent variable [
The National Bridge Inventory (NBI) ASCII database is a unified database compiled by the Federal Highway Administration (FHWA) for all bridges and tunnels in the United States that have public roads passing above or below them [
As stipulated in the National Bridge Inspection Standards, bridges are inspected at least once every 24 months. During these inspections, the conditions of the three major bridge components (deck, superstructure, and substructure) are rated using a standard scale developed by Federal Highway Administration (Table
Condition ratings used in the National Bridge Inventory (NBI).
Code | Meaning | Description |
---|---|---|
9 | Excellent | As new |
| ||
8 | Very Good | No problems noted. |
| ||
7 | Good | Some minor problems. |
| ||
6 | Satisfactory | Structural elements show some minor deterioration. |
| ||
5 | Fair | All primary structural elements are sound but may have minor section loss, cracking, spalling or scour. |
| ||
4 | Poor | Advanced section loss, deterioration, spalling or scour. |
| ||
3 | Serious | Loss of section, deterioration, spalling or scour has seriously affected primary structural components. Local failures are possible. Fatigue cracks in steel or shear cracks in concrete may be present. |
| ||
2 | Critical | Advanced deterioration of primary structural elements. Fatigue cracks in steel or shear cracks in concrete may be present or scour may have removed substructure support. Unless closely monitored it may be necessary to close the bridge until corrective action is taken. |
| ||
1 | Imminent Failure | Major deterioration or section loss present in critical structural components or obvious vertical or horizontal movement affecting structure stability. Bridge is closed to traffic but with corrective action may put back in light service. |
| ||
0 | Failed | Out of service, beyond corrective action. |
Source: United States Department of Transportation. Recording and Coding Guide for the Structure Inventory and Appraisal of the Nation's Bridges. Washington, D.C., 1995, page 38.
In this study, not only in-sample fitness assessment is conducted with the data set used to construct the model for the purpose of ensuring the model’s in-sample fitness. External data forecasting validation is also conducted with two separate data sets along with MAP21 indicators to explore model’s forecasting reliability. ND 2012 data set is selected for the purpose of constructing models and ND 2013 and 2014 data sets are selected for external forecasting validation purpose. However, it is easy to demonstrate validation procedures with any data set that makes available.
Bridge distributions by three component ratings for ND 2012 are displayed in Figure
Bridge distribution by component ratings for ND 2012.
In this study, ordinal logistic regression and multiple linear regression models were constructed to forecast bridge conditions. Model fitness and forecasting skills are evaluated and compared between the two types of models. Several criteria were selected for evaluating and comparing the two models and are introduced in the following section.
The following measures were considered in this research: prediction error (PE), bias, accuracy, out-of-range forecasts, percent of correct estimation, Akaike’s Information Criteria (AIC), and log likelihood (LL). The models were constructed with the same dataset and compared in two senses: model fitness and prediction performance. All seven proposed measurements can be used to assess model fitness with the same data set that was used to build the model. The first five measures can be used to evaluate model forecasting performance with external evaluation data.
The prediction error, also known as residuals, is a measure of the discrepancy between the observed data and an estimated value which can be mathematically expressed as (
Bias indicates, on average, how much a model overpredicts (where bias >1) or underpredicts (where bias <1) the observed data [
The accuracy measurement indicates, on average, how much the prediction differs from observed data [
Fit criteria such as Akaike’s Information Criteria (AIC) are also selected to compare model fitness between the two models. AIC is a common measure of model fit that balances model fit against model simplicity. The model with the smallest AIC is deemed the “best” model based on apparent validation. In other words, a smaller AIC value indicates a better model/predictor. This can be mathematically expressed as
Out-of-range forecasts were counted when the forecasted value is greater than 9 and less than 0 for bridge components. This issue only exists for multiple linear regression. For ordinal regression, any out-of-range forecast is always zero. Percent of correct estimation assesses model performance and fitness by examining the prediction and actual observation agreement ratio.
Multiple regression models for forecasting bridge component rating are still used by some transportation agencies such as North Dakota DOT to assist in bridge inventory management [
Description of variables used in analysis.
Name of variable | Description of variable |
---|---|
Reconstruction | Reconstruction record: Yes, No (binary variable) |
| |
Bridge Material Type | Structure materials: Steel, Concrete, Timber (dummy variable) |
| |
District | Highway districts: Bismarck, Devils Lake, Dickinson, Grand Forks, Minot, Valley City, Williston, Fargo (dummy Variable) |
| |
Age | Bridge age: Inspection year-construction year or inspection year-reconstruction year (continuous variable) |
| |
Age2 | Bridge age squared (continuous variable) |
| |
ADT | Annual daily traffic per lane (continuous variable) |
Forward stepwise regression based on all adjusted r-square, Akaike information criterion, Bayesian information criterion was used to select the “best” multiple regression model. Detailed regression model selection techniques and theories are out of the scope of this study and readers are referred to Draper and Smith [
Two candidate models were constructed for predicting deck, superstructure, and substructure component performance ratings, respectively, with North Dakota NBI 2012 data. To illustrate the model performance, the models were first evaluated and compared by an in-sample validation method with previously introduced measurements and then ND NBI data from 2013 and 2014 were used to further conduct external prediction validation. Significant parameters for the two sets of models were tested at 90% confidence level as shown in Table
Significant parameters and statistics with 2012 data.
Model Statistics | Multiple Linear Regression | Ordinal Regression | ||||
---|---|---|---|---|---|---|
Deck | Superstructure | Substructure | Deck | Superstructure | Substructure | |
Sample Size | 2354 | 2991 | 2991 | 2354 | 2991 | 2991 |
| ||||||
Reconstruction | Significant | Significant | Significant | Significant | Significant | Significant |
| ||||||
Bridge Material Type | Significant | Significant | Significant | Significant | Significant | Significant |
| ||||||
District | Significant | Significant | Significant | Significant | Significant | Significant |
| ||||||
ADT | Significant, Negative | Not Significant, Negative | Significant, Negative | Significant, Negative | Significant, Negative | Significant, Negative |
| ||||||
Age | Significant, | Significant, | Significant, | Significant, | Significant, | Significant, |
| ||||||
Age2 | Significant, | Significant, | Significant, | Significant, | Significant, | Significant, |
Note: all independent variables are significant at 90% of the confidence.
As shown in Table
To assess how well the model fit the 2012 data, Table
Model comparison statistics with 2012 data.
Model Statistics | Multiple Linear Regression | Ordinal Regression | ||||
---|---|---|---|---|---|---|
Deck | Superstructure | Substructure | Deck | Superstructure | Substructure | |
Sample Size | 2354 | 2991 | 2991 | 2354 | 2991 | 2991 |
| ||||||
Sum of Absolute Residuals | 1,656 | 1,906 | 2,380 | 1,433 | 1,494 | 2,200 |
| ||||||
Sum of Residual Squares | 2,446 | 2,722 | 4,090 | 1,933 | 1,986 | 3,580 |
| ||||||
Bias | 1.087 | 1.090 | 1.128 | 1.026 | 1.023 | 1.078 |
| ||||||
Accuracy | 1.117 | 1.112 | 1.180 | 1.103 | 1.081 | 1.174 |
| ||||||
AIC | 5,946 | 6,875 | 8,571 | 5,416 | 5,994 | 7,541 |
| ||||||
LL | -2,957 | -3,421 | -4,270 | -2,845 | -3,131 | -3,950 |
| ||||||
Out-of-range Forecasts | 0.38% | 0.87% | 1.07% | 0% | 0% | 0% |
| ||||||
Percent Exact Estimations | 44% | 47% | 41% | 48 % | 56 % | 44% |
| ||||||
Percent Estimation within 1 Rating Difference | 87% | 90% | 83% | 92% | 95% | 89% |
| ||||||
Percent Estimation within 2 Rating Differences | 96% | 96% | 93% | 99% | 100% | 98% |
Of the predictions from linear regression models, 0.38%, 0.87%, and 1.07% are out-of-range (0 to 9). The percent of exact-match predictions by three multiple linear regression models (each with the same prediction and observation) are 44.18%, 47.51%, and 41.66%, while the ordinal logistic predictions are much better: 48.3%, 56.74%, and 44.13%. The same conclusion is true for percentage of estimations within one condition-rating difference and within two condition-rating differences. One can tell that the three ordinal multinomial models have more percentage component rating predictions that are off by one or two observation ratings. The bias and accuracy indicators for ordinal logistic regression model are all slightly closer to 1 than those for multiple linear regressions. The sum of absolute residual, sum of residual squares, AIC, and LL, consistently indicate all ordinal regression models perform better than multiple regression models. The ordinal models improve the model performance in terms of the sum of residual squares by 20.97%, 27.04%, and 12.47% compared to the multiple regression models for deck, superstructure, and substructure, respectively. Detailed improvement percentage values for all the four indicators are shown in Table
Performance improvements by ordinal model compared with multiple linear model.
Model Statistics | Model Performance Improvements Percentage | ||
---|---|---|---|
Deck | Superstructure | Substructure | |
Sum of Absolute Residuals | 13.47% | 21.62% | 7.56% |
| |||
Sum of Residual Squares | 20.97% | 27.04% | 12.47% |
| |||
AIC | 8.91% | 12.81% | 12.02% |
| |||
LL | 3.79% | 8.48% | 7.49% |
To further illustrate the external validation method result, the same ordinal logistic and multiple linear regression models from 2012 data are validated and compared with all ND NBI 2013 and 2014 deck, superstructure, and substructure observed data. Model performances were compared for ordinal logistic and multiple linear regressions by comparing sum of absolute residuals, sum of residual squares, bias, accuracy, out-of-range forecasts, and percentage of estimations which are within one or two rating differences compared with observed component ratings and exact forecasts. The performance results are shown in Tables
Model comparison statistics with 2013 data.
Model Statistics | Multiple Linear Regression | Ordinal Regression | ||||
---|---|---|---|---|---|---|
Deck | Superstructure | Substructure | Deck | Superstructure | Substructure | |
Sample Size | 2309 | 2943 | 2943 | 2309 | 2943 | 2943 |
| ||||||
Sum of Absolute Residuals | 1568 | 1832 | 2316 | 1360 | 1480 | 2207 |
| ||||||
Sum of Residual Squares | 2328 | 2642 | 4072 | 1830 | 1950 | 3677 |
| ||||||
Bias | 1.096 | 1.099 | 1.147 | 1.023 | 1.022 | 1.097 |
| ||||||
Accuracy | 1.126 | 1.119 | 1.202 | 1.099 | 1.081 | 1.2 |
| ||||||
Out-of-range Forecasts | 0% | 0.2% | 0.2% | 0% | 0% | 0% |
| ||||||
Percent Exact Forecasts | 45% | 48% | 42% | 49% | 56% | 43 % |
| ||||||
Percent Estimation within 1 Rating Difference | 88% | 90% | 84% | 93% | 94% | 87% |
| ||||||
Percent Estimation within 2 Rating Differences | 98% | 97% | 94% | 100% | 100% | 97% |
Model comparison statistics with 2014 data.
Model Statistics | Multiple Linear Regression | Ordinal Regression | ||||
---|---|---|---|---|---|---|
Deck | Superstructure | Substructure | Deck | Superstructure | Substructure | |
Sample Size | 2281 | 2906 | 2906 | 2281 | 2906 | 2906 |
| ||||||
Sum of Absolute Residuals | 1550 | 1793 | 2247 | 1350 | 1441 | 2129 |
| ||||||
Sum of Residual Squares | 2350 | 2493 | 3841 | 1832 | 1823 | 3417 |
| ||||||
Bias | 1.087 | 1.079 | 1.105 | 1.025 | 1.022 | 1.053 |
| ||||||
Accuracy | 1.115 | 1.098 | 1.156 | 1.101 | 1.080 | 1.151 |
| ||||||
Out-of-range Forecasts | 0% | 0% | 0.1% | 0% | 0% | 0% |
| ||||||
Percent Exact Forecasts | 46% | 48% | 43% | 50% | 56% | 44 % |
| ||||||
Percent Estimation within 1 Rating Difference | 86% | 91% | 84% | 93% | 95% | 87% |
| ||||||
Percent Estimation within 2 Rating Differences | 96% | 98% | 94% | 100% | 99% | 97% |
One can tell from Tables
Performance improvements by ordinal model compared with multiple linear model.
Model Statistics | 2012 | 2013 | 2014 | ||||||
---|---|---|---|---|---|---|---|---|---|
Deck | Superstruc. | Substruc. | Deck | Superstruc. | Substruc. | Deck | Superstruc. | Substruc. | |
Sum of Absolute Residuals | 13.5% | 21.6% | 7.56% | 13.3% | 19.2% | 4.71% | 12.9% | 19.6% | 5.25% |
| |||||||||
Sum of Residual Squares | 21.0% | 27.0% | 12.4% | 21.4% | 26.2% | 9.7% | 22.0% | 26.8% | 11.0 % |
| |||||||||
Bias | 5.61% | 6.15% | 4.43% | 6.66% | 7.01% | 4.36% | 5.7% | 5.28% | 4.71% |
| |||||||||
Accuracy | 1.25% | 2.79% | 0.51% | 2.4% | 3.4% | 0.17% | 1.26% | 1.64% | 0.43% |
| |||||||||
Exact Forecasts | 9.33% | 19.4% | 5.93% | 8.01% | 15.6% | 1.59% | 6.46% | 16.0% | 1.5% |
Some interesting observations were obtained in the analysis. Table
The MAP-21 rules require all states to report percentage of national highway system bridges classified in good condition and poor condition. Bridge condition can be determined based on an assessment of the deck, superstructure, and substructure. The method used under the Highway Bridge Program is selected to determine bridge conditions: components with condition ratings of no less than 7 are rated as “Good” and no greater than 4 are rated as “Poor”. When all three components are rated as “Good” the overall bridge condition rating can be coded as “Good” and when all three components are rated as “Poor” the overall bridge condition rating can be coded as “Poor”. The observed bridge condition measures and the forecasted measures are listed in Table
Bridge condition measures required by MAP-21 comparison results.
Deck | Superstructure | Substructure | Overall | |||||
---|---|---|---|---|---|---|---|---|
Good | Poor | Good | Poor | Good | Poor | Good | Poor | |
2012 Observed | 48.14% | 3.14% | 70.38% | 3.58% | 54.26% | 11.84% | 44.82% | 1.87% |
| ||||||||
2012 Estimate (Multilinear Regression Model) | 60.85% | 0.00% | 81.71% | 0.00% | 62.89% | 0.90% | 54.42% | 0.00% |
| ||||||||
2012 Estimate (Ordinal Model) | 51.32% | 1.10% | 73.65% | 0.53% | 61.95% | 2.67% | 48.94% | 0.42% |
| ||||||||
2013 Observed | 47.07% | 3.11% | 69.58% | 3.51% | 52.56% | 12.07% | 44.48% | 2.04% |
| ||||||||
2013 Estimate (Multilinear Regression Model) | 58.38% | 0.00% | 80.68% | 0.00% | 62.39% | 2.24% | 54.79% | 0.00% |
| ||||||||
2013 Estimate (Ordinal Model) | 49.98% | 0.13% | 73.72% | 0.27% | 60.22% | 5.32% | 49.11% | 0.13% |
| ||||||||
2014 Observed | 45.14% | 3.21% | 68.37% | 3.18% | 51.55% | 11.63% | 43.05% | 1.75% |
| ||||||||
2014 Estimate (Multilinear Regression Model) | 57.14% | 0.00% | 80.17% | 0.00% | 61.38% | 1.6% | 54.49% | 0.00% |
| ||||||||
2014 Estimate (Ordinal Model) | 48.85% | 0.10% | 72.99% | 0.50% | 59.59% | 5.15% | 48.27% | 0.09% |
The above analysis shows that the ordinal logistic models are always better at predicting bridge conditions and measurements with both the in-sample and external validation data sets.
It is worth noting that all the models are underestimate for poor conditions due to the nature of the data distribution. The bridge condition data is imbalanced and biased data set; in other words, the number of observations belonging to one category is significantly lower than those belonging to the other categories. In the situation, the predictive model developed using any GLMs or even conventional machine learning algorithms could be biased and inaccurate. To handle imbalanced classification or improve forecast of rare events data is an extended research and should be investigated in future research.
Eight model evaluation criteria were used to compare the goodness of fit and the forecasting power of the models with both in-sample and external validation data sets for deck, superstructure, and substructure condition. The following are the main findings of the study: The analysis shows agreement among all indicators, for all three component models, and for all three-year data sets. All the comparison results indicate the clear improvement of the ordinal logistic model over the multiple linear regression model. Some indicators show significant improvement such as sum of absolute residuals, sum of residual squares, AIC, and exact forecasts (about 10% improvement). However, some indicators show slight improvement such as bias, accuracy, and LL (less than 10% improvement). Superstructure models show the greatest improvement for almost all performance criteria, followed by deck models and substructure models. To further investigate on this issue, time series data need to be tested to confirm that the superstructure model consistently performs better than the other two models There is no clear trend for model performance improvement by year. According to Table Ordinal logistic models will not predict out-of-range estimations which are not controlled by multiple linear regression model.
This paper proposes and demonstrates an ordinal logistic regression model for forecasting bridge component rating. The model is preferred for its ability to handle the ordinal nature of bridge component ratings, its explanatory power of the regression analysis, and its accurate prediction power. In this study, both ordinal logistic regression and multiple linear regression models have been generated for predicting three main bridge component ratings. The multinomial logistic model demonstrated in this research can be easily applied with element-level data when it becomes available. In addition to assessing model performance, both in-sample and external validation analysis were performed for all eight evaluation criteria. Finally, it is determined that the ordinal logistic regression method is a better approach than the multiple linear regression method for forecasting bridge component ratings. It has the inherent advantage of always making meaningful predictions and its predictions are closer to the observations.
The data used to support the findings of this study are available from the corresponding author upon request.
The authors declare that they have no conflicts of interest.