Nonparametric Monitoring for Geotechnical Structures Subject to Long-Term Environmental Change

A nonparametric, data-driven methodology of monitoring for geotechnical structures subject to long-term environmental change is discussed. Avoiding physical assumptions or excessive simplification of the monitored structures, the nonparametric monitoring methodology presented in this paper provides reliable performance-related information particularly when the collection of sensor data is limited. For the validation of the nonparametric methodology, a field case study was performed using a full-scale retaining wall, which had been monitored for three years using three tilt gauges. Using the very limited sensor data, it is demonstrated that important performance-related information, such as drainage performance and sensor damage, could be disentangled from significant daily, seasonal and multiyear environmental variations. Extensive literature review on recent developments of parametric and nonparametric data processing techniques for geotechnical applications is also presented.


Introduction
Restoring and improving urban infrastructure is recognized by the National Academy of Engineering as one of the fourteen grand challenges for engineering (NAE, [1]), and according to the 2009 ASCE Report Cards for Americas Civil Infrastructure, the current condition of U.S. infrastructure is rated "D" [2].Aging civil infrastructure including bridges, levees, and dams in the US is calling for urgent measures focusing on maintenance, repair, and renovation.Geotechnical structures, compared to other types of civil infrastructure, are more vulnerable to nature and human-induced hazards.For example, Landslides in the Pacific Coast, the Rocky Mountains, the Appalachian Mountains, Hawaii, and Puerto Rico regions cause fatalities of 25 to 50 per year and direct/indirect economic losses up to $3 billion per year [3].
Structural health monitoring (SHM) is an emerging technique for the assessment of structural condition, hazards, and risks, consisting of three major components: sensing and instrumentation, data communication and archiving, and data analysis and interpretation.With the advent of todays powerful digital media and Internet, the needs for the first two components have been readily filled in many cases, but serious technical challenges still exist on the third component; how to process voluminous sensor data to obtain critical information for decision making?The research community is caught overwhelmed with the complex and extensive nature of field data associated with various factors of geotechnical phenomena.Some important challenges in processing field measurements are as follows.
(1) How can performance-related information (e.g., condition of drainage systems) be disentangled from the causes of various environmental factors (e.g., diurnal and seasonal temperature change)?(2) Field measurements are expensive and technically difficult, especially when the monitoring is long term.How can one perform reliable estimation with insufficient sensor data without sacrificing the accuracy?(3) Extensive modeling efforts are required in current structural health monitoring practices for geotechnical structures.How can one reduce modeling efforts for geotechnical structures, whose material and structural characteristics are various?
(4) How can one deal with unavoidable and unpredictable sensor/instrument network problems and loss of subsets of sensor data, which are commonly encountered in field data collection?
This paper discusses reliable monitoring methodology for geotechnical structures that is subject to long-term environmental change with very limited sensor measurements.The objective of the methodology is to provide the information of when, where, and how confidently field engineers should be deployed to the monitoring site for potential hazards on structural performance.The methodology should be robust enough to deal with unavoidable malfunctioning of instrumentation devices during data collection.
This paper is organized as follows: some definitions and dilemma in current monitoring practices are discussed in Section 2. Sensing and modeling strategies of monitoring for complex geotechnical systems are discussed in Section 3. Understanding system identification techniques is important to develop reliable monitoring methodology.Recent developments of modeling and system identification techniques have been discussed: parametric approaches in Section 4 and nonparametric approaches in Section 5. A case study was conducted to demonstrate how monitoring methodology developed by the authors can be applied to realistic problems.The analysis results for a full-scale retaining wall subject to long-term environmental change are discussed in Section 6.

Some Definitions and Dilemma in Current Monitoring Practices
Inverse analysis and system identification techniques are necessary tools to evaluate current performance of civil infrastructure systems using field measurement data.A system in inverse analysis can be expressed with a causeresponse model, which consists of the causative force, system characteristics function, and system response as shown in Figure 1.The causative force is usually external forces (e.g., soil pressure), and the system response is usually the resulting deformation (e.g., displacement).The system characteristic function determines system properties with linear or nonlinear relationships between the system input and output associated with spatial and temporal variation of soil properties and highly variable soil conditions.When earth structures are exposed to significant environmental variation (e.g., temperature and precipitation), system identification becomes more complicated because the system response reflects the combined effects of loads and environmental factors.This is where the conventional parametric approaches of system identification become difficult to implement.
The nonparametric methods, on the other hand, are data-driven identification techniques that do not require a priori knowledge on physics of target systems.Consequently, without relying on idealization and simplification in modeling, the same data processing methodology is applicable to different structure types.The nonparametric methods are also advantageous in dealing with deteriorating structures since nonparametric models are more flexible in dealing with time-varying systems than the parametric ones, which are modeled with physical assumptions and would not be valid once target structures are damaged.So far, system identification of geotechnical structures is primarily done using the parametric methods.In long-term monitoring of geotechnical systems, however, there could be significant discrepancy between system behavior and corresponding models for two reasons.First, soil conditions are highly variable.Although high-fidelity models coupled with complex soil behavior are already available (e.g., coupled thermo-hydro-mechanical models), to collect all necessary sensor data for parametric identification is very expensive and it is usually not feasible.Due to insufficient data for sophisticated models, simpler models are often employed, which ignore many significant environmental factors.Consequently, parameter estimation becomes inaccurate due to oversimplification.Second, structures deteriorate over time.A common challenge in modeling deteriorating systems is that deterioration could result in not only changes in system parameter values but also transformation of the monitored system into different classes of nonlinear systems.Moreover, the characteristics of the damaged systems are usually unknown, so that the systems cannot be parametrically modeled prior to the occurrence of actual damage.
One drawback of existing nonparametric approaches is that physical interpretation on identification results is not as straightforward as that of the parametric methods, whose system parameters possess physical meaning (e.g., Youngs modulus).Although some nonparametric approaches were used in geotechnical applications, obtaining important performance-related information for decision making in maintenance has been rarely emphasized in this class of methods.For example, the nonparametric Artificial Neural Networks technique that will be described in Section 5.3 has been employed as an alternative approach to parametric regression methods using soil constitutive models (e.g., elastoplastic models) that will be described in Section 4.1, to identify complex nonlinear stress-strain relationship of soil.When soil strength is degraded, unlike the parametric methods, the nonparametric method could detect the change in soil mechanical properties, but it would not be able to interpret what types of physical change it is from the identification results.In order to overcome the above dilemma in current monitoring practices, it is desirable to take the advantages from both sides: modeling flexibility from the nonparametric methods and physical interpretation from the parametric methods.

Sensing and Modeling Strategies
To reduce high costs of sensor data collection associated with a high degree of spatial and temporal variability for geotechnical structures, the selection of what to be measured is a critical issue.Three options are possible in sensing: causative forces, environmental factors, and system response in Figure 1.The system response is desired to measure since the other two do not contain the information of system characteristics; the system response has the most abundant information about the entire system containing the effects Ambient temperature, rain, snow, humidity, etc.
Thermal pressure, soil weight, service loads, etc.

System response
Figure 1: A schematic of the cause-response system model consisting of the causative forces, system response, environmental effects, and system characteristic function [4].Figure 2: Procedures of conventional parametric approaches and proposed nonparametric approaches.Using output-only (or responseonly) data in modeling, the proposed approach does not require defining explicit relationships between the system input, environmental effects, and the system output, which are required in conventional parametric approaches.
of all components of causative forces, environment and system characteristics function.Using data that contain the information of the system characteristics is particularly important when one deals with deteriorating structures.A challenge, however, in dealing with the system response data is that it is usually difficult to interpret raw sensor data directly due to interrelated effects of the components in the system.Thus, some kind of disentanglement techniques will be needed to decompose the data into more easily manageable and physically understandable forms.
To explain modeling strategies, Figure 2 summarizes the differences in system identification between parametric and nonparametric methods.
In nonparametric methods, response-only (or outputonly) data are processed to find mathematical relationships embedded in the data.In order to deal with complicated raw system response (or system output) data, some disentanglement techniques will be used prior to modeling.Once the system response data are processed, additional data of the causative forces (or system input) and/or environmental factors can be used as a posteriori information for physical interpretation.In model construction, therefore, the monitoring methodology does not require explicit relationships between the system input, environment and system output, which are generally not known in geotechnical applications.
The above sensing and modeling methodology has several important advantages over existing (parametric) approaches, particularly in monitoring applications.
(1) Oversimplification problems can be avoided especially when actual systems are complex and data are insufficient for sophisticated (parametric) input-output models since the modeling process is solely data driven using response-only data.(2) Modeling time and effort can be reduced significantly by using the same data processing procedures for different structure types since the proposed approach is not limited to a specific type of structure (i.e., the model is not based on physical assumptions).For the same reason, the same procedures can be used for different sensor types.(3) The proposed approach is more advantageous than conventional parametric approaches in dealing with deteriorating structures often associated with unknown time-varying system characteristics.

Review of Parametric Approaches
In this section, recent developments of the parametric approaches have been reviewed to provide background of parametric modeling, estimation, and optimization techniques.
4.1.Modeling.Two parametric modeling approaches for geotechnical systems are discussed: soil constitutive model and coupled thermo-hydro-mechanical models.

Soil Constitutive Models.
There exist various soil constitutive models.In elastic model as the simplest constitutive model, the strain is assumed to be sustained under the applied load.Thus, the elastic strain is reversible, and if applied load is removed, the material springs back to its undeformed condition.Using elastoplastic models, the level of model complexity increases by adding the effects of irreversible plastic strains, and the soil is assumed to sustain both elastic and plastic strain.Therefore, if the load is removed, the soil sustains permanent plastic deformation, whereas elastic strain is recovered.Consequently, a key issue in the elastoplastic modeling exists in describing the material plasticity.A branch of plastic modeling is based on the concept of perfect plasticity [5].Some examples include the Tresca model and the von Mises model for perfect plasticity in cohesive soils, Mohr-Coulomb model, Drucker-Prager model, Lade-Duncan model, Matsuoka-Nakai model, and Hoek-Brown model for perfect plasticity in frictional material.Another branch of plasticity modeling adopts the concept of critical states.In this modeling approach, the soil is characterized with three major parameters: the mean effective stress, shear stress, and soil volume (or void ratio) [6].The original Cam clay model and the modified Cam clay model belongs to this category.The original Cam clay model was developed by researchers at Cambridge University as the first critical-state models that predict unlimited soil deformations without change in stress or volume when the critical state is reached in soft soil [7].The modified Cam clay model assumes that the voids between the solid particles are only filled with water (i.e., fully saturated).The modified Cam clay models are formulated based on plasticity theory; when the soil is loaded, saturated water is expelled from the voids between the solid particles, and, consequently, significant irreversible plastic volume change occurs.Some limitations of the Cam clay models are described in Yu [5].General descriptions on soil constitutive models can be found in Yu [5], Ling et al. [8], and Hicher and Shao [9].
The THM models express the sophisticated coupled relationships of heat and moisture transfer in deformable partially saturated soil [15].The freezing process influenced by the interactions between water, temperature, and stresses in soil; water migrates to freezing fronts, and the frozen soil can contain unfrozen water below the freezing temperature; the water glaciation is influenced by the state of stress [38].The formulation usually involves interrelated PDEs of thermoelasticity of solids (T-M) (interaction between the stress/strain and temperature fields through thermal stress and expansion) and poroelasticity theory (H-M) (interaction between the deformability and permeability fields of porous media).The conservation equations of mass, energy, and momentum are usually obtained with Hooke's law of elasticity, Darcy's law of flow in porous media, and Fourier's law of heat conduction [39].The effects of precipitation to the moisture content in the soil were studied by Troendle and Reuss [40], D'Odorico et al. [41], and Longobardi [42].For the numerical solution of the conservation equations, the finite element method (FEM) is usually employed [39,43].

Parameter Estimation.
For parametric models, the cause-response system can be expressed as where y k : observed (or measured) system output at time step k, in which the dimension of y k is (1 × m), and m is the total number of observational points or number of sensors in in-situ measurements; y k : estimated system output based on employed geomaterial constitutive models.In geotechnical engineering, the finite element method (FEM) is commonly used for the numerical solution of the constitutive equations, thus yielding y k ; y k : residual between the observed output y k and estimated output y k .The residual includes the modeling error η k and measurement error ε k , which are combined together and usually undistinguishable for field measurements.In many applications, the residual is assumed to have y k ∼ N(0; Σ y ), in which Σ y is an (m × m) covariance matrix of y k ; h k : system function of given system parameter vector θ.In the most general case, h k is stochastic, time-varying, nonlinear dynamic function; θ: (p × 1) system parameter vector to be estimated; x: known system input vector with the memory of the l-th order.For static systems, l = 0.The goal of system identification is to find the "best" estimates of the system parameters θ that minimize the residual y k .Many optimal estimation algorithms are available for the best estimates, and they are usually classified into two approaches: parameter estimation methods and state estimation methods.The parameter estimation methods (also referred as the variational methods in some geotechnical literatures) are described in this section, and the state estimation methods (also referred to as sequential methods in some geotechnical literatures) will be described in Section 4.3.
In parameter estimation, the most general objective function can be expressed as where J o (θ): objective function for the observational (or measurement) information of the system output; J p (θ): objective function for the prior information of the system parameters; β: a positive scalar parameter, which adjusts the significance (weighting) between the observational information J o (θ) and the prior information J p (θ); W o : covariance matrix of the measurement error whose dimension is (m × m); W p : covariance matrix of the prior information error involving system parameters whose dimension is (m × m); θ p : previously known means of the system parameters θ.Three parameter estimation methods are usually employed in geotechnical applications: (1) least square estimation, (2) maximum likelihood estimation, and (3) Bayesian estimation.

The Least Square Estimation (LSE).
The objective function of the LSE corresponds to the case in which the adjusting scalar parameter β = 0 in (2), and the covariance matrix of the measurement error W −1 o = I in (3), where I is an (m×m) identity matrix, thus resulting in With the condition of β = 0, no prior information of the system parameters is used during the parameter estimation.
With the condition of W −1 o = I, all observation values are weighted with the same significance.Thus, the LSE requires the least amount of information among the parameter estimation methods.

The Maximum Likelihood Estimation (MLE).
In the MLE method, the observational information of the measurements is used, and the measurement data are weighted according to their significance (i.e., W −1 o / = I), but no prior information of system parameters is used in the parameter estimation (i.e., β = 0).Therefore, the LSE can be seen as a special case of the MLE.The objective function of the MLE is Some examples of using the MLE for geotechnical engineering applications are Ledesma et al. [57], Honjo and Darmawan [58], Ledesma et al. [59], Ledesma et al. [60], and Gens et al. [61].

The Conventional Bayesian Estimation (BE) and Extended Bayesian Estimation (EBE).
In the BE method, the system parameters are estimated using both the observational information of measurements and the prior information of the system parameters, with the same significance between these two information (i.e., β = 1) as The objective function of the EBE is more general than that of the BE, with the nonunit positive scalar adjusting parameter β as If the adjusting parameter β is small, the prior information of θ p has less contribution in the parameter estimation of θ, and vice versa.Optimal values of the adjusting parameter β can be determined, for example, with the cross-validation method [62], ridge regression method [63], and the Akaike Information Criterion (AIC) [64][65][66].
The conventional BE and EBE methods are more sophisticated than other estimation methods, while the Bayesian methods require more amounts of information on both observational measurements and prior knowledge of system parameters.Therefore, the availability of necessary information is important to apply the Bayesian methods.

State Estimation.
In state estimation methods, the system can be identified by estimating its state at each time step using so called filters.Therefore, the state estimation method is also referred to as the sequential estimation method.Among numerous types of filters, the Kalman filter-based algorithms would be most widely used in geotechnical applications, including (1) the linear Kalman filter method and (2) the extended Kalman filter method.Some application examples of the Kalman filter methods for geotechnical applications are given in the work of Murakami and Hasegawa [68], Kim and Lee [69], and Zheng et al. [70].More general descriptions and details concerning the Kalman filter can be found in Mendel [71].

The Linear Kalman
Filter.The underlying system model of the linear Kalman filter is based on the assumption of a recursive linear dynamic system discretized in the time domain as where z k : true internal state at time step k, which is evolved from the previous state z k−1 ; x k : known system input state at time step k; w k : stochastic process of noise with a zeromean, multivariate normal distribution of w k ∼ N(0, Σw k ); A k : linear state transition matrix, which is applied to the previous state z k−1 ; B k : input matrix, which is applied to the current system input x k .
The observational (or measured) state of the system output can be expressed as where y k : observational system output; C k : observational matrix, which maps the true state space of z k into the observed space of y k ; v k : stochastic process of observational noise with zero mean Gaussian white noise of v k ∼ N(0, Σv k ).Using this underlying system model, the estimate of the state and error covariance matrix of the estimated state can be determined as where z k|k : updated state at time step k given observations up to and including time step k; P k|k : updated error covariance matrix of z k|k ; z k|k−1 : predicted state at time step k given observations up to and including time step The Kalman filter shown in (11) is an optimal estimator of minimum mean-square error z k − z k|k .

Extended Kalman Filter (EKF).
In the EKF, the underlying linear dynamic models are extended to nonlinear models as where f and h are nonlinear functions.Instead of A k and C k in the linear Kalman filter method, and, in the EKE, the Jacobian matrices of ∂ f /∂z and ∂h/∂z are used.In summary, the system in the state estimation can be identified by estimating its state at each time step using filters.Using the Kalman filter methods, it is possible to incorporate prior information in the observation data during the state estimation.Since the underlying system model of the linear Kalman filter method is a linear dynamic system, this method is usually not applicable to nonlinear geotechnical systems.The extended Kalman filter method can be used to identify such nonlinear systems.

Optimization.
Once an objective function with respect to unknown system parameters is constructed as shown in Section 4.2, the solution procedure uses standard optimization techniques to find the optimal values of the system parameters.Numerous optimization algorithms have been developed and used for general purposes of optimization in every field of science and engineering.General descriptions of optimization algorithms can be found in Bertsekas [72].
In geotechnical applications, the aim of the optimization process is usually to calibrate geotechnical models by finding a set of optimal values of the model parameters.The optimal values of the model parameters can be found, using various optimization algorithms by minimizing the residuals between the measurement data (usually obtained from field or laboratory testing) and the synthetic data (usually obtained from the finite element analysis for the numerical solutions of the geotechnical models).In many geotechnical applications, however, the optimization surface contains many local minima and sometime is nonconvex due to the complexity of material behaviors and coupled effects of temperature, moisture, and loads.Some examples of optimization algorithms used in geotechnical studies include the Newton method [73], quasi-Newton method [53], Gauss-Newton method [56,73], conjugation gradient method [47], simplex method [45,54], complex method [74], random search method [75,76], and more recently evolutionary algorithms, such as the genetic algorithm [77][78][79][80] and the particle swarm optimization method [81].

Review of Nonparametric Approaches
Nonparametric approaches have been also applied in different geotechnical problems.In this section, recent developments of nonparametric data processing techniques for geotechnical systems have been reviewed.

Time Series Analysis.
In time series analysis, the dynamic response of target systems can be analyzed with a discrete time series expansion model of the system input and output.One kind of time series models is called an autoregressivemoving average (ARMA) model that can be formulated as where x k : observed (or measured) system input at time step k; y k : observed (or measured) system output at time step k; na: order of the moving average (MA) as nb i=0 b i x k−i ; nb: order of the autoregression (AR) as nb i=0 a i y k−i ; e: white, exogenous noise.
Using the ARMA model, the characteristics of the measurement time histories of the system input and output can be determined from the identification of the expansion coefficients (a's and b's) based on the measured system input and output.The optimal coefficient values can be determined, using various optimization algorithms as discussed in Sections 4.2 and 4.4.A general description of time series analysis methods can be found in Box and Jenkins [82].
Some application examples of the time series analysis methods for geotechnical systems include Glaser [83], Glaser and Leeds [84], Glaser and Baise [85], Baise et al. [86], and Glaser [87].In Glaser and Baise [85], a technique for mapping the identified time series coefficients to relevant soil physical properties was discussed that is considered to be a parametric approach in their paper.
For any arbitrary time series x(t), an analytical signal z(t) can be obtained using the Hilbert transform.Let y(t) be the Hilbert transform of x(t) where P is the Cauchy principal value, and where In (15), it should be noted that the Hilbert transform is the convolution of x(t) with 1/t, which emphasizes the local properties of x(t).In addition, (17) provides the best local fit of x(t) using-time dependent functions of a(t) and θ(t).Finally, the instantaneous frequency is defined as In order to obtain physically meaningful instantaneous frequencies (IMF), Huang et al. [88] suggested the decomposition of a complex original time series into multiple so-called intrinsic mode functions that represents the oscillatory modes embedded in the original signal, and the instantaneous frequencies are determined for the decomposed IMFs.The signal x(t) can be expressed using the series of IMFs as where the IMF k is the k-th intrinsic mode function, m is the number of the IMFs, and r(t) is the residual.
The IMF is defined to have the properties of local zero means and the same numbers of zero crossings and extrema throughout the time series for the IMF to be only one mode of oscillation without complex riding waves.A difference from the Fourier-based signal processing methods is that the IMF is not restricted to be single banded and can be nonstationary.Several EMD algorithms have been developed using the so-called sifting process [104,105].
The HHT is a time-frequency analysis technique; combined with the EMD, a time-frequency plot can be obtained for each IMF to visualize frequency change over time.The HHT is similar to the wavelet transform (WT) as a nonstationary data processing technique, but the HHT is not limited by the underlying basis functions as the WT is.

Black-Box Methods.
One technical difficulty in the identification of complex (nonlinear) geotechnical systems is that the system characteristic function in Figure 1 is usually unknown beforehand, so that it is not possible to establish exclusive relationships between the system input and system output.This case is often encountered when systems identified are under field condition subject to various environmental effects, or systems are evolved into a different class of nonlinearity after unpredictable unknown structural damage.The black-box methods can be used when the physical relationships between the system input and the system output are unknown.
The Artificial Neural Networks (ANNs) technique, inspired by biological neural networks, has been shown to be a powerful tool for developing model-free representation of nonlinear systems.The ANNs consist of an interconnected group of artificial neurons that forms the input layer, hidden layers, and output layer for arbitrary multiinput multioutput (MIMO) systems in Figure 3. Employing various optimization algorithms, the input-output relationships could be determined by finding the optimal values of the weights and biases of the artificial neurons.Detailed description of the ANN method can be found in Fausett [106] and Gurney [107].
The ANN techniques have been used in a wide range of geotechnical applications including pile capacity, settlement of foundations, characterization of soil properties and behavior, liquefaction, site characterization, earth retaining structures, slope stability, tunnels, and underground openings [103].Some technical challenges for the ANN modeling in geotechnical engineering are discussed in Jaksa et al. [108].

Response-Only Models.
Response-only methods are defined as the methods that use no system information in their data processing procedures.The blind source separation (BSS) is classified as one of these kinds.The BSS method is a multivariate, nonparametric techniques, which separate unknown system input (or "sources"), based on observed system output (or "response") without (or with little) information of the system input or system function.BSS includes several response-only techniques, such as the principal component analysis (PCA) for statistically uncorrelated multivariate system input, and the independent component analysis (ICA) for statistically independent multivariate system input.General descriptions of the PCA and ICA methods can be found in Hyvärinen et al. [109].
The principal component analysis (PCA) method, also known as the proper orthogonal decomposition (POD) or the Karhunen-Loève transform, is a multivariate statistical technique [110].Two algebraic solutions of the PCA are commonly used including (1) the eigenvector decomposition of the covariance matrix and (2) the singular value decomposition approach.The first solution will be described in this section.For an (m×n) observation data set X = [x 1 ; . . .; x m ], where x i is an (n × 1) vector associated with sensor i, the goal of the algebraic solution is to find the orthonormal matrix of the principal components P, where which renders the covariance matrix C Y diagonal.The covariance matrix can be determined from such that where A is an (m × m) symmetric matrix, V is the (m × m) matrix of eigenvectors arranged as column, λ is the (m × m) diagonal matrix of the eigenvalues.The PCA is limited by its global linearity because the PCA removes linear correlations among the observed data and is only sensitive to secondorder statistics [111,112].Some geotechnical applications of the PCA include Dai and Lee [113], Komac [114], Folle et al. [115].

Case Study: Monitoring for Full-Scale Retaining Walls Subject to Long-Term Environmental Change
In order to demonstrate the benefits of the nonparametric methodologies discussed in Section 2, a case study was conducted using a full-scale reinforced concrete retaining wall with the height of 13.59 m.Because the wall was placed only 9.5 m away from a high-rise residential apartment building, the collapse of the wall would result in a catastrophic disaster.The backfilled soil characteristics were not known, and the soil behavior (e.g., pore water pressure or soil temperature) was not monitored.The material properties of the reinforced concrete were also unknown, and the plan of the retaining wall was not available.The retaining wall was monitored for three years with three tilt sensors located at the upper, middle, and lower locations of the wall (13.14 m, 6.55 m, and 1.68 m from the ground).At the same locations of the tilt gauges, the surface temperatures were also measured.Therefore, a total six sensors (i.e., three tilt gauges and three surface temperature sensors) were used and wired to a data logger, equipped with a digitizer and local storage device.The sensor readings were sampled at once every hour (1 sample/hr) for all channels.Consequently, due to the lack of information in terms of measurement types, temporal and spatial resolution of measurements, and information on the monitored structure, conventional parametric identification approaches could not be used in this study.Furthermore, although the wall surface temperature data were collected, only tilt data were used in this analysis to demonstrate that important performance-related information on the retaining wall can be obtained using response-only data without relying on additional data of the causative force and environment in the data processing procedures.As described in Section 3, since the inverse analysis using response-only data is not based on explicit relationships of system input output, which cannot be accurately determined due to limited information of structural characteristics and sensor measurements, the oversimplification problem often observed in conventional parametric approaches would be avoidable.Environmental measurements will be used a posteriori information for physical interpretation of the inverse analysis results, which is commonly not straightforward in other nonparametric approaches.If this approach was successful, the expensive data collection cost could also be reduced (Figure 4).The tilt time histories measured from the retaining wall are shown in Figure 5.The slope is in microradian, and the plus sign is for the slope towards the apartment side.The slope signals at all three locations were significantly affected by seasonal and daily variation: decreasing during summer and increasing during winter, and decreasing during days and increasing during nights as reflected in daily trends (not clearly shown in the figure due to scale).During this threeyear monitoring period, the wall behavior was affected by temperature change in addition to rain and snow falls, freeze thaw of backfilled soil, soil-structure interaction, and so on.
Figure 5 also shows that the collected sensor data are partially incomplete.The lower sensor failed in Q1 2006 (approximately after one year).There were "missing" data for all sensors in Q4, 2006, for about three months due to instrument failure.These unavoidable and unpredictable sensor and instrumentation problems are frequently encountered in long-term field measurements, and the proposed nonparametric methodology should be robust to handle these kinds of problems.Therefore, the figure illustrates the lack of data available for the complexity of the given problem, which is commonly encountered in many geotechnical applications.
Three nonparametric data processing techniques were used: the empirical mode decomposition (EMD), the Hilbert-Huang transform (HHT) for single-channel (or Univariate) analysis, and the principal component analysis (PCA) for multichannel (or multivariate) analysis.A summary of the proposed nonparametric data processing approaches is provided in Table 1.
A brief description of the EMD-HHT was given in Section 5.2, and the analysis procedures of the EMD-HHT are summarized in Figure 6.Due to the complexity of the geotechnical system coupled with long-term environmental Table 1: A summary of the nonparametric identification approaches employed in the case study for a full-scale retaining wall, subjected to long-term environmental variations.

Data type Purposes Empirical mode decomposition (EMD) Univariate
To decompose nonlinear and nonstationary environmental variations of daily, seasonal and long-term trends from raw sensor measurements To decompose complex raw measurements into simpler and physically "well-behaving" intrinsic mode functions for better understanding of the system

Hilbert-Huang transform (HHT) Univariate
To obtain the instantaneous frequencies for nonlinear, non-stationary, time-varying systems The obtained instantaneous frequencies could be used to detect changes in "abnormal" system characteristics in time

Principal component analysis (PCA) Multivariate
To find interchannel relationships with multi-input data (note that the EMD and HHT are single-channel data processing techniques) To visualize the mode shapes of the system decomposed by the corresponding orthogonal principal components To quantify the energy of inter-channel motions for each mode shape and find the dominant one variation, the raw sensor data shown in Figure 6 are usually too complicated to be interpreted for performance assessment.Thus, a daily trend was disentangled using the EMD based on its period of one day out of the raw signal even with missing data for three months in the second year, and a sample result is shown in Figure 6(b).The disentangled daily trend of the slope is mostly influenced by the daily fluctuation of the wall surface temperature (i.e., the wall inclined toward the apartment during daytime and toward the backfill during night time).Once the daily trend was disentangled, the instantaneous frequency of the daily trend was obtained using the HHT as shown in the time-frequency plot of Figure 6(c).
The time history of the daily trend has a period of one day, and the corresponding instantaneous frequency has a baseline frequency of one per day as shown in Figures 6(b) and 6(c).Occasional amplitude reduction is observed in the time history (e.g., 3/11, 3/15, 3/21, and 4/5 through 4/9) in Figure 6(b), and during these times, the corresponding instantaneous frequencies become significantly larger than the baseline frequency.Hourly precipitation records collected separately at the nearest weather station to the wall site are plotted in Figure 6(d).The precipitation data were not used in our analysis.Interestingly, the comparison with the instantaneous frequency in Figure 6(c) shows that the peaks of the instantaneous frequency concur with precipitation events, and the frequency decreases back to the baseline frequency (i.e., one day) when the precipitation stops.
These results demonstrate an important advantage of the nonparametric techniques over conventional parametric methods in monitoring applications.Without a priori information, physical assumptions and oversimplification of the monitored structure, the daily trend can be disentangled from a complicated raw slope signal.With the occurrence of the precipitation, the normal pattern in a slope signal (i.e., the system response in Figure 1) is "disturbed" due to the change of the structural characteristics with increased water content in the backfills (i.e., the system characteristics function).Consequently, the pattern of the disentangled daily trend is also disturbed in its amplitude and frequency.After the precipitation stops, the pattern in the raw slope time history returned to the normal condition with a working drainage system, which drain away excessive water in the soil, and so does the patter of the disentangled daily trend.After the precipitation stops, if the pattern of the disentangled signal did not go back to normal (i.e., the instantaneous frequency in Figure 6(c) did not go back to the baseline frequency), it could be concluded that the drainage system is not working properly.A critical difference between using the raw and the processed signals is that the raw signal is too complicated to recognize the precipitation effect because it is overshadowed by other dominant non-performancerelated effects, such as temperature as shown in Figure 6(a); the important drainage-related information can be extracted using the disentangled signal as shown in Figure 6(c).
The principal component analysis (PCA) technique was used as a multi-sensor analysis method.
The brief description of the PCA was provided in Section 5.4.In order to find the optimal window size, the statistics of the first PCA mode shape, which is associated with the largest contribution to the energy of the total wall motion, were calculated.Figure 7 shows the mean values of the eigenvectors in dashed lines with one-standard deviation (1 − σ) uncertainty in the shaded areas.The statistics were calculated with different window sizes (i.e., numbers of days) up to 60 days, and the window size of one-day duration includes 24 data points for the given sampling rate of 1 sample/hr.In the figures, since the expectation of the PCA mode shape begins statistically unbiased after 14 days (i.e., the mean and deviation values begin saturated), the window size of two week was selected for the PCA in this study.
Figure 8 shows the PCA mode shapes with the error bars of one-standard deviation (1 − σ).In the figure, the mode shapes of the wall slopes were converted to the displacements using the known heights of the sensor location.The μ and σ in the parenthesis are the mean and standard deviation of the eigenvalue corresponding to each mode that is normalized to the sum of the eigenvalues of all modes.Although no  The daily trend was disentangled from the complex raw signal using the EMD.(c) The disentangled daily trend was processed using the HHT to obtain instantaneous frequencies over time.The baseline frequency remains at one per day, but some peaks were observed occasionally.(d) Precipitation records measured separately at a weather station near the wall site were compared in the same time scale.Concurrence was observed between the peaks in the instantaneous frequencies and precipitation records.The peak of the instantaneous frequency increased when precipitation began and decreased when precipitation stopped, which implies that the drainage system is performing satisfactorily.
physical characteristics information was used, Figures 8(a)-8(c) illustrates that the PCA mode shapes agree to the first, second, and third bending modes of a cantilever.The PCA eigenvalues show that the motion of the first mode is dominant: 97.3% of the entire motion energy with the standard deviation of 2.1%.This dominant motion is clearly due to the significant daily and seasonal trends shown in Figure 5 that could be mostly due to diurnal and seasonal temperature variation.For the purpose of structural health monitoring, this dominant low-order mode is less interesting since important information of condition assessment is performance related, not environment related.In addition, structural damage is usually localized phenomena, so that higher modes would have a better spatial resolution to detect.
Figures 8(a)-8(c) were created using the data in year 2005 before the bottom tilt gauge was damaged.The same PCA procedures were applied using the data after the tilt gauge was permanently damaged in year 2006 Q1, and the results are compared in Figures 8(d)-8(f).The first mode after the damage in Figure 8(d) was realized similar to the one before the damage in Figure 8(a) except the deviation of the mode shape increased after the damage.The comparison of Figures 8(b) and 8(e) shows that an excessive amount of the movement was realized after the damage of the bottom sensor that is unusual for the cantilever type of the wall structure.The mean contribution of the first mode to the total energy of the wall motion was reduced from 97.3% (with the standard deviation of 2.1%) to 82.3% (with the standard deviation of 14.3%) and that of the second mode increased from 2.3%  (with the standard deviation of 2.0%) to 14% (with the standard deviation of 14.4%), while the energy contribution and shape of the third mode remained similar as shown in Figures 8(c) and 8(f).
Figure 9 shows the time histories of the PCA eigenvalues.In the figure, the time history of the first mode is shown in the solid line (black), the second mode in the dashed line (red), and the third mode in the dash-dot line (blue).The realized eigenvalues of Modes 1 and 2 significantly changed from March, 2006, the same time when the bottom sensor was damaged.
Based on the single-channel and multi-channel analyses results discussed in Section 6, the following important facts can be concluded for the general monitoring applications of geotechnical structures.  in the EMD-HHT in Figure 6), where the abnormal behaviors occur can be also determined.
(iii) Using the statistics (e.g., error bars) of the eigenvalues and eigenvectors of the PCA modes in Figure 8, the confidence levels of detecting abnormal behaviors can be quantified combining with the standard statistical hypothesis test or classification techniques.It should be noted that since the PCA modes are statistically uncorrelated (or statistically independent for the independent component analysis), uncertainty quantification can be done with three times of integral (for three slope measurements) for statistical tests, not triple integral.For example, it was observed that the cross-correlation values of the PCA eigenvalues between different modes are very low (less than 0.6404) as summarized in Table 2.This property is particularly important when a large number of sensors are used.

Summary and Conclusions
The modeling procedures of the nonparametric methods are data driven, not based on a priori physical knowledge of the monitored structure.Therefore, the methodology developed by the authors is not limited to a specific type of structure, but it could be applicable to a wide range of monitoring applications for different geotechnical structures.For the diversity of the characteristics of geotechnical structures, the nonparametric methodology could reduce modeling efforts significantly in various monitoring applications that has been technical barrier using conventional parametric approaches.
The important performance-related information (e.g., effects of drainage or malfunctioning sensors) could be obtained using a very limited amount of the response-only sensor data (i.e., three tilt time histories).The decomposition techniques used in this study could disentangle the response deformation data of the complex system subject to longterm environmental variations without the information of Advances in Civil Engineering  the causative force, environment or structural characteristics.For example, since the precipitation records were not used in the EMD-HHT, it was demonstrated that oversimplification problems could be avoided using the response-only analysis techniques that is not based on exclusive input-output relationships.Therefore, the nonparametric methodology discussed in this paper could provide the important information of when, where, and how confidently engineers should be deployed to the site for potential performance hazards of monitored structures using a very little amount of information without sacrificing accuracy of the inverse analysis.The common practical problems of the unpredictable sensor/instrument network malfunctioning problems could be also effectively dealt with the nonparametric methodology.

Figure 3 :
Figure 3: A schematic of typical structure of the Artificial Neural Networks (modified from Shahin et al. [103]).

Figure 4 :Figure 5 :
Figure 4: A full-scale retaining wall used in this study.The wall is an L-type cantilever reinforced concrete wall 13.59 m high.The retaining wall is subject to long-term environmental variations [4].

Figure 6 :
Figure 6: The procedures of the single sensor analysis using the empirical mode decomposition (EMD) and the Hilbert-Huang transform (HHT) compared with the precipitation records that are separately measured at a weather station near the retaining wall site.Note that the precipitation data were not used in the time-frequency analysis.(a) The one-month-long raw tilt time history shows the dominant daily trend mixed with various nonlinear, non-stationary events due to unknown factors over time.The material properties of reinforced concrete and backfilled soil were unknown.The raw sensor data are too complicated to understand what happens to the wall.(b)The daily trend was disentangled from the complex raw signal using the EMD.(c) The disentangled daily trend was processed using the HHT to obtain instantaneous frequencies over time.The baseline frequency remains at one per day, but some peaks were observed occasionally.(d) Precipitation records measured separately at a weather station near the wall site were compared in the same time scale.Concurrence was observed between the peaks in the instantaneous frequencies and precipitation records.The peak of the instantaneous frequency increased when precipitation began and decreased when precipitation stopped, which implies that the drainage system is performing satisfactorily.

Figure 7 :
Figure7: The statistics of the first eigenvector of the PCA mode shapes for different window sizes (i.e., number of days).The statistics were obtained using the data in year 2005 (before the lower tilt gauge was damaged).
(i) From the time histories of Figures6(c) and 9, when abnormal behaviors of the wall occur can be determined.These abnormal behaviors are related to the performance of the structure, which are commonly overshaded by the significant effects of environmental variation.The disentanglement techniques, such as the EMD and the PCA, allow filtering out the environment-related information and focusing on the performance-related information.(ii) From the mode shapes of the lower senor in Figures8(d)-8(f) (particularly Figure8(e) for the PCA or using the information of the upper sensor location

Figure 8 :
Figure 8: The PCA mode shapes with the error bars of one standard deviation before and after the damage of the bottom tilt gauge.The μ and σ are the mean and standard deviation of the corresponding eigenvalue, which is normalized to the summation of the eigenvalues of all modes.(a)-(c) shows the mode shapes before the bottom tilt gauge was permanently damaged in 2006 Q1, and (d)-(f) shows the mode shapes after the sensor was damaged.

Figure 9 :
Figure 9: Time histories of the PCA eigenvalues normalized to the sum of the eigenvalues of all modes.The time history of the first mode is shown in the solid line (black), the second mode in the dashed line (red), and the third mode in the dash-dot line (blue).

Table 2 :
Cross-correlation coefficients of the PCA eigenvalues between different modes.