Managing Information Uncertainty in Wave Height Modeling for the Offshore Structural Analysis through Random Set

This chapter presents a reliability study for an offshore jacket structure with emphasis on the features of nonconventional modeling. Firstly, a random set model is formulated for modeling the random waves in an ocean site. Then, a jacket structure is investigated in a pushover analysis to identify the critical wave direction and key structural elements. This is based on the ultimate base shear strength.The selected probabilistic models are adopted for the important structural members and the wave direction is specified in the weakest direction of the structure for a conservative safety analysis.The wave height model is processed in a P-box format when it is used in the numerical analysis. The models are applied to find the bounds of the failure probabilities for the jacket structure. The propagation of this wave model to the uncertainty in results is investigated in both an interval analysis and Monte Carlo simulation. The results are compared in context of information content and numerical accuracy. Further, the failure probability bounds are compared with the conventional probabilistic approach.


Introduction
Reliable estimation of extreme values of wave height is an important prerequisite to the design of coastal and offshore structures [1,2].Many estimation methods have been adopted by the researchers and they are summarized in Muir and El-Shaarawi [3], Goda [4], Guedes Soares [5], and Zhang and Lam [6].These covered a wide range of statistical models in the fitting to the measured data of ocean parameters, like lognormal [7], Weibull [8], Generalized Gamma [9], and Beta [10] probability distribution models.Besides these, the Peak over Threshold (POT) method is considered to be quite powerful in the modeling of extreme wave height [11][12][13][14][15] (Zhang and Cao 2015).But the main problem in POT method is the choice of a suitable threshold, a matter that is currently investigated by many researchers [16].Since the threshold is used to determine a specific group of data named as "extremes," the prediction of long term extreme values is not reliable.A generic methodology for selecting the accurate threshold is still lacking.The review summarizes that traditional statistical methods which rely on assumptions are not suitable in handling such uncertainty.Thus, this study explores nontraditional models as an alternative in the modeling of extreme wave height.
The analysis and design of offshore structures includes the consideration of waves as the decisive loads.A realistic modeling of the wave loads is particularly important to ensure a sufficiently reliable performance of these structures [17,18].Unfortunately, there are large variations in the wave load.The wave height, which is a major factor in the wave load, is believed to be time-varying and can be significantly different under different climate conditions [19].These variations can be caused by a wide range of factors such as interseasonal changes, interannual changes, and interdecadal changes [20].

Complexity
This has been noticed by many climatologists in recent times [21][22][23][24].The impact of these variations in marine applications is also mentioned in [25].
Meanwhile, it is realized that the engineering safety assessment may always involve various types of uncertainty models.In practical problems, different forms of model may be needed in the same system [26].This requires new developments in the modeling which could combine different types of information.One of these models is the random set modeling, which is a type of imprecise probability (Walley 1990).The random set, or sometimes known as P-box, is an extension from the traditional probability theory allowing for intervals or sets of probabilities [27].It can represent not only distributions with unknown parameters, but also distributions with unknown modes or unknown dependency parameters.The use of P-box has shown significant advantages in many engineering case studies investigated by the previous researchers [28][29][30].The computational techniques associated with imprecise probability in structural analysis have also been developed simultaneously.Ferson and Donald [31] have developed a formal probability bounds analysis that facilitates computation.Other similar methods can be seen from Berleant [32].These were shown to be quite useful in engineering design works [33].The algorithms in these methods are mainly belonging to interval arithmetic or Monte Carlo simulations.
The main focus of this study is to investigate the uncertainties in wave height modeling and its application in offshore engineering.In this study, time series measured data, which is from a buoy located in the west coast of US, is selected for extreme value modeling.The original data is first analyzed in form of parametric models to estimate the extremes.The Peak over Threshold (POT) method is then applied to different data set.For the consideration of information uncertainty, a nonstationary Poisson point process is used in the characterization of the occurrence rate for the extremes.The stability of the Pareto family used in extreme value modeling is investigated from several statistical points of view to obtain a feasible range for the threshold.The random sets theory is emphasized to formulate an imprecise probability model for the extremes by using a set of thresholds.The constructed uncertainty model was then introduced to represent the extreme wave height.The main focus of this study is to investigate the handling of the imprecise probabilistic information and also to show that imprecise probability can provide a framework for processing incomplete information in engineering analysis.Therefore, the handling of this uncertainty model is conducted through a reliability analysis with considerations of several uncertainties in mechanical properties.Finally the conclusions are emphasized.

Data Used
The data used in this study is downloaded from National Data Buoy Center (NDBC) for buoy 46029 (http://www.ndbc.noaa.gov/;accessed Nov 2010), which is located in the west of Columbia River Mouth.The exact measured location is at 46 ∘ 8  37  N 124 ∘ 30  37  W. The water depth is 135.3 m and watch circle radius is 48.3 km (281 yards).The recorded significant wave height (Hs) data dates back to 1984 and is available for recent times.
The collected data of Hs in each year contains 8766 observations which are based on hourly record.It shows clearly that the winter period which is from September to April is the roughest time throughout the year.This dominant season of strongest storms can cause the Hs to be significantly different from the other times and is considered as a different data set.Thus, the data corresponding to this period which has a total sample size of 5088 is chosen for the investigation.For a more accurate analysis, only parts of the data which have high degree of data completeness are extracted for this study.This corresponds to the years of 01-02, 02-03, 03-04, 05-06, 06-07, and 09-10 which have percentages of missing data of 2.79%, 0.33%, 0.94%, 3.01%, 2.14%, and 1.10%, respectively.The time series data can be seen from the plot in Figures 1-6.The next section will conduct the statistical modeling of extreme values in the wave height record.

Application of POT in Modeling the Extreme Wave Height
Compared to the traditional extreme statistical models, POT does not need the data to have stringent statistical similarities [34].As long as the data is stationary having a weak dependencies structure during its reference period, POT method is appropriate to model the extremes.In this case, since the original data is selected from a specified sample period, the winter, the statistical property of the time series data is assumed to be stationary.Therefore, the uncertainties resulting from different populations could be largely eliminated.Before the POT method is applied to the time series data, the declustering scheme is carried out first.This includes selections of appropriate threshold  and time span Δ to separate the "extreme events" out from the original data (see Figure 7).Both parameters are time dependent because of the nature's variability.For the selection of a suitable time span Δ, a wide range of values have been suggested by the researchers (see [26,[35][36][37]).It was suggested to choose a time span in such an optimal way which is a minimum value to guarantee a persistent Poisson process for the extremes in the time series.In this study, the intensity function is modeled as a constant implying that the occurrence rate throughout the reference time remains unchanged.Here, three Δ have been selected in the testing of temporal dependencies between extremes, 1 day, 3 days, and 10 days.The threshold used in this declustering is 4.0.The case where there is no separating, zero time span, is also included to assess the effect of dependencies in the modeling.While the Δ is increased, the number of extreme events is reduced and thus leads to a large estimate in return value.This is quite the same situation when the threshold is increasing.As shown in Figure 8, by incorporating a time span in the separating of extremes, the fitting of the theoretical Poisson model is highly improved.It turns out that the dependencies are high when the time span is not applied.However, the fitting turns to be poor once the time span is increased to 10 days.The reason is the shortage of data for an efficient learning of the methodology.Therefore, a time span Δ of 1 day or 3 days is considered as an appropriate value for the Poisson model.In this study, a time span Δ = 1 day is adopted.
The selection of an appropriate threshold is determined by the stability of Pareto distribution in the modeling.The crucial aspects of generalized Pareto distribution (GPD) model are tested by increasing the threshold.The change of extreme occurrence rate for the thresholds is summarized in Table 1.It is clear that the stability of GPD model is highly maintained within the range of threshold from 2.4 to 6.4, except for some small deviation of 2.6 and 2.8.In fact, the threshold is not considered suitable when it has a value below 3.5.As a result, a lower limit of 3.5 for the threshold should be required.
Finally, after inspection at these plots, it can be concluded that a range of [3.6, 6.4] for the threshold is adequate for the Pareto model.The uncertainties still exist in the selection of the threshold in this region.

Proposed Random Set and P-Box Modeling
The GPD models are found to be valid for a set of threshold values.The Poisson's process cannot make certain judgment to the use of a specific threshold.This imprecise information in the threshold could not be eliminated in a traditional statistical way as the threshold is not consistent along the time series data.Therefore, a single value in the threshold is not recommended for an accurate consideration.
In contrast to the traditional statistical method, random set could provide a good combination in the POT model to quantify the uncertainties associated with the threshold.From this point of view, by taking the advantages of random set, a nonconventional model is proposed.Different thresholds are considered as various information sources.Each of these information sources is represented by a random set (I  ,   ).These are combined in such a way that the averaging procedure is applied to the probability mass assignment: Here, the feasible range for the threshold [3.6, 6.4] is divided into 14 intervals and 15 thresholds are obtained from these intervals.Table 2 shows the focal sets for the 25year return value obtained from each threshold and also the probability mass assignment.Based on the Demspter-Shafer theory, the bounds for the cumulative probabilities can be obtained from these random sets (see Figure 9).The models are also calculated for 50-and 100-year return value and presented in the figure .It can be seen that, by increasing the return period, the bounds for the estimation values become more imprecise.The belief function in this case shows larger variations than the plausibility function and thus gives a more imprecise upper bound for the return value.Figure 9 presents a comprehensive way to represent the uncertainties within the GPD model.The sensitivity of the estimated return level value with respect to the threshold can be easily seen from the plot.This also gives an indication whether further information is needed to reduce the uncertainties in the model formulation.It provides enough flexibility to the engineering decisions.Random sets approach here provides a general combination rule which quantifies the statistical uncertainties met in the POT model.The imprecise bounds could capture the full scope of uncertainty in the selection of the threshold.The analysis suggests the possibility and advantages of using the proposed nonconventional model for the prediction of extreme significant wave height.

Reliability Analysis of Offshore Structures: P-Box Approach
In practical problems, the random set model needs to be used in a structural numerical analysis.This requires new developments in the management of uncertainty information from engineering perspectives.This section presents a Pbox reliability study for an offshore jacket structure with an emphasis on the features of nonconventional random set modeling.

Structure Description.
A realistic North Sea jacket structure taken from the USFOS example model is analyzed [24].The structure is shown in Figure 10.
The structure is an 8-leg jacket, designed for a water depth of 110 meters.The legs are arranged in a two by four rectangular grid with the central pair of legs on the platform.Overall, the dimensions at top elevation are 27 × 54 m, with launch legs twenty meters apart, and the dimensions at the bottom line are 56 × 70 m.Total height is 142 m, with horizontal bracings at 5 levels (see Figure 11).More detailed structural descriptions can be found in USFOS Manual [38].

Static Pushover Analysis.
The ultimate strength of the jacket structure is determined through a static pushover analysis in USFOS.For this example, the wave and current are considered as the only external loads and other environmental factors are ignored.The base shear strength is considered as the resistance in the reliability analysis.
A particular attention was paid to the effects of phase and direction of the coming wave.Here, the jacket structure is first analyzed with various wave directions and phases to identify the critical response in the base shear.The investigation is carried out in all wave directions with full considerations of the wave phases (Figure 12).It concludes that a direction of 180 ∘ , which has the smallest ultimate base shear strength, is found to have the most critical state and is used in the following analysis for a conservative consideration.
Besides the modeling of variations in the wave characteristics, several uncertainties associated with the key structure's mechanical properties are also considered.Based on the ultimate strength analysis of the platform, the diagonal members below the sea level showed high degree of plasticity utilization before the base shear failure of the jacket (see Figure 13).It indicates the importance of the diagonal members over the other members in the ultimate strength of the whole structure.Thus, in present study, selected uncertainties associated with manufacturing and corrosion effect (reduction in thickness) are applied to the key elements of the jacket structure.The yield strength of the steel BS 968 for high strength tubes is described with a lognormal distribution with a coefficient of variation of 0.05∼0.08(Baker 1969).The uncertainty in the thickness associated with corrosion effect is modeled in a normal distribution with a coefficient of variation of 0.17.This is based on an experimental study [24].Detailed summary of the data is given in Table 3.
The ultimate base shear resistance is determined by the environmental design load multiplied by the Reserve Strength Ratio (RSR), which is defined as RSR = ultimate resistance of intact structure design environmental load .In this example, a design environmental wave load which corresponds to a wave height of 25 m and period of 16 seconds is used.A constant current of 2 m/s is also included in the design of environmental load.The response surface method is utilized to approximate this response value while taking the yield strength and thickness as inputs [39].A quadratic polynomial with cross terms is selected in the approximation of the relationship between RSR and   : where the yield strength   is in unit 10 8 N/mm 2 ; the thickness  is in unit 10 −2 m.The adequacy of the selected model can be checked by the residual plots (see Figure 14).The response base shear can be approximated by ( 4) related to wave height and current in the form [40] where V is the speed of the current and  1 ,  2 , and  3 are constants.As the speed of current is assumed to be fixed,  the factors related to the current speed can be considered as constants.The determination of these constants can be obtained by a curve fitting procedure (Figure 15).This gives the approximated equation for the response base shear as where  is the response base shear with unit in MN;  is the wave height with unit in meters.The wave period is taken to be 16 seconds which is assumed consistent in the analysis.However, we should realize that there are many other methods that could be used to predict the extreme wave load [41][42][43][44][45][46][47][48][49].For example, a NewWave profile [50] will give the most probable shape of a large linear crest elevation.The quality of the curve fitting is examined by investigating the residuals between the real and fitted ones.The residual plots are presented in Figure 16.It can be seen that the residuals can follow a very strict normal distribution.The variance is generally quite small which indicates a well fitted model in the equation.However, we should realize that the errors contained in this formulation consist of not only the error from (3) but also the error from (5).Both equations approximating the load and strength contain the errors.The discussion in the later part of this paper must realize this error propagation scheme in advance.
Finally, the performance function can be expressed as: Consequently, the ultimate base shear failure probability can be obtained by giving   = Pr( < 0).As both probabilistic model and P-box model exist in the system, the failure probability investigation may involve interval probability analysis.This could be viewed as a general mapping from the input interval   to the failure probability: where The main difference in this case is that the upper and lower bounding values are not precisely determined (P-box approach).
P-box is characterized by a mixed case which specifies the bounds of probability for an uncertain quantity with underlying randomness that is not known in detail.Suppose  and  are nondecreasing functions mapping the real line R onto [0,1] and () ≤ () for all  ∈ R. Let [, ] denote the set of all nondecreasing functions F from the reals into [0, 1] such that () ≤ () ≤ ().When the functions  and  circumscribe an imprecisely known probability distribution, the model of [, ], specified by the pair of functions, is called a "probability box" or imprecise probability (Ferson 1998) for that distribution.This means that, if [, ] is a "probability box" for a random variable  whose distribution  is unknown except that it is within the "probability box", then () is a lower bound on () which is the (imprecisely known) probability that the random variable  is smaller than .Likewise, () is an upper bound on the same probability.From a lower probability measure  for a random variable , one can compute upper and lower bounds on distribution functions using the following [27]: As shown in Figure 17, the left bound  is an upper bound on probabilities and a lower bound on quantiles (that is, the -values).The right bound  is a lower bound on probabilities and an upper bound on quantiles.
Therefore, the results cannot be simply calculated based on one interval analysis.The P-box model for the wave height obtained in Section 4 is in a discrete form.The calculation of the final results could thus be conducted by the arithmetic of P-box structures discussed in the work of Tucker and Ferson [51].The interval analysis is carried out to find the failure probability bound    ( = 1, . . ., 15) corresponding to each of the discretized 15 intervals    ( = 1, . . ., 15) in the Pbox.Equal probabilities are assigned to the results    ( = 1, . . ., 15) without the consideration of dependence within the system.The results are then accomplished by grouping the 15 response intervals in a stacked P-box (Figure 18).
The maximum and minimum value of the P-box, which correspond to 1.12 × 10 −11 and 2.77 × 10 −8 , could represent an envelope for the failure probability   .This is basically a constraint in the failure probability while the wave height probability is specified in an imprecise form.The general aggregation formulation for the envelope in this case can be expressed as envelope ( It should be noted that this aggregation rule may not always be true if the mapping function is changed.Alternatively, the failure probability bounds can be computed by an interval Monte Carlo method.This procedure is based on repeated calculations for a series of probability functions the bounded P-box region.It is quite efficient for the P-box model which is established from some parameter uncertainties, for example, the mean or variance, but not feasible for the P-box model which is constructed by the random sets.However, the Monte Carlo simulation can still be carried out by the aid of parametric model fittings to the bounded functions.Here, three typical probability distribution functions, Gamma, Lognormal, and Weibull, are utilized.Although these distributions may be debatable, they are needed for the evaluation of the performance function by means of Monte Carlo simulation.The failure probabilities obtained in these parametric models are comparable.These are summarized in Table 4.The results obtained from Monte Carlo simulations are in good agreement with the results calculated from the interval analysis.A general comparison between these two approaches in the approximated bounds is also illustrated in Figure 19.The good agreement in the results further proves the applicability of proposed P-box approach.
The random set approach proposed in this study provides a general characterization rule in quantifying the statistical uncertainties associated with the POT model.The imprecise bounds could capture the full scope of uncertainty in the extreme value modeling [52].These suggest the possibility and advantages of using the random set model for the uncertainty information management in offshore engineering.

Conclusion
In this study, a nonconventional random set model with the consideration of nonstationarity is established for the prediction of extreme wave height.The random sets are introduced to combine a set of threshold estimations which resulted from the threshold uncertainty.The applicability of the nonconventional model is investigated in view of the probability bounds for the return values.This is justified by the study in an offshore structural reliability analysis which is carried out by using various wave height models.The impact of the nonstationarity in the extreme wave height distribution is investigated in the failure probability bounds which are obtained through the computational procedures in the structural analysis.The key findings are as follows.
(i) The P-box model possesses significant advantages over the traditional probability.It provides a convenient and comprehensive way to represent the nonstationary distribution model.The estimated bounds for the return level are found to have enough conservatism compared to the estimates from traditional probabilistic models.
(ii) The propagation of P-box model is performed in an interval analysis and Monte Carlo simulations for the structural analysis.The result obtained in this reliability analysis is also represented in a P-box format.Compared with the traditional probabilistic approach, the P-box failure probability gives a more flexible answer while the certain nonstationary effects are revealed by the imprecise bounds.This provides more information for engineers to make decisions especially for an existing structure which is exposed to a changing environment.
To the best of our knowledge, the uncertainty information problem in the offshore engineering is realized through the random set theory.Further, the findings from the case  analysis show its benefits for practical use in reliability engineering.Many other applications related to reliability and uncertainty analyses instead of only the offshore engineering applications can be conducted in future research.

Figure 1 :Figure 2 :
Figure 1: Plot of the September-April Hs time series of 01-02.

Figure 8 :
Figure 8: Comparison of several time spans in the fitting to stationary and nonstationary Poisson model.

Figure 12 :
Figure 12: Comparison of jacket structure's ultimate strength with various directional wave loads (direction angle is from -axis to axis).

Figure 14 :Figure 15 :
Figure 14: Residual plots for the adequacy check of the quadratic polynomial function.

Figure 16 :Figure 17 :
Figure 16: Residual plots for the adequacy check of the approximation equation for base shear.

Figure 18 :
Figure 18: P-box for the response failure probability.

Table 1 :
Summary of exceedances for various thresholds in POT.

Table 2 :
Random sets and mass assignment for different threshold.

Table 4 :
Failure probability obtained from selected models.The K-S test is conducted between the Monte Carlo simulated results (sample size = 100000) and the interval analysis results.The hypothesis is that they are from different continuous distributions.Significance level is at 5%. *