A New Spectral Shape-Based Record Selection Approach Using Np and Genetic Algorithms

With the aim to improve code-based real records selection criteria, an approach inspired in a parameter proxy of spectral shape, named Np, is analyzed. The procedure is based on several objectives aimed to minimize the record-to-record variability of the ground motions selected for seismic structural assessment. In order to select the best ground motion set of records to be used as an input for nonlinear dynamic analysis, an optimization approach is applied using genetic algorithms focuse on finding the set of records more compatible with a target spectrum and target Np values. The results of the new Np-based approach suggest that the real accelerograms obtained with this procedure, reduce the scatter of the response spectra as compared with the traditional approach; furthermore, the mean spectrum of the set of records is very similar to the target seismic design spectrum in the range of interest periods, and at the same time, similarNp values are obtained for the selected records and the target spectrum.


Introduction
Currently, it has been thoroughly discussed in the literature that the selection of appropriate earthquake ground motion record sets to be used as input for nonlinear dynamic analysis and several methodologies have been proposed to reduce the number of ground motion records needed for the reliable prediction of seismic response of structures [1][2][3].Buildings design codes as FEMA 450 [4] and Eurocode 8 (EC8) [5] suggest the use of seven ground motion records as input for seismic performance of structures to consider the average values of the structural responses.In particular, the seismic design codes guidelines are based on the use of records matching the pseudo-acceleration spectral shape in a range of periods as the main one among other criteria.Because of the known high variability of nonlinear seismic response of structures, the average spectra of the set of records must be as close as possible to the seismic design code spectrum in order to estimate the structural response with relatively high confidence given only seven analyses.To this aim, it is important to define ground motion intensity measures representative of the spectral shape, which may help in record selection driven by a target spectrum due to the ability of this as predictor of nonlinear structural response [6][7][8][9][10][11][12].In fact, to improve the efficiency in the prediction of structural response, recently, vector-valued and scalar ground motion intensity measures based on a parameter to characterize the spectral shape named   have been proposed by Bojórquez and Iervolino [10].This parameter has resulted very effective to feature the spectral shape even for different types of earthquake ground motion records and to predict the nonlinear structural response in terms of peak and seismic energy demands.Moreover, it can predict the structural failure in term of peak and energy demand with good accuracy [12].
Motivated by the purpose to improve the strategies for code-based records selection inspired in the spectral shape, a procedure established for   is proposed in the study.To this aim, a total of 1024 earthquake ground motions with different characteristics taken from the NGA database and recorded at different types of soils are used.The optimization approach inspired in the natural selection process well known as Genetic Algorithms is used to solve the optimization problem which tries to find the best set of seven records compatible with the parameter   and at the same time with a target spectrum in a given range of periods.It is pointed out that Genetic Algorithms have been satisfactorily used in engineering problems for structural optimization [13][14][15] and in record selection for nonlinear dynamic analysis [1,2,16,17].
The study here presented shows that it is possible to find a set of records compatible with a target spectrum and at the same time compatible with several target   values accounting for higher modes and nonlinear structural response effects.Note that response spectra can be easily predicted with computer mathematical tools as in the case of neural networks [18].The set of records obtained with the new approach may reduce the record-to-record variability of the spectra and, most importantly, the variability of structural response significantly, with respect to intensity measures of current large use.It is important to emphasize that the approach is based especially in the proxy of the spectral shape named   , and Genetic Algorithms are used only as a tool or optimization technique; however, several other techniques can be applied as neural networks, the gradient-based search (which in some cases finds a local solution of the problem), and many heuristic methods as the firefly algorithms recently developed by Yang [19] inspired by the behaviour of fireflies in nature.

𝑁 𝑝 -Based Record Selection Using
Genetic Algorithms: Methodology

Record Selection Criterion.
Since the most relevant parameters to predict nonlinear structural response appear to be those which try to capture the elastic acceleration response spectrum shape in a range of oscillation periods [9][10][11], nonlinear dynamic analysis seismic design codes usually require the selection of a set of seven ground motion records where the average of these spectra needs to match the seismic design code spectrum in the period range between the low limit  0 and the upper limit   ; this is with the aim of using the mean seismic response parameters for structural assessment and to reduce the uncertainty in the prediction of nonlinear structural response.Nevertheless, recently, the efficiency of the   parameter to predict the nonlinear structural response [10] compared with the approach where the records are scaled only for the spectral acceleration at first mode of vibration Sa( 1 ) has been observed.This implies that records selected to match a similar value of   can reduce the record-to-record variability in the structural response when the records are scaled at the same spectral acceleration.Note that   is defined in where Sa avg ( 1 , . . .,   ) represents the geometrical mean between the periods  1 and   .The information given by this equation is that if we have one or  records with a mean   value close to one, we can expect that the average spectrum to be about flat in the period range between  1 and   .For a mean   lower than one an average spectrum with negative slope is expected.As an example, the mean value of   for a group of 191 ordinary records in the period range  1 = 0.6 s to   = 2 1 is 0.39.In Figure 1(a), the average spectrum of this set is illustrated.In the case of   values larger than one, the spectra tend to increase beyond  1 .As it can be appreciated for a set of 31 narrow-band records, where the mean value of   = 1.9 for  1 = 1.2 s and   = 2 1 , the average spectrum shows an increasing accelerations zone (see Figure 1(b)).A very important issue to calculate   is the evaluation of the number of periods in the range of interest.This should be addressed through a study to find the optimal number of spectral ordinate points that maximizes the efficiency of the parameter as a predictor of nonlinear structural response.The first author in [10,12] and his students have been working to answer this issue, and they have found for moment-resisting steel frames with periods between 0.5 to 2 seconds that using spectral ordinate points for differential periods of 0.1 s provide to the parameter   of good efficiency in the prediction of seismic response in terms of maximum interstory drifts, hysteretic energy, and damage indexes, indicating that the number of points depends on the structural vibration period.This consideration was used in the present study.
Finally, for the normalization between Sa( 1 ) values let   be independent of the scaling level of the records based on Sa( 1 ), but most importantly, it helps to improve the knowledge of the path of the spectrum from period  1 to   , which is related to nonlinear structural response.To illustrate this, Figure 2 shows two response spectra with similar values of Sa avg ( 1 , . . .,   ); it is clear how the spectral shape of both records is completely different, which suggests that different responses will be obtained.It demonstrated the potential of the normalization with respect to Sa( 1 ).
More details regarding this parameter are given by Bojórquez and Iervolino [10].
The main objective of this study is to find the best combination of seven real ground motion records using Genetic Algorithms, and it also aimed to minimize the following parameters.
(1) The difference of the average   value for the set of real records and the   value for a target seismic design spectrum obtained in the range between  1 and   (where  1 is the fundamental period of the structure under consideration and   =   = 2 *  1 is used according with [10]).Hereinafter, the term  1 will be used to define the value of   .The normalized difference (or error) of the average  1 and  1 is obtained with the parameter given by where  1 ave is the average  1 value of the real records and  1 is the  1 value of the target design spectrum.Period (s) Figure 2: Elastic response spectra for two records scaled for the same geometrical mean Sa avg ( 1 , . . .,   ) that reflects different spectral shapes.
(2) The difference of an individual   value of the set of real records and the   value for a target seismic design spectrum in the range of  1 and   .The normalized difference of an individual  1 and  1 is obtained with the parameter given in where  1 is the individual value of  1 of a real record.
(3) The normalized difference of the average   value for the set of real records and the   value for a target seismic design spectrum in the range of  0 and  1 (where  0 is the initial period under consideration and usually is equal to 0.2 *  1 ).In this case,   is used to incorporate the higher mode effects, and it is defined as  2 for a real record and  2 for the target design spectrum.The normalized difference ( 2 ) of the average  2 and  2 is given by where  2 ave is the average  2 value of the spectra obtained with the real records and  2 is the  2 value of the target design spectrum.
(4) The normalized difference of an individual  2 value of the set of real records and the  2 value for a target seismic design spectrum in the range of  0 and  1 is obtained by where  2 is the individual value of  2 of a real record.
(5) The normalized difference ( Sa ave ) of the average spectrum for the set of real records and the target seismic design spectrum in the range of  0 and   (where  0 = 0.2 *  1 and   = 2 *  1 ) is obtained as follows: where Sa ave (  ) is the average pseudo-acceleration ordinate corresponding to the period   for the seven real records, SaT(  ) is the value of the spectral acceleration ordinate of the target spectrum at period   , and  is the number of spectral ordinate points in the range of periods.
(6) The normalized difference of the spectrum of an individual real record and the target seismic design spectrum in the range between  0 and   (where  0 = 0.2 *  1 and   = 2 *  1 ), is obtained by In (7), Sa  (  ) is the pseudo-acceleration ordinate of the real spectrum  corresponding to the period   .
Note that points ( 5) and ( 6) are usually chosen as record selection criteria [20].
Finally, it is important to say that although the records could be selected to match a target spectrum by means of ( 6) and ( 7) in Section 2.1, they do not necessarily have similar values of   or spectral shape.This is because it is possible to obtain two records with a similar difference of their spectra with a target spectrum in a range of period given by ( 6) and ( 7) but different spectral shape and   values [10].For this reason as it was previously discussed, the normalization of   with respect to spectral acceleration is very helpful to better describe the path of a spectrum.

The Use of Genetic Algorithms.
The genetic algorithms are heuristic methods used to solve optimization problems, which are based on the natural selection principles of Darwin [21][22][23].The main characteristic of the genetic algorithms is based on the principle of survival and adaptation.The advantage of genetic algorithms is the use of a population of possible solution instead of a single point solution.The tool of Genetic Algorithms consists in the random generation of a population of guesses or possible solutions for a given problem, usually as binary encodings.A typical genetic algorithm uses three operators: selection, crossover, and mutation [24].Selection attempts to apply pressure upon the population in a manner similar to that of natural selection found in biological systems.Poorer performing individuals are weeded out, and better performing or fitter individuals have a greater average chance of promoting the information they contain within the next generation.Crossover allows solutions to exchange information in a way similar to that  used by a natural organism undergoing sexual reproduction.Finally, mutation is used randomly and changes (flips) the value of a single bits within individual strings.After the three operators are used, a new population is developed.
The Genetic Algorithms are used to select the best suite of ground motion records.For this aim, 1024 records taken from the NGA database are considered in the analysis, which are represented by a binary codification of 10 bits.The seismic response spectrum for spectral acceleration Sa( 1 ) in percentage of the gravity for 5% of critical damping of the selected ground motions records and the distribution of the records in terms of moment magnitude and epicentral distance are provided in Figures 3 and 4 respectively.In Figure 3, a large record-to-record variability in the earthquake response spectra is observed.Hence, the importance of record selection strategies in order to select the best ground motion set is to be used as an input for nonlinear dynamic analysis.Moreover, Figure 4 illustrates that the selected records were obtained at different epicentral distances and from different events with magnitudes   between 4 and 7.5.
The operators and the procedure of the Genetic Algorithms employed here for record selection are as follows.
(1) Initial Population.The first generation or initial population is randomly defined.Each individual in the population is a combination of 7 different records, where each record is represented by a binary codification of 10 bits.For example, the binary number 0000000000 represents the first ground motion record of the database.It is considered a constant number of individuals (200 in total) in all the generations.It should be emphasized that typical values for the number of individuals are in the interval from 20 to 1000 [24] depending on the problem under consideration.
(2) Response Spectrum Parameters.After the population is created, the seismic response spectra and the parameters  1 and  2 are obtained for each selected record.
(3) Objective Function.The objective of the genetic algorithms consists in the minimization of the square root of the sum of the square errors given from ( 2) to (7).
(4) Selection.This operator is based on elitism, which is represented by the individual with the less difference given by the square root of the sum of the square errors obtained from (2) to (7).
(5) Crossover.A single point crossover was considered.The crossover was performed between the specific records of an individual.Typical values of the probability of crossover (Pc) are around 0.4 to 0.9 [24].In the present work, Pc equal to 0.65 was considered.
(6) Mutation.It is used to guarantee the diversity of the set of records obtained in each generation.The process is applied for all the generations, and it consists in changing a specific bit by inversion of the value.For example, a bit with 1 can be changed by a bit equal to 0 considering a probability of mutation.In the present study, a probability of mutation equal to 0.025 was selected.
(7) New Generation.After all the evolution procedure is over, a new generation is obtained and the process returns to step (2); it is applied for a number of  generations (for this study, 300 generations were considered).
The summary of the relevant values considered in the genetic procedure is (a) population of individuals: 200; (b) number of generations: 300; (c) probability of crossover: 0.65; and (d) probability of mutation: 0.025.These values have been successfully used in other studies for ground motion records selection [16].

𝑁 𝑝 -Based Record Selection Using Genetic Algorithms: A Numerical Example
As a numerical example, the selection of a set of seven ground motion records is considered to match the seismic design spectrum according to the ASCE 7-05 [25] for site class B for a framed building with a vibration period at the first mode ( 1 ) of 0.6 sec.Two approaches were used for the Figure 5: Set of seven earthquake response spectra obtained with the traditional approach and comparison between the average and the target spectra.
selection of the set of records.The first one was based on the consideration exclusively of the selection criteria given by ( 6) and ( 7) in Section 2.1.This approach is commonly used for ground motions selection (here it will be named traditional).
The second procedure considers all the equations given in Section 2.1 which takes into account the values of  1 and  2 (  -based approach).In this example, the target values of   are  1 = 0.9611 and  2 = 1.3461 obtained from the seismic design spectrum according to ASCE 7-05.

Results for the First Approach (Traditional).
The results for the first approach (traditional procedure) are first presented.Figure 5 compares the set of seven records obtained with the methodology under consideration.In particular, it can be observed that the mean spectrum of the set of records is very similar to the target seismic design spectrum in the range of periods under consideration.This suggests the effectiveness of the Genetic Algorithms as a tool for record selection.
It is mentioned that the computer program only requires a few seconds to finish with the evolutionary procedure.Furthermore, the set of ground motion records obtained and the computational errors are illustrated in Table 1, and the largest value of  Sa obtained was 0.4257, which is an acceptable error value according with other studies [20].Also in Table 1, the values of   are illustrated, which will be discussed in the next section.Table 2 compares the   values obtained with the   target values.A large difference is observed among them, which reflects the insufficiency of the traditional approach for compatibility of   values.Finally, the evolution of the average error  Sa ave and the total error  Sa obtained as the square root of the sum of the square from ( 6) and (7) in each generation are shown in Figure 6.
It is observed that the error is reduced in each generation, and the procedure requires less than 50 generations for the minimization.

Results
for the   -Based Approach.The results for the proposed   -approach are presented in this section.Figure 7 compares the set of seven records obtained with the   procedure.The results suggest that the records present are less scattered compared with the traditional approach, and the mean spectrum of the set of records is very similar to the target seismic design spectrum in the range of interest periods.The set of ground motion records obtained and the computational errors are illustrated in Table 3.The largest value of  Sa obtained was 0.3603, which is smaller than that corresponding to the traditional approach.For this reason it is possible to conclude that records can be selected incorporating several objective parameters in the optimization procedure and obtaining satisfactory values of the error.Table 3 shows also very small values of   which means that most of the records in the selected set have similar   values compared with the target   (see Table 4).The evolution of the average error  Sa ave and the total error  Sa obtained as the square root of the sum of the square using (2) to (7) in each generation are illustrated in Figure 8.It is also confirmed that the error is reduced in each generation, and the procedure requires less than 50 generations for minimization of the error.Note that  Sa is very similar to the value obtained using the traditional approach, and in this case, more parameters were incorporated into the optimization procedure.Figure 7: Set of seven earthquake response spectra obtained with the   -based approach and comparison between the average and the target spectra.
Moreover, only few seconds of computational time have been required for the analyses.Finally, it is important to say that the commonly used uniform hazard spectrum is an envelope of the spectral accelerations at all periods that are exceeded with a specified rate, as computed using probabilistic seismic hazard analysis.Probabilistic seismic hazard analysis already accounts for variability in spectral accelerations at each period being considered, and construction of the uniform hazard spectrum is a conservative method of combining these spectral values [26].

Analysis of a Moment-Resisting Steel Frame
To provide more information about the possible acceptable error criterion of the parameter  Sa , a moment-resisting steel frame with two bays of 8 m and two story levels of 3.5 m, and structural period of 0.6 s was subjected to the set of ground motion records obtained with the   -based record selection approach.The frame was analyzed with the nonlinear dynamic analysis computer program RUAUMOKO [27] assuming fixed columns and a bilinear model behaviour with 3% of postyielding stiffness.It has been found that the variability of the structural response of the steel frame using these records produces a standard error close to 10% which is adequate (see Table 5).Note that the standard error associated to a sample of size  can be expressed as in (8) [28], where  ln(MIDR) is the standard deviation of the natural logarithms of maximum interstory drift MIDR.Moreover, to produce this standard error, the values of  Sa are smaller than 0.36 (see Table 3 for the   -based approach), which suggests that an acceptable error of the parameter  Sa for practical earthquake engineering application could be 0.4.It   is important to say that more studies are necessary to define this target acceptable error SE =  ln(MIDR) √ .

Conclusions
Nowadays, due to the recent advantages in the computer technology, the use of nonlinear time history dynamic analysis for earthquake resistant design is becoming more popular.However, one of the main challenges to develop this type of analysis is the selection of appropriate earthquake ground motion record sets to be used as input.Buildings design codes as FEMA 450 and Eurocode 8 suggest the use of seven ground motion records as an input for seismic performance of structures to consider the average values of the structural responses for seismic assessment.In particular, the seismic design codes guidelines are based on the use of records matching the pseudo-acceleration spectral shape in a range of periods as the main one among other criteria.In this paper, a new approach for real earthquake ground motion records selection based on a parameter proxy for the spectral shape, named   , was proposed.The approach was compared with the traditional code-based procedure of record selection for nonlinear dynamic analysis.For both approaches, a genetic algorithm was used in the optimization problem to minimize the main parameters of the methodologies analyzed.For this aim, a computer program was developed for the analyses.It is observed that the genetic algorithm is very efficient, since it only requires a few seconds to find the best set of records.Further, the convergence of the algorithms requires less than 50 generations for both approaches under consideration.The results of the comparison suggest that records selected with the traditional approach aimed to match the average spectrum of the set of earthquake ground motions with a target spectrum do not necessarily results in   values similar for the spectra of the selected records compared with the target seismic design spectrum.This is very important because   is a parameter directly related with the spectral shape, especially with the nonlinear structural response, as recent studies suggest.On the other hand, the results of the recently proposed   -based approach suggest that the records present are less scattered compared with the traditional approach, and the mean spectrum of the set of records is very similar to the target seismic design spectrum in the range of interest periods.For this case, the set of ground motion records obtained and the maximum computational error  Sa obtained were smaller than that of the traditional approach.Finally, the use of the new approach provides records with similar spectral shape and values of   , which are crucial to reduce the uncertainties to predict nonlinear structural response of buildings.

Figure 1 :
Figure 1: Mean elastic response spectra for a set of (a) ordinary records with   = 0.39 and (b) narrow-band records with   = 1.9.

Figure 3 :Figure 4 :
Figure 3: Elastic response spectra for the 1024 records under consideration.

Figure 6 :
Figure 6: Evolution of  Sa ave and  Sa in each generation (traditional approach).

Figure 8 :
Figure 8: Evolution of  Sa ave and  Sa in each generation (  -based approach).

Table 1 :
Set of records and errors obtained with genetic algorithms using the traditional approach.

Table 2 :
Set of records and   values obtained with genetic algorithms using the traditional approach.

Table 3 :
Set of records and errors obtained with genetic algorithms using the   -based approach.

Table 4 :
Set of records and   values obtained with Genetic Algorithms using the   -based approach.

Table 5 :
Analysis results of the steel frame with period equals 0.6 s.