An Interpretable Method for Operational Modal Analysis in Time-Frequency Representation and Its Applications to Railway Sleepers

,


Introduction
Modal analysis is widely used in structural dynamics and structural health monitoring (SHM). In many cases of experimental modal analysis (EMA), it is difcult or expensive to manually excite a structure with a hammer or shaker [1] or to analyze changes in modal properties under varying loading conditions. Operational modal analysis (OMA) enables modal characteristics to be identifed solely based on the response of a structure under operational loads and conditions. It does not require manual generation and sensing of excitations and is becoming increasingly popular in SHM.
OMA methods are generally classifed into time-domain methods and frequency-domain methods. Time-domain methods are based on the analysis of time histories or correlation functions, e.g., natural excitation techniques, autoregressive moving average, stochastic subspace identifcation, blind source separation, and the Bayesian timedomain approach [2][3][4]. Tey are usually computationally demanding and require proper selection of model order and the exclusion of spurious modes due to numerical computation [1,2]. Frequency-domain methods are based on the Fourier spectrum or power spectral density (PSD), which are naturally more interpretable. Te most basic frequencydomain method is the peak-picking method [1,2], which considers one mode at a time. Least square frequency methods [5,6] identify multiple modes together by iteratively estimating a parameterized spectrum. Furthermore, Bayes' theorem is incorporated to infer probability distributions of modal parameters [7], such as the spectral density approach [8], Fourier transform approach [9], Markov chain Monte Carlo approach [10], and expectation-maximization approach [11]. Tese Bayesian methods provide a rigorous formulation that makes full use of measurement data, but they face many challenges in solving ill-conditioned problems and estimating closely spaced modes [12].
Frequency-domain decomposition (FDD) is an extension of the peak-picking method, which can identify closely spaced modes and does not require numerical iterations [2]. Te method was frst used for modal analysis in [13] and then systematized to identify natural frequencies and mode shapes under broadband excitations in [14]. Since then, it has been applied to the SHM of many engineering structures [15][16][17][18][19]. Meanwhile, many variants of the FDD method have been proposed in the literature. Te estimation of damping ratios is achieved by converting the PSD back to the time domain (known as enhanced FDD) [20,21] or by ftting the PSD in the narrow frequency band of a mode [22,23]. Te FDD method is further adapted for nonstationary responses or heavily damped structures by jointly using two PSD estimates and detrending the correlation function [24,25]. Moreover, model errors and measurement noise are incorporated into the analysis by estimating the PSD matrix via maximum likelihood [26].
Te FDD-based methods can produce a spectrum describing the dominance of modes in frequency but cannot directly capture the change of modal characteristics over time. To address this issue, a time-frequency representation is needed. Time-frequency methods based on blind source separation were developed [27,28], but they may produce spurious modes, and the numerical accuracy is sensitive to the number of sensors. As a popular time-frequency analysis method, continuous wavelet transform (CWT) was combined with ridge extraction in [29][30][31][32], but the performance of ridge extraction is sensitive to noise. Furthermore, CWT was combined with singular value decomposition, but proper selection of the mother wavelet and its parameters can be tricky because they are not directly related to structural dynamics, and a nonstationary signal over a long time horizon still needs to be split into segments [33,34]. Tus, spurious modes and implicit parameter design reduce the physical interpretability of OMA. To the best of our knowledge, no existing method can produce a time-frequency representation indicating both the dominance of structural modes and the correlation of their mode shapes.
Tis paper develops a new OMA method suitable for strongly nonstationary responses by extending the FDD method to a time-frequency representation. A study of railway sleepers under train passage is used to showcase the proposed method. EMA, especially hammer tests, has been widely applied in the modal analysis and SHM of railway tracks [35][36][37], whereas the application of OMA is rare [38]. Furthermore, the characteristics of the train-induced load on a sleeper vary considerably as the train approaches, passes, and leaves, which further afect the stifness and damping of track components [39][40][41]. As a result, the sleeper vibration in response to train passage is signifcantly nonstationary. In addition, the damping efect from rail pads and ballast makes modal identifcation more challenging.
Te main contributions of this paper are summarized as follows. First, the proposed method incorporates the interpretability of FDD into time-frequency representation. It can produce spectrograms indicating the dominance of structural modes and also the local similarity of their mode shapes. All the parameters explicitly afect the timefrequency representation and can be selected based on the physical information from the user's prior knowledge. Second, the proposed method provides not only a global view of the modal characteristics over time and frequency but also estimates of the modal parameters. It is applicable to strongly nonstationary responses under time-varying loads and conditions and is also robust to the length of signals. Tird, the proposed method is demonstrated to identify the rigid-body motions and bending modes of railway sleepers in diferent train passage phases and at diferent speeds. Tis paper also showcases its capability to distinguish closely spaced modes using numerical simulation.
Te remainder of this paper is organized as follows. In Section 2, the fundamentals of FDD are briefy introduced. In Section 3, the new OMA method and the corresponding parameter selection strategy are proposed. In Section 4, the proposed method is validated through theoretical analysis and laboratory experiments. In Section 5, the proposed method is applied to feld tests and compared with EMA. Finally, some discussions are presented in Section 6, and the conclusions are summarized in Section 7.

Fundamentals of FDD
Structural responses are usually measured by accelerometers at a sampling frequency f s . Estimating the PSD matrix of the response is the frst step in FDD, for which Welch modifed periodogram method [42] is widely used owing to its computational efciency [2]. First, the measured response is divided into n c overlapped (overlap ratio α b ) segments of equal length n b , and a window function tapers each segment to reduce the leakage efect. Te recommended overlap ratio 2 Structural Control and Health Monitoring α b and window function are 0.5 and Hanning window, respectively [42]. Ten, the modifed periodogram I k (f n ) is calculated for each windowed segment based on the fast Fourier transform (FFT), where f n is the n-th discrete frequency as follows [42]: (1) Next, the PSD matrix at each frequency f n is estimated by averaging the periodograms over all n c segments as follows [42]: Once S yy (f n ) is obtained, singular value decomposition (SVD) is applied as follows [2]: where V(f n ) is a diagonal matrix with singular values arranged in descending order and U(f n ) is a unitary matrix containing the singular vectors corresponding to the singular values. At each frequency f n , a singular value in V(f n ) indicates the contribution of the corresponding singular vector in U(f n ), just as a modal response indicates the contribution of the corresponding mode shape based on the modal expansion of the structural response [2]. Mathematically, the number of segments n c in PSD estimation determines the number of nonzero singular values in V(f n ) at each frequency [14], and multiple nonzero singular values allow for the identifcation of closely spaced modes. All nonzero singular values can be plotted in a spectrum, where a resonance peak indicates the possible existence of a mode at the corresponding frequency. To further confrm such a mode, the singular vector of the peak is usually compared with other singular vectors at its adjacent frequencies. A popular scheme to quantify the similarity between two vectors is the modal assurance criterion (MAC) [2], denoted as MAC(p, q), which is equal to 0 (or 1) when the two vectors, p and q, are orthogonal (or proportional). If the singular vectors in the vicinity of a peak are of high similarity (MAC greater than a threshold), they are identifed as belonging to the same dominant mode [2,24].
Once a mode is confrmed, its modal parameters can be estimated following the strategy of enhanced FDD [20,21]. First, an auto PSD function is created using the identifed singular values at the corresponding frequencies, representing an equivalent single degree of freedom system. Ten, an inverse FFT is applied to the auto PSD to obtain an autocorrelation function in the time domain. Te zero crossings of the autocorrelation function can give an estimate of the damped natural frequency, while the extremes are used to estimate the logarithmic decrement δ through linear regression. Further, the damping ratio is calculated as follows [20,21]: Finally, a real-valued mode shape vector can be obtained from each of the identifed singular vectors. A simple approach [2] is to normalize the complex singular vector by the maximum absolute value of its components and then rotate each component to 0°(or 180°) if its phase lies in the frst or fourth (or the second or third) quadrant. A mode shape vector can be displayed with respect to sensor positions in a static plot. It is noteworthy that the problem of spatial aliasing can occur when the number of sensors is insufcient. In this case, the identifed mode shape should be interpreted carefully.
In general, the validity of FDD is based on the assumptions of white noise excitations, low structural damping ratios, and orthogonal mode shapes for closely spaced modes [24]. If these assumptions are not fully satisfed or if measurement noise is present, the identifcation result is an approximation to real modal characteristics [24,43].

Time-Frequency Representation of OMA (TFOMA)
Tis paper develops a new OMA method, named, TFOMA, by extending the FDD method to a time-frequency representation. Figure 1 shows its framework. First, short-time FDD and a convolution-based strategy are proposed to obtain singular values and local mode shape similarity, respectively. Ten, they are fused into mode indicators by a fuzzy-based strategy, and modal parameters are further estimated. Sections 3.1-3.3 will introduce these main steps, and Section 3.4 will discuss the parameter selection strategy.

Short-Time FDD (STFDD).
In FDD, the Fourier transform is used to average the frequency information over the entire signal time, which is theoretically applicable to stationary processes. Te short-time Fourier transform [44] is a modifed version of the Fourier transform for strongly nonstationary signals. We apply a similar strategy to FDD as follows and name it short-time FDD (STFDD): where y(s) is the vector of synchronized measurement from multiple sensors at sampling time s and r(s − t) represents a rectangular window centered at t. As illustrated in Figure 2, y(s) is broken into segments of equal length with an overlap ratio α s , and the procedures of PSD estimation and SVD are applied to each segment, producing a series of singular value matrices V(t, f ) and singular vector matrices U(t, f). Te k-th diagonal term in V(t, f ) is denoted as v k (t, f), and the k-th singular vector in U(t, f ) is denoted as u k (t, f ). At the k-th level, a singular value spectrogram can be obtained by plotting v k (t, f) over time and frequency with color mapping. According to Section 2, the number of efective spectrograms is equal to the number of nonzero singular values, which is further equal to n c used Structural Control and Health Monitoring 3 in PSD estimation. It is noteworthy that STFDD also holds the drawbacks of STFT, the most signifcant of which is the trade-of between time and frequency resolutions (discussed in Section 3.4).

Local Mode Shape Similarity (LMSS).
A peak in a singular value spectrogram indicates the possible existence of a structural mode. According to MAC, a mode is efectively dominant when the singular vector of the peak has a high similarity to the singular vectors in its vicinity. To enable comparisons in both time and frequency dimensions, we propose a convolution-based strategy to quantify the local similarity of singular vectors. In a two-dimensional representation, e.g., an image, convolution works by applying a kernel to each location and evaluating the central element based on all elements in the kernel [45,46]. In the timefrequency representation, we adapt it to compute the following scalar, named, local mode shape similarity (LMSS): where a and b are the half kernel sizes (number of elements) in time and frequency, respectively, Δt and Δf are the time and frequency resolutions, respectively, and ω(dt, df ) represents the weight assigned for each element in the kernel.
LMSS is a weighted sum of MAC values between the central element and all other elements in a kernel. In this paper, a separable kernel with Gaussian functions [46,47] is used, and the weights are determined as follows: , otherwise, where the weight of the central element is zero and ω 0 (dt, df ) is the unnormalized weight calculated based on the following Gaussian functions:

Convolutionbased strategy
Fuzzy-based strategy where dv (t, f ) (dt, df ) denotes the logarithmic diference in singular values with respect to the central element, as calculated as follows:

Modal parameters
Furthermore, σ t , σ f , and σ v are standard deviations characterizing the decay rates of the weight as dt, df, and dv increase, respectively. We recommend determining them as follows: Equations (8)-(10) refect that, from the kernel center to the kernel boundary, the weight decreases from one to zero as dt or df increases, according to the three-sigma rule. Tis property allows comparisons to be made in a localized manner with smooth transitions at kernel boundaries. Additionally, the weight is lower when an element's singular value deviates more from the center's. Tis property enhances the adaptability of LMSS to structural modes with diferent bandwidths since the weight decays faster in the case of a sharper resonance peak and vice versa. Meanwhile, it allows LMSS to better capture the shift in resonance frequency over time. Figure 3 illustrates the calculation process of LMSS at the k-th level. An LMSS spectrogram can be obtained by plotting l k (t, f ) over time and frequency with color mapping. Te value of each point indicates the similarity of mode shapes between that point and its vicinity, and a peak region indicates a high local similarity at that time and frequency, which can help to confrm the dominance of a mode.

Mode Indicator (MI).
After obtaining the singular value spectrogram and the LMSS spectrogram, structural modes can be identifed from regions with both large singular values Structural Control and Health Monitoring and large LMSS. In this paper, a fuzzy-based fusion strategy is proposed to fuse v k (t, f) and l k (t, f) at each time and frequency into a mode indicator (MI), denoted as MI k (t, f ). Fuzzy set theory quantifes the membership of an element in a set through a membership function [48], which is usually used to handle vague information, e.g., fusing multiple images [49][50][51]. It is suitable for computing MI because there is no precise relationship to determine the existence of modes based on v k (t, f) and l k (t, f) but rather a soft and fexible thresholding strategy according to MAC. First, all v k (t, f ) and l k (t, f ) are normalized as follows: Ten, we use the following membership function to compute the MI: where λ v and λ l are the contribution exponents of v' k (t, f) and l' k (t, f), respectively. Figure 4 illustrates the calculation process of MI, in which the membership function with λ v � 1 and λ l � 3 is shown as an example. It can be seen that f) owing to the selection of λ v and λ l . As a consequence, when l′ k (t, f ) is less than 0.45, MI k (t, f ) is lower than 0.1 even if v k ′ (t, f ) is large. Te selection of λ v and λ l enables the proposed fuzzy-based strategy to mimic a thresholding strategy of MAC and provide a soft and interpretable fusion between v k (t, f ) and l k (t, f ). Figure 4, the computed MIs at the k-th level MI k (t, f ) are plotted as an MI spectrogram with peak regions indicating the presence of structural modes. Ten, a frequency band that peaks continuously over time at physically meaningful frequencies can be selected for each mode. To further estimate the modal parameters of a mode, the singular values in its frequency band with MIs greater than a threshold are selected to create auto PSD functions, which can then be used to estimate the natural frequency and damping ratio at each time instant (see Section 2). Meanwhile, the singular vectors corresponding to the selected singular values can be converted into mode shape vectors (see Section 2). Terefore, the proposed method provides not only a global view of the modal characteristics over time and frequency but also estimates of the modal parameters.

Parameter Selection Strategy.
Te time-frequency representation of the proposed method depends on the selection of its parameters. Sufcient time and frequency resolutions are necessary for clear visualization of modal characteristics and accurate estimation of the modal parameters. Te frequency resolution Δf and the time resolution Δt are determined as follows: We propose the following strategy to select all the parameters of the TFOMA method.
Step 1. Select the number of segments n c in PSD estimation. As mentioned in Sections 2 and 3.1, n c determines the number of nonzero singular values. For structures with separated modes, n c can be set to 1, whereas in cases of closely spaced modes, n c should be greater than the maximum number of physical modes in each identifed frequency band.
Time t Compute weights Kernel * Figure 3: Local mode shape similarity and its spectrogram.
Step 2. Select the segment length n b in PSD estimation and the overlap ratio α s in STFDD. According to (13), they directly afect Δf and Δt: larger n b leads to smaller Δf but larger Δt, and larger α s leads to smaller Δt but higher computational costs. We recommend frst selecting n b to provide sufcient frequency resolution, e.g., at least fve discrete frequencies in the frequency band of a mode. Ten, α s can be selected to provide sufcient time resolution, e.g., Δt to be shorter than the nonstationary behavior of the signal.
Step 3. Select the half kernel sizes a and b. Under defned Δt and Δf, the kernel lengths in time and frequency are (2a + 1) · Δt and (2b + 1) · Δf, respectively. For comparisons in a localized manner, we recommend setting a and b as small integers, such as 3∼10, to ensure that (2a + 1) · Δt and (2b + 1) · Δf are shorter than the duration and bandwidth of each mode, respectively.
Step 4. Select the contribution exponents λ v and λ l . As exemplifed in Section 3.3, we recommend setting λ l > λ v � 1 to mimic a thresholding strategy of MAC. Te larger λ l is, the greater the infuence of LMSS on MI, i.e., a higher LMSS is required to reach a certain level of MI. All parameters in the TFOMA method explicitly afect the time-frequency representation. Tey can be selected and tuned according to the user's prior knowledge of the structural dynamics and goals of analyses.

TFOMA of an In Situ
Sleeper. We validate the proposed TFOMA method on the V-Track test rig at TU Delft [52], which resembles train-track interaction. As shown in Figure 5, the track consists of rails, sleepers, foundations, and fastening systems. A beam is driven by a motor to rotate around the central axis of the test rig, and the wheel at the end of the beam rolls along the track. Te wheel is vertically loaded by a suspension system, and the static wheel load is 1,300 N in our measurement. Tere are four rail joints in the test rig. When the wheel passes over these joints, signifcant impacts occur.
We instrument one sleeper with eight accelerometers (PCB 356B21) on its top surface. Te vertical accelerations are measured at the sampling frequency of f s � 102,400 Hz. Te running speed of the wheel is 8 km/h. Figure 6(a) plots the measured sleeper accelerations with four phases distinguished, which shows signifcant nonstationarity. Phase A is caused by the passage over a joint that is several sleepers away from the instrumented sleeper. It is similar to the response of a hammer test. Phases B∼D belong to the response caused by the wheel passage, divided into prepassage, underpassage, and postpassage phases.
We apply the TFOMA method to the measured data with the parameters listed in Table 1. Te spectrograms of singular value, LMSS, and MI are shown in Figure 6 at frequencies up to 6,500 Hz. As shown in Figures 6(c) and 6(d), most of the large singular values are located below 2,000 Hz, while large LMSS is present throughout the frequency range. By fusing the singular values with the LMSS, we obtain the MI spectrogram in Figure 6(b), which shows peak regions with sharper edges than those in Figures 6(c) and 6(d), making them easier to be picked up.
In the MI spectrogram, the impact response in Phase A produces a vertical ridge, along which the MI peaks at some frequencies. In Phases B and D, a number of peak bands can be observed, which continuously dominate at frequencies close to the peaks in Phase A. Te peaks in Phase C are less clear as they belong to the forced response phase. We select ten peak bands in Phases A, B, and D, as labeled in Figure 6(b), where the label height represents the bandwidth. At low (or high) frequencies, the density of peak bands is high (or low), and their bandwidths are narrow (or wide). In each selected band, we use points with MI greater than 0.

EMA and Teoretical Analysis of a Free Sleeper.
To verify the above identifcation results, we perform hammer tests on a free sleeper of the same type on an elastic foundation. Since the sleeper is free of rail fasteners, it is instrumented with more distributed accelerometers (PCB 356B21), as shown in Figure 8(a). We use a small hammer (Brüel & Kjaer 8206-003) to generate the impact at each of the four locations.
To reduce the efect of leakage and noise, the measured response from each sensor is tapered by an exponential window, and the measured force is tapered by the same exponential window and also a force window [53]. Ten, for the i-th sensor (i � 1, . . ., 9) in response to the impact at the jth location (j � 1, . . ., 4), we compute the cross-spectrum between the acceleration and the force S a i p j (f) and the autospectrum of the force S p j p j (f) through the Fourier transform. Furthermore, the frequency response function (FRF), more specifcally the receptance function, is calculated as follows [54]: An FRF is a complex function of frequency that describes the response of a structure at the sensor position to excitation at the impact location. A resonance peak indicates the presence of a structural mode at the corresponding frequency. Te mode shape vector can be obtained by combining the imaginary parts of the FRFs from diferent sensors as follows [54]: At each impact location, we repeat the test three times and average the FRFs as the fnal result. For example, Figure 8(b) plots the magnitude of the FRFs for all sensors in response to Impact 2. Four resonance peaks are identifed from all FRFs at diferent impact locations, labeled as P0∼P3, and their mode shapes and average frequencies are shown in Figure 8(c).
Meanwhile, we calculate the theoretical mode shapes by simplifying the sleeper as a free-free beam. Based on its boundary conditions, the n-th order mode shape is given as follows [55]: w n (x) � sinh k n x + sin k n x + sin k n L − sinh k n L cosh k n L − cos k n L cosh k n x + cos k n x , where L is the beam length, x is the coordinate along the beam (0 ≤ x ≤ L), sinh and cosh are hyperbolic functions, and k n is the n-th (numerical) solution of the following equation of k: cosh(kL)cos(kL) � 1.
Te mode shapes of a free-free beam with L � 25 cm are computed and plotted in Figure 8(c). Clearly, the mode shapes of P1∼P3 are in good agreement with the theoretical mode shapes of the frst three bending modes, respectively, and P0 is the rigid-body motion of the sleeper.    Figure 9(a). We repeat the test fve times at each location, which is more than that of the free sleeper due to lower repeatability. Te average FRFs for the frst two impact locations are shown in Figure 9(b) as examples. Seven resonance peaks are identifed, labeled as Q1∼Q7. Te corresponding natural frequencies and mode shapes are shown in Figure 9(c).
Compared with the free sleeper, the in situ sleeper shows more resonance peaks below 2,000 Hz, and their mode shapes deviate for diferent impact locations. Q1∼Q4 correspond to rigid-body motions but are not comparable to P0 due to diferent boundary conditions. Q5∼Q7 correspond to P1∼P3 (the frst three bending modes), respectively. Te results of Q5 and P1 show signifcant deviations. Te frequencies of Q6 and P2 are consistent, while those of Q7 and P3 deviate slightly. Besides, the peaks of the in situ sleeper are smoother due to the damping efect. Te above fndings refect the diferences in modal characteristics due to different boundary conditions and also the infuence of other track components. In summary, TFOMA provides comparable identifcation results to EMA. Te diferences in the identifed modal parameters refect the infuence of a moving train load on track dynamics. Among the three phases in TFOMA, the impact response and the prepassage phase outperform the postpassage phase in terms of mode shape consistency with EMA.

TFOMA of an In Situ
Sleeper. We test the proposed method using sleeper vibrations measured at the Faurei Railway Test Center in Romania. Te track consists of UIC60 E1 rails, Vossloh W14 fastening systems, and B70-W60 prestressed concrete sleepers. As shown in Figure 10, four accelerometers (Brüel & Kjaer 4514-004) are mounted on a sleeper. A train passes over the instrumented sleeper at three diferent speeds −15 km/h, 80 km/h, and 200 km/h. Te vertical accelerations are recorded at a sampling frequency of 25,600 Hz. We fnd that Sensor L2 was not functional, most likely due to a loose installation, so we use the data from the other three functional sensors for analysis.
Te TFOMA method is applied to the measured data with the parameters listed in Table 2. According to Section 4, only the prepassage phases are studied, while diferent lengths of signals are used due to the diference in speeds. Te raw data and the corresponding MI spectrograms up to    Figure 11. Generally, the patterns of MI are similar at diferent speeds. Some peak bands are wide in frequency, whereas others are narrow. Te low-frequency bands are more pronounced at low speeds, especially when the train is close to the sleeper, whereas the high-frequency bands are more pronounced at high speeds and continuously dominant even when the train is still far away from the sleeper. In addition, some peak bands are not horizontal, i.e., their frequencies change as the train approaches. We select fourteen peak bands at each speed, labeled as O1∼O14. Te frst four columns of Table 3 present the characteristics of each peak band and also the average natural frequencies and mode shapes. In each plot, the identifed mode shapes at a certain speed are plotted in a light color, and their average is plotted in a dark color. In general, the identifed frequencies and mode shapes are similar at diferent speeds while varying slightly due to the infuence of train speed and noise. More discussion is provided in Section 5.3.

EMA of the In Situ Sleeper.
For comparison, we perform hammer tests with the same setup in Figure 10. All the four sensors were functional in the tests. We generate impacts at fve locations using a big hammer (PCB 086D50) and a small hammer (PCB 086D05). At each location, we repeat the test fve times with each hammer. Considering their diferent excitation frequencies [56], the results of the big and small hammers are used for analyses below 2,000 Hz and above 500 Hz, respectively. Te average FRFs are plotted in Figure 12, and eleven resonance peaks are identifed, labeled as E1∼E11. Compared to the sleeper on the test rig, the natural frequencies of the real sleeper are much lower due to its size and material. Most of the resonance peaks, especially at high frequencies, are smooth, which is consistent with the fnding in Section 4.3. Te average frequency and mode shapes for each resonance peak are shown in Table 3, where the identifed mode shapes deviate for diferent hammers and impact locations.

Comparisons between TFOMA and EMA.
Moreover, we compute the theoretical mode shapes of a free-free beam of length 2.5 m according to (16) and (17). Furthermore, in Table 3, we match the identifed modes of TFOMA with those of EMA and theoretical analysis while referring to the     Sensor L1 Sensor R1 Sensor R2 Sensor L2 Impact 3 Impact 2 Impact 1 Impact 4 Impact 5 Figure 10: Te instrumented sleeper in feld tests.  Structural Control and Health Monitoring characteristics of sleeper modes reported in [35,57]. Te average MAC in Table 3 quantifes the consistency of mode shapes between TFOMA and EMA. Te main fndings are summarized below: (i) TFOMA identifes the rigid-body motions of the sleeper at frequencies lower than those of the bending modes, which is consistent with [35,57]. Te bounce motion is more pronounced, which is consistent with the laboratory test. Te rigid-body motions are not observed in EMA because the impact forces cannot efectively excite such modes. (ii) In terms of mode shapes, both TFOMA and EMA consistently (with high MAC values) identify the 1 st , 2 nd , 4 th , 5 th , 7 th , 8 th , and 10 th bending modes. However, neither identifes the 3 rd , 6 th , and 9 th bending modes, probably because these modes are less dominant or the sensors are close to the nodes. (iii) Te frequencies of E1, E2, and E4 are close to those reported in [35,57] under unloaded conditions. For the 1 st and 2 nd bending, the frequencies of TFOMA deviate from those of EMA, refecting the infuence of the train load. For high-order modes, the frequencies of TFOMA and EMA are very close.
(iv) Both TFOMA and EMA identify extra modes probably related to other components.
Furthermore, the pros and cons of TFOMA and EMA are discussed as follows: (i) TFOMA can capture the change of modal characteristics over time and frequency, whereas EMA cannot.
(ii) TFOMA works under operational loads in a broad frequency range, but the excitation spectrum is usually not fat, which can cause errors in modal identifcation. EMA works under controlled excitations but requires manual impacts and also different hammers for diferent frequency ranges. (iii) For a complex coupled system (e.g., a train-track system), the response of a component (e.g., a sleeper) depends not only on its own modal characteristics but also on the dynamical infuence of other components (e.g., trains, rails, fasteners, and ballast). As a consequence, extra modes can be more pronounced in OMA than in EMA. (iv) For each mode, the mode shapes identifed by TFOMA spread within a certain variance, while     using the traditional peak-picking method [58]. Ten, for all the matched modes in Section 4.4, the estimated damping ratios are plotted against their natural frequencies in Figure 13(a). For most modes, TFOMA in diferent phases produces damping ratio estimates similar to EMA while underestimating those at low frequencies.
Similarly, the damping ratios of the sleeper in the feld tests are estimated and plotted in Figure 13(b). Te results of TFOMA are similar at diferent speeds, but the estimated damping ratios are lower than those of EMA. Tese deviations may come from two sources. On the one hand, the diferent loading conditions can lead to diferent modal characteristics, including damping ratios. Tis efect is pronounced for railway tracks since the train load is enormous. On the other hand, the estimation based on a truncated spectrum (either in TFOMA or EMA) can cause errors, especially when the frequency resolution is low or adjacent modes afect each other [21,23]. In summary, TFOMA can provide accurate damping estimates in the case of well-separated modes, but it needs further improvement to handle structures with signifcant nonlinearity and dense modes.

Identifcation of Closely Spaced Modes.
In this paper, the proposed method is applied to the modal identifcation of railway sleepers, where the bending modes of diferent orders are separated. It has the potential to identify closely spaced modes by involving multiple nonzero singular values. Tis section aims to demonstrate such capability using a simulation example. As shown in Figure 14(a), a rectangular plate suspended by springs and dampers vibrates in the x-y plane with three degrees of freedom −x, y, and θ. External excitation forces are applied at the upper right corner, and the equations of motion are given as follows: m€ y + 2k y y + 2c y _ y � P y , Based on the parameters and excitations defned in Table 4, (18) is solved numerically using the Newmark-beta method [59] with a time step of 0.2 ms. Te bidirectional accelerations of the four edge centers are fed into the TFOMA method with the parameters in Table 5. Te number of segment n c � 2 is used to distinguish the two translational modes, which are closely spaced since they have equal natural frequencies due to equal stifness.
Two MI spectrograms are obtained, as shown in Figures 14(b)-14(c), with eight peak bands (with MI > 0.8) identifed in diferent phases of the response. Te estimated natural frequencies and mode shapes are shown in Figure 14(d). In 0∼2 s, the translational mode in the x direction is identifed (X1), whereas the one in the y direction is not identifed since the excitation is applied only in the x direction. When the excitation is applied only in the y direction in 2∼4 s, the translational mode in the y direction is identifed (Y1), while the one in the x direction is still identifable from the decay response (Y3). When the excitations are applied in both directions, the two translational modes are identifed (XY1 and XY3), and XY1 (in the 1 st spectrogram) is more dominant than XY3 (in the 2 nd spectrogram) since the excitation in the y direction has greater power. Moreover, the rotational mode is identifed in all three phases (X2, Y2, and XY2). For all modes, the estimated frequencies are consistent with the engine frequencies calculated from the model parameters. Te simulation result demonstrates that the proposed method    Structural Control and Health Monitoring  Structural Control and Health Monitoring can distinguish closely spaced modes under nonstationary excitations as long as the modes are efectively excited. We expect the validity of this capability to hold in real-world scenarios, while it remains to be demonstrated.

Conclusions
Tis paper presents an interpretable OMA method in timefrequency representation. Short-time FDD and a convolutionbased strategy are proposed to obtain singular values and local mode shape similarity, respectively, which are further fused into mode indicators by a fuzzy-based strategy. TFOMA is an explicit tool that provides a global view of modal characteristics and estimates of modal parameters. It is applicable to strongly nonstationary responses under time-varying loads and conditions and is robust to the length of signals due to its discrete and localized nature. Its interpretability is enhanced by including physical information from the user's prior knowledge in selecting parameters and peak bands. In this paper, TFOMA identifes the rigid-body motions and bending modes of the sleepers at frequencies up to 6,500 Hz in the laboratory tests and 4,500 Hz in the feld tests. Te passage response provides similar results to the impact response, while the prepassage phase slightly outperforms the postpassage phase. TFOMA works efectively at speeds up to 200 km/h by using only three sensors, and some highfrequency modes are identifable when the train is 150 m away. TFOMA provides identifcation results comparable to EMA, while their deviations refect the dynamical infuence of train loading and other track components.
In future research, theoretical analyses and numerical simulations will be carried out to investigate the characteristics of train loads on tracks and decouple the dynamics of diferent components in a train-track system. Meanwhile, the proposed method will be further improved to provide adaptive time and frequency resolutions (especially for signals with signifcant nonstationarity or nonuniform frequency bands) and to more accurately estimate damping ratios for nonlinear structures with dense modes. Moreover, the uncertainty of modal analysis will be quantifed so as to assess the confdence level of identifcation results.

Data Availability
Te data used to support the fndings of this study are available from the corresponding author upon reasonable request.

Conflicts of Interest
Te authors declare that there are no conficts of interest regarding the publication of this article.