A New Approach to Investigation of the Relationship of VLF Signals by Using Longitudinal Analysis Models

,


Introduction
VLF radio waves (3-30 kHz) are guided by the spherical waveguide formed between the Earth's surface and the lower ionosphere and can propagate long distances [1].VLF stations located in various places of the world have been used for different purposes, such as communication with submerged military submarines and determining remote sensing of transient D Ionospheric disturbance region [2].The observed properties of VLF signals can be used to determine the spatial and temporal structure of localized disturbances in the lower ionosphere [3].These disturbances are often manifested as sudden changes in the amplitude or phase of a VLF signal.Using VLF remote sensing techniques, D region disturbances have been observed in association with solar flares, meteor showers, auroral enhancements, gamma ray flares of extraterrestrial origin, and the direct and indirect effects of lightning [4,5].
Statistical studies provide a very important "baseline" reference for case studies and a global view that can be used to identify different processes for a variety of different geophysical conditions [6].Statisticians, as well as others who apply statistical methods to data, have often made preliminary examinations of data in order to explore their behavior.In this sense, exploratory data analysis has long been a part of statistical practice [7].In statistics, exploratory data analysis (EDA) is an approach to analyzing datasets to summarize their main characteristics in easy-to-understand form, often with visual graphs, without using a statistical model or having formulated a hypothesis.Exploratory data analysis was promoted by Hoaglin et al. to encourage statisticians visually to examine their datasets, to formulate hypotheses that could be tested on new datasets [8].
The possibility of correlations between responses needs to be taken into account in the data analysis.Various models can be used to handle such correlations.Longitudinal analysis studies are progressively widespread in many scientific research fields.The aim of making longitudinal study is to look at the variation of process across time periods [9].From this point of view, marginal modeling allows for inferences about parameters averaged over the whole population.Random-effects modeling deliberately provides inferences about the variability between respondents.Transitional model investigates the reasons for the change of the responses [10].
This paper presents a comparative study of longitudinal analysis techniques and models applied to experimental VLF signals and investigations of the relationship between two VLF transmitters.
The organization of this paper is as follows.Experimental systems including VLF transmitters and receiver are introduced in Section 2. Initial EDA analysis is given in Section 3. Section 4 discusses various possible models (marginal, random-effects, and transitional models) in some detail.Some concluding remarks are given in Section 5.

Experimental Systems
The experimental VLF signals used in this study have been obtained from VLF receiver system located at Firat University, Elazig, Turkey (38.40 ∘ N, 39.12 ∘ E).The wideband signal is detected by two orthogonal magnetic loop antennas, which are right-isosceles triangles (2.6 m base, 12 turns).The data are band pass filtered to a range of 9-45 kHz and sampled at 100 kHz with 16 bit resolution and GPS timing.The demodulated amplitude and phase of the narrowband signals are recorded typically during 13:00-06:00 [1].The amplitude data is only included in this study.The GCPs of the Elazig VLF receiver system are shown in Figure 1 and summarized in Table 1.

Exploratory Data Analysis (EDA)
Exploratory data analysis emphasizes looking at and analyzing residuals to help understand data and to investigate models.A fit describing the data and examining residuals often gives guidance for improving that description.Strong patterns usually indicate that further refinement of the fit will be a worthwhile aid to understand the structure in the raw data.A regression setting illustrates the use of residuals and some associated plots.EDA advocates exploring data for patterns and relationships without requiring prior hypotheses.EDA suggests that analyses are more scientifically useful and productive when data have been transformed to comply with basic assumptions.Exploratory analyses can incorporate methods of statistical inference but use them more as indicators of the strength of a relationship or the fit of a model than as confirmation of a hypothesis [7].
In this study, we analyzed serial 4 days of the VLF signals of June, 2008.The parameters used for the analysis are day, station name, signal amplitude, and start hour.We have used exploratory graphics to summarize our dataset.The means of the amplitude of the VLF signals for each station and everyday were calculated preliminarily and given by Table 2.
Spaghetti plots and Drawman displays which are the parts of the exploratory analysis are given in Figures 2 and 3.
As can be seen in Figure 2, there is no much difference between signals for DHO signal, but some deviation was seen at the HWU signal.In particular lines are not at the same direction between 2nd and 3rd.So, we can say that some physical events occurred at the Earth-ionosphere waveguide at these days.Scatter plot is giving the relationship between two signals.For example, the relationship between 1st and 2nd day is shown by circular area in Figure 3.The pink and blue points are gathered at the corners separately.This means that there is a strong relationship between days and signals.
Correlation and covariance matrices are given in Tables 3 and 4, respectively.Signal correlation is extremely high, positive, and linear.Since all correlations are close to 0.99, the variances increase from 288.47 to 482.29.The compound symmetry (exchangeable) covariance structure is more suitable for this data.

Longitudinal Approach for the Analysis of VLF Signal Data
Longitudinal approach consists of coalition of regression and time-series analysis.Longitudinal models have an important role in the literature.Nevertheless, it is used mostly in social sciences, economy, sociology, and medicine fields.Longitudinal data analysis methods have been developed because a lot of databases have become usable to experimental investigations [11].This section studies new approaches to the analysis of VLF data.We consider longitudinal analysis techniques: marginal, random-effects, and transitional models, respectively.

Marginal Model.
In marginal model, the relation between the response vector and the explanatory variables at each time and how these relations change over time are given.In marginal modeling, the joint outcome is modeled by using a marginal probability for each response variable, and correlation structure, by using the odds ratio, parameterizes the relation between the two dependent variables [10] (1)   The parameters are interpreted as the constant (intercept) and slopes of each fixed-effect predictor such as time (day), signal (signal amplitude), and interaction of time and signal.The intercept is interpreted as the mean of VLF signals when all the predictors have a value of zero.A one-unit increase in the predictor time to new experiences time corresponds

Random-Effects Model.
In random-effects model, a natural heterogeneity across individuals in their regression coefficients is investigated and the heterogeneity can be handled by a probability distribution.Correlation among observations for the same individual arises from their unobservable variables [10].The model is given in the following equation: In the random-effects model, we have the same model as marginal model and the predictor coefficients are interpreted as the coefficients in the marginal model.The fixed coefficients for random-effects model are given in Table 6.The coefficient sign is an expected result as the coefficient of signal.The interaction between signal and time is not statistically significant, because of  = 0.90 < 1.96.
The random effects of coefficients of the output provide estimations for the random effects in the form of variances and standard deviations (Table 7).The variance is equal to 0. The variance estimates are of interest here because we can add them together to find the total variance and then divide that total by each random effect to see what proportion of the random effect variance is attributable to each random effect (similar to  2 in traditional regression).We sum the variance components, 0 + 55.28 = 55.28, and divide this total variance by our variance to give us the proportion of variance accounted for, which indicates whether or not this is meaningful (0/55.28= 0.0).If all the percentages for each random effect are equal to zero, then the random effects are not present and linear mixed modeling is not appropriate.You must remove the random effects from the model and use general linear or generalized linear modeling instead.
On the other hand, the output simply provides the correlations among the fixed effects variables given in Table 8.
This can be used to assess multicollinearity.The predictors are related to time and intercept (−0.913), also signal and interaction of signal and time.Therefore, multicollinearity is a concern.As a result, the point estimates for the fixed effects are exactly the same as the marginal model.
The two rows of fit statistics showed the Akaike Information Criterion and the Bayesian Information Criterion, the log-likelihood, the deviance for the maximum likelihood criterion, and the deviance for the REML criterion in Table 9.
The smaller deviance indicates better fit; it may be useful to add a random slope or to find the better fit or can high variance components.So, we add random slope; the results of the model are given in the following equation: ( The deviance of time is smaller than others (time, signal, signal * time).So, we select time covariate as random part.The coefficient of sign is an expected result for signal.The variances for both random intercept (0.0) and slope (4.2152e−14) are very low.Since the random coefficients are very low, it may not be beneficial to add the random coefficient for this data.For this situation, to test, we can use ANOVA.If we compare random-effects models with ANOVA, then the corresponding  value of 1 implies that estimation of coefficient may not be beneficial enough at 5% significance level.The histograms of the residuals for two models are given in Figure 4.For two random-effects model is exactly same histograms of residuals.
In this study, we have also studied subsequence of original data and made a simulation study using Bayesian procedure.The purpose of Bayesian approach is to provide the posterior variance of model factor from the likelihood function and the prior variance [12].We can estimate random-and fixed-effect coefficients by using the highest posterior density (HPD) intervals for the parameters of an MCMC distribution.Random effects of coefficients are not different for each VLF signal.They change between 0.3 and 0.4.We can estimate both lower and higher coefficients.The lower coefficient of intercept is equal to 46.84 and the higher coefficient of intercept is equal to 50.36.Similarly, the lower coefficient of time is equal to 1.12 and the higher coefficient of time is equal to 2.41.

Transitional Model.
In transitional models, the probability distribution of the outcome of individual  at time ,   , is a function of the individual's covariates at time ,   , and the individual's outcome history  1 , . . .,  −1 ,  > 1.Such The lag1 of response and the covariates have been included for transitional model and the  −1 of coefficient is −1.04 values.This value is not an expected result.The parameter of estimates highly changed in this model.This probably may be a result of high correlation among these variables.The residuals and coefficients are given in Table 10.
The time coefficient of signal is negative and this is an expected result.The models of coefficients are statistically significant, because robust  values are of a high number.This study may be modeled with transitional model, but there is a multicollinearity problem.

Result and Discussion
The main theme of presenting this paper is also to illustrate the use of various models in the analysis of VLF signals.In this paper, we used a four-day VLF signal data which come from transmitter stations located at different two countries to Elazıg receiver system.The aim of this study is to use new statistical method for analyses of VLF signal datasets and to make a comparison among the signals taken from two different stations.Therefore, longitudinal analysis techniques were applied over VLF signal data taken from two different stations.Longitudinal approach includes study where the value of the output and covariates are seen.Moreover, it provides the research and contrast to changes in the response of relationship point over time.
The outcome of study includes correlation and covariance matrix for the two stations, spaghetti (longitudinal) plots and scatter plot of VLF signals for the two stations, the residuals and coefficients for marginal model, fixed effects for randomeffects model, the residuals and coefficients for transitional model, and comparing residuals of two models.
In consequence of correlation and covariance matrix made for two stations were determined a very high intimate between signal and data.Spaghetti (longitudinal) and scatter plots show that there are differences between stations.For the DHO station, high correlation between day and signal is determined, but for the HWU station deviations at some days are detected.
For this study, as can be seen in Figure 4 and Table 6, random-effects model and coefficients of each VLF signal are not different from each other because random-effects model has not reduced deviations which occurred in marginal model.Moreover, these models show that simulation study and subsequence of data are not suitable for model VLF signals.For transitional model, there is multicollinearity problem.On the other hand, as may be seen from consequence in Table 5, the marginal model is more appropriate than the other models to the modeling of the VLF signals data.As a result of this process, a new method has been added to the analysis of VLF signal data.

Figure 1 :
Figure 1: The GCPs connecting the Elazig receiver with the VLF stations.

Figure 2 :Figure 3 :
Figure 2: Spaghetti plots of VLF signals for the two stations.

Figure 4 :
Figure 4: Comparing residuals of the two models.

Table 1 :
The list of transmitters and receiver.

Table 2 :
The means of the VLF signals.

Table 3 :
Correlation matrix of the two stations.

Table 4 :
Covariance matrix of the two stations..74increase in the outcome VLF signals for each station.Since, the coefficient for HWU signal is negative, HWU signal has lower VLF times compared to DHO signal when all other factors are kept constant.In other words, VLF signal time is expected to be lower when the HWU signal is used.This is a result expected for the station.The models of coefficients are statistically significant, because robust  values have a high number.The residuals seem high.The residuals and coefficients for marginal model are given in Table5.

Table 5 :
The residuals and coefficients for marginal model.

Table 6 :
Fixed effects for random-effects model.

Table 7 :
The variances and standard deviations for random-effects model.

Table 8 :
Correlation of fixed effects.

Table 9 :
Model selection of criteria.

Table 10 :
The residuals and coefficients for transitional model.