Multiple Gate Delay Tracking Structures for GNSS Signals and Their Evaluation with Simulink, SystemC, and VHDL

.


BACKGROUND AND MOTIVATION
The main algorithms used nowadays for GPS and Galileo code tracking are based on what is typically called a feedback delay estimator, and they are implemented based on a feedback loop.The most known feedback delay estimators are the delay locked loops (DLLs) and the today's GNSS receiver that typically use a particular DLL structure, called the narrow correlator or narrow early-minus-late (NEML) delay tracker, which proved to give good results in multipath environments [1][2][3].Another class of enhanced DLL structures is the socalled double-delta correlator class [4], which started to gain more and more attention during last years.Examples belonging to this class are: the high resolution correlator (HRC) [3,5], the strobe correlator [2,4,6], the pulse aperture correlator (PAC) [7], the multipath mitigation correlator [8], and the modified correlator reference waveform [2,9].Most of the double-delta correlators as well as the narrow correlator are patented or under patent applications [4,5,7,10,11].
An alternative to the above-mentioned feedback loop solutions is based on the open-loop (or feedforward) solutions, which refer to the solutions which make the delay estimation in a single step, without requiring a feedback loop.A general classification of open-loop solutions for CDMA communication applications can be found in [12,13] and for GNSS applications in [14].However, for the purpose of lowcost mass-market receiver implementation, feedback delay tracking structures are still the preferred ones, and they will be the focus of our paper.
We introduce here the flexible multiple gate delay (MGD) structure with adjustable parameters, and we present a method to optimize these parameters.We show the performance of MGD structures in multipath channels, with a particular attention to the situations with more than 2 paths (which are typically neglected in the literature, when analyzing the multipath error envelopes of delay tracking units).We also present, for the first time to the authors' knowledge, a comparison between using squared-envelopes versus envelopes before noncoherent integration stage as well International Journal of Navigation and Observation as a comparison between using uniform versus nonuniform gate spacings in delay tracking units.
We then validate the MGD structures via implementation in a Simulink-based navigation tool, GRANADA.Based on the presented MGD structures, we also develop a flexible delay tracking prototype receiver (in SystemC and VHDL) for Galileo and GPS signals.The main focus is on sine-BOC and BPSK-modulated signals, but the design steps shown here can be extended in a straightforward manner to other BOC modulations (cosine BOC, multiplexed BOC, alternate BOC, etc.).
In the delay tracking receiver prototyping, we focus on the implementability, complexity, and flexibility of the proposed MGD structures.First, we present the implementations and discuss about the flexibility and the restrictions caused mostly by the digital hardware characteristics.Then, we verify the implementability of the chosen algorithms with a SystemC model.After that, the complexity of the implemented prototype hardware is evaluated as VHDL synthesis results.

COMMON DELAY TRACKING STRUCTURES FOR GNSS SIGNALS
The most common delay tracking loops for GNSS signals are based on feedback delay locked loop (DLL)-like structures.
All the above-mentioned methods have a common underlying structure, in the sense that they are based on different weighted combinations of early and late samples of the correlation function with different chip-spacings between these samples.In what follows, we will first introduce the signal model for Galileo and GPS signals, then, we present the above-mentioned methods in more detail.We will then show that most of the currently used delay tracking structures (i.e., those mentioned above) can be unified under a generic structure, namely the multiple gate delay (MGD) structure, whose parameters are to be optimized in Section 3.
Typical satellite positioning signals, such as those used for GPS and Galileo, employ the direct-sequence code division multiple access (DS-CDMA) technique, where a PRN code is spreading the navigation data over S F chips (or over a code epoch length) [20,21].In what follows, we adopt, for clarity reasons, a baseband model.Also, the delay tracking estimation in nowadays receivers is typically done in digital domain (using the baseband correlation samples).The time notation t stands for discrete time.The transmitted signal x(t) can be written as the convolution between the modulating waveform s mod (t), the PRN code, including data modulation, and the pulse shaping filter p TB (t) [22]: where E b is the data bit energy, is the convolution operator, b n is the nth complex data symbol, T c = 1/ f c is the chip period, S F is the spreading factor, T sym is the symbol period (T sym = S F T c ), c k,n is the kth chip corresponding to the nth symbol, δ(t) is the Dirac pulse, and p TB (t) is the pulse shaping filter applied to pulses of duration where r(t) is the received signal, L is the number of channel paths, α l is the amplitude coefficient of the lth path, θ l is the phase of the lth path, τ l is the channel delay introduced by the lth path, f D is the Doppler shift introduced by the channel, and η(t) is the complex additive Gaussian noise of zero mean and double-sided power spectral density N 0 .
Typically, the signal-to-noise ratios for GNSS signals are expressed with respect to the code epoch bandwidth B w , under the name of carrier-to-noise ratio (CNR).The relationship between CNR and bit-energy-to-noise ratio is [23] CNR The delay tracking is typically based on the code epochby-code epoch correlation R(•) between the incoming signal and the reference x ref (•) modulated PRN code, with a certain candidate Doppler frequency f D and delay τ: where m is the code epoch index, and E(•) is the expectation operation, with respect to the PRN code, and where b n are the estimated data bits.For Galileo signals, a separate pilot channel is transmitted, thus the data bits are known at the receiver [21].In order to reduce the noise level, we can use coherent and/or noncoherent integration.The averaged coherent correlation function R c ( τ, f D ) can be written as where N c is the coherent integration time (expressed in code epochs or milliseconds for GPS/Galileo signals), and the averaged noncoherent correlation function R nc ( τ, f D ) can be written as where N nc is the noncoherent integration time, expressed in blocks of length N c milliseconds (for clarity of presentation, we dropped the block indexes used in the noncoherent summation), and pow nc is a power index used for noncoherent summation.The most encountered variants are: pow nc = 1 (i.e., sum of absolute values) and pow nc = 2 (i.e., sum of squared-absolute values).
The DLL-like structures form a discriminator function D( τ) based on the early and late correlations, and they estimate the channel first path delay from the zero crossings of this discriminator function.The discriminator functions for NEML [1-3, 10, 11] and HRC [3,5,19] are well defined in literature and their expressions as equations are, for NEML: and for HRC: In single-path channels (L = 1), the mentioned discriminator functions cross the zero level when τ = τ 1 .That is, the zero-crossings show the presence of a channel path.However, due to BOC modulation, we might have more zero-crossings present, and the search range should be restricted to the linear range of the discriminator function (for SinBOC(1,1)), this linear range goes from about −0.05 till about 0.05 chip error.In multipath channels, we also want to have D(τ 1 ) = 0, τ 1 being the true line-of-sight (LOS) delay, in order to estimate correctly the first path delay.However, this is not always possible, and an estimation error might happen due to multipath presence, that is, D(τ 1 + e me ) = 0.The term e me is the multipath error.An example is shown in Figure 1 for two in-phase paths of amplitudes 0 and −1 dB and path spacing of 0.2 chips.In this example, e me = 0.01 chips for HRC, and e me = 0.04 chips for NEML (in singlepath channel, we had e me = 0 chips for both structures).The maximum and minimum multipath errors define the multipath error envelopes (MEEs), as it will be discussed in more detail in Section 3.
While it is generally known that the performance of coherent correlators outperforms that of the noncoherent correlators in ideal conditions (e.g., absence of fading or clock synchronization errors, perfect data bit estimation, etc.), the nonidealities of practical channels make that the structures of choice in most nowadays receivers are the noncoherent ones.This motivates our choice of noncoherent correlator gates in what follows.

Proposed architecture
The proposed generalization of the NEML and doubledelta structures (which cover most of the state-of-art delay tracking techniques used nowadays in industrial implementations) follows in a straightforward manner: Above, we have a weighted sum of N g correlation pairs (or gates), with weighting factors a i , i = 1, . . ., N g , and spacings between the ith early and the ith late gate equal to Δ i .Uniform spacing between the gates (as that one used in NEML and double-delta correlators) means that Δ i = iΔ 1 , i = 2, . . ., N g .However, we need not to restrict our structure to uniform spacing alone.The above discriminator function characterizes the proposed multiple gate delays (MGDs).The first coefficient a 1 is normalized, in what follows, to 1 without loss of generality.An example of the discriminator function for MGD with uniform and nonuniform spacings is shown in Figure 2.For 2-path channel, the same channel profile as in Figure 1 was used.The multipath errors in these cases are: e me = −0.0025chips for MGD with uniform spacing and e me = 0.0150 chips for MGD with nonuniform spacing.The block diagram of the generic MGD structures is shown in Figure 3.The incoming signal is correlated with the reference, BOC or BPSK-modulated PRN code, via N g gates or correlator pairs, and, then, it is coherently and noncoherently integrated.The coherent and noncoherent integration blocks are optional, but they usually should be International Journal of Navigation and Observation Discriminator output (non-coherent) 2 in-phase path channel, with amplitudes 0 and −1 dB Figure 2: Examples of noncoherent discriminator outputs (pow nc = 2) for single-path and two-path channels, for 2 types of MGD correlators, each with N g = 2 gate pairs: uniform spacing (Δ 2 = 2Δ 1 ) versus decreasing spacing (Δ 2 = 1.5Δ 1 ).SinBOC(1,1) signal, early-late spacing Δ 1 = 0.1 chips.a 1 = 1 for both structures.employed for a better robustness against noise.The type of nonlinearity that can be used in the implementation is determined by the factor pow nc , with typical values: pow nc = 1 (envelope) or pow nc = 2 (squared envelope).The choice of nonlinearity type is usually motivated by the design constraints (e.g., complexity of squaring versus taking absolute value, possible need for analytical models, which are easier to derive in the case of squared envelopes, via chisquared statistics, etc.), therefore we will analyze both cases (pow nc = 1, 2) in what follows.To the authors' knowledge, a comparison between squared envelopes and envelopes used in noncoherent integration is not yet available in the GNSS literature.
We remark that the structure shown in Figure 3 is not the only one possible; we might, in fact, combine the early-late gates after the discriminator function.Such structures have been analyzed in [24] and were shown to give worse results than the MGD structure selected here.
We also notice that the term of MGD has been used before in [15,16].We kept the same MGD nomination, since it is quite a generic one, but, by difference with our proposed MGDs, the discriminator formed in [15,16] is a normalized discriminator, and the choice of the weighting parameters is not optimized.It is not surprising then, that, while getting rid of the false lock point problem, the MGD structures proposed in [15,16] have even poorer code tracking performance than the narrow correlator [16].
We also remark that the linear combination of weighted correlation in order to shape the discriminator function has been also considered in [25].There, the coefficients are optimized reducing the value of the correlation function outside the region ± 1 chip therefore consequently reducing the multipath error envelope area.However, the approach presented in [25] has been tested only for 2-path channels, with second path weaker than LOS path, and the optimization steps for other multipath scenarios seem to depend on previous knowledge about multipath profiles, which is not usually available.Our approach is different in the sense that we do not try to reach an optimal discriminator shape, but the optimization is done according to the estimated multipath errors, in such a way to minimize them, on average (i.e., under the assumption of various statistical distributions of channel paths, this optimization is performed, and the MGD parameters are found).
The next step is to choose the MGD parameters, namely the number of gating pairs N g , the weighting coefficients a i , and the gate spacings Δ i .This choice is done according to an optimization criterion defined in the presence of multipath channels, as given in Section 3.2.

Optimization criterion
The typical criterion to evaluate the performance of a delay tracking unit in the presence of multipaths is the multipath error envelope (MEE) [1].Typically, two paths, either inphase or out-of-phase, are assumed to be present, and the multipath error is computed versus the path spacing.The upper error envelope is obtained when the paths are inphase and the lower error envelope when the paths have 180 • phase difference.The MEEs depend on the type and length of the PRN codes, on the additive white Gaussian noise (AWGN) level, and on the residual Doppler shift errors coming from the acquisition stage.However, in order to distinguish the performance deterioration due to multipath Re-estimate τ based on the discriminator function Doppler estimation comes from the acquistion stage . .errors only, several simplifying assumptions can be made, such as: zero AWGN, ideal infinite-length PRN codes, and zero residual Doppler ( f D = f D ).Under these assumptions, after straightforward manipulations of (1), ( 2), ( 4), ( 5), (6), and ( 7), for noncoherent integration we obtain the following: where R mod (τ) is the autocorrelation function of the modulated PRN code, given by [22] R mod (τ) = Λ TB (t) where Λ TB (t) = p TB p TB is the triangular pulse of support 2T B , shown in Figure 4.The MEEs can be then computed straightforwardly, under these ideal conditions, from (10), (11), and ( 12) (noncoherent structures), by considering two-paths in-phase and out-of-phase channels.However, since the multipath profiles cannot be known in advance, we can compute some averaged MEEs, when the second-path amplitude varies.The approach selected by us was to consider that the first channel path has a unit amplitude, and the second-path amplitude varies uniformly between 0.3 and 1.0.The final MEEs will be obtained as an average of all MEEs for each channel profile.
A good delay tracking structure should furnish small average errors, small worst errors, and small maximum multipath spacing after which MEE becomes 0. The proposed optimization criterion, derived by intuitive reasoning is the area enclosed by the absolute value of the upper MEE and the absolute value with minus sign of the lower MEE.The illustration of this "enclosed area" principle is shown in Figure 5 for a MGD structure with 3 gate pairs, squared absolute value (pow nc = 2), and delta spacings and weighting coefficients shown in the figure's caption.The "enclosed area" is shown in dashed lines.We remark that the units International Journal of Navigation and Observation to measure this area are the units of MEEs (e.g., chips or meters); here the errors are shown in meters, knowing that one chip error corresponds to 293.25 m (if the chip rate is 1.023 MHz).

Tables with optimized parameters and interpretation of results
As mentioned before, the parameters to be optimized are: the number of gate pairs N g , the delta (or early-late) spacings Δ i , the weighting coefficients a = {a i } i=1,...,Ng , and the type of nonlinearity pow nc .Three types of delta spacings have been studied here.
(1) Uniform spacing: (2) Decreasing spacing: (3) Increasing spacing: The target was to minimize the area enclosed by the averaged MEEs, when the amplitude of the second channel path varied between 0.3 and 1.0 (linear scale), and the multipath spacing varied between 0 and 1.5 chips (with a step of 0.01).For convenience and without loss of generality, we normalized the weighting coefficients with respect to the first one.Thus, a 1 = 1, and the search ranges for a i were between −1 and +1, with a step of 0.1.
First, we had a look at the minimum enclosed areas for N g = 2 and N g = 3 (in order to see the effect of increasing the number of gating pairs), and for the two types of nonlinearities pow nc = 1 and pow nc = 2.For SinBOC(1,1) modulation, the minimum enclosed areas are shown in Tables 1 and 2, and they correspond to the optimum coefficients given (partly) in Table 3 (only the most illustrative cases, i.e., uniform and decreasing spacings, are shown in this last referenced table).
If we compare Table 1 (N g = 2) with Table 2 (N g = 3), we remark that, by increasing the number of gate pairs, we may decrease the enclosed MEE area, and, thus, we may increase the multipath robustness.In the worst case, the areas remain the same when going from N g = 2 to N g = 3 gate pairs, which means that the optimum is already achieved with a double-delta correlator-like structure.In this situation, the optimum is typically given by HRC (see the last column of Tables 1 and 2).We also remark that the reduction of the enclosed area is not very large when we increase the number of gate pairs, which might justify the fact that we limit our structures to a maximum of N g = 3 gate pairs (further increase in the number of gate pairs will boost the complexity, while providing only marginal benefit in terms of robustness against multipaths).
It is also seen from Tables 1 and 2 that using envelopes (pow nc = 1) instead of squaring envelopes (pow nc = 2) gives better results.Also, using a decreasing delta spacing instead of uniform delta spacing is generally better.Similar conclusions have been achieved also for GPS BPSK-modulated signals.
The optimum pairs of coefficients for the two nonlinearity types are shown in Table 3, for SinBOC(1,1) modulation, and in Table 4 for BPSK modulation.Only uniform and decreasing delta spacings are considered here, since the increasing delta spacing was clearly much worse than the other two types of spacing (as seen in Tables 1 and 2).
An illustration of the averaged MEEs for the narrow correlator, high resolution correlator, MGD with uniform spacing (a 1 = 1, a 2 = −0.7, a 3 = −0.2),and MGD with decreasing spacing (a 1 = 1, a 2 = −0.9, a 3 = 0.2) is shown in Figure 6, for SinBOC(1,1) signal, envelope-based nonlinearity (pow nc = 1), and 0.25 chips minimum earlylate spacing.The average is done with respect to the second channel path amplitude, which varies uniformly between 0.3 and 1.0 (when first channel path has unit amplitude).As discussed before, the best results among these 4 algorithms are obtained with the decreasing spacing, but the differences between the 4 considered tracking structures are not very large.
The values shown in Tables 3 and 4 give the designer the possibility of a wide choice of MGD parameters, according to the desired nonlinearity type (imposed, for example, by hardware restrictions) and to the desired minimum earlylate spacing Δ 1 .As seen in Tables 1 and 2, the smaller the minimum early-late spacing, the better the multipath performance.However, as mentioned in [23], the delay tracking error decreases with the early-late spacing only if we assume infinite bandwidth.If the bandwidth is limited, there is a lower bound limit on the minimum early-late spacing.
Although closed form expressions for this limit do not exist, a coarse limitation of the order of Δ 1 = 1/B rx has been derived in [26], where B rx is the receiver front-end bandwidth.For example, if the receiver bandwidth is limited to 20 MHz, the minimum early-late spacing that we can use will be around Δ 1 = 0.05 chips.Decreasing the early-late spacing below this limit will not provide any additional benefit in terms of code tracking error, it will only decrease the linear range of the discriminator.A large linear range of the discriminator curve is also important, since it is directly related to the International Journal of Navigation and Observation ability of the loop to keep the lock.The linear range is directly proportional with half of the early-late spacing Δ 1 /2, as illustrated in Figure 7 (there, the linear range for Δ 1 = 0.1 chips goes from −0.05 till 0.05 chips, and the linear range for Δ 1 = 0.3 chips goes from −0.1 till 0.1 chips, with high likelihood that the loop will not lose lock as long as the error is below 0.15 chips in absolute value, due to the piecewise linear and monotonic shape of the discriminator in the region −0.15 to 0.15 chips).An approximation of the linear range of the discriminator is therefore given by Δ 1 /2.Thus, when choosing Δ 1 , the designer should take into account the multipath performance, on one hand, and the bandwidth limitations and linear range constraints, on the other hand.

MEEs for more than 2 paths
When we want to analyze the MEEs in channels with more than 2 paths, there are no analytical expressions to compute them, due to the complexity of channel interactions.Thus, we cannot know if the "worst" case errors happen when all the paths are in phase or when they have alternate phases, and so forth.The solution we propose here in order to compute MEEs for multiple-paths channels is based on Monte-Carlo simulations: we generate a sufficient number of random channel realizations N random , and we look at the highest positive and negative multipath errors over the N random points.The goal is to study the MGD performance in multipath channels with more than 2 channel paths (which may occur especially in and urban indoor scenarios).For this purpose, we consider that the channel impulse response h(t) is given by (same notations from (2) are used here) We made the following assumptions during the following simulations: that the channel has a decaying power delay profile (PDP), meaning that α l = α 1 e −μ(τl−τ1) , where μ is the PDP coefficient (assumed in the simulations to be uniformly distributed in the interval [0.5; 1] when the path delays are expressed in samples), that the channel path phases θ l are uniformly distributed in the interval [0; 2π], that the number of channel paths L is uniformly distributed between 2 and L max (with L max = 3, 4, . . ., n), and that the successive path spacing τ l − τ l−1 is uniformly distributed in the interval [1/N s N B ; x max ], where N s is the oversampling factor or number of samples per BOC interval (a parameter which defines the resolution of the delay estimates), and x max is the maximum value of the successive path spacing (which will define the multipath delay axis in the MEE curves).It follows that, for each channel realization (meaning a combination of amplitudes α = α 1 , . . ., α L , phases θ = θ 1 , . . ., θ L , path spacings, and number of channel paths L), a certain LOS delay is estimated τ 1 (α, θ, L) from the zero crossing of the discriminator function (D(τ)| τ1(α,θ,L) = 0), searched in the linear region of D(•).The LOS estimation error is thus τ 1 (α, θ, L) − τ 1 , where τ 1 is the true LOS path delay.The multipath error envelopes (upper and lower) for a  3 (for uniform and decreasing spacings).pow nc = 1, Δ 1 = 0.25 chips, and SinBOC(1,1) signals.S curve for single path, MGDs with decreasing spacing, SinBOC(1, 1) Linear range Figure 7: Illustration of the linear range of the discriminator, when we increase the minimum early-late spacing.MGDs with N g = 3, decreasing spacings, pow nc = 1, and optimal parameters as given in Table 3.
particular path spacing x max can be therefore computed as  The results based on the above rule are shown in Figure 8 for L max = 6 maximum channel paths.Similar results have been achieved also for L max between 3 and 5 paths, with the only difference that the MEE levels are increasing when the number of path increases (this can be noticed also if we compare Figure 8 with Figure 6).Several structures with optimized parameters as given in Table 3 and different nonlinearity types were used here.The surprising result is that the higher the number of channel paths is, the more the performance of various MGD structures becomes similar for all the considered algorithms (and they all reach the performance of the narrow correlator).It follows that the main advantage of the proposed MGD structures comes from the fact that they offer patent-free alternatives to the current narrow and double-delta correlators, by preserving the same performance in realistic multipath channels.

Model description
The Galileo receiver analysis and design application (GRANADA), developed by Deimos Space within GARDA project, is one of the popular GNSS simulation tools nowadays.It consists of two parts: Bit-true GNSS SW receiver simulator and GNSS Environment and Navigation simulator.Since the Bit-true GNSS SW receiver simulator is created based on the Simulink/Matlab, it is easy to be modified for new receiver technologies.This simulator is currently used by several universities and researchers [16,[27][28][29].
The GRANADA Bit-true GNSS SW receiver simulator is made up by three parts: the transmitter block, the propagation channel block, and receiver block, as shown in Figure 9.The transmitter block includes the code generation, BOC modulation, and channel multiplexing.The propagation channel model takes into consideration the multipaths, the AWGN noise, and a few other possible sources of interference, such as the wideband interference from other satellites.The receiver block contains basically receiver front end, acquisition, and code tracking blocks.The general architecture of receiver is shown in Figure 10.After some modification in GRANADA version 2.02, which is distributed under Galileo supervisory authority (GSA) licenses, it can be used for testing the performance of MGD structure.The modifications made to GRANADA tool are explained with details in [29,30].

Results in AWGN and multipath static and fading channels
In order to evaluate the performance of the new structures, root mean square error (RMSE) between the estimated delay and the true LOS delay is calculated.In order to test the DLL performance in the noise presence, we chose three kinds of channel profiles: single-path static channel, two-path static channel, and four-path fading channel, as shown in Table 5. Figures 11,12,and 13 show the RMSE values of different algorithms in the different channel settings.Since the received signal cannot get synchronized in the acquisition stage of GRANADA when CNR is below 35 dB-Hz, we calculate the RMSE values from 35 dB-Hz to 50 dB-Hz.Structures with early-late spacing Δ 1 = 0.1 chips have been selected for comparison purpose, but similar results (which are in accordance with the models given in Section 3) were obtained for other early-late spacings as well.
Besides the MGD structures described in Section 3, we also considered here a normalized MGD structure, where the discriminator function is normalized by the weighted sum of early and late correlations, similar with [15,16]: The purpose of including the normalized MGD in the comparison was to show that the normalized MGD structures of [15,16] have worse performance than the un-normalized structures proposed by us.
The delay error between the initial code replica in the receiver and the received signal has not been taken into account.The estimated delay values used for calculating RMSE are taken after the transient stage in the beginning of the tracking stage.From Figures 11 and 12, the simulation results in the static channel show that as CNR increases, the estimation delay errors converge to the corresponding value in the MEEs.For scenario 1, the estimated delay errors are caused by the noise only, since there is only LOS signal in the propagation channel.As the CNR increases, the RMSE value of each algorithm gets close to 0. When CNR is equal to 50 dB-Hz, the RMSE values are below 0.5 meters.From the single path simulation results, we notice that all these algorithms have similar performance in the AWGN channel, as desired.

International Journal of Navigation and Observation
For scenario 2, as the CNR increases, the RMSE value of each algorithm converge to different values.This is because the RMSE value takes both bias and variance into account.The variance is caused by the noise and decreases when CNR increases.However, the bias is caused by the multipath in the channel and is equal to the corresponding point in the MEEs.For instance, as the CNR increases, the RMSE values of NEML algorithm converge to 11 meters, which is the same value in the MEEs according to the channel profile of scenario 2. The normalized MGD has worse behavior than an un-normalized MGD with the same parameters.
From Figure 12, it is clear that the HRC algorithm and MGD algorithm with weighting factors a = (1, −0.6, 0) show better performance than NEML algorithm, and MGD algorithm with a = (1, −0.7, 0.1) (i.e., optimum parameters) shows the best performance among all considered algorithms (which is in accordance with the theoretical derivations in Section 3.2).
In the multipath fading channel, the LOS signal follows Rician distribution, and the NLOS signals follow Rayleigh distribution.The mean power and delay of each ray are described in Table 5. Figure 13 shows that the RMSE value of NEML is much higher than other algorithms, especially when CNR is 35 dB-Hz, it gets till 172 meters (not shown in the figure in order to get a better scale).An MGD structure with weighting factor a = (1, −0.7, 0.1) shows again the best performance among the algorithms, as expected, according to the optimization results given in Section 3. The RMSE performance of normalized MGD algorithm is quite poor in fading channels.

ALGORITHM TESTING/PROTOTYPING
From the MGD structure optimization results (Tables 2  and 3), we chose the MGD algorithm with N g = 3 to be investigated further in prototype stage.Both uniform and decreasing spacings with the Δ 1 = 0.1 and Δ 1 = 0.25 chips were chosen to be studied.The purpose here is to show that the chosen tracking algorithms are feasible to be implemented on actual devices.One of the targets of this study was to see if the behavior of the proposed algorithms does change due to the restrictions given by the hardware implementation.These restrictions include finite computation accuracy, and the effect of quantization due to the bit-width of the signals and the limitation caused by the operation frequency of the synchronous digital system.
We also focus on the design complexity issue, which characterizes the algorithm development especially in the low cost receivers.Since the trend in price of the satellite navigation receivers is currently descending [31], the manufacturers of these low cost, mass market, receivers will most likely reject the algorithms with high implementation complexity and cost.In the satellite navigation receiver, the signal tracking is performed by hardware and software signal processing [23].On the evolving field of satellite navigation systems, the issue of flexility has become more and more important.Flexible designs allow algorithm updates if the specifications of upcoming systems (like Galileo) change suddenly.Flexible designs are usually relying on softwarebased implementation [32].For receivers, the software-based implementation is declared to be minimizing the area and cost.On the other hand, the computation burden of the realtime tracking algorithms is too high for most of the handheld device processors, and thus hardware-based computation acceleration is also required.The division between hardware and software implementation may vary in different cases and from the cost perspective it has quite an important role.For this software versus hardware division, one approach in the literature has been the division where the correlation of incoming signal and the reference code are implemented as a specific hardware accelerator, and the computation of discriminators for the feedback loop is done by software running on a digital signal processor (DSP) or some specific processor [20,33].In the commercial receiver chip sets, this division is usually implemented as a specific hardware GPS accelerator, engine or core, which is connected to an embedded processor [34,35].
We chose this approach with the focus on the hardware complexity for our algorithm prototyping implementation.We used the hardware synthesis results (i.e., resource consumption on target FPGA) to estimate the relative complexity of the implemented algorithms.

Implemented architecture
We implemented the chosen MGD algorithm in both Sys-temC and VHDL hardware description languages.The hardware was implemented as a Galileo/GPS tracking structure with processes of carrier wipe-off, code tracking correlation, and result integration.The architecture of the implemented hardware delay tracking channel is illustrated in Figure 14.The number of correlators is related to the algorithm used.For the chosen MGD structure with N g = 3, seven correlators are needed to form three correlator pairs and the prompt correlator.
The implemented tracking architecture contains the following functional units: numerically controlled oscillators (NCOs) are used to create the desired frequencies inside the system for the replica code and carrier generation.The code generator is used to generate the replica PRN code for tracking.The carrier NCO outputs sine and cosine waves, which are used to strip the intermediate frequency (IF) carrier from the incoming signal.The sine and cosine multiplications make also the division between in-phase and quadrature phase channels.Seven correlators in both channels are used to correlate the incoming signal with the delayed versions of locally generated code.The amount of delay between the code generation outputs defines the spacings (Δ 1 , Δ 2 , and Δ 3 ) (a) Delay registers with uniform spacing Ref code between the correlators.Discriminator function is computed from the accumulated (integrated) correlator outputs.Since we decided to implement both uniform and decreasing spacing algorithms, two versions of the delay line in the code generator output was constructed.The main difference between these delay lines is illustrated in Figure 15.The uniform delay spacing is created simply by feeding the reference code chip value from the code generator through a delay shift register, where all delays are equal (e.g., for Δ 1 = 0.25 chips, we have: Z −1 = 0.125 chips).The decreasing delay spacing implementation needs additional registers between the very-very-early (VVE) and very-very-late (VVL) outputs to align the delays correctly (e.g., for Δ 1 = 0.25 chips, we have: Z −1 = 0.03125 chips, Z −2 = 2Z −1 = 0.0625 chips, and Z −4 = 4Z −1 = 0.125 chips).One may notice that the decreasing delay spaced register implementation needs much smaller uniform delay Z −1 .The relationship of smallest uniform delay Z −1 in cases of uniform and decreasing delay spacings is

SystemC verification of the architecture
We started the prototyping task by creating a high-level SystemC model of the hardware tracking channel.SystemC is a C++ library extension which can be used, for example, to cycle accurate hardware architecture modeling [36].The similarity of the syntax of the hardware description language with C++ allowed fast prototype generation.Another benefit of using SystemC is that it contains the simulator itself, thus a stand-alone executable can be created for the simulations.The developed model was based on the one published previously in [37].In [37], the SystemC hardware description language was used to model an inter-operative GPS/Galileo code correlator channel.For the MGD tracking algorithm testing, a carrier wipe-off process was included to this newer version of model.We developed a Matlab code to represent the software part of the proposed MGD tracking algorithms.Matlab was also used for generation of the input signals for the test simulations.The division of resources between SystemC model and Matlab software environment is illustrated in Figure 16.The implemented SystemC model contains the same functional blocks as are illustrated in Figure 14 and, together with the surrounding Matlab environment, principally the same functionality as in Figure 3.
We used this SystemC model to see how the MEE curves of the proposed MGD algorithms behave when the hardware model is used.HRC and NEML algorithms were implemented for reference purposes.At first stage of MEE testing, we noticed that the envelopes (pow nc = 1) generated with the SystemC model did have a constant negative offset.This can be seen clearly in Figure 17, where the blue-squared curve illustrating the SystemC hardwarebased MEE of NEML (Δ 1 = 0.1 chips) has a negative offset when comparing to the black-star, plain Matlab based, and reference curve.On the other hand, the hardware model's MEE shares the same shape with the ideal reference one.
The reason for this behavior was found to be the imperfect frequency generation inside the hardware tracking channel.When both code generating and sampling frequency are generated with the NCO, there is a possibility to a sample slip if the NCO's frequency resolution is too low.With no noise condition (as MEEs are generated), this has an effect on the shape of ideal autocorrelation function curve, making it not to have identical sides.We improved the output accuracy of NCOs by increasing the accumulation register size from 24 to 32 bits.This removed the offset from the discriminator output as can be seen from Figure 18, where red-diamond curve presents the MEE result with the new NCO size of 32 bits and blue-squared with NCO size of 24 bits.
After the issue of the NCO size was dealt with, we made a conclusion that the proposed MGD algorithms are implementable, and the implemented hardware architecture is solid for this purpose.An example curve for the uniformly spaced (Δ 1 = 0.1, Δ 2 = 0.2, Δ 3 = 0.The blue-squared curve presents the case when NCO register size was 24 bits; the red-diamond line is for the case of 32 bits.

VHDL implementation and synthesis
After the architecture of the hardware tracking channel and its functionality with the proposed MGD structure were verified with the SystemC hardware model, we build a VHDL model of the tracking channel.VHSIC hardware description language (VHDL) is a language designed and optimized for describing the behavior of the digital systems, and it is one of the standard languages among the electronic engineers [16].Since the VHDL needs a simulator software for simulation, we used ModelSim software and tool command language (TCL) scripts to run the simulations for MEE generation.The VHDL hardware-based MEE curve of the proposed MGD with both uniform (a = [1, −0.7, −0.2]) and decreasing (a = [1, −0.9, 0.2]) spacing implementations are illustrated in Figure 20.The blue-squared curve presents the uniform spaced MGD and the red-circled the decreasing spaced MGD, for both curves the common parameters were N g = 3, pow nc = 1, and Δ 1 = 0.25 chips.From the figure, we can see that the MEE curves of the hardware implemented MGDs are active in the limits set by the theoretical ones, illustrated in Figure 6.We used the synthesis results of the VHDL model to evaluate the implementation complexity of the proposed algorithms.The synthesis was done by using the Xilinx ISE software.We varied the number of correlators, since it is the characterizing quantity when choosing the MGD algorithm to be implemented (N g ).Our target device was the Xilinx Virtex II PRO field programmable gate array (FPGA).FPGAs are reprogrammable digital devices which can be used in tasks requiring a high processing speed, like tracking process [32].
The synthesis results are subjected to the target platform and, therefore, they can not be generalized.Because of this, we focused on the comparison between the complexity of uniform and decreasing delay spaced implementations, with a varying number of correlators.We synthesized only the delay register part of the hardware architecture since it is the only part that differs.The results are illustrated in Figure 21 and in Table 6.These results indicate that the hardware  complexity, measured as usage of target FPGA resources (equivalent logic gate count, logic slices, flip flops, and lookup tables), increases linearly with respect to amount of correlators (N g ) in uniform delay spaced implementations.In cases of decreasing delay spaced implementations,.the complexity increase is much faster.One must note that the left out part of the system adds a constant positive offset to the synthesis results.
Another difference between the implementations of uniform and decreasing delay spacings is in the increase of the generated frequencies when using the decreasing one.The proposed decreasing spacing structure with N g = 3 requires approximately four times higher frequency to be generated than uniformly spaced MGD structure with equivalent N g .This is because the smallest common uniform delay factor with the uniform spacing of Δ 1 = 0.25 chips is Δ 1 /2 = 0.125, but for the proposed decreasing spacing structure of Δ 1 = 0.25 chips it is Δ 1 /8 = 0.03125.This equals to the reference code delay register frequency increase from 8.184 MHz up to 32.736 MHz with Galileo E1 and GPS C/A signals, when their fundamental frequency is 1.023 MHz.Also the limitation caused by the RF front-end bandwidth is met much faster when using the decreasing spacing, compared with uniform spacing.

CONCLUSIONS
In this paper, a comprehensive description of Multiple Gate Delay tracking structures for GNSS signals in multipath environments has been introduced, covering all the steps from theoretical derivation and choice of design parameters till the final stage of prototyping.We showed that the proposed structures are implementable and that they have a high flexibility.We also explained in detail the design steps that should be taken in order to derive easily new MGD structures according to the target constraints (e.g., desired number of gate pairs, sampling frequencies, available bandwidths, etc.).We have discussed as well some aspects not taken into account in previous research papers, such as the effect of the nonlinearity type on the system performance, the design of gate spacings in multiple gate structures, and the effect of realistic PRN code lengths on the multipath error envelope analysis.We compared the MGD structures with uniform and decreasing spacings in terms of complexity, and we showed that the slightly better performance of MGDs with decreasing spacings is counter-balanced by a higher complexity, especially when the number of gate pairs increases.We showed that the state-of-art delay trackers, such as narrow correlator and double-delta correlators, can be seen as particular cases of MGD structures.We saw that the best choices in terms of two-path error envelopes are the MGDs with decreasing gate spacings and envelope nonlinearity.However, we also showed that, when the number of channel path increases, various MGD structures start to have equal performance, and the performance gap between narrow correlator and MGD structures disappears.Nevertheless, the main advantage of the proposed MGD structures is that they offer a large set of unpatented choices (at least according to the very best of authors' knowledge) that can be used for the design of multipath delay trackers for mass-market GPS and Galileo receivers.
phase path channel, with amplitudes 0 and −1 dB NEML HRC

Figure 3 :
Figure 3: Block diagram of MGD delay tracking structures.

Figure 4 :
Figure 4: Illustration of a triangular pulse Λ TB (t) of support 2T B .

Figure 9 :Figure 10 :
Figure 9: The basic diagram of GRANADA Bit-true software receiver simulator.

Figure 14 :
Figure 14: Implemented hardware tracking architecture with seven correlators.

Figure 16 :
Figure 16: Block diagram of the implemented high-level SystemC hardware model inside the Matlab software.

Figure 17 :
Figure 17: Difference between the hardware model and Matlabbased MEE curves of NEML discriminator (Δ 1 = 0.1).The black line presents the behavior of reference Matlab simulation, and the blue curve illustrates the behavior of the SystemC model.

Figure 18 :
Figure18: Effect of the NCO register length to the MEE curves of NEML discriminator (Δ 1 = 0.1 chips) created by the SystemC model.The blue-squared curve presents the case when NCO register size was 24 bits; the red-diamond line is for the case of 32 bits.

Figure 21 :
Figure 21: Synthesis of the architecture to the target device: effect of the number of correlators.
T B = T c /N B .Here, N B is a modulation-related parameter that is detailed in what follows.For example, if infinite bandwidth is assumed, p TB (t) is a rectangular pulse of unit amplitude if 0 ≤ t ≤ T B and 0 otherwise.
Delay registers with decreasing spacingFigure 15: Implemented delay registers: (a) uniform delay spacing, (b) decreasing delay spacing.Ref code is reference code chip value from code generator, VVE, VE, E are early, P is prompt, and L, VL, VVL late outputs of the delay register.The relative correlator spacings (Δ 1 , Δ 2 , Δ 3 ) are illustrated on top.Z −1 is the smallest uniform delay.