Fourier Spectroscopy: A Bayesian Way

The concepts of standard analysis techniques applied in the field of Fourier spectroscopy treat fundamental aspects insufficiently. For example, the spectra to be inferred are influenced by the noise contribution to the interferometric data, by nonprobed spatial domains which are linked to Fourier coefficients above a certain order, by the spectral limits which are in general not given by the Nyquist assumptions, and by additional parameters of the problem at hand like the zero-path difference. To consider these fundamentals, a probabilistic approach based on Bayes’ theorem is introduced which exploits multivariate normal distributions. For the example application, we model the spectra by the Gaussian process of a Brownian bridge stated by a prior covariance. The spectra themselves are represented by a number of parameters which map linearly to the data domain. The posterior for these linear parameters is analytically obtained, and the marginalisation over these parameters is trivial. This allows the straightforward investigation of the posterior for the involved nonlinear parameters, like the zero-path difference location and the spectral limits, and hyperparameters, like the scaling of the Gaussian process. With respect to the linear problem, this can be interpreted as an implementation of Ockham’s razor principle.


Introduction
Fourier spectroscopy is a diagnostic application which reveals information about spectral quantities like refractive index, absorption, and transmission of a medium under test. In addition, the characterisation in absolute terms is possible for broadband spectra, for example, emitted by electrons of a high-temperature plasma, being magnetically confined [1].
Commonly, an interferometer diagnostic, let us say of Michelson [2] or Martin-Puplett [3] design type, probes the Fourier transform of a spectral quantity. The corresponding interferometric data is a discrete set with finite length and includes noise contributions. Standard Fourier data analysis techniques [4][5][6] have been developed. These techniques lack describing and capturing properly several fundamental aspects, like the noisy nature of measured data and possible spectral limits and their impact on the spectral quantity to be inferred.
One misconception, arising from the standard formulation, is that certain spectral information must be lost inherently, because only a finite amount of data is acquired. This is proven in standard literature by evaluating the convolution function which has a finite full width at half maximum (FWHM), implying that only a finite amount of Fourier coefficients is accessible via measurements. While this conclusion remains valid when a continuous spectrum is probed, the reasoning does not hold in general for a discrete spectrum. This fact was exploited to develop a (self-) deconvolution procedure, so that some discrete lines which were separated by less than the FWHM width of the convolution function have been inferred [7].
Opposed to the standard data analysis techniques, a probabilistic ansatz was introduced to estimate model parameters, like amplitude spectra and frequencies, in the field of Fourier spectroscopy [8,9]. For a spectroscopic problem, a model is formulated via Bayes' theorem which allows to state prior information about model parameters. Furthermore, a Gaussian likelihood connects functionally the parameters with the noisy interferometric data. Then, after having measured a noisy data set and framing the reality by a certain model, 2 International Journal of Spectroscopy the knowledge about model parameters is expressed in probabilistic terms by the posterior. This approach [8,9] demonstrated that the uncertainty on the posterior mean of a frequency to be estimated for a single-frequency problem can be orders of magnitude below the FWHM width of the equivalent convolution function. On top of that, a criterion has been derived how far frequencies need to be separated, so that the Bayesian approach is still able to make a distinction. This separation can be well below the FWHM width of the convolution function. These findings are a direct consequence of the use of probabilistic theory.
One of the advantages of the Bayesian approach is that different models and, hence, their assumptions can be compared with each other also known as Ockham's razor. This enables the identification of the best, that is, the most likely, model, complying with the data. Given this context, the (self-)deconvolution procedure mentioned above is interpreted here as an optimisation to find a minimising set of discrete frequencies to describe the data sufficiently. In general, the most fundamental issue is whether the spectrum to be inferred is discrete or continuous. If a discrete spectrum is more likely or follows by a physics model, how many discrete frequencies are involved and what are their estimates including uncertainties? Each frequency is associated with an amplitude and a phase which need to be estimated as well. If a continuous spectrum is at hand, then the spectral limits are of interest. In addition, in case, the underlying physics process is understood, what is the uncertainty on the spectrum and the phase following from experimentally inaccessible regions in the data domain.
The investigation of the fundamental issues listed above is quite challenging from numerical point of view. However, the computational effort is largely reduced when even and odd amplitudes are used instead of the phase and amplitude. This ensures a linear dependence between the even and odd amplitude parameters and the data. Formulating the prior information about all even and odd amplitudes as a multivariate normal with a specific prior mean and covariance gives straightforwardly the posterior mean and covariance. Furthermore, the marginalisation can be carried out analytically for these linear parameters. The remaining posterior quantity carries information about the nonlinear model parameters like frequencies or spectral limits and socalled hyperparameters, entering merely in the prior.
In Section 2, the basic equations for Fourier spectroscopy and their implications are generally investigated, and the main concepts of the standard analysis and their drawbacks are pointed out. Section 3 of this paper presents a Bayesian formalism, so that it lends itself to applications in the field of Fourier spectroscopy for Gaussian (white) noise. The fundamental information about the spectral quantity, that it must vanish at a lower and upper spectral limit, can be stated by the covariance function of a Brownian bridge process as described in Section 4. This covariance is used as prior in the example application of the Bayesian approach presented in Section 5 to infer continuous but band-limited spectral quantities given an actually measured interferometric data set. Thereby, some diagnostic imperfections like a drifting signal offset, the zero-path difference, and a nonuniform spatial sampling are also taken into account. Section 6 discusses the strategy for a plausibility study of models, using different priors for the spectral quantities, and attempts to compare results and computational efforts obtained with the Bayesian model and with a standard model. The last section presents the conclusions.

General Definitions.
Commonly, the complementary coordinates used for Fourier transformations are the frequency and time , or the wavenumber and a spatial coordinate . For the moment, the latter pair is used to state the basic operations of Fourier transformation. Afterwards, the wavenumber is replaced via = / for convenience.
The real-valued continuous functions ( ) and ( ) form a Fourier transformation pair stated by The inverse operation reads Note that so far ̸ = holds. To find a relation, one inserts (1) in the above expression. After applying trigonometric identities, the spatial integral becomes follow, because the two wavenumber coordinates equal each other. In fact, the delta distribution occurs, because is a distribution.
Replacing with / gives in the spectral domain the quantity ( ) = / . This gives and rescales the inverse operation like [ cos (2 ) + sin (2 )] , (7) and it becomes clear that the even and the odd parts of and are connected. Another representation uses the amplitude | | and the phase by setting Because the model representations (5) and (7) are linear in , , and , both are favoured over the formulation (9), when linear inversion techniques are to be applied. Of both favoured representations, the one, using even and odd parts, has preferred properties. Since the orders of magnitude of the amplitudes can be quite different for the even and odd functions, a separation is logical.
Here, the assumption must be mentioned that both functions have the same spectral limits. This is assumed in the following but not mandatory. Furthermore, for any combination of and Δ the relation Δ ≤ 2 must be fulfilled to be meaningful.

Relation: Fourier Transform-Fourier Coefficients.
When the bandwidth is finite, one can express the spectral functions by Fourier coefficients multiplied each with the associated sinusoidal basis function of order (∈ ). These coefficients are defined here by the integrals which carry the unit as and . The coefficients ( ),0 label the mean values of and in the spectral domain covered. Then, one can replace the even and odd functions with which allows performing the spectral integration in (10) analytically. Since the result with follows, it becomes clear that the Fourier transform of a bandlimited function can be expressed by Fourier coefficients scaled with the bandwidth and multiplied each with the associated continuous basis function in the spatial domain. These basis functions have two contributions. The first is a sum/difference of two sinc ± functions which depend on the order , the spatial coordinate, and the bandwidth. The latter quantity determines the spatial width of sinc ± . Furthermore, the localisation is permitted at = ± /Δ , where a coefficient for a given mainly acts. Hence, increasing the order implies the localisation at a larger distance from the spatial origin. This explains the occurrence of factor 2 for ( ),0 for which both sinc functions coincide.
The second contribution causes a modulation of sinc ± and is given by a sine/cosine with the centre frequency and spatial coordinate in the argument. This dependency makes the basis function for ,0 vanish at the spatial origin. With respect to the spatial origin, the transformed basis functions of the coefficients for and are symmetric and antisymmetric, respectively. Some basis functions in the spatial domain are shown in Figures 1(a) and 1(b) for = 500 GHz and Δ = 1000 GHz.

4
International Journal of Spectroscopy Figure 1: Basis functions in spatial domain of Fourier coefficients (of order = 0, 1, and 12) , , , for even spectral function (a) and , , , for odd spectral function (b). The spectral functions are finite for the interval from 0 GHz to 1000 GHz for which the centre frequency = 500 GHz and bandwidth Δ = 1000 GHz follow. The basis functions in the spatial domain are given by a sum or difference of two sinc functions, depending each on the optical path difference , Δ , and . Furthermore, the sinc functions are modulated by sine or cosine with and in the argument. The sinc functions for fixed become unity at = ± /Δ , obviously increasing in .

Embedding into Larger Spectral Domain.
The functions and may be finite in the spectral domain with limits and or centre frequency and bandwidth Δ . Embedding this domain in a larger one with limits < and > (Δ < Δ ), another set of Fourier coefficients ( ), and ( ), is obtained with associated basis functions for the spectral domain. Without going into more detail here, these coefficients can be evaluated from ( ), and ( ), and the scalar products of the basis functions labeled with and . For instance, one finds for the ratio of the means ( ),0 / ( ),0 = Δ /Δ < 1. Basically, ( ), and ( ), maximise the information per coefficient when and are known. For example, a function which is constant inside a spectral domain and zero outside appears as a boxcar function from outside this domain. Thus, the only coefficient ( ),0 is mapped to an infinite number of coefficients ( ), and ( ), which are mandatory to capture both discontinuities.
In the spatial domain, the basis functions for and ( ), behave differently than the ones for ( ), and ( ), . Important to mention is the effect which the larger bandwidth has; sinc ± are spatially narrower than sinc ± . In addition, the number of coefficients per spatial domain increases which is expressed by = Δ /Δ . For = 1873.7 GHz and Δ = 2 Figure 2 shows some basis functions ( = 0, 1, and 2) in the spatial domain. Indeed, the basis function for ,0 with = 500 GHz and Δ = 1000 GHz (see Figure 1(a)) is broader.   , with order = 0, 1, and 2 for even spectral function, being finite in spectral domain with centre frequency = 1873.7 GHz and bandwidth Δ = 2 . For this domain, the sinc functions, contributing to the basis functions, are narrower than for = 500 GHz and Δ = 1000 GHz (see Figure 1(a)). As an example, the basis function for ,0 (dashed-black) is given. Since Δ ≈ 4Δ holds, approximately 4 times more Fourier coefficients locate inside a given spatial domain. Δ was chosen to equal the Nyquist frequency Ny = /(2Δ ), having set the spatial sampling increment Δ = 40 m (black dots).
International Journal of Spectroscopy 5 2.4. Parseval's Theorem. Parseval's theorem states abstractly that the length of a function in the spectral domain equals the length of its Fourier transform counterpart in the spatial domain. The length of the band-limited function is stated by because only the term remains odd and cancels by the integration. Furthermore, the scaling by the factor appears which originates in / = . Replacing ( ) with the expression (12) and exploiting that the basis functions are perpendicular in the spectral domain leaves Thus, the length is given by the sum of the Fourier coefficients squared. According to Parseval's theorem, the length in the spectral and spatial domain remains unchanged. Inserting (13) in the above expression implies that the spatial basis functions must be orthogonal for 1 ̸ = 2 , and the spatial integral yields /(2Δ ) for 1 = 2 = 0 and /Δ for 1 = 2 > 0. Analytically, this is hard to prove; however, this was numerically investigated and is considered to be valid.

Square-Integrable Functions.
The function is said to be square-integrable, when the condition 2 < ∞ holds. Furthermore, if is square-integrable, then the Fourier series representations in (12) converges towards and almost everywhere in the spectral domain as the order grows [10]. Hence, the requirement on to be square-integrable seems reasonable.

Ideal and Real-World
Interferometer. The Fourier transform can be performed by an interferometer, achieving an optical path difference between two partial beams, and the real-valued function ( ) can be sampled. From theoretical point of view, with an ideal interferometer diagnostic, a purely symmetric and noiseless interferogram is acquired. However, a real-world interferometer suffers from diagnostic imperfections like, for example, dispersion of any kind and/or misalignment. As a consequence, any acquired interferogram is to some degree asymmetric, and, hence, an odd feature is inherent due to the measuring principle. Furthermore, a measurement involves noise, always.

Spatial Sampling and Implications.
In the spatial domain, ( ) is sampled at a finite set of optical path difference locations with ∈ [1, ], and marks the number of sample points. Usually, the sampling with constant increment Δ = +1 − between subsequent locations is preferred which puts constraints on the diagnostic design. Furthermore, the spatial origin is most likely missed by the sampling, and, thus, the absolute value of might be unknown. If so, it is mandatory to introduce the zero-path difference 0 which is in the following set that 1 = 0 holds.
The finite spatial sampling leaves ( ) undetermined between the sampling nodes and outside the limits 1 and . Assuming Δ = const. holds, the Nyquist theorem states that the maximum frequency accessible is given by the Nyquist frequency Ny = /(2Δ ). Hence, for the spectral quantities and to be inferred, a maximum for the upper limit ,Max = Ny follows from sampling theory. To prevent aliasing, Δ needs to be chosen small enough so that and vanish below Ny . In case, can be acted on by reducing the diagnostic throughput via optical filters, the transmission line, the detector sensitivity, and postdetection amplifier settings. In fact, solely by these precautions, one can make sure that no other band well above Ny contributes to ( ). If and only if no such band exists, then the interferogram is smooth with respect to the chosen sampling nodes, and missing to sample exactly at 0 has no profound impact.
A diagnostic limitation is that the distance − 1 is finite, and, thus, no sampling is achieved below and above these limits. To gain information about and or the phase (see (8)), ( ) needs to be sampled on both sides of the spatial origin, so that the asymmetric feature in the interferogram is captured. Hence, in the following 1 and, thus, 0 are set to be negative, and the double-sided region is identified for the locations | | ≤ | 0 |. This diminishes the maximum optical path difference achievable, being positive, and, thus, the length of the single-sided domain is identified by the relation | 0 | < ≤ . Because scales with the order of the Fourier coefficients (see Section 2.3), only a finite number of coefficients can be probed. According to Parseval's theorem (see Section 2.4), information about the total length is missing. Furthermore, the Gibbs phenomenon, that is, a ringing, is present when and are inferred. Hence, should be maximal, so that as many as possible coefficients can be probed to decrease the loss of information. However, a trade-off between lengths of the single-sided and double-side domains is inevitable, depending on the level of the asymmetric imperfection.

Noise Contribution.
Since any measurement has a noise contribution, the noisy data value can be written as ( ) = ( ) + , and the actual interferometric data is expressed by the vector → = → + → . As spectral quantities are investigated, photons are involved in the measuring principle, and, hence, a part of has a Poissonian origin. However, the diagnostic under investigation later probes broadband spectra in the microwave and far-infrared range, and, thus, a large number of photons are present. Hence, the central limit theorem suggests that is a sample of a normal distribution with vanishing mean and a certain variance given by the squared noise level 2 . In any case, dedicated diagnostics tests are mandatory to characterise for a given interferometer. The basic model is a starting point and must be amended by diagnostic imperfections and specifics to the interferometer design type.

Inferring Spectra by Standard Analysis Techniques.
To infer the spectral quantities and from an interferometric data set → , the standard techniques rely on a noiseless model and follow a hierarchical ansatz. After making assumptions on the spectral limits and , the zero-path difference location 0 is estimated. The next step evaluates a phase which is a measure of the ratio / , relying on the data located in the double-sided region. Given the model, the spectral limits, the spatial origin, and the phase, and are estimated up to the Nyquist frequency from the whole data set. To reduce the Gibbs phenomenon on the inferred spectral quantities, window functions are multiplied to the interferometric data. In the following, the weak points of the standard analysis techniques are described. Each step relies on model assumptions, which are not questioned or tested in any way, and results of previous steps, which carry an unstated uncertainty. This hierarchical ansatz lacks the uncertainty propagation onto and entirely.

Spectral Limits: Nyquist Assumptions.
Two fundamental assumptions, called Nyquist assumptions in the following, are made by setting the spectral limits to 0 and the Nyquist frequency Ny = /(2Δ ) (see Section 2.5.2). Hence, the chosen spatial sampling would determine the bandwidth of the spectrum which is a misconception. Furthermore, the associated Fourier coefficients are located Δ apart via their basis functions in the spatial domain, and the maximum order probed is artificially blown up to /(2Δ ) due to the Nyquist assumptions (see Section 2. The uncertainty of a Fourier coefficient, relying on the Nyquist assumptions, scales like / Ny (noise level/maximum bandwidth) which follows from the linear uncertainty propagation for (13). But for the band-limited case, the uncertainty would scale like (Δ / Ny ) 1/2 /Δ , where the square root term states that more than one data point is related to one coefficient. Hence, if a band-limitation exists but is not taken into account, then the uncertainty is maximised on the inferred spectral quantities.

Estimation of Spatial
Origin. The spatial origin or zero-path difference 0 is most likely missed by the spatial sampling. One of the standard approaches to estimate 0 fits a parabola to the main interferogram peak without any information about the even and odd spectra itself. However, as one can see from (13), the basis functions for the even and odd absolute terms ( ,0 and ,0 ) are of leading order close to the spatial origin. Hence, information about the zerothorder coefficients and the spectral limits should be at hand for the estimation of 0 .
With 0 available, the double-sided and single-sided regions are identified. Though, a systematically affected estimate of the origin causes an additional asymmetry in the interferometric data which would result in an increase of and a decrease in which is usually interpreted as a phase ramp feature. Hence, the origin should be determined with the criterion that it minimises the odd spectral function.

Windowing.
Having only a finite amount of Fourier coefficients probed causes the Gibbs phenomenon to appear for the spectral quantities inferred. To reduce this ringing feature, window functions are applied in the spatial domain to bring the interferometric data smoothly to zero towards the sampling limits. More precise, probed Fourier coefficients of higher orders are damped out, and a window function corresponds to a certain convolution function in the spectral domain. Hence, a weighted averaging of the spectral quantities is carried out which reduces the ringing.
International Journal of Spectroscopy 7 This approach can give a good global approximation of and for regions with no significant gradients. However, the damping of Fourier coefficients worsens the convergence of the inferred quantities in regions with considerable gradients.
Implicitly, the application of window functions excludes the investigation of the uncertainty on and introduced by nonprobed Fourier coefficients. Hence, the requirement of square-integrability of the spectral functions is not taken into account.

Bayesian Formalism
3.1. Bayes' Theorem. The joint probability density function (pdf) ( , V) captures the chance that the outcome , let us say a data value or set, and the outcome V, a single model parameter or a set, are realised simultaneously.
The product rule introduces the conditional probabilities for finding the outcome , if the outcome V were true and vice versa. By the theorem of Bayes one conditional probability can be expressed by the other, when the marginal distributions ( ) = ∫ (V | ) ( )dV and (V) = ∫ ( | V) (V)d are known. Hence, Bayes' theorem captures the information/knowledge gained for V when a certain outcome for has manifested. For the pdfs occurring in Bayes' theorem, common names are used, that is, the posterior (V | ), the likelihood ( | V), the evidence ( ), and the prior (V). The link or functional dependence = f(V) + enters in the likelihood which takes into account known uncertainties like, for example, measurement noise. Any knowledge about V before new data is available can be found in the prior (V).
Bayes' rule can be extended to introducing a set of hyperparameter ℎ which enters per definition solely in the prior (V | ℎ). The additional pdf (ℎ) is called hyperprior which allocates trust in ℎ. Apart from having the posterior for the parameters V, the marginalisation with respect to V reveals the posterior for ℎ which measures the plausibility of an outcome of the hyperparameter given the data. Since ( ) does not depend on ℎ, the most likely hyperparameter set is identified by the maximum of ( | ℎ), assuming (ℎ) is uniform.

Multivariate Normal.
Let the joint pdf for the random vector → (∈ ) be a multivariate normal with mean → (∈ ) and covariance matrix Σ (∈ × ); then the pdf becomes with the determinant |Σ|.

Model for Linear Problem.
If the dependency between the data and the parameters of interest is linear, and the likelihood and the prior can be expressed by multivariate normals, then the evaluation of the posterior is analytically straightforward. Such a model is the starting point for investigating a more complex model which includes parameters with a nonlinear mapping to the data domain and/or hyperparameters.
(a) Gaussian Likelihood. The data may be represented by the vector → (∈ ). The parameters of interest → (∈ ) map linearly to the data domain like where the × dimensional matrix M encodes the linear mapping, and → captures the random noise contribution. When the data is acquired independently, and the noise is independent for each datum and follows a Gaussian = N(0, 2 ) with vanishing mean and standard deviation (noise level), then the Gaussian likelihood can be found with the covariance matrix Σ = 2 .
The prior information about → may be expressed by the multivariate normal (2 ) /2 Σ Pr with the prior mean → Pr and the prior covariance Σ Pr .
After some algebra, one can show that the posterior is a multivariate normal with posterior mean and covariance which are both analytically obtained. Furthermore, the evidence reads where the first part depends explicitly on the measured data, and the second part, being dimensionless, incorporates the ratio dependent on the means and covariances of the prior and posterior.

Model for Linear, Nonlinear, and Hyperparameter Problem.
The linear model is amended by hyperparameters, entering in some way in the prior, and parameters with a nonlinear connection to the data domain. Such a model is then applicable in the field of Fourier spectroscopy. (b) Priors. The Gaussian prior for → should be given by where the prior mean and covariance depend on some of the hyperparameters → ℎ . Similarly, a prior ( → | → ℎ ) follows for the nonlinear parameters. Finally, the hyperparameters have an assigned prior ( → ℎ ).
(c) Posteriors and Evidence. According to Bayes' theorem, one can write the joint posterior like and the conditional amplitude posterior for → becomes a multivariate normal Thus, both, the conditional posterior mean → Po and covariance Σ Po evaluated by (29) and (30), depend on the nonlinear parameters and hyperparameters. After the trivial marginalisation with respect to → , the joint posterior for → and → ℎ remains. By expressing the posterior, named settings posterior in the following, like the evidence is identified with Note, that the dimensionless constant , and, thus, the evidence depend on the chosen model, including likelihood and priors. Hence, is of importance, when the model is even further abstracted or compared with alternative models.
International Journal of Spectroscopy 9 (d) Role of Settings Posterior. The optimisation, that is, the finding of the maximum of the settings posterior ( → , → ℎ | → ), can be interpreted as an implementation of the Ockham's razor principle and/or as a regularisation procedure. This is essential when the number of parameters exceeds the number of data points.
Unfortunately, a general analytical expression is not available for this posterior, and, thus, it needs to be investigated numerically for the problem at hand. In order to do so, the quantity is of interest, because it is numerically accessible. In case, × has a well distinguishable global maximum, × can be approximated by a multivariate normal which is estimated by evaluating the Hessian matrix. Thus, one finds the posterior accordingly. The prior mean → Pr is set to 0, and the priors ( → | → ℎ ) and ( → ℎ ) are chosen to be uniform. Then, one can

Brownian Bridge Covariance
The continuous even and odd spectral functions to be inferred can be modelled each by a Gaussian process [11]. Thereby, the Brownian bridge process is a good starting point, because it exploits a fundamental condition to prevent aliasing for Fourier spectroscopy applications. This condition states that the spectrum and, thus, and must vanish at the spectral origin and at an upper limit which is smaller than the Nyquist frequency (see Section 2.6.3). However, this information is usually not taken into account any further in the analysis. On the contrary, a Brownian bridge and its associated covariance function fulfil the boundary conditions for any lower and upper limit. Hence, the covariance can be used in the Gaussian prior for and . In addition, this process has only one scaling hyperparameter which makes it attractive from data analysis point of view. This scaling can be estimated as well from the Fourier coefficients probed. In fact, this reveals information about the nonprobed coefficients and gives an additional uncertainty on and . After presenting some properties of the Brownian bridge covariance, it is used as prior covariance in the example application (see Section 5).

Standard Definition.
The Brownian bridge is a continuous stochastic process for an interval, say from 0 to . This bridge is constructed by tying-down a Brownian motion process to 0 at the end of the interval in question. Furthermore, the tie-down at the beginning of the interval is inherited from the Brownian motion process. The covariance function for the bridge is defined in standard literature by for and ∈ [0, ].

Adapted
the modified covariance becomes where [Σ BB ] = Hz −2 has now the proper unit with respect to the spectral scale. The parameters and of unit [ ( ) ] = V 2 are introduced which are defined each as a scaling factor for the associated process. With these scalings the covariances Σ BB,

Covariance for Fourier
Coefficients. The Brownian bridge covariance function for the spectral domain can be studied in the domain of the Fourier coefficients via the coordinate transform stated in (11) [11]. Compactly written, one finds the infinite-dimensional covariance matrix for the Fourier coefficients analytically by for all , ≥ 1, and similarly Σ BB, ( , ) follows. The only finite off-diagonal elements occur for the absolute term in connection with the higher order terms for the even coefficients captured by the infinite-dimensional row and , respectively. This is caused by the condition that ( ) vanishes at the spectral boundaries where the sine vanishes intrinsically for any but the cosine takes values either 1 or −1 for even and odd orders, respectively. Hence, the covariance imposes the boundary condition.

Square-Integrable Property.
According to Parseval's theorem (see (16)), 2 is evaluated by summing the squares of the Fourier coefficients. Because the entries of the main diagonal in the covariances Σ BB, ( , ) and Σ BB, ( , ) drop with the order squared, the Brownian bridge process ensures squareintegrability of and as long as the scalings and remain finite.

Signal Envelope.
For the even process, the signal level can be estimated by the envelope , ( ) in the data domain. Starting point is the square root of the main diagonal of Σ BB, ( , = ). Since the argument of sinc ± (see (14)) localises the even and odd Fourier coefficient at a fixed = ± Δ / in the same data domain, the even and odd contributions of Σ BB, ( , = ) must be added for ≥ 1. As can be seen by (13), the mapping of the absolute term to the data domain includes already the factor 2. In addition, the mapping comprises the bandwidth Δ . In total, one finds the envelope as In the above equation, factor 2 in front of was chosen, so that , ( ) captures most of the signal. An approximation might be convenient, because 12 1/2 / ≈ 1. For the envelope , ( ) of the odd process, the same reasoning can be applied with one modification. The mapping demands that the contribution of the absolute term at the spatial origin vanishes (see (13)). Hence, one finds Both envelops drop with 1/| |, and, thus, most of the signal associated with each process would in the data domain.  [12] at the fusion device JET (Culham, UK) probes the spectrum emitted by a broadband source and performs the Fourier transform. The interferogram data → 1 is acquired in terms of Volts dependent on the optical path difference . However, two different sources are probed for 20 minutes subsequently to remove a class of diagnostic imperfections not treated here any further. By subtraction of the corresponding two interferograms, the Difference quotient (×10 2 . Since the zig-zag-pattern in the point-to-point variation appears only for the first case (black), the standard model +1 − = const. seems to be inadequate. Instead, a model seems appropriate for which 2 +1 − 2 −1 = 2 +2 − 2 = const. holds. data becomes available in form of the difference interferogram ( ) = 2 ( ) − 1 ( ) acquired at the spatial grid node . Then, the abstract model for the Martin-Puplett interferometer is stated by

Example Application
using the total amplification of the detection system. Furthermore, the offset Off marks a diagnostic imperfection which varies with . The Gaussian noise contributes to each data sample described by N (0, 2 ). The unknown quantities in the diagnostic model are the spatial grid , the lower and upper spectral boundaries and of the Fourier transform integral, the even and odd functions and dependent on frequency , and the offset.

Interferometric Data. The data set
→ , that is, the difference interferogram consists of = 788 values (see Figure 3(a)). Merely for graphical presentation a certain is chosen derived from the standard approach (see Section 5.1.3). Globally, the data shows an upward trend with respect to the zero baseline.
The components are measured independently on each other, and the noise level for each is captured by = 132.29 V, and, thus, the variance of the whole data vector is stated by the matrix Σ = 2 .

Optical Path Difference.
The diagnostic is set up, so that the sampling of the interferogram is triggered ideally when the optical path difference has changed by the increment Δ = 40 m. Hence, the standard model = ( − 1)Δ + 0 is obvious with = 1, . . . , , and the zero-path difference 0 is a free parameter. However, this model is not accurate. Applying the standard approach which fits a second-order polynomial to the maximum max( → ) and its two nearby values, 0 = −127.95Δ = −5.118 mm, is inferred for the data set shown in Figure 3(a). Furthermore, the difference quotient evaluated by ( +1 − )/Δ is presented versus the optical path difference = ( +1 + )/2 in Figure 3(b). The point-to-point variation of the quotient has a zig-zag pattern which implies that the assumption Δ = const. is incorrect. Indeed, each of two the difference quotients seems more appropriate, making use of two free parameters: the zero-path difference 0 and a shift for every other grid value. The priors ( 0 ) and ( ) are set to be uniform.
The joint prior for the two vectorial quantities → and → is factorised, and each prior is chosen as a multivariate normal distribution with vanishing mean. Since the Brownian bridge covariance Σ BB (see Section 4) describes functions which vanish at the boundaries and and are square-integrable, and its signal envelope decays with the optical path difference like the interferometric data at hand, the priors are chosen by should be constant, so that any combination of Δ , , and has the same probability. Furthermore, the conditions 0 ≤ Δ ≤ − , 0 ≤ ≤ , and 0 ≤ ≤ Δ (global upper limit) must be fulfilled. For example, the upper limit of is set here to the Nyquist frequency Δ = Ny = /(2Δ ) = 3747.4 GHz. (2 ) (2 +3)/2 Σ Pr

Bayes
for → , using the (2 +3)×(2 +3) dimensional covariance matrix One gets the joint posterior with the conditional amplitude posterior given by the multivariate normal where the constant is unknown so far.

Conditional Amplitude Posterior for Chosen Settings.
To give some insight, the conditional posterior for the amplitudes is evaluated given the specific  → Po contribution to the data has an upwards trend but is small (see Figure 5(b)). The For the spectral quantities, the square root of the main diagonal elements of Σ Po is of the order of some 10 −18 V/Hz (for one sample function drawn from the conditional amplitude posterior see Figure 5(a)). Hence, a considerable deviation from the posterior mean is possible. With the specific values of the settings one gets the number ln × ≈ 1.0488 × 10 6 .

Settings Posterior.
As the problem is formulated, the settings posterior is proportional to × . Its optimisation, that is, finding the global maximum of the settings posterior is a 10-dimensional problem. Since large numbers are involved, one has to investigate ln × . In general, if only one parameter is changed, a well distinguishable peak in ln × is found. Hence, the optimisation is currently carried out by varying each parameter separately (coordinate descent algorithm). Thereby, ln × increases when Δ becomes smaller. This implies an increase of the dimensionality of the involved covariances Σ Pr and Σ Po and, thus, a prolonging of the    optimisation procedure. To demonstrate this procedure, the parameters = 0 GHz, = = 10 −8 V 2 , 0 = −5.118 mm, = 0 mm, 0 ,Pr = 1 V, 1 ,Pr = 1 V/m, and 2 ,Pr = 1 V/m 2 are set to the values used for the nonoptimised case (see Section 5.3.1). Scanned roughly in , ln × shows a peak close to 1000 GHz (see Figure 6(a)). Furthermore, this peak increases for Δ ≤ 5 GHz to ln × ≈ 10 6 . The peak is localised at 860 GHz which moves to about 910 GHz for the reduced values = = 10 −11 V 2 (see Figure 6(b)). To ease the computational effort but still being able to characterise ln × , its maximum is determined dependent on Δ which is scanned in the values 5.68, 4, 3, 2, 1, and 1/2 GHz. Each maximum is captured by the sets ,Po ,  Table 1. All maxima locate at very similar sets which seem to converge as Δ becomes smaller. Although, in relative terms, the maximum for a smaller Δ has higher odds (see Figure 7). For example, the odds read 1 : 0.011 when the maxima at Δ = 1/2 GHz and 5.68 GHz are related. Furthermore, taking the values for the maximum at Δ = 1/2 GHz to evaluate ln × at Δ = 1/3 GHz and 1/4 GHz gives the odds 1 : 1.013 : 1.018. Thus, the global maximum locates somewhere in the range below Δ ≤ 1/4 GHz. But this range cannot be investigated in more detail from numerical point of view. However, the increase in ln × when Δ is decreased below 1/2 GHz is interpreted as confirmation that a continuous spectrum is indeed probed.  Figure 7: Odds for maxima of × dependent on spectral increment Δ . The maximum at Δ = 1/2 GHz is used as a reference (unity odd). For Δ = 1/3 GHz and 1/4 GHz, the maximising parameter set (see Table 1) for Δ = 1/2 GHz is used to evaluate the associated odds. The global maximum of × is obtained for very small spectral increments.
For Δ = 5.68 GHz, the odds read 1 : 10 −1390 when the associated maximum (ln × ≈ 1.0520 × 10 6 ) is related to the nonoptimised case (ln × ≈ 1.0488 × 10 6 ) investigated in Section 5.3.1. This is caused by having chosen some parameters like ( ) several orders of magnitude too large with respect to the maximum settings values.
For Δ = 1 GHz, Figures 8(a)-8(f) summarise scans in the parameter pairs ( , ), ( , ), ( 0 , ), ( 0 ,Pr , 1 ,Pr ), ( 0 ,Pr , 2 ,Pr ) and in 0 ,Pr while the remaining parameters are held at the maximum values. In 0 ,Pr , 1 ,Pr , and 2 ,Pr , a skewed distribution is found. The remaining six-dimensional posterior distribution has a high probability in a narrow region for a given Δ . This distribution is well approximated by a multivariate normal. Its mean is given by the maximum values listed in Table 1. The posterior covariance is estimated from the inverse of the Hessian matrix evaluated numerically via the second-order partial derivatives in the vicinity of the maximum. The off-diagonal elements of this covariance are negligible, so that one can factorise the posterior as a product of individual Gaussians. The posterior standard deviations ,Po , ,Po , ,Po , ,Po , 0 ,Po , and ,Po vary little when Δ is changed (see Table 2). The spectral boundaries are well determined within an interval of some GHz. While the uncertainty in the scaling is small compared to its posterior mean value, is more uncertain. The uncertainty in the zero-path difference is of the order of some m, and the shift is quite certain within some hundreds of nanometers.

Conditional Amplitude Posterior for Maximising Settings.
In the following, the amplitude posterior is investigated for the maximising settings given Δ = 1/2 GHz (see Table 1). For the listed settings, the double-sided domain is identified by | | ≤ | 0,Po | ≈ 5.115 mm; the single-sided domain is bounded by the lower limit > | 0,Po | and the upper limit of about 26.36 mm, the centre frequency   Table 1). Furthermore, the absolute and the linear mean values are similar to the ones obtained for the nonoptimised case (see Section 5.   Table 1).   Table 1). Most likely, the spectral quantities This is seen by mapping both spectral means to the data domain which gives the even and odd quantities → ,Po and → ,Po (see Figures 9(b)-9(d)). In addition, the single-sided domain is described mainly by → ,Po , because in this domain only the envelope , for the even process is of the order of the interferometric data (see Figure 9(d)). This differs from the findings for the nonoptimised case ( = ) for which → ,Po and → ,Po determine the single-sided region almost equally.
The histogram of the residuals (see Figure 9(e)) is approximated very well by the normal distribution N(−1.7 × 10 −4 , 0.94 2 ). Since the mean vanishes almost, and the standard deviation is very close to unity, the data set is well described by the model and the posterior means. Most likely, the data point located at ≈ 7 mm (see Figure 9(f)) is an outlier, because its residual is outside the 3.5-band.

(b) Posterior Covariance. For the linear parameters the posterior covariance matrix is written like
and taking the square root of an element of the main diagonal gives the posterior standard deviation for the th parameter. The correlation coefficient between two parameters 1 and 2 can be evaluated by 1 , 2 = Σ 1 , 2 ,Po /(  Figure 11(a) shows one sample for each of the even and odd spectral functions, Figure 11(b) presents 100 samples. The samples form a band around the corresponding posterior mean → ( ),Po with the width of about twice the standard deviation → ( ) ,Po as shown in Figure 10(d). Hence, the band for the odd function is much smaller.
The samples mapped to the data domain give → ,Po,Sa , → ,Po,Sa , and → Off,Po,Sa (see Figure 11(c)) which form much more narrow bands when compared to the nonoptimised case (see Figure 5(b)). In particular, the transition from the double-to the single-sided domain is smooth. This is a consequence of having used the most likely scalings ,Po and ,Po and the associated but quite different signal envelops inferred from the data. Furthermore, this explains why the band for → ,Po,Sa is much smaller than the one for needs to be carried out. In general, the pdf ( → , → | Δ , → ) is not Gaussian but can be approximated by a multivariate normal. Performing the marginalisation, that is,  Table 1 ,Po,Sa . The uncertainty at a given frequency is larger for the even function, because for the associated scalings of the two processes ,Po ≫ ,Po holds. (c) 100 spectral samples and 100 offset coefficients mapped to probed data domain. Unlike for the nonoptimised case (see Figure 5  the uncertainty propagation, is analytically not possible but achievable numerically described below for Δ = 1/2 GHz. To reduce the numerical efforts, only the most important parameters , , 0 , , , and are taken into consideration for the marginalisation, and the remaining three parameters 0 ,Pr , 1 ,Pr , and 2 ,Pr are held at the maximum values listed in Table 1 In doing so, the marginalisation with respect to → is achieved implicitly.   Figure 13(c)) shows that towards the spectral limits the propagation of the settings posterior uncertainties has little effect. Furthermore, the correlation , ,Ma is high for a broad spectral range in the central region (see Figure 13(d)). (see Figure 11(b)). The samples → ,Po,Ma,Sa have a wider spread in the centre of the spectral domain than → ,Po,Sa mainly caused by the uncertainty in the zero-path difference.

Figure of Merit for Real-World
Interferometer. For the ideal Martin-Puplett interferometer, the odd spectral function, and, hence, the odd process must vanish from theoretical point of view. However, imperfections of a real-world interferometer leave the odd contribution finite in general. This imperfection is captured by the scaling and in terms of signal by the envelope , ∝ 1/2 . By relating the square root of the scalings like one can define the figure of merit which expresses by a number the signal deviation of a real-world instrument from the ideal case. For an ideal interferometer, = 1 holds ( = 0). For the interferometer investigated here, the settings posterior (see Section 5.3.2) carries the information to state the mean as = 0.933 with the uncertainty of about 0.01. Hence, about 7% of the signal is converted from the ideal even process to the odd process by the real-world diagnostic.

Choice of Spectral Priors or Model Plausibility.
The results obtained in the previous section rely on the model BB presented here with the assumption that the even and odd spectral functions can be described each by a Brownian bridge process and its associated prior covariance. An alternative model, let us say , with certain assumptions on the spectral functions, leads to different prior covariances and to a different posterior. In addition, the model might be more or less plausible when compared to the model BB .
In principle, the plausibility of a model relative to an alternative can be investigated within the Bayesian framework by rising the abstraction level. Starting from (36), a further factorisation needs to be carried out with respect to the used models BB or . Basically, one can assign the model posterior by ( BB | → ) ∝ BB ( BB ) and ( | → ) ∝ ( ) with the model priors ( ) and ( BB ). The dimensionless constants BB and may be obtained by marginalising over all linear and nonlinear parameters and hyperparameters. Then, the model plausibility is captured by the ratio ( BB | → )/ ( | → ). This ratio becomes BB / , if no model is preferred a priori for which one sets ( ) = ( BB ) = 1/2. Such a model plausibility study was not carried out, because it demands an investigation on its own alongside with a costly numerical treatment. However, some aspects of a plausibility study will be discussed below.
The signal envelope, corresponding to the used prior covariance for the even and odd spectral functions, is expected to be an indicator for a competitive model. This envelope should be able to resemble the global trend of the interferometric data (see Figure 3(a)). The investigated model BB seems to have these desired characteristics (see Figure 9).
For the even and odd spectral functions, an alternative prior choice could be Σ ( , ) = ( ) ( − ) which assigns no correlation. For example, if the function ( ) is chosen to remain constant or as a triangular function, centred at and approaching 0 towards the spectral limits, then the corresponding covariance Σ ( , ) for the Fourier coefficients has a constant value along the main diagonal for all orders. This can be shown analytically by performing the operation stated in (43) on Σ ( , ). As a consequence, no decay is imposed on the Fourier coefficients with rising order which is incompatible with square-integrability. Furthermore, the alternative signal envelop is constant in the optical path difference domain, opposing the fall-off in the interferometric data (see Figure 3(a)). Thus, the data should not be described better by either of the two suggested alternative models.
A competitive model could use a prior covariance Σ ( , ) for the Fourier coefficients which has a dropping amplitude when the order of the coefficients rises. For instance, the main diagonal of Σ ( , ) could be chosen like 1/( ) , and a transformation back to the spectral domain would allow further investigations of the properties of Σ ( , ). For different but positive exponents , the model plausibility could be examined.

Comparison with Standard
Model. The standard analysis approach for the interferometer investigated here relies on a different model which is set up hierarchically [12], and no covariances of the parameters involved are available. Basically, the quantity of interest, that is, the spectrum, is determined conditionally on the inferred voltage offset Off , the zero-path difference 0 , while setting the shift = 0 and the phase. Furthermore, spatial window functions are multiplied to the data to retrieve the phase and spectrum, using discrete fast Fourier transformation (DFFT) routines, for the Nyquist assumptions. A consideration of nonprobed Fourier coefficients, locating outside the experimentally accessible spatial domain, is missing completely as well as the influence of the measurement noise. Hence, a comparison of the standard model with the Bayesian model BB (second column) is well approximated by BB when the settings of the third column are chosen. However, an aliasing feature is present in the spectral domain 3000-3500 GHz (see Figure 14(a)). This feature disappears when the spatial settings are changed (fourth column) to the values at the maximum of the settings posterior of BB (fifth column). The maximum of the settings posterior reveals that the Nyquist assumptions ( = 0 GHz and = 3747.4 GHz) are not plausible, because an overfitting of the data is made. is not straightforward. In order to perform a reasonable comparison, the spectral grid with the increment Δ = 3.66 GHz, which follows from the use of DFFT by , and the limits = 0 GHz, = 3747.4 GHz are the same for both models. In addition, settings are determined for BB which come close to the settings for (see second column of Table 3) to resemble the standard model. Given these settings, the even and odd spectra are compared. Then, the plausibility of these settings can be obtained with respect to the maximum of the settings posterior for BB by evaluating ln × and the corresponding odds.

Model
The only window function applied here in the standard analysis weighs the single-sided data domain twice as large as the double-sided domain. Furthermore, the settings of the standard approach 0 = −5.118 mm and vanishing are transferred to the model BB , and the remaining settings of BB are optimised (see third column of Table 3). The even and odd quantities → , and → , are inferred with , and the means → ,Po and → ,Po of the conditional amplitude posterior of the model BB are evaluated. Figure 14(a) shows the results which are similar in amplitude. For both models, an aliasing feature can be found for the odd spectral quantity in the spectral domain from 3000 GHz to 3500 GHz. This feature originates in the assumed uniform optical path difference grid ( = 0). The differences → , − → ,Po and → , − → ,Po settle mostly in the range ±0.1 × 10 −18 V/Hz (see Figure 14(b)).
Keeping fixed all settings in BB but choosing 0 and which are present at the maximum of the settings posterior for Δ = 3.66 GHz (see forth column of Table 3), the aliasing feature vanishes completely, because the nonuniformity in the optical path difference grid is taken into account properly. This small change in the spatial settings would make the conditional amplitude posterior about 47 times more likely.
The odds 1 : 10 −278.53 for the values at the maximum of the settings posterior (see fifth column of Table 3) with respect to the settings for the model BB which mimics mark the Nyquist assumptions made by the standard approach very unlikely. This is explained when the corresponding residuals are investigated which show an overfitting of the data if the Nyquist assumptions are used (similar to Figure 5(c)).

Computational Time.
The algorithms for the model BB and the standard model are implemented in Scilab [13]. To obtain the even and odd spectra with via DFFT routines, a computational time of about 4 ms is measured. This fast analysis time is exceeded by several orders of magnitude, when the problem is investigated with the Bayesian model.
For the model BB , the number = 2( − )/Δ + 3 of linear parameters, dependent on the spectral domain and increment, gives the dimension of the prior and posterior covariance matrices. Hence, determines the computational times , to evaluate the mean and covariance of the conditional amplitude posterior, and ln × , to investigate the settings posterior at one point in the parameter space. For the implemented algorithm and the maximising settings listed in Table 1 with decreasing Δ ( increases by about one order of magnitude), and ln × measured increase at least quadratically with (see Table 4). This is caused mainly by the need to invert numerically prior and posterior covariance matrices.
The characterisation of the settings posterior (see Section 5.3.2) requires a duration of 10 3 ln × at least which adds up to hours for small Δ (large ) and to much less than one hour for Δ = 4 GHz.
The numerical marginalisation described in Section 5.3.4 takes about half a day for Δ = 1/2 GHz.  Table 3). (a) Even and odd spectra. Similar results follow from both models. The elevated amplitude in the odd spectra between 3000 GHz and 3500 GHz is caused by aliasing due to the assumption that the optical path difference grid is uniform. (b) Absolute differences of spectral quantities inferred by both models. Except in the vicinity of the water vapour absorption line at 557 GHz, the difference is small in absolute terms. Table 4: Measured computational times and ln × of used algorithm for Bayesian model BB dependent on number of linear parameters and discretisation increment Δ for maximising settings listed in Table 1. and ln × rise approximately with 2 . The investigation of the settings posterior demands a time of ln × multiplied with a factor of at least 10 3 which becomes hours for Δ = 1/2 GHz.

Conclusions
The Fourier transform is the heart of Fourier spectroscopy applications. Thereby, the interferometric data has a linear dependence on the even and odd continuous spectra to be inferred. Standard analysis techniques lack appropriate handling of fundamental aspects like noisy measurements, the influence of nonprobed spatial domains linked to Fourier coefficients above a certain order, the estimation of spectral limits, and the propagation of uncertainties of additional parameters like the zero-path difference onto the inferred spectra. For instance, the Nyquist assumption implies the fundamental misconception that the upper spectral limit of spectra to be inferred would depend on the spatial sampling. In addition, a broad spectral bandwidth would follow which increases artificially the number of Fourier coefficients necessary to describe the data. On the contrary, it can be shown analytically that a band-limitation causes spatially extended basis functions (modulated sinc functions) assigned to the Fourier coefficients in the data domain. Thus, several nearby data points are captured sufficiently by less coefficients. This example demonstrates that interferometric data contains more information than usually extracted.
As an alternative to the standard analysis techniques, a probabilistic ansatz, relying on Bayes' theorem, was proposed which is able to capture the fundamental aspects listed above. In general, Bayes' theorem relates the posterior probability density function of model parameters to the product of the likelihood and the prior probability density function for these parameters. The ansatz presented here uses multivariate normal distributions for the likelihood and the prior for parameters which map linearly to the data domain. This gives straightforwardly an analytical solution for the posterior of these linear parameters in form of a multivariate normal. Though, this amplitude posterior is conditional on the settings parameters, summarising all nonlinear model parameters and hyperparameters. After the trivial marginalisation over the linear parameters, the remaining quantity 28 International Journal of Spectroscopy can be scanned in the settings parameters to investigate their joint posterior. This can be understood as a means of applying Ockham's razor for the linear problem. With the settings posterior at hand, the marginalisation projects the uncertainties in the settings onto the linear parameters.
The example application for the Bayesian approach infers even and odd spectra, which qualify as linear parameters, in the microwave and far-infrared spectral domain and several settings parameters, like the spectral discretisation increment, the spectral limits, the scalings of the even and odd processes, the zero-path difference, and a shift correction to the spatial sampling, given a measured interferometric data set. Each spectrum is modelled by a scaled Brownian bridge process which is able to capture a band-limitation, and the associated covariance is used in the Gaussian prior. This covariance assigns a broadband correlation, but its transform to the domain of Fourier coefficients reveals no correlation (vanishing off-diagonal elements except in connection with the zeroth-order term) between the coefficients. Furthermore, the diagonal elements drop with the square of the order of the coefficients. Hence, the prior information stated by the Brownian bridge covariance considers functions which are square-integrable and, thus, converge globally in the limit when the discretisation increment approaches zero and the order of the Fourier coefficients tends to infinity. In addition, these functions vanish smoothly at the lower and upper spectral boundaries. In the data domain, a signal envelope follows from the Brownian bridge process. This envelope decays with the optical path difference and the spectral bandwidth.
For the linear parameters like the even and odd spectra, a conditional amplitude posterior was briefly examined, relying on the Nyquist assumptions. Due to the large upper spectral limit, all noise contributions to the interferometric data are captured by the posterior mean of the linear parameters. This implies an overfitting. Because large and equal values are taken for the two Brownian bridge scalings (large signal envelops), the mapped posterior means of the spectra describe the even and odd parts of the interferometric data to equal parts in the single-sided domain, while the even part dominates in the double-sided domain. This is an indicator that the Fourier coefficients located in the single-sided domain are underestimated and overestimated for the even and odd spectra, respectively. The posterior samples for both spectra show large deviations from the means, and the even and odd contributions, obtained by mapping the samples, form much wider bands than the measurement uncertainty, especially in the single-sided domain. This indicates an unnecessary expanded solution space for the problem. Only by the posterior covariance of the linear parameters, the sum of the mapped samples complies with the data and its uncertainty band. The listed features mark a very unlikely conditional amplitude posterior which is revealed by the settings posterior.
The settings posterior for the most important settings is well approximated by the product of individual normal distributions, because no significant correlations could be found. The corresponding posterior means and standard deviations take reasonable values. These values are affected little by the discretisation increment which tends to be small, confirming the proposition that continuous spectral quantities are probed. The upper spectral limit is about a factor four smaller than the Nyquist frequency, and the lower limit is well separated from zero. This reduction of the bandwidth implies that the interferometric data can be described by a number of Fourier coefficients with associated spatial basis functions which is about one-quarter of the amount of data points. The scaling of the even process exceeds the one for the odd process by about two orders of magnitude.
For values corresponding to the maximum of the settings posterior at a small discretisation increment, the conditional amplitude posterior was investigated. By the discretisation, the number of the linear parameters exceeds the one of the Fourier coefficients, which is mandatory to describe the data points within the measurement uncertainty, by one order of magnitude. However, the Ockham's razor principle implemented by the settings posterior limits the solution space, so that the posterior means and samples for the linear spectral parameters, mapped to the data domain, have a smooth transition between the double-and singlesided regions. Due to the much larger scaling with respect to the odd process, the even process and, thus, the even spectrum describe most of the single-sided and doublesided interferogram region. While the probed interferometric data is well described within the uncertainty by the means and samples, the nonprobed data domain, corresponding to Fourier coefficients above a certain order, is filled broadly by these mapped samples. This filling decays with increasing optical path difference and is limited by the signal envelopes which follow from the estimated scalings of the Brownian bridge processes and the bandwidth. Because the spread of the mapped samples in the nonprobed domain exceeds the one in the probed region, the main uncertainties for the even and odd spectra originates in nonprobed Fourier coefficients.
The numerically costly marginalisation over some of the settings shows that the zero-path difference changes the covariance for the odd spectral parameters significantly. Basically, a broad increase of the posterior uncertainties and correlation was found.
A figure of merit was introduced which states the deviation of a real-world interferometer from an ideal diagnostic. By relating the scalings of the even and odd processes, the used interferometer is characterised as being close to the ideal case for which the odd process must vanish.

Disclosure
The views and opinions expressed herein do not necessarily reflect those of the European Commission.