Testing gaussianity, homogeneity and isotropy with the cosmic microwave background

We review the basic hypotheses which motivate the statistical framework used to analyze the cosmic microwave background, and how that framework can be enlarged as we relax those hypotheses. In particular, we try to separate as much as possible the questions of gaussianity, homogeneity and isotropy from each other. We focus both on isotropic estimators of non-gaussianity as well as statistically anisotropic estimators of gaussianity, giving particular emphasis on their signatures and the enhanced"cosmic variances"that become increasingly important as our putative Universe becomes less symmetric. After reviewing the formalism behind some simple model-independent tests, we discuss how these tests can be applied to CMB data when searching for large scale"anomalies"


Introduction
According to our current understanding of the Universe, the morphology of the cosmic microwave background (CMB) temperature field, as well as all cosmological structures that are now visible, like galaxies, clusters of galaxies and the whole web of large-scale structure, are probably the descendants of quantum process that took place some 10 −35 seconds after the Big Bang. In the standard lore, the machinery responsible for these processes is termed cosmic inflation and, in general terms, what it means is that microscopic quantum fluctuations pervading the primordial Universe are stretched to what correspond, today, to cosmological scales (see [1,2,3] for comprehensive introductions to inflation.) These primordial perturbations serve as initial conditions for the process of structure formation, which enhance these initial perturbations through gravitational instability. The subsequent (classical) evolution of these instabilities preserves the main statistical features of these fluctuations that were inherited from their inflationary origin -provided, of course, that we restrain ourselves to linear perturbation theory.
However, given that matter has a natural tendency to cluster, and this inevitably leads to non-linearities (not to mention the sorts of complications that come with baryonic physics), the structures which are visible today are far from ideal probes of those statistical properties. CMB photons, on the other hand, to an excellent approximation experience free streaming since the time of decoupling (z ≈ 1100), and are therefore exempt from these non-linearities (except, of course, for secondary anisotropies such as the Rees-Sciama effect or the Sunyaev-Zel'dovich effect), which implies that they constitute an ideal window to the physics of the early Universe -see, e.g., [4,5,6]. In fact, we can determine the primary CMB anisotropies as well as most of the secondary anisotropies on large scales, such as the Integrated Sachs-Wolfe effect, completely in terms of the initial conditions by means of a linear kernel: where η is conformal time, and S i denote the initial conditions of all matter and metric fields (as well as their time derivatives, if the initial conditions are non-adiabatic.) Here K i is a linear kernel, or a retarded Green's function, that propagates the radiation field to the time and place of its detection, here on Earth. Since that kernel is insensitive to the statistical nature of the initial conditions (which can be thought of as constants which multiply the source terms), those properties are precisely transferred to the CMB temperature field Θ.
The statistical properties of the primordial fluctuations are, to lowest order in perturbation theory, quite simple: because the quantum fluctuations that get stretched and enhanced by inflation are basically harmonic oscillators in their ground state, the distribution of those fluctuations is Gaussian, with each mode an independent random variable. The Fourier modes of these fluctuations are characterized by random phases (corresponding to the random initial values of the oscillators), with zero mean, and variances which are given simply by the field mass and the mode's wavenumber k = 2π/λ.
The presence of higher-order interactions (which exist even for free fields, because of gravity) changes this simple picture, introducing higher-order correlations which destroy gaussianity -even in the simplest scenario of inflation [7,8,9]. However, since these interactions are typically suppressed by powers of the factor GH 2 10 −12 , where G is Newton's constant and H the Hubble parameter during inflation, the corrections are small -but, at least in principle, detectable [10,11,12].
Since these statistical properties are a generic prediction of (essentially) all inflationary models, they can also be inferred from two ingredients that are usually assumed as a first approximation to our Universe. First, since inflation was designed to stretch our Universe until it became spatially homogeneous and isotropic, it is reasonable to expect that all statistical momenta of the CMB should be spatially homogeneous and rotationally invariant, regardless of their general form. Second, in linear perturbation theory [13] where we have a large number of cosmological fluctuations evolving independently, we can expect, based on the central limit theorem, that the Universe will obey a Gaussian distribution.
The power of this program lies, therefore, in its simplicity: if the Universe is indeed Gaussian, homogeneous and statistically isotropic (SI), then essentially all the information about inflation and the linear (low redshift) evolution of the Universe is encoded in the variance, or two-point correlation function, of large-scale cosmological structures and/or the CMB. As it turns out, the five year dataset from the Wilkinson Microwave anisotropy probe (WMAP) strongly supports these predictions [14,11]. Moreover, the measurements of the CMB temperature power spectrum by the WMAP team, alongside measurements of the matter power spectrum from existing survey of galaxies [15,16] and data from type Ia supernovae [17,18,19], have shown remarkable consistency with a concordance model (ΛCDM), in which the cosmos figures as a Gaussian, spatially flat, approximately homogeneous and statistically isotropic web of structures composed mainly of baryons, dark matter and dark energy.
However, while the detection of a nearly scale-invariant and Gaussian spectrum is a powerful boost to the idea of inflation, just knowing the variance of the primordial fluctuations is not sufficient to single out which particular inflationary model was realized in our Universe. For that we will need not only the 2-point function, but the higher momenta of the distribution as well. Therefore, in order to break this model degeneracy we must go beyond the framework of the ΛCDM, Gaussian, spatially homogeneous and statistically isotropic Universe.
Reconstructing our cosmic history, however, is not the only reason to explore further the statistical properties of the CMB. The full-sky temperature maps by WMAP [20,11] have revealed the existence of a series of large-angle anomalies -which, incidentally were (on hindsight) already visible in the lower-resolution COBE data [21]. These anomalies suggest that at least one of our cherished hypothesis underlying the standard cosmological model might be wrong -even as a first-order approximation. Perhaps the most intriguing anomalies (described in more detail in other review papers in this volume) are the low value of the quadrupole and its alignment of the quadrupole ( = 2) with the octupole ( = 3) [22,23,24,25,26,27], the sphericity [26] (or lack of planarity [28]), of the multipole = 5, and the north-south asymmetry [29,30,31,32,33]. In the framework of the standard cosmological model, these are very unlikely statistical events, and yet the evidence that they exist in the real data (and are not artifacts of poorly subtracted extended foregrounds -e.g., [34]) is strong.
Concerning theoretical explanations, even though we have by now an arsenal of adhoc models designed to account for the existence of these anomalies, none has yet quite succeeded in explaining their origin. Nevertheless, they all share the point of view that the detected anomalies might be related to a deviation of gaussianity and/or statistical isotropy.
In this review we will describe, first, how to characterize, from the point of view of the underlying spacetime symmetries, both non-gaussianity and statistical anisotropy. We will adopt two guiding principles. The first is that gaussianity and SI, being completely different properties of a random variable, should be treated separately, whenever possible or practical. Second, since there is only one type of gaussianity and SI but virtually infinite ways away from them, it is important to try to measure these deviations without a particular model or anomaly in mind -although we may eventually appeal to particular models as illustrations or as a means of comparison. This approach is not new and, although not usually mentioned explicitly, it has been adopted in a number of recent papers [35,36].
One of the main motivations for this model-independent approach is the difficult issue of aprioristic statistics: one can only test the random nature of a process if it can be repeated a very large (formally, infinite) number of times. Since the CMB only changes on a timescale of tens of millions of years, waiting for our surface of last scattering to probe a different region of the Universe is not a practical proposition. Instead, we are stuck with one dataset (a sequence of apparently random numbers), which we can subject to any number of tests. Clearly, by sheer chance about 30% of the tests will give a positive detection with 70% confidence level (C.L.), 10% will give a positive detection with 90% C.L., and so on. With enough time, anyone can come up with detections of arbitrarily high significance -and ingenuity will surely accelerate this process. Hence, it would be useful to have a few guiding principles to inform and motivate our statistical tests, so that we don't end up shooting blindly at a finite number of fish in a small wheelbarrow.
This review is divided in two parts. We start Part I by reviewing the basic statistical framework behind linear perturbation theory ( §2). This serves as a motivation for §3, where we discuss the formal aspects of non-Gaussian and statistically isotropic models ( §3.1), as well as Gaussian models of statistical anisotropy ( §3.2). Part II is devoted to a discussion on model-independent cosmological tests of non-gaussianity and statistical anisotropy and their application to CMB data. We focus on two particular tests, namely, the multipole vectors statistics ( §4) and functional modifications of the twopoint correlation function ( §5). After discussing how such tests are usually carried out when searching for anomalies in CMB data ( §6.1), we present a new formalism which generalizes the standard procedure by including the ergodicity of cosmological data as a possible source of errors ( §6.2). This formalism is illustrated in §7, where we carry a search of planar-type deviations of isotropy in CMB data. We then conclude in §8.

Part I
The linearized Universe 2 General structure We start by defining the temperature fluctuation field. Since the background radiation is known to have an average temperature of 2.725K, we are interested only in deviations from this value at a given directionn in the CMB sky. So let us consider the dimensionless function on S 2 : where T 0 = 2.725 K is the blackbody temperature of the mean photon energy distribution -which, if homogeneity holds, is also equal to the ensemble average of the temperature.
In full generality, the fluctuation field is not only a function of the position vector n, but also of the time in which our measurements are taken. In practice, the time and displacement of measurements vary so slowly that we can ignore these dependences altogether. Therefore, we can equally well consider this function as one defined only on the unit radius sphere S 2 , for which the following decomposition holds: Since the spherical harmonics Y m (n) obey the symmetry Y * m (n) = (−1) m Y ,−m (n), the fact that the temperature field is a real function implies the identity a * m = (−1) m a ,−m . This means that each temperature multipole is completely characterized by 2 + 1 real degrees of freedom.

From inhomogeneities to anisotropies: linear theory
The ultimate source of anisotropies in the Universe are the inhomogeneities in the baryon-photon fluid, as well as their associated spacetime metric fluctuations. If the photons were in perfect equilibrium with the baryons up to a sharply defined moment in time (the so-called instant recombination approximation), their distribution would have only one parameter (the equilibrium temperature at each point), so that photons flying off in any direction would have exactly the same energies. In that case, the photons we see today coming from a line-of-sightn would reflect simply the density and gravitational potentials (the "sources") at the position Rn, where R is the radius to that (instantaneous) last scattering surface. Evidently, multiple scatterings at the epoch of recombination, combined with the fact that anisotropies themselves act as sources for more anisotropies, complicate this picture, and in general the relationship of the sources with the anisotropies must be calculated from either a set of Einstein-Boltzmann equations or, equivalently, from the line-of-sight integral equations coupled with the Einstein, continuity and Euler equations [6].
Assuming for simplicity that recombination was instantaneous, at a time η R , the linear kernels of Eq.
The photon distribution that we measure on Earth would therefore be given by: We can also express this result in terms of the Fourier spectrum of the sources: where j (z) are the spherical Bessel functions. Substituting Eq. (6) into Eq. (5) we obtain that: where we have loosely collected the sources into the term Θ( k) ≡ i β i S i ( k, η R ). This expression conveys well the simple relation between the Fourier modes and the spherical harmonic modes. Therefore, up to coefficients which are known given some background cosmology, the statistical properties of the harmonic coefficients a m are inherited from those of the Fourier modes Θ( k) of the underlying matter and metric fields. Notice that the properties of the a m 's under rotations, on the other hand, have nothing to do with the statistical properties of the fluctuations: they come directly from the spherical harmonic functions Y m .

Statistics in Fourier space
The characterization of the statistics of random variables is most commonly expressed in terms of the correlation functions. The two-point correlation function is the ensemble expectation value: In the absence of any symmetries, this would be a generic function of the arguments k and k , with only two constraints: first, because Θ( x) is a real function, Θ * ( k) = Θ(− k), hence in our definition C * ( k, k ) = C(− k, − k ); and second, due to the associative nature of the expectation value, C( k, k ) = C( k , k). It is obvious how to generalize this definition to 3, 4 or an arbitrary number of fields at different k's (or "points".) Let us first discuss the issue of gaussianity. If we say that the variables Θ( k) are Gaussian random numbers, then all the information that characterizes their distribution is contained in their two-point function C( k, k ). The probability distribution function (pdf) is then formally given by: In this case, all higher-order correlation functions are either zero (for odd numbers of points) or they are simply connected to the two-point function by means of Wick's Theorem: where the sum runs over all permutations of the pairs of wave vectors and B i,j are weights. Second, let's consider the issue of homogeneity. A field is homogeneous if its expectation values (or averages) do not dependent on the spatial points where they are evaluated. In terms of the N -point functions in real space, we should have that: Writing this expression in terms of the Fourier modes, we get that: Homogeneity demands that the expression in Eq. (11) is a function of the distances between spatial points only, not of the points themselves. Hence, the expectation value in Fourier space on the right-hand-side of this expression must be proportional to δ( k 1 + k 2 + . . . + k N ). In other words, the hypothesis of homogeneity constrains the N -point function in Fourier space to be of the form: Notice that the "(N − 1)-spectrum" in Fourier space,Ñ , can still be a function of the directions of the wavenumbers k i (it will be, in fact, a function of N − 1 such vectors, due to the global momentum conservation expressed by the δ-function.) Models which realize the general idea of Eq. (12) correspond to homogeneous but anisotropic universes [37,38,39,40].
There is a useful diagrammatic illustration for the N -point functions in Fourier space that enforce homogeneity. Notice that we could use the δ-function in Eq. (12) to integrate out any one of the momenta k i in Eq. (11). Let us instead rewrite the δ-functions in terms of triangles, so for the 4-point function we have: whereas for the 5-point function we have: and so on, so that the N -point δ-function is reduced to N − 2 triangles with N − 3 "internal momenta" (the idea is nicely illustrated in Fig. 1.) Substituting the expression for the N -point δ-function into Eq. (11) and integrating out all external momenta but the first ( k 1 ) and last ( k N ), the result is that: This expressions shows explicitly that the real-space N -point function above does not depend on any particular spatial point, only on the intervals between points. Finally, what are the constraints imposed on the N -point functions that come from isotropy alone? Clearly, no dependence on the directions defined by the points, x i − x j , can arise in the final expression for the N -point functions in real space, so from Eq. (11) we see that the N -point function in Fourier space should depend only on the moduli of the wavenumbers -up to some momentum-conservation δ-functions, which naturally carry vector degrees of freedom.
In this review we will mostly be concerned with tests of isotropy given homogeneity (but not necessarily Gaussianity), so in our case we will usually assume that the N -point function in Fourier space assumes the form given in Eq. (12).

Statistics in harmonic space
In the previous section we characterized the statistics of our field in Fourier space, which in most cases is most easily related to fundamental models such as inflation. Now we will change to harmonic representation, because that's what is most directly related to the observations of the CMB, Θ(n), which are taken over the unit sphere S 2 . We will discuss mostly the two-point function here, and we defer a fuller discussion of N -point functions in harmonic space to Section 3.
From Eq. (7) we can start by taking the two-point function in harmonic space, and computing it in terms of the two-point function in Fourier space: Under the hypothesis of homogeneity, this expression simplifies considerably, leading to: If, in addition to homogeneity, we also assume isotropy, thenÑ 2 → P (k), and the integration over angles factors out, leading to the orthogonality condition for spherical harmonics:ˆd and as a result the covariance of the a m 's becomes diagonal: where we have defined the usual temperature power spectrum ∆ T (k) = k 3 P (k)/2π 2 in the middle line, and the angular power spectrum C in the last line of Eq. (18). As a pedagogical note, let's recall that the power spectrum basically expresses how much power the two-point correlation function has per unit log k: In an analogous manner to what was done above, we can also construct the angular two-point correlation function in harmonic space: The hypothesis of homogeneity by itself does not lead to significant simplifications, but isotropy leads to a very intuitive expression for the angular two-point function: Clearly, not only is this expression the analogous in S 2 of Eq. (19), but in fact the Fourier power spectrum ∆ 2 T (k) and the angular power spectrum C are defined in terms of each other as indicated in Eq. (21): Now, using the facts that the spherical Bessel function of order peaks when its argument is approximately given by , and that´d log z j 2 (z) = 1/(2 ( + 1)), we obtain that 1 : Incidentally, from this expression it is clear why it is customary to define: Using Eq. (11) we can easily generalize the results of this subsection to N -point functions in S 2 and in harmonic space, however, the assumption of isotropy alone does very little to simplify our life. The hypothesis of homogeneity, on the other hand, greatly simplifies the angular N -point functions, and most of the work in statistical anisotropy of the CMB that goes beyond the two-point function assumes that homogeneity holds. Notice that the issue of gaussianity is, as always, confined to the question of whether or not the two-point function holds all information about the distribution of the relevant variables, and is therefore completely separated from questions about homogeneity and/or isotropy.
Also notice that the separable nature of the definition (20) implies here as well, like in Fourier space, a reciprocity relation for the correlation function: C(n 1 ,n 2 ) = C(n 2 ,n 1 ) .
This symmetry must hold regardless of underlying models, and is important in order to analyze the symmetries of the correlation function, as we shall see later.
Before we move on, it is perhaps important to mention that the decomposition (20) is not unique. In fact, instead of the angular momenta of the parts, ( 1 , m 1 ; 2 , m 2 ), we could equally well have used the basis of total angular momentum (L, M ; 1 , 2 ) and decomposed that expression as: where Y LM 1 2 are known as the bipolar spherical harmonics, defined by [41]: where L and M = m 1 + m 2 are the eigenvalues of the total and azimuthal angular momentum operators, respectively. This decomposition is completely equivalent to Eq. (20), and we can exchange from one decomposition to another by using the relation: where the 3 × 2 matrices above are the well-known 3-j coefficients. At this point, it is only a matter of mathematical convenience whether we choose to decompose the correlation function as in (20) or as in (25). Although the bipolar harmonics behave similarly to the usual spherical harmonics in many aspects, the modulations of the correlation function as described in this basis have a peculiar interpretation. We will not go further into detail about this decomposition here, as it is discussed at length in another review article in this volume.

Estimators and cosmic variance
Returning to the covariance matrix (18), we see that, if we assume gaussianity of the a m 's, then the angular power spectrum suffices to describe statistically how much the temperature fluctuates in any given angular scale; all we have to do is to calculate the average (18). This can be a problem, though, since we have only one Universe to measure, and therefore only one set of a m 's. In other words, the average in (18) is poorly determined. At this point, the hypothesis that our Universe is spatially homogeneous and isotropic at cosmological scales comes not only as simplifying assumption about the spacetime symmetries, but also as a remedy to this unavoidable smallness of the working cosmologist's sample space. If isotropy holds, different cosmological scales are statistically independent, which means that we can take advantage of the ergodic hypothesis and trade averaging over an ensemble for averaging over space. In other words, for a given we can consider each of the 2 + 1 real numbers in a m as statistically independent Gaussian random variables, and define a statistical estimator for their variances as the average: The smaller the angular scales ( bigger), the larger the number of independent patches that the CMB sky can be divided into. Therefore, in this limit we should have: On the other hand, for large angular scales (small 's), the number of independent patches of our Universe becomes smaller, and (27) becomes a weak estimation of the C 's. This means that any statistical analysis of the Universe on large scales will be plagued by this intrinsic cosmic sample variance. Notice that this is an unavoidable limit as long as we have only one observable Universe. Finally, it is important to keep in mind the clear distinction between the angular power spectrum C and its estimator (27). The former is a theoretical variable which can be calculated from first principles, as we have shown in §2.1. The latter, being a function of the data, is itself a random variable. In fact, if the a m 's are Gaussian, then we can rewrite expression (27) as: where X is a chi-square random variable with 2 + 1 degrees of freedom. According to the central limit theorem, when → ∞, X approaches a standard normal variable 2 , which implies that C will itself follow a Gaussian distribution. Its mean can be easily calculated using (18) and (27), and is of course given by: which shows that the C 's are unbiased estimators of the C 's. It is also straightforward to calculate its variance (valid for any ): Because this estimator does not couple different cosmological scales, it has the minimum cosmic variance we can expect from an estimator due to the finiteness of our sampleso it is optimal in that sense. C is therefore the best estimator we can build to measure the statistical properties of the multipolar coefficients a m when both statistical isotropy and gaussianity hold. In later Sections we will explore angular or harmonic N -point functions for which the assumption of isotropy does not hold. However, it is important to remember at all times that we have only one map, which means one set of a m 's. The estimator for the angular power spectrum, C , takes into account all the a m 's by dividing them into the different 's and summing over all m ∈ (− , ). Clearly, it will inherit a sample variance for small 's, when the a m 's can only be divided into a few "independent parts". As we try to estimate higher-order objects such as the N -point functions, we will have to subdivide the a m 's into smaller and smaller subsamples, which are not necessarily independent (in the statistical sense) of each other. So, the price to pay for aiming at higher-order statistics is a worsening of the cosmic sample variance.

Correlation and statistical independence
The covariance given in Eq. (18) has two distinct, important properties. First, note that its diagonal entries, the C 's, are m-independent coefficients; this is crucial for having statistical isotropy, as we will show latter. Second, statistical isotropy at the Gaussian level implies that different cosmological "scales" (understood here as meaning the modes with total angular momentum and azimuthal momentum m) should be statistically independent of each other -and this is represented by the Kronecker deltas in (18).
In fact, statistical independence of cosmological scales is a particular property of Gaussian and statistically isotropic random fields, and is not guaranteed to hold when gaussianity is relaxed. We will see in the next Section that the rotationally invariant 3-point correlation function (and in general any N > 2 correlation function) couples to the three scales involved. In particular, if it happens that the Gaussian contribution of the temperature field is given by (18), but at least one of its non-Gaussian moments are nonzero, then the fact that a particular correlation is zero, like for example a 2m 1 a * 3m 2 , does not imply that the scales = 2 and = 3 are (statistically) independent. This is just a restatement of the fact that, while statistical independence implies null correlation, the opposite is not necessarily true. This can be illustrated by the following example: consider a random variable α distributed as: Let us now define two other variables x = cos(2πα) and y = sin(2πα). From these definitions, it follows that x and y are statistically dependent variables, since knowledge of the mean/variance of x automatically gives the mean/variance of y. However, these variables are clearly uncorrelated: Although correlations are among cosmologist's most popular tools when analyzing CMB properties, statistical independence may turn out to be an important property as well, specially at large angular scales, where cosmic variance is more of a critical issue.
Until now we have been analyzing the properties of Gaussian and statistically isotropic random temperature fluctuations. This gives us a fairly good statistical description of the Universe in its linear regime, as confirmed by the astonishing success of the ΛCDM model. This picture is incomplete though, and we have good reasons to search for deviations of either gaussianity and/or statistical isotropy. For example, the observed clustering of matter in galactic environments certainly goes beyond the linear regime where the central limit theorem can be applied, therefore leading to large deviations of gaussianity in the matter power spectrum statistics. Besides, deviations of the cosmological principle may leave an imprint in the statistical moments of cosmological observables, which can be tested by searching for spatial inhomogeneities [42] or directionalities [43]. But how do we plan to go beyond the standard model, given that there is only one Gaussian and statistically isotropic description of the Universe, but infinite possibilities otherwise? This is in fact an ambitious endeavor, which may strongly depend on observational and theoretical hints on the type of signatures we are looking for. In the absence of extra input, it is important to classify these signatures in a general scheme, differentiating those which are non-Gaussian from those which are anisotropic. Furthermore, given that the signatures of non-gaussianity may in principle be quite different from that of statistical anisotropy, such a classification is crucial for data analysis, which requires sophisticated tools capable of separating these two issues 3 .
We therefore start §3.1 by analyzing deviations of gaussianity when statistical isotropy holds. In §3.2 we keep the hypothesis of gaussianity and analyze the consequences of breaking statistical rotational invariance. We turn know to the question of non-Gaussian but statistically isotropic probabilities distributions. We will keep working with the N -point correlation function defined in harmonic space,

Non-Gaussian and SI models
since knowledge of these functions enables one to fully reconstruct the CMB temperature probability distribution. Specifically, we would like to know the form of any N -point correlation function which is invariant under arbitrary 3-dimensional spatial rotations. When rotated to a new (primed) coordinate system, the N -point correlation function transforms as: where the D m i m i (α, β, γ)'s are the coefficients of the Wigner rotation-matrix, which depend on the three Euler-angles α, β and γ characterizing the rotation. Notice that in this notation the primed (rotated) system is indicated by the primed m's. For the 2-point correlation function, we have already seen that the well-known expression does the job: Note the importance of the angular spectrum, C , being a m-independent function. What about the 3-point function? In this case, the invariant combination is found to be: which can be verified by straightforward calculations. Again, the non-trivial physical content of this statistical moment is contained in an arbitrary but otherwise mindependent function: the bispectrum B 1 2 3 [45,46,47]. As we anticipated in Section 2, rotational invariance of the 3-point correlation is not enough to guarantee statistical independence of the three cosmological scales involved in the bispectrum, although in principle a particular model could be formulated to ensure that B 1 2 3 ∝ δ 1 2 δ 2 3 , at least for some subset of a general geometric configuration of the 3-point correlation function. These general properties hold for all the N -point correlation function. For the 4point correlation function, for example, Hu [48] have found the following rotationally invariant combination where the Q 1 2 3 4 (L) function is known as the trispectrum, and L is an internal angular momentum needed to ensure parity invariance. In a likewise manner, it can be verified that the following expression The examples above should be enough to show how the general structure of these functions emerges under SI: apart from a m-independent function, every pair of momenta i in these functions are connected by a triangle, which in turn connects itself to other triangles through internal momenta when more than 3 scales are present. In Fig. 2 we show some diagrams representing the functions above.
Although we have always shown N -point functions which are rotationally invariant, the procedure used for obtaining them was rather intuitive, and therefore does not offer a recipe for constructing general invariant correlation functions. Furthermore, it does not guarantee that this procedure can be extended for arbitrary N 's. Here we will present a recipe for doing that, which also guarantees the uniqueness of the solution.
The general recipe for obtaining the rotationally invariant N -point function is as follows: from the expression (29) above, we start by contracting every pairs of Wigner functions, where by "contracting" we mean using the identity and where ω = {α, β, γ} is a shortcut notation for the three Euler angles. Once this contraction is done there will remain N/2 D-functions, which can again be contracted in pairs. This procedure should be repeated until there is only one Wigner function left, in which case we will have an expression of the following form: Now, we see that the only way for this combination to be rotationally invariant is when the remaining D L M M function above does not depend on ω, i.e., D L M M (ω) = δ L0 δ M 0 δ M 0 . Once this identity is applied to the geometrical factors, we are done, and the remaining terms inside the primed m-summation will give the rotationally invariant (N -1)-spectrum.
As an illustration of this algorithm, let us construct the rotationally invariant spectrum and bispectrum. For the 2-point function there is only one contraction to be done, and after we simplify the last Wigner function we arrive at where, of course: is the well-known definition of the temperature angular spectrum. For the 3-point function there are two contractions, and the simplification of the last Wigner function gives From this expression and the ortoghonality of the 3-j's symbols (see the Appendix), we can immediately identify the definition of the bispectrum: It should be mentioned that this recipe not only enables us to establish the rotational invariance of any N -point correlation function, but it also furnishes a straightforward definition of unbiased estimators for the N -point functions. All we have to do is to drop the ensemble average of the primed a m 's. So, for example, for the 2-and 3-point functions above, the unbiased estimators are given respectively by: Notice that isotropy plays the same role, in S 2 , that homogeneity plays in R 3 . What enforces homogeneity in R 3 is the Fourier-space δ-functions, as in the discussion around Fig. 1. However, in S 2 the equivalent of the Fourier modes are the harmonic modes, for which there is only a discrete notion of orthogonality -and no Dirac δ-function. What we found above is that the Wigner 3-j symbols play the same role as the Fourier space δ-functions: they are the enforcers of isotropy (rotational invariance) for the N -point angular correlation function. Hence, the diagrammatic representations of the constituents of the N -point functions in Fourier (Fig. 1) and in harmonic space (Fig.  2) really do convey the same physical idea -one in R 3 , the other in S 2 .

Gaussian and Statistically Anisotropic models
In the last section, we have developed an algorithm which enables one to establish the rotational invariance of any N -point correlation function. As we have shown, this is also an algorithm for building unbiased estimators of non-Gaussian correlations. In this section we will change the perspective and analyze the case of Gaussian but statistically anisotropic models of the Universe.
There are many ways in which statistical anisotropy may be manifested in CMB. From a fundamental perspective, a short phase of inflation which produces just enough e-folds to solve the standard Big Bang problems may leave imprints on the largest scales of the Universe, provided that the spacetime is sufficiently anisotropic at the onset of inflation [39]. Another source of anisotropy may result from our inability to efficiently clean foreground contaminations from temperature maps. Usually, the cleaning procedure involves the application of a mask function in order to eliminate contaminations of the galactic plane from raw data. As a consequence, this procedure may either induce, as well as hide some anomalies in CMB maps [28].
It is important to mention that these two examples can be perfectly treated as Gaussian: in the first case, the anisotropy of the spacetime can be established in the linear regime of perturbation theory, and therefore will not destroy gaussianity of the quantum modes, provided that they are initially Gaussian. In the second case, the mask acts linearly over the temperature maps, therefore preserving its probability distribution [49].

Primordial anisotropy
Recently, there have been many attempts to test the isotropy of the primordial Universe through the signatures of an anisotropic inflationary phase [38,39,40,50,51]. A generic prediction of such models is the linear coupling of the scalar, vector and tensor modes through the spatial shear, which is in turn induced by anisotropy of the spacetime [38]. Whenever that happens, the matter power spectrum, defined in a similar way as in Eq. (12), will acquire a directionality dependence due to this type of see-saw mechanism. This dependence can be accommodated in a harmonic expansion of the form: where the reality of P ( k) requires that r m (k) = (−1) m r * ,−m (k). Given that temperature perturbations Θ( x) are real, their Fourier components must satisfy the relation Θ( k) = Θ * (− k). This property taken together with the definition (12) implies that: which in turn restricts the 's in (30) to even values. Also, note that by relaxing the assumption of spatial isotropy, we are only breaking a continuous spacetime symmetry, but discrete symmetries such as parity should still be present in this class of models. Indeed, by imposing invariance of the spectrum under the transformation z → −z, we find that (−1) −m = 1. Similarly, invariance under the transformations x → −x and y → −y imply the conditions r m = (−1) m r ,−m and r m = r ,−m , respectively. Gathering all these constraints with the parity of , we conclude that That is, from the initial 2 + 1 degrees of freedom, only /2 + 1 of them contribute to the anisotropic spectrum [39].

Signatures of statistical anisotropies
The selection rules (32) are the most generic predictions we can expect from models with global break of anisotropy. We will now work out the consequences of these rules to the temperature power spectrum, and check whether they can say something about the CMB large scale anomalies. From the expressions (16) and (30) we can immediately calculate the most general anisotropic covariance matrix [37]: where: are the Gaunt coefficients resulting from the integral of three spherical harmonics (see the Appendix). These coefficients are zero unless the following conditions are met: The remaining coefficients in (33) are given by: and correspond to the anisotropic generalization of temperature power spectrum (18).
The selection rules (35), taken together with (32), lead to important signatures in the CMB. In particular, since 3 is even, the quantity 1 ± 2 must also be even, i.e., multipoles with different parity do not couple to each other in this class of models: This result is in fact expected on theoretical grounds, because by breaking a continuous symmetry (isotropy) we cannot expect to generate a discrete signature (parity) in CMB. However, notice that the absence of correlations between, say, the quadrupole and the octupole, does not imply that there will be no alignment between them. One example of this would be a covariance matrix of the form C m δ δ mm . If the C ,0 happen to be zero, for example, then all multipoles will present a preferred direction (in this case, the z-axis.)

Isotropic signature of statistical anisotropy
We have just shown that a generic consequence of an early anisotropic phase of the Universe is the generation of even-parity signatures in CMB maps. Interestingly, these signatures may be present even in the isotropic angular spectrum, since the C 's acquire some additional modulations in the presence of statistical anisotropies. In principle, these modulations could be constrained by measuring an effective angular spectrum of the form where we have introduced a small parameter to quantify the amount of primordial anisotropy.
In order to constrain these modulations we have to build a statistical estimator for a m a * m . Since we are looking for the diagonal entries of the matrix (33), a first guess would be To check whether this is an unbiased estimator of the effective angular spectrum, we apply it to (33) and take its average: Using the definition (34) of the Gaunt coefficients, the m-summation in the expression above becomes [41] where the last equality follows because > 0. Consequently we conclude that At first sight, this result may seem innocuous, showing only that this is not an appropriate estimator for (37). Note however that (38) is in fact the estimator of the angular spectrum usually applied to CMB data under the assumption of statistical isotropy. In other words, by means of the usual procedure we may be neglecting important information about statistical anisotropy. Moreover, the cosmic variance induced by the application of this estimator on anisotropic CMB maps is small, because, as it can easily be checked: This result shows that the construction of statistical estimators strongly depends on our prejudices about what non-gaussianity and statistical anisotropy should look like. Consequently, an estimator built to measure one particular property of the CMB may equally well hide other important signatures. One possible solution to this problem is to let the construction of our estimators be based on what the observations seem to tell us, as we will do in the next Section.

Cosmological Tests
So much for mathematical formalism. We will now turn to the question of how the hypotheses of gaussianity and statistical isotropy of the Universe can be tested. Though we are primarily interested in applying these tests to CMB temperature maps, most of the tools we shall be dealing with can be applied to polarization (E-and B-mode maps) and to catalogs of large-scale structures as well.
Testing the gaussianity and SI of our Universe is a difficult task. Specially because, as we have seen, there is only one Gaussian and SI Universe, but infinitely many universes which are neither Gaussian nor isotropic. So what type of non-gaussianity and statistical anisotropy should we test for? In order to attack this problem we can follow two different routes. In the bottom-up approach, models for the cosmic evolution are formulated in such a way as to account for some specific deviations from gaussianity and SI. These physical principles range from non-trivial cosmic topologies [52,53,54], primordial magnetic fields [55,56,57,58,59], local [60,43,61,62] and global [38,39,40,50] manifestation of anisotropy, to non-minimal inflationary models [51,63,64,65,66,67]. The main advantage of the bottom-up approach is that we know exactly what feature of the CMB is being measured. One of its drawbacks is the plethora of different models and mechanisms that can be tested.
The second possibility is the top-down, or model-independent approach. Here, we are not concerned with the mechanisms responsible for deviations of gaussianity or SI, but rather with the qualitative features of any such deviation. Once these features are understood, we can use them as a guide for model building. Examples here include constructs of a posteriori statistics [29,35,36] and functional modifications of the twopoint correlation function [28,37,68,69,70,71].
In the next section we will explore two different model-independent tests: one based on functional modifications of the two-point correlation function, and another one based on the so-called Maxwell's multipole vectors.

Multipole vectors
Multipole vectors were first introduced in cosmology by Copi et al. [23] as a new mathematical representation for the primordial temperature field, where each of its multipoles are represented by unit real vectors. Later it was realized that this idea is in fact much older [72], being proposed originally by J. C. Maxwell in his Treatise on Electricity and Magnetism.
The power of this approach is that the multipole vectors can be entirely calculated in terms of a temperature map, without any reference to external reference frames. This make them ideal tools to test the morphology of CMB maps, like the quadrupoleoctupole alignment.
The purpose of the following presentation is only comprehensiveness. A mathematically rigorous introduction to the subject can be found in references [73,72,74], as well as other review articles in this review volume.

Maxwell's representation of harmonic functions
We start our presentation of the multipole vectors by recalling some terminology. A harmonic function in three dimensions is any (twice differentiable) function h that satisfies Laplace's equation: where ∇ 2 is the Laplace operator. In spherical coordinates, the formal solution to Laplace's equation which is regular at the origin (r = 0) is: The functions r Y m are known as the solid spherical harmonics [74]. Since they agree with the usual spherical harmonics on the unit sphere, it is sometimes stated in the literature that the latter form a set of harmonic functions. This is an abuse of nomenclature though, and the reader should be careful. Given the scalar nature of Laplace's operator, it is possible to find solutions to Eq.(39) in terms of Cartesian coordinates. Such solutions can be constructed by combining homogeneous polynomials 4 of order : In three dimensions, the most general homogeneous polynomial of order contains ( + 2)!/(2! !) independent coefficients. However, since each polynomial must independently satisfy Eq. (39), precisely !/(2!( − 2)!) of these coefficients will depend on each other. This constraint leaves us with ( + 1)( + 2)/2 − ( − 1)/2 = 2 + 1 independent degrees of freedom in each multipole -which is, of course, the same number of independent degrees of freedom appearing in Eq. (40).
Based on this analysis, Maxwell introduced his own representation of harmonic functions. He noticed that by successively applying directional derivatives of the form v · ∇ ≡ ∇ v over the monopole potential 1/r, where r = x 2 + y 2 + z 2 and v is a unit vector, he could construct solutions of the form (41). That is: where λ are real constants. We are now going to show that this construction does indeed lead to solutions of the form (41). First, note that there is a pattern which emerges from successive application of directional derivatives over the monopole function: and so on. By induction, one can show that the general expression will be given by: where Q −2 is a homogeneous polynomial of order −2 which only involves combinations of the vectors v i and r.
The numerator of the function f given by (44) is clearly a homogeneous polynomial of order (as one can easily check for some 's and also prove by mathematical induction.) A not so obvious result is that this polynomial is also harmonic. To prove that, let us define: and consider the application of the operator ∇ 2 over the combination r α g . For the x-component we get: Repeating this process for the y-and z-components and then adding the results, we find: where in the last step we have used Euler's theorem on homogeneous functions, i.e., r · ∇g = g . If we now choose α = −(2 + 1), we find immediately that By construction, the left-hand side of the above expression is equal to ∇ 2 f . But according to the definition (42) this quantity is also zero, since Laplace's operator commutes with directional derivatives and ∇ 2 (1/r) = 0 for r > 0. Therefore, ∇ 2 g = 0 and g is harmonic, which completes our proof.
In conclusion, Maxwell's construction of harmonic functions, Eq. (42), is completely equivalent to the standard representation in terms of spherical harmonics. More importantly, this gives a one-to-one relationship between temperature maps (given by the a m 's) and unit vectors v i . This means that the multipole vectors can be directly calculated from a CMB map, without any reference to external reference frames or additional geometrical constructs. The reader interested in algorithms to construct the vectors from CMB maps may check references [23,72], as well as the other articles in this volume that review this approach.

Multipole vectors statistics
It should be clear from the discussion above that the multipole vectors give an intuitive way to discover, interpret and visualize phase correlations between different multipoles in the CMB maps. But they also can reveal intra-multipole features, such as planarity (when a given multipole presents a preferred plane.) Some of the most conspicuous hints of statistical anisotropy in the CMB, like the quadrupole-octupole alignment, have indeed been first found with less mathematically elegant methods [22,27], but are best described in terms of the multipole vector formalism [23,24,26,72]. However, this case should also sound an alarm, because some feature of a (presumably) random realization was found, then further scrutinized with a certain test which was, to some extent (intentionally or not) tailored to single out that very feature.
The pitfalls of aprioristic approaches are sometimes unavoidable, and all we can do is take a second look at our sample with a more generic set of tools, to try to assess how significant our result really is in the context of a larger set of statistical tests. Multipole vectors are, in fact, ideally suited to this, since they can be found for all multipoles, and it is relatively easy to construct scalar combinations of these vectors with the usual methods of linear algebra. Also convenient is the fact that simulating these vectors from maps, or even directly, is also relatively easy, so the standard model of a Gaussian random field can be easily translated into the pdf's for the tests constructed with the multipole vectors. An important drawback of the multipole vectors is that, because they are computed in terms of equations which are non-linear in the temperature fields, the distribution functions of statistical tests involving these vectors are highly non-Gaussian [75]. This means that these statistics usually have to be estimated by means of simulations (usually assuming that the underlying temperature field is itself Gaussian.) Another delicate issue is how stable the multipole vectors are to instrumental noise and sky cut -and here, again, we must rely on numerical simulations to compare the observations with theoretical models.
In order to see how one should go about constructing a general set of tests, it should be noted first that the multipole vectors define directions, but they have no sense (they are "headless vectors".) This follows from the fact that the sign of the constants λ are degenerate with the sign of the vectors v ,p (p = 1, . . . , .) Hence, the first requirement is that our tests should be independent of this sign ambiguity. Notice that this ambiguity extends to ancillary constructs such as the normal vectors, defined as the vector product: Mathematically, the sign ambiguity implies that the multipole vectors belong to the quotient space S 2 /Z 2 (also known as RP 2 ), while the normal vectors belong to R 3 /Z 2 . In principle, any positive-definite scalar constructed through the multipole or the normal vectors is "fair game", but there are some guidelines, e.g., one should avoid double-counting the same degrees of freedom. Below we review some of these tests, based on the work of Ref. [35].

The R statistic
The first example of a test involving the multipole vectors would be a scalar product, such as v · v . The most natural test would involve asking whether the multipole vectors of the given multipole are especially aligned or not. This means computing: where the normalization was introduced to make 0 ≤ R ≤ 1. This idea could be generalized to test alignments between multipole vectors at different multipoles: In fact, the quadrupole-octupole alignment can already be seen with this simple test: for essentially all CMB maps the significance of the alignment as measured by R 23 is of the order of 90-95% C.L. [35].

The S statistic
The second most natural test does not involve directly the multipole vectors themselves, but the normal vectors that can be produced by taking the vector product between the multipole vectors. So, we take: Notice that the number of normal vectors for a given multipole is l = ( − 1)/2so the number of normal vectors grows rapidly for larger multipoles, making it harder to use and meaning that the same degrees of freedom may be overcounted, at least for > 3. Again, the best strategy is simplicity: we can ask whether the normal vectors are aligned, within and between multipoles. This means computing: where the normalization was introduced again to make 0 ≤ S ≤ 1.
And yet again this idea is easily generalized to test alignments between normal vectors at different multipoles: With this test, the quadrupole-octupole alignment is much more significant: for essentially all CMB maps the test S 23 deviates from the expected range with 98% C.L. [35].

Other tests with multipole vectors
One can go on and expand the types of tests using both multipole and normal vectors. One idea would be, e.g., to disregard the moduli of the normal vectors: This test is therefore insensitive to the relative angle between the multipole vectors that produce any given normal vectors. This idea is similar to the planar modulations that will be discussed in the next Section. With this test, the quadrupole-octupole alignment is significant to about 95-98% C.L. This test can also be generalized to a self-alignment test ( = ), with just an adjustment to the normalization. Another possibility would be to measure the alignment between multipole vectors and normal vectors: Of course, this test cannot be easily generalized to a self-alignment. With the B 23 test, the quadrupole-octupole alignment is significant to about 95-98% C.L. We could go on here, but it should be clear that all the information (the 2 + 1 real degrees of freedom for each multipole ) has already been exhausted in the tests above.

Temperature correlation function
Despite their strong cosmological appeal, the multipole vectors have some limitations. Not only their directions in the CMB sky are sometimes difficult to interpret physically [26], they also have the additional drawback of mixing, in a non-trivial manner, information on both gaussianity and SI of the map being analyzed [26,75].
Another way of quantifying deviations in the standard statistical framework of cosmology is through functional modifications of the two-point correlation function [37,68]. Although this approach does not offer an optimal separation between gaussianity and SI (which is, by the way, an open problem in this field), working with the two-point correlation function makes it easier to test Gaussian models of statistical anisotropy [71,28].
The most general 2-point correlation function (2pcf) of two independent unit vectors is a function C of the form C : We have seen in §2.3 that, if we choose spherical coordinates (θ i , ϕ i ) to describe each vectorn i , the function above can be decomposed either in terms of two spherical harmonics Y i m i or in terms of the bipolar spherical harmonic Y LM 1 2 . In any case, the 2pcf will have the following functional dependence This function is absolutely general. If the Universe has any cosmological deviation of isotropy, whatever it is, it can be described by the function above (see also Fig.(5)). Unfortunately, this function will be of limited theoretical interest unless we have some hints on how to select its relevant degrees of freedom. This difficulty is in fact a general characteristic of model-independent tools, which at some stages forces us to rely on our theoretical prejudices about the statistical nature of the Universe in order to construct estimators of non-gaussianity and/or statistical anisotropy. Nonetheless, it is still possible to construct statistical estimators of anisotropy based on Eq.(54). For example, Hajian and Souradeep [68] have constructed an unbiased estimator κ for this function based solely on the requirement that this estimator should be rotationally invariant. Although it is true that any statistically significant κ > 0 will point towards anisotropy, it is not clear what type of anisotropy is being detected by this estimator.
We can still use Eq. (54) to search for deviations of SI if we restrict its domain to a smaller and non-trivial sub-domain. For example, we can take the vectorsn 1 andn 2 to be the same, and expand a function of the form which is equivalent to C : S 2 → R. This form of the 2pcf makes it ideal for searching for power multipole moments in CMB, once a suitable estimator is defined [37]. Unfortunately, when we taken 1 =n 2 we are in fact considering a one-point correlation function, which by construction does not allows us to measure correlations between different points in the sky.
It seems in principle that the functions (55) and (54) are the only possibilities besides the isotropic 2pcf Eq. (22). If not, what other combinations of the vectorsn 1 andn 2 can we consider? As a matter of fact, these two vectors are geometrical quantities intuitively bound to our notion of two-point correlation functions on the sphere. From this perspective they are not fundamental quantities. In fact, we can equally well represent the 2pcf by a disc living inside the unit sphere, as shown in Fig. 4  In this representation, θ is the angle between the vectorsn 1 andn 2 , as usual. The normal to the plane,n, is represented by two spherical angles (Θ, Φ). Finally, there is an overall orientation ϕ of the disc around its unit vector which completes the four degrees of freedom contained in Eq.(54). We have found therefore another valid geometrical representation of the most general 2pcf: The main advantage of the above representation when compared to (54) is its straightforward geometrical interpretation. First, note that the angular separation θ of the isotropic 2pcf is trivially included in this definition, and do not need to be obtained as a consequence of rotational invariance (see the discussion in §3.1.1.) Second, by characterizing the correlation function in terms of the geometrical components of a disc, we know exactly what are the degrees of freedom involved. This makes it easier to construct estimators of statistical anisotropy, alleviating the drawbacks of model-independent approaches mentioned above.

Anisotropy through planarity
An immediate application of the representation (56) is its use in the search for planar deviations of isotropy in CMB [71,28]. Planar modulations of astrophysical origin may play an important role to the CMB morphology. One example is the role played by the galactic and ecliptic plane in the quadrupole-octupole/north-south anomalies [26]. Also, it is well known that our galactic plane is sensible source of foreground contamination in the construction of cleaned CMB maps. These hints indicate that CMB modulations induced by the disc in Fig. 4 is not only a mathematical possibility, but perhaps also a symmetry of cosmological relevance.
Since we are primarily interested in measuring planar modulations of CMB, but including the usual angular modulation as the isotropic limit of the 2pcf, we can consider only the azimuthal average of Eq. (56): The resulting function can be easily expand in terms of simple special functions as where the restriction on the l-mode results from the symmetryn 1 ↔n 2 [76]. The multipolar coefficients C lm correspond to a generalization of the usual angular power spectrum C 's. In fact, they can be seen as the coefficients of a spherical harmonic decomposition of the function C (n), provided that this function suffers modulations as we sweep planes on the sphere.

Angular-planar power spectrum
Since we are restricting our analysis to the Gaussian framework, the set of coefficients C lm is all we need to characterize the two-point correlation function. However, the final product of CMB observations are temperature maps, and not correlation maps. What we need then is an algebraic relation between the multipolar coefficients C lm and the temperature coefficients a m defined in Eq.(3). At first sight, this relation could be obtained by equating expression (58) to its standard definition in Eq. (20), and then using the orthogonality of the special functions to isolate the C lm 's in terms of the a m 's. But for that to work we need the relation between the set of angles (Θ, Φ, θ) and (θ 1 , ϕ 1 , θ 2 , ϕ 2 ) which, depending on the reference frame we choose, is extremely complicated. Fortunately, all we need is the relation: n 1 ·n 2 = cos θ = cos θ 1 cos θ 2 + sin θ 1 sin θ 2 cos(ϕ 1 − ϕ 2 ) , together with a suitable choice of our coordinate system. For example, we can use the invariance of the scalar productn 1 ·n 2 and choose our coordinate system such that the disc of Fig. 4 lie in the xy plane. With this choice we will have: (Θ, Φ) = (0, 0) , cos θ = cos(ϕ 1 − ϕ 2 ) , and the integration over θ becomes simple. Once this is done, we make a passive rotation of the coordinate system and then we integrate over the remaining angles Θ and Φ, which will then be given precisely by the Euler angles used in the rotation. The details are rather technical and can be found in the Appendix. The final expression is: where and where the λ i m 's form a set of coefficients resulting from the θ integration, which are zero unless i + m = even (see the Appendix for more details).
Expression (60) is what we were looking for. With this relation, the angular-planar power spectrum C lm can be calculated from first principles for any model predicting a specific covariance matrix. Moreover, since the angular-planar function (58) is, after all, a correlation function, it should be possible to relate the angular-planar power spectrum C lm to the bipolar power spectrum A LM 1 2 of Hajian and Souradeep. In fact, by inverting expression (26) and plugging the result in (60), we find a linear relation between these two set of coefficients: Here, the set of geometrical coefficients I l, 1 2 plays a similar role to the 3-j symbols in expression (26). Note also the angular-planar power spectrum has only three free indices, while the bipolar power spectrum has four. This is a consequence of the azimuthal average we took in (57), which further constrains the degrees of freedom of the correlation function.

Statistical estimators and χ 2 analysis
We have shown that the angular-planar power spectrum (60) is given in terms of an ensemble average of temperature maps. Evidently, we cannot calculate it directly from data, for we have only one CMB map (the one taken from our own Universe.) The best we can do is to estimate the statistical properties of (60), like its mean and variance, and see whether these quantities agree, in the statistical sense, with what we would expect to obtain from a particular model of the Universe.
The reader should note that this procedure is not new -its limitation is due to the same cosmic variance which lead us to construct an estimator for the angular power spectrum C (see the discussion of §2.4). For the same reason, we will need to construct an estimator for the angular-planar power spectrum. An obvious choice is: for in this case we have an unbiased estimator, regardless of the underlying model.
If we now have a model predicting the angular-planar power spectrum, we can ask how good this model fit the observational data once it is calculated using (62). All we need is a simple chi-square (χ 2 ) goodness-of-fit test, which in our case can be written in the following generalized form: in which and l are the angular and planar degrees of freedom, respectively, and where σ lm is just the standard deviation of the estimator C lm . The (2l + 1) −1 factor accounts for the 2l + 1 planar degrees of freedom, and was introduced for latter convenience.
In §7 we will apply this test to the 5-year WMAP temperature maps in order to check the robustness of the ΛCDM model against the hypothesis of SI. Before that, we shall stop and digress a little about how observational uncertainties should be included in our analysis.

Theory v. Observations: cosmic and statistical variances
Until now we have been concerned with the formal aspects of non-Gaussian and statistically anisotropic universes, and how model-independent tests might be designed to detect deviations of either gaussianity or statistical isotropy. We will now discuss how such tests can be carried out and interpreted once we possess cosmological data.
In a great variety of tests, statistical tools are designed to detect particular deviations of gaussianity or SI from cosmological data like, for example, the CMB. The final outcome of these tests is usually a probability (a pure number), which should be interpreted as the chance that a Universe like ours might result from an ensemble of "equally prepared" Gaussian and SI universes. An anomaly in CMB is usually understood as a measure of how unlikely is a particular feature of our Universe according to this specific test. The multipole vector statistics, for example, when applied to a large number of simulated (Gaussian and SI) CMB maps, show that only ∼ 0.01% of these maps have a quadrupole-octupole alignment as strong as WMAP maps [26,23].
There are two points to keep in mind when carrying this type of analysis. The first is that a particular "detection" may always turn out to be a statistical fluctuation revealed by one specific tool. The robustness of an anomaly then depend on the number of independent tests pointing to the same result. The second is that the implementation of statistical tools is sensitive to the way we extract information from data, requiring an accurate separation between cosmological signal and astrophysical/instrumental noise.
In this section we present a critical review of the standard procedure used to implement cosmological tests. We show that it does not account for the intrinsic uncertainties of cosmological observations, which may possibly lead to an under/over estimation of anomalies. We then present a generalization of this process which naturally accounts for these uncertainties.

Standard Calculations
Suppose x is a random variable predicted by a particular model 5 and that cosmological observations of this quantity returned the value x 0 . We would like to calculate the probability, according to this model, that in a random Universe we would have x ≤ x 0 . Assuming that P th is the (normalized) probability density function (pdf) of x, this probability is commonly defined to be The probability of having x > x 0 is then simply given by P > = 1 − P ≤ . See the figure below.
x 0 If P ≤ is found to be too small or, equivalently, too high, we might be tempted to interpret x 0 as "anomalous" according to this model. However, this definition of probability assumes that x 0 was measured with infinite precision, and so it says nothing about an important question we must deal with: typically, the measurement of x 0 has an uncertainty which needs to be folded into the final probabilities that the observations match the theoretical expectations.
Moreover, since no two equally prepared experiments will ever return the same value x 0 , our measurements should also be regarded as random events. In a more rigorous approach, we would have to consider x 0 itself as one realization of a random variable, conditioned to the distribution of the signal. In the case of CMB, however, this would be only part of the whole picture, since the randomness of the measurements of x 0 should also be related to the way this data is reduced to its final form. This happens because different map cleaning procedures will lead to slightly different values for x 0 . This difference induces a variance in the data which reflects the remaining foreground contamination of the temperature map. We will elaborate more on this point through a concrete example, after we show how Eq. (64) may be changed in order to include the indeterminacy of cosmological measurements.

Convolving probabilities
The question we want to answer is: how to calculate the probability of x being smaller than our measurements when the latter are also random events? Let us suppose that our measurement is described by the random variable y and that x 0 is its most probable value. The probability of having x ≤ y is simply the probability of having z ≡ x − y ≤ 0 for some particular realization of the variable y. It should be clear by now that if we know the pdf of z, the probability we are looking for is simply the area under this distribution for −∞ ≤ z ≤ 0. The probability P ≤ of z being smaller or equal to zero can be calculated as: Now, under the hypothesis of independence of x and y we have P(x, y) = P th (x)P obs (y), where P obs is the (normalized) probability distribution function of the variable y. We can therefore rewrite the last expression as If we now hold y fixed and do the change of variables x = u + y, we get where we have changed the position of the integrals. The reader will now notice that term inside brackets is precisely the pdf we were looking for. Since there is nothing special about the variable y, we can equally well hold x fixed and repeat the calculus, obtaining the symmetric version of this result. In fact, the final pdf for z is nothing else than the convolution of the pdf's of each variable [77]: where the plus (minus) sign refers to the difference (sum) of x and y. Integrating this pdf from (−∞, 0] we get our answer As a consistency check, notice that in the limit where observations are made with infinite precision, P obs (y) becomes a delta function and we have: which agrees with our previous definition. The reader must be careful, though, not to think of Eq. (64) as some lower bound to Eq. (66). Since none of the pdf's appearing in Eq. (65) are necessarily symmetric, a large distance from x 0 to the most probable (not the mean!) value x would not, by itself, constitute sufficient grounds to claim that the measured value of this observable is "unusual" in any sense, simply because a large overlap between the two pdf's can render the result usual according to Eq. (66).
To illustrate this point, let us calculate P ≤ using Eq. (64) and Eq. (66) for the pdf's which appear in Fig. 6. For pedagogical reasons, we have chosen P th and P obs as positively (Maxwell distribution) and negatively (Gumbel distribution) skewed pdf's, respectively. The convolved distribution appears as the solid (black) line. For these pdf's, Eq. (64) gives 99.2%, while Eq. (66) gives 93.6% of chance of x being smaller than the observed x 0 ; all pdf's were normalized to one.

A χ 2 test of statistical isotropy
Although the last example was constructed to emphasize an important feature of the formalism developed in §6, cosmological observables designed to measure deviations of either gaussianity or statistical isotropy will often follow asymmetric distributions. The intrinsic uncertainties of cosmological measurements, specially the ones originating from map cleaning procedures, may be crucial when searching for any map's anomalies.
We will now make a concrete application of this formalism using the angular-planar chi-square test developed in §5.1

ΛCDM model
In the simplest realization of the ΛCDM model, the covariance matrix of temperature maps is determined by a 1 m 1 a 2 m 2 = (−1) m 2 C 1 δ 1 2 δ m 1 ,−m 2 . Using this expression in (60), we find that On the other hand, if the only non-zero C lm 's are given by l = m = 0, then C 00 = C . Therefore, statistical isotropy is achieved if and only if the angular-planar power spectrum is of the form (67). Since we are only interested in nontrivial planar modulations, we will restrict our analysis to the cases where l = 0, that is where, we remind the reader, the parity of comes from the symmetry (24). For this particular model, we can also calculate the covariance matrix of the estimator (62) explicitly. Using the null hypothesis above, we find after some algebra that This matrix has some interesting properties. First, note that the planar degrees of freedom are independent in this case (which justifies the (2l +1) −1 ) factor introduced in Eq. (63).) Second, its diagonal elements are given by the variance (σ lm ) 2 = ( C lm ) * C lm , which now becomes m-independent: Therefore, for the particular case of the ΛCDM model, the chi-square test (63) gets even simpler. Using (68) and (70), we find It is now clear that if the data under analysis are really described by this model, then it must be true that where we have used (70). This shows that any large deviation of this test from unity will be an indication of planar modulation in temperature maps, up, of course, to error bars. For convenience, let us define a new quantity as which will quantify anisotropies whenever χ l is significantly positive or negative.
This generalized chi-square test furnishes a complete prescription when searching for planar modulations of temperature in CMB maps. We emphasize, though, that for a given CMB map, the chi-square analysis must be done entirely in terms of that map's data. Since we are performing a model-independent test, we are not allowed to introduce fiducial biases in the analysis (for example, by calculating σ l using C model ), which would only include our a priori prejudices about what the map's anisotropies should look like. Since the C 's are, by construction, a measure of statistical isotropy. Consequently, an "anomalous" detection of C 's is by no means a measure of statistical anisotropy, and it is this particular value that should be used in (72) if we want to find deviations of isotropy, regardless of how high/low it is.

Searching for planar signatures in WMAP
In order to apply the test (72) to the 5-year WMAP data [14,11], we will define two new variables x ≡ (χ 2 ) l (th) , y ≡ (χ 2 ) l (obs) , which will be jointly analyzed using the formalism of section §6. Still, there remains the question of how to obtain their pdf's. These functions can be obtained numerically provided that the number of realizations of each variable is large enough, since in this case their histograms can be considered as piecewise constant functions which approximate the real pdf's. For the case of the (theoretical) variable x defined above we have run 2 × 10 4 Monte Carlo simulations of Gaussian and statistically isotropic CMB maps using the ΛCDM best-fit C 's provided by the WMAP team [78]. With these maps we have then constructed 2 × 10 4 realizations of the variable x.
The simulation of the (observational) variable y is more difficult, and depends on the way we estimate contamination from residual foregrounds in CMB maps. As is wellknown, not only instrumental noise, but systematic errors (e.g., in the map-making process), the inhomogeneous scanning of the sky (i.e., the exposure function of the probe), or unremoved foreground emissions (even after applying a cut-sky mask) could corrupt -at distinct levels -the CMB data.
Foreground contamination, on the other hand, may have several different sources, many of which are far beyond our present scopes. However, since different teams apply distinct procedures on the raw data in order to produce a final map, we will make the hypothesis that maps cleaned by different teams represent -to a good extent -"independent" CMB maps. Therefore, we can estimate the residual foreground contaminations by comparing these different foreground-cleaned maps.
In fact, the WMAP science team has made substantial efforts to improve the data products by minimizing the contaminating effects caused by diffuse galactic foregrounds, astrophysical point-sources, artifacts from the instruments and measurement process, and systematic errors [84,85]. As a result, multi-frequency foreground-cleaned full-sky CMB maps were produced, named Internal Linear Combination maps, corresponding Full sky maps Park et. al. [82] Delabrouille et. al.
[83] Table 1: Full-sky foreground cleaned CMB maps from WMAP data used in our analysis to estimate the variable y (see the text for more details.) Note that the reference [81] includes the analysis of maps from the three and five years WMAP releases.
to three and five year WMAP data [14,79]. To account for the mentioned randomness, systematic, and contaminating effects of the CMB data, we will use in our analyses several full-sky foreground-cleaned CMB maps, listed in Table 1, which were produced using both the three and five year WMAP data.
The prescription we adopt to determine the distribution of the observational variable y is as follows: we simulate Gaussian random a m 's in such a way that their central values are given by the five year ILC5 data [79,14], and with a variance which is estimated from the sample standard deviation of all the maps listed in Table 1. So, for example, suppose we have n different full-sky temperature maps at hand and we want to estimate the randomness inherent in the determination of, let's say, a 32 . Therefore, we take with where N (µ, σ) represents a Gaussian distribution with mean µ and standard deviation σ. Note that if the residual contamination is indeed weak, then the sample variance above will be small, and our procedure will reduce to the standard way of calculating probabilities. As for the use of a Gaussian in (73), this choice was dictated not only by simplicity, but rather by the fact that the propagation of uncertainties in physical experiments are usually assumed to follow a normal distribution. Note however that there are some instrumental uncertainties, such as beam or gain uncertainties, which will not in general follow normal distributions. In fact, some of them may even fail to be additive, meaning that our convolution formula will be inapplicable in these cases. In our analysis, we have focused only on foreground residuals, where the normality hypothesis is reasonable 6 .  Table 3: Final probabilities of obtaining, in a random ΛCDM Universe, a chi-square value smaller or equal to (χ 2 ) l (obs) , as given by full-sky temperature maps.

Full-sky maps
Following this procedure, we have used the full-sky maps shown in Table (1) to construct 10 4 Gaussian random a m 's, which were then used to calculate 10 4 realizations of y = (χ 2 ) l (obs) . With those variables we constructed histograms which, together with the histograms for the (full-sky) variable x, were used to calculate the final probability (66). We have restricted our analysis to the range of values ( , l) ∈ [2, 10], since the low multipolar sector (i.e., large angular scales) is where most of the anomalies were reported. The resulting histograms and pdf's are shown in Fig. 7, and the final probabilities we obtained are show in Table 3.  Table 3 and correspond to the area under the solid curve from −∞ to 0. All pdf's are normalized to 1.
Overall, our results show no significant planar deviations of anisotropy in WMAP data. The most unlikely individual values in Table (3) are in the sectors (l, ) given by (2,5), (10,5), (4,7) and (6,8), and are all above a relative chance of 5% of either being too negative [(2, 5), (10,5)] or too positive [(4, 7), (6,8)]. However, it is perhaps worth mentioning that not only the individual values of (χ 2 ) l are relevant, their coherence over a range of angular or planar momenta also carries interesting information. So, for example, a set of (χ 2 ) l 's which are all individually within the cosmic variance bounds, but which are all positive (or negative) can be an indication of an excess (or lack) of planar modulation. This type of coherent behavior appears in the following cases: (χ 2 ) l 2 , (χ 2 ) l 3 and, to a lesser extent, (χ 2 ) 4 -see Table 3. The angular quadrupole = 2, as well as the angular octupole = 3, have all positive planar spectra (for all values of l which we were able to compute), indicated by probabilities larger than 50%. The planar hexadecupole l = 4 also has 8 out of 9 angular spectra assuming positive values (only = 5 is negative).
The data analyzed in this Section relates to the full-sky maps, which are certainly still affected by residual galactic foregrounds. The reader interested in the complete analysis, including data from masked CMB maps, can check reference [28].

Conclusions
We know our Universe is not perfectly Gaussian, or homogeneous, or isotropic. The deviations from an idealized picture (or the lack thereof), whether predictably small or surprisingly large, can tell us a great deal about the Universe we live in. Since the types of physical mechanisms behind deviations from perfect gaussianity, homogeneity or isotropy are typically very different, we should try to measure these individual features separately -whenever possible or practical.
Recently it has been suggested that some of the most discussed anomalies in the CMB can be explained away [86], or that the evidence for them is statistically weak [87]. But even if it turns out that our Universe is a plain vanilla kind of place, where everything goes according to the inflationary theorist's dreams, we would still need to analyze it with tools that allow us to check the standard picture against the data. In addition, local physics (related to the solar system, or our galaxy), as well as instrumental quirks, tend to leave imprints on the CMB which are clearly anisotropic, but have a certain coherence which can be detected, and possibly corrected for, with the help of these checks.
However, in an era where at least the large-scale maps of the CMB are likely to remain basically unchanged, we should be careful not to over analize the data with the benefit of an ever greater hindsight (put another way, a posteriori conundrums only get worse with time.) This can only be achieved if we find natural and generally agreed-upon classifications of the types of deviations that may occur, without too much guidance from what the data is telling us. We believe that focusing on the possible underlying symmetries, with perhaps some guidance from group-theoretic arguments, is one way to settle these issues. We have presented a few methods along these lines, one using multipole vectors, the other using a natural generalization of the two-point correlation function -and other methods have been presented in this Review.
Perhaps the best indication that we are on the right track is the fact that most of these methods are applicable in other areas of physics and astronomy -and that in some cases we have adapted tests of anisotropy from other areas, such as scattering theory and the theory of angular momentum in quantum mechanics. So, even if these anomalies eventually perish, they will be survived by the powerful methods that have been devised to test them.
We need now to integrate out the Θ and Φ dependence in the right-hand side of (77) which was hidden due to our choice of a particular coordinate system. In order to do that, we keep the vectorsn 1 andn 2 fixed and make a rotation of our coordinate system using three Euler angles ω = {α, β, γ}. This rotation changes the coefficients C lm 's and a m 's according to where C lm and a lm are the multipolar coefficients in the new coordinate system and where D l mm (ω) are the elements of the Wigner rotation matrix. The advantage of positioning the vectorsn 1 andn 2 in the plane xy is that now the angles Θ and Φ are given precisely by the Euler angles β and γ, regardless of the value of α where in the last step we have used Y lm (0, 0) = (2l + 1)/4π δ m0 . Therefore, in our new coordinate system we have (dropping the " " in our notation)

2π
l,m C lm D l 0m (ω) √ 2l + 1 = 1 ,m 1 2 ,m 2 a 1 m 1 a * where in the derivation above we have made use of the Fourier series expansion of the Legendre polynomial. So we conclude that which is needed in the derivation of (67).