A Computationally Efficient, Exploratory Approach to Brain Connectivity Incorporating False Discovery Rate Control, A Priori Knowledge, and Group Inference

Graphical models appear well suited for inferring brain connectivity from fMRI data, as they can distinguish between direct and indirect brain connectivity. Nevertheless, biological interpretation requires not only that the multivariate time series are adequately modeled, but also that there is accurate error-control of the inferred edges. The PCfdr algorithm, which was developed by Li and Wang, was to provide a computationally efficient means to control the false discovery rate (FDR) of computed edges asymptotically. The original PCfdr algorithm was unable to accommodate a priori information about connectivity and was designed to infer connectivity from a single subject rather than a group of subjects. Here we extend the original PCfdr algorithm and propose a multisubject, error-rate-controlled brain connectivity modeling approach that allows incorporation of prior knowledge of connectivity. In simulations, we show that the two proposed extensions can still control the FDR around or below a specified threshold. When the proposed approach is applied to fMRI data in a Parkinson's disease study, we find robust group evidence of the disease-related changes, the compensatory changes, and the normalizing effect of L-dopa medication. The proposed method provides a robust, accurate, and practical method for the assessment of brain connectivity patterns from functional neuroimaging data.


Introduction
The interaction between macroscopic brain regions has been increasingly recognized as being vital for understanding the normal brain function and the pathophysiology of many neuropsychiatric diseases. Brain connectivity patterns derived from neuroimaging methods are therefore of great interest, and several recently published reviews have described different modeling methods for inferring brain connectivity from fMRI data [1,2]. Specifically, graphical models which represent statistical dependence relationships between time series derived from brain regions, such as structural equation models [3], dynamic causal models [4], and Bayesian networks [5], appear to be well suited for assessing connectivity between brain regions.
Graphical models, when applied to functional neuroimaging data, represent brain regions of interest (ROIs) as nodes and the stochastic interactions between ROIs as edges. However, in most nonbrain imaging graphical model applications, the primary goal is to create a model that fits the overall multivariate data well, does not necessarily accurately reflect the particular connections between nodes. Yet in the applications of graphical models to brain connectivity, the neuroscientific interpretation is largely based on the pattern of connections inferred by the model. This places a premium on accurately determining the "inner workings" of the model such as accounting for the error rate of the edges in the model.
The false discovery rate (FDR) [6,7], defined as the expected ratio of spurious connections to all learned connections, has been suggested as a suitable error-rate control criterion when inferring brain connectivity. Compared with traditional type I and type II statistical error rates, the FDR is more informative in bioinformatics and neuroimaging, since it is directly related with the uncertainty of the reported positive results. When selecting candidate genes for genetic research, for example, researchers may want 70% of selected genes to be truly associated with the disease, that is, an FDR of 30%.
Naively controlling traditional type I and type II error rates at specified levels may not necessarily result in reasonable FDR rates, especially in the case of large, sparse networks. For example, consider an undirected network with 40 nodes, with each node interacting, on average, with 3 other nodes; that is, there are 60 edges in the network. An algorithm with the realized type I error rate of 5% and the realized power of 90% (i.e., the realized type II error rate = 10%) will recover a network with 60 × 90% = 54 correct connections and [40 × (40 − 1)/2 − 60] × 5% = 36 false connections, which means that 36/(36 + 54) = 40% of the claimed connections actually would not exist in the true network! This example, while relatively trivial, demonstrates that the FDR may not be kept suitably low by simply controlling traditional type I and type II error rates.
Recent work in the machine learning field has started to investigate controlling the FDR in network structures using a generic Bayesian approach and classical FDR assessment [8]. This work was subsequently extended to look specifically at graphical models where the FDR was assessed locally at each node [9].
Li and Wang proposed a network-learning method that allows asymptotically control of the FDR globally. They based their approach on the PC algorithm (named after Peter Spirtes and Clark Glymour), a computationally efficient and asymptotically reliable Bayesian network-learning algorithm. The PC algorithm assesses the (non)existence of an edge in a graph by determining the conditional dependence/independence relationships between nodes [10]. However, different from the original PC algorithm, which controls the type I error rate individually for each edge during conditional independence testing, the Li and Wang algorithm, referred as the PC fdr algorithm, is capable of asymptotically controlling the FDR under prespecified levels [11]. The PC fdr algorithm does this by interpreting the learning of a network as testing the existence of edges, and thus the FDR control of edges becomes a multiple-testing problem, which has a strong theoretical basis and has been extensively studied by statisticians [11].
Beside giving an introduction of these recent advancements, this paper will present two extensions to the original PC fdr algorithm, the combination of which leads to a multisubject brain connectivity modeling approach incorporating FDR control, a priori knowledge and group inference. One extension is an adaptation of a priori knowledge, allowing users to specify which edges must appear in the network, which cannot and which are to be learned from data. The resulting algorithm is referred to as PC + fdr algorithm in this paper. Many applications require imposing prior knowledge into network learning. For example, analyzing causal relationship in time series may forbid backward connections from time t + 1 to t, such as that in dynamic Bayesian networks. In some situations, researchers may want to exclude some impossible connections based on anatomical knowledge. Incorporating a priori knowledge into PC fdr algorithm allows for more flexibility in using the method and potentially leads to greater sensitivity in accurately discovering the true brain connectivity.
The second extension to PC fdr algorithm is a combination of the PC fdr algorithm and a mixed-effect model to robustly deal with intersubject variability. As neuroimaging research typically involves a group of subjects rather than focusing on an individual subject, group analysis plays an important role in final biological interpretations. However, compared with the extensive group-level methods available for analysis of amplitude changes in blood-oxygen-leveldependent (BOLD) signals (e.g., Worsley et al. [12], Friston et al. [13]), the problem of group-level brain connectivity analysis is less well studied. This is likely due to the fact that it requires not only accommodating the variances and the correlations across subjects, but also accounting for the potentially different structures of subject-specific brain connectivity networks. The proposed group-level exploratory approach for brain connectivity inference combines the PC fdr algorithm (or the extended PC + fdr algorithm if a priori knowledge is available) and a mixed-effect model, a widely used method for handling intersubject variability.
Several methods have been proposed to infer group connectivity in neuroimaging. Bayesian model selection [14] handles intersubject variability and error control; however, its current proposed implementation does not scale well, making it more suitable for confirmatory, rather than an exploratory research. Varoquaux et al. [15] propose a datadriven method to estimate large-scale brain connectivity using Gaussian modeling and deals with the variability between subjects by using optimal regularization schemes. Ramsey et al. [16] describe and evaluate a combination of a multisubject search algorithm and the orientation algorithm.
The major distinguishing feature of the proposed approach compared to these aforementioned approaches is that the current data-driven approach aims at controlling the FDR directly at the group-level network. We demonstrate that in simulations that, with a sufficiently large subject size, the proposed group-level algorithm is able to reliably recover network structures and still control the FDR around prespecified levels. When the proposed approach, referred as the gPC + fdr algorithm, is applied to real fMRI data with Parkinson's disease, we demonstrate evidence of direct and indirect (i.e., compensatory) disease-related connectivity changes, as well as evidence that L-dopa provides a "normalizing" effect on connectivity in Parkinson's disease, consistent with its dramatic clinical effect.

Preliminaries.
Graphical models, such as Bayesian networks, encode conditional independence/dependence relationships among variables graphically with nodes and edges according to the Markov properties [17]. The concept of conditional (in)dependence is very important for the inference of brain connectivity, as it assists in distinguishing between direct and indirect connectivity. For example, the activities in two brain regions are initially correlated, but become independent after all possible influences from other brain regions are removed, then this is an example of indirect connectivity, as the initial activity was actually induced by common input from another region(s). On the other hand, if the activities of two brain regions are correlated even after all possible influences from other regions are removed, then very likely there is a direct functional connection between them and hence is an example of direct connectivity. Conditional dependence is the real interest in learning brain connectivity because it implies that two brain regions are directly connected. Since a graphical model is a graphical representation of conditional independence/dependence relationships, the nonadjacency between two nodes is tested by inspecting their conditional independence given all other nodes. As multiple edges are tested simultaneously, FDR-control procedures should be applied to correct the effect of multiple testing.
Given two among N random variables, there are 2 N−2 possible subsets of the other N − 2 variables upon which the two variables could be conditionally independent. To avoid exhaustively testing such an exponential number of conditional independence relationships, the following proposition [10] can be employed [9,11]. Based on Proposition 1, nodes a and b can be disconnected once they are found conditionally independent upon a conditional node set C. As the tests of adjacency progress for every node pair, the neighborhood of nodes keeps shrinking, so an exhaustive search of the conditional node set C is avoided. This greatly reduces computation, especially for a sparse network.  [11] method, called the PC fdr algorithm, was proved to be capable of asymptotically controlling the FDR. Here we present an extension of the PC fdr algorithm which can incorporate a priori knowledge, which was not specified in the original PC fdr algorithm. We name the extension as the PC + fdr algorithm where the superscript "+" indicates that it is an extension. The pseudocode of the PC + fdr algorithm is given in Algorithm 1, and its Matlab implementation is downloadable at http://www.junningli.org/software. It handles prior knowledge with two inputs: E must , the set of edges assumed to appear in the true graph, and E test , the set of edges to be tested from the data. The original PC fdr algorithm can thus be regarded as a special case of the extended algorithm, by setting E must = ∅ and E test = {all possible edges}.

Asymptotic Performance.
Before we present theorems about the asymptotic performance of the PC + fdr algorithm and its heuristic modification, let us first introduce the assumptions related to the theorems.
(A1) The multivariate probability distribution P is faithful to a directed acyclic graph (DAG) whose skeleton is G true .
(A2) The number of vertices is fixed.
(A3) Given a fixed significance level of testing conditional independence, the power of detecting conditional dependence approaches 1 at the limit of large sample sizes.
(A4) The union of E must , the edges assumed to be true, and E test , the edges to be tested, covers E true , all the true edges; that is, Assumption (A1) is generally assumed when graphical models are applied, and it restricts the probability distribution P to a certain class. Assumption (A2) is usually implicitly stated, but here we emphasize it because it simplifies the proof. Assumption (A3) may seem overly restrictive, but actually can be easily satisfied by standard statistical tests, such as the likelihood ratio test introduced by Neyman and Pearson [18] and the partial-correlation test by Fisher [19], if the data are identically and independently sampled. Assumption (A4) relates to prior knowledge, which interestingly does not require that the assumed "true" edges E must be a subset of the true edges E true , but just that all true edges are included in the union of the assumed "true" edges and the edges to be tested.
The detection power of the PC + fdr algorithm and its heuristic modification at the limit of large sample sizes is elucidated in Theorem 2.
Theorem 2. Assuming (A1), (A2), and (A3), both the PC + fdr algorithm and its heuristic modification, the PC + fdr * algorithm, are able to recover all the true connections in E test with probability one as the sample size approaches infinity: It should be noted that Theorem 2 does not need Assumption (A4), which implies that the true edges in E test are still able to be recovered by the algorithms with probability one at the limit of large sample sizes, even if Input: the data D, the undirected edges E must that are assumed to exist in the true undirected graph G true according to prior knowledge, the undirected edges E test (E must ∩ E test = ∅) to be tested from the data D, and the FDR level q for making inference about E test . Output: an undirected graph G stop , that is, the value of G when the algorithm stops, or equivalently, E stop , the edges in G stop . Notations: D denotes the multivariate input data. a, b denote the vertices. E, C denote the vertex sets. a ∼ b denotes an undirected edge. adj(a, G) denotes vertices adjacent to a in graph G. a ⊥ b|C denotes the conditional independence between a and b given C.
Test hypothesis a ⊥ b | C and calculate the p value p a⊥b|C . (8) if p a⊥b|C > p max a∼b , then (9) Let p max a∼b = p a⊥b|C . (10) if every element of P max has been assigned a valid p value by step 9, then (11) Run the FDR procedure, Algorithm 2, with P max and q as the input. (12) if the non-existence of certain edges are accepted, then (13) Remove these edges from G. (14) Update G and E. (15) if a ∼ b is removed, then (16) break the for loop at line 6.
A heuristic modification, named the PC + fdr * algorithm, at step 14 removes p max a∼b from P max as well once a ∼ b is removed from G. the edges assumed to be present by users are not completely correctly specified. The FDR of the PC + fdr algorithm at the limit of large sample sizes is elucidated in Theorem 3.

Theorem 3. Assuming (A1), (A2), (A3), and (A4), the FDR of the set of edges inferred by the PC +
fdr algorithm about E test approaches a value not larger than the user-specified level q as the sample size m approaches infinity: Theorem 3 concerns the PC + fdr algorithm, and it requires Assumption (A4). We are still not sure whether similar FDR performance can be proved for the PC + fdr * algorithm. Assumption (A4) does not require that the assumed "true" edges E must is a subset of the true edges E true but only that all true edges are included in the union of the assumed "true" edges and the edges to be tested. This is particularly useful in practice, since it does not require users' prior knowledge to be absolutely correct, but allows some spurious edges to be involved in E must , once all true edges have been included in either E must or E test . Assumption (A4) can be satisfied by making E test ∪ E must large enough to cover all the true edges, but as shown in (4) this will increase the computational cost of the algorithm. Theorems 2 and 3 address the performance of the PC + fdr algorithm and its heuristic modification at the limit of large sample sizes. Because the PC + fdr algorithm is derived from the PC fdr algorithm, its performance should be very similar. The numerical examples of the PC fdr algorithm in Li and Wang's [11] work may provide helpful and intuitive understanding on the performance of the PC + fdr algorithm with moderate sample sizes.
The detailed proofs of Theorems 2 and 3 are provided in Appendix A.
Computational and Mathematical Methods in Medicine 5 Input: a set of p values {p i | i = 1, . . . , H}, and the threshold of the FDR q Output: the set of rejected null hypotheses (1) Sort the p-values of H hypothesis tests in the ascendant order as p (1) ≤ · · · ≤ p (H) .

Computational Complexity.
The majority of the computational effort in the PC + fdr is utilized in performing statistical tests of conditional independence at step 7 and the FDR at step 11. If the algorithm stops at the depth d = d max , then the number of conditional independence tests required is bounded by where |E test | is the number of edges to be tested, Δ is the maximum degree of graph G init (the graph formed at step 1) whose edges are E must ∩ E test , and C d Δ−1 is the number of combinations of choosing d unordered and distinct elements from Δ−1 elements. The bound usually is very loose, because it assumes that no edge has been removed until d = d max .
The computational complexity of the FDR procedure, Algorithm 2, invoked at step 11 of the PC + fdr algorithm is O(H log(H)) when it is invoked for the first time, where H = |E test | is the number of input p values and is O(H) later, with the optimization suggested in Appendix B. In the worst case that p a⊥b|C is always larger than p max a∼b , the complexity of the computation spent on the FDR control in total is bounded where T is the number of performed conditional independence tests (see (4)). This is a very loose bound because it is rare that p a⊥b|C is always larger than p max a∼b . In practice, the PC + fdr algorithm runs very quickly, especially for sparse networks. In our experiments (see Section 3.1), it took about 10 seconds to infer the structure of a first-order dynamic network with 20 nodes from data of 1000 time points.

Miscellaneous Discussions.
It should be noted that controlling the FDR locally is not equivalent to controlling it globally. For example, if it is known that there is only one connection to test for each node, then controlling the FDR locally in this case will degenerate to controlling the pointwise error rate, which cannot control the FDR globally.
Listgarten and Heckerman [8] proposed a permutation method to estimate the number of spurious connections in a graph learned from data. The basic idea is to repetitively apply a structure learning algorithm to data simulated from the null hypotheses with permutation. This method is generally applicable to any structure learning method, but permutation may make the already time-consuming structure learning problem even more computationally cumbersome, limiting its use in practical situations.

FDR-Controlled Group Brain Connectivity Inference with or without A Priori Knowledge.
In this section, we propose another extension to the PC fdr algorithm: from the single subject level to the group level. Assessing group-level activity is done by considering a mixed-effect model (Step 7 of Algorithm 3), and we name it the gPC fdr algorithm where "g" indicates that it is an extension at the group level. When also incorporating a priori knowledge, the resulting algorithm is named the gPC + fdr algorithm. Suppose we have m subjects within a group. Then for subject i, the conditional independence between the activities of two brain regions a and b given other regions C can be measured by the partial correlation coefficient between X a (i) and X b (i) given X C (i), denoted as r ab|C (i). Here X • denotes variables associated with a vertex or a vertex set, and index i indicates that these variables are for subject i. By definition, the partial correlation coefficient r ab|C (i) is the correlation coefficient between the residuals of projecting X a (i) and X b (i) onto X C (i) and can be estimated by the sample correlation coefficient as where For clarity, in the following discussion we omit the subscript "ab | C" and simply use index "i" to emphasize that a variable is associated with subject i. Input: the multisubject data D, undirected complete graph G, complete vertex set E and the FDR controlled level q for making inference about E. Output: the recovered undirected graph G stop . Notations: a, b denote the vertices. E, C denote the vertex set. a ∼ b denotes an undirected edge. adj(a, G) denotes vertices adjacent to a in graph. a ⊥ b | C denotes the conditional independence between a and b given C.
(1) Form an undirected graph G on the vertex set E.
(2) Initialize the maximum p values associated with the edges in E as each ordered pair of vertices a and b that a ∼ b ∈ E ∩ E and |adj(a, G) \ {b}| ≥ d do (6) for each subset C ⊆ adj(a, G) \ {b} and |C| = d do (7) Test hypothesis a ⊥ b | C for each subject and calculate the p value p a⊥b|C at the group level. (8) if p a⊥b|C > p max a∼b , then (9) Let p max a∼b = p a⊥b|C . (10) if every element of P max has been assigned a valid p value by step 9, then (11) Run the FDR procedure, Algorithm 2, with P max and q as the input. (12) if the non-existence of certain edges are accepted, then (13) Remove these edges from G. (14) Update G and E. (15) if a ∼ b is removed, then (16) break the for loop at line 6. (17) end if (18) end if (19) end if (20) end if (21) end for (22) end for (23) Let d = d + 1.
(24) until |adj(a, G) \ {b}| < d for every ordered pair of vertices a and b that a ∼ b is in E. Note: When a priori knowledge is available, we can also incorporate the prior knowledge into the gPC fdr algorithm to obtain the gPC + fdr algorithm, where the inputs are updated as follows: the multisubject data D, the undirected edges E must that are assumed to appear in the true undirected graph G true according to prior knowledge, the undirected edges E test whose existences are to be tested from the data, and the FDR controlled level q for making inference about E test .
To study the group-level conditional independence relationships, a group-level model should be introduced for r i . Since partial correlation coefficients are bounded and their sample distributions are not Gaussian, we apply Fisher's z-transformation to convert (estimated) partial correlation coefficients r to a Gaussian-like distributed z-statistic z, which is defined as where r is a (estimated) partial correlation coefficient and z is its z-statistic.
The group model we employ is where e i follows a Gaussian distribution N(0, σ 2 g ) with zero mean and σ 2 g variance. Consequently, the group-level testing of conditional independence is to be used to test the null hypothesis z g = 0.
Because z i is unknown and can only be estimated, the inference of z g should be conducted with z i = Z( r i ). If X a (i), X b (i), and X C (i) jointly follow a multivariate Gaussian distribution, then z i asymptotically follows a Gaussian distribution N(z i , σ 2 i ) with where N i is the sample size of subject i's data and p represents the number of variables in X C (i). Therefore, based on (8), we have where i follows N(0, σ 2 i ) and e i follows N(0, σ 2 g ). This is a mixed-effect model where i denotes the intrasubject randomness and e i denotes the intersubject variability. At the group level, z i follows a Gaussian distribution N(z g , σ 2 i + σ 2 g ). Note that unlike regular mixed-effect models, the intrasubject variance σ i 2 in this model is known, because N i and p are known given the data X(i) and C. In general, is not necessarily equal to σ 2 j for i / = j, and the inference of z g should be conducted in the manner of mixed models, such as estimating σ 2 g with the restricted maximum likelihood (ReML) approach. However, if the sample size of each subject's data is the same, then σ 2 i equals σ 2 j . For this balanced case, which is typically true in fMRI applications  and as well the case in this paper, we can simply apply a t-test to z i 's to test the null hypothesis z g = 0.
Replacing Step 7 of the single-subject PC fdr algorithm (i.e., the intrasubject hypothesis test) with the test of z g = 0, we can extend the single-subject version of the algorithm to its group-level version. We will employ this t-test in our simulations and in the real fMRI data analysis presented later in this paper. Such a testing approach significantly simplifies the estimation process, and our simulation results presented later demonstrate that this method can still control the FDR at a user specified error rate level.

Simulations for the PC +
fdr Algorithm. Here we compare the performances of the proposed PC + fdr algorithm and the original PC fdr algorithm, using time series generated from two dynamic Bayesian networks in Figure 1. One network has 20 nodes (10 channels) and 23 edges, and the other has 40 nodes (20 channels) and 56 edges. The dynamic Bayesian networks are assumed Gaussian, with connection coefficients uniformly distributed in [0.2, 0.6] with Gaussian noise whose amplitudes are uniformly distributed in [0.5, 1.1]. We use partial correlation coefficients to test conditional independence relationships. The target FDR for both methods is set as 5%. For the PC + fdr algorithm, one-third of the nonexisting connections are excluded as prior knowledge. Figure 1 shows the estimated FDR and detection power results, at sample sizes of 125, 250, 500, and 1000 time points and with 50 repetitive trials for each sample size. As shown in graphs (a) and (b), the PC + fdr and PC fdr algorithms can both control the FDR under or around 5%. For both methods, the detection power increases as the sample size increases. However, we can see that the PC + fdr algorithm yields higher detection power and lower FDR than the original PC fdr algorithm does. As mentioned earlier in the Introduction Section, the PC + fdr algorithm has the advantage of providing researchers more flexibility in using the method and higher accuracy in discovering brain connectivity.

3.2.
Simulations for the gPC f dr Algorithm. The simulations here serve two purposes: first, to verify whether the proposed gPC fdr algorithm for modeling brain connectivity can control the FDR at the group level, and second, to compare the gPC fdr algorithm with the single-subject PC fdr algorithm proposed in [11] and the state-of-art IMaGES algorithm investigated in Ramsey et al. [16] for inferring the structure of the group connectivity network.
The simulations were conducted as follows. First, a connectivity network is generated as the group-level model. Individual subject-level networks are then derived from the group-level model by randomly adding or deleting connections with a small probability, and subject-specific data are generated according to individual subject networks. Next, the network-learning methods, that is, the proposed gPC fdr algorithm, the single-subject PC fdr method with pooling together the data from all subjects, and the IMaGES algorithm, are applied to the simulated data. Finally, the outputs of the algorithms are compared with the true grouplevel network to evaluate their accuracy.
The data generation process is as follows.
(1) Randomly generate a directed acyclic graph (DAG) as the group-level network and associate each connection with a coefficient. The DAG is generated by randomly connecting nodes with edges and then orienting the edges according to a random order of the nodes. The connection coefficients are assigned as random samples from the uniform distribution U(β 1 , β 2 ), where β 1 and β 2 characterize the coefficient strength.
(2) For each subject, a subject-level network is derived from the group-level network by randomly adding and deleting connections. More specifically, for each of the existing connections, the connection is deleted with probability 0.05, and for each of the absent connections, a connection is added with probability 0.01. The corresponding connection coefficients are randomly sampled from the uniform distribution U(β 1 , β 2 ).
(3) Given a subject-level network, the subject-specific data are generated from a Gaussian Bayesian network, with the additional Gaussian noise following the standard Gaussian distribution N(0, 1).
In the first simulation, we compare the performances of the proposed gPC fdr algorithm, the original PC fdr algorithm, and the IMaGES algorithm [16], when using different connection coefficient strengths. In this example, the grouplevel network is the DAG in Figure 2(a). From this model, twenty subject-level models are derived, and for each subject, data with three hundred samples are simulated. To test the performances of the algorithms with a range of connection strengths, we vary the connection coefficient generating distribution U(β 1 , β 2 ) gradually from U(0.2, 0.3) to U(0.7, 0.8).
At the network-learning stage, we set the target FDR to be 5% for the gPC fdr algorithm. For reliable assessment, this procedure is repeated thirty times. Figures 2(b), 2(c), and 2(d) show the FDR and the type I error rate, and the detection power results as a function of connection strength. We note that all methods are relatively invariant to connection strength. The proposed gPC fdr algorithm steadily controls the FDR below or around the desired level and accurately makes the inference at the group level. The detection power of IMaGES algorithm is higher than that of gPC fdr algorithm, but it fails to control the FDR under the specified 5% level. Its higher detection power is achieved by sacrificing FDR. This is reasonable, since IMaGES is not specifically designed to control the FDR error rate.
In the second simulation, we test the performances of the algorithms as a function of the number of subjects within the group. The group-level network is the DAG in Figure 3(a), and the number of subjects increases from eight to twentyfive. At the network-learning stage, we set the target FDR to be 5%. This procedure is repeated thirty times. Figure 3(b) demonstrates the FDR results as a function of the number of subjects within the group. It is noted that the proposed gPC fdr algorithm is able to keep the FDR below or around the specified level. The detection power gradually increases as the number of subjects increases. When there are more than 15 subjects, the gPC fdr algorithm seems that it can achieve higher (better) detection power and lower (better) FDR and type I error rate than the IMaGES algorithm does. It suggests that when the number of subjects is large enough, the proposed gPC fdr algorithm can jointly address efficiency, accuracy, and intersubject variability. The original PC fdr algorithm of simply pooling the data together fails to control the FDR, and the resulting FDR does not decrease as the number of subject increases, probably due to the increasing heterogeneity within the group. In order to investigate the effects of the number of ROIs, we also investigate two networks with 15 and 25 nodes, respectively, and repeat the simulations (not shown here). The results are qualitatively similar to what we show here.

fMRI Application.
In order to assess the real-world application performance of the proposed method, we apply the gPC + fdr algorithm for inferring group brain connectivity network to fMRI data collected from twenty subjects. All experiments were approved by the University of British Columbia Ethics Committee. Ten normal people and ten Parkinson's disease (PD) patients participated in the study. During the fMRI experiment, each subject was instructed to squeeze a bulb in their right hand to control an "inflatable" ring so that it smoothly passed through a vertically scrolling a tunnel. The normal controls performed only one trial, while Parkinson's subjects performed twice, once before L-dopa medication and the other approximately an hour later, after taking medication.
Three groups were categorized: group N for the normal controls, group P pre for the PD patients before medication, and group P post for the PD patients after taking L-dopa medication. For each subject, 100 observations were used in the network modeling. For details of the data acquisition and preprocessing, please refer to Palmer et al. [20]. 12 anatomically defined regions of interest (ROIs) were chosen based on prior knowledge of the brain regions associated with motor performance (Table 1).
We utilized the two extensions of the PC fdr algorithm and learned the structures of first-order group dynamic Bayesian networks from fMRI data. Because the fMRI BOLD signal can be considered as the convolution of underlying neural activity with a hemodynamic response function, we assumed that there must be a connection from each region at time t to its mirror at time t + 1. We also assumed that there must be a connection between each region and its homologous region in the contralateral hemisphere. The TR interval (i.e., sampling period) was a relatively long, 1.985 seconds; we restricted ourselves to learn only connections between ROIs without time lags. In total, there are 12 + 6 = 18 predefined connections and 12 × (12 − 1) ÷ 2 − 6 = 60 candidate connections to be tested. The brain connectivity networks (with the target FDR of 5%) learned for the normal (group N) and PD groups before (group P pre ) and after (group P post ) medication are compared in Figure 4. Note the connection between the cerebellar hemisphere and contralateral thalamus in the normal subjects and between the supplementary motor area (SMA) and the contralateral putamen, consistent with prior knowledge. Interestingly, in P pre subjects, the left cerebellum now connects with the right SMA, and the right SMA ↔ left putamen connection is lost. Also, there are now bilateral primary motor cortex (M1) ↔ putamen connections seen in the P pre group, presumably as a compensatory mechanism. After medication (P post ), the left SMA ↔ left thalamus connection is restored back to be normal.

Discussion
Up to now, graphical models to infer brain connectivity from fMRI data have implicitly relied on the unrealistic assumption that if a model accurately represented the overall activity in several ROIs, the internal connections of such a model would accurately reflect underlying brain connectivity. The PC fdr algorithm was designed to loosen this overly restrictive assumption and asymptotically control the FDR of network connections inferred from data. In this paper, we first presented the PC + fdr algorithm, an extension of the PC fdr algorithm, which allows for incorporation of prior knowledge of network structure into  Figure 4: (a) Learned brain connectivity for the normal group (group N). (b) Learned brain connectivity for the PD group before medication (group P pre ). (c) Learned brain connectivity for the PD group after medication (group P post ). Here "L" and "R" refer to the left and right sides, respectively. The solid lines are predefined connectivity, and the dashed lines are learned connectivity. "l" or "r" in the abbreviations stands for "Left" or "Right," respectively. the learning process, greatly enhancing its flexibility in practice. The PC + fdr algorithm handles prior knowledge with two inputs: E must , which is the set of edges that are assumed to appear in the true graph, and E test , the set of edges that are to be tested from the observed data. We proved that, with mild assumptions and at the limit of large samples, the PC + fdr algorithm is able to recover all the true edges in E test and also curb the FDR of the edges inferred about E test .
It is interesting that the PC + fdr algorithm does not require the assumed "true" edges E must to be a subset of the true edges E true , but only that all true edges are included in the union of the assumed "true" edges and the edges to test. This is very useful in research practice, since it allows some spurious edges to be involved in E must , as long as all the true edges have been included in either E must or E test . Users can satisfy this requirement by making E test ∪ E must large enough to cover all the true edges.
When we compared the PC + fdr algorithm with the original PC fdr algorithm, both of them successfully controlled the FDR under the target threshold in simulations, providing a practical tradeoff between computational complexity and accuracy. However, the PC + fdr algorithm achieved better detection power and better FDR than the original PC fdr algorithm. Incorporating prior knowledge into PC fdr algorithm therefore enhances inference accuracy and improves the flexibility in using the method.
Another extension to PC fdr algorithm we described here was the ability to infer brain connectivity patterns at the group level, with intersubject variance explicitly taken into consideration. As a combination of the PC fdr algorithm and a mixed-effect model, the gPC fdr algorithm takes advantage of the error control ability of the PC fdr algorithm and the capability of handling intersubject variance. The simulation results suggest that the proposed method was able to accurately discover the underlying group network and steadily control the false discovery rate. Moreover, the gPC fdr algorithm was shown to be much more reliable than simply pooling together the data from all subjects. This may be especially important in disease states and older subjects. Compared with the IMaGES algorithm, gPC fdr demonstrated better control of the FDR.
As with all group models, a limitation of the proposed gPC fdr algorithm is the requirement of a sufficient number of subjects. While it is appreciated that in many biomedical applications data collection is resource intensive, and if the number of subjects is insufficient, the gPC fdr algorithm may give unreliable results. Nevertheless, the group extension to the PC fdr algorithm is one attempt to make brain connectivity inference using error-rate-controlled exploratory modeling.
When applying the proposed gPC + fdr to fMRI data collected from PD subjects performing a motor tracking task, we found group evidence of disease changes (e.g., loss of left cerebellar ↔ SMA connectivity), compensatory changes in PD (e.g., bilateral M1 ↔ contralateral putamen connectivity), and evidence of restoration of connectivity after medication (left SMA ↔ left thalamus). The tremendous variability in clinical progression of PD is likely due to variability not only in disease rate progression, but also in variability in the magnitude of compensatory changes.
This highlights the importance of the proposed method, as it allows robust estimation of disease effects, compensatory effects, and effects of medication, all with a reasonable sample size, despite the enhanced intersubject variability seen in PD.

A. Proof of Theorems
To assist the reading, we list below notations frequently used in the proof.
V : all the nodes in a graph, G true : the skeleton of the true underlying directed acyclic graph (DAG), A a∼b : the event that edge a ∼ b is in the graph recovered by the PC + fdr algorithm, then the probability of the joint of all these events approaches 1 as m approaches infinity: For the proof of this lemma, please refer to Li  For the proof of this lemma, please refer to Li and Wang's [11] work.
Proof of Theorem 2. If there is not any true edge in E true , that is, E true = ∅, then the proof is trivially E true = ∅ ⊆ E .
In the following part of the proof, we assume E true / = ∅.
For the PC + fdr algorithm and its heuristic modification, whenever the FDR procedure, Algorithm 2, is invoked, p max a∼b is always less than max C∈V \{a,b} {p a⊥b|C }, and the number of p values input to the FDR algorithm is always not more than |E test |. Thus, according to Lemma then all the true connections will be recovered by the PC + fdr algorithm and its heuristic modification. Let A a⊥b|C denote the event A E true denote the event of (A.3), and A E true denote the event that all the true connections in E test are recovered by the PC + fdr algorithm and its heuristic modification.
∵ A E true is a sufficient condition for A E true , according to Lemma 5.
∵ A E true is the joint of a limited number of events as and lim m → ∞ P(A a⊥b|C ) = 1 according to Assumption (A3). ∴ According to Lemma 4, lim m → ∞ P(A E true ) = 1. of rejected hypotheses when P is the input is a superset of those rejected when P is the input.
The corollary can be easily derived from Lemma 6.
Proof of Theorem 3. Let P stop = {p a∼b } denote the value of P max when the PC fdr -skeleton algorithm stops.
∵ The FDR procedure is invoked whenever P max is updated, and P max keeps increasing as the algorithm progresses. ∴ According to Corollary 7, E stop is the same as the edges recovered by directly applying the FDR procedure to P stop .
The theorem is proved through comparing the result of the PC + fdr algorithm with that of applying the FDR procedure to a virtual p value set constructed from P stop . The virtual p value set P * is defined as follows.
For a vertex pair a ∼ b that is not adjacent in G true , let C * a∼b denote a certain vertex set that d-separates a and b in the true graph and that is also a subset of either adj(a, G true )\{b} Though p a⊥b|C * a∼b may not be actually calculated during the process of the algorithm, p a⊥b|C * a∼b still can denote the value as if it was calculated.
Let us design a virtual algorithm, called Algorithm * , that infers true edges in E test by just applying the FDR procedure to P * , and let E * denote the edges in E test claimed to be true by this virtual algorithm. This algorithm is virtual and impracticable because the calculation of P * depends on the unknown E true , but this algorithm exists because E true exists.
For any vertex pair a and b that is not adjacent in G true , we have the following.
∵ X a and X b are conditional independent given X C * a∼b . ∴ p a⊥b|C * a∼b follows the uniform distribution on [0, 1]. ∴ The FDR of Algorithm * is under q.
When all the true edges in the test set are recovered by the PC + fdr algorithm, that is, E true ⊆ E stop , all the edges in G true are included in E stop due to Assumption (A4). In this case, the conditional independence between X a and X b given X C * a∼b is tested for all the falsely recovered edges a ∼ b ∈ E stop \ E true , because for these edges, subsets of adj(a, G true ) \ {b} and subsets of adj(a, G true ) \ {b} have been exhaustively searched and C * a∼b is one of them. Therefore, p a∼b ≥ p * a∼b for all a ∼ b ∈ E stop when event A E true happens. Consequently, according to Lemma

B. Computational Complexity
The PC + fdr algorithm spends most of its computation on performing statistical tests of conditional independence at step 7 and controlling the FDR at step 11. If the algorithm stops at the depth d = d max , then the number of conditional independence tests required is bounded by where |E test | is the number of edges to test, Δ is the maximum degree of graph G init (the graph formed at step 1 of the PC + fdr algorithm) whose edges are E must ∩ E test , and C d Δ−1 is the number of combinations of choosing d unordered and distinct elements from Δ − 1 elements. In the worst case that d max = Δ − 1, the complexity is bounded by 2|E test |2 Δ−1 = |E test |2 Δ . The bound usually is very loose, because it assumes that no edge has been removed until d = d max . In real-world applications, the algorithm is very fast for sparse networks.
The computational complexity of the FDR procedure, Algorithm 2, invoked at step 11 of the PC + fdr algorithm, in general is O(H log(H) + H) = O(H log(H)) where H = |E test | is the number of input p values. The main complexity H log(H) is at the sorting (step 1). However, if it is recorded the sorted P max of the previous invocation of the FDR procedure, then the complexity of keeping the updated P max sorted is only O(H). With this optimization, the complexity of the FDR-control procedure is O(H log(H)) at its first operation and is O(H) later. The FDR procedure is invoked only when p a⊥b|C > p max a∼b . In the worst case that p a⊥b|C is always larger than p max a∼b , the complexity of the computation spent on the FDR control in total is bounded by O(|E test | log(|E test |) + T|E test |) where T is the number of performed conditional independence tests. This is a very loose bound because it is rare that p a⊥b|C is always larger than p max a∼b .