A Window into the Brain: Advances in Psychiatric fMRI

Functional magnetic resonance imaging (fMRI) plays a key role in modern psychiatric research. It provides a means to assay differences in brain systems that underlie psychiatric illness, treatment response, and properties of brain structure and function that convey risk factor for mental diseases. Here we review recent advances in fMRI methods in general use and progress made in understanding the neural basis of mental illness. Drawing on concepts and findings from psychiatric fMRI, we propose that mental illness may not be associated with abnormalities in specific local regions but rather corresponds to variation in the overall organization of functional communication throughout the brain network. Future research may need to integrate neuroimaging information drawn from different analysis methods and delineate spatial and temporal patterns of brain responses that are specific to certain types of psychiatric disorders.


Introduction
The human brain is the most mysterious and vital organ. Recent neuroimaging techniques, including functional magnetic resonance imaging (fMRI), electroencephalography (EEG), and magnetoencephalography (MEG), now allow us to probe the brain at unprecedentedly high temporal or spatial resolution without the use of invasive techniques. Since the first fMRI brain scans of the 1980s, scientists have achieved great progress not only in technical procedures employed to acquire brain imaging data but also in data processing methods which subsequently reveal an inspiring understanding of the brain drawn from various data perspectives. fMRI has become the dominant technique in neuroimaging due to its noninvasiveness, lack of radiation exposure, a relatively good spatial and temporal resolution, and relative ease to acquire.
In this paper, we will review popular data processing methods used in task-based fMRI and resting-state fMRI (see Figure 1 for a summary of mainstream fMRI methods). Some methods introduced in task-based fMRI, such as MVPA, are also applied in the case of resting-state fMRI data. Different analysis methods probe specific brain activity patterns. The application of these methods to investigating psychiatric disorders will be discussed in great detail. We also point out here that within neuropsychology there is an ongoing paradigm shift from identifying foci of abnormalities to delineating the functional connectivity among several brain regions, towards developing a global understanding of aberrations at the level of large-scale, whole brain networks. The advantages and disadvantages of each neuroimaging method are discussed and compared in order to help researchers select the methods most appropriate to their purposes.

Task-Based fMRI
Due to the great sensitivity of fMRI signals to event-related changes in neuronal blood flow, we can compare the BOLD signal differences between patients with psychiatric disorders and normal subjects when performing different kinds of tasks in order to elucidate how a brain in a state of disorder functions differently from a normal one. In this section, we will introduce several methods frequently used in task-related fMRI analyses and discuss the advantages and disadvantages of each method (see Table 1). Also note that these techniques are not mutually exclusive, such that two or more of them may be applied to the same dataset according to the quality of the dataset and the purpose of the study. Of importance, we discussed how these analyses can inform each other and some caveats in using these methods, such as choosing the frequency bands in resting-state fMRI (R-fMRI).

Subtraction and Correlation
. The general linear model (GLM) has gained growing popularity in task-related fMRI analysis since its introduction into the neuroimaging community by Friston et al. [1] in 1994, due to its easy interpretability and fast computability. GLM provides a framework for most kinds of data modeling and can minimize confounding factors, such as head motions or respiration from the subject, provided that these data are modeled. The aim of the general linear model is to explain the variation of the time course, 1 , . . . , , . . . , , in terms of a linear combination of explanatory variables plus a Gaussian error term. The general linear model in matrix form can be written as where is the vector of observed pixel values, is the vector of parameters, and is the vector of error terms. The matrix is known as the design matrix (Figure 2(a)). It has one row for every time point in the original data and one column for every explanatory variable in the model. In the GLM, the columns of contain vectors corresponding to the "on" and "off " elements of the stimulus presented. By finding the magnitude of the parameter in corresponding to these vectors, the presence or absence of activation can be detected. The aim of GLM analysis is to identify the brain regions that show significant signal change in response to the experimental conditions. Each pixel is assigned a value dependent on the likelihood of the null hypothesis being false. The null hypothesis is that the observed signal changes can be explained purely by random variation in the data.
The brain image containing such information for all voxels is called a statistical parametric map [1]. One of the simplest methods for obtaining results from an fMRI experiment is to perform a simple subtraction on two experimental conditions. By averaging together all the images acquired during the "on" phase of the task and subtracting the average of all the "off " images, brain regions that are activated during the "on" phase of the task can be drawn out of the data pool and identified. Using parametric design, researchers can also examine parametric correlation with behavior in the brain. GLM is also the base of the majority of functional/effective connectivity estimation techniques which will be introduced in the following sections.
GLM is the dominant method used in task-based fMRI. Studies on psychiatric disorders have used this method to compare brain activities induced by certain experimental manipulations in the patient group and in the control group. For example, Juckel et al. [2] scanned patients with schizophrenia and healthy subjects using fMRI while they performed a "monetary incentive delay" task, in which they anticipated potential monetary gain, loss, or neutral outcomes. Following preprocessing, the fMRI data was modeled by GLM with three explanatory variables ("gain, " "loss, " and "neutral outcome, " indicating experimental cues) convolved with Cohen's gamma-function. Activations of different experimental conditions can be compared based on the BOLD response differences which can be assessed using linear combinations of the estimated GLM parameters ( values). Within-group activation (e.g., "gain versus neutral outcome") and intergroup differences can be compared by including the BOLD response variations of all subjects in each group in a second-level random effects analysis. However, GLM has also undergone some criticism focusing primarily on the assumptions the model makes [3]. Greater attention should be paid to checking the model's assumptions when applying GLM as a tool to analyze task-related fMRI data. This approach is the main method used in task-based fMRI in psychiatric research.

Psychophysiological Interaction (PPI).
One important goal of neuroimaging research is to describe the pattern of brain connectivity among different regions. Functional connectivity refers to undirected associations between brain regions while effective connectivity reveals a directed and causal relationship. Psychophysiological interaction, in a clever use of the GLM, measures how functional connectivity is affected by psychological variables without specifying the directions of such influences [4]. It examines how brain activity can be explained by the interaction between an experimental variable (e.g., level of attention) and the coupling between signals from a particular brain area (the source area) and signals from voxels in the rest of the entire brain ( Figure 2(b)). A psychophysiological interaction means that the contribution of one area to another changes significantly with the experimental or psychological context. In other words, regional responses in the source area to an experimental or psychological factor are modulated by signals from a distal brain region. Das et al. [5] used a behavioral

Multivoxel pattern analysis
Applying pattern-classification algorithms to demonstrate the relationship between measures of brain activity and a perceptual state and provide an information-theoretic framework for the isolation of regions that uniquely represent a behavior (i) Simultaneously examines the disparate signals carried within a set of voxels rather than examining individual voxels in parallel (ii) Can decode more complex information due to improved sensitivity and use of spatial information (i) The possibility of overfitting increases as the classifier becomes more complex, which may result in poor performance in tests of generalization task in which schizophrenia patients and healthy participants were asked to identify the emotions displayed on a series of facial images presented either supraliminally or subliminally during scanning. Subtraction analyses of fMRI data showed that, compared to healthy controls, schizophrenia patients showed reduced activity in the right amygdala and MPFC during conscious perception of fear (relative to neutral) and also in the bilateral amygdala and rostral ACC of the MPFC during subliminal perception of fear. PPI analyses revealed reduced neural activity in schizophrenia patients, relative to control subjects, in the pathway from the amygdala and its projection to the medial prefrontal cortex (MPFC) in response to fear perception. In another fMRI study, Wang et al. [6] applied PPI analysis to explore how abnormal functional connectivity in mPFC in schizophrenics altered as a result of psychological context or variables. They found that schizophrenic patients showed higher mPFC-LSTG connectivity under self-generated conditions than under othergenerated conditions.

Structural Equation Model (SEM).
The structural equation model, which was developed in the field of econometrics and first applied to neuroimaging data in 1991 [7], is another way to measure effective connectivity. Like PPI, SEM is also used to describe how effective connectivity is affected by experimental context. But, compared to PPI, SEM is better at identifying causal relationships and it combines covariances in activity between different brain areas with anatomical models of these brain areas' connections [8,9]. SEM contains a group of regions and a group of directed connections and these connections are presumed to represent causal relationships (Figure 2(c)). SEM requires an a priori assumption of causality without inference from the data and from this basis subsequently builds a model about how the regions are connected to each other. Free parameters in these models are "path coefficients"-representing the strength of connections. This approach offers a move from correlational analysis (inherently bidirectional) to unidirectional connections (paths) which imply causality. One well-known strength of SEM is the method's ability to specify latent variable models that provide separate estimates of relations among latent constructs and their manifest indicators (the measurement model) and of the relations among constructs (the structural model) [10]. Another strength of SEM is the availability of measures of global fit that can provide a summary evaluation of even complex models that involve a large number of linear equations [10]. It has proved useful in distinguishing a patient's neural network from a normal subject's neural network in one fMRI simulation study [11].
In another fMRI study, schizophrenic patients were scanned while performing a "2-back" working memory task. SEM was used to assess effective connectivity within a corticalsubcortical-cerebellar network for mnemonic information processing and comparison of group differences [12].

Dynamic Causal Model (DCM)
. Similar to SEM, the dynamic causal model is also an approach to estimate effective connectivity and how this connectivity is influenced by experimental variables. However, underlying SEM and DCM are two very distinct generative models (see [10] for a comprehensive comparison of DCM and SEM). DCM treats the brain as a deterministic, dynamic system with a nonlinear and dynamic nature in which the observed BOLD signal recorded by fMRI results from changes in neuronal activity caused by external inputs [6,13,14], while SEM does not distinguish "neuronal" levels from "hemodynamic" levels and changes in effective connection lead directly to changes in the covariance structure of the observed hemodynamics in this method. Considering changes in effective connectivity in the brain occur at a neuronal level, DCM is a better method for fMRI analysis.
The goal of DCM is to estimate and make inferences about the coupling among brain areas and how that coupling is influenced by changes in experimental context by building a reasonably realistic neuronal model of interacting brain regions. This model is then supplemented with a forward model of how neuronal or synaptic activity is transformed into a measured response such as the BOLD signal [13]. This enables estimation of the parameters of effective connectivity from observed data. With DCM, a causal model is built in which neuronal activity in a certain region causes changes in neuronal activity in other regions through interregional connections and self-connections that can be modulated by experimental variables (Figure 2(d)). Effective connectivity is parameterized in terms of coupling among unobserved brain states (e.g., neuronal activity in different regions). The objective is to estimate these parameters by perturbing the system and measuring the response. In brief, the core of DCM distinguished from conventional approaches such as SEM and GCM is that it attempts to model neuronal interactions instead of signals [15] and explore the estimation problem according to the designed perturbations that accommodate experimental inputs. DCM has been broadly used in psychiatric fMRI. For example, in a stoke patient's fMRI study, Grefkes et al. [14] applied DCM of a bilateral network comprising M1, the lateral premotor cortex, and the supplementary motor area (SMA) to assess changes in the endogenous and task-dependent effective connectivity between the cortical motor areas activated by a hand movement task at baseline, following vertex stimulation and contralesional M1 stimulation with repetitive transcranial magnetic stimulation (rTMS). In another fMRI study, Roebroeck et al. [16] used DCM to examine the effects of Parkinson's disease and dopaminergic therapy and concluded that the DCM model selection is robust and sensitive enough to study clinical populations and their pharmacological treatment.

Granger Causality Model (GCM)
. GCM is another popular method of estimating effective connectivity [16][17][18], based on the reasoning that one time-series can be considered to cause another if using the past information of the former can help forecast the latter better than only using the past information of the latter [19]. GCM can provide an estimate of connection directionality when one time-series resembles a time-shifted version of the other, supposing that one with temporal precedence caused the other [20] (Figure 2(e)). This method does not depend upon an a priori assumption of a structural model that contains preselected ROIs and connections between them, which differs from SEM with the goal of contrasting the predefined causal model with real datasets. Furthermore, GCM defines the causal relationship between two stochastic time-series relying purely on temporal precedence in their interdependency. Demirci et al. [21] scanned schizophrenic patients and healthy subjects with fMRI while performing a Sternberg item recognition paradigm (SIRP) and auditory oddball (AOD) tasks. The fMRI data were then decomposed into maximally independent spatial components and corresponding time courses by applying ICA. The time courses for each of the components that were most related to the cognitive task with the most important and meaningful activation patterns were then used as inputs to a Granger causality test that investigated group differences in causal relationships between independent components over a frequency spectrum. Granger causality can also be applied to resting-state fMRI data to infer instantaneous correlation and causal influences. Hamilton et al. [22] measured BOLD signals of patients suffering from major depressive disorder during resting state and found that hippocampal and vACC activation in depressed participants predicted subsequent decreases in dorsal cortical activity by applying GCM.

Multivoxel Pattern Analysis (MVPA).
MVPA is gaining increasing interest in the neuroimaging community because it allows us to detect differences between conditions with higher sensitivity than conventional univariate analysis by focusing on the analysis and comparison of distributed patterns of activity (Figure 2(f)). In such a multivariate approach, data from individual voxels within a region are jointly analyzed. MVPA applies pattern-classification algorithms like support vector machines (SVM) [23][24][25][26][27], neural networks [28][29][30], or linear discriminant analysis (LDA) [31,32] as classifiers to distinguish spatial patterns of different mental states and decode the perceptual or cognitive states of an individual. In the analysis of fMRI data, the features that are descriptive of the objects are first chosen, whereafter a subset of these features to be used for classification is selected. The data is divided into two parts: a "training set" and a "testing set. " The pattern-classification algorithm utilizes the training set to train the classifier with the features and the prespecified classes of objects. The classifier thus "learns" a functional relationship between the features and the classes. Finally, the classification algorithm is tested for its generalization capabilities with the testing set. The percentage of correct classifications can be measured.
Like other multivariate approaches (e.g., PCA and ICA), MVPA takes into account multivoxel patterns of brain activity or connectivity. Information contained in these patterns can then be decoded by applying powerful pattern-classification algorithms. This method thus incorporates spatially distributed patterns of activity into the analysis, unlike univariate methods which treat every brain voxel independently. MVPA is often presented in the context of "brain reading" applications reporting that specific mental states or representational content can be decoded from fMRI activity patterns after performing a "training" or "learning phase. " MVPA has been successfully applied to identify functional connectivity difference between males and females [23], patterns in perception of pain [33], moral intentions [34], consciousness [35,36], and brain maturity [37]. In a study on subjects with autism spectrum disorder (ASD) conducted by Coutanche et al. [38], reliable correlations between MVPA classification performance and standardized measures of symptom severity that exceeded those observed using a univariate measure were found, which indicated MVPA had the potential to predict clinical symptom severity.

Resting-State fMRI Analyses
Brain regions which are active when our minds wander may hold a key to understanding neurological disorders and even consciousness itself. Resting-state fMRI, which measures spontaneous low-frequency fluctuations (<0.1 Hz) in the BOLD signal, is a relatively new pathway for evaluating regional interactions in the absence of tasks [39][40][41]. For a

Graph theory
Describing the topology of the functional brain networks by calculating connectional characteristics of the graph comprised of nodes (voxels) and edges (connections between voxels) (i) Directly describes and compares different brain networks utilizing topological parameters (i) Difficult to interpret long time neuroscientists have thought that the brain enters a "quiet" state while a person is not doing anything but remaining still. However, the recent studies of resting-state fMRI reveal that there is a persistent level of background activity in the brain during rest, which is called "the default mode" (DM) [8,11,[41][42][43]. Some neuroscientists believe that the default mode network (DMN) may be critical in uncovering the neural mechanism of psychiatric disorders ranging from Alzheimer's disease to depression [44][45][46][47][48][49][50][51]. On the other hand, due to its capacity for exploring individual differences, as well as its ease of acquisition, resting-state fMRI has become one of the most popular techniques in neuroimaging. In this section we will introduce several popular resting-state fMRI analysis methods and compare their advantages and disadvantages (see Table 2).

Seed-Based FC Analysis.
The seed-based approach extracts BOLD time-series data from a "seed"-a priorly selected voxel or ROI-and assesses the correlation between the average BOLD signal of the seed and the time course of all other brain voxels (Figure 3(a)). Seed-based analysis has been applied in resting-state fMRI to explore the relationships between resting-state brain activity and motor response regions [39], intelligence [52], descent into sleep [53], cognitive decline in normal aging [54], memory [55], task-related activation correlated with schizophrenia [56], and task-positive and task-negative networks [57]. In an fMRI study conducted by Zhou et al. [58], to investigate patients with paranoid schizophrenia, the right dorsolateral prefrontal cortex and the posterior cingulate cortex were selected as two seed regions. Then, the investigators computed a correlation map by computing the correlation coefficients between the reference time-series in the seed region and the time-series from all other brain voxels from which they found abnormal interregional connectivity in the intrinsic organization in patients with paranoid schizophrenia. Parkinson's disease (PD) is characterized by motor symptoms resulting from the death of dopaminegenerating cells. Previous studies on PD have been associated with abnormal task-related brain activation in sensory and motor regions as well as reward related network. In order to study corticostriatal skeletomotor circuit dysfunction in Parkinson's disease, in a recent resting-state fMRI study, the putamen and supplementary motor area (SMA) were selected as seed regions due to their roles in reward processing and motor control [59]. Enhanced putamen-SMA functional connectivity was also found in the PD group. Similarly, the periaqueductal gray (PAG) plays a key role in the descending modulation of pain and its functional connectivity has been intensively examined in chronic pain patients [60]. While seed-based FC analysis has the advantage of statistical transparency and comprehensible results, seed-based analysis also suffers from the potential biases attached to prior seed selection. For example, to examine the default network, researchers have used a variety of seeds and generated different versions of the default mode network [61].

Regional Homogeneity (ReHo).
ReHo is another straightforward technique that uses Kendall's coefficient concordance (KCC) to measure the similarity of a given voxel with its nearest neighbors based on the BOLD time-series [62] (Figure 3(b)). Multiple studies which applied ReHo to resting-state fMRI data processing have shown diminished ReHo of specific regions in heavy male smokers [63], patients with Alzheimer's disease [64], patients with depression [65,66], patients with schizophrenia [67], patients with Parkinson's disease [68], children with ASD [69,70], adults with ADHD [49], and normal aging people [68]. On the contrary, a positive correlation has also been found between ReHo of certain regions and intelligence [71], early blindness [72], and internet addiction disorder [73]. ReHo is very useful in identifying regional abnormality in psychiatric disorders, which may guide further network based analysis. For example, a recent study found that ReHo changes in schizophrenia are widespread [74], leading to brain-wide network analysis in schizophrenia [75,76].

Amplitude of Low-Frequency Fluctuations (ALFF).
ALFF is an index that reflects the intensity of regional spontaneous brain activity by calculating the voxel-wise magnitude within a defined low-frequency range (Figure 3(c)). In order to reduce ALFF's sensitivity to physiological noise, Zou et al. [77] proposed a fractional ALFF (fALFF) approach calculating the ratio of power spectrum of low-frequency (0.01-0.08 Hz) to that of the entire frequency range. A number of resting-state fMRI studies have observed higher ALFF in the DMN areas than other areas [77][78][79]. Applications of ALFF in studies of conditions like depression [80], ADHD [81], PTSD [82], normal aging [83], and schizophrenia [84] have also revealed some exciting findings. Recently, by decomposing R-fMRI low-frequency (typically 0.01-0.1 Hz) oscillations (LFOs) into two distinct frequency bands [slow-5 (0.01-0.027 Hz), slow-4 (0.027-0.073 Hz)], researchers found that LFO amplitudes in the slow-4 band were higher than those in the slow-5 in many brain regions [85,86].  further demonstrated that the abnormalities of LFOs in schizophrenia are dependent on the frequency band and suggest that future studies should take the different frequency bands into account when measuring intrinsic brain activity [85].

Principal Component Analysis (PCA).
PCA is a datadriven method that does not require the input of any prior information about the connectivity pattern. It has been found useful in estimating whether there are functional regions with correlated signal responses in human brain mapping [87]. It was first formulated by Pearson [88] and then developed as a useful technique for reducing the dimensionality of complex data sets and for extracting new orthogonal variables identified as principal components [89]. The basic idea of PCA is to find a set of orthogonal bases that can maximize the variance of data and to separate out the most meaningful information from the noise so as to uncover the hidden structure (Figure 3(d)). For fMRI data, PCA has the advantage of verifying the facticity of differences in the activations between conditions or groups without specifying any prior knowledge of the form of BOLD response or the structure of the experimental design [90]. It is often applied in psychiatric fMRI analysis combined with other techniques such as ICA and MVPA. For example, Shen et al. [91] aimed at classifying individuals into schizophrenic and healthy control groups by a quantitative method. They collected fMRI data from patients with schizophrenia and healthy subjects and reduced the data size by using PCA decomposition. Then ICA was employed to extract data on the functionally connected networks in the brain, yielding less noisy components, which would be used as input to the classifier algorithm. However, the effectiveness of PCA is based on strong assumptions like linearity, orthogonal principal components, and high signal noise ratio (SNR) [92]. Sometimes data sets cannot be said to fit within these assumptions.

Independent Component Analysis (ICA).
As an extension of PCA, ICA is likewise a data-driven method that has been successfully used in describing fMRI data [93][94][95][96]. With the identical goal of finding a new set of variables with lesser redundancy that would provide the best possible representation of observed phenomena, ICA measures redundancy by the much richer concept of independence (Figure 3(e)) and only requires relatively weak assumptions about the independence of source signals compared with PCA, which extracts interested variables based on decorrelation and requires some stringent assumptions [97]. The independent components are assumed statistically independent in ICA. One of the most useful applications of ICA is reducing the negative effects of artifacts for standard GLM-based analysis by using decomposition information [98,99]. Another useful application of ICA is in detecting the resting-state functional connectivity and identifying RSNs (resting-state networks) [39,94,96,100,101]. Besides, ICA is also used in taskrelated fMRI group analysis called FENICA [102]. ICA has been widely applied to the study of brain diseases, such as Alzheimer's disease [20,44], schizophrenia [21,22], bipolar disorder [2,103], and epilepsy [2].
3.6. Graph Theory. A hot recent method used in resting-state fMRI is graph theory. Graph theory is a mathematical theory and approach to studying graphs made up of nodes and edges and how these nodes connected by edges interact with each other [58,104] (Figure 3(f)). The brain network can be described as being analogous to a graph in which voxels can be viewed as nodes and connections between voxels as edges [105]. In fMRI studies, graph theory has been used by some ambitious researchers seeking to present a comprehensive map of how the brain is organized. The unique characteristic of graph theory compared with the more traditional univariate fMRI methods is that graph theory can serve as a tool to directly describe and compare different brain networks utilizing topological parameters such as clusteringcoefficient, characteristic path length, degree of connectivity, centrality, and modularity [106]. Evidence from graph theory in fMRI studies has shown that the brain is structured in a highly efficient organization with both a small-world topology achieved through the presence of hubs and a scale-free topology [107,108]. Graph theory has been applied not only to resting-state fMRI and task-based fMRI so as to analyze the topology of functional brain networks [105,109] but also to studies of cortical thickness [110,111], surface area, and diffusion weighted imaging data [91,112,113] so as to analyze the topology of structural brain networks. These studies have illustrated an alteration of arrangements in structural and functional brain networks associated with normal aging [114,115], multiple sclerosis [116,117], Alzheimer's disease [118][119][120], schizophrenia [121][122][123], depression [124,125], and epilepsy [110,126].

Conclusion
Over the past decades, the development of fMRI techniques has made great contributions to our understanding of the neural mechanism underlying psychiatric disorder.
In the present review, we summarize several major MRI methods widely used in psychiatric neuroimaging. Some methods such as ReHo and VBM focus on regional changes, whereas others take a systematic approach and emphasize the whole brain network. These methods together can reveal the abnormalities in brain structures and functions in psychiatric disorders. However, the functional significance of many measures such as ReHo and ALFF is still not well understood. Psychiatric disorders may be associated with very subtle changes in the brain. One single method may not be enough to fully capture the nature of such alternations. A systematic approach using multimodal neuroimaging and a variety of analysis methods has the potential to identify reliable biomarkers for specific psychiatric disorders. With ongoing progress being made in neuroimaging methods, neuroimaging holds clear promise in helping to diagnose and quantify psychiatric diseases.

Disclosure
The funders had no role in study design, decision to publish, or preparation of the paper.