Reciprocal Benefits of Mass-Univariate and Multivariate Modeling in Brain Mapping: Applications to Event-Related Functional MRI, H2 15O-, and FDG-PET

In brain mapping studies of sensory, cognitive, and motor operations, specific waveforms of dynamic neural activity are predicted based on theoretical models of human information processing. For example in event-related functional MRI (fMRI), the general linear model (GLM) is employed in mass-univariate analyses to identify the regions whose dynamic activity closely matches the expected waveforms. By comparison multivariate analyses based on PCA or ICA provide greater flexibility in detecting spatiotemporal properties of experimental data that may strongly support alternative neuroscientific explanations. We investigated conjoint multivariate and mass-univariate analyses that combine the capabilities to (1) verify activation of neural machinery we already understand and (2) discover reliable signatures of new neural machinery. We examined combinations of GLM and PCA that recover latent neural signals (waveforms and footprints) with greater accuracy than either method alone. Comparative results are illustrated with analyses of real fMRI data, adding to Monte Carlo simulation support.


INTRODUCTION
Inferential methods used in human brain mapping span a spectrum of experimental designs and statistical techniques. In the broadest terms the task is to recast the predictions of a theoretical description of neural information processing into testable properties of the neuroimaging data. A logical starting point is the mapping u ← F(ν, θ) in which u represents the data of a neuroimaging study, acquired using one of several imaging technologies; and ν represents the set of physiological mechanisms that have potentially influenced the measurement u. Manifest evidence that the latent processes ν of experimental interest are the actual determinants of the imaging data u is achieved through the activation and modulation of the latent physiological (neural) processes by means of parametric manipulations of the stimulus input. F is the model of the conjoint influences of the latent physiological activity on u. The vector θ quantifies both the relative strength of each mechanism's contribution to u and the strength of interactions among different latent mechanisms.
A current example is the acquisition of the BOLD MRI signal (blood oxygenation-level-dependent signal), a surrogate measure of local neural activity, in studies involving event-related experimental designs. In event-related functional MRI (fMRI), the mapping is expressed as u(s, t) ← F(ν(s, t), θ), in which each ν(s, t) may be thought of as a "movie" of an aspect of information processing whose neural signal is manifest at one or more locations s in the brain, at one or more time points during data acquisition interval T, that is, for times t ∈ T. In modeling neuroimaging data, the aim is to infer the spatiotemporal properties of the underlying operations ν(s, t), and how these ν(s, t) jointly determine the measured u(s, t).
Model construction also includes a quantitative account of the spatiotemporal filtering of F(ν(s, t), θ) introduced by the imaging technology. In the case of BOLD signal acquisition, F(ν(s, t), θ) must be transformed to represent the convolution of the hypothetical neural signal with the hemodynamic response function. Caveats are that the hemodynamic response may be different for different brain regions, and may also differ in the same brain region in different individuals, for example, in individuals of different ages.
In analyzing imaging data sets, there is a plethora of different observation models. From the perspective of inferential statistics, the choice of experimental design and observation model is dependent on both the abstract mapping u(s, t) ← F(ν(s, t), θ) and the spatiotemporal filtering of F(ν(s, t), θ) introduced by the imaging technology. The choices depend on (a) what is known a priori about ν(s, t) and F and θ; (b) which mechanisms ν(s, t) and properties of F(ν(s, t), θ) are of primary interest to the experimenter; and (c) the degree to which the features of interest are resolvable in the filtered representation of F(ν(s, t), θ). In current neuroscience studies of human sensory processing and cognitive and motor operations, the observation models that are ordinarily applied to the data are of two classes [1][2][3][4][5][6][7][8][9][10][11][12]: the general linear model (GLM) used in mass-univariate analysis and the multivariate models based on PCA or ICA decomposition.

Mass-univariate analysis
In mass-univariate analyses one or more hypothetical models F(ν(s, t), θ) are used to predict the data u(s, t). For each F(ν(s, t), θ) the observation model consists of a set of explanatory variables, or design matrix, that is assumed to be a set of known and fixed predictors, and the model is applied identically to all voxels in the brain. In the design matrix the primary design variables provide a detailed description of the predictions regarding the behavior of the hypothetical operations ν(s, t) in different experimental conditions (i.e., different temporal epochs); and the secondary design variables describe potential nuisance effects that, were they not taken into account, would inflate the GLM estimate of random error. Standard linear methods are used to quantify the contributions of the predictor variables to the temporal waveforms of individual voxels. The aim is to identify voxels for which one or more F(ν(s, t), θ) provide a plausible account of the local temporal activity in u(s, t). Moreover, in head-to-head comparisons of competing theoretical models the best-case scenario would be that in which only one model provides a high level of explanatory power.

Multivariate modeling
Multivariate models based on PCA or ICA decomposition have a somewhat different focus-on the waveform similarities in the dynamic neural activity of different brain regions. The underlying premise of this type of multivariate modeling is that multiple signals are generated in response to experimental stimulus input, and each signal is manifested in multiple brain regions. That is, similar neural trains of activity appear at multiple brain sites-with locations not only in sensory pathways, but also in limbic and temporoparietal pathways and areas of prefrontal cortex.
From this theoretical perspective the brain mines stimulus inputs using complementary inferential modes: (a) specialized sensory coding methods, for example, the array of feature-specific coding schemes known to be deployed in the initial processing of visual stimuli; and (b) the contextual guidance provided by working memory and executive systems that relate immediate stimulus events to organismgenerated goals. Concretely, predictive coding models suggest that signals generated in sensory pathways are likely to be fed forward for interpretation and synthesis to limbic and temporoparietal pathways associated with short-and longterm memories, and to the prefrontal cortices that are involved with working memory, including goal-directed response selection, motor planning, and error checking. Likewise, signals containing contextual and goal-specific information are fed back to sensory pathways, modifying sensorial representations of external stimulus events.
The observation models used in a multivariate analysis decompose the neuroimaging data u(s, t) into a series of components, in which each component represents a temporal waveform that is expressed to a stronger or weaker degree in a multiplicity of brain regions and not at all in other brain regions. In applications of unguided PCA and ICA, only mild constraints are imposed on the temporal waveforms and their respective spatial modes (i.e., topographic patterns of nonzero signal expression). Specifically in PCA, the series of waveforms are constrained to be mutually orthogonal, as are the series of spatial modes; and in ICA, either the series of temporal waveforms or the series of spatial modes are constrained to be maximally statistically independent.
Indeed, in the case of unguided PCA and ICA the individual components may, or may not, be related in a one-to-one fashion to either (a) the true neural signals ν(s, t) occurring in one or more task conditions (temporal epochs), or (b) particular behavioral and demographic experimental variables. On the other hand, these PCA and ICA decomposition methods are designed to provide an accurate approximation to the brain-wide footprint of the sites associated with the aggregate of latent neural signals ν(s, t). Ordinarily the experimental prediction is that the brain-wide footprint will be sparse in total anatomical extent, although spatially distributed.
Guided PCA and ICA observation models, on the other hand, are designed to further constrain the components of the PCA or ICA decomposition to spatiotemporal features of the data u(s, t) that most closely match the hypothesized neural processes ν(s, t) and their predicted activity in different experimental conditions.

Reciprocal benefits of mass-univariate and multivariate modeling
The practical reality is that no one modeling method alone will provide an exact description of the physiological mechanisms that are the actual determinants of the imaging data: neither the theoretical models F(ν(s, t), θ), nor their instantiation in GLM, nor the major components of unguided or guided PCA or ICA. However, there is a potential advantage to explore the footprint of a multivariate analysis with one or another theoretical model F(ν(s, t), θ). A theoretical model can provide a parsimonious account of regional activity for some portion of the voxels within the multivariate footprint. This account represents an implicit judgment of similarity between theoretical neuroscientific predictions and the latent processes actually operating within the footprint. The standard GLM calculation of goodness-of-fit represents the true explanatory power achieved by F(ν(s, t), θ) when contrasted with its distribution predicted by random Gaussian field theory. Goodness-of-fit is calculated on a voxel-by-voxel basis, but includes the footprint's anatomical extent and the level of type-I error protection as global parameters. The practical advantage in applying the mass-univariate analysis to a multivariate footprint-rather than brain wide-is that the spatially constrained analysis identifies additional voxels in which F(ν(s, t), θ) has actual explanatory power. The greatest reciprocal benefit is afforded when (a) the major temporal waveforms obtained from an PCA or ICA decomposition span the fixed predictor variables of the mass-univariate analysis; and (b) almost all, if not all regions for which the mass-univariate analysis provides a moderateto-high level of explanatory power lay within the multivariate footprint. Perhaps the multivariate analysis that is best equipped to take advantage of these potential benefits is the multivariate linear model (MLM), a guided PCA that was among the first multivariate methods applied to eventrelated fMRI data. The conjoint MLM and mass-univariate analysis is based on a theoretical model F(ν(s, t), θ) in which the temporal waveforms obtained from a MLM-PCA decomposition are constrained to match the fixed predictor variables of the mass-univariate analysis. MLM has the added virtue that the GLM-type mean contrast effects between experimental conditions are computed using a proper statistical method of whitening the data along the temporal dimension.
The essential strength of the MLM analysis is that, like mass-univariate analysis, it is based on our accrued knowledge about human information processing and the theoretical constructs derived there from. On the other hand, the strong reliance by MLM on current theory limits its capacity to uncover novel features of the data u(s, t) that reveal neural machinery not heretofore anticipated.

Utility of individual differences in brain mapping
The exploration of individual differences has been a dependable means for discovering novel neural machinery as chronicled in the research findings of cognitive psychology and clinical neuroscience [13,[19][20][21][22][23]. In brain mapping the main sources of information about individual differences are the interactions between brain regions, experimental task parameters, and endogenous variables. It is thus understandable that guided PCA methods were devised early on in the development of noninvasive brain imaging technologies to explore subject-related interaction effects. These models included the subprofile scaling model (SSM) [3,12,24,25] and the partial least squares methods [6,26]. These guided PCA were originally designed for application to data acquired with positron emission tomography (PET) with H 2 15 O perfusion and [ 18 F]Fluorodyoxyglucose, and topographic electroencephalography (EEG).
The authors and others [4,5,7,8,11,27] have extended these initial developments in guided PCA to take advantage of the higher temporal resolution of event-related fMRI. The clear benefit of higher resolution is that more experimental tasks, and greater numbers of comparisons between experimental conditions and their parametric controls, can be built into study designs. Two of the newest guided PCA are the generalized partial least squares (gPLS) and ordinal trends (OrT) analyses [5,7]. These guided PCA are designed to capture the joint influences of experimental task parameters and endogenous factors on latent neural signals of theoretical interest. In both gPLS and OrT the aim is to combine the verification of neural machinery that is reasonably well understood with the discovery of reliable signatures of new neural machinery.

Ordinal trends model
In this report we focus on the OrT analysis. The inferential strategy that is unique to OrT is its capacity to capture the joint influence of task parameters and endogenous factors on u(s, t) without resorting to classical latent variable modeling. In brain mapping, the latent variables are the neural processes ν(s, t) and their spatiotemporal properties; their observable counterparts are both the experimental predictor variables and subject variables, for example, indices of task performance, IQ, education and age. From the perspective of standard latent variable analysis [28], the method of estimating ν(s, t) relies on models that impose explicit constraints on the relationships among different ν(s, t) and between individual ν(s, t) and experimental predictor variables, behavioral scores and demographic factors. In contrast, an OrT analysis is based on the experimental design variables alone, without the use of either behavioral scores, demographic variables, or causal models that depict the relationship between latent brain circuitry and endogenous variables.
The OrT analysis is predicated on event-related experimental designs in which positive incremental changes in task parameters are expected to produce positive monotonic trends in the activity of individually targeted signals ν(s, t).
OrT performs a separate analysis for each ν(s, t) with the aim of identifying one or more topographic patterns in u(s, t) that expresses positive ordinal trends on a subject-by-subject basis. OrT is a guided PCA: a specially designed linear transformation is applied to the neuroimaging data with the effect that maximal salience is assigned to topographic patterns whose expressions are monotonic across a specified series of experimental conditions, corresponding to the positive incremental changes expected in the level of the targeted neural signal.
Algebraically speaking, the multiplication of the data matrix by the OrT design matrix differentially alters the voxel-by-condition-by-subject variance of three types of latent patterns: see the appendix. First, the OrT transformation discriminates among patterns that expressed mean trends in the predicted direction from patterns that expressed mean directional changes that are different from the predicted trend; and second, the transformation discriminates among different types of patterns within the first category. In the first category, the OrT design matrix discriminates among patterns in which the direction of the trend expressed is the same in all subjects from patterns that express conditionby-subject interactions in which the direction of the trend expressed is different for different subjects. Lastly, the design matrix is constructed to preserve the relative size of the voxel-by-task-by-subject variance accounted for by topographic patterns that express ordinal trends. On this basis the application of PCA, or singular value decomposition (SVD), to the transformed data set can be expected to produce major principal components that provided a good approximation to one or more target patterns, where each expresses ordinal trends on an individual subject basis.
Importantly, the data structure to which the OrT model is applied is not the raw fMRI data. Initially, the spatiotemporal data are preprocessed to remove the normal MR artifacts, for example, susceptibility and motion artifacts and artifacts associated with respiration and cardiac pulsations. Subsequently, a standard method of temporal averaging is applied to the "artifact-free" data to construct brain maps for individual subjects that represent the BOLD activity within different task conditions (epoch types) of the experimental design. This temporal averaging enhances the signal-tonoise characteristics of BOLD activity that is time-locked to stimulus-based, cue-based, and response-based epochs. These brain maps are the data structure to which the OrT model is applied, that is, the data consist of one brain map per subject per task condition (or epoch). More details of the time series modeling are provided in our example of an OrT analysis applied to real event-related fMRI data.
Robust inferential statistical methods have been designed for OrT applications to these types of data structures. Nonparametric statistics are used to control type-I error rates, for example, permutation test statistics and error statistics based on Monte Carlo simulations of random Gaussian fields; see the appendix. In addition, bootstrap resampling methods are applied to OrT topographic pattern estimates to evaluate the reliability of nonzero voxel weights. The reliability of individual voxel weights is computed as z-scores, where the higher the z-score the less likely it is that any subject is extraordinarily influential in determining voxel weight. The caveat is that in our current bootstrap procedure the areal extent of clustered voxels is not taken into account in calculating individual z-scores.
We suggest that OrT is likely to provide the greatest benefit in experiments that admit substantial interactions between experimental task parameters and endogenous variables. On the one hand, the OrT analysis is predicated on the notion that experimental control is sufficiently robust that positive incremental changes in task parameters produce positive ordinal trends in the activity of each targeted signal ν(s, t) of the theoretical model F(ν(s, t), θ). That is, the OrT analysis is designed to recover the footprint of each ν(s, t) for which every subject (or almost every subject) expresses a positive ordinal trend. In particular, footprint recovery is possible in data sets in which there is substantial variation in the trajectories of subjects' positive ordinal trends. The worstcase scenario for which recovery of ν(s, t) may be feasible are data sets in which interactions between task parameters and endogenous variables take the form of additional latent processes that had not been included in (i.e., were not predicted by) the theoretical model F(ν(s, t), θ). The additional latent processes may express mean trends similar to that of a targeted ν(s, t). But what distinguishes each of these latent processes from ν(s, t) is that the directional trend in task activity is different for different subjects. In other words, the experimental control over the operation of these latter latent processes is markedly less than that achieved with the targeted processes ν(s, t).
In applications to real data sets, for example, H 2 15 O PET data sets and event-related fMRI data sets, there are striking examples in which the OrT analysis appeared to provide a relatively unconfounded and unbiased estimator of a target pattern [7,29]. By contrast, the corresponding map of GLM mean trend statistics deviated markedly from the OrT estimate of the target footprint, suggesting that the GLM map estimate is influenced by interactions between task parameters and endogenous variables. We have implemented Monte Carlo methods to simulate data sets that manifest similar differences between OrT and mass-univariate analyses [7]. The simulated data sets represent the worst case scenario in which there is substantial variation in the subject trajectories of target ν(s, t) activity plus, the superposition of several "nuisance" latent processes. The inclusion of these interaction effects in simulated data sets results in maps of GLM mean trend statistics that contain significant contributions from both target and nuisance processes. By contrast, the OrT analysis provides a substantially less confounded estimate of the target footprint.
In sum, OrT is likely to provide the greatest benefit in studies in which (a) enrollment criteria create subject samples that reflect the population level of phenotypic variation, and (b) experimental control is sufficiently strong that the latent neural processes of primary theoretical interest exhibit positive ordinal trends. This potential advantage is particularly relevant to studies of learning and memory for which there are ordinarily inherited and acquired differences among individuals.

EXAMPLE OF AN OrT ANALYSIS APPLIED TO EVENT-RELATED fMRI
We demonstrate here the practical utility of an OrT analysis with its application to the event-related fMRI data from a study of visual recognition and perceptual adaptation [29].
We describe below the essential information about experimental goals and design, the fMRI acquisition and preprocessing steps, as well as the OrT analytic design and the patterns of regional activations that represented experimental effects. The OrT computational methods are outlined in the appendix that includes (a) a step-by-step recipe of the OrT computations, and (b) the attendant inferential statistical methods that are routinely applied.

Experimental aims
The aim of the fMRI study was to investigate the effects of stimulus repetition on behavioral and neurophysiological measures of adaptation. We used a modified, trial-based version of the possible/impossible object decision (IP-OD) task that was originally designed by Schacter et al. [30]. Unlike the original IP-OD task, our version was designed to measure repetition effects over delays of a few seconds rather than minutes. Our modified IP-OD task was benchmarked with the production of significant perceptual priming effects. Significant reaction time (RT) effects occurred for stimulus repetitions (p < 0.0001) and object type (p < 0.005), with a nonsignificant trend in the interaction between repetition number and object type (p < 0.08). In this two-alternative, forced-choice paradigm, the decision theoretic parameters of object discrimination, d' and bias, remained nearly constant across successive object presentations. The minimum d' and maximum bias were 2.92 ± 0.56 and −0.66 ± 0.57 (mean ± SD), recorded for initial presentations.
Ordinal trend analysis was applied to the fMRI BOLD signal. The analytic goal was to recover a latent component of the BOLD signal that appeared in multiple brain regions and that, with successive exposures of a test object, exhibited either a positive trend in every subject, or a negative trend in every subject. OrT was applied separately to possible and impossible objects [29]. We have limited our report here to the analysis of possible objects with the express purpose of illustrating the OrT methodology.

Subjects
Fourteen healthy, right-handed subjects (age = 22.8 ± 3.8 [Mean ± SD]), recruited from the Columbia University student population, participated in the experiment. All subjects supplied informed consent, as approved by the Internal Review Board of the College of Physicians and Surgeons of Columbia University. Volunteers were screened for psychiatric and neurological illness via a questionnaire.

Task procedures
The stimuli used in the visuo-perceptual task consisted of "possible" and "impossible" objects ( Figure 1). Possible objects were two-dimensional renderings of three-dimensional solid forms, where the latter are composed of a small number of intersecting planar surfaces. By contrast, the planar surfaces rendered in impossible objects did not come together to form actual 3D solid objects. With each stimulus presentation, that is, on each trial, the subject's task was to decide whether the visual stimulus was a possible or an impossible object-hence the term "object decision." Every trial was exactly-3000 milliseconds (ms) in duration: a trial began with a 500-ms ITI, followed by a fixation cue for 250 ms. Fifty milliseconds after fixation offset, the stimulus then appeared for 1000 ms; trials were terminated 1200 ms after stimulus offset. Practice trials were administered to confirm that participants understood what it meant to judge object type. Prior to commencement of fMRI scanning, subjects were told that (a) their memory of visual objects was being tested, (b) they would be viewing an extended series of object presentations, and (c) they should respond as quickly and as accurately as possible to each test object in the series. The PI-OD task consisted of three test blocks, each with a different set of 13 possible and 13 impossible objects. Within a block each test object was presented four times. Altogether a block consisted of 104 test objects. The PI-OD task design was counterbalanced to obviate confounds between experimental effects [29].
With subjects laying supine in the MR scanner, task stimuli were back-projected onto a screen located at the foot of the MRI bed using an LCD projector. Subjects viewed the screen via a mirror system located in the head coil. Responses were made on an LUMItouch response system (Photon Control Company). PsyScope [31] was used to control task events and collect subject responses (reaction time and accuracy). In addition PsyScope electronically synchronized task events with the MRI acquisition computer. 6 International Journal of Biomedical Imaging

fMRI data preprocessing
The several images acquired included T2 * -weighted functional images, T1 "scout" images, and T2 anatomical images. Details regarding the acquisition parameters for these different images are reported in Habeck et al. [29]. All image preprocessing and analysis was done using the SPM99 program (Wellcome Department of Cognitive Neurology) and other code written in MATLAB (Mathworks, Natick, Mass). The following steps were taken in turn for each subject's GE-EPI data set: data were corrected for the timing of slice acquisition, using the first slice acquired in the TR as the reference. All GE-EPI images were realigned to the first volume of the first session. The T2-weighted structural image was coregistered to the first EPI volume using the mutual information coregistration algorithm implemented in SPM99. The latter high-resolution image was then used to determine parameters (7 × 8 × 7 nonlinear basis functions) for transformation into Talairach standard space [32] defined by the Montreal Neurologic Institute (MNI) template brain supplied with SPM99. This transformation was then applied to the GE-EPI data, which were resliced using sinc-interpolation to 2 mm × 2 mm × 2 mm.

fMRI time-series and OrT modeling
A first-level, GLM-based, time series analysis was performed on individual subject image data [33] from which parameter images were constructed. A second-level OrT analysis was applied to these latter images for the group of 14 subjects. At the first level, the fMRI time-series analysis was applied voxel-wise, in which linearity and time-invariance were assumed in the physiological transformation of neural activity into a fMRI BOLD signal [34]. The steps in modeling followed the example of Friston et al. [35] and Zarahn [1,36]: GE-EPI time-series were simultaneously modeled with regressors that represented the hypothesized BOLD response to the individual PI-OD trial types-relative to a baseline of intertrial intervals. The individual GLM regressors were constructed as convolutions of an indicator sequence (i.e., a train of discrete-time delta functions) representing delayed trial onsets, an assumed BOLD impulse response function (as represented by default in SPM99), and a rectangular function of trial duration. A predictor variable was created for each of eight-trial types-two-object types times four-object presentations; and eight images of GLM parameter estimates were produced for a subject. Subject images were each intensity normalized (via voxel-wise division by the image time series mean) and spatially smoothed with an isotropic Gaussian kernel (full-width-at-half-maximum = 8 mm).
These images of GLM parameters were subsequently submitted to an OrT analysis. An OrT analysis was performed on the first three-object presentations, based on the information that the largest change in RT occurred between the first and second, or first and third presentations. OrT patterns were constructed from the first few principal components, and their significance was evaluated using nonparametric test statistics (see the appendix). As a source of independent  Figure 2: Results of an OrT guided PCA applied to the imaging data of 14 participants in the IP-OD study, for which negative ordinal trends were predicted across repeated object presentations (presentation number). A linear combination of the first two principal components (PCs) produced significant results: (a) negative monotonic trends exhibited by 12 of 14 subjects in the plot of presentation number versus pattern expression (p < 0.01); and (b) positive correlation (p < 0.0005) between the change score in OrT pattern expression (difference between first-and second-object presentations), and the corresponding change score in reaction time (index of perceptual repetition suppression). validation, change scores in OrT pattern expression were correlated with the perceptual measure of repetition suppression, that is, change scores in RT.

RESULTS
A statistically significant OrT topographic pattern was obtained using the first two-principal components. All but two of the 14 subjects expressed positive ordinal trends (Figure 2(a)) with stimulus repetition (p < 0.01). The OrT pattern accounted for 16% of the total voxel-by-conditionby-subject variance in the untransformed fMRI data set. The OrT pattern estimate identified not only areas exhibiting 7 repetition suppression, but also brain areas that were positively increasing with successive presentations of each possible object, that is, "repetition augmentation." In addition the index of perceptual repetition suppression, that is, the change score in the difference in reaction time between the firstand second-object presentations, was significantly correlated (R 2 = 0.67, p < 0.0005) with the corresponding change score in OrT pattern expression (Figure 2(b)). The one subject who did not show perceptual repetition suppression was an outlier in this correlation analysis. Indeed, this subject was an outlier in the OrT analysis as well-exhibiting a negative, rather than positive OrT change score. Notwithstanding, the correlation between OrT expression and RT was significant without this subject outlier.
Our bootstrap resampling method confirmed that many of the nonzero voxel weights of the OrT pattern estimate were reliable (Table 1). Figure 3 maps the voxels with bootstrap z-scores ≥ 3.09, which is associated with uncorrected p-values ≤ 0.001. The bootstrapped OrT pattern revealed experimental effects in several areas of the visual pathway, including primary visual cortex, the precuneus and supramarginal gyrus, fusiform gyrus and parahippocampus, and the inferior frontal gyrus. Areas of increasing activation with successive object presentations populated regions predominantly in the left hemisphere, although right BA 39 exhibited increasing activation as well (Figure 3(a)). In contrast, areas of decreasing activation populated posterior dorsolateral regions of both hemispheres, ventrolateral regions of the right hemisphere, and a portion of right BA 44 (Figure 3(b)). It is unlikely that any subject was extraordinarily influential in determining the voxel weight of these superthreshold regions.
Lastly, we also performed a mass-univariate analysis in which the predictor variable was the mean contrast most similar to the positive ordinal trend prediction, that is, the linear mean trend across three-object presentations. Two brain areas were identified with F-values > 5.61, which are associated with uncorrected p-values < 0.001: these regions revealed a mean repetition suppression effect, but not a common directional trend in all subjects. Moreover, no voxel survived an SPM99 correction for multiple comparisons. (This Bonferroni-like correction uses a random Gaussian field adjustment that properly accounts for spatial dependences in the data.)

DISCUSSION
In current neuroscience studies of human sensory processing and cognitive and motor operations, the observation models that are applied to data sets have usually been one of two kinds: the general linear model (GLM) used in massunivariate analysis and the multivariate models based on PCA or ICA decomposition. Although these two modeling strategies have an essential complementarity-in that the strengths of the one can be used to bolster the weaknesses of the other, it has been routine practice in brain mapping to apply these methods in isolation. The aim of this report has been to engender a better appreciation of the benefits of the complementarity between brain mapping methods. Table 1: Nearest gray-matter voxel locations assigned positive or negative weights (|Z| > 3.09) in the bootstrapped OrT pattern, which represents the neural effects of repeated presentations of "possible" objects. MNI coordinates, structure name, and Brodmann label are tabulated for (a) brain regions in which signal strength decreases with object repetition (repetition suppression); and (b) regions in which signal strength increases with object repetition (repetition augmentation). Localization with Talairach Demon available from http://ric.uthscsa.edu/projects/ talairachdaemon.html. It might come as a surprise that a similar kind of complementarity has previously been articulated in theories of predictive coding-as they are applied to the brain's mining of sensory inputs. Predictive coding describes a complementary set of inferential methods that are employed in human information processing to reconstruct external stimulus events from sensory signals. The latter spatiotemporal signals are those that are produced at the stage of sensory transduction, for example, in the retinal mosaic of the cone transduction of visual input. In the relationship between human information processing and brain mapping, these sensory signals correspond to the neuroimaging data  Table 1.)

u(s, t).
In human information processing, the goal is to extract information about external stimulus events that is relevant to both environment-organism homoeostasis and immediate goal-directed activity. Correspondingly, the goal of neuroscience is to mine the neuroimaging data u(s, t) for evidence that the latent processes of theoretical interest are indeed the neural processes that have been activated and modulated by the parametric manipulations of the stimulus input. In other words, the ν(s, t) of theoretical interest in brain mapping are analogous to the external stimulus events that are relevant to human thought and action.
In this analogy, the sensorial representations of external stimulus events correspond to the features of the neuroimaging data u(s, t) that are captured by the first principal components of PCA, or the task-related components of ICA. For example, in the simplest multivariate decomposition (e.g., unguided PCA), the neuroimaging data are encoded as a set of principal components without reference to experimental design variables or theoretical constructs. This type of coding would be analogous to sensorial representations in sensory pathways that are not modifiable by top-down, neural signals. But actually the brain has the capacity to modify sensorial representations with top-down signals: hence the better analogy is between modifiable sensorial representations and guided PCA and ICA, where the latter are designed to identify features of u(s, t) that share spatiotemporal features with the predicted neural signals ν(s, t). Theories of predictive coding emphasize the need to optimize the reciprocal flow of information between sensory pathways and brain areas associated with executive control as a means of maximizing the synthesis and interpretability of information about external stimulus events. The analogous concept is the aspect of brain mapping highlighted in this report, that is, conjoint multivariate and mass-univariate analysis.
The exploitation of conjoint multivariate and massunivariate analyses is expected to benefit significantly from the new developments in guided PCA that combine the capability to verify the activation of the neural machinery that we already understand with the capability to discover reliable signatures of new neural machinery. The OrT analysis is presented as the latest example of a guided PCA that combines these capabilities. The means by which OrT achieves its expanded capability was examined; and OrT's practical utility is demonstrated in a group analysis of an event-related fMRI data from a study of visuo-perceptual adaptation.

Utility of OrT for event-related fMRI
The substantive finding of the OrT analysis was that a statistically significant OrT topographic pattern was identified in which lateral occipital cortex was among the most salient regions that exhibited reductions in the BOLD signal with successive stimulus exposures. This finding is consistent with the results of similar types of visual adaptation studies that have reported group mean reductions in lateral occipital functional activity-in blood flow and the BOLD signal [30,37]. But, the OrT pattern is also consistent with the predictions of cognitive neuroscientists who argue that the neural correlates of visual adaptation and perceptual learning are not limited to neural response suppression in lateral occipital regions [38,39]. Consistent with these latter predictions, the OrT pattern revealed significant regional effects of stimulus repetition in temporoparietal and prefrontal areas. These brain regions support processing of higher-level perceptual attributes and spatial attention, and are distinguishable from the processes of preattentive feature extraction and visual imagery that take place in primary sensory pathways.
The difference between brain areas that reveal repetition suppression and those that exhibit increased activity with object repetition may reveal two different brain analyses that are performed on visual stimuli. We speculate that the predominantly left-hemisphere effects, which are associated with increased activity with object repetition, may be associated with analyses of intersecting curved and planar surfaces and their assignment to the same or different 3D solid objects. By contrast, the regions that show object suppression populate posterior regions of both hemispheres and may be associated with the operations of preattentive feature extraction and visual imagery. This interpretation is consistent with the Kosslyn et al. theory of object perception [40][41][42].
The relevance of the latent neural activity identified by OrT to perceptual repetition suppression was further affirmed by a strong, significant correlation between subject decreases in RT between the first-and second-object presentations and the corresponding change score in OrT pattern expression. On the other hand, the RT change score may have been influenced by endogenous factors unrelated to the OrT neural signal, as 33% of the subject variation in RT change scores was not accounted by OrT change scores. We therefore performed a brain-wide, mass-univariate search to detect the influence of perceptual or motor processing on RT via neural processes other than those captured by OrT. Correlations between the RT change score and regional activity was computed on a voxel-basis with the OrT change score partialled out. Two isolated brain areas were identified with F-values > 6.70, which are associated with uncorrected pvalues < 0.001. However, neither region survived an SPM99 correction for multiple comparisons.
Although the correlation between OrT pattern expression and RT change scores was quite strong, its interpretation is not altogether straightforward. There is the likelihood that activity of latent ν(s, t) revealed in the OrT pattern is different from the neural activity that is responsible for the perceptual suppression effects manifested in RT. The physiological events that are antecedents of response selection and response execution may be too brief to accurately resolve in the BOLD signal. On the other hand, the strong correlation between OrT pattern expression and RT reductions with stimulus repetition might be the result of a top-down process that operates over a more extended timeframe, for example, its operation may extend, say, from fifty milliseconds post stimulus onset to three hundred milliseconds post response initiation. In other words, the strong correlation between OrT pattern expression and RT change scores may reflect a functional coupling of two distinct aspects of learning and memory.
The question therefore remains as to whether the latent signal associated with the OrT pattern represents a bottomup flow of information from sensory cortex to limbic, temporoparietal and prefrontal cortices, or represents top-down feedback to sensory pathways, or a combination of these two signals. A more elaborate experimental design and a more elaborate OrT analysis is needed to answer this question. Indeed, a model of local neural processing with multiple inputs is needed, namely, a model that includes both bottomup and top-down input signals, and possibly a modulation of these inputs by hysteresis effects associated with prior stimulus events. Penny et al. [15] have described such a model, "bilinear dynamic systems." Were we to redesign our experiment to dissociate these different signals, OrT would be applied separately to the images of GLM trial parameters associated with the different input signals (each having first been convolved with the local hemodynamic response function). The resulting OrT analyses would likely provide more definitive answers regarding the nature of the latent signals that exhibited ordinal trends across successive stimulus repetitions.
Finally, the linear mean trend of the mass-univariate analysis was statistically nonsignificant. Moreover, of the two isolated regions that exhibited relatively large F-statistics neither manifested a significant correlation between RT change scores and the corresponding difference in voxel activity. The effect size of GLM mean contrasts appeared to have been diluted by features of the latent physiological (neural) processes that were not well described by the fixed predictor variables, including the contributions of subject-dependent factors.

Novel approaches to type-I error control
Of practical interest in inferential statistics is whether guided PCA and ICA-and OrT in particular-can augment the sensitivity of mass-univarate analysis while maintaining control over type-I errors. The facts are that in routine applications of mass-univariate methods, theoretical models oftentimes supply only rough approximations to the architecture of the underlying neural information processing. That is, the level of explanatory power is only modest to moderate for voxels containing real experimental effects. This practical reality collides with the need to control type-I error rates in brain-wide maps of GLM goodness-of-fit statistics. In order to control the false positive detection rate, mass-univariate analysis requires that a Bonferroni-like correction be applied. But the outcome of Bonferroni-like corrections for multiple comparisons is predictable, namely, a substantial portion of voxels that contain real experimental effects will not be identified as statistically significant.
Development of inferential methods that reduce the stiff penalty of high type-II error rates-in exchange for tight control over type-I errors-is an ongoing project in brain mapping, for example, type-I error control based on the statistics of false discovery rate (FDR) [43,44], conjunction analysis and meta-analyses [45,46]. But oftentimes investigators resort to less formal remedial approaches to further enhance the detection of voxels with real experimental effects: albeit they are willing to tolerate false-positive rates higher than p = 0.05. Currently researchers report, on a routine basis, brain maps of experimental effects based on single voxel statistics, for example, p < 0.001 for a standard For t-statistic-in lieu of imposing the more stringent, multivoxel Bonferroni-like correction.
However we would offer as an alternative to FDR, conjunction analysis and the informal approaches, conjoint multivariate and mass-univariate analyses. We suggest that multivariate modeling supplies essential information about latent neural processing that mass-univariate modeling lacks, namely, information about the similarities in u(s, t) activity between brain voxels. We anticipate that conjoint multivariate and mass-univariate modeling will provide real improvements in the detection of voxels with real experimental effects while maintaining control of the false-positive detection rate. Moreover, we expect these improvements will be realized in all multivariate methods including MLM, gPLS, OrT, and other forms of guided PCA and ICA, for example, probabilistic PCA and ICA.
Among multivariate methods the OrT analysis is unique in its method of controlling type-I errors. By comparison, in MLM and related PPCA and PICA, eigenvalue statistics are used to limit the number of principal components to the smallest set for which the complementary set of components is not distinguishable from the statistics of random Gaussian fields. In MLM specifically, the presumption is that on average the time-by-subject scores of the significant principal components account for at least a modest portion of the variance in a majority of the voxels that contain real experimental effects-specifically those effects described by the associated theoretical model F(ν(s, t), θ) and the corresponding GLM. Implicit in PPCA and PICA modeling-as well as in MLMis the presumption that all nuisance sources of region-bycondition-by-subject variance can be accurately articulated for inclusion in their respective observation models: of particular importance are the competing sources of variance with effect sizes that are comparable to those of main experimental interest, including nuisance sources that are partially correlated with the experimental design variables. By contrast, in an OrT analysis it is expected that across the spectrum of latent variable effects, the least is known about the spatiotemporal properties of nuisance effects: indeed, less is known about most nuisance effects than about the latent neural processes of experimental interest. For these reasons, OrT controls the type-I error rate using nonparametric statistics that are different from eigenvalue statistics [7]. Further in its applications to date, OrT analysis has appeared to provide relatively unconfounded and unbiased estimators of target patterns. One example of OrT pattern estimation is illustrated in our review of a group analysis of event-related fMRI data from a study of visuo-perceptual adaptation.

CONCLUSIONS
The aim of this report is to explicate the potential benefits of conjoint multivariate and mass-univariate analyses in human brain mapping. The practical reality is that neither modeling technique alone provides an exact description of the physiological mechanisms that are the actual determinants of the imaging data. We argue that it takes conjoint mass-univariate and multivariate analyses to determine the exactness of either modeling approach.
We began by reviewing the benefits that are afforded by MLM-a guided PCA approach that is strongly reliant on theoretical constructs of neural information processing, and speculated as to how MLM could best be combined with mass-univariate analysis to achieve a reciprocal advantage. On the other hand, because over reliance on conventional neuroscientific theory has its drawbacks, additional guided PCA methods are recommended to uncover novel features of the data u(s, t) that are associated with neural machinery not heretofore anticipated. The new OrT statistical analysis was presented as the latest example of a guided PCA that combines the capabilities not only to verify the activation of the neural machinery that we already understand, but also discover reliable signatures of new neural machinery. We examined the details as to how OrT achieves its expanded capacity through the exploration of individual differences and the interactions between experimental task parameters and endogenous factors. We suggest that OrT analysis, as well as several other guided PCA and ICA, is especially relevant to studies of memory and learning for which there are ordinarily inherited and acquired differences among individuals.
Finally we argue that conjoint multivariate and massunivariate modeling is a novel approach that significantly enhances the detection of real experimental effects while maintaining control of the false-positive detection rate. Moreover, we expect these improvements will be realized in all multivariate methods including MLM, partial least squares (PLS and gPLS), OrT and other forms of guided PCA and ICA.

APPENDIX
Listed below are the six computational steps of the OrT analysis. This computational recipe for OrT assumes that the imaging data have undergone sufficient preprocessing to yield one image per subject per task condition. Details are provided below for the case in which there are three-task conditions, denoted below as B, E 1 and E 2 . However, our recipe can be generalized to any number of task conditions (two or greater).
Step 1. Application of a projection operator, P, by multiplication from the right according to YP, to eliminate strictly task-independent effects: P is constructed from the set of 2N eigenimages of the Helmert-transformed data matrix H Y, where N is the group sample size. The Eigen decomposition can be written as Y HH YW = WΛ with the Helmert matrix The matrix W contains the 2N eigenimages as column vectors, and Λ is a 2N-diagonal matrix containing the nonzero eigenvalues. The matrix WW corresponds to the projection matrix P of the Helmert eigenimages. The modified data matrix YP has the same dimensions as the original data matrix Y. However, YP contains N fewer activation patterns and has rank 2N, that is, a lower rank than the matrix Y, which has rank 3N.
Removal of the task-independent subject effects is necessary in order to obviate their being confounded with the target patterns of experimental interest. Moreover, taskindependent subject effects are not usually of interest as they describe effects that remained unchanged by the experimental design manipulation.
Step 2. Application of the OrT design matrix, Q, by multiplication from the left according to [Q(Q Q) −1/2 ] YP, to increase the salience of ordinal trend effects. In the case of three-task conditions, Step 3. Singular value decomposition (SVD) is applied to the mean centered [Q(Q Q) −1/2 ] YP matrix. This is equivalent to applying principal components analysis (PCA), that is, in which V contains 2N orthogonal eigenimages as column vectors; and Σ is a 2N-diagonal matrix of the eigenvalues.
Step 4. The first K eigenimages are tested for the presence of an ordinal trend. For the first K singular images, a 2N × K predictor array is calculated according to [E 1 − B; E 1 + B − 2E 2 ]. B is obtained by projection of all K images onto the raw data pertaining to condition B, that is, B = Y (1 : N, :)V (:, 1 : K). Likewise for E 1 and E 2 , we have E 1 = Y (N + 1 : 2N, :)V (:, 1 : K), and E 2 = Y (2N + 1 : 3N, :)V (:, 1 : K). We then conduct a linear regression to best predict the dependent variable of the regression, which is a 2N column vector [1; −1], with the 2N × K predictor array described above, This linear multivariate regression analysis is a type of discriminant analysis that produces the linear combination of the K eigenimages, according to V (:, 1 : K)β, whose mean expression changes maximally across task conditions. For the test of significance of the ordinal trend, we compute the task-subject scores for this new linear-combination image according to the right-hand side of the above regression equation. The test of significance is based on the minimum number of exceptions to a perfect segregation of the two contrast scores C 1 and C 2 that are calculated from the resultant pattern's expression according to C 1 = E 1 − B and C 2 = E 1 + B − E 2 , respectively. The number of exceptions is an inverse correlate to the maximum number of subjects who exhibit monotonic task-activity curves as can be appreciated from Figure 4. Monte-Carlo simulations of random Gaussian fields provide the type-I error rate of ordinal trends based on the minimum number of exceptions to a perfect segregation of scores. Step 5. Bootstrap resampling methods [47] are applied to OrT topographic pattern estimates to evaluate the reliability of nonzero voxel weights. The reliability of individual voxel weights is computed as z-scores, where the higher the z-score the less likely it is that any subject is extraordinarily influential in determining voxel weights. In the bootstrap, Steps 1-4 that were performed on the original subject sample are repeated 100-1000 times on samples of subjects that have been chosen randomly with replacement from the original subject pool. The inverse coefficient of variation (ICV) serves as the measure of the reliability of the regional weight at each voxel in the topographic pattern. ICV is computed from the point estimate of the regional weights, w voxel , and the variability of the resampling process around this point estimate, captured as the standard deviation σ voxel , as and is approximately standard-normally distributed. The larger the absolute magnitude of ICV voxel , the smaller the relative variability of the regional weight about its point estimate value. Common benchmark thresholds are chosen as 1.64, 2.33, and 3.09, which corresponds to a one-tailed plevel of 0.05, 0.01, and 0.001, respectively.
Step 6. Forward application of pattern estimates into new data sets [48][49][50][51][52][53]: a pattern v from a guided PCA-in particular an OrT analysis-can be projected into any data matrix Y according to the algebraic rule Yv -provided that the row vector v and the vectorized images in Y are all coregistered to the same brain atlas; and the same voxel mask has been applied to every image. The resulting column vector consists of the levels of expression of v in the individual images in Y, for example, for each subject and experimental condition. The subject-by-condition scores v are normally used to evaluate correlations between the regional activity associated with a latent neural process ν(s, t) and (a) hypothetical responses of ν(s, t) to experimental task challenges, or (b) experimental relevant behavioral and demographic variables. In addition, the OT forward application is a useful tool for testing whether the topographic footprint of a latent process found in one parametric series of experimental conditions is also evident in the images of other task conditions within the same experiment, or in the images obtained in independent, but theoretically related experimental studies.