Compounding is the dominant morphological type in modern Chinese words; however, its brain mechanisms remain unspecified. Here, we aim to address this issue by manipulating three common morphological structures in Chinese disyllabic words in an fMRI study: parallel, biased, and monomorphemic. Behavioral analyses show no significant difference in reaction times and error rates among these three conditions. No difference in neural activation was observed in direct contrasts among these conditions in univariate contrast analyses. A support vector machine categorization analysis reveals that the left inferior frontal gyrus (LIFG) is the only region in the frontotemporal network that can differentiate the parallel from the biased disyllabic words in neural activation patterns. This finding indicates that the LIFG is the core region responsible for morphological representation universally across different language modalities and morphological structures.
Morphology is a linguistic branch that studies the internal structures and syntax of morphemes. Morphemes are the minimal meaning-bearing linguistic elements which convey semantic and syntactic cues of written or spoken words. There are three major morphological structures across different languages: inflectional, derivational, and compound. Previous research predominately focused on inflectional and derivational morphology which is prevalent in Indo-European languages. For isolating languages like Chinese, however, compounding is the predominant morphological structure, and this remains relatively neglected in the literature. To fill this gap, here we aim to shed light on the brain mechanisms of compound morphological processing in Chinese written word recognition.
Inflectional morphology is composed of one stem and one or more inflectional affixes, such as regular past tense (e.g., “jumped = jump + ed”) and regular noun plurals (“dogs = dog +
Different from Indo-European languages, Chinese has almost no inflectional or derivational morphology, and over 70% of all Chinese words are compounds with two or more constituent morphemes [
In the present study, we aim to shed light on this issue by manipulating three typical morphological structures in disyllabic Chinese words, i.e., parallel, biased, and monomorphemic compounds. In each parallel compound, the two constituent morphemes contribute to the meaning of the whole-word equally, while the meaning of each biased compound mainly comes from the second morpheme with the first morpheme as a modifier. We also include a group of monomorphemic words as a baseline condition. With comparisons among these three conditions, we aim to reveal the neural network engaged in processing or representing different morphological structures and also investigate whether compound morphological parsing is underpinned by the same left frontotemporal neural network for processing inflectional and derivational morphology as shown in previous Indo-European language research. Given the automatic feature of morphological parsing, we hypothesize to observe a weak or even null effect of morphological processing in our canonical neuroimaging analysis. To address this potential issue, we plan to adopt a machine learning approach, i.e., support vector machine (SVM) categorization analysis, to further explore the neural basis of morphological parsing, and would expect to find that neural activation patterns in the left frontotemporal language network, particularly the LIFG, might function to differentiate different morphological structures.
Twenty young healthy adults (20–36 years, mean age = 24; 10 males) took part in this study. All were right-handed (Edinburgh Handedness Inventory, Oldfield [
To understand the neural mechanisms of morphological representation, we manipulated three types of morphological structures in common Chinese real words in three conditions: parallel bimorphemic (PB), biased bimorphemic (BB), and monomorphemic (MM). The meaning of two constituent morphemes contributes equally to the whole meaning of each PB word, for example, “父母” (parents) is a combination of first morphem“父”(father) and the second morpheme母(mother). In contrast, the meaning of each BB word originates mainly from the second morpheme (i.e., word head), for example, “红豆” (red bean) emphasizes the bean), (豆) while red (红) is only a certain feature. Each MM word also consisted of two characters but only a single morpheme, e.g., “坦克” (tank) cannot be divided into two morphemes “坦” and “克” grammatically. There are 88 words in each condition with word frequency and stroke matched in between them (Table
Characteristics of stimulus in each experimental condition with mean values.
Condition | Word frequency | Stroke | First character frequency | Second character frequency |
---|---|---|---|---|
PB | 9.64 | 18.45 | 126.72 | 111.77 |
BB | 9.64 | 16.29 | 502.23 | 314.88 |
MM | 9.64 | 20.9 | 54.34 | 20.95 |
Non-words | 0 | 17.92 | 231.79 | 144.82 |
Key: PB = parallel bimorphemic words, BB = biased bimorphemic words, MM = monomorphemic words.
All stimuli in each type were divided equally into four parts by word frequency and number of strokes and then allocated to each experimental run, respectively. As a consequence, there are four runs in this fMRI experiment, with each run composed of 22 PB words, 22 BB words, 22 MM words, 33 nonwords, and 15 nonlinguistic symbols. Each stimulus was displayed in the center of the screen for 1000 ms, followed by a short period of blank screen (see Figure
Illustration of the experimental procedure with examples of stimuli.
All participants were scanned in a 3.0 Tesla GEMR 750 whole-body human scanner (General Electric, Milwaukee Wisconsin, USA) with an eight-channel head coil at Tongji University. We chose a gradient-echo EPI sequence to collect functional scans, each of which consisted of 40 contiguous oblique axial slices with no gap between adjacent slices, voxel size = 3 × 3 × 3 mm, field of view (FOV) = 19.2 × 19.2 cm, time of repetition (TR) = 2 s, echo time (TE) = 23 ms, and flip angle = 77°. The acquisition of slices in each scan was interleaved and parallel to the AC-PC line. There were 248 brain volumes in each functional run that last for 8 minutes and 16 seconds. We also collected T1-weighted structural images using a 3D fSPGR pulse sequence for anatomical localization with 162 contiguous slices, voxel size = 1 × 1 × 1 mm, FOV = 25.6 cm2, TR = 7.64 s, TE = 2.94 ms, and flip angle = 12°.
We preformed preprocessing and statistical analysis on the collected functional and structure images in SPM12 (Wellcome Institute of Cognitive Neurology, London, UK.
In the fixed-effect analysis for each participant, all the experimental stimuli were modeled in six independent events: PB, BB, MM, nonwords, visual fixation, and errors. Trials in the error event consisted of both trials with incorrect responses and those with RT over 3000 ms. Trials in the error event accounted for 4.7% of all trials. A canonical hemodynamic response function (HRF) was used to model each trial. The onset of each trial was calculated and inputted into the model with duration = 0, in order to flexibly detect the peak activation for each trial. The data for each run were first analyzed and then averaged across the four runs for each participant, and the activation maps for each contrast (e.g., PB minus null events, BB minus null events, and MM minus null events) in each participant were input into random effects analysis at the group level. Significant activations were reported at
To perform machine learning analysis, we defined four ROIs based on significant activation clusters in a major experimental contrast of real words minus null events. Neural activity within each ROI was extracted using Marsbar (region of interest toolbox for SPM) for each contrast of interest and each participant. Voxel activation values served as the input features to the Support Vector Machine classifier. In the present study, the dimension of the feature vector
Support Vector Machine (SVM) is a machine learning method proposed and developed on the basis of statistical learning theory. It has many unique advantages in nonlinear, small sample, and high-dimensional pattern recognition, so it is widely used in solving machine learning problems. For data that are linearly indivisible in N dimensions, spaces above
In our current work, we use SVM for binomial classification. The basic model is defined as a linear classifier with the largest interval in the feature space, and its learning strategy is to maximize the interval, which can eventually transform the problem into the solution of a convex quadratic programming problem. The ultimate goal of this analysis is to try to distinguish the difference in the activation patterns in the regions of interest of 20 subjects in two conditions, which is essentially a binary classification problem that the method of SVM can well meet for the purpose of this experiment.
We evaluated the performance of the classifier using cross-validation. For each cross-validation run, 18 participants were chosen to train the classifier and the two remaining participants were used for testing. This procedure was repeated 190 times, with all possible combinations of two subjects considered in testing across the 190 cross-validation runs. The classifier accuracy was measured by the proportion of observations correctly classified.
Response times and accuracy on all trials were recorded and averaged for each experimental condition (PB words: mean RT = 758 ms, error rate = 3.7%; BB words: RT = 739 ms, error rate = 2.8%; MM words: RT = 754 ms, error rate = 3.5%). We performed an ANOVA (Analysis of Variance) on RTs of correct trials among these three conditions, but found no significant difference (
The first step in the neuroimaging analysis was to test whether the task produced activations in those cortices was typically engaged in written word recognition. We addressed this issue by comparing all words against the fixation baseline. As shown in Figure
Significant activation for the direct contrast of words minus fixation at a threshold of
Areas of activity for the contrast of words minus fixation.
Regions | BA | Extent | Max. | MNI | ||
---|---|---|---|---|---|---|
LIFG | 47 | 61 | 3.87 | −39 | 29 | −16 |
LLOC | 18,19 | 311 | 4.57 | −36 | −85 | −7 |
RLOC | 18,19 | 179 | 4.63 | 33 | −88 | −10 |
SMA | 6 | 90 | 3.78 | −6 | 23 | −65 |
Key: LIFG = left inferior frontal gyrus, LLOC = left lateral occipital cortex, RLOC = right lateral occipital cortex, SMA = supplementary motor area.
To explore the neural substrates of morphological representation, we performed a one-way ANOVA with three morphological conditions as input levels: PB words minus null events, BB words minus null events, and MM words minus null events. No significant difference was found among these three conditions.
The null effects of morphological processing in the above univariate analyses indicate that PB, BB, and MM words might activate the left frontotemporal network to the same amplitude level. To test whether the neural activation patterns are the same across these three conditions, we performed SVM binary classification analysis which is sensitive to differences in pattern-information rather than activation magnitude (see Figure
Four ROIs were selected based on the significant clusters in the contrast of all words minus fixation: (a) LIFG (pink), (b) SMA (red), (c) LLOC (yellow), and (d) RLOC (purple). (e) SVM binary classification analysis: different activation patterns were extracted from different conditions within each ROI, and voxel values were defined as the characteristics of the classification of each ROI.
The whole-brain analysis includes all voxels in the brain, which might reduce detection sensitivity of SVM since some brain regions included might not be involved in morphological processing. To solve this problem, we chose the four significant clusters from the canonical contrast of words minus fixation as regions of interest (ROIs) : LIFG (BA47), left and right LOC (BA18/19), and SMA (BA6) and performed the SVM analysis in each ROI (Table
The SVM classification results with accuracy (accur.) and
Partition Contrast | LIFG(BA47) | LLOC(BA18/19) | RLOC(BA18/19) | SMA(BA6) | ||||
---|---|---|---|---|---|---|---|---|
Accur. | Accur. | Accur. | Accur. | |||||
1 | 0.758 | 0.001 | 0.517 | 0.383 | 0.581 | 0.001 | 0.514 | 0.465 |
2 | 0.432 | 0.001 | 0.664 | 0.001 | 0.477 | 0.224 | 0.509 | 0.66 |
3 | 0.515 | 0.464 | 0.58 | 0.396 | 0.534 | 0.055 | 0.503 | 0.005 |
Contrast 1 means the classification of PB and BB words (PB words–MM words vs. BB words–MM words). Contrast 2 means the classification of BB and MM words (BB words–fixation vs. MM words–fixation). Contrast 3 means the classification of PB and MM words (PB words–fixation vs. MM words–fixation).
In this study, we manipulated three morphological structures in Chinese disyllabic words to explore the neural mechanisms of compounding morphology. We did not observe significant differences among these three conditions in canonical neuroimaging analyses but found that LIFG can differentiate the parallel from the biased morphological structures in an SVM analysis. This finding is in line with previous studies in that morphological parsing or representation–irrespective of whether it is inflectional, derivational, or compound–is supported by a left frontotemporal network [
Compounding is a special morphological structure that combines two morphemes directly together without explicit changes in word form; therefore, decomposition of compound words cannot rely on word form (i.e., affixes) as in inflectional and derivational words but more likely depends on the meaning of each constituent morpheme. The relatively implicit morphological structure (without explicit form changes) might explain, in part, the null effect of contrasts between different morphological structures in canonical fMRI analyses. Another possible explanation is that we used a lexical decision task in this study rather than a more explicit morphological priming paradigm as used in the previous behavioral research. No difference was found between disyllabic compounds and monomorphemic words, which could be interpreted partly by the explicit boundaries of constituent Chinese characters. From the decomposition point of view, the two constituent characters in a monomorphemic word might be processed separately and then combined together as a single morpheme, in a process that is very similar to that found for disyllabic compounds.
In contrast to the decomposition hypothesis on compound word processing, there was another account in support of representation of compounds as whole-word units [
All fMRI and behavioral data, together with relevant analysis scripts and files, are available upon request from the corresponding author (e-mail:
This manuscript has not been published elsewhere nor is it currently under consideration for publication elsewhere.
The authors have no potential conflicts of interest regarding the publication of this study.
This study was funded by the Program to Professor of Special Appointment (Eastern Scholar) at Shanghai Institutions of Higher Learning (No. TP2018056), the Science and Technology Commission of Shanghai Municipality under Grant 18ZR1442700, and the China Electronics Technology Group Corporation (CETC). All authors have reviewed the contents of the manuscript, approved its contents, and validated the accuracy of the data. We thank Dr. Barry Devereux at Queen's University Belfast for his valuable comments on our manuscript.