Application Value of Machine Learning Method in Measuring Gray Matter Volume of AIDS Patients

Background To investigate the role of gray matter (GM) volume in the identification of HIV-positive patients with HIV-associated neurocognitive impairment (HAND) using a machine learning approach from normal healthy controls. Methods Twenty-seven HIV-infected patients and 14 healthy controls were enrolled in our study. Each set of BRAVO images was postprocessed using DPARSF3.1 to coregister all brains on the MNI template, and volume extraction of 90 brain regions was performed using custom-designed code. The machine learning method was performed using PRoNTo2.1.1 toolbox. The differences in brain volume between the HAND and non-HAND groups were analyzed. Results GM volume effectively distinguished HIV-positive patients from healthy subjects with an AUC equals to 0.73. The sensitivity, specificity, and accuracy of the established classification were 85.19%, 42.86%, and 70.73%, respectively. GM volume value of the top ten brain regions was related to digit symbols, trail making test, digit span, vocabulary fluency, stroop C time, stroop CW time, CD4, and neuropsychological group. Conclusions A machine learning approach facilitates early diagnosis of HAND in HIV patients by MRI-based GM volume measurement.


Introduction
Acquired immune deficiency syndrome (AIDS) is a disease of immune system caused by human immunodeficiency virus (HIV) infection [1]. Guangxi is the second high prevalence and mortality of HIV infection in China [2]. HIV often involves the central nervous system (CNS) after initial infection [3]. Cognitive and behavioral abnormalities may occur with ongoing CNS inflammation, which are called HIVassociated neurocognitive disorders (HAND) [4]. It has esti-mated that 15-55% of all HIV-1 cases have HAND [4][5][6]. Asymptomatic neurocognitive impairment (ANI), mild nerve HIV-associated mild neurocognitive disorder (MND) and HIV-associated dementia (HAD) are different forms of HAND [7]. After highly active antiretroviral therapy (HAART), HAND still exists and affects survival, quality of life, and daily functioning [8]. Therefore, early and accurate diagnosis of HAND is a key factor in improving life quality and prolonging life span. However, at present, the diagnosis of HAND mainly relies on neuropsychological (NP) test, which is subjective and time-consuming [9,10]. Exploring more reliable and objective methods for diagnosing HAND is essential.
Recent studies have reported changes in the gray matter (GM) volume in HIV-infected patients with HAND [11], which may provide clues for diagnosis of HAND. However, subtle changes of GM volume might be missed. Machine learning method is accurate and objective for diagnosing HAND using high-resolution anatomical data provided by MR imaging.
As a technique for identifying patterns that can be applied to medical images, machining learning allows computers to automatically learn the rules from the data and use them to predict unknown data [12]. The machine learning method presents the weight of each brain region differ-ence in the two groups and finds the brain region with the most significance. Support vector machine (SVM) is a pattern recognition method based on statistical learning theory. The main use of SVM is solving small sample, high-dimensional, and nonlinear problems. Compared with traditional statistical analysis methods, SVM has better generalization ability. Therefore, machine learning may be a noninvasive and objective method of early warning of HAND and evaluation of efficacy. The purpose of this study is to explore the application value of machine learning method in measuring GM volume of AIDS patients.

Methods
This study was approved by Affiliated Tumor Hospital of Guangxi Medical University and the Fourth People's Hospital of Nanning. All participants signed informed consent.

Subjects.
Twenty-seven HIV-infected patients (14 males, 13 females; mean age: 42:48 ± 13:03 years; age range: 22-63years) and 14 healthy controls (8 males, 6 females; mean age: 39:0 ± 13:02 years; age range: 22-63years) were enrolled in our study. The HIV-infected patients first diagnosed at the Fourth People's Hospital of Nanning from Sep. 2017 to Jan. 2019 were enrolled in our study. Urban area, age, and gender of controls were highly matched with patients. The inclusion criteria for patients included patients who can move freely and did not perform HAART. The exclusion criteria for all subjects included any drug abuse history and any obvious brain structural abnormalities or lesions, such as stroke or tumors.    Figure 1: A weight map of the brain regions contributing to the evaluation of the difference between the AIDS group and the control group by gray matter volume in machine learning. The larger the weight, the closer the color of the brain area is to red, and the smaller the weight, the closer the color of the brain area is to blue.

Image
Preprocessing. DICOM data of subjects were obtained from Siemens company'sADW42 workstation for image preprocessing. The image was preprocessing using statistical parametric mapping software (SPM12) and data processing assistant for resting-state fMRI (DPARSF3.1) in matlab2013b. Then, the specific preprocessing steps were as follows: firstly, we selected TI DICOM to NIFTI to convert the DICOM format to NIFTI format, then readjusted the direction of the T1 image (Reorient T1), and then, performed new segmentation and registration with the DAR-TEL template (New segment + DARTEL); finally, smoothing processing was performed to reduce deformation and noise caused by radiation transformation. Therefore, sign-noise ratio was evaluated.

Calculation and Presentation of Results.
PRoNTo machine learning toolkit (PRoNTo 2.1.1) was used for statistical analysis. Specific steps were as follows: load data: input data according to AIDS group and control group, and the age was added in the modalities as covariate to remove the age influences on brain function. Prepare feature set: set voxel-based morphological measurement method. Select model and run model: selected classification and cross-validation, leaved one subject per group out, and choosing normalize samples and regress out covariates subject level for data operation. Computer weights: get the PRT diagram and add the AAL template. Display results: the PRT map generated in the previous step is imported to obtain the receiver operating characteristic (ROC) curve, the total accuracy value, and the confusion matrix. The sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) of the VBM index are calculated by the confusion matrix diagram. Last, display weights: the final generated PRT map is also imported, and the weights per region are selected. The weights are ranked in order, and the top ten brain regions are selected as the result.

Statistical Analysis. HIV-infected group was divided into HAND group (ANI group, MND group and HAD group)
and non-HAND group with reference to NP test. The difference of structural image gray matter volume between two groups was analyzed by independent-sample T test using SPSS19.0. The diagnostic value of GM volume value in HAND was investigated. The GM volume of top ten brain regions with biggest differences were used to explore the relationship between GM volume and the clinical hematological index, clinical scale [2]. Spearman rank correlation analysis of ordinal data and Pearson linear correlation analysis of count data were performed. The correlation coefficient R greater than 0.6 was highly correlated, 0.4-0.6 was moderately correlated, and less than 0.4 was mildly correlated. P < 0:05 was considered statistically significant.

The Top Ten Brain Regions with Biggest GM Volume
Differences between AIDS and Control by SVM. The top ten brain regions that contributed the most to the difference in gray matter outcomes between AIDS patients and control groups in linear support vector machine classification were the right postcentral gyrus, left superior parietal gyrus, right paracentral lobule, right supplementary motor area, left lateral inferior parietal angular gyrus, left lateral temporal gyrus, right inferior parietal angular gyrus, right superior parietal gyrus, right central lid sulcus, and left superior marginal gyrus. The corresponding weight values and region of interest (ROI) values were presented in Table 1 and Figure 1.

Evaluation of the Classification Effect of SVM on GM
Volume. The AUC value, accuracy rate, sensitivity, specificity, positive predictive value, and negative predictive value were 0.73, 70.73%, 85.19%, 42.86%, 74.19%, and 60.00%, respectively. Specific results were shown in Figure 2.

Evaluation of GM Volume Index on HAND Diagnosis.
The GM volume values of each brain region in the HAND group and the non-HAND group were shown in Table 2. The GM volume of HAND group was decreased than non-HAND group. We found that in the AAL template, GM volume differences between two groups in the brain regions of nos. 42, 49, 52, 54,73, 74, 75, and 76 were not statistically significant, while 82 remaining brain regions were significant statistically, among which there were 45 brain regions with P value < 0.005, as shown in Table 2.

Correlation
Analysis. Correlation analysis of GM volume of the top ten brain regions and clinical index, clinical scale, and NP group in AIDS and control group was performed. We found that GM volume value of the top ten brain regions was related to digit symbols, trail making test, digit span, vocabulary fluency, stroop C time, stroop CW time, CD4, and NP group. The degree of correlation was moderate or highly positive or negative correlation except the right parietal superior gyrus. Specific results were as follows: The GM   volume of the right postcentral gyrus was highly negatively correlated with stroop C time. The GM volume of left superior parietal gyrus was highly related with stroop C time. The GM volume of right paracentral gyrus and left parietal margin angular gyrus were highly negatively correlated with stroop C time, stroop CW time, and NP group. The GM volume of right supplementary motor area was highly positively correlated with the digit symbol and was highly negatively correlated with stroop C time, stroop CW time, and NP group. The GM volume of the left heschl gyrus was highly negatively correlated with stroop CW time. The GM volume of the right parietal margin angular gyrus was highly positively correlated with the CD4 and highly negatively correlated with stroop C time, stroop CW time, and NP group. The GM volume of right superior parietal gyrus was highly positively correlated with digit symbols, and highly negatively correlated with stroop C time and stroop CW time. The GM volume of right rolandic operculum was highly positively correlated with digit symbols and vocabulary fluency and highly negatively correlated with stroop C time, stroop CW time, and NP group. The GM volume of left supramarginal gyrus was slightly positively correlated with CD4/CD8, and the correlation with the other indicators was moderately positive or negative. All results are shown in Table 3.

Discussion
In this study, we evaluated the difference of GM volume between AIDS group and control group and explored the correlation of GM volume and clinical index and NP group. The results showed that the GM volume of HAND group was decreased than non-HAND group, and the top ten brain regions with biggest GM volume differences between AIDS and control were the right postcentral gyrus, left superior parietal gyrus, right paracentral gyrus, right supplementary motor area, left parietal margin angular gyrus, left heschl gyrus, right parietal margin angular gyrus, right superior parietal gyrus, right rolandic operculum, and left supramarginal gyrus. The top ten brain regions were concentrated in the bilateral frontal lobe, bilateral parietal lobe, and left temporal lobe. GM volume values of the top ten brain regions were highly correlated with clinical index and NP group. The area under the ROC curve of GM volume in machine learning to evaluate differences between AIDS group and control group is 0.73.
At present, the diagnosis of HAND mainly depends on NP test, which is subjective and lacks of accuracy. Additionally, it is difficult to find subtle changes of GM in human diagnosis. Therefore, it is easy to miss diagnosis of HAND. The volume value of GM obtained by machine learning method can directly reflect the extent of damage to the regions and corresponding functions of the impaired cognitive dysfunction, which plays an important role in improving the clinical antiviral treatment program for patients, increasing the intervention measures of neurocognitive impairment, and reducing the occurrence and development of HAND. Correlation analysis of GM values in areas of the brain with positive manifestations may provide an objec-tive way to evaluate the efficacy of patients. The area under the ROC curve of GM volume in machine learning to evaluate differences between AIDS group and control group is 0.73, which indicated that change of GM volume may help early diagnosis of HAND. Recent studies reported that prefrontal GM atrophy in HIV patients is associated with prolonged disease duration, and motor dysfunction is associated with basal ganglia gray matter atrophy [13]. Therefore, application of machining learning method in measuring GM volume is of great importance.
There are many studies on gray matter volume in HIV patients. Studies of early AIDS have pointed out that cognitive impairment in HAND is associated with early subcortical and cerebral frontal lobe damage [6]. Becker et al. [14] found that HIV-related reductions in GM volume include the posterior and inferior temporal lobe, parietal lobe, and cerebellum. Pluta et al. [15] found that the volumes of the caudate nucleus, hippocampus, insular lobe, and subfrontal gyrus and GM were smaller in seropositive subjects compared with that in healthy controls. These patients behaved worse in cognitive fluency tasks. Küper et al. [13] proposed that compared with the control group, the HIV-positive patients with cognitive impairment showed reduced anterior cingulate gyrus and temporal cortex GM and the white matter of the midbrain. Our study showed the ten brain regions mainly concentrated in the bilateral frontal lobe, bilateral parietal lobe, and left temporal lobe, which was consistent with previous studies [13][14][15][16]. However, the change of GM volume in right rolandic operculum of HIV-infected patients has not been reported. Bilateral rolandic operculum damage leads to language suppression [17]. This requires further research to confirm.
There are several limitations of this study that need to be considered. First of all, small participant cohort may have an effect on the power of the statistical analysis in our study. However, SVM is suitable for small sample, which may make our results more reliable. Establishing a data base which contains all related information is a good way to analyze and predict HAND. Second, our subjects only include adults with wide age range. Childhood AIDS were not included in our study. Thus, further study should include paediatric cohorts. The last inevitable limitation was that some HIV-infected patients are not appropriate for MR imaging, and we cannot obtain more comprehensive data.

Conclusions
Machine learning is of significance in the classification of GM volume measurement in patients with AIDS based on MRI, contributing to early diagnosis of HAND.