Data Analysis and Tissue Type Assignment for Glioblastoma Multiforme

Glioblastoma multiforme (GBM) is characterized by high infiltration. The interpretation of MRSI data, especially for GBMs, is still challenging. Unsupervised methods based on NMF by Li et al. (2013, NMR in Biomedicine) and Li et al. (2013, IEEE Transactions on Biomedical Engineering) have been proposed for glioma recognition, but the tissue types is still not well interpreted. As an extension of the previous work, a tissue type assignment method is proposed for GBMs based on the analysis of MRSI data and tissue distribution information. The tissue type assignment method uses the values from the distribution maps of all three tissue types to interpret all the information in one new map and color encodes each voxel to indicate the tissue type. Experiments carried out on in vivo MRSI data show the feasibility of the proposed method. This method provides an efficient way for GBM tissue type assignment and helps to display information of MRSI data in a way that is easy to interpret.


Introduction
Glioblastoma multiforme (GBM), which typically consists of three tissue types (i.e., normal, tumor, and necrosis), is a type of extensively heterogeneous tumors. Accurate diagnosis of GBMs is of great importance for guiding therapy and planning operations. Being different from other brain tumors which present similar spectral patterns, GBMs are characterized by high infiltration [1,2]. Such characteristic brings huge difficulty in tumor typing and diagnosis.
Magnetic resonance spectroscopy imaging (MRSI) [3] is a very useful noninvasive tool for brain tumor diagnosis, especially for highly heterogeneous tumors like GBMs. Unlike magnetic resonance imaging (MRI) which only shows the brain structure, MRSI combines MRI and magnetic resonance spectroscopy (MRS) [4] to provide the localized biochemical information. By investigating the spectra from multivoxels, the clinicians could have a better insight into the pathological change of brain tissues.
However, the interpretation of MRSI data is still challenging which hinders its application in tumor diagnosis. Efforts for exploiting MRSI data have been made using both supervised and unsupervised methods. Nosologic imaging is created using linear discriminant analysis [5,6], canonical correlation analysis (CCA) [7,8], Bayesian frameworks [9,10], and nonnegative matrix factorization (NMF) [11]. NMF [11] is an alternative blind source separation technique with only nonnegative constraint. It has shown great potentials in brain tissue differentiation [2,[12][13][14]. In our previous work, we proposed an unsupervised method, namely, hierarchical nonnegative matrix factorization (hNMF), to interpret the MRSI data for GBMs without prior knowledge and provided an easy way to interpret MRSI data of GBMs for each tissue type [15].
Unlike the supervised classification methods, which labels each voxel based on large training sets [5][6][7][8][9][10], tissue typing for NMF tissue differentiation is not usually considered [12,13,16]. Recently, a tissue typing method was carried out by simply exploring which tissue contributes most to the voxel [14]. Such an approach ignored the voxels with intensively mixed tissues, that is, the different tissues contributing fairly equal. We tried to integrate the distribution information of each pure tissue in one image by encoding each of them as 2 BioMed Research International a color channel [16]. The obtained images, known as nosologic images, showed the spatial distribution of all tissue types. However, the tissue distribution is only shown in shading colors and the tissue type of each voxel is not indicated clearly.
In this paper, we improved upon [15] by proposing an approach for GBM tissue type recognition. The previous work is extended by analyzing both the pure and mixed data labeled by an expert. The spectral data labeled as each tissue type is analyzed and the relationship of different tissue types is studied. Then, we proposed criteria to assign each voxel to a certain tissue type (i.e., pure tissue normal, tumor, necrosis, mixed tissues normal/tumor, or tumor/necrosis, hereafter noted as "C", "T", "N", "C/T", and "T/N, " resp.) using the tissue distribution maps. In vivo experiments are performed using short-TE 1 H MRSI data from GBM patients. We then evaluate its performance using the expert labeling.

Data Acquisition Protocol.
The MRSI protocol had the same imaging parameters as in our previous work [15,16]. All the MRSI data were acquired at the University Hospital of Leuven (UZ Leuven, Belgium) on a 3 T MR scanner (Achieva, Philips, Best, The Netherlands). A body coil for transmission and eight-channel head coil for signal reception were used. The MRSI protocol had the following imaging parameters: point-resolved spectroscopy (PRESS) [17] that was used as the volume selection technique; TR/TE = 2000/35 ms; field of view, 16 cm × 16 cm; volume of interest, 8 cm × 8 cm (maximum size); slice thickness, 1 cm; acquisition voxel size, 1 cm × 1 cm; reconstruction voxel size, 0.5 cm × 0.5 cm; receiver bandwidth, 2000 Hz; samples, 2048; number of signal averages, 1; water suppression method, MOIST; shimming, pencil beam shimming; first-and second-order parallel imaging with SENSE factor: left-right, 2; anterior-posterior, 1.8; 10 circular 30 mm outer-volume saturation bands in order to avoid lipid contamination from the skull. Standard anatomical MR images were also acquired.

Patients and Data
. MRSI data sets from 6 GBM patients (typically present three tissue patterns, i.e., normal, tumor, and necrosis) were selected for this study. The MRSI data was acquired prior to any treatment from 6 patients with brain tumors that were subsequently diagnosed as GBM based on histological examination and followed the rules of the World Health Organization (WHO) classification for tumor grading [18]. The institutional review board approved the study. Written informed consent was obtained from all patients before their participation in the study. Data preprocessing was done as in our previous papers [15,16] using the in-house software SPID [19].

Expert
Labeling. MR spectra were judged by a spectroscopist (a radiologist with five years of experience). The expert spectroscopist was presented with the real spectra in a range from 4.3 to 0 ppm.
Firstly, spectral quality assessment was performed as recommended by Kreis [20]. Spectra were judged acceptable if the following criteria were met: FWHM of metabolites < 0.07-0.1 ppm, no unexplained features in the residuals, no doubled peaks or evidence for movement artifacts, symmetric lineshape, and no outer-volume ghosts or other artifacts present.
Afterwards, the spectra with acceptable quality were assigned to different tissue classes: normal appearing brain parenchyma, tumoral tissue, or necrosis, based on the spectra and the corresponding T1-weighted image after contrast administration.

Method
3.1. Spectra Investigation for GBMs Using Biomarkers. N-acetylaspartate (NAA), choline (Cho), and lipids are known to be the three most important biomarkers for investigating brain tumorigenesis. The concentration of these metabolites changes under disease condition. In the context of GBM spectroscopy, necrosis mostly contains lipids. NAA concentration is higher than Cho in normal tissue and gliomas are characterized by decreased NAA and increased Cho and lipids. But these biomarkers are not enough for MRSI spectra differentiation. In a specific frequency region of a spectrum, the peak height of the metabolite can be measured. Here, we use the NAA-to-Cho index (NCI) and NAA-to-Lips index (NLI), which measure the ratios of the peak heights of these components, to investigate the spectra for all GBM patients. We select all voxels containing pure tissues and mixed tissues to observe if the biomarkers are capable of clustering the same tissues. Each point represents a spectrum from a single voxel. Its coordinate values ( , ) correspond to the NCI value and NLI value, respectively. The points are colored using expert labeling to indicate their tissue types, blue for "C, " cyan for "C/T, " green for "T, " yellow for "T/N, " and red for "N. "

Spectra Variation Investigation Using Expert Labeling.
In this section, we investigate the relationship of spectral variation and expert labeling, including two aspects: (1) the variation of spectra labeled as the same tissue type and (2) the variation of spectral difference between two tissue types.
The expert labeled spectrum in each voxel as a certain tissue type "C, " "T, " "N, " "C/T, " and "T/N. " However, because of the voxel size of MRSI data and the infiltration property of GBMs, there is no clear boundary between different tissue types, especially in the area of tumor proliferation. Therefore, the spectra labeled as the same tissue type could have different profiles. In order to investigate the spectra variation, we plot all spectra of each pure tissue type and their mean spectra.
The correlation coefficients between normal and tumor spectra CT and the correlation coefficients between tumor and necrotic spectra TN can evaluate the spectral difference of different tissue types. For each spectrum labeled as a pure tissue type, we calculate the correlation coefficient of this spectrum and a spectrum labeled as another pure tissue type. With box plots, the variation of spectral difference between two different tissue types could be observed easily. Combined data of all patients and also that of individual patients are both analyzed to investigate the spectral difference between different tissue types.

Tissue Differentiation with hNMF.
Spectra from a MRSI grid can be approximated as a linear combination of constituent spectra. We define a data matrix containing all spectra from the voxels of interest (VOI). Each column of matrix represents a spectrum from one voxel. With conventional NMF, can be factorized into a new nonnegative matrix (each row represents a constituent spectrum of normal tissue or necrosis) and a new nonnegative matrix , The reshape of each row of , hereafter called "ℎ-map, " that is, the "tissue distribution maps" we mentioned before, gives the spatial distribution of the corresponding spectrum.
For the low grade gliomas, conventional NMF is able to differentiate normal and abnormal (i.e., tumor) tissues. While there are more than two tissue types (e.g., GBMs), the conventional NMF sometimes fails to recover the biomeaningful constituent spectra robustly. Therefore, a hierarchical approach based on NMF (i.e., hNMF) was proposed to recover the spectra of MRSI data for GBMs which contains 3 constituent spectra [15]. HNMF firstly differentiates the data matrix into normal and abnormal. Then, with an optimized threshold, the abnormal part is further differentiated into tumor and necrosis. As a result, the three constituent spectra of normal, tumor, and necrosis are recovered and their ℎmaps for different tissue types are obtained simultaneously. Note that, in each voxel, there are 3 values from the 3 ℎ-maps ℎ C , ℎ T , and ℎ N for each tissue type.

ℎ-Map Investigation Using Expert Labeling.
As introduced in the previous section, the ℎ-maps, which are normalized between 0 and 1, can be obtained from the result of hNMF. Then, the ℎ-map of each tissue type represents the tissue distribution using a number for a voxel. However, during expert labeling, each voxel is arbitrarily labeled as a certain tissue type instead of a number. It is obvious that the ℎ-values (hereafter, the value from a single voxel in an ℎ-map referred to as "ℎ-value") from the voxels, which are labeled as the same tissue type, could be different. In this section, we will exploit the ℎ-values of each tissue type. -values of all patients and each patient are also exploited to reveal the extent of individual difference.

Tissue Type Assignment.
Based on the data analysis of ℎmaps, each voxel could be assigned to a certain tissue type. In each voxel, there are 3 ℎ-values from 3 ℎ-maps for "C, " "T, " and "N, " respectively. Obviously, the ℎ-values of "C" (i.e., ℎ C ) should be bigger than the ℎ-values of "T" (i.e., ℎ T ) and the ℎ-values of "N" (i.e., ℎ N ) for normal voxels. Analogically, for voxels of tumor and necrosis tissue, ℎ T and ℎ N should overwhelm ℎ-values of other tissue types, respectively.
However, there are mixed tissues where ℎ-values of each tissue type vary significantly and thus the above criteria cannot be simply used to decide the tissue type. To properly separate the different tissues from the mixed tissues, a parameter should be added; that is, ℎ C should be bigger than ℎ T + CT for the voxel to be assigned to be "C. " Similarly, ℎ N should be bigger than ℎ T + TN for the voxel to be assigned to be "N. " Therefore, we make the following criteria for tissue type assignment.

The Rules for Tissue Type Assignment
While ℎ C > ℎ T , ℎ C > ℎ N , and ℎ C > ℎ T + CT , assign the voxel to be "C"; While ℎ N > ℎ T , ℎ N > ℎ C , and ℎ N > ℎ T + TN , assign the voxel to be "N"; While ℎ T > ℎ C + CT and ℎ T > ℎ N + TN , assign the voxel to be "T"; Else if ℎ N < ℎ C and ℎ N < ℎ T , assign the voxel to be "C/T"; Else if ℎ C < ℎ T and ℎ C < ℎ N , assign the voxel to be "T/N. " According to the above criteria, we can have all the voxels assigned to a certain type, including the ones originally labeled as "B" by expert.
3.6. Validation. The efficacy of the proposed tissue type assignment approach is validated using expert labeling information. The computed tissue type of each voxel is compared with the tissue type labeled by expert. We use the correct rate, false alarm rate, and the omission rate to evaluate the performance of the proposed approach.
Correct rate describes correct assignment among all the assignment, where assigned represents the number of voxels which are assigned to a certain tissue type using the proposed method. And correct represents the number of voxels assigned to a certain tissue type that our assignment is the same as that of an expert. False alarm rate describes the wrong assignments which should not be counted, where error represents the number of voxels which are assigned to be a certain tissue type using the proposed approach but not labeled by an expert as the same tissue type. Omission rate describes the wrong assignment which is missed, where omission represents the number of voxels which are labeled by an expert to be a certain tissue type but not assigned as the same tissue type using the proposed approach. -axis is the NAA-to-Cho index (NCI) and -axis is the NAAto-Lips index (NLI). Blue, green, and red points indicate normal, tumor, and necrotic tissue, respectively. Mixed colors represent mixed tissue. Blue for "C, " cyan for "C/T, " green for "T, " yellow for "T/N, " and red for "N. "

Spectra Investigation for GBMs Using Biomarkers.
We investigated all the 6 data sets which were pathologically confirmed to be GBM by clinicians. Figure 1 shows the overall tissue types of all the GBM data sets. Each point represents a spectrum from one voxel among all the data sets. The points are colored using expert labeling, same color for same tissue type. Due to the variation of the spectra, the distribution map shows serious overlap between tissue types. Though there are two vaguely centralized clusters for normal (higher NLI and NCI) and necrosis (very low NLI and lower NCI), there are no clear dividing lines between tissue types. Tumor cannot be separated from normal and necrosis. Mixed tissues cannot be differentiated from other tissue types, either.

Spectra Variation Investigation Using Expert Labeling.
The spectral variation of pure tissues labeled by an expert is investigated by plotting all spectra of the same tissue from all GBM patients in one figure. As shown in Figure 2, the green spectra are from all the voxels labeled as normal, tumor, and necrosis by an expert. Serious spectral variations for the same tissue type can be observed. It demonstrates that spectra for the same tissue type are possibly not identical. The red bold line plots the mean spectrum for normal, tumor, and necrosis, respectively. We can observe that most of the green spectra have great difference with the mean spectra.
The spectral relationships of different tissue types are investigated using correlation coefficients. The correlation coefficients of each spectrum labeled as "C" by expert and the spectrum labeled as "T" by expert, noted as CT , are calculated to investigate the difference of normal spectra and tumor spectra and its variation. Figure 3(a) shows the CT for all the GBM patients and each individual patient. As shown, most of the correlation coefficients CT are between 0.3 and 0.7. However, some values are extremely small or big because of the variation of tumor spectra. Differences between patients are not significant except for two patients, that is, patients 4 and 6. It demonstrates that the serious variation among patients is not common but possible. The lower quartile of CT , 1 CT = 0.2167, could be used to describe the relationship between normal and tumor. Similarly, the correlation coefficients of each spectrum labeled as "T" by expert and the spectrum labeled as "N" by expert, noted as TN , are calculated to investigate the difference of tumor spectra and necrotic spectra and its variation, as shown in Figure 3(b). Compared to CT , the variation of TN is more serious. However, the TN values of all patients inside the box are between 0.3 and 0.8. The lower quartile of TN , 1 TN = 0.2950, could be used to describe the relationship between tumor and necrosis.

ℎ-Maps Variation for Different Tissue Types.
The values in ℎ-maps for each labeled specific tissue type are analyzed. Figure 4 gives the ℎ-values from the 6 GBM patients. For the 6 data sets, there are 6 normal ℎ-maps, 6 tumor ℎ-maps, and 6 necrosis ℎ-maps. For ℎ-maps of each tissue type, we analyzed the data distribution of tissue types for all patients. Figure 4 illustrates the ℎ-values of normal ℎ-map. Each plot contains a box for all patients and 6 boxes for 6 GBM patients. Figure 4(a) depicts the ℎ-values taken from ℎ-maps of normal tissue. The values from voxels labeled as "C, " "T, " "N, " "C/T, " and "T /N" are depicted. As shown, the values for "C" are mostly between 0.6 and 1. The values for "N" and "T/N" are all quite small. It implies that good separation of normal and necrosis is possible using ℎ-maps. For the other two types "T" and "T/N", the values vary greatly. Figure 4(b) depicts the ℎ-values taken from tumor ℎ-maps. As shown, values for all tissue types vary seriously, even for tumor. Figure 4(c) depicts the ℎ-values taken from necrosis ℎ-maps. The values for "N" are mostly between 0.5 and 0.9. The values for "C" and "C/T" are quite small. It also implies that good separation of normal and necrosis is possible using ℎ-maps. The values for "T" and "T/N" vary greatly. In general, the ℎvalues taken from tumor ℎ-maps vary more seriously than the ℎ-values taken from ℎ-maps of normal and necrosis, and the values for "T, " "C/T, " and "T/N" taken from all three ℎ-maps vary significantly. It implies that the separation of tumor and mixed tissue is more difficult than normal and necrosis.

Tissue
Type Assignment for GBMs. The proposed tissue type assignment method described in Section 3.5 is applied to the ℎ-maps of 6 GBM data sets. 1 CT = 0.2167 and 1 TN = 0.2950 are used as CT and TN , respectively. The results are compared with expert labeling information. For both the results and the expert labeling, the distribution map is color-coded blue for "C, " cyan for "C/T, " green for "T, " yellow for "T/N, " red for "N, " and black for "B" which is spectra of low quality of which the tissue type cannot be decided by expert. As shown in Figure 5, the assigned tissue types are approximately in accordance with the expert labeling. The regions of normal and necrosis are more accurate than the regions of tumor and mixed tissues like "C/T" and "T/N. " This is mainly because the high infiltration character of gliomas brings higher variation to the spectral profiles of tumor and mixed tissues. The black voxels labeled as "B" by expert can be estimated using the proposed method. After analyzing localization of these voxels and their surrounding voxels, the assignment of these voxels was confirmed to be correct.

Validation.
For each patient, the correct rate, false alarm rate, and the omission rate are calculated for each pure tissue and mixed tissue types by comparing results with expert labeling. The "N/A" in Table 1 represents the situations which do not exist.
As we observe, the assignments of pure tissue "C" and "N" are almost always more accurate than "T. " The correct rate of "C" and "N" can be as good as above 0.9 or even 1. The correct rate of mixed tissues (i.e., C/T and T/N) is lower than pure tissues.
The omission rates of the pure tissue "C" and "N" for all patients are lower than 0.5, mostly lower than 0.4. But for "T" and mixed tissue "C/T, " "T/N, " the omission rate is higher.
For all results, the pure tissues "C" and "N" perform better than "T" and "T" performs better than the mixed tissues "C/T" and "T/N. " Inaccurate assignment of a tissue type influences the assignment of the tissues near it. In other words, the correct rate or error rate of "C, " "T, " and "N" will be affected by the inaccurate assignment of "C/T" and "T/N. "

Discussions
This study continued with our tissue typing work using hNMF [15]. We explored the possibility of only using several most representative biomarkers for tissue differentiation. The results showed that the different tissue types cannot be well separated, especially for tumor and mixed tissues. Therefore, a new approach for tissue type assignment using hNMF is developed.
Then we evaluated the relationship between spectra of different tissue types. The spectra labeled as a certain tissue type by expert are compared to the spectra labeled as another tissue type. The variation of the different correlation coefficients for both intra-and interpatient indicates the difference of spectra which are labeled as the same tissue type. This implies that the spectra are not identical even if they are labeled as the same tissue type, especially for tumoral spectra. This could be due to the fact that glioblastoma are known to be very heterogeneous lesions. Invasion, regions of increased cellularity, necrosis on a microscopic and a macroscopic scale,  and NAA. Tumoral regions with moderately elevated cellularity will be perceived as regions with only moderately elevated Cho and moderately lowered NAA. In the end, these spectra represent all tumoral tissue, as designated by the histopathologist as well as by the expert labeling in MR spectroscopy [21][22][23][24][25]. Therefore, spectral variation within the same tissue type, which is introduced by the nature of tumor proliferation and the volume of CSI voxels, could happen and influence the performance of tissue typing method. As demonstrated, the lower quartiles of correlation coefficients 1 CT and 1 TN could imply the "least spectral similarity" between different tissue types. Additionally, the scale of correlation coefficients CT and TN is in the same scale of ℎ-values. Therefore, in the tissue typing assignment experiment, where 1 CT = 0.2167 and 1 TN = 0.2950, which were calculated using 6 GBM patients, were used as the parameters CT and TN , respectively. Though the patients were few, the value of 1 CT and 1 TN will not change significantly since the voxel number for calculating them is large enough to be stable. Another important point we must stress is that, as long as we have decided the value for parameters CT and TN , we do not need to calculate 1 CT and 1 TN every time there is a new patient. The tissue assignment method is still automatic since the values used as CT and TN will be fixed numbers. Here, we just proposed a potential way to decide CT and TN . As the tissue type assignment is based on the ℎ-maps of hNMF, the results could be affected by both the ℎ-maps and the typing criteria. As shown, the results for tumor and mixed tissues are worse than the results for normal and necrosis. On the one hand, there is the serious variation of tumor spectra. On the other hand, the spectral profile of C/T and T/N is highly correlated with the tumor spectra. These facts lower the typing results. However, the assignments of each tissue type shown in Section 4.4 have shown the efficacy of the proposed method.

Conclusions
In this paper, we investigate the spectra variation with expert's labeling. Tissue type assignment criteria are proposed to assign each voxel to 5 different tissue types, including 3 pure tissue types "C, " "T, " and "N" and 2 mixed tissue types "C/T" and "T/N, " using the ℎ-maps of normal, tumor, and necrosis obtained by hNMF. Experiments show the feasibility of the proposed method for tissue type assignment.