Current and Potential Applications of Artificial Intelligence in Gastrointestinal Stromal Tumor Imaging

The most common mesenchymal tumors are gastrointestinal stromal tumors (GISTs), which have malignant potential and can occur anywhere along the gastrointestinal system. Imaging methods are important and indispensable of GISTs in diagnosis, risk staging, therapy, and follow-up. The recommended imaging method for staging and follow-up is computed tomography (CT) according to current guidelines. Artificial intelligence (AI) applies and elaborates theses, procedures, modes, and utilization systems for simulating, enlarging, and stretching the intellectual capacity of humans. Recently, researchers have done a few studies to explore AI applications in GIST imaging. This article reviews the present AI studies in GISTs imaging, including preoperative diagnosis, risk stratification and prediction of prognosis, gene mutation, and targeted therapy response.


Introduction
e most frequent mesenchymal tumors from the gastrointestinal system are gastrointestinal stromal tumors (GISTs), with a prevalence of 14-20 cases per million [1]. GISTs can occur anywhere in the gastrointestinal system, with 50-60% located in the stomach, 30-35% sited in the small intestine, 5% originated in the colon and rectum, and less than 1% in the esophagus [2]. e surgical operation is the first treatment way for GISTs with malignant potential.
Imaging methods are important and indispensable of GISTs in diagnosis, staging, follow-up, and surveilling adjuvant therapy response [3]. e recommended imaging method for GISTs classification is computed tomography (CT) according to current guidelines [4], while magnetic resonance imaging (MRI) or enhanced endoscopic ultrasonography (EUS) could be replacements for iodine allergic or pregnant patients, [18F]-fluorodeoxyglucose positron emission tomography (PET)-CT can be conducive for early phase monitoring of tumor response to tyrosine kinase inhibitor (TKI) therapy [4].
At present, the clinical images practice mainly depends on the subjective interpretation by radiologists of morphological signs such as the location, margin, contour, size, attenuation, growth type, and enhancement degree. With the application and popularization of highend multislice spiral CT, high-quality images containing rich digital information are available and prevalent, promoting artificial intelligence (AI) techniques to mine and process the big data deep in the images. Recently, an explosion of AI research emerged, particularly in the medicine field.

Artificial Intelligence
As an information science, AI applies and elaborates theses, procedures, modes, and utilization systems for simulating, enlarging, and stretching intellectual capacity of humans [5].
Lambin et al. [6] initially proposed the notion of radiomics in 2012, which consisted of a computer-aided operating instrument derived from a great number of features from radiographic images. is technique as a new imaging technology, which can provide objective image information that cannot be recognized by the naked eye, is more detailed than the personal image interpretation by radiologists' vision. And texture analysis can quantitatively evaluate and extract the characteristics of tumors and can assess tumor heterogeneousness related to histopathological components in tumor tissues and mainly influenced by tumor neovascularization (vascularization formation and vascular permeability), tumor cellular structure, tumor cell density, and microcirculation deformation. Such quantitative-feature-based method could be of clinical associations of tumor diagnosis, staging, prognosis, and therapy.
In detail, radiomic texture is mainly composed of statistical texture, morphology-based texture, and transformbased texture. e statistical texture is formed on assessing texture as a measurement of the gray levels statistical properties based on processing the region of interest (ROI). It mainly includes (1) first-order statistical features, depicting distributed pixels in an image, such as histogram analysis; (2) second-order statistical features, as well as texture features, represent spatial relations between pixels and corresponding pair ratios, including gray-level cooccurrence matrix (GLCM), gray-level difference matrix (GLDM), gray-level run-length (GLRLM), gray-level size zone matrix (GLSZM), and neighborhood gray-tone difference matrix (NGTDM); (3) higher-order statistical features refer to the features extracted after applying filters or mathematical transformations for images, such as first-order and textural features extracted from the wavelet-filtered, Laplacian of Gaussian (LoG)-filtered, or local binary pattern (LBP) filtered images. e transform-based analysis includes texture characteristic extraction based on wave spectral statistical properties of and characterization of the global periodicity of gray level by high-energy apices and their varied types in the spectrum.
Morphology-basing method incorporates the decomposition of an image into basic units and the determination of the rules required to assemble a given image based on these basic units. All of the above methods consist in various descriptors. e detailed descriptions are presented in Table 1. In a simplistic way, a representative radiomics workflow is composed of four tasks: image attainment, image segmentation, parameter extraction, and statistical analysis ( Figure 1).
Radiogenomics, as well as an encouraging novel exemplification, has the potential to extend and expand traditional radiographic images into the field of molecular and genomic imaging [7]. It aims to correlate image features with patterns of gene expressions, gene mutations, and any other genes associated traits, promoting a deeper level explanation of tumor heterogeneity and the development of imaging biomarkers [8].
Deep learning is a group of machine learning algorithms extracting deep features of the input image via multiple hidden layers [9]. Such multilayered computational models can progressively learn representations of data during multilevel abstraction [10]. A neural network is an embranchment of machine learning that organizes the basic structure of a deep learning network [11]. e models of deep learning algorithms used in medical imaging processing include Sparse Autoencoder, Convolutional Neural Network (CNN), Deep Belief Network, Restricted Boltzmann Machine, and Residual Neural Network (ResNet) [10]. Among various deep learning networks, CNN is the most popular architecture, and a further improved neural network included more computational layers.

Diagnosis and Differential Diagnosis
GISTs represent a distinct histopathological group of subepithelial tumors. A broad range of other mesenchymal tumors can also manifest similar imaging features with GISTs, while the two groups have distinguished prognosis and treatment. Previous studies have differentiated GISTs from other mesenchymal tumors based on tumor location, margin, contour, size, attenuation, growth type, enhancement degree, and necrosis [12][13][14][15][16][17]. However, it is still difficult to discriminate GISTs with a diameter less than 5 cm from other mesenchymal tumors, only counting on subjective imaging interpretations.
ere is a vacancy of AI research in this area, we look forward to more AI researches to dig new data in this field.
Clinically, the preoperative diagnosis of GISTs around the periampullary area poses a dilemma in conventional imaging performance. Rather, pancreatic ductal adenocarcinomas (PDACs), duodenal adenocarcinomas (DACs), and GISTs differed in surgery procedures and prognosis [15][16][17]. Recently, Lu et al. [18] retrospectively studied 74 patients with duodenal tumors around the periampullary area: 26 DACs, 20 DACs, and 28 GISTs. Volumetric histogram analysis was performed on enhanced multidetector CT images based on tumor heterogeneity. ey concluded that some parameters of CT histogram analysis of periampullary tumors could be valuable for diagnostic differentiating DACs, PDACs, and GISTs arising from the periampullary area. However, the sample size and tumors type involved in this article are limited. Further researches with more sample capacity and various kinds of tumors will reinforce AI application in GISTs diagnosis.

Prediction of Risk Stratification and Prognosis
Several risk assessment systems for postoperative recurrence of GISTs have been proposed and evolved over the years, including the National Institute of Health (NIH) criteria, Armed Forces Institute of Pathology (AFIP) standard, and National Comprehensive Cancer Network (NCCN) risk classification. In 2008, modifications of the NIH criteria were 2 Contrast Media & Molecular Imaging proposed, which incorporated tumor location, size, mitotic count, and tumor rupture. e criteria of the recurrence risk categorized into four groups (including very-low-risk group, low-risk group, intermediate-risk group, and high-risk group) and is accepted worldwide [19]. Imaging can provide more findings related to the risk stratification of GISTs.
According to previous studies [20][21][22][23][24], tumor growth mode is related to the risk, and the risk level of GISTs with exophytic or mixed growth mode is high. It has also been suggested that the enhancement type, boundary, enlarged blood vessels, necrosis, calcification, and invasions to adjacent organs are connected to the tumor risk stratification. e differences among the observers of subjective evaluations urged researchers to find more stable and more objective parameters and indicators. e texture analysis could extract more information hidden from medical images, which cannot be identified by subjective visual interpretation. In theory, the judgment efficiency of texture analysis of GISTs risk stratification is better than the conventional imaging [24,25]. Nine studies have researched the performance of CT-derived radiomic signature for risk stratification [24][25][26][27][28][29][30][31][32], and one study evaluated EUS-derived texture [33] associated with risk stratification. e details are summarized in Table 2. In CT-derived analysis, four studies have applied NIH criterion or modified NIH criterion for GISTs malignant risk classification [25][26][27][28], while three studies were determined on NCCN guideline [24,29,30] and one study without clear guideline [33] and one study used Ki-67 expression standard [32]. Two of the four NIH studies based on NIH risk classification only evaluated CT textural parameters [26,27]. e remaining two studies combined and compared conventional visual CT findings and clinical indexes models [25,28].
In 2018, Feng et al. [26] retrospectively reviewed 90 intestinal GISTs patients. GISTs risk levels were evaluated by CT-derived histogram features that were compared according to modified NIH risk classification. ey believe that volumetric CT texture features show the feasibility to be biomarkers for distinguishing low-risk, intermediate-risk, and high-risk intestinal GISTs (area under the curve (AUC) � 0.830, P < 0.001). However, some studies have reported contradictory results with the present study [29,[34][35][36]. We speculate that the differences in ROI delineation methods, and differences between enhanced and   unenhanced CT-derived texture features might be associated. In this study, the numbers of some risk groups of intestinal GISTs were limited and the author combined some groups. Moreover, this study only applied first-order statistical radiomics features. It will require further studies to explain the controversy.
Another research [27] constructed a radiomics model using multiple-order statistical radiomics features based on contrast-enhanced CT to noninvasively predict malignanttransformation potential and mitotic indexes of GISTs. In this research, the patients were classified as low-(including very-low-risk GISTs, low-risk GISTs, and intermediate-risk GISTs) and high-malignant-transformation-potential group (high-risk GISTs) based on the NIH criterion, and the sample size is enlarged with 333 numbers in total (training cohort � 233 and validation cohort � 100).
e radiomics model showed a good predictive performance in differentiating high-from low-malignant-transformation-potential GISTs with an AUC value of 0.882 in the training group and 0.920 in the validation group. e above two studies have only constructed radiomics model, and a single radiomics model could not utilize and compare the performance of conventional image findings and clinical information in GISTs' risk stratification. e next two studies [25,28] compared the accuracy of CT-derived textural parameters, subjective CT parameters, and clinical index models in predicting risk stratification. Yan et al. [28] included 213 intestinal GISTs patients to assess the predictive effect of clinical and subjective imaging findings and multidetector CT texture findings on preoperative risk stratification. ey reported that an AUC of the model combining clinical and conventional imaging findings and multidetector CT texture features was 0.943. ey deduced that CT texture may be a useful integrated tool for preoperative risk stratification of intestinal GISTs. In 2019, Chen et al. [25] constructed a radiomics nomogram for predicting GISTs malignancy potential. In comparison to conventional CT parameters and clinical indexes, the radiomics model could discriminate low-from high-malignant-transformation-potential group GISTs with a higher AUC value of 0.858. Besides, the generated radiomic nomogram model achieved the highest diagnostic performance, which showed an AUC of 0.867 and 0.847 in the internal and external cohort. e same predicament for the only usage of radiomics model and limited sample size applied to these studies using NCCN guidelines [29,30]. Liu et al. [29] found meaningful texture parameters from various phases in differentiating malignancy risks GISTs based on NCCN risk stratification, which was consistent with a previous study [25]. But the sample size is small as no more than 100 patients, and this study only applied first-order statistical radiomics features. With a larger included sample size (total number � 140, training cohort � 100 and validation cohort � 40) and various statistical radiomics features, Zhang et al. [30] highlighted discriminative performance with an AUC value of 0.935 and an accuracy value of 90.2% in the validation set for advanced from nonadvanced GISTs. Further, the radiomics indicated satisfied discriminative performance for four groups of GISTs risk stratification with an AUC value of 0.809 and an accuracy value of 67.5% in the validation set. Nevertheless, these studies did not conduct a direct or indirect correlation among radiomics features, subjective imaging findings, and pathological results.
So then, Choi et al. [24] evaluated and compared the diagnostic performance of CT radiomics parameters and visual CT inspection to predict malignancy grade and mitosis index of GISTs. ey found the diagnostic accuracy of special radiomics features was better than visual inspection.
However, the previous studies independently used radiomics methods for pattern classification, without regard to relatively global artificially predefined parameters. Researchers also start to explore the GISTs classification efficiency of deep features obtained by deep learning networks. In 2019, Ning et al. [31] introduced an integrated structure including various features applied to a radiomics model and deep convolutional models and incorporated these features to engage in GISTs categorization. e hybrid structure with the combination of radiomics and CNNs features exhibited better performance with an AUC of 0.882 than that of the conventional CT features model (AUC � 0.774), radiomics model (global features) (AUC � 0.807), and CNN model (local features) (AUC � 0.826). As far as we can tell, this is the initial and exclusive study to apply radiomics model and CNNs for GISTs risk stratification, in which the radiomics parameters are derived from a three-dimensional universal section and deep convolutional features derived from a regional section were combined. is integrated structure enhances not merely model robustness but classifier efficiency as well.
In addition, the risk-related molecules were also predicted by using radiomics methods. e ki-67 index is an important marker related to cell proliferation and tumor heterogeneity [37]. Ki-67 is signified in the majority of the reproducing cells in high level expression, besides G0 cells, and Ki-67 is deemed as a global risk marker of malignant potential in GISTs [38]. Previous literature has also demonstrated that expression of high level Ki-67 indexes is an unrelated risk marker for high-malignancy GISTs [39][40][41]. A multicenter study [32] has also demonstrated a nomogram that consisted of CT-based radiomics features combined with tumor size indicated significant performance in predicting Ki-67 indexes expression in GISTs, with respective AUCs of 0.801, 0.828, and 0.784 in the training, internal validation, and external validation cohort, respectively. is proved that the Ki-67 indexes expression rate in GISTs was potentially connected with the CT textural signature.
Radiomics methods extended its applicability to various imaging modalities. For EUS-based radiomics, Li et al. [33] performed a EUS-derived radiomics model to differentiate GISTs of the higher-risk classification (intermediate-risk and high-risk) from the lower-risk classification (very-low-risk and low-risk).
is model can promote the preoperative diagnosis and supply a beneficial reference for clinicians.
All of the above results show that radiomics is superior to traditional imaging description in predicting the risk stratification of GISTs, which built a foundation for the application of radiomics in the future. However, the existing studies remained insufficient. Present studies only evaluated CT-derived texture. e MRI-derived texture analysis may be more potential to dig hidden information, and quantitative imaging modalities may be useful in precise medical improvement. It should also be noted that at present, the sample sizes of most studies were limited. e inconsistency of scanning parameters, scanners, image acquisition protocol, lesion segmentation, the delineation of ROI, and statistical modeling is also presented. Selection bias of texture parameters extraction also manifested in the statistics of the levy, which leads to the consequence that duplication of research results be questioned. In addition, the conclusions of small samples also brought about poor generalization ability in specific clinical applications. Furthermore, CNN based on deep learning may substantially supplement and extend the applicability of radiomics, in the aspects of feature library or the prediction accuracy, but its effectiveness still remains to be verified.

Prediction of Gene Mutation
GISTs grow up in the interstitial Cajal cells from the gastrointestinal system [42], and 90% express CD117 antigen (C-KIT) [43], a tyrosinase kinase growth factor receptor [44]. GISTs with KIT exon 11 mutated genes are more Contrast Media & Molecular Imaging responsive to imatinib therapy [45][46][47], while other molecular mutations respond more poorly to imatinib. In addition, GISTs with KIT exon 9 mutations are more responsive to sunitinib. e connection between CT findings and GISTs genotype has been investigated [48]. GISTs with KIT exon 9 mutation classification have significant linkages with tumor size more than 10 cm, a stronger enhancement grade and greater area of tumor necrosis when compared to those of the KIT exon 11 mutation classification (P < 0.05).
In 2018, Xu et al. [49] reported a radiogenomic study on GISTs. ey included enhanced CT images of 86 GISTs and performed texture analysis. ey found that texture analysis could be of use to discriminate GISTs without KIT exon 11 mutated gene group from those with KIT exon 11 mutated gene group. In addition, the nongastric orientation, lower CD34 staining, and higher radiogenomic signature values were connected with GISTs without the KIT exon 11 mutated gene, which achieved satisfactory diagnostic efficiency in the validation group (AUC � 0.904 − 0.962). However, the sample size of the training cohort and validation cohort was 69 and 17 cases, respectively, and there were only four cases of tumors without KIT 11 exon mutation in the validation cohort, which may have affected the accuracy of the results. Hence, a study with a large number of patients is required to validate these conclusions.

Response Evaluation of Targeted Therapy
Adjuvant TKIs therapy is suggested for patients with a high recurrence risk of GISTs, and enhanced CT is the recommend imaging method for evaluating treatment response. e study in [50] constructed and confirmed a predictive nomogram for recurrence-free survival (RFS) of GISTs after surgery without aid treatment based on deep learning (ResNet model). e ResNet nomogram was investigated on enhanced CT and clinicopathological factors including mitotic index of tumor, tumor location, and size. Both the ResNet nomogram and model manifested significant prognostic capabilities in 3and 5-year RFS in receiver operating characteristic curves. ey suggested that ResNet nomogram was supreme to the existing risk stratification standards and clinicopathological nomogram majority of the probability of exceeding reasonable threshold probabilities.
For metastatic GISTs undergoing TKI therapy, Ekert et al. [51] identified 25 GISTs patients with KIT and PDGFR mutations. All patients underwent first-line imatinib therapy and different TKI therapies after disease progression. CT texture features were extracted and associated with response categories according to the modified Choi criterion. ey came to the conclusion that some of the CT texture features (GLCM inverse difference, GLCM inverse difference normalized, GLRLM, and NGTDM) correlated with prognosis, progressive-free survival, gene mutations, and treatment regimens.

Conclusions
Previous studies had some limitations. First, all of the above studies were retrospective. Most of them were the single center and the sample sizes were limited. e restricted number of samples not only limited the setting of imaging radiomics threshold standard, but also imposed restrictions on the training of the models [52]. Second, several image acquisition scanners and parameters were used in the same study, which might reduce the reliability and reproducibility of potential findings. ird, all the studies evaluated CTderived texture. MRI-derived texture analysis might have more potential to uncover hidden information, and quantitative imaging modalities may be useful for improving precision medicine. In the end, most of the significant texture semantics are statistical terms, which lacked explainable correlations to the specific clinicopathological significance and biological characteristics directly and limited the interpretation of AI in repeatable research and clinical application. e present studies demonstrated that AI methods including radiomics or deep learning have clinical value for GISTs and built a foundation for future application. Considering the limitations, prospective multicenter studies with large samples are needed. Besides, further standardization of inspection techniques and in-depth excavation of detailed signs will deepen our understanding of GIST imaging. e development of AI imaging in PET-CT and MRI will broaden our exploration. In the future, more AI studies and applications are expected in preoperative prediction of various gene mutations and evaluation of the efficacy of targeted therapies to make continuous progress towards the goal of individualized and accurate treatment.