Identification of Ultrasound-Sensitive Prognostic Markers of LAML and Construction of Prognostic Risk Model Based on WGCNA

Background Acute myeloid leukemia (LAML) is the most widely known acute leukemia in adults. Chemotherapy is the main treatment method, but eventually many individuals who have achieved remission relapse, the disease will ultimately transform into refractory leukemia. Therefore, for the improvement of the clinical outcome of patients, it is crucial to identify novel prognostic markers. Methods The Cancer Genome Atlas (TCGA) and the Gene Expression Omnibus (GEO) databases were utilized to retrieve RNA-Seq information and clinical follow-up details for patients with acute myeloid leukemia, respectively, whereas samples that received or did not receive ultrasound treatment were analyzed using differential expression analysis. For consistent clustering analysis, the ConsensusClusterPlus package was utilized, while by utilizing weighted correlation network analysis (WGCNA), important modules were found and the generation of the coexpression network of hub gene was generated using Cytoscape. CIBERSORT, ESTIMATE, and xCell algorithms of the “IOBR” R package were employed for the calculation of the relative quantity of immune infiltrating cells, whereas the mutation frequency of cells was estimated by means of the “maftools” R package. The pathway enrichment score was calculated using the single sample Gene Set Enrichment Analysis (ssGSEA) algorithm of the “Gene Set Variation Analysis (GSVA)” R package. The IC50 value of the drug was predicted by utilizing the “pRRophetic.” The indications linked with prognosis were selected by means of the least absolute shrinkage and selection operator (Lasso) Cox analysis. Results Two categories of samples were created as follows: Cluster 1 and Cluster 2 depending on the differential gene consistent clustering of ultrasound treatment. The prognosis of patients in Cluster 2 was better than that in Cluster 1, and a considerable variation was observed in the immune microenvironment of Cluster 1 and Cluster 2. Lasso analysis finally obtained an 8-gene risk model (GASK1A, LPO, LTK, PRRT4, UGT3A2, BLOCK1S1, G6PD, and UNC93B1). The model acted as an independent risk factor for the patients' prognosis, and it showed good robustness in different datasets. Considerable variations were observed in the abundance of immune cell infiltration, genome mutation, pathway enrichment score, and chemotherapeutic drug resistance between the low and high-risk groups in accordance with the risk score (RS). Additionally, model-based RSs in the immunotherapy cohort were significantly different between complete remission (CR) and other response groups. Conclusion The prognosis of people with LAML can be predicted using the 8-gene signature.


Introduction
Acute myeloid leukemia (LAML) is a heterogeneous hematological cancer distinguished by the interruption of myeloid diferentiation and the accumulation of mother cells in the bone marrow. Its main feature is that the proliferation of microleukoblasts or leukocytes cannot diferentiate normally [1]. LAML, the most prevalent form of acute leukemia in adults, is highly heterogeneous and prone to recurrence [2,3]. According to the latest data, it is estimated that by 2022, there will be 20050 new cases of LAML and about 11540 mortalities in the United States alone [4]. For decades, the conventional treatment of primary LAML has been to induce chemotherapy frst in order to achieve complete remission (CR), and then give consolidation and intensive treatment after remission, or choose stem cell transplantation at a selected time [5]. Among them, about 40-90% of LAML patients respond to the initial induction chemotherapy and can undergo CR [6][7][8][9]. However, the remission rate of young patients is only 40%-50% [9]. Chemotherapy resistance is one of the important obstacles for patients with LAML to achieve long-term remission after treatment [10,11]. Terefore, for a better understanding of the molecular characteristics involved in the occurrence of LAML and for the improvement of a patient's clinical outcome, it is essential to explore new prognostic markers.
Te advantages of convenience and small side efects make ultrasound the second most widely used imaging method in the world. In addition to being widely used in diagnostic imaging, ultrasound is also often used in the treatment of various diseases [12]. With the development of ultrasound molecular imaging technology in recent years and its clinical application, this technology can be used to provide more accurate diagnosis and treatment for tumor patients, which is expected to improve the treatment failure caused by chemotherapy resistance. Ultrasound-mediated targeted delivery (UMTD) is a novel therapeutic material delivery approach based on ultrasound that has enormous potential for efective drug delivery and considerably enhancing drug treatment impact [13]. Increasing the distribution and absorption of chemotherapy drugs through the use of ultrasonic contrast agents (UCAs) as carriers is essential for improving the chemotherapy efcacy of tumor patients [12]. Advanced cervical cancer is treated with brachytherapy, and using ultrasonography during brachytherapy signifcantly improves the prognosis for cervical cancer patients [14]. Additionally, ultrasound combined with micro/nanobubbles can transfer genes and antigens to cells, which may efectively increase the response of tumors to immunotherapy [15]. Simultaneously, studies have confrmed that ultrasound technology can also enhance the therapeutic efect of radiotherapy and photodynamic therapy (PDT) [16,17]. Automated breast ultrasound (AUBS) and digital breast tomosynthesis (DBT) can be used to assess and track the efcacy of breast cancer patients in terms of prognosis [18]. Ultrasound technology can also be used for the prediction of a cancer patient's response to chemotherapy and to evaluate its correlation with long-term survival before treatment, helping to provide a more accurate diagnosis and treatment for patients [19,20]. In a word, ultrasound diagnosis and treatment technology will beneft more and more tumor patients. Studying the principle and biological signifcance of its impact on prognosis and survival in cancer can further play the role of this technology in tumor diagnosis and treatment.
Diferential expression analysis was used in this study for the identifcation of the diferential sensitive gene of LAML before and after ultrasound treatment. Two ultrasoundsensitive subtypes with signifcant prognostic diferences were identifed based on this gene. Furthermore, an 8-gene risk model including GASK1A, LPO, LTK, PRRT4, UGT3A2, BLOCK1S1, G6PD, and UNC93B1 was constructed for the evaluation of the prognosis of patients with LAML. Te model has good and stable prognostic evaluation efciency.

Data Set Source and Preprocessing.
Te expression profle data (FPKM value) and clinical data (Table 1) of LAML were accessed from the Cancer Genome Atlas (TCGA) database with the help of the "TCGAbiolinks" R tool. Te FPKM value was log2-converted, and the unifed survival time unit was used when processing survival information: days.
Both the expression profle and ultrasonic grouping data for GSE10212 and the clinical information for GSE71014 were accessed from the Gene Expression Omnibus (GEO) database (https://www.ncbi.nlm.nih.gov/geo/), and the following processing was followed: (1) the samples with no clinical follow-up information were discarded; (2) the samples with no records of survival time (<0 days) and no survival status either were rejected, while the unifed unit of survival time was days; (3) the probe was turned into a gene symbol; (4) the probe that was associated with multiple genes was eliminated; (5) the median value was calculated for the expression having multiple gene symbols.
Te "IMvigor210CoreBiologies" R package was employed to download the expression profle and survival and response information of the IMvigor210 immunotherapy cohort (bladder cancer), and samples with survival information and expression data were selected for analysis. Table 2 illustrates the clinical information.

Diferential Expression Analysis.
Te diferential expression analysis in this study was carried out with the help of the "limma" R package. In the GSE10212 dataset, diferential expression analysis was performed on samples that received ultrasound treatment and those that did not receive ultrasound treatment, and DEGs were identifed with a threshold p value <0.01 (because no diferential gene met adjust.p value <0.05) and |log2FC| > 0.585. For the diferential expression analysis of TCGA ultrasound-sensitive subtypes, the Benjamini-Hochberg (FDR) corrected adjust.p value <0.01 and |log2FC| > 0.585 were utilized to identify DEGs.

GO and KEGG Enrichment Analysis.
Te GO and KEGG enrichment analyses were conducted with the help of the "clusterProfler" R package on the diferential genes obtained from the two diferential expression analyses. p adjust method was set as BH, and an adjust.p value <0.05 was used as a cutof to identify substantially enriched pathways. According to an adjust. p value <0.05, the top 10 pathways with signifcant enrichment were selected for visualization.

Unsupervised Cluster
Analysis. Consistent clustering analysis was performed on TCGA-LAML samples by using the "ConsensusClusterPlus" R package to identify ultrasound-sensitive molecular subtypes. Te analysis process adopted an 80% resampling rate and 1000 repetitions to assure classifcation stability. Moreover, the survival curve of KM was drawn by the "survival" R package, and the signifcance of the prognosis variation between typing was verifed by the log-rank test. Finally, the clustering outcomes with a good clustering efect and signifcant diferences in survival among subtypes were selected as the subtype recognition results. (WGCNA). Te association pattern between gene expression in microarray or RNA-seq is commonly characterized by utilizing WGCNA. Te gene coexpression network of complex biological processes was divided by WGCNA into a number of highly associated feature modules, which represent various groups of highly synergistic gene sets. Tese modules can be associated with specifc clinical features for the identifcation of genes with key functions, helping to study potential mechanisms underlying specifc biological processes, and exploring candidate biomarkers.

Weighted Gene Coexpression Network Analysis
"WGCNA" R package was used to identify the hypervariable gene of adjust.p value <0.01 in the ultrasoundsensitive molecular subtype of the TCGA-LAML cohort. Combined with clinical characteristics (age, sex, and survival status), a gene coexpression network was constructed, and key modules were identifed through the correlation coefcient between clinical characteristics and modules. Te hub genes of key modules were then identifed according to the GS and MM values, and the coexpression network of hub genes was created by using Cytoscape software.

Construction of Prognostic Risk Model and Analysis of Survival Diferences.
As per the hub gene of the key module, the univariate Cox analysis was conducted to determine (p < 0.05) the indicators associated with the prognosis of LAML. Simultaneously, the median expression of a single signature was taken as the threshold point, and two groups of LAML samples were created as follows: high and low expression groups. Te KM method was employed to build the prognosis analysis survival curve, and the log-rank test was utilized to calculate the signifcance of the diference. Te Lasso regression method of the "glmnet" R package was employed to identify the important prognosis-related genes, and the prognosis model was created. Te tumor samples were divided into two groups as follows: high-risk and low-risk groups using the median RS as the threshold, and the KM method was employed to generate the survival curve of prognosis analysis, whereas the log-rank test was employed to determine the signifcance of the diference. Te receiver operating characteristic (ROC) curve was created using the R package "timeROC" to evaluate the scoring prediction by the disturbance scoring model, while the "ggplot2" R package was utilized to produce the scatter diagram of survival time and survival state, and the scatter diagram of sample score as well. In addition, the "pheatmap" R package was utilized to create the expression heat map of model genes. Te expression value of each candidate gene was added together and multiplied by the weight to determine the model's risk value. Te formula is as follows: RS � n i�0 coef(i) × Exp(i).

Estimation of Proportion of Immune Infltrating Cells and
Immune Score. Tree algorithms from the "IOBR" R package, CIBERSORT, ESTIMATE, and xCell, were employed to measure the degree of immune infltrating cells on the basis of the expression profle of the TCGA-LAML dataset [21]. CIBERSORT algorithm is a method used in complex tissues for the characterization of cell composition depending on the expression profles of genes. Te leukocyte characteristic gene matrix LM22 evaluated 547 genes to determine the diferentiation between 22 immune cell types, comprising myeloid subpopulations, natural killer (NK) cells, plasma cells, immature and memory B cells, and 7 diferent types of T cells. Te LM22 characteristic matrix and CIBERSORT were used to estimate the proportion of the 22 cell phenotypes in the sample, and when added, the resulting sum of all immune cell types was 1 in each sample.
Te ESTIMATE algorithm was employed to measure the immune score, tumor purity, matrix score, and estimate score of the tumor.
XCELL can carry out cell type enrichment analysis on the basis of the gene expression data of 64 diferent immune cell and stromal cell types. In order to reduce the correlation between closely related cell types, the XCell machine learns from thousands of diferent cell types from diferent sources based on gene signature. Trough extensive computer simulation of signature and cellular immune typing, xCell can reliably describe the landscape of cellular heterogeneity of tissue expression profle. samples and to classify the samples with model grouping information to draw a waterfall diagram.

HALLMARK Pathway Enrichment Analysis.
With the help of the ssGSEA algorithm of the "GSVA" R package, the enrichment score of 50 hallmark pathways for each sample was evaluated in accordance with the gene expression of LAML samples. Te correlation of expression and enrichment score of the RS and model genes were determined with the cor function, and the "corrplot" R package was used to visualize the results. Moreover, the statistical tests were employed for calculating the enrichment score diferences between the model groups, while the enrichment score heat map was produced by combining the clinical characteristics of the samples with the "pheatmap" R package.

Drug Sensitivity Analysis.
Combined with the expression data of model genes, the sensitivity (IC50 value) of 138 drugs in the Genomics of Drug Sensitivity in Cancer (GDSC) dataset was predicted by using the "pRRophetic" R package. Te sensitivity of patients with LAML to drug treatment was evaluated by the IC50 value. Te Wilcoxon test was employed for comparing the IC50 values between both risk groups, and the drugs that difered substantially between the two groups were identifed.

Statistical
Test. Te Wilcoxon test was utilized for comparing variations between the two groups of samples when marking the signifcance, and the Kruskal-Wallis test was employed for the comparison of variations between various groups of samples, where ns represents p > 0.05, * represents p ≤ 0.05, * * represents p ≤ 0.01, * * * represents p ≤ 0.001, and * * * * represents p ≤ 0.0001. Among them, p < 0.05 was signifcant.

Identifcation of Ultrasound-Sensitive Genes in the GEO
Dataset. Diferential expression analysis was performed on ultrasound and nonultrasound samples of the GSE10212 dataset to select ultrasound-related diferential genes. A total of 227 signifcantly diferent genes were obtained, including 133 differentially overexpressed genes and 94 diferentially downregulated genes. A volcanic map and heat map were generated to show the expression and distribution of DEGs among subtypes (Figures 1(a) and 1(b)). Moreover, KEGG enrichment analysis and GO function enrichment analysis were carried out on the identifed DEGs. Te results are shown in Figures 1(c)-1(f), for the analysis results with enrichment entries greater than 10, the TOP10 entries with signifcant enrichment results were selected to draw a bubble diagram,

Identifcation of Ultrasound-Sensitive Subtypes in TCGA
Cohort. Ultrasound-sensitive DEGs were detected in the GEO dataset, and the TCGA-LAML cohort was used for the molecular subtypes identifcation by consistent clustering. Te clustering efect was the best when the KM was the clustering algorithm, whereas euclidean was the distance, best K � 2 (Figures 2(a) and 2(d)). Te cumulative distribution function (CDF) of consistent clustering is exhibited in Figure 2(i), which shows the cumulative distribution function when k took diferent values. Figure 2(j) shows the change of the area under the CDF curve when k was relative to k − 1. Furthermore, two independent ultrasound-sensitive molecular subtypes with signifcant survival diferences were identifed, and the prognosis of cluster2 (C2) was substantially better than that of cluster1 (C1) (Figures 2(e)-2(h)).

Diferences in the Expression of Ultrasound-Sensitive
Molecular Subtypes and Immune Infltration. Te clinical signifcance of ultrasound-sensitive subtypes was explored by analyzing the expression diferences and immune microenvironment diferences among subtypes again. Firstly, diferential expression analysis was carried out on subtypes to identify DEGs, and a total of 1341 diferential expression genes were obtained, including 375 overexpressed genes and 966 downregulated genes. Te volcanic map is shown in Figure 3(a). Moreover, DEGs were subjected to KEGG enrichment analysis and GO function enrichment analysis, the top 10 pathways with enrichment signifcance were selected to draw a bubble diagram. Te results are shown in Figures 3(b)-3(e), and they were mainly enriched in items related to immune regulation, such as leukocytes, hematopoietic cells, immune response, and MHC molecules. Subsequently, by determining the degree of immune cell infltration and the predicted immune-related score, the diferences in the tumor immune microenvironment between subtypes were investigated. Te fndings demonstrate that the CIBERSORT algorithm predicted 22 types of immune cell infltration ratios, and 13 of those types had substantial variances between subtypes (Figure 3(f )), the box diagram of matrix score, immune score, estimate score, and tumor purity, respectively (Figure 3(g)). Te diferences between the four scores were statistically signifcant, with the three C1 scores exceeding C2, while the tumor purity of C1 was lower than C2.

WGCNA in Identifying Key Modules and Hub Genes.
Te highly mutated genes among ultrasound-sensitive molecular subtypes in TCGA-LAML were selected for WGCNA analysis. 130 LAML samples were clustered (Figure 4(a)), while cut height was set to 8000 to eliminate outliers; fnally, 126 samples were used for subsequent analysis. Figure 4(b) shows the clustering tree after removing outliers. As shown in Figure 4(c), when the correlation coefcient was greater than 0.9, the optimal soft threshold was 14. Additionally, Figure 3(d) shows that K has a negative correlation with p (k) (correlation coefcient: 0.9), which indicates that a gene scale-free network can be established by the selected β value. Furthermore, in the module, the minimum gene number was set to 30, the maximum distance of the module to 0.25, and the calculation methods of coexpression correlation and module trait correlation were Pearson. Figure 4(e) shows the module clustering tree and it can be observed that brown is a more important module. Te feature vector gene clustering tree and heat map were drawn, and their results in Figure 4(f ) reveal the modules with correlation coefcient >0.8 (dissimilarity coefcient <0.2) would be merged in the subsequent analysis. Figure 4(g) is the heat map of the correlation between modules and traits, and it can be observed that the key traits are age and status, and the key modules are brown and black. A scatter diagram was drawn to show the linear relationship between GS and MM in the module, and the results are illustrated in Figures 4(h) and 4(j), revealing the correlation coefcients of 0.35 and 0.33, respectively. According to the distribution of GS and MM values of genes in the module, the threshold was set GS > 0.2&MM > 0.8, and hub genes were selected according to the key modules of each key trait. Furthermore,       Journal of Oncology 37 hub genes were selected from the age-brown module and 26 hub genes were selected in the status-black module, and based on the edge fle and node fle obtained from the export network to Cytoscape function in WGCNA, the hub genes were screened and introduced into Cytoscape to construct the module hub genes coexpression network diagram of a key trait (Figures 4(i) and 4(k)).

Construction and Verifcation of the Ultrasound-Related Prognostic Signature Recognition of Prognostic Signature
Based on Hub Genes. In TCGA-LAML, identifcation of 63 hub genes was done using univariate Cox analysis, while the threshold p < 0.01 was set. Moreover, 45 prognosis-related genes were obtained as well. Te median expression of each gene was used as the cutof value for high and low groups, and the KM survival curve was drawn. Subsequently, through random sampling, 7/10 of the TCGA-LAML overall set (n � 130) was selected as the training set (n � 91), and on the basis of these 45 prognostic-related signatures, the seed was set at 12110, while Lasso linear regression method was used to remove redundant genes and build a risk model. Te results are shown in Figures

Verifcation of Robustness of Risk Model by Internal Verifcation Set.
Further assessment of the model scores' impact constructed by the eight signatures on the training set's OS, the median of RS was taken as the critical value, and the samples were distributed into two groups: high-risk and low-risk groups. Te scatter plots of survival time and survival state of the training set and the scatter plot of samples' RS were then drawn, respectively. Combined with these two scatter plots, the relationship between survival and score can be observed (Figures 6(a) and 6(b)), whereas the model gene expression of the training set is variable in both risk groups ( Figure 6(c)). Moreover, the model's prognostic efciency was checked by the construction of KM and ROC curves (Figures 6(d) and 6(e)). Te prognosis of the samples in the high-risk group was worse, and the p < 0.01 of the KM curve of both groups indicates that there is a considerable variation in the prognosis of the two groups. Te risk modelbased AUC values for the 1-, 3-, and 5-years periods were 0.772, 0.802, and 0.904, respectively, indicating that the model score's prediction efciency is excellent. Additionally, the test set of TCGA-LAML was employed to check the ability of RS for OS prediction. In accordance with the same method as the TCGA training set, two sample groups were created as follows: the high-risk group and the low-risk group, and the survival diferences were compared. Te scatter diagram of survival time and survival state of the test set, the scatter diagram of sample RS, and the heat map of model gene expression were studied in both risk groups of the test set (Figures 7(a) and 7(c)). Te high-risk group's prognosis was observed to be worse than that of the low-risk group, and substantial diferences were observed in the prognosis of both groups (Figures 7(d) and 7(e)). In the TCGA test set, the respective AUC values of 1-, 3-, and 5-   (f ) box diagram of immune infltration diference between ultrasoundsensitive subtypes, red is C1 and green is C2; (g) box diagram of the distribution diference of matrix score, immune score, ESTIMATE score, and tumor purity between ultrasound-sensitive subtypes, red is C1 and green is C2.
Journal of Oncology 7 years were 0.733, 0.914, and 0.888, respectively. Tese results confrm that the prognostic efcacy of the TCGA test set model is stable and good. Finally, the whole set of TCGA-LAML was employed to check the ability of RS for OS prediction. Similarly, based on the same method of TCGA training set, two groups of samples were created as follows: the high-risk group and the low-risk group in the overall set. Moreover, the scatter diagram of the survival time and survival state of the whole set, the scatter diagram of the RS sample, and the heat map of the expression of model genes were studied in both risk groups of the whole set (Figures 8(a)-8(c)). Additionally, it was observed that the prognosis of the high-risk group is worse, and considerable variation was observed in the prognosis of the two risk groups (Figures 8(d) and 8(e)). In the overall concentration of TCGA, the AUC of 1-, 3-, and 5-years was 0.763, 0.827, and 0.905, respectively. Te above results confrm that the prognosis of the TCGA integrated model is stable and good.       Combined with these two scatter plots, the relationship between survival and score can be observed. Fig. C illustrates the genes' expression model in the GSE71014 dataset in both risk groups. Figure 9(d) shows the KM curve of GSE71014. Te samples in the high-risk group had a worse prognosis, while the KM curve of the two risk groups (p < 0.05) indicates substantial variations in the prognosis of both groups. In Fig. E, the AUC values of 1-, 2-, and 3-years are 0.616, 0.654, and 0.651, respectively, indicating that the prognostic efciency of the model score is good.

Prognostic Risk Models Associated with Multiple Tumor
Characteristics RS, an Independent Prognostic Factor. Te constructed risk model in this analysis shows good prognostic efcacy in the TCGA dataset and GEO external validation set. Additionally, for verifcation of RS to be served as an independent prognostic factor, the age and gender of LAML were combined to conduct univariate and multivariate Cox regression analyses. Te univariate Cox analysis was performed frst, followed by the selection of independent prognostic factors for multivariate Cox analysis. Te univariate Cox regression revealed signifcant variations between the prognostic model group and age group compared with the reference, which proves that they are independent prognostic factors (Figure 10(a)). Furthermore, based on survival time and survival status, nomograms ( Figure 10(b)) were constructed in combination with clinical indicators. Age and RS were clinical factors that contributed signifcantly. Te construction of the calibration curve was performed (Figure 10(c)) to assess the nomogram's accuracy. Te calibration curve revealed that the prediction accuracy of the model in the 1st and 3rd years is high. DCA decision curves of diferent classifcation features were constructed to assess the prognosis accuracy of multiple clinical features. Te results are illustrated in Figure 10(d).

Te Model Risk Score Related to the Clinical Characteristics of the Tumor.
Based on the clinical features of age and gender of the TCGA-LAML dataset, the distribution differences of RS among diferent clinical feature groups are shown. As shown in Figures 11(a) and 11(b), there are substantial variations in RS in the age group. In addition, according to the grouping information of age, cluster, and gender, the TCGA dataset was divided into two subdatasets, and the KM curves of the subdatasets were drawn, respectively, in accordance with the median value of RS. Te KM curves revealed substantial variations in each of the subdataset, and the prognosis of the high-risk group was observed to be worse (Figures 11(c)-11(h)).

Te Risk Model Related to the Expression of Immune
Checkpoints. A group of molecules known as immune checkpoints are expressed in immune cells and have the ability to regulate the level of immune activation while playing a signifcant role in the occurrence of human autoimmunity as well. Te correlation between fve types of immune checkpoints was analyzed (from TISIDB, respectively: chemokine, Immunoinhibitor, Immunostimulator, MHC, and receptor) and the expression of eight model genes, and the correlation heat map was constructed as well. Te Immunoinhibitor gene is the most commonly used immune checkpoint, and its correlation with model gene expression is shown in Figure 12(a), the model genes generally have a strong correlation with the expression of immune checkpoints. In addition, the box diagram of four common immune checkpoints was drawn, and the variations in the expression of immune checkpoints in model groups were shown through statistical verifcation. As shown in Figures 12(b)-12(e), there are considerable variations in the gene expression levels of CD274, BTLA, and CTLA4, and the expression level of genes is increased in the highrisk group.

Association between Model Grouping and the Proportion of Immune Infltrating Cells.
Te two primary categories of nontumor constituents in the tumor microenvironment are immune and stromal cells, and they both have the potential to be extremely helpful for tumor diagnosis and prognosis evaluation. Tree algorithms were used for calculating the proportion of immune infltrating cells: immune score, matrix score, tumor purity, and ESTIMATE score. While the tumor purity was reduced in the high-risk group, the fndings of the three scores in the high-risk group were noticeably greater in comparison to the low-risk group (Figure 13(a)). Simultaneously, the diference in the proportion of immune cell infltration in high-risk and low-risk groups was measured using CIBERSORT and xCell algorithms. Te immune infltration diference results of the CIBERSORT algorithm (Figure 13(b)), in which substantial variations were observed in the proportion of immune infltration of 6 cell types in the high-risk and lowrisk groups, whereas the heat map of immune infltration proportion was constructed based on xCell algorithm (Figure 13(c)), and the infltration proportion of 24 cell types is considerably variable in high-risk and low-risk groups.

Te Expression of Model Genes is Linked with the Proportion of Immune Cell Infltration.
Te grouping of risk models depends on the expression of model genes. We can explore the prognosis of cancer afected by the expression of genes by studying the association between the immune microenvironment and model gene expression. According to the proportional analysis of immune cell infltration by the CIBERSORT algorithm, the signifcance of gene expression in clinical immunology is represented by calculating the correlation coefcient between the expression of model genes in LAML samples and the proportion of immune cell infltration. Between the 8 model genes and the proportion of 23 immune cell infltration, the correlation was illustrated as a heat map (Figure 14(a)). Additionally, a scatter diagram showing the relationship between the ESTIMATE score and the expression of model genes was created, and two model genes with high correlation coefcients were selected for display (Figures 14(b)  and14(c)). See the annex for other results.  Figure 9: GEO dataset verifying the prognostic efcacy of the model: (a-c) the risk triple plot of the GSE71014 dataset, which is the risk dispersion plot, the survival time scatter plot, and the expression heat map of model genes in the RS group, respectively. Yellow color for the high-risk group, while green for the low-risk group; (d-e) KM curve (yellow for the high-risk group and green for the low-risk group) and ROC curve of the GSE71014 dataset.

Diferences in Genomic
Mutations. Gene mutations can promote and lead to the occurrence of cancer or coordinate to drive the malignant value-added of cancer. Te investigation of genome-level mutations is crucial for the development of novel tumor therapies and tumortargeted drugs. In order to show the distribution of somatic variation among samples between high-risk and low-risk groups and the gene mutation distribution among samples with diferent clinical characteristics, the TOP30 genes with the highest mutation frequency in the high-risk and low-risk groups were selected to draw a waterfall diagram (Figures 15(a) and 15(b)   Journal of Oncology frequency of gene mutation was observed to be substantially higher in the high-risk group than that in the low-risk group.

Model Scores Correlated with Hallmark Pathway
Enrichment. Based on the expression profle of LAML samples, the hallmark pathway enrichment score results were calculated. Combined with the model score information, the correlation between RS and enrichment score was explored, and the pathway enrichment variation between high and low-risk groups, which is helpful to analyze the association between cancer characteristic pathways and prognosis. Te RS was observed to have a signifcantly positive correlation with the hallmark pathway score ( Figure 16(a)), whereas the enrichment scores of 30 pathways (Figure 16(b)) have signifcant diferences among model groups.

Model Score Predicting the Terapeutic Efect of Patients
Analysis of Chemotherapeutic Drug Resistance. Te expression profle data of TCGA-LAML was used to predict the sensitivity IC 50 values of 138 drugs in the GDSC database. A signifcant diference in IC 50 values of 60 drugs between high-risk and low-risk groups was observed (Figure 17(a)). According to the model grouping results and IC 50 Figure 11: Clinical characteristics related to model scores: (a-b) the distribution of RS in age and gender groups, respectively; (c-h) KM curves of age, cluster, and gender subdatasets in feature grouping, respectively. Te yellow color is for the high-risk group and the green is for the low-risk group.
display. Te results revealed that the IC50 values of high-risk groups are generally higher than those of low-risk groups (Figures 17(b)-17(g)), whereas the results of other drugs are illustrated in the annex.

Discussion
Acute myeloid leukemia (LAML) is a rapidly developing malignant tumor of the hematopoietic system. It is believed to originate from a single hematopoietic stem cell or progenitor cell. After the normal diferentiation process is blocked due to various reasons, it still grows rapidly and divides continuously. Tese cells are immature and lack normal function, thus affecting the hematopoietic function of the body [22]. Chemotherapy is a crucial therapy for the treatment of tumors, and the main cause of its failure is the development of resistance in tumor cells to chemotherapy [23,24]. Te drug resistance of LAML patients to chemotherapy often manifests in relapse after remission and transformation into refractory leukemia [25]. At  CD160  CD244  CD274  CD96  CSF1R  CTLA4  HAVCR2   IDO1  IL10  IL10RB  KDR  KIR2DL1  KIR2DL3  LAG3  LGALS9  PDCD1  PDCD1LG2   TGFB1  TGFBR1  TIGIT  VTCN1   GASK1A Figure 13: Variation in the proportion of immune infltrating cells between model groups: (a) box diagrams of matrix score, immune score, ESTIMATE score, and tumor purity of high-risk and low-risk groups, respectively. Te red color is for the high-risk group while the green color is for the low-risk group; (b): in the CIBERSORT algorithm, the proportion of immune infltrating cells in high-risk and low-risk groups is shown in the box diagram. Te red color is for the high-risk group while the green color is for the low-risk group; (c) heat map of the diference between high and low-risk groups in the proportion of immune infltration in the xCell algorithm. present, the molecular mechanism that mediates the transformation of LAML cells from chemotherapy sensitivity to drug resistance is still not completely clear. Terefore, fnding biomarkers with prognostic values for LAML is very important for determining the relevant drug targets of treatment intervention and overcoming treatment resistance.
Ultrasound examination can quickly and accurately assess the size and depth of tumors and clarify the extent of involvement of deep tissues [26]. Recently, ultrasonic medicine has broken through the limitations of traditional ultrasonic imaging diagnosis and entered the "nano" era [27,28]. Te deep drug delivery of tumors is also given a new  KIT   CSMD3  CHRD  CHD4  CEBPA  CD74  CAND1  BTN1A1  BOD1L1  BLZF1  ASXL1  ARID1A  ALOX12B  TNC  PHF6  PEAR1  CACNA1B  TTN  TP53  MUC16  KRAS  IDH1  BCORL1  IDH2  NPM1  FLT3  RUNX1 Figure 17: Drug sensitivity diferences between model groups: (a) IC 50 heat map between high and low-risk groups in TCGA-LAML cohort, high drug sensitivity is represented by red, and low sensitivity is represented by green; (b-g): the distribution diference of IC 50 values of six chemotherapeutic drugs between high-risk and low-risk groups. Te red color is for the high-risk group while the green color is for the lowrisk group.
direction by ultrasound-assisted tumor diagnostics and treatment, enhancing local drug concentration to achieve targeted therapeutic goals, and minimizing side efects [29]. Compared with the commonly used response evaluation criteria in solid tumors (RECIST), ultrasound can evaluate the efcacy of antiangiogenesis drugs in tumor patients earlier and more conveniently [30]. In conclusion, ultrasound plays a signifcant role in the diagnosis, treatment, and prognosis of tumors. In this study, two ultrasound-sensitive molecular subtypes with signifcant survival diferences were identifed based on the diferential genes of ultrasonic treatment of LAML, and the prognosis of patients with cluster2 was signifcantly better than that of cluster1. Ten, the immune cell infltration between diferent subtypes was further analyzed. Te results revealed substantial variations in the immune microenvironment between the two subtypes, which may be the reason for the survival diferences between the two subtypes. Ten, WGCNA analysis of the disturbed genes between subtypes identifed two key modules of the two main clinical features associated with LAML was performed. Finally, based on the prognostic factors signifcantly related to LAML, we constructed an 8-gene signature RS model composed of GASK1A, LPO, LTK, PRRT4, UGT3A2, BLOCK1S1, G6PD, and UNC93B1 to evaluate the prognosis of patients with LAML. As a secretory protein kinase, GASK1A is expressed in basal epithelial cells, which is not only related to the occurrence and development of tumors but also may cause chemotherapy resistance in some tumor patients [31,32]. Since polyunsaturated fatty acids and oxygen free radicals react in the body to form lipid peroxide (LPO), and its expression level is correlated with the poor prognosis and disease invasiveness in breast cancer patients [33], studies have revealed that inducing the outbreak of LPO and ferroptosis in tumors can induce the death of drug-resistant cancer cells and efectively improve the efcacy and prognosis of chemotherapy-resistant patients [34]. CLIP1-LTK fusion gene can be used as a therapeutic target of loratinib in patients with non-small cell lung cancer [35]. Abnormal activation and mutation of LTK regulate the growth and apoptosis of tumor cells and afect the occurrence and progression of many types of tumors [36]. LTK is a common upregulated target gene in stages I-IV of hepatocellular carcinoma, which is mainly involved in tumor immunity and signal transduction [37]. LTK mutations may cause myeloma and can be used as biomarkers to detect specifc targets of myeloma [38]. In addition, LTK is closely related to the pathogenesis of LAML [39]. PRRT4 is considered a new prognostic biomarker for gastric cancer [40]. UGT3A2 may be the antidote to polycyclic aromatic hydrocarbons in the human body, and its mutation increases the carcinogenic risk of polycyclic aromatic hydrocarbons [41,42]. Te expression level of UGT3A2 is related to DNA methylation and afects the occurrence and development of LAML [43]. An 11-gene signature, including UGT3A2, established based on the immune microenvironment can be employed for the prediction of the prognosis of thymoma patients [44]. BLOCK1S1, also known as GCN5L1, is a new molecule homologous to the sequence of nuclear acetyltransferase GCN5, which is involved in the regulation of mitochondrial autophagy, fatty acid oxidation, and other mitochondrial biological processes [45][46][47]. BLOCK1S1 can also regulate the occurrence and development of hepatocellular carcinoma through glutamine metabolism, and the expression level is related to the prognosis of patients [48]. G6PD plays a role in cell cycle regulation (cell growth and death) and is related to tumorigenesis and malignant progression; in addition, it is an indicator of poor tumor prognosis [49,50]. According to various studies, G6PD can promote the proliferation of LAML cells and patient resistance [51]. Furthermore, the role and mechanism of UNC93B1 in tumors are still unknown and there is no substantial study available. In conclusion, nearly every one of the eight genes examined in this study is strongly linked to the development, progression, and prognosis of diferent cancers, with LTK, UGT3A2, and G6PD, particularly thought to be crucial in LAML.
Tis study is the frst to develop a prognosis model for LAML based on the diferential genes of subtypes that are sensitive to ultrasound therapy, which ofers a fresh perspective on the disease's molecular mechanism and prognosis prediction. Te model we established is obtained through the comprehensive analysis of multiple datasets, which has high reliability, and the multigene aggregation model has a higher prognostic value than a single gene. However, there are still some limitations to this study. Firstly, the sources of clinical information obtained in this study are TCGA and GEO databases, most of which are white, African, or Latin American, thus, when applying our fndings to patients of other races care must be taken. Secondly, because this is a retrospective study, there is no way to avoid some data loss and selection bias. Finally, the model is still in the theoretical stage, and more experiments are needed in the future to further verify the clinical prognostic value of the model.

Conclusions
In this study, the ultrasound-sensitive subtype of TCGA--LAML was identifed based on the ultrasound-sensitive gene for the frst time, and fnally, an RS model composed of 8 signatures was constructed to evaluate the prognosis of patients with LAML.

Data Availability
Te data used to support the fndings of this study are available from the corresponding author on reasonable request.