Expression of Mucin Family Proteins in Non-Small-Cell Lung Cancer and its Role in Evaluation of Prognosis

Lung cancer is still the major contributor to cancer-related mortality. Over 85% of patients suffer from non-small-cell lung cancer (NSCLC). Mucins (MUCs) are large glycoproteins secreted or membrane-bound produced by epithelial cells in normal and malignant tissues. They are the major components of the mucous gel that covers the surface of the respiratory epithelium. Certain MUCs have been used or proposed to act as biomarkers for lung cancer. Nevertheless, the expression, messenger ribonucleic acid (mRNA) levels, and the prognostic value of MUCs in NSCLC are yet to be investigated systematically. In this research, the biological information of MUC proteins in patients with NSCLC was examined using a series of databases. The results based on gene expression profiling interactive analysis (GEPIA) illustrated that the expression of MUC3A, MUC4, MUC5B, MUC13, MUC16, and MUC21 mRNAs was remarkably upmodulated in lung adenocarcinoma (LUAD) patients, whereas the MUC1 expression was downregulated in lung squamous cell carcinoma (LUSC) patients. Kaplan–Meier plotter (KM Plotter) analysis revealed that elevated mRNA expression levels of MUC3A and MUC16 were linked to unfavourable overall survival (OS) in NSCLC, while increased mRNA expression of MUC1 and MUC15 was linked to good OS, especially in LUAD patients. In addition, differential expression of MUC1, MUC3A/3B, MUC8, MUC12, MUC15, and MUC16 mRNA was linked to the prognoses of NSCLC patients with varied clinical-pathological subtypes. Genetic alterations of MUCs in NSCLC primarily involved mutations, fusion, amplification, deep deletion, and multiple alterations according to cancer genomics (cBioPortal). Therefore, we propose that combinations of MUC proteins can act as prognostic biomarkers and demonstrate the therapeutic potential for NSCLC-related therapy.


Introduction
Lung cancer continues to be one of the world's fatal cancers. e most frequently diagnosed histological subtype of lung cancer is non-small-cell lung cancer (NSCLC) which is responsible for over 85 percent of all lung cancer cases. NSCLC has two main histological phenotypes namely, lung adenocarcinoma (LUAD, attributed to around 50% of all cases) and lung squamous cell carcinoma (LUSC, attributed to around 40% of all cases) [1]. A majority of patients having early-stage lung cancer are typically asymptomatic or demonstrate distant metastasis at the first diagnosis. People who are diagnosed with metastatic NSCLC had a 5-year overall survival chance of lower than 5% in the previous decade [2]. Although the tumour-node-metastasis (TNM) staging system helps to decide suitable strategies for NSCLC treatment, the survival rates among NSCLC patients who are at the same stage and receiving the same therapy might vary remarkably [1]. Hence, it is crucial to explore effective tumour biomarkers for assisting early diagnosis, prognosis evaluation, and appropriate treatment for NSCLC.
Mucins (MUCs) are a group of glycoconjugates with high molecular weight for protecting epithelial cells as a physical barrier. However, recent research proves that they are involved in tumour development, tumour cell growth, and immune escape by altering localization or glycosylation patterns [3]. To date, there are 21 MUC genes in humans that have been discovered and confirmed by the HUGO Gene Nomenclature Committee (HGNC). Some MUC genes have already been demonstrated to have prognostic values in different cancer types. For instance, MUC16 (CA125) is a well-known cancer biomarker contributing to disease progression and metastasis in several malignancies [4,5]. MUC12 was identified as a candidate gene involved in colorectal cancer (CRC) metastasis and was an independent prognostic factor in stages II and III CRC [6]. e elevated expression level of MUC15 was linked to survival in stomach adenocarcinoma [7]. MUC13 is commonly dysregulated in diverse epithelial carcinomas, including gastric, colorectal, and ovarian malignancies [8]. Jonckheere et al. discovered an MUC4/MUC16/MUC20 signature that was associated with poor survival in pancreatic, colon, and stomach cancers [9]. MUC21 was considered a potential biomarker for assisting LUAD diagnosis and treatment [10].
Nevertheless, the role of expression, prospective functions, and the prognostic significance of MUCs in the prognosis of NSCLC is still contentious and has not been explored systematically.
is might be attributed to the complexity of MUC biology and the existence of multiple MUCs with differing functions within different cells at various stages [11]. We hypothesised that combinations of MUC proteins could act as prognostic biomarkers for NSCLC treatment. Considering that human airway located MUCs are possibly involved in the development of NSCLC, this study selected 19 human MUC genes (MUC1, MUC2,  MUC3A, MUC3B, MUC4, MUC5AC, MUC5B, MUC6,  MUC7, MUC8, MUC12, MUC13, MUC15, MUC16, MUC17,  MUC19, MUC20, MUC21, and MUC22) that are highly abundant in the human airway for bioinformatics analysis based on the expression profiles of LUSC and LUAD patients. Several online tools for data mining were used for investigating the MUC family members' expression, function, and prognostic value in NSCLC ( Figure 1).

Analysis of Gene Expression
Profiles. NSCLC cohorts with gene expression profiles, gene variation data, and clinical information were used in this study. GEPIA (gene expression profiling interactive analysis,https://gepia. cancer-pku.cn) [12] was employed to analyse the RNA sequence expression profile of NSCLC and adjacent tumour tissues from the Cancer Genome Atlas (TCGA) and Genotype-Tissue Expression (GTEx) project. e MUC expression in tumour and normal specimens was subjected to an analysis utilizing the Student's t-test, and the MUC expression in various stages of NSCLC was investigated utilizing the F-test. P < 0.01; the fold change (FC) >2 was established as the parameters for determining a significant difference. Additionally, MUC protein expression profiles available from the Human Protein Atlas database (HPA) (https://www.proteinatlas.org/) were compared to find out the possible matched expression at the protein and mRNA levels. In the cell types annotated, antibody (Ab) staining levels ranged from nondetected, low, medium to high. e staining degree and proportion of stained cells were utilized to compute the score [13][14][15].

Prognostic Analysis.
e Kaplan-Meier (KM) plotter (https://www.kmplot.com) is a platform available for analyzing the impact of 54 k genes (protein, miRNA, and mRNA) on the survival of 21 distinct kinds of cancer, such as gastric (n � 1,440), lung (n � 3,452), ovarian (n � 2,190), and breast (n � 6,234) cancers from Gene Expression Omnibus (GEO), the European Genome-phenome Archive (EGA), and TCGA databases. e KM plotter's primary objective is to undertake a metaanalysis-based identification and verification of survival biological markers [16]. e KM plotter and GEPIA were both employed to assess the predictive significance of MUC mRNA expression. We also analysed the disease-free survival (DFS) and OS of NSCLC patients. Subsequently, the patient specimens were categorized into high-and low-expression groups predicated on median mRNA expression, log-rank P-values, and hazard ratios (HR) with 95% confidence intervals (CI) [17,18]. Statistical significance was established as log-rank P-values <0.05. Univariate Cox analysis was undertaken with adjustments to several groups based on different clinicopathological features, namely, sex, chemotherapy, clinical stages, and smoking status among patients with NSCLC.

Analyses of the Frequency of Gene
Mutations. MUC gene mutations in patients with NSCLC were examined with visualization and analysis of the following datasets: cBio-Portal for cancer genomics (https://www.cbioportal.org) [19,20]. Genomic profiles were selected by screening individual MUC gene symbols for parameters such as cancer studies, levels of mRNA expression, putative copy-number alterations (CNV), and mutations.

Bioinformatics Analysis and Functional Enrichment.
For gene-level correlation analysis, GeneMANIA (https:// www.genemania.org), a biological network integrative platform for the prioritization of genes and prediction of their functions, was utilized [21]. We conducted gene ontology (GO) terms and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways enrichment analyses [22,23] with DAVID version 6.8 (https://david.ncifcrf.gov/tools. jsp).

Statistical Analysis.
All statistical analyses were performed during the analysis in online bioinformatics tools. Students' t-test was conducted between the two groups. e ANOVA test was conducted among three or over three groups. e log-rank test was conducted in Kaplan-Meier survival analysis. P < 0.05 was considered significant.

Levels of MUC mRNA in Patients with NSCLC.
GEPIA was utilized to analyse the relative MUC mRNA expression in LUSC and LUAD as opposed to that in normal tissues. MUC3A, MUC4, MUC5B, MUC13, MUC16, and MUC21 mRNA expression levels were considerably elevated in LUAD as opposed to those in normal lung specimens. In addition, the MUC20 mRNA expression level was remarkably increased in both LUSC and LUAD in contrast with that in normal lung specimens. Contrastingly, the MUC1 mRNA expression level was considerably reduced in LUSC as opposed to that in normal lung samples (Figure 2(a)).
MUC expression was also studied during I, II, III, and IV stages of NSCLC (Figure 2(b)). e findings illustrated that the levels of MUC1 and MUC5B mRNA expression changed considerably across various tumour stages (P < 0.05). Especially, the expression level of MUC5B in stage IV was almost twice in stage II. However, the mRNA expression of other MUC genes did not differ among tumour stages. MUC mRNA expression at different clinical stages was also studied in LUSC and LUAD. e findings illustrated that the mRNA expression level of MUC1 changed significantly across various tumour stages in LUSC, being higher at stages I and IV than at stages II and III. Furthermore, the levels of

Prognostic Significance of MUC mRNA Levels in NSCLC.
MUC levels were evaluated for prognostic significance utilizing the KM plotter analysis in both the whole NSCLC cohort and the LUSC and LUAD subtypes. Increased MUC1 and MUC15 mRNA expression levels were linked to a favourable OS in the whole cohort. In contrast, an increase in MUC2, MUC3A, MUC12, MUC16, and MUC17 mRNA expressions was strongly linked to the unfavourable OS in NSCLC (Figures 3(a)-3(g)). In addition, increased MUC1 and MUC15 mRNA levels were linked to favourable OS, and elevated MUC3A, MUC8, MUC12, MUC13, MUC16, and MUC17 mRNA levels were linked to unfavourable OS among LUAD patients (Figures 4(a)-4(h)). Moreover, elevated mRNA levels of MUC19 were considerably linked to unsatisfactory OS among patients with LUSC ( Figure 4(i)). Notably, these results indicated that MUC1, MUC3A, MUC8, MUC12, MUC13, MUC15, MUC16, and MUC17 perform different prognostic functions in LUAD.
Additionally, members of the MUC family were verified utilizing NSCLC data acquired from the GEPIA database. As depicted in Figure 5 Figure 5(b)). An increase in MUC5AC mRNA expression was linked to adverse DFS in LUAD patients, whereas an increase in MUC21 mRNA expression was related to good DFS in LUAD patients. Besides, increased MUC12 mRNA expression correlated with unfavourable DFS in LUSC patients.

MUC mRNA Level Prognostic Significance in NSCLC Subsets with Various Clinical-Pathological
Characteristics. e association between MUC mRNA level expression and different clinical-pathological features, which include chemotherapy, clinical stages, smoking history, and sex, was evaluated in the NSCLC subsets. MUC3A and MUC3B were used as the probe in the KM plotter; whereas, MUC21 and MUC22 were not available on the platform. It was observed that a high MUC15 mRNA level was linked to favourable OS in patients with a smoking history in LUAD. In contrast, high MUC12 and MUC16 mRNA levels were linked to unfavourable OS among patients with a smoking history in LUAD. High MUC8 mRNA levels were related to unfavourable OS in patients with smoking in LUSC. Whereas, high MUC15 mRNA levels were linked to favourable OS in smokers with LUAD (Supplementary Tables 1A-1C). MUC3A/3B, MUC5B, MUC8, MUC12, and MUC13 mRNA expressions had a considerable link to unfavourable OS in patients with early-stage LUAD. However, MUC15 and MUC19 were linked to good OS in patients with stage I LUAD. MUC3A/3B and MUC19 were linked to unfavourable OS in patients with stage I and II LUSC, respectively. ese findings indicated that MUC3A/3B and MUC19 performed a prognostic function in early-stage NSCLC (Supplementary Tables 2A-2C). High MUC1 and MUC3A/ 3B mRNA levels correlated with favourable OS, and increased levels of MUC16 mRNA were considerably linked to unsatisfactory OS in NSCLC patients without chemotherapeutic treatment. Prognosis of MUC levels in LUSC and LUAD subsets of patients with or without chemotherapy was not available because the total sample number was low (Supplementary Table 3). Interestingly, increased levels of MUC1 and MUC15 mRNA were remarkably related to favourable OS in male patients with LUAD. However, in female patients, MUC3A/3B, MUC8, and MUC12 correlated with unfavourable OS. Elevated MUC16 mRNA levels in male LUAD patients were substantially linked to poor OS (Supplementary Tables 4A-4C).

MUC Gene Alterations in NSCLC.
MUC genetic alterations that are regularly present in NSCLC patients were studied in the cBioPortal.

MUC Gene Enrichment Analysis in NSCLC.
e functions of MUC genes were analysed with the DAVID. Twelve GO terms were found to be enriched ( Figure 6(f )). An enrichment in MUC proteins was found in the biological processes (BP) involving O-glycan processing and maintenance of the gastrointestinal epithelium. MUC acts as an extracellular matrix structural constituent, and its lubricant activity is the molecular function (MF) associated with it. e Golgi lumen, extracellular exosome, extracellular space, extracellular region, apical plasma membrane, integral component of membrane, vesicle, and mucus layer were the cellular components (CC) associated with MUC. e salivary secretion pathway was enriched for MUCs in KEGG.

MUCs in the Human Airway.
e susceptibility of inherited genes involved in lung cancer and environmental carcinogens are important factors in lung cancer aetiology. Differential expression of all the factors demonstrates population heterogeneity. MUCs are glycoproteins synthesized by mucosal epithelial cells. e expression of MUCs promotes cell invasion and metastasis and is regarded as a risk factor, demonstrating a poor prognosis. Lung cancer is among the most fatal tumours globally, and LUAD is the most prevalent subtype. Histological classification and early diagnosis are required for individualised treatments [24]. Various cancer treatment strategies, including molecular targeted therapy, stem cells, vaccines, oncolytic immunotherapy, and genetic therapy, are regarded as promising modalities, especially for patients whose lung cancer is at an advanced stage. Specific biomarkers and accurate diagnosis

MUCs in NSCLC.
It was shown that MUC1, the most highly expressed MUC in lung cancer, was expressed specifically in invasive lepidic predominant adenocarcinoma (LPA) [25]. Moreover, the depolarization of cells impacted MUC1 expression in lung cancer progression [11]. Besides, the specificity and efficacy of the prostate stem cell antigen (PSCA)-and MUC1-targeting chimeric antigen receptor (CAR) T cells against NSCLC cell lines in vitro were confirmed [26]. It is also known that MUC1-C ⟶ PD-L1 signaling promotes the inhibition of CD8 T cell activation [27]. erefore, MUC1 would be a highly attractive antigen for the development of effective anticancer vaccines and a potential molecular target for reprogramming the tumour microenvironment. Our study demonstrated that the MUC1 mRNA expression was remarkably lower in LUSC as opposed to that in normal lung specimens, and differential MUC1 expression was observed during the tumour stage progressing from I to IV. Increased MUC1 and MUC15 mRNA levels were linked to favourable OS in LUAD patients.
MUC2 and MUC6 have been related to lymph node metastasis in LUAD patients [28]. Additionally, DNA hypomethylation was illustrated to perform an instrumental function in MUC3A expression in carcinomas [29]. Our study found increased MUC2 and MUC3A mRNA levels linked to unfavourable OS in LUAD patients.
MUC4 expression is independent of mucus secretion in both normal human airways and carcinomas before epithelial differentiation [30]. MUC4 correlated with a better OS; MUC4 seemed to play a potential protective role in early-stage LUAD [31,32]. MUC4-positive LUAD mediated by the human epidermal growth factor receptor (HER)2 signaling pathway might be a distinct LUAD subtype in patients with poor outcomes associated with smoking [33]. Our results showed that the MUC4 mRNA expression level in LUAD was considerably elevated as opposed to normal lung specimens. However, mRNA levels of MUC4 were not substantially linked to OS and DFS in patients with NSCLC. Owing to the conflicting evidence, further experiments are required to examine the molecular mechanism of whether MUC4 is oncogenic or tumour suppressive.
MUC5AC and MUC5B have been used as specific markers to detect central type LUAD and mucinous LUAD [34]. MUC5AC was found to be a significant determinant of a poor prognosis, especially in KRAS-mutant tumours [35]. In ALK + lung cancer, there is a higher incidence of MUC1 and MUC5AC cytoplasmic expression, which, combined with a paucity of MUC2 and MUC6 expression, could lead to the biological aggressiveness of ALK + cancer [36]. In a recent study, it was observed that histological subgroups were associated with ALK, KRAS, and MET mutations, and with immunohistochemical reactivity of MUC1, MUC5AC, and MUC6 among the Chinese population [36]. MUC production independently served as a prognostic indicator for the epidermal growth factor receptor (EGFR)-mutant LUAD that was characterised by negative MUC5ACstaining and positive MUC5B-staining [37]. Overexpressed MUC5AC in genetically engineered mouse LUAD tissues was associated with poor survival in comparison with normal lung tissues [38]. However, in our study, the different MUC5AC expression has not been observed between the tumour and normal tissues as well as between diverse tumour stages. Nonetheless, the elevated MUC5AC mRNA expression level was substantially linked to unfavourable DFS in patients with LUAD. Further research will help to clarify the exact role of the MUC5AC gene subtype. A combination of high expression of MUC5B with thyroid transcription factor (TTF)-1 negative cells was a valuable marker to prophesize a poor OS of patients with LUAD compared with that of patients with LUSC [39]. Besides, it has been shown that the polymorphism in the MUC5B promoter can act as a predictive marker of OS in NSCLC patients receiving radiotherapy [40]. In this investigation, the MUC5B mRNA expression level was almost twice as higher in LUAD compared to that in normal lungs and changed significantly across various tumour stages. Especially, the expression level of MUC5B in stage IV was almost as twice in stage II.
MUC6 was shown to be upmodulated in the peritumoral epithelial tissues. Besides, the expressions of MUC8, MUC5AC, and MUC4 were reduced in NSCLC [41]. MUC7 was related to cell differentiation in smoke-induced lung cancer [42]. In this study, the elevated MUC8 mRNA expression level in LUAD patients was notably linked to OS.
Mutations in MUC12 have been observed at higher frequencies in the samples of familial lung cancer samples and lung cancer tissue, compared with those in the healthy population [43]. In this study, an increased expression of MUC12, MUC13, MUC16, and MUC17 mRNAs was substantially linked to unfavourable OS in patients with LUAD in the KM plotter. However, the result cannot be wholly proved in GEPIA; only MUC2, MUC12, and MUC16 could be associated with alteration in OS using this server. is might be attributed to the fact that the GEPIA database has a lower sample size.
MUC16, which is primarily expressed on the human goblet cell surface, demonstrates overexpression in patients with NSCLC and is often correlated with an unfavourable prognosis. MUC16 performs a meaningful function in metastasis and tumourigenesis in lung cancer by regulating TSPYL5 through JAK2/STAT3/GR [44]. A MUC16-mutant was resistant to matrix-metalloproteases (MMPs) that were released by LUAD cells. Furthermore, LUAD with both MMP-and MUC16-resistant mutant expression had an unfavourable prognosis [45]. e overexpression of MUC16 was correlated with familial lung cancer, air pollution produced by coal indoor, higher metastasis, and an advanced stage. High MUC16 expression contributed to the capacity of lung cancer cells to proliferate, invade, resist chemotherapy, and migrate in experiments analyzing cell behaviour. However, the results demonstrate variations among cell lines [46]. is research illustrated that the expression of MUC16 mRNA was higher in LUAD in contrast to that in normal tissues. MUC16 mRNA was remarkably linked to poor OS in patients with NSCLC and LUAD. Enhanced MUC16 mRNA expression was strongly linked to adverse OS in patients with smoking history, in those without chemotherapeutic treatment, and in males. e examination of the role of MUC20 in NSCLC is still incomplete. However, in endometrial cancer, MUC20 overexpression drives tumourigenesis, predicts poor survival [47], and EGF-induced malignant phenotypes were enhanced by activating the EGFR/STAT3 pathway [48]. Largescale genomic dataset analyses demonstrated that the synergistic effect of MUC4, MUC16, and MUC20 was linked to a statistically significant reduction in OS and elevated HR in colon, stomach, and pancreatic cancers [9]. is research illustrated that the expression level of MUC20 mRNA was substantially elevated in NSCLC (both LUSC and LUAD) in contrast to that in normal lung samples, but exhibited no link to DFS or OS in NSCLC.
MUC21 is a novel transmembrane MUC that could be used as a negative immunohistochemical marker to differentiate mesothelioma from LUAD [49,50]. MUC21 could be a promising biomarker with potential diagnostic and therapeutic applications for LUAD showing cell incohesiveness [10]. MUC22 was shown to independently function as a specific prognostic indicator of OS in patients with LUSC [51].
Analyses of the link between MUC mRNA expression and DFS/OS in NSCLC patients were performed by using two public datasets exhibited similar results. However, the observations on MUC expression were not completely consistent among the different datasets. is could be attributed to the fact that GEPIA has a smaller sample size compared to that in the KM Plotter. It suggests that larger sample sizes and more detailed oncogenic driver-based subgroups should be considered in the future to improve the quality of the analysis. e link between MUC mRNA and a variety of clinicalpathological features was investigated. We found that MUC15 was linked to favourable OS in LUAD male patients and with smoking activity. In contrast, MUC12 was considerably related to unsatisfactory OS in LUAD patients with a smoking history, stage I or II tumours, and being female.
Mutations in MUCs could be linked to tumourigenesis and cancer development, and may act as potential tumour suppressors and genic biomarkers. Different types of alterations that are commonly observed were analysed in MUCs with NSCLC. High enrichment of amplification events in TCGA LUSC compared to other datasets suggests a role of MUC mutations on LUSC, especially those in MUC1, MUC4, and MUC20. However, the observed alterations did not have any correlation with OS or DFS. e results suggest that these gene mutations may not directly affect NSCLC prognosis. Additionally, MUC proteins were evaluated with network analysis to examine the potential molecular mechanisms of MUC in NSCLC. MUC genes were primarily enriched in the O-glycan processing and maintenance of gastrointestinal epithelium pathways, highlighting its role as a potential target for anti-NSCLC therapeutics, especially for MUC-producing LUAD.

Prognostic Value of MUCs in NSCLC.
e study showed that the elevated expression of MUC1 and MUC15 was considerably linked to favourable OS in patients with NSCLC, particularly in patients with LUAD. e elevated MUC8, MUC12, and MUC16 expression levels were substantially related to poor OS in patients with LUAD (Supplementary Table 6). MUC4 and MUC16 were colocalized and coexpressed within the cell. e study suggested the clinical personal heterogeneity and NSCLC signaling complexity, and highlight the combination of associated MUCs as a potential tool for the determination of prognosis and use in molecular targeted therapy for patients with lung cancer. More research will be required to investigate MUC protein expression in various oncogenic driver subtypes.
is study will aid in further evaluation of the molecular mechanisms of MUCs in NSCLC as well as in the exploration of the potential of MUC-based therapeutic targets for NSCLC treatment.

Data Availability
Publicly available datasets were used for this study. e analysis of gene expression profiling and prognosis was performed with the GEPIA database (https://gepia.cancerpku.cn). Protein expression levels were obtained from HPA (https://www.proteinatlas.org/). e prognostic analysis was performed with the KM plotter tool (https:// www.kmplot.com). e analysis of gene alteration frequency was performed using the cBioPortal for cancer genomics (https://www.cbioportal.org). Gene correlation analysis was conducted with GeneMANIA (https://www. genemania.org). e functional annotation and pathway enrichment analysis were performed with the DAVID (https://david.ncifcrf.gov/).

Conflicts of Interest
e authors declare that they have no conflicts of interests. versity of Chinese Academy of Sciences for the support they have provided for the development of the study. e study was supported by the High-level Medical Research Personnel Training Project of Chongqing, P.R.C., Beijing Health Alliance Charitable Foundation of China (WS817D).