Gene Identification and Potential Drug Therapy for Drug-Resistant Melanoma with Bioinformatics and Deep Learning Technology

Background Melanomas are skin malignant tumors that arise from melanocytes which are primarily treated with surgery, chemotherapy, targeted therapy, immunotherapy, radiation therapy, etc. Targeted therapy is a promising approach to treating advanced melanomas, but resistance always occurs. This study is aimed at identifying the potential target genes and candidate drugs for drug-resistant melanoma effectively with computational methods. Methods Identification of genes associated with drug-resistant melanomas was conducted using the text mining tool pubmed2ensembl. Further gene screening was carried out by GO and KEGG pathway enrichment analyses. The PPI network was constructed using STRING database and Cytoscape. GEPIA was used to perform the survival analysis and conduct the Kaplan-Meier curve. Drugs targeted at these genes were selected in Pharmaprojects. The binding affinity scores of drug-target interactions were predicted by DeepPurpose. Results A total of 433 genes were found associated with drug-resistant melanomas by text mining. The most statistically differential functional enriched pathways of GO and KEGG analyses contained 348 genes, and 27 hub genes were further screened out by MCODE in Cytoscape. Six genes were identified with statistical differences after survival analysis and literature review. 16 candidate drugs targeted at hub genes were found by Pharmaprojects under our restrictions. Finally, 11 ERBB2-targeted drugs with top affinity scores were predicted by DeepPurpose, including 10 ERBB2 kinase inhibitors and 1 antibody-drug conjugate. Conclusion Text mining and bioinformatics are valuable methods for gene identification in drug discovery. DeepPurpose is an efficient and operative deep learning tool for predicting the DTI and selecting the candidate drugs.


Introduction
Melanoma is a severe skin malignant tumor that arises from melanocytes, which is the fifth most common malignant tumor in the United States. It accounts for the leading cause of skin cancer-related deaths [1]. The prognosis of melanoma is highly correlated with the pathology stage at first diagnosis-the patients with superficial melanoma (Breslow thickness ≤ 1 mm) have a higher cure rate [2]. Treatments of melanoma mainly include surgery, chemotherapy, targeted therapy, immunotherapy, oncolytic virus therapy, and radiation therapy [3]. Surgery is still an important method for melanoma. Wide excision is the clas-sic surgical method for melanoma of the trunk and extremities [4][5][6][7][8]. Tumors growing in other body parts (head and neck, subungual, genitals, etc.) should be resected as thoroughly as possible and combined with postoperative reconstructive to improve appearance and function [9]. For advanced melanoma, resection of primary and metastatic tumors combined with adjuvant therapy (immunotherapy, targeted therapy, chemotherapy, etc.) has been proven to have a better prognosis [10][11][12].
As mentioned above, advanced melanomas require comprehensive therapies. Chemotherapy has not been shown to improve survival in patients with advanced melanomas [3,13]. Radiation therapy is the palliative for local symptoms  Figure 1 shows the research process of our study. From left to right, the text labels represent the analysis contents and corresponding tools.   [14,15]. Till now, researchers have accumulated many abnormally expressed genes in melanoma, and databases such as GEO and TCGA contain the sequencing data of mela-noma specimens and clinical data of patients. We can screen out new critical genes affecting the survival of patients on this basis.
Besides the immunotherapy targets such as PD-1/PD-L1 and CTAL-4, the classic target genes in melanoma are BRAFand MEK-related pathways. Mutation of BRAF (50%), NRAS (25%), and neurofibromin 1 (14%) are common in melanoma  Figure 3: The protein-protein interaction of candidate genes. The protein-protein interaction analysis of candidate genes from STRING. Each circle represents one protein, and the line represents the interaction between them. 3 Disease Markers [16]. Small molecule inhibitors include Vemurafenib/Dabrafenib for BRAF and Trametinib/Cobimetinib for MEK.
However, resistance always occurs. It is currently believed that the targeted drug resistance mechanism of melanoma contains the following aspects: reactivation of the MAPK pathway [17], activation of substitutive pathways (PI3K-mTOR pathway) [18][19][20][21], alteration of the tumor microenvironment [22][23][24], autophagy and ER stress of tumor cells [25][26][27], miRNA-mediated resistance [28,29], and therapy-mediated selection of resistant tumor cell subpopulations [30]. The mechanism of immune therapy resistance includes the immune desert, the immuneexcluded tumor phenotype, increased regulatory cells in the tumor microenvironment, increased immunosuppressive cytokines, and upregulation of inhibitory receptors on T cells [31]. When targeted drugs or immune checkpoint inhibitors are ineffective, the current therapeutic schedule often combines drugs to achieve better survival than single agents, like MAPK pathway inhibitors and immune checkpoint drugs [32,33]. However, the shortcomings are apparent. Drugs for melanoma are limited, and the effect of the drug combination is limited. If no new drugs are explored, there will inevitably be a situation where no more drugs are available.
Other targeted drugs have not been thoroughly studied and supported in the treatment of melanoma, such as the classic target gene-ERBB2 of breast cancer. Therefore, these targeted drugs provide new hope and convenience for exploring new treatment options for melanoma.
Traditional approaches to discovering a new drug are time-consuming and expensive, which can cause a substantial financial burden on society and delay in getting effective treatment for patients [34]. The task of finding a new drug is technologically tricky since the amount of drug-like molecules can be up to 10 60 [35]. In the past decades, the emerging technology of computational methods is considered potential in the early stage of drug discovery [36]. Text mining is a technology based on massive data resources, allowing quick analysis of potential information [37]. It has been highly developed and successfully applied in fields like security applications, biomedical applications, and emotion analysis.
The application of artificial intelligence, particularly deep learning (DL), is acceleratingly impacting the field of biomedicine [38]. In drug research and development (R&D), the principal goal is to identify the compounds tightly and selectively to the target proteins, and DL is a powerful in silico tool in which many models are built for predicting the drug-target interaction (DTI) [39,40]. Deep learning (DL) is prevailing in silico tool in biomedicine. Many DL models are built for predicting the drugtarget interaction, compound property prediction, and protein-protein interaction prediction. DeepPurpose is a deep learning algorithm that provides a framework implementing over 50 advanced DL models, 15 drug encodings, and 8 target encodings based on many databases (Bin-dingDB, DAVIS, KIBA, etc.). It is tested comparably effective to state-of-the-art DL models (GraphDTA and DeepDTA) [41]. DeepPurpose provides a simple framework to conduct DTI research using 8 encoders for drugs and 7 for proteins. The following steps correspond to one line of code in DeepPurpose.
(i) Encoder Specification. We select a specific encoder for drugs of SMILE format and proteins of amino acid sequence (ii) Data Encoding and Split. We use the selected encoders to convert the data into a format which can be recognized and calculated by DeepPurpose  [42] In the present study, we identified the relevant genes of drug-resistant melanoma via text mining technology. Further, we screened the targeted genes with GO/KEGG/PPI/ GEPIA analysis besides literature review. It was the first time that DeepPurpose was used to discover medicines for drugresistant melanoma. It would provide a reference value in

Materials and Methods
2.1. Text Mining. Three phrases "melanoma," "drug," and "resistance" were input into pubmed2ensembl (http:// pubmed2ensembl.ls.manchester.ac.uk/), a public source for mining the relevant biological literature on genes, which was used to obtain the associated gene list. We set "Homo sapiens" as the species and selected "Ensembl Gene ID," "MEDLINE: PubMed ID," and "Associated Gene Name." "Search for PubMed IDs" and "filter on Entrez: PMID" were chosen for each query [43][44][45][46].

Biological Process and Pathway
Analysis. Gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analyses were conducted by the Database for Annotation, Visualization and Integrated Discovery (DAVID) [47,48], and the genes of the most statistically enriched pathways in GO and KEGG analyses (p less than 10E-20) were selected out and used for subsequent protein-protein interaction (PPI) analysis.

Protein-Protein
Interaction. Protein-protein interaction (PPI) was conducted in Search Tools for the Retrieval of Interacting (STRING) database [49]. We imported candidate genes of the last step into the database and chose the "Homo sapiens" as the organism. Further, we imported the start and end nodes of STRING into Cytoscape to conduct the protein-protein interaction, using MCODE app to confirm the hub genes [50,51].

Survival Analysis.
The clinical significance of the candidate genes was validated by Gene Expression Profiling Interactive Analysis (GEPIA) [52]. The survival analysis results in GEPIA were used to screen out genes with significant statistical differences in skin cutaneous melanomas.
2.5. Drug-Gene Interaction. The pharmaprojects database (https://pharmaintelligence.informa.com) was used to inquire about drugs targeted at hub genes [53,54]. Each hub genes generated drugs list targeted on it. Drugs with available SMILES structures, "launched," "phase I/II/III clinical trial," "pre-registration" or "registered" in global status, and "injectable" or "oral" in delivery routes were screened out for candidate drugs lists.
2.6. DeepPurpose. With target genes and their potential drugs, we employed DeepPurpose to calculate affinity scores between them [42]. 14 encoding combinations were chosen based on DAVIS, BindingDB, or KIBA. Affinity scores were calculated by importing SMILES structures of drugs and amino acid sequences of genes into pretrained models. We summarized the scores of each pair of drugs and target genes. Ultimately, we chose the drugs with affinity scores of at least 7.0 by DAVIS or BindingDB datasets and 12.1 by KIBA dataset [46].

Immunohistochemistry. We used the Human Protein
Atlas to compare immunohistochemical staining of key genes between melanomas and skin melanocytes [55,56].

Identification of Targeted Genes.
In the pubmed2ensembl, 433 genes related to "drug-resistant melanoma" were obtained after deleting duplicates. We carried out the text mining and exported related genes to excel on December 24, 2021. And the overall process is shown in Figure 1.

PPI Network
Analysis of Candidate Genes. The PPI network created by Cytoscape is shown in Figure 3, and then, we imported the candidate genes into Cytoscape. After screening out by MCODE, 27 genes were obtained. Hub  The candidate drugs for further DTI affinity score analysis. Each color represents a set of candidate drugs targeting the screened gene above. The area represents the proportion. 6 Disease Markers  Table 1 and were chosen for further exploring.

Targeted Drugs on Selected Genes in Pharmaprojects.
In Pharmaprojects, 16 drugs targeted on selected genes met the requirements (Figure 5), which included 10 ErbB-2  3.6. Drug-Target Interaction Prediction by DeepPurpose. As shown in Table 2, the affinity scores calculated based on DAVIS and BindingDB datasets ranged from 3 to 9 approximately, while for KIBA dataset, the scores ranged from 10 to 13. As identifying the high-affinity drugs, the baseline score was set to 7.0 based on DAVIS or BindingDB, and 12.1 for KIBA. 11 drugs with further clinical verification values are screened out in Table 3. All of them were ERBB2-targeted drugs, including 10 ERBB2 kinase inhibitors and 1 antibody-drug conjugate.
3.7. The Protein Expression of ERBB2. After we identified ERBB2 as the promising target of drug-resistant melanoma by DeepPurpose. We used the Human Protein Atlas (HPA) database to confirm the protein expression in melanomas ( Figure 6(a)) and skin melanocytes (Figure 6(b)). As shown in Figure 6, the protein expression of ERBB2 was higher in melanomas compared to the melanocytes in normal skin tissue. In melanoma, ERBB2 was detected by antibody CAB000043 with low staining, moderate-intensity and <25% quantity. While in skin melanocytes, ERBB2 was not detected by antibody CAB000043.
3.8. PPI Network Analysis of Six Hub Genes. Finally, we constructed and analyzed the PPI relationship of six hub genes in Figure 7. CASP8 and NFKBIA were closely related to ERBB2, while BAX, FAS, and CFLAR were indirectly related to ERBB2. EGF, FADD, HSP90AA1, NFKB1, REL, RIPK1, RIPK3, TNFRSF10A, TNFRSF10B, and TNFRSF1A formed an interaction network with the six hub genes.

Discussion
This study purports to repurpose existing drugs as new drug options which have not been used for drug-resistant melanoma. Unlike previous methods of biomarker selection, our study did not focus on the mechanism of drug resistance but aimed at selecting the most potential gene targets through bioinformatics analysis. This study first obtained a wide range of candidate genes associated with drug resistance in melanoma (433 genes). Then, through GO and KEGG analysis enrichment, we selected genes in the most significant pathways for the next step (348 genes). Next, we screened 27 hub genes through PPI analysis and MCODE application in Cytoscape. In order to make it more clinically significant, survival analysis was conducted for candidate genes. Six genes with statistical significance were screened out for existing targeted drugs in Pharmprojects. Finally, we calculated the DTI and obtained the drugs with the highest affinity scores (11 drugs). In the study, a total of 11 candidate compounds targeted on ERBB2 were identified. All of them were ERBB2-targeted drugs, including 10 ERBB2 kinase inhibitors and 1 antibodydrug conjugate.
The abnormal expression of the ERBB2 gene had been studied in melanoma. A study by Gottesdiener et al. included patients with nonuveal melanoma at Memorial Sloan Kettering Cancer Center from 2014 to 2018. In 732 melanoma cases, ERBB2 amplifications were detected in acral (3%) and mucosal (3%) melanomas. ERBB2 mutations were found in cutaneous (1%), acral (2%), and  Disease Markers mucosal (2%) melanomas. ERBB2 amplifications were detected in acral (7%) and mucosal (6%) melanoma among 140 patients without canonical driver alterations. ERBB2 amplification was found in a patient resistant to checkpoint inhibition therapy, who showed a durable complete response to trastuzumab emtansine [57]. The research by Kluger et al. included 600 patients, and 31 patients had positive ERBB2 expression. 7% of patients had positive ERBB2 staining in primary cutaneous specimens, while 3.6% in recurrent or metastatic specimens. ERBB2 expression was associated with melanoma lesions with a Breslow depth of <2 mm [58]. In conclusion, abnormal expression of ERBB2 was associated with the development of melanoma and might be independent of the canonical driver. As a target, preliminary efficacy had been achieved in treating drug-resistant melanoma.
ERBB2 plays an important role in normal cell and tumor development. Erb-B2 Receptor Tyrosine Kinase 2 (ERBB2) is one of the epidermal growth factor receptor families. EGFR family contains four tyrosine kinase receptors: HER1, ERBB2, HER3, and HER4 [59]. The ligand-binding domain, transmembrane domain, and tyrosine kinase domain are the canonical structures of epidermal growth factor receptors. Till now, no endogenous ligands have been found for ERBB2. The ligand-independent manner or heterodimers with other EGFRs/tyrosine kinase superfamily can activate ERBB2 [60][61][62].
The ERBB2-MAPK pathway is associated with cell proliferation, growth, and survival [71]. Activated ERK phosphorylates Bim to promote its ubiquitination, proteasomal degradation, and apoptosis [72,73]. Study shows that ERBB2 causes apoptosis suppression by directly resulting in Puma destabilization and proteasomal degradation [74].
In our study, six hub genes were believed to be associated with drug resistance in melanoma. Among them, CASP8 and NFKBIA were closely related to ERBB2, while BAX, FAS, and CFLAR were indirectly related to ERBB2. After further expanding the PPI relationship, we found EGF, FADD, HSP90AA1, NFKB1, REL, RIPK1, RIPK3, TNFRSF10A, TNFRSF10B, and TNFRSF1A, as essential proteins, formed protein networks closely related to the above six hub genes. This interaction relationship can be the basis for subsequent studies on the mechanism of drug resistance in melanoma with ERBB2 as the entry point.
The regulatory relationship between them can be further verified through experiments.
The anti-ERBB2 therapy contains three aspects: ERBB2targeted monoclonal antibodies, antibody-drug conjugates, and ERBB2 kinase inhibitors. Monoclonal antibody drugs include trastuzumab and pertuzumab. Tyrosine kinase inhibitors include lapatinib, neratinib, pyrotinib, and tucatinib. Antibody-drug conjugates include trastuzumab emtansine (T-DM1) and Trastuzumab Deruxtecan (DS-8201). Our screened 11 drugs have a high affinity with ERBB2, which can play a good role in recognizing and blocking it. Among the 11 ERBB2-targeted drugs registered in the pharmaprojects database, the antibody-drug conjugates "Trastuzumab Deruxtecan" and ten other tyrosine kinase inhibitors were included.
None of the 11 drugs played a role in the treatment of melanomas. Mobocertinib is currently mainly used to treat lung cancer, while the remaining 9 tyrosine kinase inhibitors and Trastuzumab Deruxtecan have been used for various solid tumors, such as breast cancer, lung cancer, bladder cancer, kidney cancer, gastrointestinal cancer and nervous system malignancies.
As an essential factor in regulating cell death, ERBB2 plays a critical role in the occurrence of melanoma drug resistance. We screened out 11 drugs with the highest affinity for ERBB2 out of many existing drugs by deep learning algorithms. The treatment value for drug-resistant melanoma of these drugs deserves more exploration.
In conclusion, ERBB2 plays an essential role as a target in many tumors. Through machine learning, our study proves that ERBB2-targeted drugs may play an important role in treating drug-resistant melanoma. However, the research on the role of ERBB2 in melanoma is still insufficient, especially on the mechanism of drug resistance. We need more studies on the relationship between ERBB2 and melanoma resistance and developing it into medicine in the future.

Conclusion
In the present study, we explored the relevant genes of drug-resistant melanoma based on the technology of text mining. 433 genes were found by Pubmed2ensembl. Furthermore, the most statistically significant processes (p < 10E − 20) in the GO and KEGG analyses, respectively, were selected, and 348 genes were involved in these pathways. The PPI network was built in DAVIS and Cytoscape, where 27 genes were screened out by MCODE. Next, we got 6 genes with a statistical difference in survival analysis by GEPIA. For the implementation capability, the 16 targeted drugs were identified in Pharmaprojects under the stage of "launched" or "phase I/II/III clinical trial" or "pre-registration" or "registered." We employed DeepPurpose, a deep learning algorithm, to calculate the affinity score, and 11 drugs were screened out, which were 10 ERBB2 kinase inhibitors and 1 antibody-drug conjugate. Our study provided a reference value in the drug discovery at the early stage. Nevertheless, the effectiveness requires further validation from lab work and clinical trials. 10 Disease Markers