Endometrial cancer is one of the most common gynecologic malignancies, and the incidence of this cancer continues to increase [
At present, text mining (TM) technology is widely used in biomedical research to extract information from large quantities of biomedical literature and construct databases of disease-related genes, proteins, and molecular interactions [
The extraction of data by TM was based on natural language processing (NLP). Using “Endometrial Cancer” and “Endometrium Carcinoma” as search terms, we searched the PubMed database for article abstracts published before March 2014 and formatted the documents that were obtained. Genes and proteins that appeared in the abstracts of these documents were located and tagged using ABNER (A Biomedical Named Entity Recognizer; an open source tool for automatically tagging genes, proteins, and other entity names in text) [
Gene ontology (GO) analysis was performed using GSEABase software package from the R statistical platform (
After the retrieval of documents from PubMed, 15157 abstracts were examined, and 832 genes were obtained. Eventually, a total of 489 genes were identified as EC-related with
The 20 most significant EC-related genes based on text mining.
Gene | Description | Count |
|
---|---|---|---|
PGR | Progesterone receptor | 323 | 0 |
TP53 | Tumor protein p53 | 296 | 0 |
MLH1 | mutL homolog 1 | 150 | 0 |
PTEN | Phosphatase and tensin homolog | 130 | 0 |
MSH2 | mutS homolog 2 | 112 | 0 |
VEGFA | Vascular endothelial growth factor A | 82 | 0 |
ERBB2 | erb-b2 receptor tyrosine kinase 2 (HER2) | 77 | 0 |
MSH6 | mutS homolog 6 | 75 | 0 |
EGFR | Epidermal growth factor receptor | 68 | 0 |
MKI67 | Antigen identified by monoclonal antibody Ki-67 | 66 |
|
BCL2 | B-cell CLL/lymphoma 2 | 54 | 0 |
CCND1 | Cyclin D1 | 53 |
|
ESR1 | Estrogen receptor 1 | 48 | 0 |
TCEAL1 | Transcription elongation factor A (SII)-like 1 | 47 | 0 |
CDKN2A | Cyclin-dependent kinase inhibitor 2A (p16) | 39 | 0 |
CYP19A1 | Cytochrome P450, family 19, subfamily A, polypeptide 1 | 39 | 0 |
INS | insulin | 36 | 0 |
PTGS2 | Prostaglandin-endoperoxide synthase 2 (COX2) | 34 | 0 |
PMS2 | Postmeiotic segregation increased 2 | 33 | 0 |
PCNA | Proliferating cell nuclear antigen | 32 | 0 |
Classification results for biological processes, cellular components, and molecular functions by GO analysis are presented in Table
Classification results for biological processes, cellular components, and molecular functions by GO analysis.
Term | Count |
|
---|---|---|
Biological process | ||
Cell cycle and proliferation | 224 |
|
Stress response | 160 |
|
Developmental processes | 336 |
|
RNA metabolism | 188 | 0.00031 |
DNA metabolism | 67 | 0 |
Protein metabolism | 254 |
|
Other metabolic processes | 229 |
|
Cell organization and biogenesis | 178 |
|
Cell-cell signaling | 44 |
|
Signal transduction | 245 | 0.00089 |
Cell adhesion | 51 | 0.00284 |
Death | 141 |
|
Other biological processes | 436 |
|
|
||
Molecular function | ||
Transcription regulatory activity | 107 |
|
Signal transduction activity | 240 |
|
Enzyme regulator activity | 48 | 0.01638 |
Nucleic acid binding activity | 194 |
|
Kinase activity | 84 |
|
Other molecular function | 744 |
|
|
||
Cellular component | ||
Extracellular matrix | 34 |
|
Nonstructural extracellular | 180 |
|
Cytosol | 53 |
|
Nucleus | 306 |
|
Plasma membrane | 186 | 0.00014 |
Translational apparatus | 22 | 0.00148 |
Other cellular component | 446 |
|
Following pathway analysis, 32 pathways were identified as significant (
The 20 most significant pathways in which EC-related genes were involved.
Term | Count |
|
---|---|---|
Cytokine-cytokine receptor interaction | 64 |
|
MAPK signaling pathway | 62 |
|
Focal adhesion | 52 |
|
Cell cycle | 48 |
|
Regulation of actin cytoskeleton | 46 |
|
Jak-STAT signaling pathway | 39 |
|
Toll-like receptor signaling pathway | 36 |
|
Chemokine signaling pathway | 36 | 0.00170 |
p53 signaling pathway | 34 |
|
Apoptosis | 33 |
|
T cell receptor signaling pathway | 33 |
|
Insulin signaling pathway | 33 |
|
ErbB signaling pathway | 32 |
|
Wnt signaling pathway | 32 |
|
Neurotrophin signaling pathway | 31 |
|
Natural killer cell-mediated cytotoxicity | 28 | 0.00168 |
Steroid hormone biosynthesis | 26 |
|
Adherens junction | 24 |
|
Fc epsilon RI signaling pathway | 24 |
|
NOD-like receptor signaling pathway | 23 |
|
We constructed a network of EC-related proteins that included 271 interactions (Figure
Network analysis of EC-related genes.
Hub proteins for EC.
In the present study, by extracting information from biomedical literature, we obtained a dataset of EC-related proteins and identified 17 hub proteins. Most relationships between EC and certain hub proteins, such as EGFR, IGF1R, and MET, have been extensively studied, and all of the aforementioned proteins are known to be closely related to the occurrence and development of EC. However, relative to these proteins, PDGFRB, FGFR2, MAPK3, and JAK2 have been reported less frequently in the context of EC.
PI3K is a heterodimeric enzyme that consists of a regulatory subunit (p85) encoded by PIK3R1, PIK3R2, and PIK3R3 and a catalytic subunit (p110) encoded by PIK3CA, PIK3CB, and PIK3CD [
RAS is an oncogene that serves as a central focus for many signal transduction pathways associated with a high percentage of human tumors. Activating mutations in KRAS can be observed in EC [
FGFR2 is one type of fibroblast growth factor receptor and a member of the RTK family. RTKs are well known for their role in tumorigenesis [
PDGF is a major mitogen that mediates the growth of fibroblasts, smooth muscle cells, and other cell. This protein also has significant effects on the angiogenesis of endothelial cells. PDGF exerts its biological effects by binding to its two receptors,
JAK2, a member of the JAK family, is widely distributed in the cytoplasm. This protein is involved in signal transduction during hematopoiesis and in the immune system; in particular, JAK2 plays important roles in the production of red blood cells and the activation of immune cells. Research has demonstrated that JAK2 is associated with multiple tumors. The constitutive activation of JAK2 has been detected in many malignant solid tumors, such as colon cancer, head and neck cancer, leukemia, multiple myeloma, and other blood diseases [
In summary, in this investigation, we systematically analyzed EC-related genes and identified certain hub proteins and their pathways and networks. This systematic study may help to reveal the molecular mechanisms of EC development. However, the study results were obtained based on TM, which only considered previously published literatures; thus, the correlations between certain proteins and EC require additional explorations. Moreover, our data also provide implications for targeted therapy for EC. After obtaining deeper insight into the EC-related signaling network, additional hub protein inhibitors with stronger specificities will be developed. Anyhow, multiple hub proteins-targeted drugs will have broad potential for tumor treatment.
The authors declare that there is no conflict of interests regarding the publication of this paper.
This study was supported in part by the Chinese High-tech R&D (863) Program. The authors also wish to express their gratitude to Shanghai Sensichip Co., Ltd., for bioinformatics analysis.