Hepatocellular carcinoma (HCC) is a major liver tumor (~80%), besides hepatoblastomas, angiosarcomas, and cholangiocarcinomas. In this study, we used a systems biology approach to construct protein-protein interaction networks (PPINs) for early-stage and late-stage liver cancer. By comparing the networks of these two stages, we found that the two networks showed some common mechanisms and some significantly different mechanisms. To obtain differential network structures between cancer and noncancer PPINs, we constructed cancer PPIN and noncancer PPIN network structures for the two stages of liver cancer by systems biology method using NGS data from cancer cells and adjacent noncancer cells. Using carcinogenesis relevance values (CRVs), we identified 43 and 80 significant proteins and their PPINs (network markers) for early-stage and late-stage liver cancer. To investigate the evolution of network biomarkers in the carcinogenesis process, a primary pathway analysis showed that common pathways of the early and late stages were those related to ordinary cancer mechanisms. A pathway specific to the early stage was the mismatch repair pathway, while pathways specific to the late stage were the spliceosome pathway, lysine degradation pathway, and progesterone-mediated oocyte maturation pathway. This study provides a new direction for cancer-targeted therapies at different stages.
Cancer is the leading cause of death worldwide, and its etiology occurs at the DNA, RNA, and protein levels. It is a very complex disease involving cascades of spatial and temporal changes in genetic networks and metabolic pathways [
Hepatocellular carcinoma (HCC) is a major liver tumor (~80%), besides hepatoblastomas, angiosarcomas, and cholangiocarcinomas. Compared to other types of cancer, liver cancer is the third most deadly cancer globally and caused about 700,000 deaths in 2011 [
Although the histological and molecular features leading to HCC initiation are still poorly understood, mounting evidence suggests that a gradual accumulation of mutations and genetic changes in hepatocytes, which form the live lobule, may lead to the development of HCC [
As for different etiologies and heterogenic genomic alterations of HCC, the systems biology methodology that integrates Omics data is suitable to develop accurate diagnoses, novel therapeutic targets, and efficient targeted therapies [
Chen et al. developed a dynamical network biomarker (DNB) that can serve as a general early-warning signal to indicate an imminent bifurcation or sudden deterioration before the critical transition occurs, which means that it can identify a predisease state using time series microarray data [
We reveal the carcinogenesis process from early-stage and late-stage liver cancer. A specific pathway of the early stage was the mismatch repair pathway, while specific pathways of the late stage were the spliceosome pathway, lysine degradation pathway, and progesterone-mediated oocyte maturation pathway.
We successfully used our methods to find core and specific network markers of four different kinds of cancer and the evolution of network markers from the early stage to late stage of liver cancer [
Flowchart illustrating construction of network markers in two stages of liver cancer and the investigation of carcinogenesis mechanisms. We integrated NGS data, the GO database, and protein-protein interaction (PPI) information to construct the PPI network. These data were used for pool selection, and then the selected proteins and NGS data were used to contribute to the PPI network (PPIN) by a maximum-likelihood estimation and model order detection method, resulting in a liver cancer PPIN (CPPIN) and a noncancer PPIN (NPPIN) in the early and late stages of liver cancer. The two constructed PPINs were used to determine significant proteins of tumorigenesis by examining differences between the two PPI matrices of the two constructed PPINs. With the help of a differential PPI matrix (network) between CPPIN and NPPIN, a carcinogenesis relevance value (CRV) was computed for each protein, and significant proteins in carcinogenesis were determined based on
Liver RNA-seq data were collected from liver HCC (LIHC) of The Cancer Genome Atlas (TCGA) with a batch number of 100. Normalized results, which consisted of reads per kilobase of exon per million mapped reads (RPKM) values, were used to represent gene expressions. The NGS gene expression dataset of liver cancer was obtained from TCGA database. The same dataset contained early-stage and late-stage liver cancer and noncancer samples. We only used data derived from nonprocessed primary biopsies to avoid discrepancies in gene expressions that are intrinsic to cell culture and fixation. Therefore, the dataset utilized contained primary tumor samples of both stages from patients and adjacent nontumor tissue samples from the same cancer patients, which were considered to be control samples. As shown in Table
Descriptive information on datasets extracted from the TCGA database used in this study. Cases are grouped by the type of cancer, and surrounding normal tissues came from human patients in the early or late stage of liver cancer.
Cancer type | TCGA dataset number | Early stage sample# | Late stage sample# | Normal sample# | Platform |
---|---|---|---|---|---|
Liver cancer | Batch100 |
19 | 18 | 24 | Illumina |
PPI data for
To integrate gene expressions with PPI data so we could construct the corresponding CPPINs and NPPINs, we set up a protein pool containing differentially expressed proteins. Gene expression values were reasonably assumed to correlate with protein expression levels. We used a one-way analysis of variance (ANOVA) to analyze the expression of each protein and selected proteins with differential expression levels. This method allowed determination of significant differences between cancer and noncancer datasets. The null hypothesis (Ho) was based on the assumption that mean protein expression levels of cancer and noncancer sets were the same. A Bonferroni adjustment [
On the strength of the significant pool and PPI information, candidate PPINs for early-stage and late-stage liver cancer were constructed for liver cancer and noncancer tissues by linking proteins that interacted with each other. In other words, proteins that had PPI information through the pool were linked together, resulting in candidate PPINs.
As the candidate PPIN included all possible PPIs under various environments, different organisms, and experimental conditions, the candidate PPIN needed to be further confirmed by microarray data to identify appropriate PPIs according to the biological processes that are relevant to cancer. To remove false-positive PPIs from each candidate PPIN for different biological conditions, we used both a PPI model identification scheme and a model order detection method to prune each candidate PPIN using the corresponding microarray data to approach the actual PPIN. Here, the PPIs of a target protein
After constructing (
Once the association parameters for all proteins in the candidate PPIN were identified for each protein, significant protein associations were determined using the interaction model order detection method based on estimated association abilities, that is, to detect the interaction number
After the interaction number
If there was no PPI between proteins
The different matrix
The
In order to investigate what proteins are more likely involved in
Based on
Intersections of these significant proteins in the early and late stages of liver cancer and their PPIs are known as the core network markers appearing in all stages of liver cancer. In contrast, unique significant proteins and their PPIs in each stage of liver cancer are known as specific network markers for each stage of cancer. We found 18 significant proteins that could be classified as core network markers over the entire carcinogenesis process of liver cancer. We also found 134 significant proteins as specific network markers of early-stage liver cancer and 32 significant proteins as specific network markers of late-stage liver cancer.
Much valuable cellular information can be found using known pathways, which are useful for describing most “normal” biological phenomena. All of these known pathways are the result of repeated testing and verification, and the entire pathway network has defined most links. Therefore, the proteins we identified to be significant in the above network markers were mapped onto known pathway networks (e.g., the Kyoto Encyclopedia of Genes and Genomes (KEGG) and PANTHER pathways) to investigate significant pathways with network markers and explore relationships between these pathways and the carcinogenesis of liver cancer. This approach supports the view that systems biology can help identify significant network biomarkers in both normal and cancerous pathways and their roles in the pathogenesis of cancer.
Together with comprehensive pathway databases such as the KEGG, we used a series of bioinformatics pathway analytical tools to identify biologically relevant pathway networks [
Our cancer PPI model was constructed from the differential expression of cancer and noncancer microarray data and data mining of PPI information from the BioGRID database. So, the early-stage and late-stage liver CPPINs and NPPINs were the results of our systems biology model using the original NGS data and PPI databases. There were two key factors that affected our final results.
The constructed cancer differential protein-protein interaction networks (DPPINs) for early, late, and total stages of liver cancer. This figure shows the DPPINs with edge and node information for the early, late, and total stages of liver cancer. The DPPIN is the difference between the cancer PPIN (CPPIN) and noncancer PPIN (NPPIN). The figures were created using Cytoscape.
DPPIN of early-stage liver cancer
DPPIN of late-stage liver cancer
DPPIN of total-stage liver cancer
We also know that biosystems evolve with time. It is obvious that early-stage and late-stage patients have very different symptoms; these are key features we used to classify early-stage and late-stage liver cancer. Since liver cancer patients in the two stages have very different symptoms, the NGS data of these two stages of patients should undoubtedly greatly differ. As described above, protein expressions from NGS data are one of the key factors of our systems biology model producing the final CPPINs and NPPINs, and the CPPINs and NPPINs gave the final network biomarkers from our systems biology model. So, the most important thing for evolution of network biomarkers is evolution of the NGS data in both stages of liver cancer, which is inherent in the exhibition of cancer-related genes due to DNA mutations in the carcinogenesis process.
We built the DPPIN to examine the early, late, and total stages of liver cancer (Figure
After
The 27 identified significant proteins of core network marker in both early-stage and late-stage liver cancer (intersection).
Common network marker of early-stage and late-stage liver cancer | |||||
---|---|---|---|---|---|
Protein | CRV |
|
Cancer_AvgExp | Control_AvgExp |
|
APP | 46.52 | 0.00001 | 13724 | 14041 | −0.03 |
APP | 13.87 | 0.00066 | 15083 | 14041 | 0.1 |
ELAVL1 | 22.99 | 0.00033 | 1725 | 1319 | 0.39 |
ELAVL1 | 29.69 | 0.00003 | 1791 | 1319 | 0.44 |
KRTAP10-5 | 14.73 | 0.00098 | 1 | 1 | 0.04 |
KRTAP10-5 | 20.54 | 0.00022 | 1 | 1 | 0.06 |
H2AFX | 14.08 | 0.00105 | 608 | 271 | 1.17 |
H2AFX | 15.43 | 0.00053 | 806 | 271 | 1.58 |
CDK1 | 13.69 | 0.00108 | 335 | 34 | 3.32 |
CDK1 | 10.31 | 0.00119 | 596 | 34 | 4.15 |
ESR1 | 12.81 | 0.00119 | 228 | 1537 | −2.76 |
ESR1 | 34.1 | 0.00000 | 245 | 1537 | −2.65 |
EZH2 | 12.22 | 0.00127 | 310 | 49 | 2.66 |
EZH2 | 19.05 | 0.00027 | 394 | 49 | 3 |
CEP250 | 12.05 | 0.00130 | 716 | 268 | 1.42 |
CEP250 | 16.12 | 0.00045 | 627 | 268 | 1.23 |
AURKB | 11.41 | 0.00138 | 141 | 14 | 3.35 |
AURKB | 7.68 | 0.00289 | 199 | 14 | 3.85 |
CDC20 | 10.89 | 0.00152 | 359 | 20 | 4.17 |
CDC20 | 8.94 | 0.00168 | 688 | 20 | 5.11 |
E2F1 | 10.54 | 0.00158 | 476 | 43 | 3.46 |
E2F1 | 8.85 | 0.00174 | 673 | 43 | 3.95 |
OTX1 | 10.29 | 0.00166 | 31 | 2 | 4.27 |
OTX1 | 6.25 | 0.00811 | 26 | 2 | 4.02 |
SUMO1 | 9.86 | 0.00181 | 2422 | 2982 | −0.3 |
SUMO1 | 8.63 | 0.00188 | 2624 | 2982 | −0.18 |
MCM4 | 9.76 | 0.00186 | 1308 | 336 | 1.96 |
MCM4 | 6.76 | 0.00535 | 1327 | 336 | 1.98 |
HGS | 7.96 | 0.00312 | 2504 | 1167 | 1.1 |
HGS | 13.9 | 0.00066 | 2441 | 1167 | 1.06 |
KPNA2 | 7.86 | 0.00323 | 1927 | 689 | 1.48 |
KPNA2 | 10.43 | 0.00116 | 2533 | 689 | 1.88 |
UBC | 7.85 | 0.00323 | 34388 | 35490 | −0.05 |
UBC | 6.63 | 0.00591 | 36931 | 35490 | 0.06 |
SPRY2 | 7.5 | 0.00376 | 409 | 989 | −1.27 |
SPRY2 | 7.17 | 0.00399 | 328 | 989 | −1.59 |
TOPBP1 | 7.33 | 0.00406 | 799 | 346 | 1.21 |
TOPBP1 | 6.45 | 0.00698 | 875 | 346 | 1.34 |
SIRT7 | 7.11 | 0.00456 | 442 | 225 | 0.98 |
SIRT7 | 24 | 0.00012 | 456 | 225 | 1.02 |
WHSC1 | 6.67 | 0.00590 | 1217 | 447 | 1.44 |
WHSC1 | 12.06 | 0.00086 | 1369 | 447 | 1.61 |
MCM2 | 6.45 | 0.00666 | 1097 | 174 | 2.66 |
MCM2 | 8.48 | 0.00200 | 1351 | 174 | 2.96 |
AURKA | 6.4 | 0.00685 | 570 | 86 | 2.73 |
AURKA | 9.25 | 0.00154 | 516 | 86 | 2.58 |
COPS5 | 6.39 | 0.00691 | 1506 | 1184 | 0.35 |
COPS5 | 10 | 0.00128 | 1708 | 1184 | 0.53 |
PCNA | 6.29 | 0.00749 | 1652 | 904 | 0.87 |
PCNA | 15.95 | 0.00046 | 2290 | 904 | 1.34 |
BUB1B | 6.27 | 0.00762 | 179 | 12 | 3.89 |
BUB1B | 7.47 | 0.00327 | 320 | 12 | 4.73 |
DNMT1 | 6.02 | 0.00920 | 1376 | 455 | 1.6 |
DNMT1 | 10.63 | 0.00110 | 1383 | 455 | 1.6 |
There are two rows with same proteins name while the upper row represents the early stage liver cancer and the lower raw represents the late stage liver cancer.
AvgExp means average expression.
Top 20 proteins of early-stage and late-stage liver cancer.
Protein | CRV |
|
Case_AvgExp | Control_AvgExp |
|
---|---|---|---|---|---|
Top 20 proteins of early-stage liver cancer | |||||
APP | 46.52 |
|
13724 | 14041 | −0.03 |
KRTAP4-12 | 33.47 |
|
1 | 1 | −0.17 |
ELAVL1 | 22.99 | 0.000329 | 1725 | 1319 | 0.39 |
KRTAP10-1 | 17.17 | 0.000756 | 1 | 1 | 0.04 |
KRTAP10-5 | 14.73 | 0.000981 | 1 | 1 | 0.04 |
H2AFX | 14.08 | 0.001048 | 608 | 271 | 1.17 |
CDK1 | 13.69 | 0.001079 | 335 | 34 | 3.32 |
PRKDC | 12.91 | 0.001176 | 4011 | 1684 | 1.25 |
CUL3 | 12.83 | 0.001188 | 1376 | 1606 | −0.22 |
ESR1 | 12.81 | 0.001194 | 228 | 1537 | −2.76 |
EZH2 | 12.22 | 0.001274 | 310 | 49 | 2.66 |
CEP250 | 12.05 | 0.001298 | 716 | 268 | 1.42 |
AURKB | 11.41 | 0.001383 | 141 | 14 | 3.35 |
CDC20 | 10.89 | 0.001523 | 359 | 20 | 4.17 |
E2F1 | 10.54 | 0.001584 | 476 | 43 | 3.46 |
OTX1 | 10.29 | 0.001664 | 31 | 2 | 4.27 |
C19orf66 | 10.2 | 0.001688 | 1551 | 3166 | −1.03 |
SUMO1 | 9.86 | 0.00181 | 2422 | 2982 | −0.3 |
MCM4 | 9.76 | 0.001859 | 1308 | 336 | 1.96 |
GRB2 | 9.19 | 0.0022 | 3572 | 2608 | 0.45 |
|
|||||
Top 20 proteins of late stage liver cancer | |||||
ESR1 | 34.1 |
|
245 | 1537 | −2.65 |
ELAVL1 | 29.69 |
|
1791 | 1319 | 0.44 |
UBD | 28.54 |
|
20926 | 1781 | 3.55 |
YWHAZ | 27.31 |
|
13323 | 6408 | 1.06 |
SIRT7 | 24 | 0.000122 | 456 | 225 | 1.02 |
HDAC5 | 22.07 | 0.000152 | 1696 | 840 | 1.01 |
KRTAP10-5 | 20.54 | 0.000224 | 1 | 1 | 0.06 |
EZH2 | 19.05 | 0.00027 | 394 | 49 | 3 |
ILF2 | 18.86 | 0.000283 | 4111 | 1737 | 1.24 |
CEP250 | 16.12 | 0.000452 | 627 | 268 | 1.23 |
PCNA | 15.95 | 0.00046 | 2290 | 904 | 1.34 |
SUMO2 | 15.77 | 0.00049 | 3305 | 2112 | 0.65 |
H2AFX | 15.43 | 0.000532 | 806 | 271 | 1.58 |
HSP90AB1 | 14.43 | 0.0006 | 30949 | 14629 | 1.08 |
HGS | 13.9 | 0.000659 | 2441 | 1167 | 1.06 |
APP | 13.87 | 0.000659 | 15083 | 14041 | 0.1 |
WHSC1 | 12.06 | 0.000861 | 1369 | 447 | 1.61 |
SETDB1 | 11.7 | 0.000912 | 1203 | 497 | 1.28 |
TRAF2 | 11.45 | 0.000938 | 665 | 277 | 1.26 |
SMARCA4 | 11.18 | 0.000992 | 2696 | 1115 | 1.27 |
SFN | 11.03 | 0.001026 | 1008 | 55 | 4.19 |
AvgExp means average expression.
Top 30 proteins of total-stage liver cancer.
Protein | CRV |
|
Case_AvgExp | Control_AvgExp |
|
---|---|---|---|---|---|
APP | 81.53 |
< |
14385 | 14041 | 0.03 |
ELAVL1 |
|
0.000187 | 1757 | 1319 | 0.41 |
CCDC33 |
|
0.000464 | 3 | 1 | 1.7 |
UBC | 25.19 | 0.000588 | 35625 | 35490 | 0.01 |
HIST3H3 | 24.3 | 0.000626 | 1 | 1 | 0.02 |
KRTAP10-1 | 23.96 | 0.000643 | 1 | 1 | 0.02 |
ESR1 | 23.79 | 0.000652 | 236 | 1537 | −2.7 |
UBD | 23.34 | 0.000682 | 14701 | 1781 | 3.04 |
H2AFX | 17.28 | 0.001125 | 705 | 271 | 1.38 |
KRTAP10-5 | 17.05 | 0.001133 | 1 | 1 | 0.05 |
HGS | 15.2 | 0.001329 | 2473 | 1167 | 1.08 |
TRAF2 | 14.28 | 0.001461 | 581 | 277 | 1.07 |
HSP90AB1 | 14.16 | 0.001483 | 29756 | 14629 | 1.02 |
PCNA | 13.58 | 0.001572 | 1962 | 904 | 1.12 |
SMARCA4 | 13.21 | 0.001611 | 2546 | 1115 | 1.19 |
TAF6 | 12.84 | 0.001636 | 1075 | 478 | 1.17 |
SUMO2 | 12.6 | 0.001704 | 2968 | 2112 | 0.49 |
UBQLN4 | 12.59 | 0.001713 | 1881 | 905 | 1.05 |
COPS5 | 12.45 | 0.001743 | 1604 | 1184 | 0.44 |
CEP250 | 12.41 | 0.001747 | 673 | 268 | 1.33 |
EZH2 | 12.15 | 0.001802 | 351 | 49 | 2.84 |
CDK1 | 11.69 | 0.001892 | 462 | 34 | 3.78 |
TCF3 | 11.35 | 0.00199 | 980 | 415 | 1.24 |
CDC20 | 11.28 | 0.002011 | 519 | 20 | 4.7 |
MCM2 | 10.9 | 0.002126 | 1220 | 174 | 2.81 |
GRB2 | 10.64 | 0.002233 | 3488 | 2608 | 0.42 |
WHSC1 | 10.5 | 0.002288 | 1291 | 447 | 1.53 |
MYOD1 | 10.5 | 0.002292 | 2 | 1 | 1.22 |
AURKB | 10.48 | 0.002305 | 169 | 14 | 3.61 |
SUMO1 | 10.36 | 0.002373 | 2520 | 2982 | −0.24 |
The intersection of the total-stage liver cancer with our previous result.
Protein | CRV |
|
CRV |
|
---|---|---|---|---|
BUB1B | 7.58 | 0.005833 | 5.5696 | 0.00064 |
CDC20 | 11.28 | 0.002011 | 5.1507 | 0.00109 |
CDK2 | 6.9 | 0.007708 | 14.069 |
< |
CUL3 | 8.16 | 0.004478 | 12.9519 | < |
E2F1 | 10.19 | 0.002497 | 3.9947 | 0.00862 |
ESR1 | 23.79 | 0.000652 | 10.3758 | < |
HDAC4 | 10.11 | 0.002531 | 5.8397 | 0.00048 |
HGS | 15.2 | 0.001329 | 4.6929 | 0.00232 |
MYC | 7.55 | 0.005884 | 10.7821 | < |
PCNA | 13.58 | 0.001572 | 15.1438 | < |
PRKDC | 8.19 | 0.004418 | 5.9369 | 0.00041 |
SMARCA4 | 13.21 | 0.001611 | 7.7449 | 0.0001 |
SUMO1 | 10.36 | 0.002373 | 15.8533 | < |
TRAF2 | 14.28 | 0.001461 | 4.7703 | 0.00207 |
UBC | 25.19 | 0.000588 | 137.284 | < |
We first analyzed the pathway of the total stage of liver cancer using the David database. As stated above, the key point of this research was to find evolutionary mechanisms of liver cancer from the early and late stages, but the number of samples of NGS data was small, so we had to combine the early and late stages to see the overall picture of the liver cancer network. The five key pathways we were interested in, which were selected by these 74 key proteins, are listed as follows: (1) 14 proteins in hsa04110 were associated with the cell cycle (Figure
(a) The pathways analysis for 74 significant proteins in the total stage carcinogenesis. (b) The pathway analysis and gene set enrichment analysis of the 74 proteins of total-stage liver cancer on (
Rank | Term | Count | Symbol |
|
---|---|---|---|---|
KEGG | ||||
1 | hsa04110:Cell cycle | 14 | E2F1, CDK1, PRKDC, CDC20, MCM2, MCM3, YWHAE, MCM4, CDK2, MCM7, GSK3B, PCNA, BUB1B, MYC |
|
2 | hsa05215:Prostate cancer | 7 | E2F1, HSP90AB1, GRB2, GSK3B, MAPK3, PTEN, CDK2 |
|
3 | hsa04114:Oocyte meiosis | 7 | CDK1, PPP1CA, MAPK3, CDC20, AURKA, YWHAE, CDK2 |
|
4 | hsa03030:DNA replication | 5 | MCM7, PCNA, MCM2, MCM3, MCM4 |
|
5 | hsa05200:Pathways in cancer | 10 | E2F1, HSP90AB1, TRAF2, GRB2, MSH2, GSK3B, MAPK3, MYC, PTEN, CDK2 |
|
6 | hsa05213:Endometrial cancer | 5 | GRB2, GSK3B, MAPK3, MYC, PTEN |
|
7 | hsa05210:Colorectal cancer | 5 | GRB2, MSH2, GSK3B, MAPK3, MYC | 0.002457 |
8 | hsa05222:Small cell lung cancer | 5 | E2F1, TRAF2, MYC, PTEN, CDK2 | 0.002457 |
9 | hsa05214:Glioma | 4 | E2F1, GRB2, MAPK3, PTEN | 0.008946 |
10 | hsa04722:Neurotrophin signaling pathway | 5 | GRB2, GSK3B, MAPK3, YWHAE, TP73 | 0.009832 |
11 | hsa04115:p53 signaling pathway | 4 | CDK1, PTEN, CDK2, TP73 | 0.011029 |
12 | hsa05220:Chronic myeloid leukemia | 4 | E2F1, GRB2, MAPK3, MYC | 0.014381 |
13 | hsa04914:Progesterone-mediated oocyte maturation | 4 | HSP90AB1, CDK1, MAPK3, CDK2 | 0.020705 |
14 | hsa04012:ErbB signaling pathway | 4 | GRB2, GSK3B, MAPK3, MYC | 0.021345 |
15 | hsa04540:Gap junction | 4 | CDK1, GRB2, MAPK3, TUBA1B | 0.022657 |
16 | hsa05219:Bladder cancer | 3 | E2F1, MAPK3, MYC | 0.033366 |
17 | hsa04510:Focal adhesion | 5 | PPP1CA, GRB2, GSK3B, MAPK3, PTEN | 0.047655 |
18 | hsa05223:Non-small cell lung cancer | 3 | E2F1, GRB2, MAPK3 |
|
19 | hsa05221:Acute myeloid leukemia | 3 | GRB2, MAPK3, MYC |
|
20 | hsa04910:Insulin signaling pathway | 4 | PPP1CA, GRB2, GSK3B, MAPK3 |
|
21 | hsa05218:Melanoma | 3 | E2F1, MAPK3, PTEN |
|
22 | hsa04662:B cell receptor signaling pathway | 3 | GRB2, GSK3B, MAPK3 |
|
|
||||
BioCarte | ||||
1a | h_g1Pathway:Cell Cycle:G1/S Check Point | 4 | E2F1, CDK1, GSK3B, CDK2 | 0.012616 |
2a | h_p27Pathway:Regulation of p27 Phosphorylation during Cell Cycle Progression | 3 | E2F1, NEDD8, CDK2 | 0.020162 |
3a | h_p53Pathway:p53 Signaling Pathway | 3 | E2F1, PCNA, CDK2 | 0.033691 |
4a | h_her2Pathway:Role of ERBB2 in Signal Transduction and Oncology | 3 | GRB2, MAPK3, ESR1 | 0.049866 |
5a | h_ptenPathway:PTEN dependent cell cycle arrest and apoptosis | 3 | GRB2, MAPK3, PTEN |
|
6a | h_RacCycDPathway:Influence of Ras and Rho proteins on G1 to S Transition | 3 | E2F1, MAPK3, CDK2 |
|
7a | h_cellcyclePathway:Cyclins and Cell Cycle Regulation | 3 | E2F1, CDK1, CDK2 |
|
The significant pathways via DAVID Bioinformatics database are selected for the 74 significant proteins in carcinogenesis. Bold indicates
GO:term |
|
Corrected |
|
|
|
|
Term name |
---|---|---|---|---|---|---|---|
( |
|||||||
GO:0051320 |
|
|
6357 | 9 | 12 | 4 | S phase |
GO:0000084 |
|
|
6357 | 9 | 12 | 4 | S phase of mitotic cell cycle |
GO:0006267 |
|
|
6357 | 9 | 16 | 4 | Prereplicative complex assembly |
GO:0065004 |
|
|
6357 | 9 | 68 | 5 | Protein-DNA complex assembly |
GO:0000727 |
|
|
6357 | 9 | 26 | 4 | Double-strand break repair via break-induced replication |
GO:0006270 |
|
|
6357 | 9 | 28 | 4 | DNA-dependent DNA replication initiation |
GO:0022616 |
|
|
6357 | 9 | 31 | 4 | DNA strand elongation |
GO:0006271 |
|
|
6357 | 9 | 31 | 4 | DNA strand elongation involved in DNA replication |
GO:0000724 |
|
|
6357 | 9 | 38 | 4 | Double-strand break repair via homologous recombination |
GO:0022402 |
|
|
6357 | 9 | 445 | 7 | Cell cycle process |
|
|||||||
( |
|||||||
GO:0042555 |
|
|
6357 | 9 | 6 | 4 | MCM complex |
GO:0005656 |
|
|
6357 | 9 | 16 | 4 | Prereplicative complex |
GO:0031261 |
|
|
6357 | 9 | 22 | 4 | DNA replication preinitiation complex |
GO:0031298 |
|
|
6357 | 9 | 25 | 4 | Replication fork protection complex |
GO:0032993 |
|
|
6357 | 9 | 46 | 4 | Protein-DNA complex |
GO:0043234 |
|
|
6357 | 9 | 1369 | 9 | Protein complex |
GO:0044454 |
|
|
6357 | 9 | 217 | 5 | Nuclear chromosome part |
GO:0044451 |
|
|
6357 | 9 | 275 | 5 | Nucleoplasm part |
GO:0044428 |
|
|
6357 | 9 | 1251 | 8 | Nuclear part |
GO:0044427 |
|
0.0011 | 6357 | 9 | 302 | 5 | Chromosomal part |
|
|||||||
( |
|||||||
GO:0043566 |
|
|
6357 | 9 | 85 | 6 | Structure-specific DNA binding |
GO:0043138 |
|
|
6357 | 9 | 18 | 4 | 3′-5′ DNA helicase activity |
GO:0004003 |
|
|
6357 | 9 | 26 | 4 | ATP-dependent DNA helicase activity |
GO:0003682 |
|
|
6357 | 9 | 80 | 5 | Chromatin binding |
GO:0003688 |
|
|
6357 | 9 | 30 | 4 | DNA replication origin binding |
GO:0003678 |
|
|
6357 | 9 | 39 | 4 | DNA helicase activity |
GO:0003697 |
|
|
6357 | 9 | 44 | 4 | Single-stranded DNA binding |
GO:0016887 |
|
|
6357 | 9 | 278 | 6 | ATPase activity |
GO:0043140 |
|
|
6357 | 9 | 13 | 3 | ATP-dependent 3′-5′ DNA helicase activity |
GO:0008094 |
|
|
6357 | 9 | 67 | 4 | DNA-dependent ATPase activity |
Overview of significant pathways in network marker of total stage of liver cancer. Among KEGG pathways identified via the DAVID tool (Table
The proteins in the total stage liver cancer network marker are enriched in “hsa04110:Cell cycle” (Rank 1 in Table
The proteins in the total stage liver cancer network marker are enriched in “hsa04114:Oocyte meiosis” (Rank 3 in Table
The proteins in the total stage liver cancer network marker are enriched in “hsa03030:DNA replication” (Rank 4 in Table
The proteins in the total stage liver cancer network marker are enriched in “hsa05200:Pathways in cancer” (Rank 5 in Table
The proteins in the total stage liver cancer network marker are enriched in “hsa04115:p53 signaling pathway” (Rank 11 in Table
The proteins in the total stage liver cancer network marker are enriched in “h_g1Pathway:Cell Cycle: G1/S Check Point” (Rank 1a in Table
The proteins in the total stage liver cancer network marker are enriched in “h_p27Pathway:Regulation of p27 Phosphorylation during Cell Cycle Progression” (Rank 2a in Table
The proteins in the total stage liver cancer network marker are enriched in “h_p53Pathway:p53 Signaling Pathway” (Rank 3a in Table
The proteins in the total stage liver cancer network marker are enriched in “h_her2Pathway:Role of ERBB2 in Signal Transduction and Oncology” (Rank 4a in Table
Because the cell cycle is so crucial to cancer, we list other pathways given by BioCarta (Figures
Dituri et al. showed that changes in the cell cycle checkpoint frequently occur during HCC. They identified different pathways from us and showed that the phosphatidylinositol-3-kinase (PI3K)/protein kinase B (AKT)/mammalian target of the rapamycin (mTOR) pathway is the key pathway for HCC. They used the human PLC/PRF/5, Hep3B, HepG2, HLE, and HLF HCC cell lines and a normal human hepatocyte cell line. Although they and our group did not identify the same targets, the key point is that an abnormal cell cycle is a complex mechanism. Their scope covered the G0-G1, G2-M, and G1-S phases [
Furuta et al. showed that micro- (mi)RNAs are key posttranscriptional regulators of gene expression and are usually deregulated in HCC. They identified four miRNAs, mir-101, mir-195, mir-378, and mir-497 that are always silenced in HCC. In this research, we did not include miRNAs when building the model [
When we tried to apply a systems biology approach to cancer therapy, understanding cancer hallmarks was the most important and basic step. There are many investigations of cancer systems biology based on Weinberg’s work. Negrini et al. discussed the genomic instability characteristics of cancers and evolving hallmarks of cancer [
Cancer is a network disease, that is, a dysregulation of the entire network. Determining how to build a cancer network is the first important task.
We have briefly reviewed and summarized the key points of Weinberg’s theory. Now we focus on DNA replication stress, because it was identified as another hallmark of cancer. Macheret and Halazonetis claimed that the sustained proliferation hallmark can be regarded as mutations in oncogenes and tumor suppressor genes that are involved in the cell growth pathway. Mutations of TP53, ATM, or MDM2 genes can allow escaping from the apoptosis hallmark. He discussed oncogene-induced DNA replication stress and the role it plays as a cancer progression driver [
Four of these five key pathways discussed in the total stage (cell cycle, DNA replication, oocyte meiosis, and p53 signaling) were also selected by both early and late stages, although the proteins involved in these pathways were not the same (Table
(a) The pathways analysis for 43 early stage significant proteins in carcinogenesis. (b) The pathway analysis and gene set enrichment analysis of the 74 proteins of total-stage liver cancer on (
Rank | Term | Count | Symbol |
|
---|---|---|---|---|
KEGG | ||||
1 | hsa04110:Cell cycle | 11 | E2F1, CCNB1, CDK1, CDKN2A, PCNA, BUB1B, PRKDC, CDC20, MCM2, MCM4, MYC |
|
2 | hsa03030:DNA replication | 4 | POLD1, PCNA, MCM2, MCM4 |
|
3 | hsa05220:Chronic myeloid leukemia | 4 | E2F1, CDKN2A, GRB2, MYC | 0.004415 |
4 | hsa04114:Oocyte meiosis | 4 | CCNB1, CDK1, CDC20, AURKA | 0.01273 |
5 | hsa05219:Bladder cancer | 3 | E2F1, CDKN2A, MYC | 0.015099 |
6 | hsa05223:Non-small cell lung cancer | 3 | E2F1, CDKN2A, GRB2 | 0.024285 |
7 | hsa05214:Glioma | 3 | E2F1, CDKN2A, GRB2 | 0.03234 |
8 | hsa04115:p53 signaling pathway | 3 | CCNB1, CDK1, CDKN2A | 0.037212 |
9 | hsa03430:Mismatch repair | 2 | POLD1, PCNA | 0.09922 |
|
||||
BioCarta | ||||
1a | h_cellcyclePathway:Cyclins and Cell Cycle Regulation | 4 | E2F1, CCNB1, CDK1, CDKN2A | 0.002246 |
2a | h_srcRPTPPathway:Activation of Src by Protein-tyrosine phosphatase alpha | 3 | CCNB1, CDK1, GRB2 | 0.004 |
3a | h_g2Pathway:Cell Cycle:G2/M Checkpoint | 3 | CCNB1, CDK1, PRKDC | 0.025668 |
4a | h_g1Pathway:Cell Cycle:G1/S Check Point | 3 | E2F1, CDK1, CDKN2A | 0.039621 |
5a | h_ptc1Pathway:Sonic Hedgehog (SHH) Receptor Ptc1 Regulates cell cycle | 2 | CCNB1, CDK1 |
|
The significant pathways via DAVID Bioinformatics database are selected for the 43 significant proteins in carcinogenesis. Bold indicates
GO:term |
|
Corrected |
|
|
|
|
Term name |
---|---|---|---|---|---|---|---|
( |
|||||||
GO:0051320 |
|
0.0028 | 6357 | 4 | 12 | 2 | S phase |
GO:0000084 |
|
0.0028 | 6357 | 4 | 12 | 2 | S phase of mitotic cell cycle |
GO:0006267 |
|
0.0052 | 6357 | 4 | 16 | 2 | Prereplicative complex assembly |
GO:0000727 |
|
0.0142 | 6357 | 4 | 26 | 2 | Double-strand break repair via break-induced replication |
GO:0006270 |
|
0.0165 | 6357 | 4 | 28 | 2 | DNA-dependent DNA replication initiation |
GO:0022616 |
|
0.0203 | 6357 | 4 | 31 | 2 | DNA strand elongation |
GO:0006271 |
|
0.0203 | 6357 | 4 | 31 | 2 | DNA strand elongation involved in DNA replication |
GO:0000724 |
|
0.0306 | 6357 | 4 | 38 | 2 | Double-strand break repair via homologous recombination |
GO:0000725 |
|
0.0470 | 6357 | 4 | 47 | 2 | Recombinational repair |
GO:0022403 |
|
0.0515 | 6357 | 4 | 286 | 3 | Cell cycle phase |
|
|||||||
( |
|||||||
GO:0042555 |
|
|
6357 | 4 | 6 | 2 | MCM complex |
GO:0005656 |
|
0.0010 | 6357 | 4 | 16 | 2 | Prereplicative complex |
GO:0031261 |
|
0.0020 | 6357 | 4 | 22 | 2 | DNA replication preinitiation complex |
GO:0031298 |
|
0.0026 | 6357 | 4 | 25 | 2 | Replication fork protection complex |
GO:0032993 |
|
0.0091 | 6357 | 4 | 46 | 2 | Protein-DNA complex |
GO:0000151 |
|
0.0171 | 6357 | 4 | 63 | 2 | Ubiquitin ligase complex |
GO:0043234 | 0.0021 | 0.0643 | 6357 | 4 | 1369 | 4 | Protein complex |
GO:0033597 | 0.0025 | 0.0754 | 6357 | 4 | 4 | 1 | Mitotic checkpoint complex |
GO:0031463 | 0.0031 | 0.0942 | 6357 | 4 | 5 | 1 | Cul3-RING ubiquitin ligase complex |
GO:0043596 | 0.0062 | 0.1883 | 6357 | 4 | 10 | 1 | Nuclear replication fork |
|
|||||||
( |
|||||||
GO:0043138 |
|
0.0021 | 6357 | 4 | 18 | 2 | 3′-5′ DNA helicase activity |
GO:0004003 |
|
0.0046 | 6357 | 4 | 26 | 2 | ATP-dependent DNA helicase activity |
GO:0003688 |
|
0.0061 | 6357 | 4 | 30 | 2 | DNA replication origin binding |
GO:0003678 |
|
0.0104 | 6357 | 4 | 39 | 2 | DNA helicase activity |
GO:0003697 |
|
0.0133 | 6357 | 4 | 44 | 2 | Single-stranded DNA binding |
GO:0008094 |
|
0.0310 | 6357 | 4 | 67 | 2 | DNA-dependent ATPase activity |
GO:0003682 |
|
0.0443 | 6357 | 4 | 80 | 2 | Chromatin binding |
GO:0043566 | 0.0010 | 0.0500 | 6357 | 4 | 85 | 2 | Structure-specific DNA binding |
GO:0004842 | 0.0010 | 0.0500 | 6357 | 4 | 85 | 2 | Ubiquitin-protein ligase activity |
GO:0070035 | 0.0010 | 0.0511 | 6357 | 4 | 86 | 2 | Purine NTP-dependent helicase activity |
Overview of significant pathways of network markers of early-stage liver cancer. KEGG pathways accessed using the DAVID tool (Table
(a) The pathways analysis for 80 late stage significant proteins in carcinogenesis. (b) The pathway analysis and gene set enrichment analysis of the 74 proteins of total-stage liver cancer on (
Rank | Term | Count | Symbol |
|
---|---|---|---|---|
KEGG | ||||
1 | hsa04110:Cell cycle | 16 | E2F1, CDK1, YWHAZ, YWHAB, CDC20, SFN, MCM2, MCM3, CDK4, MCM4, CDK2, MCM7, PCNA, YWHAQ, BUB1B, CCNA2 |
|
2 | hsa04114:Oocyte meiosis | 8 | CDK1, YWHAZ, YWHAB, YWHAQ, CDC20, AURKA, PPP1CC, CDK2 |
|
3 | hsa03030:DNA replication | 5 | MCM7, PCNA, MCM2, MCM3, MCM4 |
|
4 | hsa04115:p53 signaling pathway | 4 | CDK1, SFN, CDK4, CDK2 | 0.020496 |
5 | hsa03040:Spliceosome | 5 | SNRPB, THOC4, LSM2, SF3A2, SF3B4 | 0.022722 |
6 | hsa05222:Small cell lung cancer | 4 | E2F1, TRAF2, CDK4, CDK2 | 0.035432 |
7 | hsa04914:Progesterone-mediated oocyte maturation | 4 | HSP90AB1, CDK1, CCNA2, CDK2 | 0.037607 |
8 | hsa00310:Lysine degradation | 3 | SETDB1, WHSC1, EHMT2 |
|
9 | hsa05200:Pathways in cancer | 7 | E2F1, HSP90AB1, TRAF2, MSH2, BIRC5, CDK4, CDK2 |
|
|
||||
BioCarte | ||||
1a | h_p53Pathway:p53 Signaling Pathway | 4 | E2F1, PCNA, CDK4, CDK2 | 0.003378 |
2a | h_cellcyclePathway:Cyclins and Cell Cycle Regulation | 4 | E2F1, CDK1, CDK4, CDK2 | 0.010334 |
3a | h_g1Pathway:Cell Cycle:G1/S Check Point | 4 | E2F1, CDK1, CDK4, CDK2 | 0.015615 |
4a | h_rbPathway:RB Tumor Suppressor/Checkpoint Signaling in response to DNA damage | 3 | CDK1, CDK4, CDK2 | 0.019987 |
5a | h_ranMSpathway:Role of Ran in mitotic spindle regulation | 3 | RAN, AURKA, KPNA2 | 0.019987 |
6a | h_g2Pathway:Cell Cycle:G2/M Checkpoint | 3 | CDK1, YWHAQ, BRCA1 |
|
7a | h_RacCycDPathway:Influence of Ras and Rho proteins on G1 to S Transition | 3 | E2F1, CDK4, CDK2 |
|
The significant pathways via DAVID Bioinformatics database are selected for the 74 significant proteins in carcinogenesis. Bold indicates
GO:term |
|
Corrected |
|
|
|
|
Term name |
---|---|---|---|---|---|---|---|
( |
|||||||
GO:0051320 |
|
|
6357 | 10 | 12 | 4 | S phase |
GO:0000084 |
|
|
6357 | 10 | 12 | 4 | S phase of mitotic cell cycle |
GO:0006267 |
|
|
6357 | 10 | 16 | 4 | Prereplicative complex assembly |
GO:0000727 |
|
|
6357 | 10 | 26 | 4 | Double-strand break repair via break-induced replication |
GO:0006270 |
|
|
6357 | 10 | 28 | 4 | DNA-dependent DNA replication initiation |
GO:0022616 |
|
|
6357 | 10 | 31 | 4 | DNA strand elongation |
GO:0006271 |
|
|
6357 | 10 | 31 | 4 | DNA strand elongation involved in DNA replication |
GO:0000724 |
|
|
6357 | 10 | 38 | 4 | Double-strand break repair via homologous recombination |
GO:0006260 |
|
|
6357 | 10 | 120 | 5 | DNA replication |
GO:0000725 |
|
|
6357 | 10 | 47 | 4 | Recombinational repair |
|
|||||||
( |
|||||||
GO:0042555 |
|
|
6357 | 10 | 6 | 4 | MCM complex |
GO:0005656 |
|
|
6357 | 10 | 16 | 4 | Prereplicative complex |
GO:0031261 |
|
|
6357 | 10 | 22 | 4 | DNA replication preinitiation complex |
GO:0031298 |
|
|
6357 | 10 | 25 | 4 | Replication fork protection complex |
GO:0032993 |
|
|
6357 | 10 | 46 | 4 | Protein-DNA complex |
GO:0044454 |
|
0.0113 | 6357 | 10 | 217 | 4 | Nuclear chromosome part |
GO:0044451 |
|
0.0280 | 6357 | 10 | 275 | 4 | Nucleoplasm part |
GO:0044428 |
|
0.0370 | 6357 | 10 | 1251 | 7 | Nuclear part |
GO:0044427 |
|
0.0400 | 6357 | 10 | 302 | 4 | Chromosomal part |
GO:0043234 |
|
0.0656 | 6357 | 10 | 1369 | 7 | Protein complex |
|
|||||||
( |
|||||||
GO:0043138 |
|
|
6357 | 10 | 18 | 4 | 3′-5′ DNA helicase activity |
GO:0004003 |
|
|
6357 | 10 | 26 | 4 | ATP-dependent DNA helicase activity |
GO:0003688 |
|
|
6357 | 10 | 30 | 4 | DNA replication origin binding |
GO:0043566 |
|
|
6357 | 10 | 85 | 5 | Structure-specific DNA binding |
GO:0003678 |
|
|
6357 | 10 | 39 | 4 | DNA helicase activity |
GO:0003697 |
|
|
6357 | 10 | 44 | 4 | Single-stranded DNA binding |
GO:0043140 |
|
|
6357 | 10 | 13 | 3 | ATP-dependent 3′-5′ DNA helicase activity |
GO:0008094 |
|
|
6357 | 10 | 67 | 4 | DNA-dependent ATPase activity |
GO:0003682 |
|
|
6357 | 10 | 80 | 4 | Chromatin binding |
GO:0070035 |
|
|
6357 | 10 | 86 | 4 | Purine NTP-dependent helicase activity |
Overview of significant pathways of network markers of late-stage liver cancer. KEGG pathways identified using the DAVID tool (Table
Tian et al. used a phosphoproteomic analytical method on the highly metastatic HCC cell line, MHCC97-H. They reported that phosphoproteins were also found in the spliceosome pathway, and they were related to liver cancer [
We found 15 common proteins in this study and our previous results (Table
Gene expression profiling by microarray has been successful at demonstrating the patterns of mRNAs within tissues and cells. Although microarray platforms showed similar levels of concordance with the RNA-seq data, the next generation sequencing (NGS) technologies provided high sensitivity, specificity, and accuracy as compared to the microarray platforms [
Liver cancer is the third most deadly cancer causing about 700,000 deaths in 2011 worldwide. It is a lethal disease like other cancers. There are three important topics in this research. The first was to compare this study with our previous research of liver cancer microarray data, because the microarray technology may someday be replaced by the NGS technology. We found the results to be good, because there were a lot of key proteins identified by both methods. The second was to reveal the carcinogenesis process from the early to late stages of liver cancer. The specific pathway of the early stage was the mismatch repair pathway, and the specific pathways of the late stage were the spliceosome pathway, lysine degradation pathway, and progesterone-mediated oocyte maturation pathway. This suggests novel directions for choosing different targeted therapeutic strategies at different stages of cancer. In particular, compared to our previous results of bladder cancer, we found that the spliceosome pathway is a significant pathway in the late stage of both liver cancer and bladder cancer. Our future work will focus greater attention on this pathway related to various cancers and consider it as a new drug target for anticancer therapies.
The authors declare that there is no conflict of interests regarding the publication of this paper.
Yung-Hao Wong and Chia-Chou Wu contributed equally to this work.
The authors are grateful for the support provided by the Ministry of Science and Technology with Grants no. MOST-103-2745-E-007-001-ASP and no. 103-2221-E-038-013-MY2.