Breast cancer (BC) is the second most common cancer diagnosed in American women and is also the second leading cause of cancer death in women. Research has focused heavily on BC metastasis. Multiple signaling pathways have been implicated in regulating BC metastasis. Our knowledge of regulation of BC metastasis is, however, far from complete. Identification of new factors during metastasis is an essential step towards future therapy. Our labs have focused on Semaphorin 6D (SEMA6D), which was implicated in immune responses, heart development, and neurogenesis. It will be interesting to know SEMA6D-related genomic expression profile and its implications in clinical outcome. In this study, we examined the public datasets of breast invasive carcinoma from The Cancer Genome Atlas (TCGA). We analyzed the expression of SEMA6D along with its related genes, their functions, pathways, and potential as copredictors for BC patients’ survival. We found 6-gene expression profile that can be used as such predictors. Our study provides evidences for the first time that breast invasive carcinoma may contain a subtype based on SEMA6D expression. The expression of SEMA6D gene may play an important role in promoting patient survival, especially among triple negative breast cancer patients.
Breast cancer (BC) is the second most common cancer diagnosed in American women and is also the second leading cause of cancer death in women [
Semaphorins were initially recognized as phylogenetically conserved neuronal guidance cues, and their critical regulatory roles in BC metastasis have rapidly emerged in recent years. Based on their sequence similarity, Semaphorins are classified into eight classes: classes 1-2 are found in invertebrates, classes 3–7 comprise the vertebrate Semaphorins, and class V is encoded by viruses. Class 2, 3, and V Semaphorins are secreted, while all other members are membrane tethered through a single transmembrane domain [
Published studies regarding functions of Semaphorins in BC have mainly focused on class 3 secreted Semaphorins and SEMA4D [
In this study, we examine the public datasets from The Cancer Genome Atlas (TCGA), National Cancer Institute (NCI) for expression of SEMA6D along with genes that interact with SEMA6D. Other genes coregulated with SEMA6D were analyzed for their function, pathway, and potential as copredictors for BC patients’ survival. We found 6-gene expression profile that can be used as such predictors. We also found that SEMA6D expression correlated with the cancer status of triple negative (TNBC) markers (ER, PR, and Her2 genes). The study shows the role of SEMA6D as potential survival predictor especially in TNBC patients.
The Cancer Genome Atlas (TCGA) Data Portal was used to download breast invasive carcinoma (BRCA) samples (
Gene-level normalized expression data were used in Partek Genomic Suite (PGS, St. Louis, MO) for additional normalization, statistics, and annotation. The analysis of variance (ANOVA) methods were used for group comparisons. False discovery rate (FDR) correction (Benjamini-Hochberg methods) was applied for multiple hypothesis testing purpose. Other statistical tools such as SAS (Cary, NC) and Ingenuity Pathway Analysis (IPA, Redwood City, CA) were used for pathway analysis and building gene-gene interaction network. Heatmap was generated by using hierarchical clustering methods after z-normalization.
A total of 140 patients with clinical outcomes data available (survival status, months of survival, demographics, and ER, PR, and HER2 status, etc.) were included in the analysis. Among significant genes after SEMA6D-high versus SEMA6D-low expression comparisons, we selected top 20 genes with the highest or lowest expression levels to correlate with clinical outcomes. Logarithm 2 based transformation of each gene was performed prior to any analysis. The correlation among these 20 genes was evaluated using Pearson correlation coefficient, and summary statistics were presented including mean with standard deviation, median, and range. Associations between level of genes and overall survival (OS) were assessed with Kaplan-Meier (K-M) curves and log-rank tests. Each gene was dichotomized as above or below median level of expression in the survival analysis. Significant association was determined at 5% type I error level. Multiple comparisons were not explicitly controlled for due to the small sample size and exploratory nature of the analysis.
Semaphorins, including members in subclass 3 and SEMA4D, have emerged as critical signaling molecules in regulating BC pathology [
Principle component analysis (PCA) of all samples.
Based on the genes that are differentially expressed in SEMA6D-high versus SEMA6D-low expression groups, we then performed a hierarchical clustering (Figure
Hierarchical clustering of significant genes of SEMA6D-H versus -L expression. Genes (vertical: high expression in red and low expression in green) and samples (horizontal: SEMA6D-high in green, SEMA6D-medium in blue, and SEMA6D-low in brown) were clustered based on Euclidean dissimilarity matrix.
Consistent with the PCA analysis, the high SEME6D expression samples showed a congregation in the lower part of the figure, which indicates a clear separation of samples based on SEMA6D expression. In other words, BC samples may contain a subtype with high SEMA6D expression.
We further examined SEMA6D levels by including SEMA6D-medium versus SEMA6D-low expression group comparison. As shown in the Venn diagram in Figure
Number of significant genes between the two comparisons: H versus L and M versus L. FC: fold change.
Among significant genes of SEMA6D-high versus SEMA6D-low comparison,
Biological process: SEMA6D high versus low comparison.
Biological process | Enrichment score | Enrichment |
---|---|---|
Multicellular organismal development | 34.34 |
|
G-protein coupled receptor protein signaling pathway | 31.13 |
|
Cell adhesion | 19.93 |
|
Nervous system development | 19.92 |
|
Mitotic cell cycle | 19.76 |
|
Cell division | 18.62 |
|
Mitosis | 18.54 |
|
M phase of mitotic cell cycle | 18.29 |
|
Ion transport | 17.24 |
|
Response to drug | 16.08 |
|
The semaphorins and their receptors, the neuropilins and the Plexins, are constituents of a complex regulatory system that controls axonal guidance [
Molecular function: SEMA6D high versus low comparison.
Molecular function | Enrichment score | Enrichment |
---|---|---|
Receptor activity | 22.90 |
|
Sequence-specific DNA binding | 21.27 |
|
Voltage-gated sodium channel activity | 20.10 |
|
Signal transducer activity | 17.14 |
|
Calcium ion binding | 16.56 |
|
Heparin binding | 15.78 |
|
Voltage-gated ion channel activity | 14.69 |
|
Receptor binding | 14.43 |
|
G-protein coupled receptor activity | 12.60 |
|
Sequence-specific DNA binding TF activity | 11.60 |
|
In cell adhesion, cell-cell interactions between cancer cells with endothelium determine the metastatic spread. There are two major cell adhesions, including selectin and integrin, and accumulating evidence confirms that tumor cell interactions through them actively contribute to the metastatic spread of tumor cells [
Activation of SEMA6D and transcription. The gene-gene interaction network was built based on direct interactions by using Ingenuity Pathway Analysis (IPA) suite. Color indicates increased (in red) expression when SEMA6D-high samples were compared with SEMA6D-low samples. The number indicated the fold changes of this comparison.
As reported, Plexin-B1 is a receptor for the transmembrane semaphorin SEMA4D (CD100) [
Expression of top EMT-related genes in SEMA6D-H versus L comparison.
Symbol | Description |
|
Fold change | Fold (description) |
---|---|---|---|---|
|
Matrix metallopeptidase 9 (gelatinase B, 92 kDa gelatinase, 92 kDa type IV collagenase) |
|
|
H down versus L |
|
Transmembrane protein 132A |
|
|
H down versus L |
|
Bone Morphogenetic Protein 7 |
|
|
H down versus L |
|
Desmocollin 2 |
|
|
H down versus L |
|
Hypoxanthine phosphoribosyltransferase 1 |
|
|
H down versus L |
|
Keratin 19 |
|
|
H down versus L |
|
Secreted phosphoprotein 1 |
|
|
H down versus L |
|
PPPDE peptidase domain containing 2 |
|
|
H down versus L |
|
Keratin 7 |
|
|
H down versus L |
|
Cadherin 1, type 1, E-cadherin (epithelial) |
|
|
H down versus L |
|
Collagen, type III, alpha 1 |
|
|
H up versus L |
|
Matrix metallopeptidase 2 (gelatinase A, 72 kDa gelatinase, 72 kDa type IV collagenase) |
|
|
H up versus L |
|
Snail homolog 2 ( |
|
|
H up versus L |
|
Microphthalmia-associated transcription factor |
|
|
H up versus L |
|
Transcription factor 4 |
|
|
H up versus L |
|
AHNAK nucleoprotein |
|
|
H up versus L |
|
Zinc finger E-box binding homeobox 2 |
|
|
H up versus L |
|
Zinc finger E-box binding homeobox 1 |
|
|
H up versus L |
|
Guanine nucleotide binding protein (G protein), gamma 11 |
|
|
H up versus L |
Correlation of SEMA6D with EMT gene expressions.
Symbol |
|
|
---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
MMP family proteins, especially MMP9, were suggested to be involved in the process of metastasis of breast cancer to the brain [
On the other hand, our results also showed an increased expression among SEMA6D-high samples of some important tumor promoters such as ZEB1/2, which had been reported to promote EMT by modulating Zeb1/2 and TGF
Canonical signaling pathway by SEME6D high expression.
Pathway name |
|
Ratio |
---|---|---|
cAMP-mediated signaling |
|
53/222 |
G-Protein coupled receptor signaling |
|
54/265 |
Granulocyte adhesion and diapedesis |
|
41/176 |
Agranulocyte adhesion and diapedesis |
|
41/187 |
Gas signaling |
|
27/119 |
Correlation of gene expression with patients’ survival.
Variable |
|
Mean | SD | Median | Min | Max | Log-rank |
---|---|---|---|---|---|---|---|
|
140 | 7.15 | 1.90 | 6.95 | 2.11 | 12.09 | 0.0156* |
|
127 | 2.12 | 1.93 | 2.36 | −1.87 | 5.73 | 0.0308* |
|
139 | 5.65 | 2.67 | 5.79 | 0.18 | 10.77 | 0.0564 |
|
134 | 2.50 | 2.03 | 2.67 | −2.02 | 6.94 | 0.0019* |
|
140 | 7.99 | 0.91 | 7.87 | 6.36 | 10.46 | 0.0397* |
|
140 | 4.96 | 1.56 | 4.83 | 1.19 | 8.86 | 0.0003* |
|
140 | 11.61 | 0.99 | 11.74 | 9.39 | 14.51 | 0.0162* |
|
140 | 7.71 | 1.93 | 7.36 | 2.81 | 11.60 | 0.0709 |
OS: overall survival, stratified by high (>medium) or low (<medium) expression *
SEMA6D correlates with patient survival.
In addition, increased expressions of SEMA6D, CLEC9A, COL4A6, and C10orf107 are associated with better survival while decreased expressions of DONSON, CHAC1, TUBA1C, and CBX2 also correlate to better survival. Figure
Interaction of SEMA6D with TNBC in patients’ survival.
These results strongly suggest that SEMA6D expression levels correlate with overall survival (Figure
Our study provides evidences that breast invasive carcinoma (BRCA) may contain a subtype based on SEMA6D expression. The expression of SEMA6D gene may play an important role in promoting patient survival, especially among triple negative breast cancer (TNBC) patients.
Triple negative breast cancer
Breast invasive carcinoma
Breast cancer
Semaphorin 6D
Bone Morphogenetic Protein
Epithelial-mesenchymal-transition
Principle component analysis
Gene Ontology
The Cancer Genome Atlas.
The authors declare that there is no conflict of interests regarding the publication of this paper.
Dongquan Chen and Yufeng Li contributed equally to the study.
The study was partially supported by institutional funding by University of Alabama at Birmingham (UAB) to Dongquan Chen and a Faculty Development Grant from UAB Comprehensive Cancer Center to Kai Jiao, an R01 (HL095783) to Kai Jiao, and an R21 (CA179282) to Lizhong Wang.