TP53 Mutation, Epithelial-Mesenchymal Transition, and Stemlike Features in Breast Cancer Subtypes

Altered p53 protein is prevalently associated with the pathologic class of triple-negative breast cancers and loss of p53 function has recently been linked to the induction of an epithelial-mesenchymal transition (EMT) and acquisition of stemness properties. We explored the association between TP53 mutational status and expression of some genes involved in the canonical TGF-β signaling pathway (the most potent EMT inducer) and in two early EMT associated events: loss of cell polarity and acquisition of stemness-associated features. We used a publicly accessible microarray dataset consisting of 251 p53-sequenced primary breast cancers. Statistical analysis indicated that mutant p53 tumors (especially those harboring a severe mutation) were consistent with the aggressive class of triple-negative cancers and that, differently from cell cultures, surgical tumors underexpressed some TGF-β related transcription factors known as involved in EMT (ID1, ID4, SMAD3, SMAD4, SMAD5, ZEB1). These unexpected findings suggest an interesting relationship between p53 mutation, mammary cell dedifferentiation, and the concomitant acquisition of stemlike properties (as indicated by the overexpression of PROM1 and NOTCH1 genes), which improve tumor cells aggressiveness as indicated by the overexpression of genes associated with cell proliferation (CDK4, CDK6, MKI67) and migration (CXCR4, MMP1).


Introduction
TP53 tumor suppressor is the most commonly altered gene in human breast cancer where it is mutated in about 30-40% [1]. TP53 gene mutations result in altered and stable p53 proteins that function as dominant negative with gain-of-function properties, including drug resistance, and contribute to malignant progression with detrimental effects on patient's outcome [2]. In particular, clinical evidence indicated that altered p53 proteins are prevalently associated with the pathologic class of triple-negative breast cancers, that is, tumors characterized by the immunohistochemical expression of basal cytokeratins and epidermal growth factor receptor (EGFR), but negative for estrogen (ER), progesterone receptor (PR), and HER2 expression [3][4][5].
Recently, triple-negative tumors have been also associated with a new less common subtype, known as claudin-low [6].
Wild-type p53 functions as a sequence-specific DNA binding transcription factor that regulates a plethora of target genes involved in DNA repair, cell cycle control, apoptosis, senescence, angiogenesis, and other fundamental biological process [7]. Therefore, it is not surprising that mutations of TP53 gene or inactivations of its signaling pathway are prerequisite for the development of tumors. Most mutations occur within the central DNA binding domain (exons [5][6][7][8] and, in particular, at several specific amino acids required for DNA binding. According to the type of mutation (point mutation, deletion, insertion, or stop codon), p53 protein synthesis may be totally inhibited or may generate functionally altered molecules that affect cell homeostasis in a different manner [8]. In fact, it is well known that not all p53 mutations have equal effects: some of them confer loss of function, others have a dominant negative effect and still others are classified as wild-type-like 2 Journal of Biomedicine and Biotechnology protein and represent mutant forms with a limited biological effect [9,10].
Recently, some excellent papers have provided experimental evidence linking loss of p53 function to the induction of epithelial-mesenchymal transition (EMT) and acquisition of stemness properties in different tumor cell lines [11][12][13]. EMT is a key program in embryonic development, the aberrant reactivation of which may induce progression, invasion, dissemination, and finally metastasis in cancer cells. The most evident peculiarities of a cell undergoing EMT process are loss of the epithelial phenotype and acquisition of mesenchymal features and abnormal motility capabilities [14,15]. The most potent inducer of EMT is transforming growth factor-β (TGF-β) that triggers the activity of several transcription factors (ZEB1/TCF8, ZEB2/SIP1, Snail, Slug, Twist, and Ids), which in turn repress the expression of genes coding for epithelial markers and activate the expression of mesenchymal genes [16]. According to recent acquisition, p53 should prevent EMT by repressing ZEB1 and ZEB2 expressions via miRNAs activity. Consequently, p53 lossof-function should downregulate miRNAs expression, the transactivation of transcription factors promoting EMT, and the emergence of tumor cells with stemlike properties [17][18][19].
So far, despite the availability of a huge amount of information on the transcript profile from microarray analysis, the interrelations among p53 mutations and genes involved in EMT have not been specifically assessed. Therefore, to investigate the association between TP53 mutational status and EMT process, we interrogated a publicly accessible microarray dataset consisting of 251 p53 sequenced primary breast cancers [20]. Adopting an unconventional approach, we did not use the whole transcript profile but we selected a priori panel of genes experimentally recognized as involved in the canonical TGF-β signaling pathway and in two early events associated with EMT: loss of epithelial cell polarity and acquisition of stemness-associated features. To delineate a more comprehensive picture of the relationship among p53 mutation, EMT, and tumor aggressiveness, we also considered some genes involved in cell proliferation, apoptosis, and metastatic spread.

Materials.
As reported in the original paper [20], gene expression profile was determined by using the Affymetrix Human Genome HG-U133A and -B GeneChip, and microarray dataset while was available at the Array-Express website (http://www.ebi.ac.uk/arrayexpress/), with the accession number E-GEOD-3494. Patients and tumors characteristics were provided as supporting information in the original paper [20].

Gene Selection.
According to the aim of the study, we selected 147 genes (Table 1). Specifically, the panel was composed of 27 genes recognized as involved in TGFβ-induced EMT, 57 involved in epithelial cell plasticity, 13 coding for stemlike properties and 31 involved in cell proliferation, apoptosis, and metastatic spread. In addition, to describe breast cancer subtypes, 19 genes coding for luminal and basal markers were also considered. The 147 genes corresponded to 352 Affymetrix probe sets, as verified by GeneAnnot system v2.0 (http://bioinfo2.weizmann.ac.il/ geneannot/), that additionally provided information about the quality of each probe set in terms of sensitivity and specificity score [21] (Supplementary Table 1

Statistical Analysis.
As some genes are recognized by more than a single probe set, each of which characterized by an individual specificity and sensitivity that differently contribute to gene expression value, a gene expression mean value was calculated after weighting each probe set for its own sensitivity and specificity score. Specifically, each expression value (already log2 transformed in the original dataset) was multiplied for the semisum of sensitivity and specificity scores of the corresponding probe set.
Prediction Analysis for Microarray (PAM) analysis was used to identify genes associated with the TP53 mutation status. PAM methodology minimizes the classification error using cross validation. For the selected genes, shrunken centroids across the different mutation groups were plotted. The FDR level was estimated through a permutation method.
To identify the tumors characterized by a similar intrinsic phenotype (tumor subtypes), unsupervised hierarchical cluster analysis was performed using the subset of genes coding for luminal and basal markers, HER-2, and claudins. The choice of the number of clusters to be used was supported by mean silhouette values [22]. PAM methodology was used to detail differential gene expression among clusters [23].
Principal Components Analysis (PCA) and PCA-based biplots were used to assess gene expression among clusters [24]. Moreover, for evaluating the associations among genes, specific subsets, not used to build PCA, were passively projected over the PCA-based biplots of intrinsic phenotypes.

Results and Discussion
As described in the original paper [20], the cases series was composed of 251 tumors, 58 of which characterized by a TP53 mutation. Of the 58 mutant tumors, 37 had point mutations and 21 had "severe" mutations, that is, insertions (n = 3), deletions (n = 11), and stop codons (n = 7), that result in frame shift and truncations with deleterious functional consequences.
To identify genes differentially expressed between mutant and wild-type p53 tumors, we first applied a PAM analysis on the overall cases series ( Figure 1). According to the expected loss-of-function, mutant p53 tumors were characterized by the underexpression of genes under p53 control (i.e., CDKN1A, coding for p21, and TP53INP, coding for the p53inducible nuclear protein 1). In addition, they were showed as underxpressed several genes coding for epithelial cell   Figure 1). Notably, in both datasets, p53 mutant tumors were associated with the overexpression of PROM1, supporting the experimental evidence indicating the relation among p53 mutation and the reacquisition of some stemlike properties according to an EMT-like process [12,13].
As regards the genes related to the canonical TGF-β signaling pathway, mutant p53 tumors showed the downregulation of many genes coding for pivotal elements of this pathway (ID1, ID4, SMAD3, SMAD4, SMAD5, TGFBR2, TGFBR3, ZEB1) coupled with the overexpression of SMURF2, coding for a SMAD-specific E3 ubiquitin protein ligase, and TGIF2, coding for a transcriptional repressor interacting with TGF-β activated SMAD proteins ( Figure 2). A similar pattern of expression (i.e., downregulation of SMAD2, SMAD3, SMAD5, SMAD7, TGFBR2, ZEB1, and overexpression of SMURF2 and TGIF2) was found in Langerod dataset [3] (Supplementary Figure 2). This unexpected finding could be explained taking into account that TGF-β is a multifunctional cytokine and a powerful tumor suppressor that governs many aspects of mammary epithelial cells physiology and homeostasis [25]. Consistent with the notion that estrogen receptor and TGFβ signaling pathways are major regulators during mammary Journal of Biomedicine and Biotechnology 7 All WT mut All WT mut All WT mut All WT mut All WT mut All WT mut All WT mut All WT mut All WT mut All WT mut All WT mut All WT mut All WT mut All WT mut All WT mut All WT mut All WT mut All WT mut All WT mut All WT mut All WT mut All WT mut All WT mut All WT mut All WT mut All WT mut All WT mut  AKT2  BMP1  ID1  ID2  ID3  ID4   SMAD2  SMAD3  SMAD4  SMAD5  SMAD6  SMAD7  SMURF1   SMURF2  SNAI1  SNAI2  TCF3  TGFBI  TGFBR1  TGFBR2   TGFBR3  TGIF2  TWIST1  TWIST2  ZEB1  gland development [26], it is not surprising that p53 mutant tumors concomitantly underexpressed ESR1 (coding for ERα), TGFBR2, TGFBR3 (coding for TGF-β receptors) and ID1, ID4, SMAD3, SMAD4, SMAD5, ZEB1 (coding for key elements of the pathway).
When severe and missense mutations were considered separately, PAM analysis provided evidence that severe TP53 mutations were responsible for the differential gene expression observed in mutant with respect to wild-type p53 tumors, even though some important alterations were already present in missense TP53 mutations, as, for example, the downregulation of some apical junctional components or the overexpression of genes related to cell proliferation and invasion. As shown in Figure 3 As regards the association between missense or severe TP53 mutation and EMT-related genes, Figure 4 indicates that, with respect to tumors harboring a missense mutation, those with severe mutations were characterized by the overexpression of SMURF2, SNAI1, and TGIF2 genes. Of particular interest is the overexpression of SNAI1 gene because of the concomitant overexpression of NOTCH1 pointed out by PAM analysis in tumors with severe TP53 mutation ( Figure 3). Indeed, Notch signalling pathway, which is implicated as an important contributor to EMT in tumorigenesis, has been recently suggested to play a direct role on the expression of the Snail transcription factor [27].
With respect to missense TP53 mutations, severe ones were also characterized by an increased expression of PROM1, the gene encoding for prominin, a pentaspan transmembrane glycoprotein (CD133) often overexpressed on cancer cells, where it is thought to function in maintaining stem cell properties by suppressing differentiation. This finding is in agreement with the hypothesis that basal cancers, which have been proposed to have a stem cell origin, are virtually all TP53 mutants and express high levels of PROM1 transcript and protein [28]. Unfortunately, since in Langerod dataset [3] severe mutations accounted for only three cases, we were unable to verify all these observations in an independent dataset.  Figure 3: Shrunken centroids for wild-type TP53 an mutant (missense or severe mutation) TP53 tumors. Left-sided bars indicate lower expression in the subgroups relative to overall centroid; right-sided bars indicate higher expression in subgroups relative to overall centroid.
One of the aims of the study was to explore the relationship among p53 mutation, EMT, and tumor aggressiveness, a peculiar characteristic of certain breast cancer subtypes (especially basal-like phenotype). To this specific aim, we performed an unsupervised hierarchical cluster analysis, using the subset of genes coding for luminal and basal markers, HER-2, and claudins, and we looked at the distribution of p53 mutations according to tumor subtype. The analysis indicated that mutant p53 tumors distributed into three main clusters ( Figure 5). Of the 58 mutant p53 tumors, 23 were included in Cluster 1, 17 in Cluster 2, and 18 in Cluster3. However, looking at the relative percentage, we found that, on the total number of tumors in each cluster, only 17% (23/133) of Cluster 1 and 19% (18/95) of Cluster 3 tumors had p53 mutations, whereas 74% (17/23) of Cluster 2 tumors did have. Notably, 10 of these 17 mutations were severe mutations.
PCA-based biplots, drawn using the same subset of genes of hierarchical cluster analysis (Figure 6), showed that Cluster 2 tumors were positively associated with genes related to basal phenotype (KRT5, KRT6A, KRT6B, KRT14, KRT17, EGFR) and with a panel of claudin-coding genes (CLDN1, CLDN6, CLDN10), whereas they were negatively associated with the majority of genes related to luminal phenotype. Conversely, Cluster 3 tumors were positively associated with genes related to luminal phenotype (ESR1, GATA3, MUC1, PGR, KRT18) and with a different panel of claudin-coding genes (CLDN3, CLDN4, CLDN7) and negatively associated with genes related to basal phenotype. Cluster 1 tumors showed a less clear-cut phenotype according to the more heterogeneous nature of this cluster, even though they appeared prevalently, associated with genes related to basal phenotype (KRT5, KRT6B, KRT14, KRT17, TP63). Remarkably, Cluster 2 tumors also showed the concomitant underexpression of ERBB2 gene providing evidence that these tumors had a gene expression profile consistent with the pathologic class of triple-negative tumors (Supplementary Figure 3), which are characterized by the expression of basal cytokeratins (mainly Krt5) and EGFR, but do not express estrogen and progesterone receptors, and HER2.
When we looked at the expression of EMT-associated genes according to clusters (Figure 7), we found that Cluster 2, consistent with the pathologic class of triple-negative cancers, showed a gene expression profile similar to that of tumors harboring a severe p53 mutation. Conversely, Cluster 3, consistent with the luminal-like phenotype, had a pattern of expression similar to that of wild-type p53 tumors whereas the phenotypically heterogenous Cluster 1 looks like the group of tumors with a missense p53 mutation. In particular,  AKT1  AKT2  BMP1  ID1  ID2  ID3  ID4   SMAD2  SMAD3  SMAD4  SMAD5  SMAD6  SMAD7  SMURF1   SMURF2  SNAI1  SNAI2  TCF3  TGFBI  TGFBR1  TGFBR2   TGFBR3  TGIF2  TWIST1  TWIST2  ZEB1   Cluster 2 (consistent with the pathologic class of triplenegative cancers and akin to severe p53 mutated tumors) was characterized by the underexpression of SMAD2, SMAD5, ZEB1, and TGFBR3 and the overexpression of SMURF2, TGIF2, and SNAI1, in agreement with the gene profile observed in tumors harboring a severe TP53 mutation ( Figure 4).
Notably, when the subset of genes related to stemness properties was passively projected over the PCA-based biplots provided in Figure 6, Cluster 2 tumors were positively associated with PROM1 and NOTCH1, and negatively associated with ALDH1A1, BMI1, NUMB (Figure 8). Similar to the latter but opposite in the sign, was the pattern of association shown by Cluster 3 tumors.   The imbalance in Numb/Notch pathway observed in Cluster 2 tumors, associated with the overexpression of SNAI1, is of particular interest because the involvement of this pathway in differentiation program and epithelial cancer progression and metastasis. Numb is an evolutionary conserved protein that plays a critical role in cellfate determination, including control of asymmetric cell division, endocytosis, cell adhesion, cell migration, and ubiquitination of specific substrates as p53. Loss of Numb causes increased activity of the oncogene Notch1 and for this reason, low expression of Numb and high levels of Notch1 have been associated with tumor progression and used as markers of tumor aggressiveness, especially in basal-like breast cancer [29]. The aggressiveness of this group of tumors was corroborated by the observation that Cluster 2 tumors were prevalently poorly differentiated (17/23 tumors were Grade III) with respect to Cluster 1 and Cluster 3 tumors, and by the positive association with genes promoting cell proliferation and metastatic spread. In this context, it can be viewed the overexpression of MMP1 and CXCR4, and the downregulation of TIMP1. Indeed, MMP1 encodes for a matrix metalloproteinase family member (specifically, a collagenase) involved in the breakdown of extracellular matrix whereas TIMP1 encodes for a specific tissue inhibitor of metalloproteases, including MMP-1. Because MMP1 is a target gene for wild-type p53 activity, the functional inactivation of the protein results in a gene overexpression that allows tumor cell migration after degradation of basement membrane and cell detachment [30,31]. The concomitant overexpression of CXCR4 due to a gain-of-function mutant p53 [32,33], further contributes to enhance tumor cell migration and metastatic spread [34]. In fact, CXCR4 encodes a C-X-C motif chemokine receptor specific for stromal cell-derived factor-1 (SDF-1/CXCL12), a member of the family of chemoattractant molecules, physiologically involved in the migration of immune cells. The CXCL12/CXCR4 signaling axis is also known to be important for tumor cell migration: CXCR4 expressed on tumor cells, provides a means of homing for metastatic cells to target organs [35]. Due to its implication with tumor dissemination, CXCR4 overexpression has been linked to a poor prognosis in breast cancer patients [35].
Surprisingly, on the contrary, it should be the negative association, pointed out by PCA-based biplots, between Cluster 2 tumors and BMI1 expression. That, because the role of BMI1 gene in self-renewal of stem cells and as an oncogene in many human cancers where it induces EMT. Although Bmi1 overexpression has been correlated with poor prognosis in several tumor types, a recent study has indicated that, in breast cancer, high Bmi1 expression is limited to the luminal subtype and that it is associated with a good outcome [36]. Under this light, the positive association that we observed between BMI1 expression and Cluster All 1 2 3 All 1 2 3 All 1 2 3 All 1 2 3 All 1 2 3 All 1 2 3 All 1 2 3 All 1 2 3 All 1 2 3 All 1 2 3 All 1 2 3 All 1 2 3 All 1 2 3 All 1 2 3 All 1 2 3 All 1 2 3 All 1 2 3 All 1 2 3 All 1 2 3 All 1 2 3 All 1 2 3 All 1 2 3 All 1 2 3 All 1 2 3 All 1 2 3 All 1 2 3 All 1 2 3  AKT2  BMP1  ID1  ID2  ID3  ID4   SMAD2  SMAD3  SMAD4  SMAD5  SMAD6  SMAD7  SMURF1   SMURF2  SNAI1  SNAI2  TCF3  TGFBI  TGFBR1  TGFBR2   TGFBR3  TGIF2  TWIST1  TWIST2  ZEB1  Cluster 1 tumors, which are prevalently p53 wildtype, are more difficult to categorize. Dissimilarly from basal-like and luminal-like, these tumors had an indefinite phenotype characterized by the coexpression of luminal and basal cytokeratins. In addition, the overexpression of several transcription factors (ID2, ID4, SNAI2, TWIST1, ZEB2), known to be under TGF-β control, and the concomitant overexpression of some genes coding for stemlike properties (ABCG2, JAG1, JAG2, NANOG, NOTCH4) makes it difficult to have a correct interpretation of the results. Indeed, it is not easy to establish whether such a phenotypical heterogeneity represents an intermediate step of an EMT-like process, in which tumor cells gain characteristics of mesenchymal cells but have not completely lost epithelial characteristics, or it is simply due to the individual heterogeneity of the tumors forming the cluster.

Conclusions
Aim of this in silico study was to investigate the association between TP53 mutational status and expression of a panel of genes related to TGF-β induced EMT and stemlike features, using a publicly accessible microarray dataset consisting of 251 p53-sequenced primary breast cancers. According to recent experimental evidence linking loss of p53 function, induction of EMT and acquisition of stemness properties in different tumor cell lines [11][12][13], we expected an evident positive association between EMT-related genes and p53 mutations, in particular with severe p53 mutations. In addition, since clinical evidence indicates that p53 mutations are prevalently associated with the pathologic class of triple-negative breast cancers, we expected an overexpression of EMT-related genes in this specific subset of tumors. Our analysis supports the notion that mutant p53 tumors (especially those harboring a severe p53 mutation) were consistent with the aggressive clinic class of triple-negative cancers, but it clearly indicates that, differently from cell cultures [11][12][13], surgical tumors did not overexpress TGF-β-related transcription factors. Taking into account the physiological role of TGF-β in mammary gland differentiation [25,26], these unexpected findings seem to suggest an interesting relationship between p53 mutation, mammary cell dedifferentiation, and the concomitant acquisition of stemlike properties which improve tumor cells aggressiveness.