The Transcribed-Ultraconserved Regions: A Novel Class of Long Noncoding RNAs Involved in Cancer Susceptibility

During recent years, novel approaches and new technologies have revealed a startling level of complexity of higher eukaryotes' transcriptome. A large proportion of the transcriptional output is represented by protein noncoding RNAs (ncRNAs) that arise from the “dark matter” of the genome. Focus on such sequences has revealed numerous RNA subtypes with several functions in RNA processing and gene expression regulation, and deep sequencing studies imply that many remain to be discovered. This review gives a picture of the state of the art of a novel class of long ncRNA known as transcribed-ultraconserved regions (T-UCRs). Most recent studies show that they are significantly altered in adult chronic lymphocytic leukemias, carcinomas, and pediatric neuroblastomas, leading to the hypothesis that UCRs may play a role in tumorigenesis and promising innovative future T-UCR—based therapeutic approaches.


INTRODUCTION
Since their introduction in the mid-1990s, microarrays have rapidly become a high-throughput method of gene expression analysis in relation to physiology, development, and disease. Moreover, together with sequencing of the human genome as well as those of model organisms, they largely contributed to the exploration of the complexities of eukaryotic genomes [1]. In the last few years, there has been increasing evidence that ~98% of human DNA is transcribed into molecules that are protein noncoding RNAs (ncRNAs) [2,3]. Such a startling finding has revolutionized the central dogma of molecular biology, according to information flows from DNA to protein through RNA as its intermediary [4]. From there, it was easy to generalize that "one gene equals one protein, one function". Generally, this holds true in prokaryotes, whose genomes consist of tightly packed protein-coding sequences, whereas complex eukaryotes have absolutely different patterns of functional regulation. Thus, the modern view of the eukaryotic RNA world involves many ncRNAs, which process and regulate other RNA molecules by cleavage, nucleotide modification, transcription, and degradation [5]. Numerous subtypes of ncRNAs participate in such RNA settings, including rRNAs, mRNAs, tRNAs, mitochondrial ncRNAs [6], small nuclear RNAs (snRNAs), small nucleolar RNAs (snoRNAs), several classes of regulatory RNAs involved splicing, RNA recognition motifs (including the six members of the SFRS family), as well as DNA binding motifs, in particular the Homeobox domain. These attributes are enriched in the 225 proteincoding genes that are near the nonexonic UCRs as well, although less significantly, suggesting that exonic UCRs may be specifically associated with RNA processing and nonexonic elements with regulation of transcription at DNA level [20].
The UCRs are frequently located at fragile sites and cancer-associated genomic regions (CAGRs), such as minimal regions of amplification and of loss of heterozygosity [29] (Supplementary Table 1). Intriguingly, those UCRs differentially expressed in human cancers are located in CAGRs specifically associated with that type of cancer. This is the case of uc.349A and uc.352 differentially expressed between normal and leukemic CD5-positive cells[30]: they are located within the 13q21.33-q22.2 chromosomal region, which has been linked to susceptibility to familial chronic lymphocytic leukemia [31]. Also, in pediatric neuroblastoma (NB) tumors, expression of seven UCRs correlated to their copy-number status [25]. Together, these data suggest that not only the protein-coding genes, but also the UCRs located in the CAGRs, could be candidate players for cancer susceptibility.

TRANSCRIBED-UCRs
A large fraction of UCRs are transcribed (so-called transcribed-UCRs, T-UCRs) in normal human tissues, and their expression levels show both a ubiquitous (for 34% of T-UCRs) and a tissue-specific pattern [30].
The T-UCRs show predominant transcription from one strand and only 9% of them are bidirectionally transcribed [30].
Interestingly, in addition to microarrays, Northern blot, and reverse transcription-quantitative realtime PCR (RT-qPCR)[29], a linear isothermal Ribo-SPIA™ RNA amplification method enables sensitive and accurate high-throughput interrogation of all 481 T-UCRs[32]. This is particularly important since transcription regulation studies are increasingly conducted in small samples of potential clinical interest, such as tumor biopsies, laser capture microdissected or sorted cell populations, when limited starting RNA amounts are available.

T-UCR AND TUMORS
Calin et al. [30] were the first to investigate the expression of UCRs in human cancers, focusing on chronic lymphocytic leukemia, the most frequent adult leukemia in the Western world[33], on colorectal carcinoma, one of the most common cancers in industrialized countries [34], and on hepatocellular carcinoma, the most rapidly increasing type of cancer in the U.S. [35]. They found that, for all the tumor types examined, the malignant cells have a unique spectrum of expressed UCRs when compared with the corresponding normal cells, suggesting that variations in T-UCR expression are involved in the malignant process. Moreover, distinct T-UCR expression signatures were differentially expressed in leukemias and carcinomas, and thus they might offer a novel strategy for cancer diagnosis and prognosis.
Recently, we investigated T-UCR expression in NB tumors [36]. The NB is a pediatric tumor of the sympathetic nervous system characterized by a remarkable heterogeneous clinical behavior [37]. Patients with localized NB have a favorable outcome and in infants with disseminated stage-4 tumor, the progression of disease is generally halted by good response to therapy. Conversely, only 20-30% of children older than 12-18 months of age with a stage-4 tumor show progression free-and overall survival longer than 60 months, despite multimodal therapeutic protocols [38]. In recent years, several prognostic signatures derived from gene expression profiles, DNA abnormalities, and miRNAs have been proposed as sensitive indicators of tumor progression in NB patients [39,40,41,42,43]. Although efforts have been performed in order to validate each gene classifier on independent patient cohorts[44,45], the major challenges remain to identify additional tumor-specific prognostic markers for improved risk estimation at the time of diagnosis, especially in high-risk NB patients. For the first time, we defined a signature based on 28 T-UCRs that is associated with good outcome in noninfant patients diagnosed with metastatic NB [36]. More recently, Mestdagh et al. [25] confirmed that T-UCRs are widely expressed in NB tumors and correlate to clinical-genetic parameters, such as MYCN oncogene status.
As regarding dysregulation of T-UCR transcription in cancer, Calin et al [30] demonstrated that transcription of tumor-associated T-UCRs in leukemias is negatively regulated by direct interaction with miRNAs. Similarly, we found negative correlations between expression values of nine specific T-UCRs and five predicted interactor miRNAs of the signature able to differentiate between long-and shortsurviving high-risk NB patients [36]. In both studies, sequence complementarity gives rise to several miRNA:UCR interacting pairs, indicating complex redundancy in regulatory mechanisms between miRNAs and T-UCRs. Accordingly, these findings provide support for a model in which both coding and noncoding genes are involved in and cooperate in human tumorigenesis. Notably, it is now easily possible to match miRNA and UCR sequences by a specific database, namely Ucbase & miRfunc, which provides UCR sequence data and shows miRNA function [46].
To gain further insight into the initiation and regulation of T-UCR transcription, Mestdagh et al.
[25] evaluated the chromatin state of the T-UCR genomic neighborhood. Both inter-and intragenic T-UCRs are significantly associated with active trimethylation of lysine 4 of histone H3 (H3K4me3), a marker for transcriptional initiation[47,48], but with a different distribution as compared with protein-coding genes, suggesting a difference in transcriptional organization between T-UCRs and protein-coding genes. In addition, H3K4me3 distance distributions for miRNAs and T-UCRs appear similar, suggesting common features of transcription organization for these two classes of ncRNA, with initiation sites several kilobases away [49].
Finally, epigenetic mechanisms as potential regulators of T-UCR expression have been evaluated. We found that 78% of intragenic T-UCRs deregulated in high-risk NBs are associated with CpG islands in the promoter region of their own host genes [36]. Therefore, much like CpG island hypermethylationmediated silencing of miRNAs with tumor-suppressor features contributes to human cancer[50], the global DNA hypermethylation events in unfavorable NB[51,52] may also affect T-UCR-host genes, and thus silence T-UCRs with a potential oncogenic role. Recently, a pharmacological and genomic approach confirmed the possible existence of an aberrant epigenetic pattern of T-UCRs. Indeed, Lujambio and coauthors[53] observed that while almost half of the UCR-associated CpG islands are unmethylated in all tissues, the other half show tissue-specific UCR CpG island methylation, as occurs with promoter CpG islands of coding genes[54] and miRNAs [55]. Moreover, treatment of cancer cells with the DNA methylation inhibitor 5-aza-2-deoxycytidine disclosed that epigenetic inactivation by CpG island hypermethylation of a subset of T-UCRs occurs in a wide spectrum of human cancer cell lines and primary tumors. Taken together, these findings support a model in which epigenetic disruption of T-UCRs constitutes a hallmark of human tumorigenesis. Accordingly, tumor-specific CpG island hypermethylated UCRs may be useful biomarkers of disease. Table 1 summarizes the main studies on the field of UCRs and cancer, and the main findings of each of these studies.

FUNCTION OF THE ULTRACONSERVED ELEMENTS
The remarkably high degree of conservation across species implies that UCRs may have a fundamental functional importance for the ontogeny and phylogeny of mammals and other vertebrates. Although UCRs are significantly depleted among segmental duplications and copy-number variants[56], deletion of some of these regions in knock-out mice was not associated to any notable phenotype abnormality [57]. Therefore, the role of UCRs in viability is still controversial. In cancer cells, T-UCRs might act as oncogenes. Indeed, functional analysis involving small interfering RNAs have identified one T-UCR in colorectal cancer, namely uc.73A, to be oncogenic by increasing the number of malignant cells as a consequence of reduced apoptosis [30]. To go into more depth of the processes in which T-UCRs are involved, Mestdagh and colleagues [25] implemented an integrative genomic workflow to infer putative T-UCR functions using Gene Set Enrichment Analysis[66] and validated them using in vitro systems. For a large number of T-UCRs, they observed widespread association to numerous cancer-related cellular functions and pathways, such as proliferation, apoptosis, and differentiation. For example, the most prominent cluster identified using this methodology contained several T-UCRs significantly related to the expression of protein-coding genes involved in cell cycle, DNA replication, and DNA repair.

CONCLUSION AND FUTURE PERSPECTIVES
It has been well accepted that T-UCRs are regulatory elements within the RNA-processing machinery that also play a critical role in human diseases such as cancer. Indeed, malignant cells show specific alterations not only at genes coding for oncoproteins or tumor suppressors, but also at several classes of ncRNAs. Dysregulation of T-UCRs is a common feature of human cancer. It offers the prospect of defining both tumor-specific signatures of T-UCRs and tumor-specific methylated UCRs that are associated with diagnosis, prognosis, and response to treatments. Above all, aberrant UCR methylation in the transformed cells might provide a molecular basis for the innovative therapeutic use of DNAdemethylating compounds in the treatment of cancer patients. As a proof of principle, restoration of expression of a down-regulated T-UCR, or, alternatively, inhibition of an overexpressed T-UCR by a small interfering RNA approach could reverse the tumor phenotype. Moreover, localization of such ncRNA within CAGRs could open the way for starting T-UCR-based therapy trials.