Analysis of Highly Conserved Regions of the 3’UTR of MECP2 Gene in Patients with Clinical Diagnosis of Rett Syndrome and Other Disorders Associated with Mental Retardation

In this work we explored the role of the 3’UTR of the MECP2 gene in patients with clinical diagnosis of RTT and mental retardation; focusing on regions of the 3’UTR with almost 100% conservation at the nucleotide level among mouse and human. By mutation scanning (DOVAM-S technique) the MECP2 3’UTR of a total of 66 affected females were studied. Five 3’UTR variants in the MECP2 were found (c.1461+9G>A, c.1461+98insA, c.2595G>A, c.9961C>G and c.9964delC) in our group of patients. None of the variants found is located in putative protein-binding sites nor predicted to have a pathogenic role. Our data suggest that mutations in this region do not account for a large proportion of the RTT cases without a genetic explanation.


Introduction
Mutations in the MECP2 gene are associated with Rett syndrome (RTT, OMIM # 312750) [1], a disorder affecting mainly females, presenting with mental retardation, autistic features and a movement disorder. The disease is characterized by an apparently initial normal developmental period, followed by a reduction in growth and loss of motor and cognitive skills around the age of 6 to 18 months [15,16]. Mutations in the coding region of MECP2 are identified in around 80-90% of classical RTT patients (presenting the above described phenotype), and 30% of variant forms of the disease, which includes more severe presentations such as neonatal encephalopathy and milder presentations such as the preserved speech variant or forme fruste [16,19]. Thus, the cause of disease in a significant proportion of patients with RTT remains unknown. It has been proposed that mutations in non-coding regions of MECP2, untranslated regions (5' and 3'UTR) and introns, or in other genes may be the unidentified cause of the disorder in these cases.
The role of the 3'UTR of a gene might be the regulation of its expression at different levels, such as the "translatability" of an mRNA, its stability, nuclear export or the sub-cellular localization of the translation event, thus affecting the levels of the resulting protein (for a review see [7]). The latter mechanisms have been demonstrated in neurons for the immediate-early gene Arc (apoptosis repressor with CARD) and the protein αCamKII (calcium/calmodulin-dependent protein kinase II-alpha) which are translated locally at/near the synapse [5].
Additionally, the 3'UTR of the CamKII gene has been linked to the regulation of activity-dependent CamKII protein expression, via glutamate N-methyl-D-aspartate glutamate receptor (NMDAR) activation. This regulation is on the basis of synaptic plasticity, learning and memory formation [28], all of which are disturbed in RTT.
A study based on the human gene mutation database (HGMD) estimated that around 0.2% of the diseaseassociated mutations reside in the regulatory regions of the 3'UTR of genes [6]. Mutations in the 3'UTR have been identified as the genetic cause of a number of diseases with a neurological component: (1) myotonic dystrophy, a neuromuscular disorder characterized by hypotonia, mental retardation and muscle development defects, due to a CTG repeat expansion in the 3'UTR of the dystrophia myotonica protein kinase gene (DMPK) [12]; (2) IPEX syndrome (immune dysfunction, polyendocrinopathy, enteropathy, X-linked), caused by a mutation within the first polyadenylation signal of the forkhead box P3 gene (FOXP3) [3]; (3) Fukuyama-type congenital muscular dystrophy, presenting with mental retardation and brain defects, caused by a defect in the Fukutin gene [17] and (4) familial Danish dementia, caused by a decamer duplication in the integral membrane protein 2B gene (BRI) [25]; finally, (5) a nucleotide change found in the 3' regulatory region of the cyclin-dependent kinase 5, regulatory subunit 1 gene (CDK5R1), was also suggested to be a cause of non-syndromic mental retardation [24]. The mechanisms involved are variable and these mutations cause pathology either by affecting (1) the splicing of other genes as in the case of myotonic dystrophy [20], (2) the mRNA maturation as is the case in IPEX syndrome [2], (3) the mRNA stability as in the case of Fukuyama-type congenital dystrophy [17] or the expression levels as it happens in familiar Danish dementia [25]. However, other variants at the 3'UTR of other genes were not found to be associated with disease, as is the case of the CAA insertion polymorphism in the NOGO gene in schizophrenia and bipolar disorder [14] and the B7-1 gene in patients with multiple sclerosis [27] among many others.
The MECP2 gene has eight different transcripts that result from alternative splicing and different sites of polyadenylation and one of the longest known 3'UTR tails, with 8.5-kb [9]. The longest transcript is more than 10 kb long and has several blocks of highly conserved residues between the human and mouse genomes (Fig. 1). This argues in favour of a potential regulatory role of this 3'UTR in the function of the MeCP2 protein, in different cell types and at different developmental times. The longest transcript is also generally described as the predominant form in the brain [9,10,21].
In this work we explored the potential role of the extremely large and conserved 3'UTR of the MECP2 gene in patients with clinical diagnosis of RTT and mental retardation: we performed a bioinformatics analysis in order to identify putative protein-binding sites and we searched for mutations in selected regions of this 3' UTR, highly conserved among mouse and human, in an attempt to identify sequence variations leading to disease.

Subjects
We have included in the study a total of 66 affected females who had already been studied for MECP2 mutations in its coding region and exon-intron boundaries, including detection of large rearrangements by robust dosage-PCR method (33 were positive for MECP2 mutations in the coding region of the gene), 51 with a clinical diagnosis of RTT, 35 classical and 16 atypical, according to the reviewed criteria of Hagberg and colleagues [15,16], 13 with mental retardation with autism and 2 with an Angelman syndrome-like clinical presentation. Whenever possible, both parents were also included in the study. A control population of 40 males and 83 females was used in order to characterize the frequency of the new variants found. All participants or their legal representatives gave informed consent

Molecular analysis of the MECP2 3'UTR
Genomic DNA was extracted from the peripheral blood of the patients and available parents using the Puregene DNA extraction system (Gentra).
Selection of the non-coding regions of the MECP2 3'UTR to be analysed was performed based on conservation among species (86%-98% conservation at the nucleotide level), in a total of 9 blocks  Table 1.
For the DNA amplification of the different 3'UTR blocks, a final volume of 25 µl PCR mixture (2.5 mM MgCl 2 , 0.2 mM dNTP, 2.5 µM of each primer and 2U of AmpliTaq Gold (Applied Biosystems)) was used. The thermal cycling profile consisted of an initial denaturation for 10 min at 94 • C, 35 cycles of a denaturation for 15 sec at 94 • C, annealing for 30 sec at 55 • C, extension for 1 min at 72 • C, and a final extension for 10 min at 72 • C.

Allele specific-PCR (AS-PCR)
The presence of the MECP2 3'UTR variants c.2595G>A and c.9961C>G was tested in a control population by AS-PCR. The primers used for allele specific amplification of the MECP2 3'UTR variants, as well as the PCR conditions used, are listed in Table 1.
Two PCR mixtures for each variant were prepared in order to amplify either the normal or the mutated allele, in a final volume of 25 µl PCR mixture that consisted of 0.5 mM MgCl 2 , 0.2 mM dNTP, 0.8 µM of each primer pair (either for the normal allele or for the mutated allele) and 1.5 U of Taq DNA polymerase (Fermentas). The thermal cycling profile consisted of an initial denaturation for 5 min at 95 • C, 35 cycles of a denaturation for 1 min at 95 • C, annealing for 1 min at Ta • C (specific for each variant, see Table 1), extension for 1 min at 72 • C, and a final extension for 5 min at 72 • C. PCR products were electrophoresed in a 2% agarose gel, and visualized under UV light.

Bioinformatics
By computational analysis we identified several putative binding sites of trans-acting factors in the MECP2 3'UTR. We identified binding sites for the neurooncological ventral antigen (NOVA), the arginine/serinerich splicing factors SRp40 and SRp55 and several heterogeneous nuclear ribonucleoproteins (hnRNPs), including the polypyrimidine tract-binding protein (PTB); all these proteins are known to be involved in the regulation of splicing and/or polyadenylation events. Although biochemical evidence of binding needs to be obtained, these findings reinforce the potential biological relevance of this untranslated region and prompted us to proceed to its genetic study.

MECP2 3'UTR variants
Variants in the 3'UTR of the MECP2 gene were searched in 66 Portuguese patients with a clinical di-agnosis of RTT or mental retardation by Detection Of Virtually All Mutations -SSCP (DOVAM-S) [18,23]. We identified 5 alterations in the 3'UTR of MECP2 gene: c.1461+9G>A, c.1461+98insA, c.2595G>A, c.9961C>G and c.9964delC (NM 004992.2, Fig. 1 and Table 2). None of the variants is localized in any of the identified putative binding sites. The 1461+9G>A and the 1461+98insA variants (identified by the direct sequencing of exon/intron boundaries in our previous work) were already described in the literature as polymorphisms (see MECP2 mutation database, http://mecp2.chw.edu.au). The c.9964delC variant was previously detected in a Portuguese control population [8], and the c.2595G>A and c.9961C>G variants were identified for the first time in this study and were not found in 218 X chromosomes of a Portuguese control population. The variants c.9961C>G and c.9964delC were both present in the same patient (Table 2); the c.9961C>G variant was also present in the unaffected father of the patient, hence this variation must not be a pathogenic mutation, and the variant c.9964delC was present in the unaffected mother of the patient who had an balanced X-chromosome inactivation pattern in her lymphocytes. We could not test the c.2595G>A variant in the parents of the patient, since their DNA was not available. However, this patient was shown to have another causal mutation in trans (Table 2), a large rearrangement of the MECP2 gene [22], and so this variant is most likely not a pathogenic alteration. Since it is present only in this one patient, in whom a severe clinical presentation was expected, we cannot conclude whether this variant may behave as a modifier of the phenotype.

Discussion
The long and highly conserved 3'UTR of the MECP2 gene suggested that mutations in this region could exist and underlie a percentage of the molecularly unexplained RTT cases; from 20% to 70% in classical and atypical cases, respectively. The most interesting RTT group to study was doubtlessly the one without MECP2 mutations in the coding region, although 3'UTR variants could also be modulators of the phenotype caused by MECP2 coding region mutations. Our data suggest that mutations in this region do not account for a large proportion of the RTT cases without a genetic explanation.
In a previous study, a 300-bp segment of the 3'UTR around the 10 kb polyadenylation signal of the MECP2 gene was scanned for mutations in RTT patients and no pathogenic variants were found [4]. Shibayama and colleagues [23] studied a heterogeneous group of patients presenting schizophrenia, autism and other psychiatric disorders and reported that 3'UTR variants in the gene MECP2 seemed to be more frequent in autism patients than in the general population. However, the screening of the entire 3'UTR of MECP2 in an autistic Portuguese population did not show any pathogenic variant that could be responsible for the pathology, and no heightened representation of 3'UTR sequence variants was found in this patient population as compared to controls [8].
In our perspective, the contribution of the MECP2 3'UTR to RTT aetiology cannot be totally excluded without studying a larger number of RTT patients of different populations; analysis should focus in these "blocks" of high conservation and with putative regulatory protein binding sites, for which we have established a large-scale scanning method. Additionally, other methods such as evaluation of mRNA levels and stability and binding of specific proteins by RNA protection assays may also be considered.