Association between a Tetranucleotide Repeat Polymorphism of SPAG16 Gene and Cataract in Male Children

Purpose. Studies involving genotyping of STR markers at 2q34 have repeatedly found the region to host the disease haplotype for pediatric cataract. Present study investigated the association of D2S2944 marker, in sperm associated antigen 16 (SPAG16) gene and rs2289917 polymorphism, in γ-crystallin B gene, with childhood cataract. Methods. 97 pediatric cataract cases and 110 children with no ocular defects were examined for tetranucleotide repeat marker/SNP using PCR-SSLP/RFLP techniques. Polymorphisms were assessed for association using contingency tables and linkage disequilibrium among alleles of the markers was estimated. Energy-optimization program predicted the secondary structure models of repeats of D2S2944. Results. Seven alleles of D2S2944, with 9–15 “GATA” repeats, were observed. Frequency of the longer allele of D2S2944, ≥(GATA)13 repeats, was 0.73 in cases and 0.56 in controls (P = 0.0123). Male children bearing ≥(GATA)13 repeats showed >3-fold higher risk for cataract (CI95% = 1.43–7.00, P = 0.0043, P c = 0.0086) as compared to female children (OR = 1.19, CI95% = 0.49–2.92, P = 0.70). Cases with haplotype—≥(GATA)13 of D2S2944 and “C” allele rs2289917—have a higher risk for pediatric cataract (OR = 2.952, CI95% = 1.595~5.463, P = 0.000453). >(GATA)13 repeats formed energetically more favorable stem-loop structure. Conclusion. Intragenic microsatellite repeat expansion in SPAG16 gene increases predisposition to pediatric cataract by probably interfering posttranscriptional events and affecting the expression of adjacent lens transparency gene/s in a gender bias manner.


Introduction
Cataract is a major cause of treatable childhood blindness, with a prevalence of around 5 to 15 cases per 10,000 children in India [1]. Cataract in children is particularly serious because it has the potential for inhibiting visual development, resulting in permanent blindness and disability. Inherited cataracts represent 8-25% of infantile cataract cases [2]. Understanding the genetics of cataract will not only lead to better treatment approaches but also open avenues for effective counseling. Most inherited cataracts mapped on to chromosome 2 are associated with a subgroup of genes, namely, gamma-crystallins (CRYG) present at 2q33-35, encoding proteins important for maintenance of lens transparency and homeostasis [3]. e chromosomal region from 198 Mb to 220 Mb on chromosome 2q has been repeatedly found to host the disease haplotype for pediatric cataract [4]. Infact Cat-Map database summary shows that major portion of mutations/variations observed to be associated with cataracts in this region have been reported in Asians majorly including Indians and Chinese.
Since linkage studies cite marker/s cosegregating with large genomic regions, usually multiple Mb in size and including many genes, it becomes imperative to resolve the region by �ne mapping. A marker consistently falling under the disease linked haplotype or region showing peak linkage signal in several studies points towards a gene/s in vicinity associated with the disease etiology. To our knowledge, till date, genes in the 2q34 region have not been tested for their association with cataract and the gender speci�c effects observed for this locus. e location of this intragenic marker-D2S2944 suggested that SPAG16 gene may have a role in pathophysiology of lens opaci�cation as several studies have demonstrated that microsatellites in the non-coding region may function in gene regulation [14]. SPAG16 protein exists in two isoforms, the L and the S. While the Spag16L mRNA has been detected in testis, brain, lung, oviduct and other murine tissues containing cells with a "9 + 2" axoneme structure, Spag16S is only expressed in testis or male germ cell in mice [15]. Spag16 S being a bifunctional protein, on one hand interacts with MEIG1 (meiosis expressed gene 1 product involved in chromosome/chromatin-binding and participates in the regulation of chromosome structure and/or gene expression) and on the other, acts as a transcription factor (TF) that transactivates the promoter of the L isoform [16]. Functionally, the L isoform is responsible for axoneme stability and sperm �agellar motility [16][17][18]. Mice chimeric for a mutation deleting the transcripts for both SPAG16L and SPAG16S have a profound defect in spermatogenesis [15]. We hypothesized that the D2S2944 microsatellite may act as an enhancer or repressor to regulate SPAG16 gene expression which in turn impacts gene/s regulating lens transparency in gender speci�c manner. In this study, we investigated the association between variations of the (GATA) repeats in microsatellite marker D2S2944 in the SPAG16 gene and a tagged SNP-rs2289917, in the promoter of the -crystallin Bgene, with childhood cataract.

Material and Method
All participants in this study were recruited aer obtaining a written informed consent. e patient cohort included unrelated pediatric patients who attended Dr. Shroff 's Eye Hospital for cataract surgery. e type of cataract was recorded according to the morphological classi�cation proposed by Merin [19]. Patients with uveitis, cataract due to trauma, steroid therapy or infective etiology, cataract with associated glaucoma or retinal pathology or subluxated lens and patients' positive for TORCH were excluded from the present study. e control population comprised of children with both lenses graded as having no opacities on observation under slit lamp and no history of congenital/infantile, juvenile, traumatic/postsurgical cataract or any other detectable ocular defects. Controls were drawn from the same ethnic population as patients from the same geographical region (subjects residing in and around Malka Ganj to Darya Ganj in Delhi). e study was conducted following the norms of Declaration of Helsinki for human experimentation and was approved by the Institutional Human Ethics Committee (IHEC) of both BITS and the eye hospital. Genomic DNA was obtained from healthy children and children with cataract using method described elsewhere [20]. Genomic DNA was subjected to PCR using UniSTS primer set (http://www.ncbi.nlm.nih.gov/genome/sts/sts. cgi?uid=68648/), and the number of (GATA) repeats were determined by resolving PCR amplicons in 12% nondenaturing polyacrylamide gel electrophoresis with commercial and internal in-house standard DNA ladders as described by Mehra et al. 2012 [21]. rs2289917 was genotyped by PCR-RFLP as reported earlier [20]. Two independent observers assigned the genotypes and unambiguous genotypes were assigned to 97 cases and 110 controls. Chi square test ( 2 ), trend test, Fisher's exact test and Odds ratios (OR) with 95% con�dence interval (CI 95% ) were used to test differences between the cases and controls and values were Bonferroni corrected ( ), wherever applicable. All statistics were performed using Med Calc version 9.3.9.0. To elucidate the mechanism by which the microsatellite motif could be involved in regulating the SPAG16 expression, the potential secondary structure/s in the intron 10 of SPAG16A mRNA were predicted using the authentic and minimum free energy (ΔG; kcal/mol) method of MFOLD program (version 3.1) (http://mfold.rit.albany.edu/?q=mfold/RNA-Folding-Form/). Haplotype frequencies for pairs of alleles of D2S2944 and rs2289917, as well as 2 values for allele associations, and linkage disequilibrium (LD) coefficients � were estimated by SHEsis soware (http://analysis2.biox.cn/myAnalysis.php/).
Control cohort segregated on the basis of castes showed no differences in longer allele frequency, indicating absence of any effect of population structures. In a replication cohort from western India, that is, Shekhawati region of Rajasthan 16% of healthy adult subjects out of 107 individuals have 13 "GATA" repeats of D2S2944, which is similar to the observed control frequency in the present studied cohort. e study has a power of >80% at = .56 (controls with longer allele) and relative risk of 2 (risk of disease associated with presence of longer allele) with the given sample size. No signi�cant difference in allele frequency of longer allele exists between the various endo-phenotypes of cataract. e putative structures of partial intronic sequence of the SPAG16 gene covering the microsatellite D2S2944 showed that the mRNA fragment had the potential to form stem loops in the microsatellite motif. e stemloop structure was energetically more favorable when the number of (GATA) repeats were >13 (ΔG for (GATA) 9 : −19.90 kcal/mol; ΔG for (GATA) 12 : −21.40 kcal/mol; ΔG for (GATA) 15 : −22.60 kcal/mol), suggesting that intronic region of SPAG16 may also regulate SPAG16 mRNA through posttranscriptional events, Figure 3.  Allele and genotype distribution in cases versus controls, respectively: 2 (trend) = 10.981, 0 0009, 0 0063; 2 (trend) = 9.669, 0 0019, 0 0133. a OR = 0.45, CI 95% = 0.25-0.81, 0 00 9, 0 0553; b OR = 2.01, CI 95% = 1.23-3.33, 0 005 , 0 03 8. Note-<13 includes all homozygotes with both alleles less than 120 bp PCR pdt length, while <13/>13 includes heterozygotes with at least one 120 bp PCR pdt length allele and ≥13: homozygotes with both alleles more than 120 bp PCR pdt length. Pdt = product.

Discussion
About one-third to one-half of all bilateral pediatric cataracts has a genetic basis with Mendelian inheritance [23]. However, extensive screening for variations in known disease associated genes could not identify the molecular lesion in large fraction of the families with inherited cataract in independent studies done in Australians, Indians, and Europeans [23][24][25]. Plausibly unidenti�ed genes may be a more signi�cant cause of cataracts than previously thought. e present study assessed the effect of D2S2944 marker on susceptibility for cataract in pediatric subjects. A signi�cant difference in allele frequencies of longer (≥13 "GATA" repeats) and shorter alleles of D2S2944 was observed among the cases verses controls. Beem et al, have shown dominance of (GATA) 1 tetra repeat allele of D2S2944 and its association with depressive individuals which are known to be at a higher risk for developing cataract [26,27]. Our results are consistent with the �ndings of �apoor et al. (2010) and Maher et al.(2010) who have again con�rmed that allele 7 of D2S2944 marker is associated with major depressive disorder and mood disorders, respectively, in a sex-speci�c manner [28,29].
Evidences from academic literature have indicated that expandable repeats, due to their unusual structural features, disrupt cellular replication, repair, and recombination machineries altering gene expression in human cells leading to disease. Many of these debilitating diseases are caused by repeat expansions in the non-coding regions of their resident genes [30]. Analysis of contig of chromosome 2 shows that the intragenic marker D2S2944 lies 225kb downstream of exon 10 of SPAG16L transcript, where an "untranslated exon" ahead of the �rst coding exon of SPAG16S (exon 11 of SPAG16L) has been reported [16]: Figure 1(b). Studies have shown that microsatellite motifs in the UTR form structural elements (stem loops) and contribute to mRNA regulation [14]. Our prediction using the MFOLD program not only showed stem-loop structures formed by sequences containing the D2S2944 microsatellite motifs but also favorable free energy level of sequence with >(GATA) repeats as compared with sequence with <(GATA) repeats. We here propose that the stabilized stem-loop structure due to microsatellite repeat expansion could affect the splicing mechanism normally taking care of the formation of SPAG16S and SPAG16L transcript. Consequently, affecting the expression of neighboring genes involved in maintaining lens transparency during developmental stages. In fact, Zhang et al., (2007) either could not detect the truncated SPAG16 protein in the western blots of sperm extracts from the human subjects carrying the SPAG16 heterozygous mutation (mutations that disrupt the expression of both SPAG16L and SPAG16S), reinforcing mRNAs transcripts instability [16]. Zhang et al. (2004) earlier reported mark impairment in spermatogenesis in mouse with the heterozygous mutation present in exon 11 of the Spag16 gene which affected the expression of both L and S isoform of SPAG16 protein [31]. However, in humans haploinsufficiency of SPAG16L/SPAG16S does not impair male fertility [16]. is lends support to our observation where 60% of parents of simplex pediatric cataract cases were homozygous for the risk allele. us bracing the fact that the mutation/s in the SPAG16 gene does not have a reproductive disadvantage but rather may have a profound effect on cell viability. e loss of L isoform of SPAG is responsible for instability of central apparatus components of the sperm. While the loss of SPAG16S transcript in addition to affecting Spag16L mRNA expression, affects postmeiotic germ cell viability [15,16]. Further the distinct colocalization of SPAG16S with SC35 in nuclear speckles (nonnucleolar domains containing splicing factors as well as TFs, RNA processing units, and structural scaffold proteins) linked to the development of a cell-type speci�c genomic organization explains the S isoform's indispensable role in early developmental process [15].
It is noteworthy that the CRYGB gene is present 52, 77,275 bases upstream of D2S2944 marker (NCBI build 37.1) and has been previously reported to show strong gender differences in expression levels as well [32]. Both markers that is, rs2289917 and D2S2944 fall in high LD blocks according to HapMap ( � , LOD > ). So it is possible that combined effect of risk allele of D2S2944, and rs2289917 is because there may be low LD (as shown by our in silico analysis) between some neighboring SNP and D2S2944 which explicate the association. We also observed novel sequence variations in the promoter region of CRYGB gene of pediatric cataract patients, which affected the putative TF binding sites in in silico analysis [4]. Being a WD-repeat protein, SPAG16 is known to interact dynamically and reversibly with TFs. Loss of SPAG16 protein and a promoter polymorphism affecting the TF binding in lens transparency maintaining gene/s thus together (as shown by our results) can either be a founder or disseminating event in early lens opacity progression. Recently, SPAG16 was found to be ubiquitously expressed in humans and has a testis associated alternative splice variant which has oncogenic properties [33]. Also expressed sequence tag pro�le of SPAG16 gene at NCBI indicates the highest restricted pool expression in fetus in developmental stages. It is thus plausible that SPAG16 affect the expression of the neighboring genes which could be involved in maintaining lens transparency that is, CRYGB gene. In conclusion, to the best of our knowledge, this is the �rst example of a testis speci�c gene conferring the ability to regulate lens transparency in developmental stages. However a larger study is warranted for elucidating the molecular mechanism/s underlying the relationship between SPAG16 gene and cataract.

�on�l�ct of �nterests
ere is no con�ict of interests among any of the authors of the paper being submitted.