Assessing Noncoding Sequence Variants of GJB2 for Hearing Loss Association

Involvement of GJB2 noncoding regions in hearing loss (HL) has not been extensively investigated. However, three noncoding mutations, c.-259C>T, c.-23G>T, and c.-23+1G>A, were reported. Also, c.-684_-675del, of uncertain pathogenicity, was found upstream of the basal promoter. We performed a detailed analysis of GJB2 noncoding regions in Portuguese HL patients (previously screened for GJB2 coding mutations and the common GJB6 deletions) and in control subjects, by sequencing the basal promoter and flanking upstream region, exon 1, and 3'UTR. All individuals were genotyped for c.-684_-675del and 14 SNPs. Novel variants (c.-731C>T, c.-26G>T, c.*45G>A, and c.*985A>T) were found in controls. A hearing individual homozygous for c.-684_-675del was for the first time identified, supporting the nonpathogenicity of this deletion. Our data indicate linkage disequilibrium (LD) between SNPs rs55704559 (c.*168A>G) and rs5030700 (c.*931C>T) and suggest the association of c.[*168G;*931T] allele with HL. The c.*168A>G change, predicted to alter mRNA folding, might be involved in HL.

Houseman and coworkers [16] analysed HL patients heterozygous for c.101T>C (p.Met34Thr), in which no second GJB2 coding mutation had been detected, and identified a 2 Genetics Research International monoallelic 10 bp deletion, c.-684 -675del (firstly designated -493del10), upstream of the basal promoter. The deletion was also present in other hearing impaired individuals as well as in control individuals, with or without c.101T>C. However, c.-684 -675del homozygosity was only observed in c.101T>C homozygous patients. The fact that in the control population 22 of the 25 (88%) c.101T>C heterozygotes carried the deletion suggested the existence of LD between c.101T>C and c.-684 -675del, later demonstrated by Zoll and coworkers [22]. Transcription was observed from alleles harbouring in cis the deletion and the variant c.101T>C, derived from keratinocytes and cell lines. However, eventual subtle differences would not have been detected, since this was not a quantitative analysis [16]. To date, the role of c.-684 -675del in HL has remained uncertain.
In the present study, we have analysed the basal promoter and the flanking upstream region, as well as the exon 1 and the 3 UTR of the GJB2 gene in 89 Portuguese HL patients. The same analysis was conducted on 91 normal hearing control individuals from the Portuguese population.

Subjects.
Eighty-nine Portuguese HL patients previously screened for mutations in the GJB2 coding region and acceptor splice site (by SSCP and/or sequencing) and for the del(GJB6-D13S1830) and del(GJB6-D13S1854) GJB6 deletions (using the methodology described in [5]) were enrolled in this study. Eight patients were heterozygous for a GJB2 coding mutation: c.71G>A (p.Trp24X; n = 1), c.35delG (n = 3), c.109G>A (p.Val37Ile; n = 1), c.380G>A (p.Arg127His; n = 1), c.457G>A (p.Val153Ile; n = 2), and one patient was heterozygous for the c.-22-12C>T variant (apparently a polymorphism; dbSNP accession number rs9578260). No patient harboured either of the known GJB6 deletions. The HL was nonsyndromic in all patients, except for one of them, who presented with Waardenburg syndrome. The patient was heterozygous for the controversial c.457G>A mutation and was thus included in the study. The patients presented with bilateral, mild to profound HL, and were either familial or sporadic cases. The familial cases predominantly showed a recessive pattern of inheritance. All patients were audiologically evaluated by pure tone audiometry.
The control sample was composed of 91 Portuguese individuals with apparent normal hearing. The status regarding c.101T>C GJB2 variant of those control individuals harbouring the c.-684 -675del, here referred, had been previously investigated, by sequencing, as part of an unpublished work. The status of the entire GJB2 coding region is not known for the vast majority of the 91 control individuals, which were blindly included in this study (and not based on their eventually available GJB2 coding region status).
Informed consent was obtained from all the participants.
Novel variants were submitted to dbSNP and the respective reference SNP (rs) accession numbers are provided within the text.
SNPs are referred to by the dbSNP reference SNP (rs) accession number whenever it was available, and by the HGVS recommended designation, relative to the forementioned reference sequences. For the sake of simplicity, when describing the composite genotypes regarding SNPs rs73431557 (c.-410T>C), rs3751385 (c. * 84T>C), rs55704559 (c. * 168A>G), and rs5030700 (c. * 931C>T), the genotype at each position, indicated in order from 5 to 3 , is designated by A, C, G, or T if homozygous, or by a code letter, according to IUPAC nucleotide ambiguity code, if heterozygous.

Genotyping and Statistical Analysis.
The allelic frequencies regarding deletion c.-684 -675del and the 14 SNPs, were determined in the control population and used to test for Hardy-Weinberg equilibrium. The chisquare test was used to compare the allelic frequencies of the patients with those of the normal hearing individuals. Allelic frequencies of the control sample for the 14

Results and Discussion
In the current study, 89 Portuguese HL patients, previously screened for mutations in the GJB2 coding region and acceptor splice site (80 patients presenting no mutation, plus eight heterozygous for coding mutations and one heterozygous for the noncoding variant c.-22-12C>T), and 91 hearing individuals were analyzed as regards the noncoding region immediately upstream of the exon 1 (which includes the basal promoter), the exon 1, and the whole 3 UTR of GJB2 gene. All individuals were also genotyped for c.-684 -675del and 14 SNPs localized therein.

DNA Sequence Variants.
No additional GJB2 variant was found in any of the eight patients previously found to be heterozygous for a coding GJB2 mutation or in the patient heterozygous for the c.-22-12C>T noncoding variant.
Among the remaining 80 patients, six of them presented noncoding variants, which had already been reported (Table 1).
One patient, presenting with profound HL was heterozygous for the donor splice site c.-23+1G>A mutation. The patient may just be a carrier, or other GJB2 or GJB6 mutation might remain undetected. One other patient, presenting with moderate to severe HL, harboured in heterozygosity the c.-216T>G variant, located within the basal promoter, between two GT boxes [24,25]. This variant was previously identified in two HL patients, also in heterozygosity [26]. The c.-45C>A variant in exon 1 was found in heterozygosity in one individual with severe HL. This variant was referred, by Wilch and coworkers [8], as an SNP at position +94 in exon 1. These authors observed expression of the GJB2 allele harbouring the variant but, since a quantitative comparison with wild-type allele was not performed, a possible contribution to HL cannot be excluded. Three affected individuals (two heterozygous and one homozygous) harboured the deletion c.-684 -675del.
No novel putative pathogenic noncoding mutation has been found in the patients, which might be due to the low number of monoallelic individuals and the small sample size. It is also possible that, simply, such mutations are very rare in our population.
Among controls, four novel noncoding variants were identified: c.-731C>T, c.-26G>T, c. * 45G>A, and c. * 985A>T (rs112400198, rs112875543, rs112399473, and rs111729919, resp.). Each of these variants was identified only once, in heterozygosity, and in different individuals ( Table 1). The hearing individual harbouring the novel c.-731C>T variant was also heterozygous for the recessive c.670A>C (p.Lys224Gln) 1mutation (https:// .cchmc.org/LOVD/; phase unknown). One control individual harboured the c.-45C>A exon 1 variant in heterozygosity (Table 1). Interestingly, we found one control subject homozygous for c.-684 -675del (Table 1), which is, to our knowledge, the first case described to date of a normal hearing individual presenting this genotype. This individual did not harbour the c.101T>C mutation. Our finding, together with the previous report of transcription from alleles harbouring c.-684 -675del [16] suggests the nonpathogenicity of the deletion. In addition, six normal hearing heterozygotes for the deletion were also identified (Table 1), with one also heterozygous for c.101T>C.
It should be noted that the pathogenic basal promoter mutation c.-259C>T, identified for the first time in a Portuguese family [18], was not found among the 89 patients and 91 normal hearing individuals here analysed, and neither was it identified in the other studies which analysed the basal promoter [14,[16][17][18][19][20][21]. Therefore, known occurrence of c.-259C>T continues to be restricted to that Portuguese family. The allelic frequencies of the deletion c.-684 -675del in patients and controls are not statistically different ( Table 2). The allelic frequency observed for this deletion in our control population is close to the one found among the British control population [16], and higher than the one determined in the German control population [22].

Genotypic Data and Statistical
The allelic frequencies regarding SNPs c.-410T>C, c. * 84T>C, c. * 168A>G, and c. * 931C>T, were statistically      Table 2). By sorting both patients and controls into groups reflecting the genotypes for these four SNPs altogether, eleven composite genotypes were evidenced (Figure 1). Comparison of the genotypic frequencies in controls and patients promptly revealed an increased frequency in patients of the genotypes YYRY and CTRY, both heterozygous for SNPs c. * 168A>G and c. * 931C>T. Also, the genotype YCAC was identified in four patients but not found in controls. On the contrary, a decrease was observed in the frequency of the three genotypes that are most represented in controls-TCAC, TYAC, and YYAC. Each of the remaining genotypes was scarcely represented in both controls and patients (0%-2%), and their frequency did not vary more than 2% between the two groups; only 3% of controls and 4% of patients belong to one of these genotypes.
We also observed that, regarding SNPs c. * 168A>G and c.  Table 2, SNP pair 8 : 9). Interestingly, the overrepresentation of c.[ * 168A>G(+) * 931C>T] genotype among patients, when comparing to hearing controls, is statistically very significant (χ 2 = 28.159; P = 3.4 E-06), thus accounting for the statistically significant differences in the allelic frequencies of these two SNPs between patients and hearing controls.
The statistically significant differences also observed in the allelic frequencies of SNPs c.-410T>C and c. * 84T>C seems to be due to the differential association  Table 2, SNP pairs 2 : 8, 2 : 9, 5 : 8, and 5 : 9). It should be noticed that the presence of genotype YCAC among patients lends some contribute to the difference in allelic frequencies between patients and controls regarding SNP c.-410T>C.
The fact that YCAC genotype is not represented in 91 control individuals while it occurs in 4/89 patients is noteworthy. The presence of genotype YCAC implies the presence of haplotype CCAC, which frequency is of at least 2,2% among patients, and estimated to be null in the control population (as inferred from Table 3(a)). In order to validate a possible association of haplotype CCAC with HL analysis of larger samples of patients and normal hearing individuals is necessary. Interestingly, one of the four patients with the referred composite genotype is a c.457G>A heterozygote (phase unknown). The change c. * 168A>G, regardless of genotype at position c. * 931, was predicted to alter mRNA folding. On the contrary, the change c. * 931C>T, regardless of genotype at position c. * 168, is not predicted to alter mRNA folding ( Figure 2). The c. * 168A was predicted to be located in an internal loop of a stem-loop structure ( Figure 2). Regulatory motifs in mRNA 3 UTR seem to function in the context of specific secondary structure [27]. Stem-loop structures occurring in the 3 UTR have been implicated in gene expression, with roles at the level of mRNA stability (e.g., the SLDE of G-CSF gene [28], the CDE of TNF-alpha gene [27,29], the complex structure integrating three C-rich elements of alpha-globin gene, the histone mRNA 3 terminal stemloops, and the IRE of TFRC gene [27]) or translation (e.g., the common 30-37 nucleotide long element present in the target mRNAs of TIA-1, a translational repressor [30], and the SECIS element [27]). The disruption of the predicted stem-loop structure and/or other adjacent stem-loop structures (Figure 2), induced by the c. * 168A>G change, might lead to deregulation of the GJB2 gene expression, thus being a contributor to the hearing loss phenotype. It should be stressed that mRNA folding predictions are fallible. This fact notwithstanding, the simple change of sequence, without affecting the secondary structure, could conceivably disrupt a binding site for a trans-acting factor, also leading to gene expression deregulation. Regarding the Genetics Research International 7 c. * 931C>T variant, despite the predictions that c. * 931C occurs in a helix and that the change from C to T does not have structural implications, the in vivo situation might be different. Functional studies involving constructs containing a reporter gene's coding sequence fused with GJB2 3 UTR could help elucidating the functional significance of these two sequence variants.

3 UTR Variants and mRNA
In this study, of a total of 15 patients presenting either a GJB2 coding mutation or a noncoding variant, 14 do not harbour either the c. * 168A>G or the c. * 931C>T changes, whereas one patient, heterozygous for the controversial c.380G>A mutation, is a compound heterozygote regarding SNPs c. * 168A>G and c. * 931C>T (phase unknown). Therefore, our data do not allow withdrawal of conclusions concerning a putative role of the two 3 UTR variants in the HL of some monoallelic patients. In this regard, the investigation of the genotypes regarding c. * 168A>G and c.931C>T variants in larger samples of monoallelic patients would be interesting. Finally, the finding of one c. * 168G homozygote (a c. * 931C>T heterozygote, and carrying no GJB2 sequence variant) in our patient cohort, might further support a possible role of c. * 168G in HL.

Conclusion
This study suggests the association of the noncoding SNPs c. * 168A>G and c. * 931C>T with HL. The c. * 168A>G change is predicted to alter mRNA folding, suggesting a putative role of this SNP in the pathology. Our data also point to a possible association with HL of the haplotype CCAC, comprising SNPs c.-410T>C, c. * 84T>C, c. * 168A>G, and c. * 931C>T, respectively. However, this observation requires validation through analysis of a larger number of subjects. The technique of targeted sequence capture and massively parallel sequencing makes it very easy and cost-effective to screen large numbers of genes, and might cover noncoding sequences of some of them, such as GJB2. This approach could prove to be very useful for genetic diagnosis in cases of NSHL [31], with predictable benefits for genetic counselling of the affected families.