Targeted Genomic Sequencing of TSC1 and TSC2 Reveals Causal Variants in Individuals for Whom Previous Genetic Testing for Tuberous Sclerosis Complex Was Normal

,


Introduction
Tuberous sclerosis complex (TSC) is an autosomal dominant condition characterised by seizures, neuropsychiatric disorders, and the development of hamartomas in the brain, lungs, heart, skin and kidneys [1]. Loss-of-function variants in the TSC complex subunit 1 (TSC1; chromosome 9q34; OMIM 605284) or TSC complex subunit 2 (TSC2; chromosome 16p13.3; OMIM 191092) tumour suppressor genes cause TSC [1]. TSC1 consists of 23 exons that extend across 60 kb of genomic DNA and produce an 8.5 kb mRNA encoding the 130 kDa TSC1 protein. The 46 kb TSC2 locus consists of 42 exons that produce a 5.5 kb mRNA encoding the 200 kDa TSC2 protein. TSC1 and TSC2 interact to form the TSC complex, a negative regulator of the mechanistic target of rapamycin (mTOR) complex 1 (TORC1). Signal transduction through TORC1 controls key aspects of metabolism [2] and constitutive TORC1 activation is a hallmark of TSCassociated lesions.
The manifestations of TSC and their severity vary widely, and the identification of an inactivating TSC1 or TSC2 variant can help establish a diagnosis and enable cascade, preimplantation and prenatal genetic testing [3]. Some diseaseassociated TSC1 and TSC2 variants are found in multiple, unrelated individuals with TSC, but often, a unique variant is identified, and in most cases, the identified variant is the result of a de novo mutation [4,5], either in a gamete or during (early) post-zygotic development [6][7][8]. The TSC1 and TSC2 Leiden Open Variation Databases (LOVD; http://www.lovd .nl/TSC1 and http://www.lovd.nl/TSC2) list many of the variants identified to date, alongside reports of predicted pathogenicity and functional test results. The wide variety of mutation types, ranging from single nucleotide changes to extensive chromosomal rearrangements, combined with the size and complexity of the TSC1 and TSC2 loci and the occurrence of mosaicism, makes the comprehensive identification of variants that cause TSC challenging. Indeed, in 10-15% of individuals with a clinically definite diagnosis of TSC, no causal variant is detected [4,[6][7][8]. These individuals are usually referred to as TSC "no mutation identified" (NMI). The failure to identify a causal variant can be due to technical issues associated with the screening method(s) employed or because the variant is located outside the screened region. Next-generation sequencing (NGS) has proven to be effective at overcoming some of these limitations [5,6], and both whole exome sequencing (WES) and whole genome sequencing (WGS) are increasingly being applied as first-line diagnostic tests to identify individuals with TSC [5]. However, WES is not able to detect variants located deep within intronic sequences, and neither WES nor WGS is optimized for the efficient detection of post-zygotic mutations.
HaloPlex custom capture NGS relies on the specific capture of both ends of restriction-digested genomic DNA fragments from a region of interest, simplifying data analysis [9]. Previously, we showed in a small cohort of 6 TSC NMI individuals that HaloPlex custom capture could identify postzygotic and deep intronic variants [10]. Here, we apply the same approach to a much larger TSC NMI cohort. Our data show that HaloPlex custom capture is an effective approach for the identification of otherwise difficult-to-detect TSC1 and TSC2 variants, particularly post-zygotic mutations. Where possible, we confirmed the HaloPlex results with a complementary DNA-based test and performed functional experiments to obtain evidence for pathogenicity at the mRNA or protein level. Our findings support the utility of bespoke NGS-based genetic analysis for variant detection in TSC and demonstrate the importance of functional approaches towards helping determine variant pathogenicity.

Editorial Policies and Ethical Considerations.
Informed consent was provided by all subjects. All individuals had requested genetic testing of TSC1 and TSC2 for diagnostic purposes, and informed consent was provided as required by the institutional review board of the Erasmus Medical Center (EMC)(METC-2012-387), the NHS research ethics committee for Wales (REC 11WA0276), and the referring institution, according to standard diagnostic protocols.

Patient
Cohort. Subjects had been referred for testing to the EMC, Rotterdam, Netherlands, or the Institute of Medical Genetics, Cardiff, UK, because of a diagnosis of definite or possible TSC [3], or who were suspected of TSC but had inadequate clinical details for classification, and were TSC NMI after diagnostic testing that included analysis of all coding exons and intron-exon boundaries by PCR and Sanger sequencing approaches, and multiplex ligation probe amplification (MLPA) for detection of large rearrangements.
2.3. DNA and RNA Isolation. Genomic DNA and total RNA were extracted from peripheral blood, affected and normal skin samples, and/or cultured skin fibroblasts using standard procedures. DNA quality and concentration were checked with the Quant-iT PicoGreen dsDNA Kit (Invitrogen, Carlsbad, USA).

HaloPlex
Custom Capture NGS. Genomic DNA samples were subjected to customised HaloPlex or HaloPlex HS target enrichment assays (Agilent Technologies, Santa Clara, USA) encompassing the TSC1 and TSC2 genomic loci [9,10]. See Supplementary Information, Methods for details.
2.6. Validation of Identified Variants. Likely germline changes were validated using a combination of PCR and Sanger sequencing. Post-zygotic changes were validated by allele-specific (AS) PCR, droplet digital (DD) PCR, or Nextera XT NGS. See Supplementary Methods for details.
To investigate effects on pre-mRNA splicing, RNA was isolated from blood or cultured skin fibroblasts, converted to 2 Human Mutation cDNA using a cDNA synthesis kit (PCR Biosystems), and amplified by PCR. PCR products were analysed by agarose gel electrophoresis and Sanger sequencing. In some cases where no RNA was available, effects on pre-mRNA splicing were investigated using an in vitro exon trapping approach, as described previously [12]. See Supplementary Information,  Methods, and Supplementary Tables S7 and S8 for details. Transcriptome sequencing was performed as described previously [13]. The effects of missense and in-frame deletion variants on the TSC complex and on TORC1 activity were assessed in vitro, as described previously [12].

TSC NMI Cohort
Characteristics. The cohort consisted of 155 TSC NMI individuals. According to the current clinical criteria [3], 113 (73%) had definite TSC, 34 (22%) had possible TSC, and 8 (5%) were suspected of TSC, but details of their clinical findings were not available to us. The clinical findings are summarised in the Supplementary Information, Tables S4-S6. In addition to testing single individuals, we tested 2 affected duos, 7 duos consisting of an affected subject plus an unaffected first-degree relative, and 38 trios consisting of an affected subject and both unaffected parents. In 6 cases, multiple genomic DNA samples from different tissues of a single individual were analysed.
3.2. TSC1 and TSC2 Variant Identification. We used 5 different HaloPlex custom capture designs, as detailed in the Supplementary Information, Methods, and Table S1. For each design, we obtained an average of >95% coverage of both target regions at a minimum depth of 20 reads per nucleotide, >85% coverage at a depth of 100 reads, and >50% coverage at a read depth of 300 (Supplementary Information,  Table S2; the median read depth and range per subject  sample is provided in Supplementary Information, Table S3).
First, we searched for likely germline, inactivating TSC1 and TSC2 variants. We defined a minimum threshold of 50 reads (total) and a variant allele frequency (VAF) >40%, in line with a previous study [6]. In 2 affected individuals, from a 4-generation family with TSC, an obligate germline variant was identified with a VAF <40%, most likely due to reduced capture of restriction fragments containing the variant (Table 1; and see Supplemental Information, Figures S1 and S4). We identified from 0 to >70 germline variants per locus per individual, mostly known benign single nucleotide variants (SNVs), often present in multiple individuals in our cohort. Variants were classified according to the criteria of the American College of Medical Genetics and Genomics (ACMG) [14] and following recommendations from the TSC1 and TSC2 LOVD (http://www.lovd.nl/TSC1 and http:// www.lovd.nl/TSC2). We identified a (likely) inactivating germline variant in 29 individuals: 7 in TSC1 and 22 in TSC2 (Table 1, Figure 1). In each case, we confirmed the presence of the variant by (i) visual inspection of the reads in the Integrated Genome Viewer (IGV) (http://www.broadinstitute .org/igv/) and (ii) PCR of genomic DNA from the corresponding individual, followed by the Sanger sequencing. To support the pathogenicity of variants predicted to affect TSC complex function or pre-mRNA splicing, functional testing (2 cases) or analysis of subject RNA (5 cases) was performed (Table 1; Figure 2; and see Supplementary Information, Figure S2).
Next, to identify post-zygotic TSC1 and TSC2 mutations, we searched for variants with a VAF <40%. Candidate (likely) causal variants were confirmed by visual inspection in the IGV and by either AS-PCR, DD-PCR, or Nextera XT NGS analysis of genomic DNA from the corresponding individual, together with appropriate controls (Table 2; Figure 2). Additional support for variant pathogenicity was sought, either by in vitro functional assessment of TSC complex activity (2 cases; see Supplementary Information, Figure S2), analysis of subject RNA (6 cases), or by in vitro exon trap experiments (6 cases; see Supplementary Information, Tables S7 and S8). To identify deletions >150 base pairs (bp) and other rearrangements that prevented fragment capture, we compared VAFs for SNVs across both loci and compared read depths using a z-score analysis [15]. We identified 2 post-zygotic TSC2 deletions: subjects 2.52 and 2.53, estimated VAF: 15% and 10%, respectively. Both events were confirmed by MLPA or SNP array analysis (Table 2; and see Supplementary Information, Figure S3 and Table S9). In total, 54 (likely) inactivating postzygotic variants were identified, 1 in TSC1 and 53 in TSC2, accounting for 35% of the cohort (Table 2; Figure 1). In 5 individuals with an apparent inactivating postzygotic variant, we did not (yet) confirm the variant using a second test (Table 3), and in 13 individuals, we identified variants of uncertain significance (VUS) ( Table 3; Figure 1).

Individuals with Multiple
Genomic DNA Samples. In 6 cases, genomic DNA samples from different tissues from a single individual were tested.
In subject 3.7 with a SEGA but no other signs of TSC, a TSC2 c.4375C>T, p.(Arg1459*) variant was identified in the SEGA DNA (VAF 53%) but was absent from peripheral blood DNA (Table 3).
A TSC2 c.5024C>T, p.(Pro1675Leu) variant (VAF 2%) was identified in genomic DNA isolated from a shagreen patch that was the only clinical sign of TSC in subject 3.19, but not in genomic DNA isolated from peripheral blood or from fibroblasts cultured from a biopsy of normal skin, either by HaloPlex NGS or by AS-PCR. This variant is likely a somatic event, specific to the shagreen patch ( Table 3).
In subject 3.20 a novel variant in the overlapping 3 ′ UTR of TSC2 and the polycystin 1, transient receptor 3 Human Mutation Table  1: Inactivating, likely germ-line TSC1 and TSC2 variants identified using HaloPlex custom capture NGS. Individuals fulfilling the clinical criteria for definite TSC [3] are indicated with "TSC"; those fulfilling only criteria for possible TSC are indicated with "?"; individuals for whom clinical information was not available to us are indicated with "n/a."

VAF, variant
allele frequency, refers to the proportion of reads containing the corresponding variant. Cases for which multiple family members or multiple DNA samples were tested are indicated.
Evidence for effects on pre-mRNA splicing was obtained by analysis of subject RNA isolated from either peripheral blood (RNA 1. ) or cultured skin fibroblasts     Table S6). An inactivating TSC2 c.1331del, p.(Asn444Thrfs*5) variant (VAF 3%) was identified in genomic DNA isolated from the angiofibroma but was absent from genomic DNA isolated from blood (Table 3) and is, therefore, likely to be a lesion-specific, second-hit mutation. In subject 3.5 with definite TSC, a TSC2 c.599+4A>G variant was detected in genomic DNA isolated from formalin-fixed paraffin-embedded (FFPE) SEGA tissue (VAF 30%), but not in genomic DNA isolated from peripheral blood. We failed to con-firm the presence of the variant in the SEGA DNA, either by standard PCR followed by the Sanger sequencing, or by AS-PCR. Therefore, this individual remained NMI (Table 3).

Cases with Genomic DNA Samples from Multiple Family
Members. We analysed 9 duos and 38 trios (see Tables 1-3). In 6 cases, a likely de novo germline variant was identified (Table 1). In 2 cases, the variant cosegregated with TSC: subjects 1.10 and 1.11 (Table 1) were both from a 4-generation family with TSC (see Supplementary Information, Figure S4) and subject 1.5 (Table 1) inherited an inactivating variant from subject 2.7 (Table 2), who was mosaic for the variant. In 16 cases, an affected child of healthy parents was mosaic for a TSC2 variant (Table 2). In the remaining cases with multiple family members, no inactivating TSC1 or TSC2 variant was identified (see Supplementary Information, Table S6).     Table 1) are shown above the corresponding gene; post-zygotic variants (see Table 2) are below. Variants of uncertain clinical significance and unconfirmed variants (see Table 3) are shown in italics.

Discussion
We investigated a cohort of 155 individuals with a clinical diagnosis of definite or possible TSC, or with suspected TSC but with inadequate clinical details for classification, in whom previous genetic testing had not identified a causal variant. We identified an inactivating TSC1 or TSC2 variant in 83 (54%), including 65/113 (58%) of those with clinically definite TSC and 18/42 (43%) with possible TSC, or suspected of TSC but without sufficient clinical information for classification (Tables 1 and 2; and Supplementary Infor-mation Tables S4 and S5). In 4 cases, we identified an inactivating variant in genomic DNA isolated from affected tissue, but not in genomic DNA isolated from peripheral blood (Table 3). These most likely represent lesion-specific and/or second-hit events. In 13 cases (8%), we identified a variant but did not obtain sufficient evidence to establish or exclude pathogenicity ( Table 3). Identification of an inactivating variant provided diagnostic certainty for the 18 individuals in whom TSC was suspected or could be defined only as "possible," and in 83 cases, it provides the potential for prenatal or preimplantation genetic diagnostics and cascade        Variants were classified according to the American College of Medical Genetics and Genomics (ACMG) criteria [15]   testing for other family members, which was previously not possible.

Human Mutation
Similar to a previous study of TSC NMI cases [6], 19/29 (66%) of the identified inactivating germline variants were located within sequences that had been screened during previous diagnostic testing, suggesting that simple technical issues account for a proportion of apparent TSC NMI cases. For example, we identified benign SNVs in cis that could have interfered with PCR primer binding (data not shown). In contrast, variants located deep within introns that interfere with TSC1 or TSC2 pre-mRNA splicing will never be identified by exonor exome-based approaches. In 19 cases, we identified deep intronic variants (>10 nucleotides up or downstream from an exon), accounting for 12% of the cohort and 16/113 (14%) of the NMI cases with a clinical diagnosis of definite TSC. Evidence for or against variant pathogenicity was obtained either by family studies, analysis of RNA, or by in vitro exon trap experiments (Tables 1-3; see Supplementary Information,  Tables S7 and S8). Notably, 2 recurrent deep intronic variants, TSC2 c.2838-122G>A and TSC2 c.848+281C>T, were identified in 10 unrelated cases, accounting for 6% of the cohort. We had originally identified the TSC2 c.2838-122G>A variant in another individual [10] and have subsequently identified 2 further unrelated cases after targeted testing in our diagnostic laboratories (data not shown). The TSC2 c.848 +281C>T variant was reported previously in a separate study [6].
We identified an apparent post-zygotic mutation (VAF <40%) in 54 individuals (35% of the cohort), consistent with earlier reports of frequent mosaicism in TSC [6][7][8]16] (Figure 1(c)). Detection of low-level mosaicism requires highquality reads, deep coverage, and careful analysis of the data and is, therefore, easy to miss using routine diagnostic applications of WES or WGS [17]. The depth of coverage and the quality of the sequence reads following HaloPlex capture were variable and, in contrast to other studies [7,16], we could not reliably detect variants with VAF <1%. Coverage at read depths >1000 was limited (Supplementary Information, Tables S1-S3), and although we did not observe a strong correlation between the median read depth per sample and the identification of a variant ( Supplementary Information, Figure S5), it is likely that some low-frequency variants escaped detection. In mosaic individuals, the VAF may vary considerably between tissues, and testing multiple tissues, including hamartoma in which at least a proportion of cells should contain the first post-zygotic mutation, has been shown to be a fruitful approach [6][7][8]16] and could also help resolve some of the additional remaining NMI cases in our cohort. Nonetheless, we identified and confirmed post-zygotic variants in genomic DNA from a significant proportion of the subjects ( Table 2).
In addition to the limitations discussed above, there are 2 other reasons for our failure to detect a causal variant in all cases. First, some individuals who were tested might not have TSC (see Supplementary Information, Table S6). Second, the HaloPlex method is not able to efficiently capture junction fragments created by DNA rearrangements affecting >150 bp and is, therefore, not suited to detection of the large deletions and rearrangements that account for 3% (214/8202; search 1/6/2022) of the pathogenic TSC2 variants and 0.5% (16/2964; search 1/6/2022) of the pathogenic TSC1 variants listed in the TSC2 and TSC1 LOVD. We only identified 2 large post-zygotic TSC2 deletions, accounting for <2% of our cohort (Table 2; and Supplementary Information, Figure S3), and failed to identify a known inversion at the TSC2 locus in a control sample (data not shown).
Despite these caveats, our work shows the benefit of detailed analysis of the TSC1 and TSC2 genomic loci for TSC molecular diagnostics and indicates that targeted genomic NGS with high-quality reads and high read depth is an appropriate molecular screening method for individuals where there is a clinical suspicion of TSC, allowing reliable detection of both deep intronic variants that affect pre-mRNA splicing and low-frequency post-zygotic changes. The implementation of similar approaches in diagnostic laboratories could circumvent the requirements for either labour-intensive PCR-based exon-specific screening or inefficient WES/WGS approaches. However, the low number of cases identified with a VAF <1%, or with a large DNA rearrangement, suggests that other high read-depth approaches, particularly of genomic DNA isolated from multiple affected tissues [6][7][8]16], might help solve more TSC NMI cases. Finally, our work has increased the spectrum of inactivating TSC1 and TSC2 variants associated with TSC and provides insight into the mechanisms of TSC pathogenesis.

Data Availability
Variants have been deposited in the TSC1 and TSC2 LOVD [https://databases.lovd.nl/shared/genes/TSC1 and https:// databases.lovd.nl/shared/genes/TSC2]. Primer sequences are available on request. The data that support the findings of this study are available from the corresponding authors, with the exception of primary patient sequencing data, as they are derived from patient samples with unique variants that are impossible to guarantee anonymity for. Our institutional guidelines do not allow sharing these raw data, as this is not part of the patient consent procedure.