Next Generation Sequencing to Determine the Cystic Fibrosis Mutation Spectrum in Palestinian Population

An extensive molecular analysis of the CF transmembrane regulator (CFTR) gene was performed to establish the CFTR mutation spectrum and frequencies in the Palestinian population, which can be considered as an understudied population. We used a targeted Next Generation Sequencing approach to sequence the entire coding region and the adjacent sequences of the CFTR gene combined with MLPA analysis of 60 unrelated CF patients. Eighteen different CF-causing mutations, including one previously undescribed mutation p.(Gly1265Arg), were identified. The overall detection rate is up to 67%, and when we consider only CF patients with sweat chloride concentrations >70 mEq/L, we even have a pickup rate of 92%. Whereas p.(Phe508del) is the most frequent allele (35% of the positive cases), 3 other mutations c.2988+1Kbdel8.6Kb, c.1393-1G>A, and p.(Gly85Glu) showed frequencies higher than 5% and a total of 9 mutations account for 84% of the mutations. This limited spectrum of CF mutations is in agreement with the homozygous ethnic origin of the Palestinian population. The relative large portion of patients without a mutation is most likely due to clinical misdiagnosis. Our results will be important in the development of an adequate molecular diagnostic test for CF in Palestine.


Introduction
Cystic fibrosis (CF) is a severe life threatening genetic disease most common among Caucasians with an incidence ranging from 1 in 2500 to 1 in 3600 [1]. CF is inherited in an autosomal recessive way and the cystic fibrosis transmembrane conductance regulator gene (CFTR), located on chromosome 7q31.2 [2], has been identified as the responsible gene encoding a transmembrane protein that functions as a chloride channel and a regulator of other channels across the epithelial cell membrane. The defective protein impairs water movement across epithelia leading to formation of viscous mucus that obstructs the airways of the lungs and ducts of the pancreas. CF is characterized by progressive lung disease, pancreatic dysfunction, elevated sweat electrolytes, and male infertility [3].
So far, more than 1900 different CFTR mutations have been reported [4]. Although most mutations are rare, the three-base-pair deletion p.(Phe508del) is most common in the Caucasian population affecting about 70% of the patients whereas in the Jewish population the p.(Trp1282 * ) is the most prevalent with a frequency of 60% [5], clearly indicating that the occurrence of mutations is highly population specific. For many ethnic or geographic populations, the mutation spectrum has been determined [6][7][8][9][10][11][12][13][14][15].
Recently, CF has been diagnosed in the Middle East ranging from 1 in 2,500 to 1 in 16,000 with different mutation frequencies according to the ethnic origin of populations [16]. However, reliable information about the frequency of CF among the Palestinians is still lacking and the spectrum and nature of mutations have not been documented yet hampering molecular diagnostics. A good insight into the nature and frequency of the mutations in a specific population is a prerequisite to set up adequate and cost-effective molecular diagnostics.
The aim of this study was to determine the CF mutation spectrum among the Palestinian patient population. Samples from 60 unrelated CF patients residing in the West Bank 2 Disease Markers and Gaza were collected and their respective CF mutations were determined. Consequently, the mutation spectrum was compared with other ethnic groups residing in the Arabic population.

Patients and Sample Collection.
A total of 60 unrelated Palestinian CF patients, 19 of them residing in the West Bank and 41 residing in Gaza, participated in this study of which 34 are males and 26 are females. Most of the participants (97%) were children less than 18 years old. The criteria for inclusion in this study were based on the clinical diagnosis. Typical pulmonary and/or gastrointestinal tract manifestations and/or elevated sweat chloride values (>60 mEq/L, Table 1) were the main criteria. Whole blood (3 mL) was collected in EDTA vacutainer tubes (BD). Participation in this project was based on the free will of the participants. Signed consent was obtained from each participant and/or the guardian.

DNA Extraction and Polymerase Chain Reaction (PCR).
Genomic DNA was extracted and purified from whole EDTA-blood by the automated extraction apparatus Autopure LS (Qiagen) using the PureGene DNA Purification Kit (Qiagen).
Amplification of the coding region and flanking introns of the CFTR gene was conducted using the 2720 thermal cycler (Applied Biosystems). A total of 28 sets of primers were developed, flanking at least 25 intronic nucleotides away from each of the 27 exons of the CFTR gene. Primers are found in supplemental Table 1 (see supplemental Table 1 in Supplementary Material available online at http://dx.doi.org/10.1155/2015/458653). Amplification was performed with 2.5 L of the purified DNA (50 ng/ L) template in a total of 10 L reaction mixture. The complete mix constituted the following components: 5 L of 2X KAPA2G Robust Hot Start Ready Mix (Kapa biosystems), 1.25 L upstream primer (0.1 M), and 1.25 L downstream primer (0.1 M). The following amplification conditions were used: 95 ∘ C for 3 min and then 95 ∘ C for 15 sec, 60 ∘ C for 10 sec, and 72 ∘ C for 15 sec, for 32 cycles, with final extension at 72 ∘ C for 1 min. The LabChip GX (PerkinElmer, USA) capillary electrophoresis was used to assess PCR product.

NGS Sequencing.
For each patient, all PCR products were pooled equimolary before they entered the Nextera sample preparation protocol (Nextera XT DNA Sample Prep Kit (Illumina, Inc., San Diego, CA)), followed by 250 bp singleend sequencing on a MiSeq instrument (Illumina, Inc., San Diego, CA). All 60 samples were labeled using 60 different index tags (Nextera, Epicentre Biotechnologies).

Data Analysis.
Reads were aligned to the human genome (hg19/GRC37) using the CLCBio software package (CLC Genomics Workbench 6.0.2). All exons with a coverage lower than 20 were analysed by Sanger sequencing. We first filtered all variants classified as deleterious according to the CFTR mutation database (ENST00000003084) [4]. To identify new

NGS Analysis.
We applied an NGS screening strategy to efficiently identify causative mutations in the 60 unrelated Palestinian patients with a clinical diagnosis of CF. Approximately, 98% of all reads on the MiSeq were successfully mapped to the reference genome. The overall mean read depth in the target area was 344x. A read depth of 10x for 92% of the bases and 20x for 90% of the bases was obtained. Patient P9 had the best coverage with all amplicons covered with at least 20x whereas patient P7 had the lowest number of amplicons (13) with a coverage higher than 20x (data not shown).

Identification of Causative Mutations.
We identified 17 different mutations in 40 patients which have previously been described as CF-causing mutations, including 3 splice sites, 5 missense mutations, 4 frame shift mutations, 3 stopcodons, and 2 exon spanning deletions (Table 2). A homozygous exon-spanning deletion was suspected when we were unable to amplify the exons 19, 20 and 21 by PCR and consequently we did not obtain any read for these amplicons by NGS. The presence of the deletion was confirmed by MLPA analysis.
In addition we identified one novel potentially deleterious homozygous missense mutation p.(Gly1265Arg) (phyloP: 5.53, Grantham dist.: 125, Sift: Deleterious, Mutation Taster: disease causing). The NGS approach does not allow, at least with the PCR-enrichment approach applied in this strategy, detection of large heterozygous deletions or insertions. Therefore, we further investigated the remaining 19 negative patients and the 3 heterozygous patients for the presence of deletions/insertions using MLPA analysis. This revealed two heterozygous deletions in patients P24 and P49. In total, we were able to detect 81 CFTR mutations on a total of 120 alleles from 60 CF patients. Mutations were compared between CF patients originating from the West Bank and those from Gaza, as both regions are physically divided over 66 years. Data are represented in Figure 1.

Discussion
This is the first study to investigate the CFTR mutation spectrum in the Palestinian population that was conducted on 60 unrelated CF patients residing in the West Bank and Gaza, Palestine. We identified in 41 of these probands 81 CFTR mutations of which 18 are different. All patients are homozygous or compound heterozygous, except patient 18, in whom we identified only one single mutation. This may be explained by the fact that the technology used in this approach does not allow detection of deep intronic mutations. The five most common mutations are p.    [1,15], the deletion of exons 19-21 (c.2988+1Kbdel8.6Kb) was only present in Palestinian Arabs which may indicate that this mutation is a founder mutation among this population [18,19]. The two mutations p.(Gly85Glu) and c.1585-1G>A, which were identified in this study, were also found among the Jewish CF populations from Balkani (9.5%) and Ashkenazi (1%) origins, respectively [15]. On the other hand, one of the most frequent mutations in the Jewish CF population p.(Trp1282 * ) (31%) was found to be less frequent in this study (4%), similar to other Arabic CF populations such as Tunisians (4.4%) and Algerians (4.2%) [8,9,15]. The rate of the p.(Asn1303Lys) mutation was about the same in Ashkenazi Jews and the Palestinian CF population tested in this study (5%), as well as in the Iranian (4.3%) and Tunisian (6.6%) CF populations [8,11,15]. Designing an allele specific primer based CF assay for the Palestinian population, a technology that can be easily implemented in any diagnostic setting, including the 9 most frequent mutations, would result in a test that identifies 84% of the Palestinian mutations.
The differences in the rates of mutations identified in CF patients residing in the West Bank versus those in Gaza, are remarkable. For example, mutations c.1393-1G>A, p.(Gly85Glu), p.(Trp1282X), and p.(Asn1303Lys) were only present among Palestinians in the West Bank, while mutations p.(Lys684Serfs * 38) and c.1585-1G>A were only present among Palestinians residing in Gaza (Figure 1). Interestingly, the p.(Phe508del) mutation was primarily prevalent among patients living in Gaza (68%) as compared to patients in the West Bank (32%). This may be due to the heterogeneous Arab population and the political barriers that were imposed on this population for many decades limiting their freedom of movement and concentrated the population into small communities in different geographic areas.
NGS technology combined with MLPA analysis proved to be a very efficient and cost-effective way to identify CFTR mutations. It should be noticed that heterozygous exon-spanning deletions or insertions are not observed with the PCR-based enrichment strategy followed by NGS analysis and therefore the combination of the NGS approach with MLPA analysis is highly recommended. The mutations Disease Markers 5 included in the INNO-LIPA CFTR17 and CFTR19 kit cover only 66% of the mutations and therefore have a limited diagnostic value for the Palestinian population.
The relatively low mutation detection rate of 67% of the CF patients tested in this study is most likely caused by clinical misdiagnosis depending on the sweat chloride test. Values of this test can be influenced by nutritional state, skin condition, age, and many other factors, resulting in false positive sweat chloride values as high as 15% [20,21]. All patients, in whom CF mutations have been identified and for whom a sweat chloride test has been performed, had a sweat chloride value higher than 70 mEq\L. On the other hand, most patients without a mutation have sweat chloride values lower or around 60 mEq\L and therefore have a less likely clinical diagnosis of cystic fibrosis. If we only consider the former group of patients, with a sweat chloride value higher than 70 mEq\L, the mutation detection rate rises from 67 to 92%.
In conclusion, we identified the most common CF mutations and their respective frequency in the Palestinian population. Not only is this knowledge important for the families themselves but also it is a prerequisite to set up a reliable and sensitive diagnostic test for CF in this population. Genetic testing in this area for recessive disorders is highly recommended because of the high rates of consanguineous marriages among the Palestinians (25%-65%) [22], resulting in a high risk for genetic diseases including CF [23].