Characterization of Ancestral Origin of Cystic Fibrosis of Patients with New Reported Mutations in CFTR

The incidence of cystic fibrosis (CF) and the frequency of the variants reported for CFTR depend on the population; furthermore, CF symptomatology is characterized by obstructive lung disease and pancreatic insufficiency among other symptoms, which are reliant on the individual's genotype. The Ecuadorian population is a mixture of Native Americans, Europeans, and Africans. That population admixture could be the reason for the new mutations reported in a previous study by Ruiz et al. (2019). A panel of 46 Ancestry Informative Markers was used to estimate the ancestral proportions of each available sample (12 samples in total). As a result, the Native American ancestry proportion was the most prevalent in almost all individuals, except for three patients from Guayaquil with the mutation [c.757G>A:p.Gly253Arg; c.1352G>T:p.Gly451Val] who had the highest European composition.


Introduction
Cystic fibrosis (CF) is an autosomal recessive disorder that has been extensively studied among populations [1]. It is characterized by obstructive lung disease, pancreatic insufficiency, diabetes, and liver disease, among others [2]. The most frequent worldwide mutation in Cystic Fibrosis Transmembrane Conductance Regulator (CFTR) protein gene is c.1521_1523delCTT (p.Phe508del) [3] which originated between 11,000 and 34,000 years ago in Europeans, then it spread across all Europe [4]. CF occurs in 1 out of 2,500 live births with high prevalence in the European ancestry, and the frequency of the heterozygotes has been reported as 1 in 25 in Europeans [5] [6]. There are plenty of studies in CF, yet the majority in Europeans, underrepresenting the Latin Americans [4] [6] [7] [8]. In the United States, a study reported the CF incidence to be 1 in 9,200 Hispanics and 1 in 10,900 Native Americans, yet the USA has a different population structure to South America [1] [6] [9]. In general, in Latin America, the incidence is 1 per 6,000 live newborns; specifically, Ecuador exhibits an incidence of 1 in 11,252 newborns [10][11][12].
The Ecuadorian population, located in the northwest of South America, is a mixed population conformed by Native Americans, Europeans who arrived in the 16 th century during the conquest, and Africans who came with them as slaves. According to the last census, the population projection for 2020 was estimated as 17,510,643 Ecuadorians. Moreover, Ecuadorian self-identified as "mestizos" 71.9%, "montubios" 7.4%, Afro-Ecuadorian 7.2%, "indígenas" 7%, "blancos" 6.1%, and others (0.4%) [13]. There are also reports of the Ecuadorian ancestry using AIMs in the mestizo population where Native American was the most prevalent ancestry (59.6%), followed by European (28.8%) and lastly African (11.6%) [14] [15].
Like other South American studies, Ecuador is underrepresented in cystic fibrosis research, and none of them involve the comparison of the mutations with the population's origin. Paz-y-Miño et.al (1999) reported 10 cases of Ecuadorian CF patients; at least 60% of the mutations differ from c.1521_ 1523delCTT (p.Phe508del) [16]. Valle et al. (2007) analyzed 62 Ecuadorian CF patients; the most prevalent mutation was F508del (37.1%) [12]. The last report by Ortiz et al. (2017), which included 48 Ecuadorian individuals with CF, reported F508del with the highest frequency (20.27%) [17]. These studies, however, are mainly focused on the particular F508 mutation, revealing that the percentage is not relatively high as in Europeans. The incidence and the frequency of the CF mutation depend on the population under study; Ecuadorians are a mestizo population, and the population's composition is not clear yet.
Here, we provide the ancestry origin data of 46 Ancestry Informative Markers of the individuals with the new mutations reported in a previous study of CF patients from Ecuador [18]. We aimed to elucidate if the mutations reported are mainly from European ancestry, due to the previous data of the main incidence.

Main Text
2.1. Methods 2.1.1. Samples and DNA Extraction. Twelve CF patients from Guayaquil (coast) and Cuenca (highland) who were available and had new CFTR disease-causing variants reported in a previous study were selected: one patient from Guayaquil with c.1473T>A:p.Cys491 * , one patient from Guayaquil and two from Cuenca with c.2672del:p.Asp891Alafs * 15, one patient from Cuenca with c.1486T>C:p.Trp496Arg, and six patients from Guayaquil and one from Cuenca with [c.757G>A:p.-Gly253Arg; c.1352G>T:p.Gly451Val] were selected [18]. DNA was extracted using Chelex 100 (Bio-Rad) (10%) from peripheral blood samples collected on FTA cards (GE Healthcare Life Sciences) and quantified using NanoDrop (Thermo Scientific). To protect the identity of the individuals, the samples were anonymized.  [19], in one multiplex reac-tion and following the standardized protocol of the laboratory. The fragment separation was carried out in 3500 Genetic Analyzers (Applied Biosystems). Data were collected with Data Collection v3 and visualized with GeneMapper v5.

Statistical
Analyses. Data were analyzed with Structure v2.3.4 in order to estimate the ancestral proportions in the population; the runs consisted of a burn-in length of 10,000 followed by 10,000 Markov Chain Monte Carlo (MCMC) interactions. The option used was the admixture model ("use population information to test for migrants"). The cluster considered for the analysis was one to three (k = 1, k = 2, and k = 3) due to the historical background of the Ecuadorian population and according to the cluster identification by Zambrano et al. and Evanno et al. [20] [14].
Principal component analysis (PCA) was built with RStudio v1.1.453 to visualize the CF individuals' structure: the correlation between individuals under analysis and the reference population from HGDP-CEPH (Native Americans, Europeans, and Africans) subset H952 [19].

Results
The DNA quantification was optimal to perform the PCR (5-20 ng/μl). After the amplification, complete profiles were obtained. A total of 339 individuals (reference population and samples) were analyzed, assuming a clustering of three using the information to test for migrants, permuting 10,000 burn-in periods and 10,000 interactions, and a bar plot was obtained showing the main ancestral population analyzed (Figure 1).
Principal component analysis (PCA) results showed the three reference populations clearly differentiate between them. The CF Ecuadorian population is in the middle of them but mainly between the European and Native American reference populations. The two main principal components represented 38.86% of the total (Figure 2).
A percentage of the ancestral composition of each individual was obtained; as a result, a heterogeneous percentage was found depending on the individual and the region under study, thus clearly showing the admixture of the Ecuadorian population according to history (Table 1). The global ancestry composition of CF patients was the Native American 50% (standard deviation of 14.03), the European 35% (standard deviation of 15.5), and the African 11.5% (standard deviation of 7.82). The Native American ancestry was the first origin of almost all individuals, except for three patients from Guayaquil with the mutation [c.757G>A:p.Gly253Arg; c.1352G>T:p.Gly451Val] with the highest European composition.

Discussion
The present study is the first report of the ancestral composition of CF Ecuadorian patients with new CFTR mutations. There are plenty of CF studies among different populations that revealed the differences between gender, age, and symptoms in CF patients. Some studies compared CF patients of different ages and gender describing that the incidence  [23]. Moreover, other studies revealed the incidence among diverse ethnicities: as an example, the prevalence of CF reported by Rohlfs

BioMed Research International
Native American [2]. That study clearly revealed the ethnic differences in the incidence and the distribution of CF worldwide.
In Ecuador, there are studies about the ancestral origin; for instance, the Ecuadorian was reported to be composed of 59.6% Native American, 28.8% European, and lastly 11.6% African origin [14] [15].
In addition to the variable predisposition of CF among populations, reports exhibit a total of 2,063 mutations listed on the CFTR mutation database [24], while in the CFTR2 database, the most recent file updated on 8 December 2017 shows a total of 374 variants [25]. Those variants were identified in different populations in diverse frequencies. For instance, the frequencies of the most common variant c.1521_1523delCTT (p.Phe508del) depend on ethnicity; it was reported as 72% in US Caucasians,~41% in African Americans, and 18% in Iranians, yet it also differs among Caucasians [26] [2] [27] [1] [28].
There are some mutations that have been commonly reported in ethnic groups; as examples, c.1624G>T (p.Gly542X) was reported in 43% of Turkish origin [29], while in a study in Peruvian patients, the frequency was 6.9% [30], c.3846G>A (p.Trp1282Ter) was reported in 43% of Ashkenazi patients [31], c.2988+1G>A (3120+1G>A) was reported in 12.3% of native African CF patients [32] [33], and c.3909C>G (p.Asn1303Lys) was described in 1.7% of the total number of CF analyzed from Europeans and the United States population [34], while in the Algerian population, the frequency was 20% [35]; c.1652G>A (p.Gly551Asp) presented a frequency of 3% in a north Brazilian population [36]. Furthermore, some mutations have been found in a specific ethnic group, like c.16C>G (p.Leu6Val) was found in one Argentinian and c.3294G>C (p.Trp1098Cys) was found in one Mexican, among other variations described [4]; c.3276C>G (p.Tyr1092Ter) was found in Jews from Iraq [31] [2].
In conclusion, the identification of ethnicity-dependent mutations would be an important aspect of CF testing in Ecuador. The present study exhibited a greater ancestral composition of Native American, followed by European and lastly African; the mixed population origin could possibly explain the new CF mutations reported.

Limitations.
Although we have found the ancestral proportions of the majority of CF patients with new mutations previously reported, we could not access all the samples due to the available conditions of the patients. Moreover, a larger CF patient study with the commonly reported mutation should be conducted to better approximate the ancestral proportions of the patients.

Data Availability
The authors confirm that the data supporting the findings of this study are available within the article. The complete raw data that support the findings of this study are available from the corresponding author upon reasonable request.

Ethical Approval
The patients were registered with the Ecuadorian Cystic Fibrosis Foundation at Guayaquil and Cuenca. The investigation was approved with the number 2018-127E by the "Comité de Ética de Investigación en Seres Humanos Universidad San Francisco de Quito."

Consent
Patients provided informed consent, including the paternal signed authorization to participate in the study.

Conflicts of Interest
The authors declare that they have no competing interests.

Authors' Contributions
César Paz-y-Miño was responsible for coordination and followed up with the development of the article. Ana Karina Zambrano was responsible for the design, experimental procedure, data analysis, and writing. Juan Carlos Ruiz-Cabezas participated in the samples collection and written edition. Isaac Armendáriz-Castillo participated in the written edition. Jennyfer García-Cárdenas participated in the written edition. Santiago Guerrero participated in the written edition. Andrés López-Cortés participated in the written edition. Andy Pérez-Villa participated in the writing and formatting of the article. Patricia Guevara-Ramírez participated in the written edition. Verónica Yumiceba participated in the written edition. Paola E. Leone participated in the written edition. César Paz-y-Miño and Ana Karina Zambrano contributed equally to this work.