Genetic Diversity of Echinococcus granulosus in Southwest China Determined by the Mitochondrial NADH Dehydrogenase Subunit 2 Gene

We evaluated genetic diversity and structure of Echinococcus granulosus by analyzing the complete mitochondrial NADH dehydrogenase subunit 2 (ND2) gene in 51 isolates of E. granulosus sensu stricto metacestodes collected at three locations in this region. We detected 19 haplotypes, which formed a distinct clade with the standard sheep strain (G1). Hence, all 51 isolates were identified as E. granulosus sensu stricto (G1–G3). Genetic relationships among haplotypes were not associated with geographical divisions, and fixation indices (Fst) among sampling localities were low. Hence, regional populations of E. granulosus in the southwest China are not differentiated, as gene flow among them remains high. This information is important for formulating unified region-wide prevention and control measures. We found large negative Fu's Fs and Tajima's D values and a unimodal mismatch distribution, indicating that the population has undergone a demographic expansion. We observed high genetic diversity among the E. granulosus s. s. isolates, indicating that the parasite population in this important bioregion is genetically robust and likely to survive and spread. The data from this study will prove valuable for future studies focusing on improving diagnosis and prevention methods and developing robust control strategies.


Introduction
Cystic echinococcosis (hydatid disease) is an important and globally distributed parasitic zoonosis caused by the larval stage of the cestode parasite Echinococcus granulosus complex [1]. Intermediate hosts, which include humans, sheep, goats, cattle, yak, camels, and other wild mammals, become infected by ingesting the parasite's eggs from infected carnivores (the definitive hosts). Subsequently, a larval stage (metacestode) develops as a cyst in the internal organs (mainly in liver and lungs) of the intermediate host.
In China, cystic echinococcosis has been reported in more than twenty provinces and is particularly prevalent [14,15]. However, to date, infections have been ascribed to just two E. granulosus genotypes; G1 (a sheep strain) and G6 (a camel strain) [16] Southwest China is one of the most serious areas of E. granulosus infections in China. The past geologic events and climate fluctuations lead to a high biodiversity 2 The Scientific World Journal of species in this area [17,18]. In addition, E. shiquicus, a new species of Echinococcus, has been recently discovered in this region [19]. Recently, the first human CE case infected with G5 genotype (cattle strain) in Asia has been reported [20]. For these reasons, it is critical to understand the genetic composition and structure of the E. granulosus complex in this region. In this study, we provide the first investigation of the molecular diagnostics of cystic echinococcosis infections in Southwest China. Mitochondrial DNA has been widely used in population genetics to elucidate phylogenies, as it experiences high mutation and low recombination rates and thus best reflects population genetic structure, population differentiation, and species relationships [21]. The NADH dehydrogenase subunit 2 gene (ND2 gene) evolves faster than other mitochondrial genes and is widely applied in molecular systematics and population genetics studies [22][23][24][25]. We used the ND2 gene as a genetic marker to investigate the genetic diversity and structure of Echinococcus granulosus within Southwest China. This information will be essential for further studies investigating the biology and transmission dynamics of these parasites, especially to humans, and will underpin research on the diagnosis, control, and prevention of this disease [2,[26][27][28][29]].

PCR Amplification, Purification, and Sequencing.
Total DNA was extracted using standard phenol-chloroform techniques [30] and then stored at −20 ∘ C. The complete ND2 gene was amplified using primers (P1: 5 -ATTGGACATTGT-GTCTAGG-3 and P2: 5 -GTTACTCCCATCAATGAGA-3 ) that were designed based on the G1 genotype of E. granulosus (AF297617). The PCR mixture was prepared in a final volume of 25 L containing 1 L of template DNA, 1 L of each primer, 12.5 L of 2 × Taq PCR Master Mix, and 9.5 L of the reaction buffer supplied by the CoWin Company (Beijing). Thermal cycling was performed with initial denaturation for 4 min at 94 ∘ C followed by 35 cycles of 50 s at 94 ∘ C, 45 s at 48 ∘ C, 50 s at 72 ∘ C, and a final extension of 10 min at 72 ∘ C. PCR products were sequenced three times by the Invitrogen Trading Company (Shanghai).
3. Data Analysis. The DNAMAN program was used to align and compare the reference sequences to the nucleotide sequences identified in our study ( Table 2). The percentage divergence of nucleotide sequences was determined using MEGA 5.0 [31] applying Kimura's two-parameter model with a -shaped parameter (alpha = 0.05) [32]. The maximum likelihood tree was also constructed using MEGA 5.0 [31] applying Kimura's two-parameter model with 1000 bootstrap replications. Bayesian phylogenetic analyses were performed and tested using MrBayes version 3.1.2 [33]. We used TCS version 1.21 to construct networks based on the criterion of statistical parsimony [34]. The Arlequin package (version 3.5.1.2) was employed to calculated genetic diversity indices (number of haplotypes, haplotype diversity, and nucleotide diversity) [35]. Pairwise fixation indices (Fst), which estimate the degree of gene flow between two populations, were calculated with the Arlequin package. Mismatch distributions were used to test for demographic signatures of population expansions within mtDNA lineages [36]. The neutrality indices of Tajima's D and Fu's Fs were calculated using the population genetics package Arlequin [37,38].  (Table 3).

Variations in
No deletions and indel sites were obtained.

Phylogenetic Analyses and Genotyping.
We detected 19 mtDNA haplotypes (labeled H1 to H19; GenBank ID: KC897670-KC897688) within the 51 isolates (the localities of the haplotypes are shown in Figure 1). The maximumlikelihood phylogram clearly showed that ND2 formed one clade and was not divided into regional clades according to the allopatric distributions of the isolates (Figure 2(a)).   The Scientific World Journal    a large clade, which was distinct from the other strains. Accordingly, all 51 isolates were classified as E. granulosus sensu stricto (genotypes G1-G3). Results from the Bayesian tree analysis depicted a similar topology to the maximumlikelihood phylogram (Figure 2(b)).

Genetic Polymorphism Analysis and Population Expansion.
The overall haplotype diversity of E. granulosus in Southwest China was high (hd = 0.898 > 0.5%), although nucleotide diversity was low (Pi = 0.005 < 5%). The genetic distance between haplotypes (Kimura 2-parameters) ranged from 0.001 to 0.028 (average genetic distance = 0.007; Table 4). Results of AMOVA showed that the majority of the variation existed within regions/populations, because there was 90.73% variation within the Southwest China population and 9.27% variation of E. granulosus s. s. among Southwest China subpopulations ( Table 5). Assuming that the ancestral haplotype is still present in the population, statistical parsimony networks were constructed, in order to discern the genealogical relationship among the haplotypes. Haplotypes of different regions were mixed together and there was no relationship between haplotype affinities and their locations ( Figure 3). However, the resulting network showed a starlike expansion, with one common ancestral haplotype (H2) occupying the center of the network (Figure 3). There were one to ten mutational steps between the ancestor and the other haplotypes. The Scientific World Journal   Fst values among the three sampling regions of Southwest China were low and ranged from 0.037 to 0.143 (Table 6), indicating that the geographical populations were not genetically differentiated from one another. However, based on the significantly large negative Fu's Fs and Tajima's values (Table 4), we can infer that the population of E. granulosus in Southwest China has undergone a demographic expansion. The unimodal distribution of the mismatch distribution supports this hypothesis of a sudden-expansion model (Figure 4).

Discussion
Southwest China is becoming a model region for biodiversity research and a major area for echinococcosis within China. This study used ND2 gene to investigate the molecular systematics of E. granulosus in this region for the first time and revealed there to be considerable genetic diversity within species complex within this region but no evidence of complete population differentiation.
We observed a distinct anti-G bias in the nucleotide sequences of the ND2 gene of E. granulosus. A similar result was found for other Echinococcus genes by Nakao et al. [40]. Variable sites occurred mainly in the third codon position, while the second codon position exhibited the least variation, supporting observations that the third codon position of mitochondrial protein genes evolves fastest, while the second codon position evolves slowest [41]. We obtained 39 transitions and 16 transversions (a transitions/transversions ratio of 2.4 > 2.0), indicating that the mutations of mitochondria ND2 gene of E. granulosus are not saturated and are suitable for the analysis of genetic variation [42].
Of the 19 haplotypes we defined (from the 51 E. granulosus isolates), one (H2) was shared between the Qinghai and Sichuan populations and two (H5 and H7) were shared between the Qinghai and Tibet populations, while no haplotypes were shared between the Sichuan and Tibet populations. The haplotype phylogenetic tree showed that E. multilocularis was genetically distinct from E. granulosus s. s. (G1, G4, G5, G6, G7, and G8) and that all 19 haplotypes (and thus all isolates) grouped with the standard sheep strain (G1) and were thus classified as E. granulosus s. s. (genotypes G1-G3). This finding is similar to previous studies that found that E. granulosus s. s. (G1-G3) is the major strains of E. granulosus throughout China [16,[43][44][45]. However, in this study, the G1-G3 isolates displayed distinct nucleotide differences to the reference sequences of G1-G3. Further studies are needed to ascertain the reasons for these differences.
Both the phylogenetic tree and the parsimony network showed that genetic relationships among haplotypes were not associated with geographical divisions, as haplotypes from all regions grouped together. This finding indicates that the regional populations of Southwest China are not fully differentiated from each other. The low pairwise Fst values we found are consistent with this observation of low genetic variation among the three sampling locations. Interestingly, an analogous genetic structure was found for E. granulosus in Tibet plateau by Yan et al. [45] using 28 ND1+ATP6 gene haplotypes and in eastern Tibet and Xinjiang by Nakao et al. [46] using 43 CO1 gene haplotypes. These results suggest that the population structure of E. granulosus s. s. may be highly uniform throughout China. However, a larger number of samples from more regions would need to be collected and analyzed to confirm this idea.
Overall, the haplotypes showed low nucleotide (Pi < 5%) and high haplotype diversities (Hd > 0.5) consistent with the hypothesis of sudden demographic expansion [47]. This hypothesis was supported by the finding of large negative Fu's Fs and Tajima's values and the single peak observed in the mismatch distribution [48]. This proposed demographic expansion may be caused by the migration of large numbers of host species (sheep, yaks, and dogs) to new areas in response to environmental change or by the artificial introduction of new hosts.
The ND2 gene similarity of the 51 E. granulosus isolates was found to be high (96.71% to 100%), though no sequence displayed 100% homology with the G1 genotype. The genetic diversity observed among the isolates in this study was significantly higher than that reported in previous studies, which used the CO1 gene to investigate genetic polymorphisms in E. granulosus in China [46]. However, the number of haplotypes identified in the previous study was much higher than that identified in the current study (43 versus 19). These differences are likely to be a consequence of the faster evolution rate of the ND2 gene than the CO1 gene of E. granulosus. Nevertheless, the high degree of genetic diversity in E. granulosus in this region revealed by all the studies The Scientific World Journal 7 indicates that the populations are genetically robust and likely to survive and spread.
In summary, our study provides the basic information about genetic diversity analysis of a wider range of isolates from different regions of Southwest China in order to understand in detail the genetic structure of E. granulosus populations and transmission dynamics of echinococcosis in these regions. Having the useful information, focus should now be directed to strengthen disease surveillance in these regions and improve diagnostic and prevention methods, in addition to developing a robust control strategy.