Molecular Characterization of Sudanese and Southern Sudanese Chicken Breeds Using mtDNA D-Loop

The objective of this study was to assess the genetic relationships and diversity and to estimate the amount of gene flow among the five chicken populations from Sudan and South Sudan and commercial strain of egg line White Leghorn chickens. The chicken populations were genotyped using mtDNA D-loop as a molecular marker. PCR product of the mtDNA D-loop segment was 600 bp and 14 haplotypes were identified. The neighbor-joining phylogenetic tree indicated that the indigenous Sudanese chickens can be grouped into two clades, IV and IIIa only. Median joining networks analysis showed that haplotype LBB49 has the highest frequency. The hierarchal analysis of molecular variance (AMOVA) showed that genetic variation within the population was 88.6% and the differentiation among the population was 11.4%. When the populations was redefined into two geographical zones, rich and poor Savanna, the results were fractioned into three genetic variations: between individuals within population 95.5%, between populations within the group 0.75%, and genetic variation between groups 3.75%. The pair wise F st showed high genetic difference between Betwil populations and the rest with F st ranging from 0.1492 to 0.2447. We found that there is large number of gene exchanges within the Sudanese indigenous chicken (Nm = 4.622).


Introduction
Chicken genetic resources comprise a wide range of breeds and populations including red jungle fowl, native and fancy breeds, middle level food producers, industrial stocks, and specialized lines. Sudan possesses potential genetic resources of local poultry, even though most of them have not been genetically and phenotypically characterized. Some efforts had been exerted by the Sudanese nationals [1,2] and international researchers and FAO investigated the genetic and phenotypic potentiality of the local chickens but still further identification and characterization as an ultimate prerequisite for their conservation and utilization are needed. The assessment of genetic distance by means of molecular marker techniques may provide useful information for initial evaluation of chicken genetic resources. Microsatellites have been successfully used in chicken genetic diversity studies.
Genetic diversity measures using the highly polymorphic variable number of tandem repeat loci have yielded reliable and accurate information for the study of genetic relationships between chicken populations. Sequencing a specific fragment of mtDNA (e.g., D-loop) gives more accurate information on evolution and genetic diversity [3]. The Dloop region does not encode protein and evolves much faster than other region of the mtDNA genome. For the past 20 years, mtDNA and particularly D-loop sequences have been used in phylogenetic analysis [4]. There is evidence from mtDNA D-loop variations in European, African, and Indian cattle breeds that indicate independent domestications of Bos taurus and Bos indicus cattle in two separate locations [5]. Also D-loop sequences have been used in unraveling domestication and diversity of dogs [6], horses [7], and goats [8] and in Australian red kangaroo and Macropus rufus [9]. Study variation among 398 African indigenous chicken from 12 countries was carried out using mtDNA D-loop region and HVI domain and it was found that the polymorphic sites account for 12.59% of the 397 sequenced base pair fragment [10], while variation rates among 25 individuals from six native Chinese chicken populations were recorded to be 7.05% and 5.54%, respectively [11,12]. Advantages of the mtDNA are that it plays a role in metabolism, apoptosis, disease, and aging and it is the site of oxidative phosphorylation essential for the production of ATP, and a variety of other biochemical functions. It is highly polymorphic compared to the nuclear DNA evolutionary rate being 5-10 times faster than nuclear genome [13] probably due to lack of replication repair mechanism [14]. The understanding of phylogeography will elucidate the demographic history, origin, and expansion of livestock species. Networks analysis has supplemented phylogenetic trees to overcome the problem of parallel mutations and lineage exchange between divergent populations [15]. Therefore, the aim of the present study is to evaluate the genetic variability within and between Sudanese native chickens using mtDNA D-loop region.

Experimental Animals.
The study was conducted in four states of Sudan, chosen because they have only indigenous chicken and no exotic breeds have been introduced. A total of 81 blood samples were collected from five chicken populations including Betwil, (BTW, = 21) from El Dilling Locality in South Kordofan State, and Large Beladi of Bhari (LBB, = 12) was collected from Khartoum North Locality. Large Beladi of Abu-Neama (LB, = 22) and Bare Neck (BRN, = 12) were both collected from Abu-Neama Locality in Sinnar State. The fifth population was (SUD, = 14) from Malakal Locality; this sample was retrieved from International livestock Research Institute (ILRI) data base. In addition to a commercial strain of egg line White Leghorn type (COML = 9) was genotyped concurrently with the Sudanese chickens.
Genetic Analysis. DNA was extracted using the Chloroform Method [16]. PCR was performed in 30 L reaction containing 2.5 mM of each dNTPs, 14 pmol of each primer, 1.5 mM MgCl 2 , 1 × PCR buffer, 1.25U Taq DNA polymerase (Roche Applied Sciences, Germany or Promega, Madison, USA), and 1 L Genomic DNA. The mtDNA D-loop (HV1) region was amplified using specific primers based on the partial chicken mitochondrial genome GenBank accession number (AB098668) and complete chicken mitochondrial genome GenBank accession number (NC 001323) ( Table 1). PCR amplification was carried out on a Gene Amp PCR 9700 (Applied Biosystems) thermo cycler. PCR conditions were as follows: initial denaturation at 94 ∘ C for 2 min, followed by 10 cycles at 94 ∘ C for 15 s, 58 ∘ C for 30 s, and 72 ∘ C for 40 s. The amplified fragments were electrophoreses on 1.5% agarose gel at 100 volts for one hour; gel was stained with 0.4 L/mL ethidium bromide and detected under UV light. PCR products were purified using QIAquick PCR purification kit (QIAGEN GmbH, Germany). Direct sequence of HV1 segment of D-loop region was performed using two    [17] was used for analysis after editing the sequences of amplified Dloop fragments. Multiple alignments of the sequences were performed using ClustalX 1.83 [18] and Muscle 3.52 [19] programs.

Phylogenetic and Molecular Evolution
Analysis. The evolutionary relationships of indigenous Sudanese chicken haplotypes were established using the molecular evolutionary genetics analysis (MEGA) version 3.0 [20]. Genetic distances of the haplotypes were calculated using Kimura's two parameters model to construct a neighbor joining phylogenetic tree. Two D-loop sequences of Gallus were included, one from wild jungle fowl Gallus gallus gallus and the other from Gallus gallus bankiva GenBank accession (number AB007720 and AB007718), respectively, and seven Clade I, II, IIIa, IIIb, IIIc, IIId, and IV reference sequences that correspond to different clades determined previously in Asian samples [10] ( Table 2).

Haplotypes Diversity.
Haplotype diversity was illustrated using network analysis implemented by NETWORK 4.1.0.8 [15]. The DNA D-loop sequences diversity indices were determined to elucidate the sequence polymorphism and the content of genetic variability in chicken population. The populations indices include number of segregation sites ( ), number of haplotypes ( ), haplotype diversity (Hd), and nucleotide diversity ( ) as explained by Nei [21]. The Genetics Research International 3 analysis was conducted using DnaSP software version 4.0 [22]. Alignment gaps arising from a deletion event were excluded from the calculations. The average number of nucleotides differences per site between the two sequences known as nucleotide diversity ( ) is defined as = /( − 1)Σ or = Σ / where is the number of DNA sequences examined, and are the frequencies of the th and th type of DNA sequences, respectively, in the sample, is the proportion of nucleotides in the respective types of DNA sequences, and is the total number of sequence comparisons [21]. Average heterozygosity or haplotype diversity, ℎ, is defined according to the formula of Nei [21], ℎ = 2 (1 − Σ 2 )/(2 − 1), where is the frequency of haplotype and is the sample size. The degree of genetic differentiation among the population was estimated using gene or haplotype frequencies. Population genetic structure was investigated by st significance test [23] and st [24] using Arlequin software version 2.000 [25].

Analysis of Molecular Variance (AMOVA).
Maternal genetic differentiation was further quantified using hierarchical analysis of molecular variance, AMOVA [24], performed using Arlequin version 2.000 software [25]. Sudanese chicken population was first considered as one single population and later it was subdivided into two geographical areas, rich and poor Savanna. Multiple sequence alignment was performed for the fourteen haplotypes identified in the Sudanese and Southern Sudan indigenous chicken. Alignment of D-loop sequences was done to a reference sequence from Gene Bank accession number (AB 098668) using Clustal-X 1.83. Two units of invariant tetradecamer 5 -AACTATGAATGGTT-3 were detected at positions 267 to 280 and 328 to 341. In the first unit there were two transitions observed G/A for SUD 71 and T/A for LB41 at position 268 and 272, respectively, while in the second unit of the tetradecamer T/C transition was observed for SUD 40 and SUD 13 at position 330. In addition to that the following domains and motif were observed, at the 5 end of the D-loop. An interrupted thymine string (AATTTTATTTTTT) was observed and found to be conserved in all the individuals studied. There was also an interrupted poly-C sequence (5 -CCCCCCCTTTCCCCCCC-3 ) which is widely conserved and downstream to this there is conserved sequence known as poly-G (5 -AGGGGGGGT-3 ). Two conserved 5 -TACAT-3 and 5 -TATAT-3 were also found in all individuals. There are six TATAT motifs and two TACAT found within the 397 bases of the D-loop and were also conserved. The first 166 base pairs adjacent to tRNA Glu were found to be highly conserved in all individuals except for  one substitution a T/G transversions in SUD 40 at position 33 (data not shown). The nucleotide substitutions found in the 14 variable haplotypes comprised one G/T and two C/A transversions and the rest were all transitions of which six were A/G substitutions and ten were C/T substitutions. This demonstrates a strong bias towards transition. The C/T substitutions are more common than A/G substitution (Table 3).

Phylogenetic Analysis of the Haplotypes.
A neighbor joining dendrogram showed the genetic relationships among the twelve haplotypes identified in Sudanese indigenous chicken from Sudan. The egg line commercial strain chicken was included and two haplotypes of genus Gallus (Gallus gallus gallus and Gallus gallus bankiva, GenBank accession number AB007720 and AB007718, resp.) were retrieved from GenBank and used as out groups, and seven clade reference haplotypes (Clade I, II, IIIa, IIIb, IIIc, IIId, and IV) were also included. The dendrogram revealed that 11 haplotypes identified in the Sudanese indigenous chicken were placed into two clusters with the domestic chicken Gallus gallus gallus. This indicates a very close relationship between the Sudanese indigenous chickens and Gallus gallus gallus, while they are relatively genetically distanced from Gallus gallus bankiva. Alignment with the reference lineage haplotypes from Asia showedthat all Sudanese indigenous chicken were grouped into clade IV (Figure 2(a)), while the commercial egg line strainstudied concurrently with Sudanese indigenous chicken fell into clade IIIc. When 3 haplotypes from Upper Nile State of the Southern Sudan werecombined with 11 haplotypes from the Sudan and aligned together again with Gallus gallus gallus and Gallus gallus bankiva, the dendrogram constructed placed Sudanese chicken haplotypes into three clusters with thedomestic haplotype from Genbank. This once more suggest that the Sudanese indigenous chicken is more closely related to Gallus gallus gallus while they are relatively genetically distanced from Gallus gallus bankiva. Alignment with reference haplotypes from Asia resulted in constructed neighbor-joining tree which grouped the Sudanese indigenous chicken into two, clade IV and clade IIIa, Figure 2(b).

Network Analysis.
Median-joining networks were drawn for the 12 haplotypes identified from the Sudanese indigenous chickens from Northern Sudan and one haplotype of commercial layers, based on the variable characters of the complete alignment using the computer program NET-WORK 4.1.0.8 [15]. The results showed that DNA sequence of haplotype LBB49 has the highest frequencies and this haplotype is connected to the frequencies of other haplotypes forming star-like connections with LBB49 in the centre. It was also observed that there are mutational links to ten haplotypes which include five singletons. Therefore, it can be referred to as an interior or ancestry haplotype. No median vector (mv) * separating the clade was observed; all the eleven haplotypes identified in the Sudan region belong to clade IV marked with yellow color as shown in Figure 3(a), while the commercial egg line chicken haplotype belongs to different clade which IIIc, marked with green color, and has seven and eight mutation connection with BRN62 and LBB49, respectively.
Median-joining network analysis was carried out with the haplotypes from the Southern States of the Sudan and the Northern Sudan States. The results illustrate that out of the 14 identified Sudanese haplotypes only one haplotype (SUD 71) from the South Sudan (Malakal) showed uniqueness. It fell into a different clade (IIIa) marked with red color while two haplotypes (SUD13 and SUD40) are both sharing clade 1V with other haplotypes from the Sudan region marked with yellow color Figure 3 Betwil, Beladi Bahri, and Bare-Neck had equal number of haplotypes ( = 4) and are regarded as the lowest haplotype number. The gene haplotype diversity (Hd) was high in Betwil population (Hd = 0.724) followed by Beladi Malakal while it was lower in Beladi Bahri and Bare Neck population (Hd = 0.455). However, the average overall haplotype diversity was approximately (0.577) for the 81 chicken haplotypes. The average nucleotide diversity detected for 81 D-loop sequences of the indigenous Sudanese chicken population was estimated to be 0.00282 substitutions per site. However, the highest nucleotide diversity was found in Malakal population (0.00603 = ) followed by Betwil population (0.00259 = ) and Beladi Neama (= 0.00179) while Beladi Bahri and Bare recorded lowest nucleotide diversity (0.00126 = ) ( Table 4).

Genetic
Differentiation. st and st were computed using DnSP version 4.0 [22]; the average st and st were similar (0.098), while Nm was 4.622 for both approaches. This indicates that 9.8% of maternal genetic differentiation estimated in Sudanese indigenous chicken resulted from variation among populations while 90.2% was due to contribution by genetic divergences among individuals within populations. The highest genetic differentiation between populations observed was between Betwil population and the rest of the populations with st value ranging from 0.1493 to 0.2450 while the rest of populations showed a relatively little maternal genetic subdivisions Table 5.

Analysis of Molecular Variance (AMOVA).
Maternal genetic differentiation within population and among population within the Sudanese chicken was quantified using hierarchal analysis of molecular variance AMOVA on Kimura-2-parameter distance considering Sudanese populations as one single group. The genetic variation within population was 88.6% and the genetic differentiation among the populations was 11.4%. When Sudanese chicken population was once more subdivided into two geographical groups: rich and poor Savanna, the resulting variation was partitioned into three fractions. The variation between individuals within populations was 95.5%, between populations within groups was 0.75%, and the genetic variation between groups was 3.75% (Table 6).

Discussion
The complete alignment revealed a very high variability in the mtDNA D-loop region between 167-391 bases; this variation constitutes 23.5% of the 81 sequences. This rate is extremely high compared to the native chicken breeds 5.54-7.05% [11,12]. Similarly the results were higher 12.59% than those of 398 African domestic chickens from 12 countries [10]. This high rate mtDNA D-loop variation may be attributed to migration and exploratory movement of human into Sudan being as the largest country in Africa sharing borders with nine countries. The base composition of the Sudanese domestic chicken D-loop HVI shows that A + T sequence content constitutes 50.46% while G + C was 49.54%; similar    results were noted by Ruokonen and Kvist [26]. The two units of invariant tetradecamer 5 -AACTATGAATGGTT-3 which was observed in this study at positions 267 to 280 and 328 to 341 were found to be conserved in most of the 14 haplotypes identified in the Sudanese indigenous chicken except for 4 haplotypes. These four haplotypes were varied by one base substitution. The substitutions were SUD71 with G/A, LB41 with T/C transition at positions 268 and 272, respectively, in the first tetradecamer unit, and SUD13 and SUD40 both with T/C transition at position 330 in the second tetradecamer unit. This type of tetradecamer duplication was also observed by Fumihito et al. [3] and found to be a specific trait for genus Gallus gallus gallus. However, this result indicates the close genetic relationship between Sudanese indigenous chicken and genus Gallus. On the other hand, at the 5 end of the D-loop HVI domain an interrupted thymine string (AATTTTATTTTTT), an interrupted poly-C (5 -CCCCCCCTTTCCCCCCC-3 ) and poly-G (5 -AGGGGGGGT-3 ) were widely conserved in all the individuals. These conserved features have been described across many avian species other than Galliform. They include Struthioniformes, Falconiformes, and Sphenisciformes [27]. However the presence of the cytosines and guanines strings in proximately to each other in D-loop segment sequence of the Sudanese indigenous chicken makes the formation of a stable hairpin structure possible [28]. The conserved sequence motifs of TACAT and TATA were found in all domestic chicken of Sudan. These types of motifs are described as TASs, termination-associated sequences elements involved in the termination of mtDNA synthesis [29]. The presence of the TASs in both Galliformesand mammals may suggest strong structural function of D-loop region of the two genera, while the lack of variation in TASs among the Galliformes may be due to the selective functional constraints. Phylogenetic analysis of the 14 haplotypes identified from the Sudanese indigenous chicken illustrated evolutionary relationships. All Sudanese chicken population from the Northern states fell into two clusters. However, when a second phylogenetic tree was reconstructed including populations from the Upper Nile State (Malakal) in South Sudan, the dendrogram constructed placed the Sudanese indigenous chicken into three clusters and two different clades meaning that Sudanese chicken came from two different maternal lineages out of the seven clades. Despite the fact that some populations have more than one haplotype yet they fell in the same maternal lineage, that is, all Sudanese different haplotypes fall in Clade IV except haplotype SUD 71 which fell in Clade IIIa, this result may suggest that these populations shared the same maternal ancestor and that their descendants have accumulated mutations to become distinct lineages.
Total nucleotide diversity among the Sudanese chicken was found to be (0.00282 = ) nucleotide substitutions per site; it was higher in Malakal and Betwil populations, while it was lower in Beladi Bahri and Bare neck populations. This low nucleotide diversity in Large Beladi of Bhari and Bare neck populations may indicate loss of gene diversity for these populations which may be attributed to relatively recent population bottleneck. On the other hand high nucleotide diversity in Malakal and Betwil populations may suggest that the populations are more ancient [11]. Network analysis showed that DNA sequence of the haplotype of Large Beladi (LBB49) has the highest frequency and connected with the largest number forming a star-like structure. Such pattern of structure was found for different species of birds including Red winged Blackbird, Red poll finches, and Greenfinch [30,31]. The analysis also revealed convergent or reverse mutation among haplotypes LBB40, LBB56, and BTW1. This convergent mutation is common where there is heterogeneity due to unequal mutation rate for all nucleotide sites. Under such circumstance, accumulation of mutations at a small number of fast sites leads to reverse mutation [32]. Moreover, the network analysis showed that there is probably more than one maternal origin of Sudanese indigenous chicken populations as one haplotype from Malakal population fell into a different (clade IIIa).
The hierarchical analysis of molecular variance AMOVA and st or st significance test indicate that 9.8% of maternal genetic differentiation in Sudanese indigenous chicken populations resulted from variation among populations while 90.2% was due to contribution by genetic divergence among individual within population. The highest observed genetic differentiation between populations was between Betwil population and the rest of the populations followed by Malakal population. The level of st value found in this study is close to the value reported in African cattle breeds ( st = 0.060), [33] but smaller than that reported among 78 Chinese indigenous chicken breeds st = 0.106 [34]. When using hierarchical analysis as a second tool to give more insight into genetic differentiation between individuals within the population and to confirm st results, the Sudanese chicken population was first defined into two geographical groups or regions, rich and poor Savanna. The genetic variation between individuals within populations was 95.5% and that occurring between populations within the groups was 0.75%, while the genetic variation between groups accounted for 3.75%. The low genetic variation between the groups that were defined geographically may suggest weak phylogeographic structure of Sudanese chicken and may be an indicator of common maternal origin. When considering Sudanese chicken population as one single population group, the genetic variation within the population accounted for 88.6% of the total variance while the proportion between populations was 11.4%. This indicates that the Sudanese chicken populations are genetically differentiated along geographical localities. Finally the study concludes that the region of mtDNA Dloop HVI which ranges from 167 to 397 has higher variation among Sudanese domestic chicken population.