cpDNA-Gene-Sequence-Based Genetic Diversity, Population Structure, and Gene Flow Analysis of Ethiopian Lowland Bamboo (Bambusinea: Oxytenanthera abyssinica (A. Rich.) Munro)

Assosa University (ASU), Assosa, Ethiopia Addis Ababa University (AAU), Addis Ababa, Ethiopia Institute of Biotechnology (IoB), Addis Ababa University (AAU), Addis Ababa, Ethiopia Ethiopian Biotechnology Institute (EBTi), Ministry of Science and Technology (MoST), Addis Ababa, Ethiopia Central Ethiopia Environment and Forest Research Centre (CE-EFRC), Addis Ababa, Ethiopia International Network for Bamboo and Rattan (INBAR), Beijing, China COMSATS University Islamabad (CUI), Park Road, Islamabad, Pakistan


Introduction
Bamboo is an arborescent perennial, giant, and woody grass belonging to order Poales (monocotyledon), family Poaceae (grass family), subfamily Bambusoideae, and tribe Bambuseae, encompassing ca. 1,662 species in 121 genera [1] and about 100 species are under commercial cultivation [2]. Bamboo is the fastest-growing plant, 100 cm per day, in the world [3] and one of the most important nontimber forest resources or a potential alternative to wood and wood products [4].
Bamboo is widely distributed in Asia, Latin America, and Africa from sea level to highlands in tropical, subtropical, and temperate countries [5,6]. According to the world bamboo resources assessment report, Ethiopia, Kenya, and Uganda possess most of the bamboo resources in Africa [7]. Ethiopia contributes to the leading coverage constituting more than 1.44 million hectares [8]. is constitutes about 86% of the total area of bamboo on the continent and 7% of the world [9].
ere are two indigenous woody bamboo species in Ethiopia: the African Alpine Bamboo or highland bamboo (Yushania alpina K. Shumann Lin; synonym: Arundinaria alpina K. Schumann) and the monotypic genus lowland bamboo (Oxytenanthera abyssinica (A. Richard) Munro).
ese species occur in some other African countries, but nowhere other than the continent of Africa [9,10]. ey are indigenous to Ethiopia and endemic to Africa, confined to the sub-Saharan region [9]. e lowland bamboo covers a range of elevation between 540 and 1750 m and highland bamboo at a higher elevation above 2,480 m [8]. e lowland bamboo (O. abyssinica) in Ethiopia accounts for 85% of the total national coverage of bamboo, and the remaining 15% is covered by highland bamboo (A. alpina) [9,11].
Genetic erosion of bamboo and their wild relatives are accelerating at a high rate because of human activities such as deforestation, wild firing, overexploitation, and introduction of exotic species without investigation and proper research on the potential impact of genetic pollution and general problems associated with transfer of exotic germplasm [1]. Ethiopia introduced around 23 new species of bamboo in two rounds since 2007. e first entries were by the Ministry of Agriculture and INBAR. Seven species of the 1st entries have been tested for their adaptability and growth performance in different locations without adequate studies and inquiry. e second entries that comprise 16 species were introduced by Morel Agroindustries LTD. ese species are under multiplication at Holetta and Gurd-shola nurseries of CE-EFRC, Addis Ababa [12].
As many as half of the world's woody bamboo species have become vulnerable to extinction as a result of massive forest destruction [13]. e most important problems currently faced by bamboos in Ethiopia are related with high abandonment of bamboo plant due to lack and/or gap of knowledge on its biology and genetics, presence of high genetic erosion, and destruction of the plant due to human activities (Grand Ethiopian Renaissance Dam (GERD) with a catchment area of 172, 250 km 2 [14] is built primarily on major lowland bamboo-growing areas of Metekel Zone of Benishangul, Gumuz Region (BGR)).
Relatively limited numbers of molecular finger printing studies have been carried out to assess the genetic diversity of bamboo species [2], and no genetic diversity study on O. abyssinica in Ethiopia has been conducted so far. e lack of research studies conducted in Ethiopia, especially on the diversity and systematics of O. abyssinica (the species with great ecological and industrial benefit and great coverage in Africa) at the DNA level, prompted the commencement of this research. erefore, for the present study, we sequenced coding (matK and ndhF) and noncoding (rps16) regions of cpDNA genes aimed to assess the genetic diversity, population structure, and gene flow analysis of O. abyssinica collected from lowland bamboo-growing areas of the country.

Plant Material Collection.
Young fresh leaves from thirteen natural O. abyssinica-growing areas across the country were collected. Representative young leaves (3)(4)(5) from three independent bamboos for each population were immersed and preserved in a 15 ml falcon tube containing 2% CTAB (2% CTAB, 100 mM Tris-Base pH 8.0, 25 mM Na 2 -EDTA, 2 M NaCl, 250 mg/mL PVP, and 2% β-mercaptoethanol) solution. Areas of sample collection along with GPS data and altitudinal information are presented in Table 1. Maps showing sample coverage and collection site are described in Figure 1.

DNA Extraction and PCR Amplification.
Genomic DNA from three independent samples of each sample was isolated using a modified CTAB method [15] at the Department of Biosciences, Laboratory of Biochemistry, Molecular Biology, and Biotechnology, COMSATS University, Islamabad, Pakistan. Leaves immersed in falcon tubes were taken out and crushed by using an autoclaved mortar and pestle using new fresh 1-2 ml of 2% CTAB solution. Only 1 ml of the crushed tissue was transferred to a new sterile centrifuge tube using blue pipette tips which were cut. Seven hundred microliter of chloroform was added to the crushed tissue and mixed thoroughly and centrifuged at 16000g for 10 minutes 26°C. Six hundred microliter of the supernatant (only clear mix) was transferred to the new fresh Eppendorf tube, and 60 μl of 3M sodium acetate (pH 5.2) was added and thoroughly mixed. Six hundred microliter of ice-cold isopropanol was added and gently mixed by inverting the tubes 3-5 times, and then, the tubes were placed in a refrigerator (−20°C) for 2 h. e mix was centrifuged at 16000g at 4°C for 5 min to precipitate the DNA. e supernatant was discarded, and the DNA was washed by 1 ml 70% ethanol by dissolving the pallet completely in the wash buffer and centrifuged at 16000 × g for 3 min. at 4°C. e wash step was repeated by cold absolute ethanol (1 ml), and the pellet was air-dried. e pellet was dissolved in 60 μl 0.1X TE (10 mM Tris-HCl pH 8.0 and 1 mM EDTA pH 8.0) buffer containing RNase. Test gel electrophoresis in 1% agarose (0.5x TBE buffer was used for gel preparation and run) and nanodrop (Implen Nanophotometer 190-1100 nm spectrophotometer) of each sample were measured, and those with high DNA quality were used for PCR amplification. Among ten primers that amplify ten specific cpDNA regions selected from [16,17] and designed via Primer3 primer designing program (http://bioinfo.ut.ee/primer3-0.4.0/), three chloroplast sequences with coding (matK and ndhF) and noncoding (rps16 intron) were used ( Table 2).   3.0 μl PCR products were analyzed using a standard 1% agarose gel electrophoresis and purified and paired-end sequenced at Macrogen, Inc. Seoul, Korea. e sequence of both strands of every fragment amplified from each sample was assembled separately using DNA Dragon version 1.6.0, and each sampling district was represented by a single sequence with the one very identical and informative among the triplicate samples. BLAST searches for each target sequence were used to confirm probable homology (query cover, identity percentage, E-value, and direction of the strand). e assembled sample sequence was then submitted to the GenBank and used for sequence analysis. Sequences of three markers from each samples were arranged in FASTA format and aligned in MAFFT (multiple sequence alignment software version 7) [18] and realigned again using MUSCLE (multiple sequence comparison by log-expectation) of MEGA X [19] to increase the quality of final alignment product. Sequence information for submission to NCBI was prepared by Sequin v5.51. An accurate and complete GenBank record or accession number is reported in Table 3.

Evolutionary Tree Construction and Network Analysis.
e sequence alignments with MUSCLE were used to identify the best nucleotide substitution model and construct the neighbor-joining (NJ) tree ( Figure 2) using MEGA version 7.0. e NJ tree analysis was conducted using Kimura's two-parameter distance correction mode to build a phylogenetic tree. e evolutionary distances were calculated based on the Maximum Composite Likelihood (MCL) method [20] and are in the units of the number of base substitutions per site. Gamma distribution (shape parameter � 1) was used for the rate variation among sites. For each sequence pair, all ambiguous positions have been removed, and for the final dataset, there were a total of 3,694 positions. DnaSP version 6.10.01 was used to generate haplotype files (with and without the use of the indels) for network analysis. e generated sequence files were saved as a Roehl format, and network analyses were performed via Network version 5.0.1.1 (http://www.fluxus-engineering.com/sharenet.htm).

Nucleotide Diversity, InDel Polymorphism, and Gene Flow and Genetic
Differentiation. DnaSP version 6.10.01 was used to calculate and analyze (1) nucleotide diversity including an average number of nucleotide difference (k), nucleotide diversity (π), and population mutation rates per 100 sites (θw) for total sequence (coding + noncoding regions together) and coding and noncoding regions separately, (2) InDel polymorphism including the number of sites with fixed gaps, total number of (InDel and non-InDel) sites, average InDel length, InDel diversity k (i), and InDel diversity per sites π (i), and (3) gene flow and genetic differentiation including gene flow via genetic differences among population (Gamma St) and average level of gene flow (Fst) and genetic differentiation via nucleotide-sequence-based statistics (Ks) and average number of nucleotide differences between population 1 and population 2 (Kxy).

Results and Discussion
ere were 13 nucleotide sequences involved in the analysis of the current study. For each pair of sequences, each ambiguous position was removed, and in the last dataset, there was an aggregate of 3,694 positions. Evolutionary history using NJ together with nucleotide diversity analysis, DNA divergence between populations, InDel polymorphism, and gene flow and genetic differentiation analysis on O. abyssinica was investigated to analyze on this paper which is the first to study the plants genetic diversity, population structure, and gene flow using coding (matK and ndhF) and noncoding (rps16) regions of cpDNA genes.   observed and the result of each samples was close to the average.

Neighbor-Joining Method Evolutionary History of Ethiopian Lowland
Bamboo. e evolutionary history of bamboo was inferred based on the Neighbor-Joining [21] method. NJ shows the optimal tree with the total length of the branch � 0.01179101 ( Figure 2). Except samples collected from the Oromia Region that show mixing with Kemash and Metekel groups, other populations form their own group which secures the distinctness of their population and effectiveness of genes used for the study.
ree major clusters and three subclusters with five clades (Assosa Zone, Kemash Zone, Metekel Zone, the distant, and the intermixed Oromia Region samples) were formed in the final NJ analysis. e three cpDNA genes (matK, ndhF 3′, and rps16) were analyzed separately and then combined into a single data to

Network
Analysis. e total number of mutations disregarding the torso (130, 27), estimated number of mutations of shortest tree within the torso (73, 0), estimated number of mutations of shortest tree (203, 27), total number of taxa (13,13), and total number of haplotypes (13,11) were observed on gaps considered and not considered network analysis. Number values before parenthesis are for gaps considered and after parenthesis are for gaps not considered. Haplotype 8 (Kurmuk sample of Assosa Zone) from gaps considered (Figure 4(a)) and haplotype 2 (Bambasi sample of Assosa Zone) from gaps not considered (Figure 4(b)) have descendants around them, many of which differ from them by nucleotide change (at sites shown by numbers), and some are more distantly related. Both forms approve that Assosa Zone is the root of O. abyssinica and others have diverged from this zone. e distant populations H_3 (Abol sample of Gambella Region) and H_11 (Koyshe sample of SNNPs Region) from gaps not considered and H_3 (Abol sample of Gambella Region) and H_13 (Koyshe sample of SNNPs Region) from gaps considered showed to be distant from the root.
Distant populations (11.000, 0.00300 ± 0.00150, and 11.00000 ± 0.00222) followed by Oromia, Assosa Zone, and Kemash Zone were found to be lower in k, π, and θw ( Table 5). is implicates that gene differentiation in Metekel Zone and the distant populations is higher and that makes them a diverse population as compared to others.

InDel Polymorphism Analysis.
e average InDel length was higher on Kemash (15.33) and Metekel Zones (12.21), and the lower value was observed on Assosa (7.14), distant, and Oromia populations ( Figure 5), whereas the total number of (InDel and non-InDel) sites analyzed were higher for populations of Metekel (3,692), distant population (3,679), and Assosa Zone (3,678), but Oromia and Kemash Zone populations show a relatively smaller value ( Figure 5). Higher frequency of InDel makes Metekel, the distant population, and Assosa Zone populations more diverse than Oromia (3,654) and Kemash Zone (3,668) populations. From this, we can conclude InDel plays a significant role in genetic differentiation and population structure of O. abyssinica populations.

3.5.
Analysis of DNA Divergence. DNA divergence between populations was higher between populations of Metekel, Oromia, Kemash, and Assosa Zone vs. distant populations in chronological order ( Figure 6). e k was higher between Metekel Zone vs. distant population (9.5) followed by Oromia vs. distant population (8.67) and Kemash Zone vs. distant population (8.47). e least k was found between populations of Assosa vs. Kemash Zone and between Assosa Zone vs. Oromia. e π (t) was higher between Metekel Zone vs. distant population (0.0026) followed by Oromia vs. distant population (0.0024) and Kemash Zone vs. distant population (0.0023). e least π (t) was found between    Oromia. e MP in P1, but monomorphic in P2, was higher between Metekel Zone vs. distant population (10) followed by Oromia vs. distant population (7) and Metekel Zone vs. Oromia (5). e MP in P1, but monomorphic in P2, was found to be zero between populations of Kemash vs. Metekel Zone and between Kemash Zone vs. Oromia. e MP in P2, but monomorphic in P1, was higher between Assosa Zone vs. Metekel Zone (11) and between Kemash Zone vs. Metekel Zone (11) followed by Kemash Zone vs. the distant population (10). e least MP in P2, but monomorphic in P1, was found between populations of Metekel Zone vs. Oromia followed by between Assosa Zone vs. Kemash Zone. e π between populations was higher between distant populations vs. Kemash Zone (11.5) followed by distant populations vs. Metekel Zone (9.5) and distant population vs. Assosa Zone (9.33). e least π between populations was found between populations of Assosa Zone vs. Kemash Zone and between Assosa Zone vs. Oromia.
Any population compared to the distant populations was found to have larger DNA divergence than others. is tells us DNA of the distant population is distinctly different from that of other populations. DNA divergence between Assosa vs. Kemash Zone, Assosa vs. Oromia Zone, and Kemash Zone vs. Oromia was found to be smaller. ose three zones (Assosa, Kemash, and Oromia) are very close to each other, and there might be seed and/or seedling transfer between 8.00 0.0022 ± 0.0011 8.00 ± 0.0016 6.00 0.0021 ± 0.0011 6.00 ± 0.0016 2.00 0.0023 ± 0.0012 2.00 ± 0.0020 Distant pop.

Gene Flow and Genetic Differentiation Analysis.
Extremely higher frequency of genetic differentiation was found between Metekel Zone vs. the distant population (51.63) and between Assosa Zone vs. Metekel Zone (51.5). e least frequency was observed between Kemash Zone vs. Oromia (19.25) and between Assosa Zone vs. Oromia (23.0). Higher frequency of gene flow was found between Assosa Zone vs. Oromia (0.61) and between Kemash Zone vs. Oromia (0.51), and a minimum frequency of gene flow was observed between Metekel Zone vs. the distant population (0.023) and between Oromia vs. the distant populations (Table 6). is implicates that gene flow to Metekel Zone and the distant populations is rare and that makes diverse population as compared to others.
Genetic diversity is the basis for the ability of an organism to adapt to environmental changes and can be influenced by many factors [23]. Geographical factors (e.g., landscape, latitude, longitude, and altitude) and environmental factors (e.g., temperature and precipitation) influence the genetic diversity and population structure of a species and the individuals among populations [24,25]. Biological factors such as mutation, genetic drift, mating system, pollination mode, gene flow, and selection also influence the diversity of the plant species and population [26]. Mainly, mutations within the DNA are the source of variations. InDels usually occur by reason of certain cellular mechanisms, including transposable elements movement, replication slippage, and crossing over imbalanced within genomes [27]. InDels are an essential phenomenon that can have a harmful or advantageous effect on specific genomes' loci [28,29]. InDels are one the main variation sources found within the genomes of various species of plants, including Arabidopsis [30], tomato [31,32], rice [33], chickpea [34], moso bamboo [35], Sorghum species [36], and Gastrodia elata (Orchidaceae) [37]. InDels also used for identifying the genetic variants underlying phenotypic variation in plants without complete genomes [38]. Examination of DNA differences among closely related species or among polymorphic DNA variations of a species will provide insight into the mutation nature and evolution process [39]. InDel analysis on O. abyssinica also shows that gene flow between distant populations was rare and genetic differentiation becomes higher.
Several morphological and molecular marker techniques such as random amplified polymorphic DNA (RAPD), inter simple sequence repeats (ISSR), amplified fragment length polymorphism (AFLP), simple sequence repeat (SSR), expressed sequence-tag-derived simple sequence repeat (EST-SSR), sequence-related amplified polymorphism (SRAP), restriction fragment length polymorphism (RFLP), and interretrotransposon amplified polymorphism (IRAP) have been routinely used for genetic diversity study, population structure analysis, and characterization of bamboo germplasm [2,[40][41][42][43][44][45][46][47][48][49]. But so far, there is no study report on the molecular genetic diversity of O. abyssinica. Hence, as the first step in the development of genomic tools and resources that could contribute to the development of strategies for effective conservation and sustainable utilization of this bamboo for ecological and economic gains by better understanding the genetic diversity profile at the species and population level, we examined and assessed thirteen O. abyssinica populations using three chloroplast gene sequences. ese molecular markers have been successfully utilized for assessing the genetic diversity and revealed a remarkable molecular genetic diversity among the O. abyssinica populations. As the evolutionary rates of cpDNA are highly conserved with structural changes and nucleotide substitution [50], cpDNA regions that are coding (matK and ndhF 3′ end) and noncoding (rps16) genes were used to examine genetic diversities, population structure, and gene flow analysis of O. abyssinica. e effectiveness of cpDNA regions for genetic diversity and phylogenetics studies are well approved and tested in many plants including temperate woody bamboos [16] and paleotropical woody bamboos (Poaceae: Bambusoideae: Bambuseae) [17], level phylogenetics relationships within the bamboos [51], for evaluating plant phylogeny at low taxonomic levels and for DNA barcoding [52], and for phylogenetic relationships among the one-flowered, determinate genera of Bambuseae [53].
Populations of O. abyssinica collected in Ethiopia show clear diversity based on their geographic location. Oumer et al. [15] also found the same result. is might be because the plant is indigenous to the country [9] and largely abundant with broad geographical coverage for a long time, cross-pollination nature of the flower, and the distance between sample collections sites, especially zones which were far and distant. Sample collection sites that have relatively closer distance such as Assosa, Kemash, and Oromia, showing smaller nucleotide diversity (Table 2), InDel (Table 3 and Figure 5), genetic differentiation (Table 4), and DNA divergence ( Figure 6) but higher gene flow between populations (Table 4). ese three sites are neighbor to each other. Assosa and Kemash zones are under the administrative system of BGR and neighbor to each other with relatively closer distance as compared to other sampling sites. Samples from Oromia are also neighbor to Kemash Zone. us, both geographic range and distribution of populations influence patterns of genetic diversity, differentiation, and gene flow of O. abyssinica. Indeed, cross pollination may have contributed to increasing the heterogeneity of bamboo seedlings [54]. is result is more in line with the study on African tropical forest refugia using chloroplast and nuclear DNA phylogeography [55], classification of the Chloridoideae (Poaceae) based on multigene phylogenetic trees [56], Chinese Cherry revealed by chloroplast DNA trnQ-rps16 intergenic spacer [57], natural populations of Oxalis laciniata from Patagonia Argentina using ISSR and cpDNA-based markers [58], wild soybean evaluated by chloroplast and nuclear gene sequences from 44 Chinese, four Japanese and five Korean populations [59], and endangered basal angiosperm Brasenia schreberi (Cabombaceae) in China using microsatellites [60].
Network analysis shows that haplotypes 8 and 2 (both from Assosa Zone) are the root of O. abyssinica and others diverge from this zone. In Amharic Ambesa Chacka (meaning lion's forest because of high abundance of lions in the forest) was one of the sample collection sites from Assosa Zone of Bambasi district. Years ago, the area was known for its wide and dense lowland bamboo coverage. But now, due to poor protection and human's effect on the forest, even bamboos are at risk and vulnerable to extinction.

Conclusions
Based on the results retrieved, populations of O. abyssinica collected in Ethiopia show clear diversity based on their geographic location. Metekel Zone found the most diverse population, and thus, government sectors and other stakeholders recommended focusing on conservation of lowland bamboo of Metekel Zone. Assosa Zone has found the source of evolution of O. abyssinica, and Gambella population shows a difference from other O. abyssinica populations found in Ethiopia. e current study was performed only by few chloroplast coding and noncoding genes; thus, additional coding, noncoding, and spacer chloroplast, mitochondrial, and nuclear-gene-based study on Ethiopian lowland bamboo (O. abyssinica) might give additional information on the population genetic diversity, structure, and gene flow.
Data Availability e cpDNA sequences from the current study are available from the corresponding author upon request. GenBank accession numbers for each cpDNA genes are provided in Table 3.

Conflicts of Interest
e authors declare that they have no conflicts of interest. Note. Ks denotes nucleotide-sequence-based statistics, Kxy: average number of nucleotide differences between population 1 and population 2, GammaSt: genetic differences among population, and Fst: average level of gene flow.