Comparative Analysis of the Complete Chloroplast Genomes in Allium Subgenus Cyathophora (Amaryllidaceae): Phylogenetic Relationship and Adaptive Evolution

Recent advances in molecular phylogenetics provide us with information of Allium L. taxonomy and evolution, such as the subgenus Cyathophora, which is monophyletic and contains five species. However, previous studies detected distinct incongruence between the nrDNA and cpDNA phylogenies, and the interspecies relationships of this subgenus need to be furtherly resolved. In our study, we newly assembled the whole chloroplast genome of four species in subgenus Cyathophora and two allied Allium species. The complete cp genomes were found to possess a quadripartite structure, and the genome size ranged from 152,913 to 154,174 bp. Among these cp genomes, there were subtle differences in the gene order, gene content, and GC content. Seven hotspot regions (infA, rps16, rps15, ndhF, trnG-UCC, trnC-GCA, and trnK-UUU) with nucleotide diversity greater than 0.02 were discovered. The selection analysis showed that some genes have elevated Ka/Ks ratios. Phylogenetic analysis depended on the complete chloroplast genome (CCG), and the intergenic spacer regions (IGS) and coding DNA sequences (CDS) showed same topologies with high support, which revealed that subgenus Cyathophora was a monophyletic group, containing four species, and A. cyathophorum var. farreri was sister to A. spicatum with 100% bootstrap value. Our study revealed selective pressure may exert effect on several genes of the six Allium species, which may be useful for them to adapt to their specific living environment. We have well resolved the phylogenetic relationship of species in the subgenus Cyathophora, which will contribute to future evolutionary studies or phylogeographic analysis of Allium.


Introduction
Subgenus Cyathophora (R. M. Fritsch) R. M. Fritsch is a small group of Allium that has been put forward lately [1]. e special subgenus Cyathophora contains about six species and one variety according to Li et al. [2]; besides, A. spicatum (Prain) N. Friesen has a wild distribution range, extending from China to Nepal, while the rest of them are endemic species in China and mainly distributed in the southeastern margin of the Qinghai-Tibet Plateau (QTP): A. mairei Lév, A. kingdonii Stearn, A. rhynchogynum Diels, A. trifurcatum (F. T. Wang and Tang) J. M. Xu, and A. cyathophorum Bur. and Franch and its variety A. cyathophorum var. farreri (Stearn) Stearn. Although it contains a small number of species, the boundary of subgenus Cyathophora and the involved species have experienced some alterations with the development of molecular biology. In previous study, A. spicatum was classified at different taxonomic levels because of its idiographic spicate inflorescence based on morphological and molecular evidences [3]. Five species have been proposed by Huang et al. about subgenus Cyathophora [4]: A. mairei, A. rhynchogynum, A. cyathophorum, A. cyathophorum var. farreri, and A. spicatum, while A. kingdonii and A. trifurcatum did not belong to this group. Micromorphological and cytological features supported that the subgenus Cyathophora is a monophyly and contains five species [5,6]. Among species of the subgenus Cyathophora, A. spicatum grows in the droughty western QTP with the extremely abnormal spicate inflorescence [3], while A. cyathophorum and A. cyathophorum var. farreri with the umbel inflorescence stretch to the moist HMR [7] (Figure 1). Furthermore, Li et al. [6] suggested that A. cyathophorum and A. farreri were independent species based on molecular phylogeny and the striking distinctiveness in micromorphology. A. rhynchogynum has never been sampled since it was published in 1912 [8]. We also performed a lot of field work to collect it but failed. e Flora of China recorded that A. rhynchogynum only distributed in northwest of Yunnan province in China. erefore, we speculate that A. rhynchogynum might become extinct or there is an identification error in previous research studies. Li et al. [6] performed phylogenetic and biogeographic analyses for A. cyathophorum and A. spicatum based on chloroplast and nuclear ribosomal DNA and detected distinct different topologies between these two molecular methods, in which A. cyathophorum showed close relationship with A. spicatum in nuclear DNA tree but was sister to A. cyathophorum var. farreri in cpDNA tree [4,6]. Other than this, the relationship between these species is not exactly determined, and phylogenetic analysis using single or several combined chloroplast fragments does not solve the problem effectively, and the complete cp genome can well resolve the relationship of subgenus Cyathophora. Hence, it is imperative to reconstruct the relationship of subgenus Cyathophora and clarify the contained species depending on the complete chloroplast genomes. To evaluate the subgenus Cyathophora resources comprehensively, we also need more efficient molecular markers.
Chloroplast is one of the basic organelles in plant cells, which is in charge of photosynthesis of green plants [9]. e chloroplast genomes have a highly conserved structure and gene content, which have a quadripartite structure composed by large single-copy (LSC) and small single-copy (SSC) regions separated by two parts of inverted repeat (IR) [10,11]. Previous studies suggested that genome size of angiosperms ranged from 120 kb to 170 kb with gene number changed from 120 to 130 [12]. Complete chloroplast genome has long been a core issue in plant molecular evolution and systematic studies because of its oversimplified structure, highly conservative sequence, and maternal hereditary traits [13]. Since the complete cp genome analysis can provide more genetic information contrasted with just single or few cpDNA fragments [14], by using cp genome sequences, many long existing phylogenetic problems of different angiosperms at various taxonomic levels have been successfully resolved [15][16][17][18][19][20].
In addition to exploring phylogenetic studies, the whole cp genome has important significance to reveal the photosynthesis mechanism, metabolic regulation, and adaptive evolution of plants. Research has shown that adaptive evolution is mainly promoted by evolutionary processes like natural selection, which affects genetic changes caused by genetic recombination and mutations [21]. Many recent studies have analyzed the selection pressures that undergo by species in the evolutionary processes based on complete chloroplast genome, for example, a positive selection for the atpF gene may suggest that it has made an important impact on the divergence in deciduous and evergreen oak tree [9], and there also existed positive selection on ycf2 in watercress chloroplasts [22]. With the development of sequencing technology, the number of cp genomic sequences has increased dramatically in recent years. However, a few plastid genomes of Allium were reported until now, and it is necessary to develop more complete chloroplast genome in Allium for future phylogenetic and evolutionary research studies.
In our report, we assembled and characterized the complete cp genome sequence of the six Allium species using next-generation sequencing technologies to (1) reveal common structural patterns and hotspot regions, (2) gain a better understanding of the relationship about subgenus Cyathophora based on complete chloroplast genome, and (3) investigate adaptive evolution in the cp genomes of the six Allium species. We hope our study will provide valuable genetic resources for further evolutionary studies about subgenus Cyathophora.  (Table 1). Morphological characters were measured using karyotype [23]. e healthy leaves were immediately dried with silica gel to use for DNA extraction. e voucher specimens were stored in the Herbarium of Sichuan University (SZ Herbarium). eir total genomic DNA was extracted from the sampled leaves according to the manufacturer's instructions for the Plant Genomic DNA Kit (Tiangen Biotech, Beijing, China). Genomic DNA was indexed by tags and pooled together in one lane of Illumina HiSeq platform for sequencing (paired-end, 350 bp) at Novogene (Beijing, China).

Chloroplast Genome Sequence Assembly and Annotation.
We firstly used FastQC v0.11.7 to assess the quality of all reads [24]. To select the best reference, we filtrated the chloroplast genome related reads by mapping all reads to the published chloroplast genome sequences in Allium. SOAPdenovo2 was used to assemble all relevant reads into contigs [25]. e clean reads were assembled using the program NOVOPlasty [26] with the complete chloroplast genome of its close relative A. cepa as the reference (Gen-Bank accession no. KM088014). Geneious 11.0.4 was used to finish the annotation of the assembled chloroplast genome, and it was corrected manually after comparison with references [27]. e circular plastid genome maps were generated utilizing the OGDRAW program [28]. e GenBank accession numbers of A. cyathophorum, A. cyathophorum var. farreri, A. spicatum, A. mairei, A. trifurcatum, and A. kingdonii are MK820611, MK931245, MK931246, MK820615, MK931247, and MK294559, respectively.

Codon Usage Analysis.
Codon usage of the species in subgenus Cyathophora was analyzed by the software of CodonW [31]. Protein-coding genes (CDS) were selected with the following filter requirements: (1) each CDS was longer than 300 nucleotides [18,32]; (2) repeat sequences  were deleted. Totally, 53 CDS of each species in Allium were selected for further study.

Genome Comparison (IR Contraction and Expansion).
e mVISTA program was chosen to analyze the whole sequence similarity of all six Allium species with Shuffle-LAGAN model [33], using the chloroplast genomes to compare their difference in sequences at the chloroplast genome level and A. cyathophorum as the reference. e boundaries between single copy regions (LSC and SSC) and inverted repeats (IR) regions among the six chloroplast genome sequences were compared by using Geneious v11.0.4 software [27].

Hotspot Regions Identification in Subgenus Cyathophora.
To analyze nucleotide diversity (Pi), we extracted the shared 112 genes of the six species in Allium after alignment. DnaSP 5.10 was employed to calculate the nucleotide variability [34].

Gene Selective Pressure Analysis of Six Allium Plastomes.
To investigate selection pressures, nonsynonymous (Ka) and synonymous (Ks) substitution rates of 65 selected proteincoding genes between the cp genomes of subgenus Cyathophora and the other two Allium species were calculated by KaKs Calculator version 2.0 [35].

Subgenus Cyathophora Phylogenomic Analysis Based on
Chloroplast Genome. Phylogenetic analysis of subgenus Cyathophora was totally depended on twenty-nine complete chloroplast genome sequences, which were twenty-one species of Allium (including 6 newly assembled species; 15 other species of Allium were collected from NCBI), six species of Lilium, and two species of Asparagus as the out groups (Table S1). ree different databases were used to build the phylogenetic tree, which include the complete genome sequences, the IGS sequences, and all CDS sequences, and three different methods, Bayesian-inference (MrBayes v3.2), maximum parsimony (PAUP-version4.0), and maximum likelihood (RAxmL8.0), were used to build the tree. e sequences were aligned using MAFFT [36] in Geneious 11.0.4 with the set parameters and manually trimmed. GTR + I + G was selected as the best model using software ModelTest v3.7 [37]. Maximum likelihood (ML) analyses were performed using RAxmL8.0 with 1000 bootstrap replications [38]. PAUP was used to conduct maximum parsimony (MP) analyses [39]. MP was run using a heuristic search with 1000 random addition sequence replicates with the tree-bisection-reconnection (TBR) branch-swapping tree search criterion. Bayesian inference (BI) was executed with Mrbayes v3.2 [40], and the Markov chain Monte Carlo (MCMC) analysis was run 1 × 10 8 generations. e trees were sampled every 1000 generations: the first 25% were discarded as burn in and the remaining trees were used to establish a 50% majority rule consensus tree. When the average standard deviation of the splitting frequency was kept below 0.001, it was considered that the stationarity is achieved.

Chloroplast Genome Organization and Gene Content in Six Species.
ese six acquired Allium cp genomes were detected to have a circular DNA structure of angiosperm cp genomes that comprises LSC, SSC, and two IR regions ( Figure 2). e sizes of the six CP genomes ranged from 152,913 bp for A. mairei to 154,174 bp for A. cyathophorum, which were similar with other Allium CP genomes [41].  (Table 2). e entire GC content of the cp genome sequences was 36.8-36.9%, and the GC contents of the LSC, SSC, and IR regions were 34.6-34.8%, 29.5-31.2%, and 42.7-43.1%, respectively. A total of 132 genes were discovered from the complete cp genome: 8 ribosomal RNA (rRNA) genes, 86 protein-coding genes, and 38 transfer RNA (tRNA) genes (Table 3).

Repeat and Simple Sequence Repeat (SSR) Analysis.
Many research studies of cp genomes revealed that repeat sequences have been widely used in phylogeny, population genetics, and other studies [42]. Four types of repeats (forward repeats, reverse repeats, complement repeats, and palindromic repeats) were detected in the six Allium species. ere were only 3 complement repeats in A. cyathophorum, while the other species did not have. e number of repeats varied from 37 to 77 in the six species; the A. cyathophorum showed the most abundant number of repeats, including 29, 40, 5 and 3 palindromic forward reverse and complement repeats, respectively. e number of forward repeats ranged from 15 to 40, the number of palindromic repeats ranged from 17 to 29, and the number of reverse repeats ranged from 1 to 5 ( Figure 3). e lengths of forward, palindromic, and reverse repeats ranged from 30 to 267 bp, and most of them were concentrated in 30-50 bp (81.48%), while those of 50-70 bp (9.09%), >100 bp (6.40%), and 70-90 bp (3.03%) were less common ( Figure S1). Earlier reports recommend that the appearance of the repeats indicates that this locus is a staple hots-pot for reconfiguration of the genome [43][44][45]. Nevertheless, these repeats are valuable for developing genetic markers in population genetics studies [46,47].
SSRs, also called as microsatellites, are 1-to 6-bp repeating sequences that are extensively distributed in the chloroplast genome. SSRs are highly polymorphic and codominant, which are valuable markers for study involving gene flow, population genetics, and gene mapping [48]. In this study, six classes of SSRs (mono-, di-, tri-, tetra-, penta-, and hexanucleotide repeats) were found in the cp genome of the six species, whereas the number of hexanucleotide repeats ranged from 1 to 3 in the six species, and the pentanucleotide repeats just existed in A. kingdonii and A. spicatum. e total number of SSRs in the genome of the six Allium species was 185 in A. cyathophorum, 158 in A. cyathophorum var. farreri, 159 in A. spicatum, 171 in A. mairei, 201 in A. trifurcatum, and 165 in A. kingdonii (Figure 4(a)). e highest number was mononucleotide repeat, which accounted for about 30.41% of the total SSRs (Figure 4(c)); the number ranged from 36 in A. kingdonii to 71 in A. trifurcatum, and all mononucleotide repeats are composed of A or T bases; these conclusions were unanimous in previous studies that SSRs in cp genomes usually contained short polyA or polyT repeats [49], while those of dinucleotide repeats (28.20%), trinucleotide repeats (20.79%), tetranucleotide repeats (19.15%), pentanucleotide repeats (0.38%), and hexanucleotide repeats were the least abundant (1.06%). In the whole SSR locus, the SSRs located in the LSC area are much more than those in the SSC and IR areas (Figure 4(b)), which is identical with previous research studies that SSRs are unevenly distributed in cp genomes [50].

Codon Usage Analysis.
Codon usage bias is a phenomenon that the synonymous codons usually have different frequencies of use in plant genomes, which was caused by evolutionary factors that affect gene mutations and selections [51,52]. e relative synonymous codon usage (RSCU) is a method that estimates nonuniform synonymous codon usage in coding sequences, in which RSCU less than 1 demonstrates lack of bias, whereas RSCU value greater than 1 stands for more frequent use of a codon. In view of the sequences of 53 protein-coding genes (CDS), the codon usage frequency was calculated for the six Allium cp genomes (Table 4). Altogether, the number of codons ranged from 33058 in A. kingdonii to 23791 in A. mairei. In addition, the result indicated that a total of 13218 codons encoding leucine in the cp genomes of the six species and 1453 codons encoding cysteine as the most common and least common universal amino acids, respectively. As recently discovered in other cp genomes of plants, our study revealed that except tryptophan and methionine, there was preference in the use of synonymous codons, and the RSCU value of 30 codons exceeded 1 for each species, and they were A or T-ending codons. e result is in accordance with other researches, which the codon usage preference for A/T ending in plants [53][54][55].

Comparative Analysis of the Chloroplast Genomes among
Six Species in Allium. mVISTA online software in the Shuffle-LAGAN mode was employed to analyze the comprehensive sequence discrepancy of the six chloroplast genomes of Allium with the annotation of A. cyathophorum as a reference. In this study, the whole chloroplast genome alignment showed great sequence consistency of the six cp genomes, indicating that Allium cp genomes are very   conservative ( Figure 5). We found that among the six cp genomes, their IR region is more conserved compared to the LSC and SSC regions, which is similar with other plants [56,57]. Furthermore, as we have found in other angiosperms, the coding areas were more conserved than the noncoding areas, and there were more variations in the intergenic spacers of the LSC and SSC areas, whereas the IR areas presented a lower sequence divergence [58,59]. A. cyathophorum var. farreri had the highest sequence similarity to A. cyathophorum in sequence identity analysis. Noncoding regions displayed varying degrees of sequence differences in these six Allium cp genomes, including trnK-rps16, trnS-trnG, atpH-atpI, petN-psbM, trnT-psbD, trnF-ndhJ, accD-psaI, and petA-psbL. e coding areas with significant diversity contain matK, rps16, rpoC2, infA, ycf1, ndhF, and rps15 genes. e highly diverse regions found in this study may be used to develop molecular markers that can improve efficiency to study phylogenetic relationships within the Allium species.
ough the cp genome is usually well conserved, having typical quadripartite structure, gene number, and order, a phenomenon recognized as ebb and flow exists, and this is where the IR area often expands or contracts [60]. Expansion and contraction of IR region is related to the size variations in the cp genome and has great differences in its evolution [61,62]. We compared the IR/SC boundary areas of the six Allium cp genomes, and we found that there are obvious differences in the IR/LSC and IR/SSC connections (Figure 6). At the boundary of LSC/IRa junction, rps19 gene of different species distance the boundary were from 1 to 81 bp, while the rpl22 genes distance the border were from 29 to 273 bp. At the boundary of LSC/IRb connections, the psbA genes distance the border were reached from 108 to 605 bp. e inverted repeat b (IRb)/SSC border located in the coding region, and the ycf1 genes of the six species with a region ranged from 4193 to 5223 bp located in the SSC regions, which the ycf1 gene of A. trifurcatum all located in the SSC region. e shorter ycf1 gene crossed the inverted repeat (IRa)/SSC boundary, with 56-919 bp locating in the SSC regions. And the ndhF genes were situated in the SSC regions, which distance from the IRa/SSC boundary ranged from 1 to 1962 bp. Undoubtedly, the full-length differences in the sequence of the six cp genomes are caused by changes in the IR/SC boundaries.

Hotspot Regions Identification in Subgenus Cyathophora.
We totally extracted the shared 112 genes of the six species in chloroplast genomes; the nucleotide variability (Pi) ranged from 0.00041 (rrn16) to 0.08125 (infA) among these shared    Table S2). Seven genes (infA, rps16, rps15, ndhF, trnG-UCC, trnC-GCA, and trnK-UUU) were considered to be hotspot regions with a nucleotide diversity greater than 0.02. ese regions can be used to develop useful markers for phylogenetic analysis and distinguish the species in Allium.

Synonymous (Ks) and Nonsynonymous (Ka) Substitution Rate Analysis.
e Ka/Ks ratio is a significant index for understanding the evolution of protein-coding genes to assess gene differentiation rates and to determine whether positive, purified, or neutral selections have been performed; a Ka/Ks ratio >1 illustrates positive selection and Ka/Ks < 1 illustrates purifying selection, while the ratio of Ka/Ks close to 1 illustrates neutral selection [63]. In our study, the Ka/Ks ratio was calculated for 65 shared protein-coding genes in all six chloroplast genomes (Table S3), and the results are shown in Figure 8. e conservative genes with Ka/Ks ratio of 0.01, indicating powerful purifying selection pressure, were rpl2, rpl32, psaC, psbA, rpoC2, petN, psbZ, psaB, psaJ, and psbT, when the averaging Ka/Ks method showed ycf1 and ycf2 genes with Ka/Ks > 1, which shows that they may undergo some selective pressure among the six Allium species. e Ka/Ks ratios ranging from 0.5 to 1 were found for matK, rps16, psaI, cemA, petA, and rpl20, representing relaxed selection. e majority (56 of 65 genes) had an average Ka/Ks ratio ranging from 0 to 0.49 for the six compared groups, indicating that most genes were under purifying selection. Other than this, four genes (matK, rpoB, petA, and rpoA) with Ka/Ks > 1 in one or more pairwise comparisons (Figure 8) suggest that these genes may undergo selective pressure which is unknown, which is very important for researching the evolution of species.

Phylogenetic Analysis of Subgenus Cyathophora Depends on Chloroplast Genome.
e cp genome of sequence is significant and helpful to construct phylogenetic relationships and explore the evolutionary history in many previous reports [64,65]. To explore the phylogenetic relationship of the six Allium species, we constructed the phylogenetic tree using three different methods and databases containing twenty-one Allium species, six Lilium species, and two Asparagus species as the out groups ( Figure 9). ree databases of the complete genome sequences, the IGS sequences, and all CDS sequences using MP, BI, and ML   methods all showed the same topologies with high support (Figures S2 and S3). e results strongly supported that subgenus Cyathophora is a monophyletic group, comprising A. cyathophorum, A. cyathophorum var. farreri, A. spicatum, and A. mairei in this study with 100% bootstrap value; subgenus Cyathophora does not contain A. kingdonii and A. trifurcatum, and the phylogenetic tree indicates that A. cyathophorum var. farreri is a direct sister to A. spicatum, which is in accordance with the results of previous molecular research studies [4,6]. e sister relationship of A. cyathophorum var. farreri and A. spicatum strongly suggests that A. spicatum is closely related to subgenus Cyathophora though it is a special species with the significant abnormal spicate inflorescence compared to other species with capitate or umbellate inflorescence. Furthermore, Allium kingdonii was the closest relative of Allium paradoxum and Allium ursinum.

Variations among the Six Allium Species.
In this research, we assembled the complete cp genome of the six species in Allium. ey were very conservative in genome structure and size; it showed a typical circular DNA structure and similar cp genome sequence length, ranging from 152,913 bp in A. mairei to 154,174 bp in A. cyathophorum.
e six species had the identical numbers of protein-coding, tRNA, and rRNA genes. ere were some expansion or contraction of IRs among these species (Figure 6); the expansion and contraction of IR regions are related to the divergences in chloroplast genome size [66]. To some extent, it is contributed to the cp genome variation and evolution. Other than this, variations in the IR/SC boundaries in the six cp genomes lead to the distinction in the whole length of sequence [61]. Previous research studies showed that SSRs have been widely known as important resources of molecular markers and have been broadly applied in phylogenetic and biogeographic studies [67,68]. We surveyed and analyzed the quantities and distributions of SSRs with the six species in Allium, the largest number of SSR type was mononucleotide repeats, and the SSRs in the LSC area are much higher than those in the SSC and IR areas (Figure 4), showing that SSRs have a unevenly distribution in cp genome [50]. Additionally, we also explored seven common genes (infA, rps16, rps15, ndhF, trnG-UCC, trnC-GCA, and trnK-UUU) with nucleotide diversity more than 0.02 in the six cp genome sequences of Allium; among them, trnK-UUU, trnG-UCC, ndhF, and rps15 have been previously known as hypervariable regions in Allium [17], and we consider that these SSRs and genes with greater nucleotide diversity can be used as helpful DNA barcodes to identify the species in Allium.

Phylogenetic Relationships.
e results of phylogenetic analysis clearly show that Allium subgenus Cyathophora is a monophyletic group, and comprise four species (A. cyathophorum, A. farreri, A. spicatum and A. mairei), A. cyathophorum var. farreri has been upgraded to the level of the species as A. farreri in a recent study [69]. Besides A. farreri is a direct sister to A. spicatum with 100% strong bootstrap value, while the previous study showed low bootstrap value by the combined plastid dataset (trnL-F + rpl32-trnL) [4]. Currently, most phylogenetic relationships are obtained with chloroplast fragments, while single ITS, chloroplast fragment, or chloroplast combined fragment does not have a better effect in phylogenetic analysis compared to the whole cp genome. We convinced that the complete chloroplast genomes have more advantages to solve the phylogenetic issues about the subgenus Cyathophora. In previous studies, many phylogenetic problems in many plants have been successfully resolved by using complete cp genome sequences [18,19,70]; the lately published article about Allium also well resolved the phylogenetic relationship [17,71]. Although the morphological characteristics of the A. farreri and A. spicatum are obviously different, in which A. spicatum has distinctive spicate inflorescence compared to A. farreri with umbel hemispheric inflorescence, our results undoubtedly showed A. farreri is a direct sister to A. spicatum, which is in accordance with Li et al. [6]. According to previous study, different inflorescence may imply that the umbel inflorescence was replaced by spicate inflorescence to adapt the harsh environment [6]. e phylogenetic tree revealed A. cyathophorum had a closer relationship with A. spicatum and A. farreri compared to A. mairei. Furthermore, the members of subgenus Cyathophora do not contain A. kingdonii and A. trifurcatum; A. kingdonii was the closest relative of A. paradoxum and A. ursinum, which is consistence with previous studies [4,6]. Certainly, our study persuasively constructed reliable phylogeny relationship of subgenus Cyathophora by using the complete cp genome data.   nucleotide substitutions occurred at a higher frequency than nonsynonymous substitutions, and thus Ka/Ks ratios are constantly <1 in most genes [9,74], and our study is similar with this. Between different regions and genes, the Ka/Ks ratios were usually specific (Figure 8). Most conserved genes (56 of 65 genes) had an average Ka/Ks value ranging from 0 to 0.49 for the fifteen comparison groups, indicating that most genes were under purifying selection. On the contrary, the average Ka/Ks values of the ycf1 and ycf2 genes were >1 in the fifteen comparison groups, revealing that some selective pressure may execute on them in six Allium species. Previous studies have shown that ycf1 and ycf2 genes were two large open reading frames; they were important to tobacco, and the gene knockout experiments showed that ycf1 and ycf2 played important role in a healthy cell [75]. Hu et al. [76] suggested that plants have a variety of adaptation strategies in response to unforeseen environmental conditions. Recent studies about Allium species also suggested that the selective pressure in chloroplast genomes play an important role in Allium species adaptation and evolution [17,71]. In our field investigations, the species of subgenus Cyathophora grows in slopes or grasslands with altitude ranging from 2700 m to 4800 m. e elevated Ka/Ks ratios observed about some genes in the six Allium species may suggest that it is relate to their specific living environment. What is more, there were four genes (matK, rpoB, petA, and rpoA) with Ka/Ks > 1 in at one or more pairwise comparisons ( Figure 8, Table S3), and among these genes, rpoA was also undergone positive selection in species of Annonaceae [77]. Previous study demonstrated that rpoA encodes the α subunit of plastid RNA polymerase (PEP), which is in charge of the expression of most photosynthesis-related genes [78]. It is generally believed that low temperature and strong ultraviolet radiation are not conducive to effective photosynthesis of plants; therefore, plants that survive and reproduce at high altitudes need a special photosynthetic protection strategy [79,80]. In this study, the population of subgenus Cyathophora is mainly distributed in the Qinghai-Tibet Plateau and its adjacent high-altitude regions [2]. erefore, we speculated that the positive selection of these genes may be related to the difference between their optimal growth environment.

Conclusions
Here, we sequenced, assembled, and annotated six chloroplast genomes of Allium with high-throughput sequencing technology. e gene contents and orders of the cp genomes were extremely conservative, and their cp genomes are also quadripartite structure. Repeated sequence and SSRs are helpful sources for developing new molecular markers. Codon usage analyses detected that some amino acids of the six species showed distinct codon usage preferences, and we should comprehend codon usage bias to learn evolution process. We also discovered seven highly variable common genes which can be used to develop useful markers for phylogenetic analysis and distinguish species in Allium. e Ka/Ks analysis indicated that some selective pressure may exert on several genes in the chloroplast genomes of six Allium L. species. e maximum likelihood (ML), BI, and MP phylogenetic results clearly showed that subgenus Cyathophora comprised the four assembled species: A. cyathophorum, A. cyathophorum var. farreri, A. spicatum, and A. mairei, and A. cyathophorum var. farreri has a closer relationship with A. spicatum. is study will not only provide insights into the cp genome characteristics of species in subgenus Cyathophora but also supply useful genetic resources for phylogenetic analysis of genus Allium.