Comparative Plastome Analysis of Three Amaryllidaceae Subfamilies: Insights into Variation of Genome Characteristics, Phylogeny, and Adaptive Evolution

In the latest APG IV classification system, Amaryllidaceae is placed under the order of Asparagus and includes three subfamilies: Agapanthoideae, Allioideae, and Amaryllidoideae, which include many economically important crops. With the development of molecular phylogeny, research on the phylogenetic relationship of Amaryllidaceae has become more convenient. However, the current comparative analysis of Amaryllidaceae at the whole chloroplast genome level is still lacking. In this study, we sequenced 18 Allioideae plastomes and combined them with publicly available data (a total of 41 plastomes), including 21 Allioideae species, 1 Agapanthoideae species, 14 Amaryllidoideae species, and 5 Asparagaceae species. Comparative analyses were performed including basic characteristics of genome structure, codon usage, repeat elements, IR boundary, and genome divergence. Phylogenetic relationships were detected using single-copy genes (SCGs) and ribosomal internal transcribed spacer sequences (ITS), and the branch-site model was also employed to conduct the positive selection analysis. The results indicated that all Amaryllidaceae species showed a highly conserved typical tetrad structure. The GC content and five codon usage indexes in Allioideae species were lower than those in the other two subfamilies. Comparison analysis of Bayesian and ML phylogeny based on SCGs strongly supports the monophyly of three subfamilies and the sisterhood among them. Besides, positively selected genes (PSGs) were detected in each of the three subfamilies. Almost all genes with significant posterior probabilities for codon sites were associated with self-replication and photosynthesis. Our study investigated the three subfamilies of Amaryllidaceae at the whole chloroplast genome level and suggested the key role of selective pressure in the adaptation and evolution of Amaryllidaceae.

Currently, more than 1,800 species have been recorded in Amaryllidaceae [14]; among them, the subfamily Allioideae occupies 13 genera and more than 900 species [15], which are widely distributed in the Northern Hemisphere and include many economically important crops, such as garlic, leek, onion, and shallot [16,17]. The subfamily Amaryllidoideae also has approximately 900 species, which include many famous ornamental plants, such as Crinum asiaticum, Clivia miniata, and Hippeastrum rutilum [10,18]. Agapanthoideae is a small subfamily of Amaryllidaceae, and only approximately 10 species have been reported, which are also famous ornamental cultivars and are widely cultivated worldwide. For the significant edible, medicinal, and ornamental values of species in Amaryllidaceae, research on these species has never stopped, which also provides valuable information for us to perform further research.
Beyond the phylogenetic studies conducted on the three subfamilies of Amaryllidaceae, genome and transcriptome data were also used to perform evolutionary and adaptive analyses on Amaryllidaceae species in recent years [11][12][13][19][20][21]. Complete plastome sequences, which have a highly conserved genome structure and gene content and a low substitution rate, offer effective approaches for investigating the phylogeny, species divergence, and adaptive evolution of plant species [12,[22][23][24][25][26]. In particular, the substitution rates of the plastome are several times lower in the inverted repeat (IR) than SSC (small single-copy) regions [11,23,[27][28][29]. We found that species from Allioideae exhibit lower GC content than relatives and lost some genes (e.g., rps2). Further studies suggested that 27 genes of Amaryllidaceae species possess positively selected sites (e.g., matK, petD, and rbcL), and 10 of them are owned by Allioideae species [12]. Of course, some Amaryllidoideae and Agapanthoideae plastome sequences have been released [30,31]. However, most of the public chloroplast genomes are annotated with different methods, which will result in more or less annotation errors, and most previous studies have focused on Allioideae. No studies have investigated the difference in plastome structure and adaptive evolution among the three subfamilies.
In this study, a total of 36 chloroplast genomes were collected and reannotated using a uniform approach, including 21 Allioideae species (18 of which were sequenced and assembled here), one Agapanthoideae species, and 14 Amaryllidoideae species. Comparative plastome analyses were performed, and our objectives were to (1) gain insights into the plastome structure features of Amaryllidaceae; (2) investigate the genome variation among the three subfamilies; (3) reconstruct the phylogenetic relationships of Amaryllidaceae species; and (4) explore adaptive evolution based on selective analysis. Our studies will contribute to a comprehensive understanding of plastome evolution in Amaryllidaceae.

Materials and Methods
2.1. Taxon Sampling. In this study, we collected 41 plastid genomes representing three subfamilies of Amaryllidaceae and an outgroup of Asparagaceae. Among them, there were 21 Allioideae species, 1 Agapanthoideae species, 14 Amaryllidoideae species, and 5 Asparagaceae species. (GenBank accessions: Supplementary Table 2). Among all 41 plastomes, we assembled 18 plastomes, and fresh leaves were collected from the wild and then desiccated and stored in silica gel (Supplementary Table S1). Total genomic DNA was extracted from silica-dried leaves with a modified CTAB method with the default parameters [32]. Voucher specimens were deposited in the Sichuan University Herbarium (SZ). In addition, we downloaded 38 ITS sequences of Amaryllidaceae and Asparagaceae species from GenBank (GenBank accessions: Supplementary Table 3).
2.2. Plastome Genome Sequencing, Assembling, and Annotation. Total genomic DNA was sent to Novogene Technologies, Inc. (Beijing, China) for genome library construction and sequencing. The sequencing library was generated using the NEB Next® Ultra™ DNA Library Prep Kit for Illumina (NEB, United States) according to the manufacturer's recommendations, and index codes were added to each sample. Sequencing was executed using an Illumina Nova-Seq 2500 sequencer (Illumina, San Diego, CA, United States). Then, the plastomes were de novo assembled by NOVOPlasty v2.7.1 [33] with clean data. To minimize the impact of distant starting seed sequences on the plastomes, we used a consistent seed sequence (A. cepa, GenBank No. KF769495) within species as a reference sequence. The bases or sequences that could not be confirmed were modified by designing primers for PCR amplification and performing first-generation sequencing. Gene annotations and IR region searches were undertaken using PGA software [34]. Three chloroplast genomes (A. cepa, A. sativum, and A. chinense) were set as reference sequences, and the results were adjusted manually in GENEIOUS R11 [35] based on comparisons with homologous genes of other species' plastomes. Circular plastome maps were drawn using the online program OGDRAW [36].

Sequence Basic Information and Sequence Divergence.
Basic information statistics for all chloroplast sequences were performed using GENEIOUS R11, including the length and GC content of the genome sequences and the number of CDSs and genes in each category. Based on A. listera as a reference, mVISTA [37] was used to construct and visualize the whole-genome alignment of 36 plastomes.

Contraction and Expansion of IRs and Repeat Element
Analysis. The program IRscope (https://irscope.shinyapps .io/irapp/) [38] was used to compare the boundaries between the IR and SC regions of the 36 species and then correct them manually. The Perl script MISA [39] was used to count the plastid SSRs, and the repetition thresholds were set as follows: mononucleotides 10 repeats, dinucleotides 5 repeats, trinucleotide 4 repeats, and tetranucleotides, pentanucleotides, and hexanucleotides have 3 repeats. We used the online REPuter program [40] to identify repeat sequences, including forward repeats, palindromic repeats, reverse repeats, and complementary repeats. The parameters were set as follows: (1) screen repeats with the sizes longer than 30 bp; (2) the sequence identity between two repeated sequences exceeding 90%; and (3) hamming distance = 3. All overlapping repeat sequences in the test results were removed.

2
BioMed Research International 2.5. Indices of Codon Usage. The protein-coding genes from the 36 plastomes were extracted, and all overlapping genes were removed for codon analysis. The final dataset included 65 consensus protein-coding genes for each species. Six values were used to estimate the degree of codon preference: relative synonymous codon usage (RSCU), codon adaptation index (CAI), codon bias index (CBI), effective number of codons (ENC), GC content of synonymous third codon positions (GC3s), and frequency of optimal codons (Fop) [41]. All the above values were calculated by the CodonW v1.4.2 program [42], and the heat map of all RSCUs was drawn using TBtools [43].
2.6. Phylogenetic Analyses. We reconstructed the phylogenetic relationships of Amaryllidaceae species based on the two datasets (including a 41-taxon plastome dataset and a separate dataset comprising 38 nuclear ITS sequences). For plastomes, all shared single-copy genes (SCGs) were extracted from the 41 taxa and then aligned using MAFFT program [44]. We adjusted all alignments manually using the GENEIOUS R11 software [34] and concatenated all of them into plastid supermatrices using PhyloSuite software [45]. For ITS, we aligned them using the MAFFT program [44] and then adjusted manually using GENEIOUS R11 [34]. Maximum likelihood analyses (ML) of the two datasets were performed using the RAxML v7.2.8 [46] under the GTRGAMMA model and 1000 bootstrap replicates. Bayesian inference (BI) was performed on the two datasets using the software MrBayes v3.2.7 [47] with the GTR+G substitution model. The Markov chain Monte Carlo (MCMC) algorithm was run for 2 * 107 generations, and one tree was sampled every 1000 generations. The convergence of MCMC was determined by calculating the average standard deviation of split frequencies, and stationarity was considered to be reached when it fell below 0.01 and ESS > 200. We discarded the first 25% percent of the trees as burn-in and used the remaining trees to generate the 50% majorityrule consensus tree.

Positive Selected Pressure
Analyses. The single-copy CDSs of all 36 species were extracted and further aligned using MUSCLE v3.6 software [48]. The DNA codon sequence alignments were further trimmed by TRIMAL v1.2 [49], and the final processing alignments were used for the positive selection analyses. The optimized branchsite model and Bayesian empirical Bayes (BEB) methods [50][51][52] were used to perform the related analysis. To identify genes under positive selection among the three subfamilies, the species of each subfamily was set as the foreground branch and compared with the other two subfamilies through the optimized branch-site model. The ratio (ω) of the nonsynonymous substitution rate to the synonymous substitution rate (Ka/Ks) was calculated using the PAML v4.8 package with the branch-site model [51]. The likelihood ratio test (LRT) was used to confirm the quality of the different sets above [53]. The Bayesian Empirical Bayes (BEB) method was used to statistically identify whether the selected sites were under positive selection (posterior probabilities ≥ 95%). We classified these genes as follows: ω < 1, ω = 1, and ω > 1 suggesting negative selection, neutral selection, and positive selection, respectively [54]. The gene that was positively selected and with a test p value < 0.05 was considered a positively selected gene (PSG) [40].

Ancestral
Character-State Reconstructions. We conducted reconstructions of two vegetative features, namely, (i) bulb shape and (ii) leaf shape. All morphological feature information comes from field observations, specimen studies, or literature information [55][56][57][58][59][60]. The details of the above two characters are provided in Supplementary  Table 11. The RASP v4 software [61] was used to reconstruct the ancestral traits of the leaf and bulb types. Amaryllidaceae bulbs were divided into three types, namely, (i) spherical, (ii) cylindrical, and (iii) ovoid, coded as A, B, and C, respectively. And the leaves were divided into six types, namely, (i) ribbon, (ii) wide bar, (iii)wide line, (iv) oval, (v) bar, and (vi) lanceolate, coded as a-f, respectively (Supplementary Table 11). The MCMC iterations were set to 100 million and sampled every 10,000 iterations. The first 50,000 iterations were set into burn-in. . The GC contents of plastomes in Allioideae, Agapanthoideae, and Amaryllidoideae were 36.8-37.1%, 37.5%, and 37.7-38.0%, respectively. The gene number of the three subfamilies was ranged from 131 to 137. The detailed statistical information of the plastome sequence is summarized in Table 1.

Contraction and Expansion of IRs and Sequence
Divergence. We found that the chloroplast genomes of Amaryllidaceae plants were relatively conserved on the IR boundary but that there was diversity in the location of the four regions of the chloroplast genome of different subfamilies and different species. From Figure S1, we found that in the chloroplast genomes of all species in the three subfamilies, the junction line between the LSC region and the IRa region (LR line) generally traversed the rpl22 gene or the intergenic region between the rpl22 gene and the rps19 gene. The junction line between the IRa and the SSC (RS line) was located in the region of the ycf_like gene in the genomes of all subfamily species (except Narcissus poeticus), but the position on the pseudogene was different.
In addition, we also found that there were a certain number of species in the three subfamilies that existed  Figure S1). We used mVISTA to visualize the chloroplast genome sequence diversity of the 36 species. The results showed that species between different subfamilies had obvious differences both  BioMed Research International in the coding region and noncoding region of the chloroplast genome (Supplementary Figure S2). When comparing the chloroplast genomes of different species in the same subfamily, we found that there was a high degree of similarity between the whole sequences.

Repeat Element Analysis and Codon
Usage. SSRs were detected in the three subfamilies (Supplementary Table S4). There were 1377 simple sequence repeats (SSRs) detected in 21 Allioideae species, and the most abundant type was mononucleotide repeats (65.6%), with other repeat types as follows: dinucleotides (17.1%), tetranucleotides (12.8%), trinucleotides (2.8%), pentanucleotides (1.0%) and hexanucleotides (0.7%). The above result was similar to the ratio of each component in the 717 SSRs detected in Amaryllidoideae, which only had three types of repeats in Agapanthoideae.
We detected the CDS of the 36 plastomes separately, and six values were used to estimate the degree of preference for codons. The results of the RSCU values for all codons are shown in heat maps (Figure 4), which showed that most of the codon usage preferences remained at a consistent level in the three subfamilies, approximately half of the codons were used more frequently (RSCU > 1), and only two codons (ATG and TGG) had no bias (RSCU = 1). After statistical analysis, the other five parameters were displayed with box plots ( Figure 5). We found that these five parameters had significant differences in the three subfamilies and Allioideae had the lowest correlation value among the five parameters, followed by Agapanthoideae and Amaryllidoideae (Supplementary Table S7).

Phylogenetic Relationships.
We referred to the tree built with the chloroplast data as the CP tree. The CP trees reconstructed using the above two methods (ML and BI) were topologically consistent with each other (Figure 6), and there was little difference in well-supported branches in terms of bootstrap support values of ML (BS) or posterior probabilities of BI (PP). There was strong support for the monophyly of each family which was revealed based on shared SCG data ( Figure 6). Amaryllidoideae was supported to be the sister of Allioideae, and Agapanthus coddii from Agapanthoideae had strong support to be sister to Allioideae and Amaryllidoideae ( Figure 6). The ITS tree (Figure 7 and Supplementary Figure 3) was roughly comparable to the CP tree regarding subfamilies and intergeneric relationships but was weakly supported regarding interspecies and had some inconsistencies.
3.6. Ancestral Character-State Reconstructions. Specific information and numbering for the two traits of Amaryllidaceae species is presented in Supplementary  Table 11, and the traits reconstruction were presented in Figure 8. For bulbs, the results from RASP proposed one possible evolutionary route for Amaryllidaceae bulbs. The most recent common ancestor (MRCA) of Amaryllidaceae probably had spherical, ovoid, and cylindrical bulbs at the

Discussion
Currently, plastome data have been used to evaluate genetic variation in different orders, such as Pilostyles, Salvia, Leguminosae, and Dipsacales [45,[62][63][64]. The plastome sizes of all tested species varied from 152748 to 160099 bp, which was consistent with the length of most angiosperms [65]. It is striking that the plastome length of Amaryllidoideae and Agapanthoideae species was significantly longer than that of Allioideae species. Further statistics and comparison revealed that the difference in plastome length mainly results from the noncoding region length variation of LSC and SSC regions (Table 1), which is shorter in Allioideae species than in Amaryllidoideae and Agapanthoideae species. The results were in line with the widespread conservation that is characteristic of plastid genes (coding regions), especially photosynthesis-related genes [66], and has been reported in other plants [67]. Additionally, Amaryllidoideae species had the highest GC content not only in the whole chloroplast genome but also in the coding region and the noncoding region, followed by Agapanthoideae and Allioideae. Two reasons may explain this phenomenon: the selection of translation efficiency may result in a lack of G and C in the plastome [68,69], and neutral mutation processes such as AT-biased gene conversion and AT-mutation pressure may cause lower GC content [70][71][72]. Similar results have been reported in other Allioideae species [11]. Large repeat sequences play an important role in sequence divergence and promote plastome rearrangement [73][74][75]. Here, we detected 1,199 long repeat sequences in the three subfamilies and found that the number of long repeat types was similar. Further analyses showed that most of the repeats are 30-45 bp, and the palindromic and forward types accounting for the largest proportion were similar to many other plastomes [76][77][78]. SSRs are considered to be potential resources in evolutionary studies and are effective  Figure 6: The phylogenetic relationships of 36 Amaryllidaceae species based on the whole plastid genomes. The phylogenetic tree is inferred from Bayesian inference (BI) and maximum likelihood (ML) analyses. Inconsistencies between PP and BS are marked separately at each node. Unmarked represents maximum support in both analyses. 9 BioMed Research International in species discrimination and population genetic analyses exploring the biogeography of allied taxa [79][80][81][82][83][84]. From the SSR results, we found that some repeat types were specifically owned by Amaryllidoideae species, such as ATT, TTCT, CGAAA, and TTTCG, and some were possessed in Allioideae species, for example, TTA, ATTT, CGAT, and TAAA ( Figure 2). These special SSRs can be used for the identification and classification of species within the Amaryllidaceae. Many SSRs have been detected and used for species identity and delimitation (e.g., Lycoris, Psidium, and Asparagus) [85][86][87]. Therefore, we believe that the repeat sequences detected in this study will provide useful information for studies of Amaryllidaceae in the future.
Codon usage is closely related to gene expression and natural selection pressure [88,89]. From the results, we found that the phenomenon existed in all three subfamilies that 30 codons were used frequently (RSCU > 1) and all biased codons ended with a purine A or T. Codons that have a higher AT content are usually used in the plastomes, and the trend of using A/T in the third position of the codon is more obvious than using G/C [24,90,91]. Codons that encode leucine had the highest number, and the order of codon bias was TTA > CTT > TTG > CTA > CTC > CTG, which was consistent with the results found in other plants, such as Ligusticum and Geraniaceae [78,92]. The codon GCA was found to be less used in Amaryllidoideae species than in the other two subfamilies, while TCC was more used in Amaryllidoideae species (Figure 4). From Figure 5, we found that five parameters involved in codon usage bias were lowest in Allioideae species, while Amaryllidoideae species had the highest values followed by Agapanthoideae ( Figure 5). The calculated values revealed that the diverse codon usage patterns of different species may also be helpful for species identification and classification [93,94].
Appropriate and multiple gene combinations are particularly important and efficient for accurate phylogenetic estimation. Nuclear ribosomal DNA genes (e.g., ITS and ETS), many cpDNA fragments (e.g., rps16, matK, and trnL-trnF), and chloroplast genomes have been used to infer the phylogeny of plants [12,13,17,95,96]. In this study, ML analysis and Bayesian inference were performed with two datasets (chloroplast SCGs and nrDNA ITS) to explore and reconstruct the phylogenetic relationships of Amaryllidaceae species. Our plastome analyses inferred well-supported relationships among the subfamily Amaryllidaceae (Figures 6 and 7). The monophyly and sisterhood of the three subfamilies was reconfirmed [12,17,97]. According to previous ITS-based studies, the Allium (Allioideae) species  Figure 7: The phylogenetic relationships of 36 Amaryllidaceae species based on ITS. The phylogenetic tree is inferred from Bayesian inference (BI) and the posterior probabilities (PP) are marked separately at each node. Subfamilies of each species belong to, color of the bar is consistent with the species color. 10 BioMed Research International were divided into three evolutionary lineages (clade 1, clade 2, and clade 3) [17]. Here, our plastome phylogenomic analysis based on the SCGs provided strong support for the monophyly of Allium (Allioideae) and other Amaryllidaceae families (Figures 6 and 7, Supplementary Figure 3), which was in agreement with previous studies [12,13,17,96,98]. Besides, we further detected new species relationships within the three evolutionary lineages with high support values, including Allium fasciculatum in the first clade and Allium funckiifolium, Allium listera, Allium ovalifolium var. cordifolium, and Allium ovalifolium var. leuconeurum on the second clade. Previous studies performed the phylogenetic analysis of Amaryllidoideae using limited ITS or matK sequences and detected weaker support in phylogenetic relationships [99,100]. Our plastome analysis based on SCGs revealed well-supported generic relationships inside Amaryllidoideae. Relationships among the five genera of Amaryllidoideae are well supported and generally in line with the previous studies [95,97,[99][100][101][102]. Our ITS tree (Figure 7 and Supplementary Figure 3) provided strongly supported relationships among subfamilies of Amaryllidaceae and were highly consistent with the CP trees ( Figure 6). However, the bootstrap support values of the ML tree among some genera and species were significantly lower than the posterior probability values of the BI tree. This may result from the use of different statistical inference methods. Relevant studies have shown that the BI method is more efficient, the node support rate in the BI method analysis results is higher than the corresponding results in other algorithms, and for closely related species sequences, the BI method works better [103][104][105]. All of the above results may indicate that the species relationships of Amaryllidaceae are complex. Although we detected some new species relationships and provided high support, relationships among species of Amaryllidaceae are still not well resolved (especially for species in Lycoris and in the third clade of Allioideae). In general, our plastome phylogenetic analysis reconstructed a well-supported tree for Amaryllidaceae and contributed to a better understanding of the Amaryllidaceae phylogeny. More extensive geographic information and genomic samples for further investigation are required in the future. We conducted further selective pressure analysis on the three subfamilies. The 60 screened protein-coding genes of each subfamily were used to estimate the selective pressures, which may have evolved evolution to adapt to changing environmental conditions. Several genes were found to have significant posterior probabilities for codon sites under the BEB test in each of the three subfamilies, although the positive selection was insignificant in all genes (p value > 0.05), which may suggest they were under purifying selection (Table 3 and Supplementary Table S8 and S9). This result reflects the typical evolutionary conservation of plant plastid genes [106,107]. Previous research has shown that codon sites with higher posterior probability can be regarded as positively selected sites, which means that genes possessing positively selected sites may be evolved under positive selection pressure [50]. Based on the above research results, it is worth noting that there are seven genes with positive  selection sites related to photosynthesis in Allioideae, and eight and four similar genes were detected in Amaryllidoideae and Agapanthoideae. Through further analysis, we found that these genes are associated with photosystem II subunits, subunits of NADH-dehydrogenase, subunits of the cytochrome b/f complex, and subunits of ATP synthase ( Table 2). Photosystem II is the site of photosynthetic light reaction in plants, where integral membrane protein complexes use light energy to produce high-energy carriers ATP and NADPH [108][109][110]. Subunits of ATP synthase, subunits of NADHdehydrogenase, and subunits of the cytochrome b/f complex are necessary for the generation of ATP in the electron transport chain [108,[111][112][113]. The genes mentioned above are all necessary for photosynthesis and participate in important physiological processes of plants [114]. These PSGs related to photosynthesis have been found in all three subfamilies, which may be closely related to the widespread distribution of Amaryllidaceae species on Earth [1]. Species of the three subfamilies are distributed in various environments, such as low temperature areas [58], temperate humid forest areas [15], hot arid and semiarid areas [115], and tropical grassland climate areas [116], and requirements for sufficient light for photosynthesis might have exerted strong selective forces on these genes, and in turn, these positively selected genes might contribute to species of the three subfamilies adapting various environment better. This phenomenon was also found in Siraitia and Urophysa genera [20,117].
In addition, we also detected a series of genes related to self-replication in each subfamily. Plastid protein synthesis plays an essential role in plant development [118,119]. Among the genes with positive selection sites, the rpoA gene has the most positive selection sites in Agapanthoideae, suggesting that the rpoA gene may play a pivotal role in the adaptive evolution of Agapanthoideae species. Studies have shown that plastid chromosomes encode four RNA polymerase genes, designated rpoA, rpoB, rpoC1, and rpoC2 [120]. Notably, half of them (rpoA and rpoC2) were detected in selective pressure analysis within Agapanthoideae species. Both have been reported in Annonaceae and Rehmannia [121,122]. The rpoA and rpoC2 genes encode subunits α and β″ of plastid-encoded plastid RNA polymerase (PEP), respectively, which is believed to be a vital protein responsible for most photosynthetic gene expression [123]. In addition, the RNA polymerase β″ encoded by rpoC2 may play an important role in the regulation of developmental pollination [117,124]. The finding of these two genes under selective pressure indicated that they might be essential for growth and reproduction in Agapanthoideae. Gene claP encodes clpP proteases containing a gene family with six members (claP1-claP6) in Arabidopsis of the mustard family Brassicaceae [125]. It was only found under positive selection pressure in Agapanthoideae. The gene is detected in the chloroplast genome of all higher plants and is involved in various biological processes, ranging from plant growth changes to stress tolerance [125,126]. It has been suggested that the clpP gene is essential for plant cell viability [127,128], and the rapid evolution of the claP gene in Agapanthoideae species may help to adapt to its environment [129]. The accD gene related to the subunit of acetyl-CoAcarboxylase was only found in Allioideae with one positive selection site. Plastid accD is essential for plant leaf development or viability and fitness and has deep effects on leaf longevity and seed yield [130,131]. It has been reported that accD gene shows an accelerated rate of evolution [65,132,133] and may be a useful marker for plastid evolution [134][135][136]. Allioideae species have many types of leaf morphology and physiological characteristics to adapt to different environments [96], and the accD gene may play an indispensable role in its adaptation process. We found the ccsA gene with one positive selection site in Amaryllidoideae, 13 BioMed Research International which encodes a protein that is required for heme attachment to C-type cytochrome and may be closely related to photosynthesis [137,138]. It is generally present in land plants, while it is absent from the plastome of Physcomitrella patens [139].
In previous studies, most of the genes mentioned above have been reported under the pressure of positive selection [11,[140][141][142]. Species in Amaryllidaceae are mostly characterized by tunicate bulbs, rhizomes, or tubers and narrow linear basal leaves, but in different environments, many Amaryllidaceae species have evolved very different leaf and rhizome morphologies [98,143]. The bulb and leaf are important taxonomic identifiers of Amaryllidaceae species, and they are also vital evidence and tools for species adaptation to various habitats [59,60]. We reconstructed the evolution of bulb traits in Amaryllidaceae. The results show that their MRCA may have several types of bulblets, and then, the bulb type diverged in three subfamilies ( Figure 8). Allium L. (Allioideae) is one of the largest genera of monocotyledons and is distributed in a variety of habitats including cliffs, shrubs, forests, and high-altitude grassy slopes [1,15] They usually embed their entire bulbs between stone crevices and bush roots to hold themselves and absorb water [96]. Allium (Allioideae) species are dominated by slender cylindrical bulbs and usually have well-developed root systems, which may help them anchor themselves more easily (Figure 8). Through reconstructing the leaf traits, we found that the leaves of Agapanthoideae and Amaryllidoideae are generally differentiated into ribbons, while the leaves of Allioideae are mainly differentiated into two types, barshaped and oval. We found that all leaves that differentiated into oval leaves belonged to sect. Anguinum (marked by red shading), which were almost exclusively found in moist understory habitats [15,96]. We speculate that the wide   14 BioMed Research International leaves may help Anguinum species utilize the weak light in the forest and transpiration more efficiently and then perform better photosynthesis [144][145][146]. These characteristics may be the key traits that will help them adapt to various harsh environments, such as severe cold, drought, saline soil, and high altitude, and enable them to produce and maintain a high level of plant diversity [147][148][149]. We suggest that these ecological characteristics of Amaryllidaceae reflect their remarkable adaptability to various environments due to diverse positive selection pressure on genes in the plastid, while most PSGs detected may play critical roles in the adaptation of plants in the Amaryllidaceae during the evolution process. Therefore, it is necessary to further investigate the important role of positive selection in the plastid genes of Amaryllidaceae species.

Conclusions
In this study, we investigated 36 complete chloroplast genomes of three Amaryllidaceae subfamily species. All chloroplast genomes exhibited a typical quadripartite structure and had highly similar genomic structures. SSRs, long repeats, and genes with positive selective sites were identified across the chloroplast genomes, which may be helpful for species identification or classification and can also be used as potential markers for phylogenetic investigations and population genetics studies. The monophyly of the three subfamilies was confirmed, and phylogenetic analysis showed that they are sisters to each other. Positive selection analysis identified some PSGs in each subfamily. These results provide a better understanding of the chloroplast genome characteristics in the three subfamilies, contributed to a better understanding of the Amaryllidaceae phylogeny, and afford more genomic information for further evolutionary investigations of Amaryllidaceae species.

Data Availability
The assembled plastid genome sequences of the 18 Allium species used in this study are available at the National Center for Biotechnology

Conflicts of Interest
The authors declare no conflict of interest. Figure S1: comparison of the border regions among the 36 Amaryllidaceae plastid genomes. Figure S2: VISTA-based sequence identity plot of the 36 Amaryllidaceae plastid genomes using Allium fasciculatum as a reference. Figure  S3: ML tree based on ITS. Table S1: information and Gen-Bank accessions for sample collection.  Table  S8: the 65 protein-coding genes. Table S9: the potential positive selection test based on the branch-site model in Amaryllidoideae. Table S10: the potential positive selection test based on the branch-site model in Agapanthoideae.