Intraspecific DNA Barcoding and Variation Analysis for Citri Reticulatae Pericarpium of Citrus reticulata “Chachi”

Citri Reticulatae Pericarpium, the desiccative mature peel of Citrus reticulata Blanco or its cultivated varieties, is a national geographical indicated product that has the concomitant function of both medicine and foodstuff. The primary source of Citri Reticulatae Pericarpium is Citrus reticulata “Chachi,” called “Guang chenpi,” while it differs in variety, propagation, grafting rootstock, and tree age, and the hereditary stability of its biological information between intraspecific plants is worthy of our attention. Homologous analysis result of 4 DNA barcodings in the ribosome or the chloroplast showed that the homology of them (ITS2, rbcl, matK, and psbA-trnH) of 22 samples was 100.00%, 99.97%, 99.99%, and 99.81%, respectively, which indicated that 4 DNA barcodes maintained a high degree of genetic stability in Citrus reticulata “Chachi.” Also, ITS2 was considered to identify Citrus reticulata “Chachi” from other varieties because it presented not only low variability within a certain taxon but also a high level of interspecies variability. Simultaneously, variant site detection of Citrus reticulata “Chachi” was analyzed by comparing with the reference Citrus reticulata genome, and 2652697 SNP sites and 533906 InDel sites were detected from whole-genome resequencing data of 22 samples, providing the data resources and theoretical foundation for the future study about the relevant molecular makers of “Guang chenpi.”


Introduction
Citri Reticulatae Pericarpium (CRP), a traditional Chinese medicine that has the concomitant function of both medicine and foodstuff, is the desiccated mature peel of Citrus reticulata Blanco or its cultivated varieties. ereinto, Citrus reticulata "Chachi," the main cultivated variety of Citrus reticulata Blanco, is the primary source of genuine Chinese medicinal materials "Guang Chenpi" [1]. As a national product of geographical indication, "Guang Chenpi" is widely used in clinical applications and by-product processing because of its better quality in CRP [2][3][4], and most studies have shown that its pharmacological activities include antiasthmatic effects, antineuroinflammatory activity, antioxidant ability, and anticancer activity [5][6][7].
Up to now, a large number of studies have focused on the chemical compounds in CRP by morphological identification, microscopic identification, TLC, UV, HPLC, GC-MS, and LC-MS [8][9][10], while these method do not distinguish well among different cultivars or among different varieties of Citrus reticulata "Chachi." As an emerging method of identification of food and natural medicinal materials, molecular marker (DNA barcoding, SNP, and InDel) has considerable untapped potential in the quality control and origin identification of food and medicinal materials. DNA barcoding, an important tool for ecological research, has been widely used in species identification [11][12][13][14]. A number of studies have shown that plant DNA core barcodes are used internationally in the fields of species discovery, taxonomy, flora, and ecology [15][16][17]. Nevertheless, molecular marker about different cultivars of CRP or different varieties of Citrus reticulata "Chachi" was less studied. Previous studies have reported that the ITS2 region was selected for discrimination of the four CRP cultivars; however, this study did not take the intraspecific variation of Citrus reticulata "Chachi" into consideration [18]. Between different plants of Citrus reticulata "Chachi," they showed some difference in the tree age and the variety including big-leaf species smallleaf species. Besides, propagations of Citrus reticulata "Chachi" include layerage on its maternal plant or graftage on different rootstocks such as Citrus limonia Osbeck, Citrus reticulata Blanco, and Poncirus trifoliata (L.) Raf.
Herein, 4 DNA barcodings including ITS2, rbcl, matK, and psbA-trnH were chosen for biological evolutionary information analysis about Citrus reticulata "Chachi" of different propagation methods, different tree ages, different varieties, and different rootstocks. Among them, ITS2 is a segment of DNA in the ribosome [19], and rbcl, matK, and psbA-trnH are DNA fragments in the chloroplast.
Except for a study on genetic stability of 4 barcodes, genetic diversity analysis of Citrus reticulata "Chachi" was carried out thought whole-genome resequencing technology with DNBSEQ-T7, compared with the reference published genomic data of Citrus reticulata from the NCBI (GenBank accession number ASM325862v1) [20], further excavating single-nucleotide polymorphism (SNP) sites and insertiondeletion (InDel) sites from whole-genome resequencing data of 22 Citrus reticulata "Chachi" samples. e objective of this work was to research the hereditary stability of 4 DNA barcodings (ITS2, rbcl, matK, and psbA-trnH) in different Citrus reticulata "Chachi" plants, which can provide screening indicator of DNA barcoding to distinguish Citrus reticulata "Chachi" and other varieties of CRP. Also, variant type detection based on whole-genome resequencing data provides more potential molecular markers to distinguish Citrus reticulata "Chachi" between intraspecific plants or other cultivars, laying a foundation for the further development of "Guang chenpi."

Biological Materials.
Twenty-two batches of biological materials were collected from the Germplasm Source and Seedling Breeding Center of "Guang chenpi" (Table 1). 22 Citrus reticulata "Chachi" samples were different in variety, plant propagation, rootstock, and tree age.

DNA Extraction.
Genomic DNA was extracted using the plant DNA extraction kit (TSP101-200) of Tsingke. e quality of the extracted genomic DNA was checked by 1% agarose gel electrophoresis with DL2000 DNA marker, and the concentration of them was carried out through the NanoDrop 1000 ( ermo Fisher Scientific, Waltham Massachusetts, US).

PCR and
Sequencing of DNA Barcodings. Genomic DNA was diluted to 15 ng·μl −1 and then was amplified by performing polymerase chain reaction (PCR) using 4 pair of universal primers of DNA barcodings listed in Table 2 [18]. PCR was performed under the following conditions: initial denaturation at 98°C for 2 min, followed by 30 cycles with 98°C denaturation for 10 s, annealing at the melting temperatures (TM) listed in Table 2 for 10 s, and extension at 72°C for 10 s. e final extension step was performed for 5 min at 72°C. Next, an aliquot of the amplification product was resolved on 1% agarose gel electrophoresis documented with a gel documentation system and further analyzed by sequencing.

DNA Library Construction and Illumina Sequencing.
Genomic DNA will be randomly interrupted, the end will be repaired, "A" will be added, and the unique connector of DNBSEQ-T7 sequencer will be added. en, DNA libraries will be constructed by PCR enrichment. Finally, the DNA library was denatured, cycled, and digested to obtain singlestranded circular DNA. Single-stranded circular DNA was amplified by rolling circle amplification (RCA), further producing DNA nanoball (DNB). Illumina sequencing was performed on DNBSEQ-T7 sequencer after the DNA libraries were qualified.

Whole-Genome Resequencing Data Quality and Filtering.
To exclude bias from low-quality reads that arise from the process of base-calling or adapter contamination, the quality of the raw data obtained by whole-genome resequencing was evaluated until the value of Q 30 was over than 85%. e clean reads were used for subsequent bioinformatics analysis. For further analysis, we downloaded previously published genomic data of Citrus reticulata from the NCBI (GenBank accession number ASM325862v1). We mapped high-quality data per individual to the reference Citrus reticulata genome using Burrows-Wheeler Aligner (BWA) software [21], then the sequencing read depth and genomic coverage of each sample were counted, and the variation was detected.

SNP and InDel
Calling. SNPs and InDels can be called by mapping the unitigs against a reference genome. e main calling procedures are as follows: (1) for the results of BWA comparison, Mark Duplicate tool of Picard software is used to remove the duplication and shield the influence of PCRduplication; (2) the Genome Analysis Toolkit (GATK) software [22] is used to perform InDels realignment, with local realignment of the sites near the alignment result with insertion-miss alignment and correction of alignment errors due to insertion-miss alignment; (3) GATK software was used for base recalibration to calibrate the base masses; (4) variant calling of SNPs and InDels was performed by GATK software; and (5) SNPs and InDels with any of the following features were filtered: two SNPs within 5 bp; SNPs within 5 bp near InDel; and two InDels within 10 bp [23].

Quality and Concentration of the Extracted DNA.
In this work, Genomic DNA was extracted from tender leaves of 22 Citrus reticulata "Chachi" samples by using the plant DNA  Evidence-Based Complementary and Alternative Medicine 3 extraction kit, and OD value (A260/280) and DNA concentration are shown in Table 3. e results showed that the concentration of DNA could be used in subsequent experiments.

Sequence Features and Homologous Analysis of DNA
Barcodings. According to the agarose gel electrophoresis result of PCR amplification products (Figure 1), ITS2, rbcl, matK, and psbA-trnH produced amplification bands of approximately 750 bp, 750 bp, 1000 bp, and 500 bp, respectively. e electrophoresis bands of each sample were uniform, bright, and nonspecific heterozygous, indicating that the success rate of sequence amplification was 100%, which could be further analyzed by sequencing. Four barcodes (ITS2, rbcl, matK, and psbA-trnH) were analyzed by DNAMAN software for the length and base composition of each sequence fragment and further identified by BLAST in Genebank. e success rates of PCR amplification and sequencing of 3 DNA barcodings (ITS2, rbcl, and matK) of 22 samples were 100%. However, due to the large number of fragments missing in the sequencing of two samples (A16 and A21) of the psbA-trnH barcode, 20 psbA-trnH sequences were actually obtained in the experiment. PCR success rate, barcoding length, GC content, variable site, and BLAST rate of 4 DNA barcodings are listed in Table 4. e aligned partial sequences had lengths of 232 bp, 680∼682 bp, 1059 bp, and 537∼538 bp for ITS2, rbcl, matK, and psbA-trnH, respectively. Among them, the ITS2 barcode had the advantages of shorter sequence length and higher GC content (71.60%), followed by the psbA-trnH barcode having shorter sequence length.
Moreover, homologous analysis of 4 DNA barcodings about 22 batches of Citrus reticulata "Chachi" samples was carried out by DNAMAN software. e homologous analysis result showed that the homology of ITS2 of 22 samples was 100.00%, which indicated that ITS2 maintained a high degree of genetic stability in Citrus reticulata "Chachi" of different propagation methods, different tree ages, different varieties, and different rootstocks. Also, the homology of rbcl, matK, and psbA-trnH of 22 samples was 99.97%, 99.99%, and 99.81%, respectively.
In addition, among the 22 batches of CRP samples, 2 SNP sites were identified in the matK barcode and 1 Indel site was identified in the psbA-trnH barcode. e results showed that there were still some variations within the species of Citrus reticulata "Chachi," making us realize molecular breeding of Citrus reticulata "Chachi" and the distinction of Citrus reticulata "Chachi" and related species need more valuable molecular markers. erefore, wholegenome resequencing was also performed on 22 Citrus reticulata "Chachi" samples of different propagation methods, different tree ages, different varieties, and different rootstocks, which provided more scientific basis for molecular breeding.
Previous studies have reported that the ITS2 region was selected for discrimination of the four CRP cultivars including Citrus reticulata "Chachi," Citrus reticulata "Dahongpao," Citrus reticulata "Unshiu," and Citrus reticulata "Tangerina," while ITS, trnH-psbA, and rbcL could not distinguish these CRP samples [18]. Different from the existing studies, this work focuses on the hereditary stability of 4 barcodes including ITS2, rbcl, matK, and psbA-trnH in Citrus reticulata "Chachi" with different varieties, propagation methods, grafting rootstocks, and tree ages. Because of DNA degradation in moderately or highly processed products with time, PCR amplification of standard-length (around 650 bp) barcodings is a huge challenge [24]. Combined with existing research and the result in this work, ITS2 was considered to be a useful DNA barcoding to distinguish Citrus reticulata "Chachi" from other varieties, which presented not only low variability within a certain taxa but also a high level of interspecies variability. Also, this work indicated that matK was not considered because of its long length and variable sites within taxa, while rbcl and psbA-trnH had the potential to distinguish Citrus reticulata "Chachi" from other varieties. Actually, combining DNA barcodes in the ribosome and in the chloroplast makes it more convincing in species identification of plants [25,26].

Quality Analysis of Whole-Genome Resequencing Data.
A total of 22 Citrus reticulata "Chachi" sample genomes were sequenced, which generated 158 Gb raw data. Base coverage depth distribution curve and coverage distribution curve indicated that the coverage depth of the bases on the genome was evenly distributed. e statistical results of insert fragment distribution with a single peak show that insert fragment distribution fits the normal distribution and the construction of DNA libraries was reliable. e chromosome coverage depth map showed that the genome was evenly covered, indicating good randomness of sequencing.
Summary of clean sequencing data results about 22 Citrus reticulata "Chachi" samples is given in Table 5. e size of reference genome Citrus reticulata is 344.27 Mb (assembly level: scaffold). In this work, the average coverage depth was 16X, and the value of Q 30 reached 88.94%. e average mapped ratio and genome coverage of all the samples were 98.97% and 93.55%, respectively. e average GC content of Citrus reticulata "Chachi" was 38.68% in line with reference genome Citrus reticulata.

SNP and InDel Calling of Citrus reticulata "Chachi".
In this study, the variant site detection of Citrus reticulata "Chachi" for the national geographical indicated product CRP was firstly analyzed by SNP and InDel calling from whole-genome resequencing data (Table 6). Except for the high genetic stability of 4 barcodes (ITS2, rbcl, matK, and psbA-trnH), 22 Citrus reticulata "Chachi" samples showed its genetic diversity between different propagation methods, different tree ages, different varieties, and different rootstocks as well.
A total of 2652697 SNP sites were excavated between 22 Citrus reticulata "Chachi" samples, among which 1741507 SNP sites were transition (T i ), 902182 SNP sites were transversion (T v ), and 9008 SNP sites were transition or transversion. ese SNP sites were with a T i /T v ratio of 1.93, which is in line with general rules of base mutation in natural organisms [27]. In the course of evolution about Citrus reticulata "Chachi," transition happens much more frequently than transversion, which means that evolution of Citrus reticulata "Chachi" tends to accept the substitution between purines and purines or the substitution between pyrimidines and purines, the substitution between purines and pyrimidines causes bad things to happen, and that substitution has mostly been eliminated by evolution.
In addition, InDel sites, as codominant molecular markers, are widely distributed in the genome with high density, which are suitable for genome-wide molecular marker exploration. A total of 533906 InDel sites were detected between 22 Citrus reticulata "Chachi" samples, among which 275380 InDel sites were insertion, 241768 InDel sites were deletion, and 9008 InDel sites were insertion or deletion. Unlike SNP sites, insertions and deletions of InDel sites are equally likely to occur.

Conclusions
Overall, our work indicated that 4 DNA barcodes (ITS2, rbcl, matK, and psbA-trnH) maintained a high degree of genetic stability in Citrus reticulata "Chachi" of different propagation methods, different tree ages, different varieties, and different rootstocks. Because ITS2 presented not only low variability within a certain taxa but also a high level of interspecies variability, it was considered to be an useful DNA barcoding to identify Citrus reticulata "Chachi" from other varieties. Moreover, 2652697 SNP sites and 533906 InDel sites were detected from wholegenome resequencing data of 22 Citrus reticulata "Chachi" samples, fully reflecting the genetic diversity of Citrus reticulata "Chachi" with different varieties or propagation methods. To excavate more useful molecular markers for distinguishing Citrus reticulata "Chachi" between intraspecific plants or other cultivars, DNA barcoding analysis and variant type detection of the Citrus reticulata "Chachi" were studied for the first time in this investigation, which laid a special foundation for the biological information analysis of Citrus reticulata "Chachi" for the national geographical indicated product Citri Reticulatae Pericarpium.

Data Availability
e data used to support the findings of this study are included within the article.

Conflicts of Interest
e authors declare no conflicts of interest.

Authors' Contributions
Mengshi Liu and Kanghui Wang conceptualized the study, performed visualization, conducted investigation, and wrote the original draft of the manuscript. Baizhong Chen, Yi Cai, Chuwen Li, and Wanling Yang formulated the methodology, were responsible for software, and curated data. Minyan Wei and Guodong Zheng formulated the methodology, were responsible for software, curated data, performed validation, supervised the work, and reviewed and edited the manuscript. All authors agreed to be accountable for all aspects of work ensuring integrity and accuracy. Mengshi Liu and Kanghui Wang contributed equally to this work.