Barcoding the Dendrobium (Orchidaceae) Species and Analysis of the Intragenomic Variation Based on the Internal Transcribed Spacer 2

Many species belonging to the genus Dendrobium are of great commercial value. However, their difficult growth conditions and high demand have caused many of these species to become endangered. Indeed, counterfeit Dendrobium products are common, especially in medicinal markets. This study aims to assess the suitability of the internal transcribed spacer 2 (ITS2) region as a marker for identifying Dendrobium and to evaluate its intragenomic variation in Dendrobium species. In total, 29,624 ITS2 copies from 18 species were obtained using 454 pyrosequencing to evaluate intragenomic variation. In addition, 513 ITS2 sequences from 26 Dendrobium species were used to assess its identification suitability. The highest intragenomic genetic distance was observed in Dendrobium chrysotoxum (0.081). The average intraspecific genetic distances of each species ranged from 0 to 0.032. Phylogenetic trees based on ITS2 sequences showed that most Dendrobium species are monophyletic. The intragenomic and intraspecies divergence analysis showed that greater intragenomic divergence is mostly correlated with larger intraspecific variation. As a major ITS2 variant becomes more common in genome, there are fewer intraspecific variable sites in ITS2 sequences at the species level. The results demonstrated that the intragenomic multiple copies of ITS2 did not affect species identification.


Introduction
Dendrobium is one of the three largest genera of the Orchidaceae family and comprises more than 1,000 species distributed throughout the Asian tropical and subtropical regions as well as Oceania, with 78 species of this genus recorded in China alone [1]. The flowers of Dendrobium come in a rich variety of colors and shapes, and in recent years they have increased significantly in commercial value as ornamental flowers. In addition, Dendrobium is also well known for its medical value. In fact, one of the earliest records of Orchidaceae plants in ancient Chinese literature is Shen Nong's classic herbal text written approximately 1,500 years ago. Approximately 33 species of Dendrobium are used as clinical medications [2], including Dendrobium officinale, also known as "Tie Pi Feng Dou," and Dendrobium nobile, also known as "Jin chai shi hu," as described in the Chinese Pharmacopoeia. Each year, large numbers of Dendrobium species are needed for both the flower and medicinal markets.
Adulterants and substitutes have become popular in the markets, especially for medicinal purposes. Thus, an effective method of species identification is very necessary.
In eukaryotic genomes, rDNA arrays are often present in hundreds of copies, with copy number varying among different species [3][4][5]. As a tool to study evolution, the rDNA copy number per genome and sequence variation between species can be used to study phylogenetic relationships and biodiversity [4]. The internal transcribed spacer (ITS) is part of a multicopy gene that encodes ribosomal RNA subunits in all eukaryotic genomes. ITS regions have been used to study biodiversity in bacteria [6], insects [7], marine organisms [8][9][10], and plants [11], as well as many others. Due to their powerful discriminatory ability and stability among Dendrobium species, rDNA sequences have been used for identification and classification purposes [12,13]. Among the numerous Dendrobium species, D. officinale has received the greatest amount of attention due to its high medicinal value in China. Ding et al. established a database that included 21 Dendrobium species labelled "Feng Dou" herbs on the market and 2 BioMed Research International proposed that rDNA ITS sequences could be used to identify Dendrobium species with high accuracy [14]. Indeed, Zhang et al. accurately identified D. officinale from its adulterants using full-length ITS regions [15]. Furthermore, Li et al. performed phylogenetic analyses and identified Dendrobium species using rDNA ITS sequences, and their classification based on ITS sequences was identical to traditional classifications for most species [16].
ITS2 is commonly used to infer phylogenetic relationships and has been employed as a DNA barcode for identification purposes. The genes in this region are thought to have evolved in concert, leading to a homogenization of all copies of this gene across the genome [17,18]. To date, the ITS2 region has been used to identify plants [19][20][21], fungi [22][23][24], and insects [25]. Although ITS/ITS2 is extremely useful for both species identification and phylogenetic analyses, it does have drawbacks. One significant problem is the fact that it is present in multiple copies in the genome. Phylogenetic studies typically use consensus sequences that average over all copies in a genome, thereby concealing most intragenomic variation. Indeed, the intragenomic variation and intraspecies divergence in ITS2 present significant challenges for genetic diversity analyses and species identification. In contrast, the evaluation of ITS2 sequences for identification and phylogenetic purposes might prove useful for deep research into intragenomic and intraspecific diversity. While intraspecific divergence in Dendrobium has been studied, the issue of intragenomic diversity revealed by multicopy has received increased attention due to the development of next-generation sequencing technology. Here we used pyrosequencing to sequence 18 selected species of Dendrobium to perform ITS2 intragenomic diversity analysis. Intraand interspecific variations among different species were also evaluated using ITS2 sequences in 26 species of Dendrobium. Our results indicate that the ITS2 region is a valuable tool for identifying species and analyzing phylogenetic relationships.

Sampling, DNA Extraction, PCR Amplification, and
Sequencing. Fresh leaves and stems of plants of the genus Dendrobium were obtained from different locations (see Appendix S1, in Supplementary Material available online at https://doi.org/10.1155/2017/2734960). Samples were dried at a temperature of 45 ∘ C prior to genomic DNA extraction. DNA extraction, PCR amplification, and sequencing were performed as described in previous studies [26,27]. Approximately 15 mg of dried leaves or 20 mg of dried stems was ground for two minutes (30 revolutions/second) in a Fast-Prep bead mill (MM400, Retsch, Haan, Germany). DNA was extracted using the Plant Genomic DNA Kit (Tiangen Biotech Co., Beijing, China). Universal primers for the ITS2 region (ITS2F/3R) were used for amplification [27]. Sequencing of the PCR products was performed bidirectionally with the same primers used for the PCR amplification using a 3730XL sequencer (Applied Biosystems, Foster, California, USA). The intragenomic data used were from a previous study by our group [28]. Other sequences were obtained from GenBank (see Appendix S2). Twenty-six Dendrobium species with more than ten sequences each were selected for identification analysis.

Data
Analysis. ITS2 sequences in this study were subjected to hidden Markov model (HMM) [29] analysis to remove the conserved 5.8S and 26S rRNA genes. Intragenomic and intraspecific Kimura 2-parameter distances were computed using the MEGA
In this study, the D. officinale sample received a total of 2554 reads of 454 pyrosequencing representing ten different variant patterns. The most common major variant represented 93.89% of the ITS2 sequences in the entire genome. After alignment, the consensus sequence of the ten variants from the D. officinale genome was 246 bp in length, with 13 variable sites, including two INDELS. The dominant   sequence patterns in D. officinale were consistent with the sequences obtained via direct PCR sequencing.

Analysis of the ITS2 Region at Intra-and Interspecific
Levels. In total, 513 ITS2 sequences from 26 species of Dendrobium were analyzed for intraspecific genetic distances (IS-GDs). The average IS-GD for each species ranged from 0 to 0.032. The average IS-GD value in seven species (D. herbaceum, D. macrostachyum, D. amoenum, D. aqueum, D. bicameratum, D. barbatulum, and D. peguanum) was zero, and the highest average IS-GD value (0.032) was found in D. hancockii (Figure 1). The number of variable sites in ITS2 sequences of each species was also calculated (

The Neighbor-Joining Tree Based on ITS2 Sequences.
A neighbor-joining tree (NJ tree) was built based on the intragenomic data to determine the phylogenetic relationships between the Dendrobium species. Previous studies have shown that minor variants present below 1% are difficult to detect directly with PCR or clone sequencing. Thus, we first selected the major variants for analysis ( Figure 2).  m  h  u  o  sh  a  n  e  n  s  e  K  C  3  3  1  0  1  2   D  e  n  d  ro  b  iu  m  h  u  o  sh  a  n  e  n  se  K  C  3  3  1  0  1  1   D  e  n  d  ro  b  iu  m  h  u  o  sh  a  n  en  se  K  C  3  3  1  0  1  0   D  en  d  ro  b  iu  m  h  u  o  sh  an  en  se  K  C  3  3  1  0  0  9   D  en  d  ro  bi  u  m  h  u  os  h  an  en  se  K  C  33  10  08   D  en  dr  ob  iu  m  hu  os  ha  ne  ns  e  K  C  33  10  07   D  en  dr  ob  iu  m  hu  os  ha  ne  ns  e  KC  33  10  06   D  en  dr  ob  iu  m  hu  os  ha  ne  ns  e  KC  33  10  05   De  nd  rob  ium   hu  os  ha  ne  ns  e  KC  33  To better clarify the relationship among these species in the main clade, these six species were used to build a separate NJ tree based on their ITS2 sequences ( Figure 3). We divided this NJ tree into two major clades (Clades I and II

Discussion
Ribosomal DNA (rDNA) is present in multiple copies of tandem repeats per genome [31], and two noncoding spacers (internal transcribed spacer 1 and 2) divide each transcriptional unit into three subunits: 18S, 5.8S and 28S. Each tandem can contain variations, thus leading to intragenomic variation. Many studies have addressed genomic divergence in Dendrobium, but most of these have been focused on intra-and interspecific levels of variation [32][33][34][35][36]. It is thought that biodiversity at the species level is generally overestimated due to intragenomic variation [37]. In this study we therefore focused on the intragenomic level, aiming to identify relationships between intragenomic diversity and intraspecific diversity. Sequence-based methods have replaced many traditional approaches such as allozyme or restriction enzyme polymorphisms, which is valid as long as appropriate marker(s) is selected [38]. Traditional approaches (e.g., RAPD, AP-PCR, and AFLP) generally require highquality DNA for amplification, which can lead to problems with reproducibility and accuracy. Sequence-based methods should be more objective and stable, enhancing our ability to assess biodiversity and identify species [39]. In addition, experimental error and subjective factors such as scoring PCR bands on a gel are eliminated or minimized in sequencebased protocols.
The ITS2 locus has already been proposed as a universal DNA barcode, particularly in plants, and it has been shown that plants can be identified at the species and genus level with more than 97% accuracy [27,40]. Although the China plant BOL group suggested ITS as the core barcode for seed plants, ITS2 has several advantages compared with the fulllength ITS region [41]. First, ITS2 is shorter than ITS, which simplifies PCR amplification. Moreover, ITS2 has secondary structure in all eukaryotes [42,43]. This molecular morphological characteristic strengthens the power of its discriminatory ability. In addition to species identification applications, ITS2 and its secondary structure have been used as effective tools for phylogenetic analyses in insects, corals, and yeast [44][45][46][47]. As these transcribed spacers are highly divergent, they can also be used to estimate low levels of genetic diversity among related species [48]. Liu et al. evaluated the resolution of five regions (rbcL, matK, ITS, ITS2, and trnH-psbA), ultimately suggesting an rbcL + ITS2 barcode combination as the most suitable marker for analyzing biodiversity in the Dinghushan National Nature Reserve (DNNR) in China [49].
Among all the Dendrobium species, D. officinale is undoubtedly the most valuable, owing to its low production but high price and clinical efficacy in the clinic. Previous studies using ISSR, RAPD, and SRAP revealed distinct genetic differences and extensive genetic diversity among different populations of D. officinale [34,50,51]. However, the intraspecific genetic diversity of D. officinale (intraspecific genetic distance, average: 0.001; Max: 0.013) as revealed by ITS2 sequences turned out to be relatively low compared with results from other approaches. From the 61 ITS2 sequences obtained from D. officinale, only five variable sites were detected after alignment. Across the whole genome, D. officinale has a single dominant variant that represents 93.89% of ITS2 sequences. These results indicate that the ITS2 regions are relatively conserved among different populations of D. officinale. Due to the low production and high price of D. officinale, there are so many closely related species appearing as adulterants in the herbal market. These adulterants are species that have morphological characteristics similar to each other, making traditional taxonomic identification difficult, particularly after processing into medicinal slices. According to this result above, ITS2 can be an effective molecules tool for identifying commercial D. officinale and other Dendrobium species.
In the Chinese Pharmacopoeia (2015 edition), D. officinale is described as an independent species that is the source of the herbal medicine "Tie Pi Shi Hu." However, this species has already been accepted as a synonym of D. catenatum, D. tosaense, and several others in flora of China and the other research [52]. In our study, ITS2 sequences from these two species were grouped into a single clade with 100% bootstrap support. The NJ tree described here demonstrates that, at the very least, D. officinale and D. tosaense are extremely closely related at the genetic level, consistent with other results from China. Therefore, we agree that D. officinale and D. tosaense should be accepted as synonyms of D. catenatum. In a previous study, a phylogenetic tree including twelve samples of Dendrobium species was constructed [50]. The three species D. moniliforme, D. hercoglossum, and D. nobile were grouped in the same clade, similar to classifications based on inflorescence color and the results from this study.

Conclusion
In this study, we analyzed intragenome and intraspecies divergence to find that, in most cases, greater intragenomic divergence is correlated with larger intraspecific variation. The results of this study strongly confirm that the direct PCR sequencing data were credible because all the dominant sequences in high-throughput sequencing in each species were detected by direct PCR. Thus, the multiple copies in ITS2 did not affect the species identification in Dendrobium. Therefore, we demonstrate that ITS2 is an effective tool for Dendrobium species identification.