De novo Transcriptome Reveals Gene Changes in the Development of the Endosperm Chalazal Haustorium in Taxillus chinensis (DC.) Danser

Loranthus (Taxillus chinensis) is a facultative, hemiparasite and stem parasitic plant that attacks other plants for living. Transcriptome sequencing and bioinformatics analysis were applied in this study to identify the gene expression profiles of fresh seeds (CK), baby (FB), and adult haustoria tissues (FD). We assembled 160,571 loranthus genes, of which 64,926, 35,417, and 47,249 were aligned to NR, GO, and KEGG pathway databases, respectively. We identified 14,295, 15,921, and 16,402 genes in CK, FB, and FD, respectively. We next identified 5,480 differentially expressed genes (DEGs) in the process, of which 258, 174, 81, and 94 were encoding ribosomal proteins (RP), transcription factors (TF), ubiquitin, and disease resistance proteins, respectively. Some DEGs were identified to be upregulated along with the haustoria development (e.g., 68 RP and 26 ubiquitin genes). Notably, 36 RP DEGs peak at FB; 10 ER, 5 WRKY, 6 bHLH, and 4 MYB TF genes upregulated only in FD. Further, we identified 4 out of 32 microRNA genes dysregulated in the loranthus haustoria development. This is the first haustoria transcriptome of loranthus, and our findings will improve our understanding of the molecular mechanism of haustoria.


Introduction
Taxillus chinensis (DC.) Danser, also called loranthus or "San Ji Sheng" (in Chinese), is a member of Loranthaceae family and mainly distributed in the southern and southwestern areas of China. It has a long history of being used in the Chinese traditional medicine, mainly because its stems and leaves can be used for the treatment of rheumatoid arthralgia, threat of abortion, and hypertension [1,2]. Loranthus is a parasitic plant that attacks other plants, such as Aceraceae, Anacardiaceae, Euphorbiaceae, Fabaceae, Fagaceae, Juglandaceae, Moraceae, Rosaceae, and Rutaceae [2]. The successful parasitism is a key process for the plants to obtain water and nutrients from the host plants via specialized feeding structures called haustoria.
In plants, approximately 4,500 parasitic species belonging to 28 families, representing 1% of the dicotyledonous angiosperm species, have been reported [3]. Depending on the attachment site in the host plants, parasitic plants can be classified in to two groups-stem and root parasites. Also, according to the degree of host dependency, parasites can be facultative or obligate. Facultative parasites can live autotrophically but latter cannot, such as Triphysaria spp. and Phtheirospermum spp., while obligate parasites have to parasitize a host in order to complete their life cycles, for example, Viscum spp., Cuscuta spp., Orobanche spp., and Striga spp.. Further, parasitic plants can be classified as hemiparasites or holoparasites based on whether they have retained or completely lost the photosynthetic activity [3]. Based on these characteristics, loranthus is a facultative, hemiparasite, and stem parasite.
It has been reported that after seed germination, most parasitic plants will develop a functional haustoria depending on a second chemical signal also derived from the host exudate, such as 2,6-dimethoxy-p-benzoquinone (DMBQ), phenolic acids, and flavonoids (a haustoria-inducing factors (HIFs)) [4]. Some studies have shown the mechanisms of haustoria development in parasitic plants. For example, a single-electron reducing quinone oxidoreductase (TvPirin) is required to trigger the haustoria development in the roots of Triphysaria versicolor [5]. The seeds of Santalum album, an aggressive root hemiparasite, can germinate in sand or in vitro on Murashige and Skoog medium after a pretreatment of 2~8 mM GA 3 for 12 h and then develop the haustoria within one month without the need for induction by HIFs [6][7][8]. Many is unknown about haustoria development in loranthus.
Transcriptome sequencing has been used to identify differentially expressed genes in the process of parasitism of Cuscuta pentagona, including genes encoding plant hormone (e.g., auxin, gibberellin, and strigolactones), transporters, and genes associated with cell wall modifications [9]. Also, it has been used to show that genes involved in cell wall metabolism, protein metabolism, and mitochondrial electron transport, genes related to auxin signaling and genes encoding nodulin-like proteins, were important for the haustoria development in Santalum album [4]. Transcriptome analysis also found that genes related to protein turnover, detoxification of reactive oxygen species, and fungal pathogenesis are abundant in the haustoria of Golovinomyces orontii [10]. Recently, small RNA sequencing characterized that some dodders' (Cuscuta spp.) microRNAs (miRNAs) could target the host (Arabidopsis thaliana) genes and further improve the parasitism [11].
In the present study, we constructed a transcriptome profile of haustoria development and identified genes encoding ribosomal proteins (RPs), transcription factors (TFs), ubiquitin, and disease-resistant proteins (DRPs) which might be involved in the loranthus haustoria development. Our results provide a valuable resource for further exploration and a basis towards understanding the molecular mechanisms of the haustoria development and underlying host-parasite interaction in angiosperms. . Then, the seeds were peeled, washed with sterile water, placed on a germination dish, and incubated under the environment of 25°C temperature and 80% moisture. Every day, the seeds were lighted under 2000 Lx for 10 h. Three fresh seeds were collected as control (CK). After 10 days of incubation, three seeds with protruding seed-type radicle and tiny suction device were randomly collected (FB). After 20 days of incubation, the loranthus haustoria was formed and elongated, and the true leaves began to grow. Three of them were collected as adult haustoria (FD).

Total RNA Extraction and Transcriptome Sequencing.
Total RNA was extracted using TRIzol reagent, as described [1,12]. After the quality and quantity were determined by Agilent 2100 Bioanalyzer, total RNA (1 μg) of each sample was used to construct the cDNA library using the TruSeq RNA Library Preparation Kit v2 protocol (Illumina), as described [13]. Then, cDNA libraries were quality controlled by the Agilent 2100 Bioanalyzer and qRT-PCR, followed by sequencing on the Illumina HiSeq2500 platform with paired-end 100 strategy.
Next, we annotated the assembled loranthus genes using KEGG pathway Gene Ontology (GO) databases. BLAST software was used to map the assembled genes to the NR database and the hits with and e-value of >1 × 10 −5 were filtered. Remaining genes were processed to retrieve GO annotation in terms of biological process, cellular component, and molecular function by BLAST2GO [19]. Using the enzyme commission numbers produced by BLAST2GO, we mapped the assembled transcriptome to KEGG pathway database and obtained the pathway annotation.
2.5. Noncoding Gene and miRNA Annotation. Unannotated loranthus genes were processed by the Coding Potential Calculator (CPC, v2) with default parameters to identify potential long noncoding genes [20]. Then, all the plant mature microRNAs (miRNAs) were mapped to these noncoding genes to identify loranthus miRNAs using SOAP2 with maximal two mismatches [21]. Then, MIREAP was used to predict the miRNA precursor sequences, and psRobot was used to predict the target genes of miRNAs [22].

Gene Expression Profile and Differential Expression
Analysis. Bowtie2 and RSEM tools were used to align clean reads to the assembled transcriptome and to profile the gene expression for each sample, respectively, [23]. Transcriptsper-million (TPM) reads method was for normalization, and lowly expressed genes (TPM < 5) were filtered. Then, differential expressed genes (DEGs) were identified using edgeR [24] with a strict criteria: log2 fold change ðLog2FCÞ > 1 or <−1 and false discovery rate (FDR) of <0.05.

Functional
Analysis. p value calculated using Fisher's exact test and q value calculated by the R package "qvalue" were used to identify enriched GO terms and KEGG 2 BioMed Research International pathways (p value of <0.05 and q value of <0.05). Human or other animal-related GO terms and pathways were filtered.
2.8. qRT-PCR. We randomly selected 9 genes for qRT-PCR validation, and 18S rRNA was used as internal control. Forward and reverse primers were predicted using Primer3 and synthesized at BGI-Shenzhen. The procedure of qRT-PCR experiment was same as our previous study [1]. The expression of genes was shown in ΔCt. ΔΔCt was used to present the difference of gene expression between two samples. Then, we used relative normalized expression (RNE) to show the gene expression changes: RNE = 2 −ΔΔCt . p values were calculated using the multiple t tests function in Prism GraphPad 8.0.

Results
3.1. Plants, Sequencing, and De Novo Analysis. Compared to CK, the green colors of FB and FD seeds were darker (Figure 1(a)). In addition, FB seeds produced seed-type radicle and tiny suction device. FD seeds formed and elongated the haustoria, and their true leaves began to grow. We generated a total of~322.92 million reads (average:~35.88 million reads) for these samples. After data cleaning,~321.92 million reads were obtained, and Trinity assembled 160,571 loranthus genes that can produce 266,379 transcripts ( Table 1). The size of the loranthus haustoria transcriptome was~110 Mb, the GC percentage was 42.83%, the N50 was 1,191 bp, which revealed that 50% of the assembled loranthus genes were >1,191 bp, and the average gene length was 685.36 bp. Furthermore, gene length distribution ( Figure 1(b)) showed 106,251 (66.17%) genes between 200 bp to 500 bp and 8,199 (5.11%) genes longer than 3000 bp.
RNAMMER predicted 19 genes that can produce ribosomal RNAs in the assembled genes. Next, CPC identified 99,817 potential long noncoding genes in the unannotated    genes, of which 32 were predicted to encode microRNAs from 19 microRNA families (Supplementary Table S1). Interestingly, we found that 3,457 protein coding genes might be specific to loranthus according to the CPC label.

Gene Expression Profile and Differential Expression
Analysis. After lowly expressed genes (TPM < 5) were filtered, we identified 14,295, 15,921, and 16,402 genes in CK, FB, and FD, respectively, and found 12,888 genes commonly detected in all three samples. Next, we performed DEG analysis to identify genes involved in the loranthus haustoria development. Using edgeR, we identified 3,749 and 4,139 DEGs in FB and FD, respectively, compared to CK (Supplementary Table S2). Among these DEGs, 1,543 upregulated and 1,086 downregulated genes were common to FB and FD. Pathway analysis showed that metabolism and environmental adaptation pathways were common to FB and FD, "amino sugar and nucleotide sugar metabolism" (ko00520) specific to FB and "mineral absorption" (ko04978) was specific to FD. GO enrichment analysis identified that "regulation of flower development" (GO:0009909), "cell tip growth" (GO:0009932), and "glycerol ether metabolic process" (GO:0006662) were shared by FB and FD DEGs; the top three biological processes specific to FB were "protein phosphorylation" (83 genes, GO:0006468), "single-organism cellular process" (72 genes, GO:0044763) and "defense response" (33 genes, GO:0006952); and "cellular metabolic process" (39 genes, GO:0044237), "cellular macromolecule metabolic process" (26 genes, GO:0044260), and "proteolysis" (20 genes, GO:0006508) were the top three biological processes specific to FD.

Ribosomal Protein.
Among the 2,576 RP genes, 255 were differentially expressed in the loranthus haustoria development (Figure 3(a)). In details, 7 and 5 RP genes (2 shared) were downregulated in FB and FD, respectively, compared to CK. Interestingly, CK 123 and 200 RP genes were upregulated in FB and FD, respectively, of which 79 were shared. Furthermore, out of the 121 upregulated RP genes exclusively in FD, 116 were downregulated in FB relative to FD. The expression profiles of RP genes in FB and FD indicate that they might have different functions in the development and formation of loranthus haustoria.

Transcription
Factor. We identified 863 TF genes in the loranthus haustoria, of which 174 were dysregulated in the developmental process. We found that most of the dysregulated TFs were shared (Figure 3(b)) by FB and FD. Next, we analyzed the expression changes of some TF subfamilies ( Table 2, Figure 3(c)), including ethylene-responsive (ER), MYB, WRKY, and bHLH. Compared to CK, 9 ER, 5 MYB, 6 WRKY, and 11 bHLH TF genes were upregulated in both FB and FD (Figure 3(c)). Some TF genes were specifically upregulated in FB or FD. were also downregulated in FD and the other 6 were increased but had no significance in FD, compared to CK. While out of the 38 upregulated ubiquitin genes in FB compared to CK, 26 were upregulated in FD as well (Figure 3(d)). It is notable that 6 ubiquitin genes were increased along with the haustoria developmental process, including 3 polyubiquitin, 2 ubiquitin-40S RP, and 1 E3 ubiquitin-protein ligase genes.

Disease Resistance Protein.
We assembled 226 genes encoding DRPs in the loranthus haustoria, of which 94 were dysregulated ( Table 2). It is notable that most of the DRP genes were upregulated. Figure 3(e) reveals 87 (out of 94 DEGs) were upregulated in FB and FD, of which 51 were shared. We only identified 15 DEGs (7 upregulated and 8 downregulated) in FD relative to FB (Supplementary Table S2). The upregulation of ubiquitin genes (Figure 3(e)) revealed that they might be functional in the loranthus haustoria development.

miRNA Host Genes and Their Target Genes.
We identified 32 miRNA host genes (Supplementary Table S1) and 4 (miR156c, miR156d, miR166d, and miR396a) were dysregulated in the process (Figure 4(a)). We next predicted the target genes for these miRNAs and found that they had no common target genes except miR156a and miR156c (Figure 4(b)). The dysregulation of miRNA host genes might explain the change of their target genes, such as TRINITY_DN3353_c2_g1, TRINITY_DN2184_c0_g1, Ethylene-responsive  (Figure 4(c)).

Discussion
Some studies have shown TFs' function in both parasitic plants and their hosts during the infection. For example, the upregulation of AtWRKY is important for the seeding site establishment of plant-parasitic nematodes [25]. Nearly one-half of the mobile mRNAs transferred from tomato or pumpkin to their parasitic plant Cuscuta pentagona were regulatory genes such as TFs and calmodulin proteins [26]. These evidences suggest that endogenous or exogenous TFs are important for the interaction of parasitic plants. We identified the dysregulation of bHLH, ER, MYB, and WRKY TFs (Table 2, Figure 3(c)), which may function in the formation and development of endosperm chalazal haustorium in Taxillus chinensis. These TFs have been reported to be inducible by the various environmental stresses, such as cold, drought, pathogen infection, and wounding, and be functional in the plant defense [27]. In parasitic plants, RP genes might play a key role in the survival and development. During the evolutional of Epifagus virginiana, although some RP genes are deleted, the E. virginiana plastid genomes are still transcribed and translated due to the fulfilled function by the nuclear components [28]. In addition, RPs have shown higher level of accumulation in resistant sunflower plants after the sunflower broomrape infection [29]. We found 258 out of 2,576 RP genes differentially expressed during the loranthus haustoria developmental and most are upregulated (Figure 3

BioMed Research International
Globodera rostochiensis can promote successful plant parasitism through suppressing the plant's defense through the suppression of plant immunity and can further generate within root tissue the feeding cells essential for nematode development [30]. Rhiannon reported that E2 and E3 ubiquitin proteins secreted by the parasitic nematode Trichinella spiralis have the capacity of modifying the host skeletal muscle cells [31]. In this study, we identified 66, 176, and 540 genes encoding E1, E2, and E3 ubiquitin enzymes, respectively. Among them, 81 were differentially expressed (Table 2, Figure 3(d)), including 8 E2 and 29 E3 ubiquitin genes (Supplementary Table S2). Based on these evidences, we assume that the secretion of ubiquitin genes and proteins by loranthus has positive efforts in the parasitism. While further experiments are required to study the functions of ubiquitin genes and proteins in the parasitism of loranthus.
A recent study reported that Arabidopsis thaliana mRNAs are targeted by miRNAs produced by Cuscuta campestris during the parasitism, resulting in mRNA cleavage, secondary siRNA production, and decreased mRNA accumulation [11]. Here, we predicted 32 miRNA host genes in the loranthus haustoria (Supplementary Table S1) and identified the dysregulation of miR156c, miR156d, miR166d, and miR396a ( Figure 4). Due to the limited information of loranthus genes, we only found a few genes targeted by these four miRNAs (Figure 4). Further experiments are required to identify the mature miRNA sequences and their function in the loranthus haustoria development.

Conclusions
In conclusion, we studied the transcriptome profiles of the loranthus haustoria development. We assembled 160,571 loranthus genes and annotated them by aligning them to NR, GO, KEGG, UniProt/Swiss-Prot, Pfam, and EggNOG databases. After lowly expressed genes were filtered, we identified 18,360 genes in the loranthus haustoria, of which 3,749 and 4,139 were dysregulated in FB and FD, respectively, compared to CK. Some important gene families were found to be related to the loranthus haustoria development, such as transcription factor, ubiquitin, ribosomal protein, and diseaseresistant protein. Further, 32 miRNA host genes were identified and the dysregulation of 4 miRNA host genes might be one of the reasons for some genes which are dysregulated as well in the process. This is the first time to report the transcriptome of loranthus haustoria. It will provide valuable resources to other studies. More importantly, the findings of this study will improve our understanding of parasitism and contribute to the breeding program of loranthus.

Data Availability
The raw sequencing data can be accessed from the NCBI Sequence Read Archive (SRA) platform (https://trace.ncbi .nlm.nih.gov/Traces/sra/) under the accession number SRA896707. The assembled transcriptome of loranthus haustoria can be accessed in the TSA database of NCBI under the accession number GHNL00000000.