A Comparative Analyses of the Complete Mitochondrial Genomes of Fungal Endosymbionts in Sogatella furcifera, White-Backed Planthoppers

Sogatella furcifera Horvath, commonly known as the white-backed planthoppers (WBPH), is an important pest in East Asian rice fields. Fungal endosymbiosis is widespread among planthoppers in the infraorder Fulgoromorpha and suborder Auchenorrhyncha. We successfully obtained complete mitogenome of five WBPH fungal endosymbionts, belonging to the Ophiocordycipitaceae family, from next-generation sequencing (NGS) reads obtained from S. furcifera samples. These five mitogenomes range in length from 55,390 bp to 55,406 bp, which is shorter than the mitogenome of the fungal endosymbiont found in Ricania speculum, black planthoppers. Twenty-eight protein-coding genes (PCGs), 12 tRNAs, and 2 rRNAs were found in the mitogenomes. Two single-nucleotide polymorphisms, two insertions, and three deletions were identified among the five mitogenomes, which were fewer in number than those of four species of Ophiocordycipitaceae, Ophiocordyceps sinensis, Hirsutella thompsonii, Hirsutella rhossiliensis, and Tolypocladium inflatum. Noticeably short lengths (up to 18 bp) of simple sequence repeats were identified in the five WBPH fungal endosymbiont mitogenomes. Phylogenetic analysis based on conserved PCGs across 25 Ophiocordycipitaceae mitogenomes revealed that the five mitogenomes were clustered with that of R. speculum, forming an independent clade. In addition to providing the full mitogenome sequences, obtaining complete mitogenomes of WBPH endosymbionts can provide insights into their phylogenetic positions without needing to isolate the mtDNA from the host. This advantage is of value to future studies involving fungal endosymbiont mitogenomes.


Introduction
Sogatella furcifera Horvath commonly known as the whitebacked planthopper (WBPH) is a planthopper belonging to the infraorder Fulgoromorpha [1] and suborder Auchenorrhyncha [2]. It has migrated to temperate climates from subtropical regions and become a major pest in rice fields across East Asia [3][4][5][6]. In particular, migration from China to Japan via Korean peninsula has highlighted the extent of its spread across the region [7]. Sogatella furcifera has already been registered in the National Species List of Korea [8] indicating that this species has been frequently found within the country. It damages rice plants by feeding directly on them, pro-ducing a characteristic symptom, hopper burn [9]. Because of the importance of WBPH as a threat to agriculture, the mitochondrial genome (mitogenome) as well as whole genome sequences of S. furcifera has been sequenced successfully [10,11]. The fundamental background of WBPH genomic research is, therefore, well established. For example, the complete genome sequence of the Cardinium bacterial endosymbiont of S. furcifera was also completed from the same raw reads generated by the whole genome project [12]. Another bacterial endosymbiont of WBPH, Wolbachia, which alters host reproductions by parthenogenesis, feminization, male-killing, and induction of cytoplasmic incompatibility in arthropods [13], also causes the cytoplasmic 2 International Journal of Genomics incompatibility in WBPH together with Cardinium endosymbiont [14]. Besides these bacterial endosymbionts, fungal endosymbiont has been identified using PCR method in planthopper, Ricania japonica [15]. This yeast-like endosymbiont uses the enzyme uricase to recycle uric acid secreted by the host spe-cies, assisting in metabolic processes [15]. In addition, yeast-like symbionts have been identified in Nilaparvata lugens, a brown planthopper [16,17] which also support the host's uric acid metabolism [18]. However, there was no sequence information of this endosymbiont until the complete fungal mitogenome was obtained from the raw reads  [19]. This mitogenome was identified as an Ophiocordycipitaceae species by comparing already known several complete mitogenomes in this family [19]. This result suggests that next-generation sequencing technology that provides a large number of short reads can be used to provide evidence for the existence of endosymbiont species using DNA extracted from insect species. These results draw comparison to previous studies that have successfully identified a multiple number of complete organelle or bacterial genomes from one NGS library [12,[19][20][21][22][23][24][25][26][27][28][29][30][31][32][33][34][35][36][37].
Here, we reported the first complete mitogenomes of fungal WBPH endosymbiont from five WBPH samples isolated in Korea and China. The five mitogenomes display 55,390 to 55,406 bp in length, shorter than that of R. speculum [19]. The numbers of intraspecific variations among the five mitogenomes are fewer in number than those of the four Ophiocordycipitaceae species. Phylogenetic analysis based on conserved PCGs across Ophiocordycipitaceae mitogenomes displays that the five mitogenomes were clustered with that of R. speculum, forming an independent clade. Once additional planthopper fungal endosymbiont mitogenomes become available, their phylogenetic relationships as well as evolutionary histories based on their complete mitogenomes will become clearer.    [38] after filtering raw reads using Trimmomatic v0.33 [39]. After obtaining mitogenome contig sequences with the condition that sequence coverage is more than 60x, gaps were filled with GapCloser v1.12 [40], and all bases from the assembled sequences were confirmed by checking each base in the align-ment (tview mode in SAMtools v1.9 [41]) against the assembled mitogenome generated with BWA v0.7.17 [42]. The circular form of mitogenomes was confirmed by the pairend reads connecting both sides of mitogenomes. All these bioinformatic analyses were conducted under the environment of the Genome Information System (GeIS; http://geis .infoboss.co.kr/) like the previous studies of mitogenomes [19, 21-24, 26, 28, 30, 32, 33, 36, 43-91].

Complete Mitogenome of Fungal WBPH Endosymbionts.
We successfully assembled fungal endosymbiont mitogenomes from four WBPH samples isolated in Korea and China and one public dataset of NGS raw reads (Table 1). This is the first WBPH fungal endosymbiont mitogenome identified. Their lengths ranged from 55,390 bp to 55,406 bp (Table 1), which is shorter than that of R. speculum (66,785 bp) [19]. In these mitogenomes, there were 28 protein-coding genes (PCGs), 12 tRNAs, and 2 rRNAs ( Table 2). Some of the PCGs found were LAGLIDADG endonucleases, which are usually found in intronic regions of various fungal mitogenomes, contributing to the expansion of their length [19,55,[108][109][110][111][112]. In comparison to the previously sequenced 7 International Journal of Genomics mitogenome of the fungal endosymbiont of R. speculum, there were slightly fewer PCGs and tRNAs found in the WBPH endosymbiont mitogenomes. There were three fewer PCGs for three reasons: the smaller number of LAGLIDADG endonucleases, the absence of one endonuclease and a GIY-YIG endonuclease, and the presence of two additional PCGs-a hypothetical protein and a LAGLIDADG/HNH endonuclease. This particular configuration of PCGs is usually identified in other fungal mitogenomes; for example, two mitogenomes of Fusarium oxysporum (GenBank accessions are MN259514 and MN259515) display two completely different PCGs in each mitogenome [54,56]. There are also five fewer tRNAs because of the different configurations: tRNA-Asp, tRNA-Cys, tRNA-Ile, and two tRNA-Ser (also found in the mitogenome of the fungal symbiont of R. speculum [19]). This difference in configuration of tRNAs between two different fungal symbionts suggests that tRNA configuration may not be critical because essential tRNAs absent in the fungal mitogenome can be supported from the nuclear genome [113].
Several PCGs in the fungal mitogenomes have been invaded by introns multiple times. For example, COX1 contains three introns, and COB has five introns in the Hirsutella thompsonii mitogenome [114]. This phenomenon contributes to increased fungal mitogenome: Aspergillus pseudoglaucus and Aspergillus egyptiacus are longer than the other Aspergillus mitogenomes because of the presence of many introns on major PCGs [55,115]. The fungal mitogenomes examined in this study also present many introns on PCGs including COB, COX1, NAD1, ATP8, COX3, COX2, and NAD2 (Figure 1), which is a major reason for the expansion of fungal mitogenomes together with endonucleases.
The gene order of WBPH and R. speculum fungal symbiont mitogenomes was the same when PCGs except endonucleases and rRNAs are considered. However, intron structures of COX1, COX2, NAD2, NAD3, NAD5, and ATP synthase F0 subunit present different configurations between the two mitogenomes ( Figure 2). The intron structures of NAD5 and NAD2 present reduce of a reduction in the number of exons via removal of intron regions in the WBPH fungal endosymbiont mitogenome (Figures 2(a) and 2(d)), whereas those of COX2, NAD3, and the ATP synthase F0 subunit display insertions of one intron into the WBPH fungal endosymbiont mitogenome (Figures 2(b), 2(c), and 2(e)). This indicates that the reduction in the total length of the WBPH fungal symbiont mitogenome is not primarily caused by reducing the number of exons, unlike in Aspergillus mitogenomes [55,116]. In addition, COX1, which contains the largest number of exons in these mitogenomes, lost the sixth and seventh exons of the R. speculum fungal endosymbiont mitogenome in the mitogenome of WBPH endosymbiont (Figure 2(f)). However, the total length of COX1, including the introns of WBPH fungal endosymbionts, is longer than that of R. speculum fungal endosymbionts by 1 kb (Figure 2(f)), reflecting complex events that occurred during the evolution of both mitogenomes. Additional studies are required to identify the correct exons of the COX1 gene of this fungal endosymbiont. For example, alignment of RNA-Seq raw reads against this mitogenome could provide expressed regions in this mitogenome.
Once more fungal symbiont mitogenomes are available, patterns of presence and absence of tRNAs, additional endonucleases, and intron structures of PCGs in endosymbiont mitogenomes will elucidate a detailed evolutionary history of these genes.

Identification of Intraspecific Variations on Fungal
WBPH Endosymbiont Mitogenomes. We identified two SNPs, three insertions, and two deletions via multiple sequence alignments of the five fungal mitogenomes ( Table 3). One of two SNPs was identified in KR.5D WBPH and changed leucine (L) to glutamine (Q) in the ATP synthase F0 subunit (Table 3). One 10 bp insertion in the intergenic space was found in KR.1D WBPH, while the remaining two insertions and all three deletions were 1 to 3 bp in length ( Table 3).
The proportions of these intraspecific SNPs, insertions, and deletions in these fungal mitogenomes were 0.0036%, 0.020%, and 0.012%, respectively. The proportion of insertions and deletions was higher than that of SNPs. Interestingly, there is geographical variation in the fungal symbiont mitogenomes. The mitogenome of WBPH endosymbionts used in the whole genome sequencing (WGS) and the KR.11D isolate were identical to that of KR, while the other three WBPHs captured in other locations in Korea displayed intraspecific variations. The sample used in the WGS originated from the University of Science and Technology of China (Anhui province, China), indicating that KR 11D and KR WBPH samples obtained in Korea have migrated from the similar region to the WGS sample. However, further analyses of their complete mitogenomes or whole genomes will be needed to provide more supportive data for identifying their origins.
There is a relatively small number of intraspecific SNPs and INDELs identified from these fungal mitogenomes in comparison to those of other fungal mitogenomes, for  [56]. They are also fewer than those identified in insect mitogenomes [10,22,23,43,[45][46][47][48][49][50][51]. Based on 25 available complete fungal mitogenomes in Ophiocordycipitaceae, four species, Ophiocordyceps sinensis, Hirsutella thompsonii, Hirsutella rhossiliensis, and Tolypocladium inflatum, contain more than one complete fungal mitogenome (Table 4). We investigated intraspecific variations in the mitogenomes of these four species (Table 5). There are significantly more INDELs than SNPs identified in the four fungal species, a trend identical to that observed in the four mitogenomes of fungal endosymbiont WBPH with the exception of their absolute amounts. Moreover, there were at least three times more SNPs and INDELs in these fungal mitogenomes than that in the fungal symbiont of WBPHs. This phenomenon can be explained by two major factors: first, the geographical distribution or genetic background of WBPH samples is relatively limited in comparison to those of the four fungal species, and second, the surroundings of fungal endosymbionts are less dynamic than those of normal fungal species, causing low selection pressure from the environment. This second factor is supported by two studies: first, the bacterial genome of aphid endosymbiont Buchnera aphidicola (Aphis gossypii) displays a low level of intraspecific variation in comparison to those of host mitogenome (Bae et al., under revision), and second, the whole genome of endosymbiont of Pediculus humanus capitis also shows low-level intraspecific variations in comparison to those of their whole genomes [117].

Identification and Comparative Analysis of Simple Sequence Repeats on the Five WBPH Fungal Endosymbiont
Mitogenomes. Simple sequence repeats (SSRs) identified from organellar genomes have been utilized as molecular markers in various species such as plant species [99,[118][119][120][121][122], suggesting that SSRs on fungal endosymbiont mitogenomes can be used as molecular markers to identify the geographical origins of WBPH. In total, 23 normal and 6 extended SSRs were identified from fungal endosymbiont mitogenomes (Figure 3(b)), with the exception of the fungal endosymbiont mitogenome of WBPH KR.1D which displays 24 normal and 6 extended SSRs ( Table 6). The fungal endosymbiont mitogenome of WBPH KR.1D has one more monoSSR (Table 6) with a unit sequence of C and length of 15 bp caused by one insertion (Table 3). In addition, 140 potential SSRs were also identified in the five mitogenomes (Table 6). SSRs identified in the mitogenome were distributed evenly (Figure 3(a)), suggesting that there was no hot spot of SSRs in these fungal mitogenomes.  10 International Journal of Genomics   The length of the identified SSRs is relatively short (a maximum length of 18 bp; Figure 4(a)) in comparison to those of other fungal species in the same family: Ophiocordyceps sinensis (up to 24 bp) [123], as well as fungal species in the other families, such as Pestalotiopsis fici (up to 45 bp) [124]. Moreover, the maximum length of SSRs identified from the mitogenome of R. speculum (NC_049089) [19] was 18 bp, suggesting that this short SSR length can be linked to the evolution of endosymbiont mitogenomes.
Out of 191 normal SSRs, extended SSRs, and potential SSRs, 84 (43.98%) are located in the genic region (genic and intronic ORF categories in Figure 4(b); Table 7). The intronic ORF position indicates the location of the PCGs placed at the introns of other PCGs, most of which are LAGLIDADG endonucleases (Table 2). Nearly half of the SSRs are in PCGs, which are conserved in comparison to intron and intergenic regions, indicating that these SSRs can be utilized for distinguishing species level or even higher rank. In the intergenic region, there were 61 SSRs (31.94%), and in comparison, only 24 SSRs (12.57%) in the intergenic region (Figure 4(b); Table 7). These SSRs are located in relatively nonconserved regions in comparison to PCG regions, suggesting that these SSRs can be used to distinguish intraspecific levels, such as population or geographical origins. Once more endosymbiont mitogenomes are available in the near future, these SSRs can be evaluated for their use in identification of species and their geographical origin as well as evolutionary history of their mitogenomes.
In the genic region, 84 SSRs were distributed in 24 different genes consisting of 21 PCGs, 2 rRNAs, and 1 tRNA (Figure 4(c); Table 7). The large subunit RNA contained the most SSRs and the genes COX1, COX3, NAD3, two LAGLIDADG endonucleases, intron-encoded nuclease aI1, hypothetical protein, and tRNA-Glu contained the fewest (Figure 4(c); Table 7). Considering the length of these genes, some, including large submit RNA, NAD2, LAGLIDADG endonuclease (QPC56057.1), NAD1, NAD6, ATP synthase F0 subunit a, and LAGLIDADG/HNH endonuclease, displayed a relatively large number of SSRs (Figure 4(c); Table 7). Meanwhile, the remaining genes have a relatively 14 International Journal of Genomics low number of SSRs. This inequality of SSR distribution in PCGs can be another useful characteristic for developing efficient molecular markers. In addition, SSRs in PCGs are known to affect the functions of those PCGs especially for adaptation to environmental factors in fungi [125][126][127], suggesting that these SSRs can also affect the functions of mitochondrial PCGs.

Phylogenetic Analysis of 25 Fungal Mitogenomes of
Ophiocordycipitaceae. We constructed bootstrapped maximum-likelihood (ML) and Bayesian inference (BI) phylogenetic trees using 26 fungal mitogenomes consisting of 5 mitogenomes used in this study, 25 mitogenomes in the Ophiocordycipitaceae family, and 1 outgroup species (Fusarium graminearum) [128]. Due to the incomplete annotation of the Ophiocordyceps sinensis fungal mitogenome (KP835313), five PCGs, NAD5, COB, COX1, NAD1, and NAD4, containing introns are not correctly annotated. Only five conserved PCGs, ATP8, COX2, NAD2, NAD3, and NAD4L, were selected and aligned individually. Subsequently, this alignment was concatenated to construct three phylogenetic trees. Five fungal endosymbiont mitogenomes of WBPH were well clustered with another fungal symbiont mitogenome of R. speculum (NC_049089) [19] with high supportive values ( Figure 5). This indicates taxonomic similarity between the R. speculum endosymbiont and the five WBPH endosymbionts, suggesting that other fungal endosymbionts may also be independently clustered with other fungal species in the sample family, Ophiocordycipitaceae. In terms of evolution, it can be explained by the two hypotheses: (i) independent evolution once this endosymbiont entered the host insect species or (ii) independent taxonomic groups of Ophiocordycipitaceae entering into the host insect species multiple times during evolution. To determine which hypothesis is more likely, we would need more endosymbiont mitogenomes from various host insect species of infraorder Fulgoromorpha and suborder Auchenorrhyncha as well as mitogenomes from neighboring noninsect endosymbiont fungal species.
Four fungal species used to investigate intraspecific variations in mitogenomes, Hirsutella thompsonii, Hirsutella rhossiliensis, Ophiocordyceps sinensis, and Tolypocladium inflatum, also display rigid clades covering all mitogenomes of each species with high supportive values ( Figure 5). Three mitogenomes of Ophiocordyceps sinensis were clustered with the longest branch length among the four species, of which Hirsutella thompsonii had the second longest ( Figure 5). These branch lengths were not proportional to the ratio of SNPs and INDELs ( Table 4). The topology of the Tolypocladium genus in the trees was not congruent between the ML and BI trees with low bootstrap values ( Figure 5), indicating that additional conserved gene sequences are required to resolve this clade properly.

Conclusions
We successfully elucidated the five complete mitogenomes of the fungal endosymbiont of WBPH from various sources of NGS raw reads obtained from WBPH samples. These five complete mitogenomes show common and their own characteristics in comparison to the previously elucidated complete mitogenome of the R. japonica fungal endosymbiont [19]. There were fewer intraspecific variations in the five WBPH endosymbiont mitogenomes in comparison to those identified from the four Ophiocordycipitaceae fungal species, Ophiocordyceps sinensis, Hirsutella thompsonii, Hirsutella rhossiliensis, and Tolypocladium inflatum. This can be explained by the narrow geographical distribution and/or genetic background and the low selection pressures of endosymbionts. We identified 191 SSRs were from each WBPH fungal symbiont complete mitogenomes, except for the WBPH_KR.1D mitogenome, which presented an additional SSR. These SSRs are relatively short in length (a maximum length of 18 bp) compared to those of other fungal mitogenomes. Nearly half of the SSRs are in the genic region, suggesting that these SSRs may be more conserved and they may affect the functionality of PCGs. Based on the phylogenetic trees of 5 conserved PCGs of 26 fungal mitogenomes, including one outgroup species, WBPH fungal endosymbiont mitogenomes were clustered with that of R. speculum with high supportive values. This suggests that these insect-hosted fungal endosymbionts have been evolved independently from the other fungal species in the Ophiocordycipitaceae family. Owing to the advantages of NGS raw reads, which can detect sequences from unknown or unexpected organisms [12,[19][20][21][22][23][24][25][26][27][28][29][30][31][32][33][34][35][36][37], we successfully identified the complete mitogenomes of WBPH fungal endosymbionts within the NGS raw reads, suggesting that we can understand their phylogenetic positions of fungal symbiont with high resolution without the need to isolate the symbiont from the host. Furthermore, our study shows that NGS raw reads of insects generated in the future can be used to pinpoint further fungal endosymbionts that have previously been difficult to identify. This method could provide novel insights into their phylogenetic positions as well as interactions with their host species.

Data Availability
Mitochondrial genome sequence used in this study can be accessed via accession numbers MW115131, MW373710, MW373711, MW376862, and BK059186 in the NCBI GenBank.

Conflicts of Interest
The authors declare that they have no competing interests. International Journal of Genomics International Journal of Genomics