The Complete Mitochondrial Genome of The American Palm Cixiid, Haplaxius Crudus (Hemiptera: Cixiidae)

Background Haplaxius crudus (the American palm cixiid) is a major insect pest of various economically important palms. H. crudus persists in tropical and subtropical regions where it is known to transmit the lethal yellowing (LY) phytoplasma. It has been implicated as the putative vector of Lethal bronzing (LB), a destructive phytoplasma-induced palm disease affecting over 16 species of ornamental and agricultural palms. To date, no mitochondrial genomes for species in the family Cixiidae are sequenced. Analysis of mitochondrial DNA sequences of H. crudus has proven useful for proper species diagnosis and population studies which could benet management programs aimed at moving infective insects. These analyses describe the rst mitochondrial genome from the American palm cixiid, Haplaxius crudus and an insect in the family Cixiidae. Results this study, the was and characterized from PacBio Sequel II long sequencing using University Florida’s HiPerGator supercomputer. The circular mitogenome of H. crudus is 15,845 bp long and encodes 37 mitochondrial genes (including 13 protein coding genes (PCGs), 22 tRNAs and 2 rRNAs) in addition to a putative non-coding internal control region. The nucleotide composition of H. crudus is asymmetric with a bias toward A and T (44.8 %A, 13.4 %C, 8.5 %G and 33.3 %T). Protein-coding genes (PCGs) possess the standard invertebrate mitochondrial start codons with few exceptions while the gene content and order of the H. crudus mitogenome is identical to most completely sequenced insect mitochondrial genomes. Phylogenetic analysis indicated that H. crudus is closely related to the planthopper in the family Delphacidae: N. lugens, which is the established sister group to Cixiidae. Conclusions Our studies the rst reference mitochondrial genome of Haplaxius crudus, providing structural analysis of the circular genome and encoded gene regions. The present results provide future opportunities to assess the diversity and origin of H. crudus. This study demonstrates the signicance


Background
The invention of high-throughput genome sequencing technologies has greatly altered the understanding of the biology, diversity, and relationships between insect vectors and their associated pathogens [1]. The mitochondrial genome in particular is the most commonly used molecular marker for phylogenetic studies and assessing population dynamics, and is therefore an important component of next generation sequencing [2]. The insect mitochondrial genome is compact, generally spanning from 14 to 20 kilobases (kb), with a collection of encoded genes that are extremely conserved [10]. Genes typically encoded in the animal mitochondrial genome are comprised of 13 protein coding genes (PCGs), 2 ribosomal RNAs (rRNAs), and 22 transfer RNAs (tRNAs) with a non-coding control region [3]. In addition, the speci c gene order within mitochondrial DNA is a key feature that can provide important evidence to establish evolutionary relationships among taxa at both high and low taxonomic levels and can be variable among insects [7,8,9].
To date, 5,178 complete or nearly complete insect mitochondrial genome sequences are available on GenBank (as of April 2020) and continue to grow as sequencing technologies become more costeffective and time effective. However, fewer complete mitochondrial genomes of Hemipteran insects, speci cally auchenorrhynchan, have been sequenced or published in GenBank (https://www.ncbi.nlm.nih.gov/).
Hemipterans are one of the largest groups of the hemimetabolous insects [14] and include three suborders: the auchenorrhyncha, sternorrhyncha, and heteroptera [11,12]. The auchenorrhyncha and sternorrhyncha are well suited to transmit plant pathogens based on the morphology of their mouthparts (piercing/sucking) and feeding behavior [13]. Within Hemiptera, notable insect pests are planthoppers, leafhoppers, aphids, and white ies. Planthoppers belonging to the family Cixiidae consist of more than 2,000 species and 150 genera of insects in the order Hemiptera [29].
The American Palm cixiid, Haplaxius crudus ( Fig. 1), is among the most important Auchenorrhynchan insect pests of palms, ranging from the subtropical United States to the tropical regions of Central and South America [21,22]. Because H. crudus is a con rmed vector of the palm disease termed lethal yellowing (LY) caused by the 16SrIV-A phytoplasma on the American continent [49,50], it is implicated as the putative vector of lethal bronzing (LB), a devastating palm disease caused by the ethal bronzing phytoplasma (16SrIV-D subgroup). Phytoplasmas are related to gram-positive bacteria and are obligate intracellular parasites of plants that are transmitted by phloem-feeding hemipteran insects, including leafhoppers, planthoppers and psyllids [25,26]. The LY phytoplasma results in yellowing, wilting and death of palms and has caused major outbreaks that resulted in the losses of millions of coconut palms (Cocos nucifera) throughout the Caribbean basin (need reference, the compendium of lethal yellowing should do, is a small book) as well as other and other palm species. Furthermore, H. crudus is widespread and abundant in the Southeastern United States.
In this study, we present the complete mitochondrial genome of Haplaxius crudus, to investigate its mitogenomic structure, function, and phylogenetic relationships. The aim of this study are to (i) present a complete and annotated mitochondrial genome sequence of H. crudus, (ii) compare the mitochondrial genome of H. crudus and other related insects to identify the common and novel characteristics, and (iii) serve as a baseline dataset for future studies on population genetics that will aid in supplementing area wide management options.

Genome Size, Organization, and Structure
The assembled contig demonstrated that the mitochondrial genome of H. crudus is a circular DNA molecule 15,845 bp in length. The mitochondrial genome includes 37 genes, 13 PCGs, 22 tRNA genes, and 2 rRNA ribosomal genes (Fig. 2, Table 1). The new sequence was submitted to GenBank under the accession number (MT385107). The major strand (α strand) carries most of the genes (6 PCGs and 13 tRNAs), while the remaining genes are encoded on the minor strand (β strand). The AT -rich regions of the mitogenome ranges from 14,540 to 15,845 bp with the location between rrnS and tRNA-Ile (Fig. 2). The nucleotide composition of the H. crudus mitochondrial DNA is A = 7,097 (44.8%), T = 5,279 (33.3%), G = 1,341 (8.5%) and C = 2,128 (13.4%) of 15,845 nucleotides present. The genome organization follows the standard order of the ancestral insect mitochondrial genome plan (Fig. 3).

Protein-Coding Genes
The mitochondrial DNA of H. crudus contains the full set of PCGs usually present in animal mitochondrial DNA. PCGs are arranged along the genome according to the standard order of insects ( Fig.  3) [1]. The putative start codons of PCGs are those previously known for animal mitochondrial DNA i.e. ATG, ATT, ATA, ATC, GTG, TTG, and GTT (Table 1) [17]. The common start codon ATG, could be assigned to most of the protein-coding sequences, with few exceptions. In multiple cases, coding units did show overlap ranging from 1 to 70 bp (Table 1). Overlapping was observed for tRNA-Ile/tRNA-Gln producing separate transcripts with their opposite directions [37,38]. Like other insect species, two protein-coding regions, ATP8/ATP6 and ND4/ND4L overlap and are translated from the same cistronic mRNAs [37][38][39][40].
In addition to the control region, we observed 18 non-coding regions ranging from 1 to 1,210 bp ( Fig. 2, Table 1). The non-coding control region in the H. crudus mitochondrial genome extends over 1,129 bp and is located between the rrnS and Ile-Gln-Met tRNA cluster. There are many unique TA-dinucleotides and TTA-trinucleotides repeats within the H. crudus mitochondrial genome sequence that are similar to microsatellite sequence divergence. No signi cant sequence similarity was found within the Haplaxius mitochondrial genome or other published sequences.

Phylogenetic Analysis
The phylogenetic analysis performed show that Haplaxius crudus resolved with Nilaparvata lugens (Delphacidae) with strong bootstrap support (100) ( Figure 4). There was also strong support (100) for Aphis aurantii (Aphidae) resolving near both H. crudus and N. lugens. In general there is strong support (100) for each clade that comprises an order of insect: the Hemiptera clade that includes H. crudus, N. lugens, A. aurantii, Dolycoris baccarum, Magicicada tredecassini, the Coleoptera clade that includes Sitophilus oryzae and Chauliognathus opacus, the Odonate clade that includes Nannophya pygmaea and the Diptera clade that includes Drosophila melanogaster ( Figure 4). Based on the pairwise comparison, N. lugens also shows the highest level of sequence homology among the analyzed taxa, differing from H. crudus by 28.3% (Table 2). All other taxa differ from H. crudus by at least 30.7% (Table 2).

Discussion
We assembled the complete mitogenome of Haplaxius crudus using PacBio Sequel II SMRTbell™ sequencing technology. The H. crudus mitochondrial DNA demonstrated the typical Hemipteran gene order [8,9] which follows the ancestral gene order of insects [1]. Gene rearrangement is not uncommon in Hemipterans families such as at bugs (Aradidae), aphids (Aphididae), and white ies (Aleyrodidae) [30,31]. The H. crudus mitochondrial genome is similar to that of the N. lugens and L. striatellus in content and taxonomically, but different in size. However, both these species have a unique gene order relative to H. curudus as well as most other taxa used in the analysis. Cixiidae is accepted as the sister group to delphacidae based on morphology and sequence homology. The length of the H. crudus mitochondrial genome falls within the range observed for most insect mitochondrial genomes, including other arthropods [41]. The nucleotide composition of the H. crudus mitochondrial genome is AT biased, which is generally observed in insect mitochondrial genomes [12]. Hemipteran insects from the suborders Fulgoromorpha, Coleorrhyncha and Heteroptera are typically AC skewed [42]. The control region in the H. crudus mitochondrial genome corresponds to the control region of vertebrate mitogenomes and contains the origin sites for transcription and replication [48]. In the H. crudus mitochondrial genome, 18 noncoding regions ranging from 1 to 1,210 bp were observed in addition to the control region. This region corresponds to the transcriptional and replicational control region typical for insect mitochondrial genomes, also referred to as the AT-rich region. This is not uncommon among insects as the Adoxophyes mitochondrial genome has 26 non-coding intergenic regions [37]. In this insect, the control region is less likely to be variable than coding regions of the mitochondrial genome due to the high AT content that consequently limits the usefulness as a diagnostic marker [43]. The unique TA-dinucleotides and TTAtrinucleotides repeats within the H. crudus mitochondrial genome sequence are similar to microsatellite sequence divergence and showed potential to be species identi cation markers which do not appear in the closely related Hemipteran species, however further analysis is required to interpret the e cacy of microsatellites at these loci. Phylogenetic analyses in MEGA X demonstrated that H. crudus is monophyletic with Hemipterans in the families Delphacidae and Aphididae. Our ML analysis using the Tamura-Nei model con rms putative lineage of H. crudus within the suborder Auchenorrhyncha. The ingroup taxa A. aurantii (MN397939.1) and N. lugens (NC021748.1) are monophyletic with H. crudus (MT385107). Mitochondrial genomes may provide a better approach to resolving intractable phylogenetic relationships than single gene analyses, however, comprehensive phylogenetic analysis is required to recover strong support for the de nitive relationships between insects in the order Auchenorrhyncha [43][44][45]. Considering the diversity of the family Cixiidae and the limitation of the present molecular information, more conclusive phylogenetic results will be achievable as bioinformation/genomic data becomes increasingly available.
This study will assist in more conclusive phylogenetic results and future studies on taxonomy, phylogeny, and systematics of cixiid insects. These results shed new light on the evolution of mitochondrial genomes from the family Cixiidae, allowing for deeper insights into the mitochondrial genomes and ancestral lineages of Hemipteran insects. Lastly, understanding the structure and function of the mitochondrial genome of H. crudus is essential to the formulation of effective pest management and control strategies in ornamental and agricultural ecosystems and is necessary for proper species determination and diagnosis of pathogenicity.

Conclusions
Our study presents the mitochondrial genome of H. crudus with typical gene content and organization. The mitochondrial genome of H. crudus should be used for developing mitogenome genetic markers for species identi cation and insect-transmitted pathogen diagnostics. Other molecular markers, especially nuclear ones should be developed for species identi cation parallel with mitogenome markers for addressing Hemipteran phylogenies with expanded data sets. The phylogenetic results inferred from mitochondrial genomes support that the genus Haplaxius is closely related to Nilaparvata. Although data collected thus far could not resolve the phylogenetic relationships within Cixiidae, this study will assist in future inference of evolutionary relationships, pathogen diagnostic studies, and species determination.

Samples collection and DNA extraction
The preparation of high-quality high molecular weight genomic DNA was carried out with a mature wildtype H. crudus female collected in Davie, Florida. The specimen was morphologically identi ed, preserved in 100% ethanol, and stored at -80° C in the Insect Vector Ecology Laboratory, Fort Lauderdale Research and Education Center (FLREC) University of Florida. The whole body insect tissue was homogenized with a sterile pestle in 2 ml of liquid nitrogen. After homogenization, total genomic DNA was extracted from the frozen adult using the Qiagen Gentra Puregene® Genomic DNA kit supplemented by the 10X Genomics® whole genome extraction protocol.

PCR ampli cation and sequencing
Prior to sequencing, PCR ampli cation of species-speci c primers were used to identify the insect and foreign microorganisms that may introduce inconsistent sequences to the sample. In addition, the H. crudus DNA was diagnostic tested for the presence of the Lethal yellowing (LY) phytoplasma using qPCR and LY16Sf and LY16sr probe primers. Lethal bronzing disease (LBD) phytoplasma testing was done via the P1, P7 and R16F2n, R16R2 forward and reverse primers [51]. PCR was performed using a 25 μL reaction volume containing Promega PCR systems GoTaq® 5x exi buffer (pH 8.5), GoTaq G2 Flexi buffer, MgCl 2 (25mM), PVP-40 (10%), dNTPs, 2 μL DNA, and 2 μL of each primer (10 μM). The PCR was performed under the following conditions: an initial denaturation. The PCR was performed under the following conditions: an initial denaturation at 94°C for 4 min followed by 35 cycles of 30 s at 94°C, 40 s at 49-58°C (depending on primer combination), 1-3 min (depending on putative length of the fragments) at 72°C, and a nal extension step of 72°C for 10 min. These PCR products were then analyzed by 1.5% agarose gel electrophoresis. Purity and concentration tests were performed using NanoDrop™ Microvolume Spectrophotometer and Qubit® Fluorometric Quanti cation technologies (Thermo Fisher Sciences https://www.thermo sher.com/us/en/home.html). Purity ratio of 1.8 and high genomic concentration >30 ng/μL were required for whole genome sequencing. Once cleared for purity and concentration, the sample was sent directly for sequencing to the University of Florida Interdisciplinary Center for Biotechnology Research (UF-ICBR). Sample was sequenced using PacBio Sequel II SMRTbell® long read sequencing technology with 40x coverage (Paci c Biosystems https://www.pacb.com/) Genome assembly and annotation Sequence assembly, annotation, alignment, and nucleotide composition calculations were conducted with the University of Florida's supercomputer, HiPerGator 3.0. HiPerGator's cluster offers processors and nodes for memory-intensive computations in basic bash command line operations. De novo assembly of sequence reads was performed via CANU v2.0 operational assembler specialized in assembling PacBio sequences in three phases: correction, trimming, and assembly [52]. Annotation of assembled sequences was performed on the same HiPerGator interface using the Prokka gene prediction software and the average nucleotide identity (ANI) commands [53]. Prokka is a rapid genome annotation tool typically used for prokaryote and mitochondrial genomes using prodigal gene prediction parameters for kingdom archaea, bacteria. In addition to the Prokka annotation, protein-coding genes and tRNAs were identi ed in tandem with NCBI's ORF nder for invertebrate mitochondrial genes (NCBI, https://www.ncbi.nlm.nih.gov/or nder/). The putative control regions were assumed to be present between the rrnL and Ile-Gln-Met tRNA cluster. Nucleotide diversity and composition analyses were also performed with HiPerGator. Post-annotation alignment was performed using the MUSCLE algorithm in MEGA X [36].

Phylogenetic analysis
Phylogenetic analyses were assessed using nine insect mitochondrial nucleotide sequences. Of the nine insect species, four non-Hemipteran insects and four Hemipterans were selected for analysis (Table 3). Species' mitochondrial genome sequences and annotations were downloaded from GenBank in fasta le format from eight species of insect and were aligned with the H. crudus mitochondrial fasta le using the MUSCLE algorithm in MEGA X with default settings [36]. Once aligned, the evolutionary history was inferred by performing the ML method and Tamura-Nei model to best t the scheme [32]. The tree with the highest log likelihood (-140533.20) is shown with the percentage in which the associated taxa are related. Heuristic-search trees were obtained using both Neighbor-Join and BioNJ algorithms to estimate the maximum composite likelihood (MCL) in tandem with a superior likelihood value. Branch lengths are measured by the number of substitutions per site with a total of 10 nucleotide sequences. Codon positions analyzed were 1 st , 2 nd , 3 rd , and noncoding in order. There was a total of 21,821 positions in the nal dataset.
Evolutionary analyses were conducted in MEGA X [33,34]. Additional, a pairwise comparison using the pdistance method was performed to show percent difference among mitochondrial genomes.     Maximum likelihood phylogenetic reconstruction at 1,000 bootstrap replicates of ten insect species based on the mitochondrial genome using the Tamura-Nei model.