Preliminary Characterization of Mitochondrial Genome of Melipona scutellaris, a Brazilian Stingless Bee

Bees are manufacturers of relevant economical products and have a pollinator role fundamental to ecosystems. Traditionally, studies focused on the genus Melipona have been mostly based on behavioral, and social organization and ecological aspects. Only recently the evolutionary history of this genus has been assessed using molecular markers, including mitochondrial genes. Even though these studies have shed light on the evolutionary history of the Melipona genus, a more accurate picture may emerge when full nuclear and mitochondrial genomes of Melipona species become available. Here we present the assembly, annotation, and characterization of a draft mitochondrial genome of the Brazilian stingless bee Melipona scutellaris using Melipona bicolor as a reference organism. Using Illumina MiSeq data, we achieved the annotation of all protein coding genes, as well as the genes for the two ribosomal subunits (16S and 12S) and transfer RNA genes as well. Using the COI sequence as a DNA barcode, we found that M. cramptoni is the closest species to M. scutellaris.


Introduction
Melipona scutellaris, popularly known as uruçu bee, is a stingless bee species profusely found from Bahia to Pernambuco Brazilian states. They are present in urban and rural environments and their pollinator role is pivotal to the ecosystems in which they live [1,2].
The genus Melipona has long been target of ecological, genetic, and especially behavioral and pollination studies. Existing molecular phylogenetic studies have shown that the Melipona genus clusters with other neotropical Meliponini. Furthermore, Melipona spp. form a well-supported monophyletic group, as determined by recent studies. Stingless bees, especially, compose an ancient group, distributed worldwide around tropics, making them important species for phylogenetic relationships studies. Thus, the elucidation of social behavioral evolution relies on a better understanding of phylogenetic relationships for eusocial insects [3,4]. For this reason, there is an increasing interest on getting more accurate phylogenetic reconstructions, based on molecular aspects.
Phylogenetic studies have explored nuclear ribosomal genes. Since 18S and 28S subunits are short genes they are therefore easily amplified and sequenced. Besides, such genes are located in very informative regions, where conserved or variable sequences may enlighten phylogenetic relations between species [5,6]. The mitochondrial genome has also been included in such studies, since its potential for providing evolutionary information was recognized. Metazoan mitogenomes are about 16 kb long and contain 37 genes: 22  2   BioMed Research International   tRNAs, 2 rRNAs (16S and 12S subunits), and 13 oxidative  phosphorylation proteins-7 from complex I (ND1, ND2,  ND3, ND4, ND4L, ND5, and ND6), one from complex III  (cytochrome B), three from complex IV (COI, COII, and  COIII), and two from complex V (ATPase6 and ATPase8) [7].
The tRNA genes are embedded in variable regions. Through evolution, these regions underwent rearrangements more often than protein coding regions. Therefore, tRNA order is a tool for comparative phylogenetic analysis between species [8]. Additionally, the high copy number per cell, low recombination rate, high mutational rate, and dominantly maternal heritance have made the mitochondrial genome a powerful tool for evolutionary studies.
DNA barcoding, or taxon identification using a standardized genomic region, was initially used in the study of animal specimens [9] and later for a wider diversity of organisms. DNA barcoding based on the cytochrome c oxidase subunit 1 (COI) gene has since then become a widely accepted molecular marker for species identification. This 650 bp long sequence is a simple and reliable tool for metazoan species identification, indicating their molecular divergences or similarities [4]. Being such a short sequence, it is useful for robust phylogeny analysis and important for tracking and measuring ecosystems biodiversity, what has lately been referred to as metabarcoding [10].
Other genes may be employed for DNA barcoding, but, especially among Arthropoda, the interest on COI over other genes is mainly due to its well conserved and single copy sequence, which makes the amplification by PCR reaction easier, allowing the usage of a small set of primers. Still, COI sequence presents faster substitution rates than nuclear genes and its variations are remarkably more interspecific than intraspecific [10,11].
DNA barcoding applications for insects have been very successful [11]. This technique allows species identification in different life stages (eggs, larvae, nymphs, and pupae), when morphologic characteristics are not easily identified [12,13]. Besides, it makes species identification possible with only tissues or fragmented parts of the insect [14]. Further, the importance of phylogenetic relationship analysis relies on understanding ecosystems biodiversity, taking into account that insects are important at pollination, decomposition, pest control, and even disease vectors [10].
The present study characterized the draft of M. scutellaris mitogenome annotation according to its gene order, gene conservation, and taxonomic characterization by DNA barcoding.

Biological Material.
Total DNA was extracted from a pool of five male individuals from the Meliponary UFU at Universidade Federal de Uberlândia, campus Umuarama (S 180 55 /W 450 17 ). DNA extraction was performed with CTAB buffer, which consists of 2% (w/v) CTAB diluted in 100 mM Tris-HCl, 20 mM EDTA, and 1.4 M NaCl. Immediately before maceration, 0.2% (v/v) -mercaptoethanol was added; 150 L of CTAB buffer was added and maceration was performed manually with a pestle. Then 350 L of CTAB buffer and 5 L of RNase solution (100 mg/mL) were added, following incubation at 37 ∘ C for 1 hour. Later, 5 L of proteinase K solution (20 mg/mL) was added with additional incubation at 50 ∘ C for 1 hour. For homogenate extraction, addition of 240 L of Phenol/Chloroform/Isoamyl alcohol (25 : 24 : 1) and then centrifugation at 12,000 ×g for 10 minutes. Supernatant was transferred for a new tube and DNA was precipitated with 500 L of absolute ethanol, following centrifugation at 12,000 ×g for 15 minutes. Ethanol 70% was used for pellet washing, with a volume of 500 L and centrifugation at 12,000 ×g for 3 minutes. This step was repeated [15]. Pellet was dried at room temperature overnight and eluted into 100 L of MiliQ water.

Mitochondrial Genome
Sequencing. The total genome sequencing of M. scutellaris was performed at the René Rachou Research Center, Fiocruz Minas (Belo Horizonte, MG) using an Illumina platform (MiSeq) and a paired-end strategy. The library was constructed with the Nextera XT DNA Sample Preparation Kit, following the manufacturer's instructions. Fragments of 404 bp long were carried out for sequencing. The average read length was 250 bp and the final throughput was 8.4 Gbp.

Genome Assembly.
In order to achieve more reliable and accurate results, two kinds of assembly software were employed: SOAPdenovo2 [16] and Velvet [17]. The first M. scutellaris mitogenome created in this work was made using contigs generated by the SOAPdenovo2 software with varying kmer parameter between 23 and 127.
The assembly by Velvet generated assemblies for different kmer values: 31, 41, 51, 61, 71, 81, 91, or 99. The MuMmer Package 3.0 [18] aligned the contigs from each pair of assemblies with the reference sequence and show-tiling was used to complete the scaffolding of the set of contigs. Combinations of the following set of parameters were used in show-tiling: -i: 50, 70, or 90; -V: 0. 5 or 10; -v: 10, 50, or 90; -c: included or not.
The generated sequences were filtered according to the following conditions: (i) sequence size between 14,000 and 17,000 base pairs; (ii) AT content greater than or equal to 84%; (iii) maximum gap tolerance of 100 nucleotides.
The remaining sequences were submitted to MITOS [19], for functional annotation. Steps for Velvet assembly and subsequent annotation with MITOS are schemed in Figure 1. The reference mitogenome of M. bicolor was also submitted in order to provide a more reliable basis for further analysis. The data from annotations were used for analysis of gene order and similarity to reference sequence.
The genes from the reference genome of M. bicolor are listed in the first column of Table 1. Relatively to each gene, the adjacent genes were analyzed (upstream and downstream) according to their frequencies (occurrence: "OCR" in Table 1). The annotated genes from M. scutellaris were locally aligned against the reference mitogenome, highlighting identity and E-value. Transfer RNAs secondary structures were predicted using tRNASCAN-SE software [20].

Results and Discussion
In total, 36 assemblies were performed, generating 36 annotations which allowed inferring that all protein-coding genes and tRNA genes were annotated. Using all annotations, the gene order analysis revealed synteny between M. scutellaris mitogenome and the reference genome. Ribosomal genes identities were also analyzed (Table 1), as well as the conservation of secondary structure for tRNA.

Organization and Partial Characterization of M. scutellaris
Mitogenome. In Table 1, the "reference gene" column corresponds to the mitochondrial gene order of M. bicolor. Considering the genes that are found in the reference genome, column "OCR" (second column) shows how many times each gene is annotated for M. scutellaris, within the 36 annotations. "Upstream gene" and "downstream gene" columns refer to which genes are found in those respective positions, relative to the reference gene. Finally, "OCR" columns (fifth and sixth columns) mean how many times a gene is found in such position, relative to another gene, also considering the total of 36 annotations. M. scutellaris mitogenome shows an overall high synteny when compared to M. bicolor mitogenome. Figure 2 is an illustrative scheme which gathers the annotated genes under the most frequent order, among all 36 assemblies. Although all genes were found, they are not all present in the following scheme, as long as some few ones must yet have their position validated.
Regarding tRNA genes, aligning the annotated genes against the reference genome did not generate considerable identity values for all cases. However, submitting the M. scutellaris sequences to tRNAScan-SE, it was possible to infer that the tRNA secondary structures of M. scutellaris are viable and well conserved. An example is displayed in Figure 3.
In Table 1 it is possible to notice that in some cases there is more than one gene in the upstream and/or downstream position, providing different possibilities of gene organization. Such genes are tRNAA, tRNAK, tRNAG, tRNAR, tRNAT, and tRNAL1.
The coding-protein genes for M. scutellaris are syntenic to their homologous in M. bicolor mitogenome. Some tRNA genes are also syntenic, namely, tRNAM, tRNAW, tRNAY, tRNAL2, tRNAD, tRNAF, tRNAP, and tRNAS2. However, it is also possible to infer that some tRNA genes have underwent rearrangement, which are tRNAV, tRNAS1, tRNAN, and tRNAA. Some other tRNA genes must yet have their position validated, since their occurrence (OCR) is very low compared to the average, such as tRNAI, tRNAG, tRNAE, and tRNAL1 (not shown in Figure 2). It is known that tRNAs order is a particular feature of each insect species [8], and tRNAs distribution, copy number, and codon usage patterns are especially important in evolutionary studies [21].
Our results provided evidences of possible duplication of two genes in M. scutellaris mitogenome in comparison to M. bicolor. In Table 1 it is possible to see the occurrences for tRNAQ downstream to ND5 and 12S genes: 35/36 and 29/36, respectively. Still, tRNAA has also high occurrences for being upstream to 12S (30/36), composing the following cluster: 16S tRNAN tRNAA 12S. But there is also a high occurrence for being downstream to tRNAT (27/36), whose position is not yet clear, although it is already unlikely to be part of the cluster mentioned above.
A single gene found in M. bicolor mitogenome was not found in M. scutellaris, which is tRNAH. On the other hand, gene tRNAX was annotated, suggesting a tRNA that is not identified, requiring further validation.
Moreover, submitting the mitogenome sequence of M. bicolor to MITOS, as a support for further analysis, a gene that is not found in the currently available sequence was annotated, namely, tRNAC. This gene was also annotated in M. scutellaris mitogenome, in the same position. Such synteny suggests that tRNAC gene may be present in both genomes and this information provided by MITOS may have been omitted by other kinds of annotation software.   Concerning the protein-coding regions, they are probably not yet complete. Genes' lengths are not all as the expected, preventing a conclusion about the codon usage in M. scutellaris' mitogenome.
Finally, comparing both assemblies, N50 values adopted were 343 for SOAPdenovo2 and 722-362 for Velvet. The longer scaffolds achieved had close lengths, which are 15580 and 15206, for SOAPdenovo2 and Velvet, respectively. In general, all assemblies generated scaffolds around 14000-15580 bp long. However, despite the good quality of the data (phred > 30), it was not possible to achieve any significant scaffold in a first approach. For this reason, the present methodology was adopted in order to reach the evidences of genes' presence. As discussed above, all protein-coding sequences were annotated, as well as transfer RNA-coding and ribosomal subunits genes. It is assumed that using a single library of inserts 400 bp long may have influences over the results under the expected, requiring a complementary library for subsequent analyses.  [23] have demonstrated by cytogenetic studies that among Melipona species there are different heterochromatin content and distribution patterns. Two groups may be distinguished for such characteristics that may be considered for evolutionary analysis as well.

DNA
Rocha et al. [23] established M. bicolor as belonging to Group I, which comprises species with less than 50% of heterochromatin. Species belonging to such group display heterochroma-tin content concentrated in one portion of the chromosome. On the other hand, Group II is composed of species with heterochromatin content higher than 50%, spread along the chromosome extension. By cytogenetic studies, M. scutellaris and M. rufiventris were classified as being part of Group II. Therefore, DNA barcoding result revealing a closer relation between M. scutellaris and M. rufiventris rather than M. bicolor confirms cytogenetic studies. However, the close relation between M. scutellaris and M. cramptoni has not yet been investigated.

Conclusion
The results generated to the mitogenome of M. scutellaris in this study suggest a conserved character in comparison to M. bicolor. Protein coding genes order especially appears to be well conserved. Some tRNA genes underwent rearrangement, but a conservation pattern could also be analyzed as well.
Through taxonomic identification search on BOLD Systems database it was possible to assume that M. cramptoni is the closest species to M. scutellaris (98.99%), although M. bicolor is also found among closely related species (95.2% identity). The second closest species to M. scutellaris found at BOLD database is M. rufiventris, what is confirmed by cytogenetic studies.