microRNA and Other Small RNA Sequence Profiling across Six Tissues of Chinese Forest Musk Deer (Moschus berezovskii)

The Chinese forest musk deer (Moschus berezovskii) is an economically important species distributed throughout southwest China and northern Vietnam. Occurrence and development of disease are aggravated by inbreeding and genetic diversity declines in captive musk deer populations. Deep transcriptomics investigation may provide a promising way to improve genetic health of captive and wild FMD population. MicroRNAs (miRNAs), which regulate gene expression by targeting and suppressing of mRNAs, play an important role in physiology and organism development control. In this study, RNA-seq technology was adopted to characterize the miRNA transcriptome signature among six tissues (heart, liver, spleen, lung, kidney, and muscle) in Chinese forest musk deer at two years of age. Deep sequencing generated a total of 103,261,451 (~87.87%) good quality small RNA reads; of them 6,622,520 were unique across all six tissues. A total of 2890 miRNAs were identified, among them 1129 were found to be expressed in all tissues. Moreover, coexpression of 20 miRNAs (>2000RPM) in all six tissues and top five highly expressed miRNAs in each tissue implied the crucial and particular function of them in FMD physiological processes. Our findings of forest musk deer miRNAs supplement the database of transcriptome information for this species and conduce to our understanding of forest musk deer biology.


Introduction
The forest musk deer (Moschus berezovskii, FMD) is commercially famous for musk production, due to its cosmetic (perfume industry) and alleged pharmaceutical properties [1]. Notably, it is regrettable that the musk was secreted by preputial gland of adult male FMD in a very limited amount. In past decades, the population of wild FMD has been declined rapidly due to considerable poaching and massive habitat destruction and become extraordinarily endangered. The wild FMD listed in Appendix II in CITES and under class I state key protection in China (http://www.iucnredlist.org/). Since the 1960s, FMD farms have been set up to conserve wild populations and obtain musk resources in a safe and sustainable way in China [2]. However, survival of captive musk deer is considerably affected by high incidence of diseases, like Diarrhea, gastrointestinal diseases, dyspepsia, pneumonia, metritis, urinary stones, and abscesses which gravely affect the population growth [3].
MiRNAs, belonging to small, noncoding RNAs (short sequence of 18-25nt), have a vital role in posttranscriptional mechanism by recognizing specific mRNA targets [4]. Thus, microRNAs expression profiling is of global interest due to their key role in the regulation of miscellaneous biological processes, such as body development, cell differentiation, proliferation, apoptosis, immune response, reproductive system development, gametogenesis, and organogenesis [5][6][7][8][9], although few transcriptomics had been made to demonstrate the expression of miRNA in heart, musk gland, and blood of FMD [10,11]. Thus, it is essential to establish depth exploration of miRNA transcriptome across a wide range tissue in FMD. To date, a total of 38,589 mature miRNAs sequence have been discovered from 271 species (miRBase 22, July 2018). Recently, the first genome sequence and gene annotation for the FMD has been published [12]. The study of the FMD miRNAs and their interactions with target genes will provide further insight into several physiological processes of FMD.
In this study, we carried out the high-throughput sequencing analysis on small RNAs of six tissues (heart, liver, spleen, lung, kidney, and muscle) of forest musk deer (Moschus berezovskii) using the RNA-seq technique on the Hi-seq 2500 sequencing platform. The aim of this study was to discover and characterize FMD specific miRNA in a large number of tissues in order to provide an extensive repertoire of expression, illustrating the potential role of miRNAs and their targets on FMD biological processes.

Materials and Methods
. . Sample Collection and RNA Extraction. The experimental Chinese forest musk deer was reared in Chongqing Institute of Medicinal Plant Cultivation (Chongqing, China). The six tissues (heart, liver, spleen, lung, kidney and muscle) were collected from one sexually mature male individual, which died from earthquake at April 20 th , 2013. Total RNA was extracted from six samples with RNAiso reagent (TaKaRa, Japan) according to the manufacturer's instructions. Following isolation, the purity (absorbance ratios, 260/280) and yield of RNA were determined using with a Qubit 3.0 Fluorometer (Thermo Fisher Scientific, Waltham, USA) and the RNA integrity number (RIN) [13] was determined using the Agilent Bioanalyzer, which reported an RNA integrity number of >8.5 for all samples. Total RNA samples were stored at −80 ∘ C until use. This study was carried out in accordance with the recommendations of the Animal Care and Committee of Sichuan Agricultural University under permit number DKY-20145020.
. . Small RNA Sequencing and Data Analysis. The purified RNA samples were sent for sequencing (RNA-seq) on Hi-seq 2500 sequencing platform at Sangon Biotech (Shanghai) Co. Ltd. (Shanghai), China. After getting the raw data, Cutadapt v1.9.1 software [14] used to remove the ligated adapter sequences. Additional filtering was applied to discard lowquality reads, insufficient and overlong sequences. rRNAs, tRNAs, snoRNAs, snRNAs, and other noncoding RNAs were distinguished and eliminated based on reference gene annotated in GenBank (http://www.ncbi.nlm.nih.gov/) and Rfam databases (http://rfam.janelia.org/). Known miRNAs in all samples were identified by comparison with the specified range in the miRBase version 22 (http://www.mirbase.org). Through BLASTN, the remaining reads were compared with all nonredundant mature miRNAs, which are obtained from miRBase 20.0 database (http://www.mirbase.org/) [15]. Fragment per kilobase of exon per million fragments mapped (FPKM) values was used to evaluate differential expression [16]. P values were calculated using multiple hypothesis testing. P⩽ 0.05 and |Log2FoldChange| ⩾1 were employed to evaluate differentially expressed miRNAs. Next, miRanda (http://www.microrna.org/) was used to predict miRNA target genes. These genes were mapped to the Gene Ontology (GO) project (http://www.geneontology.org/) and the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways [17] behind it. Meanwhile, the significantly enriched GO terms were identified by P value ⩽ 0.001, and the P value cut-off was 0.01 for KEGG terms.

Results and Discussion
. . Overview of Small RNA Profiling. The sRNA profiles were generated by sequencing six tissues namely, heart, liver, spleen, lung, kidney, and muscle of FMD and their length distribution summarized in Table 1 and Figure S1. A total of 118 million reads were retrieved from the sequencing. After trimming, 103 million good quality reads (average 17 million reads per tissue; 87.87%) were obtained. The results indicated that majority of sRNAs in each library were between 20 and 24 nt in length, which is in line with the typical size range of small RNAs generated by Dicer. Moreover, the clean reads of sRNA (20-24 nt length) were most abundant in lung (93.3%), followed by liver (89.15%), kidney (87.6%), spleen (72.8%), heart (63%), and muscle (62.3%) shown in Figure S1.
To provide more comprehensive information, we determined the distribution and frequency for each sRNAs types within unique as well as all sRNAs (see Figure 1). The result showed that, among unique sRNAs, the miRNA account for a small proportion (1.19%), and the most sRNA were found unannotated (89.73%). On the other hand, in all sRNAs, miRNA contributes major (62.07%) among all sRNAs followed by unannotated RNAs (26.52%).
The detailed data illustrated that miRNAs are major composition of all sRNAs in each tissue samples. Small RNAseq data contained significant representation of miRNAs in most of the tissues, with the maximum amount in the lung (83.26%), kidney (71.92%), and liver (71.69%) while miRNAs make up a tiny portion of unique sRNAs (less than 4%) shown in Table 3 and Figure 2. The tRNAs were maximum in small RNA-seq data of muscle (25.90%), followed by spleen (7.73%), kidney (7.57%), and heart (6.02%) and minimum in lung (0.89%). rRNA, snoRNA, and snRNAs were the minimal RNA species in all the tissues ranging between 0.33-7.86%, 0.12-0.83%, and 0.01-0.59%, respectively. The percentage of unannotated mapped reads was the highest in heart (47.60%) and the lowest in lung (14.11%). With a proportion being more than 79%, the unannotated RNA shares the major composition of unique sRNAs in all samples.

. . Comprehensive Analysis of miRNAs Expression Profiles.
In the present study, a total of 2,890 known miRNAs were identified in all six tissues. Among them, 1,129 (39.066%) known miRNAs were found to be coexpressed (see Figure 3). In the known miRNA expression profile, the reads numbers of the top 20 miRNAs accounted for each tissue as heart (51.028%), liver (53.988%), spleen (48.736%), lung (41.653%), kidney (45.090%), and muscle (44.028%). The expression profile indicated that a small portion of miRNA genes expresses most miRNAs.
The miRNA expression was normalized based on reads of exon model per million mapped reads (RPM) values. The expression of miRNAs, having RPM value more than 2000, was defined as highly coexpressed among examined tissue. Furthermore, 20 miRNAs were found to be highly coexpressed in all six tissues, and this suggested the crucial role of them for forest musk deer physiological processes (see Table 4).
. . Differential Expression of miRNAs. In order to identify differentially expressed miRNAs between tissues of musk deer, known miRNAs were compared in pairs to identify differentially expressed miRNA by multiple hypothesis testing. The results displayed that the number of differentially expressed miRNAs (P≤ 0.05) ranged from 912 (Lung versus Muscle and Liver versus Kidney) to 1,220 (Heart versus Liver), shown in Figure 4.
. . Gene Functional Annotation. In order to explore the function of target genes of these differentially expressed miRNAs, we mapped all of the target genes to terms in the GO and KEGG databases. Interestingly, in "Heart versus Liver" group, the GO enrichment analysis shows that the target genes were functionally enriched in organelle, cytoplasm, single−organism process, biological regulation, and others, although the functional enrichment in the other tissues were almost the same (see Figure S2). KEGG pathway analysis of these target genes in all fifteen groups revealed that they were mainly regulating endocytosis, hedgehog signaling pathway, glycosaminoglycan biosynthesis-chondroitin sulfate/dermatan sulfate, and PI3K-Akt signaling pathway. The significant pathways (P<0.01) in each group are listed in Table S1.
Next-generation sequencing (NGS) technology is an advance method for analyzing transcriptome and has been extensively used to explore several kinds of miRNAs and their role in many biological processes [18]. Based on this technology, a number of transcriptome data have been revealed from organisms [19]. Transcriptome data can provide a convenience for further functional researches. Previous study has reported the transcriptome expression of Chinese giant salamander liver, spleen, and muscle using NGS technology [20], as a representative example in rare animals.   Figure 4: Level of differentially expressed miRNAs between different FMD tissues. In this figure, each pair of individuals compared is plotted on the x-axis, and the number of differently expressed miRNAs is plotted on the y-axis. In "1 versus 2" group, the red column represents miRNAs upregulated in the 1 compared to the 2 group, and the green column represents miRNAs downregulated in the 1 compared to the 2 group.
In this study, high-throughput sequencing of small RNA was extensively used to characterize specific miRNA and differential expressions of small RNAs from several tissues of forest musk deer. We identified a total of 103,261,451 (∼ 87.87%) good quality small RNA reads; of them 6,622,520 were unique across all tissues. The small RNAs in each library have most clean reads between 20 and 24 nt in length. Moreover, 30.5% unannotated sRNA clean reads of 30-33 nt were identified specific to muscle tissue (see Figure S1). Moreover, tRNA and its derived small RNAs account a large proportion of the all small RNAs of muscle tissue. Previously, some studies on mature mouse sperm reported novel tRNA-derived small RNAs which were 30-34 nt in length [21] and sperm tRNA-derived small RNAs may be a paternal epigenetic factor, which have effects on the development of fertilized egg [22]. Nonetheless, the diversity of small noncoding RNAs of mammalian viscera provides an opportunity to identify a specific subset of RNAs from muscle tissue, certainly warrants further investigation. Intriguingly, we found 20 microRNAs were coexpressed in all six tissues and this suggests that vital role of them for FMD biological processes. Most abundant cluster of microRNA in all six tissues was predicted as miR-26 family (see Table 4), which was recently proven to be crucial roles in numerous biological processes such as cell proliferation, apoptosis, tumorigenesis at different stages of nontumor diseases, growth and development of normal tissues, and other biological processes by regulating some complex signaling pathway [23,24]. The downregulation of miR-26 was observed in many tumor types, such as bladder cancer and breast cancer, whereas ectopic expression of miR-26 inhibits proliferation and decreases in various tumor types [25][26][27][28][29].
There are also some studies that revealed that the expression of miR-26 was upregulated in tumors such as glioma [30].
Let-7miRNA family also has various functions; it has been reported to control cell growth as a "posttranscriptional gatekeeper" of certain genes [31] and therefore likely represents a potential biomarker. The downregulation of let-7 in breast cancer cell lines caused let-7 to lose its ability to restrain Ras mRNA, resulting in the activation of p-Ras and p-ErK, as reported in a study by Sun et al. [32]. A recent study in humans further reported that the association between the loss of let-7 expression and metastatic events is so strong that may indicates the potential prognostic role of let-7 in patient stratification and, hence, optimum selection for treatment strategies [33].
Furthermore, miR-24-3p was shown to play a vital role as an oncogenic miRNA in lung cancer [34]. Highly expressed miR-191 is a key regulator of naive, memory, and regulatory T cell homeostasis [35]. Expression of the miR-99 family of microRNAs had been shown to be related to radiation sensitivity [36], proliferation of c-Src-transformed cells through targeting mTOR, and prostate cancer cells that are inhibited by miR-99a and miR-99b [37,38]. miR-100-5p is a tumor oncogenic and could be used as a diagnostic biomarker for renal cell carcinoma [39]. Moreover, 11 miRNAs were found to be high expression in specific tissue. In them, miR-133 family (efu-miR-133-3p, chi-miR-133a-3p, dme-miR-133-3p, gga-miR-133c-3p, and mmu-miR-133a-3p) showed specifically high expression in heart, which has been function in the cardiomyocytes proliferation and suppresses muscle gene expression in the heart, cardiac development, and apoptosis [40,41]. A number of miRNAs have been discovered that play a key role in regulating liver development, regeneration, and metabolic functions. miR-122 is known as a biomarker of hepatic disorders such as chronic hepatitis, nonalcoholic fatty-liver disease, and drug-induced liver disease [42]. Some recent studies indicated that rat liver miR-122 expression may be upregulated by bisphenol A, while doses of crocin can downregulate miR-122 expression in rat with hepatic injuries inducing by ischemia-reperfusion or bisphenol A [43,44].
Moreover, studies showed that miRNA-146 negatively regulates the production of proinflammatory cytokines via NF-B signaling in human gingival fibroblasts [45], miR-26a regulates tissue, and cell growth and differentiation by acting to posttranscriptionally repress Zeste homolog 2 [46].

Conclusions
In summary, the study reveals the first comprehensive transcriptome profile in six tissues (heart, liver, spleen, lung, kidney, and muscle) of Chinese forest musk deer. We have identified several differentially expressed microRNAs and they were mainly implicated in endocytosis, hedgehog signaling pathway, glycosaminoglycan biosynthesis-chondroitin sulfate/dermatan sulfate, and PI3K-Akt signaling pathway. The dataset of assembled FMD unigenes should serve as advance in the research of forest musk deer biology and further contribute to forest musk deer breeding and conservation. Nevertheless, the validation of the relationship between forest musk deer miRNAs and target mRNAs in the regulation of specific physiological processes needs further explored.

Data Availability
The transcriptome data are available at NCBI with the accession number PRJNA541415.