Transcriptional Profiling of Exosomes Derived from Staphylococcus aureus-Infected Bovine Mammary Epithelial Cell Line MAC-T by RNA-Seq Analysis

Mastitis is a common disease in the dairy industry that causes huge economic losses worldwide. Exosomes (carrying proteins, miRNA, lncRNA, etc.) play a vital role in the regulation of immune response. lncRNA can play a variety of regulatory roles by combining with protein, RNA, and DNA. The expression of mRNA and lncRNA in exosomes derived from bovine mammary epithelial cells infected by S. aureus is rarely understood. To explore this issue, RNA sequencing analysis was performed on exosomes derived from S. aureus-infected and noninfected MAC-T cells. Analysis of the sequencing results showed that there were 186 differentially expressed genes, 431 differentially expressed mRNAs and 19 differentially expressed lncRNAs in the exosomes derived from S. aureus-infected and noninfected MAC-T cells. By predicting lncRNA target genes, it was found that 19 differentially expressed lncRNAs all acted on multiple mRNAs in cis and trans. GO analysis revealed that differentially expressed genes and lncRNA target genes played significant roles in such metabolism (reactive oxygen species metabolic processes), transmembrane transport, cellular response to DNA damage stimulus, and response to cytokines. KEGG enrichment indicated that lncRNA target genes gathered in the TNF pathway, Notch pathway, MAPK pathway, NF-kappa B pathway, Hippo pathway, p53 pathway, reactive oxygen species metabolic processes, and longevity regulating pathway. In summary, all data indicated that differentially expressed gene, mRNA, and lncRNA in transcriptional profiling of exosomes participated in bacterial invasion and adhesion, oxidative stress, inflammation, and apoptosis-related signaling pathway. The data obtained in this study would provide valuable resource for understanding the lncRNA information in exosomes derived from dairy cow mammary epithelial cells and conduced to the study of S. aureus infection in dairy cow mammary glands.


Introduction
Mastitis is a common and widespread infectious disease in dairy farms all over the world, which is characterized by inflammation of the mammary glands, resulting in a decrease in milk production and quality [1][2][3]. There are many kinds of pathogens of mastitis, and Staphylococcus aureus (S. aureus) is one of the important pathogens in subclinical mastitis [4]. S. aureus often causes clinical or subclinical mastitis, and the intramammary infections are usually characterized as mild, chronic, or persistent [5].
Bovine mammary epithelial cells (BMEC) are not only essential for the synthesis of milk components but also the main line of defense against pathogen invasion [3,6]. Milk contains a large number of exosomes, which are mainly derived from BMEC. The cargo carried by milk exosomes has been suggested to play a role in cell growth, development, immune regulation, and regulation [7]. Exosomes are tiny extracellular vesicles with a diameter ranging from 40 to 150 nm, which originate from the internal vesicles of multivesicles in almost all cell types [8]. Typical exosomes are surrounded by phospholipid membranes, which contain proteins, lipids, and nucleic acid molecules (including miRNA, cicrRNA, and lncRNA) [9]. With the deepening of the understanding of exosomes, exosomes are used as carriers to deliver drugs for targeted therapy. Because exosomes have been proven to tolerate the harsh environment in the intestine, they are absorbed by various cell types, cross biological barriers, and reach surrounding tissues [7]. Exosomes can also initiate a signal cascade of neighboring cells through the paracrine pathway. Therefore, it is very valuable to explore the RNA sequence in exosomes derived from MAC-T cells infected by S. aureus.
Long noncoding RNA (lncRNA) is a type of RNA that does not have the ability to encode proteins with more than 200 nucleotides in length [10]. LncRNA can bind to DNA, RNA, and proteins to play a role by mediating DNA methylation, mRNA degradation, chromatin remodeling, histone modification, and protein modification [10,11]. There have been research evidences that lncRNA plays a key role in mastitis caused by S. aureus, Escherichia coli, Mycoplasma bovis, and other pathogens though regulating the expression of specific genes. LRRC75A-AS1 in bovine mammary epithelial cells and breast tissue was downregulated under inflammatory conditions, and knocking out LRRC75A-AS1 in MAC-T cell line inhibited the adhesion and invasion of S. aureus and attenuated the activation of the NF-κB pathway [12]. LncRNA-TUB was the higher expression in the mammary epithelial cells of dairy cows that received proinflammatory stimulation, and knockout of lncRNA-TUB indicated that it played a key role in the morphology, proliferation, migration, and β-casein secretion of mammary epithelial cells [13].
Due to the spongy effect of exosomes, we speculated that the expression profile of lncRNA had changed in the exosomes derived from MAC-T cells infected with S. aureus. The purpose of this study was to determine the expression profile of lncRNA in exosomes derived from S. aureusinfected and noninfected MAC-T cells to provide new targets for the link between bovine mastitis caused by S. aureus and lncRNA for future research.

Exosomes Isolation.
Cell line exosomes were separated by ultracentrifugation as previously described [14]. The cell supernatant was sequentially subjected to 300×g, 10 min; 2000×g, 10 min; 10,000×g, 30 min in a large-capacity highspeed centrifuge (AVANTI J-26XPN, Beckman, USA). After collecting the supernatant, ultrahigh-speed centrifuge with SW32Ti rotor (optima XE-90, Beckman, USA) was hired to ultracentrifuge at 100,000×g for 70 min. Pellet was collected and resuspended in PBS and then performed ultracentrifugation again at 100,000×g for 70 min. Finally, the exosomal pellet was resuspended in PBS and collected for subsequent analysis.

Transmission Electron
Microscopy. Exosomes were mixed with an equal volume of 4% paraformaldehyde. 100 μL mixed liquid was sucked off and added dropwise to the copper net; then, phosphotungstic acid was added for negative staining. After drying, transmission electron microscope was used to observe exosomal morphology at 80 kV.

Particle Size Distribution.
Exosomes obtained by ultracentrifugation was diluted by PBS to the optimal concentration for analysis. ZETASIZER Nanoparticle Tracking Analysis (NTA) instrument (Malvern, Worcestershire, England) was performed to analyze particle size of exosome. According to the Stokes-Einstein principle, dynamic light scattering detected Brownian motion of particles to generate trajectories and displacements to determine particle size distribution.
2.5. Flow Cytometry Analysis. Exosomes acquired by ultracentrifugation separation were incubated with CD63 (#557288, BD Bioscience) or CD81 (#555676, BD Bioscience), and exosomes without antibody incubation were used as a negative control. BD accuri C6 flow cytometer was used for sample detection according to the operating instructions and then analyzed with FlowJo software.
2.6. RNA Extraction and lncRNA Sequencing. Total RNA was extracted from exosomal samples using Trizol reagent (Invitrogen, Carlsbad, CA). Total RNA is used as an input material for constructing cDNA library. Mainly, exosomal trace RNA purification, cDNA synthesis through reverse transcription, end repair and adaptor, PCR amplification, and other steps build a library and then go through the library quality inspection and sequence on an Illumina HiSeq 4000 platform (Illumina Inc., San Diego, CA).

Sequence Data Quality
Control. The original image file (BCL) obtained by sequencing was converted into raw data in FASTQ format after base recognition. Quality analysis on the original data was performed to assess whether it was suitable for bioinformatics analysis. Quality analysis mainly included sequencing quality and base composition analysis. Clean data was obtained by removing linker sequence and low quality sequence from raw data. The quality of clean data was then evaluated using FASTQC v0. 10 2.9. Differential Expression Analysis. Quantitative analysis of genes and transcripts was performed via htseq v0.11.2. DESeq2 (1.18.1) is used to determine differential expression in transcripts or gene expression data. It is determined that the transcripts or genes with P value < 0.05 are differentially expressed in biological replication. Q value < 0.05 and |log2 ðfold changeÞ | <1 are considered as thresholds for significant differential expression in abiotic replication.
2.10. Prediction of lncRNA-Gene Interactions. In order to predict the targets of lncRNAs in MAC-T cells and thus understand potential functions of lncRNAs, method as in previous research [15], firstly, we searched the 10 k/100 k coding gene upstream and downstream of lncRNA to find cis role lncRNA. Then, construction logarithm was modeled as log ðP/ð1 − PÞÞ = a + bx + cy, where a, b, and c were model parameters, y represented the percentage of GC content in the sequence, and P represents lncRNA binding to a 10 kb long sequence odds ratio. Predict statistical significance by using Wald test on Z statistics (P value <0.05).
Only lncRNAs with P value <0.05 and z score > 0 were annotated as trans-lncRNA.

Kyoto Encyclopedia of Genes and Genomes (KEGG)
Biological Pathway Enrichment and Gene Ontology (GO) Functional Enrichment Analysis. The GOseq R package was used to analyze GO enrichment for differentially expressed genes or lncRNA target genes [15]. GO terms with Q values < 0.05 were considered significantly enriched by differentially expressed genes. The summary differential expression genes and lncRNA target genes in KEGG pathway enrichment were analyzed by KOBAS (v3.0) software.
2.12. RT-qPCR Validation. Total exosomal RNA samples from S. aureus-infected and noninfected MAC-T cells for the lncRNA-seq were analyzed by RT-qPCR. Then, cDNA was obtained by reverse transcriptase reagent according to the instruction manual. The qPCR was performed using the SYBR Green Plus Reagent Kit with the Light Cycler 96 instrument (Roche, Basel, Switzerland) following the instructions of the manufacturer. GAPDH was used as an internal reference gene. The primer sequences used in the study were listed in Table 1.
2.13. Statistical Analysis. IBM SPSS 20 was used to perform statistical analyses. The one-way analysis of variance (ANOVA) was performed to detect statistical differences in lncRNAs and mRNAs between S. aureus-infected and noninfected groups. P < 0:05 and P < 0:01 were considered as significant differences.  Total reads: statistics of total reads, each adjacent four lines contains the information of one read, and the total read number of each file is calculated; total base: the number of all bases in total data; clean reads: filter the original data, remove the linker sequence, remove the contaminated part, and remove the sequence containing more low-quality bases to obtain clean reads; GC%: the percentage of G and C in all bases; error rate: the error rate of sequencing; Q20, Q30: the percentage of total number of bases where the Phred score is greater than 20 and 30 which indicates base call accuracy.

Identification of Exosomes.
In order to directly observe the morphology of exosomes, we employed transmission electron microscopy to observe negatively stained exosomes samples. As shown in Figure 1(a), it had a very obvious single-layer membrane structure under the electron microscope, which appeared as a saucer-shaped or disc-shaped vesicle with one side recessed. NTA analysis was performed to further determine the size of exosomes. The main peak of the particle size of the exosomes we separated was 70.55 nm, and the 80.3% particle size was between 40 and 150 nm (Figure 1(b)). Flow cytometry was used to detect the surface characteristic markers of exosomes. The results revealed that the positive rates of CD81 and CD63 were 92.7% and 85.9%, respectively (Figure 1(c)).

Sequencing Data Statistics of Exosomes Derived from MAC-T Cells.
In the current study, a total of 4 cDNA libraries were constructed by isolating total RNA from exosomes derived from S. aureus-infected and noninfected MAC-T cells. The raw reads in the cDNA library of different samples were cont1, 67572136; cont2, 66474707; infect1, 69033293; and infect2, 70292999. GC content (%) percentages were found to be cont1, 48.61; cont2, 47.63; infect1, 49.75; and infect2, 49.98 (Table 2). We conducted quality control analysis on raw data, and the results are as follows. The quality of the bases in raw reads was counted in each sequencing cycle. The base composition of each cycle was shown in Figure 2(a). As shown in Figure 2(b), the blue line represented the average base quality of the cycle in the base quality distribution box plot. Base error rate is limited to 0.11-0.13 in raw reads (Figure 2(c)). We filtered raw data to get clean data, and clean reads, GC%, Q20, Q30, and clean bases were counted in Table 3, and the quality control results of clean reads were shown in Figures 2(d)-2(f). In addition, when the clean reads were aligned with the bovine reference genome using HISAT2 software by an improved BWT algorithm (FM index), it was determined that the Total mapped reads or fragments belonging to all the samples were above 80% and was mapped in the reference genome (Table 4).

Distribution of Reads in the Chromosomes and Known
Types of RNAs. After mapped in the reference genome, we counted the position information of the genomes corresponding to all reads to evaluate the depth of sequencing data coverage. The distribution of reads on each chromosome was displayed in Figure 3(a). Subsequently, the reads aligned on the chromosome were annotated to exonic, intronic, and Raw reads: statistics of raw reads, each adjacent four lines contains the information of one read, and the total read number of each file is calculated; raw base: the number of all bases in raw data; clean reads: filter the original data, remove the linker sequence, remove the contaminated part, and remove the sequence containing more low-quality bases to obtain clean reads; GC%: the percentage of G and C in all bases; Q20, Q30: the percentage of total number of bases where the Phred score is greater than 20 and 30 which indicates base call accuracy; clean bases: the number and length of sequences in clean reads calculate the number of bases; clean bases%: percentage of clean bases/raw base. The effective reads: the remaining number of clean reads after removing rRNA reads will be used for subsequent genome alignment; total mapped: the number of sequencing sequences that can be aligned on the genome; multiple mapped: the number of sequencing sequences with multiple alignment positions on the sequence of reference; uniquely mapped: the number of sequencing sequences with unique alignment positions on the reference sequence; read-1, read-2 mapped: pair-end sequence, whose two parts that can be located on the number of genome, respectively; the statistical ratio of the two parts should be roughly the same; reads map to "+": the number of reads aligned to the positive strand of the genome; reads map to "-": the number of reads aligned to the genome on the negative strand; reads mapped in proper pairs: the relative distance of the pair end sequence mapping to the genome conforms to the length distribution of the sequenced fragments. 6 Oxidative Medicine and Cellular Longevity intergenic. Reads compared to the genome counted the distribution (Figure 3(b)). And the read distributions in the known RNA types were shown in Figure 3(c).

Differentially Expressed Gene Analysis in Exosomes
Derived from S. aureus-Infected and Noninfected MAC-T Cells. DESeq2 was used to screen differentially expressed genes. We selected the differentially expressed genes between samples through the two levels of multiple of difference (|log2 ðFold ChangeÞ | >1) and significance level (Q value < 0.05) [16][17][18]. The statistics of the number of differentially expressed genes in exosomes derived from S. aureus-infected and noninfected MAC-T cells were shown in Figure 4(a) and Table 5. There were a total of 186 gene expression disorders, of which 31 genes were significantly upregulated and 155 genes were significantly downregulated. We used the RPKM/COUNT of the differential gene as the expression level. Hierarchical clustering analysis was performed to cluster genes with the same or similar expression patterns into clusters and used different colored regions to represent different clustering grouping information to determine clustering mode of different sample control modes [15,19]. In this study, the hierarchical clustering analysis of differentially expressed genes is shown in Figure 4(b).       9 Oxidative Medicine and Cellular Longevity GO enrichment analyses were conducted to search for the biological processes, cellular components, and molecular functions of differentially expressed genes in more detail [15,20]. GO analysis was performed on the upregulated and downregulated genes. According to the enrichment point, all disordered genes are classified according to GO terms (Figures 4(c) and 4(d)). Moreover, KEGG pathway analysis results were calculated according to the enrichment points, and the best gene pathways related to the upregulated and downregulated genes were listed in Figures 4(e) and 4(f) and Table 5 3.6. Prediction of lncRNA Target Genes and mRNAs. The mode of action of lncRNA regulating target genes could be divided into two categories: cis regulation (In Cis) and transregulation (In Trans). We made predictions about how the differentially expressed lncRNAs regulated the target genes. Differentially expressed lncRNAs were selected to draw a network diagram of the interaction between these lncRNAs and their target genes (Figure 6), so as to provide reference and help for the overall analysis of the functions of lncRNAs in samples. As illustrated in Figure 6

GO Enrichment and KEGG Enrichment
Analysis of lncRNA Target Genes and mRNAs. GO analysis was performed on the target gene and took the P < 0:05 of GO annotation enrichment as the significance threshold. GO terms of biological processes, cellular components, and molecular functions were shown 10 GO terms, respectively (Figure 7(a)). KEGG analysis was performed on the target gene and used the P < 0:05 of the enrichment degree of KEGG pathway as the significance threshold to obtain the analysis result. KEGG-enriched target gene pathways were classified according to environmental information processing, human diseases, metabolism, and biological systems. It was found that lncRNA target genes were most closely related to human diseases (Figure 7(b)). KEGG pathway analysis results are shown in the bubble diagram of the KEGG pathway (Figure 7(c)). Multiple target genes in the KEGG enrichment were involved in oxidative stress (FoxO signaling pathway (ATM, SKP2); longevity regulating pathway, cAMP signaling pathway, and calcium signaling pathway (CAMK4); reactive oxygen species metabolic pathways (CYP1A2); oxidative phosphorylation (NDUFA6)) and inflammation (TNF signaling pathway (FAS, MLKL); cytokine-cytokine receptor interaction (FAS, IL1A, and TNFRSF13C); chemokine signal-ing pathway (GRK4); MAPK signaling pathway (FAS, IL1A); and NF-kappa B signaling pathway (ATM, TNFRSF13C)) and apoptosis (p53 signaling pathway (ATM, FAS, SER-PINB5, GORAB); apoptosis (ATM, FAS)).

Discussion
Mastitis is an inflammation of the udders of dairy cows that is almost always caused by pathogenic microorganisms [21]. It seriously harms dairy farming and the dairy industry. Cow mammary epithelial cells can produce exosomes and excrete with milk during lactation, and exosomes can also communicate between cells through paracrine pathways. The cargo      14 Oxidative Medicine and Cellular Longevity carried in the exosomes is the switch that initiates communication. The aim of this study was to investigate the expression of mRNA and lncRNA in exosomes derived from bovine mammary epithelial cells infected by S. aureus.
In this study, we used ultracentrifugation to separate the exosomes in the MAC-T cell culture medium. We directly observed saucer-shaped vesicles through transmission electron microscopy. The NTA instrument detected 80% of the extracellular vesicles with a size of 40-150 nm. In addition, the positive rates of CD81 and CD63, which are characteristic markers on the surface of exosomes, were more than 80% by flow cytometry. All the above data implied that exosomes were successfully isolated.
After the total exosomal RNA extracted, high-throughput sequencing was performed on the Illumina platform to reveal the lncRNA and gene expression profiles in the exosomes derived from S. aureus-infected and noninfected MAC-T cells. It was found that186 differentially expressed genes, 431 differentially expressed mRNAs, and 19 differentially expressed lncRNAs were screened out. Through GO function enrichment, most of the differentially expressed genes were involved in reactive oxygen species metabolic processes,  signal transduction of gene expression regulation, Wnt signaling pathway, planar cell polarity pathway, regulation of macrophage activation, and various biochemical metabolic processes (carboxylic acid, oxoacid and lactate metabolic process, ketone and steroid catabolic process, etc.). In the enrichment of the KEGG pathway, the differentially expressed genes participated in the signal pathways were mainly concentrated in three aspects, including pathogen invasion and adhesion (bacterial invasion of epithelial cells, cell adhesion molecules, and tight junctions), metabolism (oxidative phosphorylation, reactive oxygen species metabolism), inflammation, and apoptosis (p53 signaling pathway, Ras signaling pathway, and MAPK signaling pathway). In addition to differentially expressed genes, the function of most of these lncRNAs was unclear. We hypothesized that lncRNA regulated the expression of these genes, so we predicted the function of lncRNA based on its closely related coding genes [22]. In this study, we estimated the target genes of differentially expressed lncRNA using cis and trans, then GO and KEGG analyses were performed on the target gene to find the potential role of lncRNA [15]. We found that these differentially expressed target genes were mainly related to inflammation or apoptosis signaling pathways through enrichment analysis of GO and KEGG pathways and coding-noncoding coexpression network analysis. Differentially expressed lncRNA XR_001500758.2 and XR_003029422.1 were enriched in TNF signaling pathway, Notch signaling pathway, p53 signaling pathway, RIG-I-like receptor signaling pathway, NF-kappa B signaling pathway, MAPK signaling pathway, and PI3K-Akt signaling pathway. These signaling pathways are closely related to inflammation or apoptosis [23]. The target genes predicted for XR_ 001500758.2 included ATM, FAS, SERPINB5, PPIP5K2, SNW1, MLKL, IL1A, and PLA1A. ATM protein kinase is a product encoded by the telangiectasia ataxia-mutated gene ATM, which senses DNA damage, transmits DNA damage signals to downstream target genes, initiates stress systems, and produces cycle arrest, cell repair, and apoptosis [24,25]. FAS, adapter proteins associated with the death domain (FADD and TRADD), belongs to TNFR family. FAS could activate NF-kappa B, MAPK, and PI3K-Akt signaling pathway to initiate transcription of inflammatory genes [26]. In addition, the target genes predicted for XR_003029422.1 included CAMK4, CYP2R1, SKP2, NDUFA6, BOLA-DOA, TNFRSF13C, GORAB, and ITGA1. CAMK4 was involved in the longevity regulating pathway, cAMP signaling pathway, and calcium signaling pathway in KEEG pathway enrichment. Previous studies have confirmed that the CAMK4 gene was closely related to human longevity and hypertension and regulated the immune response by activating transcription factors to regulate gene expression in immune cells [27,28]. CYP1A2 was enriched in reactive oxygen species metabolic pathways. It has been reported that CYP1A2 could regulate various metabolic processes to participate in the production of reactive oxygen species during oxidative stress [29,30]. SKP2 helps to inhibit oxidative stress induced-DNA damage and cell apoptosis, which limits the misincorporation of ROS-damaged dNTPs into genomic DNA, making cells more resistant to such stress [31]. NDUFA6 is a mitochondrial protein involved in mitochondrial energy metabolism. The dysregulation of NDUFA6 leads to mitochondrial dysfunction and oxidative stress [32]. Based on predictions, we found that lncRNA had multiple targets and was involved in the complex process of regulating the immune response.
In conclusion, the molecular mechanism of S. aureus infecting dairy cow mammary epithelial cells involves the paracrine mode of exosomal cargo lncRNA playing a regulatory role in pathogen invasion and adhesion, oxidative stress, inflammation, and apoptosis. The data obtained in this study