The SARS-CoV-2 UTR's Intrudes Host RBP's and Modulates Cellular Splicing

SARS-CoV-2 is a novel coronavirus that causes a potentially fatal respiratory disease known as coronavirus disease (COVID-19) and is responsible for the ongoing pandemic with increasing mortality. Understanding the host-virus interaction involved in SARS-CoV-2 pathophysiology will enhance our understanding of the mechanistic basis of COVID-19 infection. The characterization of post-transcriptional gene regulatory networks, particularly pre-mRNA splicing, and the identification and characterization of host proteins interacting with the 5′ and 3′UTRs of SARS-CoV-2 will improve our understanding of post-transcriptional gene regulation during SARS-CoV-2 pathogenesis. Here, we demonstrate that either SARS-CoV-2 infection or exogenous overexpression of the 5′ and 3'UTRs of the viral genomic RNAs, results in reduced mRNA levels possibly due to modulation of host cell pre-mRNA splicing. Further, we have investigated the potential RNA-binding proteins interacting with the 5′ and 3′UTRs, using in-silico approaches. Our results suggest that 5′ and 3′UTRs indeed interact with many RNA-binding proteins. Our results provide a primer for further investigations into the UTR-mediated regulation of splicing and related molecular mechanisms in host cells.


Introduction
SARS-CoV-2 is an enveloped, positive sense, singlestranded RNA virus that is highly transmissible among humans. Exponential transmission of SARS-CoV-2 raised serious concerns regarding public health, and COVID-19 was proclaimed a global pandemic on 11 March, 2020, by the World Health Organization (WHO) [1][2][3]. Te positive-sense RNA genome of SARS-CoV-2 can be directly recognized by host ribosomes as an mRNA, which is followed by the expression of viral proteins pivotal for replication [4]. As in other RNA viruses, SARS-CoV-2 relies upon host cell factors involved in major cellular processes such as RNA localization, processing, translation, stability regulation, and so on, for the completion of its life cycle [5].
Till now, in vitro studies of SARS-CoV-2 infection in human and Vero (African green monkey kidney) cells have been performed to characterize either host cell gene expression [6,7] or dysregulation of pre-mRNA alternate splicing globally and its correlation with antiviral immunity [8][9][10]. In addition to this, studies in protein-protein interaction have indicated multiple targets which may be targeted for a potential inhibition of viral replication using repurposed drugs [9]. Among RNA viruses, specifc interactions between diferent host proteins and the viral genomic RNA may play a crucial role in the life cycle. Further, such cross-talk between the viral genome and host trans-acting factors may lead to altered regulatory pathways in the host during infection, and these need to be characterized [11,12]. RBPs are host cell proteins that can bind to single and double-stranded RNA molecules forming ribonucleoprotein complexes, and their dysregulation has been shown to signifcantly alter the regulatory networks in a range of diseases, including cancer, genetic diseases, and virus-causing diseases [13,14].
From tiny microbes to multicellular organisms, gene expression is tightly regulated and coupled with all the crucial processes necessary for survival. In eukaryotes, the genome is interrupted with noncoding segments of introns, and the protein-coding genes are dispersed throughout in the form of short fragments called exons. Te introns are removed from precursors of messenger RNAs (pre-mRNA) in a reaction catalyzed by a multimegadalton RNA-protein complex called the spliceosome, via the process known as splicing [15][16][17]. RNA processing plays an essential role in virtually all cellular processes and plays a vital role in gene expression. Inside the cell, RNA molecules are associated with RNA-binding proteins and form dynamic ribonucleoprotein particles (RNPs) that afect most steps of the RNA metabolism [18]. Specifc RBPs congregate on pre-and mature mRNAs, governing gene regulation at the posttranscriptional level, and mutations afecting the function of RBPs lead to diseases [19][20][21]. Apart from this, noncoding RNA has also been reported as a post-transcriptional gene regulator in other viral infections [22,23].
In human cells, during pre-mRNA splicing, most introns are removed by the canonical U2-dependent, major spliceosome. However, over 0.5% of the human introns are excised by an alternative splicing pathway catalyzed by the minor spliceosome which is dependent on U12 snRNA [24,25]. Although U12-dependent introns are rare, they are often located in genes with critical cellular functions, and mutations in the minor spliceosome components lead to several genetic disorders [24,26,27]. Multiple studies have reported an interaction between human RBPs, which include serine/arginine-rich splicing factor (SC-35), the heterogeneous nuclear ribonucleoprotein family (hnRNPA1 and hnRNPAQ), transformer 2 alpha homolog (TRA2A), and polypyrimidine tract binding protein (PTBP-1), with coronavirus genomic RNA [28,29].
Te SARS-CoV-2 genome contains a 272-nucleotide untranslated region (UTR) at the 5′ end and a 207 nucleotide UTR at the 3′end. Te sequence of these genomic RNA UTRs shows considerable homology with that of other beta coronaviruses, such as MERS-and SARS-related beta coronaviruses [30]. Within the secondary structure of the UTR RNA, stem-loops and hypervariable regions are known to mediate in steps such as subgenomic RNA synthesis, discontinuous transcriptions of ORFs, viral RNA packaging, and pathogenesis, which are critical for the viral life cycle [30][31][32].
Te interaction of host RBPs with the 5′ and 3′UTRs RNA of the viral genome is crucial for the pathogenesis of many RNA viruses. Earlier, it was reported that PTBP1 interacts with the 5′UTR of the mouse hepatitis virus (MHV), a member of the Coronaviridae family, and regulates replication of the viral RNA [33,34]. Tere is however a considerable gap with respect to the composition of the RNA-protein complexes as well as their functional implication in the viral life cycle as well as pathogenesis.
Here, our study suggests that SARS-CoV-2 infection results in a reduction in mRNA levels possibly, due to modulated pre-mRNA splicing in the host cells. Interestingly, exogenous overexpression of SARS-CoV-2 5′UTR and 3′UTR also leads to reduced mRNA levels. Overall, our results hint at deregulation of the host mRNA splicing. Additionally, overexpression of the 3′UTR caused a more pronounced efect when compared to that induced by the 5′UTR. Also, to gain an insight into the molecular mechanism of this deregulation, we have investigated the potential interaction between host RBPs and the viral UTR RNAs using the catRAPID tool. Our results suggest that many RBPs indeed interact with UTRs and may be involved in modulating various molecular mechanisms, including pre-mRNA splicing.

Cell Lines and
Transfections. Vero cells were used for SARS-CoV-2 infection studies. Briefy, cells were infected with SARS-CoV-2 (the Wuhan strain) at MOI of 0.01, and mock cells were taken as the control. At 24 hours postinfection, total RNA from uninfected or infected cells was extracted using Trizol and purifed. Te qualitative and quantitative estimation of RNA was performed using a NanoDrop spectrophotometer. Purifed total RNA was used for reverse transcription, followed by PCR for estimation of splicing.
Human embryonic kidney cell line HEK293T was used for exogenous overexpression of SARS-CoV-2 UTR RNA. A monolayer of HEK293T cells was grown in advanced complete Dulbecco's modifed Eagle's medium (DMEM) (Invitrogen) consisting of 10% foetal bovine serum (FBS) and antibiotics (Penicillin and streptomycin, Sigma).

Splicing
Assay. Te splicing efciency of candidate genes was tested by RT-PCR analysis of total RNA. Briefy, total RNA extracted from transfected cells was treated with RQ1 RNase-free Dnase (Promega cat no. M610A). Firststrand cDNA was synthesized from 500ng of total RNA using random hexamers (Cat No. 48190011), followed by PCR using exon-specifc primers with phire polymerase (Termo, Cat No. F122L). Amplifed PCR products were resolved in 1.5% agarose gel, stained with ethidium bromide, and visualized using an advanced gel imaging system with chemiluminescent and a laser diode (GBOX XX9; Syngene). TIF images from the gel documentation system were analyzed using ImageJ software (NIH). Te quantifcation of the RT-PCR product was performed with ImageJ software (NIH) by taking β-Actin as a loading control. Te genes mRNA normalized with the β-Actin mRNA, and fgures depicted the normalized mRNA in a percentage.

Statistical Analysis.
Two-way ANOVA analysis was performed using GRAPHPAD PRISM 5.01 (GRAPHPAD software, San Diego, CA, USA) for measuring statistical signifcance in splicing changes during 5′ UTR and 3′ UTR overexpression. P values of <0.05 in the Student's t-test were considered to be signifcant.

catRAPID Analysis.
Te catRAPID algorithm was used for predicting host proteins that can potentially bind to RNA corresponding to 5′ and 3′ UTR of the SARS-CoV-2 genomic RNA [35,36]. Te algorithm in catRAPID can predict RNA-protein interaction in multiple species, through an analysis of the secondary structure, H-bonds, as well as van der Waals interactions. For analysis, sequence of either the RNA or a protein can be used as the input query.

SARS-CoV-2 Infection and Overexpression of SARS-CoV-2 5′UTR and 3'UTR Afects Host Splicing.
Interactions of viral cis-acting RNA elements and nonstructural proteins with host trans-acting factors play a key role in host-virus interaction. A transcriptional or post-transcriptional modulation in the expression of a crucial RBP-encoding gene in infected host cells can have a signifcant impact on the outcome of viral pathogenesis. To investigate any efect of SARS-CoV-2 infection on pre-mRNA splicing, total RNA from Vero cells, either mock-infected or infected with the virus at 0.01 MOI was extracted. It was followed by RT-PCR using primer on the specifc exons spanning U12-type introns. Te resultant PCR will reveal either pre-mRNA or mRNA levels of given genes. Candidate genes for splicing assay, namely MORC3, THOC2, CRNKL1, HNRNPLL1, and DDX54, were selected ( Figure 1). Te results showed that, when compared to mock-infected cells, the resultant mRNA from each gene was reduced by approximately 25%. Tis indicated a possibility of reduced mRNA splicing of these genes by SARS-CoV-2 replication in these cells.
To further investigate a possible involvement of the SARS-CoV-2 UTRs in modulating host cell splicing as observed above, pre-mRNA splicing of these genes was quantifed upon exogenous overexpression of RNA corresponding to either 5′UTR or 3′UTR of the SARS-CoV-2 genome. For this purpose, eukaryotic expression vector encoding either the 5′UTR or 3′UTR of the SARS-CoV-2 genome was transfected in HEK293T cells, and overexpression of cognate RNA compared to that in untransfected cells by RT-PCR of purifed total RNA ( Figure 2). Subsequently, a splicing assay was performed to investigate mRNA levels of candidate genes as mentioned earlier, namely MORC3, THOC2, CRNKL1, and HNRNPLL1. As shown in Figure 3, the results showed that, compared to vector control, the resultant mRNA level was reduced during the overexpression of 5′ and 3′UTRs of SARS-CoV-2 compared to vector control. Interestingly, the overexpression of the 3′UTR results in an enhanced reduction of mRNA was compared to the 5′ UTR overexpression (Figure 3). For statistical analysis of the results, a two-way ANOVA test was performed. Our observation suggests that the splicing efciency of MORC3, THOC2, CRNKL1, and HNRNPLL was reduced by up to 40% during 3′UTR overexpression as compared to the vector control. On the other hand, overexpression of the 5′ UTR of SARS-CoV-2 results in the reduction of mRNA to 20% of the same gene. We hypothesize that overexpression of SARS-CoV-2 UTRs in HEK293T cells could act as a sponge, titrating away host RNA binding proteins involved in splicing, and other gene regulatory pathways.

In-Silico Approaches to Study UTRs-RBP Interaction.
mRNA splicing involves a complex interaction of multiple host splicing factors. Earlier studies have reported that host cellular factors, specifcally RBPs, actively participate in every crucial process during virus infection [37][38][39]. To investigate the possible RNA-binding proteins interacting with UTRs, we performed computational studies using catRAPID omics. We have extensively used the catRAPID tool to analyze RBPs that interact with SARS-CoV-2 UTRs. We have listed the top interaction in descending order of their interactions strength (Table 2). Tese top hits indicate high specifcity for the interaction. Tis study suggests that both the 3′ and 5′UTRs of the SARS-CoV-2 virus could interact with the vast array of RBPs (Table 3). Te interaction profle also indicates that 5′UTR interacts with RBPs with high specifcity compared to 3′UTRs. In addition, we have evaluated the functional implications of these RBPs and found that PM14, MGN2, and MGN proteins interacting with 3′UTRs could be an important determinant of pre-mRNA splicing. PM14 is a component of the splicing factor SF3B complex, and MGN and MGN2 are the components of EJC (Exon Junction Complex). Te RBPs interacting with 5′UTR are highly diverse in function. We have further compared our results with recently published studies where the RaPID assay was used to identify the host interacting partners of SARS-CoV-2 UTRs [27]. Te rapid assay involves BirA ligase-mediated ubiquitination of host proteins, which interact with cloned RNA sequences. However, none of the proteins observed in the RaPID assay featured in the catRAPID predicted list of UTR-interacting proteins. A possible explanation can be that the RaPID assay preferentially shows interacting proteins that are cytosolic rather than nuclear. Further studies are needed to validate these interactions and their signifcance under disease conditions experimentally.

Discussion
Eukaryotic RNA splicing is a major mechanism that directly contributes to human proteome diversity and is associated with several chronic disease conditions, including cancers of diferent types, atherosclerosis and so on [40,41] Figure 2: Overexpression of SARS-CoV-2 5′UTR and 3′UTR in HEK293T cells: cells were transfected with plasmid pcDNA3.1 encoding either 5′UTR or 3′UTR of SARS-CoV-2. Total RNA was extracted using Trizol reagent. Te overexpression was validated by RT-PCR using 5′UTR and 3′UTR specifc primers (lanes 6 and 7). Lane 4 depicts a positive control in PCR reaction where pcDNA3.1 constructs having 5′UTR or 3′UTR were used (Table 1). Advances in Virology the alternative splicing events of host cells. Tis is because of intrinsic factors like polymorphism at the splice sites or due to direct interference of virulence factors [42,43]. Similarly, viral components, including UTRs and structural and nonstructural proteins often interact with host cellular machinery. In various disease conditions, from cancer to viral infection, the host factors, including transcription factors, splicing components, or ncRNAs, are potent targets for modulation of the cellular environment [44], and there are various transcriptomics approaches to study these ncRNA in disease conditions [45]. Dengue, HIV, Zika, and SARS-CoV-2 have been shown to hijack host splicing machinery. Tis leads to the regulation of the physiologically important immune response, specifcally during the process of infection [46][47][48]. Various viral proteins have been shown to interact with host splicing components and can alter host splicing. Earlier studies have shown that the immediate early infected cell protein ICP27 (EI63) from the Herpesviridae family interacts with host SRSF2 and SRSF3 and causes the hypophosphorylation of SR proteins. Additionally, SR proteins also inhibit the splicing of the host cell at an early stage of the spliceosome assembly [49,50]. Similarly, 3D pol from picornaviruses contains RDRP activity and localizes to the nucleus. It associates with PrP8, one of the major components of the spliceosome, and blocks the second step of catalysis, and due to this, a lariat form of splicing intermediates is accumulated [51]. Host splicing modulations have been demonstrated in the Infuenza A virus (IAV) also. Nonstructural protein-1 (NS1) of IAV is a vital multifunctional protein and is required for viral replication. It suppresses the innate antiviral host response. Moreover, NS1 contains an RNA-binding domain which binds to a specifc region of U6 snRNA and acts as a potent inhibitor of the host splicing [52][53][54].

SARS-CoV-2 Virus and UTRs Modulate Pre-mRNA
Splicing. Here, we have investigated the efect of SARS-CoV-2 infection and overexpression of UTRs on the mRNA levels of U12 intron-containing genes. Also, we have identifed the functional role of the 5′ and 3′  Alternate splicing of MORC3, THOC2, HNRNPLL, and CRNKL1 was probed using RT-PCR. Actin was used as a loading control. Statistical signifcance was determined using two-way ANOVA: P < 0.05; P < 0.01; P < 0.001.  Recently, Banerjee et al. have shown that the nonstructural protein NSP16 of SARS-CoV-2 interacts with major spliceosomal snRNAs U1 and U2, particularly at their mRNA recognition sites, and disrupts the host pre-mRNA splicing [55]. Similarly, the nonstructural protein 5 (NS5) of DENV has been shown to interact with U5 snRNP, a common component involved both in U2 and U12 spliceosomes. Tis interaction leads to reduced host pre-mRNA splicing [47].  Our results suggest that the overexpression of the 3′UTR results in the reduction of mRNA level more compared to the 5′UTR. (Compare Figures 3(b) with 3(c)). It hints that any possible efect of the 3′UTR on pre-mRNA splicing will be more compared to the 5′UTR. Our study suggests that SARS-CoV-2 may modulate host pre-mRNA splicing in various ways. In addition to NSP16-mediated regulation, UTRmediated modulation may be an additional way of regulating post-transcriptional machinery. Currently, we do not know the primary target genes for both mechanisms of spliceosome regulation. It may be possible that due to diferent regulatory mechanisms, the target genes may also be diferent.

catRAPID Profling Suggests Tat Various RBPs Interact with UTRs.
We have performed an in-silico study to analyze the interaction profle of SARS-CoV-2 3′ and 5′ UTRs with RBPs. Te analysis indicates that the SARS-CoV-2 UTRs can interact with a wide range of RBPs. Te predicted interactions are much more specifc with 5′UTRs compared to 3′UTRs. In the prediction table, we have searched out the RBPs which act as a splicing factor. And we found that in the RBPs interacting with the 3′UTR, few of them act as splicing factors, namely, PM14, MGN2, and MGN. PM14 is a component of the splicing factor SF3B complex [56], MGN2 is the constituent of EJC (Exon Junction Complex) [10], and MGN participates in RNA splicing [57]. Tese proteins play an important role in spliceosome complex formation to exon junction recognition, and hence they could regulate the process of cellular splicing at a large scale. Te top hits listed as interacting partners of the 5′UTR are more diverse in function, with none of them known to function as a splicing regulator. Most of the proteins are ATP-dependent helicases, including DDX43, DDX1, DDX51, DDX18, and DDX3X. Te rest of the proteins participate in the other aspects of cellular functions, including DNA repairing (ALKB8), a component of signal recognition particles (SRP72), a constituent of the synaptonemal complex (RED1), pseudouridylation of RNA (PUS7L), and E3 ubiquitin ligation (TRI32). Tis analysis supports our experimental observation, where we found that the splicing defects are much more pronounced under 3′UTR overexpression compared to 5′UTRs.

Data Availability
All data used to support the fndings of this study are included within the article.

Disclosure
Anjali Singh and Kush Kumar Pandey contributed equally to this work and are co-frst authors.

Conflicts of Interest
Te authors declare that there are no conficts of interest.