Appearance of L90I and N205S Mutations in Effector Domain of NS1 Gene of pdm (09) H1N1 Virus from India during 2009–2013

In the present study, full length sequencing of NS gene was done in 91 samples which were obtained from patients over the time period of five years from 2009 to 2013. The sequencing of NS gene was undertaken in order to determine the changes/mutations taking place in the NS gene of A H1N1 pdm (09) since its emergence in 2009. Analysis has shown that the majority of samples belong to New York (G1 type) strain with valine at position 123. Effector domain of NS1 protein displays the appearance of three mutations L90I, I123V, and N205S in almost all the samples from 2010 onwards. Phylogenetic analysis of available NS1 sequences from India has grouped all the sequences into four clusters with mean genetic distance ranging from 12% to 24% between the clusters. Variability in length of NS1 protein was seen in sequences from these clusters, 230-amino-acid-residue NS1 for all strains from year 2007 to 2008 and for 21 strains from year 2009 and 219-residue products for 37 strains from year 2009 and all strains from year 2010 to 2013. Mutations like K62R, K131Q, L147R, and A202P were observed for the first time in NS1 protein and their function remains to be determined.


Introduction
Influenza viruses are responsible for acute respiratory infection and are a source of seasonal epidemics and occasional pandemics. Influenza A viruses are classified into subtypes based on the different types of HA and NA combinations that occur. So far 18 hemagglutinin (HA) and 11 neuraminidase (NA) subtypes have been reported from various organisms ranging between aquatic, avian, and human species [1,2]. Segment 8 of influenza A (H1N1) encodes two proteins NS1 (nonstructural) protein and NEP (nuclear export protein) by alternative splicing. The mRNAs of both proteins share 56 nucleotides at the 5 end, resulting in both proteins sharing 10 amino acids at N terminal. NS1 protein is encoded by the collinear mRNA from segment 8 of the influenza virus genome and has a strain specific length ranging from 230 to 237 amino acid residues. It is expressed exclusively in the infected cells [3]. NS1 could be divided into two functional domains: (i) N-terminal RNA binding domain (residues 1-73) and (ii) C-terminal effector domain, interacting with several host factors (residues 74-230) [3][4][5][6].
NS1 is a multifunctional protein involved in various functions of regulating immune responses. It functions as an interferon (IFN) antagonist, which allows efficient virus replication in IFN-competent hosts. NS1 targets both IFN-/ production and the activation of IFN-induced antiviral genes [6]. The RNA binding domain (RBD) of NS1 binds to both ssRNA and dsRNA, thereby sequestering them and preventing their recognition by RIG1 (retinoic acid inducible gene), resulting in inhibition of IFN and expression [7,8]. NS1 protein is also involved in inhibiting 3 end processing of host mRNA by binding to CPSF 30 (cleavage and polyadenylation specificity factor 30) and PABPN1 (poly(A) binding protein nuclear 1) [9]. Sequestering of dsRNA by RBD of NS1 from 2 -5 oligoadenylate synthetase (OAS) is 2 Advances in Virology essential for inhibition of ribonuclease L (RNase L) pathway, which is involved in the degradation of viral RNA. NS1 binds directly to the regulatory subunit of protein kinase R (PKR) and therefore regulates the effectors of IFN response and controls apoptosis, cell growth, cell proliferation, cytokine production, and signaling [10]. NS1 interacts with eIF4GI and PABP1 (poly(A) binding protein 1) and enhances viral protein synthesis in comparison to host cell protein. In this way, NS1 inhibits the innate immune response of the host by suppressing the interferon release. It also inhibits adaptive immunity by restricting human dendritic cells maturation and induction of T-cell response [11].
NS2 (NEP) is involved in the export of viral RNP from the nucleus to the cytoplasm through nuclear export signal and via interaction with Crm1 protein. NEP can be divided into a protease-sensitive N-terminal domain (amino acids 1-53) and a protease-resistant C-terminal domain (amino acids 54-121) [12]. Of the two domains N-terminal domain has been reported to contain nuclear export signal between residues 12 and 21 which interact with the nuclear export protein Crm1 and facilitate the export of viral RNPs [13].
In the present study, full length sequencing of pdm H1N1 (09) virus for NS gene was performed in samples collected from years 2009 to 2013 in order to determine the mutations taking place in the NS gene of pdm H1N1 (09) virus since its emergence in year 2009. Genetic and phylogenetic analyses of previously studied sequences reported from India and other countries were done, based on available literature in order to determine their phylogeny and sites under selection pressure (contributing towards the evolution of virus) and to study the possible effect of mutations on virulence and pathogenicity of influenza virus.

Materials and Methods
Samples (Nasal and Throat Swabs in viral transport media (VTM)) from years 2009 to 2013 (details given in Table 1) from patients with symptoms of fever, cough, sore throat, nasal catarrh, or shortness of breath were collected from hospitals of Delhi and outbreak samples from other states obtained for H1N1 testing at the National Centre for Disease Control (NCDC), New Delhi, India. The study was approved by the institutional ethical committee and all the samples were processed in a high containment facility (a biosafety level-3 laboratory) at NCDC, New Delhi. Viral RNA was extracted using QIAmp viral RNA mini kit (Qiagen, Germany) according to manufacturer's protocol. Finally RNA was eluted in 50 L of elution buffer and stored at −80 ∘ C until use. The initial detection of influenza viruses was done by RT PCR protocol for detection of influenza A (H1N1) pdm (09) by WHO/CDC [14,15].
For sequencing, viral genes were amplified as described earlier [16,17]. Nucleotide (nt) sequencing was carried out on Applied Biosystems 3130xl Genetic Analyzer (Applied Biosystems, Foster City, CA, USA), using gene specific primers. Nucleotide and protein sequence BLAST (Basic Local Alignment Search Tool) search was performed using the National Centre for Biotechnology Information (NCBI), National Institute of Health, Bethesda, MD, BLAST server at GenBank database [18]. Sequences for phylogenetic analyses were retrieved and multiple sequence alignments were performed on the Influenza Virus Resource (IVR) at NCBI and the Influenza Research Database (IRD) at http://www.fludb.org/ [19,20]. Phylogenetic analysis was done by MEGA v6.0, using maximum likelihood method and 500-replicate bootstrapping. Mean genetic distance within the cluster and between the clusters was determined by MEGA v6.0 [21].
Metadata-driven comparative analysis of study samples against all protein sequences in the influenza virus database at the Influenza Research Database (IRD) at http://www.fludb.org/ for NS1 protein till 21 December 2013 was performed by Meta-CATS tool [22] on the Influenza Research Database (IRD) at http://www.fludb.org/ at value threshold of 0.05 ( value threshold is used as the maximum probability level for the likelihood that the position is different among the groups simply by chance) in order to identify significantly different sites between group 1 (database sequences) and group 2 of the study samples.
Selection pressure analysis acting on the codons of NS (nonstructural) gene of H1N1 pdm virus was carried out using HyPhy open-source software package available under the datamonkey web server (http://www.datamonkey.org/) [23]. Analysis was performed using reference sequences [ = 72 (NS)] including Indian H1N1 pdm virus. A separate analysis for NS1 and NEP genes was also carried out by including 44 Indian H1N1 pdm viruses. The ratio of nonsynonymous (dN) to synonymous (dS) substitutions per site (dN/dS or v) was estimated using five different approaches, including single likelihood ancestor counting (SLAC), fixed effects likelihood (FEL), random effects method (REL), mixed effects model of evolution (MEME), and fast unbiased Bayesian approximation (FUBAR). The best nucleotide substitutions model for different data sets as determined through the available tool in datamonkey server was adopted in the analysis.  Tables 2 and  3. I123V mutation was seen in 88 samples, along with other common amino acid changes like E55Q, L90I, and N205S (30-50% samples). Changes like D53N, K62R, S73T, T94A, E96K, R108K, I111T, V129I, V129A, K131E, T143N, I145V, L147R, T151P, E172 K, A202P, and N209D were rare and observed only in few samples as given in Tables 2 and 3. Mutations I43N, D53N, T94A, R108K, and E172K were observed only in 1-4 samples from year 2009, while two samples had mutation N209D and three had E55Q mutation similar to year 2010. E55Q mutation was also observed in a varied number of samples (given in Tables 2 and 3   Mutations E96K (2 samples) and I145L and T151P (1 sample each) were detected only in samples from 2010. E55Q mutation was the second most common amino acid change after I123V and was seen in 16 samples of year 2010, while three mutations K62K, S73T, and I145V were common among 2010 and 2011 samples (given in Tables 2 and 3).

Mutations Seen in the NS1 Gene of Influenza
Mutations V129I (6 samples) and L147R (1 sample) were observed only in samples from 2011. Other mutations observed in a few samples in 2011 were E55Q (6 samples) and S73T (3 samples), while mutations K62R and I145V  (18) 2010 (21) 2011 (17) 2012 (22) 2013 (13) RNA binding domain  (Residues 1-73). The RNA binding domain is involved in binding and sequestering dsRNA from its recognition by RIG1 and OAS and thereby inhibiting the IFN response against the virus. In samples from 2009 to 2013, synonymous mutations were seen at ten nucleotide positions (14, 18, 27, 31, 36, 38, 44, 53, 68, and 71) and nonsynonymous mutations were noticed at positions I43N, D53N, E55Q, E55Q, K62R, and S73T as given in Table 2. Among these E55Q was the most common change found in 3 samples from 2009, most samples of 2010 (16), 6 samples from 2011, and 1 sample from 2013. Other changes were rare and only seen in two or three samples.
Metadata-driven comparative analysis tool (meta-CATS) of NS1 protein sequence between all database sequences and study sample sequences was performed for identification of amino acid positions that significantly differ between two or more groups of virus sequences. A total 79 sites were identified by Meta-CATS as sites having a significant nonrandom distribution between the specified groups (database sequences and study sequences). 18 of 79 sites identified by Meta-CATS were similar to sites with amino acid changes in study samples and most of the changes seen in the samples were common to sequences in the database. However, mutations like E96K and V129A were rare and viewed only in a limited number of samples in the database, while changes like K62R and K131Q were unique and seen only in one or two study samples.

Selection Pressure Analysis.
Selection pressure analysis of NS gene of influenza A H1N1 pdm virus strain revealed 8 positively selected sites. Integrated analysis was performed for differential selection pressure acting on NS1 (219 codons) and NEP (121 codons) proteins (shown in Table 4). Out of seven NS1 sites, one was located in RBD and six in ED. Analysis of NEP protein gene revealed single position 49 to be under positive selection. A specific selection pressure analysis for Indian isolates ( = 44) for NS1 and ( = 21) for NEP gene revealed 3 sites in NS1 and 1 site in NEP gene under positive selection. NS1 protein encoded by clusters KOL 507, KOL 596, and KOL 989 was of 230 amino acid residues in length, whereas NIV 6196 cluster encoding NS1 protein was of 219 amino acid residues. Due to difference in length of NS1 protein, 12 sites were only seen in clusters encoding 230 amino acid residues' NS1 protein. Terminal amino sequence of avian influenza A (H5N1) virus NS1 protein is reported to be associated with virulence and pathogenicity (30). NS1 protein encoded by clusters has different C-terminal amino acid sequence; KOL 507 and KOL 596 have RSEV, KOL 989 had RSKV, and NIV 6196 had PEQK.

Analysis of Available
Multiple sequence alignment of 120 sequences from 2007 to 2013 strain of all clusters (from India) showed differences in amino acid sequence at 100 sites between the clusters when compared with reference to KOL 507 cluster of which some sites were cluster specific (shown in Table 5), while others were common between clusters (Shown in Table 6).
KOL 507 and KOL 596 clusters have no year specific distribution of mutations or signature sequence within the cluster. KOL 989 cluster have one such pattern in sequences from year 2009, which has arginine at position 135 and glycine at position 139 in place of serine and aspartic acid. The NIV 6196 cluster has isoleucine at position 90 and serine at position 205 in place of leucine and arginine in the majority of the samples from 2011 to 2013.
Mean distance in NS1 protein sequence between clusters with reference to KOL 507 cluster was approximately 12% for KOL 596 cluster, 17% for KOL 989 cluster, and 25% for NIV 6196 cluster. All clusters have maximum sequence dissimilarity of 1% between the sequences within the cluster except NIV 6196 which has the maximum dissimilarity of 2% within the cluster. It has been observed that all study samples (2009)(2010)(2011)(2012)(2013) belonged to NIV 6196 cluster and no circulation Table 5: Cluster specific amino acid changes were seen in all the sequences of cluster and were absent from sequences of all other clusters.
of strain similar to KOL 507, KOL 596, and KOL 989 like strains has been seen in the last four years. Mutations in two functional domains of NS1 protein were observed between various clusters which affect their function. RNA binding domain of NS1 protein has mutations at positions 41, 44, and 67 which are involved in binding dsRNA. Mutation at positions 41 and 44 were noticed in KOL 596 and KOL 989 clusters, whereas change at position 67 was seen in clusters KOL 989 and NIV 6196. Effector domain of NS1 protein has mutation at 12 positions: 91, 95, 98, 101, 117, 119, 123, 125, 135, 144, 145, and 145 which may affect its interaction with host protein. Mutations at positions 95, 143, and 145 were seen in all the clusters. Some changes were cluster specific: 98,135,and 144 in KOL 989 cluster,91,119,and 123 in NIV 6196 cluster while others were common between two clusters 101 and 117 in KOL 596 and KOL 989 clusters, 125 in KOL 989 and NIV 6196 clusters.

Discussion
NS1 protein is responsible for regulation of antiviral immune response in the host cells and a number of NS1 molecular markers are reported to be associated with increased virulence and pathogenicity like R38, F103, and M106 [7,24]. NS1 protein is functionally divided into two domains: RNA binding domain (RBD) and effector domain (ED). RBD is mainly involved in sequestering of dsRNA from OAS and RIG1. In the present study, analysis of sequencing data from NS1 gene showed relatively conserved RBD in comparison to ED (shown in Table 2). Only five amino acid changes were seen in RBD, out of which E55Q was the most common change in comparison to other mutations which were rare and occurred in two or three samples only. None of the changes occurred in positions reported to be involved in RNA binding [7].
Effector domain is involved in interactions with the host factors, associated with cell signaling and immune response.
I123V mutation was seen in ED of almost all study samples (shown in Tables 2 and 3) and was categorized into New York (G1 type) strains [25]. Apart from this mutation, L90I and N205S mutations were found to occur over three years in a large number of samples. Other mutations which were detected in 20% or more samples were R108K (year 2009), I145V (year 2010), V129I (year 2011), and K131E (year 2013). The rest of the mutations were seen to occur only in one or two samples in all years.
Glutamate at position 96 is functionally important for binding of NS1 to CPSF30 and necessary for interaction with TRIM25, a ubiquitin ligase which mediates the ubiquitination of the RIG-1 (a viral RNA sensor) in order to facilitate IFN production. It has been reported that E96A mutants were ineffective in blocking TRIM25 mediated IFN response [8,26]. E96K substitution was noted in 2 samples from 2010, while the rest of 89 samples have E96, which shows that the majority of viruses in circulation with E96 are competent enough to inhibit TRIM25 mediated immune response and replicate efficiently in host cell.
It has been reported that interaction of NS1 residues 123-127 with PKR results in inhibition of eIF2 phosphorylation and viral protein synthesis, indicating that NS1-PKR binding is necessary and sufficient to block PKR activation in influenza A virus-infected cells [27]. In the present study eighty-eight samples were seen to have I123V mutation in this region. I123V mutation may therefore affect the inactivation of PKR by NS1 protein.
It has also been reported that NS1 protein with R108, E125, and G189 is unable to block the host gene expression resulting in inefficient replication of virus. This inhibitory effect could be restored by replacing above residue with residues corresponding to the human H1N1 virus consensus sequence [28]. One of these mutations R108K was seen in 4 samples from year 2009. It has been reported that the influenza A (H5N1) NS1 protein interacts with eukaryotic translation initiation factor 4GI (eIF4GI) via eIF4GI binding domain (residues 81-113) resulting in the preferential translation of the viral mRNA in comparison to host mRNA [29]. Therefore, mutation in this domain may result in impaired ability of virus to inhibit interferon production which may result in inefficient virus replication. L90I and T94A mutations may, therefore, affect interferon response and virus replication. Similarly, in ferrets it has been reported that human (H5N1) virus with arginine (N) at position 205 of NS1 protein enhances the type I IFN antagonistic property of the host cell leading to high virulence in ferrets [30]. In the present study samples, we have seen N205S mutation in all samples from 2011 onwards.
In this study the nuclear export signal of NEP displays M14I mutation in 2 samples from 2009, while the C-terminal domain of NEP was reported to interact with the nuclear localization signal of the viral matrix protein M1 [31] which has shown two mutations, T48A in almost all samples from 2011-2013, S60N in 50% of samples from 2013. These mutations may affect nuclear transport and release of virus from cell.
Selection pressure analysis of NS gene of influenza A H1N1 pdm virus strain revealed 8 positively selected sites (shown in Table 4). Positions 108, 123, 145, 147, and 205 were noted to be situated in NS1 protein host factor interaction domains. Analysis of NEP protein gene revealed single position 49 to be under positive selection. A specific selection pressure analysis for Indian isolates revealed 3 sites in NS1 and 1 site in NEP gene to be under positive selection. Positions 55, 129, and 145 in NS1 gene were found to be common between India specific isolates and reference strain isolates. This showed that positive selection on NS1 gene was stronger than that on NEP, of which a large number of sites were located in influenza host factor interaction domains, which are reported to be associated with virulence and pathogenicity of influenza virus [3,[26][27][28][29].
Phylogenetic analysis of study samples on the basis of NS1 gene of influenza A (H1N1) virus broadly grouped all sequences into two major branches (shown in Figure 2 strain. However, 2011 samples in group two formed two separate branches: one at base of group two containing two samples and the other with four samples, which showed homology to A/Boston/DOA2-099/2012 strain. Samples from year 2012 showed homology to A/India/Nsk12388/2012 strain, while 2013 samples had homology with A/Helsinki/405/2013 and A/New Jersey/NHRC403730/2013 strains. Mutations I123V and N205S in NS1 protein observed in the present study have also been observed in a large number of sequences from Europe, America, Africa, and Asia. While L90I (NS1 protein) was seen in limited number of sample from Europe, America and Africa. T48A (NEP protein) was seen only in few samples from Europe, Asia, Africa, and America. An earlier study on H1N1 pdm (09) sequences from India involving 13 samples has also reported the mutation reported in this study [32]. However, in comparison to that study, the present study has used a larger number of samples and found additional mutation in NS1 gene.
Phylogenetic analyses of 120 full length NS1 sequences from India during the time period 2007-2013 (retrieved from the Influenza Research Database (IRD) at http://www.fludb.org/) were found to be grouped into four clusters as shown earlier in Figure 1. NS1 protein encoded by KOL 507, KOL 596, and KOL 989 cluster was seen to be of 230 amino acid residues in length, whereas NIV 6196 like strains were seen to encode 219 residue long NS1 protein. Our investigation reveals that influenza A (H1N1) is evolving and acquiring mutations, which could be noted, by observing the mean distance in NS1 protein sequence between the clusters, approximately 12% for KOL 596 cluster, 17% for KOL 989 cluster, and 25% for NIV 6196 cluster. NS1 protein of none of the clusters was seen to have ESEV, EPEV, and KSEV as their terminal amino acid sequence, which are reported to be associated with increased virulence in influenza A (H5N1) virus. All study samples belonged to NIV 6196 cluster and had loss of 11 amino acids at c-terminal end of NS1 protein. Analysis of NS1 protein shows that the four clusters were derived from three major reassortment events, with KOL 989 cluster derived from seasonal H3N2 virus, KOL 507 and KOL 596 clusters from prepandemic seasonal H1N1, and NIV 6196 cluster from H1N1 pdm (09) lineage. This has resulted in large mean distance between cluster and loss of terminal amino acid residue. High values for mean distance and loss of residue between NIV 619 cluster and KOL 507 could be explained by introduction of pandemic strain in year 2009 in human population. Circulation of three different clusters in period of four years from 2007 to 2009, with high mean distance between them, shows that influenza A virus has evolved rapidly (by antigenic shift and drift) in the past and it could do so in the future, which highlights the need of continuing surveillance and monitoring of influenza virus infection across the nation and worldwide.

Conclusions
Sequence analysis shows that NS1 protein is mutating more rapidly than NEP and that within NS1 protein RBD is more conserved than ED. The prominent change seen in RBD was Bootstrap support values (based on 500 replications) above 50% are shown at the branch node. Each branch is denoted by accession number and strain name.