In Silico Biology of H1N1: Molecular Modelling of Novel Receptors and Docking Studies of Inhibitors to Reveal New Insight in Flu Treatment

Influenza is an infectious disease caused by RNA viruses of the family Orthomyxoviridae. The new influenza H1N1 viral stain has emerged by the genetic combination of genes from human, pig, and bird's H1N1 virus. The influenza virus is roughly spherical and is enveloped by a lipid membrane. There are two glycoproteins in this lipid membrane; namely, hemagglutinin (HA) which helps in attachment of the viral strain on the host cell surface and neuraminidase (NA) that is responsible for initiation of viral infection. We have developed homology models of both Hemagglutinin and Neuraminidase receptors from H1N1 strains in eastern India. The docking studies of B-Sialic acid and O-Sialic acid in the optimized and energy-minimized homology models show important H-bonding interactions with ALA142, ASP230, GLN231, GLU232, and THR141. This information can be used for structure-based and pharmacophore-based new drug design. We have also calculated ADME properties (Human Oral Absorption (HOA) and % HOA) for Oseltamivir which have been subject of debate for long.


Introduction
The H1N1 virus particle is about 80-120 nanometers in diameter and roughly spherical [1,2]. It is made up of a viral envelope containing two main types of glycoproteins hemagglutinin (HA) and neuraminidase (NA), wrapped around a central core [3,4] that contains a single-stranded viral RNA. The eight single (nonpaired) RNA strands encode for eleven proteins HA, NA, NP, M1, M2, NS1, NEP, PA, PB1, PB1-F2, and PB2. HA encodes hemagglutinin while NA encodes neuraminidase. The influenza A virus can be further classified into subtypes by serological reactivity of its surface glycoprotein antigens. H1N1 is a serotype of influenza A virus that commonly causes swine flu in humans [5][6][7][8]. During infection, the influenza virus is attached to the cell receptor (sialic acid) through HA. The HA plays an important role in the release of the viral RNA into the cell, by causing fusion of viral and cellular membranes [9]. Once the (-) strand influenza viral RNAs enter the nucleus, they serve as templates for the synthesis of mRNAs by RNA-dependent RNA polymerase [10]. The new (-) strand viral RNAs produced in the cell nucleus are exported to the cytoplasm and are joined with the viral proteins PA, PB1, PB2, and NP. M1 protein binds to the membrane on which HA, NA, and M2 have been inserted. The assembly and precise packaging of both viral RNAs and viral proteins form the new virions [11]. These are produced by budding and are attached to sialic acid receptors on the cell surface and wait for the viral neuraminidase activity that removes sialic acids from the cell surface and release them.
It is obvious that if the functionality of viral neuraminidase is rendered ineffective by any means the virions will fail to be released from infected host and further infect fresh hosts. Several neuraminidase inhibitors such as Oseltamivir and Zanamivir have been designed to work by blocking the function of the viral neuraminidase protein.
Neuraminidase inhibitor treatment limits the severity and spread of viral infections. Again, if the cell receptor (sialic acid) that helps in viral attachment to the host cells is altered the virions will fail to infect. There are a number of chemically different forms of sialic acids which can be used in this regard [9,12]. In our in silico study, we present molecular modelling of novel neuraminidase and hemagglutinin receptors and interaction of sialic acid types on hemagglutinin receptors to reveal novel insights for structure-based and pharmacophore-based drug design for the development of novel therapeutics.

Materials.
The sequences for the nucleocapsid protein that is neuraminidase (NA) and hemagglutinin (HA) were taken from flu.gov database of NCBI [13]. The NCBI influenza virus sequence database contains nucleotide sequences as well as protein sequences and their encoding regions derived from the nucleotide sequences. The neuraminidase protein sequences were retrieved by putting the (keywords Type: A, Host: Human, Country/Region: India, Protein: NA, Subtype: H1 and N1, Sequence type: Protein.) We got a list of 350 protein sequence of NA of different regions of India, from which sequences of eastern India were considered for research. Similar search was performed for hemagglutinin (HA) and a list of 95 protein sequences from different regions of India of which sequence of Eastern India were considered. About 15 sequences of Neuraminidase and 95 protein sequences of hemagglutinin were downloaded in FASTA format for analysis. Prior to model development, the sequences were analyzed in BioEdit [14]. It was found that all sequences were approximately of same length. The longest sequence of NA with accession id ADD85917 was selected for 3D model development that contains 453 amino acid residues with molecular weight 49654. 19 Daltons. Similarly, the longest sequence of HA with accession ID ADD85911 was selected for model development that contains 335 amino acid residues with molecular weight of 37140.01 Daltons.

Molecular
Modelling. Molecular modelling of novel HA and NA receptor proteins was performed using modelling server (I-TASSER [15]) and standalone software (Modeller 9.9 [16][17][18][19]). For modelling in Modeller 9.9, suitable templates were searched with BLAST-P [20] against the PDB [21]. Six templates (for NA) and ten templates (for HA) were considered for modeling on the basis of query coverage and E-value. The alignment files of the target protein sequence with the suitable templates were generated and ten models were generated for each protein (Please see supplementary material for more information).

Molecular Docking. The Oseltamivir and Sialic acids (B-Sialic acid and O-Sialic acid)
were obtained from Pub-Chem [22] in SDF format and converted to PDB format for docking purpose. Schrodinger [23,24] software was used for flexible docking. The receptors were prepared by assigning bond orders, adding hydrogens, setting proper ionization states of residues, capping the termini, and so forth. The receptors were then refined with H-bond assignment (water orientations, at neutral pH), and energy was minimized with OPLS 2005 force field. A grid for the protein was generated by using site around the centroids of selected residues. The ligands were prepared in ligprep with the following parameters force field: OPLS2005, ionization at target pH: 7.0 ± 2.0, generate tautomers and stereo isomers (generate all combinations of specified chiralities and determine chiralities from 3D structure) with at most 32 ligands to be generated. Finally, the ligands were docked by XP (Extra Precision) method with flexibility (nitrogen inversions and ring conformations). The results were exported with XP descriptors information and adding Epik State penalties to docking score.

The 3D Model of NA Receptor.
Five 3D models were generated with different C-scores featuring the accuracy of prediction from I-TASSER. The best model was chosen on the basis of highest C-score. One model of NA was selected from those generated in Modeller 9.9 on the basis of different scores (molpdf: 15581.55371, DOPE score: −44868.22266 and GA341 score: 1.00000).

The 3D
Model of HA Receptor. Best model was selected from among those generated by I-TASSER with C-score = 1.47, estimated accuracy (TM-score) = 0.92 ± 0.06. Similarly, a model of HA was selected from amongst generated by Modeller 9.9 based on various scores (molpdf: 15373.00293, DOPE score: −33933.94141 and GA341 score: 1.00000).

Evaluation of the Models.
PROCHECK checks the stereo chemical quality of a protein structure, producing a number of PostScript plots analyzing its overall geometry. All the models were evaluated in Procheck NT standalone version. It produced 10 postscript files featuring Ramachandran plots, Chi1-Chi2 plots, main chain parameters, side chain parameters, residue properties, main chain bond lengths, main chain bond angles, RMS distances from planarity, and distorted geometry. The selected models were found to be satisfactory for the calculated stereochemical parameters.

Flexible Docking with Schrodinger.
Docking results of Oseltamivir, beta-sialic acid, and ortho-sialic acid were imported in XP visualiser and corresponding docking poses were viewed in workspace (Figures 3 and 4). The overall dock scores (GScore) for the three ligands were −5.18, −9.74, and −8.65, respectively ( Table 1). The GScore is the total glide score based on ChemScore used for ranking ligand poses found in docking. The H-bond interaction among the ligand

Discussions
The structures of NA receptor modelled by Modeller 9.9 and I-TASSER reported the amino acids percentage in the favourable region as 91.20% and 83.50%, respectively, with amino acids percentage in the disallowed region as 0.30% and 0.80%, respectively, similarly the structures of HA receptor modelled by Modeller 9.9 and I-TASSER reported the amino acids percentage in the favorable region as 90.10% and 77.80%, respectively, with amino acids percentage in the disallowed region as 0.00% and 0.70%, respectively ( Table 2). All these models were evaluated by PROCHECK NT [25] standalone version which indicates that the model build by Modeller 9.9 shows the highest percentage of amino acids in the allowed region. Further, these structures were validated with PROSA 2003 standalone version which produced Z-Score in the range −5 to −7. Hence, these models were considered as best models and are represented in Figures 1 and 2   The docking of Oseltamivir on Neuraminidase shows H-bond interaction with residues ARG109, ARG143, and GLU219. The docking of beta-sialic acid on Hemagglutinin shows Hydrogen-bond interaction with residues ALA142, ASP230, GLN231, GLU232, and THR141. Again the docking of O-sialic acid on Hemagglutinin shows hydrogenbond interaction with residues ALA142, ASP230, GLN231, GLU232, and THR141. This residue information (Table 3)   that may cause false positives in high-throughput screening (HTS) assays. The range of values that cause a molecule to be flagged as dissimilar to other known drugs can be modified in the QPlimits. The 16 conformers of Oseltamivir obtained from ligprep were imported to QikProp and their ADME properties were generated using fast mode. About 51 descriptors and properties were reported of which few important are represented in Table 4. According to the Hazardous Substances Data Bank [26] the percent Human Oral Absorption is 75%. Earlier it was debated that Oseltamivir (Tamiflu) has a low rate of absorption but our predicted qualitative human oral absorption for all 16 conformers was 3 (the highest) and the percent of human oral absorption was >67.739% (average). Again, the predicted values of aqueous solubility, brain/blood partition coefficient, and skin permeability for all 16 conformers were good.

Conclusion
The docking studies of Oseltamivir (Tamiflu) on the optimized and energy-minimized model of NA showed some important H-bond interactions with functionally important residues. Similarly, the docking studies of B-Sialic acid and O-Sialic acid on the optimized and energy-minimized model of HA show some important H-bond interactions. This information can be used for structure-based and pharmacophore-based new drug designing for development of novel therapeutic agents for the prevention and treatment of influenza.