Epitope - based peptide vaccine against glycoprotein G of Nipah henipavirus using immunoinformatics approaches

Background Nipah virus (NiV) is a member of the genus Henipavirus of the family Paramyxoviridae, characterized by high pathogenicity and endemic in South Asia, first emerged in Malaysia in 1998. The case-fatality varies from 40% to 70% depending on the severity of the disease and on the availability of adequate healthcare facilities. At present no antiviral drugs are available for NiV disease and the treatment is just supportive. Clinical presentation ranges from asymptomatic infection to fatal encephalitis. Bats are the main reservoir for this virus, which can cause disease in humans and animals. The last investigated NiV outbreak has occurred in May 2018 in Kerala. Objective This study aims to predict effective epitope-based vaccine against glycoprotein G of Nipah henipavirus using immunoinformatics approaches. Methods and Materials Glycoprotein G of Nipah henipavirus sequence was retrieved from NCBI. Different prediction tools were used to analyze the nominee’s epitopes in BepiPred-2.0: Sequential B-Cell Epitope Predictor for B-cell, T-cell MHC class II & I. Then the proposed peptides were docked using Autodock 4.0 software program. Results and Conclusions Peptide TVYHCSAVY shows a very strong binding affinity to MHC I alleles while FLIDRINWI shows a very strong binding affinity to MHC II and MHC I alleles. This indicates a strong potential to formulate a new vaccine, especially with the peptide FLIDRINWI that is likely to be the first proposed epitope-based vaccine against glycoprotein G of Nipah henipavirus. This study recommends an in-vivo assessment for the most promising peptides especially FLIDRINWI.


INTRODUCTION:
Nipah virus (NiV) is an RNA virus that belongs to the Genus Henipavirus within the family Paramyxoviridae and has first emerged in Malaysia in 1998, gaining its name from a village called Sungai Nipah where it was isolated from the cerebrospinal fluid (CSF) of one of the patients [1][2][3][4] . NiV is transmitted by zoonotic (from bats to humans, or from bats to pigs, and then to humans) as well as human-to-human routes. Its clinical presentation varies from asymptomatic (subclinical) infection to acute respiratory illness and fatal encephalitis with most of the patients have been in direct contact with infected pigs, it has also been found that the virus causes central nervous system illnesses in pigs and respiratory illnesses in horses resulting in a significant economic loss for farmer [1,[5][6][7][8][9] . Large fruit bats of the genus Pteropus seem to act as a natural reservoir of NiV based on the isolation of Hendra virus which showed the presence of neutralizing antibodies to the Hendra virus on the bats [10,11] . Although there are no more cases of NiV in Malaysia, the outbreaks have been frequently occurring in India, Bangladesh, Thailand, and Cambodia [12] . The case fatality rate ranges from 50 to 100%, making them one of the deadliest viruses known to infect human [3,13,14] .
Laboratory diagnosis of Nipah virus infection is made using reverse transcriptase polymerase chain reaction (RT-PCR) from throat swabs, cerebrospinal fluid, urine, and blood analysis during acute and convalescent stages of the disease. IgG and IgM antibody detection can be done after recovery to confirm Nipah virus infection. Immunohistochemistry on tissues collected during autopsy also confirms the disease [15,16] . Currently, there is no effective treatment for the Nipah Virus infection however a few precautions include practicing standard infection control, barrier nursing to avoid the spread of infection from person to person as well as the isolation of those suspected to have the infection [7,8,17] . Recent computational approaches have provided further information about viruses, including the study conducted by Badawi M, et al. on ZIKA virus, where the envelope glycoprotein was obtained using protein databases. The most immunogenic epitope for the T and B cells involved in cell-mediated immunity were previously analyzed [18] .
The main focus of the analysis was the MHC class-I potential peptides using in silico analysis techniques [19,20] . In this study, the same techniques were applied to keep MHC class I and II along with the world population coverage as our main focus. Furthermore, we aim to design an Epitope-Based Peptide Vaccine against Nipah virus using peptides of its glycoprotein G as an immunogenic part to stimulate a protective immune response [3] .
Nipah virus invades host cells by the fusion of the host cell membranes at physiological pH without requiring viral endocytosis. Cell-cell fusion is a pathological lineament of Nipah virus infections, resulting in cell-to-cell spread, inflammation, and destruction of endothelial cells and neurons [21] . Both Nipah virus entry and Cell-cell fusion require the concerted efforts of the attachment of glycoprotein G and fusion (F) glycoprotein. Upon receptor binding, Nipah virus glycoprotein G triggers a conformational cascade in Nipah virus glycoprotein F that executes viral and/or cell membrane fusion [22] . Due, to the potency of glycoprotein G over F, we have considered it as the target of this study. It could be the first vaccine developed for humans against glycoprotein G of Nipah henipavirus to be put forward using an immunoinformatics approach and population coverage analytical tools. Even though there's a lot of challenges regarding the development of peptide vaccines, we have decided to develop them for fighting the Nipah virus infection because they make a very good alternative strategy that relies on the usage of short peptide fragments to induce immune responses that are extremely targeted, avoiding all allergenic as well as reactogenic sequences [23][24][25][26] . Antigenic epitopes from single proteins may not be really necessary, whereas some of these epitopes may even be detrimental to the induction of protective immunity. This logic has created an interest in peptide vaccines and especially those containing only epitopes that are capable of inducing desirable T cell and B cell mediated immune response. Less than 20 amino acid sequences make up the peptides used in such vaccines, which are then synthesized to form an immunogenic peptide molecule. These molecules represent the specific epitope of an antigen. These vaccines are also capable of inducing immunity against different strains of a specific pathogen by forming non-contiguous and immunodominant epitopes that are usually conserved in the strains of the pathogen [27] .
The production of peptide vaccines is extremely safe and cost-effective especially when they are compared to conventional vaccines. Traditional vaccines that prevent emerging infectious diseases (EIDs) are very difficult to produce because they require the need to culture pathogenic viruses in vitro however, epitope based peptide vaccines do not require any means of in vitro culturing making them biologically safe, they allow for the large scale of bio-processing to be carried out rapidly and economically and finally their selectivity allows for the precise activation of immune responses by means of selecting immunodominant and conserved epitopes [25,28] .
The complexity of the epitope based peptide vaccine's design depends largely on the properties of the carrier molecules from their reactogenicity to their allergenicity [29,30] . When it comes to the selection of epitopes, it is directed towards B cells, cytotoxic T cells and the induction of the helper T cells as well. Then, it is important to identify the epitopes capable of activating T cells vital for bringing about protective immunity. One of the issues concerned with peptide vaccines representing T cells in a human population that is highly MHC-heterogeneous is identifying highly conserved immunodominant epitopes which are considered to be broad-spectrum vaccines due to their ability to work against multiple serovars of a pathogen [30] .
For the purpose of this research, we have used a variety of bioinformatics tools for the prediction of epitopes along with the population coverage and epitope selection algorithms including the translocation of peptides into MHC class I and MHC class II.    [31] in FASTA format on July 2018. Different prediction tools of Immune Epitope Database IEDB analysis resource (http://www.iedb.org/) [32] were then used to analyze the candidate epitopes.

Conservation region and physicochemical properties:
Conservation regions were determined using multiple sequence alignment with the help of Clustal-W in the Bio-edit software version [33] . Epitope conservancy prediction for individual epitopes was then calculated using the IEDB analysis resource. Conservancy can be defined as the portion of protein sequences that restrain the epi-tope measured at or exceeding a specific level of identity [34] . The physicochemical properties of the retrieved sequence; molecular weight and amino acid composition; were also determined by Bio-edit software version.

B cell epitope prediction tools:
Candidate epitopes were analyzed using several B-cell prediction methods to determine the antigenicity, flexibility, hydrophilicity and surface accessibility. The linear prediction epitopes were obtained from Immune epitope database (http://tools.iedb.org/bcell/result/) [35] by using BepiPred test with a threshold value of 0.149 and a window size 6.0 Moreover, surface accessible epitopes were predicated with a threshold value of 1.0 and window size 6.0 using the Emini surface accessibility prediction tool [35] .
Kolaskar and Tongaonker antigenicity methods (http://tools.iedb.org/bcell/result/) were proposed to determine the sites of antigenic epitopes with a default threshold value of 1.030 and a window size 6.0 [36] .

Peptide binding to MHC class 1 molecules:
The peptide binding was assessed by the IEDB MHC 1 prediction tool at http://tools.iedb.org/mhc1. This tool employs different methods to determine the ability of the submitted sequence to bind to a specific MHC class 1 molecule. The artificial neural network (ANN) method was used to calculate IC50 values of peptide binding to MHC-1 molecules. For both frequent and non-frequent alleles, peptide length was set to 9 amino acids earlier to the prediction. The alleles having binding affinity IC50 equal to or less than 500 nM were considered for further analysis [37] .

Peptide binding to MHC class II molecules:
To predict the peptide binding to MHC class II molecules, MHC II prediction tool http://tools.iedb.org/mhcII provided by Immune Epitope Database (IEDB) analysis resource and human allele references set was used [38] . The Artificial Neural Network prediction method was chosen to identify the binding affinity to MHC II grooves and MHC II binding core epitopes. All epitopes that bind to many alleles at score equal to or less than 1000 half-maximal inhibitory concentration (IC50) were selected for further analysis.

Population coverage:
The population coverage of each epitope was calculated by IEDB population coverage tool at (http://tools.iedb.org/tools/population/iedb_input) This tool is aimed in order to determine the fraction of individuals predicted to respond to a given set of epitopes with known MHC restrictions [39] . For every single population coverage, the tool computed the following information: (1) predicted population coverage, (2) HLA combinations recognized by the population, and (3) HLA combinations recognized by 90% of the population (PC90). All epitopes and their MHCI and MHC II molecules were assessed against population coverage area selected before submission.

Homology modeling:
The 3D structure of glycoprotein G of Nipah virus was predicted using the Raptor X web portal (http://raptorx.uchicago.edu/) the reference sequence was submitted in FASTA format on 14/9/2018 and structure received on 15/9/2018 [40] . The Structure was then treated with UCSF Chimera 1.10.2 to visualize the position of proposed peptides [41] .

Ligand Preparation:
In order to estimate the binding affinities between the epitopes and molecular structure of MHC I & MHC II, in silico molecular docking were used. Sequences of proposed epitopes were selected from Nipah virus reference sequence using Chimera 1.10 and saved as a pdb file. The obtained files were then optimized and energy minimized. The HLA-A0201 was selected as the macromolecule for docking. Its crystal structure (4UQ3) was downloaded from the RCSB Protein Data Bank (http://www.rcsb.org/pdb/home/home.do), which was in a complex with an azobenzene-containing peptide [42] .
All water molecules and heteroatoms in the retrieved target file 4UQ3 were then removed.
Molecular docking was performed using Autodock 4.0 software, based on Lamarckian Genetic Algorithm; which combines energy evaluation through grids of affinity potential to find the suitable binding position for a ligand on a given protein [44,45] Polar hydrogen atoms were added to the protein targets and Kollman united atomic charges were computed. The target's grid map was calculated and set to 60×60×60 points with grid spacing of 0.375 Ǻ . The grid box was then allocated properly in the target to include the active residue in the center. The genetic algorithm and its run were set to 100.The docking algorithms were set to default. Finally, results were retrieved as binding energies and poses that showed lowest binding energies were visualized using UCSF chimera.

Nipah virus glycoprotein G physical and chemical parameters:
The physicochemical properties of Nipah virus glycoprotein G protein were assessed using   were shown in Figure (3-5).

Population coverage:
Population coverage test was performed to detect all epitopes binds to MHC1 alleles and MHC11 alleles for the world, South Asia, Southeast Asia, Sudan, and North Africa.           P  o  p  u  l  a  t  i  o  n  /  a  r  e  a  C  l  a  s  s  I  I  c  o  v  e  r  a  g  e   a   a  v  e  r  a  g  e  _  h  i  t   b   p  c  9  0   c   S  o  u  t  h  A  s  i  a  5  6  .  0  %  5  0  .  5  -1  0  .  0  9   A  v  e  r  a  g  e  5  6 .       f Nipah imera X

Discussion:
In our current study, potential peptides were suggested to design an epitope-based peptide vaccine ≤ 500 [37] . 191 peptides were found to bind MHC I with different affinities. It is well known that a better immune response depends on the successful recognition of epitopes by HLA alleles with significant affinity. However, a peptide recognized by the highest number of HLA alleles has the best potential to induce a strong immune response. Therefore taking that fact, only three peptides were selected based on a higher number of alleles and 100% conservancy. The  The selected peptides were further subjected to both MHC I and MHC II based population coverage analysis in all World, Southeast Asia, South Asia, North Africa, and Sudan as shown on (Table 14).
Among the five primarily selected peptide, the obtained results showed very strong potential in proposing peptide FLIDRINWI as vaccine candidate than others by considering its overall peptide conservancy, population coverage and by the affinity for the highest number of HLA alleles.
The peptide FLIDRINWI demonstrate a special population coverage results for both MHC I and MHC II alleles with total binding to 12 alleles in MHC II and 8 alleles in MHC I, This finding demonstrate a very strong potential to formulate an epitopes-based peptide vaccine against Nipah virus glycoprotein G and make the peptide FLIDRINWI highly proposed. Furthermore, an insilico molecular docking was done to explore the binding affinity between the aforementioned peptides and the target HLA-A0201; which has been selected for docking pertained with its involvement in several immunological and pathological diseases [46,47,48] . Although a numerous studies have demonstrated an association between HLA alleles and disease susceptibility, defining protective HLA allelic associations potentially allows the identification of pathogen epitopes that are restricted by the specific HLA alleles. These epitopes may then be incorporated into vaccine design in the expectation that the natural resistance can be replicated by immunization [47,48] . Calculation of the root means square deviation (RMSD) between co-ordinates of the atoms and formation of clusters based on RMSD values computed the resemblance of the docked structures. The best docking result is considered to be the conformation with the lowest binding energy. The least energy was predicted for the peptide FLIDRINWI (-6.95 Kcal/mol) and the 3D structure of the allele and peptide shown in (figure 7). According to these interesting outcomes, formulating a vaccine using FLIDRINWI peptide is highly promising that is encouraging to be the first proposed epitope-based peptide vaccine against Nipah virus. However, novel vaccines approach as epitope-based peptide vaccines can possibly conquer these obstructions for these immunizations can make increasingly successful, specific and long lasting impregnable reaction and without all the adverse effects of the classical vaccines. Moreover, based on our previous studies peptide-based vaccines have been effectively proposed by utilizing in silico approach against Madurella mycetomatis, Mokola Rabies Virus, Lagos Rabies Virus and others [49,50,51] . Such investigations in regards to those viruses built up immunoinformatics considered as a computational analysis. In this study the same techniques were used, to design the Epitope-Based Peptide Vaccine against Nipah virus glycoprotein G as a target immunogenic part of the virus to stimulate a protective immune response.
One of the limitations of this study was the absent of peptides with a strong binding affinity to Bcell The sequence of Nipah virus glycoprotein G was subjected to Bepipred linear epitope prediction, Emini surface accessibility, and Kolaskar and Tongaonkar antigenicity methods in IEDB.

Conclusions:
To the best of our knowledge, this study is considered to be the first to propose epitope-based peptide vaccine against glycoprotein G of Nipah virus, which is expected to be highly antigenic with a minimum allergic effect. Furthermore, this study proposes promising peptide FLIDRINWI with a very strong binding affinity to MHC1 and MHC11 alleles. This peptide shows an exceptional population coverage results for both MHC1 and MHC11 alleles.
In-vivo and in-vitro assessment for the most promising peptides namely; (FLIDRINWI), (TVYHCSAVY) and (FAYSHLERI) are recommended to be explored and find out more on their ability to be developed into vaccines against Nipah virus glycoprotein G.