Epitope-Based Peptide Vaccine against Glycoprotein G of Nipah Henipavirus Using Immunoinformatics Approaches

Background Nipah belongs to the genus Henipavirus and the Paramyxoviridae family. It is an endemic most commonly found at South Asia and has first emerged in Malaysia in 1998. Bats are found to be the main reservoir for this virus, causing disease in both humans and animals. The last outbreak has occurred in May 2018 in Kerala. It is characterized by high pathogenicity and fatality rates which varies from 40% to 70% depending on the severity of the disease and on the availability of adequate healthcare facilities. Currently, there are no antiviral drugs available for NiV disease and the treatment is just supportive. Clinical presentations for this virus range from asymptomatic infection to fatal encephalitis. Objective This study is aimed at predicting an effective epitope-based vaccine against glycoprotein G of Nipah henipavirus, using immunoinformatics approaches. Methods and Materials Glycoprotein G of the Nipah virus sequence was retrieved from NCBI. Different prediction tools were used to analyze the epitopes, namely, BepiPred-2.0: Sequential B Cell Epitope Predictor for B cell and T cell MHC classes II and I. Then, the proposed peptides were docked using Autodock 4.0 software program. Results and Conclusions. The two peptides TVYHCSAVY and FLIDRINWI have showed a very strong binding affinity to MHC class I and MHC class II alleles. Furthermore, considering the conservancy, the affinity, and the population coverage, the peptide FLIDRINWIT is highly suitable to be utilized to formulate a new vaccine against glycoprotein G of Nipah henipavirus. An in vivo study for the proposed peptides is also highly recommended.


Introduction
Nipah virus (NiV) is an RNA virus that belongs to the genus Henipavirus within the family Paramyxoviridae and has first emerged in Malaysia in 1998, gaining its name from a village called Sungai Nipah where it was isolated from the cerebrospinal fluid (CSF) of one of the patients [1][2][3][4]. NiV is transmitted zoonotically (from bats to humans, or from bats to pigs, and then to humans) as well as human-to-human routes. Its clinical presentation varies from asymptomatic (subclinical) infection to acute respiratory illnesses and fatal encephalitis in most of the patients who has been in direct contact with infected pigs. It has also been found that the virus causes central nervous system illnesses in pigs and respiratory illnesses in horses resulting in a significant economic loss for farmers [1,[5][6][7][8][9]. Large fruit bats of the genus Pteropus seem to act as a natural reservoir of NiV based on the isolation of Hendra virus which has showed the presence of neutralizing antibodies to the Hendra virus on the bats [10,11]. Although, there are no more cases of NiV in Malaysia, several outbreaks have been frequently occurring in India, Bangladesh, Thailand, and Cambodia [12]. The case fatality rate ranges from 50% to 100%, making it one of the deadliest viruses known to infect humans [3,13,14].
Laboratory diagnosis of Nipah virus infection is made using reverse transcriptase polymerase chain reaction (RT-PCR) from throat swabs, cerebrospinal fluid, urine, and blood analysis during acute and convalescent stages of the disease. IgG and IgM antibody detection can be done after recovery to confirm Nipah virus infection. Immunohistochemistry on tissues collected during an autopsy can also confirm the disease [15,16]. Currently, there are no effective treatments for the Nipah virus infection. Therefore, a few precautions should be followed such as practicing standard infection control, barrier nursing to avoid the spread of the infection from person to person, and the isolation of those suspected to have the infection [7,8,17]. Recent computational approaches have provided further information about viruses, including the study conducted by Badawi M et al. on Zika virus, where the envelope glycoprotein was obtained using protein databases. The most immunogenic epitope for the T and B cells involved in cell-mediated immunity was previously analyzed [18]. The main focus of the analysis was the MHC class I potential peptides using in silico analysis techniques [19,20]. In this study, the same techniques were applied to keep MHC classes I and II along with the world population coverage as our main focus. Furthermore, in this study, we aimed to design an epitope-based peptide vaccine against Nipah virus using peptides of its glycoprotein G as an immunogenic part to stimulate a protective immune response [3].
Nipah virus invades host cells by the fusion of the host cell membranes at an optimal physiological pH for cleavage without requiring viral endocytosis. Cell-cell fusion is a pathological lineament of Nipah virus infections, resulting in a cell-to-cell spread, inflammation, and destruction of endothelial cells and neurons [21]. Both Nipah virus entry and cell-cell fusion require concerted efforts of the attachment of glycoprotein G and fusion (F) glycoprotein. Upon receptor binding, Nipah virus glycoprotein G triggers a conformational cascade in Nipah virus glycoprotein F that executes a viral and/or a cell membrane fusion [22]. Due to the potency of glycoprotein G over F, we have considered this incident to be the target of this study. There are a lot of challenges regarding the development of peptide-based vaccines, and therefore, we have decided to study and propose a new vaccine against the Nipah virus, since they make a helpful alternative strategy that relies on the usage of short peptide fragments to induce immune responses [23][24][25][26]. Antigenic epitopes from single proteins may not be really necessary, whereas some of these epitopes may even be detrimental to the induction of protective immunity. This logic has created an interest in peptide vaccines and especially those containing only epitopes that are capable of inducing desirable T cell-and B cell-mediated immune responses. Less than 20 amino acid sequences make up the peptides used in such vaccines, which are then synthesized to form an immunogenic peptide molecule. These molecules represent a specific epitope of an antigen. These vaccines are also capable of inducing immunity against different strains of a specific pathogen by forming noncontiguous and immunodominant epitopes that are usually conserved in the strains of the pathogen [27].
The production of peptide vaccines is extremely safe and cost-effective, especially when they are compared to conventional vaccines. Traditional vaccines that prevent emerging infectious diseases (EIDs) are very difficult to produce because they require the need to culture pathogenic viruses in vitro. However, epitope-based peptide vaccines do not require any means of in vitro culturing which makes them biologically safe, allowing a large scale of bioprocessing to be carried out rapidly and economically. Finally, their selectivity allows a precise activation of the immunological responses by means of selecting immunodominant and conserved epitopes [25,28]. The complexity of an epitope-based peptide vaccines' design depends largely on the properties of its carrier molecules' reactogenicity as well as its allergenicity [29,30]. When it comes to the selection of epitopes, it is based on the analysis of the B cells, cytotoxic T cells, and the induction of the helper T cells. Then, it is important to identify the epitopes capable of activating T cells vital for stimulating a protective immunity. One of the issues concerning peptide vaccines representing T cells in a human population and that are highly MHC-heterogeneous is to identify the highly conserved immunodominant epitopes that are considered to be among a broad spectrum of vaccines due to their ability to work against multiple serovars of pathogens [30]. In this study, we have used a variety of bioinformatics tools for the prediction of epitopes along with the population coverage and epitope selection algorithms, including the translocation of peptides into MHC class I and MHC class II.  [32] were then used to analyze the candidate epitopes.

Conservation Region and Physicochemical Properties.
Conservation regions were determined using multiple sequence alignments with the help of Clustal-W in the BioEdit software version 7.2.5 [33]. Epitope conservancy prediction for individual epitopes was then calculated using the IEDB Analysis Resource. Conservancy can be defined as the portion of a protein sequence that restrains in which an epitope is measured at or which that is exceeding a specific level of identity [34]. The physicochemical properties of the retrieved sequence, molecular weight, and amino acid composition were also determined by using BioEdit software.

B Cell Epitope Prediction Tools.
Candidate epitopes were analyzed using several B cell prediction methods to determine their antigenicity, flexibility, hydrophilicity, and surface accessibility. The predicted linear epitopes were obtained from the Immune Epitope Database (http://tools.iedb.org/ bcell/result/) [35] using a BepiPred test with a threshold value of 0.149 and a window size of 6.0. Moreover, surface accessible epitopes were predicated with a threshold value of 1.0 and a window size of 6.0 using the Emini surface accessibility prediction tool [35]. Kolaskar and Tongaonkar antigenicity methods (http://tools.iedb.org/bcell/result/) were also proposed to determine the sites of antigenic epitopes with a default threshold value of 1.030 and a window size 6.0 [36].  [38]. The artificial neural network prediction method was chosen to identify the binding affinity of MHC II grooves and MHC II binding core epitopes. All epitopes that bind to many alleles at a score equal to or less than 1000, halfmaximal inhibitory concentration (IC50), were selected for further analysis.

Population
Coverage. The population coverage of each epitope was calculated by the IEDB population coverage tool at (http://tools.iedb.org/tools/population/iedb_input). This tool was used in order to determine the fraction of individuals predicted to respond to a given set of epitopes, with known MHC restrictions [39]. For every single population coverage, the tool computed the following information: (1) predicted population coverage, (2) HLA combinations recognized by the population, and (3)  , which was in a complex with an azobenzene-containing peptide [42]. All water molecules and heteroatoms in the retrieved target file 4UQ3 were then removed. The target structure was further optimized and energy minimized using Swiss PDB viewer V.4.1.0 software [43].
Molecular docking was performed using AutoDock 4.0 software, based on the Lamarckian genetic algorithm, which combines energy evaluation through grids of affinity potential to find the suitable binding position for a ligand on a given protein [44,45]. Polar hydrogen atoms were added to the protein targets, and Kollman united atomic charges were computed. The targets' grid map was calculated and set to 60 × 60 × 60 points with a grid spacing of 0.375 Ǻ. The grid box was then allocated properly in the target to include the active residue in the center. The genetic algorithm and its run were set to 100 as the docking algorithms were set on default. Finally, results were retrieved as binding energies and poses that showed the lowest binding energies in which they were visualized using UCSF Chimera.

Nipah Virus Glycoprotein G Physical and Chemical
Parameters. The physicochemical properties of the Nipah virus glycoprotein G protein was assessed using BioEdit 3 Journal of Immunology Research software version 7.0.9.0. The protein length was found to be 602 amino acids, and the molecular weight was at 67035. 54 Daltons. The amino acids that form the Nipah virus glycoprotein G protein are shown in Table 1 along with their numbers and molar percentages in (Mol%).

B Cell Epitope Prediction.
The ref sequence of the Nipah virus glycoprotein G was subjected to a Bepipred linear epitope prediction. Emini surface accessibility and Kolaskar and Tongaonkar antigenicity methods in IEDB were used to determine bindings to the B cell and in testing its surface and immunogenicity. The results are shown in Figures 1-3.

Prediction of T Helper Cell Epitopes and Interaction with
MHC Class I Alleles. The Nipah virus glycoprotein G sequence was analyzed using the IEDB MHC class I binding prediction tool based on ANN-align with half-maximal inhibitory concentration ðIC 50 Þ ≤ 500; the least most promising epitopes that had a binding affinity with the class I alleles along with their positions in the Nipah virus glycoprotein G are shown in Table 2.

Prediction of T Helper Cell Epitopes and Interaction with
MHC Class II Alleles. The Nipah virus glycoprotein G sequence was analyzed using the IEDB MHC class II binding prediction tool based on NN-align with half-maximal inhibitory concentration ðIC 50 Þ ≤ 1000. The list of the epitopes and their correspondent bindings to MHC class II alleles, along with their positions in the Nipah virus glycoprotein G, while the list of the most promising epitopes that had a strong binding affinity to MHC class II alleles and depending on the number of their binding alleles is shown in Table 3.

Population Coverage.
A population coverage test was performed to detect all the epitopes that bind to MHC class I alleles and MHC class II alleles available in the database in relation to the world, South Asia, Southeast Asia, Sudan, and North Africa.

Discussion
Traditional vaccination approaches depend on the total amount of pathogens that are either live-constricted or inactivated. Among the significant issues, these vaccines have brought along pivotal security concerns. In light of the fact that they are being utilized for vaccination, this may have caused them to become actuated and may also cause contamination. Additionally, due to the varied hereditary pathogen strains found in the world, vaccines are probably going to lose their viability in various areas or even in certain populations.
However, novel vaccine approaches such as DNA-and epitope-based immunizations may possibly conquer obstructions for this type of immunization approaches, making them increasingly successful, explicit, and longlasting in vulnerable reactions with insignificant structures and without any undesired impacts [46]. Moreover, many peptide-based vaccines have been effectively proposed through utilizing in silico approaches against Madurella mycetomatis, Mokola rabies virus, Lagos rabies virus, and others [47][48][49][50][51][52]. Such investigations, in regard to those viruses, have built up immunoinformatics in the computational analysis field.
In our present work, potential peptides were suggested to design an epitope-based vaccine for Nipah virus, using the latest amino acid sequences of glycoprotein G (glycoside hydrolase family) for a total of 21 strains of Nipah virus that were retrieved from the NCBI database (https://www.ncbi .nlm.nih.gov/protein) [31] on July 2018 after the last outbreak at the end of May 2018 in Kerala-India according to the WHO report [53]. Figure 4 summarizes the method of the present work.
Various literatures were surveyed to define the antigenic part of the virus. Glycoprotein G was found to be on the outer surface of the virus which was chosen as our target. Initially, we have evaluated the binding affinity of the virus to MHC alleles. This was done by submitting the protein reference sequence to IEDB MHC, a binding prediction tool, based on the ANN align method with I C 50 ≤ 500 [37] for MHC class I molecules. 191 peptides were found to bind to MHC class I with different affinities. It is well known that a better immune response depends on whether or not the recognition of epitopes by HLA molecules with significant affinity is successful. Therefore, a peptide recognized by its highest number of HLA alleles has the best potential to induce a strong immune response, leading us to take into account the only three peptides found with a 100% conservancy. The conserved peptide FLIDRINWI was found to interact with 8 alleles (  The reference sequence of Nipah virus glycoprotein G was reanalyzed using the IEDB MHC II binding prediction tool based on NN-align with half-maximal inhibitory concentration ðIC50Þ ≤ 1000 [38]. The analysis resulted in the prediction of 398 peptides from which FSWDTMIKF, FLIDRINWI, and ILSAFNTVI were potentially proposed according to their high number of binding alleles (15, 12, and 15 alleles, respectively). Additionally, the sequence of

Journal of Immunology Research
Nipah virus glycoprotein G was subjected to BepiPred linear epitope prediction, Emini surface accessibility, and Kolaskar and Tongaonkar antigenicity methods in IEDB. Unfortunately, the peptides with the strongest binding affinities, utilizing the three mentioned tests, were absent.
Population coverage results for the total peptides found and the proposed peptides binding to MHC classes I and II alleles are summarized in Tables 4 and 5. Obtained results from the bindings to MHC I alleles revealed a 99.84% projected population coverage in the world, 98.55% in Southeast Asia, 98.40% in South Asia, 99.23% in North Africa, and 99.36% in Sudan while the population coverage results for the total number of peptides binding to MHC II alleles showed only a 56.84% projected population coverage in the world, 48.63% in Southeast Asia, 56.00% in South Asia, 62.37% in North Africa, and 55.75% in Sudan.
The selected peptides were further subjected to both MHC I-and MHC II-based population coverage analysis in the whole world, Southeast Asia, South Asia, North Africa, and Sudan as shown in Table 5. Among the six primarily selected epitopes, the obtained results showed a very strong potential in proposing the epitope FLIDRINWI as a vaccine candidate compared to the rest, taking into consideration its overall epitope conservancy, population coverage, and its affinity for the highest number of HLA molecules. Furthermore, in silico docking was carried out to measure the binding efficacy between the proposed peptides and HLA-A * 02:01, in which it has been specifically chosen in relation to their contribution to several immunological and pathological diseases [54][55][56], although numerous investigations have shown a relationship between HLA alleles and disease susceptibility, which defines defensive HLA allelic associations that possibly permit a recognizable proof that pathogen epitopes are limited by particular HLA alleles. These epitopes may then be fused into a vaccine design in the expectation that the immunization will be reproduced naturally [55,56].
Calculations of the root mean square deviation (RMSD) between coordinates of the atoms and formation of clusters based on RMSD values have computed the resemblance of the docked structures. The most favorable docking is considered to be the conformation of the lowest binding energy. The least energy predictions of the peptide FLIDRINWI (-6.95 Kcal/mol) and the 3D structure of the allele and its peptide are shown in Figure 5. Furthermore, the monoisotopic mass, sum formula, and molecular weight of the three highly proposed peptides are shown in Table 6.
As a result of these interesting outcomes, formulating a vaccine using the suggested peptide is highly promising and encouraging to be highly proposed as a universal epitopebased peptide vaccine against Nipah virus.

Conclusions
The present study proposed a very promising epitope-based peptide vaccine against glycoprotein G of Nipah virus. It is expected to be highly antigenic with a minimum allergic effect. The proposed peptide FLIDRINWI has a strong binding affinity to both MHC class I and MHC class II alleles. Moreover, it shows an exceptional population coverage result for both MHC class I and MHC class II alleles in the whole world, Southeast Asia, South Asia, North Africa, and Sudan.
Despite having to validate the findings of the current study, an in vivo assessment of the most promising peptides,  Journal of Immunology Research    Journal of Immunology Research namely, FLIDRINWI, TVYHCSAVY, and FAYSHLERI, is highly recommended and will serve as the ground data for such work as shown in Figures 5-9.

Data Availability
The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest
The authors declare that they have no competing interests.