Toxoplasma gondii Tyrosine-Rich Oocyst Wall Protein: A Closer Look through an In Silico Prism

Toxoplasmosis is a global threat with significant zoonotic concern. The present in silico study was aimed at determination of bioinformatics features and immunogenic epitopes of a tyrosine-rich oocyst wall protein (TrOWP) of Toxoplasma gondii. After retrieving the amino acid sequence from UniProt database, several parameters were predicted including antigenicity, allergenicity, solubility and physico-chemical features, signal peptide, transmembrane domain, and posttranslational modifications. Following secondary and tertiary structure prediction, the 3D model was refined, and immunogenic epitopes were forecasted. It was a 25.57 kDa hydrophilic molecule with 236 residues, a signal peptide, and significant antigenicity scores. Moreover, several linear and conformational B-cell epitopes were present. Also, potential mouse and human cytotoxic T-lymphocyte (CTL) and helper T-lymphocyte (HTL) epitopes were predicted in the sequence. The findings of the present in silico study are promising as they render beneficial characteristics of TrOWP to be included in future vaccination experiments.


Introduction
The model apicomplexan, Toxoplasma gondii (T. gondii), virtually infects a large number of warm-blooded animal species including humans [1]. Reportedly, one-third of the global population has shown serological traits of a previous exposure to the parasite [2]. Also, the parasite is of veterinary importance, as a well-known abortifacient among domestic livestock [3]. Within feline intestine, gametogony and sporogony occur in order to develop unsporulated oocysts. The latter are shed via feces into the environment, become infective, and contaminate food/water supplies [4]. Additionally, fast-replicating tachyzoites (transfusion-mediated and congenital infection) and slow-dividing bradyzoites (cyst-contaminated muscle tissues and organ transplant) are involved in alternative transmission pathways [5]. Notwithstanding its widespread prevalence, T. gondii infection rarely results in clinical disease in immunocompetent individuals, whereas a decreased immune status, such as the case in pregnancy and/or immunosuppressive disorders, may pave the way for the opportunistic parasite to vividly invade to the unborn via placenta or to central nervous system (CNS) tissues, respectively [6].
Disappointingly, the present chemotherapeutic options for acute infections and/or recrudescence of infection are only active against tachyzoite stages with a number of sideeffects reported in treated patients [7]. Thereby, employing preventive measures such as a One-Health vaccination approach more sufficiently benefit the world [8]. Since Toxoplasma is an obligatory intracellular organism, T CD 8 + responses with interferon gamma (IFN-γ) upsurge are the preferred immunity to combat acute infection. However, humoral responses are also advantageous, in particular following an oral infection with bradyzoites or oocysts [9].
Despite over 30 years of preclinical and clinical Toxoplasma vaccination using various platforms, no commercial human vaccine is registered yet [10]. The only available vaccine is the so-called "Toxovax®," a live attenuated strain (S48) of T. gondii, for prevention of abortion in sheep, but it cannot be used in humans for safety issues [11]. The multistage life cycle of Toxoplasma requires expression of a large number of proteins during parasite stage-to-stage transition. Surface antigens (SAGs) along with organellar proteins including micronemes (MICs), rhoptries (ROPs), dense granule antigens (GRAs), and their associated molecules have been nominated as the main repository containing valuable vaccine candidates. Interestingly, some of these proteins are stage-specific (SAG1 in tachyzoite), while some may associate with multiple stages (MIC4, MIC13, ROP2, GRA8, GRA14) [12].
Although human T. gondii infection acquired via tissue cysts could be handled through proper cooking of meat products, but oocysts can thoroughly contaminate water streams, soils, and foodstuffs and survive for a long time in moist surrounding as well as fresh/marine waters [13]. This outstanding ability of oocysts is not yet fully understood, but the underlying mechanism may lie within key molecular components in the oocyst walls [14,15]. In light of advances in omics-based technologies, an increasing trend has been raised to precisely decipher the parasite biology and to characterize critical immunocorrelates of T. gondii protection [16]. In this sense, bioinformatics methods have more facilitated the identification of biophysical features and novel immunogenic antigens/peptides in different Toxoplasma infectious stages [17]. The present in silico study was performed to have a closer look at bioinformatics properties of the T. gondii tyrosine-rich oocyst wall protein (TrOWP) and its immunodominant cytotoxic T-lymphocyte (CTL) and helper T-lymphocyte (HTL) as well as B-cell epitopes using comprehensive immunoinformatics tools.  Figure 1: Secondary structure prediction for TrOWP using GOR IV web server. (a) Sequence-based prediction results indicated that 54.24%, 33.05%, and 12.71% of the sequence are dedicated to the random coil (yellow C), alpha helix (blue H), and extended strand (red E), respectively. (b) Graphical illustration of the secondary structure by GOR IV server. 2 BioMed Research International sequence transformation into uniform vectors of amino acid characteristics via auto cross covariance (ACC) [18]. As well, particular microarray data is used by ANTIGENpro online tool, in a pathogen-and alignment-free manner [19]. Allergenic feature of the protein was evaluated using AllerTOP v2.0 (https://www.ddg-pharmfac.net/AllerTOP/) and AlgPred (http://crdd.osdd.net/raghava/algpred/) servers. AllerTOP utilizes E-descriptors, k-nearest neighbors, and auto/cross variance transformation algorithms [20]. Also, AlgPred employs several machine learning algorithms to determine IgE-specific epitopes and MEME (Multiple Em for Motif Elicitation)/MAST (Motif Alignment and Search Tool) allergen motifs [21]. Protein solubility was calculated using two web servers: SOLpro (http://scratch.proteomics .ics.uci.edu/) and Protein-Sol (https://protein-sol .manchester.ac.uk/). SOLpro employs a two-stage support vector machine (SVM) algorithm to check the solubility upon overexpression in Escherichia coli (E. coli) [22]. Based on the population average of the experimental dataset in Protein-Sol server, that is 0.45, any solubility value above this score is predicted to be highly soluble [23]. Finally, the physico-chemical functions of the TrOWP were determined using ExPASy ProtParam web server (https://web.expasy .org/protparam/), showing the protein molecular weight (MW), positively and negatively charged residues, isoelectric point (pI), in vitro and in vivo estimated half-life, instability index, aliphatic index, and grand average of hydropathicity (GRAVY) [24].

Refinement of the 3D Structure and Validations.
The best-fit, high-ranked 3D model provided by I-TASSER server was subsequently subjected for structural rehashing and relaxation purposes, using GalaxyRefine server (http:// galaxy.seoklab.org/cgi-bin/submit.cgi?type=REFINE). This server is one of the best refining tools, which enhances the predicted structure through formation of side chains and their repacking, thereby providing an overall relaxation in the structure via dynamic simulations. Five refined models are yielded following calculations, based on various qualifying scores including global distance test-high accuracy (GDT-HA), root mean square deviation (RMSD), MolProbity, Clash score, Poor rotamers, and Rama favored [35]. In the next step, the quality of the refining process was checked through Ramachandran plot analysis of Zlab (https://zlab .umassmed.edu/bu/rama/) and ERRAT tool of SAVES 6.0 server (https://saves.mbi.ucla.edu/). "ERRAT analyzes the statistics of non-bonded interactions among various atom types by comparison with statistics from highly-refined structures" [36,37]. Furthermore, Ramachandran analysis is directed towards protein structure confirmation through
Given human CTL epitopes, NetCTL 1.2 web server was used, which predicts CTL epitopes with respect to 12 major MHC supertypes (threshold 0.75). In order to cover up to 90% of human leukocyte antigen (HLA) in the global population, we utilized top five frequent supertypes (A1, A2, A3, A24, and B7) for CTL epitope prediction [45]. Similar to mouse MHC-I epitopes, screening was performed regarding immunogenicity, allergenicity, and hydrophobicity. Moreover, human HTL epitopes were predicted using IEDB MHC-II tool with respect to full-HLA reference set, along with antigenicity, allergenicity, IFN-γ, and IL-4 induction analyses.

Secondary and Tertiary Structure Prediction.
Based on GOR IV and PSIPRED output, three secondary structure constituents were present in the sequence, including 128 (54.24%) random coil and 78 (33.05%) alpha helix as well as 30 (12.71%) extended strand. The graphical representation of secondary structure prediction by GORV and PSIPRED servers is illustrated in Figures 1 and 2, respectively. Subsequent homology modelling analysis by I-TASSER server predicted five tertiary models based on top 10 threading templates using LOMETS threading approach. Each predicted model is appointed a C-score as confidence index, ranging between -5 and 2, where higher C-score values indicate a higher confidence in prediction. Provided models using I-TASSER had a C-score from -4.3 to -5. Model number 1 with C-score -4.3, estimated TM-score 0:26 ± 0:08 and estimated RMSD of 16:3 ± 3:0 Å, was chosen for further refining and validations (Figure 3(a)).

Prediction of Linear and Conformational B-Cell Epitopes
Multistep linear B-cell epitope prediction revealed nine shared peptides among three web servers (BCPREDS, ABCpred, and SVMTriP) with high antigenicity score and good water solubility and without allergenic properties (Table 1 and Supplementary File 1). Also, the linear B-cell epitopes predicted on the basis of physico-chemical parameters are tabulated in Table 2. In addition, ElliPro tool of the IEDB server demonstrated four conformation epitopes in the protein sequence, with length and scores as follows: (i) 62 residues (0.731), (ii) 46 residues (0.707), (iii) 3 residues (0.562), and (iv) 3 residues (0.552) (Figure 4).

Discussion
One-third of the global population is affected by the apicomplexan protist, T. gondii, and its clinical significance is conspicuous in pregnant women and immunocompromised patients [46,47]. Despite over decades of research in the field of toxoplasmosis vaccination using various antigens and immunization platforms, a human vaccine is still lacking [10]. In the molecular era, the combination of mathematical algorithms with outstanding large depository of biological data has yielded the interdisciplinary bioinformatics science. "Immunoinformatics" is a subdivision aimed at characterization of novel vaccine candidates, their biophysical features, immunogenic epitopes, and evaluation of intermolecular interactions as well as the dynamics of host responses to injected vaccine through in silico approaches. Some advantages are anticipated using this novel approach, including (i) time-and cost-effectiveness; (ii) precisely targeted, durable immune response with desired polarity in cellular components; and (iii) abolition of unfavorable responses through specific, epitope-based construct design [48]. The present in silico study was done to identify the bioinformatics features of the poorly known Toxoplasma TrOWP, along with identification of potential CTL, HTL, and B-cell epitopes for future vaccine design. Toxoplasma oocysts are shed via felid's feces as unsporulated form, where under optimum climatic conditions (20-25°C, moist soil) turn into infective sporulated oocysts [49]. Early studies demonstrated that T. gondii infection was lacking in cat-free islands, emphasizing the significance of oocysts in transmission dynamics and infection maintenance [50,51]. Such infective stages are highly resistant to a wide range of temperatures (-20°C to +37°C) [52], salinity up to 15 ppt (parts per thousand) [53] and chemical inactivation agents used in water treatment supplies [54], whereas they succumb to temperatures over 45°C as well as desiccation [55][56][57]. Therefore, oocysts ring the alarm of a possible health hazard in regions where drinking water treatment plants are only based on chemical disinfection without further filtration [54]. Insights into structural conformation of oocyst wall components would represent us the key for such a large extent of environmental resistance by oocysts [14,15]. This intuition was mostly gained from pioneering investigations on the oocyst walls of other apicomplexan members, including Cryptosporidium parvum and Eimeria species [13]. Proteomics analyses during the last decade led to the identification of approximately 225 proteins in fractions of T. gondii oocyst wall, with the predominance of PAN-domain containing, cysteine or tyrosine rich proteins [15,58,59]. The latter were initially discovered in Eimeria maxima within macrogametes type-2 wall forming bodies and inner layer of oocyst wall, showing high rate of conser-vation across Eimeria species [60][61][62]. Detailed transcriptomic and proteomic information suggested that such proteins are different in T. gondii, localizing in both layers of oocyst walls [59,63]. Here, a TrOWP of T. gondii was further characterized using a set of bioinformatics web servers.
The basic features of a good vaccine candidate are antigenicity and lack of allergenicity; accordingly, TrOWP was showed to have adequate antigenic scores via analysis by VaxiJen v2.0 (0.8769) and ANTIGENpro (0.879297) web servers. It was, also, considered as a nonallergen protein by AllerTOP v2.0 server, without IgE specific epitopes and MEME/MST motifs, as demonstrated by AlgPred server. The ProtParam server revealed some of the critical physico-chemical parameters of the protein, so that this 236 amino acid proteins possessed a MW of 25.57 kDa, suggesting adequate immunogenicity value, since molecules over 5-10 kDa are strong immunogens [64]. The pI is defined as the pH at which net charge turns zero, which for this protein was relatively acidic in nature (4.85). Negatively charged residues (Asp + Glu) were prevalent in the sequence than positively charged residues (Arg + Lys). According to instability index of 29.51 and aliphatic index of 54.24, TrOWP was a stable and moderately thermotolerant molecule. In fact, higher aliphatic index demonstrates higher tolerability to vast range of temperatures [65]. Moreover, the protein was highly hydrophilic in nature, as evidenced by a negative GRAVY score (-0.785). Protein solubility is another important factor in purification experiments. Two web servers, SOLpro a d Protein-Sol, substantiated that TrOWP is a highly soluble molecule with 0.623441 and 0.810 scores, respectively. Totally, such biophysical characteristics are fundamental for further extraction/purification purposes in the molecular laboratory experiments.
Synthesized proteins may be directed towards cellular secretory pathway for several purposes such as excretory/secretory antigen, virulence factor, and/or structural molecule; hence, they are highlighted by a signal peptide [66]. In the present in silico study, SignalP-5.0 web tool assisted us for detecting three types of signal peptide: "standard secretory signal peptides transported by the Sec translocon and cleaved by Signal Peptidase I (Sec/SPI), lipoprotein signal peptides transported by the Sec translocon and cleaved by Signal Peptidase II (Sec/SPII), and Tat signal peptides transported by the Tat translocon and cleaved by Signal Peptidase I (TAT/SPI)" [29]. TrOWP only possessed a standard signal peptide with the probability of 0.9967. Crude synthesized proteins undergo several enzymatic modifications, including glycosylation, acetylation, palmitoylation, and phosphorylation, totally known as PTMs [67]. Each modification has its own function, such as protein half-life alteration (glycosylation), signal transduction (phosphorylation), membrane anchoring (acetylation), and enhanced protein hydrophobicity (palmitoylation) [68]. Among PTMs examined in the present study, 26 phosphorylation sites and 13 acetylation sites were only predicted in TrOWP, whereas palmitoylation and N-glycosylation were lacking. In general, prediction of PTMs, in particular for eukaryotic proteins, is beneficial for choosing appropriate expression hosts for recombinant protein production purposes. Complex protein structures 9 BioMed Research International can be produced efficiently using yeast-and mammal-based expression machineries [68]. It is believed that "the presence of hydrogen bonds in a polypeptide chain between amino hydrogen and carboxyl oxygen represents the secondary structure, with frequent α-helices and β-structures" [69]. Moreover, tertiary or 3D structure of a protein is defined by the involved bonds and their interactions. High hydrogen-bond energy among α-helices and β-structures strongly maintains the protein conformation and, therefore, strengthens the possible interaction with antibodies [70]. Secondary structure prediction using two web servers, GOR IV and PSIPRED, determined the predominance of random coil (54.24%), followed by alpha helix (33.05%) and extended strand (12.71%). In the next step, I-TASSER tool of the Zhang Lab automatically generated high-quality 3D models using TrOWP amino acid sequence. "To select the final models, I-TASSER uses SPICKER program to cluster all the decoys based on the pair-wise structure similarity, and report up to five models which corresponds to the five largest structure clusters and a C-score of higher value signifies a model with a high confidence" [71]. Here, model number 1 was the best-fit model, based on the C-score of -4.3, estimated TM-score of 0:26 ± 0:08, and estimated RMSD of 16:3 ± 3:0 Å. Subsequently, GalaxyRefine server was used to establishment and rehashing side chains, in order to improve the global quality of the structure. The outputs of this server for the selected model were including GDTHA (0.8951), RMSD (0.579), MolProbity (2.789), Clash score (34.5), Poor rotamers (1.1), and Rama favored (79.5). ERRAT and Ramachandran analyses, also, confirmed the quality of the refined model in comparison with the initial model.
Upon oocyst challenge, the gut mucosal barriers are the first place that host body senses the parasite through innate lymphoid cell (ILC) types 1 and 3 [72]. Hence, ILC1 secretes IFN-γ and tissue necrosis factor alpha (TNF-α), while ILC3 activates CD 4 + -dependent T-cell responses in lamina propria. Consequently, sensing the infection through the socalled toll-like receptors (TRLs) would summon macrophages and dendritic cells (DCs) to the infection site, where they produce a Th1-type cytokine, IL-12, to more elicit T CD 4 + and T CD 8 + cells. Furthermore, natural killer (NK) cells are stimulated to secrete another Th1-type cytokine, IFN-γ, which is totally required to limit the proliferation of parasites. Additionally, a Th-2 response and IL-4 production is a vital marker for B-cell propagation and differentiation as specific humoral response. On [9] this bases, specific CTL (CD8 + ), HTL (CD4 + ) and linear B-cell epitopes of TrOWP were predicted using various web servers. Regarding linear B-cell epitopes, a cross-checking approach was performed using BCPREDS, ABCpred, and SVMTriP servers to find common epitopic regions in the protein sequence. The final epitopes were further screened with respect to antigenicity, allergenicity, and water solubility. Therefore, the following six peptides were qualified as the best linear B-cell epitopes: "EEAAEPDE," "TNNEDEQ," "EPDEDKKDDS," "QGNDEHSSQ," " AGAPQNEVAAT," and "SQKLSFIECDCRK." Human CTL epitopes were predicted in context of the most frequent MHC-I supertypes (A1, A2, A3, A24, and B7) to cover large number of global population. As well, mouse CTL epitopes were predicted with respect to six mouse MHC-I alleles (H2-Db, H2-Dd, H2-Kb, H2-Kd, H2-Kk, and H2-Ld). All of these peptides were screened regarding immunogenicity, allergenicity, and hydrophobicity.

Conclusion
In spite of over three decades of research in the field of vaccination against toxoplasmosis, finding the most efficient antigen(s) and immunization platform is still a global health challenge. Computer-aided tools assist us to identify immunostimulant regions of a given antigen and better design multiepitope-based vaccine candidates for improving immunization. Previously, several surface and/or excretory/secretory antigens of T. gondii were evaluated using bioinformatics tools, while lack of studies on oocyst wall proteins leads us to select T. gondii TrOWP. The present study investigated the bioinformatics properties of TrOWP and its potent immunogenic mouse and human CTL/HTL epitopes to show its potential as a vaccine candidate. It was substantiated that TrOWP was a highly antigenic, nonallergenic, and soluble protein with several B-cell-specific and MHC-binding peptides in mouse and human, which highlights its significance for future vaccinology studies to prevent Toxoplasma infection as alone or multiepitope formulations in experimental models.

Data Availability
The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest
The authors declare that there are no conflicts of interest. Table 1. Specific linear B-cell epitopes of T. gondii Tyrosinerich Oocyst Wall Protein predicted by the BCPREDS web server (Threshold 75%). Table 2. Specific linear B-cell epitopes of T. gondii Tyrosine-rich Oocyst Wall Protein predicted by the ABCpred web server (Threshold 0.75%). Table 3. Specific linear B-cell epitopes of T. gondii Tyrosine-rich Oocyst Wall Protein predicted by the SVMTriP web server. (Supplementary Materials)