Computer-Aided Analysis of Multiple SARS-CoV-2 Therapeutic Targets: Identification of Potent Molecules from African Medicinal Plants

The COVID-19 pandemic, which started in Wuhan, China, has spread rapidly over the world with no known antiviral therapy or vaccine. Interestingly, traditional Chinese medicine helped in flattening the pandemic curve in China. In this study, molecules from African medicinal plants were analysed as potential candidates against multiple SARS-CoV-2 therapeutic targets. Sixty-five molecules from the ZINC database subset (AfroDb Natural Products) were virtually screened with some reported repurposed therapeutics against six SARS-CoV-2 and two human targets. Molecular docking, druglikeness, absorption, distribution, metabolism, excretion, and toxicity (ADMET) of the best hits were further simulated. Of the 65 compounds, only three, namely, 3-galloylcatechin, proanthocyanidin B1, and luteolin 7-galactoside found in almond (Terminalia catappa), grape (Vitis vinifera), and common verbena (Verbena officinalis), were able to bind to all eight targets better than the reported repurposed drugs. The findings suggest these molecules may play a role as therapeutic leads in tackling this pandemic due to their multitarget activity.


Introduction
Coronaviruses (CoVs) are members of Coronaviridae family belonging to the order Nidovirales. ey are enveloped viruses of about 60 to 140 nm in diameter with positive-sense single-stranded RNA genome (+ssRNA) classified into four genera, namely, alpha (α), beta (β), gamma (c), and delta (δ) [1]. ese viruses have spikes protruding from their surface, giving them a crown-like structure (hence, the name corona), which makes them bind to the human lower respiratory system [2]. Since the turn of the 21 st century, CoVs have caused several pandemics that are not only of significant public health concerns but also distort socioeconomic activities in infected regions [3]. In 2003, Severe Acute Respiratory Syndrome (SARS) was identified in Guangdong, China, with SARS-CoV as the causative pathogen, while the Middle East respiratory syndrome (MERS) caused by MERS-CoV resurfaced a decade later in Jeddah, Saudi Arabia [4]. ese zoonotic pathogens belong to the β-CoV genus, with pneumonia and acute respiratory distress syndrome (ARDS) as prominent symptoms [5]. In Wuhan, China, a novel pandemic initially known as 2019 novel coronavirus (2019-nCoV) was reported in December 2019. It was later called coronavirus disease (COVID-19) by the World Health Organization (WHO) on 11 February 2020 [6]. e disease exhibits similar symptoms with SARS, consequently making the International Committee on Taxonomy of Viruses (ICTV) name the viral pathogen severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) [7]. Irrespective of the numerous ways and policies to contain this pandemic, it has continued to spread worldwide with an increase in its related mortality. e WHO situation report as of 8 June 2020 shows 6,931,000 confirmed cases with 400,857 reported deaths worldwide. In Africa, 135,412 confirmed cases with 3,236 deaths were reported with the transmission classified majorly as community-based [8]. e novelty of this disease means there is no antiviral therapy or vaccine to combat it. Nonetheless, the severity of this disease has led to its prioritisation and increased research on the disease and the virus by researchers worldwide, leading to a better understanding of its aetiology, pathogenesis, management, and treatment [1]. Recently, lopinavir, ritonavir, remdesivir, chloroquine, hydroxychloroquine, camostat, and nafamostat, to mention a few, have been proposed as potential drug candidates that could be repurposed to combat this pandemic [9][10][11][12]. Traditional Chinese Medicine (TCM) and Ayurveda system of medicine have been attributed to play a part in the flattening of the pandemic curve in China with about 60,000 people treated with TCM [13,14]. eoretically, the secondary active metabolites in these natural products may have been responsible for the attributed success of TCM against COVID-19 in China.
Computational method approaches are important techniques which efficiently filter, screen, select, and identify potential leads for drug development from diverse chemical databases [15]. Numerous computational screening analyses have been carried out to screen and identify drug candidates with therapeutic and prophylactic potential against various proteins of SARS-CoV-2 from the ZINC database [16][17][18][19][20]. Network analysis leaning algorithm (machine learning, deep learning, and artificial intelligence (AI)) approach based on a fully connected neural network in combination with virtual screening methods as well as network-based meta-analysis has been utilised in investigating potential anti-SARS-CoV-2 leads from ZINC database [21][22][23][24][25]. e Unites States Food and Drug Administration-(USFDA-) approved drugs [26], drugbank [27,28], traditional Ayurvedic, Chinese and natural medicine [20,[28][29][30][31], dark chemical matter, and fooDB [25] are some of the ZINC database subsets that have been rigourously screened for molecules to combat SARS-CoV-2 with main protease, RNA-dependent RNA polymerase, and angiotensin-converting enzyme-2 as the major therapeutic targets. Despite the volume of research on computational screening analyses from different databases, there is a paucity of information on small molecules from African medicinal plants and other therapeutic targets that can help combat SARS-CoV-2. Hence, this study analysed a plethora of natural products (NPs) from African medicinal plants with known bioactivities in human as therapeutic candidates targeting and inhibiting SARS-CoV-2 RNA synthesis, replication, structural protein function, and host-specific receptors/enzymes.

Protein Retrieval, Preparation, and Binding Pocket
Prediction.

Structure Based-Virtual Screening and Molecular
Docking. iGEMDOCK v2.1 was used to screen the ligands in the binding pockets of the proteins with population size 300, 80 generations, and 10 solutions as the set screening parameters [42]. Top ranking ligands present across all eight proteins [8] were further subjected to molecular docking using AutoDock Vina [43]. Gasteiger charges were computed, polar hydrogen was assigned, and grid map was set at 1Å space for PLpro, 3CLpro, RdRp, helicase, 2OMT, S-RBD, ACE2, and TMPRSS2 protein crystal structures using Autodock 4.2 as shown in Table 1 before molecular docking was carried out [34,35].

Homology Modelling.
e SARS-CoV-2 helicase FASTA sequence (601 amino acid residues) which was inputted into SWISS-MODEL generated five helicase homology models from 5 templates, namely, 6jyt, 5wwp, 4non, 5ftf, and 6sje. However, the model built using 5wwp as template was selected based on its global model quality estimate, QMEAN, template resolution, sequence identity, similarity, and coverage over other templates (Table S2, Figure S1). In SWISS-MODEL, structural information of templates is extracted based on OpenStructure comparative modelling engine to provide complete stoichiometry and generate a 3D homology model structure [37]. According to Xiang [46], if the sequence identity similarity of 30% upward is shared between target and template, the homology model is considered dependable and successful. With a 72.2% similarity and QMEAN of −1.72, the selected model was not only successful but might have structural similarity and behaviour with experimental models [47]. Upon energy minimisation of the modelled helicase, a root-mean-square deviation (RMSD) score of 0.387 and 0.588Å was observed when superimposed with the modelled helicase and 5wwp, respectively ( Figure S2). is observation suggests the homology model has a high similarity and structural orderliness with the template as corroborated by studies on TMPRSS2 and dipeptidyl peptidase 4 [17,36]. e model also had good stereochemical quality as conferred by the Ramachandran plot with reduced noise as shown in Table S3 with a 3D verification score of 94.7% and quality factor of 88.14% further corroborating the reliability of the model ( Figures S3 and S4, respectively).

Structure-Based Virtual Screening and Molecular
Docking. In the course of drug discovery, structure-based virtual screening is a computational approach utilised to identify promising novel small chemical ligands from curated chemical compound databases with potential activity against drug targets [48]. also known as 3-galloylcatechin, proanthocyanidin B1, and luteolin 7-galactoside, respectively, appeared across all therapeutic targets as the 3 rd , 10 th , and 5 th respective ranking molecule for PLpro; 1 st , 3 rd , and 9 th respective ranking molecule for 3CLpro; 2 nd , 4 th , and 9 th respective ranking molecule for RdRp; 4 th , 1 st , and 3 rd respective ranking molecule for helicase; 3 rd , 4 th , and 1 st respective ranking molecule for 2OMT; 8 th , 2 nd , and 12 th respective ranking molecule for S-RBD; 10 th , 1 st , and 7 th respective ranking molecule for ACE2; and 4 th , 5 th , and 10 th respective ranking molecule for TMPRSS2 (Table S4). ese three compounds were further docked using Autodock Vina in the binding pocket of the therapeutic targets predicted by DogSiteScorer, as shown in Figure S5. Pocket and subpocket detection and analysis tools are useful for assessing the druggability of therapeutic targets. DoGSiteScorer detects druggable pockets using a support vector machine based on the integration of grid-based method and Gaussian filter difference [41]. Pocket detection is usually done prior to molecular docking simulations. e selected hit ligands exhibited better Autodock binding fitness than the proposed COVID-19 treatment drug candidates in the binding pocket of PLpro (−5.7, −4.1, and −4.6 kcal/mol, respectively), 3CLpro (−6.3, −5.1, and −5.6 kcal/mol, respectively), RdRp (−8.7, −9.8, and −8.5 kcal/mol, respectively), helicase (−9.8, −10.3, and −9.2 kcal/mol, respectively), 2OMT (−8.6, −10.5, and −9.6 kcal/mol, respectively), S-RBD (−5.2, −6.4, and −6.1 kcal/ mol, respectively), ACE2 (−8.5, −9.7, and −8.9 kcal/mol, respectively), and TMPRSS2 (−8.4, −8.9, and −8.9 kcal/mol, respectively) ( Table 2). Interestingly, none of them were able to target the S-RBD-ACE2 interface (result not shown). In AutoDock Vina, ligands with more electronegative binding energy are ranked higher and believed to have better binding fitness [43]. 3-Galloylcatechin, proanthocyanidin B1, and luteolin 7-galactoside all had lower AutoDock Vina scores than the proposed standards in most of the targets, suggesting better binding fitness, lower inhibition constant, and better experimental activity values [49]. 3-Galloylcatechin is an extremely weak base flavonoid with almond (Terminalia catappa) and grapes (Vitis vinifera) as rich sources [50]. Proanthocyanidin B1-like 3-galloylcatechin is also a flavonoid made up of (−)-epicatechin and (+)-catechin units, joined by a β-interflavanyl bond. It is found in cocoa powder and grapes at high concentration [51]. Luteolin 7galactoside is a flavonoid glycoside found in fruits, herbs, and spices such as common verbena (Verbena officinalis) [52]. e antiviral activity of 3-galloylcatechin and proanthocyanidin B1 has previously been reported [53,54]. Ghosh and Chakraborty [55] and Nguyen and Woo [56] in their respective studies have identified gallocatechin gallate, a closely related molecule to 3-galloylcatechin, as a candidate molecule against SARS-CoV 3CLpro. Competitive inhibition of this enzyme was the proposed mode of gallocatechin gallate action. e inhibitory activity of procyanidin B1 on SARS-CoV infection has been reported by Zhuang et al. [57]. ough procyanidin B1 mode of action was not elucidated, transferrin receptor and ACE2 expression were found not to play a role in the inhibitory property. Luteolin 7-galactoside, on the other hand, does not have any reported antiviral activity, but rather its effectiveness in treating sore throats, respiratory tract diseases, and its anti-inflammatory activity is well established [58,59]. e carbohydrate moiety stereoisomer, luteolin 7-glucoside, on the other hand, has been reported to possess antiviral activity [60,61]. Luteolin was recently reported to possess possible strong antiviral activity against SARS-CoV-2 as a result of its low binding energy in the active site of some therapeutic targets [62].
Molecular docking simulation predicts the binding properties and interactions between a ligand and its target [63]. Looking at the binding mode and molecular interaction of the hit ligands in the binding pocket of SARS-CoV-2 therapeutic targets (PLpro, 3CLpro, RdRp, helicase, 2OMT, S-RBD, ACE2, and TMPRSS2) as simulated by Autodock Vina, conventional hydrogen bonds, carbon-hydrogen bond, Van der Waal forces, and various pi (π) interactions were responsible for stabilising the ligand interactions (Figures 1-16). HIS1017 and CYS1015 were noteworthy common amino acid residues that interacted via hydrogen bond and Van der Waal interaction, respectively, with all the hit ligands and proposed drugs in the binding site of PLpro (Figure 1). In the binding pocket of 2OMT as depicted in Figures 9-10, ASN101 (hydrogen bond), GLY71, ASP133 (Van der Waal forces), LEU100, and MET131 (π interactions) were the common amino acid residues that interacted with the hit molecules and proposed standards to stabilise them in the binding site while ASP618 (Van der Waal interaction) was synonymous in the binding site of RdRp (Figures 7-8). ese amino acid residues could be targeted as possible therapeutic targets in the inhibition of these proteins due to their noncovalent stabilising activity in the druggable pocket [64]. HIS contains an imidazole side chain with acid-base properties which is essential in the catalytic mechanism of enzymes where it is found [65]. For 3CLpro, helicase, S-RBD, ACE2, and TMPRSS2, the amino acid residues responsible for stabilising the molecules in the binding site were not consistent for the ligands and proposed standards (Figures 3-6 and 11, 12-15, and 16, respectively).
All ligands and proposed standards were able to bind properly in the binding pocket of the therapeutic targets as predicted by DogSiteScorer after the docking simulation by Autodock Vina (Figure S5). On the other hand, luteolin 7galactoside and ritonavir bound to the allosteric site of PLpro with ritonavir extending into the active site while lopinavir as well as ritonavir bound to the allosteric site of 3CLpro.
is binding pose could be attributed to their chemical structures requiring a more substantial binding pocket. Chloroquine (S-RBD and ACE2) and hydroxychloroquine (TMPRSS2) were also found to bind to allosteric sites after docking ( Figures S6 and S7, respectively). e binding of hydroxychloroquine in the positively charged allosteric site of TMPRSS2 was similar to earlier findings of remdesivir binding with TMPRSS2 where hydrogen bonds with ASN and ARG were also formed [20]. It is of interest that the high binding energies of 3-galloylcatechin, proanthocyanidin B1, and luteolin 7-galactoside to ACE2 and TMPRSS2 could have clinical implication in the prevention of SARS-CoV-2 transmission [66]. Various studies have reported other phytomolecules present in African medicinal plants that have exhibited potential activity as antagonists against SARS-CoV-2 by interfering with different therapeutic targets [67,68]. ese findings further buttress the role of NPs and their identified bioactives in tackling the COVID-19 pandemic.

Physicochemical, Pharmacokinetic, and Toxicity Evaluation of Hit Ligands.
e physicochemical properties of the hit ligands are presented in Table 3. Interestingly, none of the hit molecules were able to pass Lipinski's druglikeness test with 3-galloylcatechin, proanthocyanidin B1, and luteolin 7galactoside violating 1, 3, and 2 rules, respectively. ey also violated Ghose, Veber, Egan, Muegge, and leadlikeness      [36]. e druglikeness failure exhibited by the compounds is not surprising as NPs have been reported to fail Lipinski's druglikeness test, which is attributed to their mechanism of absorption [73]. NPs have been reported to be bioavailable by exploiting complex mechanisms such as active transport unlike synthetic drugs that utilise passive diffusion [74]. is is because NPs look a lot like biosynthetic intermediates and endogenous metabolites than synthetic compounds [75]. ey also differ in terms of elemental composition and stereochemical complexity [76]. Interestingly, many NPs that violate RO5 have been reported to remain bioavailable by Lipinski [77]. Advances in synthetic biology and organic synthesis methodology such as biosynthetic gene cluster manipulation, total synthesis, semisynthesis, or a combination of these methods have identified a new generation of natural product scaffolds that can be systematically targeted, to increase the activity, decrease the toxicity, and/or improve the physicochemical and pharmacokinetic properties [78]. is can also be applied as done in other natural leads during synthesis and lead optimization to improve the druglikeness of the compounds [79,80]. e synthetic accessibility score of the compounds ranged from 4.16 to 5.32. e score which ranges from 1 to 10 is based primarily on the assumption that molecular fragment frequency in easily obtainable molecules actually correlate with the ease of synthesis [44]. e observed values for the hit ligands suggest their fragmental contribution and chemical moieties should be moderately favourable for synthetic synthesis in the pharmaceutical industry leading to potential drug discovery outcome [81][82][83].
In the course of drug discovery, compounds with predicted favourable pharmacokinetic and toxicity properties have the potential to pass standard clinical trial criteria's making them drug candidates [84]. Looking at the absorption properties presented in Table 4, none of the hit compounds were permeable to Caco-2 and human intestine.   ey were also not P-glycoprotein substrates, inhibitors, and bioavailable at 30%. However, 3-galloylcatechin was predicted to inhibit the P-glycoprotein, permeate the blood-brain barrier, and become bioavailable at 20%. In contrast, proanthocyanidin B1 and luteolin 7-galactoside could permeate the blood-brain barrier and become bioavailable at 20%, respectively. ese compounds are not P-glycoprotein substrates, ensuring their bioavailability and intracellular concentration are not affected [85]. A look at the distribution properties showed 87.287, 76.369, and 78.270% as the plasma protein binding and −1.129, −0.720, and −1.028 L/kg as the volume distribution for 3-galloylcatechin, proanthocyanidin B1, and luteolin 7-galactoside, respectively. ese hit compounds were predicted to be neither inhibitors nor substrate of the various cytochrome P 450 (CYP 450 ) isoforms. 3-Galloylcatechin was nonetheless the only exception inhibiting CYP3A4 and CYP2C19 isoforms (Table 4). is inhibition could lead to the accumulation of drugs metabolised by these CYP isoforms and, hence, drug toxicity may occur [86]. A half-life of 1.534, 2.11, and 1.483 hours as well as a clearance rate of 1.204, 1.015, and 1.232 mL/min/kg were the respective elimination properties of 3-galloylcatechin, proanthocyanidin B1, and luteolin 7-galactoside (Table 4). Toxicity property presented in Table 4 showed these compounds met the maximum recommended daily dose by the U.S. Food and Drug Administration without being hepatotoxic, skin sensitisers, and mutagenic. ey were found to be human ether-a-go-go-related (hERG) gene blocker with the ability to induce drug liver injury. On the other hand, 3-galloylcatechin was found to be mutagenic, while luteolin 7-galactoside was not a predicted hERGrelated gene blocker. Compounds identified as inhibitors of CYP isoforms, hERG blockers, and AMES mutagen can be optimised by the addition of analogues to their cores during optimization process to avoid the development of long QT syndrome and mutagenicity and overcome the few lapses in the pharmacokinetic properties [78]. ese identified hits, however, possess good ADMET properties while meeting the maximum recommended daily dose of USFDA.

Conclusion
In this study, computer-aided analysis was utilised to identify 3-galloylcatechin, proanthocyanidin B1, and luteolin 7-galactoside (ZINC 3978503, ZINC 5085289, and ZINC 40422816, respectively) found in some medicinal plants (almond (Terminalia catappa), grape (Vitis vinifera), and common verbena (Verbena officinalis)) of African flora as hit compounds against multiple SARS-CoV-2 targets. ese compounds could be possible leads and nutraceuticals involved in the treatment or as prophylaxis of COVID-19. e scaffolds of these compounds can be optimised to improve the few lapses in its metabolism and toxicity. e results further suggest these compounds will help to overcome in some degree the old paradigm "one gene, one drug, one disease" of drug discovery. Nonetheless, further in vitro, in vivo, and clinical research is required to validate the pharmacotherapeutic significance of these hit compounds as anti-SARS-CoV-2 therapy.

Data Availability
e data used to support the findings of this study are included within the article and supplementary files.

Conflicts of Interest
e authors declare no conflicts of interest regarding the publication of this paper.