Molecular Characterization and In Silico Analyses of Maurolipin Structure as a Secretory Phospholipase A2 (sPLA2) from Venom Glands of Iranian Scorpio maurus (Arachnida: Scorpionida)

The venom is a mixture of various compounds with specific biological activities, such as the phospholipase  A2 (PLA2) enzyme present in scorpion venom. PLA2 plays a key role in inhibiting ryanodine receptor channels and has neurotoxic activity. This study is the first investigation of molecular characterization, cloning, and in silico analyses of PLA2  from Iranian Scorpio maurus, named Maurolipin. After RNA extraction from S. maurus venom glands, cDNA was synthesized and amplified through RT-PCR using specific primers. Amplified Maurolipin was cloned in TA cloning vector, pTG19. For in silico analyses, the characterized gene was analyzed utilizing different software. Maurolipin coding gene with 432 base pair nucleotide length encoded a protein of 144 amino acid residues and 16.34 kilodaltons. Comparing the coding sequence of Maurolipin with other characterized PLA2  from different species of scorpions showed that this protein was a member of the PLA2  superfamily. According to SWISS-MODEL prediction, Maurolipin had 38.83% identity with bee venom PLA2  with 100% confidence and 39% identity with insect phospholipase A2 family, which Phyre2 predicted. According to the three-dimensional structure prediction, Maurolipin with five disulfide bonds has a very high similarity to the structure of PLA2 that belonged to the group III subfamily. The in silico analyses showed that phospholipase A2 coding gene and protein structure is different based on scorpion species and geographical condition in which they live.

As a country located in the Middle East, due to its climate, Iran has a good potential for scorpions' life [28]. According to previous studies, Iran has one of the highest ranks in terms of scorpion envenomation over the world [29].
e last research has shown that Iran possesses 68 scorpion species that comprise 19 genera and four families [30]. e data estimated that the south and southwest of Iran possess about 95% of scorpion species, so were known as densely populated areas [31]. Fars Province, located in the southwest of Iran, has hot and humid weather and scorpions are important public health problem in this region [32]. Scorpio maurus, as a species of the Scorpionidae family, is a notable species in this province [33].

Objectives
is study aimed to identify PLA 2 from the venom glands of Iranian scorpion, S. maurus, based on molecular characterization and in silico analyses.

Scorpion Collection.
e scorpions were collected from Fars Province, Zarrin Dasht County, in the southwest of Iran and transferred alive to the laboratory of Medical Entomology in Shiraz University of Medical Sciences, Shiraz, Iran. e morphological characteristics of this species are shown in Figures 1(a)-1(c). Samples were identified via a valid key [30]. Before RNA extraction, the venom of the collected scorpions was milked manually to release the venom. ree days after venom milking, the telson of one scorpion was separated and stored at −70°C. Other parts of the body were stored at ethanol 96% and were kept in the archives of the Museum of the Department of Medical Entomology in Shiraz University of Medical Sciences.

RNA Extraction.
Total RNA was extracted from the venom glands of one S. maurus telson by using a High Pure RNA Isolation Kit, Roche ® . e RNA sample was treated enzymatically by DNase based on the manufacturer's manual. Extracted RNA concentration was measured using a Nanodrop (Analytik Jena ® ).

Reverse Transcription Polymerase Chain Reaction (RT-PCR).
According to the manufacturer's instruction, 3 μL of total RNA was used as a template for cDNA synthesis by AccuPower ® CycleScript RT Premix with (dN 6 ) (Bioneer Company, Korea). 0.1 to 1 μg of RNA template was filled up to the 20 μL volume with sterile water and was dissolved by vortexing. cDNA synthesis reaction was performed in four steps according to the manufacturer's manual, including 30 sec at 25°C for primer annealing, 4 min at 45°C for cDNA synthesis, 30 sec at 55°C for melting secondary structure and cDNA synthesis, and 5 min at 95°C for heat inactivation. e synthesized cDNA was kept at −70°C for amplification steps as a template. Researchers amplified the desired DNA sequences in total 20 μL volume containing 10 μL Taq DNA Polymerase Master Mix RED (2X), 1 μL of forward primer, 1 μL of reverse primer, 1 μL of synthesized cDNA, and finally 7 μL of sterile water. It was performed to 35 cycles of 30 sec at 94°C as denaturation, 30 sec at 50°C as annealing temperature, 30 sec at 72°C as an extension, and 10 min at 72°C for a final extension.
3.5. TA Cloning. PCR products were run onto 1% tris borate EDTA (TBE) agarose gel with an appropriate DNA ladder (100 bp). Our specific band was observed in the gel documentation instrument, and the selected band was purified according to the protocol of GeneAll ® kit. e purified PCR product was ligated to the linearized pTG19 vector with a TA cloning kit (Vivantis ® ) according to the manufacturer's instructions in 1 : 3 ratio. Escherichia coli strain DH5α competent cells were prepared before ligation reaction and stored at −70°C. Ligation mixtures were transformed in prepared competent cells by heat shock method. Blue-white screening technique was used on Luria-Bertani (LB) agar plates containing ampicillin (100 μg/ml), 2 Journal of Tropical Medicine 5-bromo-4-chloro-3-indolyl-β-D-galactopyranoside (X-Gal) (40 μg/ml), and isopropyl ß-D-1-thiogalactopyranoside (IPTG) (40 μg/ml) for detection of recombinant colonies . Each white colony was suspended in 30 μl of sterile water and boiled for 10 min. 1 μl of that has been used as a DNA template in PCR reaction for the characterization of recombinant colonies containing target insert using specific and universal M13 primers. After detecting the recombinant colonies based on PCR technique, plasmid DNA was extracted using the GeneAll ® plasmid isolation kit according to manufacturer's manual. Maurolipin coding gene was sequenced using universal M13 forward and reverse primers. Apis mellifera (GenBank : EF373554.1), and Homo sapiens (GenBank : M86400.1) by utilizing the maximum likelihood method [34]. Multiple sequence alignment for evolutionary analyses was conducted in MEGA7 software based on the Clustal W method [35]. e percentage of replicate trees in which the associated taxa clustered together in the bootstrap test (1000 replicates) was shown next to the branches [36].

In Silico Analyses
3.7.1. Structural Characteristics of Maurolipin. MEGA software (Version 7.0) was used for the sequence alignments. All primers were designed by the Gene Runner (Version 4) and Oligo 7 software. e designed primer specificity was determined using Primer-BLAST on NCBI (https://blast. ncbi.nlm.nih.gov/Blast.cgi). To identify the structural feature of Maurolipin and guarantee that the obtained sequence could be a part of phospholipase family proteins, Maurolipin coding sequence was translated by Gene Runner software (Version 4.0), and diverse instruments evaluated the concluded sequence. Protein BLAST (https://www.ncbi.nlm. nih.gov) was performed to identify proteins with great closeness to recognize and record their characteristics. To compare and classify protein structure, researchers chose similar proteins for alignment with Clustal Omega (https:// www.ebi.ac.uk/Tools/msa/clustalo/). Amino acid compounds of Maurolipin were analyzed by utilizing ProtParam online tool (https://web.expasy.org/protparam/). To disulfide bridge prediction of the target protein, DiANNA 1.1 web server (https://clavius.bc.edu/∼clotelab/DiANNA/) was used [37,38]. Disulfide bridge formation is essential for biological activity in many proteins [39].

ree-Dimensional Structure Prediction.
In order to predict the three-dimensional (3D) structure, the SWISS-MODEL online tool (https://swissmodel.expasy.org/) and Phyre2 (https://www.sbg.bio.ic.ac.uk/phyre2/html/page.cgi? id�indexfulvip) were used to predict the 3D structure of a query protein through the sequence alignment of template proteins. is model was chosen for superimposition. e predicted 3D structure was evaluated by UCSF Chimera software (Version 1.14).

Active Site
Structure. PLA 2 breaks the sn-2 position of the glycerol backbone of phospholipids, mainly in a metaldependent reaction, to produce lysophospholipid (LysoPL) and a free fatty acid (FA) [10,11]. Superimposition of active site was carried out by UCSF Chimera (Version 1.14) and DeepView/Swiss-PdbViewer (Version 4.10) software. e root-mean-square deviation (RMSD) of bee venom PLA 2 and Maurolipin active-site residue was calculated by Chimera software (Version 1.14) to measure the average distance between corresponding atoms in two protein chains based on carbon alpha atoms. e active site prediction was undertaken by ExPASy-PROSITE (https:// prosite.expasy.org/).

Characterization of Maurolipin Coding Sequence.
To identify Maurolipin coding sequence, RT-PCR was performed by specific primers on the synthesized complementary DNA (cDNA), which appeared as a fragment close to the expected size of 432 (bp) (Figure 1(d)). About colony PCR, the expected size of PCR product amplified using universal M13 primers was 580 bp (Figure 2). Maurolipin coding sequence was deposited in GenBank under accession number (MW241004).  Maurolipin protein-coding sequence was assessed using protein BLAST alignment.

Phylogenetic Analysis.
Initial tree(s) for the heuristic search were received automatically by applying Neighbor-Join and BioNJ algorithms to a matrix of pairwise distances estimated using the maximum composite likelihood (MCL) approach and then selecting the topology with higher level logarithm likelihood value. Instructive branch lengths were typically drawn to scale and showed the number of substitutions per site (0.1). e significant relationship between insect PLA 2 was quite evident. In addition, the phylogenetic analyses showed that the enzyme detected in humans differed significantly from that of insects ( Figure 3).  Table 1. e predicted disulfide bonds in five positions were between (Cys8 and Cys45), (Cys25 and Cys46), (Cys52 and Cys75), (Cys77 and Cys84), and (Cys115 and Cys131) ( Figure 5).

4.4.2.
ree-Dimensional Structure Prediction. e threedimensional (3D) structure of Maurolipin by SWISS-MODEL demonstrated that Maurolipin was similar to chain A of bee venom PLA 2 , group III subfamily, with 38.83% identity.
e predicted 3D structure demonstrated that Maurolipin resembled chain A of bee venom PLA 2 (Figure 6). Superimposition of 3D structure of bee venom and S. maurus PLA 2 is shown in Figure 7. e 3D structure of Maurolipin by Phyre2 displayed that the target sequence was close to the insect PLA 2 family with 100% confidence and 39% identity. On the other hand, Maurolipin was similar to vertebrate PLA 2 with 97.3% confidence and 33% identity.

Active Site Structure.
e comparison of Maurolipin sequence with similarly characterized related proteins displayed a conserved catalytic site (active site), which was common in the PLA 2 superfamily of secretory and cytosolic enzymes. Conserved domains on Maurolipin were in the position of 17 to 118 amino acids length. According to the ExPASy-PROSITE online tool, the Maurolipin active site was in the position of 45 to 52 of amino acids length, which  Figure 3: Phylogenetic relationship between different species of scorpions, Apis mellifera, and Homo sapiens based on PLA 2 coding sequences. e accession number of each sequence has been shown in front of its name. e percentage of replicate trees in which the related taxa clustered together in the bootstrap test (1000 replicates) has been displayed next to the branches. e scale bar corresponds to 0.1 substitutions per nucleotide. was conserved among different species of scorpions (Figure 4). e comparison of Maurolipin and bee venom active site was done by calculating the root mean square deviation (RMSD), which is reported in Table 2. e catalytic domain included the "CCRTHDXC motif" and the Ca 2+ binding domain that supported the active site in the PLA 2 superfamily.

Prediction of Protease Cleavage
Sites. Based on PROSPER, three types of protease family cleavage Maurolipin coding sequences were determined and shown in Table 3. According to the PeptideCutter web server, the prediction enzymes including caspase-1 to −10, enterokinase, factor Xa, granzymeB, and thrombin cannot cut Maurolipin coding sequence.

Discussion and Conclusion
Prediction of protein structure is the focus of interest of many investigators. e current study was the first investigation of molecular characterization and in silico analyses of Maurolipin structure from venom glands of Iranian S. maurus that can be added to the literature when targeting molecular characterization of PLA 2 coding gene from venom glands of Iranian S. maurus. Phospholipase A 2 (PLA 2 ) has a relevant role in of the inflammatory process, which catalyzes the hydrolyze phospholipids at the sn-2 position of the glycerol backbone and releases fatty acid and lysophospholipids [40,41]. Group III subfamily of PLA 2 has been identified from various sources such as reptiles [42,43], mammals [44], parasites [45], and arthropods including scorpion [20,23]. e coding sequence of PLA 2 was detected from S. maurus venom glands for the first time in Iran. e detected PLA 2 has encoded a protein of 144 amino acid residue named Maurolipin. Till now, several studies reported the genes encoding phospholipases A 2 from different species of scorpions, including Hemilipin from H. lepturus [23], Leptulipin from H. lepturus [25], Imperatoxin I and Phospholipin from P. imperator [21,22], Phaiodactylipin from A. phaiodactylus [20], Heteromtoxin from Heterometrus laoticus [10], Phospholipase A 2 from H. fulvipes [46], MtPL A 2 from Mesobuthus tumulus [47], Sm-PLVG from Tunisian S. maurus [27], and Phospholipase A 2 from Hadrurus gertschi [48]. e results of BLAST in NCBI showed that the detected sequence from the Iranian Scorpio maurus has a very high similarity to the same sequence of Tunisian S. maurus (GenBank : MF347455.1) [49]. However, this level of similarity is very low compared to other species of scorpions. Comparing the coding sequence of Maurolipin with other characterized PLA 2 from different species of scorpions showed that Maurolipin was a member of the PLA 2 superfamily.
Phylogenetically, there is a weak relationship between groups I, II, and III of phospholipase A 2 , but at calcium-  Journal of Tropical Medicine binding site and the active site region, they are pretty similar [50,51]. In the current study, the result of the likelihood analysis showed that the characterized PLA 2 from Iranian S. maurus and other arthropods, scorpions, and Apis mellifera (GenBank : FE373554.1) are well clustered. In contrast, the one from Homo sapiens (GenBank : M86400.1) is distantly located similar to the phylogenetic analyses of A. phaiodactylusPLA 2 [51]. e difference in 13 amino acids in the gene encoding sequence of Maurolipin and Sm-PLVG is most likely due to differences in the geographical condition in which they live, because the structure of a protein reflects its genetic sequence. e residues Asp 2 , Leu 9 , Phe 17 , Glu 30 , Glu 32 , Ser 67 , Pro 69 , Met 86 , Asp 90 , Thr 95 , Asn 102 , Lys 109 , and Tyr 110 of Maurolipin were Val, Ser, Leu, Val, Lys, Phe, Phe, r, Asn, Asp, Asp, Asp, and Asn in Sm-PLVG (GenBank : AVD99009.1), respectively. Indeed, the difference in the amino acid residue is more significant among different species of scorpions.
Based on the three-dimensional structure prediction results, Maurolipin is highly similar to the structure of phospholipase A 2 that belonged to the group III subfamily; however, it was slightly similar to vertebrates' phospholipase A 2 . Phylogenetic analysis also confirms these results. Similar to characterized PLA 2 of A. phaiodactylus [51] and Imperatoxin I of P. imperator [21], Maurolipin is closely related to the genomic structure of A. mellifera, the only known representative structure in group III PLA 2 . Evaluation of the 3D structure of the target gene revealed that the identified protein was very similar to the chain A of characterized PLA 2 from bee venom. Four disulfide bonds were predicted for phospholipase A 2 of M. tamulus at the position of eight cysteines (Cys8-Cys30), (Cys29-Cys68), (Cys35-      [47]. It is similar to the position of the human group III PLA 2 disulfide bonds in ten cysteines (Cys8-Cys30), (Cys29-Cys68), (Cys35-Cys61), (Cys59-Cys91), and (Cys101-Cys113) while different positions were predicted to the five disulfide bonds in Maurolipin [52]. e histidine-aspartic (His-Asp) acid pair is stated necessary for the catalytic mechanism of the phospholipase A 2 . Active-site residues were universally conserved within protein families, displaying their key role for substrate catalysis [53]. In this study, RMSD of Maurolipin and bee venom PLA 2 was calculated. e RMSD value computes the average deviation between the equivalent atoms of two proteins depending on conformational differences and structural dimensions [54,55]. e smaller the RMSD, the further similar the two structures. In this study, the active site residue of Maurolipin also has the His-Asp acid pair, and the effective amino acid in the catalytic domain was His 49 while His 34 played a significant role in the active site residue of bee's venom PLA 2 and Sm-PLVG [49].

Data Availability
All data generated or analyzed during this study are available upon reasonable request to the corresponding author.

Ethical Approval
No human or animal data or tissue was used in this study.

Consent
No written consent has been obtained from the patients as there is no patient identifiable data included.

Conflicts of Interest
e authors declare that there are no conflicts of interest.

Authors' Contributions
PS-A and KA conceived and designed the study and contributed to the discussion of results. HA, AS, QA, and AR conceived the study and analyzed data; PS-A, KA, and DM drafted the manuscript. All authors have read and approved the final manuscript.