Sequence analysis of earthworm hemolysins

Annelids are capable of defending themselves from pathogens and of recognizing degenerated self tissue. These reactions require specialized immune mechanisms that are effected by proteins and cellular reactions. Hemolytic proteins are the most striking humoral defense molecules in the earthworm Eisenia fetida. Beside their hemolytic activity these proteins possess agglutinating, antibacterial, cytotoxic, and clotting properties. Hemolytic proteins both from coelomocytes (CL39,41) and coelomic fluid (H1−3) of wildtype E. fetida were isolated and assigned to fetidin and lysenin using mass spectrometry and bioinformatic tools. Glycosylation was found for H1. In silico analyses of the hemolysins revealed two hemolysin isoforms and consensus sites for N-glycosylation and peroxidases proximal heme-binding ligand.


Introduction
Earthworms, and, in fact all annelids, are interesting immunological study objects.They are capable of defending themselves from pathogens and of recognizing degenerated self tissue.These reactions require specialized immunological mechanisms.Hemolytic proteins are the most striking humoral defense molecules in the earthworm Eisenia fetida.They, as well as several bacterial toxins, toxins secreted from invertebrate cells, and even perforin and components of the complement cascade in vertebrates, belong to a family of phylogenetically very old, barrel-like pore forming, membrane attacking proteins.Hemolytical and agglutinating proteins in E. fetida were originally described by DuPasquier and Duprat [3].Beside their hemolytic activity these proteins possess agglutinating [4,6,14], antibacterial [11,13], cytotoxic [7] and clotting [15] properties.We isolated and analysed hemolytic proteins both from CL (CL 39 and CL 41 ) and CF (H 1−3 ) of wildtype E. fetida as illustrated in Fig. 1, and assigned them to fetidin and lysenin [9], which have been described earlier [10,12].The results supported the concept that the hemolytic proteins in E. fetida originate from chloragocytes and are secreted into CF [9].Here, we present sequence analyses based on the MS data using similarity, pattern, and profile searches.1. Earthworm proteins were subjected to MS-based analysis which is part of the proteomics toolkit [9].(a) Thereby, proteins are separated using gel electrophoresis and digested tryptically in the gel spot of interest.Peptides are then eluted and subjected to mass mapping or sequencing.MS data allow the identification of known proteins from databases.In case of unknown proteins MS contributes detailed structure information.(b) Functional analyses require knowledge on the genomic and proteomic level.In both areas high-throughput techniques are available, which generate huge data amounts for gene expression or protein identification experiments.However, in order to access modifications, mutations, and isoforms, which are of uttermost importance for functional studies, the methods need to be refined and optimised for the respective analysis task.

Experimental
Proteins were isolated as described [9] and illustrated in Fig. 2. Database, similarity (BLAST2 [1]), pattern and profile (Prosite) searches have been performed using public access of the EXPASY server searching SwissProt (University of Geneva and European Bioinformatics Institut) and the NCBI Entrez server searching Genbank (National Center of Bioinformatics, Bethesda, MD, USA).

Results and discussion
Eisenia is currently not subject of extensive sequencing projects, although efforts have been started to sequence Lumbricus rebellus (http://convoluta.cap.ed.ac.uk/Lumbribase/lumbribase/lumbribase.html).Therefore, only 5 proteins can be found in SwissProt and 74 in Genbank among hundreds of thousands of entries.Only the fact that relevant earthworm proteins have been analysed at the genomic level [10,12] allowed us to interpret the sequencing data of the CL and CF hemolysin proteins [9].We had noted that the proteins share large parts of their sequences as is demonstrated in Fig. 3.There are four entries in Genbank, which are extremely similar, and those are fetidin (U02710, CL 39 [10]), lysenin (D85846, CL 41 [12]), and lysenin-related proteins 1 and 2 (D85848 and D85847 [12]).Fetidin and lysenin are to 90% and fetidin and LPR2 to 75% identical, so that the analytical challenge was to find the few peptides which are different.Protein sequence analysis also showed that the amino acid sequences of LPR1 and fetidin are completely identical (Fig. 3b).However, detailed analysis at the genomic level (Figs 3a, 4) revealed that those proteins are isoforms.Guanine 621 in fetidin is replaced by adenine 672 in LPR1 (the position being defined by the length of the submitted mRNA sequences).Nevertheless, this mutation does not have an effect on the translation, since both AAA and AAG code for Lys178 (Fig. 3b).
For N-glycosylation of the high-mannose type, which was indicated to be present by the MS results ( [9], Fig. 4), the signature can be found at Asn250 (Fig. 3b).This consensus sequence is also found in lysenin.LPR2 does not have this site, but there is a N-glycosylation pattern towards the N-terminal end at Asn33.However, the particular pattern N-x-S/T is found often, due to its length, without functional relevance.More interesting in that respect is the finding of a peroxidases proximal heme-ligand signature in all three hemolysins (Fig. 3b)

. The pattern is [DET]-[LIVMT A]-xx-[LIV M]-[LIVMST AG]-[SAG]-[LIVMSTAG]-H -[ST A]-[LIVMFY]
with variable amino acids residues in brackets and found residues italicized.In this pattern His is the proximal heme-binding ligand according to Prosite rules [2,5,8].Heme-binding peroxidases carry out a variety of biosynthetic and degradative functions using hydrogen peroxide as the electron acceptor.Hydrogen peroxide is a toxic substance which causes dramatic intracellular degradation effects, if it accumulates in the cell and is not metabolized.In so far a detoxification function of earthworm hemolysins can be speculated and needs to be investigated.

Conclusion
Hemolytic proteins both from CL (CL 39 and CL 41 ) and CF (H 1−3 ) of wildtype E. fetida were carefully isolated by puncturing the coelomic cavity minimizing protein cross-contamination between cells and fluid.MS analysis assigned the hemolysins to fetidin and lysenin and high-mannose N-glycosylation was detected for CF fetidin.In silico sequence analyses of four Eisenia hemolysins revealed high similarity among them, in fact, the two sequences of LPR1 and fetidin only differ in one base with no effect on translation.Consensus sites for N-glycosylation are pointed out in this study, but it is yet unknown whether glycosylation does occur at those sites.This is also true for the peroxidases proximal hemebinding ligand site discovered towards the N-terminal site of the protein, although this pattern is highly (c)   specific.The data initiate experiments to study those findings especially with respect to a hemolysin involvement in hydrogen peroxide detoxification.
Fig.1.Earthworm proteins were subjected to MS-based analysis which is part of the proteomics toolkit[9].(a) Thereby, proteins are separated using gel electrophoresis and digested tryptically in the gel spot of interest.Peptides are then eluted and subjected to mass mapping or sequencing.MS data allow the identification of known proteins from databases.In case of unknown proteins MS contributes detailed structure information.(b) Functional analyses require knowledge on the genomic and proteomic level.In both areas high-throughput techniques are available, which generate huge data amounts for gene expression or protein identification experiments.However, in order to access modifications, mutations, and isoforms, which are of uttermost importance for functional studies, the methods need to be refined and optimised for the respective analysis task.

Fig. 2 .
Fig. 2. The source of hemolysins was CF of adult E. fetida harvested by puncturing the coelomic cavity with a very thin glass capillary.

Fig. 4 .
Fig. 4. Basic core structure for Asn-glycosylation of the high-mannose type.The number of mannose residues is variable.X is any amino acid except Pro.