Next Generation Sequencing Identifies Five Major Classes of Potentially Therapeutic Enzymes Secreted by Lucilia sericata Medical Maggots

Lucilia sericata larvae are used as an alternative treatment for recalcitrant and chronic wounds. Their excretions/secretions contain molecules that facilitate tissue debridement, disinfect, or accelerate wound healing and have therefore been recognized as a potential source of novel therapeutic compounds. Among the substances present in excretions/secretions various peptidase activities promoting the wound healing processes have been detected but the peptidases responsible for these activities remain mostly unidentified. To explore these enzymes we applied next generation sequencing to analyze the transcriptomes of different maggot tissues (salivary glands, gut, and crop) associated with the production of excretions/secretions and/or with digestion as well as the rest of the larval body. As a result we obtained more than 123.8 million paired-end reads, which were assembled de novo using Trinity and Oases assemblers, yielding 41,421 contigs with an N50 contig length of 2.22 kb and a total length of 67.79 Mb. BLASTp analysis against the MEROPS database identified 1729 contigs in 577 clusters encoding five peptidase classes (serine, cysteine, aspartic, threonine, and metallopeptidases), which were assigned to 26 clans, 48 families, and 185 peptidase species. The individual enzymes were differentially expressed among maggot tissues and included peptidase activities related to the therapeutic effects of maggot excretions/secretions.


Introduction
The maggots of certain flies have been used as traditional medicines for centuries [1] but modern maggot debridement therapy (MDT) was established approximately 100 years ago. MDT was then widely used for the treatment of chronic wounds until the mid-1940s, since when the technique has been supplanted by antibiotics and improved wound care [2]. MDT, using exclusively Lucilia sericata maggots, has recently undergone a renaissance, and medical maggots are now approved as an alternative approach for the treatment of many types of chronic and necrotic wounds, including diabetic ulcers [3][4][5], postsurgical wounds [6], and burns [7,8].
Maggots applied to hard to heal wounds debride the necrotic tissue, disinfect the wound, and stimulate the healing process [9]. The beneficial effect of MDT cannot be attributed to the single molecule but rather to the synergistic action of various bioactive substances, including large variety of proteolytic enzymes, which are present in maggots excretions/secretions products (MEP) [10].
Debridement, the removal of necrotic tissue and wound slough, is a well-documented effect of MDT [11][12][13]. The maggots perform physical debridement with their mandibles, but chemical debridement with enzymes is the most important component. They do so by releasing their digestive enzymes into the wound, which liquefy necrotic and infected tissues, 2 BioMed Research International before it is consumed back. Chambers et al. identified three classes of proteolytic enzymes (aspartic, serine, and metallopeptidases) from MEP and proposed that mainly serine peptidases are responsible for the superficial debridement activity of maggots [14]. Only two such peptidases (serine peptidases) have been identified and characterized thus far. Chymotrypsin 1 was identified from MEP and produced in the recombinant form [15]. Recombinant enzyme was shown to degrade the eschar from venous leg ulcers in vitro [15] and to be unaffected by two endogenous inhibitors, 1-antichymotrypsin and 1-antitrypsin from wound eschar [16]. We recently produced and characterized Jonah-like chymotrypsin, which digested three specific extracellular matrix proteins (laminin, fibronectin, and collagen IV) in vitro and proposed its function in wound debridement [17].
The natural habitat of L. sericata larvae is rotting organic matter such as cadavers and excrement, but this ecological niche also favors many microorganisms so the larvae must have adequate defenses against infection. The maggots therefore protect themselves by producing many antimicrobial substances [18][19][20][21][22] and by digesting microbes, which are thus eliminated in the larval gut [23,24]. Interestingly, MEPs also show activity against relevant human pathogens including antibiotic-resistant bacterial strains [25][26][27] and biofilms [28][29][30][31]. Recently, two molecules with antibiofilm activity have been identified from MEP. Affinity purified DNase disrupted Pseudomonas aeruginosa biofilm [32] and recombinant chymotrypsin I was active against Staphylococcus epidermidis and S. aureus biofilms [33].
Surprisingly, medical maggots also directly promote wound healing [10]. MEPs stimulate fibroblast migration [34,35] and proliferation [36] and increase angiogenesis [37,38]. MEPs also influence the activation of the human complement system [39], reduce proinflammatory responses [40][41][42][43], and induce fibrinolysis [44]. Recently we discovered that MEPs contain peptidases that influence blood coagulation as part of the wound healing process [45] and this activity was attributed to Jonah-like serine peptidase. Recombinant enzyme was shown to reduce the clotting time of human plasma by substituting for the intrinsic clotting factors kallikrein, factor XI, and factor XII, respectively [17].
However, although several molecules have been identified from MEP, it is still recognized as a largely unexplored source of compounds with therapeutic potential. The future studies shall focus on identification, isolation, and/or production of effector molecules and testing of their therapeutic potential. Here, we analyzed the transcriptome of different larval tissues to systematically identify MEP peptidases. It is not clear whether MEP components are exclusively produced by salivary glands or also by other tissues, so we dissected three individual tissues (gut, crop, and salivary glands) as well as the remaining larval biomass to generate tissue-specific sequence data. The extracted mRNA was sequenced using the Illumina HiSeq2000 Genome Analyzer platform and paired-end read technology. After preprocessing, 123,856,654 paired reads remained in the panel of libraries. These were processed further to yield a final assembly of 41,421 contigs in 17,479 clusters, resulting in the identification of 1729 contigs in 577 clusters encoding five different functional classes of proteolytic enzymes.

Preparation of Biological Material.
First-instar L. sericata maggots were obtained from BioMonde GmbH (Barsbüttel, Germany) and were cultured under sterile conditions on Columbia agar plates with "sheep blood +" (Oxoid Deutschland GmbH, Wesel, Germany) at 28 ∘ C for 48 h in the dark. The larvae were cleaned and then infected with a mixture of Pseudomonas aeruginosa (DSM 50071) and Staphylococcus aureus (DSM 2569) as previously described [21]. The larval gut, salivary glands, and crop were dissected under a binocular microscope 8 h after infection. Dissected tissues and the remaining larval body for Illumina sequencing were frozen in liquid nitrogen and stored at -80 ∘ C. Samples for qRT-PCR analysis were processed immediately as described below.

RNA Isolation and Illumina Sequencing.
Total RNA was extracted from individual tissues and the rest of the larval body using the innuPREP RNA Mini Isolation Kit (Analytik Jena, Jena, Germany) following the manufacturer's instructions. Additional RNA purification, quantification, and quality control were carried out as previously described [46]. An additional Turbo DNase treatment (Thermo Fisher Scientific, Waltham, MA, USA) was applied before the second purification step to eliminate contaminating DNA. The DNase was removed and the RNA purified using the RNeasy MinElute Clean up Kit (Qiagen, Hilden, Germany) following the manufacturer's protocol. RNA was eluted in 20 L Ambion RNA Storage Solution (Thermo Fisher Scientific) and poly(A) + mRNA was prepared using the Ambion MicroPoly(A) Purist Kit according to the manufacturer's instructions (Thermo Fisher Scientific). The integrity and quantity of the mRNA was confirmed using an Agilent 2100 Bioanalyser and RNA Nano chips (Agilent Technologies, Santa Clara, CA, USA).
Transcriptome sequencing was carried out on an Illumina HiSeq2000 Genome Analyzer platform using paired-end (2 × 100 bp) read technology for the larval tissues, with RNA fragmented to an average length of 150 nucleotides. Sequencing was carried out by Eurofins MWG Operon (Ebersberg, Germany) and resulted in totals of 29, 33, 26, and 34 million reads for the rest of body, gut, crop, and salivary glands, respectively.

Assembly.
The reads from all libraries were assembled de novo in two steps. One assembly was computed using the Trinity assembler [49] followed by 28 individual Velvet/Oases assemblies [50] with k-mer parameters ranging from 21 to 75. In the second step, the resulting transcript sequences were combined and high-quality sequences were extracted using the EvidentialGene pipeline [51]. Potential isoforms were detected by clustering the protein sequences from the EvidentialGene pipeline using CD-HIT [52] with 90% identity.
2.5. Quality Assessment. The CEGMA method [53] was used to assess the completeness of the transcriptome. The detection step of the CEGMA pipeline was replaced by a BLASTp search [54] against the CEGMA EuKaryotic Orthologous Groups (KOG) sequences, because protein sequences had already been identified in the previous step. The completeness of individual sequences was estimated by computing the "ortholog hit ratio" [55] against the D. melanogaster protein sequences.

Functional Annotation and Peptidase Identification.
Putative transposable elements were identified using Trans-posonPSI [56]. Furthermore, all sequences with HMMER 3.0 [57] hits against Pfam domains [58] as previously described [59] were marked as potential transposable elements. All sequences in clusters with at least one putative transposable element were annotated as transposable elements. All sequences were uploaded to the SAMS web server [60] and automatically annotated using BLAST [54] and HMMER [57] searches against different databases. Next all the peptidases were identified using the EC numbers [61] from the automatic annotation of the transcriptome data and further classified using MEROPS database [62].

Mapping and Digital Gene Expression Analysis.
Digital gene expression analysis was carried out by using QSeq Software (DNAStar Inc.) to remap the Illumina reads onto the reference backbone and then counting the sequences to estimate expression levels. For read mapping, we used the following parameters: n-mer length = 40; read assignment quality options required at least 40 bases (the amount of mappable sequence as a criterion for inclusion) and at least 90% of bases matching (minimum similarity fraction, defining the degree of preciseness) within each read to be assigned to a specific contig; maximum number of hits for a read (reads matching a greater number of distinct places than this number are excluded) = 10; n-mer repeat settings were automatically determined and other settings were not changed. Biases in the sequence datasets and different transcript sizes were corrected using the RPKM algorithm (reads per kilobase of transcript per million mapped reads) to obtain correct estimates for relative expression levels. For the selected protease groups, gene expression (log2 transformed RPKM values) was visualized as heat maps using custom scripts and matplotlib [63] to generate a 2D plotting library using the Jet Colormap [64].

Quantitative Reverse Transcription Real-Time PCR
Reaction (qRT-PCR). A subset of differentially expressed  "hypothetical proteins." Gene ontology (GO) analysis was used to explore the functional characteristics of all contigs and assign them to three independent categories: biological processes, molecular function, and cellular components ( Figure 1). In addition, a BLASTp search against the MEROPS database v9.12 [67] identified 1729 contigs in 577 clusters as peptidases. The identified peptidases represented   ∼4% of the total number of contigs. This result correlates with data from other organisms where peptidases represent more than 2% of all genes [68].

Peptidases.
Peptidases are proteolytic enzymes that hydrolyze peptide bonds and they are found ubiquitously in all biological systems from viruses to vertebrates. Based on the key amino acid residues responsible for proteolytic activity, six different peptidase classes are recognized (aspartic, cysteine, serine, glutamic, threonine, and metallopeptidases) as well as further unclassified peptidases [69]. From the 1729 L. sericata contigs (in 577 clusters) identified as peptidases, 1655 contigs (in 557 clusters) were assigned to one of five peptidase classes (aspartic, cysteine, serine, threonine, and metallopeptidases) whereas 74 contigs (in 20 clusters) remained unassigned. As summarized in Figure 2, serine peptidases were the most prominent class (837 contigs in 270 clusters) followed by metallopeptidases (565 contigs in 202 clusters), cysteine peptidases (145 contigs in 45 clusters), threonine peptidases (51 contigs in 25 clusters), and aspartic peptidases (57 contigs in 15 clusters). The MEROPS database was used to subdivide the identified enzymes further into clans (peptidases with evolutionarily conserved tertiary structures, orders of catalytic residues, and common sequence motifs around the catalytic site), families (peptidases with similar amino acid sequences), and species (peptidases with similar properties and a unique MEROPS identity) [70,71]. Accordingly we identified 26 clans containing 48 families and 185 peptidase species (Table 2). We found that almost half of the identified clusters represented serine peptidases in clan PA and family S1. GO analysis was then carried out to assign functional categories to each of the identified peptidase clusters. We were able to assign 534 of 577 clusters to three different categories: biological process (345 clusters), molecular function (533 clusters), and cellular component (70 clusters) ( Figure 3). We found that most of the peptidases (310 clusters) are involved in the biological process (level 3) category of "primary metabolic process" (Figure 3(a)). The molecular function (level 3) of most peptidases was either catalytic activity (201) or hydrolase activity (253) (Figure 3(b)) as expected given the molecular role of peptidases. Interestingly, only 70 clusters were assigned a cellular component function (Figure 3(c)).

Aspartic Peptidases.
Aspartic peptidases contain an aspartic acid residue at the active site [72]. An aspartic peptidase activity was previously identified in maggot MEPs using class-specific inhibitors [14] and the corresponding gene was shown to be strongly upregulated in L. sericata larvae following an immune challenge [18]. We identified 57  contigs in 15 clusters representing aspartic peptidases, and these were further assigned to two clans, three families, and seven peptidase species (Table 3) with different tissue-specific expression profiles (Additional file 2).

Family A1.
Preprocathepsin D-like peptidases are the largest group of aspartic peptidases. The majority of clusters included a signal peptide, propeptide, and mature enzyme containing all of the conserved catalytic and substratebinding residues found in human lysosomal cathepsin D [69].
With the exception of cluster LST LS5572 and two incomplete clusters (LST LS009595 and LST LS016491), all clusters lacked the polyproline loop (DxPxPx(G/A)P) ( Figure 4). The absence of this loop is a characteristic of pepsin and digestive cathepsin D peptidases in the Brachyceran infraorder Muscomorpha and may be associated with the extracellular role of these enzymes [73]. The aspartic peptidase gene previously identified in challenged L. sericata larvae [GenBank: FG360526] is homologous to cluster LST LS005916, which was predominantly expressed in the larval gut. Aspartic peptidases can kill bacteria in vitro in an acidic medium [74] and may also kill bacteria in the Musca domestica larval midgut (Espinoza-Fuentes, Terra 1987). Based on its localization in the gut and induction by an immune challenge, we propose a similar role for this L. sericata aspartic peptidase. However, heat map analysis (Additional file 2) revealed that the majority of A1 family aspartic peptidases are predominantly expressed in the larval gut, suggesting a role in digestion and/or the elimination of bacteria.

Family A22.
This family of intramembrane peptidases comprises two subfamilies. The A22a subfamily is typified by presenilin, an enzyme that plays central role in intramembrane proteolysis [75] and the pathogenesis of Alzheimer's disease [76]. The A22b subfamily is typified by impas 1 peptidase, which is responsible for the degradation of liberated signal peptides and may play an essential role in the development of D. melanogaster larvae [77]. We identified four clusters assigned to three peptidase species encoding members of the A22b subfamily, and two of them (LST LS005714 and LST LS015224) were strongly upregulated in the salivary glands (Additional file 2) as previously reported in D. melanogaster [77]. We therefore propose a similar function for the L. sericata peptidase in larval development.

Family A28.
Only one contig in one cluster was identified assigned to family A28. A homologous skin aspartic peptidase (SASPase) was recently identified in human skin [78] although its biological role remains unclear.
. * : . * * : : * * * * * : * * * Figure 4: Partial amino acid sequence alignment of A1 peptidases found in the L. sericata transcriptome. Amino acid sequences of A1 aspartic peptidases were aligned using MAFFT [199]. Only one cluster (LST LS005572) contained a polyproline loop (underlined). Asterisk ( * ) indicates positions which have a single, fully conserved residue. Colon (:) indicates conservation between groups of strongly similar properties, scoring > 0.5 in the Gonnet PAM 250 matrix. Period (.) indicates conservation between groups of weakly similar properties, scoring =< 0.5 in the Gonnet PAM 250 matrix. on the strict localization in the L. sericata gut and induction by an immune challenge, we speculate that these enzymes are probably required for the elimination of bacteria in the larval gut [23] rather than the digestion of food, but additional experiments are needed to confirm their specific function.

LST_LS005916 A I G A T F N Y D Y Y T Y T V D C S S I D S L P A L T L N I G G T T F T I E A S D Y I L Q ----S E G V C S S A F E N I G T D F -------W I L G D I -F I G R Y Y S I F D L A N N R V G F A T A V LST_LS005431 A I G A T F N Y D Y Y T Y T V D C S S I D S L P A L T L N I G G T T F T I E A S D Y I L Q ----S E G V C S S A F E N A G T D F -------W I L G D I -F I G R ------------------LST_LS005915 A I G A T F N Y D Y Y T Y T V D C S S I D S L P A L T L N I G G T T F T I E A S D Y I L Q ----S E G V C S S A F E N I G T D F -------W I L G D I -F I G R Y Y S I F D L A N N R V G F A T A V LST_LS005911 A I G A T F N E T T Y E F M L D C S T L D S L P D V N F H I G D G I Y T L E P S D Y V L Q ----A D D Q C A T A F E D A G M N I
3.6.2. Family C13. The C13 family of cysteine peptidases comprises two types of enzymes. The first is the asparaginyl endopeptidases, which were originally found in legumes [90] and later in schistosomes [91], mammals [92], and recently also arthropods [93]. These are acidic lysosomal enzymes that favor asparagine at the P1 position [94] and whose roles include antigen presentation [95], enzyme transactivation [96], and blood meal digestion [97]. The second is the glycosylphosphatidylinositol (GPI):protein transamidases, which are required for the removal of C-terminal peptides and the attachment of GPI anchors [98]. We identified two clusters encoding GPI:protein transamidases and two clusters remained unidentified.

Family C14.
Caspases are intracellular endopeptidases that are highly specific for the cleavage of aspartyl bonds. With the exception of caspase 1, which is responsible for the production of interleukin-1 in monocytes [69], most caspases regulate apoptosis by taking part in a protease cascade [99]. The D. melanogaster genome encodes seven caspases. Dronc (Drosophila Nedd2-like caspase), Dredd (death related ced-3/Nedd2/like), and Strica (serine/threonine rich caspase) possess long N-terminal domains and function as upstream or initiator enzymes, whereas Drice (Drosophila ICE), Dcp-1 (death caspase-1), and Decay (death executioner caspase related to Apopain/Yama) are downstream or effector caspases [100,101]. Damm (death-associated molecule related to Mch2) caspase shares the features of both groups but its biological role is not fully understood [69]. The L. sericata transcriptome database contained eight clusters in seven peptidase species representing caspases, with different tissuespecific expression profiles (Additional file 4). Phylogenetic analysis ( Figure 5) revealed one L. sericata homolog each for the effector caspases Dcp-1 and Decay, two homologs for Drice, one homolog each for the initiator caspases Dredd and Strica, and two homologs for Dronc. We did not find a sequence representing the D. melanogaster Damm caspase.

Other Cysteine Peptidase
Families. Several cysteine peptidase families were more or less equally distributed among the L. sericata tissues we tested, and these are probably required for essential cellular functions. The C2 family of calcium-dependent peptidases (calpains) is formed of ubiquitous, intracellular, neutral peptidases, associated with diverse biological functions ranging from signal transduction to apoptosis [102]. Ubiquitinyl hydrolases (family C12) are intracellular enzymes that remove ubiquitin from ubiquitinylated proteins and peptides [69]. Members of family C15 are ubiquitous, intracellular peptidases that remove pyroglutamate from the N-terminus of peptides and hydrolyze biologically active peptides such as neurotensin and gonadotropin [103]. Gamma-glutamyl hydrolases (family C26) are primarily lysosomal enzymes, which are widely distributed in nature and probably required for the turnover of cellular folates [69]. Hedgehog proteins (family C46) are self-splicing, two-domain signaling proteins originally discovered in D. melanogaster [104]. They are found in most metazoan species and play multiple roles in pattern formation during development [105]. Members of family C54, which was first discovered in the budding yeast Saccharomyces cerevisiae, are necessary for autophagy [106]. Otubains (family C65) are isopeptidases involved in the removal of ubiquitin from polyubiquitin [107]. These enzymes share no homology to other deubiquitinylating enzymes but belong to the ovarian tumor family (OTU) and possess a cysteine peptidase domain [108].

Metallopeptidases.
The metallopeptidases are a ubiquitous and highly diverse group of enzymes containing both endopeptidases and exopeptidases. MEROPS database v9.12 [62] lists more than 15 clans and 71 families involved in diverse biological processes such as digestion, wound healing, reproduction, and host-pathogen interactions. Although these enzymes vary widely at the sequence, structural, and even functional levels, all members require a metal ion for catalytic activity [69]. More than 30% of all clusters in our L. sericata transcriptome database (202 clusters) were found to encode metallopeptidases, which were further assigned to 9 clans, 20 families, and 53 peptidase species (Table 5). The metallopeptidases are therefore the second largest group of peptidases in the L. sericata transcriptome and the most diverse in terms of the number of families. Although the variability and abundance of metallopeptidases in L. sericata indicate their importance, their roles are not well understood and few studies have addressed specific biological activities. A metallopeptidase with exopeptidase characteristics and a pH optimum of 8 was detected in L. sericata MEPs   [199] was used.   [109]. This family mostly comprises membrane-bound or cytosolic exopeptidases that remove the N-terminus of their substrates. However the specificity of the S1 subsite varies considerably, which allows this family to be involved in many different biological processes [110]. Insect M1 peptidases are mainly expressed in the gut, where they play important intermediate roles in protein digestion [111] as well as host-pathogen interactions. Membrane-bound aminopeptidases in the gut are receptors for Bacillus thuringiensis toxins in several insect species [112][113][114]. Aminopeptidases have also been detected in other insect tissues, such as the fat body [115], salivary glands [116], and Malpighian tubes [116]. Although their interactions with B. thuringiensis toxins have been confirmed, their endogenous role is unclear [116]. Aminopeptidase N in the hemocoel plays an important role in the postembryonic development of the pest moth Achaea janata [117]. We identified 33 clusters encoding 8 peptidase species (Table 5) and 6 of them are predominantly expressed in the larval gut (Additional file 4).

Family M2.
Family M2 contains angiotensin converting enzyme (ACE), the dipeptidyl peptidase that removes dipeptides from the C-terminus of angiotensin. ACE was originally identified in mammals, where it regulates vascular homeostasis [69]. The first insect ACE was found in M. domestica [118] and several ACE paralogs have been identified in every insect genome investigated thus far [119]. Insect ACE cleaves peptides with roles in development [119,120], reproduction [121], and immunity [122,123]. Recently, ACE was shown to be involved in aphid-plant interactions by modulating the feeding behavior and survival of aphids on plants [124]. Six ACE paralogs were identified in the D. melanogaster genome, but only Ance and Acer are active enzymes [125]. These enzymes have distinct tissue localization and substrate profiles, but their exact role is unclear. Ance is expressed mostly in the gut and around the reproductive organs, thus suggesting a role processing peptides in gut muscle cells [126] and during spermatogenesis [121]. In contrast, Acer was exclusively found in developing heart cells [127]. We identified 26 clusters belonging to the M2 family, 16 of which were assigned to the Ance peptidyl-dipeptidase species (Table 5) and were predominantly expressed in the larval gut (Additional file 4). Based on this localization, we speculate that L. sericata Ance plays a similar role to its D. melanogaster ortholog. Interestingly, D. melanogaster Ance was shown to hydrolyze the two important bioactive peptides angiotensin I and bradykinin [119], which are the major substrates of mammalian ACE. It would be interesting to see whether L. sericata Ance can also cleave these substrates, which would suggest a potential endogenous role in hormonal signaling.

Family M3.
The L. sericata transcriptome was shown to contain mitochondrial intermediate peptidase and thimet oligopeptidase, which were expressed similarly in all the tissues we sampled (Additional file 4). Both enzymes are intracellular endopeptidases. Mitochondrial intermediate peptidase processes mitochondrial protein precursors during their import into the mitochondria [128], whereas thimet oligopeptidase degrades small peptides (5-53 residues) with broad specificity and plays an important role in antigen presentation [129].

Family M8.
Two L. sericata clusters belong to family M8, which is typified by leishmanolysin, an important virulence factor found in leishmania parasites [130]. Leishmanolysin is a membrane-bound peptidase which degrades extracellular matrix proteins, thus enabling parasite migration [131]. Furthermore, a D. melanogaster M8 metallopeptidase was found to be involved in cell migration during embryogenesis and coordinated mitotic progression [132].
As their name indicates, MMPs play important roles in extracellular matrix remodeling and turnover. Aberrant MMP activity is associated with many forms of cancer, making them medically relevant [137]. Most MMPs are oncogenic, that is, higher activity promotes cancer, but some (including MMP3 and MMP8) have the opposite effect [138]. It is difficult to determine their precise individual roles because there are 24 human MMPs with overlapping expression profiles and activities, but insects could be used as a simplified model to probe their functions in more detail. Only two D. melanogaster MMPs have been described [139,140], as well as three from the red flour beetle Tribolium castaneum [141] and one from the greater wax moth Galleria mellonella [142]. All insect MMPs play important physiological roles and some also promote tumor progression, suggesting they have similar functions to their human counterparts [143]. We identified only four clusters representing L. sericata MMPs, which were assigned to two peptidase species (Table 5). These enzymes were generally expressed at low levels but were slightly upregulated in the larval gut (Additional file 4). The role of these enzymes remains unknown and further studies are required to clarify their physiological functions and whether L. sericata MMPs contribute to the degradation of extracellular matrix proteins in human wounds.

Family M12.
Family M12 comprises two subfamilies, namely, subfamily M12a, which is typified by astacin, and subfamily M12b, which is typified by adamalysin. Astacin is an endopeptidase, originally identified in the crayfish Astacus astacus, which is probably involved in digestion [69]. Hundreds of astacins have been identified in many different species, but no examples have yet been identified in plants and fungi [144]. In addition to digestion, astacins may also play roles in embryogenesis and extracellular protein remodeling [145]. Adamalysins are membrane-bound proteins with disintegrin and metallopeptidase domains. They have a broad substrate range and are therefore involved in many important physiological processes, such as protein shedding, development, and spermatogenesis [146]. Adamalysins are also known to facilitate cell signaling and have been implicated in carcinogenesis, making them medically relevant [147]. We identified 13 L. sericata clusters representing subfamily M12a and another 13 clusters representing subfamily M12b. Only one cluster (LST LS007850) was mainly expressed in the larval gut, indicating a potential role in digestion, whereas the others showed diverse tissue-specific expression profiles and their roles remain unclear.

Family M13.
Neprilysin and endothelin converting enzyme (ECE) are the two best characterized members of metallopeptidase family M13 in mammals. Neprilysin is involved in biological processes such as reproduction and the modulation of neuronal activity and blood pressure, whereas ECEs are responsible for the final step in the synthesis of endothelins, which are potent vasoconstrictors [69]. Insect family M13 metallopeptidases are membrane-bound peptidases with a broad substrate range and tissue distribution [125]. The precise biological roles of these enzymes in insects are still unclear, but they are associated with immunity to bacteria, fungi, and protozoa [122,148], metamorphosis [149], reproduction [150], and neuropeptide metabolism [151]. We identified 34 clusters coding for M13 peptidases in L. sericata and they were predominantly expressed in the larval body following the removal of the gut, crop, and salivary glands (Additional file 4). Among 34 clusters, 15 were further assigned to 8 peptidase families, whereas 13 remained unassigned and 6 represent nonpeptidase homologs (Table 5).

Family M14.
Most family M14 enzymes are carboxypeptidases that remove a single amino acid residue from the C-terminus of polypeptides. Carboxypeptidases are required for digestion and are widely distributed among insects [152], but they also process bioactive peptides (carboxypeptidase E) and hydrolyze bacterial cell walls ( -glutamyl-(L)-meso-diaminopimelate peptidase I) [69].
Recently, a partial L. sericata sequence encoding an M14 metallopeptidase was found to be upregulated by an immune challenge [18]. We identified 33 clusters representing M14 family metallopeptidases that were differentially expressed among the L. sericata tissues we tested (Additional file 4). These clusters were assigned to 9 peptidase species, whereas 14 remained unassigned and one was shown to represent a nonpeptidase homolog ( Table 5). The previously identified M14 peptidase [GenBank: FG360509] was found to be homologous to cluster LST LS004029, which is strongly expressed in the gut. The localization of this enzyme in the gut and its induction in response to an immune challenge suggest that it contributes to the elimination of ingested bacteria as previously described for L. sericata larvae [23].
3.7.9. Family M16. Family M16 can be divided into three subfamilies: M16A comprising oligopeptidases such as insulysin, nardilysin and pitrilysin, M16B which includes mitochondrial processing peptidase, and M16C which includes eupitrilysin [69]. We identified four M16A clusters and two peptidase species with pitrilysin-like characteristics. Pitrilysin is an endopeptidase originally found in Escherichia coli which is homologous to human insulin-degrading enzyme [153]. We also identified five M16B clusters and three peptidase species similar to mitochondrial processing peptidase, which cleaves the N-terminal signals of mitochondrial proteins during their import from the cytosol [69]. We also identified one M16C cluster representing one peptidase species similar to eupitrilysin.

Family M17.
Leucyl aminopeptidase (LAP) is a cocatalytic peptidase; that is, it requires two metal ions for activity, with diverse biological roles [154]. We identified four clusters and two peptidase species similar to LAP, with the strongest expression in gut tissues (Additional file 4). LAPs were previously identified in the digestive organs of bloodfeeding parasites including ticks [155], schistosomes [156], and Plasmodium spp. [157] and were found to be involved in the final stage of hemoglobin digestion. Because hemoglobin could also represent part of the L. sericata diet, we can speculate that L. sericata LAPs similarly are required for hemoglobin digestion.

Families M19 and M50.
Families M19 and M50 each comprise strictly membrane-bound enzymes. The family M19 dipeptidases degrade extracellular glutathione or inactivated leukotriene D4, whereas the family M50 enzymes regulate gene expression by processing different transcription factors [69]. We identified one cluster coding for a family M19 nonpeptidase homolog and one representing a family M50 S2P peptidase. The latter is a strongly hydrophobic peptidase found on the endoplasmic reticulum membrane. D. melanogaster S2P (ds2p) is required to cleave the sterol regulatory element binding protein (SREBP) and thus helps to regulate lipid biosynthesis [158].
3.7.12. Families M20 and M28. Families M20 and M28 comprise divergent cocatalytic exopeptidases. Family M20 contains only carboxypeptidases, whereas family M28 includes both carboxypeptidases and aminopeptidases. We identified one cluster in family M20, which was tentatively identified as peptidase T and one cluster tentatively identified as a homolog of D. melanogaster putative protein CG10062. Six unassigned clusters were also identified in family M28.

Family M24.
Members of family M24 are mostly intracellular, cocatalytic exopeptidases characterized by the so-called pita-bread fold [159], which have been found in every genome sequence published thus far [109]. They are involved in many fundamental biological processes, including the removal of N-terminal methionine residues from nascent polypeptides (methionyl aminopeptidase), intracellular protein turnover, and collagen metabolism (Xaa-Pro dipeptidase). They are also involved in angiogenesis and their specific inhibitors are therefore sought as potential anticancer drugs [160]. We identified nine clusters representing methionyl aminopeptidases and one Xaa-Pro dipeptidase. Clusters LST LS003866, LST LS017277, and LST LS003028 were predominantly expressed in the larval gut, whereas the other M24 family metallopeptidases were expressed at similar levels in all the tissues we investigated (Additional file 4).

Families M48 and M79. The members of families M48
and M79 are membrane-bound metallopeptidases involved in the release of tripeptides from Saccharomyces cerevisiae mating factor [161] and the Ras oncoprotein [162], to facilitate membrane attachment. Both families are medically relevant because of their ability to regulate the function of Ras, which is involved in many forms of cancer. We identified two clusters in one peptidase species coding for M48 family and one cluster in one peptidase species coding for M79. All clusters were expressed at similar levels in all dissected tissues.

Family M49. Dipeptidyl-aminopeptidase III (DPP-3)
is an exopeptidase that may be involved in the metabolism of angiotensin peptide and encephalin in mammals [163]. Insect orthologs of DPP-3 were purified from the foregut membrane of the cockroach Blaberus craniifer [164] and from adult D. melanogaster [165]. Purified DPP-3 hydrolyzed an insect neuropeptide (proctocolin), suggesting a role in neuropeptide signaling activity. We identified two clusters and one peptidase species related to DPP-3 expressed at similar levels in all the L. sericata tissues we tested.

Family M67.
Family M67 metallopeptidases are responsible for the removal of ubiquitin from ubiquitinylated proteins prior to their degradation in the proteasome. We identified one cluster and one peptidase species representing family M67 expressed at similar levels in all the L. sericata tissues we tested.

Threonine Peptidases.
Threonine peptidases were discovered in 1995 in archaean proteasomes [166]. They are Nterminal nucleophile peptidases belonging to clan PB and can be divided into three families, namely T1, T2, and T3. The T1 family comprises peptidases of the proteasome and related compound peptidases. The proteasome plays a central role in intracellular protein turnover and is a complex supramolecular complex with many subunits [167]. The T2 family comprises the aspartyl glucosylaminases, which are necessary for the degradation of asparagine-linked glycoproteins [168]. The T3 family comprises the -glutamyltransferases, which play a key role in glutathione metabolism [69]. Among 25 L. sericata clusters identified as threonine peptidases, 21 clusters and 14 peptidase species represented family T1, whereas 4 clusters and 2 peptidase species represented family T2 ( Table 6). All 25 clusters were expressed in all the L. sericata tissues we investigated and the expression levels were universally low (Additional file 5).
3.9. Serine Peptidases. Serine peptidases require a serine residue for their catalytic activity and represent one of the most abundant and functionally diverse groups of enzymes. Serine peptidases are involved in a broad range of biological processes including digestion, development, immunity, and blood coagulation [169]. MEROPS database v9.12 [62] lists 45 families in 15 clans as well as further 7 unassigned families. We found that serine peptidases are the largest group of peptidases in the L. sericata transcriptome. We identified more than 800 contigs in 270 clusters, which were assigned to 8 clans, 12 families, and 86 peptidase species (Table 7). These clusters showed a number of distinct tissue-specific expression profiles (Additional file 6).
3.9.1. Family S1. Clan PA family S1 comprises endopeptidases containing the catalytic triad His-Asp-Ser, and this was the largest peptidase family we found in the L. sericata transcriptome. Most S1 peptidases possess an N-terminal signal peptide and are synthesized as propeptides that must be cleaved to generate the active form. S1 peptidases are usually soluble, secreted enzymes, but membrane-bound and inactive homologs have also been described [69]. Many S1 peptidases have been identified in insects, where their roles include digestion [111], immunity [170], wound responses [171], and development [172]. S1 serine peptidases from L. sericata maggots have been associated with several of the beneficial effects of MEPs including blood coagulation [45], biofilm eradication [33], and wound debridement [15]. Although serine peptidases play an important role in MDT, only a small number of complete and partial L. sericata sequences representing these enzymes have been published thus far. We detected 230 clusters representing S1 peptidases of subfamily S1A, which is typified by chymotrypsin and trypsin. We assigned 216 clusters to 62 peptidase species, whereas 14 clusters represented nonpeptidase homologs (Table 7). Interestingly, only 21 of the 62 species have already been provisionally identified and associated with specific functions, whereas the remaining 41 putative peptidases have not been characterized. Among the identified peptidases, we detected 39 clusters in 7 peptidase species encoding for trypsin-like peptidases (S01.110, S01.116, S01.117, S01.130, S01.A83, S01.A87, and S01.A91). These were mainly expressed in the larval gut (Additional file 6) and probably function as digestive enzymes as reported for other insect species [111].   [173]. TmSPE peptidase (S01.507) [174], Persephone (S01.421) [175], Grass (S01.502), and Spirit (S01.B27) [176] facilitate the activation of Toll pathway signaling, which triggers the synthesis of antimicrobial peptides in response to fungi and Gram-negative bacteria [177]. We also identified ovochymase (S01.024), which was discovered in Xenopus laevis eggs and may play a role in fertilization or early development [178], the Easter peptidase (S01.201) required for dorsoventral patterning in D. melanogaster embryos [179], the Tequila peptidase (S01.461) that mediates long term memory formation in D. melanogaster [180], the proapoptotic DmHtrA2-type mitochondrial peptidase (S01.476) [181], and testis-specific protein 50 (S01.993), which is necessary for spermatogenesis in mammals and is upregulated in breast cancer [182]. All the previously identified L. sericata serine peptidase genes were also identified in the transcriptome dataset (Table 8). Interestingly, only four of these previously described genes could be assigned to peptidase species with a known function, whereas most were identified based on homology to putative proteins in D. melanogaster (Table 8).
We also found that although many of the enzymes were detected in MEPs, the corresponding mRNA was predominantly expressed in the larval gut (Additional file 6). The same phenomenon was confirmed for the Jonah-like peptidase, where high expression level of Jonah mRNA was observed in the gut but the native enzyme was only detected in MEPs [17]. These results indicate that peptides in MEPs are not exclusively produced by the salivary glands but rather a combination of the salivary glands, gut, and crop. Although further studies are required to confirm this hypothesis, we suggest that regurgitation and/or vomiting in dipteran species [183] contributes to the production of beneficial MEP molecules in L. sericata larvae.
Cluster LST LS005873 was tentatively identified as chymotrypsin m-type 2 (S01.168) and this is identical to the previously described L. sericata chymotrypsin I [GenBank: CAS92770]. A recombinant form of this enzyme was shown to degrade wound eschar ex vivo [15] and to degrade microbial surface components recognizing adhesive matrix molecules from the slough [184]. As shown in Additional file 6, we identified seven further clusters representing the same peptidase species (S01.168). Another recombinant serine peptidase known as sericase [GenBank: AAA17384] was shown to enhance fibrinolysis [44]. Sericase was found to be identical to L. sericata trypsin-like serine peptidase, which was proposed to facilitate wound debridement [185]. We identified three clusters (LST LS007476, LST LS007613, and LST LS010750) homologous to sericase, which represent one peptidase species provisionally identified as D. melanogaster putative protein CG7542 (S01.B07). Moreover, the most prominent cluster (LST LS007476) was also found to be homologous to a previously identified serine peptidase [GenBank: FG360529] which is induced at the transcriptional level following an immune challenge [18]. Our data indicate that sericase is probably involved in several MEP functions including fibrinolysis, debridement, and other immune responses.
Debrilase [GenBank: AJN88395] is a serine peptidase known to play a role in L. sericata MDT. Debrilase is homologous to cluster LST LS015273, which along with another eight clusters represents one peptidase species provisionally identified as D. melanogaster putative protein CG17571 (S01.A85). All the clusters share the same tissue-specific expression profile with strong upregulation in the gut (Additional file 6). Recently, L. sericata MEPs were shown to reduce the clotting time of human plasma, and this phenomenon was attributed to a serine peptidase activity [45]. Recombinant Jonah-like chymotrypsin was confirmed to reduce the clotting time of human plasma and to degrade certain extracellular matrix proteins [17]. We found 10 clusters representing one peptidase species, tentatively identified as Jonah 65Aiv (S01.B05). These clusters were predominantly expressed in the gut (Additional file 6). Interestingly, cluster LST LS015269 was found to be homologous to a L. sericata serine peptidase [GenBank: FG360505] that is upregulated in immune challenged larvae thus indicating a role in immunity [18].

Family S8.
Family S8 comprises two subfamilies of enzymes. Subfamily S8a is typified by the endopeptidase subtilisin, originally identified in Bacillus subtilis [69], as well as tripeptidyl-peptidase II, an exopeptidase involved in general intracellular protein turnover [69]. Subfamily S8b is typified by kexin (whose mammalian homolog is known as furin), which processes numerous proteins ranging from growth factors and chemokines to extracellular matrix proteins [186], and is therefore associated with diseases such as Alzheimer's disease, atherosclerosis, and cancer [187]. We identified three clusters of L. sericata subtilisin-like enzymes, two clusters similar to tripeptidyl-peptidase II and two clusters as furinlike enzymes (Table 7).

3.9.
3. Family S9. The family S9 prolyl oligopeptidases are intracellular enzymes that strictly cleave substrates containing proline residues, and they are thought to process neuropeptides in humans [188]. Interestingly, a prolyl oligopeptidase was recently identified in the human parasite Schistosoma mansoni. Although the enzyme is not secreted by the parasite, it cleaves the human vasoregulatory peptides bradykinin and angiotensin I in vitro, thus potentially modulating or dysregulating homeostasis in its host [189]. We identified 9 clusters and 4 peptidase species representing L. sericata prolyl oligopeptidases. The clusters showed different tissue-specific expression profiles but three of them (LST LS016452, LST LS001966, and LST LS016700) were predominantly expressed in the gut and/or the salivary glands (Additional file 6). This specific distribution in L. sericata tissues associated with the production of MEPs indicates a potential role in wound homeostasis but more detailed experiments are required to confirm this hypothesis.
3.9.4. Family S10. Family S10 comprises lysosomal carboxypeptidases with predominantly regulatory functions, although hemipteran S10 peptidases were recently shown to be involved in the digestion of food [152]. Among four family S10 clusters identified in L. sericata, two (LST LS003337 and LST LS015778) were strongly upregulated in the gut, whereas the others (LST LS004162 and LST LS016840) were expressed at similar levels in the L. sericata tissues we  investigated. The clusters induced in the gut were assigned to the peptidase species without a known function, whereas the other two were annotated as vitellogenic carboxypeptidaselike proteins, suggesting a role in vitellogenesis.
3.9.5. Families S14 and S41. Family S14 comprises cytosolic ATP-dependent Clp endopeptidases and their homologs. Clp peptidases together with their ATP-binding subunits create an oligomeric complex of 20-26 subunits [69] that mediate protein quality control and regulatory degradation [190]. Family S41 comprises endopeptidases that are involved in the degradation of incorrectly synthesized proteins. They possess the catalytic tetrad Ser-His-Ser-Glu, which is unusual for serine peptidase, and neither the position of the active site residues nor the residues themselves are conserved [69]. We found that the L. sericata families S14 (one cluster in one peptidase species) and S41 (three clusters in two peptidase species) endopeptidases were similarly expressed in all the tissues we analyzed and are likely to be involved in the regulation of protein synthesis.
3.9.6. Family S16. The family S16 enzyme Lon is a bacterial ATP-dependent endopeptidase containing a peptidase domain and an ATP-binding domain in a single subunit. Similar enzymes are found in many other organisms [109] where they facilitate the degradation of unfolded proteins [191]. We identified three clusters assigned to two peptidase species, which were present in all L. sericata tissues.
3.9.7. Family S26. Family S26 consists of ubiquitous, membrane-bound enzymes with a catalytic dyad, which are involved in the cleavage of signal peptides thus facilitating the secretion of proteins [69]. We identified two L. sericata family S26 clusters and two peptidase species representing signal peptidases and another cluster that remained unassigned (Table 7). Cluster LST LS009731 was strongly expressed in the salivary glands, which are known to secret large amounts of protein, thus indicating a role in protein secretion.
3.9.8. Family S28. Family S28 comprises the lysosomal Pro-Xaa carboxypeptidases, which are lysosome-specific exopeptidase found solely in eukaryotes, featuring an unusual selectivity for the cleavage of ProXaa bonds. In humans, such enzymes inactivate angiotensin II and activate plasma kallikrein [192]. We identified three L. sericata family S28 clusters assigned to two peptidase species that were upregulated in the crop and larval body samples (Additional file 6). The precise role of these enzymes remains unclear although they may contribute to the procoagulation activity of MEPs as recently described [45].
3.9.9. Family S49. Only one cluster in one peptidase species was assigned to the family S49. Family S49 is the signal peptidases required for intracellular protein processing and the regulation of protein export [193].
3.9.10. Family S51. Family S51 is typified by aspartyl dipeptidase, an exopeptidase originally identified in Salmonella typhimurium [194] that hydrolyzes -aspartyl bonds. The crystal structure of the S. typhimurium aspartyl dipeptidase has been solved, revealing an unusual catalytic triad with Ser and His as predicted but Glu instead of Asp [195]. The biological role for this enzyme is not clear, but it seems to be involved in the production of nutritional amino acids [69]. Two clusters belonging to one peptidase species were identified in L. sericata transcriptome.
3.9.11. Family S54. Family S54 is typified by Rhomboid-1, an intramembrane enzyme identified in D. melanogaster [196] that plays an important role in embryonic development by cleaving the Spitz protein and thus activating the epidermal growth factor receptor [197]. We identified three clusters in three peptidase species, which were expressed in all L. sericata tissues.

qRT-PCR Verification of Gene Expression.
To experimentally verify the results from digital gene expression analysis we performed qRT-PCR analysis of one peptidase gene from each peptidase clan (aspartic, cysteine, metallo, threonine, and serine). These genes (Table 9) code for peptidases with various physiological function and show different tissue expression profile. All the tested genes show the similar expression profiles as acquired by digital gene expression analysis ( Figure 6).

Conclusion
The purpose of this study was to provide an overview of the distribution of proteolytic enzymes in L. sericata, focusing on the tissue-specific expression profiles and potential functions as the basis for further more detailed studies of individual peptidases. We identified 577 clusters representing five classes of proteolytic enzymes (aspartic, cysteine, threonine, serine, and metallopeptidases) which were further assigned into 26 clans, 48 families, and 185 peptidase species with diverse tissue-specific patterns of distribution. We identified all previously described therapeutic peptidases and found that most of them were most highly expressed in the larval gut, thus indicating that the larval gut contributes to the production of beneficial enzymes found in the MEPs. Although the majority of the enzymes we identified were serine peptidases, most of them were novel putative peptidases whose function is unclear, but whose specific tissue-specific expression profiles indicate an important role in MEP activity. Several peptides with the most intriguing expression profiles have been prepared as synthetic genes allowing the functional analysis of the corresponding recombinant peptides.