Proteolysis in hyperthermophilic microorganisms

Proteases are found in every cell, where they recognize and break down unneeded or abnormal polypeptides or peptide-based nutrients within or outside the cell. Genome sequence data can be used to compare proteolytic enzyme inventories of different organisms as they relate to physiological needs for protein modification and hydrolysis. In this review, we exploit genome sequence data to compare hyperthermophilic microorganisms from the euryarchaeotal genus Pyrococcus, the crenarchaeote Sulfolobus solfataricus, and the bacterium Thermotoga maritima. An overview of the proteases in these organisms is given based on those proteases that have been characterized and on putative proteases that have been identified from genomic sequences, but have yet to be characterized. The analysis revealed both similarities and differences in the mechanisms utilized for proteolysis by each of these hyperthermophiles and indicated how these mechanisms relate to proteolysis in less thermophilic cells and organisms.


Introduction
Proteases are critical to the maintenance of cellular function.They hydrolyze both external and internal nutrient sources and recognize and break down unneeded or abnormal polypeptides, the latter produced as a result of environmental stress, mutation or errors in biosynthetic processes (Tomoyasu et al. 2001).Cells have an array of proteases for processing proteins and polypeptides (e.g., Allison andMacfarlane 1990, Guedon et al. 2001) and for maintaining metabolic function under abnormal conditions (Arsene et al. 2000).Proteases range from simple monomeric hydrolases to complex, multi-subunit structures with molecular masses in the order of 1 MDa.Some proteases have a high functional and structural complexity, particularly those that are ATP-dependent, e.g., Lon, ClpXP, ClpYQ, FtsH and the proteasome (Gottesmann 1996, Maupin-Furlow et al. 2000), but whether simple or complex, all proteases must be precisely regulated to avoid destruction of the cell's metabolic machinery.
Here, we exploit genomic sequence data, in conjunction with physiological, biochemical and biophysical information, to investigate relationships among proteolytic inventories of hyperthermophilic microorganisms (capable of growth at 80 °C) of the euryarchaeotal genus Pyrococcus, the crenarchaeote Sulfolobus solfataricus and the bacterium Thermotoga maritima.Thermotoga maritima, initially isolated from Vulcano, Italy, has an optimum growth temperature of 80 °C and is a fermentative anaerobe that reduces sulfur facultatively and prefers simple and complex sugars as growth substrates (Huber et al. 1986).Sulfolobus solfataricus, initially isolated from a solfataras field near Naples, Italy (Zillig et al. 1980), is an aerobe that grows at acidic pHs as low as 2.0, with an optimal growth temperature of 80 °C (Brock et al. 1972).This archaeon grows chemolithotrophically by oxidizing metal cations (Fe 2+ ) or sulfur as well as heterotrophically on simple sugars.Pyrococcus is a marine hyperthermophilic genus that includes species with optimal growth temperatures between 96 and 100 °C.Pyrococcus spp.have been isolated from deep-sea hydrothermal vent systems, such as those found along the North Fiji basin (P.abyssi, Erauso et al. 1993) and Okinawa Trough (P.horikoshii, Gonzalez et al. 1998), as well as from shallow marine environments, such as those found around Vulcano Island, Italy (P.furiosus, Fiala and Stetter 1986).Pyrococcus spp.are strict anaerobes that reduce sulfur facultatively and grow heterotrophically by fermentation of proteinaceous compounds and sometimes simple sugars and α-keto acids such as pyruvate.

Inferring protease inventory from genomic sequences
Initial efforts to assess the extent and variety of proteases in hyperthermophiles by biochemical methods significantly underestimated this biocatalytic feature.For example, Connaris et al. (1991) and Blumentals et al. (1990) reported that gelatin-based zymograms of cell-free extracts from P. furiosus revealed the presence of up to 13 clearing zones, some of which were later attributed to multiple versions of a single protease (Halio et al. 1996, Chang et al. 2001).Similar experiments with T. maritima revealed an even more limited set of proteases than observed in P. furiosus (Hicks et al. 1998).Genome sequence data, however, indicate that the proteolytic genotypes of these organisms are more expansive than can be inferred from zymogram analyses.Tables 1 and 2 show the confirmed and putative protease-related genes in the genomes of P. furiosus (http://comb5-156.umbi.umd.edu/genemate/),P. abyssi,  P. horikoshii (Kawarabayasi et al. 1998), S. solfataricus (She et al. 2001) and T. maritima (Nelson et al. 1999).This information was also used to examine other hyperthermophilic and mesophilic organisms (for more information see http://www.che.ncsu.edu/extremophiles/). Protease-related genes were identified through BLAST analyses of completely sequenced prokaryotic genomes, the online database at the Kyoto Encyclopedia of Genes and Genomes (http://www.genome.ad.jp/ kegg/), and the AlignAce output files (http://arep.med.harvard.edu/microbial_motifs/). Putative protease/peptidases were analyzed for known motifs with ScanProsite (http://ca.expasy.org/tools/scnpsit1.html), the presence of signal peptides using SignalP (http://www.cbs.Dtu.dk/services/SignalP/) and genomic organization using the STRING analy-sis method (Snel et al. 2000), for possible protease-gene relationships.Homology was confirmed when amino acid sequence identities were at least 25% over 50% or more of the protein.Genomic analysis showed that Pyrococcus spp., S. solfataricus and T. maritima have numerous protease homologs to putative and confirmed proteases in other archaea and bacteria, although there are many differences among these organisms in protease inventory.This is true even when comparing the protease inventory of the three species within the genus Pyrococcus.

The ATP-dependent proteases in hyperthermophiles
All prokaryotic genomes sequenced to date, including hyper-  The Lon protease in Pyrococcus lacks the ATP-binding domain, but was placed in the ATP-dependent protease section because there is no experimental evidence aside from the sequence to suggest otherwise.
thermophiles, indicate the presence of several ATP-dependent proteases, although there is some variation in the roster of these enzymes among organisms.These proteases are typically implicated in protein and peptide turnover, and stress response (for reviews see Gottesman 1996, 1999, Porankiewicz et al. 1999, Schmidt et al. 1999).Although it is unclear how particular functional or abnormal proteins are selected for proteolytic processing by ATP-dependent proteases, it is a key issue in understanding cellular function.

The Lon protease
Lon protease is unique among ATP-dependent proteases in that it is not based on the assemblage of small subunits (i.e., 15-25 kDa) into stacked rings that interact with separate ATPases.Rather, Lon protease is a homotetramer, composed of 87-kDa subunits, each containing an active site serine and a single ATP-binding site (Maurizi 1992).Lon is ubiquitous across all three domains of life and seemed at one point to be the only such ATP-dependent protease.Analysis of the T. maritima genome revealed a homolog to the Escherichia coli Lon, TM1633, referred to here as LonA.However, a second open reading frame, TM1869, was also annotated as a possible Lon homolog, designated LonB, which was previously not identified.LonA has both the active site and ATP-binding site motifs typically noted in other versions of Lon.LonB contains the active site region (with the conserved serine residue), but not the ATP-binding site (Figure 1).Homologs to both LonA and LonB were noted in several bacterial genomes.However, euryarchaeal genomes appear to encode only a single LonB homolog.In Pyrococcus spp., the lon gene contains an intein (http://www.neb.com/neb/frame_tech.html), a self-splicing element functional at the protein level.Whether this impacts its regulation or the expression of the encoded protein is unknown.No Lon homolog could be identified in the S. solfataricus genome.The archaeal Lons also have one or two putative transmembrane regions, which are absent in the Eubacteria, suggesting that they may be membrane-associated.
LonA and LonB are distinct enzymes, as is readily apparent when a phylogenetic tree is constructed based on their amino acid sequences (Figure 2).Although it is clear that LonA and LonB are distinct enzymes, their metabolic roles and biochemical specifics are unknown.

The FtsH (or Hf lB) proteinase
FtsH (or HflB) is an ATP-dependent proteinase with a zinc metalloprotease motif that is present in Bacteria, mitochondria, and chloroplasts, but not in Archaea (Schumann 1999, Langer 2000).Unlike the proteasome, Lon protease and Clp protease, which are cytoplasmic, the FtsH protein is anchored to the cytoplasmic membrane through two transmembrane regions.Among hyperthermophiles, FtsH has been located only in the bacteria Aquifex aeolicus and T. maritima (see Table 1).
Although Archaea lack FtsH, we speculate that the archaeal Lon, which has one to two putative transmembrane regions, may replace FtsH from a metabolic standpoint.

The Clp family of proteases
The Clp family occurs in both the eubacteria and eukaryotes, including hyperthermophilic bacteria, but appears to be absent in hyperthermophilic archaea.Two types of Clp proteases are known, with either ClpP or ClpQ (HslV) as the proteolytic subunit.For full proteolytic activity, ClpP and ClpQ must associate with their respective ATPase subunits.In the case of ClpP this can be either ClpX or ClpA ATPases, whereas ClpQ associates with ClpY (HslU) ATPase (Porankiewicz et al. 1999).Both the ClpP and ClpQ proteases, as well as their respective ATPase subunits ClpC (three homologs), ClpX and ClpY were identified in the genomes of T. maritima and A. aeolicus (Deckert et al. 1998).In both T. maritima and A. aeolicus, the clpP gene is preceded by a putative trigger factor.The trigger factor may be involved in the export of proteins, acting as a chaperone to keep them in an open conformation.Moreover, ClpP in the A. aeolicus genome is followed by ClpX.In T. maritima, the Clp ATPases are scattered throughout the genome and are not linked to other proteases.However, ClpC-1 in T. maritima (TM0198) is linked to radA, which is involved in DNA repair.We postulate that these genes are involved in stress response and are coordinately regulated.The genes encoding ClpQ and ClpY are linked on the T. maritima genome and are likely co-transcribed, whereas they occur separately in the A. aeolicus genome.As more is learned about gene regulation in these two hyperthermophiles, the significance of these alternative arrangements for ClpQ and ClpY should become clearer.In E. coli, expression of lon, clpP and the ATPase subunits clpX and clpB, are all σ 32 -dependent (Bukau 1993) heat-shock response involves at least three classes of heat-inducible genes (Hecker et al. 1996).It has recently been found that, in gram-positive bacteria, clpP is under the control of a novel regulator, CtsR, belonging to Class III, which is σ B -independent and lacks the cis-acting CIRCE operator sequence (Hecker et al. 1996).Analysis of the T. maritima genome reveals 70 putative transcriptional regulators that have moderate degrees of identity to regulators from both gram-positive and gram-negative sources (Nelson et al. 1999).Among these are homologs to the E. coli sigma E (rpoE) and the heat-shock operon repressor (HcrA) from B. subtilis (Nelson et al. 1999).
However, homologs to E. coli σ 32 and the gram-positive regulator CtsR could not be identified.It is possible that T. maritima uses a novel mechanism, or a regulatory pathway that is a hybrid of that in gram-positive and gram-negative organisms.

The proteasome
The 20S proteasome, or multicatalytic proteinase (MCP), is a cylindrically shaped protease found in the Archaea, Eukarya and the gram-positive actinomycetes (Bochtler et al. 1999, Barber andFerry 2001).The lack of the proteasome in other bacteria suggests that actinomycetes acquired the protease ARCHAEA ONLINE at http://archaea.wsthrough lateral gene transfer (Lupas et al. 1997).Although most bacteria lack a version of the proteasome, they contain the related complex ClpQY (or HslVU), which shares a similar fold and catalytic mechanism with the proteasome (Bochtler et al. 1997).The proteasome from the thermophilic archaeon Thermoplasma acidophilum yielded the first structure of the proteasome and it has since become a prototype for the three-dimensional structure and topology of the molecule (Lowe et al. 1995).Native versions of the archaeal proteasome have been isolated and characterized from Methanosarcina thermophila, Methanococcus jannaschii and P. furiosus and appear to share many structural and biochemical properties with the T. acidophilum proteasome (Maupin-Furlow and Ferry 1995, Bauer et al. 1996, Wilson et al. 2000).The 20S structure comprises four heptametrical rings stacked on top of one another (Rechsteiner et al. 1993).Each ring comprises either αor β-type subunits, arranged in the order α 7 β 7 β 7 α 7 , with a centralized hollow channel running the entire length of the complex (DeMartino and Slaughter 1999).To safeguard against unwanted protein degradation, the proteasome confines proteolytic activity to the interior region of this self-compartmentalized structure (DeMartino and Slaughter 1999, Goldberg 2000).Although the archaeal 20S proteasome functions as a discrete protease in vitro, it is not known if the 26S proteasome is the only functional form in vivo (Maupin-Furlow et al. 2000).Archaeal proteasomes contain various peptidase activities; most have only chymotrypsin-like activity, although some also have high trypsin-like or caspase-like activity.The archaeal proteasome acts in a processive manner, chopping protein substrates at multiple places to yield peptide fragments of three to 30 amino acids in length (Kisselev et al. 1998).Although the physiological role of the archaeal proteasome is unclear, inhibitor-based studies show that T. acidophilum cells can proliferate without a functional proteasome under normal growth conditions, but cannot grow without proteasome activity under heat shock conditions (Ruepp et al. 1998).
All archaeal genomes sequenced to date contain homologs of the 20S proteasome core structure, including members of both Crenarchaeota and Euryarchaeota (see Tables 1 and 2).The genomes of Aeropyrum pernix, Pyrococcus spp.and S. solfataricus each contain two different β subunit homologs (see Tables 1 and 2), and two different α subunits have been identified in the halophilic archaeon Haloferax volcanii (Wilson et al. 1999).In Archaea, the genes encoding the α and β proteasome subunits appear to be transcribed as a part of independent operons that have conserved gene organization, whereas the genes surrounding pan (proteosome activating nucleotidase) do not appear to be conserved (Maupin-Furlow et al. 2000).In addition, the archaeal proteasome appears to be a part of a superoperon containing subunits of the exosome, indicating a possible functional link between RNA processing and proteolysis in this domain (Koonin et al. 2001).
It is clear that hyperthermophiles, like their mesophilic counterparts, have a full complement of ATP-dependent proteases at their disposal.However, the proteases differ between the two domains, the proteasome and Lon proteases being present in Archaea and the Lon, Clp and FtsH being present in Eubacteria.Although much is known about ATP-dependent proteases and their metabolic roles in mesophilic eubacteria, one cannot assume that they have identical roles or regulation patterns in hyperthermophilic bacteria.Moreover, little is known about the metabolic roles and regulation of the proteasome and Lon in Archaea.

ATP-independent proteases in hyperthermophiles
Most heterotrophic hyperthermophiles can grow on proteinaceous substrates as primary carbon and energy sources.Such substrates must initially be acted on by extracellular proteases, which may or may not be cell-associated.The products of extracellular hydrolysis are transported into the cell, presumably by an ABC-type transporter, where they are further broken down to individual amino acids by the concerted action of intracellular proteases and peptidases.It is thought that peptides can be oxidized to CO 2 in Thermoproteus tenax, Archaeoglobus fulgidus and S. solfataricus, with sulfur, thiosulfate, sulfate, oxygen, nitrite and nitrate serving as terminal electron acceptors (Schönheit and Schäfer 1995).In species from the genera Thermococcus, Pyrococcus, Thermotoga, Desulfurococcus and Pyrodictum, peptides are likely fermented to free acids, such as acetate, isovalerate, butyrate and phenylpyruvate, generating ATP by substrate-level phosphorylation (Schäfer et al. 1993, Schönheit andSchäfer 1995).Parts of these pathways can be constructed by analysis of the genomes of these organisms, although the details of metabolic schemes involved in peptide fermentation in these organisms are unknown.
Proteases and peptidases presumably involved in the initial steps in protein and peptide utilization have been isolated from hyperthermophilic archaea, including Pyrococcus spp.(Halio et al. 1996, Voorhorst et al. 1996, Chang et al. 2001), Thermococcus stetteri (Klingeberg et al. 1995), A. pernix (Sako et al. 1997) and P. abyssi (Dib et al. 1998) (see Table 3).The only protease isolated from T. maritima thus far is a homomultimeric protease that has moderate amino acid sequence identity to bacteriocins from mesophilic bacteria (Hicks et al. 1998).Within the classical classification scheme for proteases, namely, serine, aspartic, metallo and cysteine, the majority of the enzymes characterized to date from hyperthermophiles has been extracellular, and belongs to the serine class.Although these enzymes have been well characterized biochemically, little is known about their metabolic significance and even less about their regulation.
Proteolysis in hyperthermophilic bacteria is poorly understood, despite the availability of complete genomic sequences for T. maritima (Nelson et al. 1999) and A. aeolicus (Deckert et al. 1998).To date, only two proteases (Hicks et al. 1998, Choi et al. 1999) and a leucine aminopeptidase (Khan et al. 2000) have been characterized biochemically from hyperthermophilic bacteria.A 43-kDa serine protease was identified in Aquifex pyrophilus using a sequence tag specific for serine proteases (Choi et al. 1999).The gene encoding the protease was sequenced and found to contain a putative signal se-  quence.The protease was identified in the cell wall fraction of A. pyrophilus, demonstrating that the protease is expressed in the native organism and that it is exported from the cell.However, the role of the protease is unclear because Aquifex spp.appear unable to use peptides as a carbon or nitrogen source (Deckert et al. 1998).Choi et al. (1999) suggested that the protease is associated with the cellular S-layer, but whether it is involved in protein degradation or is involved in S-layer formation is unclear.Similarly, in T. maritima, analysis of the genome identified about 35 proteases and peptidases, 12 of which may be exported (see Tables 1 and 2).However, growth of T. maritima on peptides as sole carbon and energy sources has not been reported.Among the hyperthermophiles, proteolysis has been best studied in members of the order Thermococcales.These organisms can use peptides as sole carbon and energy sources and this is reflected in a range of proteolytic activities in their cell extracts and confirmed by genomic sequence analysis.Initial efforts to study P. furiosus indicated that the organism was highly proteolytic (Blumentals et al. 1990, Eggen et al. 1990, Connaris et al. 1991).Analysis of the P. furiosus genome has thus far revealed the presence of about 40 genes encoding proteases, protease subunits, or peptidases (Table 1).Nine of the identified proteases contain a putative signal sequence, suggesting that they are exported from the cell.A comparative genomics approach was also used to compare P. furiosus, P. abyssi and P. horikoshii in an effort to access the proteolytic inventory within this narrow phylogenetic range of the Thermococcales (Table 1).Similar to P. furiosus, both P. abyssi and P. horikoshii are highly proteolytic, with 34 genes encoding proteolytic enzymes detected in their genome sequences.It is apparent that, despite their close phylogenetic relationship, there are distinct differences between these organisms in their respective proteolytic inventories.Pyrococcus furiosus has four proteases (pf_699579, pf_1553191, pf_1757236 and pf_1136394) and three aminopeptidases (pf_1898123, pf_1902688 and pf_1906416) that are absent from P. abyssi and P. horikoshii.There is only one protease unique to P. abyssi (CAAX Prenyl protease, PAB0555), whereas no proteases yet identified are unique to P. horikoshii.Moreover, there is a clear distinction between the pyrolysin proteases in the Thermococcales.The pyrolysin from P. furiosus is only 17% identical to the pyrolysin-like proteases of P. abyssi and P. horikoshii, and 33% identical to stetterolysin from Thermococcus stetteri, whereas the pyrolysin-like proteases of P. abyssi and P. horikoshii are 69% identical.Differences in proteolytic content of Thermococcales can also be seen by zymogram analysis.For example, Figure 3 shows that the extracellular proteases expressed by P. furiosus, P. abyssi, Thermococcus profundus, Thermococcus peptonphilius and Thermotoga maritima vary when grown in the same medium at their respective temperature optima with tryptone (5 g l -1 ) as the primary carbon and energy source.
Several extracellular (Morikawa et al. 1994, Klingeberg et al. 1995, Dib et al. 1998, Kannan et al. 2001) and membraneassociated proteases (Voorhorst et al. 1996(Voorhorst et al. , 1997) ) have been characterized biochemically from both Thermococcus and Pyrococcus spp.With the exception of the thiol protease from T. kodakaraensis, all the extracellular and membrane-associated proteases have been classified as serine proteases.The proteases range in size from 40 to 68 kDa and are monomeric, with the exception of the P. abyssi protease, which is multimeric.The serine protease from T. kodakaraensis has broad substrate specificity, cleaving at the carboxyl termini of Tyr, Phe, Leu, Gln, His, Thr, Ser and Ala (Kannan et al. 2001).The other serine proteases from Thermococcus stetteri and P. abyssi have fairly specific substrate specificity, preferring Arg/Phe and Aromatic/Leu at the P1 position, respectively (Klingeberg et al. 1995, Dib et al. 1998).Substrate specificity of the thiol protease was not determined.
An intracellular protease identified in P. furiosus (Blumentals et al. 1990) and designated Pyrococcus furiosus protease I (PfpI) was found to be based on a single 18.8-kDa subunit (Halio et al. 1997).The gene encoding this subunit has putative homologs in other cells and microorganisms from the three domains of life, including M. jannaschii, B. subtilis, E. coli and Homo sapiens, but not T. maritima or Saccharomyces cerevisiae.In vitro, PfpI occurs in at least three functional forms, a trimer, a hexamer and a dodecamer, and is most active as a dodecamer (Chang et al. 2001).The physiological role of PfpI is unclear, but it complements the proteasome in P. furiosus; in vitro, a synergistic relationship between the two proteases has been noted (Chang et al. 2001).Because P. furiosus lacks the tricorn protease (Table 1), we speculate that PfpI assumes this role in P. furiosus.It also appears to be a predominant protease in P. furiosus based on zymogram analyses.The three-dimensional structure of the PfpI homolog in P. horikoshii (PhpI) (90% identical at the amino acid level) was recently reported (Du et al. 2000).The structure was consistent with previously reported biophysical information: PhpI is a dodecamer consisting of two identical six-member rings, each with axes of symmetry such that it consists of a dimer of trimers or a trimer of dimers.This supports the observation that, in vitro, it exists in at least three functional forms (P.M. Hicks, North Carolina State Univ., and R.M. Kelly, unpublished data).Furthermore, even though PfpI/PhpI are not ATP-dependent, the structure of PhpI suggests a similar barrel-like compartmentalization of the active site, reminiscent of the 20S proteasome and ClpP.The possible relationship of PfpI/PhpI and other ATP-independent proteases with protected active sites to ATP-dependent proteases needs to be examined, especially with respect to the evolutionary significance of the energetic requirement.
Sulfolobus solfataricus can use proteinaceous compounds as primary carbon and energy sources, and analysis of the S. solfataricus genome revealed the presence of 37 genes encoding proteases, protease subunits, or peptidases (Table 2).The tricorn protease as well as the interacting F factors F1, F2 and F3, which are aminopeptidases, were also identified (She et al. 2001).Although it has not been characterized in S. solfataricus, the tricorn protease has been well characterized in the thermophilic archaeon Thermoplasma (Tamura et al. 1998).Tricorn protease, in conjunction with its three interacting factors, degrades oligopeptides in a sequential manner, yielding free amino acids.
Two proteases have been characterized in Sulfolobus spp.; a heat-stable intracellular serine protease from S. solfataricus (Burlini et al. 1992), and the acid protease thermopsin from S. acidocaldarius (Fusek et al. 1990, Lin et al. 1991).The protease from S. solfataricus has a subunit molecular mass of 54 kDa and the active form is 118 kDa, suggesting that it exists as a dimer.The enzyme has chymotrypsin-like specificity, preferring aromatic or bulky aliphatic amino acids, but differs from chymotrypsin in being unable to digest natural substrate proteins like insulin chains A and B (Burlini et al. 1992).Thermopsin has a predicted molecular mass of 32.6 kDa, but was found to be a monomer with a molecular mass of 46 kDa.The difference in molecular mass is likely because thermopsin is a glycoprotein (Lin et al. 1991).Thermopsin is capable of degrading various protein substrates, such as hemoglobin, ovalbumin, bovine serum albumin and insulin chain B (Fusek et al. 1990).The enzyme has a rather broad substrate specificity but, similar to pepsin, prefers hydrophobic residues to flank the cleavage site.

Proteases specific to hyperthermophiles
Genome sequence analysis shows that there is considerable conservation of certain proteases across all domains of life and growth temperature ranges.However, some proteases appear unique or more common to hyperthermophiles.For example, three clostripain-related proteases appear to be found only in T. maritima.These proteases have limited homology (32% identical in a 125 amino acid region) to the heterodimeric cysteine endoprotease from Clostridium histolyticum (Dargatz et al. 1993).However, given the difficulties in cloning and expressing active forms of hyperthermophilic proteases in mesophilic hosts, biophysical and biochemical information for these enzymes may need to rely to a great extent on direct purification.
There are examples of proteases that, from a structural or functional perspective, may be unique to hyperthermophiles.For example, the homomultimeric protease in T. maritima based on an approximately 31-kDa subunit has significant homology at the amino acid sequence level to a gene encoding a mesophilic bacteriocin, linocin M18; this protease has been tentatively named maritimacin (Hicks et al. 1998).A putative homolog to this protease has also been identified in the genome sequence of P. furiosus, but the enzyme appears to be absent from the rest of the Archaea (Table 1).Linocin M18, a multimeric assembly of a single 31-kDa subunit isolated from Bevibacterium linens, was found to be antagonistic to a wide spectrum of mesophilic coryneform and other gram-positive bacteria (Valdes-Stauber and Scherer 1996).However, the physiological function of this protein in these two hyperthermophilic organisms and its relationship to bacteriocins are unknown.Initial efforts to screen the purified T. maritima protease for bacteriocin-like activity focused on several hyperthermophiles that are readily cultured, including hyperthermophilic archaea.However, no significant antagonistic activity was found, although the presence of the putative protease seemed to extend the lag phase for Thermococcus litoralis (Hicks et al. 2001).
Pyrolysin is a cell envelope-associated protease that has N-terminal sequence homology with subtilisin-like serine proteases purified from P. furiosus (Voorhorst et al. 1996).The enzyme exhibits endopeptidase activity and may be involved in the first step of protein utilization during proteolytic growth (Voorhorst et al. 1996).Pyrolysin exists as a mosaic of domains shared by other proteases containing 1398 amino acid residues, including a conserved pre-proenzyme region, a catalytic domain of 500 residues, and a large C-terminal extension, making it one of the largest known serine proteases ( de Vos et al. 2001).Pyrolysin exhibits highest identity with the catalytic domain of the eukaryal tripeptidyl peptidases II, a subgroup of the subtilisin-like proteases (Voorhorst et al. 1996).Although the catalytic domain of pyrolysin is found in a putative subtilase from P. furiosus, the intact gene is absent from the genome sequences of the rest of the Archaea, including P. horikoshii and P. abyssi.However, a gene was identified in the genome sequences of P. horikoshii (PH0310) and P. abyssi (PAB1252), which includes the conserved pre-pro-and C-terminal sequences of pyrolysin, but contains a putative protease with a thiol-protease catalytic domain rather than serine protease activity (de Vos et al. 2001).
Another surface layer bound subtilisin-like protease, STA-BLE, has been identified in Staphylothermus marinus (Mayr et al. 1996) that has some relationship to pyrolysin in that specific motifs are common to both proteases.This protease pre-ARCHAEA ONLINE at http://archaea.wsPROTEASES OF HYPERTHERMOPHILIC MICROORGANISMS sumably digests and provides peptides to this organism, which grows by sulfur-dependent, peptide fermentation.

Evolutionary aspects of proteolytic processes
Given that hyperthermophiles arguably represent an early evolutionary linkage, the question concerning the minimum set of proteases required for cellular function in these organisms arises.About half of the genes contained in hyperthermophilic genomes are still unassigned in terms of function (Nelson et al. 1999), and it seems certain that as yet undetected proteases are among those unidentified genes.About 60-70 putative and actual proteases/peptidases can be gleaned from the E. coli genome sequence by informatics techniques similar to those used here for the hyperthermophiles (K.R. Shockley and R.M. Kelly, unpublished data); this is about twice as many as noted in each of the five hyperthermophiles examined here (Table 2).We also note that the E. coli genome is about 2.5-fold larger than the pyrococcal and T. maritima genomes, which may relate to differences in protease inventory.However, S. solfataricus and E. coli have comparably sized genomes but differ significantly in the numbers of putative and confirmed proteases.Information about the expression and activation of specific proteases under various environmental conditions is needed to resolve the relationship between genotype and phenotype for each organism.The biochemical properties of specific proteases in these organisms will have to be reconciled with their regulation to provide some perspective on the global regulation of proteolysis.
It is intriguing to consider the origin and development of proteases based on the multimeric assembly of small (~20-30 kDa) subunits into complex ring-like structures.This is the basis for the proteasome and Clp proteases, both of which are ATP-dependent.Such structures help sequester the active sites in these proteases, presumably to avoid unwanted protein turnover.The PfpI from P. furiosus and its homologs are based on a similar structural organization (~19-kDa subunits arranged into two six member rings) yet have no ATP-dependence (Du et al. 2000).Furthermore, there may be an additional structural relationship between these proteases, given that antibodies raised against the eukaryotic proteasome and E. coli ClpP recognized PfpI in Western blots (Halio 1995).The evolutionary significance of these multi-subunit proteases and the complexity introduced through ATP-dependence merits further examination.It will be interesting to determine whether various proteolytic phenotypes recruit specific sets of proteases to certain tasks involving protein turnover under normal and stressed conditions and how the interplay between ATP-dependence and ATP-independence is regulated.If it turns out that hyperthermophiles contain fewer proteases than other cells and organisms, they may provide an interesting perspective on the complex nature of protease function and regulation.

Figure 3 .
Figure3.Extracellular proteolytic inventory from various hyperthermophiles.Pyrococcus furiosus and P. abyssi were grown at 95 °C, and T. profundus, T. peptonophilis and Thermotoga maritima were grown at 80 °C.The cells were removed after 14 h of growth and the extracellular enzymes were precipitated by addition of ammonium sulfate to 80%.Two µg of total extracellular protein was loaded onto a 12% SDS-polyacrylamide gel containing gelatin.

Table 2 .
Protease-related genes in selected microorganisms.
, whereas in gram-positive Bacillus subtilis and Lactococcus lactis, the