Biosynthesis of ribose-5-phosphate and erythrose-4-phosphate in archaea : a phylogenetic analysis of archaeal genomes

A phylogenetic analysis of the genes encoding enzymes in the pentose phosphate pathway (PPP), the ribulose monophosphate (RuMP) pathway, and the chorismate pathway of aromatic amino acid biosynthesis, employing data from 13 complete archaeal genomes, provides a potential explanation for the enigmatic phylogenetic patterns of the PPP genes in archaea. Genomic and biochemical evidence suggests that three archaeal species (Methanocaldococcus jannaschii, Thermoplasma acidophilum and Thermoplasma volcanium) produce ribose-5-phosphate via the nonoxidative PPP (NOPPP), whereas nine species apparently lack an NOPPP but may employ a reverse RuMP pathway for pentose synthesis. One species (Halobacterium sp. NRC-1) lacks both the NOPPP and the RuMP pathway but may possess a modified oxidative PPP (OPPP), the details of which are not yet known. The presence of transketolase in several archaeal species that are missing the other two NOPPP genes can be explained by the existence of differing requirements for erythrose-4-phosphate (E4P) among archaea: six species use transketolase to make E4P as a precursor to aromatic amino acids, six species apparently have an alternate biosynthetic pathway and may not require the ability to make E4P, and one species (Pyrococcus horikoshii) probably does not synthesize aromatic amino acids at all.


Introduction
The pentose phosphate pathway (PPP) is a ubiquitous metabolic pathway in bacteria and eukaryotes.The function of the PPP is threefold: to generate NADPH for reducing power in reductive biosynthesis, to convert three-and six-carbon sugar phosphates from the glycolytic pathway into ribose-5-phosphate (R5P) for nucleotide biosynthesis, and to provide erythrose-4-phosphate (E4P), a precursor for aromatic amino acid biosynthesis.The classical PPP is divided into two phases (Figure 1).In the oxidative PPP (OPPP), glucose-6-phosphate (G6P) undergoes oxidative decarboxylation to form ribulose-5-phosphate (Ru5P) with the concomitant reduction of two molecules of NADP + to NADPH.In the nonoxidative phase (NOPPP), R5P isomerase converts Ru5P to R5P, while the enzymes Ru5P 3-epimerase, transketolase and transaldolase act in concert to shuttle excess Ru5P back into three-and six-carbon glycolytic intermediates, with E4P formed in the process.
The role of the PPP in archaea has been an open question for more than a decade.Before the advent of whole genome sequencing projects, it was reported that activities of NOPPP enzymes could be detected in the methanogenic archeaon Methanococcus maripaludis, but OPPP enzyme activities were not observed (Yu et al. 1994).The authors proposed that R5P synthesis in methanococci occurs solely via the NOPPP enzymes acting in the 'reverse' direction (F6P and GAP to Ru5P), and not by oxidative decarboxylation of G6P.This hypothesis was further supported by studies with Methanococcus voltae (Choquet et al. 1994) and M. maripaludis (Tumbula et al. 1997), in which 13 C labeling patterns were consistent with R5P and E4P synthesis via the NOPPP.Further genomic analysis indicated that Methanocaldococcus jannaschii has a complete NOPPP but is missing the OPPP entirely (Selkov et al. 1997).Conversely, 13 C labeling studies with other methanogenic archaea including Methanobacterium thermoautotrophicum (Ekiel et al. 1983, Eisenreich et al. 1991, Choquet et al. 1994) showed ribose labeling patterns consistent with the synthesis of R5P by the OPPP.The same studies also revealed labeling patterns in phenylalanine and tyrosine that were inconsistent with the biosynthesis of these aromatic amino acids starting from E4P, calling into question the role of E4P and the NOPPP in these species.
Recent progress in the sequencing of archaeal genomes has revealed a confusing picture of the PPP in the archaeal domain (Table 1).Among the thirteen completed archaeal genomes included in the Clusters of Orthologous Groups (COG) database (Tatusov et al. 1997(Tatusov et al. , 2001; http://www.ncbi.nlm.nih.gov/COG/), orthologs of genes encoding enzymes of the OPPP are virtually nonexistent: only one species, Halobacterium sp.NRC-1, has a recognizable ortholog of 6-phosphogluconate dehydrogenase (6PGDH), whereas glucose-6-phosphate dehydrogenase (G6PDH) and 6-phosphogluconate lactonase (6PGL) orthologs are missing in all 13 genomes.Nonoxidative PPP orthologs are present to a varying extent in archaea: all 13 genomes contain R5P isomerase orthologs, but only three species (M.jannaschii, Thermoplasma volcanium and Thermoplasma acidophilum) have a complete set of recognizable NOPPP genes.Transketolase orthologs are present in an additional four archaeal genomes, but these four are missing genes for transaldolase and Ru5P 3-epimerase.Only two archaeal PPP enzymes have been biochemically characterized to date: R5P isomerase from Pyrococcus horikoshii (Ishikawa et al. 2002) and transaldolase from M. jannaschii (Soderberg and Alver 2004).
The biochemical and genomic data raise important questions about how R5P and E4P are synthesized in the 10 archaeal species that apparently lack a complete NOPPP.A possible answer could involve the existence of novel, as-yet unrecognized PPP genes in archaea.Another possibility is the existence of alternate pathways.Aromatic amino acid biosynthesis in some archaeal species appears to occur via an alternate route that does not use E4P (Tumbula et al. 1997, White 2004).An alternate source of pentose biosynthesis, which to our knowledge has not yet been recognized as such, is the ribulose monophosphate (RuMP) pathway.
In the RuMP pathway (Figure 2), Ru5P condenses with formaldehyde to form 3-hexulose-6-phosphate (D-aribino-3ketohexulose 6-phosphate, Hu6P) in an aldol reaction catalyzed by 3-hexulose-6-phosphate synthase (HPS, encoded by hps).3-Hexulose-6-phosphate is then converted to F6P by 3-hexulose-6-phosphate isomerase (PHI, encoded by phi).Until recently, it was believed that the RuMP pathway was present exclusively in methylotrophic bacteria for formaldehyde fixation.However, hps and phi orthologs have recently been cloned from Bacillus subtilis and shown to encode HPS and PHI enzymes (Yasueda et al. 1999).Further sequence analysis indicates that hps and phi orthologs are widely distrib-uted among bacterial and archaeal genomes (Reizer et al. 1997).The function of the phi ortholog in the archaeon M. jannaschii has been biochemically confirmed (Martinez-Cruz et al. 2002), and in the same study HPS activity was indirectly observed in a Pyrococcus horikoshii extract.However, the function of an archaeal hps ortholog has yet to be experimentally confirmed.It has been proposed that the role of the RuMP pathway in nonmethylotrophic bacteria could be xylose metabolism (Reizer et al. 1997), methylamine metabolism or formaldehyde detoxification (Yasueda et al. 1999).However, all of these functions require the presence of a complete NOPPP in order to supply Ru5P, a requirement that is met by bacteria but not by most archaea.This suggests that the RuMP pathway enzymes have another function in archaea, possibly related to pentose synthesis.
The availability of a growing database of complete microbial genome sequences and the development of new software tools to analyze phylogenetic patterns of inheritance provide a novel opportunity to better understand how various species have evolved different metabolic strategies.This report gives the results of a genomic analysis study that suggests an explanation for the enigmatic phylogenetic patterns of the PPP genes in archaeal genomes, and discusses the implications for nucleotide and aromatic amino acid biosynthesis in archaea.

Materials and methods
Included in this analysis are the 13 complete archaeal genomes in the COG database at the National Center for Biotechnology Information (NCBI) (Tatusov et al. 1997(Tatusov et al. , 2000; http://www.ncbi.nlm.nih.gov/COG/).The STRING resource (Snel et al.

Biosynthesis of ribose-5-phosphate
The genomic evidence strongly suggests that in archaea, as in bacteria and eukaryotes, Ru5P is the original source of the pentose moiety in nucleotides.Orthologs of R5P isomerase (COG 0120), which catalyzes the reversible interconversion of R5P and Ru5P, are present in all 13 archaeal genomes compared in this study (Table 1).Also ubiquitous in archaeal genomes are most of the genes for nucleotide biosynthesis, including putative genes for phosphoribosylpyrophosphate (PRPP) synthase (COG 0462), orotate phosphoribosyltransferase (COG 0461) and glutamine phosphoribosyl amidotransferase (COG 0034), the key enzymes responsible for in-corporating R5P into nucleotides.The question thus becomes how Ru5P is synthesized by those archaea that lack a complete set of NOPPP orthologs.
The RuMP pathway, operating in the 'reverse' direction, is a possible alternate source of Ru5P.The two reactions of the RuMP pathway-an aldol condensation and a 3-ketose to 2-ketose isomerization-are thermodynamically reversible.Furthermore, it appears that in archaeal genomes, the phylogenetic profiles of the RuMP pathway and the NOPPP are for the most part complementary.Orthologs for hps and phi (COG 0269 and COG 0794, respectively), are present in 10 of the 13 genomes (Table 1), and all but one of these (M.jannaschii) lack a complete NOPPP, which is the source of Ru5P in the 'forward' RuMP pathway of methylotrophic bacteria.This indicates that the role of the RuMP pathway in these archaeal species may be reversed: that is, its function may be to produce Ru5P from F6P.
The operation of a reverse RuMP pathway for the production of Ru5P implies the concurrent production of formaldehyde, which is toxic to cells.Some mechanism must therefore Table 1.Phylogenetic patterns of archaeal gene orthologs from the pentose phosphate pathway (PPP) and the ribulose monophosphate (RuMP) pathway.GenBank accession numbers are listed for each ortholog.
exist to remove formaldehyde in all species that synthesize RuMP in this manner.Five of the 13 genomes analyzed in this study (the four methanogens plus the sulfate reducer Archaeoglobus fulgidus) contain members of COG 1795, a formaldehyde-activating enzyme first characterized in the methylotroph Methylobacterium extorquens (Vorholt et al. 2000).It is significant that in four of these five species this gene is fused to hps, possibly encoding a bifunctional enzyme that activates formaldehyde for oxidation as soon as it is produced as a byproduct of Ru5P synthesis.Also present in nine of the 13 genomes are members of COG 2414, a tungsten-dependent formaldehyde ferredoxin oxidoreductase that has been characterized in Pyrococcus furiosus (Roy et al. 1999).Among the nine archaeal genomes that contain the RuMP pathway genes, only Sulfolobus solfataricus is missing both COG 1795 and COG 2414.This genome does, however, contain an open reading frame (GenBank accession no.AAK40795) that appears to be an ortholog of a glutathione-independent formaldehyde dehydrogenase characterized from Pseudomonas putida (Ogushi et al. 1984, Tanaka et al. 2002).
Experimental confirmation of the biochemical function of the protein encoded by hps in archaea will be critical to confirming or denying the role of the RuMP pathway in ribose synthesis.The homology between some archaeal hps orthologs and the biochemically characterized hps gene in B. subtilis is relatively low, with expected values ranging from 10 -43 for the Pyrococcus abyssi gene to 10 -14 for the M. jannaschii gene.In addition, hps orthologs are fused in several archaeal genomes to members of COG 0684, which encodes demethylmenaquinone methyltransferase.This fusion event further clouds predictions of the function of hps in archaea.
An alternate interpretation of results from previous isotopic labeling studies can be provided in light of the genomic data now available.Previous reports (Eisenreich et al. 1991, Choquet et al. 1994) stated that 13 C labeling patterns in several methanogenic archaea (but not in methanococci) pointed to the action of OPPP enzymes in the conversion of F6P to Ru5P.However, the same labeling pattern observed in these studies would also be expected if the F6P to Ru5P conversion was carried out instead by a reverse RuMP pathway, a more likely scenario given the lack of OPPP orthologs in methanogenic genomes.
Although both T. volcanium and T. acidophilum are missing the RuMP pathway genes, these species do have a complete NOPPP that could provide a source of Ru5P.Methanocaldococcus jannaschii has both a complete NOPPP and an RuMP pathway, therefore Ru5P could derive from either or both of these pathways.However, labeling studies (Choquet et al. 1994, Tumbula et al. 1997) suggest that the NOPPP alone is responsible for ribose biosynthesis in methanococci.
Only Halobacterium lacks both an NOPPP and an RuMP pathway.However, Halobacterium is also the only archaeal genome that contains a recognizable OPPP gene (the ortholog of genes encoding 6PGDH (COG 1023)), which suggests that pentose synthesis in Halobacterium may take place via a modified OPPP.Although it is not yet clear how this could occur, there are several reasonable possibilities.Halobacterium may have a novel G6PDH that is not recognizable by sequence homology to known enzymes.6-Phosphogluconate could be produced by oxidation of glucose to gluconate by glucose dehydrogenase (COG 1063, GenBank accession no.AAG18991), followed by phosphorylation of gluconate by an as yet uncharacterized putative sugar kinase (e.g., COG 0524, GenBank accession no.AAG20057).Alternatively, the oxidative carboxylation of gluconate to ribulose could be followed by phosphorylation to Ru5P by an as yet unidentified ribulokinase, if the predicted 6PGDH enzyme in Halobacterium is capable of oxidizing a nonphosphorylated substrate.Biochemical characterization of the predicted 6PGDH enzyme in Halobacterium, as well as experimental confirmation of the presence or absence of G6PDH, gluconate-6-kinase and ribulokinase activities in cell extracts, will be necessary to address the questions surrounding the origin of Ru5P in Halobacterium.

Biosynthesis of erythrose-4-phosphate and aromatic amino acids
Another critical function of the NOPPP in bacteria and eukaryotes is the generation of E4P as a precursor for aromatic amino acid biosynthesis.The curious NOPPP phylogeny in archaeal genomes can be explained by the presence in some archaea of an alternate biosynthetic pathway that does not use E4P.The first phase of the classical aromatic amino acid biosynthesis pathway involves the seven-step synthesis of chorismate, beginning with the condensation of E4P and phosphoenolpyruvate (PEP) to form 2-dehydro-3-deoxy-D-arabinoheptulosonate-7-phosphate (DAHP) and the subsequent cyclization to 3-dehydroquinate (DHQ).Most of this pathway is present in 12 of the 13 archaeal genomes analyzed here, although it is entirely missing in Pyrococcus horikoshii, which requires at least one aromatic amino acid (tryptophan) for growth (Gonzalez et al. 1998).For the fifth enzymatic step, archaea possess a novel shikimate kinase (COG 1685) that has only recently been identified and characterized (Daugherty et al. 2001).Seven of the genomes are missing orthologs from COG 2876 and COG 0337, the first two biosynthetic steps responsible for the formation of DHQ from E4P and PEP (Table 2).Significantly, these are the same genomes (with the addition of M. jannaschii) that are also missing transketolase, and thus do not possess a recognizable means of making E4P.
The hypothesis that these archaeal species possess an alternate beginning phase of the aromatic amino acid biosynthesis pathway that does not involve E4P as a precursor (Tumbula et al. 1997) was recently confirmed (White 2004).In M. jannaschii, two genes were identified that catalyze the formation of DHQ from L-aspartate semialdehyde and 6-deoxy-5-ketohexulose, rather than from E4P and PEP.Although the novel aldolase identified in that study is widespread among archaea, including several species that have genes for the classical (E4P-and PEP-utilizing) chorismate synthesis pathway, the phylogenetic pattern of the novel DHQ synthase is highly complementary to the patterns observed both for the classical chorismate pathway genes and also for transketolase (Table 2).
It is notable that four of the 13 genomes (Pyrococcus abyssi, Aeropyrum pernix, Pyrobaculum aerophilum and Sulfolobus solfataricus) have an incomplete NOPPP: orthologs of transketolase and R5P isomerase are present but not orthologs of epimerase or transaldolase (Table 1).Apparently, these four species use the RuMP pathway for Ru5P synthesis and thus do not require a complete NOPPP, but do require transketolase for generation of E4P from F6P and GAP.All four genomes contain orthologs for DAHP synthase, the E4P-utilizing step in aromatic amino acid biosynthesis.Furthermore, in three of these four genomes, the transketolase genes are located in operonlike clusters along with aromatic amino acid biosynthesis genes.

Conclusions
The archaeal phylogeny of the PPP, when taken in isolation, is an enigmatic picture of partial pathways and seemingly inexplicable patterns of gene inheritance.When combined with genomic information about the RuMP pathway and the chorismate pathway of aromatic amino acid biosynthesis, however, a more coherent picture begins to emerge.In most archaea, the ribose biosynthesis function of the OPPP is apparently carried out either by the enzymes of the NOPPP or by a reverse RuMP pathway.Those species that require E4P for aromatic amino acid biosynthesis produce it with transketolase, whereas those species with an alternate early phase of the chorismate pathway do not require transketolase.Finally, pentose synthesis in Halobacterium may occur via a modified OPPP, although the details of this putative pathway await experimental determination.The development of this picture of pentose and erythrose synthesis in archaea is an example of how the analysis of phylogenetic patterns within multiple genome sequences can yield valuable insights into diverse metabolic pathways.It will be interesting to see if the patterns observed in this study hold true as more complete genome sequences become available, and as the predicted biochemical functions of more genes are confirmed experimentally.

Table 2 .
(White 2004c distribution of the novel archaeal chorismate biosynthesis pathway genes(White 2004) compared with the first two steps of classical chorismate biosynthesis and transketolase.GenBank accession numbers are listed for each ortholog.