A Genomic Approach to Study Anthocyanin Synthesis and Flower Pigmentation in Passionflowers

Most of the plant pigments ranging from red to purple colors belong to the anthocyanin group of flavonoids. The flowers of plants belonging to the genus Passiflora (passionflowers) show a wide range of floral adaptations to diverse pollinating agents, including variation in the pigmentation of floral parts ranging from white to red and purple colors. Exploring a database of expressed sequence tags obtained from flower buds of two divergent Passiflora species, we obtained assembled sequences potentially corresponding to 15 different genes of the anthocyanin biosynthesis pathway in these species. The obtained sequences code for putative enzymes are involved in the production of flavonoid precursors, as well as those involved in the formation of particular (“decorated”) anthocyanin molecules. We also obtained sequences encoding regulatory factors that control the expression of structural genes and regulate the spatial and temporal accumulation of pigments. The identification of some of the putative Passiflora anthocyanin biosynthesis pathway genes provides novel resources for research on secondary metabolism in passionflowers, especially on the elucidation of the processes involved in floral pigmentation, which will allow future studies on the role of pigmentation in pollinator preferences in a molecular level.


Introduction
Anthocyanins belong to a diverse group of secondary metabolites of the phenylpropanoid class, the flavonoids, which are found in different plant species. They represent some of the most important natural pigments, which are responsible for the wide range of red to purple colors present in many flowers, fruits, seeds, leaves, and stems. Besides having great economical relevance, flower and fruit pigments play an important ecological role in the animal attraction for pollination and seed dispersal, wich is a spectacular example of coevolution between plants and animals [1][2][3].
Briefly, the pathway is initiated with chalcone synthase (CHS) catalyzing the stepwise condensation of three molecules of acetate residues from malonlyl-CoA with one molecule of 4-coumaroyl-CoA to form the basic structure of flavonoids (tetrahydroxychalcone), which is rapidly isomerized to the colorless naringenin by chalcone isomerase (CHI). Naringenin is then converted to dihydroflavonol by flavanone 3-hydroxylase (F3H). Dihydroflavonol 4-reductase (DFR), which is a specific enzyme for the anthocyanin synthesis, catalyses the production of leucoanthocyanidins from dihydroflavonols, which can be hydroxylated on the 3 or 5 position of the B-ring by flavonoid 3 -hydroxylase (F3 H) to produce dihydroquercetin or by flavonoid 3 5 -hydroxylase (F3 5 H) to form dihydromyricetin. Subsequently, leucoanthocyanidin oxidase/anthocyanidin synthase (LDOX/ANS) is responsible for the formation of the anthocyanidins 2 Journal of Nucleic Acids from the colorless leucoanthocyanidins. GT enzymes (Oglucosyltransferases) represent the final step in anthocyanin biosynthesis: anthocyanidins are converted in differentially "decorated" anthocyanin molecules [15,16]. Biochemical approaches have demonstrated that all anthocyanin pigments are derived from one of three aglycones: pelargonidin, cyaniding, and delphinidin. The main determinants of the apparent color of these pigments are the hydroxylation and methylation patterns, as well as the number and type of sugars on the beta ring of the flavonoid molecule [1,3,[17][18][19]. Figure 1 depicts a generalized anthocyanin biosynthesis pathway. At least, two groups of genes are required for anthocyanin biosynthesis: the first group is represented by the structural genes encoding enzymes for the production of the flavonoid precursors, as well as those involved in the formation of particular ("decorated") anthocyanin molecules. The second group includes the genes encoding regulatory factors that control the expression of structural genes which are mainly orenestrated by complexes formed by MYB and basic helix-loop-helix (bHLH) transcription factors that include WDR (WD40 repeats) proteins [2,4,15,16,[20][21][22][23].
There are about 600 Passiflora species widely distributed in tropical and subtropical regions. Some Passiflora species have economical importance due to the production of fruits (passionfruit) or use as ornamentals. Nevertheless, a large number of Passiflora species are rare and/or endangered, as the environment of their diversity center has been increasingly degraded by human activities [24]. An enormous floral diversity is observed among Passiflora species, including variation in color, size, morphology, and fusion of floral organs. These and other floral characteristics, including evolutionary innovations such as the presence of coronal filaments and an androgynophore, are indicative of the wide range of pollination syndromes found in the genus [24]. Wide passionflowers may be pollinated by insects (bees and wasps), hummingbirds, and bats [24]. The most striking feature of floral variation among passionflowers is the wide range of pigmentation patterns of the corona filaments. Most of the floral pigments in Passiflora are different types of anthocyanin molecules [25,26]. Among all Passiflora species, P. edulis Deg and P. suberosa L. are of particular interest, because they are model Passiflora species for which expressed sequences tags (ESTs) were produced within the frame of the "PASSIOMA" Project [27]. P. edulis Deg flowers are pollinated by large bees of genus Xylocopa. These flowers are about 8-12 cm wide, and their coronas contain multiple series of purplish filaments with white tips. The flowers of P. suberosa L. are small (2-3 cm wide) and show two morphologically distinct series of corona filaments: the outer series is greenish, and the inner series is formed by smaller purple filaments. The flowers of P. suberosa are pollinated by wasps [28].
We are particularly interested in the characterization of genes involved in the anthocyanin biosynthetic pathway of these two Passiflora species. With this aim, we searched for putative Passiflora genes responsible for flower pigmentation, using the key proteins known to be involved in the different enzymatic steps of anthocyanin biosynthesis as baits to search for expressed sequences tags (ESTs) in the PASSIOMA database.

Searching Passiflora ESTs Homologous to Anthocyanin
Biosynthetic Genes. The clustered expressed sequence tags (ESTs) from the PASSIOMA Project database [27] were used as a primary source of data for our analyses. These sequences were assembled from ESTs obtained from the sequencing of several P. edulis or P. suberosa cDNA libraries, made from floral buds at different developmental stages (see [27] for details on library construction, sequencing, and database structure). Nucleotide sequences and their respective deduced amino acid sequences from genes known to be involved in anthocyanin biosynthesis (see Figure 1) were obtained from the Nation-al Center for Biotechnology Information (NCBI; http:// www.ncbi.nlm.nih.gov/). Searches for putative homolog sequences in the PASSIOMA database were conducted using the tBLASTN module that compares the consensus amino acid sequence with a translated nucleotide sequences database [29]. We generally used Arabidopsis thaliana or Petunia hybrida as query consensus sequences as the anthocyanin biosynthesis pathways in these model species are more thoroughly studied at the molecular level [30][31][32]. All sequences in the PASSIOMA database that exhibited a significant alignment (e-value lower than 10-5) with the query were retrieved from the PASSIOMA database.
The clusterization of all reads identified using a given query sequence was performed using the CAP3 algorithm [33] from the BioEdit software [34]. The novel cluster consensus sequences obtained were reinspected for the occurrence of conserved motives using InterProScan [35] and were compared to NCBI databases using BLAST [29]. Sequences that did not show the main motives present in the query sequence were discarded. Validated sequences were then included in phylogenetic analyses.

Comparison of the Amino Acid
Sequences and Phylogenetic Analysis. All amino acid sequences were aligned by CLUSTALX software using default parameters [36]. The obtained alignments were eventually corrected by hand and imported into the molecular evolutionary genetics analysis (MEGA) software [37]. Phylogenetic trees were obtained using parsimony and/or genetic distance calculations (in the later case using pairwise deletion option and with the Poisson correction model). Neighbor-joining [38] and Bootstrap (with 10,000 replicates) trees were also constructed.

Results
The cDNA libraries of the PASSIOMA Project were obtained from mRNA extracted from floral buds at different developmental stages, and it is expected that all EST sequences correspond to genes expressed during Passiflora flower development [27]. This sequence search detected a total of 75 Passiflora EST sequences, 34 of them corresponding to P. edulis sequences and 41 of them corresponding to sequences derived from P. suberosa libraries. When submitted to the CAP3 algorithm and detailed comparison of their deduced amino acid sequences, the number of valid clusters was reduced to 15, potentially corresponding to 15 different genes.  Figure 1: Schematic representation of the anthocyanin biosynthetic pathway (adapted from [16]). Enzymes are indicated in red, and classes of compounds are in green. Anthocyanidin is further modified with glycosyl, acyl, or methyl groups, resulting in the "decorated" anthocyanin. In this case, UF3GT is responsible for the glycosylation of anthocyanidins. The proposed anthocyanin biosynthetic pathway for  When the validated amino acid sequences obtained from the PASSIOMA database were compared to other plant protein sequences in the public databases, the first BLAST hits generally corresponded to Populus and Ricinus sequences. This was expected, as Passiflora and these genera belong to the same order (Malpighiales) and are considered to be closely related [39]. We obtained assembled EST sequences corresponding to genes of the following genes families: CHS, DFR, GT, GST, MYB, and WD40 (see Table 1). Therefore, we used 15 Passiflora assembled sequences from the PASSIOMA database and a selected set of genes from divergent plant species from the public databases to explore their evolutionary relationships. The obtained sequence comparison alignments allowed the construction of phylogenetic trees for each of these families of genes involved in the different enzymatic steps of the anthocyanin pathway.
The similarities among all genes identified in this study and those reported from other plant species were assembled in Table 1 and ranged from 70% (PACEPE3030G03.g; representing a putative member of the GST, glutathione Stransferase superfamily) to 96% (PACEPE3007G07.g; potentially encoding a WD40 protein).
Some of these gene sequences showed significant similarity to elements required for early or late steps of the pathway; others putatively encode regulatory proteins involved in the control of the spatial and temporal patterns of pigmentation, while others are responsible for intracellular transport of the anthocyanin molecules. The role of each of these genes in the anthocyanin biosynthesis and the probable implications for the understanding of the Passiflora flower pigmentation are presented in the Discussion.
To determine the phylogenetic relationship of different CHSs, we aligned protein sequences from a diverse range of plant species (moss, ferns, gymnosperms and angiosperms), cyanobacterium (Synechococcus sp.) and Passiflora representatives of the CHS superfamily ( Figure 2). The phylogenetic tree was resolved in three clades. These three clades were highly supported with 100% bootstrap values. The Passiflora proteins were consistently positioned into different clades.
Journal of Nucleic Acids  Figure 2: A Neighbor-joining phylogenetic tree of chalcone synthase (CHS) amino acids sequences. The cluster containing all anther-specific CHS-like enzymes is highlighted. Bootstrap values from 1,000 replicates were used to assess the robustness of the trees. Only bootstrap values above 75% are indicated at the nodes. Accession numbers for genes from other species are given in Supplementary data.
One of these monophyletic clades (highlighted in Figure 2) contains all the anther-specific CHS-like genes (ASCLs; [40,41]). The remaining sequences, including three Passiflora members, were clustered in the other sister clade together with all CHS genes from seed plants.

Dihydroflavonol 4-Reductases (DFR).
A single Passiflora cDNA sequence of 850 bp encoding a predicted protein of 204 amino acids showed significant e-value (1e −95 ) and 94% similarity to a Populus DFR sequence (Table 1). Figure 3 shows an alignment of the deduced amino acid sequence of the Passiflora DFR with some other plant sequences containing an NADP-binding domain, considered the region of substrate preference of DFR enzymes [42,43]. Additionally, the Passiflora DFR showed an aspartic acid residue at position 134, as it is observed for the Petunia and Populus proteins, GhDFR VvDFR PACEPE3003G04.g Figure 3: Multiple sequence alignment of the Passiflora sequence with some plant DFR sequences. The identical and similar residues are highlighted on a black and gray background, respectively. NADP-binding domain is underlined. Boxed amino acids have been considered to control the substrate specificity of DFR enzyme [40], and the amino acid residue (indicated by an arrowhead) is especially important for this specificity [41]. The alignment was performed using CLUSTALX and BOXSHADE program. whereas Gerbera and some Lotus DFR show an asparagine residue at the same position ( Figure 3). We adopted the terminology suggested by Shimada and coworkers [44] to designate the conserved motifs present in the DFR sequence. A neighbor-joining tree was constructed based on the alignment DFR sequences shown in Figure 3. The monocots and eudicots DFRs were positioned separately. While monocot DFR genes formed one clade, the eudicot DFR sequences diverged into two clades. Clearly, Asn-type DFRs are found in a larger number of species. On the other hand, Asp-type DFRs are restricted to some species, including Passiflora and Populus (Figure 4).

Glucosyltransferases (GT).
We identified two Passiflora EST clones, PACEPE3030G03.g and PACEPS7021H02.g, encoding proteins with sequence similarity to Ricinus communis glucosyltransferases ( Table 1). The first cDNA sequence contained an ORF specifying a 124 amino acid 8 Journal of Nucleic Acids protein, and the second cDNA encoded a protein of 200 amino acid residues. These putative Passiflora GT proteins were compared with those GT enzymes described by Kovinick and colleagues [45] and retrieved from the NCBI database. The obtained phylogenetic tree resulted in five clades, according to their in vitro substrate specificities [45]. Phylogenetic analysis revealed that the Passiflora sequences were positioned within the Cluster II proteins ( Figure 5).

Glutathione S-Transferases (GSTs).
We have identified five Passiflora sequences representing putative members of the GST family. Each member was represented by a single EST sequence. Comparison of these deduced GST protein sequences with those in the GenBank database revealed homology with multifunctional GSTs from Populus, Ricinus, and Glycine spp (see Table 1). Phylogenetic relationships among the putative Passiflora GSTs and family members of other plant species were established ( Figure 6). Based on sequence similarity, the five Passiflora putative GSTs were grouped into three clades. PACEPE3018F08.g, PACEPS4006H06.g, and PACEPS7023B03.g are type I GSTs, PACEPE3007A05.g is a type II GST, and PACEPE3013H01.g is a type III GST [46]. We could not find any putative homologs to chalcone isomerases (CHI), flavanone 3-hydroxylases (F3H), and anthocyanidin synthases (ANS; see Figure 1) in the PASSIOMA database. Three EST sequences were identified corresponding to a putative flavonoid 3-O-hydroxylase (F3 H) gene, and  Figure 6: A Neighbor-Joining phylogenetic tree of glutathione S-transferase (GST) amino acids sequences with three types representing phi, tau, and zeta classes. Phi and tau are plant-specific GSTs. Bootstrap values from 1,000 replicates were used to assess the robustness of the trees. Only bootstrap values above 75% are indicated at the nodes. Accession numbers for genes from other species are given in Supplementary data. one sequence was found that showed significant homology to genes encoding flavonoid 3-5-O-hydroxylases (F3 5 H; data not show). As these sequences were incomplete at their 5 end, they were not considered in our analyses.

Identification and Phylogenetic Analysis of Passionflower Genes Potentially Involved in Spatially and Temporally Patterning Anthocyanin Deposition.
Based on the searches in the PASSIOMA database, we identified one potential homolog for an MYB transcription factor of the R2R3 class. The P. suberosa cDNA clone PACEPS7022E07.g encodes a protein of 132 amino acids showing 91% similarity to the Ricinus communis R2R3 MYB. On the other hand, PACEPE3007G07.g is a putative P. edulis WD40 gene of 886 bp encoding 291 amino acid residues showing 96% similarity to an R. communis, WD40 (Table 1). Figure 7 shows an alignment of the deduced PACEPS7022E07.g protein sequence with 17 other plant anthocyanin-related R2R3-MYB, indicating the presence of a conserved DNA-binding domain, designated as the R2R3 domain. All sequences analyzed also contained a second conserved amino acid motif in the R3 repeat (red box), important for the interaction between MYB and bHLH proteins in Arabidopsis [48]. The four specific residues required for this interaction in maize [49] are also indicated by the arrows in Figure 7. The third conserved motif appears to be ANDV (blue box) in the R3 repeat of all eudicot R2R3-MYB proteins related to anthocyanin biosynthesis.  Figure 7: Multiple sequence alignment of the R2R3 MYB domains involved in anthocyanin production including the deduced amino acid sequence of Passiflora suberosa. R2R3 repeats refer to two imperfect repeats of the MYB domain. The identical and similar residues are highlighted on a black and gray background, respectively. Red box shows the R/B like bHLH interacting motif in the R3 repeat [45], and arrows indicate four specific residues of maize C1 required for interaction with a bHLH cofactor R [46]. Blue box shows a conserved motif in the R2R3 repeats for eudicots MYB related to the anthocyanin pigments [47]. The alignment was performed using CLUSTALX and BOXSHADE program.

ME ---------------------S L GVR KGAW I Q EE DV LL R KCI E KY GE GKWHL VP L R AG L NRCR KS CR L R W L NY LK P D MK---------------------S L GVR KGAWTQ EE DV LL R KCI E KY GE GKWHL VP L R AG L NRCL KS CR L R W L NY LK P D ME ---------------------S L GVR KGAWTQ EE DV LL R KCI E KY GE GKWHL VP L R AG L NRCR KS CR L R W L NY LK P D MRNP AS A-S ------TS KTP CCTKVGL KR GPWTP EE D E LL ANY VKR E GE G RWRTL P KR AG L L RCGKS CR L R WMNY LR P S MRNAS S A-S AP P -S S S S KT P CCI KVGL KR GPWTP EE D E VL ANY I KKE GE G RWRTL P KR AG L L RCGKS CR L R WMNY
A phylogenetic tree of selected plant R2R3-MYB transcription factors, including PACEPS7022E07.g, was constructed using the alignment of the conserved R2R3 repeats (Figure 8). The Passiflora sequence was placed within the clade including ZMC1 (Zea mays), PhPH4 (Petunia hybrida), VvMYB5a, and VvMYB5b (Vitis vinifera), which are known to be involved in the regulation of the anthocyanin pathway in these species [49][50][51].
Sequence comparison of selected plant WD40 proteins with the sequence obtained from P. edulis indicated that the four WD repeats are highly conserved among all species analyzed (Figure 9). Phylogenetic analysis of these amino acid sequences confirmed that P. edulis WD40 grouped together with Ricinus communis WD40 and found to be more related to other dicot proteins ( Figure 10).
No putative homologs to bHLH transcription factors were found in the PASSIOMA database.

Discussion
Flavonoid pathway results in the production of a range of flavonoid compounds, including anthocyanins ( Figure 1). CHS is the first enzyme in the phenylpropanoid pathway and is encoded by members of a plant-specific multigene family of polyketide synthases. Nevertheless, genes belonging to the CHS family have been recently described to occur in some microorganisms (Azotobacter vinelandii; [52] and Neurospora crassa; [53]) and, therefore, indicate CHS functions might have evolved previous to the divergence of land plants. Thus, the biological functions of some of  the CHS superfamily members are clearly important to plant adaptation. CHS proteins are collectively linked to the biosynthesis of different plant products with diverse functions such as UV protection, defense against pathogens, pigment biosynthesis, and pollen fertility [54,55].
Sequence analysis indicated that two Passiflora CHS deduced proteins belong to a small distinct group of chalcone synthases that includes angiosperm and gymnosperms homologs to anther-specific chalcone synthase-like genes (ASCLs; highlighted in Figure 2). Furthermore, all ASCLs form a monophyletic clade. Recently, ASCLs transcripts were detected within the tapetum cells during microspore stage in wheat [56]. These genes apparently have important roles in anther development and in pollen fertility [40,41,56].
The remaining three Passiflora CHSs were clustered together in a sister clade containing all seed plant CHS genes. Their products are considered key in the biosynthesis of flavonoids. These include CHSA and CHSJ genes, known to be expressed in floral tissues, and involved in floral pigmentation in petunia [30,31,57]. Moreover, two nonchalcone genes, divergent from the typical CHSs, formed a separate clade. The SyPKS gene from cyanobacterium encodes an enzyme of the thiolase superfamily [58], whereas the function of the PpCHS11 gene (from Physcomitrella patens) may resemble more the most recent common ancestor of all plant CHSs than do other members of the plant CHS superfamily [55].
We do not have identified putative genes encoding CHI enzymes. Besides the general limitations and drawbacks of the EST-based approach, another possible explanation may be because the rapid isomerization of chalcone to form narigen and the fact that even in the absence of a functional CHI enzyme, chalcone can spontaneously isomerize to form naringenin [15].
DFR is an enzyme catalysing the reduction of three dihydroflavonols: dihydromyricetin (DHM), dihydroquercetin (DHQ), and dihydrokaempferol (DHK) into colorless leucoanthocyanidins. These are further converted to delphinidin, cyaniding, and pelargonidin ( Figure 1). The synthesis of three different anthocyanidins is mainly determined by the enzymes activities of two hydroxylases: F3 OH and F3 5 OH. The first converts DHK to DHQ and F3 5 OH converts DHK to DHM [15].
In some plant species, DFR displays distinct substrate specificity in according to the hydroxylation pattern of anthocyanin molecule [30]. A hypothesis to determine substrate specificity was proposed based on the amino acid sequence alignment of Petunia DFR with others plants. The alignment indicated a variable region that controls substrate recognition. Naturally, Petunia hybrida does not produce orange flowers, because the DFR enzyme cannot use dihydrokaempferol as substrate to produce pelargonidin, due to an aspartic acid residue at the 134th position [30,42], as it was also observed for Passiflora (Figure 3), thus converting dihydroquercetin to leucocyanidin and, more efficiently, the reduction of dihydromyricetin to leucodelphinidin [30,59]. On the other hand, some Gerbera genotypes have an asparagine residue at this same position and can utilize three dihydroflavonols as substrates of DFR, consequently producing orange to red colored flowers [9,30]. Thus, the flower color is partly determined by alteration of a single amino acid that changes the substrate specificity of the DFR enzyme.   Almost all anthocyanidins undergo several modifications, which vary across species and involve enzymes of the glucosyltransferase, methyltransferase, and acyltransferase families. The most common is glycosylation of the 3-position of anthocyanidins (represented in Figure 1) to produce stable anthocyanin molecules [15,30,31,60]. UDPglucose:flavonoid 3-O-glucosyltransferase (3GT) belongs to a large multigene glucosyltransferases (GTs) family, representing the final step in anthocyanin biosynthesis.
In this work, we adopted the classification of the GTs into clusters according to Kovinic and colleagues [45]. Cluster I groups includes 3GTs enzymes. Cluster II includes GTs with multiples substrates preferences, generally for chalcones, flavones and flavonols but not anthocyanidins. Enzymes from Cluster III have isoflavone 7-O and anthocyanidin 3,5-O-GT activities. Cluster IV glycosylates flavonol and isoflavonol substrates and Cluster V have anthocyanin 5-O and/or flavone 7-O-UGT enzymes [45]. Our results indicated that the obtained Passiflora glucosyltransferase gene sequences were grouped in Cluster II, together with other family members that show a high catalytic specificity for more than one class of flavonoid substrates ( Figure 5). DicGT5 (from Dianthus caryophyllus) glycosylates a chalcononaringenin 2 -O-glucosyltransferase [61], whereas the Beta vulgaris GT has a favonoid-7, 4 -O-betanidin-5-O-glucosyltransferase activity [62]. Both GTs have non-anthocyanidin substrate specificity. Despite these results, obviously neither GT substrate specificity, nor in vivo function of the Passiflora GTs can be predicted solely based on amino acid sequence similarities and must be experimentally determined.
Anthocyanin biosynthesis has been demonstrated to occur predominantly in the cytosol, but these pigments are exclusively accumulated in the vacuole of epidermal cells [20]. Transport of pigments to the vacuoles requires a glutathione S-transferase and a specific carrier protein localized in the vacuolar membrane. GSTs are multifunctional proteins encoded by a large familiar present in all cellular organisms. Plants GSTs are classified on the basis of sequence identity into four classes: phi, tau, theta, and zeta [46]. The two small zeta and theta classes include GSTs from animals and plants, while the phi and tau classes are plant-specific. Several studies have confirmed the involvement of GSTs in the vacuolar transport of anthocyanins. PhAN2 (from Petunia), ZmBZ2 (from maize), and AtTT19 (from Arabidopsis) are GST proteins involved in anthocyanin transport [30][31][32][63][64][65].
To characterize their phylogenetic relationships, the deduced amino acid sequences from the Passiflora putative GSTs were compared with other plant GST sequences, including the ones mentioned above. Figure 6 shows that the Passiflora GSTs are included into three different clades: three sequences were positioned in the same clade of PhAN9 and AtTT19 (phi class), whereas one sequence was grouped together with ZmBZ2 (tau class; [66]). Although of these known proteins belong to distinct GST clades, they perform similar functions [63][64][65].
Interestingly, PACEPE3007A05.g was clustered with carnation (Dianthus caryophyllus) GST type II (zeta class) which is associated to petal senescence in response to ethylene [67,68].
At the moment, we can classify the Passiflora GSTs into type I (phi), type II (zeta), and type III (tau). At least, four of them might be involved in the anthocyanin pathway and PACEPE3007A05.g might be related to other biological processes related to flower development such as those observed for the carnation GST.
In all analyzed species, the spatial and temporal expression of the structural genes of the anthocyanin biosynthetic pathway is controlled by regulatory genes, which interfere with the intensity and pattern of anthocyanin biosynthesis [15]. MYBs, basic helix-loop-helix (bHLH) transcription factors and WD40 proteins form a transcriptional complex for the activation of the structural genes [4,12,20,47,69,70]. MYBs and bHLHs proteins are coded by large multigene families, and those associated with anthocyanin biosynthesis are characterized by a conserved DNA-binding domain consisting of two imperfect repeats (named R2R3), and a specific bHLH domain, respectively. These two gene families have been extensively studied in model plants such as Arabidopsis and maize [48,49,71].
A multiple sequence alignment of the R2R3 domains of selected MYB proteins known to be involved in anthocyanin biosynthesis regulation, and the deduced amino acid sequence of PACEPS7022E07.g confirmed the presence of the conserved R2R3-MYB domain in this P. suberosa sequence ( Figure 7) as well as that of a second conserved domain in the R3 repeat (red box, Figure 7), which is known to be necessary for the interaction between MYB and bHLH transcription factors [48,49]. Additionally, a third motif in the R3 repeat (ANDV, blue box in Figure 7) represents a conserved motif shared among all eudicot MYBs involved in the anthocyanin biosynthesis [72].
The phylogenetic tree obtained using the alignment shown in Figure 7 is presented in Figure 8 and indicates LhMYB6 and LhMYB12 clustered outside the eudicot clade. These two genes regulate anthocyanin biosynthesis in the flowers of lily (Lilium hybrid), a monocot [73]. One clade is formed exclusively by eudicot anthocyanin regulators (PhAn2, AtPAP1, AtPAP2, AmROSEA1, and AmROSEA2; [12,71,[74][75][76][77]. Curiously, one regulator of the anthocyanin in maize (a monocot), ZmC1 was positioned in the same clade of other dicot members such as PhPH4 (from Petunia), VvMYB5a, and VvMYB5b (from Vitis), as well as the Passiflora R2R3-MYB sequence. PhPH4 is expressed in the petal epidermis and activates vacuolar acidification in petunia [50]. VvMYB5a and VvMYB5b genes are involved in the regulation of anthocyanin biosynthesis during grape berry development [51].
WD40 proteins are highly conserved and can be found in organisms that do not biosynthesize anthocyanins as algae, fungi, and animals [78,79]. In plants, these proteins are involved in a plethora of developmental and biochemical functions. As an example, the Arabidopsis TRANSPARENT TESTA GLABRA 1 (TTG1), which is a WD40 protein, is involved in regulating trichome formation, anthocyanin biosynthesis, seed coat pigmentation, and seed coat mucilage production. A common feature of WD40 repeat proteins is that they facilitate protein-protein interactions between the MYB and bHLH proteins [22,79].
The alignment of the Passiflora WD40 protein sequence with other known WD40s from different plant species revealed the presence of conserved WD40 motifs in the Cterminal region (Figure 9). The phylogenetic tree constructed based on this alignment is shown in Figure 10. The results indicated that the monocot sequences ZmPAC1 and OsWD clustered together, whereas the eudicot WD40s known to function as anthocyanin regulators were grouped into a different clade, with Passiflora WD40 being closely related to the Ricinus communis protein (RcWD, Table 1 and Figure 10). Although WD40 proteins are required to regulate anthocyanins and proanthocyanidin together with MYB and bHLH transcription factors, their potential involvement in other biological processes is enormous, therefore, it is premature to say what functions PACEPE3007G07.g might perform in Passiflora.
The fact that no putative homologs to bHLH transcription factors were found in the PASSIOMA database may reflect the high degree of novelty of most of the libraries of the PASSIOMA project indicating that full gene expression spectra was not completely achieved [27]. Perhaps a more deep sequencing effort would reveal that such homologs are indeed expressed in Passiflora flowers, as these elements are generally essential to MYB-WD40 protein complex stability [30][31][32].

Conclusions and Perspectives
We took the first steps toward the understanding of the molecular processes involved in the biosynthesis of anthocyanins in Passiflora that could account for the differences in pollinator preferences found in the genus. We identified 15 putative coding sequences derived from two distinct Passiflora species (P. edulis and P. suberosa) expressed in developing flower buds and potentially involved in the anthocyanin biosynthetic pathway. Comparisons of deduced amino acid sequences from the 15 Passiflora cDNAs with selected sequences from other plant species revealed strong similarity with genes that encode key elements involved in the biosynthesis (8 sequences), transcriptional regulation (2 sequences), and transport (5 sequences) of anthocyanin molecules.
Needed research concerning the determination of temporal and spatial expression patterns of all these Passiflora putative anthocyanin-related genes presented here are already ongoing in our group. We expect that future work on the manipulation of their expression patterns, using transgenic approaches, will help us to unravel important aspects relating anthocyanin biosynthesis, flower pigmentation, and flower pollination in rapidly changing tropical environments.