The Coevolution of Phycobilisomes: Molecular Structure Adapting to Functional Evolution

Phycobilisome is the major light-harvesting complex in cyanobacteria and red alga. It consists of phycobiliproteins and their associated linker peptides which play key role in absorption and unidirectional transfer of light energy and the stability of the whole complex system, respectively. Former researches on the evolution among PBPs and linker peptides had mainly focused on the phylogenetic analysis and selective evolution. Coevolution is the change that the conformation of one residue is interrupted by mutation and a compensatory change selected for in its interacting partner. Here, coevolutionary analysis of allophycocyanin, phycocyanin, and phycoerythrin and covariation analysis of linker peptides were performed. Coevolution analyses reveal that these sites are significantly correlated, showing strong evidence of the functional and structural importance of interactions among these residues. According to interprotein coevolution analysis, less interaction was found between PBPs and linker peptides. Our results also revealed the correlations between the coevolution and adaptive selection in PBS were not directly related, but probably demonstrated by the sites coupled under physical-chemical interactions.


Introduction
The process of photosynthesis is initiated by the absorption of light. In cyanobacteria and red algae, the main accessory light-harvesting complexes are comprised of the phycobilisomes (PBSs), which are attached to the cytoplasmic surface of the thylakoid membrane except Gloeobacter violaceus PCC7421 having no thylakoid membrane [1][2][3][4][5][6]. PBSs are composed of rods and a core and biochemically consist of phycobiliproteins (PBPs) and linker polypeptides, which are particularly superior subjects for the detailed analysis of structure and function due to their various components affected by growth conditions [2]. In view of the spectral properties as well as pigment compositions, allophycocyanin (APC), phycocyanin (PC), and phycoerythrin (PE) are the principal classes of PBPs in cyanobacteria. They consist of two different subunits, α and β, which exhibit high affinity for one another and associate into (α/β)-monomers to be organized as (α/β) 3 -trimers and (α/β) 6 -hexamers [7]. Different PBPs contain different kinds and different numbers of chromophores, covalently attached to the apoprotein by thioether bonds to cysteine residues. PC has three phycocyanobilin chromophores attached to the monomer through thioester linkages at the α84, β84, and β155 positions [8,9]. In addition, unlike PBPs, most of the linker polypeptides do not bear chromophores [10]. Previous studies have provided a system of abbreviations to characterize linker peptides in PBSs: rod linker (L R ), rod-core linker (L RC ), core linker (L C ), and core-membrane linker (L CM ) [11,12]. They can induce the aggregation of the PBP trimers (L R ) and also connect the rods to the core (L RC ), and the core to the thylakoid membrane (L CM ). The light energy absorbed by PE is transferred to PC, and then to APC, finally to the chlorophyll a in a quite efficient way [2,4]. PBPs are important for absorbing light energy, while the linker polypeptides are important for stability and assembly of the complex.
Previous researches are mainly focused on PBPs. Electron microscopic and crystallographic studies have revealed that the tertiary fold and the general architecture of macromolecular assemblages are remarkably conserved and provided a wealth of information on structure and function relationship of PBPs [13][14][15][16][17][18]. Amino acid sequence alignments and phylogenetic analyses have been used to go through the parse for the evolution of PBPs [7]. Also, the divergence and evolution of linker family have been investigated [19].
Light quality and quantity are key factors affecting the composition of PBSs. Two different forms of PE gene, found in two ecotypes of Prochlorococcus, are specifically adapted to either high-light (HL) or low light (LL) conditions which are under different selective pressure [19][20][21]. The structure and function of linker peptides in PBSs have shown a great diversity based on the light condition [1,22]. The method to respond to high-light stress in marine cyanobacteria is decreasing the content of PBSs per cell [23].
As we know, coevolution is prevalent at species as well as molecular levels. In the molecular level, coevolution between amino acid sites can be the result of their structural, functional, physical interaction, phylogenetic convergence, and their stochastic covariation [24]. Coevolving sites are a powerful indicator of the structures, interactions, and functions between residues [25][26][27]. The strength and pattern of coevolution vary depending on their environment. Since the nature and strength of residue interactions vary according to the involved residues and their local and global environments, coevolution exhibits a complex dependence [28].
The availability of protein sequences and their previous information allow us to perform a systematic screening on PBS protein families. Here we extended an exhaustive coevolution analysis of PBP genes and the linker polypeptide genes from the well-annotated and even unfinished cyanobacterium and red alga genomes. Intramolecular and interprotein coevolution of PBPs and covariation analysis of linker peptides in the varieties of PBSs were analyzed, and specific comparison to positive selection was also performed for better understanding the evolution of PBSs.

Sequence Collection, Alignment.
For the large amounts of data, sequences with PBPs and linker peptides in 21 cyanobacteria and 5 red algae were obtained from GenBank with the accession numbers which could be found in the accessory files. The [29,30] and MUSCLE [31] software and then manually adjusted using BioEdit (http://www.mbio.ncsu.edu/BioEdit/bioedit.html). Further analyses were all performed on this set of aligned amino acid sequences.

Coevolution
Analysis. Many methods, parametric or nonparametric, suffer from inaccuracies from their inability to erase the background noise [24,25]. Coevolution analysis using protein sequences (CAPS) compares the correlated variance of the evolutionary rates corrected by the time based on the divergence of the protein sequences [25]. It uses the blocks substitution matrix (BLOSUM) method between two sequences at these particular sites [32]. This application is based on CAPS Version 1.0 [33]. This method has proved to be successful in disentangling real coevolutionary signal from the background noise and minimizing false positive rate with high sensitivity [25]. CAPS can produce the files containing information of coevolutionary networks and compensatory mutations. CAPS program is available at http://bioinf.gen.tcd.ie/caps/. Also, we run the InterMap 3D 1.3 server (http://www.cbs.dtu.dk/services/InterMap3D/) [34,35] to measure the atomic distance as a complementary explanation for the evidence of coevolution.

Covariation Analysis.
Detecting structural interactions and statistical covariance among separate amino acid sites is significant for understanding protein covariation and evolution [27,36]. Such analyses are based on the assumption that functionally significant coordinated residues in proteins originated by physicochemical properties (e.g., volume, charge, polarity, and hydrophobicity) of the residues [37]. Here, we use the software CRASP (Correlation analysis of the amino acid substitutions in protein sequences) to run the coevolved analysis. The CRASP program is available at http://wwwmgs.bionet.nsc.ru/mgs/programs/crasp/. Figure 1 shows amino acid sequences alignment of PC α subunits. In addition, numbers of highly conserved amino acids of the PBPs were identified from the sequence alignments (see detailed information in the supplementary materials available online at doi: 10.1155/2011/230236). Figure 2 provides clear coevolution relevance with α and β subunits in PE, PC, and APC, respectively. In addition to the implementation of the method previously published [25], CAPS also performs a preliminary analysis of compensatory mutations by testing the correlation in the hydrophobicity and the molecular weight variations between coevolving amino acids [33]. Some of the coevolving groups detected are significantly correlated either in hydrophobicity or molecular weight or both (details are shown in Table 1).

Interprotein Coevolution Analysis.
Interprotein coevolution, in addition to the intramolecular analysis developed previously, can also be operated by CAPS. Detecting correlation in the molecular weights and hydrophobicities in the groups of coevolution is not available in such condition. We run all the possible interprotein coevolution analysis according to the locations in PBSs, including two proteins of PBPs or linker peptides or both. The linker peptides had few connections to PBPs according to their coevolution results. No coevolution groups were found in the CAPS output in APC-L CM , L C -L CM , PE-L R , and L RC -L C . Figure 3 shows six interprotein coevolution networks in PC-L R , PC-L RC , PC-APC, PE-PC, APC-L C , and L R -L RC . Compared to intramolecular coevolution, less groups were found in interprotein analysis. Besides, it is obvious that the relationships among two proteins of PBPs or linker peptides were much closer than the connections between the PBPs and linker peptides.

Covariation Analysis.
These characteristics of amino acids reflect physical and chemical interactions between residues. It has been suggested that these linker proteins play roles in rod-core assembly and complex stabilization [38]. There are many physical-chemical scale parameters such as flexibility, volume, polarity, and hydrophobicity. Here, we firstly considered such amino acids characteristics as volumes for covariation analysis. As can be seen in Figure 4, nearly all the lineages were highly correlated at the 99% significance level, while some of them approaching to 99.99%. In L C , the lengths of the sequences are very short (approximately 67 amino acids); thus the number of the coevolved sites were the least. Other linker peptides contain numerous coevolved residues owning to the physical-chemical interactions. Then alignment consequences of these peptides are narrowly conservative. The number of amino acids, residue-residue interactions, the dependence of covariations on phylogenetic distances and interior environment would be the main factors to account for the covariation outcomes. Then we chose some other amino acid characteristics (polarity, hydrophobicity, and flexibility) to perform the covariation analyses. Results from the properties are similar to the former analysis, just changing the branch locations with the same residues.

Atomic
Distance. The analysis of the atomic distances (AD) identified a certain percentage of coevolving residues within each group as spatially close. Physical distance (<10Å) is one pattern within the residues in the coevolutionary events [39].
In Figure 5, large amounts of spatial couplings and few physical interactions were detected in all PBPs and linker peptides.
Spatially proximal pairs of sites and clusters of distant sites located in functional domains, suggest a functional dependency between them [39]. Furthermore, linker family shares the coevolution positions in which most atomic distances showed not available for their unknown protein tertiary structures.

Discussion
Intramolecular coevolution detected among PBPs reveals the strong coevolved connections between sites. The factors on compensatory mutations including hydrophobicities and molecular weights are among the most important in explaining amino acid contribution to protein structure with less error [40]. Most of the coevolving residues are significantly correlated in hydrophobicities and molecular weights except the PE α subunit. It may be caused by the microenvironment of PBS. The interactions of hydrophobicities are responsible for different phenomena such as structure stabilization of proteins [41] and folding of proteins [42]. PE is the outmost portion of the structure of PBS, so it might possess less physical-chemical interactions than APC and PC. Other possible explanations include the coupling patterns which can balance the formation of the region and the interior environment such as water dynamics.
Apt proposed a hypothetical outline that different types of PBPs and linker peptides originated from the same ancestor [7]. The results of interprotein coevolution analysis in PBPs verify the previous hypothesis, and so as linker peptides. Apt and Zhao also supposed that the linker polypeptides developed from an earlier ancestor of PBPs [7,43]. The rare relationship between PBPs and linker peptides depending on the interprotein coevolution analysis demonstrates less interaction in the long period of evolution. This hypothesis would in part be overturned by this point.
Interestingly, a significant proportion of the sites detected coevolving had been previously proposed to be under adaptive evolution [25,39]. Based on S. sp. PCC 6803 PCα protein numbering in Zhao's research [43], these residues are 4P, 5L, 7E, 15Q, 25Q, 66T, 88I, 107L, 118S, 119P, 134K, and 140H (these positions are not the same with the paper [43]     and 140H are involved in the coevolutionary analysis. The positive selective residues are usually between or adjacent to the coevolutionary sites, such as 118S and 119P within the coevolved residues 115I, 116D, and 120R. In PC-β subunit, 30T, 57R, 61A, 103S, 127V, 129A, 130G, 133K, 139L, 167A, 168A, and 171V were under the adaptive selection. Surprisingly, just one residue 61A was detected coevolved. Most of the sites are also among the coevolved sites that shows the potential connection between coevolved and selective residues. And we found that the coevolution positions occurred through the whole molecule, while many sites with elevated dN/dS ratios (the frequency of nonsynonymous versus synonymous substitutions) in different PBP lineages were located in the chromophore-binding domain and the helical hairpin domains (X and Y) [43]. The identification of genes showing particular amino acid residues that have undergone adaptive evolution is a key in determining functionally or structurally important protein regions. Conserved amino acids throughout protein evolution are expected to have critical effects on protein functions [44,45]. Former researches had concluded the relationship between coevolution and selective pressure with the fixed associations [25,43]. In this paper, we found that most of the sites under adaptive selection were adjacent to or among the coevolved ones. One hypothesis is that according to the physical-chemical properties, the residues under positive selection are one key factor to stimulate the sites coevolved and vice versa. Coevolutionary sites may be important in two ways. First, some sites are functionally important because they provide the ability to respond to the dynamic circumstance [46,47]. Second, the regions may be indirectly important because they fall in the vicinity of important amino acid sites, and therefore their variability may dramatically affect functional sites. In the latter case, variable amino acid sites tend to coevolve to preserve the structural characteristics of the functional sites [26]. It is expected that compensatory coevolution may occur either between amino acid sites three dimensionally proximal (indicating structural and probably functional coevolution) or alternatively between sites apparently far apart from one another but in contact with functionally important sites. Certain variability coupled with the strong functional constraints and the involvement in the network of interactions for coevolutionary processes would both arise from the environmental factors especially light acclimation. Hence, the complex relations between coevolution and selective constraints are worth pursuing at a deeper level.
The coevolutionary analysis is regarded as an important tool to gain functional and structural relationships in a protein. The evolution of amino acid residues is hence depending on their mutation and the constraint pressure imposed by their complex networks [48]. Amino acid interdependency can lead to coevolution. Many evidences pointed to the importance of coevolution in shaping the molecular function [24,25,27]. Moreover, structural and functional coupling of distant interacting residues requires coevolution among these amino acid residues. Some possible explanations include the coupling of binding energy via pathways in the protein, interactions with intermediate molecules, and the surrounding environment. Various environmental factors especially light acclimation were the primary influences in coevolution. And the detail evolution mechanism in PBSs mediating by the light can be further resolved.