Computational Analysis of the Binding Specificities of PH Domains

Pleckstrin homology (PH) domains share low sequence identities but extremely conserved structures. They have been found in many proteins for cellular signal-dependent membrane targeting by binding inositol phosphates to perform different physiological functions. In order to understand the sequence-structure relationship and binding specificities of PH domains, quantum mechanical (QM) calculations and sequence-based combined with structure-based binding analysis were employed in our research. In the structural aspect, the binding specificities were shown to correlate with the hydropathy characteristics of PH domains and electrostatic properties of the bound inositol phosphates. By comparing these structure properties with sequence-based profiles of physicochemical properties, PH domains can be classified into four functional subgroups according to their binding specificities and affinities to inositol phosphates. The method not only provides a simple and practical paradigm to predict binding specificities for functional genomic research but also gives new insight into the understanding of the basis of diseases with respect to PH domain structures.

In the present work, we will focus on eleven well-known PH domains whose three-dimensional structures have been determined. Five out of these PH domains bind inositol phosphates with different specificities [16][17][18][19][20][21]. In addition to evolution, PH domains have been divided into four functional subclasses according to binding affinities and specificities [2] (Table 1). In summary, PH domains in Group 3 bind their preferred ligands with similar affinities to those in Group 1, whereas PH domains in Group 2 have 4-8fold weaker binding affinities. In addition, PH domains in Group 4 have low affinity and less specificity. Concerning the three inositol phosphates, the affinity of Ins(1,4,5)P 3 to its preferred PH domains is generally lower than those of Ins(1,3,4,5)P 4 and Ins(1,3,4)P 3 [22,23]. To understand the sequence-structure relationship of the four groups of PH domains, we have investigated a number of sequence profiles to characterize their physicochemical properties [24]. More recently, molecular dynamics (MD) simulations were performed to study structural and binding affinity of functional mutations of Btk PH domains [25] and the role of membrane penetration and electrostatics in the interaction between GRP1 PH domain and PI(3,4,5)P 3 [26,27].
Despite the large body of PH domain literature, some problems are urgently to be resolved, such as the following: (1) Why the binding affinity of Ins(1,4,5)P 3 to its preferred PH domains is weaker than those of the other two inositol phosphates? (2) What factors determine the binding specificities of PH domains for different inositol phosphates? (3) Considering the fact that PH domains have low sequence identities but highly conserved structures, is there any intrinsic relationship between the binding specificities and sequence profiles of physicochemical properties? To these aims, a systematic comparison of the sequence-structure-function relationship is needed for further research of the functions of PH domains. In this paper, we first calculated the properties of both the inositol phosphates and PH domains, and then the binding specificities and affinities of PH domains were analyzed from both the structural and sequence aspects.

Sequence Analysis.
Because the sequence similarities of PH domains are limited (average identity is 16%), the multiple sequence alignment could not be obtained with the general alignment programs. The sequence alignment of the selected PH domains was retrieved from the protein family (Pfam) database, which is created based on hidden Markov model [30]. The sequence alignment was further modified based on structure-based alignment and shown in Figure 1. The PH domains were selected to represent the different functional subgroups including Btk (PDB code: 1BTK, 1B55), Grp1 (1FGZ, 1FGY, 1FHW, 1FHX), Plc-(1MAI), spectrin (1BTN, 1MPH, 1DRO), pleckstrin (1PLS), -Ark (1BAK), Dapp1 (1FB8, 1FAO), dynamin (1DYN), and UNC-89 (1FHO). MEGA6 was used to construct phylogenetic tree of PH domains based on maximum composite likelihood method [31]. The profiles of physiochemical properties of PH domains, including flexibility, hydropathy, isotropic surface area, and electronic charge concentration, were calculated as previously described [24].

Structure and Binding
Analysis. The electrostatic potentials of PH domains were calculated using a finite different solution to the nonlinear Poisson-Boltzmann equation [32]. The grid was 20Å larger than the PH molecule containing 123 grid points in the longest dimension. The solute dielectric was set to 2. Solvent accessible surface areas (SASA) were calculated according to the algorithm of Lee and Richards [33], and a solvent radius of 1.4Å was used for water. The fractions of residues exposed to solvent were calculated directly from the experimental structures and were subsequently used to generate profiles with the sliding window averaging technique to facilitate comparison to the predicted properties. In the case of NMR structures missing the average structure (1MPH, 1PLS, 1BAK, and 1FHO), the first structure in the entry was used. The detailed interatomic contacts for inositol phosphate to PH domains were investigated with the LIGIN program [34]. All the other structural analyses were performed with InsightII software of Accelrys, Inc.

Results and Discussion
Binding PH domains have been identified in various species [35,36], and a few reports have discussed the evolution of TFKs including PH domains [37]. Evolutionary relationship among different groups of PH domains is of interest to be compared with binding specificities. The phylogenetic tree for 12 PH domains ( Figure 2) was constructed based on the sequence alignment. Apparently, four groups are not clearly classified in the phylogenetic tree. For example, Btk, Plc-, and PDK1, belonging to three different groups, have nearly phylogenetic relationship. This result may be attributed to the low sequence identities but extremely conserved structures in PH domains. Therefore, the detailed analysis of their  structural features is reasonably needed. We calculated and analyzed the binding specificity of PH domains (receptors) for inositol phosphates (ligands). First, the geometries and electronic properties of the three inositol phosphates were calculated. The electronic properties and hydropathy of the receptors were investigated with structure-based analysis. Then, based on the structural characteristics of PH domains and inositol phosphates, the binding affinities and specificities of PH domains to inositol phosphates were analyzed.

Geometries and Electronic Properties of Inositol Phosphates.
The electrostatic potentials of the three inositol phosphates were calculated by QM ( Figure 3). The calculated single point energies, geometries, and electronic properties are listed in Table 2. In all the optimized structures, the myo-inositol ring adopted the conformation with 1-axial/5equatorial oxygen positions (C2-hydroxyl in axial position and the other hydroxyls/phosphates in the equatorial orientation). The electronic charge distribution is apparently different for Ins(1,4,5)P 3 and Ins(1,3,4)P 3 (or Ins(1,3,4,5)P 4 ) ( Figure 3). In Ins(1,4,5)P 3 , the negative charge (in red) is concentrated on one side of the molecule. The molecules were oriented based on the superimposition of their inositol carbon atoms. To compare the geometries and properties of Ins(1,3,4)P 3 and Ins(1,4,5)P 3 , the Ins(1,4,5)P 3 was rotated by 180 ∘ to superimpose the phosphate groups of these two molecules. It has been reported that Ins(1,3,4,5)P 4 is bound to Btk in the opposite orientation compared to Ins(1,4,5)P 3 binding to Plc PH domain, although the interacting residues are in corresponding positions [3]. Figure 3 provides a qualitative explanation for the phenomenon, since the dipole moments of Ins(1,4,5)P 3 and Ins(1,3,4,5)P 4 point almost to the same direction when in inverted orientations.
The data in Table 2 confirmed the results of QM calculations. Although the chemical compositions of Ins(1,3,4)P 3 and Ins(1,4,5)P 3 are the same, the single point energies are different by 6.4 kcal/mol, indicating that Ins(1,4,5)P 3 conformation is more theoretically stable. The comparison of the geometries of the compounds provides an explanation 4 BioMed Research International  The electronic spatial extent is a measure of molecular volume, whereas the dipole moment is an index of molecular polarizability. Ins(1,3,4)P 3 thus has wider electronic charge distribution and greater polarity than Ins(1,4,5)P 3 . Since the electrostatic interaction is the main contributor for the interaction between PH domains and inositol phosphates. Ins(1,4,5)P 3 , which has smaller electronic spatial extent, binding to its preferred PH domain is weaker compared with Ins(1,3,4)P 3 . It is evident according to Figure 3 and Table 2 that the phosphate groups of Ins(1,4,5)P 3 are on one side of the molecule and the shape is much flatter than that for Ins(1,3,4)P 3 . The different geometries and electronic properties between Ins(1,3,4)P 3 and Ins(1,4,5)P 3 contribute to the different specificities in PH domain interactions. The hydrophilic phosphate groups of inositol phosphates favour hydrophilic environment in the binding region of proteins. Among the three inositol phosphates, Ins(1,3,4,5)P 4 requires the most hydrophilic environment for binding, due to having four hydrophilic phosphate groups. For Ins(1,4,5)P 3 , the binding environment in PH domain should be hydrophilic at one side of Ins(1,4,5)P 3 . The binding environment for Ins(1,3,4)P 3 is less hydrophilic than that for Ins(1,4,5)P 3 , by virtue of the observation that the three phosphate groups in the molecule distribute separately and widely.

Electrostatic Properties and Hydropathy of PH Domains.
The calculated electrostatic potentials of PH domain structures ( Figure 4) indicate that the binding sites for inositol phosphates are conserved and positively charged and thus electrostatic interactions play the key role in the binding of inositol phosphates. The localization and orientation of the inositol phosphate binding sites can be estimated by calculating the electrostatic properties of PH domains. On the other hand, hydrophobicity is also important for both the function and the stability of a protein. Hydropathy profiles may indicate functional sites [38]. The hydropathy analyses of PH domains are shown in Figure 4. Hydropathy environments of the PH domain binding regions are not as conserved as the electronic charge distributions. Hydropathy profiles can help to explain the different affinities and specificities in inositol phosphate binding. The binding environments are most hydrophilic for PH domains in group 1 (Figures 4(a)  and 4(b)). The hydrophilic binding environments for PH domains in group 2 are on one side of the bound Ins(1,4,5)P 3 molecule (Figures 4(c) and 4(d)). The hydrophilic binding environments for PH domains in group 3 are less strict (Figure 4(e)). These observations are in agreement with the results obtained from the QM calculations of inositol phosphates.
We compared the structural profiles of electrostatic properties and hydropathy with results obtained by sequence profiles [24]. From both profiles, eight conserved extrema were found for structurally essential regions. For example, residues with smaller electronic charge correspond to the residues forming the hydrophobic core of PH domains, which may contribute to stabilizing the structure. According to the profiles of electronic charge concentration, the most charged segments are generally located in the 1/ 2 and 7/ 1 loops. Indeed, the 1/ 2 loop is positively charged and thus appears to be the most important segment for the binding of inositol phosphates. Table 3 lists the contact surface areas of phosphate groups in different PH domain-inositol phosphate complexes and the normalized complementarity (NC) function calculated by LIGIN program [34]. All the illegitimate contacts are of hydrophilichydrophobic type. In all cases, the C1-phosphate group has the lowest NC function indicating that the C1-phosphate group generally points outward and phosphoinositides can be replaced by inositol phosphates to study the binding specificities of PH domains. The binding affinities of inositol (tetra-and penta-) phosphates to Btk PH domain have the following order:
In the structure of the Dapp1 PH domain-Ins(1,3,4,5)P 4 complex the C6-hydroxyl is in a hydrophobic environment but it points outwards from the domain. It has only small effect on the affinity. The surroundings of C5-phosphate contain both hydrophobic and hydrophilic residues and thus C5-phosphate has minor effect on the interaction (NC = 0.20). In conclusion, the hydropathy analysis of the binding environment provides explanations for the experimentally obtained binding affinities as follows: Ins(1,3,4,5,6)P 5 ≅ Ins(1,3,4,5)P 4 > Ins(1,3,4,6)P 4 > Ins(1,4,5,6)P 4 Contrary to the Plc domain (Figure 4(c)), the 5/ 6 loop in the spectrin PH domain (Figure 4(d)) is more hydrophilic and more positively charged than the 3/ 4 loop. Consequently, the inositol phosphate binds between 1/ 2 and 5/ 6 loops in the spectrin PH domain.

Analysis of Binding
Affinity. The positions of binding sites in SASA profiles of known PH domain structures are shown in Figure 5. The PH domain binding sites are generally hydrophilic, flexible, and charged. The charge concentration, hydrophilicity, and flexibility are the main factors, which determine the binding affinity and specificity. Since the 1/ 2 loop is located between the 3/ 4 and 5/ 6 loops, the residues in this loop play crucial roles in inositol phosphate binding. In Figures 5(a) and 5(b), the binding sites in the 1/ 2 loop of Group 1 PH domains (Btk and Grp1) are located in the region of high hydrophilicity, high flexibility, and high electronic charge concentration. Group 1 PH domains are specific and have high affinity for Ins(1,3,4,5)P 4 . In addition, the Grp1 PH domain has high affinity to Ins(1,3,4,5)P 4 including the contribution of the 6/ 7 loop. In the complex of Btk PH domain-Ins(1,3,4,5)P 4 , the 6/ 7 loop is not involved since it is hydrophobic.
The binding sites in the 1/ 2 loop of Group 2 PH domains (Plc-and spectrin, Figures 5(c) and 5(d)) are also hydrophilic. The flexibility and electronic charge concentration are also high, but the electronic spatial extent and dipole moment of Ins(1,4,5)P 3 are relatively small. The hydrophilic phosphate groups are distributed on one side of the molecule. Therefore their affinities are reduced compared to Group 1 and Group 3. Compared to Group 2 PH domains, the binding environment of Group 3 PH domain is less hydrophilic (Figure 5(e)), because the phosphate groups of their binding ligands are distributed separately, even for Ins(1,3,4)P 3 . The high binding affinity of the Dapp1 PH domain to Ins(1,3,4)P 3 could be related to the electronic properties of Ins(1,3,4)P 3 . It can be seen that the Akt PH domain has similar profiles as Dapp1 [24,41]. The Akt PH domain binds the Ins(1,3,4)P 3 to the loops 1/ 2, 3/ 4, and 6/ 7. As for Group 4 PH domains, it may be identified by inspecting the profile of electronic charge concentration and hydropathy. The peaks in loop 1/ 2 are generally low and the binding of Group 4 PH domains for inositol phosphates is less specific. Accordingly, the positively charged 1-2 loop in all the structures of PH domains appears to be the most important segment for the binding of inositol phosphates. The inositol-binding affinities can thus been explained by the length of 1-2 loops. The 1-2 loops of the BTK PH domain (Figure 6(a)) and Plc-PH domain (Figure 6(c)) contain 11 and 9 residues, showing higher inositol-binding affinities. In comparison, the 1-2 loop of Akt PH domain (Figure 6(b)) and dynamin PH domain ( Figure 6(d)) are shorter, and thus they have lower inositol-binding affinities. For PH domains without structure, the sequence profiles can pinpoint possible binding sites, guide experiments, and provide understanding of the sequence-function relationships. A signature motif for 3-phosphate binding has been suggested [23]; however it does not distinguish between Group 1 and Group 3 PH domains. With profile analysis, these two groups are distinguished, since the binding sites of Group 1 PH domains are generally more hydrophilic. Motif information should be combined with the profile analysis to predict binding specificities of PH domains. By mapping the signature motif for 3-phosphate binding, Group 1 and Group 3 PH domains can be distinguished from Groups 2 and 4. Then Group 1 and Group 3 PH domains are partitioned by hydropathy profile analysis. By analyzing the hydropathy and electronic charge concentration, it is possible to identify Group 4 PH domains. Since Group 4 PH domains have less specificity and lower binding affinity, the hydrophilicity of the sequence profile is weaker and the electronic charge concentration of the sequence profile is lower. Figure 7 gives an example of the prediction of the specificity of expressed sequence tag AA054961 PH domain, which bears the signature motif for 3-phosphate binding. Since the loop 1/ 2 in this PH domain includes a notable hydrophobic peak, it is predicted to belong to Group 3.

Conclusions
The different binding affinities and specificities of PH domains to the three inositol phosphates of Ins(1,3,4)P 3 , Ins(1,4,5)P 3 , and Ins(1,3,4,5)P 4 were compared and explained from both the structural and sequence aspects. First, the electrostatics and geometric properties of the three inositol phosphates were calculated by a quantum mechanical method. Since the electronic charge distribution of the Ins(1,4,5)P 3 is smaller, its interaction with PH domains is generally weak. The phosphate groups in the Ins(1,4,5)P 3 are on one side of the molecule and the binding region is more hydrophilic on one side of the binding molecule than for Ins(1,3,4)P 3 . Then, the structure-based electrostatic properties and hydropathy of PH domains profiles showed that hydrophobic environment is essential for the binding specificity. These structural results  are compared with sequence profiles for the analysis of binding specificity of PH domains, which also proved the essential role of hydrophobic environment for the binding specificity. The agreement of information from 1-dimensional sequence profiles and 3-dimensional structures provides a simple but practical method to investigate sequence-structure relationship of PH domains. The overall flowchart of our research is summarized in Figure 8, which also contain two future directions.
PH domains can also be specifically identified and combined with signalling molecules, such as PTEN and PI3K [8]. It constitutes the basis for PH domains to participate in a variety of signalling pathways. Therefore, further understanding of the interaction between inositol phosphates BioMed Research International and its downstream molecules not only reveals a consistent picture of PH domain-mediated signal network system but also provides new insights into the mechanism of diseases with respect to these signalling pathways. Although MD simulation has been used to understand the interaction of PKB PH domain with inositol phosphates involved in the PI3K pathway [42], it remains a major challenge in the field.  In our previous work, the relationship between sequence profiles of binding sites and the effect of disease-causing PH domain mutation was analyzed, and MD method has been used to classify "folding mutation" and "disease-causing mutation" [25]. However, the analysis and discussion of disease-causing variations affecting binding specificities and affinities pose another challenge. For example, Btk PH domain is the most studied PH domain which contains the highest number of unique disease-causing variations among the human protein kinases. The PON-BTK provides [43] a method for analyzing and classifying disease-causing mutations. With this mutation data, it is possible to reveal the basis of XLA by binding analysis. We hope that, in the new future, our method would be applied to PH domains to understand the basis of diseases with respect to inositol phosphates involving signalling pathways and harmful mutations for PH domains.