Classification of Nonenzymatic Homologues of Protein Kinases

Protein Kinase-Like Non-kinases (PKLNKs), which are closely related to protein kinases, lack the crucial catalytic aspartate in the catalytic loop, and hence cannot function as protein kinase, have been analysed. Using various sensitive sequence analysis methods, we have recognized 82 PKLNKs from four higher eukaryotic organisms, namely, Homo sapiens, Mus musculus, Rattus norvegicus, and Drosophila melanogaster. On the basis of their domain combination and function, PKLNKs have been classified mainly into four categories: (1) Ligand binding PKLNKs, (2) PKLNKs with extracellular protein-protein interaction domain, (3) PKLNKs involved in dimerization, and (4) PKLNKs with cytoplasmic protein-protein interaction module. While members of the first two classes of PKLNKs have transmembrane domain tethered to the PKLNK domain, members of the other two classes of PKLNKs are cytoplasmic in nature. The current classification scheme hopes to provide a convenient framework to classify the PKLNKs from other eukaryotes which would be helpful in deciphering their roles in cellular processes.


Introduction
It is now well known that enzymes, in their role as biocatalysts, are the most important control points in the living organisms, and the catalytic residues of an enzyme are key to its molecular function. Bartlett and colleagues [1] have described pairs of active and inactive enzyme homologues having same structural scaffold but different functions. Catalytically inactive enzyme homologues are represented in a large variety of enzyme families with families of signaling enzymes having high number of enzymatically inactive members [2].
Phosphorylation by Ser/Thr/Tyr protein kinases plays a crucial role in cellular signal transduction. A canonical kinase domain consists of 12 subdomains containing few conserved residues of functional importance. Subdomains I, II, VIB, and VII are considered to be the most important ones. Subdomain I includes β-turn structure with 2 or 3 glycine residues (G-X-G-X-X-G) while subdomain II comprises an invariant lysine participating in anchoring and orienting the ATP (Adenosine Tri Phosphate). Subdomain VIB contains catalytic loop with a key aspartate [D] residue that mediates the transfer of a phosphate group from ATP to the appropriate substrate. The D residue of DFG motif in subdomain VII ligates Mg2+ which in turn interacts with β and γ phosphates of ATP [3]. Roles of these residues in protein kinases are well established. The catalytic residues of the protein kinases are usually highly conserved to maintain their ability for efficient cellular signal transduction. However, there have been few reports of proteins with substitutions/deletion at essential catalytic sites. Among these functionally important residues in a Ser/Thr/Tyr kinase, the aspartate residue in subdomain VIB acting as catalytic base seems to be most important as we are not aware of a properly functional kinase which lacks this residue.
Although the importance of protein kinases has long been recognized, studies on protein kinase homologues lacking catalytic residue/residues are more recent. Several studies on repertoire of kinases in various organisms have revealed presence of enzymatically inactive homologues of protein kinases [4][5][6] which lack catalytic function and instead serve as scaffolds or kinase substrates. Boudeau and colleagues have discussed roles of human kinase-like proteins in regulating diverse cellular processes [7]. Pki nas e CA MK TSS K 6 Pk in as e CA M K TS SK 4 P ki n as e C A M K T SS K 5 P K L N K C A M K T S S K 1 P k in a s e C K 1 V R K 1 P k in a s e C K 1 V R K 2 P k in a s e C K 1 V R K 3 P K L N K C K 1 V R K 1 P k i n a s e T K L I r a k 4 P k i n a s e T K L I r a k 1 P k in as e T K Ja k 5 P k in a se T K Ja k 7 P k in a s e T K Ja k 1 P k in a s e T K J a k 2 P K L N K Pki nas e TK Eph 6 Pkinase TK Eph 9 Pkinase TK Eph 11 Pkin ase TK Eph 8 Pk ina se TK Ep h 10 Pk in as e TK Ep h 4 P k in as e T K E p h 7 P k in a se T K E p h 2 P k in a s e T K E p h 3 P k in a s e T K E p h 5 P K L N K T K L D i c t y 4 P K L N K U n c l a s s i f i e d P k i n a s e O t h e r S c y 1 2 Despite considerable sequence similarity to enzymatically active protein kinases, Protein Kinase-Like Nonkinase (PKLNK-also referred as Kinase Homology Domain-KHD in some of the earlier publications) domains lacking key residues thought to have regulatory roles. Some examples of proteins containing such domains which lack catalytic base aspartate are a PKLNK domain tethered to a tyrosine kinase domain in Janus Kinase (JAK) and membrane guanylyl cyclases (or particulate guanylyl cyclase) in which a regulatory PKLNK domain is situated N-terminal to the guanylyl cyclase domain [8][9][10]. PKLNK domain in JAK is thethered to functional kinase domain; however in guanylyl cyclases (GC), a functional kinase domain is absent, and the PKLNK is tethered to a cyclase domain. PKLNK domain of Guanylyl cyclase-A serves as an important mediator in transducing the ligand-induced signals to activate the catalytic cyclase domain of the receptor. Deletion of PKLNK domain from GC-A, -B, and -C resulted in constitutive activation of these enzymes [11,12] and is shown to act as a repressor of the catalytic domain in the basal state [13]. The PKLNK of guanylyl cyclase-A (Natriuretic peptide receptor A) is more closely related to protein tyrosine kinase than protein serine/threonine kinase [11,12,14]. PKLNK in receptor guanylyl cyclase provides a critical structural link between the extracellular domain and the catalytic domain in regulating the activity of this family of receptor. Modeling of the PKLNK of human GC-C indicates that it can adopt a structure similar to that of tyrosine kinases [15]. There are many other protein kinase-like domains which lack other catalytically important residues, though playing important role as regulatory proteins, for example, "dead" RTK-ErbB3 [16], OTK (Off Track Kinase), WNK (with no  [17], and so forth. Recently, crystal structure of first PKLNK, VRK3 (a member of the vaccinia-related kinase family), which lacks aspartate in the catalytic loop has been reported [18] which revealed that it cannot bind ATP because of residue substitutions in the binding pocket, compared to ATP binding homologues. However, VRK3 still shares prominent structural similarity with enzymatically active protein kinase.
In the past, our group has reported presence of ABC1, RIO1, and kinases in archaea and bacteria that share significant similarity with Ser/Thr/Tyr kinase family [19]. The sequences of these protein kinases were examined for the presence of catalytic aspartate in the catalytic loop.  Sixteen prokaryotes have been predicted to have at least one member lacking catalytic aspartate, and the total number of  such sequences is 23. This study indicates that PKLNK has been evolved much before the divergence of prokaryote and eukaryote.
In the current analysis, we present a detailed analysis of the PKLNKs from four completely sequenced higher eukaryotes, namely, Homo sapiens, Mus musculus, Rattus norvegicus, and Drosophila melanogaster. An attempt has been made to classify these PKLNKs based upon their amino acid sequences and domain tethering preference in order to understand molecular basis of evolution and functions of these proteins.

Materials and Methods
In order to identify the repertoire of PKLNKs in various eukaryotic organisms PSI-BLAST [20] search was performed using traditional protein kinases as queries against the Nonredundant Data Base (NRDB) which is a database of protein amino acid sequences maintained at NCBI, USA. Hits were analyzed for the absence of catalytic base aspartate in the catalytic loop. In the PKLNKs, lacking catalytic aspartate, we further looked for the presence of other key-residues such as glycine in glycine-rich motif in the subdomain II, lysine and glutamic acid in the subdomain III, and DFG motif in subdomain VII. However we have considered all the kinase-like sequences lacking the catalytic base (Asp) residue for the present analysis. The subfamily classification of these PKLNKs and recognition of other domains in the PKLNK domain containing multidomain proteins have been made using the procedures and protocols developed earlier in our group in connection with analysis of kinomes [4,6,21]. We have essentially employed multiple sensitive search and analysis methods like PSI-BLAST [20], RPS-BLAST [22], and HMMer [23] which match sequences to Hidden Markov Models (HMMs) of various families in Pfam (release 23) [24] to identify various domains in the multidomain sequences. Procedure such as PSI-BLAST has been used to detect homologues of noncatalytic kinase like domains using an E-value cut-off of 0.0001 that has been previously bench marked [25]. Hits lacking significant sequence similarity with the query have been further examined manually.
In the current analysis, we have identified a total 82 PKLNKs in the four organisms. CD-hits program [26,27] was used in order to eliminate redundant sequences, which are indicated by 100% sequence identity. So the data set is devoid of redundant sequences.
CLUSTALW [28] has been used to align the nonenzymatic domains of 82 PKLNKs (see Table 1 in Supplementary Material available online at doi: 10.1155/2009/365637). Further, catalytic domain of the protein kinase and PKLNK domain from mouse have been aligned, and MEGA [29] was used to generate phylogenetic dendrograms.
Domain assignment to the other regions apart from the noncatalytic kinase domain of these PKLNKs has been carried out using HMMer search methods by querying each of the PKLNK against the 10340 protein families HMMs available in the Pfam database (http://pfam.sanger.ac.uk/). MulPSSM (Multiple PSSM) [30] approach was used further to assign domain to the region which has not been assigned using HMMer approach. Trans-membrane regions were detected using TMHMM [31].

Results and Discussion
In the current analysis, we have identified 82 PKLNKs. The main criteria used to detect these PKLNKs involve ensuring acceptable e-value with protein kinases and absence of catalytic base residue (Asp). There are 31 PKLNKs identified in Homo sapiens (Table 1), 18 PKLNKs in Drosophila melanogaster (Table 2), 13 PKLNKs in Rattus norvegicus (Table 3), and 20 PKLNKs in Mus musculus (Table 4). Although the catalytic Asp is absent in these sequences, we looked for the presence or absence of other key residues, characteristic of functional protein kinases, in the 82 identified PKLNKs. Glycine rich loop in the subdomain I (displaying consensus sequence G-X-G-X-X-G) contains at least two glycine residues in 26 gene products (see Supplementary Table 1). The phosphorylation of the activation segment is required for the activation of most protein kinases that contain an Arginine (R) preceding the catalytic base aspartate. We have essentially looked for the H-R-X motif (where X can be any residue but cannot be D) in Table 5: List of PKLNK analysed. Information on number of residues and the nearest protein kinase subfamily to which they belong to has also been provided. Abbreviations followed in the   Table 1). Though these PKLNKs lack the crucial aspartate in the catalytic loop, they are closely related to the functional protein kinases, in terms of the sequence similarity. Table 5 provides information on the closest protein kinase subfamily to which these PKLNKs belong to. Many of the PKLNKs are closely related to tyrosine kinase or tyrosine kinase-like group. Further, phylogenetic tree has been constructed considering PKLNK domain and catalytic domain of protein kinase subfamilies of mouse to which these PKLNKs from mouse are closely related (Figure 1). It has been observed that most of the PKLNKs from mouse are grouping to protein kinase subfamilies to which they closely belong to. This information provides a hint about the nearest evolutionary relation between PKLNKs and protein kinases. However, there are two PKLNKs from mouse, one of which is closely related to Tyrosine kinase-like group (gi|6005792), and the other one (gi|158635954) is not closely related to any of the known protein kinase subfamilies which are not grouping with their closest kinase subfamilies (Figure 1) suggesting that these two PKLNKs are evolutionary quite diverged.     Table 6). The Pfam domains tethered to PKLNK domain and their frequency of occurrence are represented in Table 7.
As can be seen in Table 7, the most commonly tethered domains are ANF receptor domain, Transmembrane domain and Guanylate cyclase domains. Interestingly most of the time it has been observed that all the three domains are present in the same polypeptide. There are some domain families which occur in repeats like Immunoglobulin Iset domain and HEAT domain which are mainly involved in cell-cell recognition, and protein-protein interactions, respectively, have also been found tethered to the PKLNK domain. Prediction of transmembrane domain has revealed occurrence of receptor PKLNKs which have most of the time single pass transmembrane region. Interestingly a drosophila protein (gi|21626698) has two PKLNK domains, many Iset (Immunoglobulin) repeats and fn3 domains which has been observed for the first time (Figure 2(a)) and not seen in any functional protein kinase. Our study has revealed that these two PKLNK domains are closely related to myosin light chain kinase subfamily of calcium/calmodulin dependent kinase group. There are a few PKLNKs which are closely related to receptor guanylate cyclase family of protein kinase which is characterized by extracellular ANF receptor domain. Interestingly some of these PKLNKs which are closely related to receptor guanylate cyclase subfamily of protein kinase do not have extracellular domain predicted in the N-terminal (Figure 2(b)) suggesting evolutionary paradigm. Based upon the broad function, the domains tethered to PKLNK can be functionally categorized into four categories: (1) Domains which are mainly involved in ligand binding like ANF receptor, Receptor L domain, and Ephrin receptor ligand binding domain.
There are 17 gene products which have domains architecture similar to ANP receptor [12] in which ANF receptor is followed by PKLNK which is followed by Guanylate cyclase domain. The ANF receptor is an extracellular ligand binding domain in a wide range of receptors [33]. Guanylate cyclase catalyses the formation of cyclic GMP (cGMP) from GTP which acts as intracellular messenger and regulates various cellular processes like smooth muscle relaxation, retinal phototransduction, regulation of ion channels, and so forth [34,35]. The ephrin receptor ligand-binding domain (EPH lbd) which binds to ephrin is a large family of receptor tyrosine kinases. Biochemical studies suggest that the multimerization of EPH lbd modulates the cellular response and acts on actin cytoskeleton [36].
(2) Domains which are extracellular and involved in protein-protein interactions like I-set (Immunoglobulin like domain) and Fn3 (Fibronectin type III) domains.
(3) Domains involved in dimerization like Death domain, SAM (Sterile Alpha Motif) domain, and Furin-like domain.
Proteins containing death domains are well known to participate in the signaling events which regulate apoptosis [37] indicating role of PKLNK in apoptosis. Proteins containing SAM domains are involved in homo-and hetero oligomerization with other SAM domains and are involved in various developmental processes [38]. Furin-like domain is found tethered to receptor tyrosine kinase. It is rich in cysteine and involved in receptor aggregation.
(4) Domains involved in protein-protein interactions like Ank (Ankyrin repeats) and Heat repeats.
Ank is one of the most common protein-protein interaction modules which occur in large number of functionally diverse proteins. PKLNKs containing Ank repeat are likely to play role in diverse functions like signal transduction, ion transportation, transcription initiation, and so forth. Heat domain is 30-40 amino acid tandemly repeated domain. PKLNK containing Heat domain might have role in intracellular transport processes. Apart from the domains discussed above there are some more accessory domains found tethered to the PKLNK domain which provide functional diversity to the PKLNKs. A human PKLNK (gi|18676872) has TBC domain and Rhodanese domain in the C-terminal. TBC domain is involved in GTPase signaling, and Rhodanese domain which shares evolutionary relationship with large family of protein is involved in cyanide detoxification [39]. Another human PKLNK (gi|17368698) has TUDOR domain N-terminal to the PKLNK domain which indicates its role in RNA binding [40] which has so far not seen tethered with protein kinase (Figure 2(c)).
A drosophila PKLNK (gi|17368346) has WSC domains N-terminal to the PKLNK domain which is likely to be an extracellular carbohydrate binding domains. At least three PKLNKs (gi|20869393, gi|21627748, gi|7020363) have PX domain N-terminal to the PKLNK domain which might have role in lipid signaling.
Phylogenetic tree has been generated by considering the nonenzymatic PKLNK domains of these 82 PKLNKs (Figure 3) in which interestingly we have observed some clusters having similar domain organization. Some of the frequently found tethered domains have been represented in Table 9: List of PKLNKs which interact with a large number of proteins. The in vivo/in vitro protein-protein interaction data has been obtained from HPRD database [32].    Figure 3). There are 13 PKLNK sequences which have SH2 (Src homology 2) domain in the N-terminus (represented in red in Figure 3). SH2 domain functions as regulatory module of intracellular signaling cascade by interacting with the phosphopeptide. All of these SH2 containing PKLNKs except one (gi|2288925) have protein kinase domain tethered in the C-terminus. These 12 protein kinases domains are close homologues of protein tyrosine kinase 7 subfamily. This kind of domain architecture having SH2 domain followed by PKLNK which is followed by protein kinase domain has not been reported anywhere to the best of our knowledge. However, JAK1 (Janus kinase 1) has very similar domain combination in which apart from these three domains, FERM domain which, is involved in binding to cytokine receptors [32,41] is present in the N-terminus [42]. The biological function of these PKLNKs might be in the regulation of tyrosine protein kinase activity.
Further, we have compared the domain structure of PKLNKs and their closest protein kinase subfamilies. Interestingly, we have observed that there are a few domain combinations which are unique to either PKLNKs or protein kinases. There are a few Pfam domains such as TUDOR and HNOBA (Heme NO binding associated) which have not been seen tethered to protein kinase domains so far. HNOBA domain is known to function as heme-dependent sensor for gaseous ligands and transduce diverse downstream signals across diverse organisms [43]. The domain structures which commonly occur between PKLNK and protein kinase have also been studied (Table 8).

Protein-Protein Interaction of Human PKLNKs.
Understanding the biological roles of proteins in the cellular environment is the main aim of genome analysis. For almost all cellular processes in a living cell protein-protein interactions are of central importance. In the current section, we have focused on human PKLNKs. We have looked for the protein-protein interactions of PKLNKs using HPRD database (http://www.hprd.org/) [44]. At least 9 human PKLNKs are shown to interact with various other proteins ( Table 9) and most of these proteins are signaling proteins and adapter proteins which module the cell signaling and play critical role in cell polarization, differentiation, cell adhesion, neuronal cell development, apoptosis, homeostasis, and so forth. Four of these nine PKLNKs which are closely related to receptor guanylate cyclase (RGC) family of protein kinase are reported to interact mainly with natriuretic peptide and guanylate cyclase. The protein-protein interaction informations obtained from HPRD emphasize role of PKLNKs in signaling.

Conclusions
This work represents functional analysis of noncatalytic PKLNKs across a data set of 82 PKLNKs from four higher eukaryotes. Our analysis has indicated that existence of noncatalytic PKLNKs is quite common. The fact that noncatalytic PKLNKs are well conserved between Homo sapiens, Mus musculus, Rattus norvegicus, and Drosophila melanogaster strongly argues against pseudogenes, as otherwise these would have been lost during the evolutionary time. Our study on PKLNKs suggests that most noncatalytic PKLNKs are derived from active protein kinase ancestors and have lost one or more of the critical catalytic residues within the active site which provides new insight into nature's way of eliciting new functions of PKLNKs. Based upon the domain tethering preferences we have classified PKLNKs into four main classes in which members of the two classes are receptor PKLNKs which are mainly involved in ligand binding and protein-protein interaction extracellularly while other two classes of PKLNKs have members which are cytoplasmic, and they are mainly involved in dimerization and proteinprotein interaction in the cytoplasm. The phylogenetic analysis reveals function-based clustering of these PKLNKs. Conservation of some of the modular organization across the four organisms suggests their central role in the eukaryotic signaling pathway. Since many of the PKLNKs have other domains tethered to them and are involved in proteinprotein interactions, one can speculate that though the kinase-like domain is nonenzymatic, they might have role in regulation and scaffolding. Some of these catalytically inactive members of PKLNKs which are close homologues of the receptor tyrosine kinase are shown to be over-expressed in cancer cells. Additional studies are required to determine precise function and role of these PKLNKs in tumorgenesis and its usefulness in the diagnosis of tumors. Domain organization of these PKLNKs revealed that some of the PKLNKs have new and hence unique domain organization so far not seen in any other family of gene products. 3D structure and biochemical analysis can further determine and explore the functional role of these PKLNKs. The presence of putative PKLNK in higher eukaryotes indicates that we have more to learn about cellular signaling involving these noncatalytic domains. Evolutionary history of these PKLNKs would be of particular interest. It is hoped that this analysis will provide a better understanding about the frequent occurrence of PKLNKs in different organisms and hence their function.