Identification of a Fungi-Specific Lineage of Protein Kinases Closely Related to Tyrosine Kinases

Tyrosine kinases (TKs) specifically catalyze the phosphorylation of tyrosine residues in proteins and play essential roles in many cellular processes. Although TKs mainly exist in animals, recent studies revealed that some organisms outside the Opisthokont clade also contain TKs. The fungi, as the sister group to animals, are thought to lack TKs. To better understand the origin and evolution of TKs, it is important to investigate if fungi have TK or TK-related genes. We therefore systematically identified possible TKs across the fungal kingdom by using the profile hidden Markov Models searches and phylogenetic analyses. Our results confirmed that fungi lack the orthologs of animal TKs. We identified a fungi-specific lineage of protein kinases (FslK) that appears to be a sister group closely related to TKs. Sequence analysis revealed that members of the FslK clade contain all the conserved protein kinase sub-domains and thus are likely enzymatically active. However, they lack key amino acid residues that determine TK-specific activities, indicating that they are not true TKs. Phylogenetic analysis indicated that the last common ancestor of fungi may have possessed numerous members of FslK. The ancestral FslK genes were lost in Ascomycota and Ustilaginomycotina and Pucciniomycotina of Basidiomycota during evolution. Most of these ancestral genes, however, were retained and expanded in Agaricomycetes. The discovery of the fungi-specific lineage of protein kinases closely related to TKs helps shed light on the origin and evolution of TKs and also has potential implications for the importance of these kinases in mushroom fungi.


Background
Proteins undergo various post-translational modifications such as ribosylation, acetylation, thiolation, and phosphorylation. In eukaryotic organisms, reversible protein phosphorylation achieved by protein kinases (PKs) and phosphatases plays critical roles in the regulation of enzyme activity and intracellular signaling. Most PKs catalyze ATP-dependent phosphorylation of Serine (Ser) or Threonine (Thr), and some of these, which are known as dualspecificity kinases, can also phosphorylate on tyrosine (Tyr) [1,2,3]. Tyrosine kinase (TK) is a distinct group that specially catalyzes the phosphorylation of Tyr residues in proteins. In animals, TKs play essential roles in cell proliferation and differentiation, immune responses, organ development, and other cellular processes [4,5]. Mutations in TK genes have been linked to various human diseases, such as cancer and immune diseases [6,7,8].
The Ser/Thr kinase catalytic domains are highly conserved and could be divided into 11 subdomains [1]. Tyrosine kinases contain highly conserved catalytic domains similar to those in protein Ser/ Thr kinases but with unique subdomain motifs. Three motifs in subdomain VI, VIII and XI are highly conserved in TKs but are not found in Ser/Thr kinase [1,9,10]. The high degree of conservation of the tyrosine kinase motifs could be used to distinguish TKs from Ser/Thr kinases.
To date, most TKs were found in metazoan species. Previous studies have demonstrated that TK genes underwent duplication and loss during the evolution of metazoans [11]. A further study on dozens of eukaryotic genomes revealed that TKs appeared early in the common ancestor of metazoans and expanded after the divergence of the metazoans, especially after the split of the vertebrate lineage from the Ciona linage [12]. More recently, TKs were demonstrated to be established before the divergence of filastereans from the Metazoa and Choanoflagellata clades [13]. Many organisms outside the Opisthokont clade, such as Amoebozoa Acanthamoeba castellanii, Dictyostelium discoideum and Entamoeba histolytica [13], green alga Chlamydomonas reinhardtii [14], and oomycete Phytophthora infestans [15] were also found to contain TK or putative TK genes.
Fungi were found to have tyrosine kinase-like kinases (TKLs) [16], a group of kinases that share high sequence similarity with TKs but function mainly as serine-threonine kinases. However, it is generally thought that fungi lack TKs [13,16,17]. Recently, possible TK genes were identified in the basidiomycete Laccaria bicolor using sequence searches [16] but whether these genes are true TKs remains to be determined. To systematically investigate whether TK genes occurred in fungi, in this study we searched for possible TKs across the fungal kingdom by using Profile hidden Markov models (HMMs) [17] and determined their relationships with TKs by phylogenetic analysis. Our results confirmed that fungi lack orthologs of animal TKs. However, they have a specific lineage of protein kinases which is most closely related to TKs.
Most of these genes were found in Agaricomycetes of Basidiomycota but neither in Ascomycota nor other phyla of Basidiomycota. Members of this lineage are predicted to have enzymatic activity but lack key amino acid residues that determine TK-specific activity. The evolution of members of this lineage was also addressed.

Identification of possible TKs in fungi
To systematically search for the fungal possible TKs, we used the HMMER program [18] to search the predicted proteomes of 84 fungi from phyla Ascomycota, Basidiomycota, Chytridiomycota, and Zygomycota ( Figure 1; Table S1) with the multi-level HMM library of protein kinases [17]. Only the fungal sequences designated as TKs (best matches) were selected and then deposited into the Pfam server for kinase domain confirmation. These sequences were further subject to preliminary phylogenetic analysis with classical TKs and TKLs downloaded from Kinbase (http://kinase.com/ kinbase/). We also included some representative fungal sequences classified as TKLs by our HMMER searches. In the resulting phylogenetic tree, fungal sequences identified as TKs were clustered into two distinct clades ( Figure S1). One clade (fungal clade 2) was clustered into TKLs and was thus excluded from the following analysis. The other clade (fungal clade 1) was most closely related to animal TKs. To identify new sequences belonging to this clade, we built a HMM profile with the 18 sequences of fungal clade 1 and combined it into the kinase HMM library to further search against fungal proteomes. The close relationships of newly identified sequences with TKs were also confirmed by the phylogenetic analysis. In total, we identified 241 sequences from 14 fungi ( Figure 1; Table S2). These sequences formed a distinct clade in the phylogenetic tree. We named this clade as fungi-specific lineage of protein kinase (FslK).

Phylogenetic position of the FslK
Because organisms beyond animals, including Amoebozoa A. castellanii, D. discoideum and E. histolytica, and Oomycete P. infestans were also found to contain TKs. We therefore performed a comprehensive phylogenetic analysis with selected representative members of FslK to determine their evolutionary relationship with all known TKs, by using two independent phylogenetic methodologies: Maximum likelihood (ML) and Bayesian inference (BI). The green alga C. reinhardtii was also reported to have TKs [14]. We identified 16 possible TK sequences from C. reinhardtii genome and included them in our analysis.
In the resulting ML and BI trees (Figure 2), the known TKs, including classic TKs from animals and choanoflagellates, and previously reported TKs from pre-opisthokont species E. histolytica, A. castellanii, P. infestans, formed a well-supported clade (named as TK clade). Three sequences of C. reinhardtii also fell into the TK clade. The FslK was clustered with a clade of C. reinhardtii (Cr clade 1) and together formed a sister group to the TK clade. As we know, fungi are evolutionary more close to animals than those of pre-opisthokont species. If fungi have orthologs of animal TKs, they should be clustered with them in the TK clade. In contrast, the position of the FslK clade suggests that orthologs of animal TKs were lost in fungi.
Since the TK activity of members in Cr clade 1 is unclear, we do not know if the last common ancestor of both TK clade and Cr clade 1 has the TK activities. Therefore, whether the FslK members have TK activity cannot be determined solely by the phylogenetic position.

The members of FslK may have no TK activity
We performed comparative analysis of TK unique motifs and specific residues related to TK activities in catalytic domain to explore whether FslK members have tyrosine catalytic activities. The three motifs in subdomain VI, VIII and XI are reported to be TK specific [1,9,10]. However, in our analysis the sequence pattern of the motif in subdomain X [CW(X) 6 RPXF] was found to be shared by TKs and TKLs ( Figure S2) and therefore was excluded from our subsequent analysis. A new motif with sequence pattern [GXR(L/M)] in subdomain X was found to be TK specific ( Figure S2) and was used in our analysis. Results of comparative analysis showed that the sequence patterns of FslK are obviously different from those of TKs ( Figure 3). The key amino acid residues 'AARN' for stabilizing the relative positions of the substrate-binding site and the catalytic loop of TKs in motif 1 and the first conserved proline (P) residue important for substrate recognition in motif 2 [10,19] were not found in FslK members. In addition, TKs have a glutamate (L) or methionine (M) residue in the fourth position of motif 3, while members of FslK have a 'P' residue in the equivalent site. These residues are important for the TK activity and are diagnostic for TKs, and the lacking of these residues in the members of FslK suggests that they have no TK activities. Members of Cr clade 1 do not contain these key residues, and thus may also have no TK activities.
TKs, especially metazoan TKs, contain additional domains out of catalytic domains [12]. We examined additional domains in the FslK members. Different from the TKs, most of the FslK members have no additional domains; only 16 of the 241 members (Table  S3) have additional domains which are not likely related to TK activities.
These results together with the phylogenetic analysis suggest that the TK activity is most likely to be acquired by the ancestor of TK clade after it diverged from the last common ancestor of FslK and Cr clade 1.

Distribution and evolution of FslK members
Among the 241 FslK members, only 6 are from Chytridiomycota. All the others are from Agaricomycotina of Basidiomycota. Surprisingly, no FslK sequences were detected in 62 ascomycetes examined. In Agaricomycotina, only the mushroom-forming fungi Agaricomycetes contain FslK members but the two Tremellomycetes, Cryptococcus neoformans and Cryptococcus gattii, lack any putative FslK ( Figure 1).
Phylogenetic analysis showed that FslK members were clustered into multiple sub-clades. Each sub-clade contains sequences from different species. Moreover, two distantly related sub-clades both contain sequences from Basidiomycetes and Chytridiomycetes (Figure 4). These suggest that the last common ancestor of fungi had possessed numerous paralogous genes from which the subclades were descended. All these ancestral genes may have been lost in Ascomycetes and also in Ustilaginomycotina and Pucciniomycotina of Basidiomycota. In contrast, Agaricomycetes retained most of these copies. Furthermore, lineage or species-specific gene duplications (gains) also have occurred in some Agaricomycetes. For example, the wood decaying fungus Fomitiporia mediterranea and the ectomycorrhizal basidiomycete L. bicolor each contains numerous FslK members, and many of them in each species have high sequence identity and are clustered together in the phylogenetic tree (Figure 4), suggesting recent expansion occurred in these species.

Members of FslK may have important functions in Agaricomycetes
Sequence conservation of proteins is correlated with their functions, and proteins with important molecular functions are more conserved because they are under higher selection pressure than those of less important ones [20,21]. Alignment of FslK sequences revealed that although some of them are truncated in the kinase catalytic domains, most of them have complete catalytic domains, and are highly conserved in residues required for catalysis, such as residues required for ATP and substrate binding ( Figure 5). This suggests that members of FslK have catalytic activities of protein kinases. Carefully examined the sequences, we found many of the FslK members with incomplete catalytic domains are truncated due to sequencing gaps or wrong annotation, and the truncation of the others may be due to the sequence degradation caused by functional redundancy after recent gene duplication.
We further investigated the expression of FslK genes by searching for corresponding EST data from NCBI and JGI database. Most genes examined were found to be expressed (Table  S4), suggesting that these genes are functional in these organisms.
Taken together, above evidences suggest that the FslK genes likely play important roles in fungi. Considering that multiple ancestral genes of FslK members were retained and further expanded in Agaricomycetes, some members of FslK most likely play important roles in controlling cellular development and differentiation processes specific to mushroom-forming fungi. However, their exact functions need to be experimentally determined.
In summary, we systematically investigated possible TKs in fungi by using HMMs and phylogenetic analysis. Our results confirmed that fungi lack the orthologs of animal TKs. However, there is a specific lineage of protein kinases in fungi (FslK) which is most closely related to TKs. Kinases of the FslK clade lack key amino acid residues that determine TK-specific activities and therefore may not be true TKs. However, they contain conserved catalytic domains of protein kinases and thus are likely enzymatically active. Phylogenetic analysis revealed that the last common ancestor of fungi had possessed several FslK genes. The ancestral FslK genes may have been lost in Ascomycota and also in Ustilaginomycotina and Pucciniomycotina of Basidiomycota during evolution. However, most of these ancestral genes were retained and further expanded in Agaricomycetes, suggesting that the FslK kinases possibly have important functions in controlling cellular processes specific to mushroom fungi. This discovery of the FslK protein kinases closely related to TKs helps shed light on the origin and evolution of TKs and also has potential implications for the importance of these kinases in mushroom fungi.
Catalytic domain sequences of TKs and TKLs from Homo sapiens, Drosophila melanogaster, Caenorhabditis elegans, M. brevicollis, Trichomonas vaginalis, Tetrahymena thermophila, and TKLs of Coprinopsis cinerea were obtained from the Kinbase (the kinase database, http://kinase.com/kinbase) that collected the currently accepted classification of eukaryotic kinases [23]. The catalytic domains of putative TKs of Entamoeba histolytica were obtained from the Kinomer database. TKs of A. castellanii and three Oomycetes P. infestans, P. sojae, and P. ramorum identified in previous studies [13,15] were retrieved from GenBank.

Identification of possible TKs in fungi
The Hmmscan program in the HMMER 3.0 package [18] was employed to search the multi-level HMM library of protein kinases with each fungal proteome as queries using score of 20 as the Figure 2. Phylogenetic position of the FslK. Phylogenetic trees were calculated using Maximum-likelihood (ML) and Bayesian inference (BI) methods, respectively. Both methodologies gave similar tree topology. The tree presented here is the BI tree. Numbers on major branches indicate SH-like approximate likelihood ratio test (SH-aLRT) probabilities/Bayesian posterior probabilities. Branches with Bayesian posterior probability less than 0.5 have been collapsed. The simple cladogram of eukaryotic groups on the top right corner was drawn according to the tree of life (http:// tolweb.org/tree/). Ac, Acanthamoeba castellanii; At, Arabidopsis thaliana; Ce, Caenorhabditis elegans; Cr, Chlamydomonas reinhardtii; Dd, Dictyostelium discoideum; Dm, Drosophila melanogaster; Eh, Entamoeba histolytica; Hs, Homo sapiens; Mb, Monosiga brevicollis; Mm, Mus musculus; Pi, Phytophthora infestans; Pr, Phytophthora ramorum; Ps, Phytophthora sojae; Su, Sea Urchin; Tv, Trichomonas vaginalis. For abbreviations of fungi see Table S1. doi:10.1371/journal.pone.0089813.g002  6 RPXF in gray was found to be not specific to TKs in this study ( Figure S2) cutoff. Only the fungal sequences designated as TKs (best matches) were selected and deposited into the Pfam server to confirm if they are kinases. These sequences were further subject to the preliminary phylogenetic analysis with classic TKs and TKLs to determine if they are closely related to TKs.

Sequence alignment and phylogenetic analysis
Multiple sequence alignments were performed with the PSI-Coffee program [24]. Alignments used for phylogenetic analysis were trimmed by trimAL [25] with gappyout model. Some sequences that are truncated due to wrong gene prediction were manually revised.
Phylogenetic trees were constructed with two independent methods: Maximum likelihood (ML) and Bayesian inference (BI) methodologies. The ML trees were constructed with PhyML 3.1 [26] using the best-fit model LG+C selected by ProtTest3 [27], with SPRs algorithms and 16 categories of c-distributed substitution rates. The reliability of internal branches was evaluated with  Table S1. doi:10.1371/journal.pone.0089813.g004 SH-aLRT supports. The BI tree was constructed with MrBayes-3.2 [28] using mixed models of amino acid substitution with 16 categories of c-distributed substitution rates, performing two runs for each of four Monte Carlo Markov Chains (MCMCs), sampling every 1000th iteration over 1.1610 6 generations after a burn-in of 101 samples.

Examination of functional domains
Conserved protein domains were searched in the Pfam database [29]at Sanger and the CDD database at NCBI (http://www.ncbi. nlm.nih.gov/cdd). Sequence logos were generated by WebLogo (http://weblogo.berkeley.edu/) [30]. Figure S1 Phylogenetic analysis of fungal sequences classified as TKs with those of classical TKs and TKLs. The phylogenetic tree was built with the kinase domain sequences using ML methodologies with SPRs algorithms and 16 categories of c-distributed substitution rates. The reliability of internal branches was evaluated based on SH-aLRT supports. The base tree was drawn using Interactive Tree Of Life Version 2.2.2 (http://itol.embl.de/#). The p-values of approximate likelihood ratios (SH-aLRT) are plotted as circle marks on the branches (only p-values.0.5 are indicated) and circle size is proportional to the pvalues. Sequences in fungal clade 1 and clade 2 were designated as TKs by multi-level HMM library of protein kinases. Other fungal sequences were designated as TKLs. Abbreviated species names are as follows: At, Arabidopsis thaliana; Ce, Caenorhabditis elegans; Cr, Chlamydomonas reinhardtii; Dd, Dictyostelium discoideum; Dm, Drosophila melanogaster; Eh, Entamoeba histolytica; Hs, Homo sapiens; Mb, Monosiga brevicollis; Mm, Mus musculus; Ot, Ostreococcus tauri; Ol, Ostreococcus lucimarinus; Su, Sea Urchin; Tt, Tetrahymena thermophila; Tv, Trichomonas vaginalis; Vv, Vitis vinifera. For abbreviations of fungi see Table S1. (PDF) Figure S2 The identification of a new TK-specific motif in sub-domain X. Asterisks indicate the newly identified TKspecific motif [GXR(M/L)]. The previous reported motif [CW(X) 6 RPXF] shaded in gray is common in TKs and TKLs. (PDF)

Supporting Information
Table S1 Information of fungal species used in this study. (XLSX) Figure 5. Multiple sequence alignments of representative members of FslK. The consensus sequences of Protein kinase domain (Pkinase, pfam00069) and Catalytic domain of Protein Tyrosine Kinases (PTKc, cd00192) were used as references. Eleven sub-domains of FslK catalytic domains were shown. Conserved amino acid residues related to crystal structure and catalytic function [1,9,10] in protein kinases were indicated below. The default color scheme for ClustalW alignment in the Jalview program was used. doi:10.1371/journal.pone.0089813.g005