Genome-Wide Identification of Mitogen-Activated Protein Kinase Gene Family across Fungal Lineage Shows Presence of Novel and Diverse Activation Loop Motifs

The mitogen-activated protein kinase (MAPK) is characterized by the presence of the T-E-Y, T-D-Y, and T-G-Y motifs in its activation loop region and plays a significant role in regulating diverse cellular responses in eukaryotic organisms. Availability of large-scale genome data in the fungal kingdom encouraged us to identify and analyse the fungal MAPK gene family consisting of 173 fungal species. The analysis of the MAPK gene family resulted in the discovery of several novel activation loop motifs (T-T-Y, T-I-Y, T-N-Y, T-H-Y, T-S-Y, K-G-Y, T-Q-Y, S-E-Y and S-D-Y) in fungal MAPKs. The phylogenetic analysis suggests that fungal MAPKs are non-polymorphic, had evolved from their common ancestors around 1500 million years ago, and are distantly related to plant MAPKs. We are the first to report the presence of nine novel activation loop motifs in fungal MAPKs. The specificity of the activation loop motif plays a significant role in controlling different growth and stress related pathways in fungi. Hence, the presences of these nine novel activation loop motifs in fungi are of special interest.


Introduction
Among the several signal transduction pathways present in an eukaryotic system, the mitogenactivated protein kinase (MAPK) pathway is one of the most important signaling pathways [1][2][3][4]. It is considered as one of the evolutionary conserved signal transduction pathways that transduces an extracellular signal to the nucleus and maintains proper adjustment of cellular responses [2]. The signaling pathways consist of myriads of cascades that are induced in response to environmental cues. A recent development in the study of MAPK shows that, this pathway is involved in diverse cellular responses [5][6][7]. The MAPK cascade consists of a threekinase signaling module; MAP kinase kinase kinase (MAP3Ks), MAP kinase kinase (MAP2Ks), and MAP kinases (MAPKs) [1,2,8] which are connected to each other by a process of sequential phosphorylation events [1,5]. Upon extra-cellular responses, the signalling molecules activate upstream MAP kinase kinase kinase kinase (MAP4Ks) or MAP kinase kinase kinase (MAP3Ks) making them as adaptor-signaling molecules [8] which phosphorylates downstream MAP2Ks at S/T-X 3-5 -S/T motif of the activation loop region [9]. Subsequently, MAP2Ks phosphorylates MAPKs at conserved T-x-Y (T-E-Y/T-D-Y) motifs in the activation loop region [1,10]. Upon activation, the MAPKs are able to phosphorylate a large number of downstream targets, including other kinases and transcription factors that regulate growth, development and stress responses. The MAPK pathway module present in fungi is activated by a member of the p21-activated protein kinase (PAK) that activates Ste20. Protein p21 is a monomeric Ras-related GTPase Cdc42 that activates Ste20 [11]. Upon activation by Cdc42, Ste20 modulates phosphorylation of Ste11 [11]. The Ste20 acts as an upstream MAPKKK kinase in MAPK module [12]. Later Ste20 phosphorylates downstream protein Ste7 (MAPKK) and Fus3 (MAPK). A scaffold protein Ste5, binds all three kinase module cascade together [11,12].
The MAPKs contain characteristic T-E-Y/T-D-Y/T-P-Y and the T-G-Y motif in the activation loop region [4,[13][14][15][16]. Reports suggest that, activation loop motif T-E-Y and T-D-Y are common to plants and animals, whereas presence of T-E-Y, and T-G-Y motifs are unique to animals and fungi only [13]. The MAPK motif specificity in fungi plays distinct roles in maintaining different pathways. The Fus3 (T-E-Y) mediates cellular response to peptide pheromone, Kss1 (T-E-Y) helps in adjustment to nutrient limiting conditions, and Hog1 (T-G-Y) controls hyperosmotic condition in fungi [11]. The phosphorylation at both threonine and tyrosine residues in the T-x-Y motif of MAPK is important for locking the kinase domain in a catalytic competent conformation. The phosphorylation of tyrosine is usually followed by phosphorylation of threonine, although phosphorylation of any one of the two residues can occur in the absence of the other [17]. This sequential activation of the MAPK cascade controls different aspects of cellular activities and regulates proper growth and development of the organisms [18,19]. The fungi are very important microorganisms with a diverse genomic organization [20]. The MAPK gene family plays significant roles in fungal growth, development as well as different signaling and stress responses [20]. The MAPK research in fungal biology is in its infantile stage and required in-depth investigations. Therefore, to know more detail about fungal MAPKs, we conducted genome-wide identification of the MAPK gene family of 173 fungal species, and analysed their diversity and evolutionary aspects.

Nomenclature of fungal MAPKs
Nomenclature of a gene is very important to know its exact identity. But, it was a very difficult task to name all the identified fungal MAPKs. Therefore we named the fungal MAPKs which contained only novel activation loop motif. Name was provided by taking first letter of genus name in upper case and first letter of specie name in lower case followed by MPK. Functional characterizations of all the novel activation loop motifs are yet to be done; therefore the naming of fungal MAPKs which contains novel activation loop motifs were not done according to gene HOG, SLT2, KSS1, FUS3 and SMK1 as found in Saccharomyces cerevisiae.

Molecular modeling of fungal MAPKs
During the identification of fungal MAPKs, we identified several MAPKs that contains novel activation loop motif. Therefore, representative molecular structures for all the MAPKs were modelled which contained the novel activation loop motif. The fungal MAPK sequences that contained novel activation loop motif was used to build the molecular structure with the help of Swiss-model workspace [26]. After building the models, the models were analysed for the presence of novel motifs using Pymol software.

Multiple sequence alignment
Multiple sequence alignments of all the identified MAPKs of fungi were conducted using Multalin software (http://multalin.toulouse.inra.fr/multalin/) [S1 Fig]. Owing to large data files of multiple sequence alignments, it was difficult to incorporate them all in the manuscript for a Table 1. MAPK gene family of fungi . The MAPK gene family of 173 fungal species (including three Oomycota species) shows the presence of several  novel activation loop motifs (T-T-Y, T-I-Y, T-N-Y, T-H-Y, T-S-Y, K-G-Y, T-Q-Y, S-E-Y and S-D conceptual figure. Therefore, we again conducted another multiple sequence alignment of few selected fungal MAPK sequences those that contained the novel activation loop motif. Different statistical parameters used to run the Multalin program were as follows. Protein weight matrix: Blosum62-12-12; gap penalty at the opening: default; gap penalty at extension: default; gap penalty at the extremities: none; one iteration only: no; high consensus level: 90%; and low consensus level: 50%.

Phylogenetic analysis
Different phylogenetic trees were constructed to infer the phylogenetic relationship of fungal MAPKs. In the first case, all the MAPK sequences of the fungal species were taken to construct a phylogenetic tree. In the second case, a phylogenetic tree was constructed by taking fungal MAPK sequences that contained only novel activation loop motifs and AtMPK4, AtMPK16, OsMPK6, and OsMPK14 from A. thaliana and O. sativa as the representative of the T-E-Y and T-D-Y motifs. In the third case, a phylogenetic tree was constructed using all representative MAPKs of A. thaliana, and O. sativa with the fungal MAPKs that contained the novel activation loop motifs. In all the three cases, first of all, a clustal file was generated using clustalw or clustal omega program using the default parameters [27,28]. The resulting clustal files were downloaded and converted to MEGA file format using MEGA5 software [29]. Resulting MEGA files of MAPKs were subjected to construct the phylogenetic tree. Different statistical parameters used to construct the phylogenetic trees were as follows. Analysis: phylogeny reconstruction; scope: all selected taxa; statistical parameters: maximum likelihood; test of phylogeny: bootstrap method; number of bootstrap replications: 1000; substitution type: amino acids; model/methods: Jones-Taylor-Thornton (JTT); rates among sites: uniform rates; pattern among lineage: homogenous; gaps/missing data treatment: pair wise deletion/use all sites; and swap filter: very strong.

Statistical analysis
Tajima's neutrality test and Tajima's relative rate test were conducted in order to understand the statistical significance and the rate of evolution of fungal MAPKs [30]. Different statistical parameters used for Tajima's neutrality test were; analysis, Tajima's test of neutrality; scope, all selected taxas; substitution type, amino acids; gaps/missing data treatment, pairwise deletion and different statistical parameters that were used to carry out Tajima's relative rate tests were; analysis, Tajima's relative rate test; scope, for three chosen sequences; substitution type, amino acids; and gaps/missing data treatment, complete deletion.

Gene duplication analysis
The duplication event of all the Saccharomyces cerevisiae MAPKs were studied using Notung 2.6 software [31,32]. Due to lack of species tree of 173 species, study of gene duplication/loss of all the fungal MAPKs was not feasible. Therefore, we conducted the duplication analysis of fungal MAPKs that contained the novel activation loop motifs, by using the online server Pinda (http://orion.mbg.duth.gr/Pinda) [33]. The fungal MAPKs sequences that resulted in z-score below four were considered as non-duplicated while those resulted in a z-score above four were considered to be duplicated MAPKs [33].

Identification of novel activation loop motifs
Commonly, the fungal MAPKs contain either a T-E-Y or a T-G-Y motif in the activation loop region [19,20]. Besides the presence of the activation loop T-E-Y or T-G-Y motif in fungal MAPKs, they were also found to contain several new activation loop motifs (Tables 1 and 2).  Table 1). The novel activation loop motif S-D-Y is only present in Basidiomycota group. Earlier, it was thought that the "x" in T-x-Y motif is restricted to G (glycine), P (proline), D (aspartate), and E (glutamate) which are belonged to polar, non-polar and negatively charged amino acids respectively. In our study we found that, the "x" amino acid in T-x-Y motif is very dynamic and can be either polar (T-T Table 1). The species Laccaria amethystine encodes for a maximum of 20 MAPKs in its genome. From the studied 173 species, 31 species encode for 10 or more than 10 MAPKs in their genome (Table 1). A majority of the fungal species putatively encodes six (40 species), seven (23 species) or eight (20 species) MAPKs in their genomes (Table 1). From 73 species of Basidiomycota, 29 species encodes 10 or more than 10 MAPKs in their genome. Similarly, from 82 species of Ascomycota, only one species (Thielavia appendiculata) encodes 10 MAPKs in its genome. The MAPK gene family size of Basidiomycota is bigger than Ascomycota.

Phylogeny
The grouping and classification of different genes in a gene family is very important in order to understand the functional aspects of a gene family. So, grouping of fungal MAPKs was conducted by constructing a phylogenetic tree. The phylogenetic analysis shows that fungal MAPKs are grouped into four major groups and two minor groups. The major groups are named as group A (red), B (lime), C (magenta), and D (deep blue) (Fig 3). The minor groups are named as group E (black) and F (light blue). Group E is present in between groups A and B, and group F is present in between groups B and C. The phylogenetic tree shows that, the fungal MAPK groups are originated polyphyletically. This could have happened due to the fact that the study was conducted with 173 species, which included diverse species, including slime molds. Slime molds are not considered as true fungi, and hence, their common ancestor could be different from other fungi. The classification of the fungi kingdom is changing very rapidly and is yet to acquire a specified classification. Therefore, this may be the reason for the presence of polyphyletic groups in fungal MAPKs.
During the study, we found several fungal MAPKs that contain the novel activation loop motifs. Therefore, it was very important to understand their grouping system and phylogenetic relationship with other common activation loop motifs (T-E-Y and T-D-Y) as well to understand their evolution and subsequent divergence. Therefore, we  Table 2). We named them as groups A (red), B (lime), C (magenta), and D (blue). Table 2). The fungi and plants have close evolutionary relationship. During evolution, fungi have helped the photosynthetic plant lineage to move to terrestrial ecosystem to green the earth in the early Palaeozoic era and played a significant role in assisting colonization of terrestrial environments [39]. Besides, the mutualistic relationships of fungi are much closer to the plants [39]. Therefore, we used the MAPK sequences of plant lineage to construct the phylogenetic tree to understand their evolutionary linkage. Later, we constructed another phylogenetic tree by taking all the representative MAPKs of A. thaliana and O. sativa to understand the evolutionary relationship of fungal MAPKs that contain the novel activation loop motifs. The phylogenetic tree resulted into four groups (Fig 5). From the resulting four groups, two groups were shared by fungal MAPKs (magenta and blue), and the other two groups were shared by plant MAPKs (red and lime). Phylogenetic result shows, fungal MAPKs are evolutionarily older than plant MAPKs and are derived from their common ancestors during the process of evolution.

Statistical analysis
Tajima's relative rate test explains the test of the molecular evolutionary hypothesis (i.e. a constant rate of molecular evolution) between two samples using an out-group. It applied to nucleotide and protein sequences. Therefore, we conducted Tajima's relative rate test of fungal MAPKs that contained novel activation loop motif combining with AtMAPKs and OsMAPKs that contained common activation loop motifs. During the analysis, sequences A (OsMPK6) and B (ScMPK-A), with sequence C (CjMPK) were used as random selection by the MEGA program. Results of the analysis show a p-value of 0.01004 (p = 0.01004), and X 2 value of 6.63 (X 2 = 6.63) ( Table 3). Tajima's relative rate test with fungal MAPKs that contained a representative of a novel activation loop motif combined with the representative AtMAPKs and OsMAPKs, resulted in a p value of 0.0522 (p = 0.0522), and X 2 test result of 3.77 (X 2 = 3.77) The phylogenetic tree shows the presence of four major and two minor groups. The major groups are named as group A (red), B (lime), C (magenta) and D (deep blue) and minor groups are named as group E (black) and F (light blue). Group E is present in between group A and B while group F is present in between groups B and C. The majority of MAPK sequences of fungi-like organisms (Oomycetes) are fall in group D (deep blue). The phylogenetic tree was constructed using maximum likelihood statistical method and Jones-Taylor-Thornton model with 1000 bootstrap replicates. The MAPKs sequences of all fungal species can be found in S1 Appendix.    Table 3). The sequences A (DeMPK) and B (AtMPK1), with sequence C (CjMPK) were used during this analysis. When all the MAPKs that contained the novel activation loop motifs were analysed for Tajima's relative rate test, the resulting p-value was 0.0000 (p = 0.0000), and X 2 value was 97.92 (X 2 = 97.92) ( Table 3). The sequences A (JGI sequence ID: 687684) and B (JGI sequence ID: 91669), with sequence C (JGI sequence ID: 5240) were used during this analysis as random selections by MEGA software. The p value less than 0.05 is often used to reject the null hypothesis of equal rates between the lineages (p 0.01: very strong presumption against the null hypothesis, 0.01< p 0.05: strong presumption against the null hypothesis, 0.05 < p 0.1 low presumption against the null hypothesis).
Tajima's neutrality test distinguishes between the DNA sequences that evolve randomly (neutrally), and one evolving under a non-random process like a balancing selection. This explains the evolution of a particular gene, or group of genes or a genome, and explains whether they evolved neutrally, or by directional selection, or by balancing selection. Therefore, we conducted Tajima's neutrality test of fungal MAPKs. When fungal MAPKs that contained novel activation loop motifs were analysed in combination with a few AtMAPKs and OsMAPKs (two AtMPKs and two OsMAPKs) as the representative of T-E-Y and T-D-Y motif, the Tajima's D test result was 5.189926 (Table 4). When fungal MAPKs that contained novel

AtMPKs were grouped with AkMPK (T-T-Y motif) and AlMPK (T-I-Y motif) of fungi. This suggests that AkMPKs and AlMPKs which contains the T-T-Y and T-I-Y motifs can also be plant-specific as well, which is yet to be elucidated. The other MAPKs having T-N-Y, T-H-Y, T-S-Y, K-G-Y, T-Q-Y and S-E-Y activation loop
motifs are those that fall into unique groups, which are specific to fungi only. The activation loop T-E-Y motif is very common and present in all three domains (plant, animal and fungi) of life. Therefore, one MAPK from fungi that contained the T-E-Y (MbMPK) motif was included in this study to better understand their grouping system and phylogenetic relationship. Owing to the presence of the T-E-Y motif, MbMPK is placed in between fungi and plant. Because the T-E-Y motif is common to all the three domains of life and others are unique to fungi, hence MbMPK is present in between the plant and fungi domains. ScMPK in the figure represents SLT2 of Sachharomyces cerevisiae.
doi:10.1371/journal.pone.0149861.g005 Table 3. Tajima's relative rate test. The equality of evolutionary rate analysis between sequences A (OsMPK6) and B (ScMPK-A), with sequence C (CjMPK) were used for the analysis of fungal MAPKs with a few AtMAPKs and OsMAPKs as the representative of T-E-Y and T-D-Y motifs. Sequences A (DeMPK) and B (AtMPK1), with sequence C (CjMPK) were used for fungal MAPKs with all the representative AtMAPKs and OsMAPKs as per default selection on the MEGA program. In the case of all fungal MAPKs that contain the novel activation loop motif, the equality of the evolutionary rate was calculated between sequences A (687684) and B (91669), with sequence C (5240). A P-value less than 0.05 is often used to reject the null hypothesis of equal rates between lineages. The analysis involved 3 amino acid sequences. All positions containing gaps and missing datas were eliminated. Evolutionary analyses were conducted in MEGA5.

Configuration
Fungal  (Table 4). When all the fungal MAPKs of 173 species were analysed for Tajima's neutrality test, the resulting D value was -3.500934. When fungal MAPKs that contained only novel activation loop motifs were analysed, the resulting D value was 4.7514 (D = 4.751476). As per thumb rule presumption, D value greater than +2 (plus two) or less than -2 (minus two) is considered as highly significant. Therefore, all the results of Tajima's neutrality test were significant.

Gene duplication analysis
Gene duplication is a major mechanism which helps in generating new genetic material during molecular evolution, and this creates the genetic novelty. To understand this phenomena, we analysed the duplication event of the model fungi S. cerevisiae. From the six MAPKs of S. cerevisiae, five were found to be duplicated and no MAPK was found to be lost during the evolution (Fig 6). Due to certain limitation and lack of species tree of all the studied (173) fungal species, it was difficult to study the duplication event of all the fungal MAPKs, and hence we restricted out study to S. cerevisiae only. Still, it was very peculiar to understand the duplication event of the MAPKs that contained the novel activation loop motif. Therefore, we studied the duplication event of MAPKs that contained the novel activation loop by using Pinda server. We found that the MAPKs that contain the most common activation loop motifs T-E-Y, T-G-Y, K-G-Y, T-Q-Y, S-D-Y, and T-S-Y are highly duplicated and contain a z-score of more than four (Table 5). This shows that the most common form of activation loop motif MAPKs is highly duplicated. This result is due to presence of large numbers of evolutionarily conserved orthologous genes (Fig 6). From, six MAPKs of S. cerevisiae, five were found to be orthologous gene and only one (Smk1) was found to be paralogous (Fig 6). The MAPKs that contain the activation loop motifs T-T-Y, T-I-Y, S-E-Y, T-H-Y and T-N-Y show a z-score of less than four and are found to be non-duplicated.

Discussion
The origin and evolution of fungi dates back to 1500 million years, when fungi were diverged from other forms of life (Fig 7) [ 40,41]. Various groups of fungi colonized the land more than 500 million years ago during the Cambrian period [42]. We found that, the presence of T-E-Y motif in MAPK is very common to almost all fungi and presence of the fungal kingdom before 1500 million years directly reflects that, T-E-Y motif of MAPK is the ancient most motif compared to others and might present since 1500 million years ago. The number of MAPK sequences identified from different fungal species ranges from 2 to 20 ( Table 1). The presence of diverse numbers of MAPK sequences and the presence of diverse activation loop motifs in fungal species reflects wide diversity of MAPKs in fungi. The presence Table 4. Tajima's test for neutrality. Statistical analysis was carried out using MEGA5. In the statistical analysis, all the positions with site coverage 95% site coverage were eliminated. That is, fewer than 5% alignment gaps, missing data, and ambiguous bases were allowed at any position. All ambiguous positions were removed for each sequence pair. Abbreviations: m = number of sequences; S = number of segregating sites; P s = S/n; Θ = p s /a 1; π = nucleotide diversity; and D is the Tajima test statistic.   The T-N-Y, T-S-Y, and K-G-Y motifs are specific to Ascomycota only. The S-D-Y motif is found in M. bicolour, which is specific to Basidiomycota, and the T-Q-Y motif found in U. ramanniana is specific to Zygomycota only. As fungi have also been in existence since the past 1500 million years, it can be possible that fungal MAPKs have also been in existence since the past 1500 million years. The absence of T-D-Y motifs in fungi reflects that, this motif might have evolved recently, and is present only in the plant and animal kingdoms. In fungi, MAPK signaling cascades are regulated through cell integrity (CWI pathway) [43], the high osmolarity glycerol (HOG) pathway, Kss/Fus3 cascade, Mpk1 and SmK1 (sporulation and meiosis) pathways (Fig 8) [18,[44][45][46][47][48][49][50]. The HOG pathway is crucial for adaptation to osmotic stress and is regulated by accumulation of glycerol in intracellular spaces [51]. During starvation, fungi activate the Kss1 pathway [52][53][54]. In the Kss1 pathway, the upstream MAPKKK Ste11 activates Ste7 that followed by activation of MAPK Kss1 [55,56]. The Fus3 pathway is involved in mating and cell cycle arrest [48,52,57]. The mating is induced by the presence of specialized pheromones which is sensed by the Ste2 and Ste3 G-protein coupled receptor which, in turn, leads to the activation of MAPK signaling module. The Fus3 and Kss1 MAPK contain T-E-Y motif in the activation loop region. Therefore, the presence of T-E-Y motif in the activation loop region of MAPK that regulates the Fus3 and Kss1 pathways is supposed to be responsible for mating and filamentous growth, respectively. The HOG1 MAPK, which is responsible for hyperosmosis-mediated glycerol synthesis contains T-G-Y motif. This indicates that the presence of the T-G-Y motif in the activation loop motif might mediate critical role for HOG pathway. The Mpk1 pathway is responsible for cell wall remodeling in fungi and it contains T-E-Y motif. The Smk1 protein that mediates Smk1 pathway contains T-N-Y motif. Hence, the presence of the T-N-Y motif in the activation loop might be responsible for regulating Smk1 pathway. The activation loop motif K-G-Y was found in MAPK of S. cerevisiae and X. parietina, which are not reported before. Therefore, the presence of mitogen activated protein kinase that contain K-G-Y motif is new to S. cerevisiae. The presence of K-G-Y motif corroborates to the K-D-X motif of KDX1 protein in Saccharomyces genome database. This KDX1 protein is considered as catalytically dead and the presence of the K-G-Y motif in this MAPK is of particular interest. As the tyrosine residue is a potential phosphorylation site of upstream MAP2Ks, the presence of K-G-Y motif in ScMPK-A might be functionally active. Motif specific MAPKs in fungi regulate a different pathway in S. cerevisiae and presence of K-G-Y motif might regulate some important pathway as well, which is yet to be elucidated. In this context, the presence of several new activation loop motifs in fungal MAPKs seems very interesting, and they might be regulating some novel pathways in fungi. A detailed experimental study with fungal MAPKs that contain a different activation loop motifs may elucidate a new MAPK pathway in fungi.
The mutualistic relationships of fungi are closer to plants, and fungi has helped the plants to colonize the terrestrial environment due to their ecological dominance [39,58]. Beside this, during the evolutionary process of life, the evolutions of plants were followed by evolution of fungi and evolutionary relationships of fungi are much closer to the plants. Therefore, to better understand the evolutionary event of fungal MAPKs, it was highly necessary to conduct the comparative analysis of fungal MAPKs with plant MAPKs. Therefore, MAPKs of A. thaliana and O. sativa were included in this study to construct the phylogenetic tree. The resulting phylogenetic tree shows the presence of four distinct groups (Fig 5). The two upper groups belong to MAPKs of O. sativa and A. thaliana (except fungal AkMPK and AlMPK that contain the T-T-Y and T-I-Y motifs), and the two lower groups belong to the MAPKs of fungi. Two fungal MAPKs (AkMPK and AlMPK) that contain T-T-Y and T-I-Y motifs, respectively, were found to be grouped with the MAPKs of A. thaliana and O. sativa (Fig 5). This shows that the T-T-Y and T-I-Y motifs might be plant-specific as well, which are yet to be reported. The two lower clades of the phylogenetic tree are distinctly separated from the MAPKs of higher plants. This suggests that fungal MAPKs were evolved independently and polyphyletically from their common ancestor and diverged during the evolution, which led to the presence of the diverse activation loop motifs. Earlier it was widely reported that, motif specificity plays important role in grouping of MAPKs [2]. The group D MAPKs of plants and animals always contains T-D-Y motif. But here, we can see that the plant MAPKs of A. thaliana and O. sativa with T-D-Y motif don't grouped together (Fig 5). This clearly explains, T-D-Y motif has no role in grouping of MAPKs and might be applicable to plants and animals only. When MAPKs sequences from all the fungal species were subjected to construct the phylogenetic tree, it resulted in six distinct groups, similar to the MAPK of plant systems (Fig 2) [16]. This explains, although MAPKs of plants and fungal systems were evolved independently from their polyphyletic common ancestors, their basic sequence and group architecture remain conserved. This indicates the conserved and lineage-specific evolution of MAPKs. When the phylogenetic tree was constructed by taking the sequence of MAPKs with the novel activation loop motif group, it resulted in four groups (groups A, B, C and D) (Fig 4, Table 2). The activation loop motifs T-E-Y and T-G-Y were found to be distributed in all four groups while activation loop motif T-N-Y distributed in groups A and C (Table 2). Similarly MAPKs having the activation loop motif T-H-Y were distributed in groups A and B only. The activation loop motifs S-D-Y, S-E-Y, T-T-Y, and T-I-Y are unique to group D; the T-S-Y and T-Q-Y motifs are unique to groups A and B, respectively ( Table 2). The K-G-Y activation loop motif is unique to groups A and C.
In the statistical analysis of Tajima's relative rate test, the p value of all the three groups lies between 0.0000 and 0.05, which are considered as significant [30]. The X 2 values for all the three groups were also very significant (Table 3). This indicates a high level of significance to Tajima's test for neutrality (D test). The D values of all the fungal MAPKs studied were found to be -3.500934. In the case of fungal MAPKs that contained only novel activation loop motifs combined with selected AtMAPKs and OsMAPK, the resulting D value was found to be 5.189926 [30,59,60]. Similarly, the D value of fungal MAPKs that contained only novel activation loop motifs was found to be 4.751476 (Table 4). When fungal MAPKs that contained novel activation loop motifs were combined with all representative MAPKs of A. thaliana and O. sativa, the resulting D value was 5.233833 (Table 4). A negative Tajima's D value represents very low frequencies of genetic polymorphism relative to expectation [30,59,60]. This represents an expansion in the population size after the selective and purifying process of selection. A positive Tajima's D value represents a high level of frequencies of polymorphism [30]. This indicates a decrease in population size by balancing selection. The D value of all fungal MAPKs was found to be positive. This confirms that, overall, fungal MAPKs underwent very high frequencies of genetic polymorphism. The novel activation loop motifs of MAPKs were recently evolved and had undergone a significant genetic polymorphism. When Tajima's D = 0, theta-Pi is equivalent to theta-k (observed = expected). This implies that the average heterozygosity is equal to the number of segregating sites, or otherwise, it can be explained as an observed variation being similar to expected variation [30,59,60]. This signifies that the population is evolving as per mutation-drift equilibrium and there is no evidence of selection. When Tajima's D < 0, it indicates a lower average heterozygosity than the number of segregating sites, and rare alleles are present at low frequencies [30,59,60]. This signifies a recent selective sweep and population expansion after a recent bottleneck and linkage to a swept gene. When Tajima's D > 0, it indicates a more average heterozygosity than the number of segregating sites, and can be present in multiple alleles, some at low and others at high frequencies. This signifies a balancing in selection and sudden contraction of the population [30,59,60]. In the cases of novel activation loop motif fungal MAPKs, and a comparative study with selected A. thaliana and O. sativa MAPKs, the D value is greater than zero (D > 0) ( Table 4). This confirms that, novel activation loop motif fungal MAPKs and plant MAPKs have multiple alleles present at high frequencies that were responsible for the sudden decrease in population by balancing selection. The D value of all fungal MAPKs together was found to be -3.500934, which is less than zero (D < 0). This explains why rare MAPK alleles are present at very low frequencies and the population expansion after a recent bottleneck. As per thumb rule, the D values greater than +2 or less than -2 are supposed to be significant [30]. In all the cases, the D value was found to be more than +2, and hence; the analyses were highly significant. In the cases of fungal MAPKs that contain the novel activation loop motifs analysed with all representatives of AtMAPKs and OsMAPK, the resulting D value was found to be of 5.233833 (D = 5.233833). This denotes that the novel activation loop containing fungal MAPKs and plant MAPKs are highly polymorphic and evolutionarily conserved, and that they have evolved only recently. In the gene duplication study, the MAPKs that contain the new activation loop motifs T-T-Y, T-I-Y, S-E-Y, T-H-Y and T-N-Y are those that contain a Z-score below four ( Table 5). The presence of Z-score of a gene above four is considered as highly significant to be duplicated [33]. Thus, fungal MAPKs that contain these novel activation loop motifs are highly nonduplicated, and most probably evolved only recently, and are yet to undergo any duplication events. The abundance and numbers of MAPK sequences that contain T-T-Y, T-I-Y, S-E-Y, T-H-Y and T-N-Y motifs are very low. This signifies the nonduplication aspects of these MAPKs.

Conclusion
Genome-wide identification of the MAPK gene family in fungi revealed the presence of several novel activation loop motifs. Evolutionary study shows that fungal MAPKs that contain the T-E-Y motif in the activation loop region are older than other activation loop motifs and MAPKs that contains the T-D-Y motifs are supposed to be evolved recently. The Mpk1, Fus3 and Kss1 pathway is mediated by MAPKs that contains T-E-Y motifs. The HOG1 pathway mediated by MAPK that contains T-G-Y motif and the Smk1 pathway is mediated by MAPK that contains T-N-Y motif. This reflects that, activation loop motif in fungal MAPK decides the fate of different pathways. From this point, we can speculate that, presence of novel activation loop motif MAPKs of fungi might regulate some other novel pathways in different species, which are yet to be elucidated.