Genome-Wide Identification and Analysis of Expression Profiles of Maize Mitogen-Activated Protein Kinase Kinase Kinase

Mitogen-activated protein kinase (MAPK) cascades are highly conserved signal transduction model in animals, yeast and plants. Plant MAPK cascades have been implicated in development and stress responses. Although MAPKKKs have been investigated in several plant species including Arabidopsis and rice, no systematic analysis has been conducted in maize. In this study, we performed a bioinformatics analysis of the entire maize genome and identified 74 MAPKKK genes. Phylogenetic analyses of MAPKKKs from maize, rice and Arabidopsis have classified them into three subgroups, which included Raf, ZIK and MEKK. Evolutionary relationships within subfamilies were also supported by exon-intron organizations and the conserved protein motifs. Further expression analysis of the MAPKKKs in microarray databases revealed that MAPKKKs were involved in important signaling pathways in maize different organs and developmental stages. Our genomics analysis of maize MAPKKK genes provides important information for evolutionary and functional characterization of this family in maize.

Maize (Zea mays L.) is one of the oldest and most important world-wide crops that are relied upon for human food, animal feed and for starch ethanol production. So far, seven MAPKs and 4 MKKs have been characterized in maize [43][44][45][46][47][48][49][50][51]. However, to our knowledge, the maize MAPKKK gene family has not been characterized in detail. In this study, we performed a bioinformatics analysis of the entire maize genome and identified 74 MAPKKK genes. In addition, we provide detailed information on the genomic structures, chromosomal locations and phylogenetic tree of maize MAPKKK genes. Subsequently, we investigated their transcript profiles in different organs and developmental stages using microarray data, which will help future studies for elucidating the precise roles of MAPKKKs in maize growth and development.

Identification of MAPKKK Gene Family in Maize
The completed genome sequence of Zea mays was downloaded from the maize sequence database (http://www.maizesequence. org/index.html). For the identification of maize MAPKKK gene family, Arabidopsis and rice MAPKKK protein sequences were firstly used as query sequences to search against the maize genome database and NCBI using BLASTP program. And self BLAST of the sequences was carried out to remove the redundancy. The Pfam (http://pfam.sanger.ac.uk/search) and SMART (http:// smart.embl-heidelberg.de/) databases were used to confirm each predicted maize MAPKKK protein sequence.

Gene Structure Analysis of Maize MAPKKK Genes
The information of maize MAPKKK genes, including accession number, chromosomal location, ORF length, exon-intron structure, were retrieved from the B73 maize sequencing database (http://www.maizesequence.org/index.html).

Phylogenetic Analysis of Maize MAPKKK Proteins
Multiple alignments of MAPKKK proteins were carried out using the Clustal X v1.83 program. The protein sequences of Arabidopsis and rice MAPKKK were obtained from the TIGR database and phylogenetic analysis was performed with MEGA5.0 program by neighbor-joining method and the bootstrap test was carried out with 1000 replicates.

Chromosomal Locations and Gene Duplication of MAPKKK Genes
Genes were mapped on chromosomes by identifying their chromosomal position provided in the maize sequence database. Gene duplication events of MAPKKK genes in maize B73 were also investigated. We defined the gene duplication in accordance with the criteria: 1) the alignment length covered .80% of the longer gene; 2) the aligned region had an identity .80%; 3) only one duplication event was counted for tightly linked genes. All of the relevant genes identified in the maize genomes were aligned using Clustal X v1.83 and calculated using MEGA v5.0.

Expression Analyses of the MAPKKK Genes
Microarray expression data from various datasets were obtained making use of Genevestigator (https://www.genevestigator.com/ gv/) with the Maize Gene Chip platform. The maize MAPKKK expression data was obtained through searching the Maize Gene Chip using identified MAPKKK ID (Table 1).

Plant Materials and Growth Conditions
For maize inbred line Qi 319 (from Shandong Academy of Agricultural Sciences), embryo of 25 days after pollination was harvested from greenhouse-grown plants in sand under 16 h of light (25uC) and 8 h of dark (20uC), and eight-week-old seedling tissues and organs were harvested for expression analysis. Samples were collected and were immediately frozen in liquid N 2 for further use. Two biological replicates were performed for each sample.

RNA Isolation and Real-time Quantitative RT-PCR Expression Analysis
Total RNAs were extracted according to the instructions of Trizol reagent (Invitrogen, Carlsbad, CA, USA) from leaves of maize seedlings with different treatments. The first strand cDNAs were synthesized using First Strand cDNA Synthesis kit (Fermentas, USA).
Real-time quantification RT-PCR reactions were performed in Bio-RAD MyiQ TM Real-time PCR Detection System (Bio-Rad, USA) using the TransStart Top Green qPCR SuperMix (TransGen, China) according to the manufacturer's instructions. Each PCR reaction (20 ml) contained 10 ml 26real-time PCR Mix (containing SYBR Green I), 0.5 ml of each primer, and appropriately diluted cDNA. The thermal cycling conditions were 95uC for 30 s followed by 45 cycles of 95uC for 15 s, 55uC 260uC for 30 s, and 72uC for 15 s. The Zmactin gene was used as internal reference for all the qRT-PCR analysis. Each treatment was repeated three times independently. Relative gene expression was calculated according to the delta-delta Ct method of the system. The primers used are described in Table S1 in File S1.

Genome-wide Identification of MAPKKK Family in Maize
Availability of complete maize genome sequences has made it possible for the first time to identify all the MAPKKK family members in this plant species. BLAST searches of the maize sequences database and NCBI database were performed using 80 Arabidopsis and 75 rice MAPKKK sequences as query and this analysis has identified 74 putative MAPKKK gene family members in the complete maize genome, designated as ZmMAPKKK1-ZmMAPKKK74 according to their group, since there was no standard nomenclature followed for MAPKKKs neither in Arabidopsis nor in rice. All the 74 MAPKKKs had conserved protein kinase domains. Because there were alternative splice variants in some genes of the family, the following analysis was restricted to only a single variant for further analysis. The detailed information of maize MAPKKK genes identified in the present study, including accession numbers, number of amino acids, molecular weight, and isoelectric point (pI), was listed in Table 1. ZmMAPKKK ORF lengths ranged from 1062 bp (ZmMAPKKK57) to 4014 bp (ZmMAPKKK14) and the molecular weights ranged from 39.8 kDa (ZmMAPKKK57) to 148.1 kDa (ZmMAPKKK14). Since the size of maize genome (,2300 Mb) is much larger than the genomes of Arabidopsis (125 Mb) and rice (389 Mb), MAPKKK genes in maize would be larger than that in Arabidopsis and rice. However, according to the present study, the number of maize MAPKKK genes was even smaller than that of Arabidopsis and rice ( Figure 1).

Comparative Phylogenetic Analysis of MAPKKK Gene in Maize, Arabidopsis and Rice
To examine the evolutionary relationships between different MAPKKK members in maize, Arabidopsis and rice, an unrooted tree was constructed from alignments of the full MAPKKK amino acid sequences using Neighbor-Joining (NJ) method by MEGA5.0 and phylogenetic analysis indicated that ZmMAPKKKs can be divided into three major groups: MEKK, Raf and ZIK. There were 46 MAPKKKs from maize, 43 from rice and 48 from  Only 6 MAPKKKs from maize, 10 from rice and 11 from Arabidopsis were grouped into ZIK group (Figure 1). The inspection of the phylogenetic tree indicated 19 ZmMAPKKK paralogous gene pairs and these gene pairs represented 52% of the maize MAPKKK genes family members ( Figure S1 in File S1), suggesting maize MAPKKK gene family may have undergone multiple duplications during the evolution history. Phylogenetic analysis also showed that there were 16 pairs of maize/rice MAPKKK proteins in the same clade of the phylogenetic tree (Figure 1).

Gene Structural Organization and Analysis of Conserved Domain in MAPKKK Genes
Based on the predicted sequences, the maize MAPKKK gene structures were determined. As shown in Figure 2, there were 8-17 exons in most maize MEKK group genes, whereas six genes (ZmMAPKKK17, ZmMAPKKK18, ZmMAPKKK19, ZmMAPKKK20, ZmMAPKKK21 and ZmMAPKKK22) only had one exon, and one gene (ZmMAPKKK14) had 24 exons, which were consistent with the exon numbers of their orthologs in Arabidopsis and rice. All members from Raf and ZIK possessed 2-17 exons and 7-9 exons respectively. This conserved exon numbers in each subgroup    among all three species supported their close evolutionary relationship and the introduced classification of subgroups.
Using Clustal X to analyze the full protein sequences of all MAPKKKs, we found that the most of the Raf group proteins had a C-terminal kinase domain and extended N-terminal domains. However, most of the ZIK group members had N-terminal kinase domain whereas kinase domain of MEKK family protein were located either at N-or C-terminal or central part of the protein, which were consistent with their orthologs in Arabidopsis and rice (data not shown) [24]. In addition, we also investigated the conserved motif in their kinase domains. Among the three families MEKK family is relatively well characterized. Most MEKK-like proteins seem to participate in canonical MAP kinase cascades that activate downstream MKKs. AtMEKK1 and AtMEKK2 were shown to play important roles in plant innate immunity [28,30,52]. More recently, Hashimoto et al. (2012) reported that NbMAPKKKa, NbMAPKKKb and NbMAPKKKc functioned as positive regulators of PCD [53]. All the members of maize MEKK family shared conserved motif G (T/S) Px (W/F) MAPEV, which confirmed their association with MEKK family [24] (Figure 3A). ZIK-like kinases also known as WNK (With No lysine (K)), which have not been shown to phosphorylate MKKs in plants, are involved in internal rhythm. AtWNK1 phosphorylated the putative circadian clock component APRR3 in vitro and might be involved in a signal transduction cascade regulating its biological activity [54]. AtWNK2/5/8 regulated flowering time by modulating the photoperiod pathway [55]. Recently, OsWNK1 was found to respond differentially under various abiotic stresses and also showed rhythmic expression profile under diurnal and circadian conditions at the transcription level [56]. The conserved motif of ZIK family proteins in maize were investigated using Clustal X and as shown in Figure 3B, a conserved signature motif GTPEFMAPE (L/V) (Y/F) was found in all members [24]. Compared with ZIK and MEKK like families, Raf family has many more members. Two of the best-studied Arabidopsis Raflike MAPKKKs, CTR1 and EDR1 are known to participate in ethylene-mediated signaling and defense responses. However, neither CTR1 nor EDR1 have been confirmed to participate in a classic MAPK cascade. As shown in Figure 3C, all the members of Raf family have the conserved motif GTXX (W/Y) MAPE except ZmMAPKKK47, which strongly supported their identity as members of Raf subfamily [24].

Genomic Distribution and Gene Duplication
The physical locations of the MAPKKK genes on maize chromosomes were depicted in Figure 4. It was found that 73 ZmMAPKKKs were mapped on all 10 chromosomes of maize and 1 MAPKKK (ZmMAPKKK29) was situated on unanchored contigs (chromosome unknown). Ten were present on chromosomes 3 and 5; nine on chromosomes 1, 2, 4; four on chromosomes 6, 7, 10; In addition, chromosome 8 had 8 MAPKKK members, whereas chromosome 9 encoded 6 MAPKKKs members.
Gene duplication events play a significant role in the amplification of gene family members in the genome. Several rounds of genome duplication events have been found in maize genome [57]. The expansion mechanism of the maize MAPKKK gene family was analyzed to understand gene duplication events. As shown in Figure 4, nineteen paralogs of the 74 maize MAPKKKs were identified, including 17 segmental duplication events between chromosomes and the other 2 duplication events within the same chromosome (ZmMAPKKK30 and ZmMAPKKK31, ZmMAPKKK50 and ZmMAPKKK51). Furthermore, these gene pairs shared similar exon-intron structures. This result suggested the duplication events play vital roles in MAPKKK genes expansion in maize genome.

Expression Pattern of the Maize MAPKKK Genes in Different Tissues and Developmental Stages
To observe expression profiles of the MAPKKK in maize development, we analyzed the expression of the MAPKKK genes under normal growth conditions by a Genevestigator analysis (https://www.genevestigator.ethz.ch/) in 18 different tissues, including the seedlings, coleoptiles, radicles, tassel, anther, ear, silk, caryopsis, embryo, endosperm, pericarp, culm, internode, foliar leaf, juvenile leaf, adult leaf, blade and primary root. Fifty seven genes correspond to probes and there were 17 MAPKKK genes whose corresponding probes were not found. Heatmap representation of expression profile of 57 MAPKKK genes during maize development was shown in Figure 5. Eight MAPKKKs (ZmMAPKKK25, ZmMAPKKK28, ZmMAPKKK36, ZmMAPKKK43, ZmMAPKKK52, ZmMAPKKK53, ZmMAPKKK67 and ZmMAPKKK72) had higher expression in anther than that of in other organs. Eight MAPKKKs (ZmMAPKKK4, ZmMAPKKK9, ZmMAPKKK32, ZmMAPKKK33, ZmMAPKKK37, ZmMAPKKK46, ZmMAPKKK69 and ZmMAPKKK73) had higher expression in embryo than that of in endosperm, whereas ZmMAPKKK22, ZmMAPKKK29, ZmMAPKKK39 and ZmMAPKKK49 had the opposite expression profiles in embryo and endosperm. In addition, five MAPKKKs (ZmMAPKKK18, ZmMAPKKK22, ZmMAPKKK55, ZmMAPKKK63 and ZmMAPKKK62) were ex-pressed with high abundance in primary roots which was consistent with their expression in radicle. Specifically, ZmMAPKKK10 and ZmMAPKKK11 demonstrated a unique expression pattern in silk. Furthermore, MAPKKK duplicated gene pair expression patterns were also investigated, only seven pairs (ZmMAPKKK33 and ZmMAPKKK32, ZmMAPKKK44 and ZmMAPKKK45, ZmMAPKKK52 and ZmMAPKKK53, ZmMAPKKK64 and ZmMAPKKK65, ZmMAPKKK63 and ZmMAPKKK62, ZmMAPKKK67 and ZmMAPKKK68, ZmMAPKKK70 and ZmMAPKKK71) shared the similar expression patterns in nearly all the organs, whereas other paralogs were not the case. These results showed that although the duplicated genes had higher similarities in amino acid, they may not have similar function or are involved in the same signaling pathway.
In addition, we also identified the expression profiles of MAPKKK family genes under different developmental stages through analysis of publicly available microarray data sets. All the 57 genes were expressed in at least one of developmental stages ( Figure 6). Nine MAPKKK genes (ZmMAPKKK10, ZmMAPKKK11, ZmMAPKKK16, ZmMAPKKK25, ZmMAPKKK28, ZmMAPKKK30, ZmMAPKKK56, ZmMAPKKK70, ZmMAPKKK71) were expressed in all developmental stages mentioned in the Figure 6 except for inflorescence formation stage, whereas another nine MAPKKK genes (ZmMAPKKK1, ZmMAPKKK9, ZmMAPKKK26, ZmMAPKKK33, ZmMAPKKK45, ZmMAPKKK49, ZmMAPKKK65, ZmMAPKKK68, ZmMAPKKK73) had higher expression in inflorescence formation than that of in other developmental stages. In addition, ZmMAPKKK37, ZmMAPKKK46 and ZmMAPKKK58 had higher expression in germination stage than other genes, whereas ZmMAPKKK43, ZmMAPKKK69 and ZmMAPKKK71 had highest expression in anthesis stage. Specifically, ZmMAPKKK52 was expressed with low abundance in all stages. Moreover, several paralogs (ZmMAPKKK15 and ZmMAPKKK16, ZmMAPKKK71 and ZmMAPKKK70, ZmMAPKKK52 and ZmMAPKKK53, ZmMAPKKK62 and ZmMAPKKK63, ZmMAPKKK64 and ZmMAPKKK65) showed highly similar expression profiles, which may indicate subfunctionalization in the course of evolution. However, other gene pairs showed quit different under the maize developmental stages.
Next, we used quantitative real-time RT-PCR to validate the expression patterns in different tissues resulting from microarray database.
Nine genes (ZmMAPKKK10, ZmMAPKKK11, ZmMAPKKK16, ZmMAPKKK18, ZmMAPKKK27, ZmMAPKKK47,  ZmMAPKKK51, ZmMAPKKK55, and ZmMAPKKK63) were selected to confirm their expression in primary root, pericarp, internode, adult leaf, silk, culm, seedling, endosperm, embryo and tassel. Surprisingly, most our qRT-PCR data did not correspond with microarray data (Figures 5 and 7). For example, our qRT-PCR results showed that ZmMAPKKK11, ZmMAPKKK18, ZmMAPKKK47 and ZmMAPKKK51 exhibited a highest expression level in embryo (Figure 7), and the microarray data showed that these four genes had higher expression in silk, root and coleoptiles than that of in embryo ( Figure 5). However, ZmMAPKKK11, ZmMAPKKK18, ZmMAPKKK47 and ZmMAPKKK51 showed higher expression in dough stage (Figure 6), suggesting they may play important roles in seed development and which was consistent with our qRT-PCR data (Figure 7). The conflicting results between our qRT-PCR and microarray database may be due to the different plant materials and growth conditions, and different experimental conditions. From these results, it is speculated that most of MAPKKK genes with different expression levels in all the maize detected organs might play key roles in plant development and several MAPKKK genes may uniquely function in maize developmental stages. However, more researches are needed to determine the functions of the MAPKKK family by additional biological experiments.

Conclusion
An increasing body of evidence has shown that the mitogenactivated protein kinase (MAPK) cascades are involved in plant development and stress responses. So far, MAPKKKs have been investigated in several plant species including Arabidopsis and rice, no systematic analysis has been conducted in maize. In this present study, we performed a genome-wide survey and identified 74 MAPKKK genes from maize. Phylogenetic analysis of MAPKKKs from maize, rice and Arabidopsis has classified them into three subgroups. Members within each subgroup may have recent common evolutionary origins since they shared conserved protein motifs and exon-intron structures. Furthermore, microarray analysis showed that a number of maize MAPKKK genes differentially expressed across different tissues and developmental stages. In addition, quantitative real-time RT-PCR was performed on nine selected MAPKKK genes to confirm their expression patterns in different tissues. Our observations may lay the foundation for future functional analysis of maize MAPKKK genes to unravel their biological roles.

Supporting Information
File S1 Supporting Information file contains Figure S1 and Table S1. (DOC) Author Contributions