Conserved Transcriptional Regulatory Programs Underlying Rice and Barley Germination

Germination is a biological process important to plant development and agricultural production. Barley and rice diverged 50 million years ago, but share a similar germination process. To gain insight into the conservation of their underlying gene regulatory programs, we compared transcriptomes of barley and rice at start, middle and end points of germination, and revealed that germination regulated barley and rice genes (BRs) diverged significantly in expression patterns and/or protein sequences. However, BRs with higher protein sequence similarity tended to have more conserved expression patterns. We identified and characterized 316 sets of conserved barley and rice genes (cBRs) with high similarity in both protein sequences and expression patterns, and provided a comprehensive depiction of the transcriptional regulatory program conserved in barley and rice germination at gene, pathway and systems levels. The cBRs encoded proteins involved in a variety of biological pathways and had a wide range of expression patterns. The cBRs encoding key regulatory components in signaling pathways often had diverse expression patterns. Early germination up-regulation of cell wall metabolic pathway and peroxidases, and late germination up-regulation of chromatin structure and remodeling pathways were conserved in both barley and rice. Protein sequence and expression pattern of a gene change quickly if it is not subjected to a functional constraint. Preserving germination-regulated expression patterns and protein sequences of those cBRs for 50 million years strongly suggests that the cBRs are functionally significant and equivalent in germination, and contribute to the ancient characteristics of germination preserved in barley and rice. The functional significance and equivalence of the cBR genes predicted here can serve as a foundation to further characterize their biological functions and facilitate bridging rice and barley germination research with greater confidence.


Introduction
Seed germination is a biological process important to plant development, plant evolution and agricultural production. Strictly defined, germination begins with the uptake of water by dry quiescent seeds and ends with visible emergence of an embryo tissue from its surrounding tissues [1]. Seed germination is accompanied by many distinct metabolic, cellular and physiological changes. For example, upon imbibition, the dry quiescent seeds take up water and rapidly resume many fundamental metabolic activities such as respiration, RNA metabolism, and protein synthesis using surviving structures and components in the desiccated cells. These concerted biological activities transform a dehydrated and resting embryo with almost undetectable metabolism into one with vigorous metabolism culminating in growth [2,3].
Transcriptional regulatory program underlying seed germination and its associated biological pathways were investigated in divergent plant species [4,5,6,7,8,9,10,11]. Extremely complex transcriptional regulatory programs are activated over the course of seed germination. In barley germination and seedling growth, 50% of examined genes are expressed in dry and germinating seeds at a detectable level. Twenty-five percent of those examined genes are differentially regulated over the course of seed germination and seedling growth. Based on global and dynamic expression changes of the germination-regulated genes, the transcriptional regulatory program underlying barley seed germination is divided into early and late phases. Each phase is accompanied by differential expression of a distinct set of genes and biological pathways. For example, the early phase of seed germination is accompanied by transcriptional up-regulation of cell wall synthesis and regulatory components including transcription factors, signaling proteins, and post-translational modification proteins. During the late germination phase, histone families and many metabolic pathways are up-regulated. Stress related pathways and seed storage protein genes are down-regulated through the entire course of germination. Comparing transcriptomes of barley and Arabidopsis showed that high accumulation of many seed stored transcripts in Arabidopsis and barley dry seeds have been preserved for 200 million years of monocot-dicot divergence [9,11].
Barley and rice have been divergent for 50 million years, but share a great similarity in seed germination and seedling growth [3,12]. For example, both rice and barley are endospermic and starch cereal species, and have a highly conserved seed storage mobilization pathway. Both rice and barley produce hydrolytic enzymes in aleurone tissues during seed germination and seedling growth, and translocate the hydrolytic enzymes to starch endosperm for mobilizing seed storage reserves. Seed germination and its associated production of hydrolytic enzymes are induced by gibberellic acid through a highly conserved transduction pathway [10,13,14,15]. To gain an insight into transcriptional regulatory programs underlying the conserved characteristics of barley and rice germination, we determined transcriptomes of rice grains at start-, mid-and end-germination points, and developed a bioinformatic and evolutionary approach to compare them with our previously determined transcriptome of barley at the equivalent germination stages [9]. Genome-wide sequence comparison identified germination regulated rice and barley gene pairs with a strong sequence similarity. While a small percentage of these pairs showed similar expression patterns over the course of seed germination, a majority had divergent expression pattern. The analysis also identified a collection of germination regulated barley-rice gene sets. The rice and barley genes in each set shared strong similarities in protein sequences and expression patterns. Gene expression patterns and protein sequences changes quickly if there are no functional constraints [16,17,18,19,20,21,22]. Seed germination is accomplished through concerted activities of many gene products, which are mainly defined by their protein sequences and accumulation patterns. The preservation of germination-regulated expression patterns and protein sequences of the barley and rice genes in each set suggests that the barley and rice genes were functionally important and equivalent in germination, and likely contributed to the molecular and cellular processes conserved in barley and rice germination.

Transcriptomes of Barley and Rice at Three Distinct and Equivalent Developmental Stages of Germination
An objective of this study was to compare transcriptomes of rice and barley over the course of germination and to identify germination regulated barley and rice genes with conserved protein sequences and expression patterns. Since expression of germination related genes are often differentially regulated with respects to specific developmental stages over the course of seed germination [6,9], it is critical to compare their transcript accumulation levels at distinct and equivalent physiological stages. Our previous studies showed that transcriptional regulatory program underlying seed germination is divided into early and late germination phases that are separated by the mid-time point of germination [9]. Transcriptomes of barley at start-(dry), middle-(9 hr) and end-points of germination (18 hr) were previously determined and used for the comparison [9]. It took 42 hours for radicles to emerge from rice grains at the germination condition identical to barley germination. To compare transcriptomes of germinating rice and barley grains at their equivalent stages of barley germination, we examined transcriptomes of rice at 0 (dry), 21 and 42 hours of germination as start-, middle-and end-stages of germination. Three independent biological replications were conducted for each stage in rice and barley transcriptome assays.
Both barley and rice transcriptome data used in this study were produced using the Affymetrix GeneChip technologies (GeneChip Barley Genome Array and GeneChip Rice Genome Array), and were analyzed using identical statistical approaches and parameters to reduce variation from different transcriptome assay platforms and statistical analysis. One-way ANOVA identified a total of 3599 barley and 18665 rice probe-sets that were differentially regulated between any two examined stages of germination with a false discovery rate less than 5%. Considering the potential that non-specific hybridization between paralogous genes could cause an inaccurate assignment of signal intensity to gene family members, the probe-sets flagged by Affymetrix as potentially cross-hybridizing probes were removed from further analysis. A total of 2537 barley and 13813 rice probe sets were identified as germination regulated genes, and were used for further comparative analysis. A much higher number of germination regulated probe-sets were identified in rice than in barley. It was partially caused by the fact that the GeneChip Rice Genome Array has two times as many probe-sets as the GeneChip Barley Genome Array. In addition, probe-sets on barley array were designed using EST sequences while the ones on the rice array were designed using genes predicted from genome sequence, which are likely to lead to a lower percentage of germination regulated genes on the barley array than on the rice array.

Conservation and Divergence of Transcriptional Regulatory Programs Underlying Barley and Rice Germination
A total of 1507 pairs of barley and rice genes (BRs) with protein sequence similarity at an e-value less than 250 were identified among the germination regulated barley and rice genes. The BRs contained 805 barley and 1054 rice genes (Table 1). Pearson correlation coefficients (PCC) between log2 signal intensities of each paired barley and rice genes at start-, mid-and end-stages of germination were calculated to determine the similarity of their expression patterns. Sixty percent of the BRs had a PCC value higher than 0.5, indicating that the barley and rice genes in each of the BRs had a good similarity in their transcript accumulation patterns ( Figure 1, Table 2). However, forty percent of the BRs had PPC value lower than 0.5, indicating that a significant percentage of BRs had low similarity or no similarity in their expression patterns. Thus, the BRs with high protein sequence similarity preferentially preserved their expression patterns after rice and barley diverged from their most recent ancestor. The germination regulated barley and rice genes (BRs) were paired randomly and paired based on their sequence similarity with an evalue less than 250 respectively; and their PCC values were determined. The distribution of PCC value for BR genes with e-value less than 250 (dark blue) were compared with randomly paired BR genes (light blue). The percentage of BRs (Y-axis) in each defined PCC value range (X-axis) was graphed. doi:10.1371/journal.pone.0087261.g001 However, a significant percentage of the BRs had evolved into different gene expression patterns.
A collection of randomly paired barley/rice genes were generated from the germination regulated barley and rice genes. The randomly paired BRs had a relatively symmetrical distribution of PCC value with a slightly higher percentage at a range of PCC value from 0.8 to 1.0 than that from 20.8 to 21.0. Interestingly, twenty-seven percent of the randomly paired BRs had a PCC value greater than 0.8 ( Figure 1).
Percentage of BRs with similar expression patterns (PCC value from 0.5 to 1.0) positively correlated with their protein sequence similarities in the e value range of 25 to 2100 (Table 2). However, there was little difference in distribution of PCC values between BRs with e value ranging from 250 to 2100 and BRs with e value less than 2100. Chi-square analysis was performed to compare distributions of PCC values between randomly paired BRs and BRs with a given range of e value. There was a significant difference in distribution of PCC values between BRs with e value from 250 to 2100 and randomly paired BRs at P,0.01 (Table 2). However, there was no significant difference in distribution of PCC values between BR genes with e value from 220 to 250 and random paired BRs at P value of 0.1. Thus, the BRs at e-values less than 250 were used for identification of BRs that had conserved expression patterns.

Barley and Rice Genes with Conserved Protein Sequences and Germination Regulated Expression Patterns (cBRs)
A total of 483 BRs with a PCC value higher than 0.9 were identified among the 1507 germination regulated BR genes. Those BRs accounted for 32% of the germination regulated BRs. The 483 BRs were comprised of 368 distinct barley genes and 388 distinct rice genes. Those genes represented a small percentage of the 2537 barley and 13813 rice germination regulated genes.
Thus, majority of the germination-regulated genes had diverged beyond our thresholds in protein sequences, gene expression patterns or in both. The 483 BRs were further merged into 262 single-gene cBRs containing only one gene from each species and 60 multi-gene cBRs (Table 1 and Table 3). Barley and rice genes in each of those BRs were differentially regulated during seed germination, and shared strong similarity in both protein sequences and transcriptional expression patterns. We referred to the BRs as conserved BRs (cBRs). Each multi-gene cBR had at least three genes with one-to-many, many-to-one and many-tomany barley and rice gene relationship. Any pair of ''orthologous'' or paralogous genes in each multi-gene cBR had sequence similarity with an e-value less than 250 and expression pattern similarity with a PCC value higher than 0.9. The largest multigene cBR (cBR_M2) encoded a U-box domain containing RING protein family and had a total of 20 rice and barley genes (Table 3). However, the numbers of rice and barley genes in each cBRs were not always equally distributed. For example, the cBR_M2 was composed of 17 barley genes and 3 rice RING protein genes.

Diverse Gene Expression Patterns Were Preserved in Barley and Rice Germination
There are eight possible expression patterns based on up or down-regulations of a gene in early and late germination phases. All of the possible expression patterns were observed for the cBRs, and were preserved in both rice and barely since their divergence (Table 3 and 4). Table 4 summarized the cBRs in the eight expression patterns. A total of 71 cBRs showed up-regulated expression patterns in both early and late germination phases, and made up the largest group of cBRs (Group 1). Many cBRs in the Group 1 encoded the proteins related to cell wall metabolism, cell organization, chromatin structure, protein degradation, and signaling G-proteins.     Table 3. Cont.               Interestingly, Group 3 had 28 cBRs that were transiently upregulated in the early germination phase. Expression levels of most cBRs in Group 3 at the end of germination were down-regulated to levels at the dry seed stage. Preserving transient up-regulation in early germination followed by down-regulation in late germination in both barley and rice indicated that those genes likely participated in biological processes specific to early germination. Many cBRs in Group 3 encoded proteins involved in cell wall modification, protein degradation, protein modification, and signaling transduction. Cell wall modification is required to weaken cell walls during early germination to permit radicle protrusion and to provide access to stored metabolites in the endosperm [23]. Also in Group 3 were proteins such as F-box proteins, receptor-like kinases, G-proteins and calcium-dependent protein kinases, which play important roles in a variety of signaling transduction pathways. Those signaling components likely played roles in transducing a variety of signals in the early germination phase to initiate the biological pathways required in seed germination. Sixty-two cBRs in Group 8 were down-regulated in both early and late stages. They encoded proteins with a wide range of biological functions. Those cBRs highly accumulated in dry mature grains and their accumulation gradually decreased over the course of seed germination. This raises the possibility that these cBRs encoded proteins involved in seed development and maturation. The highly accumulated transcripts were degraded over the course of seed germination.

The cBRs Encoded Proteins in Diverse Biological Pathways
The genes represented on the rice and barley GeneChips are classified into 35 functional groups based on their functions in metabolic pathways, signaling pathways and gene families in MapMan and PageMan [24,25]. The cBRs encoded proteins in most of the functional groups ( Figure 2 and Table 3). For examples, 13 cBRs encoded proteins in cell wall metabolic pathways while 22 cBRs were functionally related to signaling pathways. Eighty-nine cBRs encoded proteins that are not classified into any of the functional groups. cBRs in the same functional group often had diverse expression patterns. For example, cBRs in stress-related pathways had both up-regulated and down-regulated expression patterns in early phase of germination. Conversely, cBRs in several functional groups had similar expression patterns. For example, all three cBRs in the biodegradation of xenobioitics pathway were down-regulated in both early and late phases of germination while all eight cBRs Table 3. Cont.  except cBR_M23 in DNA related pathways were up-regulated in both early and late phase of germination ( Figure 2 and Table 3). Interestingly, a large number of transcription factor genes are differentially regulated over the course of barley germination [9]. However, a limited number of cBRs encoded transcription factors. Only a PHD finger protein (cBR_207) and an AP2/EREBP protein (cBR_191) were down-regulated during seed germination (Table 3). Therefore, germination regulated transcription factor genes evolved quickly in either their protein sequences or/and their expression patterns.

Biological Pathways Regulated by Conserved Transcriptional Regulatory Programs
Representation analysis of cBRs in each functional group showed that the cBRs in a number of biological pathways were preferentially regulated in conserved expression programs ( Figure 3A). Early germination up-regulated cBRs were overrepresented in cell wall metabolic pathways and peroxidase gene family ( Figure 3A, 3B and 3C). A total of 13 cBRs such as arabinogalactan protein (AGP), cellulose synthase, beta-glucanase, beta-D-xylosidase, expansins and xyloglucan endotransglucosylase were identified in the cell wall metabolic pathway. All of the 13 cBRs were up-regulated during early germination, except that cBR_228 encoding beta-D-xylosidase was slightly down-regulated ( Figure 3B). In addition, five cBRs encoded peroxidases; and four of them were up-regulated in the early germination phase ( Figure 3C). Most of the peroxidase genes were also preferentially up-regulated in the late germination phase. It was reported previously that peroxidase activity increases significantly in the micropylar end of germinating tomato seeds [26]. The conserved up-regulation of peroxidase genes in barley and rice provides additional evidence supporting the functional importance of peroxidase in seed germination.
The cBRs encoding chromatin remodeling and structural proteins were preferentially up-regulated during the late germination phase. There were 8 cBRs in chromatin structure pathways. All of them were dramatically up-regulated during the late germination phase by more than 4.7 fold with an average of 30 fold. However, expression levels of those cBRs had no or little change during the early germination phase ( Figure 3D). Thus, the specific and strong up-regulation of chromatin-related genes in the late germination phase was conserved in rice and barley. Five of the eight cBRs encoded histone proteins. For example, the cBR_M23 was composed of 8 barley and 2 rice histone H4 genes. Two of the eight cBRs encoding replication licensing factor MCM proteins were specifically up-regulated in late germination phase. MCM encodes a conserved minichromosome maintenance protein and plays an essential function as a helicase in DNA replication elongation in eukaryotes. MCM proteins also participate in other chromosome processes including transcription, chromatin remodeling, and genome stability [27].

Biological Pathways and Gene Families Containing cRBs with Diverse Expression Patterns
Interestingly, the cBRs in a number of signaling pathways and gene families had diverse expression patterns. The cBRs encoding 14-3-3 proteins, G-proteins, receptor kinases, calmodulin and calcium-dependent protein kinase in signaling pathways were identified. The expression patterns of those cBRs were highly diverse (Table 3 and Figure 4A). A total of 12 cBRs encoded Gproteins, but their expression patterns were highly diverse over the course of germination. For example, the cBR_M17 was upregulated by 13-fold in the early germination phase. In contrast, another ras-related G protein cBR (cBR_246) was down-regulated by 2.4 fold in the early germination phase. Two cBRs (cBR-M37 and cBR_71) encoded 14-3-3 proteins. The cBR_71 was downregulated while cBR-M37 was up-regulated over the course of seed germination. Fourteen cBRs encoded proteins in ubiquitin/ 26S proteasome-mediated protein degradation pathways, which often play important roles in a variety of signaling transduction pathways ( Figure 4B). Most of the cBRs encoded E2 and E3 regulatory proteins such as E2, HECT, RING and F-BOX proteins, and had diverse expression patterns. For example, four cBRs encoding F-box proteins were differentially regulated by seed germination, and showed diverse expression patterns.
Both alpha-and beta-amylases are key enzymes required in seed storage starch mobilization during seed germination and seedling growth [1,23]. Interestingly, the cBRs encoding alpha-and betaamylases had opposite transcriptional patterns. The alpha-amylase cBR was up-regulated in late germination stages while the betaamylase cBR was down-regulated in late germination ( Figure 4C). In addition, two cBRs encoding cysteine proteases and two cBRs encoding serine proteases were identified. Both cysteine and serine proteases were suggested to play a role in protein mobilization during seed germination [28]. Interestingly, one cysteine protease cBR and one serine protease cBR were up-regulated while the others were down-regulated in both the early and late germination phase ( Figure 4D). The functional and evolutionary significance in preserving the opposite transcriptional regulatory programs for these functionally related genes remains to be explored.

Discussions
Barley and rice diverged from their common ancestor 50 million years ago [12]. However, they share a great similarity  Figure 3A showed biological pathways and families over-and under-represented with early or late germination regulated cBRs. The functionalities were displayed on the right; and the germination phase and regulation patterns were displayed on the top. The representation analysis was conducted for all cBRs. Log2 fold change values in early and late germination phases were used in the PageMan analysis. Fisher's exact test and an ORA Cutoff value of 1 were used. A false color scale was used to indicate the statistic Z value. Blue and red indicates significance in over-representation and under-representation. The cBRs encoding proteins in cell wall metabolism and peroxidase families were preferentially regulated in early germination phase ( Figure 3B and 3C) while the cBRs encoding proteins in chromatin structure/modeling pathways were preferentially up-regulated in late germination phase ( Figure 3D). Log2 of average fold changes from dry seed over the course of germination for the cBRs in those pathways were graphed. Dry, middle (Mid) and end (End) points of germination were indicated as X-axis. doi:10.1371/journal.pone.0087261.g003 morphologically and physiologically in germination and seedling growth. In this study, we measured the transcriptomes of germinating rice grains at dry, mid-and end points of seed germinations, which should represent the most distinct stages of the dynamic transcriptional changes over seed germination process. Having determined transcriptomes of rice at the three equivalent stages [9], we designed a systems and evolutionary strategy to compare the dynamic transcriptomic changes over the course of seed germination to gain an insight into divergence and conservation of gene regulatory programs underlying rice and barley germination.
One-Way ANOVA analysis of the transcriptomes revealed that 2537 barley and 13813 rice genes were differentially regulated over the course of seed germination. Comparing their encoding protein sequences and expression patterns identified 322 sets of conserved barley and rice genes (cBRs) sharing strong similarity in both protein sequences and gene expression patterns. The collection of cBRs contained 368 barley genes and 388 rice genes. Thus, only a very small percentage of the germination-regulated genes preserved their protein sequences and gene expression patterns; and a significant divergence occurred in transcriptional regulatory programs underlying rice and barley germination since the barley-rice divergence. As expected, protein sequence similarity of germination regulated barley and rice genes positively correlated to the similarity of their expression patterns, suggesting co-evolution of protein functions and gene expression patterns.
Biological functions of genes are mainly determined by their protein sequences and their expression patterns. Both protein sequences and expression patterns change quickly if the genes have no functional significance [17,29,30,31]. Therefore, we hypothesized that the germination regulated expression patterns and protein sequences of the barley and rice genes in each cBR have been preserved for 50 million years after the split of rice and barley from their common ancestor because the genes are functionally important to seed germination, and should contribute to the characteristics shared by rice and barley germination. Additionally, 60 of the 322 cBRs were multi-gene cBRs. Each multi-gene cBRs contained at least one pair of paralogs. Duplicated paralogous genes are subjected to little functional constrains, and offer a great opportunity for their sub-functionalization or neofunctionalization through divergence of their protein sequence and/or expression patterns [17,19,20,21,32]. Preserving germination regulated expression patterns and protein sequences of those paralogous genes in the multi-gene cBRs suggests that they may be subjected to negative selection, and provides additional evidence supporting their functional significance in seed germination.
We identified a number of biological pathways enriched with cBRs of similar expression patterns, suggesting that their underlying transcriptional regulatory programs are highly conserved in rice and barley. Preserving coordinate regulation of their gene expression patterns across rice and barley in each of those pathways provided further evolutionary evidence for functional significance of those biological pathways in seed germination. As suggested, most of those biological pathways have been previously proposed to functionally important in seed germination based on a variety of evidences. For example, a total of 13 cBRs were identified in cell wall metabolic pathway; and 12 of the 13 cBRs were up-regulated during early germination. Cell wall metabolism plays an important role in germination for most angiosperm seeds. It is required for two important germination biological processes [33,34], radicle elongation growth and endosperm weakening. It was previously reported that endosperm weakening is accompanied with the induction of cell wall remodeling enzymes in several species. They include endo-beta mannanase, beta-1,3-glucanases, expansins, xyloglucan endotransglycosylase, pectin methylesterase, polygalacturonase and arabinogalactan protein [34]. We identified cBR encoding each of these proteins. Three cBRs encoding expansins were up-regulated during early germination. Expansins are involved in modifying the cell wall matrix during plant growth and development, and have been demonstrated to have cell wall extension activity in vitro and in vivo [35]. It was proposed that expansins is involved in the expansion of cucumber hypocotyls [36]. During germination of tomato seeds, a specific alphaexpansin transcript accumulates in the endosperm cap, presumably in association with the weakening of cell walls that facilitates emergence of the radicle [37]. The functional significance of expansins in germination might be an importance force to preserve the early germination up-regulated expression patterns and protein sequences of the cBRs. Cell wall precursor synthesis, cellulose synthesis and cell call modification genes are up-regulated during the early germination phase in barley [9]. A number of cell wall degradation related genes are preferentially expressed in afterripening barley coleorhiza, and are likely to associate with breaking seed dormancy [7]. Preserving early germination upregulation of those cell wall metabolic enzyme genes in barley and rice also provided further evidence supporting the hypothesis that the early germination process turns on the transcriptional regulatory programs underlying cell wall metabolism to weaken coleorhiza and facilitate root emergence.
The cBRs encoding chromatin remodeling and structural proteins were preferentially up-regulated during the late germination phase. There were 8 cBRs in chromatin structure pathways. All of them were dramatically up-regulated during the late germination phase by more than 4.7 fold with an average of 30 fold. Histone modification and chromatin remodeling play important roles in reprogramming transcriptional programs. Chromatin-based regulation of seed dormancy and germination was also reported [38,39,40]. Mutation of histone monoubiquitination genes in Arabidopsis reduces ubiquitinated forms of histone H2B and alters expression levels for several dormancy-related genes [39]. A transient histone deacetylation event occurs during seed germination one day after imbibition, and is likely to serve as a key developmental signal that affects the repression of a number of histone deacetylase regulated genes [40]. Preserving preferential up-regulated expression of cBRs in late germination phase suggests an important role for histone modification and chromatin remodeling in germination, which likely supports radicle elongation and quick seedling growth in late and post-germination phase.
Interestingly, a number of biological pathways and gene families contained cBRs with diverse expression patterns. The cBRs encoding proteins in signaling pathways such as G-proteins and kinases often had diverse germination regulated expression patterns. G-proteins are involved in seed germination [41]. Diverse expression patterns of those G-protein cBRs suggested The cBRs encoding Gproteins and 14-3-3 proteins (4A), proteins in ubiquitin dependent degradation pathways (4B), cysteine and serine proteases (4C), and alpha and beta amylases (4D) with diverse expression patterns were shown. Log2 of average fold changes in reference to dry seeds over the course of germination for each cBR was graphed. Dry, middle (Mid) and end (End) points of germination were indicated X-axis. The diagram of ubiquitin dependent degradation pathway was displayed in 4B. doi:10.1371/journal.pone.0087261.g004 that those G-protein cBRs may participate in diverse signaling pathways in seed germination process. Thus, those cBRs had distinct biological functions in the most recent ancestor of barley and rice, and their protein sequences and germination regulated expression patterns have been preserved after their split from the ancestor. In addition, two distinct regulatory programs controlling alpha-and beta-amylases production were conserved in barley and rice. Starch, a major storage reserve in rice and barley grains, is mobilized during seed germination to support seedling growth. Alpha-and beta-amylases are key enzymes required in starch mobilization [1,23]. The alpha-amylase cBR was up-regulated in late germination stages while the beta-amylase cBR was downregulated in late germination ( Figure 3D). Alpha-amylase genes are up-regulated in cereal grain germination and seedling growth. They are also induced by GA in barley aleurone tissues [10,15,23,42,43]. Preserving up-regulation of alpha-amylase genes was consistent with its biological functions in starch degradation during seed germination and seedling growth [44]. In contrast, previous biochemical studies showed that beta-amylase is synthesized and stored exclusively in the starchy endosperm during seed maturation rather than in the aleurone after the initiation of germination [45,46]. Accumulation level of beta-amylase transcript does not respond to GA treatment in barley aleurone [10]. Thus, the alpha-and beta-amylase cBRs had two opposite expression patterns that had been preserved during barley and rice seed germination for 50 million years of barely-rice divergence. Two cBRs encoding protease also showed opposite expression patterns during seed germination. The functional and evolutionary significance in preserving the two opposite transcriptional regulatory programs for these functionally related genes remains to be explored.
We also hypothesized in the study that the barley and rice genes in each cBR have equivalent or similar biological functions because of their strong similarity in protein sequences and expression patterns. Rice serves as a model plant for monocot plant research, and has rich research resources such as a large collection of genetic mutants and substantial genomic information. Barley germination has been extensively studied biochemically and physiologically. Identification of the functionally equivalent rice and barley genes should greatly facilitate integration of research resource and knowledge from rice and barley research. In addition, gene expression changes in response to a biological process are used to successfully predict functional involvement of a gene in the biological process. However, it is often limited to a single species. It is difficult or even impossible to distinguish coincidentally regulated genes from those that are physiologically important. We hypothesized that the evolutionary conservation in the expression patterns of the inter-species and intra-species homologous genes could be used to predict their biological functions with a higher confidence [47,48]. Overall, the evolutionary and systems strategies described in the manuscript have a broad application in predicting genes functionally important and equivalent in a biological process and translate the research and knowledge across plant species with a great confidence.

Plant Growth and Harvest
Oryza sativa L. ssp. japonica (cv. Nipponbare) seeds were used in the experiment. Plump and healthy seeds were imbibed in water for three hours and then germinated on water-saturated germination pack in the dark at 30uC. Twenty seeds were planted in each 15 cm diameter Petri dish and spaced evenly to reduce the variation. The seeds at each representative time point of 0 h (dry grains), 21 h and 42 h were harvested. Three replications were conducted for each time point. Each replication represented an independent germination experiment at identical growth condition. The seeds for each replication were pooled together and immediately frozen in liquid nitrogen, and then stored at -80 degree for RNA extraction.

RNA Purification
Plant tissue (2 g) was ground using a mortar and pestle in liquid nitrogen followed by adding 10 mL extract buffer (4% paminosalicylic disodium, 1% 1, 5-naphthalenedisulfonic acid) and 10 ml phenol. The mixture was inverted several times, and then 10 ml chloroform was added; and the solution was homogenized for 45 seconds using a Polytron. After centrifuging, the aqueous phase was transferred into a new tube. Calcofluor white (60 ul of 10% solution) was added, mixed thoroughly and centrifuged for another 15 min at 4uC, 12,000 rpm. RNA in the supernatant was precipitated using 1/10 volume of 3 M NaOAc, and 2 volume of 100% ethanol. After centrifuging, the pellet was dissolved in 8 ml water. 5 ml of 8 M LiCl was added and the solution incubated on ice overnight. The resulting RNA pellet, isolated after centrifugation, was dissolved in water. RNA quality and quantity was determined using a Nano-Drop AN1000 (Nano-Drop, Wilmington, DE) and Agilent 2100 Bioanalyzer (Aglient, Palo Alto, CA).

Microarray Assay and Data Analyses
Preparation of cDNA and biotin-labeled cRNA were performed and analyzed as recommended by Affymetrix (Santa Clara, CA). According to the manufacturer's protocol, 10 ug of total RNA was used in a reverse transcription reaction to generate first-strand cDNA using SuperScript II (Invitrogen, Clarsbad, CA). After second-stranded synthesis, double-strand cDNA was used for an in vitro transcription reaction to generate biotinylated cRNA. 10 ug of fragmented cRNA for each sample was used in the hybridization. Staining and scanning steps were performed according to the manufacturer's recommended protocols (Affymetrix, Inc., Santa Clara, CA).
The GeneChip probe-level data were background-corrected, normalized and summarized based on GC-Robust Multi-Array Analysis (RMA) approach [49]. In this approach, quantile normalization was used to remove the variation introduced during sample preparation, manufacturing of the arrays, and the processing of arrays, so that GeneChips from different time points and replicates are comparable, and expression level value for each gene was derived from probe pairs based on a log scale linear additive model [50].
Then pre-normalized data were analyzed with Genespring 7.2 software (Silicon Genetics, Redwood City, CA). Within each array, a further ''per gene normalize the median'' (with cutoff 0.01) was applied. The most unreliable data with absent call across 9 chips based on analyzed result using Microarray Suite 5.0 (Affymetrix, Santa Clara, CA) were filtered out. Statistical analyses were performed using a one-way ANOVA provided in Gene-Spring 7.2 software (With Parametric Test, Variances Assumed Equal Option; Benjamini and Hochberg multiple testing correction. FDR set at 0.05) to identify genes that were differentially expressed among samples at any two time points during seed germination.
Considering that the potential non-specific hybridization between homologous genes could lead to cause an inaccurate correlation of their expression profiles, we excluded probes flagged by Affymetrix as potentially cross-hybridizing. The flagged probe sets included the ones with _x _at, which designates probe sets where it was not possible to select either a unique probe set or a probe set with identical probes among multiple transcripts, _s _at, which designates probe sets with common probes among multiple transcripts from different genes and _i_at, _g_at, _f_at, _r_at.

Identification of Barley-Rice (BR) Genes
The exemplar sequences of all probe-sets on Barley Genome GeneChip and Rice genome GeneChip were downloaded for the GeneChips used (http://www.affymetrix.com/products/arrays). An all-against-all reciprocal tBLASTX search was used to identify BRs at a given sequence homology. Pearson correlation coefficients (PCCs) of log 2 expression values were calculated between homologs in R. Barley and rice genes with significantly changed expression level during seed germination were permuted to produce 100,000 random pairs to determine the distribution of PCCs for the randomized population. Chi-square analysis was used for comparison of observed values between barley and rice genes in each BR and PCCs values from randomized pairs. Chisquare analysis was used for comparison of expression values between observed and random pairs. The microarray data used in the studies were deposed in NCBI Gene Expression Omnibus database (GSE 23595).