Comprehensive Analysis of Genic Male Sterility-Related Genes in Brassica rapa Using a Newly Developed Br300K Oligomeric Chip

To identify genes associated with genic male sterility (GMS) that could be useful for hybrid breeding in Chinese cabbage ( Brassica rapa ssp. pekinensis ), floral bud transcriptome analysis was carried out using a B . rapa microarray with 300,000 probes (Br300K). Among 47,548 clones deposited on a Br300K microarray with seven probes of 60 nt length within the 3' 150 bp region, a total of 10,622 genes were differentially expressed between fertile and sterile floral buds; 4,774 and 5,848 genes were up-regulated over 2-fold in fertile and sterile buds, respectively. However, the expression of 1,413 and 199 genes showed fertile and sterile bud-specific features, respectively. Genes expressed specifically in fertile buds, possibly GMS-related genes, included homologs of several Arabidopsis male sterility-related genes, genes associated with the cell wall and synthesis of its surface proteins, pollen wall and coat components, signaling components, and nutrient supplies. However, most early genes for pollen development, genes for primexine and callose formation, and genes for pollen maturation and anther dehiscence showed no difference in expression between fertile and sterile buds. Some of the known genes associated with Arabidopsis pollen development showed similar expression patterns to those seen in this study, while others did not. BrbHLH89 and BrMYP99 are putative GMS genes. Additionally, 17 novel genes identified only in B . rapa were specifically and highly expressed only in fertile buds, implying the possible involvement in male fertility. All data suggest that Chinese cabbage GMS might be controlled by genes acting in post-meiotic tapetal development that are different from those known to be associated with Arabidopsis male sterility.


Introduction
Pollen development, a process stemming from anther cell division and differentiation leading to male meiosis, as well as pollen wall and coat development and anther dehiscence, relies on the functions of numerous genes from both the microspore itself and sporophytic anther tissues including the tapetum [1][2][3][4][5][6][7]. Since pollen development is known to be regulated by the levels of transcripts and small RNAs [8], transcriptome analysis can provide insights into male sterility.
During the last decade, transcriptomic studies of the anther have identified thousands of transcripts expressed in various plant species, including B. oleracea [9]. In the model plant Arabidopsis, gene expression profile studies by microarray during pollen development have been extensively carried out to identify genes specific for stamen [10][11][12][13][14] and pollen development [15][16][17][18][19][20]. Since the Brassica and Arabidopsis genera share about 85% exon sequence similarity [21], the Arabidopsis microarray was applied to Brassica species [22] to investigate gene expression in flower buds of the Ms-cd1 (male sterile mutants of B. oleracea) [23] and in male sterility in B. napus [24,25]. However, these arrays represent parts of genes for each plant, and do not cover the majority of genes. Using a B. rapa-specific microarray, transcriptome analysis from floral buds, which include both gametophytic and sporophytic tissues, was conducted to identify genes associated with genic male sterility (GMS) in Chinese cabbage.
Chinese cabbage (Brassica rapa L. ssp. pekinensis), a popular leafy vegetable, is a cross-pollinating crop with significant heterosis; however, F 1 seed production using manual pollination is limited by the small reproductive organ and small number of seeds per fruit. Therefore, the method of choice to date is to use self-incompatible lines or male sterile lines. Because the utilization of self-incompatible lines is hampered by difficulty in parent reproduction, inbred depression after selfing for multiple generations, and contamination with non-hybrid seed production, the use of male sterile lines appears to be a more promising method for hybrid seed production in Chinese cabbage. In Chinese cabbage, two types of male sterile sources are available: GMS and cytoplasmic male sterility (CMS) [41]. F 1 hybrid seeds using CMS lines have not been widely used because the F 1 plants do not show heterosis, but rather chlorosis (a cytoplasmic negative effect), at low temperatures. By contrast, GMS has more obvious advantages, such as stable and complete sterility, extensive distribution of restorers, and no negative cytoplasmic effect; thus it has been considered to be a good male sterile resource.
Previously, Feng et al [42,43] had obtained four 100% male sterile lines in Chinese cabbage by mutual crossing of nine AB lines. They found that male sterility was controlled by three alleles at one locus: "Ms f " as the dominant restorer, "Ms" as the dominant sterile allele, and "ms" as the recessive fertile allele. The dominance relationship is "Ms f " > "Ms" > "ms", as described in a genetic model shown in Figure S1. Although the 100% male sterile GMS line has been utilized in commercial Chinese cabbage hybrid seed production in China, molecular genetics mechanisms of GMS are totally unknown. To identify Ms f gene(s), and understand GMS mechanisms in Chinese cabbage, we carried out microarray experiments using the newly developed Br300K chip designed from 47,548 B. rapa Unigenes. The results revealed that the Chinese cabbage GMS mechanism might be different from the Arabidopsis one. Many genes regulating pollen wall and coat formation processes were specifically up-regulated in fertile line, but down-regulated in sterile line. All data analyzed in this study indicated that Chinese cabbage GMS might be controlled by genes acting in post-meiotic tapetal development.

Plant materials
As shown in Figure S1, fertile plants (Ms f Ms) and sterile plants (MsMS) were obtained by planting seeds from a cross between male fertile (Ms f Ms) and sterile (MsMS) plants, segregated to a 1:1 ratio. The seeds were sown and grown in a greenhouse at Chungnam National University in spring and autumn of 2009 and 2010. After flowering, Ms f Ms and MsMS plants were identified and floral buds were sampled from at least 10 plants with transcriptome profiles representing 'f' difference, each at different developmental stages. The bud samples were divided into three and four pools for sterile and fertile buds, respectively, and stored at -70 °C until use.

Construction of the Br300K chip
A 300k microarray chip (Br300K; version 2.0) for B. rapa designed from 47,548 Unigenes ( Figure S2) was manufactured at NimbleGen, Inc. (http://www.nimblegen.com/) as described recently [44]. Random GC probes (40,000) were used to monitor the hybridization efficiency and four corner fiducial controls (225) were included to assist with overlaying the grid on the image. To assess the reproducibility of the microarray analysis, we repeated the experiment two or three times with independently prepared total RNAs. The normal distribution of Cy3 intensities was tested by qqline. The data were normalized and processed with cubic spline normalization using quantiles to adjust signal variations between chips and Robust Multi-Chip Analysis (RMA) using a median polish algorithm implemented in NimbleScan [45,46].

RNA isolation and hybridization to the Br300K Microarray GeneChip
Total RNA was isolated from samples using an easy-BLUETM total RNA extraction kit (Invitrogen, NY, U.S.A.) and was then purified using an RNeasy MinEluteTM Cleanup Kit (Qiagen, Germany). For biological repeats, RNAs were extracted from two samples collected in 2009 and 2010, and subjected to microarray analysis.
For the synthesis of double-stranded cDNAs, a Superscript Double-Stranded cDNA Synthesis Kit (Invitrogen, NY, U.S.A.) was used. Briefly, 1 µl of oligo dT primer (100 µM) and 10 µl (10 µg) of total RNA were combined and denatured at 70 °C for 10 min and renatured by cooling the mixture on ice. First-strand DNA was synthesized by adding 4 µl of 5X First Strand Buffer, 2 µl of 0.1M DTT, 1 µl of 10 mM dNTP mix, and 2 µl of SuperScript enzyme and by incubating at 42 °C for 1 h. To synthesize the second strand, 91 µl of DEPC-water, 30 µl of 5X Second Strand Buffer, 3 µl of 10 mM dNTP mix, 1 µl of 10 U/µl DNA ligase, 4 µl of 10 U/µl DNA Polymerase I, and 1 µl of 2 U/µl RNase H were added to the first-strand reaction mixture and the reaction was allowed to proceed at 16 °C for 2 h. After the RNA strand was removed by RNase A (Amresco, OH, U.S.A.), the reaction mixture was clarified by phenol/chloroform extraction and then cDNA was precipitated by centrifugation at 12,000 × g after adding 16 µl of 7.5 M ammonium acetate and 326 µl of cold ethanol. For the synthesis of Cy3-labeled target DNA fragments, 1 µg of double-stranded cDNA was mixed with 40 µl (1 OD) of Cy3-9mer primers (Sigma-Aldrich, MO, U.S.A.), and denatured by heating at 98 °C for 10 min. Next, 10 µl of 50X dNTP mix (10mM each), 8 µl of deionized water, and 2 µl of Klenow fragment (50 U/µl, NEB, MA, U.S.A.) were added and the reaction mixture was incubated at 37 °C for 2 h. DNA was precipitated by centrifugation at 12,000 × g after adding 11.5 µl of 5M NaCl and 110 µl of isopropanol. Precipitated samples were rehydrated with 25 µl of water. The concentration of each sample was determined by spectrophotometry. Thirteen micrograms of DNA were used for microarray hybridization. The sample was mixed with 19.5 µl of 2X hybridization buffer (NimbleGen, WI, U.S.A.) and finalized to 39 µl with deionized water. Hybridization was performed in a MAUI chamber (Biomicro, CA, U.S.A.) at 42 °C for 16 h. After the hybridization, the microarray was removed from the MAUI Hybridization Station and immediately immersed in a shallow 250 ml Wash I solution (NimbleGen, WI, U.S.A.) at 42 °C for 10-15 sec with gentle agitation and then transferred to a second dish of Wash I and incubated for 2 min with gentle agitation. The microarray was transferred into a dish of Wash II solution and further washed in Wash III solution for 15 seconds with agitation. The microarray was dried in a centrifuge for 1 min at 500 × g and scanned using a GenePix scanner 4000B (Molecular Devices, CA, U.S.A.) The microarray was scanned with a GenePix 4000B preset with a 5 µm resolution, for Cy3 signal. Signals were digitized and analyzed by NimbleScan (NimbleGen, U.S.A.). The grid was aligned to the image with a chip design file (NimbleGen Design File, NDF). The alignment was verified to ensure that the grid corners were overlaid on the image corners. This was further confirmed by uniformity of scores in the program. The analysis was performed in a two-part process. First, pair report files were generated in which sequence, probe, and signal intensity information for the Cy3 channel were collected. Databased background subtraction using a local background estimator was performed to improve fold-change estimates on arrays with high background signal. The data were normalized as mentioned in the microarray construction section. The complete microarray data have been deposited in NCBI's Gene Expression Omnibus (GSE47665).

Gene chip data analysis
Genes with adj.P.Value or false discovery rate below 0.05 were collected and further selected for those genes with expression greater than 1 or less than -1 at at least one stage compared with expression at stage 1. Multivariate statistical tests such as clustering, principal component analysis, and multidimensional scaling were performed with Acuity 3.1 (Molecular Devices, U.S.A.). Hierarchical clustering was performed with similarity metrics based on squared Euclidean correlation and average linkage clustering was used to calculate the distance between genes.

Comparison of B. rapa genes on the Br300K microarray with other known plant genes
In the Brassica rapa 300k Microarray v2.0, designed from 47,548 Unigenes, 31,057 cDNA/EST-supported genes were compared with the genome sequences of B. napus, Arabidopsis, and rice sequences at the amino acid levels using BLASTP analysis. The numbers of genes for the comparison were 33,410 from the Arabidopsis TAIR9 database, 30,192 from the rice RAP2.0 database, and 56,628 putative ORFs among 80,696 B. napus consensus sequences.

RT-PCR analysis
Total RNA (5 µg) from each sample was combined with random hexamer primers in a SuperScript first-strand cDNA synthesis system according to the manufacturer's instructions (Invitrogen, U.S.A.). Complementary DNA was diluted 10-fold and 1 µl of the diluted cDNA was used in a 20 µl PCR mixture. RT-PCR primers are listed in Table S1 and primers for  BrACT1,  used  as  controls,  were  5′-GTCTTGACCTTGCTGGACGTGA-3′ (forward) and 5′-CCTTTCAGGTGGTGCAACGAC-3′ (reverse). A standard PCR was performed with 5 min denaturation at 94 °C, followed by 25 cycles of 94 °C for 30 s, 55 °C for 30 s, and 72 °C for 90 s. PCR products were analyzed following electrophoresis through a 1% agarose gel.

Floral structure of GMS Chinese cabbage
To investigate development defects in Chinese cabbage, flowers from sterile and fertile plants were examined ( Figure  S3, Table S2). All floral organ measurements except pistil length and diameter were smaller in sterile flowers than in fertile flowers (significant difference: p=0.01, by T-test). However, the morphology of all of the floral organs except for the stamens was normal. In sterile flowers, the length of the stamens was greatly reduced, with shortened filaments. In addition, anthers appeared to be thin and pale white and did not bear any pollen grain. These observations imply that genes regulating the floral organ identity seemed to be normal, whereas genes for anther and pollen development were defective or expressed abnormally. Moreover, the expression of genes associated with cell growth and hormonal signaling might be altered.

Anther development in floral buds used in microarrays
To gain information complementary to the microarray experiments, anther development was examined for sterile and fertile floral buds ( Figure 1). Detailed microscopic study led to the division of anther development of Chinese cabbage into five stages: pollen mother cell (PMC), tetrad, uninucleate, bicellular, and mature pollen stages ( Figure 1 plus data not shown). The anthers of sterile and fertile floral buds appeared to be similar before the tetrad stage. After the tetrad stage, the fertile anthers could release microspores, which develop into mature pollens. However, in the sterile anthers, PMCs seem to remain associated with each other in the locule, unlike the normal PMCs that dissociate from each other during meiosis. In addition, the tapetum swelled to expand at the centre of the locule. These events were followed by abnormal degradation of the endothecium and collapse of pollen grains in the mature pollen stage. Based on Arabidopsis microsporogenesis [28], the early microsporogenesis process should be normal in our GMS plants. Instead, genes associated with tapetal development or post-meiotic tapetal function were defective in the GMS cabbage. Taken together, the sterile buds showed two distinct defects: the failure of microspore release or imperfect tetrad formation, and the swollen tapetum layer. This may imply that expression of GMS-related genes must commence from an early stage of male sporogenesis if microspores are to be released.
Using morphological features and floral bud size, fertile and sterile bud samples were classified into four stages (F1, F2, F3, and F4) and three stages (S1, S2, and S3), respectively ( Figure S4, Table 1). At each corresponding stage, the sizes of

Analysis of B. rapa genes on Br300K microarray
To demonstrate the necessity of the B. rapa microchip for Chinese cabbage study, and to verify the microarray results, genes used in construction of the Br300K chip were analyzed for sequence similarity to other plant genes. When the 31,057 B. rapa amino acid sequences with cDNA/EST supports were compared to those of Arabidopsis, B. napus, and rice, the number of genes with BLASTP scores higher than 30 were 18,078, 17,441, and 15,361, respectively. Figure S5A shows the percentage of similar genes in the three plants after grouping genes according to BLASTP score bins: <=70, 100, 200, 300, and > = 300. As expected, more B. rapa sequences showed homology with Arabidopsis and B. napus than with rice. In the BLAST score bin 300-1,000, 40.6% and 39.8% of the genes had homologs in Arabidopsis and B. napus, respectively, while 18.9% of the genes had homologs in rice. Interestingly, in the bins less than 200, more genes had counterparts in rice than in Arabidopsis and B. napus. This is consistent with the longer evolutionary distance between B. rapa and rice compared with that between B. rapa and B. napus or Arabidopsis.
When the probe-designed regions of B. rapa genes were compared with the 18,078 Arabidopsis homologs, the percentage distribution of BLASTn score bins was lower than that of BLASTP score bins ( Figure S5B). Comparison of 39,181 B. rapa genes with Arabidopsis ones showed an average sequence identity of 89%, suggesting that existing Arabidopsis oligomeric chips are not appropriate for analysis of B. rapa gene expression. In conclusion, genome-wide transcriptome analysis of Chinese cabbage requires the use of a B. rapaspecific microarray, instead of Arabidopsis chips.

Analysis of microarray data
To identify genes with altered expression, including candidate GMS gene(s) and/or GMS-related genes in the Chinese cabbage, we carried out microarray analyses using the newly developed Br300K chip and RNAs from fertile and sterile buds (Table S3). Among 47,548 genes on the Br300K chip, 7,213 genes showed values of less than 500 in PI (probe intensity) from all tested floral bud samples. We ignored these genes in subsequent analyses. The remaining 40,335 genes were subjected to significance analysis of microarray (SAM) [47]. The false discovery cutoff was set at <5% and genes changing over 2-fold were selected. A total of 10,622 genes were differentially expressed; 4,774 genes were up-regulated over 2-fold in at least one of four fertile buds compared with sterile buds, while 5,848 genes were down-regulated (Table  S3, S4). About 12-20% of the differentially expressed genes appeared to have no Arabidopsis counterparts, indicating that they might be present in B. rapa and/or other plants but not in Arabidopsis. Among the up-regulated genes in any stage of the fertile buds, 41% of them showed up-regulation in all stages, indicating that many genes may function in several developmental stages of pollen formation. There were 11,390 clones that were classified as no hit found in the initial analysis with Arabidopsis thaliana annotation (Table S3). Among these, 293 clones were specifically expressed in fertile buds and only 28 clones in sterile buds (Table S5, S6). When these sequences were subjected to BLASTn, most of the F-specific clones showed similarity to B. oleracea (12), B. napus (15), and other plant clones (62). Seventy clones (56 fertile-specific and 14 sterile-specific) were matched only to B. rapa bacterial artificial chromosome (BAC) clone sequences, implying that they are specific to B. rapa and will be important for further research to discover novel GMSrelated genes. In addition, several genes that were classified as unknown function but were specifically expressed in the fertile buds, such as Brapa_ESTC000796, Brapa_ESTC008117, and Brapa_ESTC049183, would be good candidates for GMS-associated genes.
To verify the general pattern of gene expression during pollen development, we selected genes showing the highest PI values in each of the floral buds, and carried out semiquantitative RT-PCR ( Figure S6, Table S7). As shown in Figure  S6, most of the genes that showed the highest PI values in sterile buds were also expressed in fertile buds. In addition, genes showing the highest PI value in F1 and F2 buds were also expressed in sterile buds at very low levels. However, some genes from F2 buds were not expressed in sterile buds at all, indicating a possible involvement in male fertility. As expected, genes that had the highest PI value in F4 buds were specifically expressed in fertile buds. They started expression in the F2 buds and continued through to the F4 buds, the pollen maturation stage, indicating that, in GMS plants, expression of genes in late stages of pollen development may be inhibited.

Genotype-specific expression of genes
In addition to being significantly different from SAM, genotype-specific genes were defined as genes that had PI values of over 1,000 in at least one bud type in a genotype, but less than 500 in all buds of other genotype, e.g., F-specific genes have a PI value of over 1,000 in any of the fertile buds (F1-F4 buds), but less than 500 in all three sterile buds (Table  S8, S9). The total numbers of F-and S-specific genes were 1,413 and 199, respectively, implying that the expression of large numbers of genes which might be important for fertility was defective in GMS floral buds. Of the F-specific genes, 71% showed the highest expression in F4 buds, the pollen maturation stage, indicating that putative GMS genes affect the expression of many genes involved in the late stage of pollen development. Approximately 1%, 9%, and 17% of genes were highly expressed in F1 (before tetrad), F2 (at tetrad), and F3 (after tetrad) buds, respectively, indicating that 90% (1,272 genes) of the genes were highly expressed after the tetrad stage. By contrast, among the genes that were more highly expressed in the sterile buds, most (82%) were highly expressed at the tetrad stage.
A Venn diagram and K-mean clustering of the genes listed in Tables S8 and S9 are shown in Figure 2. As shown in Figure  2A, genes with PI values over 1,000 in all four fertile buds and three sterile buds totaled 337 and 16, respectively. Genes showing the highest PI value in F1 buds were not expressed in F3 and F4 buds, suggesting that none of these were related to male gametogenesis in our GMS Chinese cabbage. These could be excluded from putative GMS genes. On the other hand, genes showing the highest PI values in F2 buds were expressed through the F3 bud stage ( Figure 2B). Genes showing the highest PI values in F3 buds were also expressed in both F2 and F4 buds, indicating these genes could be related to GMS phenotypes. Genes showing the highest PI values in F4 buds commenced expression in F3 buds and dramatically increased their levels at the F4 bud stage. Genes showing the highest PI values in S1 buds were also expressed in S2 buds, whereas most genes showing the highest PI values in S2 buds were only expressed at that stage. Several genes showing the highest PI values in S3 buds were highly expressed in S2 buds as well. All of these data indicate that fertile or sterile bud-specific genes might function in a relatively broad range of pollen development. Otherwise, our samples include several stages of pollen development.
Genotype-specific genes were functionally grouped based on 'The Arabidopsis Information Resource; http:// www.Arabidopsis.org/'. As shown in Table 2, most of the sterile bud-specific genes were highly expressed in S2 buds, the dominant categories of which were transferase activity, transcription factors, protein binding, and membrane metabolism. A high proportion of fertile bud-specific genes were associated with transporter activity, kinase activity, and lipid metabolic processes. In addition, F-specific genes were largely expressed in F4 buds.

Genes showing dramatically altered expression
The following categories were selected by both previous reports and highly altered gene groups found in this study: peroxidases (PODs), purple acid phosphatases (PAPs), multidrug and toxic compound extrusion (MATE) efflux family proteins, cytochrome P450 family proteins, lipid transfer protein (LTP) family, Cys-proteinase, kinases, transporters, and carbon supply-related genes. Among 68 BrPOD genes, 14 (eight Arabidopsis counterparts) and eight (two Arabidopsis counterparts) genes were specifically expressed in sterile and fertile buds, respectively ( Figure S7). These numbers, compared with their Arabidopsis counterparts, indicate that BrPOD genes are present in multiple copies in Chinese cabbage. Jiang et al. [48] reported that the expression level of reactive oxygen species (ROS)-scavenging genes was high during pollen development. However, major cell wall peroxidases reported by Bayer et al. [49] in Arabidopsis were highly expressed in both buds, implying that fertile bud-specific PODs found in this study might be novel genes expressed during pollen development in Chinese cabbage.
PAPs belong to a metallophosphoesterase superfamily and are characterized by their pink or purple color in solution [50]. Our microarray revealed that several BrPAP genes were highly and specifically expressed in either fertile or sterile buds of Chinese cabbage. Among 18 BrPAPs on the Br300K chip, three (BrPAP3, 7, and 8) were specifically expressed in sterile buds, while another three (BrPAP5, 6, and 11) were specifically expressed in fertile buds ( Figure S7), suggesting that the latter three might play an important role in pollen development. In tobacco (Nicotiana tabacum), NtPAP12 is bound to the cell wall and enhances the activities of cellulose and callose synthases [51]. Due to sequence similarity among PAP genes in plants, we speculate that BrPAP5, 6, and 11 might have similar functions during pollen development to NtPAP12.
MATE family proteins are known to confer tolerance to toxins like aluminum in plants [52,53], and Chinese cabbage contains many MATE genes. Among 65 MATE efflux family protein genes on the Br300K chip, two and four genes (three Arabidopsis counterparts) were specifically expressed in sterile buds and fertile buds, respectively ( Figure S7). The rest showed no significant difference between sterile and fertile buds. The role of MATE efflux proteins in pollen development is not clear, but their expression implies some sort of function of these genes related to the developmental process.
Numerous P450s have been known to be involved in the biosynthesis and metabolism of triterpenoids and steroids [54], the phenylpropanoid pathway [55], and lipid exine synthesis [8], all of which are required for normal pollen development. Among 311 cytochrome P450 (CYP) genes on the Br300K chip, 11 and 15 were specifically expressed in sterile and fertile buds, respectively ( Figure S8). In particular, seven fertile bud-specific genes (which were similar to seven Arabidopsis counterparts) (BrCYP71B2, BrCYP86C2, BrCYP86C3, BrCYP86C4, BrCYP705A24, BrCYP707A3, and BrCYP735A1) were first reported as pollen development-related P450s in this study. The CYP98A8 gene, mentioned by Matsuno et al. [55], was not F-specific, but its expression levels were 14-287-fold increased (in an allelic-specific manner) in the fertile buds. However, the upstream gene of CYP98A8, BrSHT (spermidine hydroxycinnamoyl transferase, AT2G19070), was specifically and highly expressed in the fertile buds, indicating possible involvement in pollen fertility.
The transport of lipid molecules from the tapetum to the microspore surface has been considered to be an essential process for the pollen wall formation. LTPs are basic extracellular small (9 kDa) proteins present in high amounts (as much as 4% of the total soluble proteins) in higher plants [56] and are involved in the fertilization process, such as pollen tube growth, pollen allergens, and pollen tube adhesion [57,58]. Among 116 LTP family genes on the Br300K microarray, five (three Arabidopsis counterparts) and 18 (nine Arabidopsis counterparts and five Brassica-specific genes) were specifically expressed in sterile and fertile buds, respectively ( Figure S9). A previous report found that LTP types 1 and 2 (At3g51590 and At1g66850) were significantly reduced in the Arabidopsis ams mutant [59]. The fertile bud-specific expression of B. rapa genes homologous to these LTPs might imply the importance of their function in pollen development after meiosis. BrATA7 in particular, which has 70% identity to the A. thaliana antherspecific gene 7 (AT4G28395) [60] at the amino acid sequence level, would be another candidate GMS gene.
Since several Cys proteases and their inhibitors are thought to be involved in PCD in tapetum [59,[61][62][63][64], it can be assumed that Cys-proteinases are important in pollen development in Chinese cabbage. Among 50 Chinese cabbage Cys-proteinase genes, 12 genes (corresponding to three Arabidopsis genes; AT1G06260, AT2G31980, and At4G36880) were highly and specifically expressed in fertile buds ( Figure  S9). These fertile-bud-specific genes might be related to pollen development in Chinese cabbage. Some of these have not been mentioned in other male sterile plants, implying the presence of PCD regulatory pathways that differ from those of Arabidopsis. The swollen tapetum layer might also be caused by the inhibition of PCD [65], resulting from defective AtMYB103/80, MS1, and AMS [20,[37][38][39]. On the other hand, the swollen tapetum layer observed in Figure 1 might be influenced only by transcription factor AMS (Table 3) and various proteinase genes.
Extracellular invertase genes (also known as cell wall invertases or beta-fructofuranosidases) were expressed specifically in anther and they supplied carbohydrate to the developing microspores [66]. Repression of or interference with extracellular invertase caused male sterility, while complementation restored fertility [66]. Arabidopsis contains six cell wall invertases (AtcwINV1-AtcwINV6) (At3g13790, At3g52600, At1g55120, At2g36190, At3g13784, and At5g11920) [67]. Among these, AtcwINV2, 4, and 5 were expressed in flower and/or seeds, while AtcwINV1, AtcwINV3, and AtcwINV6 were expressed in all tissues [67]. In our microarray data, the counterparts of AtcwINV1 and AtcwINV3 were expressed in all floral buds, while that of AtcwINV6 was not expressed in floral buds (data not shown). However, the counterpart of AtcwINV2 was highly expressed in F4 buds, indicating that its function may be important in pollen development at the late stage ( Figure S9).
Kinases and phosphatases are major regulatory components that control various pathways. This fact naturally leads to the presumption of involvement of these gene products in pollen development. Particularly, receptor-like protein kinases regulated male sterility from the early stages [64,68,69] to the late pollen developmental stage [70]. Among 1,226 protein kinase genes on the 300K chip, 63 of them, including those mentioned in Ms-cd1 B. oleracea by Kang et al. [23] were differentially expressed (Table S10). All receptor-like kinase genes were expressed in fertile buds, showing the highest expression level in F4 buds. In particular, receptor-like kinase genes (counterparts of AT3G21910, AT3G21920, 3G21930, AT3G21990, AT3G22040, AT3G29040, and AT3G58310) were highly expressed and up-regulated in the fertile buds, implying a critical role in pollen development. ASK1 (Arabidopsis SKP1like 1) is a component of Skp1-Cullin-F1-box-protein (SCF) complexes involved in protein degradation by the 26S proteasome. It also plays a role in male meiosis [71,72]. Knockout of the ask1 gene in Arabidopsis caused male sterility [71]. In this study, no difference in BrAsk1 expression was observed between sterile and fertile buds (Table S1). However, BrASK2 appears to be essential for male fertility (Figure 3), supporting the hypothesis that either our GMS occurs after meiosis of the male gametophyte, or that different regulatory mechanisms for fertility operate between the two species. In other words, BrASK2 appears to have taken over BrASK1 function in B. rapa.
In addition, it was reported that Arabidopsis magnesium transporter family member, AtMGT9, which functions as a lowaffinity Mg 2+ transporter, has a crucial role in male gametophyte development and male fertility [24]. In our microarray data, three alleles belong to this transporter family. One (Brapa_ESTC020685) showed no difference in its expression between sterile and fertile buds, but two (Brapa_ESTC020255 and Brapa_ESTC046558) were up-regulated in fertile buds, specifically, F2 and F3 buds. Particularly, Brapa_ESTC046558 seems to display fertile-specific expression, implying that it might be involved in male fertility.

Pollen wall and coat formation genes
After microspore release from the tetrad, formation of the pollen wall and the pollen coat are major events controlled by the tapetum layer and microspores. Based on cytological study (Figure 1), a change in the expression of numerous genes  involved in pollen wall and coat formation in GMS floral buds (Tables 4-5) seemed to be the result of defects in an early event in male gametophyte development. These genes might participate in the fertilization process.

1) Pollen cell wall formation genes.
Since the formation and modification of the pollen cell wall is also important for normal pollen development, we analyzed microarray data related to two categories: cell wall modification-related genes and cell wall arabinogalactan proteins (AGPs). A large number of genes involved in pollen cell wall formation and modification were specifically expressed in fertile buds.
Cell wall modification-related genes include six families: methyltransferase, pectate lyase, pectinesterase family, polygalacturonase, glycosyl hydrolase, and fructosidase genes. Five hundred and twenty-three Chinese cabbage clones contain such genes. Among these, 158 were highly expressed in fertile buds, including all genes mentioned by Kang et al. [23]. However, the degree of up-regulation was much higher in Chinese cabbage (up to 1,004-fold) than B. oleracea (31-fold) ( Table 4). Fourteen invertase/pectin methylesterase inhibitor family protein genes, 14 pectinesterase genes, 11 glycosyl hydrolase family protein genes, 8 polygalacturonase genes, and 5 pectate lyase family protein genes were highly and specifically expressed in fertile buds. These results are similar to those of the B. oleracea experiment, but the level of expression was more dramatic and many novel genes might be induced in Chinese cabbage. BrPGA4 (polygalacturonase 4) and BcMF2 (At1G02790 homolog) have many alleles in Chinese cabbage, the expression of which showed two patterns: one group was highly expressed in F3 and F4 buds, but expression of the others began in F1 buds and continued to F4 buds. Interestingly, among the invertase/pectin   (Table S8, S9), suggesting the existence of allelic-specific expression patterns. To release microspores from the early PMC stage, several specialized PMC wall layers must be generated and degraded [35]. Ms-cd1 B. oleracea, similar to our GMS, exhibited degradation of the primary PMC wall and delayed degradation of callose surrounding the tetrads, thereby arresting microspore release [23]. In our microarray data, two important enzymes for the degradation of esterified and unesterified pectin, pectin methylesterase (PME) and polygalacturonase (PG), were differentially expressed, whereas callose degradation genes were not, indicating little difference in the mechanism underlying male sterility. One putative PG gene, Brassica campestris Male Fertility 9 (BcMF9), conferred male fertility by acting as a coordinator in the late stages of tapetum degeneration, and subsequently in the regulation of wall material secretion and, in turn, exine formation [8]. In our microarray, its homolog also showed altered expression, with high levels in F3 and F4 buds, suggesting an important role in GMS.
Alpha 1-acid glycoproteins (AGPs) connect the plasma membrane to the cell wall [73]. They are a family of extensively glycosylated hydroxyproline-rich glycoproteins located on the cell surface. They are required for stamen and pollen development and function [73,74]. Therefore, it was expected that Chinese cabbage AGPs might be also involved in male fertility. Similar to Arabidopsis data, BrAGP6, BrAGP11, BrAGP14, BrAGP23, BrAGP40, BrAGP41, and BrAGP23 were highly expressed in fertile buds, particularly F3 and F4 buds. However, expression of the remaining 19 BrAGPs (BrAGP1-4, BrAGP8- 10,and BrAGP26 and 27) showed no difference between fertile and sterile buds (Table  4). These data indicate that at least six AGPs could be associated with pollen development in Chinese cabbage.
2) Pollen coat-related genes. The pollen coat of the family Brassicaceae, including A. thaliana, B. napus, B. oleracea, and   B. rapa, consists of lipids and proteins that facilitate adhesion to insect vectors and mediate pollen-stigma interactions during pollination and fertilization processes [75,76]. Lipases and oleosins (largely oleo-pollenins) are major protein components (over 90%) of the pollen coat [76,77], while protein kinases and pectin esterase are minor components [76].
Pollen coat lipases are largely composed of GDSL lipases and extracellular lipases (EXLs) [77,78]. Among 95 clones encoding GDSL lipase genes from Chinese cabbage, three genes (corresponding to two Arabidopsis genes) and 13 genes (corresponding to nine Arabidopsis genes) were specifically expressed in sterile and fertile buds, respectively ( Table 5). The remaining genes were either not expressed or constitutively expressed in both floral buds. On the other hand, 58 genes belonging to extracellular lipases and other lipases were found in the Br300K microarray. Among these, 3 and 51 genes were specifically expressed in sterile and fertile buds, respectively (Table 5). BrEXL4, BrEXL6, and the putative family II EXLs were highly expressed in the fertile buds. Interesting findings included a very highly up-regulated gene, encoding a beta-ketoacyl-CoA synthase family protein, which catalyzes wax synthesis, in fertile buds (F1, F2, and F3 buds). Another interesting finding was that the acyl-activating enzyme 11 (AAE11) gene was highly expressed only in S3 and F4 buds.
Oleo-pollenins (oleosin-like proteins) made up 50-80% of total pollen coat proteins by mass, whereas oleosins and calosins are minor components of the pollen coat [76]. The oleo-pollenins include many from the glycine-rich protein (GRP) family [75,79]. In our microarray data, one BrGRP (AT1G55990 homolog) gene was expressed specifically in sterile buds. However, 35 genes were specifically and highly expressed in fertile buds (Table 5), which included Arabidopsis counterparts, B. napus homologs, B. oleracea homologs, and B. rapa genes. Only one of these is the calosin-related family proteins.
Pectin esterases and protein kinases are less-abundant proteins in the pollen coats that facilitate the penetration of the emerging pollen tube into the stigmatic surface and that participate in signaling processes, respectively [76]. In our microarray data, one pollen coat receptor-like kinase   (AT3G21920 homolog) and one Chinese cabbage pollen coat protein homolog (BAN103) (U77666) showed fertile budspecific expression (Table 5). Particularly, the receptor-like protein kinase might play a role in an entire stage of normal pollen development.
In addition to the above proteins, our microarray data revealed that genes encoding five pollen-specific proteins, one phosphatase, two polcalcins, three pollen Ole e 1 allergens, and one channel were specifically and highly expressed in fertile buds. These data indicate that in addition to cell wall and pollen coat proteins, many pollen components are required for male sterility or male gametophyte development (Table 5). Although many genes essential for the formation of both pollen wall and coat were suppressed in GMS, the pollen maturation and anther dehiscence would be expected to be normal since the expression of genes essential for late stage pollen development, such as PM-ANT1, ER-ANT1, and mitochondrial ATP/ADP carriers AAC1 and AAC2 [80], was high in all S1-3 and F1-4 floral buds.

Expression analysis of transcription factors
Transcription factors can regulate a number of genes associated with a specific trait, so their effects will be more powerful than those of structural genes. We analyzed several major transcription factors showing altered expression in GMS Chinese cabbage (Figure 4). Among 56 BrWRKY transcription factor genes, seven genes (BrWRKY26, BrWRKY28, BrWRKY33, BrWRKY41, two BrWRKY71, and BrWRKY75) were expressed specifically in sterile buds, whereas three genes (BrWRKY7, BrWRKY21-1, and BrWRKY 68) were expressed specifically in fertile buds. In particular, BrWRKY21-1 (homologous to B. napus WRKY21-1 [81]) was highly expressed in F3 and F4 buds, implying a possible involvement in pollen development and/or pollen fertility.
NAC [for NAM (no apical meristem), ATAF1, 2, CUC2 (cupshaped cotyledon 2)] transcription factors are one of the largest plant TF families. They share an N-terminal NAC domain. Since NAC transcription factors have been found to be key regulators of stress perception and developmental programmes [82], examining their expression profiles could provide insight into their involvement in pollen development. A total of 66 NAC transcription factors were analyzed in this microarray. Among them, two (BrNAC42 and BrNAC92) were expressed in sterile buds, while another two (BrNAC56 and BrNAC73) were expressed in fertile buds. Two BrNAC56 (Brapa_ESTC000813 and Brapa_ESTC007054) homologs of NARS2/NAC2, which regulates embryogenesis in Arabidopsis [83], were expressed from F2 to F4 floral buds, whereas two novel BrNAC73 (Brapa_ESTC01835 and Brapa_ESTC038584) genes were expressed in F3 and F4 floral buds, indicating possible involvement in pollen development. The remaining 47 genes were constitutively expressed in both types of buds, but 15 genes were not expressed in the tested tissues.
Among 279 BrMYB transcription factor genes, 14 (9 Arabidopsis genes) and 8 (7 Arabidopsis genes) were specifically expressed in sterile and fertile buds, respectively. BrMYB46, BrMYB85, BrMYB99, BrMYB103 (MYB80 or MS188), BrMYB108, and two MYB genes appeared to be fertile bud-specific. Interestingly, most fertile bud-specific MYB genes were highly expressed in F4 buds, whereas BrMYB99 was highly and specifically expressed in F1 and F2 buds. This BrMYB99 will be a putative candidate for control of the early stage of Chinese cabbage GMS, while others will be putative candidates for pollen fertility.
Among 1,542 zinc finger family protein genes deposited on the Br300K chip, 2 and 23 genes were specifically expressed in sterile and fertile buds, respectively. Two sterile bud-specific genes are C3H4-type RING finger and C2H2 type (BrZAT11) genes, while fertile bud-specific genes are comprised of C2H2-, C3H3-, CCH-, DHHC-, and Dof-type protein genes. Among these, C2H2-type family protein genes are remarkably highly expressed in F3-and F4-buds.
Analysis of known transcription factors revealed two (AT1G33770 and AT1G75490 homologs) and 11 (FIS3, HOS9/ PF2, ATHB-7, AGD10/MEER28/RPA, MSG2/IAA19, ZFWD1, At-HSF4A, AT4G35700, AT4G21895, and AT1G77570 homologs) genes that were specifically expressed in sterile and fertile buds, respectively. Most of these are associated with dehydration stress and ovule development. In contrast to our data, none of these genes has been reported to be related to male fertility, implying that more functions than those related to pollen development should be elucidated.

Prediction of gene function through analysis of expression profiling during floral bud development
Analysis of gene expression levels (expressed as PI values) during floral bud development provides an opportunity to identify sequentially operating genes and to predict the function of previously known genes in other plant systems. As shown in Figure 5, the somewhat similar regulatory pathway underlying Arabidopsis pollen development might also exist in Chinese cabbage. The expression of BrNZZ/SPL and BrEXS/EMS1 began in F1 buds and continued through to the pollen maturation stage F4. Interestingly, BrMYB103/MYB80, one of the BrMS5s, BrMYB35, LTP family protein gene, BrMS1, and BrMYB99 were expressed only in F1 and F2 floral buds, not in F3 and F4 buds. In addition, the transcript levels for BrMS2 and BrATA1 were high in F1 and F2 buds, but not detectable in F4 buds. On the other hand, the transcripts for BrATA20, microtubule motor gene, BcMF7, and BrMYB103 were not detectable in F1 buds. According to Figure 5, the chronological working order of floral bud developmental genes in Chinese cabbage should be different from that in Arabidopsis. BrMYB35 and BrMYB103/80 definitely worked upstream of BrMS1 and BrMYB99. BrMS1, BrMS2, and BrAMS might function at similar stages of pollen development.
As Arabidopsis contains multiple copies of the male sterility 5 (MS5) gene [84], the Br300K microarray includes five BrMS5 genes: homologs of AT1G04770, AT3G512890, AT4G20900, AT5G44330, and AT5G48850 (ATSDI1; sulfur deficiencyinduced 1). Unlike the Arabidopsis AT4G20900 gene, which when mutated led to male sterility [84], the transcript level of its homolog could not be detected in any of the seven floral buds, suggesting that it is not related to pollen development in Chinese cabbage. Instead, AT5G44330 and AT3G51280 might be functional, but they were also expressed in all sterile buds, indicating that they might not be major determinants in GMS even though they are required for pollen development. The counterpart of AT5G48850, the expression of which was highest in F3 buds, was also expressed in all seven floral buds, indicating that MS5 genes do not play a critical role in Chinese cabbage GMS. All BcMF genes showed the highest expression levels in F4 buds. However, some of them were expressed in all floral buds, but others were expressed only in F3 and F4 buds. Arabidopsis BES1 (BRI1-EMS-SUPPRESSOR1), an important transcription factor for brassinosteroid signaling, is considered to be a master gene that controls many transcription factors essential for anther and pollen development as well as MS1-downstream genes [40]. However, four homologs (Brapa_ESTC001714, Brapa_ESTC013323, Brapa_ESTC021551, and Brapa_ESTC039699) of Arabidopsis BES1 were highly expressed in all seven floral buds (Table S3), indicating that the mechanism underlying GMS is different from that of Arabidopsis.
Tetrad formation defectives of Arabidopsis, AtPC1 (Parallel Spindle 1) (At1G34355), and JASON (At1G0660) [85] were expressed in both sterile and fertile floral buds in our GMS (Table S3), indicating that the meiosis II or tetrad formation process would be normal or other genes may be involved in it.

Comparison of B. rapa GMS with Arabidopsis MS genes
Genes regulating anther and pollen development in Arabidopsis have been well established by genetic and molecular biological studies. To unravel whether B. rapa GMS is also controlled by homologs of Arabidopsis genes, the alteration of expression of those genes was compared with previous results (Table 3). Genes associated with stamen formation, microsporangium differentiation (except NZZ/SPL and EXS/EMS1), and early tapetum development (except bHLH89) were not down-regulated in B. rapa GMS buds, indicating putative GMS gene(s) might be functioning downstream of these groups of genes. However, alteration of NZZ/SPL and EXS/EMS1 expression in GMS might imply the presence of different pathways in the two plants. Other early genes associated with anther development in Arabidopsis, such as MS5 [84], MYB33, and MYB65 [86] showed no change in their expression in Chinese cabbage. The rice UNDEVELOPED TAPETUM1 gene and its putative Arabidopsis thaliana ortholog DYSFUNCTIONAL TAPETUM1 (DYT1), encoding basic helix-loop-helix (bHLH) transcription factor, are crucial for tapetal differentiation and the formation of microspores [35,87]. The B. rapa ortholog of Arabidopsis DYT1 was absent in our microarray, but BrDYT1 (Bra013519 [The Brassica rapa Genome Sequencing Project Consortium, 2011] [88]), which was 86% identical to the Arabidopsis ortholog, was not expressed in any floral buds (data not shown). Instead, another bHLH transcription factor, BrbHLH89, might replace DYT1 function in Chinese cabbage (Table 3). Among major genes essential for post-meiotic tapetal function that are controlled by DYT1 [28,35,36], MS1 and AMS appear to be related to GMS, but MYB35 and MYB103/80 do not ( Figure 5, Table 3).
Most genes related to later pollen development were downregulated in GMS floral buds, but some genes, such as ATA1, MS2, ATLP-3, AtMYB32, and DEX2, were not. In addition, expression of several genes associated with pollen wall development, such as FLP1 and DEX2, was high in all seven buds. These data imply that exine formation genes are expressed in GMS buds, even in the aborted pollen grains.
AMS, a basic helix-loop-helix (bHLH) transcription factor, plays a role in completion of meiosis [38], and regulates 13 genes involved in anther development, including lipid transport and metabolism [59]. BrAMS showed altered expression, especially in F3 and F4 buds. The Brassica genome may contain two (or three) copies of AMS (Bra002004 and Bra030041) (http://brassicadb.org) and both showed similar patterns of expression, but Bra030041 (Brapa_ESTC011209 and Brapa_ESTC010964) changed to a greater degree. B. rapa GMS showed somewhat similar phenotypes to the Arabidopsis ams mutant, such as reduced filament length, swollen tapetum layer, and no pollen production. However, BrGMS revealed the failure of tetrad formation and release, indicating that additional genes are involved in this. BrAMS was expressed in both S1 and S2, but not in S3. In addition, BrAMS expression was high in F3 and F4 buds. This indicates that the BrAMS gene itself might be normal, but that signaling that controls BrAMS transcription could be disturbed in GMS buds.
An ortholog of another bHLH gene, bHLH89 (At1G06170), revealed a more dramatic change in GMS, indicating a more important role than BrAMS in GMS. Interestingly, both bHLH genes were highly expressed in S1, S2, F1, and F2 buds, but completely suppressed in S3 while keeping relatively high levels in F3 and F4 buds. This result indicates that upstream component(s) might play a major role in GMS. Another interesting finding was that the expression of chalcone synthase (CHS) was AMS-dependent, but that the expression of ABC transporter WBC27 (AT3G13220) was not AMSdependent in GMS. Since both genes were direct targets of AMS and essential for pollen fertility [59] in Arabidopsis, our data indicate somewhat different pollen development processes between the two plants.

qRT-PCR confirmation of microarray analysis
To confirm our microarray data, we selected several genes that had been previously identified in Arabidopsis and other Brassica species. Transcript levels of these genes were examined by semi-quantitative RT-PCR (Figure 3). Some genes identified in Arabidopsis spl and ems mutants [14] were expressed in both sterile and fertile buds, indicating that these are not closely related to Chinese cabbage GMS. Others (BrEST10704, BrATA7, and BrbHLH) were specifically expressed in fertile buds or up-regulated after F2 buds, implying possible involvement in pollen fertility ( Figure 3A). BrAG (Agamous) determining organ identity was expressed in all seven floral buds, suggesting that it might not be critical in our GMS ( Figure 3B). Except for BrMYB33, BrNAC25, and BrASK2, most genes associated with pollen development in Arabidopsis might not be associated with Chinese cabbage GMS determination ( Figure 3B). On the other hand, most genes which are related to tapetum specific, pollen coat, pollen wall, kinases, transport, and so on, were specifically expressed in fertile buds ( Figure 3C, 3D), implying that they are directly or indirectly the cause and effect on male fertility.
Counterparts of Arabidopsis CYP98A8, which was highly expressed in the tapetum and developing pollen, and SHT, which was coexpressed with CYP98A8 [55] in Chinese cabbage in a similar fashion to in Arabidopsis, indicated that they are involved in male fertility as well.
In conclusion, most important genes essential for the early stage of microsporogenesis in Arabidopsis, including EXS/ EMS1, NZZ/SPL, MS5, MS1, MS2, AMS, bHLH89, MYB103/80 MYB35, and MYB65, were highly expressed at least in S1 and S2 buds, meaning that these are not GMS genes in Chinese cabbage. Instead, a signaling factor(s) or another transcription factor(s) that controls the expression of all these genes would be a better candidate for the GMS gene(s) even though we did not identity it in this study. However, BrMYB99, which was specifically expressed in F1 and F2 buds ( Figure 3C) could be a putative GMS gene, even though the GMS phenotype was different from that of the Arabidopsis mutant [13].
Since pollen development is a complex process regulated by the expression of sense-and antisense transcripts as well as small RNAs [89], more comprehensive molecular and genetic study will be required for elucidating GMS mechanism in Chinese cabbage. In addition, 17 B. rapa-specific genes had no Arabidopsis counterpart genes (Table S5) Brapa_ESTC048170, Brapa_ESTC049217, and Brapa_ESTC050778. These genes that were highly and specifically expressed in fertile buds will be important genes to investigate in terms of function.
In conclusion, we identified many genes that are differentially expressed between fertile and sterile buds of Chinese cabbage. Most genes are already known in other male sterile plants, but some are newly identified in Chinese cabbage including 17 novel genes. Expression of core transcription factors involved in pollen development were quite similar to Arabiodopsis with exception. Numerous genes controlling pollen wall and pollen coat formation were greatly downregulated in sterile buds, possibly indirect effect of GMS gene defect. All data suggest that Chinese cabbage GMS might be controlled by genes acting in post-meiotic tapetal development.  )   Table S5. List of specifically expressed genes in fertile buds that were initially classified as no hit found (NHF). All sequences were subjected to a repeated BLASTn search in NCBI. (XLSX )   Table S6. List of specifically expressed genes in sterile buds that were initially classified as no hit found (NHF). All sequences were subjected to a repeated BLASTn search in NCBI.