De Novo Assembly and Transcriptome Analysis of Wheat with Male Sterility Induced by the Chemical Hybridizing Agent SQ-1

Wheat (Triticum aestivum L.), one of the world’s most important food crops, is a strictly autogamous (self-pollinating) species with exclusively perfect flowers. Male sterility induced by chemical hybridizing agents has increasingly attracted attention as a tool for hybrid seed production in wheat; however, the molecular mechanisms of male sterility induced by the agent SQ-1 remain poorly understood due to limited whole transcriptome data. Therefore, a comparative analysis of wheat anther transcriptomes for male fertile wheat and SQ-1–induced male sterile wheat was carried out using next-generation sequencing technology. In all, 42,634,123 sequence reads were generated and were assembled into 82,356 high-quality unigenes with an average length of 724 bp. Of these, 1,088 unigenes were significantly differentially expressed in the fertile and sterile wheat anthers, including 643 up-regulated unigenes and 445 down-regulated unigenes. The differentially expressed unigenes with functional annotations were mapped onto 60 pathways using the Kyoto Encyclopedia of Genes and Genomes database. They were mainly involved in coding for the components of ribosomes, photosynthesis, respiration, purine and pyrimidine metabolism, amino acid metabolism, glutathione metabolism, RNA transport and signal transduction, reactive oxygen species metabolism, mRNA surveillance pathways, protein processing in the endoplasmic reticulum, protein export, and ubiquitin-mediated proteolysis. This study is the first to provide a systematic overview comparing wheat anther transcriptomes of male fertile wheat with those of SQ-1–induced male sterile wheat and is a valuable source of data for future research in SQ-1–induced wheat male sterility.


Introduction
Wheat (Triticum aestivum L.) is one of the most important cultivated crops in the global food system; it supplies the needed food protein and calories for over 35% of the world's population

Materials and Methods
Chemical hybridizing agent SQ-1 SQ-1 was obtained from the Key Laboratory of Crop Heterosis of Shaanxi Province and used to produce hybrid wheat seed. It is a pyridazinecarboxylic acid compound that is readily soluble in water. Stamen sterility (male sterility rate > 98%) was induced without affecting pistil fertility when 5.0 kg ha -1 SQ-1 was evenly sprayed on the leaves of wheat at the 8.5 stage of the Feekes scale.

Plant material and sample preparation
The wheat cultivar Xi-nong 1376 was sown in the experimental field of Northwest A & F University, Yangling, Shaanxi, China (108°E, 34°15 0 N) on October 7, 2013. The experimental plot contained about 9,000 plants grown in 180 rows (1.5 m long each) at a density of 25 cm space between rows and 3 cm between plants with a row, it was divided into two groups (treatment group and control group) and each group contained 90 rows. When the average growth stage of the wheat reached the 8.5 stage of the Feekes scale, 5.0 kg ha -1 SQ-1 was evenly sprayed on the leaves of the wheat in the treatment group, and the same amount of water was sprayed on the leaves of the wheat in the control group. At the uninucleate pollen stage of wheat, anthers were separately collected from wheat fertility spikes and SQ-1-induced male sterile spikes, frozen immediately in liquid nitrogen, and stored at -80°C until use.

RNA extraction and preparation of library for Illumina sequencing
Each frozen anther was ground in a mortar with liquid nitrogen, after which total RNA was extracted using Trizol Reagent (Invitrogen Life Technologies, USA) according to the standard protocol. The quality of total RNA was checked by electrophoresis in a 1.5% agarose gel and the concentration of total RNA was determined by NanoDrop (Thermo Scientific, Wilmington, DE, USA). The RNA integrity value was further verified using an Agilent 2100 Bioanalyzer (Agilent Technologies, Santa Clara, CA, USA). The two cDNA libraries of fertile wheat and SQ-1-induced male sterile wheat were prepared according to the manufacturer's instructions for mRNA-Seq sample preparation (Illumina, Inc., San Diego, CA, USA). The cDNA library products were sequenced by Illumina paired-end sequencing technology with read lengths of 100 bp, and they were sequenced on the Illumina HiSeq 2500 instrument by Biomarker Technologies Co. Ltd. (Beijing, China). The dataset was submitted to the NIH Short Read Archive (accession number: SRP051670).
Sequence data analysis and de novo assembly Before assembly, the raw paired-end reads were filtered to obtain high-quality clean reads. Lowquality sequences were removed, including sequences with ambiguous bases (denoted with an "N" in the sequence trace) and reads with more than 20% low-quality bases (quality value < 20). After purity filtering was completed, the high-quality reads were assembled by Trinity (release 20131110) with default parameters to construct unique consensus sequences [47].

Analysis of differential gene expression
The gene expression level was measured by the values of RPKM (reads per kb per million reads) [48]. Unigenes that were differentially expressed between the male fertile and SQ-1-induced male sterile wheat were analyzed by Chi-square test using IDEG6 software (http://telethon.bio. unipd.it/bioinfo/IDEG6/). The false discovery rate (FDR) method was introduced to determine the threshold p-value at FDR < 0.01, and the absolute value of log 2 Ratio ! 1 was used as the threshold to determine the significance of the differential expression of unigenes.

Gene annotation and classification
In order to perform functional annotation, the assembled unigenes were submitted to public databases and compared with the NCBI non-redundant protein database (NR) [49], the Swiss-Prot database (http://www.uniprot.org/) [50], the Clusters of Orthologous Groups (COG) database (http://www.ncbi.nlm.nih.gov/COG/) [51], and the Kyoto Encyclopedia of Genes and Genomes (KEGG) database (http://www.genome.jp/kegg/) [52] using BlastX (v. 2.2.26) with an E-value of less than 1e-5, while the Gene Ontology (GO) annotations were analyzed using the Blast2GO (v. 2.5) program (http://www.geneontology.org/) [53]. All differentially abundant unigenes between the male fertile and SQ-1-induced male sterile wheat were mapped to the GO and KEGG pathway databases, and then the respective numbers of unigenes for every GO term and KEGG Orthology (KO) term were calculated. To compare these unigenes with the whole transcriptome background for wheat, significantly enriched GO and KO terms from the set of differentially abundant unigenes were identified using the hypergeometric test. The formula for the gene enrichment test was in which N represents the total number of unigenes with GO and KEGG pathway annotation; n represents the number of differentially abundant unigenes in N; M represents the number of unigenes that were annotated to certain GO or KO terms; and m represents the number of differentially abundant unigenes in M. The initial p-values were then adjusted using a Bonferroni Correction and a corrected p-value of 0.05 was adopted as a threshold.

Illumina sequencing and de novo assembly
A total of 42,634,123 paired-end sequence reads (20,586,618 and 22,047,505 for fertile wheat and SQ-1-induced male sterile wheat, respectively) remained, with the Q30 percentage over 92%. The high-quality reads were aligned to assembled unigenes, with more than 80% of the high-quality reads mapping to a unique or to multiple unigene locations (Table 1). Trinity (release 20131110) software was used to generate 4,062,157 contigs, which were further assembled

Prediction of open reading frames
We used GetORF software from the EMBOSS (v. 6.0.1) analysis package to perform open reading frame (ORF) predictions. A total of 81,838 ORFs were detected from 82,356 unigenes, with an average length of 402 bp and an N50 length of 846 bp. The majority of ORFs (68.31%) contained 0-300 bp, whereas 17,566 ORFs (21.46%) contained 300-1000 bp and 8,369 ORFs (10.23%) exceeded 1,000 bp ( Table 2). The remaining 518 unigenes with no ORFs were noncoding sequences or likely originated from untranslated regions.

Analysis of differentially expressed unigenes
A sequence similarity search was conducted against several public databases: the NR database, the Swiss-Prot database, the COG database, the GO database, and the KEGG database, using Blast with an E-value of less than 1e-5. A total of 967 differentially expressed unigenes (88.88%) exhibited gene annotation (Table 3).
To identify differentially expressed unigenes between fertile wheat and SQ-1-induced male sterile wheat, putative differentially expressed unigenes were identified on the basis of RPKM values that were calculated from the read counts mapped onto the reference transcriptome. A total of 1,088 unigenes were differentially expressed between fertile wheat and SQ-1-induced male sterile wheat according to a comparison of expression levels with Fold Change (FC) ! 2 and FDR < 0.01 (Fig 2). Using the fertile wheat as a reference, 643 up-regulated unigenes (with higher levels of expressions in the SQ-1-induced male sterile wheat) and 445 down-regulated unigenes (with higher levels of expressions in the fertile wheat) were identified (Fig 3). Significantly more unigenes were up-regulated in sterile wheat than were down-regulated. The unigenes with the two highest levels of up-regulation were involved in "posttranslational modification, protein turnover, chaperones" and "oxidation-reduction process," while the two highest levels of down-regulation corresponded to "translation, ribosomal structure and biogenesis" and "amino acid transport and metabolism."

GO functional classification of differentially expressed unigenes
GO is an international classification system for standardized gene functions that may be obtained from the NR annotation information. The GO terms consist of the following three broad categories: cellular components, molecular functions, and biological processes. GO assignment was used to assign 637 differentially expressed unigenes to 57 subcategories. Among them, 394 differentially expressed unigenes were related to cellular components, 487 differentially expressed unigenes were grouped under molecular functions, and 492 differentially expressed unigenes were involved in biological processes. Within the cellular component category, the majority of differentially expressed unigenes were enriched in the subcategories of "cell part," "cell," and "organelle." Within the biological process category, the great majority were related to "metabolic process," "cellular process," and "response to stimulus." Within the molecular function category, the largest proportion of differentially expressed unigenes were involved in "binding" and "catalytic activity," and a relatively large number were related to "transporter activity" (Fig 4), which may play an important role in ions, small molecules, and macromolecules transport, such as amino acids and proteins.
ROS are chemically reactive molecules containing oxygen, including hydroxyl radical (OH), lipid peroxide (ROO-), singlet oxygen ( 1 O 2 ), superoxide radical (O 2 -), and hydrogen peroxide (H 2 O 2 ), which can lead to the significant destruction of cells [54]. Catalase and peroxidase were dramatically down-regulated in SQ-1-induced male sterile wheat. In addition, 3 downregulated unigenes were related to "plant-pathogen interaction" (pathogenesis-related protein 1, jasmonate ZIM domain-containing protein, and calmodulin) and play important roles in the response to environmental stress. A few down-regulated unigenes were involved in "amino acid metabolism," including glutamate, tryptophan, glycine, and methionine.
Glutathione is an abundant and important antioxidant in plants that can react with electrophilic or oxidizing species before the latter can damage more critical cellular constituents, such as nucleic acids and proteins [55]. Among all the up-regulated unigenes, the paths containing the largest number of up-regulated unigenes were involved in "glutathione metabolism" (6 unigenes). A few up-regulated unigenes were related to protein degradation, including "protein processing in endoplasmic reticulum" (5 unigenes), "ubiquitin mediated proteolysis" (3 unigenes), and "protein export" (4 unigenes). mRNA degradation plays an important role in gene expression regulation, and 2 up-regulated unigenes were involved in "mRNA surveillance pathway," which contained the nonsense-mediated mRNA decay and no-go decay. In addition, 2 up-regulated unigenes were closely interrelated with anaerobic respiration, a form of respiration using electron acceptors other than oxygen that is generally less efficient energetically than aerobic respiration.

Discussion
Wheat PMS is closely related to material metabolism and energy metabolism and is characterized by little starch accumulation in the mature pollen and deformed shape of abortive pollen particles [29]. As a key regulator of pyruvate metabolism, pyruvate dehydrogenase complex (PDC) links the glycolysis metabolic pathway to the citric acid cycle. PDC-E1α was down-regulated in PMS wheat relative to fertile wheat, a change that results in an energy shortage during pollen development [32]. Furthermore, some studies showed that the expression level of three phosphoric acid-glycerol dehydrogenase (GAPDH) genes in the PMS lines were lower than those in fertile lines, The reduced transcription of GAPDH genes in wheat anthers were closely associated with pollen abortion [30]. The expression level of the gene for aconitase, a key enzyme in the tricarboxylic acid cycle, was significantly lower in PMS wheat anthers than that in fertile anthers at the binucleate and trinucleate stages. The shortage of energy and the production of some metabolic intermediates caused by abnormal expression of the aconitase gene probably induced pollen abortion [29]. Our results showed that a large number of unigenes related to material and energy metabolism were down-regulated in PMS wheat, including ribulose-bisphosphate carboxylase, glyceraldehyde-3-phosphate dehydrogenase, fructose-bisphosphate aldolase, and pyruvate dehydrogenase. These genes were involved in the categories of "photosynthesis" (8 unigenes), "carbon fixation in photosynthetic organisms" (6 unigenes), "citrate cycle" (1 unigene), "glycolysis/gluconeogenesis" (3 unigenes), and "starch and sucrose metabolism" (2 unigenes).
The amount of ROS can increase sharply under conditions such as ultraviolet light, environmental stress (including cold, heat, salt, drought, and heavy metals stress), or anthropic action through xenobiotics, such as herbicides [54]. Increased ROS metabolism was closely related to pollen abortion in PMS wheat. From a late stage of mononucleate pollen to the initial stage of binonucleate pollen, the rate of generation of O 2 − and the amounts of H 2 O 2 and MDA of PMS wheat anthers were significantly higher than those of fertile anthers, but the activity levels of SOD, POD, CAT, and APX were significantly lower than those of fertile anthers. The serious imbalance in ROS metabolism and the increase in membrane lipid peroxidation were probably the major cause of pollen abortion [31]. In our study, we found that the activities of POD and CAT were significantly higher in fertile anthers than in PMS wheat anthers, and they affected the normal development of wheat pollen directly or indirectly. The proteasomal degradation pathway is essential for many cellular processes, including the regulation of gene expression, the cell cycle, and responses to oxidative stress, and it also participates in plant pollen abortion. From the early stage of mononucleate pollen to the initial stage of trinucleate pollen, the expression level of ubiquitin-related modifier 1 gene was significantly higher in PMS wheat anthers than in fertile anthers, and the F-box protein gene was significantly up-regulated in PMS wheat anthers from the late stage of mononucleate pollen to the initial stage of binonucleate pollen. These genes are the promoters for pollen abortion of wheat PMS induced by the chemical hybridization agent SQ-1 [28].
Differential proteomic analysis of polyubiquitinated proteins associated with wheat male sterility identified 127 differentially expressed polyubiquitinated proteins, including heat shock protein 70, ATPase subunit, ubiquitin-related enzyme, glycosyltransferase, and 20S proteasome subunit. Proteins in PMS wheat anthers whose expression was up-regulated included those in the categories of protein metabolism, carbohydrate and energy metabolism, amino acid metabolism, and plant secondary metabolism. However, PMS wheat anthers expressed much lower levels of photosynthesis-related proteins. Alteration of polyubiquitinated proteins is associated with wheat male sterility [56].
Our results showed that most of the up-regulated unigenes were closely associated with the categories of protein degradation and glutathione metabolism. The down-regulated unigenes were related to the categories of "photosynthesis" (17 unigenes) and "ribosome" (18 unigenes) (Fig 6). The down-regulated expression of ribosomes affected biological protein synthesis, which was probably the key reason for wheat pollen abortion. Conclusions This is the first study to obtain a large-scale SQ-1-induced male sterile wheat anther transcriptome dataset using high-throughput Illumina sequencing technology. Using functional annotation and enrichment analysis, we analyzed differentially abundant unigenes between fertile wheat and SQ-1-induced male sterile wheat and established a simple mode of SQ-1-induced male sterility in wheat (Fig 6).
After SQ-1 was sprayed on the leaves of wheat, it was transported from the leaves to the flowers. Its effects included down-regulated expression of numerous unigenes involved in "nucleotide metabolism," "ribosome," "aerobic respiration," "photosynthesis," and "ROS metabolism." In addition, a large number of unigenes were up-regulated, including those that were closely associated with "anaerobic respiration," "glutathione metabolism," "abnormal RNA degradation," and "protein degradation." We identified a great number of important genes related to wheat male sterility and provided a framework for further mechanism studies on SQ-1-induced male sterility in wheat.