The G123 rice mutant, carrying a mutation in SE13, presents alterations in the expression patterns of photosynthetic and major flowering regulatory genes

Day length is a determinant of flowering time in rice. Phytochromes participate in flowering regulation by measuring the number of daylight hours to which the plant is exposed. Here we describe G123, a rice mutant generated by irradiation, which displays insensitivity to the photoperiod and early flowering under both long day and short day conditions. To detect the mutation responsible for the early flowering phenotype exhibited by G123, we generated an F2 population, derived from crossing with the wild-type, and used a pipeline to detect genomic structural variation, initially developed for human genomes. We detected a deletion in the G123 genome that affects the PHOTOPERIOD SENSITIVITY13 (SE13) gene, which encodes a phytochromobilin synthase, an enzyme implicated in phytochrome chromophore biosynthesis. The transcriptomic analysis, performed by RNA-seq, in the G123 plants indicated an alteration in photosynthesis and other processes related to response to light. The expression patterns of the main flowering regulatory genes, such as Ghd7, Ghd8 and PRR37, were altered in the plants grown under both long day and short day conditions. These findings indicate that phytochromes are also involved in the regulation of these genes under short day conditions, and extend the role of phytochromes in flowering regulation in rice.


Introduction
An optimal flowering time, or heading date, that adjusts to local agroclimatic conditions is essential for maximizing the yield potential of rice crops. In line with this, flowering regulation in rice has played an important role in its expansion and diversification, and is one of the main factors that contributes to the adaption of this crop in northern regions [1]. Rice domestication took place in a region with a tropical climate with a short day (SD) length and temperatures that only slightly vary all year long [2]. During its expansion, rice crops reached northern areas where permissive temperatures occur only in summer, when day length is long. In these PLOS ONE PLOS ONE | https://doi.org/10.1371/journal.pone.0233120 May 18, 2020 1 / 20 a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 GENES1 (HOS1) and CO to form a three-protein complex that coordinates the photoperiodic response [17]. In rice, phytochromes inhibit flowering by negatively modulating both the Hd1 and Ehd1 flowering pathways. Furthermore, PhyA homodimers and PhyB-PhyC heterodimers are independently sufficient to activate Ghd7 transcription, while PhyB homodimers can repress it [13]. More recently, PhyA, PhyB and OsGI, a circadian oscillator protein, have been described to interact with Ghd7 [18].
There is direct evidence that phytochromes control the flowering signaling pathway through PHOTOPERIOD SENSITIVITY 5 (SE5) by encoding a heme oxygenase that converts the heme group into biliverdin IX α during phytochromobilin (PθB) biosynthesis, a phytochrome chromophore [19]. It has been reported that SE5 negatively controls Ehd1 expression and, thus, inhibits flowering. Furthermore, SE5 confers photoperiodic sensitivity through the regulation of Hd1 [20]. Mutants defective in SE5 are deficient in active phytochromes and exhibit very early heading under both SD and LD conditions. Furthermore, the deficiency of both PhyA and PhyB in se5 plants results in a light response being absent in the mutant [19]. Similarly, PHOTOPERIOD SENSITIVITY 13 (SE13/OsHY2) encodes a PθB synthase that participates in the final step of PθB synthesis [21]. Phytochrome-defective plants, as a consequence of lack of the functionality of SE13/OsHY2, flower earlier than those plants with functional phytochromes [21], which occurs in se5 mutants.
In order to understand the factors involved in photoperiodic flowering regulation in rice, we characterized G123, an early flowering mutant that derives from the Gleva variety that is widely cultivated in Spain. We determined the changes in the expression patterns of the main flowering regulatory genes in this mutant, and carried out a comparative transcriptomic analysis with the parental variety. The genome structure analysis allowed us to identify the mutation responsible for the early heading phenotype displayed by this mutant.

Plant material and growing conditions
Seeds of the Gleva variety were irradiated with fast neutrons 25 Gy at the Instituto Tecnologico e Nuclear (Sacavem, Portugal), and were germinated and grown in pots in a greenhouse at a controlled temperature (25˚C) and relative humidity (50% RH) under natural daylight conditions (latitude 39˚28' N). Adult plants were grouped into families of five plants and their seeds were collected. One hundred and twenty-two M 2 plants from each family were grown in rows, spaced 20 x 20 cm, in the field. They were screened for those plants showing earlier flowering than Gleva. The M 3 plants were cultivated in summer in pools resembling cultivation field conditions and flowering dates were recorded. Other traits, such as height, panicle length, number of panicles and grain weight per plant, were also scored.
For the photoperiod sensitivity assays, the gene expression analysis, RNA-seq and mutation detection, plants were cultivated in growth chambers (SANYO Mod. MLR350) equipped with broad-spectrum fluorescent tubes (400-700 nm) (GROLUX F36W / GRO-T8, Sylvania, Germany) with a light intensity of 250 μmol -2 � s -1 . Plants were cultivated separately for each analysis. Plants were grown under LD (14 h light:10 h dark) or SD (10 h light:14 h dark) conditions, or under 12 h light:12 h dark photoperiod conditions. Temperature was kept constant at 27˚C in all the experiments. To monitor the photoperiod effect on minimizing the differences in the flowering induction times, the seeds of Gleva and G123 were sown in pots and grown under 12 h light:12 h dark photoperiod conditions for 4 weeks, followed by 1 week under the LD or SD conditions. For the expression pattern analysis, at the end of week 5, the time series of the samples were taken from the second leaf of three different plants every 4 h. The time when plants began to receive light was considered 0 h. For the RNA-seq analysis, a new set of plants was grown and the second leaves of these plants were collected 20 h after dawn. Samples were frozen in liquid nitrogen and stored at -80˚C until the RNA extraction procedure.
Three hundred sixty-five F 2 plants derived from a cross between Gleva and G123 were grown in pots in a greenhouse under natural light conditions in summer. The heading date was considered the time when half of the first panicle emerged. The plants that flowered before 72 DAS sowing were considered the early flowering plants. A chi-square test was used to test the hypothesis of a single recessive gene.

RNA isolation
For the quantitative Real-Time PCR, total RNA was isolated using extraction buffer (0.1 M LiCl; 0.1 M Tris pH8; 1% SDS; 0.01 M EDTA) and a mixture of phenol: chloroform: isoamyl alcohol (25: 24: 1), and was then precipitated with LiCl at a final concentration of 2M LiCl and resuspended in TE. The RNA concentration was measured using the QubitTM RNA BT Assay Kit (Ref: Q10211) following the manufacturer's instructions, and was measured by Qubit1 2.0 Fluorometer (Life Technologies, USA).
The RNA isolation for the RNA-seq analysis was performed using the NucleSpin1 RNA plant Kit (Ref: 740949.50, MACHEREY-NAGEL, Germany) following the manufacturer's instructions. The quality and concentration of RNA were tested by agarose gel electrophoresis with a BioAnalyzer 2100 (Agilent) and a NanoDrop 1 spectrophotometer (Thermo Scientific). mRNA was enriched using oligo-dT beads.

Quantitative real-time PCR
The gene expression analyses were carried out on 2 replicates using RNA extracted from the second leaf of 2 different plants. The analyses were performed in a single step with the Light Cycler1 Fast Start DNA MasterPlus SYBR Green I Kit (Applied Biosystems TM, Ref: 03515885001) following the manufacturer's instructions. First-strand cDNA was synthesized from 100 ng of total RNA by reverse M-MuLV Roche1 transcriptase. The Real-Time PCR procedure involved incubation at 48˚C for 30 min and 95˚C for 10 min, followed by 45 cycles at 95˚C for 2 s, 55-61˚C for 8 s and, finally, 72˚C for 8 s. Next samples were incubated at 95˚C for 15 seconds and 42˚C for 1 min, followed by a temperature gradient from 42˚C to 95˚C with a ramp of 0.1˚C/s. Fluorescence intensity was measured during both the extension at 72C and the temperature gradient. The specificity of the reaction was verified by a melting curve analysis, obtained during the temperature gradient and by sequencing the reaction product. The expression of a rice ubiquitin gene was used for normalization purposes. The sequences of primers, extension times and number of cycles are provided in S1 Table. mRNA-Seq library construction and sequencing mRNA-Seq library construction and sequencing were performed by Novogen Bioinformatics Technology Co., Ltd (Hong Kong). Briefly, a library of insert size 250~300 bp was constructed, followed by its sequencing by pair-end readings of 150 bp. Following random mRNA fragmentation, cDNA was synthesized by using random hexamer primers and reverse transcriptase. Then the synthesis of the complementary strand was carried out following the Illumina mRNA Sequencing Sample Preparation Guide. A series of the end terminal repair and ligation of the pair-end sequencing adaptors was performed. The employed adapters were: 5' adapter 5'-AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCC GATCT and 3' adapter 5'-GATCGGAAGAGCACACGTCTGAACTCCAGTCACATCACGATC TCGTATGCCGTCTTCTGCTTG, where the six underlined bases correspond to the index. Size selection and PCR amplification enrichment were performed. The quality testing of the library was done using Qubit 2.0 and Bioanalyzer 2100, and the effective concentration of the library was quantified accurately by Q-PCR.
The mRNA-Seq libraries were sequenced by an Illumina HiSeq/MiSeq sequencer. Raw readings were filtered to eliminate low quality readings and adapters. Those readings containing sequences of adapters, or 10% of their indeterminate bases or more than 50% of low quality bases (Qscore < = 5), were eliminated.

RNA-Seq and differential expression analysis
Prior to the differential expression analysis, an extra quality analysis was performed using the FastQC High Throughput Sequence QC Report (version: 0.11.5, www.bioinformatics.babraham. ac.uk/projects/). Version 7.5.2 of the CLC Genomics Workbench software (QIAGEN, Germany) [22] was used for the differential expression analysis. Reads were filtered according to default parameters for Illumina reads, plus a restriction of a 250-300 bp distance between pairs. Reads were cut based on their quality with a limit of 0.049 and a maximum number of ambiguous nucleotides that equaled 2. In addition, 15 nucleotides of the 5' end of all the reads were first trimmed due to discrepancy in the percentage of bases according to the FastQC reports. The release 7th of the rice pseudomolecules and genome annotation data was used as a reference. The mapping and distribution of reads across genes were carried out with default parameters. Expression levels were normalized by RPKM (Reads Per Kilobase per Million mapped reads). The differential expression analysis was performed by employing a re-implementation of the "Exact test" of the edgeR Bioconductor package [23] for a two-group comparison with a common dispersion cutoff of 5 and p-values with FDR correction. The significance threshold was set at FDR < 0.1.
Functional annotation according to Gene Ontology (GO), and the enrichment analysis, were carried out through the CARMO platform (Comprehensive Annotation of Rice Multi-Omics data) [24] (http://bioinfo.sibs.ac.cn/carmo/). The GO terms with FDR < 0.05 were considered enriched.

Nuclear genome DNA extraction and genome sequencing
Nuclear genome DNA was isolated from 2 g of fresh leaves from Gleva, G123 and from the 20 F 2 plants showing an early phenotype by following a modified CTAB protocol [25]. A nuclear DNA mixture (Epool), in equal amounts, was prepared from the 20 F 2 individuals that presented the same phenotype as the mutant.
The final concentration and quality of DNA were checked with the Qubit™ dsDNA BR Assay Kit (Ref: Q32853) following the manufacturer's instructions and using Qubit 1 2.0 Fluorometer (Life technologies, USA).
Library construction and genome sequencing were carried out at Novogen Bioinformatics Technology Co., Ltd as follows: the DNA from each sample was cut into fragments of approximately 350 bp, which were used to construct a genomic DNA library using the NEBNext1 DNA Library Prep Kit following the manufacturer's instructions. In the next step, the repair of ends, addition of dAMP tails (dA-tailing), and further ligation with the NEBNext adapter were done, and the required fragments (300-500 bp) were enriched by the P5 and P7 indexed oligos. After purification and quality checking, the resulting library was ready for sequencing.
The quality control of the library was first performed with a Fluorometer Qubit12.0 and Agilent1 2100 bioanalyzer. Finally, the real-time PCR (qPCR) was performed to detect the effective concentration of each library. The libraries with an appropriate insert size (~350 bp) and effective concentration above 2 nM were selected and mixed according to their effective concentration and the expected amount of data to be produced. The sequencing of the pairend readings was performed on the Illumina1sequencing platform with a reading size of PE150pb at each end. The raw data obtained from sequencing were filtered to discard the paired readings showing contamination with adapters, or if indeterminate nucleotides constituted more than 10% of the sequence, or if nucleotides with a low quality (quality of the bases less than 5, Q <5) constituted more than 50% of the reading. DNA sequencing generated 44.7 Gb of clean reads. The sequencing and cleaning process statistics are summarized in S2 Table.

Detection of the mutation
Raw sequencing data were filtered by applying a minimum sequencing quality threshold of 30 in at least 70% of the read length. Those reads that did not fulfill these conditions, as well as their pairs, were discarded. Then reads were mapped using bwa mem with an increased reseeding (-r 1.2) against the reference genome Os-Nipponbare-Reference-IRGSP-1.0 v7 (http://rice.plantbiology.msu.edu/). The resulting BAM files were processed using samtools [26]. The mapping statistics are summarized in S3 Table. Structural variations were detected using LUMPY [27] in the multisample mode with the default parameters by analyzing together the Gleva, G123 and Epool genomes. The resulting VCF was genotyped by SVTYPER [28]. This generated a VCF, including the putative SVs found in any of the three genomes compared to the employed reference genome.
Finally, SVs were filtered using the allele balance (AB) reported by SVTYPER; that is, the proportion of reads supporting the variation against the total reads for each sample. The SVs with an AB<0.001 (virtually no read supported variation) for Gleva and an AB>0.99 for G123 and Epool were selected by SnpSift [29]. The selected variations, which were absent in Gleva, but present in G123 and Epool, were manually verified with the genome visor IGV Browser to check for false-positives. The data from the statistics analysis of the detected SV are summarized in S4 Table. RNAseq and genomic sequencing data for the mutation identification have been deposited at the European Nucleotide Archive (ENA) in the European Bioinformatics Institute (EBI) with the accession number PRJEB37950.

Characterization of G123, a rice mutant displaying early flowering and insensitivity to the photoperiod
The G123 mutant was isolated as an early flowering mutant in the field screenings, under natural LD conditions, from a mutant M 2 population that derived from the irradiation of Gleva, a local temperate japonica cultivar widely grown in Spain. The G123 plants flowered in the field 82 days after sowing (DAS), which was 1 week earlier than the wild-type plants.
Exposure to different photoperiod conditions showed that the G123 plants were insensitive to the photoperiod because the number of hours of light did not affect the heading date. Plants were cultivated in growth chambers under LD (14 h light:10 h dark) and SD (10 h light:14 h dark) conditions, and the heading date was recorded. Under both SD and LD conditions, the G123 plants flowered 53 days after germination, while the Gleva plants flowered 68 days under SD conditions and 89 days after germination under LD conditions ( Fig 1A). The G123 plants were consistently shorter and displayed a slightly yellowish color compared to the wild-type plants. They also developed more panicles, but the total grain weight lowered by 25.1% ( Fig 1B and Table 1).
To determine whether the observed early flowering phenotype was due to a recessive mutation in a single gene, the 365 F 2 plants derived from a cross between G123 and Gleva were grown in pots in a greenhouse under natural light conditions in summer, and the heading date was recorded. As seen in Fig 2, the flowering frequencies showed a bimodal distribution. The progeny segregated a 289:82 ratio for the plants with a vegetative cycle that was longer or shorter than 72 DAS, respectively, with a 3:1 segregation ratio (chi-squared test: χ 2 = 1. 8, p = 0.18). This indicated that the flowering early phenotype in G123 is conferred by a single recessive mutation.

The expression of the flowering regulatory genes is altered in the G123 mutant
The daily expression pattern of the main genes involved in the photoperiod regulation of flowering in the Gleva and G123 plants reflected their flowering phenotype differences (Fig 3).  Hd3a showed higher expression levels in G123 than in Gleva at 4 h under SD and at 8 h under LD conditions after dawn. The RFT1 expression levels in the G123 plants grown under the SD conditions were lower than in Gleva at the end of the dark period, but under LD conditions, the RFT1 levels were generally lower throughout the photoperiod. Under SD conditions, Ehd1 expression was generally higher in G123 than in Gleva, and increased in the dark phase and decreased in the light phase. Under LD conditions, the levels of Ehd1 mRNA were similar in both varieties, except at 4 h after dawn when Ehd1 expression peaked for G123. Both lines exhibited similar Hd1 expression pattern under both LD and SD conditions, except at the end of the day, when the G123 plants showed lower levels than the wild type. We also analyzed the expression of the other genes involved in flowering regulation that modulated Hd1 and Ehd1 expressions to some extent (Fig 3). OsGI displayed similar expression patterns in G123 and Gleva under both SD and LD conditions, with expression peaks at 8 h and 12 h, respectively. Ghd7 and Ghd8, both inhibitors of flowering under LD conditions, generally exhibited similar and rather constant expression profiles in both varieties. However, under the SD conditions, both genes showed a peaked expression in G123 in the dark phase 12 h after dawn. Under LD conditions, the Ghd7 and Ghd8 expression levels at dawn tended to be lower in G123 than in Gleva. This expression pattern was also observed in the Pseudo-Response Regulators 37 (OsPRR37) coding for a protein with a CCT domain, whose expression is governed by the circadian clock [30]. The expression levels of DTH2 in both G123 and Gleva increased in the dark under SD conditions, but remained low in the mutant under LD conditions. The OsEARLY FLOWERING3-1 (OsELF3-1) expression was also activated in the dark in both G123 and Gleva under SD, and levels lowered in the daytime, but the expression remained low in G123 under the LD conditions. Hd6 expression remained low in the G123 plants, but exhibited a peak at 0 h in the Gleva plants under both SD and LD conditions (Fig 3). SE13 expression was also analyzed and increasing expression levels were observed at night, which lowered in the daytime under both SD and LD conditions in Gleva plants. No expression was observed in the G123 plants (Fig 3).

RNA-seq transcriptome analysis
To study the effect caused by the mutation on the transcriptome, an RNA-seq experiment was performed to detect the genes differentially expressed between G123 and Gleva. G123 and Gleva were exposed during one week to LD conditions, and mRNA was isolated from leaf samples 20 h after dawn, when the expression of the two pivotal regulatory genes, RFT1 and Hd3a, in Gleva and G123 was clearly different. An RNA-seq analysis of the differential gene expression was performed. A threshold of 1.5-fold and an FDR <0.1 were set to evaluate the differential gene expression. Following this criterion, the analysis revealed that 116 genes were differentially expressed between both genotypes, of which 62 were up-regulated and 54 were down-regulated in G123 versus Gleva (Table 2). According to Gene Ontology (GO) terms, the classification of the differentially expressed genes in G123 indicated that the mutant showed major alterations in photosynthesis, and in other processes related to the response to light. The functional annotation, as well as the assignment of the functional categories to the 116 genes based on their GO, were carried out with the Comprehensive Annotation of Rice Multi-Omics database (CARMO, http://bioinfo. sibs.ac.cn/carmo). The biological process classification according to the GO term annotation of the 62 up-regulated genes in G123 revealed that the bulk of these genes were included in the groups related to transport, photosynthesis, light harvesting, and responses to blue, red and far red light (Table 3). Of the genes related to the response to light, at least eight genes also affected the chlorophyll-binding proteins. The cellular component classification clearly showed that all the GO terms were directly associated with chloroplast or membranes. Thus, the terms related to the organelle were very abundant, e.g. those defined as thylakoid, thylakoid membrane, chloroplast envelope, plastoglobuli, chloroplast stroma or the light-harvesting complex. In particular, the biological function classification grouped the genes mostly included in the GO terms involving general binding, metal ion binding or chlorophyll binding.
Regarding the down-regulated genes in G123, the biological process classification revealed that most numerous groups of genes corresponded to the GO terms related to metabolic processes and response to internal stimuli, which included 21 and 10 genes, respectively (Table 4). Similarly, the most relevant group of genes in the cellular component classification was included in the term cytoplasm, while the molecular function classification grouped the genes associated with transferase and transporter activity.
It is worth mentioning that the changes in the expression of genes involved in G123 flowering regulation were also observed, as expected, in an early flowering mutant. In G123, Hd3a (LOC_Os06g06320), one of the master genes to induce flowering was up-regulated by 37.9-fold. Similarly, other genes implicated in flowering, such as MADS14 (LOC_Os03g54160) and MADS18 (LOC_Os07g41370), which act downstream of Hd3a, also displayed expression levels that were 20.5-and 3.8-fold higher in G123 ( Table 2).

Detection of the mutation
Several approaches were adopted to identify the putative gene responsible for the early flowering phenotype of G123. First, the differentially expressed genes identified in the transcriptome analysis were represented on a volcano plot, on which statistical significance, given by the pvalue in-log 10 , was plotted against the variation of expression given by the log 2 of fold change (S1 Fig). Three of the most down-regulated genes, LOC_Os01g72170, LOC_Os01g72130 and LOC_Os01g72120, encoded the proteins with the glutathione S-transferase function. A fourth gene, LOC_Os01g72100, encoded a calmodulin-like protein. The fifth most down-regulated gene compared to the wild type, LOC_Os01g72090, encoded SE13/OsHY2, a phytochromobilin synthase involved in both the biosynthesis of phytochromes and the response of plants to the photoperiod. Interestingly, these five genes are located in a 29796 bp region in chromosome 1, between positions 41825087 and 41854883. This observation suggests that the downregulation of these five genes may be due to a deletion in that region.
In order to investigate the mutations produced by the irradiation that affects the heading date in G123, we generated an F 2 population that derived from a cross between Gleva and G123. The nuclear DNA from the leaf samples of Gleva, G123 and from a bulk (Epool) of the 20 F 2 plants showing early flowering, similar to the phenotype exhibited by G123, was sequenced and compared.  In an attempt to detect variations among the genomes of Epool, G123 and the wild type, an analysis of structural variations (SV) was done with a combination of two programs: LUMPY and SVTYPER [27,28]. The SV search included mutations larger than 50 pb, which comprised deletions, duplications, insertions, inversions, and intra-and inter-chromosomal translocations. Different SV types in the genome sequences of the three DNA samples were detected by making a comparison with the reference genome and those found in both G123 and the Epool, but not in Gleva (see Materials and Methods), and were selected for further analyses. Ten SVs, consisting in five deletions, four translocations (each represented by two entries in the table) and one inversion, remained after filtering (Table 5). After hand curation in the IGV browser software, only the occurrence of one variation was fully confirmed: a deletion of 33373 bp located in chromosome 1 at position 41822688-41856061 pb. Eight genes were present in this region and are indicated in Table 6. Most of these genes were previously identified in the RNA-seq analysis (Table 2) as the genes that exhibited the highest differential expressions between G123 and Gleva. Six of them encoded proteins with glutathione S-transferase activity, and one of the remaining two genes was CML10, a calmodulin-like protein. Interestingly, the other gene encoded SE13/OsHY2, a gene involved in the response of plants to light and is, therefore, putatively implicated in the early flowering phenotype observed in the mutant line G123. This also agrees with the lack of SE13 expression observed in the G123 plants (Fig 3).

Discussion
The rice mutant line G123 was identified in a screening for early flowering plants. In addition to early flowering, the G123 mutant also exhibited photoperiod insensitivity, which suggests that its mutation affects photoperiod-mediated flowering regulation. The analysis of the structural variations in the G123 genome indicates that SE13/OsHY2 is the most probable candidate responsible for the early flowering phenotype. SE13/OsHY2 encodes a phytochromobilin synthetase that participates in the last step of the synthesis of phytochromobilin, a chromophore that forms part of the phytochrome structure [19]. Phytochromes participate in photoperiod flowering regulation as they inhibit Hd3a under LD conditions through Hd1, and also repress RFT1 expression by inhibiting Ehd1 [31]. Consequently, plants defective in phytochrome due to lack of SE13/OsHY2 functionality should flower earlier than those plants with functional phytochromes, as observed in G123.
The SE13/OsHY2 gene was first described in X61, a Gimbozu mutant [21], in which a deletion of a single nucleotide in the first exon caused a shift in the reading frame to produce a The RFT1 levels were higher in Gleva than in G123, probably due to their differences in vegetative cycle duration. Despite there being some connections between the two flowering regulatory pathways, Hd3a expression is regulated by Hd1 and Ehd1, while RFT1 expression is regulated by Ehd1 [32]. In regions located at northern latitudes with a temperate climate, varieties can often be found with non-functional Hd1 alleles. As Hd1 is an inhibitor of flowering under LD conditions, in these situations, flowering is governed by Ehd1. This is not the case of Gleva because it contains an Hd1 functional allele, which implies that both regulatory pathways are functional in Gleva [6]. This agrees with the fact that the Hd3a expression levels in the G123 mutant are higher than in the wild type, which indicates that Hd3a also promotes flowering in G123. In previous studies by our research group, another mutant that exhibits photoperiod insensitivity, s73, was isolated in an irradiated Bahia variety collection. The identification of a null mutation in SE5, and the analysis of Ehd1 silencing in both Bahia and s73 backgrounds, not only proved that SE5 regulates Ehd1 expression, but SE5 also confers photoperiodic sensitivity through Hd1 regulation. These results provided direct evidence that phytochromes inhibit flowering by negatively modulating both the Hd1 and Ehd1 flowering pathways [20]. SE5 encodes a hemoxygenase that acts one step upstream of SE13/OsHY2 in the phytochromobilin synthesis pathway to produce the substrate of SE13/OsHY2, a molecular connection that explains why s73 plants present similar alterations to G123. The Hd3a expression in s73 displayed much higher levels than those in the non-mutated parental variety, as in G123 and X61. Moreover, the expression of Ehd1, an Hd3a inductor, also peaked and was much higher than that observed in the non-mutated line at 4 h after dawn under LD conditions, which agrees with lack of Ehd1 inhibition by phytochromes. This reinforces the hypothesis that Hd3a also induces flowering in the G123 mutant.
The early flowering phenotype of G123 was also reflected in the alteration of the other genes participating in the flowering regulation pathway. HD5/DTH8/Ghd8 codes for a HEME ACTI-VATOR PROTEIN 3 (HAP3), a subunit of the CCAAT-box-binding transcription factor complex [33]. It acts as a repressor of flowering under LD conditions, and delays flowering by inhibiting the expression of Ehd1 and, consequently, of Hd3a and RFT1 [34]. Conversely under SD conditions, Ghd8 induces the expression of these regulators [33]. Ghd8 expression is not affected by Ghd7 or Hd1, which indicates the occurrence of a different genetic pathway in the control of flowering [34]. The Ghd8 and Ghd7 expression patterns in the G123 mutant were similar under LD conditions and their levels were lower than in Gleva, which is in agreement with the observed early flowering in the mutant. As the expression of both genes is activated by light [8], regulation by phytochromes is altered in G123. It is noteworthy that in the mutant, both Ghd7 and Ghd8 presented peaked induction at 12 h after dawn in the dark phase under SD conditions and may, thus, act as inductors of flowering in the SD photoperiod (Fig 3).
ELF3-1 promotes rice flowering under the LD conditions by inhibiting Ghd7 expression [35]. ELF3-1-defective plants exhibit higher levels for the Ehd1, RFT1 and Hd3a expressions under LD [36]. OsELF3-1 might be involved in PhyB-mediated flowering regulation as it has been reported that oself3-1 mutation suppresses the photoperiod-insensitive early flowering of se5 [37]. Furthermore, Ghd7 expression is activated by pulses of light at higher rates in ef7, a mutant defective in ELF3-1, than in wild-type plants [21]. In Arabidopsis, it has been demonstrated that ELF3 interacts directly with PhyB and other proteins to form complexes capable of regulating the gene expression of several flowering regulatory pathway genes [38]. Recently, an interaction between OsELF3 and PhyB has been demonstrated in yeast cells [37]. We have previously shown that Gleva exhibits higher OsELF3-1 levels and lower Ghd7 levels than G123 under LD. Under these conditions, the accumulation of phytochromes is greater due to the number of light hours in accordance with lack of phytochrome action in G123.
Hd6 encodes an α-subunit of protein kinase CK2 (CK2α) and requires a functional Hd1 gene to perform its function by acting independently of circadian clock mechanisms [39,40]. In our analysis, we observed that Gleva displayed higher Hd6 expression levels when Hd1expression peaked, which occurred at the end of the dark period under both SD and LD conditions. This observation agrees with the flowering times of both Gleva and the mutant. Finally, Days to heading on chromosome 2 (DTH2) encodes an Hd1-like protein that induces Hd3a and RFT1 expressions by acting independently of Hd1 and Ehd1. The circadian clock regulates DTH2 expression [41], a gene that increased in the dark phase of the day under SD and LD conditions in both G123 and Gleva, and with lower G123 levels than in Gleva.
The phenotypic data and the transcriptome analysis of the G123 mutant indicated that the deletion detected in SE13/OsHY2 was very likely responsible for the altered phenotype of G123. Consistently with the defect in phytochrome content, the transcriptome analysis revealed that photosynthesis and other processes related to the response to light were profusely altered in G123. A major group of genes corresponding to transport and photosynthesis was up-regulated in not only the G123 plants, but also in other genes involved in the response to light, in relation to different chloroplast elements, such as thylakoids or stroma. Therefore, the role of SE13/OsHY2 in the synthesis of phytochromes and its function in flowering regulation could explain the phenotype observed in the mutant.
In the last few years, several methodologies based on whole genome sequencing have been developed to detect the mutations responsible for altered phenotypes. In our case, in order to identify the mutation responsible for the early flowering phenotype exhibited by G123, we used a structural variation detection pipeline that combines two programs, LUMPY and SVTYPER [27,28], which were initially developed to detect such variations in human genomes. Using these tools in conjunction with a pooled F 2 generation, we avoided employing several generations of plants given the time that this entails. It is worth mentioning that our attempts to use another method to identify the G123 mutation, such as MutMap, developed to identify single nucleotide polymorphisms (SNP) mutations [42], were not successful given its restriction to detect SNP-type mutations. However, the methodology used herein allowed the detection of a deletion of 29.8 Kb, which is most probably responsible for the observed phenotype. The combination of SV detection and a pooled F 2 is a novel methodology for detecting mutations in plants. It generates only a few false-positives, enables easy hand curation, and offers the possibility of reducing the time spent to identify mutations.

Conclusions
This manuscript reports the generation and identification of a mutant, G123, that displays an early flowering phenotype. The proposed structural variation responsible for the mutation was identified by an analysis technique that combines LUMPY and SVTYPER [27,28] in conjunction with a pooled F 2 population. This approach suggests that SE13/OsHY2, a gene encoding a phytochromobilin synthase implicated in phytochrome chromophore biosynthesis, is the candidate gene for the altered phenotype of the mutant. The expression analysis of the major flowering regulatory genes indicated that, in the absence of functional phytochromes, flowering in the G123 mutant was governed mainly by Hd3a rather than by RFT1 under LD conditions. We also revealed that the G123 transcriptome reveals major alterations in the expression of a group genes involved in both photosynthesis and the light response. The SE13/OsHY2 gene is proposed as an interesting donor in breeding programs to reduce the vegetative cycle of elite varieties.
Supporting information S1 Fig. Volcano plot analysis of the differential expressed genes in G123. Statistical significance, given by the-log 10 of the p-value (ordinate axis), is plotted against the variation of expression given by the log 2 of fold change (abscissa axis) for each gene. (TIF) S1