Reference gene selection for molecular studies of dormancy in wild oat (Avena fatua L.) caryopses by RT-qPCR method

Molecular studies of primary and secondary dormancy in Avena fatua L., a serious weed of cereal and other crops, are intended to reveal the species-specific details of underlying molecular mechanisms which in turn may be useable in weed management. Among others, quantitative real-time PCR (RT-qPCR) data of comparative gene expression analysis may give some insight into the involvement of particular wild oat genes in dormancy release, maintenance or induction by unfavorable conditions. To assure obtaining biologically significant results using this method, the expression stability of selected candidate reference genes in different data subsets was evaluated using four statistical algorithms i.e. geNorm, NormFinder, Best Keeper and ΔCt method. Although some discrepancies in their ranking outputs were noticed, evidently two ubiquitin-conjugating enzyme homologs, AfUBC1 and AfUBC2, as well as one homolog of glyceraldehyde 3-phosphate dehydrogenase AfGAPDH1 and TATA-binding protein AfTBP2 appeared as more stably expressed than AfEF1a (translation elongation factor 1α), AfGAPDH2 or the least stable α-tubulin homolog AfTUA1 in caryopses and seedlings of A. fatua. Gene expression analysis of a dormancy-related wild oat transcription factor VIVIPAROUS1 (AfVP1) allowed for a validation of candidate reference genes performance. Based on the obtained results it can be recommended that the normalization factor calculated as a geometric mean of Cq values of AfUBC1, AfUBC2 and AfGAPDH1 would be optimal for RT-qPCR results normalization in the experiments comprising A. fatua caryopses of different dormancy status.


Introduction
Seed dormancy is an innate seed property responsible for the temporal suppression of germination despite suitable conditions [1,2]. Still on the mother plant, seeds acquire a state of primary dormancy which after shedding, depending on external factors (e.g. temperature, light, different chemicals), decreases gradually or is abruptly lost to allow seed germination [1]. Seeds of many plant species require variable periods of dry after-ripening or moist chilling to become non-dormant. Either in primary dormant seeds or seeds that fully or partially lost PLOS  tissues [36][37][38][39][40], there is a need for re-validation of reference genes for analyses dedicated to wild oat caryopses. Thus, in the present study, a qRT-PCR protocol based on SYBR reagent was used for the identification of genes with minimal expression variation during primary dormancy release and thermodormancy induction in A. fatua caryopses. Seven homologs of genes traditionally used as internal controls [33,41] were selected as candidate reference genes for evaluation, including translation elongation factor (EF1α), α-tubulin (TUA) and TATA-binding protein (TBP) as well as two homologs of glyceraldehyde 3-phosphate dehydrogenase (GAPDH) and ubiquitin-conjugating enzyme (UBC). The expression stability of these genes was assessed by four different statistical algorithms to select the most suitable internal control. The validation of the optimal reference gene was performed on dormancy-related AfVP1 gene. Our work was aimed to assist future studies on molecular basis of seed dormancy induction, maintenance, and release in wild oat as a contribution to the on-going research focused on characterization of functional networks of seed dormancy-associated genes.

Ethics statement
Before entering the crop field, an oral permission for the collection of caryopses was obtained from the land owner. No other permits or approvals for the collection of plant material used in the studies were required because A. fatua is not endangered or protected species in Poland. It is not mentioned in any of the relevant documents [42][43][44]. On the contrary, as a dangerous weed species, it is subject to different weed control programs, usually including herbicide application.

Plant material
Ripe caryopses of Avena fatua L. were obtained from plants naturally growing in Poland. Spikelets were collected in crop fields near Szczecin in July 2011. After collection, they were dried at room temperature for 7 days to a constant moisture content (~11%). Afterwards, one part of harvested spikelets was immediately stored at -20˚C to maintain primary dormancy of caryopses, the other part was subjected to after-ripening, i.e. dry-storage at 25˚C for six months. Thus released from dormancy caryopses were then also kept at -20˚C. In the experiments, only dehulled caryopses were used so both lemma and palea were removed by hand. The expression stability of putative reference genes was analyzed in primary dormant (PD) and non-dormant (ND) A. fatua caryopses, either dry or subjected to different temperatures or application of plant growth regulators during imbibition. For comparison with the vegetative tissues, also samples of leaves and roots of eight day old seedlings were examined.
PD or ND caryopses, 25 per sample, were incubated in darkness, in 6-cm Petri dishes on one layer of filter paper (Whatman No. 1) moistened with 1.5 mL deionized water or a solution of applied plant growth regulator, karrikinolide KAR 1 (3x10 -9 M) or gibberellin A 3 (10 −5 M). The temperatures of incubation were 20, 30 or 35˚C, of which the latter two were found as supraoptimal temperatures (SOT) inducing thermodormancy in PD and/or ND caryopses. Samples were collected at three or four time points (hour 8, 24, 36 for all combinations and 41 or 96 for selected treatments) during imbibition before the completion of germination that is before radicle protrusion through colleorhiza.
In total, samples of eleven treatments of ND and seventeen treatments of PD caryopses (differing in incubation time, temperature, solution as indicated in S1 Table) as well as two types ofvegetative organs (leaf and root) were obtained, each of them in three biological replicates. All collected samples were snap-frozen in liquid nitrogen and stored at -80˚C until used.

Total RNA preparation, quality control and cDNA synthesis
Total RNA from caryopses was extracted according to the protocol described by Oñate-Sánchez and Vicente-Carbajosa [45] for seeds and tissues with a high content of polysaccharides, with some modifications. Each sample of plant material comprised 25 caryopses which were ground in a mortar using liquid nitrogen. Ca. 80 mg of powdered tissue was mixed with 1 ml of the extraction buffer (0.2 M Tris pH:8, 0.4 M LiCl, 25 mM EDTA, 1% SDS), 0.5 ml chloroform and 0.1 ml Plant RNA Isolation Aid (Ambion). The extraction mixture was incubated on ice for 10 min and centrifuged (14,000 rpm, 15 min, 4˚C). RNA was then purified from the lysate with water-saturated acidic phenol (pH:4.5) and chloroform. Afterwards, RNA was precipitated overnight at 4˚C from aqueous phase through addition of 1/2 volume of 5 M LiCl and spinned for 30 min at 4˚C. The pellet was then washed with 75% cold ethanol.
The isolation of total RNA from seedlings (leaf/root separately) was conducted from 100 mg of frozen tissue homogenized in 1 ml Tri Reagent solution (Sigma-Aldrich). After homogenization, samples were incubated at room temperature for 5 min, vortexed and centrifuged (14,000 rpm, 10 min, 4˚C) to remove the insoluble material. For phase separation, 100 μl of BCP (1-bromo-3-chloropropane) was added to the supernatant, mixed by inversion and kept at room temperature for 5 min followed by centrifugation (14,000 rpm, 15 min, 4˚C). RNA from the aqueous phase was precipitated with isopropanol and washed with ethanol.
Total RNA, isolated either from caryopses or seedlings, was suspended in 25 μl of RNasefree water and subjected to quantitative and qualitative control. The RNA concentration of each sample was estimated using a BioSpec-Nano micro-volume UV-Vis Spectrophotometer (Shimadzu Scientific Instruments) and RNA integrity was assessed by 2% agarose gel electrophoresis through visualisation of the two ribosomal subunits. Potential genomic DNA contamination was eliminated by DNase I (Ambion) treatment of all RNA samples. They were suspended in 25 μl nuclease-free water and their concentration, purity and integrity were rechecked. Only the RNA samples with A260/A280 ratios between 1.8 and 2.1 and A260/A230 ratios greater than 2.0 were further used. First-strand cDNA of each sample was synthetized from 1 μg of DNase-treated total RNA in a 20 μl reaction volume using the High-Capacity cDNA Reverse Transcription Kit (Thermo Fisher) according to the manufacturer's protocol. cDNA samples were ten times diluted with nuclease-free water and used for RT-qPCR.

Candidate reference gene selection and primer design
Candidate genes were selected based on their previous description as good plant internal control genes for qRT-PCR analysis, including seeds, in a number of species [10,40,41,[46][47][48]. Homologous sequences of these selected genes were retrieved from our A. fatua RNA-seq assembled transcriptome dataset (S1 File) by sequence search using cDNA sequences of homologous genes identified in different species of Poaceae family performed in Geneious R6 software (Biomatters Ltd.) (S2 Table). Strict parameters were used to identify the BLASTn hit with greatest homology and included: the full/partial transcript being >400 bp in length; an Escore <1e -20 , and greatest homology the query sequence according to the Bit score and pairwise identities for the transcript and deduced amino acid sequences.
The qRT-PCR primers for reference gene candidates, except for AfTBP2, and also for the target gene AfVP1, were designed with Primer3: WWW primer tool [49], using the following parameters: amplicon length around100 bp; primer size between 18 and 22 bp; melting temperature (Tm) between 57 and 61˚C; GC content between 40 and 60%. Amplification primers used for TBP were adopted from the published sequence data of Wrzesińska et al. [10]. The sequences of qRT-PCR primers with the characteristics of their corresponding amplicons are listed in Table 1 and S3 Table.

RT-qPCR conditions and primer specificity
Real-time reactions were performed using SYBR Select Master Mix (Applied Biosystems) as follows: 1 μL of cDNA diluted 10 times, 5 μL of the mix and 0.2 μM of each primer, in a final volume of 10 μL. The StepOne system (Applied Biosystems) was employed and PCR cycling consisted of the following steps: initial 2 min at 95˚C for pre-denaturation, followed by 40 cycles of 15 s at 95˚C for denaturation and 1 min at 60˚C for annealing and extension. Afterwards, the final extension was performed at 60˚C for 1 min and was followed by melting curve analysis using default parameters in order to verify the PCR specificity by constant increase in temperature from 60˚C to 95˚C, at increments of 0.5˚C. Three biological replicates per treatment/organ type were analyzed with each PCR reaction done in triplicate.

Primer amplification efficiency calculation and Cq determination
Efficiency of each primer pair and the quantification cycle (Cq) determination were done by LinRegPCR method [50]. Cq value is the amplification cycle number at which the fluorescence from amplification reaches a set threshold of signal level [51]. Raw fluorescence data generated with the StepOne software (Applied Biosystems) was used for these calculations. LinRegPCR calculates individual efficiency values for each RT-qPCR reaction and averages them to obtain mean efficiency value for each gene. After running the LinReg algorithm, Cq values were transferred as a Microsoft Excel file (Microsoft, Redmond, WA) for further gene expression stability analysis.

Analysis of stability of candidate reference genes
Expression stability of the seven potential internal control genes across all treatments and samples in total and in subsets were determined using four statistical algorithms: geNorm [52], NormFinder [53], BestKeeper [54] and the comparative ΔCt method [55] following developer's instructions. The Cq values of all reference genes used in the geNorm and NormFinder were converted into relative quantities by ΔCq method based on the efficiencies of primers. The geNorm algorithm was used to rank the reference genes by calculating their expression stability value M determined as the standard deviation of the logarithmically transformed expression ratios of a particular gene and all other candidate genes, with 1.5 as a recommended cut-off threshold value. The lower M value was attributed to the particular gene, the more stable it was considered. Additionally, using the geNorm software the pairwise variations (Vn/Vn +1) was calculated in order to determine the minimum number of reference genes for optimal normalization. An expression stability value (M) for each candidate reference was calculated with NormFinder software, based on the inter and intragroup variance estimation approach. Three main measures offered by BestKeeper software: the standard deviation (SD) of the Cq of all samples for each analyzed gene, the coefficient of variation (CV) of a potential reference gene and the correlation coefficient with the BestKeeper index (r) were used to evaluate gene expression stability. Since the BestKeeper program does not indicate which of them is more reliable nor has more weight to select the best reference gene, we used the approach described by Olias et al. [56]. First, each candidate reference gene was evaluated for the three measures (SD, CV and r) separately, then the mean of the rankings was calculated to determine the final rank of each gene. In the comparative ΔCt method, candidate reference genes were ranked due to the mean standard deviation of ΔCt in analyzed samples.

Validation of reference genes
To validate the choice of the optimal reference genes in our experimental system, their performance as single or combined normalization factors in qRT-PCR analysis of a target gene was assessed. The expression levels of AfVP1 gene were normalized by the most stable reference genes (according to the expression stability rankings provided by four different statistical algorithms). Samples were collected from ND and PD caryopses, either dry or incubated in water at 20˚C for 8, 24 or 36 hours. The relative expression ratios were calculated using the efficiency corrected model [54]. The final value of relative quantification was described as fold change of gene expression in the test samples (ND caryopses) compared to control (PD caryopses) which served as a calibrator with set value of 1 at each time-point. Data were expressed as mean ± SD of three biological replicates for each treatment.

Candidate reference gene selection and sequences identification
The selection of genes which could serve as reference genes for normalization in qRT-PCR gene expression analysis in A. fatua was based on literature recommendations. Since only very limited wild oat genome sequence information was available, the A. fatua transcriptome dataset was queried for the chosen (EF1α, GAPDH, TUA, UBC) candidate genes using the publicly available gene sequences from Avena sativa, Hordeum vulgare and Brachypodium distachyon, all belonging to Poaceae family. The longest homologous sequences of A. fatua showing the highest similarity to the query sequences were selected as targets for qRT-PCR analysis ( Table 1, S2 Table). There were, however, also other sequences retrieved for each of the candidate gene indicating the existence of multigene families. In two cases, namely GAPDH and UBC probable gene families, two paralogs were appointed as candidate reference genes and their expression stability was subsequently assessed. Whenever possible, the retrieved sequences were compared with partial coding sequences of A. fatua recently submitted to NCBI database. The results of the pairwise alignment of these transcripts revealed some sequence discrepancies (Table 2). Homolog AfEF1a (CL3197 con-tig1) represented nearly complete coding sequence missing only 7 codons at 3' end, whereas its counterpart from GenBank (KT153026) was shorter. There was a 94.6% pairwise identity between them. Two homologues transcripts, AfGAPDH1 (Unigene36693) and AfGAPDH2 (Unigene36695) comprised full or nearly full (only one codon missing) coding sequences if compared to the reference GAPDH sequences from H. vulgare. They showed 99.3% and 83.2% identity with the partial AfGAPDH cds (KT153027), respectively. Out of several retrieved homologous transcripts of TUA, the longest was Unigene36141 designated as AfTUA1 and further analyzed for its expression stability. AfTUA1 showed only 79.3% pairwise identity with the partial AfTUA transcript available in NCBI database (KT153029). Higher identity of 96.7% was observed between KT153029 and Unigene2539, confirming the presence of the second AfTUA homolog.
The use of a partial AfTBP sequence (KT153028) in the search of homologs in our NGS transcriptome dataset revealed the presence of even eight AfTBP paralogs (S2 Table). The KT153028 query had the highest identity of 97.5% with CL4896 contig1 designated AfTBP2 due to its similarity to Oryza sativa TBP2 gene (AF464907), while 83.1% with the longest transcript CL3311 contig2, designated AfTBP1 (Table 3). Only the expression stability of AfTBP2 was further analyzed for the sake of comparison with previous results of Wrzesińska et al. [10]. The sequence of the target gene AfVP1 (AJ001140) had 98.6% pairwise identity with the transcript found in the NGS transcriptome dataset (S2 Table).

Primer specificity, amplification efficiency and gene expression profiles
The gene specific primers were designed to accurately amplify the candidate reference genes using qRT-PCR. The melting temperatures (Tm) of all PCR products ranged from 78.14˚C for AfTBP2 to 83.94˚C for AfTUA1 (Table 1). All the primer pairs amplified a single PCR product of the expected size, which appeared as one peak on the melting curve (S1 Fig). According to Taylor et al. [57], amplification efficiencies within the range of 1.8-2.0 can be considered ideal for qRT-PCR and R 2 values greater than 0.98 indicate that the efficiencies are reliably determined. In our analyses, the PCR efficiency of each of the qRT-PCR primer pairs met these standards. All primer pairs had mean efficiencies between 1.807 and 1.883, therefore sufficient for qRT-PCR analysis (Table 1). Also, very high values of the coefficents of determination exceeding 0.999, implied high accuracy of efficiency estimation.
Since all qRT-PCR reactions were performed with an equivalent amount of template cDNA, transcript abundance of these genes in different samples was estimated by direct comparison of Cq values (Fig 1). Over all samples, the Cq values of the seven candidate reference genes ranged from 17.71 to 29.61 ( Fig 1A). Among them, AfEF1α was the most abundantly expressed in all of the samples (mean Cq ± SD = 20.12 ± 1.21) followed by two AfGAPDH paralogs, AfGAPDH1 (mean Cq ± SD = 21.06 ± 0.76) and AfGAPDH2 (mean Cq ± SD = 22.53 ± 1.51), whereas AfTBP2 had the lowest transcription level (mean Cq ± SD = 26.68 ± 0.80). AfTUA1 was close in Cq value but substantially differed in SD value (mean Cq ± SD = 25.99 ± 1.78) in comparison to AfTBP2. The expression of AfTUA1 was the most variable among all candidates tested, with the widest Cq range of 9.46 cycles. Lower discrepancies in Cq values were detected for AfEF1α and AfGAPDH2 (5.64 and 6.84 cycles respectively). The variability of four other genes'expression levels were quite even, showing differences of ca. 4.5 cycle. A considerable contribution to the differences in expression levels of most analyzed genes came from seedling samples ( Fig 1B). As soon as they were omitted from the calculations, a substantial decrease (by more than 1 cycle) in Cq range was observed, especially for AfGAPDH1, AfTUA1, AfUBC1 and AfUBC2. Only AfEF1α and AfTBP2 remained then unaffected. In comparison, when Cq value from dry caryopses where excluded, a significant decrease in expression level range was observed for AfEF1α, AfTBP2 and also AfTUA1, but not for AfGAPDH1, AfUBC1 and AfUBC2 (Fig 1C). If only Cq values from imbibed caryopses were analyzed, both the Cq range and SD achieved minimal values (Fig 1D). Either standard deviation or the range of Cq values can reveal the expression stability of candidate reference genes, however more robust stability ranking was obtained using the four different computational programs.

Expression stability of candidate reference genes
In order to perform the comprehensive expression analysis of candidate reference genes and choose an appropriate normalizing factor for different experiments, the statistical analyses were not only conducted in the overall dataset comprising all biological samples but also in five subsets (sample groups) distinguished due to the initial dormancy status of caryopses and imbibition conditions ( Table 3). The Cq values for each candidate reference gene were used for stability comparison.
In geNorm analysis none of the candidate reference genes had M value higher than 1.5 thus exhibiting a relative stability (Fig 2). When the results from all samples of A. fatua caryopses and seedlings were combined (Total dataset), AfGAPDH1 and AfUBC2 had the highest expression stability with M value of 0.387, closely followed by AfUBC1 (M = 0.427). In most data subsets, these three reference genes were also ranked at the top, though flipping sometimes in their positions. Only in the subset NC+SOT, a sample group comprising ND caryopses subjected to SOT treatments, the pair of highest stability indicated by geNorm was AfGAPDH1 and AfTBP2 (M = 0.314), while AfUBC2 and AfUBC1 occupied the 3 rd (M = 0.426) and 4 th (M = 0.485) position respectively. Two reference genes were indicated by geNorm analysis as the most unstable, either AfTUA1 or AfGAPDH2, depending on the data subset.
The results of the pairwise variation measure showed that generally in all experimental sample groups, the V2/3 value was very close to 0.15 (Fig 3) which is a commonly accepted cut-off threshold [52]. In the Total dataset and two data subsets (NC+S, PC+KG), the pairwise variation increased for V3/4 and V4/5, confirming that the two top-ranked candidate reference genes would be sufficient as the normalization factor for these sample groups. For three other data subsets i.e. NPC, NPC+SOT and NC+SOT, the variation value V3/4 was lower than V2/3, which was nearly as high or even slightly exceeding the 0.15 threshold. Such results suggested that in their case three rather than two top ranked genes should be included for calculation of a reliable normalization factor.
In the Total dataset, AfUBC1 (M = 0.286) was appointed as the most stable gene by Norm-Finder (Table 4). Next two positions in the stability ranking were assigned to AfUBC2 (M = 0.314) and AfTBP2 (M = 0.327). So, two out of three geNORM top-ranked genes appeared as highly stable in NormFinder algorithm. Unexpectedly, AfGAPDH1 was ranked only at the 5 th position, but together with AfGAPDH2, gave the optimum combination. Interestingly, both AfGAPDH paralogs were individually assessed as rather unstable, and yet, when paired, exhibited a low variation value of 0.155. In the data subsets, either AfTBP2 (NPC, NC +S, PC+KG) or AfUBC2 (NPC+SOT, NC+SOT) were indicated by NormFinder as the most stable (M value between 0.2 and 0.3). Thus AfTBP2 and two AfUBC paralogs appeared as the three top-ranked in stability by NormFinder, although in PC+KG data subset, AfUBC2 was assigned only 6 th position. Similarly as in geNorm analysis, usually AfTUA1 and AfGAPDH2 were at the bottom of stability rankings in NormFinder. The stability rankings for candidate reference genes in different data subsets were calculated on the basis of three measures generated by BestKeeper program (SD, CV and r) ( Table 5). Similarly to geNorm, in the Total dataset and most of the subsets, the genes of highest stability assessed by BestKeeper, were AfUBC2, AfUBC1 and AfGAPDH1. Some discrepancies between BestKeeper (Table 5) and geNorm (Fig 2) concerned stability evaluation of AfGAPDH1 and AfTBP2 in two data subsets comprising only ND caryopses. The ranks of these two genes assigned by BestKeeper, 1 st place in NC+S and 3 rd place in NC+SOT for AfTBP2, while 4 th and 5 th for AfGAPDH1, respectively, repeated the results from NormFinder ( Table 4). The least stable genes assessed by BestKeeper were also AfTUA1 and AfGAPDH2.  Pairwise variation (V) analyses of the candidate reference genes. The pairwise variation (V n/n+1 ) was analyzed for the normalization factors NFn and NFn+1 by the geNorm program to determine the optimal number of reference genes for accurate normalization. The cutoff value was proposed to be 0.15, below which the inclusion of an additional reference gene is not necessary. For dataset symbols, refer to Table 3. https://doi.org/10.1371/journal.pone.0192343.g003

Table 4. Candidate genes expression stability value (M) of single and best combination of two genes estimated by NormFinder algorithm in different datasets.
For dataset symbols, refer to Table 3. Based on comparative ΔCt method, the highest stability in the whole dataset showed AfUBC1 (SD = 0.723), AfUBC2 (SD = 0.752) and AfGAPDH1 (SD = 0.790), while the lowest AfGAPDH2 and AfTUA1 with SD of 0.915 and 1.065, respectively (Table 6). In most data subsets, two paralogs of UBC were ranked as any of the top three accompanied either by AfGAPDH1 or AfTBP2. Only in the data subset comprising PD caryopses treated with plant growth regulators (PC +KG), AfUBC2 was ranked quite low (SD = 0.755; 5 th place) and the three most stable genes were AfUBC1 (SD = 0.0,687), AfTBP2 (SD = 0.690) and AfGAPDH1 (SD = 0.693).

Relative expression of AfVP1 gene
There was no absolute agreement between the four statistical algorithms as to which of candidate reference genes or their combination could be considered as the most appropriate for SD-standard deviation, CV-coefficient of variation, r-coefficient of correlation. Ã Same ranking for a given dataset https://doi.org/10.1371/journal.pone.0192343.t005 Reference genes for wild oat dormancy molecular studies qRT-PCR analyses in A. fatua dormancy/germination studies. Therefore, for NPC data subset, we compared the performance of single and multiple normalization factors suggested as optimal by each stability assessment method (Fig 4). The relative expression levels of the target gene (AfVP1) were calculated at distinct time-points during imbibition of ND in parallel with PD caryopses. The overall observation of AfVP1 expression profiles revealed that there was not much difference in its transcript abundance between dry ND and PD caryopses. It was also not much regulated during the beginning of ND caryopses germination process up to 24 h of incubation. Only 2 hours before radicle protrusion through coleorhiza (36 h of incubation), there was observed a downregulation of AfVP1 by about twofold. The results obtained through normalization using candidate reference genes indicated as the most suitable by GeNorm (combination of three genes: AfGAPDH1, AfUBC1, AfUBC2) (Fig 4A), NormFinder (combination of two genes: AfEF1a, AfGAPDH1) ( Fig 4C) as well as BestKeeper and ΔCt method (single gene: AfUBC1) (Fig 4D) gave generally similar patterns of AfVP1 gene expression. However, if the normalization was conducted with the AfTBP, the most stable single gene according to NormFinder, the expression profile of AfVP1 was quite different (Fig 4B). In that case, there was unexpectedly higher expression of AfVP1 in ND caryopses, as compared with PD ones, at the beginning of incubation (8h) and only slightly reduced shortly before the completion of germination (36h).

Discussion
Comparative studies of the dormancy-related gene expression in A. fatua caryopses of different dormancy status, employing a powerful method of RT-qPCR, may provide insights into molecular mechanisms of the dormancy induction, maintenance and release as well as germination process per se. Therefore, several experiments were planned and performed to obtain reliable results which would facilitate further work on the expression profiles of the genes of interest involved in those physiological phenomena in A. fatua. Although there are several publications dedicated to reference genes used in seed biology studies [40,47,48,58,59,60], to our knowledge, this is the first report concerning reference gene selection and validation for that specific developmental stage of A. fatua life cycle comprising dormant and germinating caryopses.
Since, among other factors, dormancy mechanisms contribute to persistence of the soil seed banks [61], it seems vital to reveal its mechanisms in a serious weed species like wild oat in order to enable improvements in integrated weed-control strategies. Taking this practical aspect into account, the investigations were focused on the issue of hormonal regulation of primary and secondary dormancy in natural populations of A. fatua, thus field-collected batches of caryopses were used in the studies. If biologically significant results are to be obtained from RT-qPCR gene expression analyses, the raw data must be normalized to alleviate the influence of technical variation which is usually accomplished by the parallel quantification of endogenous reference genes [33,34,52,62]. For a good reference gene it is expected that its expression variation should not arise from the regulation by experimental conditions, but reflect the errors produced by the sample processing steps. Therefore, in the present study, a SYBR green-based RT-qPCR assay was carried out on several candidate reference genes (AfEF1α, AfGAPDH1, AfGAPDH2, AfTBP2, AfTUA1, AfUBC1 and AfUBC2) using different samples of A. fatua in order to identify the most stable ones. The stability assessments were conducted on the whole dataset in order to compare different plant organs: seeds (caryopses) v. seedlings (vegetative organs/tissues) and also in data subsets which matched different experimental setups focused on: i. the germination of ND caryopses up to seedling stage (NC+S) or in comparison to PD caryopses (NPC); ii. the influence of supraoptimal temperature on ND caryopses (NC+SOT) also in comparison to PD caryopses (NPC+SOT); iii. the release of PD by KAR 1 or GA 3 (PC+KG) ( Table 3, S1 Table).
Vast experimental evidence leads to the conclusion that there are no genes that are constantly expressed throughout the different stages in the plant's lifecycle [41]. The levels of expression and thus particular transcripts abundance in different organs and developmental stages may substantially differ, which was also observed when comparing caryopses and seedlings of A. fatua (Fig 1). The expression of four out of seven candidate reference genes (AfGAPDH1, AfTUA1, AfUBC1 and AfUBC2) was induced after germination. In our experimental system, comprising dry and imbibed caryopses additional factor might have affected gene expression and thus hinder the finding of an appropriate reference gene. That is the dryto-hydrated (inactive-to-active) state transition being a part of the germination process [63]. We did observe a fluctuating expression levels of some candidate reference genes i.e. AfEF1α, AfGAPDH2, AfTBP2 and AfTUA1 which could be specifically attributed to the change in the moisture content and metabolism activation in A. fatua caryopses (Fig 1). Similar concerns were addressed by Chen et al. [58] who suggested that most house-keeping genes, traditionally chosen as reference genes for normalization in qPCR analyses, were influenced by the "hydration bias".
The results of four widely accepted statistical algorithms employed for gene stability evaluation in A. fatua, did not distinguish one universal most stable gene to serve as an internal control in all of our experimental layouts. However, starting with a whole dataset, whichever the method used, at least one of two homologs of AfUBC with either AfGAPDH1 or AfTBP2 were appointed as highly (1 st to 3 rd rank) stable genes (Fig 2, Tables 4-6). In comparison, AfGAPDH2 and AfTUA1 were indicated as least stable, occupying from 5 th to 7 th place in the rankings. In most data subsets, a similar tendency in the stability of candidate genes was observed, although some discrepancies in results returned by different algorithms appeared. The differences in stability estimation due to the applied methods have been also observed elsewhere [10,59,62,64,65] and mainly attributed to differences in the type of input data (raw v. normalized Cq) and the parameters upon which the rankings are based. In several studies, the problem of the disagreement between the four methods has been solved by ranking the candidate reference genes either according to the geometric mean of the ranking numbers calculated e.g. by a web-based tool RefFinder [10,57,66] or the consensus reached with a crossentropy Monte Carlo algorithm offered by RankAggreg, an R package for weighted rank aggregation [67,68]. Robledo et al. [65] criticized the motion of selecting the best reference gene(s) due to such consensus rankings as having no biological meaning and recommended, instead, relying rather on assessment of the two most important and complementary issues i.e. absence of intergroup variation and correlation between reference genes provided by Norm-Finder supported by descriptive statistics from BestKeeper (SD, CV and r) if the four methods disagree. According to Kozera and Rapacz [33], it is advisable to use at least a pair of genes responsible for distant functions, because of a very little chance for a common regulation of their expression. Others advise to use at least three reference genes in order to achieve best results [69,70]. As shown by the pairwise variation analysis (Fig 3) and further confirmed by the validation experiment in which the target AfVP1 gene expression was assessed (Fig 4, S2  Fig), in our experimental conditions also two or three reference genes of high expression stability, performed better than a single gene.
The selection of an appropriate reference gene for normalization may have a great impact on the conclusions drawn from the experiment on biological significance of target gene expression data. Since there was no absolute agreement between different algorithms as to how many and which reference genes would be most suitable for RT-qPCR normalization, different possibilities were checked by normalizing raw Cq values of AfVP1 gene (Fig 4, S2 Fig). Our data show that the combined use of at least two out of three the most stable reference genes (AfGAPDH1, AfUBC1 and AfUBC2) can provide an optimal normalizing factor for qRT-PCR analyses comprising ND and/or PD caryopses. UBC belongs to the group of traditional reference genes and has been often reported as having a high expression stability [41,71,72]. It was the only classic HKG found in a set of highly stable genes in Arabidopsis thaliana and tomato seeds [37,47]. The UBC gene was ranked the third among 21 genes from the results of overall ranking of the best reference genes for all samples including seeds of Euphorbia esula [60]. GAPDH gene was also indicated as one the most stable genes in herbicide-resistant A. fatua populations treated with different herbicides [10]. Taking into account a very high sequence similarity (99.3%) ( Table 2) between the NGS sequenced transcript Uni-gene36693 designed AfGAPDH1 and GAPDH partial cds (KT153027) it may be assumed that they represent the same gene homolog. In comparison, the other GAPDH homolog identified through NGS sequencing in wild oat, Unigene36694 (AfGAPDH2) was quite unstable in our experimental system (Fig 1, Tables 4-6). This observation may reflect the disagreement in reports concerning GAPDH gene expression stability. For example, GAPDH was recommended as the appropriate reference gene during seed development in tung tree [48] or under different germination time points in Suaeda aralocaspica [46] as well as in combination with MDH (malate dehydrogenase) for gene expression analysis in pulp and seed samples of Theobroma grandiflorum [59], but the results in Euphorbia showed that the levels of transcripts from GAPDH1 and GAPDH2 were very unstable in buds, seeds, and various organs [60]. As pointed by Saraiva et al. [73] such contrasting results between different species or even within the same species not necessarily should be attributed to differential expression behavior in response to distinct treatments, they as well might be caused by primer heterogeneity from different studies that could amplify different members of a gene family and thereby presenting a differential expression profile.

Conclusions
From our findings it can be concluded that the normalization factor calculated as a geometric mean of Cq values of two UBC homologs accompanied by AfGAPDH1 would be optimal for RT-qPCR results normalization in the experiments comprising A. fatua caryopses of different dormancy status. This conclusion remains in agreement with the approach proposed by Vandesompele et al. [52] and widely considered as the most appropriate in gene expression studies. Based on the obtained results it would be advisable to use AfUBC1, AfUBC2 and AfGAPDH1 as reference genes also in further studies on molecular basis of seed dormancy induction, maintenance, and release in A. fatua caryopses on condition of prior validation.  Table. Coding sequences of Avena fatua L. candidate reference and target genes found in NGS RefSeq unigene dataset by Geneious software. Locations of qRT-PCR primers marked in colour (forward in blue, reverse in orange). (DOCX) S1 File. Dataset of de novo assembled Avena fatua L. unigene sequences. RNA-seq library constructed using a pooled sample of RNA isolated from caryopses and seedlings was sequenced using an Illumina HiSeq2000 genome analyzer at customer service of BGI-Tech Solutions Co., Ltd., (Hong Kong, China). (7Z)