Identification of Candidate Genes Associated with Positive and Negative Heterosis in Rice

To identify the genes responsible for yield related traits, and heterosis, massively parallel signature sequencing (MPSS) libraries were constructed from leaves, roots and meristem tissues from the two parents, ‘Nipponbare’ and ‘93-11’, and their F1 hybrid. From the MPSS libraries, 1–3 million signatures were obtained. Using cluster analysis, commonly and specifically expressed genes in the parents and their F1 hybrid were identified. To understand heterosis in the F1 hybrid, the differentially expressed genes in the F1 hybrid were mapped to yield related quantitative trait loci (QTL) regions using a linkage map constructed from 131 polymorphic simple sequence repeat markers with 266 recombinant inbred lines derived from a cross between Nipponbare and 93-11. QTLs were identified for yield related traits including days to heading, plant height, plant type, number of tillers, main panicle length, number of primary branches per main panicle, number of kernels per main panicle, total kernel weight per main panicle, 1000 grain weight and total grain yield per plant. Seventy one QTLs for these traits were mapped, of which 3 QTLs were novel. Many highly expressed chromatin-related genes in the F1 hybrid encoding histone demethylases, histone deacetylases, argonaute-like proteins and polycomb proteins were located in these yield QTL regions. A total of 336 highly expressed transcription factor (TF) genes belonging to 50 TF families were identified in the yield QTL intervals. These findings provide the starting genomic materials to elucidate the molecular basis of yield related traits and heterosis in rice.


Introduction
Rice is one of the most important cereal crops feeding half of the worlds' population. Because of the increasing population and reduction of arable lands for rice production, improving grain yield is one of the most important goals of rice breeding programs [1,2]. The genetic basis of yield and its component traits are complex, and controlled simultaneously by QTLs that are sensitive to environmental changes [3][4][5]. Hybrid rice where F 1 plants are used has provided the highest yield potential in comparison with inbred cultivars. Since the 1970's hybrid rice has been widely cultivated in China and is now being extended to United States and worldwide.
Rice yield is either directly or indirectly affected by various yield related traits including days to heading [ [TYP]. Heading date is important to rice breeders because it affects adaptation of plants to various crop seasons and cultivation areas [6]. Heading date is regulated by a complex gene network consisting of a series of genetic factors [7]. Many genes that control heading date have been identified by QTL analysis [8][9][10][11]. Some of the important QTLs, Hd1, Hd3a and Ehd1 involved in heading date were cloned [12][13][14]. In addition, genes influencing heading date, plant height and rice yield like Ghd7 and Ghd8 were also cloned [15,16]. A major plant height gene, the semi-dwarf gene sd1 was responsible for the green revolution in rice [17]. Some major QTLs for grain shape and 1000 grain weight such as GS3, GW2 and qSW5/GW5 were fine mapped and cloned [2,18,19]. The QTL Gn1a influencing the number of kernels per panicle was isolated by a map-based cloning strategy [1]. In addition, QTL controlling grain weight, gw8.1 and gw9.1 [20,21] and number of spikelets per panicle, GPP1, gpa7 and SPP3b/TGW3b, were recently fine mapped [22][23][24]. In spite of hundreds of QTL mapping studies in rice for yield related traits, few of them have been isolated. Most of the genes either cloned or fine mapped so far belong to major QTLs, and the genes located in the minor QTL regions have not been fully explored.
In hybrids, novel patterns of gene action resulting from the combination of allelic variants are thought to be responsible for heterosis [25][26][27][28]. Dominance [29], over-dominance [30,31], or epistasis [32,33] were used to explain heterosis. For example, indica x japonica crosses show maximum heterosis compared to any other combination between other subspecies [34]. Gene expression and QTL analysis provide an avenue for identifying candidate genes for heterosis [35]. Several genomic approaches have been employed in rice and many genes underlying yield related traits have been identified [1,2,18,19,36,37,38]. For example, plant height is related to synthesis of sucrose phosphate synthase [SPS] [39], and phytohormones such as gibberellin and brassinolide [40,41]. Further, large-scale transcriptome profiling has been used to identify the genes related to heterosis in crop plants such as rice [42][43][44], maize [45] and wheat [46]. Using a cDNA microarray consisting of 9198 expressed sequence tags [ESTs], gene expression profiles from an elite hybrid rice Shanyou 63, its parents [Zhenshan 97 and Minghui 63] revealed patterns of expressed genes may be associated with heterosis at three stages of young panicle development [42]. In addition, differentially expressed genes related to heterosis were identified in the super hybrid rice LYP9 compared to its parents [93-11 and PeiAi64S] using microarray and SAGE technologies [44,45]. The root transcriptomes of the super-hybrid rice variety Xieyou 9308 and its parents were analyzed at tillering and heading stages for identification of candidate genes for heterosis [46] using RNA sequencing technology [RNA-Seq].
Both positive and negative heterosis can be employed in breeding depending on target traits, In general, positive heterosis is desirable for yield, and negative heterosis of growth duration is useful for earliness [47][48][49]. In F 1 hybrid, the combination of allelic variants results in novel patterns of gene action possibly leading to heterosis [45,49,50]. Genetic variation, epistatic interaction, epigenetic modification and small-RNA-directed gene regulation were also shown to be related to heterosis [30,33,[51][52][53]. Expression of transcription factors [TFs] and polymorphic cis-regulatory elements in the promoters of related genes in hybrids play an important role in heterotic gene expression and heterosis in rice [54]. Recently, gene expression profiling in Arabidopsis suggests that the genes involved in the circadian rhythm such as MYB-like transcription factors were associated with heterosis [55]. However, the molecular mechanism of either positive or negative heterosis remains poorly understood. For investigating heterosis at the transcriptome level EST library sequencing, microarray hybridization and serial analysis of gene expression [SAGE] have been used in crop plants, However, these technologies have drawbacks, such as low throughput, high cost, low sensitivity, cloning bias, high background signal, and predetermined probe requirements [43,44,56] [47,57]. The MPSS and SBS tags are short cDNA tags or digital gene expression tags, which are mainly derived from the 39 regions of a transcript. These are deep sequencing methods previously used in rice and Arabidopsis [58,59]. These tag-or sequence-based technologies determine the expression level of a gene by counting the precise abundance of a specific transcript in a library [25,58,59].
An inter-subspecific F1 hybrid was developed from a cross between Nipponbare [japonica] and 93-11 [indica]. The genomes of both parents were completely sequenced [60][61][62]. Nipponbare is a rice cultivar developed in Japan [63]. Cultivar 93-11 is an elite parental line used in developing several super hybrid rice such as LYP9, YLY7, YLY1 etc. in China [44]. Thus, the F 1 hybrid produced in this study provides a unique opportunity to investigate the molecular basis of yield related traits. The objectives of the present study were to 1) evaluate yield related traits and determine the transcription profiles of leaves, roots and meristems of the two parents, Nipponbare and 93-11] and their F 1 hybrids using MPSS technology; 2) determine their commonly and specifically expressed genes; and 3) to map differentially expressed transcripts onto a genetic map and analyze their potentials for rice heterosis.

Results
Phenotyping and transcriptome sequencing of hybrid and their parents F 1 hybrid plants, RILs and the parents Nipponbare and 93-11 were phenotyped for yield traits (Table 1, Table S1 Figure S1). In this study, DTH, PHT and NOT in the F 1 hybrid plants were greater than in the parents. PTY, PLE, NOK, KWP, and TYP in the F 1 hybrid were less than the parents. About 1.0 to 3.0 million 17-base MPSS signatures were obtained in the 21 libraries (Table 2, Table S2). These signatures were clustered and processed with reliability and significance filters as described by Meyers et al. [40][41][42] ( Figure S2). To compare the expression levels across the libraries, the frequency of signatures in the individual libraries were normalized to one million [transcripts per million or TPM] [40][41][42]. The number of distinct signatures ranged from 14,127 to 28,621 in the MPSS libraries. The number of distinct genes identified using reliable and significance filtered signatures from 5,444 to 12,717 genes. About 56 to 87% of the signatures from Nipponbare matched to the Nipponbare genomic sequence. Similarly, about 77 to 87% of the signatures obtained from the 93-11 tissues matched to the 93-11 genomic sequence (Table S2). In F1, about 74 to 84% of signatures matched Nipponbare and 75 to 85% of signatures matched 93-11 (Table  S2). The significant MPSS signatures from all 21 libraries were classified into seven classes based on their location on the annotated genes (Table S3).

Commonly and specifically expressed genes
Based on the results of a Venn diagram, we analyzed the differences that existed in gene expression among leaves, roots and meristem tissues. The similarities between any two genotypes [Nipponbare, 93-11 and F 1 hybrid] were established based on the number of similar genes expressed between any two genotypes, and the number of genotype specifically expressed genes. The number of genes expressed in all three tissues [leaves, roots and meristems] in Nipponbare, 93-11 and the F 1 hybrid were identified. A total of 7,812, 5,181 and 4,009 genes were commonly expressed in leaves, roots and meristem tissues, respectively. The number of commonly expressed genes was higher than specifically expressed genes in all 3 tissues (Figure 1).
The expression levels of Nipponbare and 93-11 were compared with their F 1 hybrid and the differentially expressed signatures were classified into eight expression patterns: above high parent level [   Roots '

Meristems
Total reads sequenced  Table S4). Italics indicate-match to 93-11 genomic sequence. doi:10.1371/journal.pone.0095178.t002 and SEF 1 represent a majority of the gene expression pattern in leaves, roots and meristems ( Figure 2). In leaves, roots and meristems combined AHPL and SEF1 expression patterns represented 58%, 48% and 54%, respectively. AHPL expression patterns represented 22%, 17%, and 17% while SEF1 expression patterns represented 36%, 31% and 37% in leaves, roots and meristems, respectively. These findings suggest that novel patterns of gene action thought to be involved in heterosis resulting in allelic variants from the parents and the hybrid.
Seventy-one QTLs above the significant threshold level for yield related traits were mapped ( Figure 3; Table 3; Table S5). A total of three novel QTLs related to PTY [2] and KWP [1] were identified. No QTL above the significant threshold level was found for LOG. We selected the QTLs which showed mean LOD scores of $5 for a given trait for identification of differentially expressed genes located in the 71 mapped QTLs (Table S6). Based on the physical location of SSR markers on the chromosomes, the differentially expressed genes between flanking markers were identified. Expression patterns for the differentially expressed genes in a particular QTL were listed in Table S6 (Table S6).
Of the seven QTLs for DTH, five showed average LOD score   We identified seven QTLs related for PLE, of which three showed average LOD scores of $5. We identified six QTLs related to NOB, two of which showed an average LOD score of $5. A total of ten QTLs have been identified for NOK. Of these QTL only qnok3 showed a significant LOD score of 5 or higher (LOD 6.5). Of the nine QTLs detected for Total kernel weight per panicle, qkwp7 was a novel QTL. The allele qkwp8 was detected in the public rice QTL database, but was unnamed. QTLs for KWP were found on chromosomes 1, 2, 6, 7, 8 and 11 and four of these had a LOD $5. A total of seven QTLs were detected for TGW. The qtgw1, qtgw3, qtgw5 and qtgw7 alleles were only detected at the Stuttgart location. Of these, two QTLs had a LOD $5. Total grain yield per plant is the most important trait to be mapped on chromosomes for improving rice yield. The QTLs related to TYP were detected in the Beaumont 2009 and Stuttgart 2011. Six QTLs were found to be associated with TYP, of which four QTLs [qtyp1, qtyp2, qtyp4 and qtyp5] were detected at the Beaumont 2009 location. The QTL qtyp4 located on chromosome 8 at 2.8-31.5 cM showed a LOD score of 5.72, and contained 442 differentially expressed genes between the closely linked markers RM6863 and MJIndel1.
Expressed TF genes located in the mapped yield QTL were identified using homology search in the rice transcription factor database [http://ricetfdb.bio.uni-potsdam.de/v2.1/]. A total of 336 expressed TF genes belonging to 50 TF families, were identified in combined leaves, roots and meristem tissues, located in yield related QTL intervals (Table S7)

Expression of genes involved in epigenetics
A total of 99 epigenetic/chromatin-related genes were expressed in leaves, roots and meristem tissues (Table S8). Several genes belonging to flowering, histone demethylases, histone deacetylases, genes encoding argonaute like proteins and polycomb group genes were highly represented. In leaves, genes belonging to the AHPL expression pattern were expressed including genes encoding flowering control-associated proteins [Os03g58070-qnot4;

Functional classification of genes showing AHPL expression pattern in F 1 hybrids
The genes in the mapped yield related QTL showing AHPL expression patterns were classified into different groups based on KEGG's functional classification of genes ( Figure S3). Biochemical pathways including carbohydrate metabolism, energy metabolism and metabolism of cofactors and vitamins were highly represented in leaves, roots and meristem tissues. Leaf and meristem tissues showed expression of more energy and nucleotide metabolism genes compared to roots ( Figure S3). Because in leaves, seven biochemical pathways including carbohydrate metabolism, energy metabolism, nucleotide metabolism, metabolism of cofactors and vitamins, amino acid metabolism, translation, and sorting and degradation were highly represented, we further classified these genes based on mapped QTLs. More carbohydrate metabolism related genes were present in QTLs qdth2, qnot4, qnot6, qple5 and qkwp5. The QTLs qnot4 and qnot5 were highly represented by genes belonging to all seven biochemical pathways ( Figure S3). Some of the genes belonging to the photosynthesis and carbon Figure 3. Chromosomal locations of yield related QTL identified using RILs obtained from Nipponbare x 93-11 cross. Seventy one QTL identified were mapped for yield related traits including days to heading (qdth), plant height (qpht), tiller angle (qpty), tiller number (qnot), panicle length (qple), number of primary branches per panicle (qnob), number of kernels per panicle (qnok), total kernel weight per panicle (qkwp), 1000 grain weight (qtgw) and total grain yield per plant (qtyp). doi:10.1371/journal.pone.0095178.g003 Table 3.      was located in the QTL qpty3/qnot7. Many of these genes were expressed in either of the parents and/or the F 1 (Table S9).

Discussion
For the first time, genome-wide gene expression from the leaves, roots and meristems of rice were mapped onto 71 QTLs of yield related traits. Among them, sixty eight QTLs had have been previously reported by others; while, three QTLs [qpty1, qpty2, qkwp7] were novel and could be specific for 93-11 and Nipponbare. Three QTLs [qkwp8, qnok8 and qpht3] were reported in the Gramene/Q-TARO database without gene designations (Table S5). Of the seven QTLs detected for DTH, the alleles from Nipponbare decreased DTH at four loci while three alleles increased DTH. All seven QTLs for DTH had been reported previously (Table S5). The similarity of the regions associated with QTLs in this study for DTH compared to other studies involving indica and japonica cultivars suggests that the same alleles are responsible for DTH across different genetic and environmental backgrounds. Of the nine QTLs detected for PHT, Nipponbare alleles increased PHT at three QTLs. All nine QTLs for PHT had been previously reported (Table S5). All the QTLs for NOT had been reported earlier (Table S5). In addition, seven QTLs for NOT, TGW and PLE, and six QTLs for grain yield per plant were reported in multiple rice germplasm lines in rice production areas worldwide. These findings suggest that these QTLs may be intensively selected during the domestication and breeding process.
In comparison with parents, changes including increased DTH, increased PHT, narrow PTY, increased NOT, slightly decreased PLE, moderate NOB, decreased number of NOK, decreased KWP, moderate TGW and decreased TYP were observed in the F 1 hybrid (Table S1). The OsSUT1 gene at QTL [qdth1] was located on chromosome 3 for DTH and PHT [64]. Another QTL, Ghd7 on chromosome 7 was predicted to encode a CCT-domain protein controlling grain yield, PHT and DTH in rice [15]. For PTY, the TAC1gene on chromosome 9 was identified at qTA-9 in a 93-11xNipponbare cross [5]. The cytokinin dehydrogenase 5 precursor gene [Gn1a, Os01g56810] for grain number was located at qple1 for PLE [1]. The OsSPL14 gene encoding a squamosa promoter-binding-like protein 9 for panicle number, panicle branching and high grain productivity was located at qnob4 [64]. The Hd1 gene [zinc finger protein CONSTANS] for DTH was located at qdth5 [12]. The TAC1gene for PTY was located at qpty3 [65] (Table S9).
Consistently, In F1 hybrid, HPL and LPL may explain dominance and AHPL and BLPL may explain over-dominance (Table S4). For example, genes for the AHPL in F1 are involved in plant growth, development and signal transduction including granule-bound starch synthase I, growth regulator and phosphatidylinositol 3-and 4-kinase family protein. The genes involved in carbon fixation and photosynthesis pathways were the AHPL expression pattern in F 1 leaf including vacuolar ATP synthase subunit D 1, vacuolar ATP synthase catalytic subunit A, ribulosephosphate 3-epimerase, uridine/cytidine kinase-like 1, ribose-5phosphate isomerase, ferredoxin and ferredoxin-NADP reductase. Genes involved in sucrose biosynthesis phosphoglucomutase and sucrose-phosphate synthase 1 belonged to AHPL (Table S6). Similarly, genes for photosynthesis, carbon fixation, starch and sucrose metabolism were mapped at yield related QTL, and their enhanced expressions were found in the super rice hybrid [4]. Besides photosynthesis, and sucrose and starch pathways, the oxidative phosphorylation, citrate cycle [TCA cycle], and stressresistant pathway, etc., may also contribute to heterosis [46]. In our study, the gene for sucrose phosphate synthase [SPS], the major limiting enzyme for sucrose synthesis, mapped at ph1 responsible for plant height, and was highly expressed [AHPL] in F 1 hybrid leaves compared to that of the parents (Table S6). The higher SPS activity was proposed to be responsible for increasing panicle length [9].
Heterosis may also be a combination of genetic and epigenetic regulation [26,33,43]. Altered gene expression caused by interactions between transcription factors and the allelic promoter region in the hybrids was one plausible mechanism for heterosis in rice [50]. Many differentially expressed TF genes in super hybrid rice were located in grain yield related QTLs [43]. In this study, the TF genes belonging to helix-loop-helix DNA binding domain containing protein genes of TF family HLH including Os03g53020 [qdth2, qnot4], Os01g57580 [qple1] and Os02g47660 [qple2] showed the AHPL expression pattern in leaves, roots and meristems, and located in the mapped yield related QTL. The LAX1 gene encoding a bHLH transcription factor was involved in the formation of all types of axillary meristems throughout the ontogeny of rice [66] and a mutant of LAX1 [lax 1-2] was shown to reduce tiller number [67] suggesting that LAX1 function may be required for the generation of axillary meristems of both tillers and panicles [67].
The TF family AP2-EREBP members were potential targets of miRNA [68]. Noncoding RNAs were involved in epigenetic regulations, and other epigenetic mechanisms including DNA methylation, acetylation and deacetylation of histones, and chromatin remodeling [69][70][71]. Three genes encoding AP2 domain containing protein belonging to TF family AP2-EREBP showed an AHPL expression pattern in F 1 hybrid leaves (Table  S6). In Arabidopsis epigenetic regulation of a few regulatory genes for growth and development were observed in hybrids [55]. In this study, a total of 99 chromatin-related genes were expressed in leaves, roots and meristem tissues. Specifically, several epigenetic related genes belonging to flowering, histone demethylases, histone deacetylases, argonaute like protein genes, and polycomb group genes were highly expressed F 1 hybrids suggesting their potential roles heterosis.
In the past decade, oligoarrays, SAGE, MPSS, and SBS have been used for transcriptome profiling. Illumina's MPSS technology has been used to generate expression data for many organisms [58,59,[77][78][79]. Thus far, MPSS was the most popular tag based technology for sequencing of the transcriptomes of various organisms [72][73][74][75][76][77][78]. More genes were identified using MPSS technology than SAGE or oligoarrays [76]. In this study, MPSS technology was used to analyze the transcriptomes of the leaves, roots and meristem tissues obtained from Nipponbare, 93-11 and their F 1 hybrid. A total of 1 to 3 million signatures were obtained from each library. The number of redundant and nonredundant signatures generated in this study was similar to those in previous reports in rice and Arabidopsis [58,59,[72][73][74][75][76]. It is important to note that significant proportion of MPSS signatures failed to match Nipponbare genome. One of the most plausible explanations is alternate splicing. Published reports demonstrated that alternative splicing in rice ranged from 13 to 21% [79]. Other possibilities also include sequencing errors in the MPSS signatures and in the Nipponbare genome or un-sequenced regions of the Nipponbare genome and also unknown mechanisms.
In summary, MPSS technology was used to obtain genome wide expression profiles in leaves, roots and meristem from 'Nipponbare' and '93-11', and their F1 hybrid. Commonly and specifically expressed rice genes were identified, and mapped to 71 yield related QTL regions for days to heading, plant height, plant type, number of tillers, main panicle length, number of primary branches per main panicle, number of kernels per main panicle, total kernel weight per main panicle, 1000 grain weight and total grain yield per plant. Differentially expressed genes at yield related QTLs are the important candidate genes for further functional validation to unravel their role in positive and negative heterosis in F1 hybrids. This study provides the starting genomic materials to elucidate the molecular basis of yield related traits and heterosis in rice.

Materials and Methods
The cross between japonica cultivar Nipponbare and indica cultivar 93-11 was made at Ohio State University. The RIL population an F 5-7 was developed using an F 2 and F 3 Table 1). The trait data collected per location varied depending on the weather conditions during that growing season [ex: hot summer]. The traits DTH, PHT and PTY were obtained in six field experiments. The KWP, LOG, NOT, NOK, PLE, TGW and TYP were obtained from four field experiments. NOB was obtained in three field experiments ( Table 1).

Evaluation of yield related traits
Yield related traits were measured in RILs using a modified procedure from Moncada et al. [80] and a source book from the International Rice Research Institute [IRRI] [November 2002] entitled 'Standard Evaluation System for Rice'. Number of days to heading was recorded when 50% of plants had flowers on at least one panicle. Lodging was measured using 0-9 scale, where 0 stands for no lodging, 1 stands for up to 10% lodged, 2-11 to 20% lodged, 3-21 to 30% lodged, 4-31 to 40% lodged, 5-41 to 50% lodged, 6-51 to 60% lodged, 7-61 to 70% lodged, 8-71 to 80% lodged, 9-81 to 100% lodged. Tiller angle [plant type] was measured using 1-9 scale, where 1-tillers were erect with an angle less than 30u from the perpendicular; 3-tillers were intermediatethe angle was about 45u; 5-tillers were open-the angle was about 60u; 7-tillers were spreading-angle was more than 60u but the culms do not rest on the ground; 9-procumbent the culm or its lower part rests on ground surface. Plant height [cm] was measured for each plot. Averages were calculated for each RIL and each trait in all 6 experiments [mentioned above]. Number of tillers per marked plant was counted. One 'main panicle' from the marked plant was harvested to record panicle length [length of the panicle from the base to tip of the panicle], number of primary branches per main panicle, number of kernels per main panicle, and 1000 grain weight per main panicle. Main panicle data was collected for each marked RIL. The total grain yield per plant was calculated collecting all the kernels from the entire plant [for each marked RIL]. Phenotypic data were analyzed using Microsoft access and JMP Genomics [version 5.1] software (Table 1).

Genotyping and data analysis
A total of 266 F 5 -F 7 RILs were used for SSR analysis. DNA extraction and quantification was performed as previously described [81,82]. Except for two indels and two SSR primers designed in house the primer sequences and map position of the SSR markers were obtained from the Gramene database [http:// www.gramene.org/qtl/index.html] [83][84][85]. LJSSR1 and Con673 sequences were described by Li et al. [85]. MJIndel1 and MJIndel 2 were designed using the annotated Nipponbare and 93-11 genomes. The sequences for MJIndel 1 were F: attggatcaacacaccacac R: cagtcgaactccatcttcct and MJIndel 2 were F: aacttcaacaccaccctttga R: tttccaggtccagctcctaa. Marker amplification and allele calling were determined as described by Liu et al. [81]. A linkage map was constructed using JoinMap 4 based on the Kosambi function. Composite interval mapping [CIM] was used for phenotypic data obtained from 6 field experiments using Windows QTL Cartographer version 2.5 to identify QTLs affecting each yield related trait. The threshold was estimated by 1000 permutations at P ,0.01 by QTL cartographer for each trait (Table S5)

Construction of the MPSS libraries, sequencing, and bioinformatics
MPSS library construction, sequencing and annotation were performed essentially as previously described [58,59,[72][73][74][75][76]. Briefly, to get high quality data, sequences obtained from MPSS technology were passed through two filters -reliability and significance. The 'reliability' filter determines if the given signature is found in more than one library (reliable signatures) or present in only one library (unreliable signature). The 'significance' filter determines if a given signature is found in any library at $4TPM (transcripts per million) (significant signature) or ,4TPM (nonsignificant signature) in a normalized library. The significant and reliable distinct signatures were identified in 21 MPSS libraries as previously described. For leaf tissue, the signature frequencies data obtained from 4 replications were used to calculate the mean value. This included the transcripts expressed either in only one, two, three or all four replications. Similarly, mean values were calculated for 2 replications of root tissues. The expression levels of Nipponbare and 93-11 were compared with that of their F 1 hybrid and the differentially expressed signatures were classified into 8 expression patterns [AHPL, HPL, MPL, LPL, BLPL, SEF 1 , EOPF 1 and EPAF 1 ] with some modifications [26,44]. To avoid sequence bias we performed cluster analysis on the signatures matching to both Nipponbare and 93-11 genomes to identify commonly and specifically expressed genes among Nipponbare, 93-11 and F 1 hybrid. Clustering analysis was carried out using Microsoft Access and JMP Genomics [version 5.1] software to identify the genes specifically and commonly expressed in Nipponbare, 93-11 and their F 1 hybrid. Bioinformatic analyses including identification of antisense transcripts, alternate transcripts, TFs and functional classification of genes using KEGG database were conducted as previously described [58,59,[72][73][74][75][76]. The entire dataset is available at the NCBI's Gene Expression Omnibus database. Figure S1 Transgressive variation among the F 6-8 generation RILs of Nipponbare X 93-11 cross in the field at Stuttgart, Arkansas. Also, phenotypic variation (plant height, number of tillers and maturity) of F 1 hybrids and their parents in both field and growth chamber conditions. (TIF)

Supporting Information
Filter results for 21 MPSS libraries. A total of 179,151 distinct 17base expressed signatures from 21 MPSS libraries were processed according to three filters-significance, reliability, and genomic match as described by Meyers et al. [58][59].  Table S4 Gene expression levels in leaf, root and meristem tissues of F 1 hybrids and their parents Nipponbare and 93-11. Mean signature frequencies (copy number) were calculated from four leaf replications and two root replications separately. The mean signature value of F 1 hybrids was compared with their parents to classify the signature into one of the 8 expression patterns (AHPL, HPL, MPL, LPL, BLPL, SEF1, EOPF 1 and EPAF 1 ). Detailed annotation of each signature is presented. (XLSX) Table S5 List of known and novel yield related QTL identified in this study based on Gramene QTL database (ftp://ftp.gramene.org/pub/gramene/CURRENT_RELEASE/ data/qtl/) and Japanese rice QTL database (Q-TARO database) -http://qtaro.abr.affrc.go.jp/qtab/table. SSR marker intervals, percent variation, additive effect and LOD score for each QTL identified in each location is presented. Average LOD score was calculated for QTL identified in more than one location. Similar QTL locations identified in other studies (indicated in RED font color) are presented for the known QTL (in our study). Yellow highlighted QTL are reported in Japanese Q-TARO database.
(XLSX) Table S6 List of all expressed genes identified in mapped yield related QTL regions. The expressed genes located between the two flanking SSR markers of yield related QTL are listed. Also, the expressed genes belonging to SEF1 and AHPL expression pattern categories located in mapped yield related QTL regions are also listed. The QTL identified by mean LOD score $5 are used for this analysis. (XLSX) Table S7 List of TF genes identified in mapped yield related QTL regions. The TF genes located between the two flanking SSR markers of yield related QTL are listed. The QTL identified by mean LOD score $5 are used for this analysis. TF family names and the expression level patterns of TF genes in different tissues are presented here. (XLSX)