QTL and QTL networks for cold tolerance at the reproductive stage detected using selective introgression in rice

Low temperature stress is one of the major abiotic stresses limiting the productivity of Geng (japonica) rice grown the temperate regions as well as in tropical high lands worldwide. To develop rice varieties with improved cold tolerance (CT) at the reproductive stage, 84 BC2 CT introgression lines (ILs) were developed from five populations through backcross breeding. These CT ILs plus 310 random ILs from the same BC populations were used for dissecting genetic networks underlying CT in rice by detecting QTLs and functional genetic units (FGUs) contributing to CT. Seventeen major QTLs for CT were identified using five selective introgression populations and the method of segregation distortion. Of them, three QTLs were confirmed using the random populations and seven others locate in the regions with previously reported CT QTLs/genes. Using multi-locus probability tests and linkage disequilibrium (LD) analyses, 46 functional genetic units (FGUs) (37 single loci and 9 association groups or AGs) distributed in 37 bins (~20%) across the rice genome for CT were detected. Together, each of the CT loci (bins) was detected in 1.7 populations, including 18 loci detected in two or more populations. Putative genetic networks (multi-locus structures) underlying CT were constructed based on strong non-random associations between or among donor alleles at the unlinked CT loci/FGUs identified in the CT ILs, suggesting the presence of strong epistasis among the detected CT loci. Our results demonstrated the power and usefulness of using selective introgression for simultaneous improvement and genetic dissection of complex traits such as CT in rice.


Introduction
As the staple food for the half world population, rice (Oryza sativa L.) grows worldwide. There are two major subspecies of O. sativa, Xian (indica) or Geng (japonica) [1], the former of which is widely grown in the tropical and subtropical areas, while the latter in the temperate areas of high latitude/altitude of Asia. For Geng rice crops grown primarily in high-latitude or high- altitude regions of China, Japan, Korea, and other parts of the world, low temperature, or cold stress, is one of the most common environmental stresses affecting rice growth and development, and thus grain yields. Cold stress includes two major classes, chilling (0-15˚C) and cold deep-water irrigation (CDWI) (18-19˚C) stress, both of which are major environmental factors limiting rice productivity [2,3]. The former occurs primarily at early stages of rice crops, causing poor crop establishment and delayed development, while the latter causes low fertility and poor grain filling. Thus, developing cold tolerant (CT) rice varieties is the most efficient way to resolve the problem. CT in rice is a quantitative trait controlled by multiple genes and involving complex biological and genetic mechanisms. To date, QTL mapping using bi-parental or multiple cross populations identified more than 250 QTLs affecting CT at different development stages of rice [3][4][5][6][7][8][9][10]. Mapping QTL for CT at the booting stage is more difficult because of difficulties in phenotyping. Of these, 108 QTLs for CT were detected at the booting stage (S1 Table). Four QTLs, qCT8, qCTB7, qCTB3 and qCT-3-2, have been fine mapped [6,[11][12][13], qCTB10-2 was delimited to a 132.5 kb region containing 17 candidate genes and 4 genes were cold-inducible [14], and only two CT QTL, Ctb1 and CTB4a, have been cloned and functionally characterized [15,16]. Ctb1 encodes an F-box protein and functions as part of the E3 ubiquitin ligase complex that is associated with increased spikelet fertility (SF) under cold stress [11]. CTB4a encodes a conserved leucine-rich repeat receptor-like kinase [16]. Despite the progress in genetic and molecular dissection of CT in rice, few of these mapped or cloned CT QTL have been used in breeding because of the possible epistasis and QTL x environment interactions [17]. In contrast, breeding progress for improving CT in Geng rice remains slow by the conventional pedigree method. This is because almost all crosses have been made between different varieties of the same Geng subspecies, which is known to have very low genetic diversity [18]. Breeders have also been reluctant to use Xian varieties as donors because of the genetic barriers such as hybrid sterility and hybrid breakdown commonly seen in Xian-Geng crosses.
To overcome the problem of breeding and genetic research being separate activities in the past study, we have been practicing a strategy of simultaneous improvement and genetic dissection of complex traits by selective introgression, which have been demonstrated to be powerful in several successful applications [19][20][21][22][23][24][25][26][27]. This strategy includes two parts: (1) development of large numbers of trait-specific introgression lines (ILs) by backcross (BC) breeding; (2) genetic and molecular dissection of target traits using the ILs and marker-based tracking. In this study, we report the application of selective introgression in characterizing the genetic networks underlying CT at the reproductive stage in rice, demonstrating clear advantages of selective introgression over the classical QTL mapping method for simultaneous improvement and genetic dissection of complex traits.

Plant materials
The materials used in this study included two sets of introgression lines (ILs) from five BC 2 F 4 populations derived from crosses between Chaoyou1 (CY1, the recipient), a high yield Geng variety, and 5 donors (3 Xians and 2 Gengs from China, Vietnam, Nepal and India) ( Table 1). The first set of ILs consisted of 84 BC 2 F 4 ILs developed by two rounds of selection for cold tolerance (CT) at the reproductive stage under cold water, as described previously [22]. The second set of ILs included 310 random ILs developed by single-seed decent from the same five BC 2 populations without any selection. Fig 1 shows the procedure for developing the two sets of ILs. Briefly, seeds of the five BC 2 F 4 bulk populations were sown in the seedling nursery on April 15, 2008, and 450 40-day seedlings of each BC 2 F 4 bulk population were transplanted into a 45-row plot at the Rice Research Institute of Jilin Academy of Agricultural Sciences (JAAS). When the RP (CY1) entered the panicle initiation stage, the cold-water treatment was initiated by constant irrigation of cold water (19± 0.5˚C). The cold-water depth in the pond was 20 cm and the treatment was maintained for~30 days until panicles of almost all plants exerted completely. Then, irrigation with normal temperature water was resumed until the maturity. Under this cold-water treatment, CY1 had spikelet fertility (SF) of 24.8±4.3%, then, 162 BC 2 F 4 plants with SF>50% were selected. The selected 162 BC 2 F 5 introgression lines (ILs) from the five populations were progeny tested under the same condition in 2009. Under this cold-water treatment, CY1 had a Phenotyping of the random populations for SF and yield related traits under cold water stress and non-stress conditions , 1000-grain weight (GW, in g), biomass (BM, in g) per plant, grain yield per plant (GY, in g) and harvest index (HI) in the same way as described previously [22].

Genotyping of the selected and random ILs
In the 2010 cold treatment experiment when CY1 had SF of 10.5%, 84 BC 2 F 6 ILs with SF>30% and the 310 random BC 2 F 4 ILs (ranging from 59 to 65 per population) were used for genotyping (Table 1). More than 600 anchor SSR markers covering the whole rice genome from the Grammene database (Grammene, http://www.gramene.org) were used to survey the polymorphism between CY1 and the donors, from which an average of 113 well-distributed polymorphic SSR markers were used to genotype the selected and random ILs. Marker positions and genetic distances between linked markers were based on the Cornell SSR maps (Grammene, http://www.gramene.org), which covered all 12 rice chromosomes with a total genome size range from 1,125.0 cM in population CY1/Yuanjing7 to 1649.8 cM in population CY1/ Chhomrong, with average genetic distances between adjacent markers ranging from 15.2 to 23.7 cM.

Data analyses
Three different approaches were used to detect QTL and genetic networks underlying CT that were segregating in the five populations. First, we used the segregation distortion method to detect main-effect QTLs in the selected IL populations. This method takes advantage of several populations together and performs a joint analysis to detect QTL that are respond to directional selection in multiple populations based on a consensus map [28]. We also used the traditional method to verify CT QTL by detecting marker-SF associations in the random ILs based on t-tests. In this method, we used the minimum threshold of P<0.05 because of the small population sizes of the random populations. The third methodology was more comprehensive and was used to detect QTL and QTL networks in three steps based on the molecular quantitative genetics theory [3,29]. In this theory, two concepts, functional genetic units (FGUs) and the principle of hierarchy, are defined here based on two commonest types of functional relationships between genes acting in the same signaling pathways affecting complex traits. In the selection experiment, alleles at segregating regulatory loci affecting the target trait (CT) were expected to respond more strongly than their regulated downstream ones, and show greater shifts in their introgression frequencies (IF) of the functional genotypes (FG) in response to selection. Here, FG was defined as one of the two homozygotes and heterozygote that contain the functional allele at involved loci. Based on the concepts, a FGU could be either single loci showing significant excess introgression or an association group (AG) of r (r!2) unlinked but perfectly associated loci of equal introgression in CT ILs selected from each BC population. Two types of statistical tests were performed to detect FGUs associated with ST. First, X 2 tests were performed to detect whether the allelic and genotypic frequencies at individual loci across the genome in ST ILs from each BC population deviated significantly from the random unselected populations. Second, a multi-locus probability test, , was used to detect individual AGs for CT, where p i is the frequency of the donor introgression in the random ILs from each BC population, n is the number of the selected ILs, m is the number of ILs that have co-introgression of the donor alleles, and (n−m) is the number of ILs having no introgression at the r unlinked loci in the AG. Here (P i ) m is the probability of m ILs having co-introgression of the donor alleles and (1-P i ) n-m is the probability of (n-m) ILs having no introgression at r unlinked loci. The threshold to claim a significant case was P 0.005 for individual cases. For any detected AG of r (r ! 2) perfectly associated loci, there will be rÁ(r-1)/2 significant pairwise associations between the r loci, which were also confirmed by the linkage disequilibrium (LD) analyses [30]. Because each of the BC populations were related to other populations by sharing the same recipient, we also listed putative ST loci detected with the sub-thresholds of 0.05 < P < 0.005 if they were detected with the selected threshold in one or more related populations.
To reveal the multi-locus structure of the detected FGUs-the putative genetic network underlying ST in the ILs from each BC population, pairwise gametic LD analyses were performed to characterize the relationships between alleles at all ST FGUs detected in ILs from each BC population [3,29]. The LD statistics [26],D AB ¼p AB Àp ApB , were calculated using the genotypic data of ILs from each BC population, wherep AB ,p A andp B were the frequencies of co-introgressed functional genotype AB, and functional genotypes at FGUs A and B, respectively. Then, a multi-locus genetic network containing all FGUs detected in the ILs selected from each BC population was constructed in the following 2 steps based on the principle of hierarchy: 1) all FGUs detected in the ILs from a single population were divided based on the LD results into major groups such that individual FGUs of different introgression frequencies within each group were all significantly and positively associated withDs AB 0 = 1.0 (p 0.05), and FGUs in different groups were either independent, or negatively associated; and 2) all associated FGUs within each group were connected, forming multiple layers, according to their progressively reduced FG frequencies and inclusive relationships [3].

Performances of selected and random ILs under cold water and normal conditions
When compared with the normal conditions, the cold-water stress of 2010 caused delayed heading by 14 days and greatly reduced trait values of CY1 for all measured traits ( However, ILs from each of the random populations showed significantly lower SF than CY1 under both the normal and cold-water conditions, except for ILs from populations CY1/ Yunjing7 and CY1/Chhomrong derived from parents of the same Geng subspecies (Table 2). Thus, the reduced fertility in the random progenies from the three inter-subspecific crosses was commonly seen in most breeding populations of this type [31]. Also, ILs from each population showed considerable variation in SF under both the normal and cold water conditions. The CT selected ILs had an average 9.3% of the donor genomes, significantly less than the expected 12.5% for BC 2 populations, and ranged from 4.7% in those of population CY1/Yunj-ing7 to 13.1% in the ILs of population CY1/Chhomrong. The average heterozygosity of the selected ILs was 1.9%, ranging from 1.0% in those of population CY1/Yunjing7 to 4.4% in the ILs of population CY1/X22. In contrast, the random ILs showed slightly less donor introgression (8%) but higher heterozygosity at 2.9%, when compared to the selected ILs.

Genetic networks (multi-locus structures) underlying CT
Total of 46 functional genetic units (FGUs) (37 single loci and 9 association groups or AGs) for CT were detected by χ 2 tests (single loci) and multi-locus linkage disequilibrium analyses in 84 cold-tolerant introgression lines (ILs) selected from five populations (Fig 1) and distributed on all 12 chromosomes except chromosome 7 (Fig 3). The number of detected CT loci ranged from 6 loci in 4 FGUs in the CY1/Yuanjing7 (B) population to 16 loci in 13 FGUs in the CY1/Doddi (E) population. The average IF of the donor alleles at the 58 CT loci was 0.46, or 5.75 times as much as the average introgression in the random populations (Table 5). These FGUs were distributed in 39 bins (~20%) across the rice genome. Fig 4A-4E shows the five putative genetic networks or multi-locus structures each containing all CT FGUs identified in the selected CT ILs from each BC population and their corresponding graphical genotypes at the detected CT FGUs. Fig 4A is the network containing all 9 FGUs (11 loci) detected in the 15 CT CY1/X22 (A) ILs, which has two highly associated groups of FGUs plus 2 independent loci (RM312 in bin1.3 and RM160 in bin9.7). Branch A-1 consisted of two sub-branches with OSR13 (bin 3.4) of high introgression (IF = 0.667) placed in the upstream of the network as the putative regulator. Sub-branch 1 had RM21 (bin11.5) in the upstream and agCT A1 consisting of three unlinked but perfectly associated loci at RM556 (bin8.4), RM257 (bin9.6) and RM228 (bin10.6) in the downstream. Sub-branch 2 had RM207 (bin2.9) in the upstream and RM154 (bin2.1) in the upstream. Fig 4B shows  loci) with two major branches plus three independent loci detected in the 22 ILs from population CY1/Fengaizhan. Branch C-1 was the most important one consisting of six unlinked but highly associated FGUs (1 AG and 4 loci) of high introgression. RM522 (bin1.2) was placed at the top of the network as the putative regulator because 20 of the 22 CT ILs had donor alleles at this locus, with the remaining 4 FGUs (RM200 in bin1.10, agCT C1 (bins 2.8 and 9.7), RM303 (bin4.6) and RM209 (bin11.5) in the downstream. The fourth network (Fig 4D) consisted of 11 FGUs (3 AGs and 8 loci) with 4 branches plus two independent loci of high introgression identified in the 15 ST CY1/Chhomrong (D) ILs. The two important independent loci at RM85 (bin3.12) and RM277 (bin12.4) each had an IF of 0.667. Branch D-1 consisted of four unlinked but highly associated loci with RM3 in bin6.5 of high introgression (IF = 0.667) on the top as the putative regulator and three downstream loci of lower introgression at RM317 (bin4.6), RM408 (bin8.1) and RM518 (bin4.2) in the downstream. Branch D-2 consisted of two highly associated FGUs, with RM439 (bin6.7) in the upstream and agCT D2 in the downstream. agCT D2 comprises four unlinked but perfectly associated loci near RM315 (bin1.11), RM227 (bin3.12), RM87 (bin5.8) and RM247 (bin12.2) detected with a P value of 7.3 −15 . Branch D-3 was agCT D3 consisting of two unlinked but perfectly associated loci near RM207 (bin2.9) and RM434 (bin9.5) detected with a P value of 1.8 −13 . Branch D-4 was agCT D1 consisting of two unlinked but perfectly associated loci near RM7 (bin3.5) and RM282 (bin3.6). The fifth network (Fig 4E) consisted of 13 FGUs (3 AGs and 10 loci) in three branches identified in the 17 ST CY1/Doddi (E) ILs. Branch E1 consisted of highly associated FGUs with agCT E1 in the upstream and agCT E3 in the downstream. agCT E1 contained two unlinked but perfectly associated loci near RM411 in bin3.8 and RM141 in bin6.7 detected with a P value of 9.7 −17 , while agCT E3 contained two unlinked but perfectly associated loci near RM406 in bin2.9 and RM405 in bin5.2 detected with a P value of 1.6 −12 . Branch E-2 contained two FGUs with RM433 in bin in the upstream and agCT E2 in the downstream. agCT E2 consisted of two perfectly associated loci near RM282 (bin3.6) and RM216 (bin10.2). Branch E-3 contained 3 unlinked but highly associated loci with RM341 (bin2.5) in the upstream, and RM274 (bin5.8) and RM185 (bin4.4) in the downstream (Table 5, Fig 4E).

Discussion
In Geng breeding efforts for improved CT at the reproductive stage, donors for CT are almost exclusively Geng varieties that are known to be more tolerant to cold than Xians. However, we previously demonstrated that Xian varieties could be used as donors for improving CT in Geng varieties [22]. In fact, the Xian donors in three of the five populations used in this study were from Vietnam, India and south China, and are known to have poor CT. Interestingly, they all appeared to be good donors for CT as far as the number of selected CT progeny concerned. This was true for tolerances to other abiotic stresses such as drought, salinity and submergence in most BC breeding populations involving a diverse set of donors and recipients [19][20][21]23]. We have proposed that future rice improvement requires successful integration of trait improvement and gene/QTL discovery [24,31,32], which can be achieved by examining how phenotypic selection was operating on the genetic variation of complex traits in the process of breeding. Thus, this study presents another case in our efforts for simultaneous improvement and genetic dissection of complex traits by marker-facilitated characterization of donor introgression in selected ILs. Moreover, when compared with the previous reports [3,25,29], the inclusion of random populations in this study provided additional evidence to justify the strategy of selective introgression. Several interesting findings regarding the real power of this strategy and the genetic basis of CT at the reproductive stage in rice worth more discussion. The first one was the genomewide under-introgression of the donor genomes in both the random and selected ILs in the CY1 (Geng) genetic background as compared with the Mendelian expectation, which was in contrast to the general genomewide over-ingression in ILs of several Xian genetic backgrounds [2,25]. This characteristic introgression patterns with the two subspecific genetic backgrounds of rice appeared to be consistent with the fact that Geng genomes have smaller pan genome size but larger core genome than Xian genomes [1], and with our observation that Geng genomes have much higher "genetic load" than the Xian genomes [33]. This also suggests that the overall level of donor introgression may also be characteristic of specific genetic backgrounds. Nevertheless, this genomewide under-introgression actually increased the power in detecting CT QTLs by selective introgression. Thus, the detection of 37 genomic regions harboring CT loci or an average nine FGUs (7 loci and 2 AGs) per population was highly efficient (Table 5, Fig 3) with the average 16.8 ILs per population. This was in contrast to the extremely low power in QTL detection using the random populations of larger size. Moreover, most identified CT loci were detected with P<0.0001 and 45% of the  Table S1. Solid arrow lines connected two FGUs in each branch of a network represent putative functional relationships with those of high introgression as putative regulators in the upstream and those of low introgression in the downstream, and the thickness of an arrow line was proportional to the introgression frequency of the downstream FGU in Table 5.
https://doi.org/10.1371/journal.pone.0200846.g004 loci were detected in two or more populations, suggesting the rate of false positives, if any, was low in this study. This was not surprising because the 84 CT ILs were selected from the original 5 BC populations of 2,250 BC 2 F 2 plants.
Secondly, we observed strong non-random associations between or among some of the detected CT loci, primarily detected as AGs each consisting of two or more unlinked but perfectly associated loci in the selected CT ILs from a single population. In these cases, the donor alleles at all loci of an AG were the target of the strong phenotypic selection for CT. The strongest one was branch C-1 that included 6 unlinked but highly associated loci of high introgression. In fact, 18 of the 22 CT ILs from population C (CY1/Fengaizhan) had the introgressed donor alleles at most C-1 loci. Theoretically, pronounced non-random associations between or among unlinked loci in selected progeny of a bi-parental cross suggest the presence of strong epistasis among genes acting in a same positively regulated pathway that leads to the target trait (CT) under strong directional selection [34]. If so, branch C-1 might represent an important pathway for CT in rice because ILs of this population showed the highest level of CT (Table 1). Our observation for the presence of several major branches in the CT genetic networks suggests the presence of several positively regulated pathways controlling CT in rice. Although it remains very challenging to determine what specific pathways were implicated by these putative QTL branches or associated QTL groups, transcriptome analyses using the CT ILs and its recipient under stress and non-stress conditions should provide valuable information in this respect [35][36][37][38][39].
Clearly, the putative genetic networks detected in this study should be verified. While it would be very difficult to clone and verify the molecular functions of the detected CT loci individually using the classical approaches of map-based cloning or of molecular biology, it is not difficult to confirm the putative CT genetic networks genetically. A straightforward strategy is to break, by recombination, each identified network or AG into individual loci and evaluate their effects individually in the progeny derived from the crosses between different ILs or between crosses between the ILs and CY1. Also, various omic tools and bioinformatic analyses can be very powerful to gain useful information for inferring the possible molecular mechanisms of the detected genetic networks by assessing the genomewide responses at different levels of carefully selected progeny from these crosses to cold stress [24,34,40]. Thus, the CT ILs developed in this study provide valuable materials for genetic and molecular dissection of complex genetic networks underlying complex traits such as CT in rice.
Our results have important implications for improving complex traits of Geng rice. The Geng rice gene pool is known to have very limited genetic diversity [1,41]. Our result that all four Xian donors contributed CT enhancing alleles at quite different loci to the Geng recipient, CY1, suggested that there is a rich source of hidden genetic diversity in the Xian gene pool for improving CT of Geng rice. Thus, our strategy of using BC breeding, strong phenotypic selection plus genetic tracking and characterization using DNA markers provides a good solution for broadening the genetic diversity in the Geng gene pool. Thus, genetic complementarity provides an appropriate explanation for the hidden diversity and transgressive segregation of CT observed in this study and other complex traits in rice [19][20][21]. Finally, the CT FGUs and their relationships identified in the CT ILs provide useful genetic information for further improving rice CT and simultaneous verification of the identified CT genetic networks using pyramiding populations derived from crosses between ILs carrying independent CT FGUs from different donors.

Conclusions
Total of 17 QTLs for CT were detected in 84 cold-tolerant ILs selected from five BC2 populations in CY1 genetic background using a consensus linkage map, three of which (qCT3.12, qCT6.7 and qCT9.6) were validated in random BC populations. In addition, 46 functional genetic units (FGUs) including 37 single loci and 9 association groups for CT were detected by χ 2 tests and multi-locus linkage disequilibrium analyses. The findings reported herein may be useful for knowledge-based rice improvement of CT. Future research will focus on validating the effects of these putative genetic networks detected in this study. Our results demonstrated that selective introgression is powerful in simultaneous improvement and genetic dissection of complex traits such as CT in rice.
Supporting information S1