Identification of QTL for Fiber Quality and Yield Traits Using Two Immortalized Backcross Populations in Upland Cotton

Two immortalized backcross populations (DHBCF1s and JMBCF1s) were developed using a recombinant inbred line (RIL) population crossed with the two parents DH962 and Jimian5 (as the males), respectively. The fiber quality and yield component traits of the two backcross populations were phenotyped at four environments (two locations, two years). One hundred seventy-eight quantitative trait loci (QTL) were detected including 76 for fiber qualities and 102 for yield components, explaining 4.08–17.79% of the phenotypic variation (PV). Among the 178 QTL, 22 stable QTL were detected in more than one environment or population. A stable QTL, qFL-c10-1, was detected in the previous F2 population, a RIL population in 3 environments and the current two BCF1 populations in this study, explaining 5.79–37.09% of the PV. Additionally, 117 and 110 main-effect QTL (M-QTL) and 47 and 191 digenic epistatic QTL (E-QTL) were detected in the DHBCF1s and JMBCF1s populations, respectively. The effect of digenic epistasis played a more important role on lint percentage, fiber length and fiber strength. These results obtained in the present study provided more resources to obtain stable QTL, confirming the authenticity and reliability of the QTL for molecular marker-assisted selection breeding and QTL cloning.


Introduction
Cotton is the largest natural textile fiber provider and one of the important oil crops worldwide. Approximately 50 species have been discovered in the cotton genus (Gossypium), among which only four cultivated species (G. herbaceum, G. arboreum, G. hirsutum, and G. barbadense) could be used for fiber production [1]. Of the four cultivated species, G. hirsutum, also known as upland cotton, is widely cultivated because of its wide adaptability and high production, accounting for over 95% of the world's cotton production [2].
Cotton fiber is an important raw material for the textile industry because of its softness and comfort; the products of cotton fiber are very popular [3]. With the stricter requirements of modern textile industry, the fiber quality of cotton is unable to meet the demands of the textile industry at present. Thus, the research on fiber development is particularly urgent. Cotton fiber is a spindly single cell derived from ovule epidermis. The development of cotton fiber is a complex process [4]. The molecular mechanism of fiber origination and elongation has been the research focus of scientists in cotton, and many novel genes related to fiber development have been detected [5][6][7][8][9][10][11][12][13][14][15][16]. For example, some genes related to fiber development based on a normalized fiber cDNA library have been verified using transgenic analysis in our laboratory [8,14,15,[17][18][19]. Meanwhile, cotton breeders have been working on the improvement of lint yield. In the past several decades, the yield of cotton has been improved greatly, but this trend has been stagnant in recent years. The development of high yield and good-fiber-quality cultivars is the most urgent task for the cotton industry.
Fiber quality traits have been proven to be negatively correlated with yield traits in previous studies [3,20]. Although many genes related to fiber development and yield traits have been detected by reverse genetics, these genes are difficult to be used in breeding directly. The fast development of molecular maker technology has made it possible to map QTL for fiber quality and yield traits and to aggregate excellent genes controlling cotton yield and fiber quality using marker-assisted selection (MAS). The genome of upland cotton is complex and large [21], and the genetic background of upland cotton is narrow [22]. These reasons hinder the development of QTL mapping in upland cotton. At present, hundreds of QTL related to fiber quality and yield traits have been obtained using population genetics in upland cotton [3,[23][24][25][26][27][28][29][30]. Some stable QTL related to yield traits were obtained, for example, qBS-D8-1 and qLP-D6-1 [31]. At the same time, many available QTL related to fiber length and fiber strength were also detected in previous studies, distributing on D3 and D11 [32], A1, D5 and D9 [24], A9 [3,23].
In this study, two immortalized backcross populations were developed from recombinant inbred lines (RILs) [3]. Both backcross populations were planted in four environments to detect stable QTL and confirm available QTL related to fiber quality and yield traits; thus, useful information will be provided for marker-assisted selection breeding and cloning candidate genes in the future.

Plant materials
A RIL population was developed by crossing G. hirsutum acc. DH962 and G. hirsutum cv. Jimian5 in a previous study [3]. Two backcross populations were developed in this research. The first backcross population contained 178 BCF 1 hybrids (DHBCF 1 s), which were crossed between the RILs and DH962 (used as the male), and the second population contained 178 BCF 1 hybrids (JMBCF 1 s), which were crossed between the RILs and Jimian5 (used as the male). . Each plot was 5-m long with 10 plants. A randomized block design was used to arrange the lines in the field. The data of the boll number per plant (BN) were collected in the middle of September, and twenty naturally opened bolls of each line were harvested in early October for fiber quality and yield investigation. Fiber qualities were measured using an HVI1000 Automatic Fiber Determination System at 20˚C, and 65% relative humidity in the Institute of Cotton Research, Shihezi Academy of Agricultural Sciences, Xinjiang. Six yield and five fiber quality components were analyzed, including the seed cotton weight per boll (SCW), lint weight per boll (LW), lint percentage (LP), boll number per plant (BN), lint index (LI), seed index (SI), fiber length (FL, mm), fiber strength (FS, cN/tex), fiber length uniformity ratio (FU), fiber elongation (FE), and micronaire (MIC).

Genotype analysis
A total of 634 primers were selected from Wang et al. [33] to genotype the RIL population [3], and a genetic map including 616 loci was constructed. The genotypes of the two backcross populations were deduced based on the genotypes of the RIL populations as the previous studies [34,35]. The genotypes of DHBCF 1 s (AA or AB) were deduced based on the cross of the genotypes of RILs (AA or BB) and DH962 (AA), and the genotypes of JMBCF 1 s (BB or AB) were deduced based on the cross of the genotypes of RILs (AA or BB) and Jimian5 (BB). If the genotypes were heterologous, we deduced that the genotypes of the BCF 1 populations were heterologous.

Data analysis and QTL detection
The differences in the phenotypic data between DH962 and Jimian5 were detected using t-test. The phenotypic data of the fiber quality and field components were analyzed using SPSS version 21.0 (SPSS, Chicago, IL, USA). The linkage map of an RIL population in a previous report was used for QTL mapping in the present study [3]. Additionally, the physical locations of the marker sequences were performed using a BLASTN search against the G. hirsutum (TM-1) genome [21] with an E-value cut-off of 1e -10 . The composite interval mapping (CIM) method of Windows QTL Cartographer version 2.5 (http://statgen.ncsu.edu/qtlcart/WQTLCart.htm) was used to identify QTL for fiber quality and yield components of the two backcross populations. The mapping population type of the DHBCF 1 s and JMBCF 1 s populations were B1 and B2, respectively. The standard model (Model 6) was used to identify QTL action. The LOD threshold values were estimated by running 1,000 permutations to declare significant QTL for all of the traits [36]. The QTL with a LOD ! 2.5 was used to declare suggestive QTL, when the QTL's confidence intervals overlapped in another environment or population with a LOD ! 2.0, it was considered to be a common QTL [37]. The main-effect QTL (M-QTL), digenic epistatic QTL (E-QTL) and their environmental interactions (QTL×environment, QE) of the two backcross populations were identified using two-locus analysis and the software ICIMapping 4.1 (http://www.isbreeding.net/software/?type=detail&id=18). The mapping population types of the DHBCF 1 s and JMBCF 1 s populations were P1BC1F1 and P2BC1F1, respectively. The model ICIM-ADD and ICIM-EPI were used for the analysis of M-QTL and E-QTL, respectively. The M-QTL with a LOD ! 2.5 was used to declare suggestive QTL, and a threshold of LOD ! 5.0 was used to declare the presence of E-QTL. QTL nomenclature was adapted according to the method in the previous report [38]. The graphic representation of the linkage map and QTL was drawn using MapChart V2.2 software [39].

Fiber quality and yield traits under four environments
The trait data of fiber qualities and yield components of the parents and two BCF 1 populations across four environments are shown in S1 Table. Significant differences between the parents were observed for most of the fiber and yield traits, except SI and LI. The parent DH962 was better in fiber qualities, and Jimian5 performed well in yield components. Skewness and kurtosis values showed that fiber quality and yield traits of the two BCF 1 populations were almost approximately normally distributed (S1 Table; Fig 1; S1 Fig). For DHBCF 1 s, all the maximum phenotype data were larger than the parent DH962. In the JMBCF 1 s, the minimum phenotype data except FL in 2013HG was smaller than those in the parent Jimian5. These results showed that all traits performed transgressive segregation in the two BCF 1 populations. Meanwhile, the average levels of the fiber quality traits of DHBCF 1 s were higher than those of JMBCF 1 s, and the average levels of the yield component traits of JMBCF 1 s were higher than those of DHBCF 1 s (S1 Table).

Correlation between fiber quality and yield traits in two backcross populations
In DHBCF 1 s (S2 Table), SCW was significantly and positively correlated with LW, SI and MIC. LW was significantly and positively correlated with LP and MIC and was significantly and negatively correlated with FL and FS. LP was significantly and negatively correlated with FL, FU and FS. FL was significantly and positively correlated with FU and FS, and significantly and negatively correlated with MIC and FE. FU was significantly and positively correlated with FS, and significantly and negatively correlated with FE. MIC was significantly and positively correlated with FS and FE. All other correlations were neither significant nor stable. In JMBCF 1 s (S3 Table), SCW was significantly and positively correlated with LW, LI, SI and MIC. LW was significantly and positively correlated with LP and LI. LP was significantly and negatively correlated with SI, FL and FS. LI was significantly and positively correlated with SI and MIC. FL was significantly and positively correlated with MIC, FU, FS and FE. FE was significantly and positively correlated with FS and MIC. Some stable correlations between different traits were obtained from the results of the two BCF 1 populations. SCW was significantly and positively correlated with LW, SI and MIC. LW was significantly and positively correlated with LP. LP was significantly and negatively correlated with FL and FS. FL was significantly and positively correlated with FU and FS.
LP: Fifteen QTL associated with LP were detected in the two populations, explaining 5.22-12.54% of the PV (S4 Table). Seven QTL were identified in the DHBCF 1 s population, and 9 QTL were detected in the JMBCF 1 s population. qLP-c11 was identified in the DHBCF 1 s and JMBCF 1 s populations in the same environment (2014JZ), explaining 5.78-9.70% of the PV (Table 1).
BN: Fourteen QTL were identified on 5 chromosomes and 1 linkage group in the two populations (S4 Table). Among the 14 QTL, 7 identified in each of the DHBCF 1 s and JMBCF 1 s populations. Five QTL were located on LG2-c9/23, and 4 were located on Chr26.
SI and LI: Five and eight QTL were detected for SI and LI, respectively (S4 Table). qSI-c26 was identified in the DHBCF 1 s and JMBCF 1 s populations in the same environment (2014JZ), located between markers CCRI272 and MON_CGR6759 (Table 1). For LI, two common QTL were identified on Chr26 (Table 1), qLI-c26-1 was identified in the two backcross populations in the same environment (2014JZ), explaining 7.39-11.11% of the PV. qLI-c26-2 was also identified in the two populations in the same environment (2014JZ).
MIC: Twenty QTL were detected on 10 chromosomes and 2 linkage groups, explaining 4.63-13.70% of the PV (S4 Table). Three stable QTL were identified in more than one environment and population (Table 2). qMIC-c1/15-2 was identified in the DHBCF 1 s and JMBCF 1 s populations in the same environment (2014JZ), explaining 13.68-13.70% of the PV. qMIC-c9 was located between markers MON_DPL0530 and NAU2354 and was detected in the JMBCF 1 s population in three environments, explaining 4.91-7.02% of the PV. qMIC-c10-1 was detected in the JMBCF 1 s population in two environments, explaining 4.63-5.71% of the PV.
FE: Three QTL were detected on 2 chromosomes and 1 linkage group, explaining 5.85-16.13% of the PV (S4 Table). qFE-c22 was identified in the JMBCF 1 s population in 13JZ, explaining 16.13% of the PV, with an LOD score of 7.03.

Discussion
In the present study, a RIL population was crossed with the two parents (DH962 and Jimian5) as the males to construct two immortalized BCF 1 populations. S1 Table shows that the average levels of fiber quality traits of DHBCF 1 s were higher than those of JMBCF 1 s, and the average levels of yield component traits of JMBCF 1 s were higher than those of DHBCF 1 s. The parents obviously affected the population performance. The differences in the fiber quality and yield component traits between the two BCF 1 populations were useful for the QTL mapping on different traits [35,40].
In our previous studies, 33 QTL were detected using an F 2 population crossed by DH962 and Jimian5 [33]. A RIL population developed by the same parents was phenotyped under 8 environments, identifying 134 QTL for fiber quality and yield traits [3]. In the present study, 178 QTL were detected in four environments using the two BCF 1 populations. Using the F 2 population, the RIL population and two BCF 1 populations developed by the same parents could mutually increase the power of QTL detection, a finding that was consistent with previous studies in cotton [35,40]. Some new stable QTL were detected using the two BCF 1 populations (Tables 1 and 2). For example, qSCW-c1/15-1 and qLW-c1/15-3 were detected in the two BCF 1 populations and the same genome region. Two new QTL for FL, qFL-c2 and qFL-c21-2, were identified. A stable QTL, qMIC-c9, was only detected in JMBCF 1 s for 3 environments. In addition, 5 of 33 QTL in the F 2 population and 17 of the 134 QTL in the RIL population were verified in the two BCF 1 populations (Table 3).
Regarding fiber length as one of the most indicators in fiber quality, the QTL qFL-c10-1 was detected in the F 2 population and RIL population in 3 environments and was detected in the two BCF 1 populations in 3 environments, explaining 5.79-37.09% of the PV. A total of 470 QTL for fiber length distributed on 26 chromosomes have been collected in the Cotton QTL Database (http://www2.cottonqtldb.org:8081/index). Compared with these QTL, the QTL qFL-c10-1 was only identified in our study; thus, the region between markers CIR305 and HAU-J5638 would be a novel important research focus for MAS and map-based cloning. qFL-c10-2 was also an important locus for fiber length that was not only detected in the JMBCF 1 s population in two years and in the RIL population but was also identified as a major QTL in previous studies [23,25]. qFE-c22 was detected as a major QTL in the RIL (qFE-c22) and F 2 (qFE-c22-1) populations, respectively. In the yield component traits, qSCW-c9/23-2 was detected in the RIL populations in two years and was verified as qSCW-c9/23-3 in the two BCF 1 populations. Three QTL related to lint percentage were also verified in the BCF 1 population. These stable QTL of fiber quality and yield component traits identified in this research were more comprehensive and significant, which could be used for future fine mapping and gene cloning to promote molecular breeding in cotton.
Until now, the current release (Release 2.1) of the Cotton QTL Database collected 4,189 QTL from 132 publications of cotton. Many QTL distributed in the cotton whole genome revealed the complexity of the cotton genome and arduousness of QTL mapping in cotton. The identification of common QTL among the different studies is useful to confirm the authenticity and reliability of QTL. Compared with previous studies, some common QTL were detected according to the same markers on the same chromosomes. The QTL qSCW-c21 was identified in a natural population by association analysis [41]. qLW-c26 corresponded to the QTL qLY-26 in an F 2:3 population [42]. The QTL qBN-Chr14-1 was detected as a stable QTL qBNP-Chr14-1 in a RIL and a BC population crossed between upland cotton GX1135 and GX100-2 [40]. qBN-c14-2 was detected as qNB-D2-1 in a 4WC population [43]. Additionally, some stable QTL for fiber quality traits were obtained. qFL-c10-2 was identified as a stable QTL in two studies [23,25]. qFL-c25 was detected as the major QTL qFL-C25-2 in a RIL population [44]. qFU-c22 was the same as the QTL qUI-c22 in a randomly mated recombinant inbred population [45]. Tan et al. (2015) obtained qFM24.1 and qFS07.1 using a high-density intraspecific genetic map [24], and they were the same as qMIC-c24 and qFS-c7 in the present study. Additionally, qFS-c7 was also verified in F 2 and RIL populations [46]. qFS-c13-1 and qFS-c13-2 were detected in an RIL population [44] and a natural population [47], respectively. The stable QTL qFE-c22 was also confirmed as qELO-c22 in the previous study [45]. The 12 common QTL detected by different populations confirmed the stability and veracity of these QTL, providing the resources for the fine mapping of this candidate QTL and developing functional markers for MAS.
After analysis by ICIM, 227 M-QTL were detected in the two BCF 1 populations. Comparing the results of CIM and ICIM analysis, 94 QTL detected by CIM were verified in the ICIM analysis. The number of QTL detected by ICIM was more than that by CIM, and this phenomenon was consistent with that in previous studies [40,48]. For E-QTL, 238 E-QTL and QEs were obtained. This result showed that the E-QTL and QEs existed widely in the BCF 1 populations, and epistasis played an important role in heterosis of the BCF 1 populations [40,48]. The results of the E-QTL and QEs identified in the DHBCF 1 s and JMBCF 1 s populations showed that the number of E-QTL and QEs for LP, FL and FS were more than that of other traits, and digenic epistasis played a more important role in the heredity and expression of LP, FL and FS.