Identification of Chromosome Segment Substitution Lines of Gossypium barbadense Introgressed in G. hirsutum and Quantitative Trait Locus Mapping for Fiber Quality and Yield Traits

Chromosome segment substitution lines MBI9804, MBI9855, MBI9752, and MBI9134, which were obtained by advanced backcrossing and continuously inbreeding from an interspecific cross between CCRI36, a cultivar of upland cotton (Gossypium hirsutum) as the recurrent parent, and Hai1, a cultivar of sea island cotton (G. barbadense) as the donor parent, were used to construct a multiple parent population of (MBI9804×MBI9855)×(MBI9752×MBI9134). The segregating generations of double-crossed F1 and F2 and F2:3 were used to map the quantitative trait locus (QTL) for fiber quality and yield-related traits. The recovery rate of the recurrent parent CCRI36 in the four parental lines was from 94.3%–96.9%. Each of the parental lines harbored 12–20 introgressed segments from Hai1across 21 chromosomes. The number of introgressed segments ranged from 1 to 27 for the individuals in the three generations, mostly from 9 to 18, which represented a genetic length of between 126 cM and 246 cM. A total of 24 QTLs controlling fiber quality and 11 QTLs controlling yield traits were detected using the three segregating generations. These QTLs were distributed across 11 chromosomes and could collectively explain 1.78%–20.27% of the observed phenotypic variations. Sixteen QTLs were consistently detected in two or more generations, four of them were for fiber yield traits and 12 were for fiber quality traits. One introgressed segment could significantly reduce both lint percentage and fiber micronaire. This study provides useful information for gene cloning and marker-assisted breeding for excellent fiber quality.


Introduction
Cotton is one of the most important cash crops in the world and cotton fiber provides the main natural raw material for the textile industry. Upland cotton (Gossypium hirsutum) has a high yield and wide adaptability, but a relatively low fiber quality. On the other hand, sea island cotton (G. barbadense) has excellent fiber quality with low yield and limited adaptability. Therefore, one way to improve fiber quality of upland cotton is to introgress the favorable genes from sea island cotton to upland cotton. However, the yield and quality of cotton are quantitative traits that are affected by multiple genes and often negatively correlated [1][2][3]. Therefore, the simultaneous improvement of cotton fiber quality and yield is a tall task for breeders in conventional breeding [4]. With the continuous development and improvement of molecular marker technologies, researchers have conducted extensive studies to construct cotton genetic maps and identify quantitative trait locus (QTL). This would make it possible to simultaneously improve both of fiber quality and yield in a breeding program.
The construction of the first cotton molecular genetic map [5] had facilitated QTL mapping. Completion of the cotton genome draft sequence laid a foundation for further molecular design breeding at the whole genomic level [6][7][8][9]. Most segregating populations for QTL identification are F 2 [10][11][12], BC 1 [13], BIL [14] and RIL [15][16][17][18]. However, because these segregating populations (e.g., F 2 ) are usually not immortal, the results of QTL identification are usually difficult to repeat. Furthermore, although several QTLs for cotton fiber quality or yield traits have been identified, fine mapping and cloning of these genes have rarely begun. Chromosome segment substitution lines (CSSLs), also known as introgression lines, are permanent populations that possess the same genetic background as the recurrent parent. Differences among CSSLs usually involve only one or a few of the introgressed chromosome segments, which in turn effectively eliminate interference of the genetic background. CSSLs are also highly efficient in detecting QTLs with minor effects. Therefore, CSSLs are ideal materials for QTL fine mapping, gene cloning, and investigating QTL interactions. Since Eshed and Zamir first constructed introgression lines of tomato [19], these have been successfully applied in rice, corn, and other plants [20][21][22][23].
CSSLs are seldom reported in QTL studies in cotton. Stelly et al. first constructed 17 chromosome substitution lines of G. barbadense in TM-1 background of G. hirsutum [24]. Subsequently, the same research team performed a thorough analysis of the genetic effects of CSSLs [25][26][27][28][29][30][31]. Their results showed that the sea island cotton genotype has positive effects on fiber quality traits, suggesting that these particular traits are influenced by multiple genes [32]. Other researchers also used CSSLs for QTL mapping of fiber quality and yield traits [33][34][35][36][37]. Although these CSSL populations are beneficial for QTL mapping, a large gap between the QTL mapping and the application of these lines in breeding programs remains to be resolved.
To simultaneously obtain brand-new lines for direct application in breeding while conducting basic researches, we constructed a CSSL population using upland cotton CCRI36 and sea island cotton Hai1, both of which are commercially grown cultivars. A genetic linkage map containing 2,292 markers was constructed [38] and cotton fiber quality and yield-related QTLs were identified in this CSSL population. Liang et al. [39] detected 20 yield-related QTLs in BC 5 F 2 of this CSSL population. Through multi-ecological environment evaluations of the yield and fiber quality of the population (BC 5 F 3 , BC 5 F 3:4 , and BC 5 F 3:5 ), Zhang et al. [40] and He et al. [41] mapped specific QTLs for fiber quality and yield-related traits using selected lines with stable and excellent fiber quality and yield.
In the present study, based on the phenotypic performance of this CSSL population [38], as well as previous findings from multi-ecological environment investigations [38][39][40][41], four introgression lines MBI9804, MBI9855, MBI9752, and MBI9134 with excellent fiber quality were selected as parental lines, and a double-crossed population of (MBI9804×MBI9855)× (MBI9752×MBI9134) was constructed. The introgressed Hai1 segments were evaluated in the segregating generations F 1 and F 2 and verified in the following F 2:3 generation using SSR markers. We also performed QTL mapping for fiber quality and yield related traits with these three generations. New stable QTLs for fiber quality and yield traits were identified and validated in multiple generations. Our study lays a foundation for fine mapping of fiber quality QTLs and using them in breeding via marker assisted selection (MAS).

Materials and Methods Materials
CSSL population was constructed by crossing and backcrossing between donor parent Hai1 and the recurrent parent CCRI36. Hai1 was a commercially grown cultivar of G. barbadense with a dominant glandless gene, highly resistant to Verticillium wilt, and has excellent fiber quality [42]. While CCRI36 was a widely grown upland cotton cultivar with high yield and early maturity, and developed by the Institute of Cotton Research of Chinese Academy of Agricultural Sciences (State Approval Certificate of Cotton 990007). The development of the CSSL population was reported by Shi et al. [38]. The four BC 5 F 4 introgression lines, MBI9804, MBI9855, MBI9752, and MBI9134, with stable and excellent fiber quality performance, were selected as parental lines to construct a double-crossed population of (MBI9804×MBI9855) × (MBI9752×MBI9134). In 2012, the double-crossed F 1 was planted in the experimental farm (Anyang, Henan Province) of the Institute of Cotton Research of Chinese Academy of Agricultural Sciences. All parental lines were planted in two rows and a total of 868 individuals of double-crossed F 1 were planted in 45 rows. Each row was 5m long and 0.8m apart with 20 plants. The

Investigation of Fiber Yield and Quality Traits
In 2012 and 2013, the phenotypic traits of each plant were investigated. Naturally opened bolls per plant were harvested for indoor testing, including seed cotton weight, fiber weight, boll weight (BW), lint percentage (LP), fiber length (FL), fiber uniformity (FU), fiber micronaire (FM), and fiber strength (FS). In 2014, 30 naturally opened bolls for each plot were harvested for indoor testing same as those in 2012 and 2013. The fiber quality traits were tested with HFT9000 in the Cotton Quality Supervision and Testing Center of the Ministry of Agriculture of China. HVICC international calibration cotton samples were used.

DNA Extraction and SSR Molecular Detection
In 2012 and 2013, young leaves of the parental lines and the double-crossed F 1 and F 2 individual plants were sampled. DNA was extracted using a modified CTAB method [43]. SSR amplification and polyacrylamide gel electrophoresis were performed following Zhang's description [44]. Based on the previously constructed CCRI36×Hai1 BC 1 F 1 genetic linkage map [38], SSR markers were selected at a distance of 10-20 cM. A total of 526 pairs of polymorphic SSR markers between CCRI36 and Hai1 were selected for screening of polymorphisms among the four parental lines. Finally, 51 out of the 526 markers were identified to be polymorphic for genotyping of the double-crossed F 1 and F 2 individual plants. The sequences of the SSR primers were uploaded to the CMD database (http://www.cottonmarker.org/). The primers used in the present study were synthesized by Beijing Sunbiotech Co., Ltd. (Beijing, China).

Analysis of Phenotypic Traits
EXCEL 2013 software was used for the descriptive statistical analysis of fiber quality traits (including FL, FS, FM, and FU) and yield related traits (including BW and LP) for the doublecrossed F 1 and F 2 individual plants and the F 2:3 family lines. The statistical values included average, maximum, minimum, the transgressive rate over the recurrent parent (%), coefficient of variation, skewness and kurtosis. Correlation analysis and ANOVA were performed using the SPSS20.0 software.

Genotypic Analysis for Parents and Population
Genotypic analysis of the parental lines and population was performed based on the SSR polymorphic results using the GGT2.0 software developed by van Berloo (http://www. plantbreeding.wur.nl/UK/software_ggt.html) [45]. The background recovery rate of the CSSLs to the recurrent parent, number of introgressed segments, and length were calculated.

QTL mapping
The linkage map was constructed using the MapChart2.2 software [46]. QTL mapping was performed using the QTL IciMappingV4.0 software developed by Wang et al. [47]. The nomenclature of QTL was: q + trait abbreviation + chromosome number + serial number of the marker closely linked to the trait. For example, qFS-2-7 represented a QTL controlling fiber strength near the seventh marker on chromosome 2 (Chr2).

Fiber Quality and Yield Traits and Correlation Analysis
The four parents had longer fiber than the recurrent parent CCRI36 (P < 0.05). MBI9804 and MBI9134 had significant stronger fiber than that of CCRI36 (P < 0.01) (Tables 1 and 2). In the three generations, except for FU in 2013 and FM in 2014, the average values of the other traits were higher than those of the recurrent parent CCRI36 (P < 0.05). The absolute value of the skewness was <1, indicating that the fiber quality and yield traits showed a normal distribution in the three generations. The recovery rates to the recurrent parent CCRI36 for FL, FU, FM, and FS increased from F 1 to F 2:3 , whereas those of BW and LP decreased. The coefficient of variation indicated that fiber quality traits, FM, FS, and FL were highly variable compared to FU.
Correlation analysis between fiber quality and yield traits (Table 3) showed that FS was significantly positively correlated with FL in all three generations, significantly positively correlated with FU in F 1 and F 2 . FL was significantly positively correlated with FU in F 1 and F 2 , significantly negatively correlated with FM in F 2 and F 2:3 , whereas significantly positively correlated were FM in F 1 .

Genotypic Analysis of the Parents and Populations
Introgressed Hai1 chromosome segments in the four parental lines were identified by SSR markers using the GGT2.0 software (S1 Table, Fig 1). The background recovery rate in the four parental lines ranged from 94.1% to 97.6%. Each of the parental line harbored 12-20 introgressed Hai1 chromosome segments, spanning a genetic length of 116.3 cM-283.1 cM, and accounting for 2.4%-5.9% of the total detected genetic length. The introgressed Hai1 segments were distributed across 21 chromosomes in all four parental lines, mainly on Chr10, Chr11, Chr20 and Chr23. Chr20 had the most introgressed segment number, whereas no introgressed segments were detected on Chr4, Chr19, Chr22, Chr24 and Chr25. The number of homozygous introgressed Hai1 segments were more than that of the heterozygous introgressed segments in three parental lines, except for MBI9855. Genotyping analysis (S2 Table) showed that the average recovery rates to CCRI36 in F 1 , F 2 and F 2:3 generations were 96.1%, 96.3% and 96.0%, respectively. The average length of the homozygous introgressed Hai1 segments ranged from 71.8-110.0 cM in the three generations, whereas the average length of the heterozygous introgressed Hai1 segments was 68.5-103.5 cM. The average length of the total introgressed Hai1 segments was 172.0-182.8 cM.
In all three generations, most individual plants contained 4 or more introgressed Hai1 segments, whereas a few plants contained 1-3 introgressed Hai1 segments. The minimum number of introgressed Hai1 segments was one and the maximum was 27, with an average of 13.6-14.5 introgressed Hai1 segments. The length of the introgressed segments was mainly between 126 cM and 246 cM. In F 2 , no heterozygous segments were detected in three plants. In F 1 , no homozygous segments were detected in four plants and one introgressed segment was detected in only one plant (S3 Table, Fig 2a and 2b).
Fiber uniformity: Two QTLs for fiber uniformity were mapped on Chr17 and Chr20. qFU-17-5 explained 2.49%-3.83% of the observed phenotypic variations with a negative additive effect. QTL qFU-20-9 was detected in both F 1 and F 2 , explaining 3.57% of the observed phenotypic variations, with a positive additive effect which indicated Hai1 alleles increased fiber uniformity.
Boll weight: Two QTLs for BW were mapped on Chr11 and Chr12, explaining 6.02%-9.50% of the observed phenotypic variations. The positive additive effects indicated that Hai1 alleles increased boll weight.

Assessment and Application of Chromosome Segment Substitution Lines
CSSLs are usually applied to investigate the genetic behavior and effects of chromosome introgression Segment from the donor parent in the background of the recurrent parent. Its application for QTL mapping generally improves accuracy. CSSLs also provide permanent segregation populations for studying multi-environmental stability of the mapped QTLs. In the previous studies, a set of CSSL population was constructed using a widely planted upland cotton cultivar, CCRI36 as the recurrent parent, which is characterized of high yield and early maturity, and a sea island cotton Hai1 as the donor parent, which had good fiber quality and a high level of resistance to Verticillium wilt [38,39,40,41]. In the current study, four introgression lines with excellent fiber quality which derived from this CSSL population were used to construct the segregating populations of double-crossed F 1 , F 2 , and F 2:3 . The introgressed Hai1 Chromosome segments were detected in individual plants of all three generations. The genotyping of each individual plant in the three generations was relatively clear. The recovery rate to the recurrent parent in all three generations was >95%. These introgression lines are ideal materials for further fine mapping, gene interaction analysis, heterosis mechanism study, and functional genomics.

QTL Cluster and Linkage Distribution
Cluster distribution of QTLs is a relatively common phenomenon [14,17,53,[55][56][57][58]. Said et al. [59] comprehensively analyzed 2,134 previously reported QTLs in intra-and inter-species populations and detected numerous QTLs, which were distributed in clusters in certain chromosome regions in the specific populations. In the present study, QTLs controlling different traits were also detected at the same SSR marker locus. QTLs for FL and FS were mapped to the neighboring regions of markers HAU1980b on Chr2, NAU3648, NAU5421 and HAU1980a on Chr14, and CGR5565a, NAU5013, and NAU3665 on Chr20. QTLs for FU and LP were mapped to the neighboring regions of marker NAU5013 (S4 Table). QTLs for FM (qFM3-12, qFM-17-7, and qFM-17-8) were consistently detected in the neighboring regions of the molecular markers linked to the QTLs for LP (qLP-3-12, qLP-17-7, and qLP-17-8), suggesting that these major QTLs are closely linked to the same markers in the introgressed Hai1 segment, thereby providing an explanation for the significant correlations between the two traits in all three generations (Table 3). These results indicate that these loci might function as pleiotropic genes or are closely linked to the various other genes.
The chromosome segments of these QTL hotspot clusters could be useful for molecular breeding based on common molecular markers [59]. When the chromosome segments clustered both the favorable alleles of the QTLs for cotton fiber quality and yield traits, it could be more easily used for simultaneous improvement of traits. However, when the chromosome segments clustered the negatively correlated favorable alleles of QTLs, it would be very difficult to simultaneously improve these traits. An in-depth study of this linkage mechanism and breaking the linkage between cumbersome genes would play a significant role in cotton molecular breeding.

Sources of QTL Synergistic Genes
Among the QTLs for fiber quality traits mapped in the present study, the synergistic genes for 14 QTLs were from CCRI36, whereas the synergistic genes for 21 QTLs were from Hai1. Among the QTLs controlling cotton yield traits, the synergistic genes for 8 QTLs were from CCRI36, whereas the synergistic genes for 3 QTLs were from Hai1. These results suggest that fiber quality or yield QTLs are not necessarily derived from the superior parent and the parent with relatively poor traits can also contribute genes that favor fiber yield and quality. Our findings also indicate that introgression between upland cotton and sea island cotton may broaden genetic variations as well as increase the potential of favorable gene rearrangements.
Supporting Information S1

Author Contributions
Conceived and designed the experiments: YLY YZS.