Genetic and phenotypic effects of chromosome segments introgressed from Gossypium barbadense into Gossypium hirsutum

MBI9915 is an introgression cotton line with excellent fiber quality. It was obtained by advanced backcrossing and continuous inbreeding from an interspecific cross between the upland cotton (Gossypium hirsutum) cultivar CCRI36 as the recurrent parent and the sea island cotton (G. barbadense) cultivar Hai1, as the donor parent. To study the genetic effects of the introgressed chromosome segments in G. hirsutum, an F2 secondary segregating population of 1537 individuals was created by crossing MBI9915 and CCRI36, and an F2:3 population was created by randomly selecting 347 individuals from the F2 generation. Quantitative trait locus (QTL) mapping and interaction for fiber length and strength were identified using IciMapping software. The genotype analysis showed that the recovery rate for MBI9915 was 97.9%, with a total 6 heterozygous segments and 13 homozygous segments. A total of 18 QTLs for fiber quality and 6 QTLs for yield related traits were detected using the two segregating generations. These QTLs were distributed across 7 chromosomes and collectively explained 0.81%–9.51% of the observed phenotypic variations. Six QTLs were consistently detected in two generations and 6 QTLs were identified in previous studies. A total of 13 pairs of interaction for fiber length and 13 pairs of interaction for fiber strength were identified in two generations. Among them, 3 pairs of interaction for fiber length and 3 pairs of interaction for fiber strength could be identified in all generations; 4 pairs of interactions affected fiber length and fiber strength simultaneously. The results clearly showed that 5 chromosome segments (Seg-5-1, Seg-7-1, Seg-8-1, Seg-20-2 and Seg-20-3) have important effects on fiber yield and quality. This study provides the useful information for gene cloning and marker-assisted breeding for excellent fiber related quality.

MBI9915 is an introgression cotton line with excellent fiber quality. It was obtained by advanced backcrossing and continuous inbreeding from an interspecific cross between the upland cotton (Gossypium hirsutum) cultivar CCRI36 as the recurrent parent and the sea island cotton (G. barbadense) cultivar Hai1, as the donor parent. To study the genetic effects of the introgressed chromosome segments in G. hirsutum, an F 2 secondary segregating population of 1537 individuals was created by crossing MBI9915 and CCRI36, and an F 2:3 population was created by randomly selecting 347 individuals from the F 2 generation. Quantitative trait locus (QTL) mapping and interaction for fiber length and strength were identified using IciMapping software. The genotype analysis showed that the recovery rate for MBI9915 was 97.9%, with a total 6 heterozygous segments and 13 homozygous segments. A total of 18 QTLs for fiber quality and 6 QTLs for yield related traits were detected using the two segregating generations. These QTLs were distributed across 7 chromosomes and collectively explained 0.81%-9.51% of the observed phenotypic variations. Six QTLs were consistently detected in two generations and 6 QTLs were identified in previous studies. A total of 13 pairs of interaction for fiber length and 13 pairs of interaction for fiber strength were identified in two generations. Among them, 3 pairs of interaction for fiber length and 3 pairs of interaction for fiber strength could be identified in all generations; 4 pairs of interactions affected fiber length and fiber strength simultaneously. The results clearly showed that 5 chromosome segments (Seg-5-1, Seg-7-1, Seg-8-1, Seg-20-2 and Seg-20-3) have important effects on fiber yield and quality. This study provides the useful information for gene cloning and marker-assisted breeding for excellent fiber related quality. PLOS

Introduction
Cotton is the most important natural fiber in the textile industry. With improved living standards and the development of the textile industry, higher quality cotton fiber is desirable. Upland cotton and sea island cotton are cultivated tetraploid varieties in the genus Gossypium [1]. Upland cotton (Gossypium hirsutum L.) has a high yield and wide adaptability, but relatively low fiber quality while sea island cotton (Gossypium barbadense L.) has low yield and limited adaptability but excellent fiber quality [2]. One approach for improving both cotton fiber quality and yield is by integrating the high yield genes of G. hirsutum and excellent fiber quality genes of G. barbadense through hybridization.
Cotton yield and quality are quantitative traits controlled by multiple genes subject to environment influences [3][4][5]. The use of molecular markers makes it easier for breeders to more rapidly and precisely improve economic and agronomic traits of crops [6]. Reinisch et al. [7] constructed the first cotton molecular genetic map of RFLP markers in 1994. Jiang et al. [8] detected three fiber strength quantitative trait loci (QTL) in an F 2 generation established by the cross of G. barbadense and G. hirsutum. These explained 30.9% of the total phenotypic variation. Paterson et al. [9] detected 68 QTL related to fiber quality in multiple environments. Wu et al. [10] found 13 fiber quality trait QTL in a F 2 population derived from hybridization of G. hirsutum Handan 208 and G. barbadense Pima 90. Jamshed et al. [11] identified 165 QTLs for fiber quality traits in a G. hirsutum recombinant inbred line. Of these, 47 QTL were stable across multiple environments. Wang et al. [12] detected 64 QTL for fiber quality and 70 QTL for yield components in a population of 178 recombinant inbred lines (RILs). Yang et al. [13] detected 44 fiber quality QTL on 17 chromosomes in BC 1 and its derived BC 1 F 2 lines. In spite of these QTL data, the complex genetic background of the study populations makes QTL results difficult to use for cultivar trait improvements.
Chromosome segment substitution lines (CSSLs), also known as introgression lines, are permanent populations that possess the same genetic background as the recurrent parent. Differences among CSSLs usually involve only one or a few of the introgressed chromosome segments, which effectively eliminates interference from the genetic background. Therefore, CSSLs are ideal materials for QTL fine-mapping, gene cloning, and study of QTL interactions [14]. Since Eshed and Zamir first constructed introgression lines of tomato [15], CSSLs have been successfully applied in rice [16], corn [17], and wheat [18]. However, CSSLs are less commonly used for cotton QTL mapping. Stelly et al. [19] constructed 17 chromosome substitution lines of G. barbadense in the TM-1 background of G. hirsutum. Luan et al. [20] detected 24 QTL associated with fiber yield and quality using two G. hirsutum. introgression populations. Zhu et al. [21] detected 2 QTL for lint percent and a QTL for seed index in F 2 and F 2:3 populations derived from a cross between two introgressed lines. Wang et al. [22] identified six stable QTL associated with fiber quality using 174 introgression lines. Cao et al. [23] finemapped clustered QTL for fiber quality on chromosome 7 using a G. barbadense introgressed line. Wang et al. [24] detected 24 QTL for fiber quality and lint quantity based on three phenotypic datasets collected over 2 years in two locations.
To introgress the preferred fiber quality from G. barbadense into a commercial G. hirsutum variety, a high-density simple sequence repeat (SSR) genetic linkage map was developed from a BC 1 F 1 population derived from an interspecific backcross between the highly resistant line Hai1 (G. barbadense) and CCRI36 (G. hirsutum) as the recurrent parent [25]. A total of 48 QTLs for verticillium wilt resistance were identified in BC 1 F 1 , BC 1 S 1 and BC 2 F 1 populations from the same parents [26]. A total of 20 QTL for yield traits and 33 QTL for fiber quality traits were detected using 303 chromosome segment substitution lines (BC 5 F 2 ) [27]. Genetic effects and heterosis of yield and yield component traits were analyzed through hybridizing 10 chromosome segment substitution lines (CSSLs) each from two CSSL populations that produced 50 F 1 hybrids according to North Carolina Design II [28]. Chromosome segment substitution lines MBI9804, MBI9855, MBI9752 and MBI9134 were used to construct a multiple parent population of (MBI9804×MBI9855)×(MBI9752×MBI9134). A total of 24 QTLs controlling fiber quality and 11 QTLs controlling yield traits were detected using the three segregating generations of double-crossed F 1 and F 2 and F 2:3 [29].
We focused on the genetic effects of the introgressed segments in the introgression line with excellent fiber quality, MBI9915, which was selected from the BC 5 F 3:5 of an interspecific cross between CCRI36, a cultivar of G. hirsutum as the recurrent parent, and Hai1, a cultivar of G. barbadense) as the donor parent. An F 2 secondary segregating population was constructed by crossing CCRI36 and MBI9915. A F 2:3 generation was constituted by random selection of 347 individuals from the F 2 population. QTL mapping and interaction for fiber length and fiber strength were identified by SSR markers.

Materials and methods Materials
The introgression line with excellent fiber quality, MBI9915, was selected from the BC 5 F 3:5 of an interspecific cross of G. hirsutum CCRl36 (Chinese Cotton Research Institute 36) as the recipient parent and G. barbadense Hai 1 as the donor parent [30]. The line was used to produce an F 2 secondary segregating generation including 1537 individuals with the other parent of CCRI36, and the F 2:3 generation was formed by randomly selecting 347 F 2 individuals.
In 2013, the F 2 generation was planted in 68 rows at the Institute of Cotton Research of Chinese Academy of Agricultural Sciences (Anyang, Henan Province). The parental lines and F 1 were planted in two rows, respectively. In 2014, a total of 347 F 2 individual plants were randomly selected as F 2:3 rows of plants, and CCRI36 was planted as the control in one row for 20 experimental rows in the experimental farm(Anyang, Henan Province).Each row was 5 m long and 0.8 m apart with about 20 plants in each year.

Investigation of fiber yield and quality traits
In 2013, the phenotypic traits of individual plants were studied. Naturally opened bolls were harvested and evaluated for boll weight (BW), lint percentage (LP), fiber length (FL), fiber micronaire (FM), and fiber strength (FS). In 2014, 30 naturally opened bolls in each plot were harvested for evaluation as in 2013. The fiber quality traits were tested with an HFT9000 in the Cotton Quality Supervision and Testing Center of the Ministry of Agriculture of China. HVICC international calibration cotton samples were used.

DNA extraction and SSR molecular detection
Young leaves of the parental lines, F 1 , and F 2 individual plants were sampled. DNA was extracted using a modified CTAB method [31]. SSR amplification and polyacrylamide gel electrophoresis were performed following the method of Zhang [32].
All the SSR markers in the genetic linkage map constructed by Shi [25] were used for screening of polymorphisms among the parental lines. A total of 41 markers were identified to be polymorphic for genotyping the F 2 individual plants. The sequences of the SSR primers were uploaded to the CMD database (http://www.cottonmarker.org/). The primers used in the present study were synthesized by Beijing Sun biotech Co., Ltd. (Beijing, China). Data analysis SAS 9.2 software was used for the descriptive statistical analysis and correlation analysis of fiber quality traits (including FL, FS and FM) and yield related traits (including BW and LP) for the F 2 individual plants and the F 2:3 family lines.
Genotypic analysis of the parental lines and population was performed based on the SSR polymorphic results using the GGT 2.0 software developed by van Berloo (http://www. plantbreeding.wur.nl/UK/software_ggt.html) [33].
QTL mapping and interaction was performed using the QTL IciMappingV4.0 software developed by Wang et al. [34].

Fiber quality and yield analysis
Mean values of MBI9915 and recurrent parent (CCRI 36) are shown in Table 1. Compared to CCRI36, MBI9915 had longer fiber (> 30 mm), stronger fiber (> 33 cN•tex -1 ) and lower micronaire. These results indicate that the MBI9915 introgression line had excellent fiber quality.
The fiber quality performance and the yield related traits for the F 2 and F 2:3 generations are presented in Table 2 and Fig 1. The transgressive rate of F 2 and F 2:3 generations was 31.23%-93.94%. The variation coefficients of fiber quality and yield traits for F 2 and F 2:3 generations were 3.48% to 12.64%, indicating that the traits were significantly separated in the F 2 and F 2:3 generations. The absolute value of the skewness was < 1, indicating that the fiber quality and yield traits had a normal distribution in the two generations.

Traits correlation analysis
Correlation analysis results between fiber quality and yield related traits are presented in Table 3. The significant negative correlation between boll weight and lint percent indicates the difficulty in simultaneously improving the BW and LP. Among the fiber quality traits, fiber length and fiber strength were significantly positively correlated in the two generations, while significantly negatively correlated with fiber micronaire. This indicated that fiber quality traits are easier to simultaneously improve. Lint percent was positively correlated with fiber

Genotypic analysis of the parents
A total of 41 pairs of SSR markers were identified as polymorphic between the two parents. Introgressed Hai1 chromosome segments in MBI9915 were identified by SSR markers using GGT2.0 software (Fig 2).
Lint percent: A total of 4 QTLs for LP were identified. One was mapped on chromosome 5 and three were mapped on chromosome 20, explaining 1.16%~8.91% of the observed phenotypic variation. The negative additive effect for qLP-5-1 and qLP-20-1 indicated that the CCRI36 alleles increased the lint percent. qLP-20-2 and qLP-20-3 had the opposite additive effect.
Fiber length: A total of 5 QTLs controlling FL were detected on five chromosomes (Chr5, Chr7, Chr16, Chr20 and Chr22), explaining 0.81%~5.13%of the observed phenotypic  variations. Among them, qFL-5-1 and qFL-7-1 were identified in two generations. qFL-5-1 linked to TMB1296 and could explain 5.13% and 3.22% of the observed phenotypic variations in the F 2 and F 2:3 generation, respectively, with the additive effect of 0.37 mm in two generations. qFL-7-1linked to NAU1085 could explain 0.81% and 3.24%of the observed phenotypic variations in the F 2 and F 2:3 generations, respectively, with the additive effect of 0.13 mm and 0.26 mm in two generations, respectively. The positive additive effect for all the five QTLs indicated that Hai1 alleles increased fiber length. Fiber strength: A total of 9 QTLs controlling FS were detected on four chromosomes (Chr5, Chr 7, Chr8 and Chr20), explaining 1.26%~9.13%of the observed phenotypic variations. Three chromosomes (Chr5, Chr7 and Chr8) had only one QTL. A total of 6 QTL, of which two QTL (qFS-20-1 and qFS-20-3) were detected in two generations, were mapped on chromosome 20. qFS-20-1linked to Gh119, explained 9.13% and 1.69% of the observed phenotypic variations, with the additive effect of 0.92 and 0.44 in F 2 and F 2:3 , respectively. qFS-20-3 linked to TMB1125, could explain 4.75% and 1.66% of the observed phenotypic variations, with the additive effect of 0.06 and 0.52 in F 2 and F 2:3 , respectively. The positive additive effect of all the 9 QTLs indicated that Hai1 alleles increased fiber strength.
Micronaire: A total of 4 QTLs controlling FM were detected on four chromosomes (Chr5, Chr8, Chr20 and Chr24), explaining 3.26%~6.37% of the observed phenotypic variations. Among them, qFM-8-1linked to NAU1209, explaining 6.37% and 3.47% of the observed phenotypic variations, with the additive effect was -0.13 and -0.08 in F 2 and F 2:3 , respectively. Except for the positive additive effect of qFM-20-1, the negative additive effect for all the other QTL indicated that Hai1 alleles decreased fiber micronaire.

Interactions between fiber length and fiber strength
Interaction between fiber length and fiber strength in the F 2 and F 2:3 populations were identified by Icimapping software. The nomenclature of introgression segment was: Seg + chromosome number + serial number of segment. For example, Seg-20-3 represented the 3th segment on chromosome 20 (Chr20).
A total of 13 pairs of interactions for fiber strength were identified in two generations, explaining 0.55%-7.60% of the observed phenotypic variations (Fig 4, S1 Table). Among these, 3 pairs of interaction (Seg-5-1 with Seg-20-2, Seg-7-1 with Seg-8-1, Seg-7-1 with Seg-15-1) for fiber strength could be identified in two generations. The main positive effects of the interaction between Seg-5-1 and Seg-20-2 were Add by Add and Dom by Add, which indicated that interaction increased fiber strength. The Add by Add effect of Seg-7-1 with Seg-8-1 was 0.30 and 0.47 in F 2 and F 2:3 population, respectively, which indicated that interaction increased fiber strength. The Add by Add effect of Seg-7-1 with Seg-15-1 were -0.17 and -0.37 in F 2 and F 2:3 populations, respectively, which indicated that interaction decreased fiber strength. The region around 17.86 cM and 22.86 cM on Seg-8-1 had the opposite interaction effect of Add by Add with Seg-20-1. The result indicated that linkage drag on Seg-8-1 should be broken.
A total of 13 pairs of interaction for fiber length were identified in two generations, explaining 0.38%-5.64%of the observed phenotypic variations (Fig 4, S2 Table). Among these, 3 pairs of interaction (Seg-7-1with Seg

Selection of parental materials
Cotton fiber yield and quality are quantitative traits controlled by multiple genes, which are vulnerable to environmental influences [3][4][5]. The development of molecular markers provides crop breeders with a rapid and precise alternative approach for improving economic and agronomic traits [6]. Chromosome segment substitution lines (CSSLs) are permanent populations that possess the same genetic background as the recurrent parent, which effectively Genetic and phenotypic effects of chromosome segments eliminates the interference of the genetic background. Therefore, CSSLs are ideal materials for QTL fine-mapping, gene cloning, and investigating QTL interactions [13].
MBI9915 was a introgression line with excellent fiber quality. It was obtained by advanced backcrossing and continuously inbreeding from an initial interspecific cross between G. hirsutum cultivar CCRI36, as the recurrent parent, and G. barbadense cultivar Hai1as the donor parent. The background recovery rate for MBI9915 was 97.30%, with 19 Hai1 introgression segments distributed on 15 chromosomes with coverage of a total 105.5 cMin, which can effectively reduce the interference of the genetic background. MBI9915 is very important research material with fiber length more than 30 mm and fiber strength more than 33 cNÁtex -1 in a three-year continuous evaluation. The research for genetic effect of the introgressed segments provides a basis for fine-mapping and cloning fiber quality QTLs.
Fiber length: qFL-7-1 linked to NAU1085 explained 0.81% and 3.24% of the observed phenotypic variations in the F 2 and F 2:3 generations, with the additive effect of 0.13 and 0.26 in two generations. This QTL was reported by Ma [35] and Guo [36]. qFL-20-1 linked to NAU3665 was detected with the additive effect of 0.24, which was reported by Zhai [29]. qFL-22-1 was detected linked to NAU2026 with the additive effect of 0.27. This was also reported by Liang [26]. The result indicated that these QTL were stable between generations and in different environmental conditions.
Fiber strength: qFS-7-1 was detected on chromosome 7 with the common marker of NAU1085 reported by He [37]. qFS-8-1 was identified linked to the common marker of NAU1209 and qFS-20-2 linked to NAU3813b as reported by Zhai et al. [29].
It is difficult to detect stable QTL in different generations, different backgrounds and different environments because the quantitative traits are usually susceptible to environmental effects. Therefore, these stable QTL could play an important role in improving fiber quality and yield traits.

Genetic effect and interaction of the introgressed chromosome segments
Clustered distribution of QTL is a relatively common phenomenon [38][39][40][41]. Said et al. [42] analyzed 2,134 previously reported QTL in intra-and inter-species populations and found numerous QTL that were distributed in clusters within defined chromosome regions in the specific populations. We found some clusters on the introgressed segments of the cotton genomes. Five QTL (qFL-5-1, qFM-5-1, qFS-5-1, qBW-5-1andqLP-5-1) were distributed in clusters linked to TMB1296 in Seg-5-1. Two QTL (qFL-7-1 and qFS-7-1) were linked to NAU1085 on Seg-7-1. Two QTL (qFM-8-1 and qFS-8-1) were linked to NAU1209 on Seg-8-1. Eleven QTL (3 for lint percentage, 5 for fiber strength, 1 for fiber length, boll weight and micronarie respectively) were distributed in clusters in the regions from 152.25 cM to 194 cM in chro20. Lacape etc. [43] suggested that these clustered QTL may belong to the same genetic factor group contributing to the complex network of fiber development and affecting the multiple fiber quality traits. Zhang et al. [44] reported that improvement of auxin expression level in epidermis of cotton ovule in the initial stage of fiber development could increase lint percentage and decrease fiber micronaire. The cluster on Seg-7-1 was identified by Guo [36]. The chromosome segments with these QTL hotspot or clusters could be useful for molecular breeding based on common molecular markers [42]. When the favorable alleles of the QTLs for cotton fiber quality and yield traits are clustered in the same chromosome segments they could be more easily used for simultaneous improvement of traits.
In addition to QTL, interactions are an important genetic basis for improvement of cotton yield and fiber quality traits [45,46]. The interactions can be divided into three types (QTL with QTL, QTL with non-QTL, non-QTL with non-QTL) [47]. Therefore, the study of cotton molecular markers should be extended to non-mendelian factors [46]. A total of 4 pairs of interaction (Seg-5-1 with area around 165 cM on Seg-20-2, Seg-7-1 with Seg-8-1, Seg-7-1 with Seg-15-1, Seg-15-2 with Seg-20-2) were found to affect fiber length and fiber strength simultaneously. These interactions could further explain the correlation between fiber length and fiber strength, and contribute to the simultaneous improvement of fiber length and fiber strength.
Seg-20-2 was between NAU3813b and NAU4928 on chromosome 20. A total of 2 QTL clusters were detected on this segment. qLP-20-1 and qFS-20-2 were linked to NAU3813b (152. These results show that fiber length and fiber strength both have a complex genetic basis that involves numerous interactions. Therefore, it is important to consider genetic interaction effects for fiber yield and quality in future research. Chromosome segment substitution lines (CSSLs) are useful for the precise mapping of quantitative trait loci (QTLs) and dissection of the genetic basis of complex traits. Li et al. [48] confirmed a major QTL (qGR2) on chromosome 2 by using the CSSL-derived F 2 population, and delimited to a 10.4 kb interval containing three putative candidate genes, of which OsMADS29 was only expressed preferentially in the seed. Functional analysis using CSSLs with HI6 indicated that HI6 reduced the size of the lower parts of the plant, which is not important for production, while maintaining the size of the other organs related to production (e.g., flag leaf and panicle), resulting in improved nitrogen use efficiency [49]. Li et al. [50] precisely map qRBSDV-6MH to the markers S18 and S23 at a physical distance of 627.6 kb on the Nipponbare reference genome using a set of chromosome segment substitution lines. Liu et al. [51] mapped SPP1 to a 2.2-cM interval between RM1195 and RM490 using a random NIL-F2 population of 210 individuals, which explained 51.1% of SPP variation. And then, four newly developed InDel markers were used for high-resolution mapping of SPP1 with a large NIL-F 2 population. Finally, it was narrowed down to a bacterial artificial chromosome clone spanning 107 kb; 17 open reading frames have been identified in the region [51]. Liu et al. [52] mapped a major QTL influencing four fiber quality traits to a 0.28-cM interval and identified three candidate genes by RNA-Seq and RT-PCR analysis. Fang et al. [53] mapped qFS07.1 into a 62.6-kb genome region which contained four annotated genes on chromosome A07 of G. hirsutum. This study clearly showed that 5 chromosome segments (Seg-5-1, Seg-7-1, Seg-8-1, Seg-20-2 and Seg-20-3) have important effects on fiber yield and quality. The result provides the useful information for fine-mapping, gene cloning and marker-assisted breeding for excellent fiber related quality.