Partial Dominance, Overdominance and Epistasis as the Genetic Basis of Heterosis in Upland Cotton (Gossypium hirsutum L.)

Determination of genetic basis of heterosis may promote hybrid production in Upland cotton (Gossypium hirsutum L.). This study was designed to explore the genetic mechanism of heterosis for yield and yield components in F2: 3 and F2: 4 populations derived from a hybrid ‘Xinza No. 1’. Replicated yield field trials of the progenies were conducted in 2008 and 2009. Phenotypic data analyses indicated overdominance in F1 for yield and yield components. Additive and dominance effects at single-locus level and digenic epistatic interactions at two-locus level were analyzed by 421 marker loci spanning 3814 cM of the genome. A total of 38 and 49 QTLs controlling yield and yield components were identified in F2: 3 and F2: 4 populations, respectively. Analyses of these QTLs indicated that the effects of partial dominance and overdominance contributed to heterosis in Upland cotton simultaneously. Most of the QTLs showed partial dominance whereas 13 QTLs showing overdominance in F2:3 population, and 19 QTLs showed overdominance in F2:4. Among them, 21 QTLs were common in both F2: 3 and F2: 4 populations. A large number of two-locus interactions for yield and yield components were detected in both generations. AA (additive × additive) epistasis accounted for majority portion of epistatic effects. Thirty three complementary two-locus homozygotes (11/22 and 22/11) were the best genotypes for AA interactions in terms of bolls per plant. Genotypes of double homozygotes, 11/22, 22/11 and 22/22, performed best for AD/DA interactions, while genotype of 11/12 performed best for DD interactions. These results indicated that (1) partial dominance and overdominance effects at single-locus level and (2) epistasis at two-locus level elucidated the genetic basis of heterosis in Upland cotton.


Introduction
Cotton (Gossypium spp.) is the most important fiber crop in the world. A number of experiments showed that significant heterosis existed in Upland cotton for yield and yield components [1][2][3]. Hybrids with acceptable heterosis have been released for cotton production in both India, and China,.
Rapid advances in high density molecular genetic linkage maps have enabled a fine dissection of quantitative traits into genetic effects of individual Mendelian loci [4]. To understand the genome of cotton and detect quantitative trait loci (QTLs) for lint yield and fiber quality, the first genetic linkage map was constructed in cotton [5]. Since then, a variety of genetic linkage maps with improved high density were constructed in different mapping populations derived from either interspecific tetraploids or cultivated allotetraploid cotton [6][7][8][9][10][11][12]. Because genetic polymorphism in intraspecific Upland cotton was lower than that in interspecific tetraploids, the genome coverage of genetic linkage maps in intraspecific tetraploids was relatively low [13]. An intraspecific genetic linkage map comprising 604 loci and covering 3141 cM was recently constructed in Upland cotton [14]. Some other Upland cotton intraspecific linkage maps were also reported such as a linkage map with 471 loci and coverage of 3070 cM [15], and another linkage map with 978 loci and coverage of 4,184 cM [16]. More recently, a genetic map was constructed containing 421 SSR loci with a span of 3814.3 cM using an F 2 population in Upland cotton [17]. Moreover, the most marker-rich intraspecific linkage map, which contained 1540 loci and spanned 2,842.06 cM, was constructed using recombinant inbred line (RIL) population derived from a cross between Upland cotton cultivar/line 'Yumian 1' and '7235' [18]. QTL mapping of yield and fiber quality traits based on high-density genetic linkage maps of intraspecific tetraploids can be very useful in revealing the genetic basis of lint yield and fiber properties. Recently, Liu et al. (2012) identified a total of 225 QTLs controlling yield and its components in both the XZM2-derived RILs and IF 2 populations, and from a total of 111 non-redundant QTLs, of these, 23 were detected in both two populations simultaneously [19].  mapped a total of 20 QTLs for fiber quality traits using 177 RILs derived from 'Xinza No. 1', a cross between 'GX11359' and 'GX100-2' [20]. Sequences of the D5-genome of G. raimondii, A2-genome of the G. arboreum and genome sequence of cultivated Upland cotton (Gossypium hirsutum TM-1) were publically available recently [21][22][23][24][25]. These physical maps provide new opportunities to develop rich SSR and functional markers for construction of high-density genetic maps and further dissect the genetic basis of complex quantitative traits as well as heterosis in Upland cotton.
Utilizations of heterosis have made significant economic benefits in crops during the last century. However, the mechanism of heterosis remains enigmatic [26]. Different hypotheses including dominance hypothesis [27][28][29], overdominance hypothesis [30][31] and epistasis hypothesis [32][33][34] were proposed to explore the mechanisms of heterosis. Epistasis is a phenomenon which the effect of one gene is modified by one or several other genes. Epistasis can be contrasted to dominance, which is an interaction between alleles at the same gene locus. The dominance hypothesis attributes the superiority of hybrids to the suppression of undesirable (deleterious) recessive alleles from one parent by dominant alleles from another parent. The overdominance hypothesis states that some combinations of alleles are especially advantageous when paired in a heterozygous individual. Recent research using quantitative molecular tools provided new evidence for different hypotheses in dissecting heterosis [26,[35][36][37][38]. In two rice BC 1 F 7 populations between recombinant inbred lines (RIL) and their parents, the heterozygotes were superior to the respective homozygotes in most QTLs, which supported the hypothesis that dominance was the genetic basis of heterosis in hybrid rice [39]. Stuber et al. [40] studied heterosis in the elite maize hybrid 'B73 × Mo17' and concluded that overdominance (or pseudo-overdominance) (i.e., a higher yield in the heterozygote than that in either homozygote) was the major cause of heterosis for grain yield. The hypothesis for epistasis as the genetic basis of heterosis was supported by a series of researches too [41][42][43]. The fact that a large number of digenic interactions for yield and yield components were detected in an F 2: 3 population provided evidence for epistasis as the primary genetic basis of heterosis in rice [43]. Hua et al. [41][42] and Zhou et al. [44] further verified that heterozygotes were not necessarily advantageous for phenotypic traits and epistasis was the important genetic basis of heterosis in elite rice hybrid 'Shanyou 63'. Multiple genetic mechanisms were identified to play roles in heterosis. In rapeseed, epistasis as well as all levels of dominance from partial to overdominance were responsible for the expression of heterosis [45]. Another study in hybrid maize 'Yuyu No. 22' suggested that genetic basis of grain yield heterosis relied on the cumulative effects of dominance, overdominance, and epistasis by genetic dissection using an "immortalized F 2 " population [46].
Opposite to genetic mechanisms of heterosis, reports on the genetic basis of inbreeding depression were inconsistent due to diverse materials in different experiments [47][48][49][50][51][52][53][54]. Wright [51] suggested that recessive or partially recessive deleterious effects of alleles were the most important cause of inbreeding depression. However, overdominance and epistasis were suggested as the primary genetic basis of inbreeding depression in rice [48][49]. Dominance effect was clearly a cause of inbreeding depression for yield while dominance and epistasis effects were the genetic basis of heterosis in soybean [53]. To understand the genetic basis of inbreeding depression in rice, Li and Luo et al. [48][49] reported that hybrid breakdown (part of inbreeding depression) in an intersubspecific F 4 population and recombinant inbred lines and it's two BC and two testcross hybrid populations derived from crosses between the RILs and their parents, and these results implied epistasis as a genetic basis of inbreeding depression. Li et al. [54] reported that hybrid breakdown (part of inbreeding depression) in an intersubspecific F 4 population was largely due to additive epistatic loci, which implies epistasis as a genetic basis of heterosis.
In the present research, the main objective was to explore the genetic basis of heterosis using the populations of F 2: 3 and F 2: 4 generations in Upland cotton by dissecting QTLs effects at both single-and two-locus levels. These results are expected to improve our understanding of the genetic basis of heterosis in cotton which will promote cotton hybrid production for lint yield.

Ethics Statement
University (Beijing) in October 2007. All the F 2 seedlings were transported by air and transplanted at the Hainan South Propagation Station (Sanya, Hainan Province) to produce F 3 seeds by self pollination. The F 3 family lines were bulk self-pollinated to produce F 4 seeds. A population of 173 F 2:3 family lines were planted with their parents and the only one F 1 individual as, and a population of 173 F 2:4 family lines were planted with their parents and F 2 plants as controls in 2009.

Field Planting and Traits Examination
Field trials of the F 2: 3 and F 2: 4 populations were conducted at two locations, Quzhou Experimental Station of China Agricultural University at Handan City, Hebei Province (36°78'N, 114°92'E) and Hejian Guoxin Cotton Breeding Experiment Station at Cangzhou City, Hebei Province (38°43'N, 116°09'E) during 2008 and 2009. The two parents GX1135 (P 1 ) and GX100-2 (P 2 ) and their F 1 hybrid, 'Xinza No. 1', were also included as controls in the field tests. Field experiments followed a randomized complete block design with two replications at each location. Plants were planted in two row plots with 14 plants each. Plots were 4 m in length with 80 cm row spacing for the experiment at Handan and 4 m in length and 60 cm row spacing for the experiment at Cangzhou. Plants were spaced 30 cm in rows. Field management followed conventional standard field practices. Seedlings of approximate 25 days old were transplanted to fields in 2008. The seeds of F 2: 4 were planted by sowing directly in 2009. Bolls from seven and five consecutive plants in the middle of each plot were sampled at Handan and Cangzhou, respectively. Boll samples were ginned for seed-cotton yield (SY, t/ha), lint yield (LY, t/ha), bolls/plant (BNP), boll weight (BW, g), lint percentage (LP, %). Traits examined included yield per plant measured as the seed cotton weight of all bolls of the individual. The remaining bolls in each plot were collected for yield estimation. Mid-parent heterosis (MPH) was calculated as MPH = (F 1 -M)/M × 100, M = (P 1 + P 2 )/2. Where M is the mean yield of both parents and P 1 and P 2 are maternal and paternal yield, respectively. Hybrid breakdown (part of inbreeding depression) in an intersubspecific F 4 population was analyzed by the method provided previously by Li et al. [54].

DNA Isolation and Genotype Analysis
Young leaves were collected from labeled F 2 , P 1 , P 2 , and F 1 individuals, frozen in liquid nitrogen, and stored at -80°C. DNA was individually extracted as described by Paterson et al. [58]. A total of 16,405 SSR primer pairs were used to screen for polymorphic markers between parents. Among these primers, the sequences of 13,468 pairs including BNL, NAU, TM, JESPER, CIR, HAU, CM, MUSS, MUSB and MUCS primers were downloaded from Cotton Microsatellite Database (CMD, http://www.cottonmarker.org). The detailed information of these primers was described in literature [6-9, 14, 59]. The remaining 2,937 pairs of primers were designed and developed from DNA sequence library [60]. The polymorphic markers identified were used to genotype individuals of F 2 population.

Map Construction and QTL Analysis
MAPMAKER 3.0 [61] was employed to construct a genetic linkage map. Assignment of linkage groups to chromosomes was made based on previously chromosome-anchored SSR markers [6, 11-12, 18, 62]. When no chromosome reference was available, the linkage group was described as "un××" with "××" referring to its serial number. QTLs were detected by composite interval mapping as described by Zeng [63] using software of WinQTL Cartographer 2.5 [64]. A stringent LOD threshold of 3.0 was used to declare suggestive QTL [65], whereas the same QTL in another environment with LOD of at least 2.0 was considered to be a common QTL, as described by Shen et al. [66]. The graphic representation of the linkage group and QTL marked were created by Map Chart 2.2 as described by Voorrips [67].

Analysis, Confirmation of Digenic Interaction and Components Partition
Analysis of digenic interactions and further confirmation followed the methods described by Yu et al. [43] and Hua et al. [42]. The detected epistasis was confirmed by a randomization test conducted to identify those interactions more likely to be 'really' significant. In doing so, the entry order of the trait data in the analysis was randomly permutated and the F-statistic values were recalculated for the digenic interactions using the same marker data. This procedure was repeated 1000 times, and the resulting 1000 F-values were compared with the F-statistic values from the original data. If no more than one F-value from the random permutations was larger than the F-statistic value from the original data, the digenic interaction was regarded to be significant [41]. Significant interactions were further partitioned into three components, each specified by a single degree of freedom: AA (additive × additive), AD/DA (additive × dominance or dominance × additive), and DD (dominance × dominance). The significance for each term was assessed in an orthogonal contrast test using the statistical package STATISTICA [68]. The measurements of yield and yield components including boll number per plant (BNP), boll weight (BW), and lint percentage (LP) were made for hybrids and their parents in F 2: 3 and F 2: 4 populations (S1 File). Data are listed in Table 1 and illustrated in S1 Transgressive segregation of both directions was observed for all yield traits in F 2: 3 and F 2: 4 populations. Seed cotton yield (SY) and lint yield (LY) of the F 2: 4 population grown at Handan were much lower than that of the same population grown at Cangzhou, probably due to drought conditions at Handan which caused early aging during experiments. Lint yield was significantly correlated with all other yield traits and seed-cotton yield was significantly correlated with all the traits except lint percent. The correlation coefficients between different traits varied greatly from -0.23 to 0.97. Highest correlations were detected for bolls per plant vs. lint yield, 0.55 and 0.56 at Handan and Cangzhou respectively, in F 2: 3 populations (Table 2). Significant heterosis for yield was detected in hybrid 'Xinza No. 1' with 56% and 62% mid-parent heterosis (MPH) for seed cotton yield and lint yield, respectively. In the F 2 population, MPH values were 38.1% and 43.3% for seed yield and lint yield, respectively.

Linkage Map and QTL Mapping for Yield and Yield Components
A total of 450 polymorphic markers, screened from 16,405 pairs of SSR primer between parents, were used to construct a linkage group (S2 File). The linkage map was obtained with 421 loci linked into 49 groups leaving 29 loci unlinked. The map spanned 3814 cM with an average distance of 8.9 cM between adjacent markers, accounting for 73% of the entire tetraploid genome. Forty-eight of the 49 linkage groups were assigned to 26 chromosomes. The remaining one linkage group could not be associated with any chromosome, and the one group was tentatively named as 'Un 1'.
A total of 67 QTLs controlling yield and yield components were identified in F 2: 3 and F 2: 4 generations using composite interval mapping. The data are given in Table 3  were 38 QTLs identified in F 2: 3 generation, among which seven were major QTLs detected in both environments. There were 49 QTLs identified in F 2: 4 generation, among which, six were major QTLs detected in both environments.
For seed cotton yield, a total of 12 QTLs were identified in two generations, of those three QTLs were detected simultaneously in the two generations. Seven QTLs were identified in F 2: 3 generation, accounting for 5.2% to 12.7% of total variance, and eight QTLs were identified in F 2: 4 generations, accounting for 3.6% to 23.1% of total variance, respectively. Among them, two QTLs, qSY-chr26-1 and qSY-chr14-1, were identified at both Handan and Cangzhou locations in F 2: 3 generation. Three QTLs, qSY-chr11-1, qSY-chr12-1 and qSY-chr14-1, were detected in both F 2: 3 and F 2: 4 generations. Six QTLs identified in F 2: 3 generation showed negative additive effects (Table 3) and the alleles from female parent GX1135 increased phenotypic variation (S2 Fig). Three QTLs in F 2: 3 generation showed positive additive effects and the alleles from GX100-2 increased phenotypic variation. In F 2: 3 and F 2: 4 generations, eight QTLs   showed negative dominance effects and the remaining nine QTLs showed positive dominance effects.
For lint yield, a total of 10 QTLs were identified in two generations, among which three QTLs were detected simultaneously in the two generations. Five and eight QTLs were identified in F 2: 3 and F 2: 4 generations, respectively, accounting for 4.8% to 12.6% and 4.7% to 16.2% of total phenotypic variance in the two generations, respectively. Four QTLs among these were identified simultaneously in more than two environments. Two QTLs, qLY-chr26-1 for lint yield and qSY-chr26-1 for seed cotton yield, were mapped to the same interval on chromosome 26 in the same environments (S2 Fig). For bolls per plant, 14 QTLs were found in two generations, among which, three QTLs were detected simultaneously in the two generations. Eight and nine QTLs were identified in F 2: 3 and F 2: 4 generations, respectively, accounting for 4.8% to 12.2% and 4.5% to 20.2% of total phenotypic variance in the two respective generations. Four of them were detected simultaneously in more than two environments. Five and six QTLs with negative additive effects were detected in F 2: 3 and F 2: 4 generations, respectively. The alleles from female parent GX1135 increased phenotypic variation of lint yield. The remaining eight QTLs showed positive additive effects in the two generations. For the effects of these QTLs, the alleles from GX100-2 increased phenotypic variation. There were six and four QTLs identified with negative dominance effects in F 2: 3 and F 2: 4 generations, respectively. The results suggested that heterozygotes would have lower values than the two homozygotes for these QTLs. The remaining nine QTLs showed positive dominance effects indicating that heterozygotes would have higher phenotypic values than the two respective homozygotes for the effects of these nine QTLs.
A total of 15 QTLs referring boll weight were resolved in two generations, among which five QTLs were detected simultaneously in the two generations. Eight and 12 QTLs were identified in F 2: 3 and F 2: 4 generations, respectively, explaining 5.9% to 9.7% and 3.8% to 22.1% of phenotypic variance in the two respective generations. There were two QTLs (qBW-chr5-1 and qBW-chr5-2) simultaneously detected in F 2: 3 and F 2: 4 generations with negative additive effects. The alleles from the female parent GX1135 increased phenotypic variation for the effects of these QTLs. There were seven and eight QTLs showing negative dominance effects in F 2: 3 and F 2: 4 generations, respectively. There were three and eight QTLs in F 2: 3 and F 2: 4 generations, respectively, showing positive dominance effects. For lint percent, a total of 15 QTLs were identified in two generations, among which seven QTLs were observed simultaneously in the two generations. Ten and 12 QTLs were identified in F 2: 3 and F 2: 4 generations, respectively, explaining 5.4% to 14.3% and 4.8% to 16.2% of phenotypic variance in the two respective generations. Among them, there were seven QTLs detected in more than two environments. There were four and five QTLs in F 2: 3 and F 2: 4 generations, respectively, with negative additive effects. For these QTLs, alleles from the female parent GX1135 increased phenotypic variation ( Table 3). The remaining 15 QTLs in the two generations showed positive additive effects with the alleles from female parent GX100-2 increasing phenotypic variation. There were six and six QTLs detected in F 2: 3 and F 2: 4 generations, respectively, with negative dominance effects. The remaining 12 QTLs in the two generations showed positive dominance effects.

Dominance and Overdominance Effects of QTL
A locus is regarded as having overdominance if the ratio of the estimated dominance to the absolute value of additive effect is larger than one. Similarly, a locus is regarded as having partial dominance if the ratio is between 0 and 1 [34]. A total of 32 and 36 QTLs in F 2: 3 and F 2: 4 populations, respectively, showed partial dominance with their ratios ranging from 0 to 1. One QTL was detected for seed cotton yield showing dominance in F 2: 4 generation. Thirteen and 19 QTLs in F 2: 3 and F 2: 4 populations, respectively, showed overdominance with ratio greater than 1. Partial dominance occurred more frequently than overdominance and dominance. More than half of the QTLs listed in Table 3 showed varying degrees of negative dominance indicating that heterozygosity does not necessarily favor over homozygosity for yield. Similar results were reported previously in a study in an elite rice hybrid [41].

Relationship between Marker Heterozygosity and Performance
Correlation coefficients were not significant between heterozygosity of the marker genotypes and the performance of these genotypes in terms of yield and yield components in four environments (Table 4). These results implied difficulty in prediction of agronomic performance based on heterozysity of marker genotypes. One possibility for the low correlation is that only a portion of the 362 marker loci is related to the performance of those traits. These results were consistent with previous reports [41,69].

Identification and Verification of Digenic Interactions
The number of digenic interactions identified by two-way ANOVA for lint yield and yield components are given in Table 5. There were a total of 65,341 tests using 362 co-dominant markers at the whole genome level. For individual tests at the 0.001 probability level, the expected number of spurious interactions would be 65. The interaction number of the traits analyzed in the four environments ranged from 70 to 234, although the number analyzed for boll weight of F 2: 3 population at Handan (S1) and the number analyzed for lint percent of F 2: 4 population at Cangzhou (N1) was 64 and 65, respectively. To further assess the likelihood of the interactions identified above as chance events, each of the declared significant interactions was subjected to randomization tests [41]. The numbers of significant interactions decreased after randomization tests, and the reductions were more than the expected numbers based on chance events in all the cases. Current result indicated that the randomization test was highly stringent in identification of the significant interactions. For example, the number of significant interactions identified for lint yield in F 2: 3 populations was 109 and 106 at the two respective experimental sites, and reduced to 97 and 93, respectively, after randomization test. In F 2: 4 populations, the number of significant interactions was from 99 and 160 at the two respective experimental sites, and reduced to 93 and 149, respectively, after randomization tests ( Table 5). The interactions that survived randomization tests may therefore be regarded as the minimum number of significant interactions for each trait at the 0.001 probability level. In this way, large numbers of significant digenic interactions in the two populations were confirmed. In the F 2: 3 population, 97 digenic interactions for lint yield were detected at location of S1, and 93 digenic interactions were detected at location of N1 with one digenic interaction detected at both of the two locations. For the yield component of bolls per plant, 77 and 211 digenic interactions were detected at Handan (S1) and Cangzhou (N1), respectively, with one digenic interaction detected at both locations. For boll weight, 77 digenic interactions were detected at location of Handan, and 60 digenic interactions were detected at location of Cangzhou. There was no common digenic interaction detected at both locations for this yield component. For the yield component of lint percent, 98 digenic interactions were detected at location of Handan, and 61 digenic interactions were detected at location of Cangzhou with six common digenic interactions detected at both locations.
Interactions were partitioned by orthogonal contrasts into significant genetic components, and the significant genetic components were further confirmed by randomization tests (Table 6). In general, among two-locus interaction components, AA interactions for lint yield and yield components had the highest frequencies.
These results indicated that all genetic components, AA, AD/DA, DD, existed in F 2: 3 generation for all traits detected with significant interactions. In contrast, DD interactions occurred with the lowest frequencies and AD/DA occurred with intermediate frequencies.

Effects of Epistatic Interaction
Significant heterosis was detected for boll number per plant, and this yield component contributed directly to lint yield [3]. This yield component was also identified as the most important yield component to lint yield in recent studies [70][71]. This yield component was used as an example in this study to illustrate digenic effects in F 2: 3 population. Effects of digenic interactions for the best double homozygotes, i.e., homozygotes of twolocus combinations with significant AA, are given in Table 7. A total of 56 two-locus combinations with significant AA, 33 interactions of complementary two-locus homozygotes (11/22 or 22/11) had distinct advantages over the means of two parental genotypes (11/11 or 22/22) (Table 7). Among them, twenty-two cases were for complementary two-locus homozygotes, 11/ 22, and eleven cases were for complementary two-locus homozygotes, 22/11. Among the 56 epistasis, sixteen and six cases were the best loci combination from homozygotes of 22/22 and 11/11, respectively. Occasionally, single heterozygotes (six and one cases out of 27 tests for 11/12 and 22/12, respectively) could be comparable to the best two-locus genotypes. Generally, the heterotic values of the two-locus combinations (11/22 or 22/11, eight cases out of 27 tests) and homozygotes (22/22, 11 cases out of 27 tests) with significant AD/DA interactions had larger heterosis effects than single heterozygotes (11/12, 22/12) (Table 8). These results indicated the complementary effects in two-locus homozygotes. In order to test the hypothesis that double heterozygotes (12/12) have advantage over single heterozygotes, two-locus combinations with significant DD effects were compared with two locus heterozygotes ( Table 9). Most of the best two-locus genotypes were homozygous at one locus and heterozygous at the other locus (11/12 and 22/12). Advantage of the double heterozygotes over single heterozygotes was not observed. In summary of these results, the complementary two-locus homozygotes (11/22 or 22/11) frequently showed large effects in heterosis. Single heterozygotes (11/12 and 22/12) and homozygotes (22/22) were the best two-locus genotypes for yield in a few cases while in no case did double heterozygotes (12/12) showed larger effects of heterosis than double homozygotes and single heterozygotes.

Necessity for Constructing a High-Density Linkage Map in Upland Cotton
Microsatellite markers were widely used, even as framework markers, to construct linkage maps in crops [72], because of their relatively high polymorphisms, detectability, and stability  in genome. Although the available EST-SSR sequences are now rich in GenBank, these sequences are so conserved that the polymorphism of SSR markers developed is comparatively low in Upland cotton [15]. Since the first genetic linkage map was constructed in an F 2 population derived from crosses between G. hirsutum and G. barbadense [5], a variety of linkage maps have been constructed and a number of QTLs for yield and fiber quality traits have been identified from these linkage maps. However, heterosis of yield and yield components had not been analyzed in these highdensity maps.
In order to construct a high-density linkage map for analysis of yield heterosis in Upland cotton, 450 polymorphic primers were used to construct a linkage map in 173 F 2: 3 progeny lines derived from two mapping parents 'GX1135' and 'GX100-2'. The genetic linkage map was constructed with 421 loci mapped and coverage of 3814 cM genetic length in the cotton genome. In this research, the majority of primers with high rate of polymorphism were developed by genomic clones from microsatellite-enriched cotton DNA libraries which were sequenced to identify SSR-containing target regions and SSR-containing EST collections [73] (S1 Table). Physical map of D 5 -genome cotton species, A 2 -genome cotton species, and AD 1genome in Upland cotton have been completed [21][22][23][24][25]. The physical map of whole genome in Sea Island cotton is not currently available, but the sequencing of this physical map will be released soon (www.cottongen.org). These physical maps will provide extensive information in constructing high-density genetic linkage map, molecular marker selection, and map-based cloning in Upland cotton. With emergence of next generation sequencing technology and more sequences from newly released expression DNA libraries, more and more functional markers will be developed for gene mapping and tagging.

Genetic Effects of QTL for Yield and Yield Components
In present study, only a few QTLs showed overdominance while majority of them showed partial dominance. Alleles from both female and male parents were in direction of increasing phenotypic variation, which has been observed in previous studies [39][40][41].
There were a number of QTLs for yield traits that were detected simultanuously in F 2: 3 and F 2: 4 populations with both additive and dominant types of genetic effects. QTLs identified for yield and yield components simultaneously in F 2: 3 and F 2: 4 populations as shown in Table 3 were further analyzed. Generally, QTLs detected in F 2: 3 generation had larger effects than those in F 2:4 generation which indicated depression. For their QTL effects, the dominance effect of qSY-chr14-1 was 3.32 in F 2: 3 generation, larger than its dominance effect, 0.10, in F 2: 4 generation. For lint percent, seven QTLs were detected in both populations with similar magnitudes of additive and dominance effects in two populations. So the effects of the QTLs detected in F 2: 3 population were substantially larger than those detected in F 2: 4 population. Same phenomenon was observed in a previous study of QTLs for yield using F 2 and F 2: 3 populations using a vegetatively reproduced rice [74]. These results are consistent with the contention that dominance effects were the genetic basis of heterosis and inbreeding depression in Upland cotton.

Repeatability of QTLs among Different Studies
Although a number of QTLs for yield and yield components were identified in earlier reports using both inter-specific populations and intra-specific Upland cotton populations, only a limited number of markers were reported to be common among different populations in cotton genome. Therefore, it is necessary to fill this gap by identification of enough QTLs for yield traits consistent in different segregating populations and different environments. Some QTLs contributing to yield traits were identified on the same chromosomes in different studies. For example, QTLs for lint yield were detected on chromosome 1, 2, 3, 5, 6, 9, 12, 13, 14, 15, 16, 20, 23, 24, 25 and 26 [7, 75-80]. Among these, a common QTL for lint yield on chromosome 14 was detected in different mapping populations and environments [7,77]. In current study, the QTL qLY-chr13-1 was also detected on chromosome 13 with interval between markers of GH157 and BNL 1495. Another QTL for lint yield was detected in a study by Wu et al. [79] with flanking markers BNL1421 and BNL1495. These two QTLs for lint yield may be overlapped or a common one due to their location proximity in the genome.
In F 2: 3 and F 2: 4 populations, three types of digenic interactions, AA, AD/DA and DD, were detected. AA interactions were detected with the highest frequency and DD interactions were detected with the lowest frequency. None of the double heterozygotes was detected as the best genotypes in terms of bolls per plant. The large number of significant digenic interactions for yield traits identified in F 2: 3 and F 2: 4 populations indicated that epistasis was the genetic basis of heterosis in Upland cotton.
In two-locus combinations showing significant AA effects, the best genotypes for bolls per plant were complementary two-locus homozygotes (11/22 or 22/11). Only one single heterozygote (11/12) was detected as the best genotype. The high heterotic values of two-locus combinations with significant AD/DA interactions indicated that the complementary twolocus homozygotes (11/22 or 22/11) were the best genotypes. In two-locus combinations with significant DD, the best genotypes were frequently single heterozygotes (11/12 and 22/12). No double heterozygote (12/12) was detected as the genotype with best performance. Similar results were also reported in previous studies by Hua et al. [41] and Zhou et al. [44]. These phenomena indicated a low correlation between heterozygosity of parental genotypes and their performance. However, single locus heterozygotes showed higher correlations between parental genotypes and their performance than those of double heterozygotes. These results implied that the double heterozygous does not necessarily favor the expression of a trait. The lack of correlation between heterozygosity and performance is possibly due to the small portion of heterozygous loci analyzed for heterosis [41]. The lack of correlation could also be caused by low association between marker heterozygosity and QTL [81].
Genetic basis of heterosis has been analyzed extensively by different research groups [37-41, 44, 82]. Due to different materials and methods used, these reported genetic and molecular mechanisms underlying heterosis were inconsistent in previous studies. In this research, we dissected the genetic basis of heterosis in F 2: 3 and F 2: 4 populations at single-and two-locus levels and concluded that dominance effects at single-locus and epistasis effects at two-locus level were the genetic basis of heterosis in Upland cotton.