From Many, One: Genetic Control of Prolificacy during Maize Domestication

A reduction in number and an increase in size of inflorescences is a common aspect of plant domestication. When maize was domesticated from teosinte, the number and arrangement of ears changed dramatically. Teosinte has long lateral branches that bear multiple small ears at their nodes and tassels at their tips. Maize has much shorter lateral branches that are tipped by a single large ear with no additional ears at the branch nodes. To investigate the genetic basis of this difference in prolificacy (the number of ears on a plant), we performed a genome-wide QTL scan. A large effect QTL for prolificacy (prol1.1) was detected on the short arm of chromosome 1 in a location that has previously been shown to influence multiple domestication traits. We fine-mapped prol1.1 to a 2.7 kb “causative region” upstream of the grassy tillers1 (gt1) gene, which encodes a homeodomain leucine zipper transcription factor. Tissue in situ hybridizations reveal that the maize allele of prol1.1 is associated with up-regulation of gt1 expression in the nodal plexus. Given that maize does not initiate secondary ear buds, the expression of gt1 in the nodal plexus in maize may suppress their initiation. Population genetic analyses indicate positive selection on the maize allele of prol1.1, causing a partial sweep that fixed the maize allele throughout most of domesticated maize. This work shows how a subtle cis-regulatory change in tissue specific gene expression altered plant architecture in a way that improved the harvestability of maize.


Introduction
The ''domestication syndrome'' of crop plants is a suite of adaptive traits that arose in response to direct and indirect selection pressures during the domestication process [1][2][3]. This suite of traits includes an increase seed or fruit size, larger inflorescences, an increase in apical dominance, more determinate growth and flowering, loss of natural seed dispersal, loss of seed dormancy, and, in some cases, the gain of self-compatibility. These traits make crop plants easier to cultivate and harvest, resulting in increased value for human use.
Among the domestication syndrome traits, the increase in apical dominance improves agricultural performance by enhancing harvestability. Apical dominance confers a reduction in the number of branches and inflorescences per plant. The inflorescences that do form, however, have either more and/or larger fruits or seeds. Thus, increased apical dominance can afford easier harvestability by reducing the number of inflorescences to be harvested without a concomitant loss in yield per plant. Moreover, larger seeds allow for more vigorous growth after germination when seedlings can face intense competition from weedy species. Finally, the fewer but larger inflorescences mature in a narrower window of time, enabling all the fruit/seed of a plant to be harvested at the same time of optimal maturation. Maize was domesticated from Balsas teosinte (Zea mays subsp. parviglumis) through a single domestication event in Mexico about 9000 years ago [4,5]. During maize domestication, there was a profound increase in apical dominance such that the amount of branching and the number, size and arrangement of the female inflorescences (ears) changed dramatically [6,7]. The teosinte plant has multiple long lateral branches, each tipped with a tassel. At each node along these lateral branches, there are clusters of several small ears ( Figure 1A). Summed over all branches, a single teosinte plant can easily have more than 100 small ears. By comparison, the maize plant has relatively few lateral branches (often just two), each tipped by a single large ear rather than a tassel as in teosinte ( Figure 1C). Modern commercial varieties of maize typically have only one or two ears per plant, and even traditional landraces of maize rarely have more than 6 ears per plant. In maize genetics and breeding, the number of ears on a plant is scored as prolificacy, teosinte having high and modern maize low prolificacy.
Here, we report a genome-wide scan for prolificacy QTL using a maize-teosinte BC 2 S 3 mapping population [8]. We also report the fine-mapped of one of the discovered QTL to a 2.7 kb ''causative region'' located 7.5 kb upstream of the coding sequence of the known maize gene grassy tillers1 (gt1), which encodes a homeodomain leucine zipper (HD-ZIP) transcription factor [9]. We characterize the change in expression of gt1 between the maize and teosinte alleles of our mapping population, and the relationship between this expression change and reduced prolificacy in maize. We also performed molecular population genetic analysis that suggests the causative region was the target of a partial selective sweep that brought a haplotype at low frequency in teosinte to a higher frequency over most of the range of maize landraces. Our results show that a subtle change in the tissue specific gene expression is associated with a reduction in prolificacy during domestication.

Results
A major QTL (prol1.1) largely controls prolificacy Whole genome QTL mapping for loci affecting prolificacy was performed using a set of 866 maize-teosinte BC 2 S 3 recombinant inbred lines (RILs). This analysis identified eight QTL, distributed across the first 5 chromosomes ( Figure 2, Table 1). Of the eight QTL, one has a much larger effect than the other seven. This QTL (prol1.1) is located on the short arm of chromosome 1 and accounts for 36.7% of the phenotypic variance. Plants in the mapping population that are homozygous teosinte at prol1.1 typically produce multiple ears at each node like teosinte ( Figure 1B). The 1.5 LOD support interval surrounding prol1.1 defines a 0.79 Mb segment between 22.63 Mb and 23.42 Mb (B73 Reference Genome v2) on chromosome 1. This region contains just 25 annotated genes including gt1. The other seven QTL have much smaller LOD scores and smaller effects. This disparity in QTL size suggests that although the seven smaller QTL contribute to prolificacy, the phenotype is primarily controlled by prol1.1.

prol1.1 maps to the promoter of gt1
We chose prol1.1 for fine-mapping to identify the underlying causative gene. Two markers (umc2226 and bnlg1803) that flank the QTL interval were used to screen for recombinant chromosomes in one of the 866 BC 2 S 3 RILs that is heterozygous in the prol1.1 QTL interval. After screening ,4000 plants of this RIL, 23 plants with a cross-over between the two markers were identified and selfpollinated to create progeny lines homozygous for the 23 recombinant chromosomes. The physical position of each of the 23 recombination events was determined using a combination of gel-based markers and DNA sequencing (Figure 3, Figure S1; Table S1).

Author Summary
Crop species underwent profound transformations in morphology during domestication. Among crops, maize experienced a more striking change in morphology than other crops. Among the changes in maize from its ancestor, teosinte, was a switch from 100 or more small ears per plant in teosinte to just one or two large ears in maize. We show that this change in ear number has a relatively simple genetic architecture involving a gene of large effect, called grassy tillers1. Moreover, we show that grassy tillers1 experienced a tissue-specific gain in expression in maize that is associated with suppressing the initiation of multiple ears per plant such that only one or two large ears are formed. Our results show how simple changes in gene expression can lead to profound differences in form.
Progeny lines homozygous for the 23 recombinant chromosomes were grown in a randomized-block design and scored for prolificacy. We also included two lines derived from the same BC 2 S 3 RIL as controls: one homozygous teosinte and the other homozygous maize in the QTL interval. This set of 25 progeny lines fell into two discrete classes for prolificacy ( Figure 3). One class, which included the maize control line, had an average prolificacy score of 2.3860.05 ears. The other class, which included the teosinte control line, had an average prolificacy score of 7.2460.12 ears. Separately, to estimate dominance relationships, we compared the trait values of the maize, teosinte and heterozygous genotypic classes at prol1.1 The dominance/additivity ratio is 0.08, indicating additive gene action (Table S2).
Examination of the relationship between the two phenotypic classes and the recombination breakpoints revealed that all members of the maize class carry maize chromosome between   Figure 3, Figure S1). Correspondingly, all members of the teosinte phenotypic class carry teosinte chromosome between these two markers. No other chromosomal region shows this absolute correspondence with phenotype. Thus, substitution mapping based on the recombination breakpoints indicates that prol1.1 or the factor that governs prolificacy maps to this interval. This interval, which we will refer to as the ''causative region,'' is approximately 7.5 kb upstream of gt1 and measures 2720 bp in W22, 3142 bp in our teosinte parent, and 2736 bp in the B73 reference genome (Figure 3, Figure S1). The sequence alignment of W22 and the teosinte parent expands to ,4.2 kb because there are several large insertions unique to either W22 or teosinte (see below).

The decrease in prolificacy in maize is correlated with an increase in kernel weight
The maize allele of prol1.1 confers a reduction in ear number, which by itself would cause a reduction in yield. To test whether there is a compensatory increase in either the number of kernels per ear or kernel weight, we assayed plants of the BC 2 S 3 family used for fine-mapping to determine if prol1.1 has associated effects on these traits. The prol1.1 maize allele is not associated with an increase in ear size as measured by the total number of spikelets (kernel forming units) produced in the primary ear (maize = 418, heterozygous = 423, teosinte = 421, p = 0.86; Table S2). However, the maize allele is associated with an increase in kernel weight (maize = 0.216 g, heterozygous = 0.208 g, teosinte = 0.187 g, p,0.0001; Table S2). Other aspects of plant architecture such  Figure S1 and Table S1. doi:10.1371/journal.pgen.1003604.g003 as tillering and the number of nodes along the maize culm that produce ears do not appear to be affected by prol1.1 (Table S2). Thus, these data suggest that the reduction in secondary ears caused by prol1.1 in maize was compensated for by an increase in kernel weight such that yield itself may not have changed. Confirm of this interpretation would require a formal yield trial comparing the maize and teosinte genotypes.

Maize and teosinte alleles of gt1 show near equal expression
The location of prol1.1 at ,7.5 kb upstream of coding sequence of gt1 suggests that it may represent a cis-regulatory element of gt1.
To investigate this possibility, we used ESTs from Genbank and genomic sequence of our maize and teosinte parents to construct a gene model for gt1 ( Figure S2). This model agrees with the gt1 gene model presented elsewhere [9]. gt1 possesses three exons with two small introns and a transcript of ,1350 bp that encodes a protein of 239 amino acids. The homeodomain and a putative nuclear localization signal are located in Exon 2.
We performed RT-PCR with primers designed to amplify most of the predicted transcript (1203 bp of the predicted 1350 bp) using cDNAs isolated from immature ear-forming axillary branches of isogenic lines derived from our mapping population possessing the maize and teosinte alleles. We observed three size classes of RT-PCR products, presumably corresponding to three splice variants or isoforms of gt1 ( Figure 4). The three size classes are present with both maize and teosinte alleles. We cloned and sequenced all three size classes and aligned these with the genomic sequence ( Figure S3). The largest class contains the entire predicted open reading frame, encoding a predicted protein of 239 amino acids. The middle-sized product is missing most of Exon 2 and part of Exon 3. The smallest-sized product is missing all of Exon 2 and parts of Exons 1 and 3. Critically, the middle and small-sized products are both missing the homeodomain and all or part of the putative nuclear localization signal.
The relative band intensities of different sized RT-PCR products ( Figure 4) suggest that transcript abundance for the isoforms differs between the maize and teosinte alleles: teosinte having a greater abundance of the full length product and maize a greater abundance of the middle-sized product that lacks the homeodomain. To test whether these differences in band intensity for the different isoforms are independent of the causative region, we performed RT-PCR with two of our recombinant isogenic lines. One of these has the teosinte causative region linked to the maize coding sequence (T:M), and the other has the maize causative region linked to the teosinte coding sequence (M:T). RT-PCR assays with these recombinant lines confirm that the differential band intensity for the isoforms is determined by the coding sequence and not the causative region 7.5 kb upstream of the coding sequence ( Figure 4).
To investigate the effect of the causative region on transcript abundance for our maize and teosinte alleles, we used an allele specific expression assay [10]. cDNA was made from RNA from immature ear-forming axillary branches of plants heterozygous at prol1.1-gt1. PCR primers were designed flanking a 2 bp indel in the 39 non-translated region that distinguishes the maize and teosinte alleles ( Figure S2). This indel is in all three isoforms, and thus PCR products measure the overall difference in the abundance of the maize and teosinte transcripts without regard to any differences in relative abundance of the isoforms between maize and teosinte. In a heterozygous plant, the maize and teosinte alleles are expressed in the same cells with a common set of trans-acting factors, therefore any difference in transcript abundance of the alleles in heterozygous plants must be due to cis-regulatory factors. This assay shows a ratio of 1.35 for teosinte:maize gt1 transcript, suggesting a modest but statistically significant excess of teosinte relative to maize transcript (z-test, p,0.001).
As an additional test of the effects of the causative region on gt1 transcript abundance, we used quantitative PCR (qPCR) to compare overall gt1 transcript abundance in immature earforming axillary branches of isogenic lines that are homozygous for the maize vs. teosinte alleles at prol1.1-gt1. For this assay, we used a primer pair in the 39 UTR of all three isoforms. The abundance of gt1 transcript relative to actin transcript for the teosinte class (1.03, n = 12) was slightly higher than the maize class (0.88, n = 12), however this difference is not statistically significant (t-test, p = 0.077). Both the allele specific expression assay and qPCR suggest that the teosinte transcript abundance might be slightly higher than that of maize, but any difference is modest.

Maize prol1.1 directs increased gt1 expression in primary branch nodes
Although a substantial change in gt1 transcript levels was not detected between the maize and teosinte alleles of prol1.1 in immature ear-forming axillary branches, we hypothesized that the absence of secondary ears in maize could be caused by a more subtle change that does not drastically alter overall transcript level but instead impacts the domain of gt1 expression. In order to test for such a tissue-specific expression difference, we performed RNA in situ hybridization on immature primary ear-forming branches of lines containing all possible combinations of the maize and teosinte causative region (prol1.1) and gt1 coding sequence (M:M, T:T, M:T, and T:M). A previous study demonstrated that gt1 is strongly expressed in the leaves of dormant tiller-forming lateral buds [9], thus we anticipated that gt1 expression might differ in the leaves (husks) surrounding secondary ear buds of maize and teosinte. Contrary to this expectation, our sections revealed that lines containing the maize allele of prol1.1 (M:M and M:T) rarely, if at all, initiate secondary ear buds (Text S1, Table S3). Expression of gt1 was observed in young leaves surrounding secondary ears of lines containing the teosinte allele of prol1.1 (T:T and T:M) ( Figure  S4), but was weak compared to dormant buds [9], and required an extended incubation for detection, suggesting that these secondary ears are not dormant. Interestingly, an up-regulation of gt1 expression was observed in the stem node or nodal plexus [11] of primary branches for lines containing the maize allele of prol1.1 (M:M and M:T, Figure 5 A,B). This nodal gt1 expression was either absent or only weakly detectable above background in lines containing the teosinte allele of prol1.1 (Figure 5 C,D). While the nodal stripe of gt1 was weak, the difference between the maize and teosinte prol1.1 lines was consistently observed in both late ( Figure 5) and early staged ( Figure S5) ear-forming axillary branches. Taken together, these observations suggest that the allelic differences at prol1.1 involve changes in a cis-regulatory element that causes increased gt1 expression in the nodal plexus of maize, which in turn inhibits the initiation of secondary ear buds.
A partial selective sweep occurred at prol1.1 To investigate whether the causative region shows evidence of past selection during maize domestication, we sequenced the entire causative region (,2.7 kb) plus flanking sequence (,1000 bp upstream and ,700 bp downstream) in 15 inbred maize landraces and 9 inbred teosinte (Text S2, Table S4). Diversity statistics across the region in both teosinte (S = 85, p = 0.00844 and Tajima's D = 21.16) and maize (S = 32, p = 0.00307 and Tajima's D = 20.439) are within the previously estimated range of these statistics for neutral genes [12], where S and p were the number of segregating sites and nucleotide diversity, respectively. Although these data would superficially appear to be consistent with a loss of diversity due to the domestication bottleneck alone, a neighbor-joining tree of the sequences separates most maize from most teosinte sequences in the causative region ( Figure S6). This separation of the mostly maize and mostly teosinte clusters reflects differences at numerous SNPs and multiple putative transposon insertions ( Figure S7). We will refer to these maize and teosinte clusters hereafter as the class-M and class-T haplotypes, respectively. Linkage disequilibrium (LD) analysis of maize sequences confirms this separation, identifying a 2.5 kb block of strong LD corresponding to SNPs that differentiate class-M from class-T maize sequences ( Figure 6A, Figure S8). This high LD block lies completely within the 2.7 kb causative region. The maize class-M haplotype in this block exhibits extremely low levels of nucleotide diversity (p = 0.000740) and a strongly negative Tajima's D value (D = 21.966). These values are extremely unlikely under neutrality (p,0.01; Text S2), leading us to investigate instead a partial sweep model to explain the observed sequence data.
To investigate the unusual pattern of diversity for the maize class-M haplotypes, we applied a maximum likelihood method to estimate the selection coefficient (s) and the degree of dominance (h) using structured coalescent simulations (Text S2). We specified a partial sweep model ( Figure 6B), consistent with the observation of both class-M and class-T haplotypes in domesticated maize sequences, and performed structured coalescent simulations over a wide range of parameter settings similar to previous studies [12,13]. Our maximum likelihood estimates suggest that the class-M allele is dominant (h = 1.0) and under reasonably strong selection (s = 0.0015) ( Figure 6C). We also estimated the age of class-M haplotype to be ,13,000 generation ago using Thomson's method [14,15]. Although the observed length (2.5 kb) of the swept region may seem short, simple calculations show that this length falls within the ,1-7 kb range expected given available estimates of recombination and the age of the haplotype (Text S2).
We assayed a diverse sample of maize and teosinte to better estimate the frequencies of the class-M and class-T haplotypes (Table S5). We used an ,250 bp insertion specific to the class-T haplotype as a marker. We observed that the class-M haplotype exists at a relatively low frequency in ssp. parviglumis (5%) and ssp. mexicana (8%) while the class-T haplotype exists at a moderate frequency in maize landraces (29%) ( Table 2). These frequencies are consistent with the partial selective sweep discussed above that brought the class-M haplotype from a low frequency (5%) in the progenitor population to a relatively high frequency (71%) in domesticated maize.
An examination of the distribution of the class-T haplotype in maize shows a distinct geographic pattern ( Figure S9). With only three exceptions, the class-T haplotype is limited to southern Mexico, the Caribbean Islands and the northern coast of South America. One exception is its occurrence in the landrace Tuxpeño Norteñ o in northern Mexico, but this is a landrace thought to be recently derived from the landrace Tuxpeñ o of southern Mexico [4]. The two other exceptions are found in southern Brazil in landraces thought to have been brought to Brazil in the 1800s from the southern USA [16]. In turn, the southern US landraces are thought to have been brought there from southern Mexico and the Caribbean in the 1600s by the Spanish [17]. Thus, the class-T haplotype in maize has a distribution centered on southern Mexico and the Caribbean with recent dispersals to other regions.

Discussion
A critical challenge during the domestication of crop plants was to improve the harvestability of the crop as compared to its progenitor. Many wild species are adapted to ''spread their bets'' and thereby increase the probability of successful reproduction under diverse environments [2]. This is especially true of annual species, like the ancestors of many crops, that colonize disturbed habitats [2]. In unfavorable environments, such species can flower and mature rapidly, producing smaller numbers of branches, inflorescences, flowers and seeds but still complete their reproductive cycle. In favorable environments, such species can flower over a longer period, sequentially producing more branches, inflorescences, flowers and seeds over time, maximizing their reproductive output. The latter strategy is not optimal for a crop as greater efficiency of harvest is achieved by having all seed mature synchronously. Similarly, harvesting a single large inflorescence or fruit from a plant is easier than harvesting dozens of smaller ones [18]. Thus, diverse crops have been selected to produce smaller numbers of larger seeds, fruits or inflorescences as a means of improving harvestability [2]. In the terminology of modern day maize breeders, crops were selected to be less prolific.
Our QTL mapping for prolificacy confirms the results of three prior studies that indicated this trait is controlled by a relative small number of QTL including one of large effect on the short arm of chromosome 1. First, in an F 2 cross of Chalco teosinte (Zea mays ssp. mexicana) with a Mexican maize landrace (Chapalote), one of the four detected QTL was located on the short arm of chromosome 1 and accounted for upwards of 19% of the phenotypic variance in prolificacy [19]. Second, in an F 2 cross of Balsas teosinte with a different Mexican maize landrace (Reventador), one of the seven detected QTL was located on the short arm of chromosome 1 and accounted for 25% of the phenotypic variance [20]. Finally, in a maize-teosinte BC 1 cross of Balsas teosinte by a US inbred line (W22), seven prolificacy QTL were detected [21]. All seven QTL had small effects, but the one that explained the greatest portion of the variance (4.5% averaged over two environments) was on the short arm of chromosome 1. As in these prior studies, the QTL mapping reported here indicates that prolificacy is under relatively simple genetic control, involving only 8 QTL but including one QTL (prol1.1) of large effect. prol1.1 accounted for 36.7% of the variation in the number of ears and reduces the number of ears from 7.2 for teosinte homozygous class to 2.4 for the maize homozygous class.
The genetic architecture of the change in prolificacy during domestication appears to be relatively simple in several other crops as well. In tomato, five QTL of roughly equal effects for the number of flowers per truss between wild and domesticated tomato were detected [22,23]. In the common bean, three QTL were detected for the reduction in the number of pods per plant in a cross of wild and domesticated bean [24]. The QTL of largest effect confers a reduction from 29 to 17 pods per plant and accounts for 32% of trait variation. In pearl millet, the reduction in the number of spikes per plant is governed by four QTL, including one that controls 37% of trait variation [25]. In sunflower, the reduction of number of heads per plant was governed by seven QTL, one of which had a much larger effect than the other six [26]. This large effect QTL accounts for a difference of 4.8 heads per plant between the cultivated and wild genotypes, and it colocalizes with the previously described Branching (B) locus, which is known to influence apical dominance [27]. Thus, simple genetic architecture including QTL of relatively large effect is common for this trait.
One theory of crop domestication is that traits change is often the result of recessive, loss of function alleles [28]. Contrary to this expectation, prol1.1 acts in an additive fashion with a dominance/ additivity ratio of 0.08, suggesting that domestication did not involve selection for a simple loss of function. Moreover, our expression assays indicate that gt1 has roughly equal expression in maize and teosinte ear-forming axillary branches and the phenotypic change is caused by a relatively subtle gain/increase of expression in the nodal plexus of the ear-forming branches of maize. These results demonstrate that rather than a simple loss of function allele, the gene underlying this QTL experienced an increase or gain of expression in a specific tissue. While selection for loss of function alleles may be a common feature of domestication, none of the three positionally mapped maize domestication QTL (teosinte branched1, teosinte glume architechture1, and gt1) involved a loss of function allele [29, 30, this paper].
Seventy-five years ago, the ''teosinte hypothesis'' that a small number of large effect genes substitutions could convert teosinte into a useful food crop was proposed [31]. The experimental basis for this model was that maize-like and teosinte-like segregants were recovered in a large maize-teosinte F 2 population at frequencies, suggesting that as few as five loci might control the critical differences in ear architecture. Subsequent QTL mapping identified six regions of the genome that harbor QTL of large effect on plant and ear architecture, consistent with the teosinte hypothesis [32]. Fine-mapping of two of these QTL identified an underlying gene of large effect in both cases. One of these is teosinte glume architecture (tga1) that controls the difference between covered vs. naked grain [30], and the other is teosinte branched (tb1), which conferred increased apical dominance during domestication [29]. In this paper, we have shown that a gene of large effect (gt1) also underlies a third of these six QTL of large effect. This result adds further support to the view that a small number of genes of large effect were key in the dramatic morphological changes that occurred during maize domestication. Nevertheless, it is also clear a larger number of QTL of smaller effect on morphology were also involved in converting teosinte into modern maize [8,32,33].
The role played by genes of large effect, like gt1, is not limited to maize domestication, but seems to be a common feature of plant domestication [34]. Recently, a large effect gene in sorghum that encodes a YABBY transcription factor was shown to control shattering vs. non-shattering inflorescences [35]. Previously, two domestication genes controlling shattering had been identified in rice, one encoding a homeodomain and the other a myb-domain transcription factor [36,37]. In tomato, two domestication genes for increase in fruit size have been isolated, one encoding a YABBY transcription factor and the other a putative cell signaling gene [38,39]. A single gene (PROG1), which encodes a zinc finger transcription factor, controls differences in plant architecture and grain yield between wild and cultivated rice [40,41].
The fine-mapping of prol1.1 was initiated using a publically available set of maize-teosinte RILs [8]. These RILs allow some QTL to be mapped to relatively small intervals. We mapped prol1.1 to a 0.79 Mbp segment that included only 25 annotated genes and then fine-mapped it to a 2.7 kbp causative interval. These same maize-teosinte RILs were recently used fine-map a  QTL (dtp10.1) for photoperiod response that was involved in the adaptation of maize to northern latitudes [8,42]. The dtp10.1 QTL was mapped to a 7.6 Mbp interval containing 103 annotated genes, and then fine-mapped to a 202 kbp interval containing a single annotated gene (ZmCCT). Features of prol1.1 and dtp10.1 that made them good candidates for fine-mapping were (a) having large effects with strong statistical support (LOD.100) so that progeny lines with recombinant chromosomes possessing the maize vs. teosinte alleles of the QTL segregated into two distinct classes (i.e. Mendelized) and (b) being located in genomic regions with sufficient recombination to capture multiple cross-overs per gene in an F 2 family of 2000 plants. For example, prol1.1 is located near the end of the short arm of chromosome 1, where we observed a recombination rate 1.3610 23 cM/kbp which is over twice the genome-wide rate reported for a maize-teosinte crosses [21]. The location of prol1.1 just 7.5 kb 59 of grassy tillers1 (gt1) suggested that it may act as a cis-regulatory element of gt1. Whipple et al [9] identified gt1 as a HD-Zip transcription factor, a class of proteins that is unique to plants. The role of gt1 in maize development is complex. Although named for the excessive tillering caused by loss of function alleles, these alleles also cause the derepression of carpels in tassel florets, leading to the formation of sterile carpels [9]. Additional changes include an increased numbers of ear-forming nodes along the main culm, elongation of the lateral branches, and elongation of the blades on the husk leaves. The formation of secondary ears is occasionally (but not typically) seen with maize gt1 mutant allele consistent with the effect of prol1.1 on gt1 expression that we observed. The infrequency of this phenotype with the maize mutant alleles might be due to differences in genetic background between our lines, for which about 10% of the genome comes from teosinte, and the elite maize inbreds in which gt1 mutant alleles have been assayed. One curiosity is that the teosinte allele we studied does not confer an increase in tillering (Table S2), suggesting the role of gt1 in regulating tillering is conserved between maize and teosinte.
Another HD-Zip transcription factor, six-rowed spike1 (Vrs1), has been identified as a domestication gene, controlling the change from two-rowed spikes in the wild progenitor of barley to sixrowed spikes found in domesticated barley [43]. Vrs1 is expressed in the lateral spikelet primordia of immature spikes of wild barley where it represses their development. Loss of function vrs1 alleles selected during domestication fail to repress the development of these lateral spikelets, resulting in two additional fully fertile spikelets per rachis node. A comparison of gt1 and vrs1 offers an interesting contrast. Loss of function of vrs1 alleles were selected in barley, producing a larger number of organs (spikelets or grains) per spike, while selection for an allele that confers the gain of nodal expression of gt1 in maize caused a reduction in the number of organs (ears) per plant. In maize, our data suggest the reduction in ear number may be compensated for by an increase in grain weight such that yield may not be affected. It would be of interest to know if the production of more grains per spike in barley is compensated for by a reduction in the number of spikes per plant such that yield is not affected although harvestability is improved.
The nature of the causative polymorphism for prol1.1 that governs gt1 expression in the nodal plexus and represses secondary ear formation remains unknown. There are multiple polymorphisms that distinguish the class-M and class-T haplotypes for the causative region, all of which are potential candidates for the functional variant that controls expression in the nodal plexus ( Figure S7). Among these polymorphisms are at least four transposable element insertions including Cinful, Pif/Harbinger, and hAT elements. Given the evidence that a Hopscotch transposon is the functional variant at tb1 [29], the transposons in the causative interval of gt1 are good candidates for future functional assays. Transposon inserts have also been identified in alleles of genes involved in millet and tomato domestication or improvement [44,45], suggesting that transposons may be important contributors to regulator variation in crop plants.
DNA sequence analysis of the prol1.1 locus in diverse maize and teosinte accessions revealed two distinct haplotypes. Both haplotypes were present in maize and teosinte, but the class-M haplotype was common in maize and rare in teosinte. Neutral coalescent simulations revealed that patterns of diversity in the class-M haplotype in maize were unlikely in the absence of selection, and subsequent parameter estimation supported a partial sweep model in which selection acted to increase the frequency of the class-M haplotype during domestication. The estimated age of the class-M haplotype at 13,000 BP predates maize domestication and is consistent with its observed presence in about ,5% of the teosinte sampled. This observation suggests that selection at prol1.1 acted on standing variation, similar to observations for tb1 [29] and barren stalk1 [46].
It is curious that the class-T haplotype is found at a frequency of nearly 30% in maize, although the multi-eared phenotype that this haplotype confers is rare in maize. Furthermore, none of the maize races (Table S3) that carry the class-T haplotype are known to exhibit the multiple ears along a single shank. These observations suggest that these landraces may have other factors that suppress the formation of multiple ears on a single shank. Thus, there may have been two pathways to the switch from several to a single ear per node in maize, one governed by prol1.1 and a second controlled by unknown factors that suppress multiple ear formation in plants carrying the class-T haplotype at prol1.1.
The presence of such a second genetic pathway could also explain the incomplete selective sweep at prol1.1. In some maize populations, fixation of low-prolificacy alleles at genes in this proposed second pathway could have reduced or eliminated selection on prol1.1.
Previous analysis of gt1 and surrounding sequence uncovered evidence of selection at the 39 UTR of the gene [9]. We reanalyzed this sequence data (Text S2) and identified two distinct haplotypes distinguished by a ,40 bp indel. The class-M haplotype at this locus bears the signature of a partial sweep from standing variation similar to that seen at prol1.1 (Text S2). A PCR survey of a large panel of maize landraces reveals that the class-M haplotype at the 39 UTR has an overall frequency of 78%. Combined with the small size of both sweeps and geographical differences in the abundance of each haplotype ( Figure S9), these results suggest that the class-M haplotypes at prol1.1 and gt1 may represent independent selective events [47], perhaps on different regulatory aspects of gt1. Neither prol1.1 nor gt1 were identified in a recent wholegenome analysis of selection during domestication [48], likely due to the short span of the selected region and the presence of the class-T allele in 30% of maize lines. This result highlights the difficulty in identifying small selected regions from genome-wide scans, especially in the case of soft sweeps [49,50].
The shade avoidance response in plants involves an increase plant height, a decrease in branching, reduction in the number of flowers, and early flowering [51]. During domestication, human preference for easier harvestability resulted in a form of plant architecture that mimics the shade avoidance in that crops are less branched and produce fewer reproductive structures. Two maize domestication genes, gt1 and tb1, are members of the developmental network controlling the shade avoidance response [9], suggesting that domestication acted to constitutively fix aspects of the shade avoidance syndrome in maize. As the shade avoidance network becomes better known, it will be of interest to see if additional genes within this network also play a role in domestication.

QTL mapping
Whole genome QTL mapping for prolificacy in maize was performed using a set of 866 maize-teosinte BC 2 S 3 RILs that were genotyped at 19,838 markers using a ''genotype by sequence'' (GBS) approach [8,52]. The 19,838 markers were selected from over 50,000 GBS markers as the subset that defines the end-points of all cross-overs in the 866 RILs. For the RILs, the maize inbred W22 was the recurrent parent and the teosinte parent was CIMMYT accession 8759 of Zea mays ssp. parviglumis. QTL mapping was carried out using a modified version of R/qtl [53] that allows the program to take into account the BC 2 S 3 pedigree of the lines [8]. Given that the LSM showed continuous variation, the QTL model was set to ''normal'' for a normal distribution in R/qtl. The percentage of variance explained by each QTL was estimated by a drop-one-ANOVA as implemented in R/qtl [53].

Fine mapping
We used one of the BC 2 S 3 RILs (MR0091) for fine-mapping of prol1.1. MR0091 is heterozygous for a 33.9 Mb region including this QTL and homozygous maize for all other prolificacy QTL. We screened ,4,000 MR0091-derived plants for cross-overs in the QTL interval between markers umc2226 and bnlg1803. Twenty-three individuals with cross-overs in the QTL interval were identified and selfed. Selfed progeny from these 23 individuals that are homozygous for the recombinant chromosome plus two control lines (homozygous non-recombinant maize and teosinte) were grown in randomized block design with four blocks of 25 entries each. Prolificacy was scored as the total number of ears observed on the top two lateral branches of each plant. Thus, for maize (W22), which has a single ear per lateral branch, the prolificacy score is 2. LSMs with standard errors for prolificacy for each of the recombinant chromosome progeny lines and controls were determined by ANOVA with line and block effects using the software package JMP

Expression assays
For all expression assays, total cellular RNA was isolated using Trizol (Invitrogen) from immature ear-forming axillary branches. A 1 mg aliquot of each of RNA sample was DNase treated and reverse transcribed using a polyT primer and Superscript III reverse transcriptase (Invitrogen). cDNA integrity was checked by using 0.5 ml of the RT reactions as the template for PCR (Taq Core Kit, Qiagen) with actin primers (59-ccaaggccaacagagagaaa-39, 59-ccaaacggagaatagcatgag-39). The same actin primers were used to check for genomic DNA contamination; none was detected.
To confirm the intron-exon structure of gt1, PCRs were performed with cDNAs with primers (59-acaggctacagaggcagagc-39, 59-gcgcacttgcatgataatccacac-39) that amplify most of the predicted transcript ( Figure S2). cDNAs derived from both the maize and teosinte alleles were used. PCR products were assayed on standard Tris-borate-EDTA agarose gels. These PCRs consistently revealed three size classes of products for both maize and teosinte alleles. These PCR products were cloned using TOPO TA Cloning Kit (Invitrogen) and the clones sequenced at the University of Wisconsin Biotechnology Center using Sanger sequencing. Since the relative abundance of the three PCR size classes differed between the maize and teosinte alleles, we also assayed cDNAs derived from two lines with recombinant alleles: one having teosinte ''causative region'' and maize coding region (W22-QTL1S-IN0383), the other having maize ''causative region'' and teosinte coding region (W22-QTL1S-IN1043) ( Figure S1).
To compare gt1 transcript accumulation for the maize and teosinte alleles, we performed an allele specific expression assay [10] with cDNAs from ear-forming axillary branches of 20 plants that were heterozygous for the maize/teosinte alleles of our mapping population. One ml aliquots of the 20 RT reactions were used as the template for PCRs with a primer pair in the 39 UTR of gt1 including one fluorescently labeled primer (59-FAM-catgatggacctcgcgcccg-39, 59-gcgcacttgcatgataatccacac-39). This primer pair flanks a 2 bp indel that distinguishes the maize and teosinte transcripts. PCR products were assayed on an ABI 3700 fragment analyzer (Applied Biosystems) and the areas under the peaks corresponding to the maize and teosinte transcripts were determined using Gene Marker version 1.70 (Softgenetics, State College, PA). The relative message level associated with the maize vs. teosinte alleles in each of the twenty samples was calculated as the ratio of the area under teosinte/maize allele peaks. Two technical replicates were performed for each of the 20 biological replicates. The same assay was also performed with the DNA from each plant used for RNA extraction to assess any bias in allele amplification in the PCRs. The DNA analysis showed a slight bias towards the maize allele with maize/teosinte ratios of 1.05. Thus, the area under the teosinte peak with the cDNAs was multiplied by 1.05 to correct this bias.
We also compared transcript accumulation for the maize and teosinte alleles using quantitative real-time PCR (qPCR) with cDNA from immature ear-forming axillary branches of 12 homozygous maize and 12 homozygous teosinte plants as described above. For this assay, cDNA was first concentrated using RNAClean XP beads (Beckman Coulter). qPCR was performed on ABI Prism 7000 sequence detection system (Applied Biosystems) with Power SYBR Green PCR Master Mix (Applied Biosystems). Transcript abundance for gt1 was assayed using a set of primers in the 39 UTR (59-gcaatcaaggtcactagtatagtctg-39; 59gcgcacttgcatgataatccacac-39). Actin primers (see above) were used as the control. The annealing temperature/time used were 52uC for 30 sec; the extension temperature/time were 72uC for 45 sec.

In situ hybridization
Young ear-forming axillary buds (44-50 days after planting) were collected from the top two nodes bearing lateral buds from field grown plants. These ears were fixed in 4% para-formaldehyde 1 X phosphate-buffered saline overnight at 4uC, then dehydrated with an ethanol series and embedded in paraffin wax. Embedded tissue was sectioned to 8 mM with a Leica RM2155 microtome. The full gt1 cDNA coding sequence was used as a probe as described previously [9]. In situ hybridization with digoxygenin-UTP labeled antisense probe was preformed as previously described [54]. Strong gt1 expression characteristic of dormant lateral bud leaves or tassel floret carpels requires a relatively short development of the color reaction (3-4 hrs), while weaker gt1 expression in leaves of non-dormant buds and shoot nodes requires a more extended development (15-20 hrs.).

Population genetics
We sequenced the gt1 control region plus some flanking sequence (AGP v2: 23,231,760 to 23,235,500) for a set of 15 diverse maize and 9 diverse teosinte lines (Table S4; Genbank Accessions KC759702-KC759727). Initial PCR primers were designed at either end of this interval based on the B73 reference genome. PCR products for each of the 24 diverse lines were sequenced using the Sanger method. A primer walk across the interval was performed for each of the 24 lines. In cases where B73 specific primers failed for one of the diverse lines because of sequence divergence or large insertions, we used consensus sequence data from the diverse lines that were successfully amplified to design primers in conserved regions.
Sequences were aligned with Clustal X [55], and checked manually. Alignment regions with gaps or ambiguous alignment were removed from further analysis. Because the teosinte and maize individuals sequenced were inbred lines, we treated the sequence as haploid data (Table S4). After removing all gapped and tri-allelic sites, 2,871 base pairs remained. We calculated the number of segregating sites (S), nucleotide diversity (p) and Tajima's D for both maize and teosinte using custom perl scripts. We used MEGA5 [56] to infer a neighbor-joining (NJ) tree for the region (Figure S4A), and STRUCTURE [57] to test for admixture (Text S2). We used structured coalescent simulations to estimate the maximum likelihood values of the selection coefficient (s) and degree of dominance (h) of the class-M haplotype. We simulated a simple domestication model including a demographic bottleneck and a partial selective sweep (Text S2). Coalescent simulations made use of a modified version of the mbs software [58].
To estimate population frequencies of the class-M and class-T haplotypes in the gt1 control region, we chose an ,250 bp insertion in the teosinte haplotype at AGP v2: 23,232,564 in the B73 reference genome as a marker for the teosinte haplotype. This insertion was identified from the sequences of the 24 diversity lines discussed above. The insertion is present in all of the class-T haplotypes. Primers (59-gagactggcgactggtcct-39, 59-gacgtgcagacagcagacat-39) were designed in conserved sequences flanking the insertion. PCRs with these primers yield an ,600 bp product for the teosinte haplotype and an ,350 bp product for the maize haplotype. PCR product size differences were scored on 2% agarose gels for a panel of 68 maize landraces, 90 Z. mays ssp. parviglumis and 96 Z. mays ssp. mexicana (Table S5). Figure S1 Breakpoints of introgressed teosinte chromosomal segments used in the substitution mapping of prol1.1. Positions shown at the top of each column are based on the B73 Maize reference genome AGP_v2. Details for the markers listed in the top row can be found in Table S1. Rows represent the 23 Recombinant Chromosome Lines plus the maize and teosinte control lines. The genotypes for markers and intervals of each line are shown: ''M'' = maize, ''T'' = teosinte, ''2'' = undetermined, and ''F'' = fixed such that the maize and teosinte sequences are identical in the interval. The causative interval is highlighted in orange. (TIF) Figure S2 Gene model for grassy tillers1 (gt1) inferred from our maize (W22) and teosinte genomic sequence and full length ESTs obtained from Genbank (EB673843, DV519626). The following inferred features are marked: TATA box (underlined), transcription start (bold arrow), translation start (red text), introns (lower case), nuclear localization signal (yellow highlight), homehbox (bold horizontal line), stop codon (red text). The nuclear localization signal was predicted using the software NLStradamus (http://www.moseslab.csb.utoronto.ca/NLStradamus/). (TIF) Figure S3 Nucleotide sequences for the large, medium and small RT-PCR products aligned with W22 maize genomic sequence, gt1 coding sequence, and sequence for the homeodomain.