Comparative evolution of vegetative branching in sorghum

Tillering and secondary branching are two plastic traits with high agronomic importance, especially in terms of the ability of plants to adapt to changing environments. We describe a quantitative trait analysis of tillering and secondary branching in two novel BC1F2 populations totaling 246 genotypes derived from backcrossing two Sorghum bicolor x S. halepense F1 plants to a tetraploidized S. bicolor. A two-year, two-environment phenotypic evaluation in Bogart, GA and Salina, KS permitted us to identify major effect and environment specific QTLs. Significant correlation between tillering and secondary branching followed by discovery of overlapping sets of QTLs continue to support the developmental relationship between these two organs and suggest the possibility of pleiotropy. Comparisons with two other populations sharing S. bicolor BTx623 as a common parent but sampling the breadth of the Sorghum genus, increase confidence in QTL detected for these two plastic traits and provide insight into the evolution of morphological diversity in the Eusorghum clade. Correspondence between flowering time and vegetative branching supports other evidence in suggesting a pleiotropic effect of flowering genes. We propose a model to predict biomass weight from plant architecture related traits, quantifying contribution of each trait to biomass and providing guidance for future breeding experiments.


Introduction
Plant architecture is the three-dimensional organization of a plant body. The above-ground architecture includes the pattern of vegetative branching, sizes and shapes of stalks, leaves and floral organs, and plant height. The expression of plant architecture varies during different developmental stages by a series of highly regulated endogenous genetic factors [1][2][3] and exogenous constraints exerted by environments. Genetic factors impart the biodiversity of plant architecture, contributing to adaptation to different niches, are often utilized in the classification of taxa. On the other hand, responsiveness to biotic and abiotic stresses tailors plant architecture to fitness under different environments [4,5].
Important aspects of plant architecture are tillering and vegetative branching, which are considered to be medium to low heritability traits with a high degree of plasticity [6,7] complexity of these traits is due in part to their non-deterministic and genetic pathways controlling axillary meristem initiation and outgrowth that affect the number of tillers and patterns of vegetative branching [2,3,8]. Many of these genes are involved in the production, signal transduction, transport, degradation and interactions of hormones such as auxin, cytokinin and strigolactone. Those hormones act directly and locally to promote or repress axillary meristem activity [7,[9][10][11][12].
Recent studies have also suggested that genes involved in controlling flowering time also influence the activity of axillary meristems and thus influence tillering and vegetative branching. For example, the flowering locus T (Ft) gene that regulates flowering time in many species, has recently been found to trigger storage organ formation through direct interaction with the TCP factors [13]. The rice homolog of Leafy (Lfy) from Arabidopsis, expressed during the development of axillary bud and inflorescence branch primordia, is also required to produce tillers and panicle branches (Rao, 2008) [37]. As a C4 model plant, sorghum has a relatively small genome (~730 Mb) with a high quality reference genome sequence [14] and provides many avenues to study traits related to plant architecture. Using colinearity, results from sorghum may be extrapolated to many other C4 plants with large genomes, such as sugarcane. The flexibility to make crosses between the five main sorghum races (bicolor, guinea, caudatum, durra and kafir), and with wild relatives such as S. propinquum and S. halepense which vary widely in plant architecture, makes sorghum particularly attractive to dissect and compare genetic components of plant architecture. Compared to voluminous studies of plant height and flowering [15][16][17][18][19][20][21][22][23][24], understanding of genetic components for tiller number and vegetative branching in sorghum has been relatively limited [6,[25][26][27][28], possibly due to difficulties in phenotyping and the lower heritability of these traits.
Here, we describe a quantitative trait study of two important components of plant architecture, tillering and vegetative branching, in two half-sib tetraploid BC 1 F 2 populations derived from crossing Sorghum bicolor BTx623 and Sorghum halepense Gypsum 9E. A two-year, twoenvironment phenotypic evaluation in Bogart, GA and Salina, KS permitted us to identify major effect and environment specific QTLs [29,30]. Quantitative trait loci (QTLs) discovered in these two populations are compared to those from two diploid sorghum recombinant inbred line (RIL) populations sharing BTx623 as a common parent but sampling the breadth of the Sorghum genus, one a cross to S. bicolor IS3620C [31], and the other to S. propinquum [32]. QTLs identified in this study and their comparison elucidate morphological evolution in the Sorghum genus, are of practical use for marker-assisted breeding, and provide a foundation for molecular cloning and functional analysis.

Materials and methods
Population development is shown in Fig 1. Genetic maps of two BC1F1 populations derived from crosses of S. bicolor (sorghum) and S. halepense were produced with totals of 722 and 795 single nucleotide polymorphism (SNP) markers. These maps respectively span 37 and 35 linkage groups, with 2-6 for each of the 10 basic sorghum chromosomes due to fragments covering different chromosomal portions or independent segregation from different S. halepense homologs. Details of population development, genotyping methods and methods for QTL analysis were discussed in Kong, Nabukalu [29] and [30].

Phenotyping
We evaluated tillering (TL) and secondary branching per tiller (BRCH) in the BC 1  . Tillering (TL) was measured by counting the number of tillers with mature seed heads before plants were senesced. Secondary branches per tiller (BRCH) was calculated by taking the average number of secondary branches from two representative tillers (S1 and S2 Files).
Phenotyping of vegetative branching in the IS-RIL population was consistent with our system applied to the S. bicolor × S. propinquum RILs described in Kong, Guo [6]. To compare secondary branching across population and environments, we used the number of mature tillers (TL), and calculated the average number of secondary branches per mature tiller (BRCH) in the IS-RIL and PQ-RIL population. The variance component method was used to calculate broad-sense heritability [H = V G /(V G +V GE /e +V residual /er)] where V G is the variance estimate for genotype, V E is the variance estimate for environment, V GE is the genotype by environment interactions, e is the number of environments and r is the number of subsamples.

Genetic analysis
To fully utilize the available data while protecting against false-positive results, genetic analysis employed two approaches. Using genetic maps that were constructed as described [29] from selected well-groomed SNP segregation data for each of the two SBSH-BC 1 F 2 populations, interval mapping was conducted [33]. Permutation tests (with α = 0.10) suggested LOD scores of 2.9 and 3.1 for H4 and H6 populations, respectively. QTLs with LOD scores of 2.5 were listed as marginal QTLs. Additional QTLs were added to the model after considering the larger effect QTL as a fixed effect. For each trait, percentage of variance explained were calculated based on an additive QTL model with QTL positions refined.
In addition, single marker analysis was conducted using each SNP marker that met quality standards described (whether in the genetic map or not), using hierarchical clustering to separate SNP markers on potentially different homologous chromosomes and inferring QTLs only if more than 4 SNPs were found within a cluster cut at height of 0.3 in recombination frequency to mitigate spurious associations [30]. Similarities and differences in the results of these analyses were addressed in results.

Mixed modeling for biomass
We constructed a mixed effect model with Biomass as the response variable; FL, PH, TL, BRCH, mid-stalk diameter (MD), the number of nodes (ND), and population (H4 or H6) as fixed explanatory variables; and the environment (ENV) as a random effect. MD was the stalk diameter at the middle of a plant. The average of the six phenotypes from two blocks and three subsamples was taken for the mixed effect modeling. A natural log transformation was used for Biomass to normalize the data. Mixed effect modeling and model selection used the lme4 and lmerTest packages in R [34,35]. We used a modified method to calculate R-squared for the fixed and model effects [36] for the mixed effect modeling.

Summary statistics and heritability analysis
The average number of mature tillers (ML) of S. halepense G9E was 16, higher than the 2.6 of diploid S. bicolor BTx623 (S1 Table). Tetraploid BTx623 had an average of 0.77 more tillers than diploid BTx623 in 2013 (t = 2.91, p = 0.006) and 1.58 more in 2014 (t = 3.82, p = 0.0005). In the BC 1 F 2 population, average TL for most lines fell within the range of those of their parents, showing less transgressive segregation than plant height (PH) and flowering time (FL) [30]. The average TL was 1.46 more in Salina than Athens (t = 14.07, p<0.001). Average TL in Athens was 2.24 (t = -21.87, p<0.001) fewer in 2013 than 2014; and in Salina 2013 was 2.14 fewer in 2013 than 2014 (t = 19.07, p<0.001). Average TL of the BC 1 F 2 population is 0.30 greater than that of the PQ-RILs (t = 2.52, p = 0.020, S2 Table), and 2.83 greater than that of the IS-RILs (t = 36.19, p<0.001, S3 Table). Broad sense heritability estimates for TL were intermediate for all three populations, at 35%, 36%, and 30% for the PQRIL, ISRIL and SH-BC 1 F 2 populations, respectively (S1-S3 Tables).
The number of secondary branches per primary tiller (BRCH) is sensitive to environmental changes and is also a fail-safe for a plant in case the primary seed head is damaged. Average BRCH of S. halepense is 13, dramatically higher than the 0.286 of S. bicolor BTx623 (S1 Table). There were no statistically significant differences for BRCH between diploid and tetraploid BTx623 in Athens 2014, Salina 2013 and 2014, while there was 2.1 more BRCH in tetraploid BTx623 than in diploid BTx623 (t = 4.16, p = 0.0011) in Athens 2013. The average number of BRCH of most progeny lines fell within the range of the respective parents. For the SH-BC 1 F 2 progeny lines, the average number of BRCH in Athens was 1.29 more than in Salina (t = 25.50, p<0.001). Average BRCH in Athens was 0.45 more in 2013 than 2014 (t = 7.70, p<0.001); and in Salina was 0.60 more in 2013 than 2014 (t = 7.98, p<0.001). The average number of BRCH of the SH-BC 1 F 2 population was 2.28 smaller than that of the PQ-RIL population (t = -14.38, p<0.001, S2 Table), and 0.99 smaller than that of the ISRIL population (t = -0.99, p<0.001, S3 Table). Broad-sense heritability estimates for BRCH are relatively low, 7% and 10% for the PQ-RIL and SH-BC 1 F 2 populations, respectively, but intermediate for the ISRIL population, 40.9% (S1-S3 Tables).

Trait correlations
In all four environments, FL is positively and significantly correlated with PH (Fig 2)

Genetic analysis
Number of tillers. We detected a total of two marginal QTLs, qTL.4A.H4.1 and qTL.4D. H4.1, for TL in the H4-derived population (Table 1) Fig and Fig 3). Fewer signals detected for TL in multiple environments reflect lower heritability and large genotype by environment interactions. In the H4 population, we detected two QTLs for TL, qTL2.H4.1 and qTL.H7.1 in addition to the two QTLs on chromosome 4 detected by interval mapping (S4 Table). As was true for H4 QTLs found by interval mapping, S. halepense alleles increase the number of TL. In the H6 population, we detected a total of 14 QTLs for TL on chromosomes 1 (2), 2, 3(2), 4 (2), 6 (3), 9 (2), 10 (2) with three new QTLs (qTL4.H6.1, qTL4.H6.2 and qTL10.H6.2) not overlapping with any QTLs detected in the interval mapping, all with S. halepense alleles increasing the number of TL (S4 Table). The other 11 QTLs from the single-marker analysis overlap with seven QTLs from interval mapping based on their physical positions. We consider all marginal TL QTLs from interval mapping to be real QTLs, since results from single-marker analysis suggested lower P-values (smaller than 0.0001) for the peak SNPs (S4 Table).

Number of secondary branches per primary branch (BRCH).
We detected a total of five QTLs and two marginal QTLs for BRCH in the H4-derived population, including six from Athens 2014 and one from Salina 2013 ( Table 2). No QTLs were found in Athens 2013 or Salina 2014. The six QTLs detected in Athens 2014 together explain 22.0% of the total phenotypic variance, while the one QTL detected from Salina 2013 explains about 8.28% of the phenotypic variance. It is interesting that six out of seven QTLs show negative allele effects ( Table 2), suggesting that S. halepense alleles contribute to decreased BRCH, which is unexpected and contrary to the difference between parents. Those QTLs with negative additive effect might reflect late release of apical dominance from S. halepense, which is associated with fewer BRCH.
We detected a total of five QTLs and two marginal QTLs for BRCH in the H6-derived population, with one QTL, qBRCH.3E.H6.1, significant in two environments, Salina 2013 and 2014 (  correspondence among different environments. This observation is consistent with low heritability estimates and large genotype by environment interactions (S2 Fig and Fig 4). In the H4-derived population, we detected a total of 11 QTLs for BRCH on chromosomes 1 (2), 3, 4

QTL correspondence across traits in the BC 1 F 2 population
In most environments, TL and BRCH are significantly and positively correlated (Fig 2), therefore some QTL regions are expected to overlap due to their developmental relationship [6].  (Table 3).
Similarly, a total of two and five QTLs for BRCH show possible correspondence to FL in the H4 and H6-derived BC 1 F 2 populations, respectively (
Despite that BRCH is a plastic trait with low heritability, we still found two BRCH QTLs in the H4-derived SBSH-BC 1 F 2 , qBRCH4D.H4.1 and qBRCH5C.H4.1 overlapping with two IS-RIL QTLs, qRBCH4.1 and qBRCH5.1; and three H6-derived SBSH-BC 1 F 2 QTLs overlapping with two IS-RIL QTLs, qBRCH4.2 and qBRCH10.1 (Table 4). Two pairs of QTLs, qBRCH.5C.H4.1 and qBRCH5.1 from IS-RIL and qBRCH.10C.H6.1 and qBRCH10.1 from IS-RIL show opposite allele effects from S. halepense, suggesting that S. bicolor alleles increase BRCH in the SBSH-BC 1 F 2 population but decrease it in the ISRIL. This implies that IS3620c has an allele conferring abundant branching, with the BTx623 allele conferring less branching than IS3620c but more than S. halepense. In addition, a total of five and four H4 and H6derived SBSH-BC 1 F 2 QTLs for BRCH overlap with QTLs for various degrees (primary, secondary or tertiary) of vegetative branching in PQRIL described in Kong, Guo [6]. Most overlapping pairs of QTLs of SBSH-BC 1 F 2 and the PQRIL show the same direction of effect, from S. halepense and S. propinquum, respectively, except one case on chromosome 7 where QTLs within PQRIL shows different directions of effects (Table 4; within the PQRIL population, qSR_7.1 showed negative effect while qVG7.1 and qIM2_7.1 showed positive effect).

A regression model to predict biomass
We performed a regression analysis to predict biomass weight (Biomass, using natural log transformation) with respect to traits related to plant architecture while controlling for population structure and environmental factors. Our final model consists of a total of seven variables, with plant height (PH), mid-stalk diameter (MD), number of mature tillers (TL), number of secondary branches (BRCH), flowering time (FL), and population (H4 or H6) as fixed effects and environmental factors as a random effect (Table 5). Fixed effects (the environmental factor) in this model collectively explain about 71.76% of the total variance using a modified method for estimating R-squared in mixed models [36]. The typical log error in this model is about 0.3148, and can be decomposed into environmental error that is estimated to be normally distributed with a mean of zero and standard deviation of 0.1260; and the inherent residual error that is estimated to be normally distributed with a mean of zero and standard deviation of 0.2885 (Table 5A). The model suggests that PH, TL and MD are the three most important variables in predicting Biomass, followed by FL and BRCH (Table 5B). For

Discussion
The present study offers several new insights into the genetic control of tillering and vegetative branching. First, it adds more information to current knowledge of vegetative branching in sorghum, an under-studied trait, especially providing early insights into QTL polymorphism in S. halepense. Correspondence of QTL regions between three populations sharing S. bicolor BTx623 as a common parent, with the other parents being morphologically and genetically distinct genotypes that represent cultivated (IS3620C), wild diploid (S. propinquum) and wild polyploid (S halepense) sorghums, provides information about common QTLs shared between or among populations and taxon-specific QTLs that contribute to divergence. Finally, constructing a mixed model to predict dry biomass with respect to various traits associated with plant architecture and the environmental factors provides a framework to prioritize each trait in selection for biomass, as well as quantifying environmental influences.

QTL mapping
QTL mapping results for two relatively plastic traits, TL and BRCH, suggest high genotype by environment interactions and population differences. We only found three and one QTLs significant in multiple environments for TL and BRCH with interval mapping, respectively, with 6 and 13 significant in only single environments. Overlapping SNP sets from single marker analysis are much lower for these traits than for highly heritable traits such as plant height and flowering time (S1 and S2 Figs). QTL results are very different in the two populations derived from two different sibling S. halepense x S. bicolor F 1 plants, possibly due to Ma and Dw genes on chromosome 6. We detected fewer TL QTLs in the H4 than the H6-derived population, as was also true of FL and PH QTLs [30]. The number of BRCH QTLs for the two populations does not follow this pattern, but most H4-derived QTLs had negative effects in interval mapping, indicating the S. halepense allele reduced BRCH. Unexpected (Tables 3 and 4). This finding suggests that delaying flowering might reduce tillers and branching, perhaps due to late release of apical dominance. A surprising number of genomic regions were significant for FL and TL or FL and BRCH, perhaps suggesting pleiotropic relationships (Tables 3 and 4, Fig 5). For example, genes regulating flowering such as MADS box proteins also influence determinacy of other meristems [39]. Further, the flowering locus T (FT) gene that regulates flowering time in many species, has recently been found to trigger storage organ formation through direct interaction with the TCP factors [13]. We found a total of six genomic regions harboring QTLs responsible for both FL and TL, and four regions for both FL and BRCH in their respective populations. Previous study [21,27,40] has suggested that regions on chromosome 6 that harbor Ma1 also contain QTLs for tiller number. One explanation might be that Ma1, which appears to be a homolog of the Arabidopsis Ft and Rice Hd3a genes [18], influences organ formation. The Ma1 associated region in this study affected both TL and BRCH, while another QTL region at 47.2Mb on chromosome 6 affecting all three traits, FL, TL and BRCH. This QTL (~47.2Mb) might be related to the Sb06g019010 or (Sobic.006G107400.1 in Sorghum bicolor v3.11) gene encoding the 'number of apical meristem' (NAM) protein [6,41].

Regression model for predicting biomass
Plant architecture related traits can predict Biomass with relatively high accuracy. A mixed model for predicting dry biomass weight (Biomass) retained a total of five traits, plant height (PH), mid-stalk diameter (MD), mature tillers (TL), number of secondary branches (BRCH), and flowering time (FL) as significant predictors of dry biomass. The fixed effects explains 71.76% of the total variance, and a log error of 0.3148. Application of this model might be a cost-efficient method for predicting Biomass for future experiments, quantifying the contribution of individual traits to Biomass and providing guidance for improving genotypes aimed at biomass production.