The infrageneric phylogeny and temporal divergence of Sorghum were explored in the present study. Sequence data of two low-copy nuclear (LCN) genes, phosphoenolpyruvate carboxylase 4 (Pepc4) and granule-bound starch synthase I (GBSSI), from 79 accessions of Sorghum plus Cleistachne sorghoides together with those from outgroups were used for maximum likelihood (ML) and Bayesian inference (BI) analyses. Bayesian dating based on three plastid DNA markers (ndhA intron, rpl32-trnL, and rps16 intron) was used to estimate the ages of major diversification events in Sorghum. The monophyly of Sorghum plus Cleistachne sorghoides (with the latter nested within Sorghum) was strongly supported by the Pepc4 data using BI analysis, and the monophyly of Sorghum was strongly supported by GBSSI data using both ML and BI analyses. Sorghum was divided into three clades in the Pepc4, GBSSI, and plastid phylograms: the subg. Sorghum lineage; the subg. Parasorghum and Stiposorghum lineage; and the subg. Chaetosorghum and Heterosorghum lineage. Two LCN homoeologous loci of Cleistachne sorghoides were first discovered in the same accession. Sorghum arundinaceum, S. bicolor, S. x drummondii, S. propinquum, and S. virgatum were closely related to S. x almum in the Pepc4, GBSSI, and plastid phylograms, suggesting that they may be potential genome donors to S. almum. Multiple LCN and plastid allelic variants have been identified in S. halepense of subg. Sorghum. The crown ages of Sorghum plus Cleistachne sorghoides and subg. Sorghum are estimated to be 12.7 million years ago (Mya) and 8.6 Mya, respectively. Molecular results support the recognition of three distinct subgenera in Sorghum: subg. Chaetosorghum with two sections, each with a single species, subg. Parasorghum with 17 species, and subg. Sorghum with nine species and we also provide a new nomenclatural combination, Sorghum sorghoides.
Citation: Liu Q, Liu H, Wen J, Peterson PM (2014) Infrageneric Phylogeny and Temporal Divergence of Sorghum (Andropogoneae, Poaceae) Based on Low-Copy Nuclear and Plastid Sequences. PLoS ONE 9(8): e104933. doi:10.1371/journal.pone.0104933
Editor: Manoj Prasad, National Institute of Plant Genome Research, India
Received: April 23, 2014; Accepted: July 12, 2014; Published: August 14, 2014
Copyright: © 2014 Liu et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: The authors confirm that all data underlying the findings are fully available without restriction. The Pepc4, GBSSI, and combined plastid matrices were submitted to TreeBASE (http://treebase.org, study no. TB2: S15625). The data may be accessed on the Treebase website using the identifier S15625.
Funding: This work was supported by the National Natural Science Foundation of China (31270275, 31310103023), the Special Basic Research Foundation of Ministry of Science and Technology of the People’s Republic of China (2013FY112100), the Key Project of Key Laboratory of Plant Resources Conservation and Sustainable Utilization, South China Botanical Garden, CAS (201212ZS), the 42nd Scientific Research Foundation for the Returned Overseas Chinese Scholars, State Education Ministry (2011-1139), and the Laboratories of Analytical Biology of the National Museum of Natural History, Smithsonian Institution. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: The authors have declared that no competing interests exist.
Cultivated sorghum [Sorghum bicolor (L.) Moench] ranks fifth in both production and planted area of cereal crops worldwide, only behind wheat, rice, maize, and barley . Sorghum Moench comprises 31 species exhibiting considerable morphological and ecological diversity – in global tropical, subtropical, and warm temperate regions . The genus has panicles bearing short and dense racemes of paired spikelets (one sessile, the other pedicelled), whose sessile spikelets resemble the single sessile spikelets of Cleistachne Benth. These two genera were assigned to Sorghinae Clayton & Renvoize , one of the 11 subtribes of the tribe Andropogoneae Dumort. . Previous studies of the genus using chloroplast DNA (cpDNA) and nuclear ribosomal DNA (nrDNA) internal transcribed spacer (ITS) sequences indicated that Cleistachne was sister to or part of an unresolved polytomy within Sorghum –. The ambiguous relationship between Sorghum and Cleistachne is reflected by the absence of pedicelled spikelets and the unverified hypothesis for the allotetraploid origin of Cleistachne sorghoides Benth. , . Within Andropogoneae, Sorghastrum Nash has sometimes been considered as a subgenus in Sorghum due to its somatic chromosome number of 40 , or a distinct genus whose pedicelled spikelets are reduced to vestigial pedicels . Therefore, the generic limits of Sorghum have long been a controversial issue that needs to be tested using highly informative molecular markers.
Five morphological subgenera are recognized in Sorghum: Sorghum, Parasorghum, Stiposorghum, Chaetosorghum, and Heterosorghum , , . Subgenus Sorghum contains ten species (including the cultivated sorghum) that are distributed throughout Africa, Asia, Europe, Australia, and the Americas , . The seven species of subg. Parasorghum occur in Africa, Asia, and northern Australia, and the ten species of subg. Stiposorghum occur in northern Australia and Asia. Subgenera Chaetosorghum and Heterosorghum are native to northern Australia and the Pacific Islands . Culm nodes are glabrous or slightly pubescent in three subgenera: Sorghum, Chaetosorghum, and Heterosorghum, and bear a ring of hairs in subg. Parasorghum and Stiposorghum , . Subgenus Sorghum is characterized by the presence of well-developed pedicelled spikelets, while subg. Chaetosorghum and Heterosorghum are characterized by pedicelled spikelets which are reduced to glumes , .
The five morphological subgenera of Sorghum are not shown to be concordant with molecular phylogenetic hypothesis –. The combined ITS1/ndhF/Adh1 sequence data support a clade of Sorghum plus Cleistachne sorghoides that is divided into two lineages, one containing subg. Sorghum, Chaetosorghum and Heterosorghum, as well as Cleistachne sorghoides, and the other, subg. Parasorghum and Stiposorghum . Uncertainty about relationships in Sorghum has led to the reclassification of three distinct genera: Sarga Ewart including species of subg. Parasorghum and Stiposorghum; Sorghum including S. bicolor, S. halepense (L.) Pers., and S. nitidum (Vahl) Pers.; and Vacoparis Spangler including species of sub. Chaetosorghum and Heterosorghum . Ng’uni et al.  argued that this reclassification was unwarranted. Based on plastid and ITS sequence data, they found that Sorghum consisted of two lineages: one lineage containing species of subg. Sorghum, Chaetosorghum and Heterosorghum, and a second lineage containing species of subg. Parasorghum and Stiposorghum. More than 80% of samples were confined to Australia in previous molecular studies, which focused on resolving interspecific relationships in subg. Sorghum. Therefore, the molecular analysis based on a greater sampling of taxa throughout their geographic ranges is essential to explore the infrageneric relationships in Sorghum.
The species of Sorghum are an excellent group for understanding the evolutionary patterns in crop species and wild relatives since the genus contains a large tertiary gene pool (GP-3, a genetic entity developed by Harlan and De Wet  to deal with varying levels of interfertility among related taxa), and a relatively small secondary gene pool (GP-2) . Members of primary gene pool (GP-1) from the same species (such as the cereal species) can interbreed freely. Members of GP-2 are closely related to members of GP-1, although there are some hybridization barriers between members of GP-1 and GP-2, which can occasionally produce fertile first-generation (F1) hybrids. Members of GP-3 are more distantly related to members of GP-1, while gene transfers between members of GP-1 and GP-3 are impossible without artificial disturbance measures . Members of subg. Sorghum are found in GP-2, except for S. bicolor, which belongs to GP-1, while species of the other four subgenera are found in GP-3 . Subgenus Sorghum is traditionally treated as two complexes: the Arundinacea complex, consisting of annual non-rhizomatous species such as S. arundinaceum (Desv.) Stapf, S. bicolor, S. x drummondii (Nees ex Steud.) Millsp. & Chase, and S. virgatum (Hack.) Stapf; and the Halepensia complex, consisting of perennial rhizomatous species such as S. almum Parodi, S. halepense (L.) Pers, S. miliaceum (Roxb.) Snowden, and S. propinquum (Kunth) Hitchc. . Members of GP-3 contain wild genetic resources of important agronomic traits, e.g., drought tolerance and disease resistance. Nevertheless, the studies of interspecific relationships among GP-3 species has lagged behind due to small sampling, so a detailed understanding of relationships among GP-3 species is conducive for the exploitation of these valuable agronomic traits.
To date, 21.8% of grass species have been documented to have arisen as a result of hybridization events , . Plastid genes are commonly employed in phylogenetic reconstructions because they exist in high copy numbers in plant genomes and sequencing them often does not require cloning steps, and they are uniparentally (in most cases, maternally) inherited in angiosperms . Low-copy nuclear (LCN) genes harbor the genetic information of bi-parental inheritance and often provide critical phylogenetic information for tracking evolution of plant lineages involving hybridization and allopolyploidization , . For these reasons, LCN gene data complementing plastid gene data are more effective in identifying allopolyploids and their genome donors. Several studies using this method have successfully resolved the backbone phylogenetic patterns of economically important crop genera, e.g., Eleusine Gaertn. , Gossypium L. , and Hordeum L. .
The middle Miocene-Pliocene interval of 1.8–17.6 million years ago (Mya) was a crucial period in the diversification of Poaceae . The C4 clades within the subfamily Panicoideae originated in the middle Miocene (ca. 14.0 Mya) in global tropical and subtropical regions. Subsequently, the ecological expansion of C4 Panicoideae became associated with climate aridification and cooling through the late Miocene-Pliocene boundary (3.0–8.0 Mya) , . Sorghum, documented as an ecologically dominant member during the C4 grassland expansion , is characterized by its modern geographic distribution spanning five continents , , . Therefore, its ecological abundance in the late Tertiary, coupled with its wide geographic distribution in modern times, implies that Sorghum may have established conservative ecological traits during the early diversification process, i.e., Sorghum is a niche-conservative C4 genus , . However, the paucity of accurate age estimations of major diversification events in Sorghum has impeded our understanding of whether temporal relationships existed between the diversification of Sorghum and palaeoclimatic fluctuations during the middle Miocene-Pliocene interval. Our study will shed some light on the impact of palaeoclimatic fluctuations on the diversification of niche-conservative C4 grasses.
Here we explore the infrageneric phylogeny and temporal divergence of Sorghum by employing sequence data from two LCN and three plastid genes. The study aims to: (1) reconstruct infrageneric phylogenetic relationships in Sorghum; (2) investigate interspecific phylogenetic relationships among GP-3 species; and (3) estimate divergence times of major lineages in order to understand the impact of palaeoclimatic fluctuations on the diversification of Sorghum.
Materials and Methods
Plant Sampling and Sequencing
We sampled 79 accessions of 28 species in Sorghum –, covering the morphological diversity and the geographic ranges of five subgenera (Table 1), plus the monotypic genus Cleistachne, together with seven species in six allied genera as outgroups , . Seeds were obtained from International Livestock Research Institute (ILRI), International Crops Research Institute for the Semi-Arid Tropics (IS), and United States Department of Agriculture (USDA). Leaf material was obtained from seedlings and dry herbarium specimens deposited at CANB, IBSC, K, and US (Table S1 , –).
Two LCN genes, phosphoenolpyruvate carboxylase 4 (Pepc4) and granule-bound starch synthase I (GBSSI), were chosen for this study. The housekeeping Pepc4 gene encodes PEPC enzyme responsible for the preliminary carbon assimilation in C4 photosynthesis , whereas GBSSI gene encodes GBSSI enzyme for amylose synthesis in plants and prokaryotes . These two LCN genes have been used for accurate phylogenetic assessments in Poaceae , . They are predominantly low-copy in Poaceae, making it possible to establish orthology and track homoeologues arising by allopolyploidy , . Based on genome-wide researches on cereal crops, these two LCN genes appear to be on different chromosomes , , thus each of the LCN markers can provide an independent phylogenetic estimation.
Genomic DNA extraction by means of DNeasy Plant Mini Kit (Qiagen, Valencia, CA, USA) was undertaken in accordance with the manufacturer’s instructions. Two LCN markers were amplified using primers and protocols listed in Table 2 , . PCR products were purified by the PEG method . Cycle sequencing reactions were conducted in 10 µL volumes containing 0.25 µL of BigDye v.3.1, 0.5 µL of primer, 1.75 µL of sequencing buffer (5×) and 1.0 µL of purified PCR product. For accessions that failed direct sequencing, the purified PCR products were cloned into pCR4-TOPO vectors and transformed into Escherichia coli TOP10 competent cells following the protocol of TOPO TA Cloning Kit (Invitrogen, Carlsbad, CA, USA). Transformed cells were plated and grown for 16 h on LB agar with X-Gal (Promega, Madison, WI, USA) and ampicillin (Sigma, St. Louis, MO, USA). We started with fewer colonies and picked more to ensure results, and eight to 24 colonies were selected from each individual via blue-white screening in order to assess allelic sequences and PCR errors , . Inserts were sequenced with primers T7 and T3 on the ABI PRISM 3730XL DNA Analyzer (Applied Biosystems, Forster City, CA, USA).
Cloned sequences of nuclear loci were initially aligned with MUSCLE v.3.8.31  and adjusted in Se-Al v.2.0a11 (http://tree.bio.ed.ac.uk/software/seal/). Subsequently, the corrected clones were assembled into individual-specific alignments that were analyzed separately using a maximum parsimony optimality criterion with the default parsimony settings in PAUP* v.4.0b10 . The resulting trees were used to determine unique alleles present in each individual . Alleles were recognized when one or more clones from a given individual were united by one or more characters . After identifying all sequence clones for a given allele, the sequences were combined in a single project in Sequencher v.5.2.3 (Gene Codes Corp., Ann Arbor, Michigan, USA) and manually edited using a “majority-rule” criterion to form a final consensus allele sequence, and instances of PCR errors ,  were easily identified and never occurred in more than one sequence. Newly obtained consensus sequences of 62 Pepc4 alleles and 76 GBSSI alleles were submitted to GenBank (http://ncbi.nlm.nih.gov/genbank; Table S1).
Three plastid markers (ndhA intron, rpl32-trnL, and rps16 intron) were amplified and sequenced to estimate lineage ages in Sorghum. Primer sequences and amplification protocols for the plastid markers were listed in Table 2. PCR products were purified by the PEG method . Cycle sequencing reactions were conducted in 10 µL volume and were run on an ABI PRISM 3730XL DNA Analyzer. Both strands were assembled in Sequencher v.5.2.3. Sequence alignment was initially performed using MUSCLE v.3.8.31  in the multiple alignment routine followed by manual adjustment in Se-Al v.2.0a11. The Pepc4, GBSSI, and combined plastid matrices were submitted to TreeBASE (http://purl.org/phylo/treebase/phylows/study/TB2:S15625).
Each data set was analyzed with maximum likelihood (ML) using GARLI v.0.96 , and Bayesian inference (BI) using MrBayes v.3.2.1 . The substitution model for different data partitions was determined by the Akaike Information Criterion (AIC) implemented in Modeltest v.3.7 , and the best-fit model for each data set was listed in Table 3. ML topology was estimated using the best-fit model, and ML bootstrap support (MLBS) of internal nodes was determined by 1000 bootstrap replicates in GARLI v.0.96 with runs set for an unlimited number of generations, and automatic termination following 10,000 generations without a significant topology change (lnL increase of 0.01). The output file containing the best trees for bootstrap reweighted data was then read into PAUP* v.4.0b10  where the majority-rule consensus tree was constructed to calculate bootstrap support values.
Bayesian inference (BI) analyses were conducted in MrBayes v.3.2.1  using the best-fit model for Pepc4 and GBSSI loci (Table 3). Each analysis consisted of two independent runs for 40 million generations; trees were sampled every 1000 generations, and the first 25% were discarded as burn-in. The majority-rule (50%) consensus trees were constructed after conservative exclusion of the first 10 million generations from each run as the burn-in, and the pooled trees (c. 60,000) were used to calculate the Bayesian posterior probabilities (PP) for internal nodes using the “sumt” command. The AWTY (Are We There Yet?) approach was used to explore the convergence of paired MCMC runs in BI analysis . The stationarity of two runs was inspected by cumulative plots displaying the posterior probabilities of splits at selected increments over an MCMC run, and the convergence was visualized by comparative plots displaying posterior probabilities of all splits for paired MCMC runs.
The nuclear data were used to help determine bi-parental contributions, and multiple alleles were present for most polyploid taxa. Thus, the nuclear data cannot be combined with the plastid dataset, which provided the maternal phylogenetic framework. We rooted the Pepc4 tree using species of Apluda, Bothriochloa, Chrysopogon, Dichanthium and Sorghastrum as outgroups and rooted the GBSSI tree using species of Bothriochloa, Dichanthium, Microstegium and Sorghastrum as outgroups ,  because clean GBSSI sequences of Apluda and Chrysopogon could not be isolated in the laboratory. The appropriate choice of outgroups was confirmed by phylogenetic proximity (the monophyletic ingroup being supported), genetic proximity (short branch length being observed) and base compositional similarity (ingroup-like GC%; Table 3) .
For molecular dating analyses using the plastid markers, a strict molecular clock model was rejected at a significance level of 0.05 (IL = 686.7024, d.f. = 60, P = 0.025) based on a likelihood ratio test . A Bayesian relaxed clock model was implemented in BEAST v.1.7.4  to estimate lineage ages in Sorghum. Three plastid markers were partitioned using BEAUti v.1.7.4 (within BEAST) with the best-fit model determined by Modeltest v.3.7 (Table 3).
The Andropogoneae crown age was estimated at 17.1±4.1 Mya  and within this confidence interval , although the most reliable fossils of subfamily Panicoideae were the petrified vegetative parts from the Richardo Formation in California  now dated to be approximately 12.5 Mya –. Because the lineages may have occurred earlier than the fossil record , the Sorghum stem age was set as a normal prior distribution (mean 17.1, SD 4.1). A Yule prior (Speciation: Yule Process) was employed. An uncorrelated lognormal distributed relaxed clock model was used, which permitted evolutionary rates to vary along branches according to lognormal distribution. Following optimal operator adjustment, as suggested by output diagnostics from preliminary BEAST runs, two independent MCMC runs were performed with 40 million generations, each run sampling every 1000 generations with the 25% of the samples discarded as burn-in. All parameters had a potential scale reduction factor  that was close to one, indicating that the posterior distribution had been adequately sampled. The convergence between two runs was checked using the “cumulative” and “compare” functions implemented in the AWTY . A 50% majority rule consensus from the retained posterior trees (c. 60,000) of three runs were obtained using TreeAnnotator v.1.7.4 (within BEAST) with a PP limit of 0.5 and mean lineage heights.
Phylogenetic analyses of Pepc4 sequences
The aligned Pepc4 matrix comprised 1225 characters, including partial exons 8 and 9, complete intron 9, at lengths of 841 bp, 190 bp, and 194 bp, respectively (Table 3). The Pepc4 data provided a relatively high proportion of parsimony-informative characters (249 bp; 20.3%). The log likelihood scores of 56 substitution models ranged from 5883.8525 to 6165.2119, and Modeltest indicated that the best-fit model under AIC was GTR+I+G with base frequencies (πA = 0.19, πC = 0.32, πG = 0.31, and πT = 0.18), and substitution rates (rAC = 1.7, rAG = 2.6, rAT = 2.8, rCG = 2.3, rCT = 3.6, and rGT = 1). Within the Bayesian phylogenetic inference, two chains converged at similar topologies. The standard deviation of split frequencies reached values lower than 0.01 during analysis, and the stationarity was reached after 2.27 million generations (Figure S1). The ML and the BI analyses indicated an identical phylogenetic pattern for Sorghum plus Cleistachne sorghoides.
The monophyly of Sorghum plus Cleistachne sorghoides (with the latter nested within Sorghum) received strong support from the BI analysis (PP = 0.99). Three clades (designated as clades P-I, P-II, and P-III) were observed in the Pepc4 phylogram with strong support (Figure 1). The Pepc4 sequences from one accession of Cleistachne sorghoides fell into two divergent lineages [clade P-I and an independent branch with strong support (MLBP = 100%, PP = 1.00)], with clade P-I having A type sequence and the independent branch having B type sequences (putative homoeologues, a potential result caused by allotetraploidy, where each sequence type represents a different parental lineage). Clade P-I contained species of subg. Sorghum, S. ecarinatum Lazarides, and A-type sequence of Cleistachne sorghoides with strong support (MLBP = 100%, PP = 1.00). Clade P-II comprised subg. Parasorghum and Stiposorghum with strong or moderate support (MLBP = 88%, PP = 1.00). Clade P-III contained S. laxiflorum with strong support (MLBP = 95%, PP = 0.99). Clade P-I was sister to clade P-III (PP = 0.94), while clade P-II was sister to B-type sequences of C. sorghoides (PP = 0.58), and finally, the clade P-I+clade P-III was sister to the clade P-II and B-type sequences of C. sorghoides in the Pepc4 phylogram (PP = 0.99) (Figure 1).
Numbers above branches are maximum likelihood bootstrap/Bayesian posterior probability (MLBS/PP). Taxon labels are in the format: Sorghum brachypodum-2-Cowie8981-×2 where Sorghum brachypodum indicates that the sequence belongs to the species Sorghum brachypodum; -2- = the second sequence listed in Table S1 for the species; Cowie8981 = specimen voucher information; -×2 indicates we recovered 2 clones for the sequence; and without any mark after specimen voucher information indicates the sequence is derived from PCR-direct sequencing. Coloured taxon labels and circles correspond to the listed subgenera and geographic ranges at the top left corner of the figure, respectively.
Phylogenetic analyses of GBSSI sequences
The aligned GBSSI matrix comprised 1501 characters, including partial exons 8 and 13, complete exons 9, 10, 11, and 12, introns 8, 9, 10, 11, and 12 at a length of 82 bp, 33 bp, 185 bp, 204 bp, 106 bp, 138 bp, 158 bp, 152 bp, 145 bp, 130 bp, and 168 bp, respectively (Table 3). The log likelihood scores of 56 substitution models ranged from 11947.3877 to 12361.0693, and Modeltest indicates that the best-fit model under AIC is TIM+G with base frequencies (πA = 0.23, πC = 0.26, πG = 0.28, and πT = 0.23) and substitution rates (rAC = 1.0, rAG = 1.5, rAT = 1.1, rCG = 1.1, rCT = 1.9, and rGT = 1). Within the Bayesian phylogenetic inference, two chains converged at similar topologies. The standard deviation of split frequencies reached values lower than 0.01 during analysis, and stationarity was reached after 1.09 million generations (Figure S2). The ML and the BI analyses generated an identical phylogenetic pattern for Sorghum.
The monophyly of Sorghum received strong support (MLBS = 100%, PP = 1.00) (Figure 2). Three clades (designated as clades G-I, G-II, and G-III) were recognized in the GBSSI phylogram with strong support. Clade G-I contained subg. Sorghum species, S. leiocladum (Hack.) C.E. Hubb., and S. versicolor Andersson with strong support (MLBP = 100%, PP = 1.00). Clade G-II comprised species of subg. Parasorghum and Stiposorghum with strong support (MLBP = 100%, PP = 1.00). Clade G-III consisted of S. laxiflorum and S. macrospermum with strong support (MLBP = 100%, PP = 1.00). Clade G-I was shown to be sister to clade G-II with weak support (MLBS = 0.61, PP = 0.71), and this group in turn, showed a strong association with clade G-III (MLBP = 100%, PP = 1.00) in the GBSSI phylogram (Figure 2).
Numbers above branches are maximum likelihood bootstrap/Bayesian posterior probability (MLBS/PP). Taxon labels are in the format: Sorghum matarankense-2-Perry2691-×3 where Sorghum matarankense indicates that the sequence belongs to the species Sorghum matarankense; -2- = the second sequence listed in Table S1 for the species; Perry2691 = specimen voucher information; ×3 indicates we recovered 3 clones for the sequence; and without any mark after specimen voucher information indicates the sequence is derived from PCR-direct sequencing. Coloured taxon labels and circles correspond to the listed subgenera and geographic ranges at the top left corner of the figure, respectively.
Two (A- and B-type) homoeologous loci of GBSSI sequences were identified for two accessions of Cleistachne sorghoides, providing strong evidence for the presence of two divergent genomes. The A-type GBSSI sequences of Cleistachne sorghoides were characterized by three features: a large number of variations occurred in introns 8, 9, 11, and 12 (e.g., the strong support for A-type homoeologues of C. sorghoides and Sorghastrum nutans in Figure 1); the A-type homoeologues of C. sorghoides being distantly related to B-type homoeologues of C. sorghoides (Figure 2); and 13 insertions (3–17 bp in length) distributed in introns 8, 9, 11, and 12, implying the likelihood of sequence divergence after the speciation event of C. sorghoides.
The combined plastid matrix of 62 accessions comprised 2858 characters, of which 113 were parsimony-informative (4.0%). The “cumulative” and “compare” results implemented in the AWTY showed that two runs had reached stationarity after 2.57 million generations (Figure S3). The BEAST analysis generated a well-supported tree (MLBP = 90%, PP = 0.99) for Sorghum plus Cleistachne sorghoides (Figure 3), which was identical to the topologies from ML and BI analyses. Three clades were recognized for Sorghum plus Cleistachne sorghoides. Clade II included Cleistachne sorghoides and subg. Parasorghum and Stiposorghum (lineage number 2), and clade I (i.e., subg. Sorghum) (lineage number 3) was sister to clade III (i.e., subg. Chaetosorghum and Heterosorghum). Here we discuss divergence times for the lineages of interest as shown in Table 4.
Numbers above the branches are maximum likelihood bootstrap/Bayesian posterior probability (MLBS/PP). Taxon labels are in the format: Sorghum almum-Liu236 where Sorghum almum indicates that the sequence belongs to the species Sorghum almum; -Liu236 = specimen voucher information. Coloured taxon labels and circles correspond to the listed subgenera and geographic ranges at the top left corner of the figure, respectively. Numbers 1–9 indicate the lineages of interest as shown in Table 4.
The uncorrelated-rates relaxed molecular clock suggests that the diversification of Sorghum plus Cleistachne sorghoides lineage occurred in the middle Miocene (12.7 Mya with 95% HPD of 5.5–16.7 Mya; lineage number 1 in Figure 3), which is the stem age for clade II (lineage number 2) and for clades I and III (lineage number 3). The crown age of clade II excluding S. grande was determined to be 10.5 (4.1–13.8) Mya in the late Miocene (lineage number 4), which is also the divergence time of clade II excluding S. grande and Cleistachne sorghoides (lineage number 5). The crown age of clade I was 10.5 (4.1–14.1) Mya in the late Miocene (lineage number 6), which is also the stem divergence time of clade III (lineage number 7) in Figure 3. Two lineages containing S. bicolor were estimated at 3.9 (0.3–4.3) Mya in the early Pliocene (the Africa-America-Asia-Europe lineage; lineage number 8) and 2.4 (0.0–3.4) Mya in the early Pliocene (the Africa-Asia lineage; lineage number 9), respectively (Table 4).
Origin of Cleistachne sorghoides
Plastid, Pepc4 and GBSSI data support the hypothesis for the allotetraploid origin of Cleistachne sorghoides. Based on the plastid data, Cleistachne sorghoides shared a common ancestor with clade II excluding S. grande (lineage number 4 in Figure 3), which may represent a source of the maternal parent for C. sorghoides. The plastid sequence similarity between C. sorghoides and clade II excluding S. grande also indicated that C. sorghoides became separated from the common ancestor in a relatively ancient time . The Pepc4 data provide evidence for this ancient allopolyploid origin because the conservative Pepc4 gene evolved more slowly than non-housekeeping genes . Two Pepc4 homoeologous loci of C. sorghoides were isolated from the same accession, and this indicates the presence of two divergent genomes in C. sorghoides. The maternal lineage identified by the plastid tree was confirmed by the weak relationship between clade P-II and B-type homoeologues of C. sorghoides in the Pepc4 phylogeny (Figure 1). The GBSSI tree was found to be complementary to the nrDNA ITS tree, in which C. sorghoides was deeply nested within the subg. Parasorghum and Stiposorghum lineage . The authors inferred that the ITS sequences of C. sorghoides might have undergone complete homogenization towards the maternal parent, i.e. the subg. Parasorghum and Stiposorghum lineage. The B-type homoeologues of Cleistachne sorghoides showed no close relationship with any sampled species in the GBSSI tree (Figure 2), providing indirect evidence for the full divergence of B-type GBSSI homoeologues of C. sorghoides away from the maternal parent in Sorghum (clade II) in the GBSSI tree.
The paternal parent of Cleistachne sorghoides remains unresolved due to the incongruence between the two LCN trees. In the Pepc4 tree, A-type homoeologue of C. sorghoides shared a common ancestor with clade P-I native to the Old World, while A-type GBSSI homoeologues of C. sorghoides showed a strong relationship with Sorghastrum nutans in the GBSSI tree. Considering its geographic range in North America, Sorghastrum nutans seems a much less likely candidate as the paternal parent for C. sorghoides because geographically there is no opportunity for sexual contact with its potential maternal lineage.
To explain the paternal genome of Cleistachne sorghoides, it seems likely that C. sorghoides acquired the A-type Pepc4 sequences via hybridization with the ancestor of subg. Sorghum, and subsequently the A-type GBSSI sequences of C. sorghoides experienced recombination (gene exchange) with species of the of African-American disjunct Sorghastrum . A pre-requisite of this hypothesis is that East Africa and India would have been the geographic location of the recombination episode, perhaps in the fallow lands of Sudan, Uganda, Kenya, Congo, and India, where the native distribution of C. sorghoides is found . Therefore, the recombination event of C. sorghoides placed its GBSSI homoeologues near the outgroup location in the GBSSI phylogram. The LCN data indicate that C. sorghoides may have experienced a complex speciation process . Based on support from Pepc4, combined plastid, and previous restriction site data , we chose to transfer Cleistachne sorghoides into Sorghum (Table 5).
Infrageneric phylogenetic relationships in Sorghum
The monophyly of Sorghum plus Cleistachne sorghoides is supported by Pepc4 and plastid data, as well as the combined ITS1/ndhF/Adh1 data , where Sorghum plus Cleistachne sorghoides are resolved into a distinct clade with 100% support. Nevertheless, the result contradicts the monophyly of Sorghum supported by GBSSI data. The absence of a definitive boundary for members of the subtribe Sorghinae has led others to suggest that the subtribe might have experienced rapid radiation . The gene recombination event was inferred to explain the GBSSI sequence divergence of C. sorghoides from Sorghum, thus the unresolved phylogenetic position of the B-type GBSSI homoeologues of C. sorghoides in the GBSSI tree may indicate a complex phylogenetic history of the Sorghinae.
Three infrageneric lineages were supported by the LCN and the plastid data: the subg. Sorghum lineage; the subg. Parasorghum and Stiposorghum lineage; and the subg. Chaetosorghum and Heterosorghum lineage. The subg. Chaetosorghum and Heterosorghum lineage contained S. macrospermum and S. laxiflorum, respectively (Figures 2 and 3). These two species were easily distinguished from the remaining Australian native species of Sorghum in having glabrous culm nodes, reduced pedicelled spikelets, and a minute obtuse callus , . The two species possessed relatively smaller 2C DNA content (2.07 pg to 2.49 pg) than the remaining congeneric Australian species , , , . The close relationship between S. macrospermum and S. laxiflorum was also supported by nrDNA ITS ,  and the combined ITS1/ndhF/Adh1 , , On the basis of morphological, cytogenetic, and molecular sequence evidence, it is appropriate to recognize a distinct subg. Chaetosorghum comprising two sections: sect. Chaetosorghum (E.D. Garber) Ivanjuk. & Doronina (S. macrospermum) and sect. Heterosorghum (E.D. Garber) Ivanjuk. & Doronina (S. laxiflorum) (Table 5), although we could not get clean Pepc4 sequences of S. macrospermum in the laboratory.
Most species of subg. Parasorghum and Stiposorghum were resolved into one well-supported lineage in the two LCN phylograms. The two subgenera were traditionally distinguished by length and shape of the callus on the sessile spikelet: Parasorghum was characterized by a short and blunt callus with an articulation joint, whereas Stiposorghum was characterized by a long and pointed callus with a linear joint , . However, doubts have recently been cast on the systematic value of the callus owing to the continuity of character states across the subgeneric boundary . The subjective nature of determining callus morphology was also reflected by the molecular results because members of Parasorghum and Stiposorghum were aligned into a single lineage , , . Since there were no well-defined taxonomic and genetic boundaries between these two subgenera, the most practical solution is to combine them into a single subg. Parasorghum (Table 5).
Subgenus Chaetosorghum (including S. macrospermum and S. laxiflorum) appears closely related to subg. Sorghum with strong support (PP = 1.00) in the plastid tree (Figure 3); and such a relationship is consistent with nrDNA ITS , the combined ITS1/ndhF/Adh1 , and Pepc4 sequence data (Figure 1). Although the relationship between subg. Chaetosorghum and the clade G-I+clade G-II lineage received weak support (MLBS = 0.61, PP = 0.71) in the GBSSI tree, the placement of subg. Chaetosorghum in Sorghum is unequivocally supported by the sequence data .
Interspecific relationships within subg. Sorghum and GP-3 species
In the Pepc4 phylogram, weak support (MPBS<50%, PP<0.5) was found for S. bicolor (Australian and Mexican accessions) and its immediate wild relatives, i.e., S. almum, S. arundinaceum, S. x drummondii, S. propinquum, and S. virgatum (Figure 1). The five species formed a strongly supported clade G-I (Figure 2). Based on the short branch lengths within clade P-I and clade G-I, the ease to hybrid formation between S. bicolor and certain members of subg. Sorghum , and their similar karyotypes , it is reasonable to infer that the ancestors of S. bicolor may be members of subg. Sorghum . It was suggested that S. almum was a recent fertile hybrid between S. bicolor and S. halepense , but S. arundinaceum, S. bicolor, S. x drummondii, S. propinquum, and S. virgatum appear closely related to S. almum in Pepc4, GBSSI, and plastid phylograms, suggesting that they may be potential genome donors to S. almum .
Sorghum bicolor is an annual diploid species native to Africa . Four main hypotheses have been proposed to explain its early evolutionary history: (1) annual S. arundinaceum was assumed to be the wild progenitor of S. bicolor based on a cytological study ; (2) S. bicolor was thought to be an interspecific hybrid and a descendant of two diploid species (2n = 10) ; (3) S. bicolor may have arisen by chromosome doubling from one diploid ancestor (2n = 10) ; or (4) S. bicolor may share a common ancestor with sugarcane and maize through an ancient polyploidization event . The first hypothesis is supported by our study, where S. arundinaceum is confirmed to have a close relationship with S. bicolor, and this is seen in our LCN trees. Being an ancient forest-savanna species native to tropical Africa , Sorghum arundinaceum extends eastwards to India, Australia, and is introduced to tropical America , . It is possible that the cultivated sorghum originated from S. arundinaceum native to forest-savanna in the sub-Saharan belt at the north of the equator before it colonized regions from the Atlantic to the Indian Oceans.
The separation of S. sudanense (Sudan grass) from S. x drummondii is supported by our study. The two species are distributed from Sudan to Egypt in East Africa  and naturalized in China and the Americas . The relationship between these two species was incongruent based on the two LCN gene phylograms. The Pepc4 sequences suggest that S. sudanense is sister to the lineage containing S. x drummondii and the remainder of subg. Sorghum with strong support (MLBS = 100%, PP = 1.00, Figure 1), it appears that S. sudanense is genetically distant from S. x drummondii. While in the GBSSI phylogram, the two species are nested within a strongly supported clade G-I (MLBS = 100%, PP = 1.00, Figure 2). An interpretation of the incongruent pattern might be that S. sudanense was a consequence of sympatric speciation among different East African populations of S. x drummondii occurring abundant genetic variation . Sorghum sudanense has obovate caryopses with smooth surfaces whereas S. x drummondii has obovate or elliptic caryopses with striate surfaces (H. Liu et al., unpublished data). Perhaps caryopses with different surface sculptures are the phenotypic consequence of adaptation to different microhabitats , . Recognition of the two taxa at the specific level, as opposed to merging them as varieties  is compatible with our results.
The genome origin of S. halepense has been debated for years. It was believed that S. halepense experienced homoeologous chromosome transpositions  from potential progenitors S. bicolor and S. propinquum , . Some workers proposed that S. halepense was a segmental allotetraploid hybrid between S. arundinaceum and S. propinquum , . If so, the maternal parents of S. halepense may have come from members of subg. Sorghum, since S. halepense is deeply nested within lineage number 6 (Figure 3). Furthermore, the plastid data supports S. arundinaceum and S. x drummondii as potential progenitors of S. halepense. An alternative hypothesis is that S. halepense is an interspecific hybrid and a descendant of S. bicolor and S. virgatum . However, the Pepc4 and GBSSI data contradict this hypothesis since no corresponding loci were isolated from S. halepense. In GBSSI tree, four sequences of S. halepense formed a lineage (MLBS = 85%, PP = 1.00), which was sister to the S. sudanense lineage. These results are consistent with the hypothesis that S. halepense arose via homoeologous chromosome transpositions from members of subg. Sorghum. Sorghum halepense exhibits disomic inheritance , , allowing the independent assortment of DNA segments between progenitors resulting in a complex evolutionary pattern . This assumption is substantiated in allozyme studies, where high-frequency alleles found in S. halepense were not detected in S. bicolor or S. propinquum, providing further evidence for the absence of alleles from progenitors of S. halepense .
Based on GBSSI and plastid data, Sorghum nitidum is nested within the subg. Parasorghum and Stiposorghum lineage. Sorghum nitidum is distributed in southeast Asia, the Pacific Islands, and northern Australia , and exhibits significant morphological variation. The species is characterized by a hairy ring around the nodes, awnless or awned lemmas in sessile spikelets, and relatively small chromosomes . Based on ITS and ndhF analyses, S. nitidum is embedded in subg. Sorghum . However, the genome size of S. nitidum (2.20 pg) resembles that of members of subg. Parasorghum and Stiposorghum (0.64 pg–2.30 pg) rather than that of subg. Sorghum (0.26 pg–0.42 pg) . Our study supports a close relationship between S. nitidum and the subg. Parasorghum and Stiposorghum lineage , .
Palaeoclimatic hypothesis for lineage divergence in Sorghum
It is recognized that the evolution of organisms is profoundly influenced by past tectonic activities and climate changes , . Two Sorghum major lineages (lineage numbers 2 and 3) diverged from a common ancestor at 12.7 (95% HPD: 5.5–16.7) Mya (Figure 3) in the middle Miocene-Pliocene interval marked by aridification, which induced C4 grassland emergences in Africa , . The Eastern branch of East Africa Rift has continuously uplifted since the early Miocene , , and the increasingly arid climate of tropical and subtropical Africa was caused by the topographic barrier of the eastern branch Rift to moist maritime air from the Indian Ocean , . The resultant formation of new ecological niches  presumably catalyzed the diversification of Sorghum (e.g., lineage numbers 8 and 9 in Figure 3) in Africa at a time when significant faunal turnover was observed, e.g., leaf-mining flies , savanna-inhabiting crickets , prairie-adapted rodents , and grass-feeding mammals .
The northern Australian endemic species of Sorghum (mostly in lineage number 5, Figure 3) diverged by 9.0 (HPD: 3.3–11.5) Mya around the late Miocene/Pliocene boundary, when the monsoonal palaeoclimate was characterized by south-eastward dry trade winds in winter and north-westward moist flow in summer –. The Australian endemic species [e.g., S. intrans, S. leiocladum, S. matarankense E.D. Garber & L.A. Snyder, and S. timorense (Kunth) Büse] are geographically restricted to rocky hills, coastal dunes, and seasonally flooded swamps in northern Australia ,  where the local vegetation was affected by the lowering seas, leading to the dominance of monsoonal savannas . Meanwhile, the highly dissected tropical areas became even more scattered in northern Australia causing complex topography in the monsoonal savannas. Therefore, it is reasonable to hypothesize that the dominance of monsoonal savanna in the late Miocene contributed to the high level of endemism of Sorghum in Australia.
Traditionally, Cleistachne has been separated from Sorghum because it has only single spikelets whose pedicels are thought to represent raceme peduncles, whereas Sorghum has sessile and pedicelled spikelets, although the sessile spikelets can be much reduced , . Our study and that of early workers agree that Cleistachne is allied with Sorghum , , ; we thus propose the new combination as below.
Sorghum sorghoides (Benth.) Q. Liu & P.M. Peterson, comb. nov. Basionym: Cleistachne sorghoides Benth., Hooker’s Icon. Pl. 14: t. 1379. 1882.
We also propose a new subgeneric classification of Sorghum (Table 5). Within Sorghum we recognized three subgenera: Chaetosorghum, Parasorghum, and Sorghum; and chose to retain two sections within Chaetosorghum: Chaetosorghum and Heterosorghum. Alternatively, based on our molecular results, one could use the new generic name Sarga to represent species in subg. Parasorghum, Sorghum for species in subg. Sorghum, Vacoparis for species in Chaetosorghum and retain Cleistachne. Perhaps with a greater number of molecular markers, the apparent hybrid origin of S. sorghoides and phylogenetic position of S. burmahicum Raizada, S. controversum (Steud.) Snowden, S. derzhavinii Tzvelev, and S. trichocladum (Rupr. ex Hack.) Kuntze (all incertae sedis in our classification) will be elucidated.
The monophyly of Sorghum plus Cleistachne sorghoides is supported by the Pepc4 and the plastid data, and we provide a new combination, Sorghum sorghoides. Molecular results support the allotetraploid origin of S. sorghoides. Based on combined plastid data, members of subg. Parasorghum may represent the maternal parents, while the paternal parents of S. sorghoides remained unresolved because of incongruence between the Pepc4 and the GBSSI phylograms. Sorghum macrospermum is sister to S. laxiflorum, forming a distinct clade, which we refer to as subg. Chaetosorghum with two sections Chaetosorghum (S. macrospermum) and Heterosorghum (S. laxiflorum). Most of members of the two subgenera Parasorghum and Stiposorghum are resolved into one well-supported lineage by the two LCN phylograms. Therefore, we choose to recognize a single subg. Parasorghum, and place Stiposorghum in synonymy. The two LCN gene trees and the combined plastid tree are consistent with the hypothesis that S. halepense originated via homoeologous chromosome transpositions. During the middle Miocene-Pliocene interval, the formation of new ecological niches in tropical and subtropical Africa presumably catalysed the diversification of Sorghum in Africa. Furthermore, it seems reasonable to infer that the dominance of monsoonal savanna in the late Miocene contributed to the high level of endemism of Sorghum in Australia. Molecular results support the recognition of three distinct subgenera in Sorghum: subg. Chaetosorghum with two sections each containing a single species, subg. Parasorghum with 17 species, and subg. Sorghum with nine species.
Results of the exploration of Pepc4 MCMC convergence using the AWTY (Are We There Yet?) approach. (a) Cumulative plot of the posterior probabilities of 20 splits at selected increments over one of two MCMC runs. (b) Comparative plot of posterior probabilities of all splits for paired MCMC runs.
Results of the exploration of GBSSI MCMC convergence using the AWTY (Are We There Yet?) approach. (a) Cumulative plot of the posterior probabilities of 20 splits at selected increments over one of two MCMC runs. (b) Comparative plot of posterior probabilities of all splits for paired MCMC runs.
Results of the exploration of three plastid sequences (ndhA intron, rpl32-trnL and rps16 intron) MCMC convergence using the AWTY (Are We There Yet?) approach. (a) Cumulative plot of the posterior probabilities of 20 splits at selected increments over one of two MCMC runs. (b) Comparative plot of posterior probabilities of all splits for paired MCMC runs.
Taxon name, chromosome number, source, and GenBank accession numbers of Pepc4, GBSSI, and three plastid (ndhA intron, rpl32-trnL, and rps16 intron) sequences used in the study.
We thank ILRI-Addis Ababa, IS-Andhra Pradesh, and USDA-Beltsville Germplasm System for seeds, and six anonymous reviewers for their constructive comments that improved the manuscript.
Conceived and designed the experiments: QL PMP. Performed the experiments: QL HL. Analyzed the data: QL HL. Contributed reagents/materials/analysis tools: QL HL JW PMP. Contributed to the writing of the manuscript: QL HL JW PMP. Obtained necessary plant material: QL HL PMP.
- 1. FAO (Food and Agriculture Organization of the United Nations) (2011) FAOSTAT Database. FAO, Rome, Italy. Available: http://faostat.fao.org. Accessed 30 September 2011.
- 2. Garber ED (1950) Cytotaxonomic studies in the genus Sorghum. Univ Calif Publ Bot 23: 283–361.
- 3. Lazarides M, Hacker JB, Andrew MH (1991) Taxonomy, cytology and ecology of indigenous Australian sorghums (Sorghum Moench: Andropogoneae: Poaceae). Aust Syst Bot 4: 591–635. doi: 10.1071/sb9910591
- 4. Clayton WD, Vorontsova MS, Harman KT, Williamson H (2006 onwards). GrassBase –The online world grass flora. Available: http://www.kew.org/data/grasses-db.html. Accessed 8 November 2006.
- 5. Liu H, Liu Q (2014) Geographical distribution of Sorghum Moench (Poaceae). J Trop Subtrop Bot 22: 1–11.
- 6. Clayton WD, Renvoize SA (1986) Genera graminum: grasses of the world. Kew Bull Addit Ser 13: 320–375. doi: 10.2307/4114451
- 7. Soreng RJ, Davidse G, Peterson PM, Zuloaga FO, Judziewicz EJ, et al.. (2014) A world-wide phylogenetic classification of Poaceae (Gramineae): căo (), capim, çayır, çimen, darbha, ghaas, ghas, gish, gramas, graminius, gräser, grasses, gyokh, he-ben-ke, hullu, kasa, kusa, nyasi, pastos, pillu, pullu, zlaki, etc. Available: http://www.tropicos.org/projectwebportal.aspx?pagename=ClassificationNWG&projectid=10. Accessed 13 January 2014.
- 8. Dillon SL, Lawrence PK, Henry RJ (2001) The use of ribosomal ITS to determine phylogenetic relationships within Sorghum. Plant Syst Evol 230: 97–110. doi: 10.1007/s006060170007
- 9. Dillon SL, Lawrence PK, Henry RJ, Ross L, Price HJ, et al. (2004) Sorghum laxiflorum and S. macrospermum, the Australian native species most closely related to the cultivated S. bicolor based on ITS1 and ndhF sequence analysis of 25 Sorghum species. Plant Syst Evol 249: 233–246. doi: 10.1007/s00606-004-0210-7
- 10. Sun Y, Skinner DZ, Liang GH, Hulbert SH (1994) Phylogenetic analysis of Sorghum and related taxa using internal transcribed spacers of nuclear ribosomal DNA. Theor Appl Genet 89: 26–32. doi: 10.1007/bf00226978
- 11. Clayton WD, Renvoize SA (1982) Gramineae (Part 3). In: Polhill RM, editor. Flora of Tropical East Africa. Rotterdam: August Aimé Balkema. pp. 320–734.
- 12. Celarier RP (1958) Cytotaxonomy of the Andropogoneae. III. Subtribe Sorgheae, genus Sorghum. Cytologia 23: 395–418. doi: 10.1508/cytologia.23.395
- 13. De Wet JMJ (1978) Systematics and evolution of Sorghum sect. Sorghum (Gramineae). Am J Bot 65: 477–484. doi: 10.2307/2442706
- 14. Dillon SL, Lawrence PK, Henry RJ, Price HJ (2007) Sorghum resolved as a distinct genus based on combined ITS1, ndhF and Adh1 analyses. Plant Syst Evol 268: 29–43. doi: 10.1007/s00606-007-0571-9
- 15. Spangler RE (2003) Taxonomy of Sarga, Sorghum and Vacoparis (Poaceae: Andropogoneae). Aust Syst Bot 16: 279–299. doi: 10.1071/sb01006
- 16. Ng’uni D, Geleta M, Fatih M, Bryngelsson T (2010) Phylogenetic analysis of the genus Sorghum based on combined sequence data from cpDNA regions and ITS generate well-supported trees with two major lineages. Ann Bot 105: 471–480. doi: 10.1093/aob/mcp305
- 17. Harlan JR, De Wet JMJ (1971) Toward a rational classification of cultivated plants. Taxon 20: 509–517. doi: 10.2307/1218252
- 18. Stenhouse JW, Prasada Rao KE, Gopal Reddy V, Appa Pao KD (1997) Sorghum. In: Fuccillo D, Sears L, Stapleton P, editors. Biodiversity in Trust: Conservation and Use of Plant Genetic Resources in CGIAR Centers. Cambridge: Cambridge University Press. pp. 292–308.
- 19. Snowden JD (1955) The wild fodder sorghums of the section Eu-sorghum. J Linn Soc Lond 55: 191–260. doi: 10.1111/j.1095-8339.1955.tb00011.x
- 20. Knobloch IW (1968) A check list of crosses in the Gramineae. New York: Stechert- Hafner Service Agency.
- 21. Knobloch IW (1972) Intergeneric hybridization in flowering plants. Taxon 21: 97–103. doi: 10.2307/1219229
- 22. Ness RW, Graham SW, Barrett SCH (2011) Reconciling gene and genome duplication events: using multiple nuclear gene families to infer the phylogeny of the aquatic plant family Pontederiaceae. Mol Biol Evol 28: 3009–3018. doi: 10.1093/molbev/msr119
- 23. Zhang N, Zeng LP, Shan HY, Ma H (2012) Highly conserved low-copy nuclear genes as effective markers for phylogenetic analyses in angiosperms. New Phytol 195: 923–937. doi: 10.1111/j.1469-8137.2012.04212.x
- 24. Zimmer EA, Wen J (2012) Using nuclear gene data for plant phylogenetics: progress and prospects. Mol Phylogenet Evol 65: 774–785. doi: 10.1016/j.ympev.2012.07.015
- 25. Liu Q, Triplett JK, Wen J, Peterson PM (2011) Allotetraploid origin and divergence in Eleusine (Chloridoideae, Poaceae): evidence from low-copy nuclear gene phylogenies and a plastid gene chronogram. Ann Bot 108: 1287–1298. doi: 10.1093/aob/mcr231
- 26. Cronn R, Wendel JF (2004) Cryptic trysts, genomic mergers, and plant speciation. New Phytol 161: 133–142. doi: 10.1111/j.1469-8137.2004.00947.x
- 27. Brassac J, Jakob SS, Blattner FR (2012) Progenitor-derivative relationships of Hordeum polyploids (Poaceae, Triticeae) inferred from sequences of TOPO6, a nuclear low-copy gene region. PLoS ONE 7: e33808. doi: 10.1371/journal.pone.0033808
- 28. Edwards EJ, Osborne CP, Strömberg CAE, Smith SA, C4 Grasses Consortium (2010) The origins of C4 grasslands: integrating evolutionary and ecosystem science. Science 328: 587–591. doi: 10.1126/science.1177216
- 29. Cerling TE, Harris JM, Macfadden BJ, Leakey MG, Quade J, et al. (1997) Global vegetation change through the Miocene/Pliocene boundary. Nature 389: 153–158. doi: 10.1038/38229
- 30. Strömberg CAE (2005) Decoupled taxonomic radiation and ecological expansion of open-habitat grasses in the Cenozoic of North America. Proc Natl Acad Sci USA 102: 11980–11984. doi: 10.1073/pnas.0505700102
- 31. Hartley W (1958) Studies on the origin, evolution, and distribution of the Gramineae. I. The tribe Andropogoneae. Aust J Bot 6: 116–128. doi: 10.1071/bt9580116
- 32. Keng YL (1939) The gross morphology of Andropogoneae. Sinensia 10: 274–343.
- 33. Liu Q, Peterson PM, Ge XJ (2011) Phylogenetic signals in the realized climate niches of Chinese grasses (Poaceae). Plant Ecol 212: 1733–1746. doi: 10.1007/s11258-011-9946-7
- 34. Li N (2009) Cytology and seed biology of Sorghum halepense and its three related species. M.S. Thesis. Jinhua: Zhejiang Normal University.
- 35. Martin JH (1959) Sorghum and pearl millet. In: Happert H, Rudorf W, editors. Handbuch der Pflanzenzüchtung, 2nd edition, vol. 2. Berlin: Paul Parey. pp. 565–587.
- 36. Price HJ, Dillon SL, Hodnett G, Rooney WL, Ross L, et al. (2005) Genome evolution in the genus Sorghum (Poaceae). Ann Bot 95: 219–227. doi: 10.1093/aob/mci015
- 37. De Wet JMJ, Huckabay JP (1967) The origin of Sorghum bicolor. II. Distribution and domestication. Evoluton 21: 787–802. doi: 10.2307/2406774
- 38. Reddi VR (1970) Chromosome association in one induced and five natural tetraploids of Sorghum. Genetica 41: 321–333. doi: 10.1007/bf00958915
- 39. Chen SL, Phillips SM (2006) Sorghum Moench. In: Wu ZY, Raven PH, editors. Flora of China, vol. 22. Beijing: Science Press and St. Louis: Missouri Botanical Garden Press. pp. 602–604.
- 40. Lu QS (2006) Sorghum. In: Dong YC, Liu X, editors. Crops and Their Wild Relatives in China: Food Crops. Beijing: China Agriculture Press. pp. 360–405.
- 41. Mathews S, Spangler RE, Mason-Gamer RJ, Kellogg EA (2002) Phylogeny of Andropogoneae inferred from phytochrome B, GBSSI, and ndhF. Int J Plant Sci 163: 441–450. doi: 10.1086/339155
- 42. Spangler RE, Zaitchik B, Russo E, Kellogg E (1999) Andropogoneae evolution and generic limits in Sorghum (Poaceae) using ndhF sequences. Syst Bot 24: 267–281. doi: 10.2307/2419552
- 43. Nadeem Ahsan SM, Vahidy AA, Ali SI (1994) Chromosome numbers and incidence of polyploidy in Panicoideae (Poaceae) from Pakistan. Ann Mo Bot Gard 81: 775–783. doi: 10.2307/2399922
- 44. Baltisberger M, Kocyan A (2010) IAPT/IOPB chromosome data 9. Taxon 59: 1298–1302. doi: 10.2307/25065590
- 45. Celarier RP (1956) Cytotaxonomy of the Andropogoneae. I. Subtribes Dimeriinae and Saccharinae. Cytologia 21: 272–291. doi: 10.1508/cytologia.21.272
- 46. Vahidy AA, Davidse A, Shigenobu Y (1987) Chromosome counts of Missouri Asteraceae and Poaceae. Ann Mo Bot Gard 74: 432–433. doi: 10.2307/2399415
- 47. Lepiniec L, Vidal J, Chollet R, Gadal P, Crétin C (1994) Phosphoenolpyruvate carboxylase: structure, regulation and evolution. Plant Sci 99: 111–124. doi: 10.1016/0168-9452(94)90168-6
- 48. Mason-Gamer RJ, Weil CF, Kellogg EA (1998) Granule-bound starch synthase: structure, function, and phylogenetic utility. Mol Biol Evol 15: 1658–1673. doi: 10.1093/oxfordjournals.molbev.a025893
- 49. Christin PA, Besnard G, Samaritani E, Duvall MR, Hodkinson TR, et al. (2008) Oligocene CO2 decline promoted C4 photosynthesis in grasses. Curr Biol 18: 37–43. doi: 10.1016/j.cub.2007.11.058
- 50. Mahelka V, Kopecký D (2010) Gene capture from across the grass family in the allohexaploid Elymus repens (L.) Gould (Poaceae, Triticeae) as evidenced by ITS, GBSSI, and molecular cytogenetics. Mol Biol Evol 27: 1370–1390. doi: 10.1093/molbev/msq021
- 51. Fortuné PM, Schierenbeck K, Ainouche A, Jacquemin J, Wendel JF, et al. (2007) Evolutionary dynamics of Waxy and the origin of hexaploid Spartina species. Mol Phylogenet Evol 43: 1040–1055. doi: 10.1016/j.ympev.2006.11.018
- 52. Paterson AH, Bowers JE, Bruggmann R, Dubchak I, Grimwood J, et al. (2009) The Sorghum bicolor genome and the diversification of grasses. Nature 457: 551–556. doi: 10.1038/nature07723
- 53. Shaw J, Lickey EB, Schilling EE, Small RL (2007) Comparison of whole chloroplast genome sequences to choose noncoding regions for phylogenetic studies in Angiosperms: the tortoise and the hare III. Am J Bot 94: 275–288. doi: 10.3732/ajb.94.3.275
- 54. Peterson PM, Romaschenko K, Johnson G (2010) A classification of the Chloridoideae (Poaceae) based on multi-gene phylogenetic trees. Mol Phylogenet Evol 55: 580–598. doi: 10.1016/j.ympev.2010.01.018
- 55. Hiraishi A, Kamagata Y, Nakamura K (1995) Polymerase chain reaction amplification and restriction fragment length polymorphism analysis of 16S rRNA genes from methanogens. J Ferment Bioeng 79: 523–529. doi: 10.1016/0922-338x(95)94742-a
- 56. Li FW, Pryer KM, Windham MD (2012) Gaga, a new fern genus segregated from Cheilanthes (Pteridaceae). Syst Bot 37: 845–860. doi: 10.1600/036364412x656626
- 57. Rothfels CJ, Schuettpelz E (2013) Accelerated rate of molecular evolution for vittarioid ferns is strong and not driven by selection. Syst Biol 63: 31–54. doi: 10.1093/sysbio/syt058
- 58. Edgar RC (2004) MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res 32: 1792–1797. doi: 10.1093/nar/gkh340
- 59. Swofford DL (2003) PAUP*. Phylogenetic analysis using parsimony (* and other methods), ver. 4.0b10. Sunderland: Sinauer Associates.
- 60. Grusz AL, Windham MD, Pryer KM (2009) Deciphering the origins of apomictic polyploids in the Cheilanthes yavapensis complex (Pteridaceae). Am J Bot 96: 1636–1645. doi: 10.3732/ajb.0900019
- 61. Zwickl DJ (2006). Genetic algorithm approaches for the phylogenetic analysis of large biological sequence datasets under the maximum likelihood criterion. Ph.D. Thesis. Austin: University of Texas at Austin.
- 62. Ronquist F, Teslenko M, van der Mark P, Ayres DL, Darling A, et al. (2012) MrBayes 3.2: efficient Bayesian phylogenetic inference and model choice across a large model space. Syst Biol 61: 539–542. doi: 10.1093/sysbio/sys029
- 63. Posada D, Crandall KA (1998) Modeltest: testing the model of DNA substitution. Bioinformatics 14: 817–818. doi: 10.1093/bioinformatics/14.9.817
- 64. Nylander JAA, Wilgenbusch JC, Warren DL, Swofford DL (2008) AWTY (are we there yet?): a system for graphical exploration of MCMC convergence in Bayesian phylogenetics. Bioinformatics 24: 581–583. doi: 10.1093/bioinformatics/btm388
- 65. Rota-Stabelli O, Telford MJ (2008) A multi criterion approach for the selection of optimal outgroups in phylogeny: recovering some support for Mandibulata over Myriochelata using mitogenomics. Mol Phylogenet Evol 48: 103–111. doi: 10.1016/j.ympev.2008.03.033
- 66. Felsenstein J (1981) Evolutionary trees from DNA sequences: a maximum likelihood approach. J Mol Evol 17: 368–376. doi: 10.1007/bf01734359
- 67. Drummond AJ, Rambaut A (2007) BEAST: Bayesian evolutionary analysis by sampling trees. BMC Evol Biol 7: 214. doi: 10.1186/1471-2148-7-214
- 68. Vicentini A, Barber JC, Aliscioni SS, Giussani LM, Kellogg EA (2008) The age of the grasses and clusters of origins of C4 photosynthesis. Glob Change Biol 14: 2963–2977. doi: 10.1111/j.1365-2486.2008.01688.x
- 69. Nambudiri EMV, Tidwell WD, Smith BN, Hebbert NP (1978) A C4 plant from the Pliocene. Nature 276: 816–817. doi: 10.1038/276816a0
- 70. Jacobs BF, Kingston JD, Jacobs LL (1999) The origin of grass-dominated ecosystems. Ann Mo Bot Gard 86: 590–643. doi: 10.2307/2666186
- 71. Kellogg EA (2000) Molecular and morphological evolution in the Andropogoneae. In: Jacobs SWL, Everett J, editors. Grasses: Systematics and Evolution. Collingwood: Commonwealth Scientific and Industrial Research Organization Publishing. pp. 149–158.
- 72. Whistler DP, Burbank DW (1992) Miocene biostratigraphy and biochronology of the Dove Spring Formation. Mojave Desert, California, and characterization of the Clarendonian mammal age (late Miocene) in California. Geol Soc Am Bull 104: 644–658. doi: 10.1130/0016-7606(1992)104<0644:mbabot>2.3.co;2
- 73. Ho SYW, Phillips MJ (2009) Accounting for calibration uncertainty in phylogenetic estimation of evolutionary divergence times. Syst Biol 58: 367–380. doi: 10.1093/sysbio/syp035
- 74. Gelman A, Rubin DB (1992) Inference from iterative simulation using multiple sequences. Statist Sci 7: 457–511. doi: 10.1214/ss/1177011136
- 75. Hata S, Izui K, Kouchi H (1998) Expression of a soybean nodule-enhanced phosphoenolpyruvate carboxylase gene that shows striking similarity to another gene for a house-keeping isoform. Plant J 13: 267–273. doi: 10.1046/j.1365-313x.1998.00022.x
- 76. Duvall MR, Doebley JF (1990) Restriction site variation in the chloroplast genome of Sorghum (Poaceae). Syst Bot 15: 472–480. doi: 10.2307/2419363
- 77. Wu TP (1990) Sorghum macrospermum and its relationship to the cultivated species S. bicolor. Cytologia (Tokyo) 55: 141–151. doi: 10.1508/cytologia.55.141
- 78. Wu TP (1993) Cytological and morphological relationships between Sorghum laxiflorum and S. bicolor. J Hered 84: 484–489.
- 79. Liao F, Liu Y, Yang XL, Huang GM, Niu CJ (2009) Molecular phylogenetic relationships among species in the genus Sorghum based on partial Adh1 gene. Hereditas 31: 523–530. doi: 10.3724/sp.j.1005.2009.00523
- 80. Doggett J (1970) Sorghum. Longmans: Green and Company.
- 81. Gu MH, Ma HT, Liang GH (1984) Karyotype analysis of seven species in the genus Sorghum. J Hered 75: 196–202.
- 82. Van Oosterhout SAM (1992) The biosystems and ethnobotany of Sorghum bicolor in Zimbabwe. Ph.D. Thesis. Harare: University of Zimbabwe.
- 83. Tang H, Liang GH (1988) The genomic relationship between cultivated sorghum [Sorghum bicolor (L.) Moench] and Johnsongrass [S. halepense (L.) Pers.]: a re-evaluation. Theor Appl Genet 76: 277–284. doi: 10.1007/bf00257856
- 84. Swigoňová Z, Lai JS, Ma JX, Ramakrishna W, Llaca V, et al. (2004) Close split of sorghum and maize genome progenitors. Genome Res 14: 1916–1923. doi: 10.1101/gr.2332504
- 85. Paterson AH, Bowers JE, Chapman BA (2004) Ancient polyploidization predating divergence of the cereals, and its consequences for comparative genomics. Proc Natl Acad Sci USA 26: 9903–9908. doi: 10.1073/pnas.0307901101
- 86. House LR (1985) A guide to sorghum breeding. Andhra Pradesh: International Crops Research Institute for the Semi-Arid Tropics.
- 87. Bolnick DI, Fitzpatrick BM (2007) Sympatric speciation: models and empirical evidence. Annu Rev Ecol Evol Syst 38: 459–487. doi: 10.1146/annurev.ecolsys.38.091206.095804
- 88. Jiang B, Peterson P M, Liu Q (2011) Caryopsis micromorphology of Eleusine Gaertn. (Poaceae) and its systematic implications. J Trop Subtrop Bot 19: 195–204.
- 89. Zhang Y, Hu XY, Liu YX, Liu Q. 2014. Caryopsis micromorphological survey of Themeda (Poaceae) and allied spathaceous genera in the Andropogoneae. Turk J Bot 38: 1206–1212. doi: 10.3906/bot-1308-27
- 90. Udall JA, Quijada PA, Osborn TC (2005) Detection of chromosomal rearrangements derived from homologous recombination in four mapping populations of Brassica napus L. Genetics. 169: 967–979. doi: 10.1534/genetics.104.033209
- 91. Doggett H (1976) Sorghum. In: Simmonds NW, editor. Evolution of Crop Plants. London: Longman Scientific and Technical. 112–117.
- 92. Paterson AH, Schertz KF, Lin YR, Liu C, Chang YL (1995) The weediness of wild plants: molecular analysis of genes influencing dispersal and persistence of Johnsongrass, Sorghum halepense (L.) Pers. Proc Natl Acad Sci USA 92: pp. 6127–6131. doi: 10.1073/pnas.92.13.6127
- 93. Bhatti AG, Endrizzi JE, Reeves RG (1960) Origin of Johnsongrass. J Hered 51: 107–110.
- 94. Gaut BS, Doebley JF (1997) DNA sequence evidence for the segmental allotetraploid origin of maize. Proc Natl Acad Sci USA 94: 6809–6814. doi: 10.1073/pnas.94.13.6809
- 95. Morden CW, Doebley J, Schertz KF (1990) Allozyme variation among the spontaneous species of Sorghum section Sorghum (Poaceae). Theor Appl Genet 80: 296–304. doi: 10.1007/bf00210063
- 96. Linder HP, Rudall PJ (2005) Evolutionary history of Poales. Annu Rev Ecol Evol Syst 36: 107–124. doi: 10.1146/annurev.ecolsys.36.102403.135635
- 97. Zachos J, Pagani M, Sloan L, Thomas E, Billups K (2001) Trends, rhythms, and aberrations in global climate 65 Ma to present. Science 292: 686–693. doi: 10.1126/science.1059412
- 98. Guiraud R, Bosworth W, Thierry J, Delaplanque A (2005) Phanerozoic geological evolution of Northern and Central Africa: an overview. J Afr Earth Sci 43: 83–143. doi: 10.1016/j.jafrearsci.2005.07.017
- 99. Lærdal T, Talbot MR (2002) Basin neotectonics of Lakes Edward and George, East African Rift. Palaeogeogr Palaeoclimatol Palaeoecol 187: 213–232. doi: 10.1016/s0031-0182(02)00478-9
- 100. Sepulchre P, Ramstein G, Fluteau F, Schuster M, Tiercelin JJ, et al. (2006) Tectonic uplift and East Africa aridification. Science 313: 1419–1423. doi: 10.1126/science.1129158
- 101. Swezey CS (2009) Cenozoic stratigraphy of the Sahara, Northern Africa. J Afr Earth Sci 53: 89–121. doi: 10.1016/j.jafrearsci.2008.08.001
- 102. Winkler IS, Mitter C, Scheffer SJ (2009) Repeated climate-linked host shifts have promoted diversification in a temperate clade of leaf-mining flies. Proc Natl Acad Sci USA 106: 18103–18108. doi: 10.1073/pnas.0904852106
- 103. Voje KL, Hemp C, Flagstad Ø, Saetre GP, Stenseth NC (2009) Climatic change as an engine for speciation in flightless Orthoptera species inhabiting African mountains. Mol Ecol 18: 93–108. doi: 10.1111/j.1365-294x.2008.04002.x
- 104. Finarelli J, Badgley C (2010) Diversity dynamics of Miocene mammals in relation to the history of tectonism and climate. Proc R Soc Lond B Biol Sci 277: 2721–2726. doi: 10.1098/rspb.2010.0348
- 105. Janis CM, Damuth J, Theodor JM (2000) Miocene ungulates and terrestrial primary productivity: where have all the browsers gone? Proc Natl Acad Sci USA 97: 7899–7904. doi: 10.1073/pnas.97.14.7899
- 106. Wheeler MC, McBride JL (2005) Australian-Indonesian monsoon. In: Lau WKM, Waliser DE, editors. Intraseasonal Variability in the Atmosphere-Ocean Climate System. Heidelberg: Springer-Praxis. pp. 125–173.
- 107. Martin HA (2006) Cenozoic climatic change and the development of the arid vegetation in Australia. J Arid Environ 66: 533–563. doi: 10.1016/j.jaridenv.2006.01.009
- 108. Russell-Smith J, Needham S, Brock J (1995) The physical environment. In: Press T, Lea D, Webb A, Graham A, editors. Kakadu: Natural and Cultural Heritage Management. Darwin: Australian Nature Conservation Agency. pp. 94–126.
- 109. Fujita MK, McGuire JA, Donnellan SC, Moritz C (2010) Diversification and persistence at the arid-monsoonal interface: Australia-wide biogeography of the Bynoe’s gecko (Heteronotia binoei; Gekkonidae). Evolution 64: 2293–2314. doi: 10.1111/j.1558-5646.2010.00993.x
- 110. Phillips S (1995) Poaceae (Gramineae). In: Hedberg I, Edwards S, editors. Flora of Ethiopia and Eritrea. Addis Ababa: Addis Ababa University and Uppsala: Uppsala University.