Interspecific Phylogenic Relationships within Genus Melilotus Based on Nuclear and Chloroplast DNA

Melilotus comprises 19 species, while the phylogenetic relationships between species remain unclear. In the present work, three chloroplast genes, rbcL, matK, trnL-F, and one nuclear region, ITS (internal transcribed spacer) belonging to 48 populations of 18 species of Melilotus were sequenced and phylogenetic trees were constructed to study their interspecific relationships. Based on the phylogenetic tree generated in this study using rbcL analysis, the Melilotus genus is clearly monophyletic in the legume family. Both Bayesian and maximum-parsimony approaches were used to analyze the data. The nrDNA ITS provided more informative characteristics (9.8%) than cpDNA (3.0%). Melilotus contains two closely related groups, clade I and clade II. M. spicatus, M. indicus and M. segetalis have a close relationship. M. infestus, M. siculus and M. sulcatus are closely related. The comparing between molecular phylogeny and flower color classification in Melilotus showed that the flower color is not much informative for phylogenetics of this genus.


Introduction
Melilotus (sweet clover) belongs to the tribe Trifolieae of the legume family and comprises 19 annual and biennial species [1]. All species are native to Eurasia or North Africa [2], and three species are cultivated: M. albus, M. officinalis and M. indicus [3]. M. albus and M. officinalis are mainly capable of self-pollination [4], but when the pistil is longer than the stamens, there is very little self-pollination [5]. Melilotus also has entomophilous flowers, which can lead to hybridization. Several species have invaded the Northwest Territories in Canada and the Midwestern USA, among which M. albus and M. officials are often studied [6,7,8]. Members of the Melilotus genus have high seed yields and, relative to most other forages, are more tolerant to extremes in environmental conditions, e.g. drought, cold and high salinity [9,10]. Melilotus also has important medicinal value in addition to being an important forage crop [11]. Furthermore, the nitrogen fixation rate of Melilotus is higher than that of other legumes, making it beneficial for crop rotations [12].
Members of Melilotus exhibit wide variations in flower structure, flower color, seed, leaf and pod characteristics [13,14]. The classification of Melilotus are more difficult based on morphological traits and growth habits [15,16]. However, except for morphological studies, no other taxonomic assessments have been conducted on interspecific phylogenetic relationships among species. Analysis of DNA has been widely used in the phylogenetic and classification studies. These methods are more effective and specific than traditionally morphological methods in phylogenetic relationships and genetic variation involved in sibling species and morphologically intermediate species [17,18].
Phylogenetic results that used a single gene may lead to misleading, especially in cpDNA, which is inherited maternally [19]. Hybridization between different species or genera may lead to reticulate evolution [20]. The employment of a different molecular marker could help to assess and to reduce this problem. Nuclear ribosomal genes with alternating gene and spacer regions and tandom repeat structures can provide this option [21,22,23]. The nrDNA internal transcribed spacer (ITS) region and chloroplast DNA have higher variability and are thus suitable for classifying lower taxonomic levels [24,25,26]. Accordingly, these regions are useful for inferring phylogenetic relationships at lower taxonomic levels and have been successfully used to analyze plant systematics [27,28]. Here we selected three cpDNA termed the rbcL gene, matK gene and trnL-F gene and one nrDNA ITS to study the interspecific relationships [29,30].
In this study, except for M. macrocarpus in Melilotus genus, plant samples from 48 populations of 18 Melilotus species were collected. To study the phylogenetic relationships among members of the Melilotus genus and to generate more accurate estimates of its genetic diversity, we constructed the molecular phylogenetic trees of single nrDNA ITS, 3-cpDNA and the concatenated sequences of all four genes. Finally the molecular phylogenetic classification was compared based on flower color and karyotype in Melilotus.

Sampling
Seeds from 48 populations representing 18 species were obtained from National Plant Germplasm System (NPGS, America) and planted at Yuzhong (35°57'N, 104°09'E) in Gansu Province, China (Table 1). Samples were collected from public land instead of protected areas in the northwest China, and no samples of endangered or protected species were included in our study.
Young leaves from 2 to 12 individuals of each population were sampled (totaling 406 individuals). Leaves were frozen in liquid nitrogen and stored at -80°C.

DNA extraction, amplification and sequencing
Four genes were amplified and sequenced: three chloroplast genes (cpDNA), trnL-F, rbcL and matK, and one nuclear region (nrDNA), ITS ( Table 2). For each population, 2 to 12 independent DNA samples were obtained to check for sequencing errors. Total genomic DNA was extracted using an SDS (sodium dodecyl sulfate) method [31]. Polymerase chain reactions were then conducted in a 25-μL tube containing 1 μL genomic DNA (50 ng / mL), 1 μL of each primer (5 pmol / mL), 12.5 μL Takara Taq DNA polymerase master mix and 9.5 μL deionized water. For nuclear DNA ITS, the region was amplified using a PCR protocol of 94°C for 3 min, followed by 35 cycles of denaturation at 94°C for 30 s, annealing at 50°C for 30 s, and extension at 72°C for 1 min, and a final extension at 72°C for 10 min. For trnL-F gene using a PCR

Phylogenetic analyses
Phylogenetic analyses were performed using Bayesian and maximum-parsimony approaches. Sequence alignment was initially performed using ClustalX [32] and manually adjusted using MEGA5.0 [33]. The maximum-parsimony analyses involved a heuristic search strategy with 1000 replicates of random sequence addition in combination with TBR branch swapping in MEGA5.0. All character states were treated as unordered and equally weighted. Informative insertions and deletions (indels) were coded as binary characters (0, 1) according to Graham et al. (2000). A strict consensus tree was constructed from the most parsimonious trees. Bayesian analyses were conducted using MrBayes version 3.1 [34]. A model of sequence evolution for the combined dataset was selected using the program ModelTest version 3.6 [35] as implemented in MrMTgui [36] and based on the Akaike information criterion (AIC) [37]. The dataset was analyzed as a single partition using the GTR + I + G model. Four chains were run, beginning with a random tree and saving a tree every 100 generations for one million generations. Finally, the ITS region, three cpDNAs and the dataset of the four genes ITS, rbcL, matK and trnL-F were combined for phylogenetic analyses. One sequence from each population was used to construct phylogenetic trees for the genus Melilotus.

Phylogenetic analyses
rbcL analysis in the legume family. The phylogenetic tree in the legume family shown in  4-gene analysis. The 4-gene tree of Melilotus yielded 2998 bp of four concatenated genes (rbcL, matK, trnL-F and ITS), with T. lupinaste and M. sativa as outgroups, is shown in Fig 4. The major clades recovered in the above tree were also successfully resolved by this analysis. Clade I was observed and contained 10 related species. Subgroup IIa and M. infestus formed a subclade, namely clade 2, and subgroup IIb with M. spicatus formed a highly supported

Discussion
In this study, clade I, which contains 10 species, was found in all four trees of ITS and cpDNA genes. However, in the two cpDNA trees (Fig 1 and Fig 2), the 8 species formed another large clade, clade II. In the nrDNA tree (Fig 3) and 4-gene tree (Fig 4), clade 1, which was clustered into clade II in other two trees, was clustered into clade I; therefore, no clade II was shown. Several species are closely related in 3-cpDNA tree, nrDNA tree and 4-gene tree, e.g., M. spicatus, M. segetalis and M. indicus in clade 1, and M. siculus, M. sulcatus and M. infestus in clade 2. As shown in Table 3, the ITS region provided more informative characteristics (9.8%) than  cpDNA (3.0%). Liu et al. [38] reported a similar result for Ligularia-Cremanthodium-Parasenecio in a study showing that ITS (39.6%) had more parsimony-informative characters than cpDNA (2.5%) using an NdhF and trnL-trnF combination. The higher sequence variability in the ITS region compared with cpDNA, which has also been demonstrated in many other taxa [39,40,41,42] may lead to incongruence in phylogenetic tree. As nrDNA is biparentally inherited and has high rates of intraspecific gene flow which can enhance species delimitation. Howevre, the maternally inherited chloroplast DNA is more frequently introgressed and more limited use in species delimitation than nuclear DNA [43,44]. In addition, incomplete lineage sorting [45] and hybridization between and within species [20] may also cause phylogeny incongruent.  According to Steven [16], plant morphology may show great variation within a single plant, which was not used for species classification. Steven studied agronomic and taxonomic reviews of the genus Melilotus and divided Melilotus into two groups according to flower color, namely, white and yellow. The white group contains four species, M. albus, M. tauricus, M. wolgicus and M. speciosus, and the other species compose the yellow group (Table 4). Our results showed that flower color has no obvious link with the phylogenetic classification in our study.
Clarke studied the number and morphology of chromosomes in the genus Melilotus [46], reporting a chromosome number of 2n = 16. Karyotype analyses of all Melilotus species were  conducted by Kita [47]. The 19 species examined are grouped into three types: A, B and C. Type B is further divided into Type B-1 and Type B-2 ( Table 4). The grouping information based on karyotype analyses indicates that the species within each type are closely related [47]. Except for M. elegans, clade 1 of the phylogenetic trees is consistent with type A, and clade 2 comprises all Type B and Type C species. The molecular phylogeny classification in our study well support the karyotype classification. The better consistency between molecular phylogenetic and karyotype indicate that karyotype may be the significant phylogenetic signal in the Melilotus genus.
However, the phylogeography of Melilotus species and populations which rely on their distributions around the world remains largely unknown. Genetic diversity analysis within Melilotus genus is on going in our group with SSR markers, which will also provide a supplement conclusion of the interspecific relationship. Interspecific Phylogenic Relationships within Genus Melilotus