Past Climate Change and Plant Evolution in Western North America: A Case Study in Rosaceae

Species in the ivesioid clade of Potentilla (Rosaceae) are endemic to western North America, an area that underwent widespread aridification during the global temperature decrease following the Mid-Miocene Climatic Optimum. Several morphological features interpreted as adaptations to drought are found in the clade, and many species occupy extremely dry habitats. Recent phylogenetic analyses have shown that the sister group of this clade is Potentilla section Rivales, a group with distinct moist habitat preferences. This has led to the hypothesis that the ivesioids (genera Ivesia, Horkelia and Horkeliella) diversified in response to the late Tertiary aridification of western North America. We used phyloclimatic modeling and a fossil-calibrated dated phylogeny of the family Rosaceae to investigate the evolution of the ivesioid clade. We have combined occurrence- and climate data from extant species, and used ancestral state reconstruction to model past climate preferences. These models have been projected into paleo-climatic scenarios in order to identify areas where the ivesioids may have occurred. Our analysis suggests a split between the ivesioids and Potentilla sect. Rivales around Late Oligocene/Early Miocene (∼23 million years ago, Ma), and that the ivesioids then diversified at a time when summer drought started to appear in the region. The clade is inferred to have originated on the western slopes of the Rocky Mountains from where a westward range expansion to the Sierra Nevada and the coast of California took place between ∼12-2 Ma. Our results support the idea that climatic changes in southwestern North America have played an important role in the evolution of the local flora, by means of in situ adaptation followed by diversification.


Introduction
Understanding the influence of climate change on the evolution and distribution of the world's biota constitutes a major task in biology. An accurate estimation of how species have responded to changes in the past may enable us to better predict future responses to global warming, with far-reaching implications influencing the work of policy-makers and conservational biologists [1].
A suitable area for assessing the effect of climate change on plant evolution is western North America. This is a botanically diverse region, rich in both total species numbers and proportion of endemic species, and has undergone major climatic and geologic changes during the Cenozoic (the last 65 Ma). At the beginning of the Eocene (,55.8-33.9 Ma) a warm and humid tropical climate prevailed in the region, but global cooling has since then gradually changed the conditions [2]. Onset of glaciation in Antarctica by the end of the Eocene was accompanied by rapid decline of global deep-sea temperatures [3]. Increased upwelling of cool Pacific ocean water off the Californian coast eventually led to summer drought by mid-Miocene (,15 Ma) [4]. Global cooling also strengthened the westerlies [5], which increased winter precipitation after mid-Miocene (,11.6 Ma). A Mediterranean type of climate, with summer droughts and winter precipitation, was in place in Late Miocene (,10 Ma) [2]. Climate change in the area has been suggested to trigger the evolution of evening primroses (genus Oenothera, family Onagraceae) [6][7], but several questions remain concerning how general niche conservatism/lability has been in the area, and from which areas and habitat zones the local flora originated.
The 'ivesioids' are a well-supported plant clade [8][9] confined to western North America [10]. It is nested within Potentilla L. (cinquefoil) in the Rosaceae -a cosmopolitan family of large ecological and economic importance, which includes many edible fruits (apples, plums, cherries, pears, strawberries, almonds) as well as ornamentals (roses, firethorns, hawthorns). As currently circumscribed (Figures 1 and S1; [8][9][11][12]), the ivesioid clade includes more than 50 species classified in three genera: Ivesia, Horkelia and Horkeliella [10,[13][14]. Common to many of them is that they grow under extremely dry conditions and have developed means to avoid drought (petrophily on protected rock faces, tolerance of alkalinity) or minimize water loss (increased pubescence, numerous minute leaflet segments in a tightly overlapping arrangement).
Potentilla sect. Rivales is the sister group of the ivesioids [8][9]. Species in this group preferably occupy seasonally inundated flats or lake and stream shores, and have a widespread distribution in the Northern hemisphere. In contrast, the ivesioid species usually reside in extremely arid regions, alpine habitats and sites with a Mediterranean type of climate in the Great Basin ( Figure 2) and adjacent arid parts of western North America, and comprise many narrowly endemic species [10,[13][14].
Phyloclimatic modeling [7,[15][16][17][18][19] combines phylogenetic estimation of species relationships with bioclimatic models [20]. These models use climate data from known species locations to predict areas of suitable climate for that species, by projecting the models into a present-day climatic scenario. They can thus estimate the total potential distribution of species even when not all localities and populations have been sampled. Furthermore, different methods for ancestral state reconstruction can be used to reconstruct the climatic preferences for ancestral nodes in a dated phylogeny. Historical distributions regulated by climatic conditions can then be estimated by projecting the optimized models into past climate scenarios, leading to an estimate of ancestral distributions. These models can thus be used to evaluate the evolutionary importance of niche conservatism for producing the distribution of plant diversity seen today (e.g., [21]), and help predict how this diversity may be affected in the future by global warming.
The primary objective of this study is to test the hypothesis that species in the ivesioid clade evolved in response to late Tertiary development of dry conditions in western North America. Under such circumstances, we would expect its stem node to have originated in western North America, and that the crown age of this clade -reflecting the onset of diversification of dry-adapted species -is not older than the proposed time of the aridification in the region. To address this, we have performed a molecular dating analysis of a plastid phylogeny of Rosaceae to establish the age of the ivesioid clade and produced niche models for both extant species and well-supported nodes of the phylogeny. Projections of these models into palaeoclimatic scenarios were used to estimate the geographic origin of the group and to infer changes in geographical distributions over time.

Molecular data
A taxonomically representative set of sequences (selected to represent all subfamilies of Rosaceae; see Table 1) from the plastid matK and trnL-trnF intergenic spacer was downloaded from the National Center for Biotechnology Information (www.ncbi.nlm. nih.gov). Pisum sativum (Fabaceae) was chosen as outgroup for the analysis, and Rhamnus cathartica (Rhamanaceae) was also included as representative for another family in the order Rosales.
In addition, new sequences from species in genus Potentilla, including the ivesioid clade, were generated. The matK region was amplified and sequenced with the trnk-3914 FM primer [22] and the matK2R primer [23]. The trnL intron and the trnL-trnF intergenic spacer were amplified with the trnLc and trnLf primers [24]. Two additional primers, trnLe and trnLd, together with the PCR amplification primers were used for sequencing. The PCR amplification of the two regions was performed using 12.5 ml MasterAmp 26 PCR PreMix G (Epicentre Biotechnologies, Madison, Wisconsin, USA), 0.6 mM of the forward and reverse primers, 1 unit Thermoprime Plus DNA Polymerase (ABgene House, Epsom, UK), 1 ml template DNA and purified water to a final volume of 25 ml.
The PCR mix was heated to 95uC for 5 minutes followed by 35-45 cycles of a denaturation step at 95uC for 30 seconds, annealing at 55uC for 30 seconds and extension for 1 minute (trnL/F) or 2 minutes (matK) at 72uC. The program ended with an additional 10 minutes (trnL/F) or 7 minutes (matK) extension step at 72uC. The resulting PCR products were sequenced by Macrogen Inc. (Seoul, Korea). The matK and trnL/F sequences (Table 1) were then aligned separately with mafft-linsi v.6.717b [25] and subsequently concatenated into a common matrix.

Phylogenetic inference and Molecular dating
We investigated whether the sequences evolved in a clocklike way by generating a neighbor joining tree in PAUP [26] and comparing Maximum Likelihood scores calculated from the data with and without enforcing a molecular clock. A likelihood ratio (LR) test was then performed with LR = 2 (L mol. clock enforced 2L no mol. clock enforced ) and assumed to be distributed as a x 2 with S-2 Figure 1. Molecular chronogram of Rosaceae. Maximum clade credibility tree obtained from 25000 post burn-in Bayesian chronograms generated in BEAST, with median branch lengths. Grey bars at nodes represent 95% Highest Posterior Densities of node ages. The red dots indicates age constraints used for the analysis; (1) The split between Rosales and Fabales was constrained to an age of 104-115 Ma based on a previous analysis [31], and (2) a Crataegites borealis fossil was used to set a conservative minimum age of 85.8 Ma on Rosaceae [32,34]. Subclades of Rosaceae were calibrated using fossil data from (3)   degrees of freedom, S being the number of taxa in the dataset. Since the LR test rejected a molecular clock (p,0.001), we chose to estimate divergence times with the relaxed clock algorithm implemented in the software BEAST v.1.6.1 [27] using the beagle library for likelihood calculations [28]. Fourteen runs of 10 million generations were performed, assuming an uncorrelated lognormal clock model and a pure birth (Yule) process under the GTR+C model, sampling every 2500 th generation. The nucleotide substitution model was selected using the program MrAic [29] and the Aikaike information criterion. Performance of the analysis (convergence of the independent runs and effective sample sizes for all sampled parameters) was evaluated using Tracer v.1.5 [30], after which 2500 trees were removed from each of the fourteen tree sets as the initial burn-in. Median and 95% Highest Posterior Density (HPD) intervals of node ages were then calculated from the remaining 21000 trees using the software TreeAnnotator v.1.6.1 [28].
Calibration. The crown age of the tree, corresponding to the split between Fabales (here represented by Pisum) and Rosales (all other species), was set as a uniform prior between 104 and 115 Ma. This interval corresponds to the lower age estimate for the Rosales stem lineage and the upper age estimate for the Rosid crown group, respectively, as inferred in BEAST in a large fossil-  based dating analysis of the Rosids [31]. Although this maximum age may be incorrect, in the absence of further evidence we consider this a conservative assumption since the Rosales clade has a well-supported position within the Rosids (Figure 1 in [31]). In addition, seven carefully chosen fossils were used to impose minimal age constraints on the prior distributions. The oldest fossil assigned to the crown group of Rosaceae is Crataegites borealis [32] from the Kolyma area in Siberia. It belongs to the Bour-kemuss Formation of the Zyrianka Coal Basin and has been dated to Early Albian (99.6-112 Ma) in the Cretaceous, by stratigraphic methods ( [33] and references therein). The 40 Ar/ 39 Ar dates for the geographically adjacent but stratigraphically younger Chauna group tephra was determined to fall within the Coniacian stage (85.8-89.3 Ma) in late Cretaceous [34]. Crataegites borealis is based on a number of very well preserved leaf imprints [32]. The similarity to modern-day leaves of Crataegus is striking and there is no obvious reason to dispute the taxonomic position of the fossils in the crown group of Rosaceae.
Fossils of Spiraea (Amygdaloideae) and Neviusia (Rosoideae) were found at Republic, Washington, USA [35] and dated to 48-49 Ma [36]. Representatives of the genera Holodiscus (Amygdaloideae; [37]) and Rosa (Rosoideae; [38]) are known from Florissant, Colorado and are dated to 34.1 Ma in Late Eocene [39]. Chamaebatiaria (Rosoideae) fossils belong to the Creede Flora, Colorado [40], and the formation in which they were found has been dated to early Late Oligocene (26.85 Ma [41]). The oldest fossils of the genus Potentilla (Rosoideae; [42]) are from brown coal strata in Lausitz, Germany, formed in Early-Middle Miocene (11.6-23.0 Ma; [43]). Reference to an older Potentilla fossil is given by Wolfe and Schorn [44] from the Creede Flora in North America (27.3 Ma). The fossil, a leaf imprint, was originally described as a member of Ranunculaceae by Axelrod [40] but was reclassified to Potentilla/Rosaceae by Wolfe and Schorn [44]. We have examined the photography of this fossil and dispute its reclassification, choosing instead the younger European fossil for calibration of the genus. Macrofossils of Fragaria (Rosoideae) were found in the Beaufort formation, Prince Patrick Island in the Canadian Arctic [45]. The Beaufort formation is considered to be of the same age as the Lost Chicken tephra in Alaska dated to 2.960.4 Ma [46].

Species distribution data
Locality data for ivesioid species were downloaded from the Global Biodiversity Information Facility portal (www.gbif.org), Jepson Online interchange (ucjeps.berkeley.edu/interchange.html) and the Consortium of Pacific Northwest Herbaria websites (www. pnwherbaria.org). Duplicated data points were removed manually. If, in total, less than ten locations were found in the online databases, more locality data were collected from herbarium labels. The occurrence data was then plotted in a GIS using QGIS (http://qgis.org/), to verify that it agreed with current known distributions. Data points were this way cleaned by visual inspection (e.g. samples from coastal species ending up in the ocean were excluded).

Climate scenarios
Climate datasets for present day conditions (experiment set named xakxu) and paleoclimatic scenarios for 10 Ma (xakfl), 8 Ma (xakxu) and 3 Ma (xaiud) were provided by the BRIDGE project (www.bridge.bris.ac.uk/resources/simulations). Each dataset contained the sixteen climate variables listed in Table 2.

Selection of climate variables
Various strategies have been proposed for selecting climate variables to use in bioclimatic modeling. Methods for selecting or rejecting variables have included the quantification of variable contribution to the model, or specifically for phyloclimatic modeling [19], an assessment of the phylogenetic conservatism of individual variables, but most have investigated the correlation of prospective variables [15,[47][48]. Correlated climate variables will emphasize certain climate components (e.g. temperature or  precipitation) if included in the analysis, and potentially result in incorrect inference of climate models. Thuiller [48] used principal component analysis to select uncorrelated variables for the models and Beaumont et al. [47], evaluated several different methods, including random sampling of variables, to assess the extent to which parameter choice influenced the predicted areas. The latter investigation showed that the size of the predicted area of distribution decreased when more climate variables were included in the analysis. Hence, selecting climate variables is an important step in inference of ancestral distribution areas.
We have used a novel method to exclude correlated variables while taking their prediction power in the form of Area under the Receiver Operating Characteristic (AUC) values into account. AUC is a measure of how well a model discriminates between sites where a species is present, compared to where it is absent [49]. The values range from 0 to 1, where a score of 1 indicates a perfect prediction of distribution and a score of 0.5 equals a random prediction of sites [50]. We produced bioclimatic models for each of the 38 species in the ivesioid clade, using one of the sixteen variables at a time. We recorded the AUC value for each model, and hence, each climate variable given a particular species, and the number of environmentally unique occurrence points for each analysis. An environmentally unique occurrence point is a species location with a value not previously sampled by the projected model. We then calculated the mean AUC value from each analysis with ten or more environmentally unique locations.
Correlation between the sixteen climate variables was then assessed using the function cor in the statistical package R [51]. Variable pairs with correlation coefficients greater than 0,8 were identified and the climate variable with the lowest AUC value was excluded ( Table 2). The four variables remaining after this exclusion process were used to build the bioclimatic models for the extant species and the ancestral nodes.

Bioclimatic models for extant species
Locality data together with the four selected climate variables ( Table 2) were used to define the climate preferences for each of the 38 ivesioid species, using the Envelope Score algorithm implemented in OpenModeller v.1.1.0 (openmodeller.sourceforge.net). The Envelope score is a modified version of the Bioclimatic Envelope Algorithm (Bioclim) that uses the observed maximum and minimum values in each environmental variable to determine the climate preferences for a taxon [20]. These preferences, called the bioclimatic envelope, can then be projected into a climate scenario to identify areas with a suitable climate for the taxon. The probability of a suitable environment in the projected model is determined by the number of layers with a value within the min-max threshold, divided by the total number of layers in the model [52].
The Bioclim methodology treats the environmental parameters independently of each other. This is a prerequisite for the ancestral state reconstruction where each variable in the bioclimatic envelop has to be optimized independently with currently available methods. Also, the simplicity of the algorithm makes it possible to combine these optimized variables to an ancestral bioclimatic envelope. More complex algorithms do not permit this independent treatment of variables as they attempt to account for the correlation between variables, and have therefore not been used for phylogenetic niche modeling [19].

Ancestral state reconstruction
Ancestral climate preferences were reconstructed for each node in the ivesioid phylogeny (Figures 3 and S2) using the function ace in the package ape [53] of the statistical program R [51]. Independent optimizations were done for the maximum and the minimum values of each variable by fitting a Brownian motion model using Maximum Likelihood optimization [54]. Optimized models for each node are presented in table 3.

Ancestral bioclimatic model
The optimized maximum and minimum values for the four climate variables were used to build bioclimatic models for all nodes in the ivesioid clade. Models for nodes with a posterior probability higher than 0.95 were then projected in to the climate scenario that corresponded best in time with the age of the nodes as follows: node 40, 41, 43 and 71 were projected in to the climate scenario for 10 Ma; node 57 in a 8 Ma scenario; node 49, 54, 55, 58, 66 and 73 in the climate scenario for 3 Ma.
The same models were also projected into present-day climate data to evaluate whether the variation in predicted geographic area between nodes in the tree depends on variation in the climate scenarios or in the inferred models. By keeping one of these variables constant (in this case the climate scenario), any variation in the inferred area with a suitable climate will depend on the inferred model. Differences between optimised models can that way be visualised. This analysis was performed to identify shifts in climate preferences during the evolution of the ivesioids. Additionally, the comparison of models for extant taxa projected into the present-day climate scenario, with ancestral niches projected into present-day climate scenarios permits a visual comparison of the differences between the extant and ancestral niches.

Test of models
AUC values for all niche models of extant taxa were calculated. A test of the correlation of the projected surfaces for all extant taxa was performed using the niche.overlap tool in the phyloclim package [55] of R [51]. Additionally, the age.range.correlation tool (also in phyloclim) was used to test for correlation between the niche overlap of two taxa and age to their most recent common ancestor (MRCA).

Phylogenetic inference and Molecular dating
The dated phylogeny of the Rosaceae family (Figures 1 and S1) identified the three subfamilies Rosoideae, Amygdaloideae and Dryadoideae as monophyletic. Except for the position of the species Lyonothamnus floribundus, the tribes presented by Potter et al. [56] were also in congruence with our phylogeny. Potentilla, and the ivesioids were inferred to be monophyletic.
The estimated 95% Highest Posterior Density (HPD) of the crown age of Rosaceae was 108.

Bioclimatic models
Species distribution data. The number of occurrence points used varied between 10 and 256 for 36 of the 38 species (Table 4). Locality data for two species, Ivesia longibracteata (five points) and I. cryptocaulis (two points), were still less than the desired ten data points after the dataset had been complemented with data from herbarium collections.
Climate variables. The four climate variables selected to build the models by analyzing correlation and AUC values were Standard deviation of mean temperature, Mean temperature in coolest month, Mean daily precipitation in coolest month and Mean daily precipitation in warmest month. Table 4 shows the maximum and minimum values for the four climate variables for all included species. A visualisation of one character mapped onto the final phylogeny is shown in supplementary figure S2.
Bioclimatic models, extant species. The projected areas range from the restricted species such as I. utahensis, which is an endemic to northern Utah, up to the wide-ranging species I. kingii, which finds climatically suitable areas in part of the Great Basin. The niche correlation analysis produced D and I correlation coefficients [57] for each pairwise comparison of species. Coefficients range from 0-1 signifying low to high correlation between age to most recent common ancestor and niche overlap.   Bioclimatic models, ancestral nodes. Figure 3 shows the fully resolved maximum clade credibility sub-tree of the ivesioids from the BEAST analysis. Ten of the branches have a posterior probability greater than 0.95 and are subjects for further investigation.
Projections into palaeoclimatic scenarios. The reconstructed ancestral climate models, projected into their respective climate scenarios, are shown in figure 4. Node 40 is the MRCA of the ivesioid species and its sister clade Potentilla sect. Rivales, and hence represents the age of the ivesioid stem lineage. This lineage emerged at 23.4 Ma and is shown to diverge at 17.7 Ma (crown age; node 41 in figure 3). The bioclimatic model for node 40 projected into a climate scenario from 10 Ma indicates an area of suitable climate from where the clade could have evolved (areas marked in red in Figure 5a). Most of Utah, parts of Nevada, Arizona, Colorado and New Mexico are inferred to have had a suitable climate by all four variables.
In node 41, the suitable area inferred by all four climate variables has decreased, but still includes parts of Utah. Due to low support for the topology of the tree, the two well-supported clades (A and B in Figure 3) as well as four taxa with uncertain position (I. lycopodioides, I. longibracteata, I. jaegeri and I. bailey) are treated as being derived from this node. Three variables inferred the radiation in clade A (node 71) to Northeastern Nevada, Northern Arizona and New Mexico. The other node with support in clade A (node 73) inferred the Northern parts of the Sierra Nevada, Northwest Nevada and Southwestern Idaho as having had a suitable climate.
The inferred suitable area for node 43, MRCA of clade B, resembles that of node 41, but is weaker (only yellow areas in figure 4, map 43) and slightly more southern. A westward movement of suitable climate is seen in nodes 54, 55, 58 and 66, which have models predicting large parts of the Sierra Nevada and the coast of Northern California. The projected models for two nodes do not corroborate this westward movement of a suitable climate. They are Node 57, with a large part of the Great Basin, Western Montana and parts of Arizona and Canada inferred, and node 49 with only a small part of Southeastern Oregon inferred by three climate variables. Most models also show a weak support for a suitable climate on the East coast of North America and Europe (data not shown).
Projections of ancestral models into present-day climate. Projections of the ancestral model for the MRCA with P. biennis (node 40) into present-day climate shows that the ivesioids originate from a climate corresponding to what is now found in the Sierra Nevada, Nevada, Southwestern Oregon and Northeast Arizona (Figure 5b). The preferences for present-day central Sierra Nevada climate prevails for all nodes in clade B ( Figure 6; Maps 49, 54, 55, 57, 58 and 66) and are only slightly weakened for nodes 55, 58 and 66. The three latter nodes have an affinity for a climate found around the San Bernardino mountains in the south. As in the projections into palaeoclimate scenarios, there is a shift in climate preferences that includes the type of climate now found along the coast of California, sometime after 12.2 Ma (node 43).

Phylogenetic inference and Molecular dating
The dated phylogeny of Rosaceae is congruent with previous analysis of the relationships in the family (Figures 1 and S1). The topology of the Potentilla clade was also congruent with that reported by Dobes and Pauli [9] and Töpel et al. [8] with few exceptions. Table 3. Cont. Ivesia sabulosa

Locality data
Locality data can be highly influential on the model predictions for extant species [58]. It is important to use locality data from all climate regions occupied by the species to be able to create a model that predicts the true climate preferences for the species. Still, less than ten locality points were used in the analyses for I. longibracteata and I. cryptocaulis. Instead of following the procedure of Evans et al. [7] and manually add extra points from these areas we used only the observed locations, and thereby violated the rule of thumb of only including taxa with more than 10 or 20 data points in the analysis (10 points [19], 10-20 points [58]). The two species are both narrow endemics, with the former known only from Castle Crags (41.17uN, 122.33uW) in the Trinity Mountains, California, and the latter from the summit of Mt. Charleston (36.3uN, 115.6uW) in Spring Mountains, Nevada (Barbara Ertter, personal communication). We manually analyzed the climate data from these areas, and found that adding more occurrence points from the known area of distribution would sample the same values as were already in the model. In effect, each of the two species occupies only one climatic niche (at the scale of our data) in its known area of distribution. The Envelope Score algorithm, used to build the bioclimatic models, only uses the observed minimum and maximum value for each environmental variable to define the bioclimatic envelope of a species. Adding more points from these areas would therefore not change the models. The bioclimatic models for I. longibracteata and I. cryptocaulis might therefore predict  a too narrow area of suitable climate if the distribution of these two species is not limited by the climate. Our primary goal is not to model extant species, but rather reconstruct ancestral models. We therefore believe that it is better to include these minimal models than to exclude these species from the analysis.

Origin of the ivesioid clade
The crown age of the ivesioid clade (24.3-12.1 Ma; median 17.7 Ma) corresponds to the time when summer drought started to appear in western North America [2]. This supports the hypothesis that the group evolved in response to the Miocene aridification of western North America. Furthermore, the area where the ivesioids are inferred to have originated includes the eastern parts of the Great Basin and the western side of the Rocky Mountains (Figure 5a). This region represents the eastern extension of the present day distribution of the group. Potentilla biennis, sister species of the ivesioids, has a distribution from the Sierra Nevada in the west to North Dakota in the east, and from southern British Columbia and Oregon in the north to Arizona in the south. Hence, both species in the ivesioid clade and in the sister group Potentilla sect. Rivales can still be found in their optimized ancestral area. In addition, an area outside of the present area of distribution, corresponding to the southeastern parts of North America, is inferred by three variables to have had a suitable climate (yellow area in figure 5a). The hypothesis that the ancestor of the ivesioid clade evolved on the east side of the Rocky Mountains and migrated to the Great Basin, the Sierra Nevada and the coast of California is less parsimonious than an origin and diversification in the Great Basin. The stronger prediction of the Great Basin (red areas in figure 5a) also supports this notion.
The ivesioid clade is inferred to have originated in a climate resembling that of present-day western Nevada, the Sierra Nevada and southeast Oregon (Figure 5b), which indicates that the ancestor had fairly wide climate preferences. However, this result may be due to limitations in the method used for the ancestral state reconstruction and may be unduly influenced by the outgroup, Potentilla biennis, which has a relatively wide niche. This is a generic problem with ancestral state reconstruction, but any artificial widening of the ancestral niche preferences would still encompass the 'true' niche.

Diversification in the ivesioid clade
The ancestral niche models for nodes older than 10 million years (node 71 in clade A and node 43 in clade B, as well as the nodes 40 and 41, figure 4), more or less uniformly infer the central Great Basin as the ancestral area. These models are all projected into the same climate scenario, permitting a direct comparison without additional uncertainty caused by potentially conflicting palaeo-climate layers. Furthermore, the geographic areas identified when these models are projected into the present-day climate scenario are also very similar ( Figure 6). Hence, the result from our analyses suggests that the diversification of the group and the emergence of the two clades A and B at approximately 17.7 Ma was not driven by climate change or a shift in climate preferences. This is supported by the low correlation between age to MRCA and niche overlap, and uniformly low niche correlation within and  between clades. The split may instead have been associated with a shift in pollination syndrome.
Clade A consists of species with flowers that have shallow hypanthia and narrow filaments. Their morphology points towards a pollination syndrome involving small flies and beetles [59].
In contrast, clade B mostly consists of species with wide and flattened filaments, forming a cone on top of a deep hypanthium and are pollinated by bees or bumblebees [14]. Most species in genus Potentilla have shallow hypanthia and narrow filaments, and an adaptation to a bee pollination syndrome in clade B could have been an important force for this split.
Clade A. Only one clade with a posterior probability greater than 0.95 was found in clade A. It includes the two species Ivesia santolinoides and I. unguiculata, which do not occur in the Great Basin. Instead, these species have the most westerly distribution of species in clade A, and are only found in the Sierra Nevada and adjacent mountain ranges. The rest of the species in clade A are mainly confined to the interior of the Great Basin. From the MRCA of clade A and B (node 41), and further into clade A there is a narrowing of climate preferences, and a more westerly area of suitable climate inferred between 10.7 Ma (node 71) and 2.0 Ma (node 73). Projecting these models into present-day climate shows that the optimized climate models only change slightly (Figure 6), and the detected westward shift of suitable area is probably due to differences in the underlying paleoclimate scenarios used for the different nodes, thus not representing a change in climate preferences.
Clade B. A similar pattern is seen in clade B. The ancestral area with a suitable climate is inferred to be the interior of the Great Basin until 7.6 Ma (Figure 4; maps 41, 43 and 57) for at least part of the clade. Furthermore, projections into present-day climate demonstrate that climate preferences of the ancestral nodes remained relatively stable until that time ( Figure 6; Map 41, 43 and 57). At 4.5 Ma we find the earliest indication of preference for the climate of coastal California (Figure 4; Map 66). This type of climate preferences appears in several places in the tree after 4.5 Ma (Figure 4; Maps 54, 55 and 58). Hence, a westward migration, as seen in clade A, is also inferred to have happened in clade B, but continued past the Sierra Nevada to the coastal areas of California. The Mediterranean type of climate of this area emerged approximately 10 Ma [2]. It is therefore reasonable to believe that species in clade B have found suitable habitats in the coastal regions of California on at least two occasions between 12.3 Ma (node 43) and 4.5 Ma (node 66), and between 7.6 Ma (node 57) and 1.7 Ma (node 58).
Niche conservatism. We observe a general pattern of niche conservatism amongst earlier lineages, until around 7,5-5 Ma (e.g node 57 in figure 3). There follows a greater amount of niche partitioning amongst related lineages, including a transition towards the coastal Mediterranean type climate in parts of clade B. This partitioning is evident for extant taxa as there are low levels of niche overlap between sister species. If sister species shared more similar niches we would expect to see a pattern of correlation between niche overlap and age to MRCA (i.e. that more closely related species have more similar niches), but this is not the case. The mean niche similarity within clades A and B is similar to the overall niche similarity for all species, so there is no major niche differentiation between clades. Many species pairs in clade B, such as I. shockleyi+I. sericoleuca and I. kingii+I. cryptocaulis, follow a schizo-endemic distribution pattern, i.e. one wider ranging species sister to a narrow endemic, but these sister groupings receive low support. This pattern has been reported for a number of plant groups in the Mediterranean region [60], and has been interpreted as the wider-ranging species being progenitor to the local endemic. Our results corroborate the generality of this pattern, that should be especially important in areas containing distinct micro-habitats (e.g., moist rock crevices in the middle of a wide arid zone, as observed for the ivesioids). We suggest that this may be an underestimated process in plant evolution, which could potentially explain at least some of the plant species richness observed today as well as the uneven distribution of certain species as compared to others that are closely related.

Conclusions
The phyloclimatic evolution of the ivesioids, inferred here, provides temporal and spatial support for the hypothesis that this group evolved in response to the late Tertiary development of dry conditions in western North America. The age of the MRCA of the clade (24.3-12.1 Ma; median 17.7 Ma) at Early-Middle Miocene coincides with the time when summer drought began in western North America. The hypothesis is further supported by the fact that the eastern parts of the Great Basin and the western slopes of the Rocky Mountains are inferred to have been the ancestral area of the clade. No other part of North America is strongly inferred to have had a suitable climate for the ancestor of this node; thus, migration into the Great Basin from areas not presently occupied by ivesioid species is unlikely.
A shift in pollination syndrome possibly led to diversification of the ivesioids at approximately 17.7 Ma. The resulting two clades experienced a westward range expansion from the foothills of the Rocky Mountains and the central Great Basin to the Sierra Nevada between 10.7-2.0 Ma, in clade A, and on at least two occasions between 12.3-4.5 Ma and 7.6-1.7 Ma in clade B. After a Mediterranean type of climate became established on the coast of California ,10 Ma, several lineages crossed the Sierra Nevada and found new suitable habitats to exploit. Our results thus suggest that the evolution and current distribution of this morphologically aberrant and diverse group to a large extent has been influenced by past climate change. Figure S1 Same molecular chronogram of Rosaceae as shown in figure 1, but also including species names. Maximum clade credibility tree obtained from 25000 post burn-in Bayesian chronograms generated in BEAST, with median branch lengths. Grey bars at nodes represent 95% Highest Posterior Densities of node ages. The red dots indicates age constraints used for the analysis; (1) The split between Rosales and Fabales was constrained to an age of 104-115 Ma based on a previous analysis [31], and (2) a Crataegites borealis fossil was used to set a conservative minimum age of 85.8 Ma on Rosaceae [32,34]. Subclades of Rosaceae were calibrated using fossil data from (3) [42]. A uniform prior with a maximum age of 115 Ma was used for all calibration points. Also indicated are the tribes of Rosaceae (species highlighted in blue and yellow) as well as the ivesioid clade highlighted in red. Time scale from [61].