Genome downsizing, physiological novelty, and the global dominance of flowering plants

The abrupt origin and rapid diversification of the flowering plants during the Cretaceous has long been considered an “abominable mystery.” While the cause of their high diversity has been attributed largely to coevolution with pollinators and herbivores, their ability to outcompete the previously dominant ferns and gymnosperms has been the subject of many hypotheses. Common among these is that the angiosperms alone developed leaves with smaller, more numerous stomata and more highly branching venation networks that enable higher rates of transpiration, photosynthesis, and growth. Yet, how angiosperms pack their leaves with smaller, more abundant stomata and more veins is unknown but linked—we show—to simple biophysical constraints on cell size. Only angiosperm lineages underwent rapid genome downsizing during the early Cretaceous period, which facilitated the reductions in cell size necessary to pack more veins and stomata into their leaves, effectively bringing actual primary productivity closer to its maximum potential. Thus, the angiosperms' heightened competitive abilities are due in no small part to genome downsizing.


Author summary
The angiosperms, commonly referred to as the flowering plants, are the dominant plants in most terrestrial ecosystems, but how they came to be so successful is considered one of the most profound mysteries in evolutionary biology. Prevailing hypotheses have suggested that the angiosperms rose to dominance through an increase in their maximum potential photosynthesis and whole-plant carbon gain, allowing them to outcompete the ferns and gymnosperms that had previously dominated terrestrial ecosystems. Using a combination of anatomy, cytology, and modelling of liquid water transport and CO 2 exchange between leaves and the atmosphere, we now provide strong evidence that the success and rapid spread of flowering plants around the world was the result of genome downsizing. Smaller genomes permit the construction of smaller cells that allow for greater CO 2 uptake and photosynthetic carbon gain. Genome downsizing occurred only among the angiosperms, and we propose that it was a necessary prerequisite for rapid growth rates among land plants. PLOS

Introduction
The flowering plants are highly competitive in almost every terrestrial ecosystem, and their rapid rise during the early Cretaceous period irrevocably altered terrestrial primary productivity and global climate [1][2][3]. Terrestrial primary productivity is ultimately determined by the photosynthetic capacity of leaves. The primary enzyme in photosynthesis, rubisco, functions poorly when CO 2 is limiting, which requires leaf intercellular CO 2 concentrations (c i ) to be maintained within a narrow range [4] through adjustments in leaf surface conductance to CO 2 and water vapor. This surface conductance is one of the greatest biophysical limitations on photosynthetic rates across all terrestrial plants [5,6]. In order for CO 2 to diffuse from the atmosphere into the leaf, the wet internal surfaces of leaves must be exposed to the dry ambient atmosphere, which can cause leaf desiccation and prevent further CO 2 uptake. As a consequence, increasing leaf surface conductance to CO 2 also requires increasing rates of leaf water transport in order to avoid desiccation [7]. Both theory and empirical data suggest that among all major clades of terrestrial plants, the upper limit of leaf surface conductance to CO 2 and water vapor is tightly coupled to the biophysical limitations of cell size [8][9][10][11]. Cellular allometry, in particular the scaling of genome size, nuclear volume, and cell size, represents a direct physical constraint on the number of cells that can occupy a given space and, as a result, on the distance between cell types and tissues [12][13][14]. Because leaves with many small stomata and a high density of veins can maintain higher rates of gas exchange than leaves with fewer, larger stomata and larger, less numerous veins [15], variation in cell size can drive large changes in potential carbon gain. Without reducing cell size, increasing stomatal and vein densities would displace other important tissues, such as photosynthetic mesophyll cells [16]. Therefore, the densities of stomata on the leaf surface and of veins inside the leaf are inversely related to the sizes of guard cells and the primary xylem elements comprising them.
While numerous environmental and physiological factors can influence the final sizes of somatic eukaryotic cells, the minimum size of meristematic cells and the rate of their production are strongly constrained by nuclear volume, more commonly measured as genome size [17][18][19]. Among land plants, the bulk DNA content of cells varies by three orders of magnitude, with the angiosperms exhibiting both the largest range in genome size and the smallest absolute genome sizes [20]. Whole-genome duplications and subsequent genomic rearrangements, including genome downsizing, are thought to have directly contributed to the unparalleled diversity in anatomical, morphological, and physiological traits of the angiosperms [12,[21][22][23][24][25][26][27][28]. We extend this prior work and test the hypothesis that genome size variation is responsible not only for gene diversification but also directly limits minimum cell size and, thus, is the underlying variable constraining stomatal size and density and leaf vein density (D v ). Due to the strong influence of cell size on maximum potential carbon gain, the allometric scaling of genome size and cell size is predicted to directly influence primary productivity across all major clades of terrestrial plants [12,13,27,29].

Results and discussion
To determine whether genome downsizing among the angiosperms drove the anatomical and physiological innovations that resulted in their ecological dominance, we compiled data for genome size, cell size (guard cell length; l g ), stomatal density (D s ), and D v for almost 400 species of ferns, gymnosperms, and angiosperms. Consistent with prior studies and with our predictions, genome size varied substantially among major clades (Fig 1) and was a strong predictor of anatomical traits across the major groups of terrestrial plants even after accounting for phylogenetic relatedness of species (Fig 2, Table 1). Species with smaller genomes have smaller, more numerous stomata and higher leaf vein densities. Genome size explained between 31% and 54% of interspecific variation in l g, D s , and D v across the major groups of terrestrial plants, and both phylogenetic and non-phylogenetic analyses showed that a single relationship predicted each of these traits from genome size across all species (Table 1). In both phylogenetic and non-phylogenetic analyses there were strong, significant correlations between anatomical traits both among the major clades and within the angiosperms, highlighting the coordinated evolution of these traits throughout the history of seed plants (S1 Table).
Because genome size directly affects minimum cell size, variation in genome size has numerous consequences for the structure and organization of cells and tissues in leaves, which directly influence rates of leaf water loss (transpiration) and photosynthesis. Physical resistance to diffusion across leaf surfaces is ultimately determined by the sizes of epidermal cells, and the  maximum diffusive conductance of CO 2 and water vapor is higher in leaves with more numerous, smaller stomata [8,10,11]. While the effects of cell size on leaf epidermal properties have been well characterized, the effects of cell size on the efficiency of liquid water supply through the leaf are, perhaps, less obvious. Because the highest hydraulic resistance in the leaf occurs in the path between the veins and the sites of evaporation, shortening this path length by increasing D v reduces the resistance outside the xylem and increases leaf hydraulic conductance [7,30]. Given a constant leaf volume, increasing D v without displacing photosynthetic mesophyll cells requires reductions in vein and conduit sizes that can only be accomplished by decreasing cell size [16,31]. However, smaller conduits have higher hydraulic resistances. To overcome hydraulic limitations associated with reductions in conduit size, other innovations in xylem anatomy that reduce hydraulic resistance have been hypothesized to facilitate narrower xylem conduits and high D v . In particular, the development of low resistance end walls between adjacent cells is thought to have given angiosperms a hydraulic advantage as conduit diameters decreased. Only in angiosperm lineages with very high D v do primary xylem have simple perforation plates, which have lower resistance to water flow than scalariform perforation plates [16]. Similarly, the low resistance of gymnosperm torus-margo pits compared to angiosperm pits can result in higher xylem-specific hydraulic conductivity for small diameter conduits [32]. In both cases, while smaller conduits have higher resistance, this potential cost Phylogenetically corrected major axis regressions have similar slopes, R 2 , and p-values, and are shown in Table 1. Data can be found in S1 Data. D s , stomatal density; D v , leaf vein density; l g , guard cell length.
https://doi.org/10.1371/journal.pbio.2003706.g002 Table 1. Non-phylogenetic standard major axis regressions of D v , l g , D s , g s, max , and g s, op versus genome size for all species and for each clade separately and phylogenetic standard major axis regressions. Asterisks indicate significance level: Ã p < 0.05; ÃÃÃ p < 0.001. No regressions were significant with p < 0.01. has been offset by other innovations that reduce hydraulic resistance at the scale of the whole xylem network. We examined the consequences of variation in genome size on terrestrial primary productivity by calculating maximum stomatal conductance (g s, max ) and operational stomatal conductance (g s, op ) using theoretical and empirical models that directly relate leaf anatomy to gas exchange (see Materials and Methods). Genome size was a highly significant predictor of both g s, max and g s, op , whether or not phylogenetic relatedness of species was incorporated (Fig 3,  Variation in vapor pressure deficit will affect the intercept of g s, op but not the slope. Points are omitted for clarity. Phylogenetically corrected major axis regressions are similarly significant and are reported in Table 1. g s , max, maximum stomatal conductance; g s, op , operational stomatal conductance. Table 1). Scaling relationships that accounted for phylogenetic relatedness of all species in our dataset were as significant as non-phylogenetic analyses and had similar slopes. Thus, a single relationship between genome size and stomatal conductance exists among all land plants. We tested assumptions about how vein positioning in the leaf influences g s, op by modeling stomatal conductance for leaves of varying thickness and found that regardless of leaf thickness (70, 100, 130 μm), the slopes of the relationships between genome size and g s, op were significantly steeper than the slope of the relationship between genome size and g s, max (all p < 0.001). Thus, across all species, shrinking the genome brings g s, op closer to g s, max (Fig 3, Table 1), which facilitates faster rates of growth.
The timing of these physiological innovations further corroborates their role in promoting angiosperm domination of terrestrial ecosystems. Unlike other major clades of terrestrial plants, genome sizes, D v , D s , and l g of the angiosperms expanded into new regions of trait space during the Cretaceous period (Fig 4), increasing rates of leaf level carbon assimilation and ushering in an era of greater terrestrial primary productivity [12,15,27]. To determine how the upper or lower limits of trait values changed through time, linear and nonlinear curves were fit through the upper or lower 10% of trait values during the period of rapid angiosperm diversification (165-60 Ma). For the angiosperms, extreme values of genome size and anatomical traits were fit by a logarithmic curve better than by a linear relationship (genome size change in Akaike Information Criterion (ΔAIC) = 31.8; D v ΔAIC = 6.6; l g ΔAIC = 16.3; D s ΔAIC = 5.7), indicating that Cretaceous angiosperms pushed the frontiers of genome size, cell size, and vein and stomatal densities. In contrast to the angiosperms, fern and gymnosperm lineages exhibited no such sudden change in any trait during the Cretaceous period (Fig 4). Reconstructions of D v matched well with fossil data, but the limited available data for l g and D s among Cretaceous angiosperms precludes a comparable analysis (S1 Fig).
These results suggest that the ability to develop leaves with high vein and stomatal densities derives not exclusively from common developmental programs underlying these traits nor from genetic correlations (i.e., linkage between genes controlling both traits), but-even more fundamentally-from biophysical scaling constraints that limit minimum cell size [8,29]. Together with analyses of trait evolution, the scaling relationships between genome size and gas exchange rates suggest that rapid genome downsizing among the angiosperms during the Cretaceous period facilitated increased rates of photosynthesis and biomass accumulation (Fig  2, Fig 3 and Fig 4). Importantly, while genome downsizing has been critical to increasing leaf gas exchange rates among the angiosperms, it was not a key innovation that occurred only at the root of the angiosperm phylogeny. Rather, the angiosperms exhibit a wide range of genome sizes, and coordinated changes in genome size and physiological traits have occurred repeatedly throughout their evolutionary history (Table 1, S1 Table). While whole-genome duplications have been particularly important in promoting diversification among the angiosperms [21], larger genomes increase minimum cell size and depress maximum potential gas exchange, thereby reducing competitive ability in productive habitats. Our results suggest that reductions in minimum cell size through genome downsizing can recover leaf gas exchange capacity subsequent to genome duplications and diversification events. If heightened competitive ability among the angiosperms drove their ecological dominance, then innovations that reduced minimum cell size were critical to this transformative process [29].
Although genome size limits minimum cell size [19,25], final cell size can vary widely as cells grow and differentiate. After cell division and during cell expansion, various factors influence how large a cell becomes. Intracellular turgor pressure overcomes the mechanical rigidity of the cell wall to enlarge cellular boundaries. The magnitude of turgor pressure is itself controlled by water availability around the cell and by the osmotic potential inside the cell. Final cell size is influenced, therefore, by both biotic and abiotic factors that affect pressure gradients in and around the cell. By reducing nuclear volume and the lower limit of cell size [19,25], genome downsizing expands the range of final cell size that is possible. Species that can vary cell size across a wider range can more finely tune their leaf anatomy to match environmental constraints on leaf gas exchange. Indeed, D v , D s , l g , and g s are more variable among species with small genomes (Fig 2, Fig 3 and Fig 4), and the variance in these traits unexplained by genome size is likely due to environmental variation. Thus, genome size may predict ecological breadth insofar as species with small genomes can exhibit greater plasticity in final cell size and can inhabit a wider range of environmental conditions, although more analyses of withinand between-species variation in genome size are needed to clarify this [33,34]. Interestingly, only the angiosperms occupy this region of trait space, and the angiosperms tend to be more productive than either the ferns or the gymnosperms across a broad range of environmental conditions. Therefore, rapid genome downsizing by the angiosperms during the Cretaceous period likely explains not only their greater potential and realized primary productivity (Fig 3  and Fig 4) but also why they were able to expand into and create new ecological habitats, fundamentally altering the global biosphere and atmosphere [3].
Prevailing theories have suggested that the global dominance of angiosperms occurred due to higher maximum photosynthetic capacity and growth, despite Cretaceous declines of atmospheric CO 2 that would have otherwise depressed rates of photosynthesis [3,12,15,35]. In habitats that can support high rates of primary productivity, maximum rates of gas exchange and growth are generally greater for angiosperms than for gymnosperms and ferns and are due, we show, to reductions in genome and cell sizes that occurred after the appearance of early angiosperms. Smaller genomes and cells increased leaf surface conductance to CO 2 and enabled higher potential and realized primary productivity. Furthermore, because genome downsizing lowers the limit of minimum cell size, final cell size can vary much more widely, which may facilitate a closer coupling of anatomy and physiology to environmental conditions [36]. Therefore, genome downsizing among the angiosperms allowed them to outcompete other plants in almost every terrestrial ecosystem.

Leaf traits
Published data for l g , D s , and D v were compiled from the literature (S1 Data). Genome size data for each species were taken from the Plant DNA C-values database (release 6.0, December 2012), managed by the Royal Botanic Gardens, Kew [37]. In total, our dataset comprised 393 species of vascular plants, of which 289 were angiosperms, 53 were gymnosperms, and 51 were ferns. The dataset comprised here represents 0.1% of the estimated angiosperm species diversity. Of the 416 families and 64 orders of extant plants currently accepted by the Angiosperm Phylogeny Group IV, the 289 species in our dataset represented 102 families and 43 orders. Among angiosperm clades, the species diversity in our dataset is positively correlated with the number of known species in those clades (S2 Fig). The Plant DNA C-values database currently contains data for over 7,000 angiosperms, and our sample of 289 for which there were anatomical traits had genome sizes highly representative of all angiosperms in the database with no significant differences between the mean genome sizes of the two datasets (S3 Fig). For the 289 angiosperms in the dataset, there were D v data for 165, guard cell size data for 184, and D s data for 184. Similarly, there were D v data for 23 gymnosperms and for 10 ferns, there were l g data for 20 gymnosperms and for 38 ferns, and there were D s data for 37 gymnosperms and 26 ferns.

Calculating g s, max and g s, op
For each species, we calculated g s, max and g s, op . g s, max is defined by the dimensions of stomatal pores and their abundance, and represents the biophysical upper limit of gas diffusion through the leaf epidermis. Anatomical measurements of guard cells were used to calculate g s, max as [8,9]: where d H 2 O is the diffusivity of water in air (0.0000249 m 2 s −1 ), m v is the molar volume of air normalized to 25˚C (0.0224 m 3 mol −1 ), D s is stomatal density (mm −2 ), a max is maximum stomatal pore size, and d p is the depth of the stomatal pore. The a max term can be approximated as: π(l p /2) 2 , where l p is stomatal pore length with l p being approximated as l g /2, where l g is guard cell length. For studies that only reported l p , we calculated l g as 2Ál p [8,42]. d p is assumed to be equal to guard cell width (W). If W was not reported d p was estimated as 0.36Ál g [11]. g s, op , by contrast, more accurately defines the stomatal conductance leaves attained under natural conditions when limitations in leaf hydraulic supply constrain stomatal conductance. We used an empirical model of g s, op that directly relates D v to stomatal conductance during periods of steady state transpiration (E) [7] as: where: K leaf is leaf hydraulic conductance (mmol m −2 s −1 MPa −1 ), d m is the post vein distance to stomata (μm), d x is the maximum horizontal distance from vein to the stomata (μm), d y is the distance from vein to the epidermis (μm), ΔC is the water potential difference between stem and leaf (set to 0.33 MPa [43]), and v is vapor pressure deficit set to 2 kPa. Variation in v would affect the intercept but not the slope of g s, op . In order to test the influence of variation in leaf thickness on g s, op , we used three values of d y (70, 100, and 130 μm). The steady state equations presented above can be related directly to photosynthesis as: where A op is operational photosynthetic capacity (μmol m −2 s −1 ), c a is the molar concentration of CO 2 in the atmosphere, c i is the molar concentration of CO 2 in the air spaces inside the leaf, and 1.6 accounts for the difference in diffusivity between H 2 O and CO 2 in air.

Analyses of trait evolution
To determine the temporal patterns of trait evolution, we generated a phylogeny from the list of taxa (S1 Data) using Phylomatic (v. 3) and its stored family-level supertree (v. R20120829).
To date nodes in the supertree, we compiled node ages from recent, fossil-calibrated estimates of crown group ages. Node ages were taken from Magallón et al. [44] for angiosperms, Lu et al. [45] for gymnosperms, and Testo and Sundue [46] for ferns. The age of all seed plants was taken as 330 million years [47]. Because there is some uncertainty in the maximum age of the ancestor of all angiosperms, we took the angiosperm crown age used by Brodribb and Field [12] to make our results directly comparable to theirs. We tested this assumed angiosperm age by using different ages for the crown group angiosperms ranging from 130 Ma to 180 Ma, and the results were not qualitatively different. Of the 254 internal nodes in our tree, 82 of them had ages. These ages were assigned to nodes and branch lengths between these dated nodes evenly spaced using the function "bladj" in the software Phylocom (v. 4.2 [47]). Polytomies were resolved by randomly bifurcating and adding 5 million years to each of these new branches and subtracting an equivalent amount from the descending branches so that the tree remained ultrametric. For all subsequent analyses of character evolution, this method for randomly resolving polytomies was repeated 100 times to account for phylogenetic uncertainty. For ancestral state reconstructions, the ages and character estimates at each node were averaged across the 100 randomly resolved trees. Ancestral state reconstructions were calculated using the residual maximum likelihood method, implemented in the function "ace" from the R package ape [48]. To determine when changes in traits pushed the frontiers of trait values, the upper (D v and D s ) and lower (genome size and l g ) limits of traits were estimated by first extracting the upper or lower ten percent of reconstructed trait values in sequential 5 million-year windows and then attempting to fit curves to these values. This method is similar to a previous analysis of D v evolution through time [38], which is included here for comparison. We compared three types of curve fits: a linear fit that lacked slope (equivalent to the mean of the reconstructed trait values), a linear fit that included both a slope and an intercept, and a nonlinear curve of the form . Curves were fit to reconstructed trait values for each clade between 165 and 60 Ma, which corresponds to the time period encompassing the major diversification and expansion of the angiosperms, and the best fit was chosen based on AIC scores with a difference in AIC of 5 taken to indicate significant differences in fits. Phylogenetic generalized least squares regression was used to determine whether traits underwent correlated evolution. A regression was performed for each pairwise combination of traits for only species with data for both traits. Phylogenetic regressions used a Brownian motion correlation structure from the R package ape [49].
We acknowledge the potential for high uncertainty in ancestral state character reconstructions when working with a small subset of species relative to the broader species pool [50,51].
In an effort to minimize uncertainty, we sampled basal angiosperms as much as possible and performed two additional analyses that suggest our dataset is robust to incomplete sampling. First, we performed a bootstrapping analysis in which we randomly sampled species from our entire genome size dataset (35%, 52%, and 78% of angiosperm species), reconstructed genome size, and fit curves to the lower limit of reconstructed genome sizes, as before. This procedure was replicated 100 times at each level of sampling diversity. This analysis revealed that using only 35% of the angiosperms in our dataset still produced estimates of minimum genome size that are consistent with the entire dataset (S4 Fig). Second, the species diversity of 20 named nodes in our dataset is strongly correlated with the actual extant species diversity of those clades (S2 Fig). Additionally, our sample of genome size variation does not differ significantly from the genome size variation among approximately 7,000 measured species (S3 Fig). Furthermore, our analysis of vein density evolution based on 151 angiosperm species is almost identical to the previous analysis by Brodribb and Feild [12], which relied on 504 angiosperm species (Fig 4), and both of these modeled limits of vein density agree strongly with fossil data [38]. Overall, these analyses strongly suggest that the trait values represented in our taxon sampling is robust, given the incredible extant diversity of angiosperms and the data currently available.

Scaling relationships
Scaling relationships between genome size and D v , l g , g s, max , and g s, op were calculated from log-transformed data and analyzed using the function "sma" in the R package smatr [52]. Analyses were performed for the entire dataset and also for individual clades. Slope tests were used to determine whether the scaling relationship between genome size and g s, max was significantly different from the relationship between genome size and g s, op and whether the scaling relationships between genome size and g s, op and g s, max differed among clades. To account for the non-independence of sampling related species, phylogenetic standard major axis regressions were performed on all species using the function "phyl.RMA" in the R package phytools.
Supporting information S1 Table. l