Does the Mode of Plastid Inheritance Influence Plastid Genome Architecture?

Plastid genomes show an impressive array of sizes and compactnesses, but the forces responsible for this variation are unknown. It has been argued that species with small effective genetic population sizes are less efficient at purging excess DNA from their genomes than those with large effective population sizes. If true, one may expect the primary mode of plastid inheritance to influence plastid DNA (ptDNA) architecture. All else being equal, biparentally inherited ptDNAs should have a two-fold greater effective population size than those that are uniparentally inherited, and thus should also be more compact. Here, we explore the relationship between plastid inheritance pattern and ptDNA architecture, and consider the role of phylogeny in shaping our observations. Contrary to our expectations, we found no significant difference in plastid genome size or compactness between ptDNAs that are biparentally inherited relative to those that are uniparentally inherited. However, we also found that there was significant phylogenetic signal for the trait of mode of plastid inheritance. We also found that paternally inherited ptDNAs are significantly smaller (n = 19, p = 0.000001) than those that are maternally, uniparentally (when isogamous), or biparentally inherited. Potential explanations for this observation are discussed.


Introduction
Plastids originate from an ancient endosymbiosis of a cyanobacterium by a eukaryotic host [1]. They first arose in the ancestor of the Archaeplastida (i.e., Plantae), and were then passed on laterally to diverse lineages through eukaryote-eukaryote endosymbioses [2,3]. The genomes within contemporary plastids show a remarkable, and puzzling, diversity of sizes (5 to .1000 kilobases; kb) and compactnesses (,5 to .85% noncoding DNA) [4]. However, the evolutionary forces that gave rise to this variation are poorly understood.
The mutational hazard hypothesis argues that large, bloated genomes, with lots of intergenic and intronic DNA, pose a greater mutational burden to their hosts than genomes that are compact [5]. This is because any expansion in DNA content increases the potential for deleterious mutations, where the higher the mutation rate the greater the burden of having excess DNA. It follows, therefore, that species with large effective genetic population sizes (N e ), where natural selection is efficient, are better at perceiving and eliminating ''burdensome'' excess DNA than those with a small N e [5]. Many studies have explored the relationship between N e and genome compactness [6][7][8], but few have employed plastid DNA (ptDNA).
Effective genetic population size is a difficult parameter to measure, and one that is likely influenced by the mode of inheritance. Plastid genomes, unlike most nuclear chromosomes, are typically uniparentally inherited [9]. For sexually reproducing species with male and female gametes, maternal plastid inheritance is the norm. Studies, however, have identified diverse species with paternal or biparental modes of plastid inheritance [10][11][12][13]. Other things being equal, the N e of uniparentally inherited plastid genomes should be half that of biparentally inherited ones. Further, the influence of differential migration (e.g. seeds are heavier and less numerous than pollen) and an individual's size at reproduction (e.g. smaller individuals produce greater amounts of pollen vs. seeds) mean that maternal vs. paternal modes of organellar inheritance can also lead to overall differences in the N e of ptDNAs [14].
In this study, we use newly available data on plastid genome sequence and inheritance pattern to investigate how differing modes of inheritance impact ptDNA architecture. Based on the mutational hazard hypothesis, we predict that biparentally inherited ptDNAs, given their potential for having a higher N e, will be more compact than those that are uniparentally inherited. We also expect to see differences in genomic architecture between paternally vs. maternally vs. uniparentally (when isogamous) inherited ptDNAs.

Methods
By searching the literature, we found 81 species for which both plastid inheritance statistics and complete ptDNA sequence data are available, including 69 land plants, 6 green algae, 2 red algae, 2 apicomplexans, and 2 stramenopile ( Table 1). The mode of plastid inheritance is thought to vary continuously rather than discretely between taxa; however, determining an appropriate scale for ranking the degree of biparental inheritance was difficult because of large differences in sample sizes between species. Instead, we categorized the primary pattern of plastid inheritance using the following: inheritance determined from genetic analysis of mutant plastids; ptDNA restriction analysis and/or analysis of ptDNA sequence data of progeny with known parentage; epifluorescence microscopy employing DNA fluorochromes to detect plastids in viable, mature sperm cells; and ultrastructural observations using transmission electron microscopy (TEM). Further, we noted cases where interspecific, intergeneric, or widely divergent strain cross was used to assess plastid mode of inheritance because at least one previous study has shown that taxonomically divergent crosses can cause the breakdown of the typical pattern of cytoplasmic maternal inheritance [15]. In a few cases the primary mode of inheritance was undetermined for the species with a complete plastid sequence in our dataset, so we screened the literature for plastid inheritance studies from other members of the same genera or higher-level taxonomic group; if the mode of inheritance was identical within the group, then we assumed all members from that group had the same mode of plastid inheritance (e.g. maternal inheritance for the genus Cuscuta or paternal inheritance for the order Pinales).
Noncoding ptDNA content was calculated as follows: genome length minus the collective length of all annotated protein-, rRNA-, and tRNA-coding regions, not including the portions of these regions that are also annotated as introns. Intronic and non-standard open reading frames were treated as noncoding DNA. This method is contingent on the authors of the GenBank records having properly annotated their entry.
We performed a linear regression between plastid genome length (independent variable) and the amount of noncoding ptDNA (dependent variable). Both variables were log-transformed to meet the assumptions of homoscedasticity and normality. To test the effect of plastid inheritance pattern on noncoding ptDNA content and plastid genome size, we performed two nonparametric analyses. The factor ''plastid inheritance'' contained four levels: biparental vs. maternal vs. paternal vs. uniparental isogamous. The first analysis tested how all four levels affected the dependent variables (using separate Kruskal-Wallis tests for each variable). For the second analysis, we pooled the last three levels into 'uniparental' and used Wilcoxon rank sign tests. We applied non-parametric tests because our data were not normally distributed and because of the uneven sample sizes between levels of the factor ''mode of plastid inheritance.'' When more than two levels were used, we looked for significant differences between the various levels by performing post-hoc multiple comparisons using the Kruskal-Wallis test (function 'kruskalmc' in the R package 'pgirmess'). Statistical analyses were performed with R v.2.14.2 (R Core Development Team 2012).

Phylogenetic Independent Contrasts and Phylogenetic Signal in Our Dataset
Because our dataset was comprised of several groups of very closely related species (Table 1), we considered if the effects of phylogenetic non-independence (and by proxy pseudoreplication) [16,17] were influencing the conclusions from our initial analyses. First we checked the tree topology of our dataset using a taxonomic tree generated from the NCBI Taxonomy Database [18,19], and a maximum-likelihood phylogeny (10000 bootstraps using the PhyML plugin for Geneious Pro v. 5.4.4 [20]) based on the deduced amino acid sequences of the plastid-encoded rbcL gene (see Table 1 for GenBank accession numbers). Both trees had identical topologies except that the rbcL tree contained no apicomplexans because their ptDNAs do not contain rbcL. Because most tests of phylogenetic independence require a tree to be rooted, we forcibly rooted our rbcL tree in the red algal species Gracilaria tenuistipitata var. liui.
Phylogenetic independent contrasts (PICs) for the continuous variables of ptDNA size and noncoding content were performed using the 'crunch' function within the 'caper' package [20] of R v.2.14.2 (R Core Development Team 2012). To investigate the association between plastid genome size and noncoding ptDNA content, we fit a linear model of the standardized contrasts against each other. We were unable to obtain a large number of contrasts for our dataset that incorporated all nodes of the phylogeny (taxonomic or gene tree) for the categorical variable of primary mode of inheritance. This is because the tips of our phylogeny did not possess sufficient variation in the categorical trait, and with categorical variables only the tips are used in assessing the role of phylogenetic non-independence [21,22]. Instead, we performed an analysis of phylogenetic signal strength (D) [23] for the binary trait of biparental vs. uniparental plastid inheritance to see if these traits were ''clumped'' or randomly distributed [22,23] in the phylogeny. D values that are negative or close to 0 are more phylogenetically conserved (or clumped), which can indicate non-independent evolutionary events, whereas D values closer to 1 are overdispersed and therefore can be a sign of randomness in the trait's distribution within a phylogeny.

Results and Discussion
As Plastid Genome Size Increases so does the Amount of Noncoding ptDNA Consistent with previous observations [5,24], the amount of noncoding ptDNA in nucleotides co-varied positively with plastid genome size for our dataset (n = 81), adjusted R 2 = 0.78, p#0.000001 ( Fig. 1 A and B). Logged transformation of both variables enabled our linear model to meet the more crucial assumption for linear regression -homoscedasticity, but transformation did not improve normality. There was one significant highleverage outlier (Volvox carteri) and two moderate statistical outliers (the apicomplexans Toxoplasma gondii and Eimeria tenella). Removal of these statistical outliers from our dataset (n = 78) did not alter the significance of the linear relationship, adjusted R 2 = 0.76, p#0.000001. When we fit a linear model to our standardized phylogenetic independent contrasts there was still a positive significant relationship (p = 0.00078) between plastid genome size and amount noncoding ptDNA, but the strength of the relationship decreased, adjusted R 2 = 0.136. The assumptions of homoscedasticity and normality were violated in fitting this linear model, and neither log transformation of the variables nor the removal of the high-leverage outlier Volvox carteri helped us meet these assumptions. Overall, we contend that if more taxa were added to our dataset, this pattern would remain consistent with the past observations that plastid genome size scales positively with the amount of noncoding ptDNA [24].

Plastid Genome Size and Compactness do not Vary
Significantly between Taxa with Biparental vs. Uniparental Plastid Inheritance Patterns her plastid genome size nor the amount of noncoding ptDNA varied significantly with respect to the primary mode of plastid inheritance when only two types of inheritance pattern were considered (uniparental vs. biparental) (plastid genome size: Wilcoxon signed rank test x 2 = 2, df = 1, p-value = 0.12; noncoding ptDNA: Wilcoxon signed rank test x 2 = 2, df = 1, p-value = 0.23). Our analysis of phylogenetic signal strength revealed that the binary trait of mode of plastid inheritance was clumped, Table 1. Organisms, coarse taxonomic group, plastid genome size, coding proportion of ptDNA, primary mode of plastid inheritance, and references to support the mode of inheritance.

Organism
Accession # D = 20.0052, and the probability that this trait was distributed at random in the phylogeny is effectively zero. This is likely due to the pseudoreplication produced from including multiple species of the same genus (e.g. Oenothera, Pinus, Cuscuta, Picea). Reducing our dataset, by randomly including only one taxon from each of the pseudoreplicated genera produced no significant difference between biparental and uniparental taxa (Wilcoxon signed rank test, df = 1, p-value range = 0.32-0.54). We expected uniparentally-inherited plastids, because of their potential for a reduced N e , to have more bloated ptDNAs than those with biparentally inherited ones, especially when looking within lineages. Our results suggest that forces other than, or in addition to, inheritance pattern are influencing N e(ptDNA) and ultimately shaping plastid genome architecture. Population bottlenecks can severely reduce the effective population size of a species [25]. Our dataset includes many crop and model species (e.g., Triticum aestivum and Arabidopsis thaliana), including some that show biparental plastid inheritance (e.g., Pisum sativum and Medicago truncatula). In the process of being bred for ''desirable traits'' or under laboratory conditions, it is likely that these species experienced multiple and frequent bottlenecks, which may have greatly reduced N e(ptDNA) and canceled out the slight increases in N e(ptDNA) due to biparental modes of plastid inheritance. Similarly, several of the taxa showing biparental plastid inheritance are the products of hybridizations -events that can alter genome architecture and size [26]. Indeed, the hybrid Pelargonium6hortorum (the garden geranium) has a very large ptDNA genome (217 kb), and one that is thought to have been shaped by one or many hybridization events [27]. In contrast, Geranium palmatum, a close relative of Pelargonium6hortorum but not a hybrid, has a relatively small ptDNA genome (156 kb). It has also been argued that biparental organelle inheritance as compared to uniparental inheritance is more likely to cause the rapid spread of deleterious cytoplasmic elements (such as a mutant organelle genome with a replication advantage over the wild-type genome) through a sexual population [28]. Although our study was not designed test this particular hypothesis, our observation that ptDNA architecture did not vary significantly with respect to the primary mode of plastid inheritance does not support the view that biparental organelle inheritance promotes the spread of selfish cytoplasmic elements.
Reduced ptDNA Size for Species with Paternally Inherited Plastomes: Lineage Specific Gene Loss or Male-biased Mutation?
Are Paternally Inherited ptDNAs Truly Smaller than those Following Other Patterns of Inheritance?
In our dataset, all of the taxa with paternally inherited plastid genomes belong to pinophytes (i.e., conifers). The ptDNAs of pinophytes tend to have fewer NADH dehydrogenase-encoding ndh genes (because of gene loss or gene transfer to the nuclear genome) than those from most other land plant lineages [29,30], which largely explains their smaller sizes. Gnetophytes, which are close relatives of pinophytes, also have small plastid genomes with a reduced number of ndh genes [31]. However, unlike pinophytes, gnetophytes are believed to have maternally inherited plastids (at least for some Ephedra species) [32,33], supporting the notion that the small ptDNAs within these two groups are probably the product of gene loss and not plastid inheritance pattern.
That said, male-biased mutation pressure [34][35][36] may also help to explain why pinophytes have smaller plastid genomes. It is wellestablished that male-biased mutation occurs in the biparentally inherited nuclear genomes of various animal taxa because male germ-lines cells go through many rounds of cell division, which means they are subjected to increased mutation rates compared to female germ-line cells. Female germ-line cells do not typically undergo cell division throughout the lifespan, and so are effectively buffered from the potentially deleterious effects of mutation. However, plants (unlike animals) were long hypothesized not to have a separation between germ-line and somatic cells, yet both nuclear-and plastid-encoded genes that are transferred paternally still undergo greater amounts of mutation compared to those that are maternally transmitted [34][35][36]. It is possible that paternally inherited plastid genomes have higher mutation rates because of male-biased mutation, and thus are potentially subject to more intense selection pressure for genome compaction [5].

Concluding Remarks
Considering all of the data available at present, we have shown that the ptDNA genomic traits of size and compactness do not vary significantly with respect to mode of plastid inheritance, i.e. biparental vs. uniparental modes of inheritance. These observations are not in line with our expectations formulated under the mutational hazard hypothesis. We expected species with uniparentally inherited plastids to be larger and more bloated than biparentally inherited ones -they were not. However, we did find that paternally inherited ptDNAs were more compact and smaller than maternally and biparentally inherited plastid genomes. One hypothesis for this observation is that paternally inherited ptDNAs have a higher mutation rate due to male-biased mutation pressure. If true, this may mean that there is a greater ''burden'' associated with carrying excess DNA in plastid genomes that are paternally inherited relative to those that are maternally or biparentally inherited.