Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Temporal changes in genetic diversity and forage yield of perennial ryegrass in monoculture and in combination with red clover in swards

  • Christophe Verwimp,

    Roles Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Software, Validation, Visualization, Writing – original draft, Writing – review & editing

    Affiliations Plant Sciences Unit, Research Institute for Agriculture, Fisheries and Food, Melle, Belgium, Department of Biology, Plant Conservation and Population Biology, University of Leuven, Heverlee, Belgium

  • Tom Ruttink,

    Roles Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Project administration, Resources, Software, Supervision, Validation, Writing – original draft, Writing – review & editing

    Affiliation Plant Sciences Unit, Research Institute for Agriculture, Fisheries and Food, Melle, Belgium

  • Hilde Muylle,

    Roles Conceptualization, Investigation, Methodology, Resources, Supervision, Writing – review & editing

    Affiliation Plant Sciences Unit, Research Institute for Agriculture, Fisheries and Food, Melle, Belgium

  • Sabine Van Glabeke,

    Roles Data curation, Formal analysis, Methodology, Resources, Software, Writing – review & editing

    Affiliation Plant Sciences Unit, Research Institute for Agriculture, Fisheries and Food, Melle, Belgium

  • Gerda Cnops,

    Roles Conceptualization, Resources, Writing – review & editing

    Affiliation Plant Sciences Unit, Research Institute for Agriculture, Fisheries and Food, Melle, Belgium

  • Paul Quataert,

    Roles Conceptualization, Formal analysis, Investigation, Resources, Writing – review & editing

    Affiliation Research Institute for Nature and Forest, Brussels, Belgium

  • Olivier Honnay,

    Roles Conceptualization, Investigation, Methodology, Resources, Supervision, Writing – review & editing

    Affiliation Department of Biology, Plant Conservation and Population Biology, University of Leuven, Heverlee, Belgium

  • Isabel Roldán-Ruiz

    Roles Conceptualization, Funding acquisition, Investigation, Methodology, Project administration, Resources, Supervision, Writing – original draft, Writing – review & editing

    isabel.roldan-ruiz@ilvo.vlaanderen.be

    Affiliations Plant Sciences Unit, Research Institute for Agriculture, Fisheries and Food, Melle, Belgium, Department of Plant Biotechnology and Bioinformatics, Ghent University, Zwijnaarde, Belgium

Abstract

Agricultural grasslands are often cultivated as mixtures of grasses and legumes, and an extensive body of literature is available regarding interspecific interactions, and how these relate to yield and agronomic performance. However, knowledge of the impact of intraspecific diversity on grassland functioning is scarce. We investigated these effects during a 4-year field trial established with perennial ryegrass (Lolium perenne) and red clover (Trifolium pratense). We simulated different levels of intraspecific functional diversity by sowing single cultivars or by combining cultivars with contrasting growth habits, in monospecific or bispecific settings (i.e. perennial ryegrass whether or not in combination with red clover). Replicate field plots were established for seven seed compositions. We determined yield parameters and monitored differences in genetic diversity in the ryegrass component among seed compositions, and temporal changes in the genetic composition and genetic diversity at the within plot level. The composition of cultivars of both species affected the yield and species abundance. In general, the presence of clover had a positive effect on the yield. The cultivar composition of the ryegrass component had a significant effect on the yield, both in monoculture, and in combination with clover. For the genetic analyses, we validated empirically that genotyping-by-sequencing of pooled samples (pool-GBS) is a suitable method for accurate measurement of population allele frequencies, and obtained a dataset of 22,324 SNPs with complete data. We present a method to investigate the temporal dynamics of cultivars in seed mixtures grown under field conditions, and show how cultivar abundances vary during subsequent years. We screened the SNP panel for outlier loci, putatively under selection during the cultivation period, but none were detected.

Introduction

Whereas it is widely accepted that plant species richness and plant trait diversity have a positive effect on the functioning of ecosystems, recent progress in community ecology has also emphasized the importance of intraspecific diversity [14]. Positive relationships between ecosystem functioning and diversity at both the inter- and intraspecific level are known as diversity effects. Such diversity effects result from functional complementarity among members of the same or different species, resulting in structural, trophic or phenological niche differentiation. A large part of the Earth’s terrestrial ecosystem is covered with grasslands [5], which support a wide range of ecosystem services including forage production, water regulation, maintenance of soil fertility and structure, carbon sequestration, and provisioning of habitat to many plant and animal species [6]. These services strongly depend on the grasslands’ ecosystem functioning, which in turn is affected by the diversity they harbor, i.e. by the variety of species, functional traits and genes present in the plant community [7]. Identification of relevant diversity effects in grassland communities and understanding the underlying mechanisms may support effective management strategies and sustainable grassland exploitation.

Grasslands can develop progressively upon grazing or mowing activity, but in highly productive livestock systems they are sown for the production of high quality forage. These agronomic grasslands support the production of meat and dairy products, and the total value of grass production in the EU is estimated at more than 23 billion Euro [8]. Such grasslands in Europe are often dominated by species of the genus Lolium (accounting for about 23% of the grasslands), with L. perenne (perennial ryegrass) as the most prevalent species [9]. Cultivation of perennial ryegrass may comprise sowing one or a mixture of cultivars, combinations with other grass species such as timothy (Phleum pratense) or tall fescue (Festuca arundinacea), or with legumes such as white clover (Trifolium repens) or red clover (Trifolium pratense). Although perennial ryegrass is still frequently cultivated as a monoculture, sustainable agricultural practice is progressing towards combinations of ryegrass with legumes, which reduces the need for nitrogen fertilizer application [10, 11]. The close interaction between intermingled grass and legume plants has a synergistic effect on biomass production, and mixed species swards can deliver higher yields than would be predicted from their component monocultures [12]. While monospecific swards can produce high quality forage under favorable environmental and soil fertility conditions, multispecies swards may increase resistance and resilience against environmental disturbances such as persistent periods of drought [13, 14]. Especially in a context of global climate change, where such disturbances are expected to become more frequent [15], increasing diversity might prove to be key for the sustainable production of high quality forage.

The importance of intraspecific diversity for the functioning of cultivated grasslands has not been investigated in depth so far. However, the intraspecific diversity of perennial ryegrass might be an important factor for sward productivity and resilience, as cultivars of this species are typically genetically very diverse [1618]. Ryegrass cultivars are commonly derived from multiple parental components due to the necessity to incorporate a sufficiently high number of self-incompatibility alleles (at the self-incompatibility loci S and Z), to allow abundant seed set [9, 1921]. Furthermore, genetic diversity benefits performance [22]. Given this high level of genetic diversity of the sown individuals, the genetic composition of the grassland sward is likely to change during the cultivation period. Changes might be driven by self-thinning or density-dependent mortality [23], or by the selection of specific genotypes that are better adapted to the prevailing conditions (for example, differences in early vigor, tolerance to defoliation, or competitive ability). Therefore, understanding temporal changes in the genetic composition and diversity is essential to get a more complete understanding of the complex plant-to-plant interactions in highly productive grassland swards. Furthermore, understanding changes in the genetic composition of highly productive ryegrass grasslands in response to particular management practices can be of high significance to breeding applications in at least two ways. First, by providing information on the proportion of genetic diversity originally present in the seed composition that remains present in the field. Currently, the necessity to incorporate genetic diversity in ryegrass cultivars compromises to a certain extent the selection intensity that can be applied. Knowing what proportion of this diversity remains present in the ryegrass population after establishment could guide the optimization of breeding programs by allowing a more precise fine-tuning of the balance between genetic diversity and selection intensity. Second, revealing the identity of particular genes, and corresponding alleles, that are preferentially selected under specific circumstances could enable breeders to target traits associated with performance and to exploit polymorphisms in these genes during selection.

Here, we monitored changes in the genetic diversity and forage yield of cultivated ryegrass populations over the course of four years. Experimental field plots were established with populations of perennial ryegrass, with or without red clover. We simulated different levels of potential niche differentiation by mixing phenotypically contrasting cultivars in different combinations, i.e. low vs. high tillering cultivars for the ryegrass component [24], and erect vs. creeping growth habit cultivars for the clover component [25]. We used genotyping-by-sequencing (GBS) of pooled samples (pool-GBS) to quantify genome-wide allele frequencies. This approach has recently been applied to perennial ryegrass to differentiate cultivars [26] and to characterize the genetic basis of flowering time and crown rust resistance [27]. Because we here specifically aimed to compare measurements of genetic diversity among single populations at different time points, we empirically validated this method with special emphasis on SNP data completeness, allele frequency accuracy and removal of non-reproducible SNPs. Our specific objectives were: (1) Investigating whether the composition of ryegrass and clover cultivars of the initial seed mixtures affects forage production and abundance of the clover and ryegrass components. (2) Validating the reliability of GBS to characterize the genetic diversity of the ryegrass component of grassland swards using pooled samples. We compared alternative allele frequencies (AAF) estimated from pooled samples (AAFpool) with AAF calculated from the constituent individual samples (AAFind). (3) Characterizing the temporal changes in genetic diversity of the ryegrass component during the cultivation period, based on AAFpool. (4) Screening for the presence of outlier SNP loci, which may be indicative for selection of specific alleles during cultivation. (5) Investigating how temporal changes in genetic diversity of the perennial ryegrass component relate to the composition of the initial seed mixture, forage production and abundance of the ryegrass and clover components.

Material and methods

Field trial

A field trial was established in April 2011 on a sandy loam soil in Merelbeke, Belgium (50.9867 N 3.7912 E). Seven different seed compositions (Table 1) were sown in plots of 6 by 1.8 m according to a randomized complete block design with two replicates, rendering 14 field plots. Not all possible combinations of cultivars were included in the seed compositions. Seed compositions 1 and 2 were monospecific and consisted of the ryegrass cultivars Merks and Meloni, respectively. Merks is high tillering, late heading, and was derived from a polycross with three components. Meloni is low tillering, intermediate heading and was derived from a pair cross. Seed compositions 3 to 7 comprised both perennial ryegrass and red clover. Two red clover cultivars were used. Crossway was chosen because of its creeping growth habit; Lemmon was chosen because of its erect growth habit. Compositions 3 and 4 consisted of both red clover cultivars combined with either Merks or Meloni. Compositions 5, 6, and 7 consisted of both ryegrass cultivars with either Lemmon plus Crossway, Lemmon, or Crossway, respectively. Sowing densities were 1400 seeds/m2 for the ryegrass monoculture plots and 1190 seeds/m2 for the mixed species plots, with a ratio of 70 perennial ryegrass seeds to 30 red clover seeds, as is common agricultural practice.

The trial was fertilized and weeded according to common agricultural practice (S1 Table), and mown with a Haldrup plot harvester (Haldrup GmbH, Ilshofen GER). Three cuts were harvested in 2011 (year 1), four cuts in 2012 (year 2) and five cuts in 2013 (year 3) and 2014 (year 4). The total harvest of each cut was weighted to determine the fresh weight. A subsample of approximately 450 g was dried in a ventilated oven at 65°C during 48 h to determine its water content. This value was subsequently used to estimate the herbage dry matter weight (DMW; t/ha). The botanical composition of mixed species swards was determined right before each cut by collecting four subsamples of approximately 350 g per plot to account for local heterogeneity. The subsamples were separated manually into three fractions: grass, clover and weed (a minor weed fraction was harvested in the first year, but was negligible in subsequent years). Each fraction was dried in a ventilated oven at 65°C during 48h, and the respective dry weights were determined. The portion of each fraction was averaged over the four subsamples.

Leaf samples for genetic analysis were collected in 2011 after establishment of the 14 plots, and immediately before the spring cut during three subsequent years (2012–2014). At each sampling moment, 40 perennial ryegrass leaves were collected in each plot, rendering a total of 56 sets of 40 leaves (Fig 1). Leaves were picked randomly, but sampling the same plant twice was avoided by maintaining a minimum distance of at least 15 cm between sampling positions. Individual leaf samples were immediately frozen at -80°C, freeze-dried and vacuum packed for storage. With this sampling strategy, we intended to get a representation of the genotypes present in the sward at a given moment in time.

thumbnail
Fig 1. Schematic illustration of the experimental setup of this study.

An overview of the field trial, the perennial ryegrass population samples and the pooling and replication strategy is provided. For GBS of individual plants, genotypes are called per plant, i.e. homozygote reference (0), heterozygote (1) or homozygote alternative (2). The alternative allele frequency (AAFind) in the set of 40 individual plants is the sum of alternative alleles, divided by the total number of chromosomes investigated (80 in this case). For pool-GBS, the alternative allele frequency (AAFpool) is directly measured from the read data, i.e. the alternative allele read count divided by the total read depth (RD) of a locus. For genotyping the 56 population samples, leaf tissue was weighed and pooled in replicate. DNA was extracted from each pooled sample (112 in total), and libraries were prepared. The amplification step was done in three separate PCR reactions, and each PCR product was split and sequenced on two separate HiSeq lanes. Thereafter, sequence read data was merged for each of the original 56 population samples, and AAFpool was measured for population genetic analysis.

https://doi.org/10.1371/journal.pone.0206571.g001

Genotyping-by-sequencing of pooled samples (pool-GBS)

Pooling strategy.

The pool-GBS procedure was validated on a set of 40 plants derived from seed composition 1 (20 plants of replicate A and 20 plants of replicate B, both sampled in 2011). The 40 individual leaf samples were genotyped individually and allele frequencies were calculated (Fig 1) as described below. This information was compared to allele frequencies derived from three replicate pools, each containing equal weights (5 mg) of leaf tissue of the same 40 plants (Fig 1). For one individual sample, three GBS libraries were generated to estimate the reproducibility of the GBS procedure on an individual genotype basis. Comparison of the three replicated pools allowed us to estimate the reproducibility of pool-GBS. AAF derived from individual samples, from the pools, and from pairwise merging of replicate pools were compared to assess the accuracy of allele frequency estimation and reproducibility of SNP calling of the pool-GBS procedure. Based on the results of this validation step (see Results section), we developed the following pooling strategy for the samples of the field experiment: two replicate tissue pools were created for each of the 56 sets of 40 leaves, by weighing 5 mg from each leaf sample of the respective plots, yielding a total of 112 pooled samples.

GBS library preparation.

DNA of the 40 individuals and three pools of the validation set was isolated with the Bio-Nobile Quickpick Plant DNA extraction kit. DNA of the 112 tissue pooled samples was isolated using the CTAB procedure of Doyle [28]. DNA integrity was checked by gel electrophoresis and the concentration was measured with Quantifluor intercalating dye on a Promega Quantus fluorometer (Promega, Madison, USA). All samples were genotyped using a single-enzyme GBS procedure based on Elshire [29] and Byrne [26]. In short, 100 ng of genomic DNA was digested with PstI (New England Biolabs, Ipswitch, USA), and barcoded adapters were ligated with T4 ligase (New England Biolabs, Ipswitch, USA) in a final volume of 50 μL. Ligation products were purified with AM-pure magnetic beads [30] and eluted in 50 μL TE. PCR amplification was performed separately for each adapter-ligated sample. For the validation experiment we used an aliquot of 2 μL of the bead-purified ligate as template for PCR amplification. The fragment size distribution of amplified libraries was evaluated using a Qiagen QIAxcel system (Qiagen, Venlo, NL). Libraries were quantified with the Quantus fluorometer, and then normalized, pooled, bead-purified, and 100 bp paired-end sequenced on an Illumina Hiseq2000 instrument by BGI (Beijing, CN). Because of the sequencing of short fragments, only the forward reads were used for the data analysis.

The protocol for the 112 tissue pool samples was slightly adjusted, based on the results of the validation step. In this case, the 50 μL bead-purified ligate was split into three aliquots, and 16.7 μL was used for three replicate PCR reactions to account for PCR amplification bias. Amplified libraries were quantified as described above and pooled in equal amounts into three ‘super’ libraries (one for each set of replicated PCR reactions). Super libraries were again bead-purified and 100 bp single-end sequenced on two parallel lanes per super library on a Hiseq2500 instrument (Genomic Services Lab at HudsonAlpha, Huntsville, USA) (Fig 1). In this case we used single-ended sequencing because the results of the validation experiment indicated that very short insert sizes were sequenced (+/- 100 bp). This means that a large proportion of read pairs overlapped at the 3’ side, resulting in read-through (sequencing of adaptor sequences). Moreover, the overlapping part of read pairs was redundant as they were observations of the same molecule (sequenced twice).

NGS read data processing, mapping and analysis.

Reads were demultiplexed with GBSX 1.3 [31] allowing 1 mismatch in the barcodes. Sequence quality was checked with FastQC [32] and summarized with MultiQC v0.7 [33]. Reads containing uncalled bases (Ns) were discarded using a custom python script. Reads with average base quality below 35 were discarded with prinseq-lite 0.20.4 [34]. 3’ restriction site remnants and common adapter sequences were removed with Cutadapt [35]; 5’ restriction site remnants were removed with FASTX-Toolkit 0.0.13 [36]. All reads were trimmed to a maximum length of 86 bp to account for variable barcode lengths. Reads shorter than 50 bp after trimming were discarded. Trimmed reads were aligned to the perennial ryegrass reference genome [19] with the BWA-mem algorithm in BWA 0.7.8 with default parameters [37]. Alignments were sorted, indexed, and filtered on mapping quality 20 with SAMtools 1.2. [38]. For each of the 56 samples, the 12 BAM files were merged (i.e. two tissue pool replicates x three PCR replicates x two sequencing lane replicates), yielding 56 BAM files for further analysis. These BAM files were converted to mpileup format with SAMtools, while filtering on minimum RD 30. All previous steps were parallelized with GNU parallel [39].

For each genomic position, we counted the number of missing data across samples. For the individual plants, we filtered on maximum of 1, 5 or 10 missing genotype calls per locus. For the pool samples, we filtered the loci on > 0 missing data. The three datasets, individual plants, pool replicates of the validation and 56 pools of the field experiment were also filtered on excessive RD (e.g. positions with extremely high number of reads, which probably represent repetitive regions, were removed). The thresholds for this filtering step were determined for each dataset separately; maximum 6 k total RD for the dataset of 40 individual samples (and replicate individual samples), 3.5 k for the three replicate pools of the validation experiment, 7 k for the pairwise merged pools and 150 k for the 56 pooled samples of the field experiment.

RD saturation curves were constructed by first merging all read data of the validation experiment, followed by computational subsampling of reads, read mapping, and calculating the number of genomic reference positions with a minimal RD of 10, 30, 100, or 300 reads at increasing numbers of reads mapped.

SNP calling and allele frequency measurement.

For individual plants, SNPs were called with GATK HaplotypeCaller 3.3.2 [40]. Indels and multi-allelic SNPs were removed and genotype calls below RD10 or genotype quality (GQ) score below 30 were flagged as missing data with VCFtools 0.1.14 [41]. AAFind at each SNP position over the set of 40 individual plants was calculated as the number of genotypes with the alternative allele, counting homozygous reference calls as 0, heterozygous calls as 1, homozygous alternative calls as 2. This value was divided by the number of sampled chromosomes. To identify SNPs and estimate AAFpool of the pooled samples, we used SNAPE-pooled. This is a Bayesian method to differentiate genotyping errors from real alleles on a per-sample basis, using the Watterson’s estimator theta as a diversity prior determined by NPstat [42, 43]. The theta value was calculated for each pooled sample separately, with a fixed minimum minor allele count (MAC) of three reads. SNAPE-pooled was run with an informative prior and folded spectrum. The algorithm estimates for each sample the AAFpool values based on allelic RDs, and assigns probabilities to an allele being fixed reference if 1− p(0) < 0.9 or fixed alternative if p(1) > 0.9. A custom python script was used to merge SNAPE-pooled output files to a single AAFpool matrix and set frequencies to zero if 1− p(0) < 0.9 and to one if p(1) > 0.9, where p represents the posterior probability as described in Raineri [42]. After all the above filters had been applied, we removed positions that were tri- or tetra-allelic across all samples.

For the validation experiment, we calculated the Pearson’s correlation and the median deviation between AAFind and AAFpool for SNPs that were identified by both approaches.

Data analysis

Herbage DMW and species composition.

The herbage DMW harvested each year (Fig 2) was analyzed with linear mixed models implemented in R 3.4.0 [44] with the lmer function of the lme4 package (version 1.1–13) [45]. To test the effects of species or cultivar on the total DMW three hypotheses were formulated (Table 2). Seed compositions 1 to 4 allow testing the contrast between the ryegrass cultivars and the presence or absence of red clover (H1). Seed compositions 3 to 5 allow to test for the effect of the ryegrass component when combined with both red clover cultivars (H2). Finally, seed compositions 5 to 7 allow to test the effect of the clover component when combined with both ryegrass cultivars (H3). The significance of the variables expressing the hypotheses were determined with likelihood ratio tests implemented with the anova function. The following three models were tested:

thumbnail
Fig 2. Forage yield harvested per year.

The first row shows the DMW of the total yield (both species), the second and third row show the DMW of respectively the grass component and the clover DMW. The first column shows the seed compositions containing the ryegrass cultivar Merks, the second column shows the seed compositions containing both ryegrass cultivars, and the third column shows the seed compositions containing the ryegrass cultivar Meloni.

https://doi.org/10.1371/journal.pone.0206571.g002

thumbnail
Table 2. Effects of the perennial ryegrass and red clover cultivars on forage yield and species abundance.

https://doi.org/10.1371/journal.pone.0206571.t002

In model 1, Ylmjk represents the total DMW (both species) of field plot replicate j of the seed composition with ryegrass cultivar l (Merks or Meloni) and clover component m (a mixture of the two red clover cultivars, or no clover) in year k (1 to 4). (Ll x Tm) is the interaction between Ll and Tm, which are both fixed effects. Rj (replicate) and Yk (year) are random effects. The term μ represents the mean total DMW, and ϵij is the error term. Model 2 is similar to model 1, but the clover component (and the interaction term) have been excluded. In this case, the ryegrass component l is either Merks, Meloni or a mixture of both cultivars. Model 3 is similar to model 2, but in this case Ll has been replaced by Tm. In this case, the clover component m is either Lemmon, Crossway or a mixture of both cultivars. Additionally, H2 and H3 were tested considering the DMW of grass and clover separately.

Genetic composition and diversity of the ryegrass component.

The SNPs were filtered on minimum 10% and maximum 90% mean AAFpool per locus across the 56 samples (see Results section for justification). Principal component analysis (PCA) was then used to detect the most pronounced trends in the data. We used the prcomp function (3.0.2) on the centered, non-scaled AAFpool data (samples were oriented as rows and SNPs as columns). Hence, the scores on the principal components represent the 56 samples, i.e. 14 field plots at 4 time points, and a lineplot of year against score for one of the principal components represents shows changes in the genetic composition of each field plot over time.

The expected heterozygosity was calculated per locus and sample, based on the AAFpool values following He = 2 × AAFpool × (1 − AAFpool). Subsequently, the genetic diversity contained in each sample was calculated by averaging He over all SNPs (i.e. mean He). As each plot was sampled at four discrete time points, the data were organized in 14 temporal series (corresponding to the 14 field plots) for further inspection. He values are independent of the alternative allele being the minor or major allele, and allow comparing genetic diversity in samples corresponding to different field plots, or in samples corresponding to different time points of the same field plot. Mean He is maximal (0.50) when all alleles occur in equal proportions and decreases as the frequency of one of the allele (either reference of alternative) increases. A low mean He value might thus be indicative of an overall lower level of diversity when samples of different plots are compared. Correspondingly, changes in mean He might be indicative of selection when samples were compared that were taken at different time points. Mean He data were analyzed with a linear model with the aov function in R (model 4).

With mean He ijk representing the genetic diversity of field plot replicate j (A or B) of seed composition i (7 in total) in year k (1 to 4). The factor Si represents seed composition, and the interaction of seed composition and replicate (Si x Rj) represents differences of AAFpool between replicate field plots (irrespective of time). The interaction of seed composition and year (Si x Yk) captures temporal changes in mean He, which are consistent across replicate field plots of a given seed composition. Plot-specific changes of mean He are captured by the residual term ϵijk. The degrees of freedom of this model are 55 for the dependent variable mean He, 6 for Si, 7 for (Si x Rj), 21 for (Si x Yk) and 21 for the residuals.

Next, we investigated possible temporal changes in the relative abundance of the two ryegrass cultivars in the plots in which both were sown together (seed compositions 5 to 7), using cultivar private SNPs. We considered ‘Merks private SNPs’ those that are not polymorphic in plots in which only Meloni was sown, but polymorphic in plots where Merks was sown (either as a single ryegrass cultivar or in mixture with Meloni). Conversely, Meloni private SNPs were not polymorphic in plots in which only Merks was sown, but polymorphic in plots where Meloni was sown. The mean AAFpool per sample was calculated for both private SNP sets and represented graphically for inspection.

Temporal changes of allele frequencies.

The time series of AAFpool frequencies were analyzed with a similar approach as for mean He (described above). He reflects the balance in allele frequency of two alleles at a single locus. This on its own does not inform us about the identity of the allele that is more or less abundant. Linear regression of AAFpool values allows investigating these aspects. The AAFpool data was analyzed with the aov function using the following model:

Model 5 is similar as model 4, except for an additional covariate PC1 that represents the scores of the first principal component of the PCA. In this model, PC1 captures the differentiation between Merks and Meloni, and changes in the relative abundance of these two cultivars in the plots in which they were sown together (seed compositions 5 to 7). This was considered necessary after inspection of the PCA results and the temporal changes of mean He (see Discussion for further explanation). The variance of AAFpool was partitioned to estimate the relative importance of the interaction between seed composition and year (Si x Yk). This component captures the variance attributed to temporal changes in AAFpool that are consistent for field plots of the same seed composition.

Identification of outlier loci.

Finally, we screened the SNPs for outliers, i.e. loci that were putatively under selection during the cultivation period. The allele frequencies of loci under selection (or linked with unobserved loci under selection), are expected to change more dramatically over time than those of neutral loci. The quotient of the mean sum of squares (MSS) of (Si x Yk) and the residual sum of squares (RSS) (model 5) was used as test statistic. This value represents the ratio of variance explained by changes in allele frequency that are consistent for replicated field plots of the same seed composition to changes that are not consistent. This model assumes that the ryegrass component of replicate field plots experiences similar selection pressures and responds similarly. We compared several probability density functions to fit the null distribution of the test statistic, including the chi-square, F, gamma and log-normal distribution (see Results). The fitdistr function of the MASS package of R (version 7.3–47) was used for maximum likelihood-fitting of the distributions (Venables and Ripley, 2002).

Results

Herbage yield and species abundance

The total herbage DMW showed a similar temporal pattern across field plots (Fig 2). In general, the establishment year was the least productive, while the second year was the most productive.

Hypothesis 1 was tested using seed compositions 1 to 4. Both the ryegrass cultivar (either Merks or Meloni) and the presence of red clover significantly affected the herbage yield (Table 2, H1; pLRT = 0.021 and pLRT = 2.23 x 10−6 respectively). The interaction of these effects was not significant (pLRT = 0.145). The total DMW was higher when perennial ryegrass was sown in combination with red clover (Fig 2; seed compositions 3 > 1 and 4 > 2). The DMW of the Meloni monoculture was higher than that of Merks, and this was more evident in the plots in which Merks or Meloni were combined with red clover (seed compositions 4 > 3).

Significant effects of the ryegrass component were detected for DMW (Table 2, H2a, pLRT = 0.015), when sown in combination with a mixture of both red clover cultivars. When the effect of the ryegrass component on the DMW of grass and clover was tested separately, a significant marginal effect was only detected for the clover DMW (pLRT = 0.031).

For field plots in which a mixture of both ryegrass cultivars was sown in combination with red clover (seed compositions 5 to 7), plots containing only Lemmon had the highest yield, followed by mixed red clover cultivars, and finally Crossway (seed composition 6 > 5 > 7). The effect of red clover composition was significant for the total DMW, and for the grass DMW and clover DMW separately (Table 2, H3, pLRT = 1.5 x 10−6; pLRT = 2.2 x 10−3; pLRT = 4.3 x 10−6 respectively). However, the effect on the DMW of both species separately was reversed. The highest yield of clover was obtained with Lemmon, and the lowest yield of ryegrass was obtained with Crossway (Fig 2).

Taken together, these results indicate that the presence of red clover had a significant effect on the total DMW. Meloni was higher yielding than Merks in ryegrass monocultures and in combination with red clover. In combinations with perennial ryegrass, Lemmon (erect growing habit) displayed a better competitive ability than Crossway (creeping growing habit), resulting in a larger total DMW although the yield of the grass component was significantly lower when combined with Lemmon.

Comparison of pool-GBS allele frequencies and frequencies estimated from GBS genotyping of individuals

First, the pool-GBS procedure was validated using a subset of 40 leaf tissue samples. GBS sequencing resulted in 2.4 ± 0.4 M reads for the individual plants (n = 40) and 14.1 ± 1.1 M reads for the pools used for validation (n = 3). To determine the optimum number of reads per sample, we constructed RD saturation curves at varying levels of minimum RD threshold. The saturation curves suggest that the majority of potentially available GBS loci are covered if at least ~20 M reads are obtained per sample and that the coverage increases when data of replicate pools are merged (S1 Fig).

Comparison of the genotype calls of three technical replicates of a single plant (replicate GBS library preparation and sequencing) shows that individual genotyping was highly reproducible. On average 93.6% of the genotype calls were identical in the three pairwise comparisons and 92.4% of the genotype calls was identical across all three replicate SNP sets (S3E Fig). Next, we compared the alternative allele frequencies based on the genotyping data of 40 individual plants (AAFind) to the allele frequencies estimated in three replicate pools (AAFpool). Because the 40 individual plants were not sequenced to saturation (S1A Fig), we considered three thresholds of missing data across the individuals, i.e. max. 1, 5 or 10 missing plants (out of 40) per SNP (S2A Fig, columns), and compared them to replicate pool 1 at increasing minimum RD threshold per SNP position in the pool data, i.e. minimum 30, 100 and 300 reads per SNP position (S2A Fig, rows). In general, correlations were high, with r ranging from 0.94 to 0.96. As the maximum missing data threshold for the AAFind becomes less stringent and more SNPs are considered in the correlation, r slightly decreased. As the minimum RD threshold for the pool increased, less SNPs are considered and r slightly increased (S2A Fig). Therefore, the allele frequencies measured in one pool agreed very well with those estimated by individual genotyping. This was consistent for the three pool-GBS replicates (S2B Fig), and shows that maximizing the number of loci screened by using a RD threshold of 30 for SNAPE-pooled does not negatively affect the AAFpool accuracy. Next, we analyzed the correlation of the AAFpool of two pool-GBS replicates, considering only the SNP positions that were also identified in the individuals. All three pairwise AAFpool comparisons showed a slightly lower r than AAFind versus AAFpool comparisons (S2C Fig). The highest correlation (r > 0.967) with the AAFind was achieved when read data from two replicate pools were merged before SNP calling (S2D Fig). Therefore, we decided to create two replicates of leaf tissue pools for the genotyping of the 56 population samples.

Reproducibility of SNP identification

More than half of the SNPs were called uniquely by SNAPE-pooled in a given pool sample but were not identified by GATK in the set of 40 individuals even though sufficient RD was available on the respective loci (S3A Fig). This was consistent across the three pool replicates (S3B Fig). Likewise pairwise comparisons of replicate pools showed that up to 29% of the SNPs was uniquely identified by SNAPE-pooled in a given pool, but not in a replicate pool sample, despite weighing material from the same 40 leaves (S3C Fig). Combining the read data of two pool replicates showed similar patterns (S3 Fig, compare B and D). Furthermore, intersecting the SNAPE-pooled SNP sets of the three pool replicates showed that 59.8% of all SNPs were called uniquely in a single pool, and only 31.9% were common to the three pools (S3E Fig).

We ran the NPstat/SNAPE-pooled pipeline with a range of parameter values to test if the number of non-reproducible SNPs could be reduced (results not shown). The minor allele read count (MAC) threshold of NPstat was increased from minimum 2 (default) up to minimum 6, which resulted in (reproducible) estimates for theta from ca. 0.014 to ca 0.007. However, varying the theta diversity prior for SNAPE-pooled in this range did not reduce the number of non-reproducible SNPs in the output of SNAPE-pooled.

In order to identify the source of the non-reproducible SNP calls, we compared the AAF spectra of uniquely called (non-reproducible) SNPs versus SNPs that were consistently called (reproducible) by SNAPE-pooled in replicate pools. Pairwise comparisons of replicate pools revealed that non-reproducible SNPs were strongly skewed towards low AAFpool values (S4 Fig). For instance, 5.1% to 5.8% of the reproducible SNPs and 65.6% to 69.7% of the non-reproducible SNPs had an AAFpool < 3% across the various pairwise pool comparisons. Taken together, these data suggest that non-reproducible SNPs are derived from randomly distributed and typically low-frequency read errors.

In conclusion, SNP filtering based on p(0) and p(1) values as recommended by SNAPE-pooled needs to be complemented with additional filtering based on the allele frequency spectrum. We chose a cutoff of minimum 10% < mean AAFpool < 90% for at least one sample out of 56 as criterion to retain SNPs for further data analysis.

SNP genotyping of the 56 pooled samples

Based on the insights obtained in the validation experiment, we used the strategy of creating two replicate pools of tissue, preparing one GBS library per tissue pool, performing three independent PCRs per library, sequencing all of those in parallel on two lanes, and merging the read data per pooled sample to estimate AAFpool profiles of the 56 samples of the field experiment (Fig 1).

Sequencing resulted in 29.3 ± 5.2 M reads per pooled sample, which cover on average 2.3 ± 0.2 MB of the reference genome with sufficient RD to estimate AAFpool (RD > 30) (Raineri [42], and results of the validation experiment) (S1B Fig). In total, 3.76 Mbp of the reference genome sequence was covered by reads from at least one sample. However, we expected that comparing genetic diversity of samples is more accurate when a common set of loci is considered. Therefore, we only considered loci with complete data (56 AAFpool values), which cover ca. 1.1 Mb of the reference genome sequence. We removed approximately 15 kb of the base positions were covered with excessive total RD (above 150 k reads over all samples), and represent genomic repeat sequences. The remaining read data covered ca. 1.085 kb, which represents ca. 0.1% of the 1.13 Gbp reference genome sequence [19].

SNAPE-pooled identified 238 k SNPs. We removed 9 k SNPs with more than two alleles across all samples. As expected from the validation experiment, many SNPs which were polymorphic in just one sample had a low AAFpool. To reduce the fraction of non-reproducible SNPs (see validation), we only retained SNPs with a mean AAFpool per SNP of minimum 10% and maximum 90%, resulting in a dataset of 22,324 SNPs. Usually, heteroscedastic allele frequencies are normalized [46]. However, we preferred to work with non-normalized allele frequencies because normalization inflates AAFpool values close to either zero or one and/or present in only a few samples, which are abundant in our dataset.

Overall patterns of differentiation among populations based on PCA

We estimated the overall patterns of differentiation among samples with PCA based on AAFpool of 22,324 SNPs. One meaningful principal component was obtained, explaining 77% of the variance (Fig 3). Samples corresponding to field plots that contained either Merks or Meloni clustered on opposite sides of this PC (seed compositions 1 and 3 versus 2 and 4). Therefore, this PC captures the variance in AAFpool that can be attributed to differentiation between the ryegrass cultivars Merks and Meloni. Samples corresponding to mixtures of both cultivars take an intermediate position on the PC1 axis, and show a rather consistent time-related pattern, with samples taken in years 1 and 4 clustering closer to Merks, and samples taken in years 2 and 3 clustering closer to Meloni.

thumbnail
Fig 3. Principal component analysis (PCA).

Results for 56 pooled samples based on AAFpool of 22,324 SNPs are summarized. PCA scores of the first PC for each plot over the four years are shown. Lines represent the time series of a field plot, and the colors indicate the seed composition. The screeplot shows the portion of the total variance explained by the first five PCs. The first PC explains 77% of the total variance.

https://doi.org/10.1371/journal.pone.0206571.g003

Temporal patterns of change in genetic diversity

The overall genetic diversity of perennial ryegrass populations and its temporal evolution was estimated by calculating mean He per sample. The main factors explaining mean He are seed composition Si and its interaction with year (Si x Yk), which captures changes in overall genetic diversity that are consistent across replicates (Table 3). The highest mean He values were found in samples representing plots in which both ryegrass cultivars were sown (seed composition 5, 6 and 7) (Fig 4A). Regarding plots in which only one ryegrass cultivar was sown, samples of Merks (seed compositions 1 and 3) were more diverse than samples of Meloni (seed composition 2 and 4). The genetic diversity of Merks plots remained relatively stable across the four-year experiment, while the diversity of Meloni plots increased. The genetic diversity in plots with both ryegrass cultivars increased from year 1 to year 2, remained constant from year 2 to year 3 and decreased again from year 3 to year 4.

thumbnail
Fig 4. Time series of the genetic diversity of perennial ryegrass swards.

Genetic diversity of the 14 field plots as estimated by mean He, based on 22,324 SNPs (A), and time series of mean AAFpool per sample, based on 246 Merks private SNPs (B) and 617 Meloni private SNPs (C). Lines represent the time series of a field plot, and the colors indicate the seed composition. The genetic diversity of field plots with seed compositions containing Merks only (1 and 3) remained stable throughout the four years. The field plots of seed compositions containing Meloni only (2 and 4) were less diverse than those of Merks, but increased slightly after establishment. The field plots containing both cultivars (5 to 7) were more diverse and more variable than the plots containing a single cultivar. These changes coincide with changes in the mean AAFpool of cultivar private SNPs, and are attributed to shifts in the cultivar composition of the population.

https://doi.org/10.1371/journal.pone.0206571.g004

thumbnail
Table 3. ANOVA results of the mean He based on 22,324 SNPs (Fig 4A).

https://doi.org/10.1371/journal.pone.0206571.t003

The distinct temporal patterns observed in the scores of PC1 (Fig 3) and in the mean He values (Fig 4A) for the plots in which both ryegrass cultivars were sown suggest that the genetic composition of the mixed ryegrass cultivar plots changed throughout the four years. These changes in the cultivar composition were confirmed by inspection of cultivar private SNPs. In total, 246 (1.1%) and 617 (2.8%) of the 22,324 SNPs were uniquely detected in either Merks or Meloni respectively (Fig 4B and 4C). In both cases, mean AAFpool per sample was relatively constant for ryegrass monoculture plots but displayed temporal changes for the mixed cultivar plots. These results confirm that the cultivar composition of mixed cultivar plots changed over the course of four years. Merks was more prominent in years 1 and 4, Meloni was more prominent in years 2 and 3.

Variance decomposition of AAFpool frequency data and identification of outlier loci

The density distributions of the total variance (TSS), the variance captured by each predictor component (MSS) and the residual variance (RSS) are shown in Fig 5. The largest part of the variance was explained by PC1 (median MSS PC1 = 0.4354), which agrees with the results of the PCA (Fig 3, screeplot). Variance due to overall differences among seed compositions (Si), to differences between replications (Si x Rj), or due to temporal changes in AAFpool (Si x Yk), were relatively small compared to the residual variance (median MSS of Si = 0.0043; median MSS of (Si x Rj) = 0.0033; median MSS of (Si x Yk) = 0.0035; median RSS = 0.0032). The quotient of MSS of (Si x Yk) to the RSS follows a log-normal distribution (Fig 6A). The p-values estimated with the log-normal probability density function based on the observed mean and standard deviation is uniformly distributed (Fig 6B). Therefore, no outlier SNPs or loci putatively under selection were identified.

thumbnail
Fig 5. Linear regression of AAFpool based on 22,324 SNPs.

Density distributions of the total variance (TSS) (A) and the (scaled) variances (MSS) captured by each component (B-F) of model 5 (see Material and Methods) are shown. The TSS, the MSS of seed composition Si, its interactions with Rj and Yk, and the RSS follow lognormal distributions. MSS of component PC1 follows a chi-square distribution with one degree of freedom. The green lines represent the predicted probability density functions. The blue dotted lines represent the observed median. The green line represents the expected median of the chi-square distribution.

https://doi.org/10.1371/journal.pone.0206571.g005

thumbnail
Fig 6. SNP outlier analysis.

SNP outlier analysis based on AAFpool to identify loci putatively under selection. Density distribution of the test statistic, i.e. the quotient of MSS of (Si x Yk) and RSS (A), and distribution of the corresponding p-values (B). The blue dotted line indicates the observed median. The green line shows the probability density function of the expected lognormal distribution based on the mean and standard deviation of the test statistic.

https://doi.org/10.1371/journal.pone.0206571.g006

Discussion

Effect of the seed composition on establishment, herbage yield and species abundance

The herbage yield differed among seed compositions throughout the four years of cultivation, including the first year of establishment. This suggests that plant-plant interactions affected biomass production from the establishment of the field and onwards. Fluctuations in yield and species abundance were consistent between field plot replicates. The total DMW peaked during the second year, and competition might have been the most intense during this period.

Red clover often lacks persistence, but in this field trial, the proportion of clover to grass did not decrease, confirming the viability of red clover as a companion for perennial ryegrass [47, 48]. As expected, the presence of clover had a significant positive effect on the total DMW across field plots in which it was combined with perennial ryegrass, despite the fact that plots in which red clover was combined with perennial ryegrass received less N-fertilizer than those in which only ryegrass was sown. Synergy between grasses and legumes is commonly attributed to the nitrogen-fixing capacity of the companion legume [11, 12]. However, other types of species interactions drive grassland dynamics. First, structural complementarity between neighboring ryegrass and clover plants can improve the sward canopy structure (i.e. structural niche differentiation between clover and grass). Secondly, differences in relative growth rates of the component species can affect sward dynamics [49], and biomass production can benefit from asynchrony in growth during the season [50, 51].

It is expected that these functional interactions are affected by the intraspecific diversity of the component species [4]. In this experiment, the intraspecific diversity of the clover component significantly affected total DMW in field plots were both perennial ryegrass cultivars were sown. The erect growing habit of Lemmon is better suited for cultivation under mowing conditions [25], and resulted in a higher clover DMW. In contrast, the ryegrass DMW was the highest in combination with Crossway. This indicates that the functional traits of the clover cultivar affected the competitive interactions between both species. The functional diversity of the ryegrass component also affected herbage yield, and the ryegrass component significantly affected total DMW of field plots were a single perennial ryegrass cultivar was sown. The total DMW was consistently higher for Meloni compared to Merks, both in monoculture and in combination with clover. The interaction of this effect and the effect of the presence of clover was not significant, suggesting that the functional diversity of the ryegrass component did not affect the beneficial interaction between both species. Considering the field plots where two red clover cultivars were sown in combination with one or two perennial ryegrass cultivars, the ryegrass component also significantly affected total DMW. Remarkably, the effect of the ryegrass component was significant for the clover DMW and not significant for the grass DMW. Taken together, these results indicate that the functional diversity of perennial ryegrass affects the growth of red clover, with Meloni being better suited for cultivation in combination with red clover. The intraspecific diversity of both species affects functional interactions within grassland swards.

Validation of the pool-GBS approach

Single nucleotide polymorphisms (SNPs) have proven to be effective to characterize the genetic diversity among plant populations. Using whole genome shotgun sequencing (pool-seq), allele frequencies can be estimated directly from pools of plants [5254]. Yet this protocol is still costly for sequencing many samples, especially for species with large genomes. Complexity reduction methods such as GBS target a fraction of the genome associated with restriction sites, thereby reducing the sequencing cost per sample [29, 55, 56]. The combination of pool-seq and GBS (pool-GBS) is a promising strategy for assessing genetic diversity among populations on a genome-wide scale [57]. This approach has previously been applied to perennial ryegrass [26, 27], barley [58, 59], multiple species of alpine shrubs [60], herring [61], and cyst nematodes [62]. To the best of our knowledge, this approach has not been used before for the characterization of temporal changes in the genetic composition of plant populations. Therefore, we considered the potential limitations of this methodological approach. The pool-GBS procedure was optimized towards three main criteria: read data completeness, accuracy of measurements of allele frequencies and removal of non-reproducible SNPs.

A central goal of this study was to compare quantitative estimates of genetic diversity across populations. Estimation of the diversity index mean He strongly depends on the set of genomic loci under consideration. For comparison of samples, mean He is preferentially calculated for a common set of loci. High proportion of missing data have been reported for GBS data [29, 63]. Therefore, we investigated the read data distribution of the pool-GBS procedure in a validation experiment, and determined a suitable amount of read data per sample to minimize missing data. For the 56 pooled samples, we obtained consistent sampling of ca. 0.1% of the perennial ryegrass reference genome with a very low level of missing data across populations. Creating two independent tissue pools per population and merging the read data of such replicated libraries increased the number of consistently detected loci further.

As AAFpool is directly estimated from the read counts at a certain locus, genotyping errors can be confounded with low frequency alleles [42, 53]. Moreover, the accuracy of the allele frequency measurement is affected by the number of plants per pool [57, 64, 65]. While Byrne [26] and Ashraf [27] sampled several hundreds of perennial ryegrass seedlings to represent the population genetic parameters, our representation of the genetic diversity of a field plot relied on sampling of leaf material from 40 randomly selected individual plants per field plot. Therefore, we empirically validated the accuracy of allele frequencies obtained with pool-GBS. AAFpool estimated in three replicate pools correlated very well with the AAFind estimated from individual GBS genotyping. However, pool-GBS was sensitive to genotyping errors, and stringent filtering of the SNP loci was required to identify a reliable set of 22,324 high quality SNPs across the 56 population samples.

Genetic characterization and temporal dynamics of the ryegrass component

The perennial ryegrass cultivars could be differentiated with PCA based on the AAFpool of 22,324 SNPs, similarly as in Byrne [26]. The genetic differentiation between Merks and Meloni represented the majority of the variance in the AAFpool dataset. perennial ryegrass cultivars are genetically very diverse [1618], and some genotypes within a sward might be better adapted to the prevailing conditions. Moreover, these conditions change during the cultivation period, e.g. changes in the biomass production and competitive pressure within the sward, and variability of the environmental conditions. Therefore, we expected changes in the genetic composition of the ryegrass component. The largest changes in the genetic composition of the ryegrass populations were observed in field plots containing both ryegrass cultivars. The representation of the cultivars in the plots where they are sown together changed throughout the cultivation period. This illustrates the dynamic behavior of the ryegrass component in mixed cultivar field plots. If a cultivar is less prominent at a certain point in time, it is not necessarily removed from the population. Changes in cultivar abundance might be related to their functional characteristics, similar to changes in species abundance [66]. Similar dynamics have been described between grass species by Brophy [49]. They observed that species dynamics are primarily driven by relative growth rates, and secondarily by density dependent and climatic factors. However, it was concluded that the species with the highest relative growth rates became dominant over time. In this study we did not observe either a clear dominance of one of the species nor one of the cultivars. In field plots containing a single perennial ryegrass cultivar, we did not detect pronounced genetic fluxes. The genetic diversity of Merks populations remained stable across the four years, while the genetic diversity of the Meloni populations showed a slight increase. This may be related to the decrease of grass biomass production towards the third and the fourth year of cultivation. It is possible that a subset of Meloni genotypes that was more dominant during the first and second year, became less dominant in subsequent years, increasing the chance of sampling a more diverse set of genotypes.

Detection of loci putatively under selection

We developed a statistical test to identify loci that significantly change in time. However, we did not identify convincing signatures of selection in the perennial ryegrass populations investigated. Possibly, selection pressures were too low to have any significant detectable effect on AAFpool data. An alternative explanation for not detecting outlier loci is related to linkage disequilibrium (LD) patterns in these populations. With the current spacing of GBS tags across the genome (approx. 1 stack of 100 bp per 100-200kbp), the genetic markers observed may not be in genetic linkage with loci under selection, if LD extends only for short distances [67]. In this situation, genome complexity reduction approaches such as GBS are likely to miss the majority of outliers that might be present [68].

Conclusions

The interactions between neighboring plants plays an important role in the functioning of cultivated grasslands, and are determined by the functional diversity of the sown species and cultivars. This study showed that herbage yield and species abundance of field plots is affected by the mixture of perennial ryegrass and red clover cultivars used. Genetic diversity among ryegrass populations, investigated by GBS genotyping of pooled samples, showed that the abundance of perennial ryegrass cultivars is highly dynamic in mixtures. Taken together, these results illustrate that phenotypic traits of both perennial ryegrass and red clover affect their behavior in seed mixtures of both species and that the dominance of cultivars in mixtures can shift throughout the cultivation period.

Supporting information

S1 Table. Fertilization of the field plots.

https://doi.org/10.1371/journal.pone.0206571.s001

(XLSX)

S1 Fig. Saturation curves.

Saturation curves show the relationship between the number of reads per sample (x-axis) and the number of base positions of the perennial ryegrass reference genome (Byrne et al., 2015) that are covered (y-axis) at various minimum RD threshold; RD 10 (blue), RD 30 (red), RD 100 (orange) and RD 300 (green). A shows the data of individual plants (between 0 and 4 M reads), data of three replicate pools (between 12 and 14 M reads) and pairwise merged data of pools (between 26 and 30 M reads) of the validation experiment. The line curves were constructed by resampling reads of the validation experiment. These curves suggest that the larger part of potentially available GBS loci are covered if at least ~20 M reads are obtained per sample. B shows the data for the 56 population samples of the field experiment. Data of the technical replicates (between 0 and 14 M reads), were merged for each population sample (between 20 and 54 M reads) (see pooling and replication scheme Fig 1).

https://doi.org/10.1371/journal.pone.0206571.s003

(PDF)

S2 Fig. Allele frequency correlations.

Allele frequency correlations of SNPs that were identified in a set of 40 plants (AAFind) and three pool replicates of the same set (AAFpool), showing the number of SNPs (n), the median deviation (d), Pearson’s correlation coefficient (r) and the least squares regression (red). A The effect of SNP filtering on the correlation of AAFind and AAFpool. The SNPs of the individuals were filtered on maximum missing data (MD) 1, 5 or 10 out of 40 samples, and the SNPs of the pool were filtered on minimum RD of 30, 100 or 300. For subsequent comparisons we consistently the thresholds RD 30 and MD 5 B: Correlation of AAFpool of the three pool replicates to AAFind. C: Pairwise correlations of AAFpool of the three pool replicates. Only SNPs that were also detected in the individuals were considered for this comparison. D: Correlation of AAFpool obtained by pairwise merging of pool replicates to AAFind.

https://doi.org/10.1371/journal.pone.0206571.s004

(PDF)

S3 Fig. Venn diagrams of overlapping SNPs.

Venn diagrams showing the number of SNPs that were identified in individual samples, pooled samples, or both. The comparisons of SNP datasets of A—D follows the same order as the allele frequency correlations (S2 Fig). E Venn diagram of heterozygous loci across three replicates of one individual plant, and three-way comparison of the three pool replicates.

https://doi.org/10.1371/journal.pone.0206571.s005

(PDF)

S4 Fig. Pairwise comparisons of replicate pools.

Pairwise comparisons of AAFpool distributions of replicate pools. Each distribution shows the AAFpool distribution of a pooled sample, colors indicate whether the SNP was detected in the corresponding replicate (blue for detected, red for not detected and black for no data available). Non-reproducible SNPs are strongly skewed towards low AAFpool values.

https://doi.org/10.1371/journal.pone.0206571.s006

(PDF)

Acknowledgments

Sincere thanks to Alex De Vliegher for expert supervision of the field trial, and all the people who have provided technical assistance in the field and in the laboratory (K. Succaet, T. Vanderstocken, L. Van Gyseghem, N. Mergan, C. Pardon, A. Staelens).

References

  1. 1. Hughes AR, Inouye BD, Johnson MT, Underwood N, Vellend M. Ecological consequences of genetic diversity. Ecol Lett. 2008;11(6):609–623. pmid:18400018
  2. 2. Albert CH, Grassein F, Schurr FM, Vieilledent G, Violle C. When and how should intraspecific variability be considered in trait-based plant ecology? Perspectives in Plant Ecology, Evolution and Systematics. 2011;13(3):217–225. https://doi.org/10.1016/j.ppees.2011.04.003.
  3. 3. Violle C, Enquist BJ, McGill BJ, Jiang L, Albert CH, Hulshof C, et al. The return of the variance: intraspecific variability in community ecology. Trends Ecol Evol. 2012;27(4):244–252. pmid:22244797
  4. 4. Hart SP, Schreiber SJ, Levine JM. How variation between individuals affects species coexistence. Ecol Lett. 2016;19(8):825–838. pmid:27250037
  5. 5. Ojima DS, Dirks BO, Glenn EP, Owensby CE, Scurlock JO. Assessment of C budget for grasslands and drylands of the world. Water, Air, & Soil Pollution. 1993;70(1):95–109. https://doi.org/10.1007/BF01104990.
  6. 6. Reheul D, De Cauwer B, Cougnon M. The role of forage crops in multifunctional agriculture. Fodder crops and amenity grasses: Springer; 2010. p. 1–12. https://doi.org/10.1007/978-1-4419-0760-8_1.
  7. 7. Cardinale BJ, Duffy JE, Gonzalez A, Hooper DU, Perrings C, Venail P, et al. Biodiversity loss and its impact on humanity. Nature. 2012;486(7401):59–67. pmid:22678280
  8. 8. Euroseeds. Factsheet. 2016.
  9. 9. Humphreys M, Feuerstein U, Vandewalle M, Baert J. Ryegrasses. Fodder crops and amenity grasses: Springer; 2010. p. 211–260. https://doi.org/10.1007/978-1-4419-0760-8_10.
  10. 10. Luscher A, Mueller-Harvey I, Soussana JF, Rees RM, Peyraud JL. Potential of legume-based grassland-livestock systems in Europe: a review. Grass Forage Sci. 2014;69(2):206–228. pmid:26300574
  11. 11. Suter M, Connolly J, Finn JA, Loges R, Kirwan L, Sebastia MT, et al. Nitrogen yield advantage from grass-legume mixtures is robust over a wide range of legume proportions and environmental conditions. Glob Chang Biol. 2015;21(6):2424–2438. pmid:25626994
  12. 12. Finn JA, Kirwan L, Connolly J, Sebastià MT, Helgadottir A, Baadshaug OH, et al. Ecosystem function enhanced by combining four functional types of plant species in intensively managed grassland mixtures: a 3-year continental-scale field experiment. Journal of Applied Ecology. 2013;50(2):365–375. https://doi.org/10.1111/1365-2664.12041.
  13. 13. van Rooijen NM, de Keersmaecker W, Ozinga WA, Coppin P, Hennekens SM, Schaminée JHJ, et al. Plant Species Diversity Mediates Ecosystem Stability of Natural Dune Grasslands in Response to Drought. Ecosystems. 2015;18(8):1383–1394. https://doi.org/10.1007/s10021-015-9905-6.
  14. 14. De Keersmaecker W, van Rooijen N, Lhermitte S, Tits L, Schaminée J, Coppin P, et al. Species-rich semi-natural grasslands have a higher resistance but a lower resilience than intensively managed agricultural grasslands in response to climate anomalies. Journal of Applied Ecology. 2016;53(2):430–439. https://doi.org/10.1111/1365-2664.12595.
  15. 15. Lloret F, Escudero A, Iriondo JM, Martínez-Vilalta J, Valladares F. Extreme climatic events and vegetation: the role of stabilizing processes. Global Change Biology. 2012;18(3):797–805. https://doi.org/10.1111/j.1365-2486.2011.02624.x.
  16. 16. Bolaric S, Barth S, Melchinger A, Posselt U. Genetic diversity in European perennial ryegrass cultivars investigated with RAPD markers. Plant breeding. 2005;124(2):161–166. https://doi.org/10.1111/j.1439-0523.2005.01108.x.
  17. 17. Roldán-Ruiz I, Dendauw J, Van Bockstaele E, Depicker A, De Loose M. AFLP markers reveal high polymorphic rates in ryegrasses (Lolium spp.). Molecular Breeding. 2000;6(2):125–134. https://doi.org/10.1023/A:1009680614564.
  18. 18. Roldán-Ruiz I, Van Euwijk F, Gilliland T, Dubreuil P, Dillmann C, Lallemand J, et al. A comparative study of molecular and morphological methods of describing relationships between perennial ryegrass (Lolium perenne L.) varieties. Theoretical and Applied Genetics. 2001;103(8):1138–1150. https://doi.org/10.1007/s001220100571.
  19. 19. Byrne SL, Nagy I, Pfeifer M, Armstead I, Swain S, Studer B, et al. A synteny-based draft genome sequence of the forage grass Lolium perenne. Plant J. 2015;84(4):816–826. pmid:26408275
  20. 20. Manzanares C, Barth S, Thorogood D, Byrne SL, Yates S, Czaban A, et al. A Gene Encoding a DUF247 Domain Protein Cosegregates with the S Self-Incompatibility Locus in Perennial Ryegrass. Mol Biol Evol. 2016;33(4):870–884. pmid:26659250
  21. 21. Thorogood D, Yates S, Manzanares C, Skot L, Hegarty M, Blackmore T, et al. A Novel Multivariate Approach to Phenotyping and Association Mapping of Multi-Locus Gametophytic Self-Incompatibility Reveals S, Z, and Other Loci in a Perennial Ryegrass (Poaceae) Population. Front Plant Sci. 2017;8:1331. pmid:28824669
  22. 22. Boller B, Grieder C, Schubiger F. Performance of diploid and tetraploid perennial ryegrass synthetics with variable numbers of parents. Breeding in a World of Scarcity: Springer; 2016. p. 15–19. https://doi.org/https://doi.org/10.1007/978-3-319-28932-8_2.
  23. 23. Lee JM, Thom ER, Wynn K, Waugh D, Rossi L, Chapman DF. High perennial ryegrass seeding rates reduce plant size and survival during the first year after sowing: does this have implications for pasture sward persistence? Grass and Forage Science. 2017;72(3):382–400. https://doi.org/10.1111/gfs.12243.
  24. 24. Saracutu O, Cnops G, Roldán-Ruiz I, Rohde A. Phenotypic assessment of variability in tillering and early development in ryegrass (Lolium spp.). Sustainable use of genetic diversity in forage and turf breeding: Springer; 2010. p. 155–160. https://doi.org/10.1007/978-90-481-8706-5_22.
  25. 25. Van Minnebruggen A, Roldán-Ruiz I, Van Bockstaele E, Haesaert G, Cnops G. The relationship between architectural characteristics and regrowth in Trifolium pratense (red clover). Grass and Forage Science. 2015;70(3):507–518. https://doi.org/10.1111/gfs.12138.
  26. 26. Byrne SL, Czaban A, Studer B, Panitz F, Bendixen C, Asp T. Genome wide allele frequency fingerprints (GWAFFs) of populations via genotyping by sequencing. PLoS One. 2013;8(3):e57438. pmid:23469194
  27. 27. Ashraf BH, Byrne S, Fe D, Czaban A, Asp T, Pedersen MG, et al. Estimating genomic heritabilities at the level of family-pool samples of perennial ryegrass using genotyping-by-sequencing. Theor Appl Genet. 2016;129(1):45–52. pmid:26407618
  28. 28. Doyle JJ, Doyle JL, Hortoriun LB. Isolation of plant DNA from fresh tissue. Focus. 1990;12:13–15.
  29. 29. Elshire RJ, Glaubitz JC, Sun Q, Poland JA, Kawamoto K, Buckler ES, et al. A robust, simple genotyping-by-sequencing (GBS) approach for high diversity species. PLoS One. 2011;6(5):e19379. pmid:21573248
  30. 30. Rohland N, Reich D. Cost-effective, high-throughput DNA sequencing libraries for multiplexed target capture. Genome Res. 2012;22(5):939–946. pmid:22267522
  31. 31. Herten K, Hestand MS, Vermeesch JR, Van Houdt JK. GBSX: a toolkit for experimental design and demultiplexing genotyping by sequencing experiments. BMC Bioinformatics. 2015;16:73. pmid:25887893
  32. 32. Andrews S. FastQC: a quality control tool for high throughput sequence data 2010 [Available from: https://www.bioinformatics.babraham.ac.uk/projects/fastqc/.
  33. 33. Ewels P, Magnusson M, Lundin S, Käller M. MultiQC: summarize analysis results for multiple tools and samples in a single report. Bioinformatics. 2016;32(19):3047–3048. pmid:27312411
  34. 34. Schmieder R, Edwards R. Quality control and preprocessing of metagenomic datasets. Bioinformatics. 2011;27(6):863–864. pmid:21278185
  35. 35. Martin M. Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet journal. 2011;17(1):pp. 10–12. https://doi.org/10.14806/ej.17.1.200.
  36. 36. Gordon A, Hannon G. Fastx-toolkit. FASTQ/A short-reads preprocessing tools (unpublished). 2010.
  37. 37. Li H, Durbin R. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics. 2009;25(14):1754–1760. pmid:19451168
  38. 38. Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, et al. The sequence alignment/map format and SAMtools. Bioinformatics. 2009;25(16):2078–2079. pmid:19505943
  39. 39. Tange O. Gnu parallel-the command-line power tool. The USENIX Magazine. 2011;36(1):42–47. https://doi.org/10.5281/zenodo.16303.
  40. 40. McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A, et al. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome research. 2010;20(9):1297–1303. pmid:20644199
  41. 41. Danecek P, Auton A, Abecasis G, Albers CA, Banks E, DePristo MA, et al. The variant call format and VCFtools. Bioinformatics. 2011;27(15):2156–2158. pmid:21653522
  42. 42. Raineri E, Ferretti L, Esteve-Codina A, Nevado B, Heath S, Pérez-Enciso M. SNP calling by sequencing pooled samples. BMC bioinformatics. 2012;13(1):239. https://doi.org/10.1186/1471-2105-13-239.
  43. 43. Ferretti L, Ramos-Onsins SE, Perez-Enciso M. Population genomics from pool sequencing. Mol Ecol. 2013;22(22):5561–5576. pmid:24102736
  44. 44. R Core Team. R: A language and environment for statistical computing [Internet]. Vienna, Austria; 2014. 2017.
  45. 45. Bates D, Maechler M, Bolker B, Walker S, Christensen R, Singmann H, et al. lme4: linear mixed-effects models using Eigen and S4. 2013. https://doi.org/10.18637/jss.v067.i01.
  46. 46. Pickrell JK, Pritchard JK. Inference of population splits and mixtures from genome-wide allele frequency data. PLoS Genet. 2012;8(11):e1002967. pmid:23166502
  47. 47. Clavin D, Crosson P, Grant J, O'Kiely P. Red clover for silage: management impacts on herbage yield, nutritive value, ensilability and persistence, and relativity to perennial ryegrass. Grass and Forage Science. 2017;72(3):414–431. https://doi.org/10.1111/gfs.12249.
  48. 48. Phelan P, Moloney A, McGeough E, Humphreys J, Bertilsson J, O’Riordan E, et al. Forage legumes for grazing and conserving in ruminant production systems. Critical Reviews in Plant Sciences. 2015;34(1–3):281–326. https://doi.org/.1080/07352689.2014.898455.
  49. 49. Brophy C, Finn JA, Lüscher A, Suter M, Kirwan L, Sebastià M-T, et al. Major shifts in species’ relative abundance in grassland mixtures alongside positive effects of species diversity in yield: a continental-scale experiment. Journal of Ecology. 2017;105(5):1210–1222. https://doi.org/10.1111/1365-2745.12754.
  50. 50. Eriksen J, Askegaard M, Søegaard K. Complementary effects of red clover inclusion in ryegrass-white clover swards for grazing and cutting. Grass and Forage Science. 2014;69(2):241–250. https://doi.org/10.1111/gfs.12025.
  51. 51. Husse S, Huguenin-Elie O, Buchmann N, Lüscher A. Larger yields of mixtures than monocultures of cultivated grassland species match with asynchrony in shoot growth among species but not with increased light interception. Field Crops Research. 2016;194:1–11. https://doi.org/10.1016/j.fcr.2016.04.021.
  52. 52. Futschik A, Schlötterer C. The next generation of molecular markers from massively parallel sequencing of pooled DNA samples. Genetics. 2010;186(1):207–218. pmid:20457880
  53. 53. Schlötterer C, Tobler R, Kofler R, Nolte V. Sequencing pools of individuals—mining genome-wide polymorphism data without big funding. Nat Rev Genet. 2014;15(11):749–763. pmid:25246196
  54. 54. Rellstab C, Zoller S, Tedder A, Gugerli F, Fischer MC. Validation of SNP allele frequencies determined by pooled next-generation sequencing in natural populations of a non-model plant species. PLoS One. 2013;8(11):e80422. pmid:24244686
  55. 55. Davey JW, Hohenlohe PA, Etter PD, Boone JQ, Catchen JM, Blaxter ML. Genome-wide genetic marker discovery and genotyping using next-generation sequencing. Nat Rev Genet. 2011;12(7):499–510. pmid:21681211
  56. 56. Poland JA, Rife TW. Genotyping-by-Sequencing for Plant Breeding and Genetics. The Plant Genome Journal. 2012;5(3). https://doi.org/10.3835/plantgenome2012.05.0005.
  57. 57. Gautier M, Foucaud J, Gharbi K, Cezard T, Galan M, Loiseau A, et al. Estimation of population allele frequencies from next-generation sequencing data: pool-versus individual-based genotyping. Mol Ecol. 2013;22(14):3766–3779. pmid:23730833
  58. 58. Bélanger S, Clermont I, Esteves P, Belzile F. Extent and overlap of segregation distortion regions in 12 barley crosses determined via a Pool-GBS approach. Theor Appl Genet. 2016;129(7):1393–1404. pmid:27062517
  59. 59. Bélanger S, Esteves P, Clermont I, Jean M, Belzile F. Genotyping-by-Sequencing on Pooled Samples and its Use in Measuring Segregation Bias during the Course of Androgenesis in Barley. Plant Genome. 2016;9(1). https://doi.org/10.3835/plantgenome2014.10.0073.
  60. 60. Bell N, Griffin PC, Hoffmann AA, Miller AD. Spatial patterns of genetic diversity among Australian alpine flora communities revealed by comparative phylogenomics. Journal of Biogeography. 2017. https://doi.org/10.1111/jbi.13120.
  61. 61. Corander J, Majander KK, Cheng L, Merilä J. High degree of cryptic population differentiation in the Baltic Sea herring Clupea harengus. Molecular ecology. 2013;22(11):2931–2940. pmid:23294045
  62. 62. Mimee B, Duceppe MO, Veronneau PY, Lafond-Lapalme J, Jean M, Belzile F, et al. A new method for studying population genetics of cyst nematodes based on Pool-Seq and genomewide allele frequency analysis. Mol Ecol Resour. 2015;15(6):1356–1365. pmid:25846829
  63. 63. Fu YB. Genetic diversity analysis of highly incomplete SNP genotype data with imputations: an empirical assessment. G3 (Bethesda). 2014;4(5):891–900. https://doi.org/10.1534/g3.114.010942.
  64. 64. Anderson EC, Skaug HJ, Barshis DJ. Next‐generation sequencing for molecular ecology: a caveat regarding pooled samples. Molecular ecology. 2014;23(3):502–512. pmid:24304095
  65. 65. Fracassetti M, Griffin PC, Willi Y. Validation of Pooled Whole-Genome Re-Sequencing in Arabidopsis lyrata. PLoS One. 2015;10(10):e0140462. pmid:26461136
  66. 66. Barot S, Allard V, Cantarel A, Enjalbert J, Gauffreteau A, Goldringer I, et al. Designing mixtures of varieties for multifunctional agriculture with the help of ecology. A review. Agronomy for Sustainable Development. 2017;37(2). https://doi.org/10.1007/s13593-017-0418-x.
  67. 67. Auzanneau J, Huyghe C, Julier B, Barre P. Linkage disequilibrium in synthetic varieties of perennial ryegrass. Theoretical and Applied Genetics. 2007;115(6):837–847. pmid:17701396
  68. 68. Hoban S, Kelley JL, Lotterhos KE, Antolin MF, Bradburd G, Lowry DB, et al. Finding the Genomic Basis of Local Adaptation: Pitfalls, Practical Solutions, and Future Directions. Am Nat. 2016;188(4):379–397. pmid:27622873