Using a Control to Better Understand Phyllosphere Microbiota

An important data gap in our understanding of the phyllosphere surrounds the origin of the many microbes described as phyllosphere communities. Most sampling in phyllosphere research has focused on the collection of microbiota without the use of a control, so the opportunity to determine which taxa are actually driven by the biology and physiology of plants as opposed to introduced by environmental forces has yet to be fully realized. To address this data gap, we used plastic plants as inanimate controls adjacent to live tomato plants (phyllosphere) in the field with the hope of distinguishing between bacterial microbiota that may be endemic to plants as opposed to introduced by environmental forces. Using 16S rRNA gene amplicons to study bacterial membership at four time points, we found that the vast majority of all species-level operational taxonomic units were shared at all time-points. Very few taxa were unique to phyllosphere samples. A higher taxonomic diversity was consistently observed in the control samples. The high level of shared taxonomy suggests that environmental forces likely play a very important role in the introduction of microbes to plant surfaces. The observation that very few taxa were unique to the plants compared to the number that were unique to controls was surprising and further suggests that a subset of environmentally introduced taxa thrive on plants. This finding has important implications for improving our approach to the description of core phytobiomes as well as potentially helping us better understand how foodborne pathogens may become associated with plant surfaces.


Introduction
Culture independent phyllosphere research has greatly expanded our understanding of the diversity of microbes associated with plant surfaces [1][2][3][4][5][6][7][8][9][10][11][12][13]. Food safety initiatives have played a small but important role in the advancement of culture independent phytobiome research. The microbiology of living surfaces of fresh produce has clear implications for public health and food safety [14]. The description of agro-ecologies (beginning with crop phytobiomes) along the farm to fork continuum has begun to establish microbial baselines that will contribute to an improved understanding of precisely where and how human pathogens may become associated with food plant ecologies in agricultural settings. At least nineteen Salmonellatomato associated outbreaks occurred between the years of 1990 and 2014, causing thousands of illnesses (FDA internal document) [7]. Understanding how contamination events occur is extremely important and thus, the tomato microbiome has become an important study system for food safety research.
Agricultural management practices for food crops have been studied to better understand the role they may play in shaping phyllosphere microbiota. For example, the impact on phyllosphere microbiota by different irrigation waters has been studied [15,16] as well as the impact of different pesticide schedules [17][18][19] and even organic and conventional management [8]. Seasonality and biogeography have also been contrasted to farming systems [20][21][22]. An interesting trend that was observed in many of these studies was the lack of statistically significant differences in plant microbial communities that correlated to the treatment queried. For example, Telias et al. studied water sources used in agriculture (ground water compared to surface pond water) and showed that the microbial communities of water sources were highly divergent, but the communities collected from tomatoes treated with the different waters did not exhibit those same differences [15,16] This suggests that environmental pressures (potentially air) in the phyllosphere exerted stronger pressures than either water source did. Work by Perazzolli et al. supported this observation by demonstrating that different pesticides used on the same crop had less influence on phyllosphere crop microbiota than biogeographical factors did [22]. They found that epiphytic microbes associated with grape vines were not significantly altered by different pesticides treatments (bio-control and traditional pesticide). Instead, the primary driver of microbial differences appeared to be biogeography-which suggests that environmental parameters such as wind and air associated with each region likely influenced the consortia of microbes found on the grape plant surfaces.
Furthermore, work that examined the influence of two very different pesticides on tomato crop microflora, found that the most striking differences in microbial composition were associated with the sampling time-points and not with the pesticide treatments [18] suggesting once again, that environmental parameters, largely unaccounted for to date, may be the most influential drivers of the microbiology of the phyllosphere. Marine et al. demonstrated that for leafy greens grown in the mid-Atlantic region, seasonal events and weather conditions, as opposed to farming systems, were the most important risk determinants for crop contamination by human pathogens [20]. The overall importance of seasonality on phyllosphere community structure and membership has been demonstrated by both culture dependent and culture independent research studies [23][24][25][26].
While these studies all suggest that environmental factors may play the most significant roles in microbiologically seeding the phyllosphere, there is also evidence that host plant species play significant roles in the shaping of phyllosphere microbial ecology and succession [27][28][29][30].
While there are undoubtedly important drivers from both host plants and environmental factors, the fact remains that almost every phyllosphere study in the literature to date has sampled microbes from surfaces of plants and described these consortia as phyllosphere microbiota without any type of control-such as an inanimate surface placed at a similar elevation to the plant part sampled. Thus we have an opportunity to learn more about drivers of phyllosphere microbiota by employing an inanimate control to help us better understand differences between environmentally introduced microbial species and endemic or host plant mediated microbiota.
To attempt to distinguish between host plant mediated and environmentally introduced microbiota, we interspersed sterilized plastic plants in a row of live tomato plants and sampled each surface type at four time-points throughout a growing season. DNA was extracted and used with 16S rRNA amplicon sequencing to describe bacterial membership for each sample type.

Shared and Unique Bacterial Communities
A total of eight independent replicates for each treatment (n = 2) and time-point (n = 4) were used to compare the bacterial composition associated with each surface type (i.e. control (plastic) and phyllosphere). 16S rRNA amplicon sequences were filtered for quality and clustered into operational taxonomic units (OTUs) using the QIIME package (see methods). To normalize for differences in sequencing depth, all replicates were subsampled to 2,500 sequences prior to downstream statistical comparisons. Excluding low abundance taxa-(less than 0.5% of libraries), we observed very high percentages of shared OTUs. Shared bacterial taxa ranged from 92.59% to 100% for all time-points (Table 1). Only 2 unique OTUs were observed in control samples on August 15 th 2013, and 2 OTUs unique to phyllosphere were observed on June 30 th 2014 and July 31 st 2014 (Table 1). When Time-point 0 was examined independently at a deeper level of sequencing depth (16,074 sequences per sample), the percentage of shared taxa remained high at 94.7%. With the inclusion of low abundance OTUs (all those that occurred in less than 0.5% of the data), an interesting trend was observed. For almost every time pointthere was a greater diversity of low abundance OTUs present in control samples when compared to phyllosphere samples ( Table 2). This implies that the air is host to a greater diversity of microbes than can be found on the surfaces of plants and that a subset of bacterial members thrive in the phyllosphere. This trend was even more pronounced in Time-point 0, when it was examined independently. Without rarefying to 2500 but instead maintaining all 16,074 sequences recovered for each replicate and including the low abundance OTUs-3,249 sequences were uniquely associated with controls (representing 198 OTUs) while only 63 sequences (representing 36 OTUs) were uniquely associated with phyllosphere samples (live tomato plants) (Fig 1). The most abundant phyllosphere unique taxa were Rubrobacter, Acidovorax, Peptoniphilus, Porphyromonas and undefined members of Acidimicrobiaceae, Nitrospiraceae families and the phylum Chloroflexi. Dominant unique taxa for controls were Turicibacter, Vagococcus, Bacteriodes, Wohlfahrtimonas, Prevotella, and undetermined members of Lachnospiraceae, Ruminococcaceae and Veillonellaceae families (Fig 1). Profiles of dominant bacterial families observed in our rarefied datasets for all time-points and treatments, including the store microbiota are shown by independent replicate in S1 Fig. Fig 2 shows the most abundant families that were observed in 16S rRNA gene libraries for merged independent replicates of phyllosphere (P) and control (C) samples. Despite the dominance of Enterobacteriaceae in phyllosphere and control samples, high-resolution taxonomic analysis using the Resphera Insight protocol described Pantoea, Erwinia and Serratia species, but found no evidence of Salmonella across the sample set in either phyllosphere or control samples. Little separation by treatment (control and phyllosphere) was evident using nonmetric Multi Dimensional Scaling (nMDS) ordinations to look at Bray Curtis dissimilarity of bacterial communities (Fig 3). Microbiota that was washed off plastic plants pre-surface sterilization however, was clearly different from microbiota associated with plastic and live plants from the field environment. The washing step was performed to ensure that microbiota from the store environment where plants were purchased was not erroneously described as part of the environmentally driven consortia. Fig 3A shows nMDS ordination of 16S libraries for all environments: store, control (plastic plants) and phyllo (phyllosphere, live tomato plants). Fig 3B shows an nMDS ordination of the same data separated by time point. To further test the global null hypothesis of independence for covariate variables (store, phyllo, enviro, and time-points: T0, T1, T2, T3) and associated response variables, a conditional inference regression tree was modeled onto the data. Nodes were generated by performing a binary split on all covariates (where the null hypothesis could not be rejected) using a p value set at 0.1 as a cutoff (Fig 4) [31]. For ordination component MDS2, the time at which the samples were collected determines their position in the ordination ( Fig 4A). However, for component MDS1, whether or not the plant was real (phyllo) or plastic (control) does not play a significant role in its location on the ordination chart. Nonparametric significance tests comparing the MDS1 component for time point 0 (p < 0.001) and time-points 1, 2, and 3 (p < 0.001) support this conclusion. The distances between control and phyllosphere samples from the same time-point were significantly closer to each other than they were to other time-points of same treatment (P = 2e-16; Mann-Whitney), suggesting that temporal changes are more influential drivers of community composition than treatment alone ( Fig 4B). Additionally, we performed an analysis of multivariate homogeneity of group dispersions (using the permutest in the Vegan package). Significant differences in dispersion were identified between T0 compared to T1 and T1 compared to T2 (P<0.007 for each comparison). We did not identify significant differences in dispersions among time-points within each environment (control and phyllosphere), most likely due to limited group sizes (S2 Fig).

Differential Abundance of Bacterial Communities
For every time-point there were numerous taxa that were significantly differentially enriched from one treatment to the other. For example, in Time-point 0, five different genus level OTUs in the Family Sphingomonadaceae were differentially enriched-with each one occurring in greater abundance in control samples. There is research that describes the protective effect that specific Sphingomonadaceae taxa can have on host plants [32]. There is even a species of Sphingomonas with the species epitaph, "phyllosphaerae" [33]. The enhanced abundance of Sphingomonas OTUs in control samples suggests that these taxa may not actually be endemic to the phyllosphere but rather, adapted to this niche. The composition itself (shared incidence of each taxa in each sample) was highly similar for both sample types. The differential abundance of specific groups described above provided the first insight into microbiota that were responding to host plant physiology in contrast to environmental deposition. To further explore bacterial relationships thriving in response to biological and physiological drivers from living leaves, we performed a network analysis by computing Spearman's correlation coefficients with corresponding P values for all pairwise distances of bacteria in phyllosphere and control samples (Fig 5). Examining significant pairwise correlations (P< 0.05 with False Discovery Rate (FDR) correction [34], a total of 23 unique correlations were identified in control samples. For phyllosphere samples, 37 unique correlations were observed and 21 significant correlations were shared between the two sample types (Fig 5).

Discussion
Work supporting the observation that microbial consortia associated with plant surfaces may be heavily influenced by air was recently presented at the 15 th 'International Symposium for Microbial Ecology (2014) [35]. Looking to understand the origins of the microbes that inhabit clouds, Santl-Temkiv et al. examined 16S rRNA and rDNA from soil, water, plant surfaces and air, and showed that communities from air and plant surfaces were the most similar [35]. Work by Gales et al., actually demonstrated that plant surfaces are useful for the monitoring of  Distinguishing between Host Plant Mediated and Environmentally Introduced Phyllosphere Microbiota bioaerosol emissions from a composting plant [36] because of their accurate representation of how far airborne microbiota actually travel from the composting plant. As previously mentioned, a number of studies suggest that seasonal and/or biogeographic factors (potentially airborne or other environmental pressures) appear to have a stronger influence on the composition of phyllosphere microbiota than pesticides [18], water sources [15] and other agricultural management practices [20,22,25].
In contrast to these findings, other work has described phyllosphere and air bacterial communities as distinct from one another other, with only 18% of OTUS at 97% shared in a study done comparing greenhouse air (represented by glass slides placed adjacent to plants) and Arabidopsis thaliana plant surfaces [30,37]. Maignien et al. describe the combination of selective and random forces that shape the microbial ecology of epiphytic bacterial populations of surfaces of Arabidopsis thaliana leaves grown in a green house. They suggest that proximity of plants may play a role in the bacterial community development and succession but they also acknowledge that random and stochastic processes appear to contribute to the development of phyllosphere bacterial communities that remain distinct from greenhouse air communities. The greenhouse air however, was constant in temperature and without wind pressures and thus does represent the pressures that occur in field conditions. Indeed, the authors mention that greenhouse Arabidopsis phyllosphere communities differed from those of field grown plants. Research by Williams et al. also supports the observation that microbiota of laboratory grown plants is significantly distinct from that of plants grown in field conditions [38]. Interestingly however, and consistent with the results presented here, Maignien et al. described a greater diversity of unique OTUs (1,003) associated with greenhouse air (measured using deposition on glass slides) compared to surfaces of greenhouse grown plants (435 OTUs). This observation supports the hypothesis that microbial colonizers of the phyllosphere represent a subset of the microbiota associated with air. A second study that suggests that microbiota from plant surfaces is distinct from air microbiota was conducted by Vokou et al. [37]. However, the total number of sequences generated for this study was not sufficient to make robust inference about the scope of alpha or beta community diversity for phyllosphere or air communities.
While air may play a poorly described but important role in phyllopshere microbial composition, it is well documented that host plants themselves mediate important microbial dynamics in the phyllosphere. We saw evidence of this in the differential abundance of certain taxa between control and phyllosphere samples. Especially within the family Sphingomonadaceae. Other important evidence of influence of host plant on phyllosphere bacterial microbiota was evident in the network analyses ( Fig 5). As previously mentioned, 21 correlations were shared between the two sample types with 37 unique to phyllosphere in contrast to only 23 unique to controls. A greater number of unique phyllosphere relationships, despite the fact that there were less unique phyllosphere OTUs, demonstrates the robust influence of host plant biology on phyllosphere microbiota. Correlations unique to the phyllosphere appeared among members of Bacilli, and Betaproteobacteria including genera such as Ralstonia, Staphylococcus, and Arthrobacter (Fig 5). The examination of the Ralstonia node for control and phyllosphere samples is particularly striking. Ralstonia is an important and well known plant pathogen of many plants. Ralstonia solanacearum causes wilt in a variety of plants-notably tomato. A complex network of relationships between Ralstonia and other bacteria is easily observed in phyllosphere while no relationships between Ralstonia and other genera are evident in control samples (Fig 5).

Conclusions
The most surprising observation for the bacterial taxa associated with control and phyllosphere samples was the high percentage of shared taxonomy for each sample type. These findings suggest that airborne or other environmental pressures may play an important role in driving the consortia of microbes that are associated with phytobiomes. These results also highlight the need for enhanced sampling methodologies to more comprehensively describe endophytic and epiphytic core phytobiomes. More sophisticated air sampling will be needed for future work to exclude the possibility that live plants may have introduced bacterial microbiota to controls. A repertoire of "usual suspects" spanning numerous bacterial and fungal phyla has been reported across a variety of plant phytobiomes, however the delineation of a core microbiome for any plant remains elusive. Efforts to describe core endophytes are closer to accomplishing this goal than efforts to describe core epiphytes. Indeed, the delineation of a "core microbiome" for the well studied human GI tract also has yet to be established. There are so many genetic, temporal, physiological, and stochastic environmental factors that exert small yet significant pressures on undescribed and interwoven ecologies.
A research goal for both human microbiome and phytobiome research is the description of core microbiota and how endemic, established or introduced taxa play a role in community stability, persistence, and antagonism or suppression of pathogens that enter the host microbiome. The observations presented here using plastic plants as a control surface, and by Maignien et al., using glass slides, demonstrated that more bacterial diversity was associated with representative air samples (plastic plants and glass slides) than with surfaces of plants. This suggests that improved methods need to be developed in replicated and diverse biogeographic regions before a core phytobiome can be described for any plant species. The great diversity of microbes in agricultural air systems is worthy of further study with state of the art air monitoring and integrative data techniques. Because of the proximity of the plastic plants used in this study to the live plants, we cannot rule out the possibility that the community composition of plastic plants actually represents local dispersal from colonizers of living tomato plants. However this does not explain the augmented diversity of OTUs associated with the control environment. The many (37) significant unique correlations between genera in phyllosphere samples observed in the network analysis (Fig 5) definitely provide an exciting insight into relationships that are host plant mediated. There was also evidence of many shared relationships between the two sample types as well (21) (Fig 5). These results suggest that stochastic processes, including wind driven dispersal and drift may play a significant role in the structuring of phyllosphere communities. They also provide very interesting insight into some of the potentially host plant mediated dynamics of the phytobiome. Continued examination of both control and phyllosphere samples will help us improve our understanding of which microbiota in phyllosphere communities are host plant mediated and which may be introduced by environmental forces.

Field Collection
Commercial tomato plants, variety BHN 602, were planted a minimum of 60 cm (2 feet) apart in 23 meter rows comprised of approximately 25 plants at the Wye Research and Education Center of the University of Maryland. The plants were border plants surrounding another 5experiment and received no treatment inputs such as pesticide applications. Plastic plants (figs) were purchased from Ikea to serve as non living controls possessing roughly same color, height and surface area of mid season tomato plants. The selection of the plastic plants for use as a control was made based on similarity of height, color, and surface area of plastic leaves to live tomato leaves. Several other materials were used in preliminary trials such as yellow and green sticky cards but these were determined to be too differentially selective for insect microbiota and DNA recovery was also significantly more challenging. The lack of sticky surface associated with the plastic plants provided a less biased surface to estimate bacterial deposition by environmental forces. Cost and availability were also practical considerations for selections of the controls. Plastic plants were washed thoroughly and rinsed with a 3% bleach solution for "sterilization". Plants were approximately 60 centimeters tall, which matched the height of the tomatoes at sampling time. Plastic plants were interspersed between live plants with a range of approximately 30 centimeters to 60 cm at the base of each plastic and live plant. All plants were supported by the same twine system, installed to support the tomato plants. In general, plastic and live plants were spatially separated by at least 12 cm at leaf tips, however this distance became smaller for a few live and plastic plant pairs over the six weeks of the experiment. . Additional sequencing of the microbiota associated with the plastic plants post purchase and pre-washing and "sterilization" was performed to provide an understanding of how effective our washing techniques were and to ensure that diversity described from the controls did not represent the store environment. Using sterile water and sonication, store microflora wash was removed, centrifuged and DNA was extracted from the pellet. All plastic and living leaves were collected from similar altitudes on the adjacent plants and from the front of each plant to ensure that no plastic and living leaves that had been physically touching were collected. Field collected samples were placed in ziplock bags and stored at approximately 4°C in a cooler with ice packs for transportation back to the lab. Samples were sonicated in 200 ml of sterile water to disrupt biofilms associated with plastic leaf and tomato leaf surfaces. The resulting "wash" water was centrifuged and DNA was extracted from the pellet. No specific permissions were required for collection from these research fields other than the consent of the Wye Research and Education Center (WREC) scientists and extension agents who direct the activities of the WREC Station. The field studies did not involve endangered or protected species.

16S amplicon sequence analysis
Raw fastq files reflecting forward reads output by the MiSeq platform were initially filtered for quality and length (!200bp) using QIIME [40,41]and spurious hits to the PhiX control genome were identified using BLASTN and removed. Passing sequences were trimmed of the forward primer, and evaluated for chimeras with UCHIME (de novo mode) [42], and subsequently filtered for host-related contaminant including chloroplast DNA using the RDP Bayesian classifier [43]. Next a large-scale BLASTN search of the GreenGenes database (v13_05) was performed to identify unknown contaminant sequences. Sequences without a database match of at least 70% identity along 60% of their length were removed. Identified contaminants included a substantial number of mitochondrial DNA. This resulted in an average of 160,000 raw reads per independent replicate with an average length of 221 bases. A full table of sequences throughout stages of quality screening is available in S1 Table. The final dataset of high-quality 16S rRNA gene amplicon sequences were characterized for diversity and taxonomic composition using QIIME with the GreenGenes database. Sequences were clustered into operational taxonomic units (OTUs) using UCLUST (de novo) [44] with a 97% identity threshold. Representative sequences of each cluster were assigned to a taxonomic lineage by the RDP classifier (trained on the GreenGenes 16S database, v13_05) using a minimum threshold of 0.50. Representatives were input to PYNAST [45] to generate a multiple sequence alignment, which was subsequently used to construct a neighbor-joining phylogenetic tree with FastTree [46]. After full characterization of the clean sequence dataset, sampling depth was normalized by rarefaction to 2,500 sequences per sample to include as many independent replicates as possible. The rarefied 16S sequence set was further evaluated by the Resphera Insight protocol to obtain high-resolution taxonomic assignments (Baltimore, MD; www.respherabio.com).

Statistical analysis
Beta-diversity distance metrics (Bray-Curtis) were computed from rarefied OTU tables and visualized using principal coordinate analysis in QIIME [40]. Hierarchical clustering and visualization were performed in R (v.2.12.0). The false discovery rate (FDR) was employed to control for false positives in comparative statistical testing [34]. nMDS ordination plots were created by applying nonmetric multidimensional scaling to the Bray-Curtis Dissimilarity matrix for the samples. nMDS ordination was achieved by the metaMDS wrapper function from the vegan package (http://CRAN.R-project.org/package=vegan), which uses the monoMDS function from the same package. The ordination was applied such that the data was scaled down to two dimensions. In addition (with a random seed of 246), 20 starting point iterations were performed within the metaMDS function call, leading to a minimum stress level of 0.1575973. Once the ordination was applied, the data was graphed using the ggplot2 package in R. The ellipses seen in the plots are crated using the stat_ellipse function from the ggplot2 package, which assumes a multivariate t distribution [47]. The ellipse is the upper 95 th percentile limit of the assumed distribution (95% confidence ellipse).
To test the global null hypothesis of independence for covariate variables (store, phyllo, enviro, and timepoints: T0, T1, T2, T3) and associated response variables (MDS1 and 2), the ctree() function in the party package of R was used to model a conditional inference regression tree onto the data after nMDS scaling. If the null hypothesis cannot be rejected, the process ends. If the null hypothesis is rejected, the covariate with the strongest association (determined using Bonferroni corrected p-values from permutation tests) is selected. A binary split is subsequently performed on the selected covariate. These steps are repeated until all possible nodes are generated using a p value set at 0.1 as a cutoff [31]. An analysis of multivariate homogeneity of group dispersions was also performed using the Vegan R package (betadisper) to provide permutation-based tests of dispersion homogeneity. Additionally, the Bray-Curtis distance matrix was used with ADONIS in R to perform a two-factor PERMANOVA analysis evaluating both time point and environment type (Control/Phyllo). ADONIS identified both time and type as significantly associated with total community composition (P < 0.001), with time influencing composition more so than type.
To evaluate differences in microbial networks among control and phyllosphere communities, we computed Spearman's correlation coefficients for all defined genera and their corresponding statistical significance, corrected using FDR [34]. Those correlations with significant P-values (P < 0.05) were included for comparative analysis and visualized using Cytoscape v3. x (www.cytoscape.org).

Data Submission
All 16S rRNA gene fastq files have been deposited in the SRA of NCBI associated with accession number [SRP043640]. All metadata has been submitted according to MIMARKS (minimum information about a marker gene sequence) [48]. Permutation-based tests of dispersion homogeneity were performed using the Vegan R package (betadisper). Significant differences in dispersion were identified between T0 compared to T1 and T1 compared to T2 (P<0.007 for each comparison; permutest in Vegan package). No significant differences in dispersions among time-points within each environment were identified (most likely due to limited group sizes). (TIF) S1 Table. Raw Sequence Data and Preprocessing Details. Full report of pre-processing and quality screening of sequences used for downstream analyses. (XLSX)