Functional Genes That Distinguish Maize Phyllosphere Metagenomes in Drought and Well-Watered Conditions

The phyllosphere microbiome consists of the genetic content of the microorganisms that colonize the external aerial portions of plants. Relationships of plant responses to specific microorganisms‐‐both pathogenic and beneficial‐‐have been examined, but responses of the phyllosphere microbiome overall are not well described. Changing crop growth conditions such as increased drought can have profound impacts on crop productivity and epiphytic microbial communities provide a new target for crop yield optimization. We provide the first comparison of Zea mays leaf microbiomes between drought and well-watered conditions. We examined the maize leaf phyllosphere microbiome from three physically disparate locations with and without drought treatment, through the application of deep sequencing with short sequence read technology. Functional profiles revealed a wide variety of metabolic and regulatory processes including the ability to adapt to changing environmental conditions associated with, and external to the plant and the presence of potential plant growth promoting traits suggesting possible beneficial plant-microbiome interactions. Both specific field site and drought affected taxonomic and functional composition of these leaf epiphyte communities.


Introduction
Plants form a wide variety of intimate associations with a diversity of microorganisms (bacteria, fungi and viruses) that include those below ground in the rhizosphere and above in the phyllosphere, which consists of the aerial parts of plant especially the leaves (Müller et al., 2016;Bulgarelli et al., 2013). Microorganisms can exist as endophytes within the plant, as epiphytes attached on plant surfaces and in the roots and the soil surrounding the roots (Berg et al., 2014;Knief, 2014;Mine et al., 2014). The ubiquity and intricacy of these plant-microbe associations give rise to the notion that the plant is a "meta-organism" or "holobiont" consisting of the host and its microbiome (the collection of microorganisms and their gene content) which maintain a continuous relationship over the lifetime of the plant (Berg et al., 2014). The plant microbiome is complex and dynamic existing as both an agonist and antagonist of plant fitness and adaptability (Vacher et al., 2016). Therefore, elucidating the nature and extent of these interactions offers significant opportunities for improving plant health for example, through nutrient cycling, neutralizing toxic compounds, discouraging pathogens, and promoting resistance to abiotic stresses, generating significant impact on plant productivity (Churchland and Grayston, 2014;Choudhary et al., 2011;Howden and Preston, 2009;Phieler et al., 2014). Optimization of selective breeding for epiphytes presents new challenges in ensuring that colonization occurs as needed, while also presenting new potential for effective indirect genetic selection (Parnell et al., 2016;Mitter et al., 2017).
In contrast to the rhizosphere, the phyllosphere is both a relatively understudied and transitory microbial environment (Bringel and Couée, 2015;Müller et al., 2016). While annual plants undergo a life cycle within a single growth season, deciduous perennial and evergreen plants shed leaves throughout the year (Niinemets, 2007) and these types of differences in plant traits shape their associated microbial communities (Lambais et al., 2017). Furthermore, microbial epiphytes of the phyllosphere experience an environment that is distinguished from the rhizosphere by atmospheric influences including direct sunlight exposure during diurnal cycles, and plant epidermal cells are covered by a waxy cuticle that forms an important trophic barrier designed to reduce water evaporation and release of plant metabolites, resulting in an oligotrophic environment (Müller and Riederer, 2005).
Drought is an abiotic stress representing an important limitation of crop plants that can negatively affect plant productivity (Schmidhuber and Tubiello, 2007). Corn, Zea mays L., is one of the most widely grown and economically important annual crops, making the understanding of the role that the phyllosphere may play in association with maize undergoing stress a priority. Only between 0.1% and 10% of the microbial diversity is culturable in vitro (Torsvik and Øvreås, 2002;Kell et al., 1998). This has led to the use of culture independent methods to study microbial community structure and function. For approximately the past two decades the sequencing of amplicons representing biomarkers such as the 16S rRNA gene and internally transcribed spacer (ITS) regions have been used to describe microbial and fungal diversity, respectively (Methé and Lasa, 2013;Rajendhran and Gunasekaran, 2011). More recently, research of microbial communities has emphasized a systems-level approach relying on high throughput methodologies such as metagenomics, a term which describes a technological approach consisting of sequencing and analysis of genes derived from whole communities as opposed to individual genomes; thereby emphasizing that microbes ultimately function within communities rather than as individual species. Typically functional genes are different pattern than taxa, and for phyllosphere specifically functional genes are better predictor of niche/site (Suen et al., 2007). Here we present a metagenomic-based study of the maize phyllosphere during conditions of drought stress in three locations.

Materials and Methods
Seed Stocks Zea mays L. inbred B73 seed was supplied by the Maize Stock Center, http://maizecoop.cropsci.uiuc.edu/, and seed was increased at the Central Crops Agricultural Research Station using standard maize nursery procedures. Genotype of the seed lots used for these experiments was verified by SSR genotyping using eleven markers and comparing fragment sizes to the sizes listed in the MaizeGDB database (Sen et al., 2009), http://www.maizegdb.org/ssr.php.

Field Sites, Planting and Growth Conditions
Research field sites were generously provided by collaborators with ongoing scientific and extension projects; no additional permits or permissions were required. Replicated field plots were planted in Albany, CA at the USDA University of California-Berkeley field site (abbreviated as CA), 37°53'12.8"N 122°17'59.8"W, on June 6, 2012. The field site had uniform soil and subsurface irrigation. The southern section had normal irrigation throughout the season, while the northern section had normal irrigation until vegetative growth stage V5 when all watering was stopped. Seeds were planted at two sites in Texas, Dumas Etter field (abbreviated as DE) 35.998744°N 101.988583°W on May 8, 2013, and Halfway, TX (abbreviated as HF), 34.184136°N 101.943636°W on April 26, 2012. The sites had center-pivot irrigation and standard maize field management. Drought treatment blocks were watered at 75% of the normal rate at DE and at 50% of the normal rate at the HF field site. The DE field site had one replicate plot that experienced additional rain late in the season (after phyllosphere sample collection). The HF field site had no unmanaged precipitation between July 9 and harvest. Late-season (post-phyllosphere sampling) field trait measurement methods and data files for each field site are provided in Supplemental Plant Traits files 1-7.
Field Trait Measurement At the Texas field sites plant and ear heights were measured once per plot when growth was complete after tasseling. Ears were harvested and shipped to UNCW for measurement. For each ear, cob diameter at the base was measured with digital calipers, and twenty seeds were removed from the middle of each cob, placed in envelopes, and weighed. For the Albany, CA site, individual plant heights were measured and cobs were collected at the end of the season, October 1-3, 2012. Seed development was not complete, so only cob traits were measured. Cob diameter at base was measured with digital calipers; cob length was measured with a ruler. Plant data for each location and trait are included in supplemental files 1 to 6, with metadata about the column headers in supplemental file 7.
Leaf Washes and DNA Extraction Samples were collected from Dumas Etter TX on June 26, 2012 and from Halfway TX on June 27, 2012, at developmental stage V8. Albany, CA phyllosphere samples were taken August 7 and 8, 2012, at developmental stage V8. Six fully expanded leaves from the top quarter of the plants in each plot were placed into sterile bags (Whirl-Pak, Nasco, Fort Atkinson, WI) prefilled with 300 ml sterile water and 3 microliters Silwet L-77 (EMCO, North Chicago, IL). Bags were moved to nearby shelters, sonicated for one minute to loosen epiphytic microbes, and the 300 ml of wash solution was filtered through sterile Pall microfunnel 0.2 micron filter cups (VWR, Radner, PA) to collect microbial cells on the filter surface. The filter was removed from the cup with sterile tweezers and dropped into small sterile Whirl-Pak bags then stored frozen until DNA extraction. DNA was extracted from each filter was with a PowerSoil Mega kit (MoBio, Carlsbad, CA). Samples were concentrated with filter-sterilized sodium chloride and absolute ethanol according to the manufacturer's instructions and shipped frozen to JCVI for sequencing.

Sequencing, Quality Control and Removal of Host Sequence
Three test samples were used to aid in determining good library construction and sequencing protocols for this project. Two nucleic acid negative controls were also processed through library construction and sequencing to test for the presence of any significant contamination of experimental samples by exogenous DNA. After sequencing of the phyllosphere experimental samples, a quality control procedure was used to remove completely reads of overall low quality or of very low complexity. Remaining reads were trimmed to remove adapter sequence from the library construction process as well as any low quality bases (typically present at the 3' end of the sequence). Due to the nature of the sampling collection and nucleic acid procedures, plant host genomic DNA was inevitably included in the nucleic acid samples used for library construction. Therefore, a screening process was implemented to remove reads most likely to be of maize origin. The remaining high quality reads were subsequently used for phyllosphere microbial community analysis. The raw data and processed reads are accessible from the NCBI Short Read Archive under Bioproject PRJNA297239.
Read Annotation Reads were functionally and taxonomically annotated by association with Uniref clusters through blastx. Briefly, for each collected sample subjected to sequencing, a maximum between 200,000 or the number of reads in the sample, were blasted against the UniRef90 database of peptide sequences downloaded on July 21, 2014. The blastx was performed with an e-value of 1e-10 and a fixed database size of 1e6, to ensure a consistent quality of match would be detected, independent of actual database size. Hits were then scored by computing a composite percent identity score = (percent identity of alignment * percent of read aligned). Taxonomic and functional assignments were then made on the groups of UniRef90 hits identified at the cutoffs of 90%, 75%, 60%, and 45%. Within each cutoff group, per read taxonomic assignments were made based on the highest common ancestor using the taxonomic hierarchies defined in NCBI (http://www.ncbi.nlm.nih.gov/taxonomy) across all the taxonomies assigned to each hit UniRef cluster. Reads associated with Viridiplantae, Chordata, Orthomyxoviridae with the NCBI taxonomy ID's 3309, 7711, 11308, respectively, or lower in the taxonomic hierarchy, were considered to be contaminants and removed before further processing. Next, for functional assignments, the annotation from the highest quality hit for each annotation category was selected for each read. The annotation categories that could be associated with each read by UniRef cluster proxy included PFam, TIGRFams, GO Processes, GO Functions, and ECs (see Table 2, Annotation results). For each of the annotation categories and the key levels of assigned taxonomy (Species, Genus, Family, Order, Class, Phylum, Superkingdom), profiles for each of the samples consisting of counts to each of the categories (functional or taxonomic) were generated for downstream statistical analyses.
Statistical Analysis of Plant Traits Plant traits (seed weight, plant height, and cob diameter) were analyzed with linear models using JMP11 Pro (SAS, Cary, NC). Models were fit with water treatment and plot (as nominal factors) for each trait, as multiple plants were measured within each plot.
Outlier Tests Read count data (Supplemental Table A and Supplemental Table B, complete files at https://github.com/Talleyman/Metagenomic-Analysis-Part-2) was tested for outliers using JMP Pro v11 for the Grubbs and Dixon tests, and using SAS 9.3 (SAS, Inc, Cary, NC) for the zRobust test; code and additional details on outlier tests are provided in the Supplemental Output File zRobust and summary outlier results in Supplemental Table C.
Community Composition Analysis Functional annotations from the 45% cutoff were structurally compared using typical ecological PERMANOVA analysis methods and important functional features identified using methods optimized for high-quality high-power metagenomics feature selection. Overall community composition was compared for ribosomal and functional classifications with the Permanova and Permdisp add-on package in Primer-E software (Clarke and Gorley, 2006;Anderson, 2006), http://www.primer-e.com/, using square-root transformation and Bray-Curtis distance (there was no imputed HF sample used for the Primer-E analyses). The Primer-E SIMPER module was used to output lists of the most important features, using the default 90% importance threshold. The SIMPER output files are included in the supplemental output files simper 1-9.
Annotation Analysis For feature selection, functional annotation read counts were used as input to the R package ENNB (Pookhao et al., 2015), which uses a two-stage process with an elastic net then negative binomial fit to identify significant annotations, though it is only possible to fit one factor (nested or full factorials for multiple experimental factors are not possible to fit using this two-stage method). The package was downloaded from the An web page (http://cals.arizona.edu/~anling/software.htm) and loop scripts written to run each of the different annotation types individually, with method 1, the trimmed mean (TMM) from the EdgeR package, and method 2, DE-Seq-type count overdispersion, used to adjust for library size differences. Before analysis, the annotation data sets were cleaned to remove any values lower than those found in the soil or mock-collected sequenced sample annotations from each annotation column in the phyllosphere sample data. Statistical analysis of annotations different in drought and well-watered conditions were carried out across all field sites. The missing HF sample was imputed using the R package MI for the ENNB analysis. The original data files, cleaned data, R scripts for repeated runs of the ENNB package, and the outputs are available from github (https://github.com/Talleyman/Metagenomic-Analysis-Part-2).
Annotation Visualization Lists of GO Process and GO Function annotations that were significantly different upon output from ENNB were visualized using REVIGO (Supek et al., 2011), http://revigo.irb.hr/, with the Simrel and medium list defaults selected. The REVIGO cytoscape-format xgmml network files were matched with Primer-E SIMPER analyses and any nodes not found in the SIMPER analysis or not consistent in abundance count direction by treatment factor were removed from the REVIGO network visualization. The REVIGO networks were color-coded and the network layout redrawn using Cytoscape v3.2.1 (Shannon et al., 2003). Taxonomic visualizations were created from the ENNB significant taxa output list, checked for consistency with SIMPER genus output, and visualized using FigTree v1.4.2 (http://tree.bio.ed.ac.uk/software/figtree/) and the NCBI taxonomy common tree (http://www.ncbi.nlm.nih.gov/taxonomy). An analysis of the core microbiome was carried out using a subset of 13 of the samples (not shown).

Plant Traits
All plant measurements at all sites showed significantly less growth in the treatment with less water (Figure 1). Plot effects were examined for each trait and no significant interaction between plot and replicate was found. Mid-season plant heights were significantly less (P<0.0001) in the drought condition for the California-Albany (CA) site; the drought-treated plants were ~20% shorter, with an estimated difference between normal water and drought of 0.158 meters. The Texas-Dumas-Etter (DE) field site with plot 101 excluded exhibited significant (P=0.0139) effects of drought on end-of season seed weight (Figure 1b), with the seed weights in drought reduced by about 25% (estimated difference of 0.468 grams less in drought samples); plot 101 from the 75% site had a late-season rain event and was thus excluded from the analysis. Drought reduced seed weight by ~50% at the Halfway, TX (HF) field site (Figure 1c), with P<0.0001 and an effect difference of 1.206 g less in drought seed samples. Cob diameters were also significantly smaller in the drought-treated plants (Figure 1 d, e, f) with the effect size differences ranking the drought intensity of DE (1.66 mm less in drought) less than CA (2.52 mm less in drought), with the most severe cob diameter drought effects at the HF site (3.24 mm less in drought).

Metagenomic sequences
The mock and test samples yielded very few good sequencing reads or high quality base calls. Most of this sequence was determined to be maize in origin. The screening process generated on average ~40% of reads of maize origin. The remaining reads totaling approximately 2.2GB of sequence were subsequently used for phyllosphere microbial community analysis. Approximately one-third of reads were either novel or non-coding. Given that the majority of reads appeared to be bacterial, the former explanation likely dominates the source of the unannotated reads, given the higher proportion of bacterial DNA dedicated to coding sequence. A proportion of UniRef90 clusters are known not to be annotated with any taxonomic or functional information. However, of the phyllosphere reads that did hit UniRef90 clusters, 98.5% of these hits could be associated with taxonomy.

Metagenomic Analysis of Drought Effects within Field Sites
Outlier tests (Grubbs, Dixon's Q, and zRobust) of the read counts for each library type and annotation datatype (as listed in Supplemental Tables A and B) indicated that there was no consistently significant outlier sample (Supplemental Table C and Supplemental File zRobust). Overall community structure comparisons of all leaf samples' annotations using Permanova and Permdisp with square-root transformation and Bray-Curtis relatedness were not significant for any annotation types for either single-factor or two-factor (treatment within site) models (data not shown).
The ENNB analysis with normalization by TMM generated a list of significant GO Process and GO Function annotations in watered as compared to drought-treated phyllosphere samples for each field site, with groups of related GO terms from REVIGO analysis indicated by edges between GO node terms. The GO Process terms identified as semantically important in the drought treatment for the DE site (Figure 2a) include biochemical pathways involved in amino acid and nuclei acid biosynthesis, regulation of transcription, and response to stress, with the two most distinguishing annotations (with the largest node size) being chemotaxis and cyclic nucleotide biosynthesis. REVIGO process annotations that were important in the GO list from the watered-treatment samples include copper ion homeostasis and urea catabolism (blue nodes in Figure 2a). For the HF field site, chemotaxis and amino acid biosynthesis were also prominent, with higher levels in drought samples (and large node sizes in Figure 2b). Protein synthesis and transcriptional function annotations were also prominent. At this field site, gamma-aminobutyric acid catabolism was detected in a cluster of annotations related to amino acid metabolism (Figure 2b, top right). One GO Process annotation, dexoyribonucleotide catabolism, was detected for the CA site (data not shown), with read count abundance for that annotation higher in drought samples.
Within the DE site, GO Function annotations relating to transport (for example, porins and ion transporters, top left section of Figure 3a) and elimination reactions (large cluster of reactions, top second from left in Figure 3a) distinguish the drought versus watered microbial communities. The most prominent annotation that is more abundant in watered samples is RNA methyltransferase activity (lower left corner of 3a). For the HF site, there is a cluster of annotations related to regulation of protein activity that are more abundant in drought (second from left at the top of Figure 3b); the most prominent annotation is 'structural constituent of ribosome' (lower left corner of Figure 3b). There were no significantly different functional annotations for the CA site. Both DE and HF sites share the annotation "5formyltetrahydrofolate cyclo-ligase activity", which may play a role in thiamin metabolism (Pribat et al., 2011), though the abundance in drought samples is higher in DE and lower in HF. Across the GO Function categories more nodes were prominent (larger node size in Figure 3) in both the DE and HF sites in comparison to the node size distributions in Figure 2 for GO Process annotations.
Overall important taxa list from the drought versus water comparison within the DE site ( Figure  4) shows that nine groups had higher read counts in watered samples, while 20 groups had higher read counts in drought. Actinobacteria and Gammaproteobacteria contained taxa with a higher abundance in drought, while all other groups contained taxa with abundances both higher and lower in the watered and dry samples. Groups well-known to associate with plants included Pantoea, Enterobacter and Ustilago, all with higher abundance in drought-treatment leaf samples at the DE site. There were no significant taxa abundance differences identified within the HF or CA sites for the water treatment factor using the ENNB-with-TMM method.
Most GO annotations found by ENNB analysis (Figures 2 and 3) were also identified as important contributors to sample difference using the SIMPER method. If annotations were not present in both sets of results, they are not shown in the figures. Specifically, ten nonconcordant nodes were removed from the DE process graphic (Figure 2a), none from the HF process graphic (Figure 2b), and one node was removed from the CA process network --leaving only one difference for the CA site. Twelve non-concordant nodes were removed from the DE function graphic (Figure 3a), one node was removed from the HF function graph (Figure 3b), and five taxa were removed from the DE genus collection (Figure 4). As expected, normalization method and sample inclusion strongly affect the significant-annotation output lists. We include the output files for alternative normalization methods in our github repository. The effect of sample removal can be observed for any annotation of interest by using the TMM output files provided in the repo, to explore the dependence of abundance count averages on sample presence.
Analysis of a subset of 13 MiSeq samples (data not shown) provided a core microbiome containing bacterial and eukaryotic components. Analysis of the core bacterial taxonomy across all phyllosphere samples, using >80% ubiquity and >1% identify categories, indicated that at the genus level Acinetobacter and Pseudomonas have the greatest abundances. A potentially novel genus in the family Enterobacteriaceae is also present. Of the top five eukaryotic categories that could be associated with a genus three were from the Family Pleosporaceae: Pyrenophora, Bipolaris, and Setosphaeria and the fourth genus Parastagonospora shares the same order Pleosporales with the first three genera. In addition, the fifth genus Fusarium shares the same division Ascomycota with all four genera (Pyrenophora, Bipolaris, Setosphaeria, Parastagonospora).

Discussion
The field site, CA, with the least difference in the plant cob trait when comparing drought to well-watered (Figure 1 d, e, f), had the fewest significant changes in annotation. The moderate intensity drought treatment site in regards to the cob trait, the DE site, had the largest numbers of important annotations (Figures 2 and 3) and had significant taxonomic differences (Figure 4). This suggests that environmental gradients across sites may be needed to determine which field sites are predictive for each other.
One striking pattern in all the GO annotation networks is the much larger number of categories identified as more abundant in the drought samples (Figures 2 and 3). A pattern such as this, with larger numbers of contributing features in a stressful environment, is also see in transcriptomic analyses of increasing (need cite) and combined (Zandalinas et al., 2017) abiotic stress and in genetic analyses of important loci under stress conditions (cite crossa?).
Effect estimation was less consistent than significance list comparisons across different methods. Better effect estimation methods would be a useful focus for future research, especially for estimation of read count abundances that have truncated or zero-inflated distributions and are thus not well represented by means or medians. Additionally, it would be helpful to have methods that can fit a weak hierarchy interaction (i.e. one main effect and the interaction) for the variable selection step, while maintaining good performance on correlated datatypes. The HierNet (https://cran.r-project.org/web/packages/hierNet/index.html) R packages has this functionality, though the package does not yet handle categorical factors such as those in our experiment. Our experiment was designed to examine annotation patterns and treatment differences; as we expected the experiment was underpowered to detect community composition differences in correlated/nested annotation datatypes with small numbers of replicates. We thus endorse efforts to improve methods for detection of community shifts in multivariate count data (Widder et al., 2016).
We compared our GO annotations within sites to the twelve pfam terms in the rice metaproteome (their Table 2) (Knief et al., 2012) that have GO annotation equivalents. We examined the rice metaproteome terms by considering the term itself and the parent and child terms. Elven of the rice category annotations are also found in our important annotations (Figures 2 and 3); the missing category is 'heme binding'. In contrast to the similarity we see between rice and maize leaf communities, a metaproteomic analysis of tree phyllospheres did not reveal any annotated proteins shared across all plant species (Lambais et al., 2017).
In summary, we identified sets of functional gene annotations that differ between drought and well-watered maize leaf samples from field plots. The field site and the drought treatment affected the type and number of functional genes identified. For future selective breeding, sites and treatment regimes should be chosen for optimal predictive power for the target production area.

Figure Legends
Figure 1 Plant Traits Error bars are standard error, colors are the same within a field site. a) Comparison of plant heights in drought and well-watered plots from the California-Albany (CA) field site; bar heights indicate average height in meters. Drought (less drip irrigation) n=40, wellwatered regular drip irrigation n=40. b) Comparison of seed weights from mild drought (75% of normal irrigation) and well-watered (100% irrigation) field blocks at the Texas-Dumas-Etter (DE) field site. Colored bar indicates mean value. Drought n=18, well-watered n=17. c) Comparison of seed weights from in intense drought (~half of normal watering level, DRT) and well-watered (WW) field plots at the Texas-Halfway (HF) site. Colored bar indicates mean value. DRT n=4 and WW n=22. Zeros (cobs with no seeds from DRT) were not included in the analysis. d) Box plot of cob diameter of CA samples; white line is mean and quantiles are indicated by the box and whiskers, n=12. e) Comparison of cob diameter by water treatment in samples from the DE site. Colored bar indicates mean value, n(75%)=28, n(100%)=26. f) Comparison of cob diameter by water treatment in samples from the HF site. Colored bar indicates mean value, n(DRT)=10, n(WW)=22.

Figure 2 Network Visualization of Gene Ontology Process Annotation Differences Between
Normal Water and Drought Treatment Significant annotations from ENNB analysis were grouped by semantic similarity into a network. The size of each node is proportional to the SIMPER dissimilarity that describes the extent of difference between the treatment samples, with larger sizes reflecting increased contribution to differentiation. Tan nodes had higher read count abundances in drought and blue nodes had higher abundance in fully watered samples. a) DE field site GO Process annotations significantly different between fully watered and 75% watered microbial phyllosphere samples. b) HF field site GO Process annotations significantly different between the 50% water treatment samples and the fully watered samples.

Figure 3 Network Visualization of Gene Ontology Function Annotation Differences Between
Drought and Normal Water Treatment Significant annotations from ENNB analysis were grouped by semantic similarity into a network. Tan nodes had higher abundances in drought and blue nodes had higher abundance in full-watered samples. The size of the node is proportional to the SIMPER dissimilarity that describes the extent of difference between the treatment samples, with larger sizes reflecting increased differentiation. a) DE GO Function annotations significantly different between 75% water and full watered treatment microbial phyllosphere samples within the DE site. b) HF GO Function annotations significantly different between 50% water and full watered treatment microbial phyllosphere samples within the HF site. Tan nodes have higher abundances in drought and blue nodes have higher abundance in full-watered samples. The size of the node is proportional to the SIMPER dissimilarity that describes the extent of difference between the treatment samples, with larger sizes reflecting increased differentiation.

Figure 2sb
HFs GO Process annotations signifcantly different between 50% water and full watered treatment microbial phyllosphere samples within the HF site. Tan nodes have higher abundances in drought and blue nodes have higher abundance in full-watered samples. The size of the node is proportional to the SIMPER dissimilarity that describes the extent of difference between the treatment samples, with larger sizes reflecting increased differentiation.
S Figure 3a DE GO Function annotations signifcantly different between 75% water and full watered treatment microbial phyllosphere samples within the DE site. Tan nosdes have higher abundances in drought and blue nodes have higher abundance in full-watered samples. The size of the node is proportional to the SIMPER dissimilarity that describes the extent of difference between the treatment samples, with larger sizes reflecting increased differentiation.
Figure 3b HF GO Function annotations significantly different between 50% water and full watered treatment microbial phyllosphere samples within the HF site. Tan nodes have higher abundances in drought and blue nodes have higher abundance in full-watered samples. The size of the node is proportional to the SIMPER dissimilarity that describes the extent of difference between the treatment samples, with larger sizes reflecting increased differentiation.