Habitat and Host Indicate Lineage Identity in Colletotrichum gloeosporioides s.l. from Wild and Agricultural Landscapes in North America

Understanding the factors that drive the evolution of pathogenic fungi is central to revealing the mechanisms of virulence and host preference, as well as developing effective disease control measures. Prerequisite to these pursuits is the accurate delimitation of species boundaries. Colletotrichum gloeosporioides s.l. is a species complex of plant pathogens and endophytic fungi for which reliable species recognition has only recently become possible through a multi-locus phylogenetic approach. By adopting an intensive regional sampling strategy encompassing multiple hosts within and beyond agricultural zones associated with cranberry (Vaccinium macrocarpon Aiton), we have integrated North America strains of Colletotrichum gloeosporioides s.l. from these habitats into a broader phylogenetic framework. We delimit species on the basis of genealogical concordance phylogenetic species recognition (GCPSR) and quantitatively assess the monophyly of delimited species at each of four nuclear loci and in the combined data set with the genealogical sorting index (gsi). Our analysis resolved two principal lineages within the species complex. Strains isolated from cranberry and sympatric host plants are distributed across both of these lineages and belong to seven distinct species or terminal clades. Strains isolated from V. macrocarpon in commercial cranberry beds belong to four species, three of which are described here as new. Another species, C. rhexiae Ellis & Everh., is epitypified. Intensive regional sampling has revealed a combination of factors, including the host species from which a strain has been isolated, the host organ of origin, and the habitat of the host species, as useful indicators of species identity in the sampled regions. We have identified three broadly distributed temperate species, C. fructivorum, C. rhexiae, and C. nupharicola, that could be useful for understanding the microevolutionary forces that may lead to species divergence in this important complex of endophytes and plant pathogens.


Introduction
Delimiting species boundaries among fungi lays the groundwork for detailing the natural history and ecology of species and defines a robust framework from which further comparative studies can be designed (i.e. population genetics/genomics). This is also prerequisite to providing targeted and effective disease control measures and identifying specific pathogens against which plant breeders can focus their efforts in developing and selecting disease resistant cultivars. Strictly agro-centric studies of plant pathogens risk sampling too narrowly, overlooking important adjacent (parapatric) niches driving pathogen evolution. Extensive sampling, within, adjacent to, and beyond agricultural landscapes has the potential to provide a broader view of the natural history of pathogen species and offer insight into the evolution of differential traits among closely related lineages [1].
Colletotrichum Corda is among the most important and widespread genera of plant-associated fungi, causing disease and occurring as asymptomatic endophytes on aerial organs of a broad range of host plants [2][3][4][5][6]. Colletotrichum gloeosporioides sensu lato represents an aggregate of species frequently reported as a dominant endophyte of tropical herbaceous plants and is known as a field and post-harvest fruit pathogen of many economically important crops [7][8][9][10][11]. Morphological homoplasy and phenotypic plasticity have previously thwarted efforts to clearly define species boundaries within the species complex, necessary if we are to develop a greater understanding of the ecology and natural history of each lineage. The recent development of molecular markers suitable for resolving species limits and phylogenetic relationships within this species aggregate have been proposed and validated [12][13][14][15][16], making it possible to examine the role of geography, host preference/specificity, the nature of host-pathogen/host-endophyte associations, and other niche specialization attributes that may underlie species divergence [14]. Furthermore, the recent epitypification of C. gloeosporioides provides a much needed reference point for developing an improved taxonomic framework for the species complex [17]. The current study uses a phylogenetic approach to infer species boundaries among North American members of the C. gloeosporioides species complex, particularly those in association with the large American cranberry (Vaccinium macrocarpon Aiton) and sympatric plant species. Colletotrichum gloeosporioides s.l. has been reported as a leaf and fruit pathogen of cranberry since the late 1800s [18] and early 1900s [19] and has recently been observed to colonize stem tissue [20]. Contemporary studies of fruit pathogens have confirmed the importance of C. gloeosporioides s.l. in agricultural ecosystems throughout the cultivated range of V. macrocarpon [8,[21][22][23]. However, given the fact that C. gloeosporioides s.l. is an aggregate of species that can be difficult or impossible to distinguish morphologically, it is not clear that the strains isolated from cranberry are conspecific throughout the cultivated range of cranberry. In addition, genetic studies have focused on isolates solely from fruit and have not investigated diversity from alternate host organs or sympatric host species, despite evidence that these represent potential reservoirs of diversity for the species complex in cranberry agricultural areas [19,[24][25][26][27]. Vaccinium macrocarpon is one of North America's few native, economically important crop species. Cranberry has undergone little selection from wild relatives and is cultivated in several regions alongside extant native populations [28]. It can also be found in sympatry with the closely related species V. oxycoccos L., which has been used for inter-specific hybridization in breeding programs but is not cultivated for agricultural production [29]. In addition, many of the cultivars used in past breeding regimes remain available for further experimentation. These characteristics provide cranberry breeders with excellent resources for meeting the needs of cranberry growers to improve fruit production and reduce pathogen pressure through breeding for disease resistance [30]. However, providing a refined systematic understanding of the pathogens that are the source of disease pressure will be necessary to help focus the efforts of plant breeders.
Host specificity/preference has historically been a focal criterion for species delimitation in Colletotrichum [7,11]. Similarly, contemporary studies have indicated the presence of host specific lineages [13,31]. However, host-fungus associations within Colletotrichum are variable. It is clear that multiple species are capable of infecting single hosts and, conversely, some Colletotrichum species are capable of infecting multiple host species [13,14,32]. In addition, there is indication that different plant organs may act as selective forces leading to organ specialization in Colletotrichum species [33]. While C. gloeosporioides s.l. is established as a pathogen of a variety of important crop species worldwide [8,14,15,31,[34][35][36][37][38], this study represents a unique perspective into the evolution of the species complex by investigating the genetic diversity of C. gloeosporioides s.l. in a single crop species and its surrounding habitat within the host's native range. The specific objectives of this study are to determine: (1) whether there are multiple sympatric lineages within the species complex that infect cranberry, 2) whether host preference/specificity is evident among sympatric lineages within the species complex, 3) whether host organ or nature of the fungus-host association is predictive of phylogenetic structure, and 4) whether we can identify lineages with broad geographical and/ or host associations suitable for fine-scale landscape genetic analysis. We sampled horizontally across five sympatric host species in wild and commercial cranberry bogs in order to address questions related to host specificity, and sampled vertically among different plant organs within V. macrocarpon to target questions related to organ specialization (Table 1). Utilizing the recent development of molecular phylogenetic markers useful for distinguishing lineages within the species complex [12][13][14][15] and isolates from five major cranberry agricultural areas, wild cranberry bogs, fruit and stem of V. macrocarpon, and from five sympatric host species, we assess lineage diversity among isolates from cranberry and surrounding habitats.

Ethics Statement
All necessary permits were obtained for the described field studies from the Delaware Division of Parks and Recreation, the National Forest Service, and The Nature Conservancy. All other samples collected in this study were on private land and did not require permits. There were no endangered or protected species collected for this study.

Fungal Isolation and Culturing
Colletotrichum was isolated from symptomatic and asymptomatic tissue of several host species in North America with a focus on species sympatric with Vaccinium macrocarpon in wild and agricultural habitats. Sympatric host plant species from which C. gloeosporioides s.l. was isolated include Vaccinium oxycoccos, Rhexia virginica L., Chamaecyparis thyoides (L.) Britton, Sterns, & Poggenb., and Nuphar lutea (L.) Sm. (Table 1). Colletotrichum gloeosporioides s.l. was isolated from symptomatic and asymptomatic stem and fruit of V. macrocarpon (Table 1) in both wild and agricultural habitats. Both symptomatic and asymptomatic tissue was surface sterilized in 10% bleach (final concentration 0.6125% sodium hypochlorite) between 1.5 minutes and 5 minutes, depending on the permeability of the tissue, and plated on V8 juice agar, 2% malt extract agar (MEA: BD Diagnostics, Franklin Lakes, NJ, USA) or corn meal agar (CMA: BD Diagnostics), with the exception of Nuphar lutea. Nuphar lutea was incubated in a humid chamber under ambient light and conidia were transferred from anthracnose lesions to CMA and potato dextrose agar (PDA: BD Diagnostics) for purification. Isolates were characterized as endophytes if they were isolated from asymptomatic tissue after surface sterilization; all other isolates were considered pathogenic. Strains morphologically similar to Colletotrichum gloeosporioides s.l., based on an assortment of characters including growth rate, colony color, hyphal morphology and/or conidial shape and size, were isolated by transferring conidia or hyphal tips to sterile media and preserved on CMA slants stored at 6uC, stored in 1.5 mL microcentrifuge tubes at 6uC, and in 10% glycerol at 280uC. Types, epitypes, and representative cultures were deposited at the Centraalbureau voor Schimmelcultures (CBS) with corresponding dried cultures deposited as vouchers at the U.S. National Fungus Collections (BPI).

DNA Extraction and PCR Amplification
Isolates were grown on potato dextrose broth (Difco) for 5-7 days before mycelium was harvested, blotted dry on sterile paper towels and dehydrated for 6-10 hours in a vacuum centrifuge on low heat. Approximately 20 mg of dried tissue was used for DNA extraction using the DNeasy Plant MiniKit (Qiagen Inc., Valencia, CA, USA) or a standard phenol-chloroform extraction method after tissue homogenization in the FastPrep FP120 (MP Biomedicals, LLC., Solon, OH, USA).
PCR reactions were run on an Eppendorf Mastercycler pro S with the following cycling parameters: nrITS-initial denaturing for 2 min at 94uC, followed by 38 cycles of 94uC for 1 min, 55uC for 30 s, and 72uC for 45 s, followed by a final extension at 72uC for 5 min; tub2initial denaturing for 2 min at 94uC, followed by 30 cycles of 94uC for 35 s, 52uC for 55 s, and 72uC for 2 min, followed by a final extension at 72uC for 5 min; apn2/matIGSinitial denaturing for 5 min at 95uC, followed by 10 cycles of 95uC for 30 s, 62uC (decreasing by 1uC each cycle) for 30 s, and 72uC for 1 min, followed by 35 cycles of 95uC for 30 s, 52uC for 30 s, and 72uC for 1 min, followed by a final extension at 72uC for 10 min. PCR products were purified and sequenced at the High Throughput Genomics Unit, Department of Genome Sciences, University of Washington using ABI 37306l sequencers.

Contig Assembly, Sequence Editing, and Phylogenetic Inference
Sequences were automatically assembled into contigs, and edited manually in Sequencher version 4.9 (GeneCodes Corp., Ann Arbor, Michigan). Alignments were carried out with the online version of the sequence alignment program MAFFT version 6 [41,42] using the iterative refinement option G-INS-i for each locus independently. GenBank accession numbers for sequences generated for this study are provided in Table S1.
Rojas et al. [14] sampled New World isolates of C. gloeosporioides s.l. representing several lineages within the species complex and demonstrated the utility of a combined analysis of nrITS, btub, apn2, and apn2/matIGS to resolve closely related lineages. Representative sequences from their study of all four markers were combined with sequences generated here to meet the aforementioned objectives. Independent gene trees were inferred under the parsimony (MP) optimality criterion in TNT [43] and maximum likelihood (ML) criterion in RAxML-HPC2 (7.2.8) implemented on the CIPRES Science Gateway portal [44][45][46]. Phylogenetic estimates of the concatenated matrix were inferred under parsimony in TNT, maximum likelihood in RAxML-HPC2 (implemented on the CIPRES cluster), and Bayesian inference in MrBayes v3.1.2 [47,48]. Phylogenetic analyses in TNT were carried out with parsimony ratchet tree searches (one thousand random addition sequence replicates while holding 20 trees per replicate) and TBR branch swapping. Statistical support for the inferred nodes was determined with parsimony bootstrapping on 1000 pseudoreplicate datasets. Rapid bootstrapping in RAxML was carried out implementing the GTRCAT model and the ML tree search under the GTRGAMMA model (-m GTRCAT -x -f a). The best-fit models for Bayesian analyses were selected with MrModeltest 2.3 [49].
Independent analyses of nrITS and partial beta-tubulin were rooted in TNT and RAxML to Coll57 and Coll60, strains of Colletotrichum aff. acutatum Simmonds, a species complex related to Colletotrichum gloeosporioides s.l [50][51][52]. Strains Coll57 and Coll60 were identified based on morphological similarity and NCBI BLAST similarity searches of nrITS and partial beta-tubulin (nrITS: Glomerella fioriniae -GenBank accession JN121193, e-value 0.0; btub: Glomerella acutata -GenBank accession AB273716, e-value 0.0). Analyses of apn2 and apn2/matIGS were rooted in TNT and RAxML to strain 4766, which exploratory nrITS and tub2 analyses and a previous study of the group [14] indicated is a suitable outgroup for inferring phylogenetic relationships within C. gloeosporioides s.l. Statistical support for the inferred nodes was determined in RAxML-HPC2 by bootstrapping with the number of replicates for the independent gene trees determined by implementation of the extended majority rule (-N autoMRE) convergence criterion implemented in the CIPRES portal and the combined dataset with 1000 pseudoreplicates (-N 1000). Node frequencies for both parsimony and maximum likelihood bootstrap analyses were calculated with the SumTrees 3.1.0 program using the DendroPy Phylogenetic Computing Library version 3.7.1 [53].
The best-fit models implemented in MrBayes for each of nrITS, tub2, apn2 and apn2/matIGS respectively, were as follows: SYM+C, GTR+I, GTR+C, HKY+ C. Four parallel runs were conducted with one cold and three heated Markov chains per run for 10,000,000 generations sampling every 1,000 generations, with each of these models applied independently to each locus in MrBayes. Convergence of parameter estimates was monitored in Tracer v. 1.5 [54] and posterior probabilities calculated from the sampled trees after discarding the first 25% as burn-in. To assess convergence to a global optimum in both ML and Bayesian tree searches of the combined dataset, the log-likelihood scores (2lnL) for the best tree from the maximum likelihood and Bayesian tree searches were calculated in RAxML [45].
In order to estimate the primary concordance tree from the multilocus dataset we used Bayesian concordance analysis (BCA) implemented in BUCKy v. 1.4.0 [55,56]. BUCKy integrates over gene tree uncertainty to estimate the proportion of sequenced markers that support each clade and constructs the tree that reflects the dominant vertical phylogenetic signal (primary concordance tree) [56]. We limited our analysis with BUCKy to a reduced dataset of 55 unique haplotypes (sequences with missing data were considered unique). The posterior distribution of single gene trees was inferred in MrBayes using the same model parameters for each partition as in the concatenated dataset. The primary concordance tree and concordance factors (CF) were estimated across a range of prior values for the discordance parameter (a) of 0.1, 1, 10, and 100. The discordance parameter (a) represents the prior probability distribution that all genes share the same tree, where a = 0 indicates all genes share the same tree while a = ' indicates all genes have a distinct set of trees [55]. All analyses were initially run with two Markov chain Monte Carlo search chains (MCMC) of 100,000 generations after a burn-in of 10,000 generations. All runs converged on the same primary concordance tree with identical concordance factors. The final analysis was run with a = 1 and two MCMC chains of 1,000,000 generations following a burn-in of 100,000 generations (-a 1 -c 2 -n 1000000).
Several species within the C. gloeosporioides species complex have been published since the epitypification of C. gloeosporioides [14,17,57,58] and a comprehensive phylogenetic analysis of the species complex, including the description of new species and subspecies, was recently published [16]. Despite agreement regarding the necessity of multilocus genetic data to resolve taxonomic problems within the species complex and the recognition that the phylogenetic resolution provided by the most commonly used markers is significantly lacking, a common set of markers has not yet been adopted among independent research groups [14,15,17,[57][58][59]. Nonetheless, there is some overlap in the markers sequenced for recently described species. In order that isolates from this study could be placed in the broader context of the species complex and to evaluate the distinctiveness of the newly described species presented here (see 'Species assignments' below), we analyzed sequence data from representative isolates and type strains available in GenBank. In addition to the complete four-marker dataset described above, hereafter referred to as D4G, two additional datasets were constructed with the available sequence data from nrITS, partial tub2, and apn2/matIGS. The first dataset, hereafter referred to as D3G, included all isolates from the D4G dataset with the addition of sequence data from nine isolates, included in a recent study by , representing three additional species and one subspecies. The combined data set included 2,134 nucleotides (nrITS: 557; tub2:690; apn2mat/IGS: 887) for 91 isolates with no missing markers for any terminal. The second dataset, hereafter referred to as D3G + , included the addition of nrITS and partial tub2 sequence data for thirty additional strains, included in a comprehensive phylogenetic study of the species complex by Weir et al. (2012), representing sixteen species and one forma speciales with data for the apn2/matIGS marker coded as missing for these additional strains in an alignment of 2,148 nucleotides. Phylogenetic analyses of these concatenated matrices were conducted with RAxML-HPC2 and MrBayes v3.1.2 as previously described. The best-fit models determined in MrModeltest 2.3 for each locus in the expanded datasets were as follows: nrITS -GTR+C; tub2-GTR+C; apn2/matIGS -HKY+ C. GenBank accession numbers from previously published studies included in analyses presented here are provided in Table S2 and all multiple sequence alignments are available from the authors upon request. The presentation of results from the analysis of D3G + is restricted to Text S1 for the sake of brevity and clarity.

Species Assignments
In order to delimit novel species, we applied the criteria of genealogical concordance phylogenetic species recognition (GCPSR) [60,61] to phylogenetic estimates from the D4G dataset. Briefly, following Dettman et al. [60], novel species were recognized if they satisfied one of two criteria: genealogical concordance or genealogical non-discordance. Clades were genealogically concordant if they were present in three of the four single gene trees and genealogically non-discordant if they were strongly supported (MP$75%; ML$70%) in a single gene and not contradicted at or above this level of support in any other single gene tree. In addition, novel species were recognized if resolved with strong support (PP$.95; ML$70%; MP$75%) in all analyses for the combined dataset of nrITS, tub2, apn2, and apn2/matIGS and were not nested within clades containing the type of any previously described species in any of the combined analyses. All isolates were assigned to a species with the exception of GJS0857 and Coll940, which remain singletons.

Quantitative Measures of Genealogical Sorting
The genealogical sorting index is used here to represent a quantitative assessment of the monophyly or genealogical exclusivity of a group of commonly labeled terminals in a set of trees. This quantitative measure enables the comparison of individual markers to assess their ability to differentiate lineages or species. In order to determine the degree of exclusive ancestry of each of the well-supported lineages inferred from the concatenated dataset of nrITS, btub, apn2, and apn2mat/IGS (D4G) across the independent gene trees, the bootstrap trees obtained from the parsimony bootstrap analyses for each independent gene and the concatenated dataset were used to calculate the genealogical sorting index (gsi) for each lineage [62]. The gsi was calculated for lineages of the C. gloeosporioides species aggregate using the R (2.13.0) package genealogicalSorting version 0.91 [63,64] after removing outgroup strains (Coll57, Coll60, 4766, 4801, 4766). The genealogical sorting index (gsi) reaches a maximum when a commonly labeled group (species or terminal lineage) reaches monophyly in a tree or set of trees and a minimum when all nodes on the tree are required to unite a group. P-values represent the probability that the calculated gsi from the inferred tree or trees would be observed by chance. The ensemble genealogical sorting index (gsi T ) is the sum of the gsi values for each topology weighted by the probability of a given topology based on the proportion of trees in the sample where it is represented. P-values are calculated by permutation of group labels on the terminals of a tree and determining the frequency distribution from the recalculated gsi value. P-values are estimated from 1,000 permutations of each dataset and represent the probability of observing gsi values $ to the reported gsi values by chance under the null hypothesis that labeled groups are of mixed ancestry.

Morphological Studies
Morphological observations were made from strains cultured on potato dextrose agar (PDA), corn meal agar (CMA), V8 agar, and Synthetischer nä hrstoffarmer agar (SNA) [65] incubated at 22-25uC. Growth rates were determined from cultures grown in darkness at 25uC on PDA. Cultures were first grown on CMA and 10 mm plugs were taken from the expanding margin and transferred to 20 mL of PDA in 100615 mm Petri dishes. Each strain was plated to three replicate plates. Three radial measurements were taken from the edge of the plug to the margin of the colony every 24-48 hours over the course of 5 days by marking the bottom of the plate at the margin of the colony using a stereomicroscope equipped with stage lighting, resulting in 9 radial measurements per strain. Each plate was then photographed and the distance between markings was measured.
Microscopic observations of conidia, phialides, and ascospores were made from specimens mounted in water. Hyphal appressoria were observed using slide cultures; a 10 mm 2 block of CMA was placed on a CMA plate, each of the four corners of the block was inoculated, covered with a sterile coverslip, and incubated at 25uC. Microscopic observation of perithecial development on cranberry fruit was made by surface sterilizing symptomatic field collected fruit, cutting in half transversely, placing face down on V8 agar, and incubating at room temperature (22uC) for approximately 3 weeks. The fruit was removed from the plate after perithecial development and fixed in FAA (3.7% formaldehyde, 5% glacial acetic acid, and 50% ethanol) before dehydration in an alcoholxylene dehydration series and embedded in Paraplast X-TRA (Leica Microsystems, Buffalo Grove, IL, USA). The fruit was sectioned at 8 mm and stained with acid fuchsin and Cotton blue in lactic acid and mounted in Permount (Fisher Scientific, Pittsburgh, PA, USA). Microscopic images were made with a Nikon DXM1200C digital camera attached to a Zeiss Axioplan compound microscope or with a Nikon SMZ1500 stereoscope equipped with a Nikon DXM1200F digital camera using Nikon ACT-1 software. All measurements were made using ImageJ 1.44p software [66,67] and summary statistics calculated in R version 2.13.0. Growth rates are reported giving the minimum, average, maximum [(minimum-) average (-maximum)] and the standard deviation in millimeters per day. All other measurements additionally include the 1 st quartile and the 3 rd quartile with the following notation: (minimum-) 1 st quartile -average -3 rd quartile (-maximum).

Nomenclature
The electronic version of this article in Portable Document Format (PDF) in a work with an ISSN or ISBN will represent a published work according to the International Code of Nomenclature for algae, fungi, and plants, and hence the new names contained in the electronic publication of a PLOS ONE article are effectively published under that Code from the electronic edition alone, so there is no longer any need to provide printed copies.
In addition, new names contained in this work have been submitted to MycoBank from where they will be made available to the Global Names Index. The unique MycoBank number can be resolved and the associated information viewed through any standard web browser by appending the MycoBank number contained in this publication to the prefix http://www.mycobank. org/MB. The online version of this work is archived and available from the following digital repositories: PubMed Central; LOCKSS.

Individual Gene Trees
In order to test the hypothesis that multiple sympatric lineages within the C. gloeosporioides species complex infect cranberry in the field, North American isolates need to be placed in a broader phylogenetic context. Therefore, sequence data generated in this study were combined with an earlier study of isolates of C. gloeosporioides s.l. from the New World tropics [14]. Outgroup sampling was expanded from the aforementioned study to include two isolates of C. aff. acutatum. Individual locus data were analyzed separately to assess the topological congruence among datasets and the utility of each to resolve terminal lineages with robust statistical support for sister group relationships within the C. gloeosporioides species complex. Strain data are summarized in Table 1 and character and tree statistic data are summarized in Table 2.
Sequence data from nrITS has been widely used in fungal phylogenetic studies and has been proposed as a barcode locus for fungi [68]. However, our analysis indicates it neither provides adequate resolution for reliable species assignment, nor does it reliably assess phylogenetic relationships within the C. gloeosporioides species complex, as has been reported in previous studies of Colletotrichum [14,23,69]. Despite the low phylogenetic resolution inferred from nrITS data, six nodes within the species complex were supported in more than 75% of the parsimony bootstrapped datasets ( Figure S1) and eight nodes in more than 70% of the maximum likelihood bootstrapped datasets ( Figure S5). In addition, C. sp. indet. D (4766, 3386, 4801) was determined to be closely related but peripheral to the C. gloeosporioides species complex, indicating this is a suitable outgroup, as previously suggested by Rojas et al. [14].
Phylogenetic analysis of partial tub2 sequence data largely supported the inferences made from nrITS data, but provided further resolution within the species complex, recovering 20 wellsupported nodes in both MP and ML analyses ( Figures S2 and S6). Isolates from cranberry (V. macrocarpon and V. oxycoccos) were distributed among 5 clades, including lineages originally described from tropical regions.
Phylogenetic analysis of the apn2 locus provided greater resolution for terminal lineages than that achieved with either tub2 or nrITS. Bootstrap analyses provided significant statistical support for 23 nodes and 20 nodes for MP ( Figure S3) and ML ( Figure S7) analyses, respectively, with six clades in both analyses containing isolates from V. macrocarpon and V. oxycoccos. While support for terminal lineages from apn2 sequence data was greater than that from partial tub2, support for sister group relationships was not as robust.
A similar topology was recovered from the analysis of nucleotide data from apn2/matIGS, the intergenic spacer bridging the apn2 and mating-type locus, with respect to the nrITS, tub2, and apn2 gene trees while providing greater resolution than the other three datasets. Bootstrap analyses of apn2/matIGS recovered strong branch support for 25 and 29 nodes for MP ( Figure S4) and ML ( Figure S8), respectively. In agreement with the apn2 analysis, six lineages include isolates from V. macrocarpon and V. oxycoccos. In addition to increased resolution of terminal clades within the species complex, compared with apn2, support for sister group relationships within the species complex is stronger than inferences based on partial tub2.
While statistical node support based on bootstrap resampling was somewhat inconsistent among ML and MP analyses, both sets of analyses were concordant, with differing levels of resolution. Similarly, the resolution of terminal clades among independent gene trees is consistent with a few exceptions. One strain, isolated as a leaf endophyte of Persea americana (Coll11), was relegated with strong support to a terminal clade that includes isolates designated as Colletotrichum siamense based on analysis of nrITS, but this isolate is strongly supported in both the partial tub2 and apn2/matIGS tree as belonging to a lineage that includes isolates of C. tropicale. The resolution of this isolate in the apn2 analysis was ambiguous. The phylogenetic placement of another strain, Coll940, is inconsistent among gene trees. While Coll940 is resolved as sister to a group that includes C. fructicola, ''C. ignotum 20, C. nupharicola and C. sp. indet. C in the tub2 (ML) and apn2 (MP) analyses, its placement is switched with C. sp. indet. C in the apn2/matIGS analyses.

Phylogenetic Analysis of the Combined Datasets
In order to infer organismal phylogenetic relationships and make species assignments, we relied on the combined dataset of 3,119 nucleotide characters from four nuclear markers and eightyfour terminals (D4G). The Bayesian consensus tree is presented in Figure 1 with node posterior probability and bootstrap proportion values from Bayesian, parsimony, and maximum likelihood analyses. The outgroup taxa, C. aff. acutatum and C. sp. indet. D have been trimmed from Figure 1 due to long branches between the species complex and the outgroup terminals. The relationships among clades in each of the three analyses converged on a topology, with varying levels of node support, identical to the Bayesian analysis presented in Figure 1. The log-likelihood of the tree with the best score (highest log-likelihood value) from the set of four Bayesian runs as calculated in RAxML under the GTRGAMMA model (21298.97) was identical to the tree inferred under the maximum-likelihood criterion.
Thirty-six nodes were inferred with strong bootstrap and posterior probability support in the combined analysis of nrITS, tub2, apn2, and apn2mat/IGS. The multilocus analyses corroborated the majority of the single gene trees in recovering C. theobromicola as sister to two principal lineages within the species complex. The first lineage includes two previously described species, C. kahawae and C. rhexiae, and two newly described species, C. temperatum and C. fructivorum. The second lineage includes six previously described species, C. tropicale, C. asianum, C. siamense, C. fructicola, C. nupharicola, and C. gloeosporioides, one new species, C. melanocaulon and four undescribed sublineages. Added taxon sampling suggests the original circumscription of C. ignotum (recently placed in synonymy with C. fructicola by Weir et al. [16]) to include isolates in the lineage labeled ''C. ignotum 20 in Figure 1 was too broad, also discussed in Rojas et al. [14].
The primary concordance (PC) tree estimated with Bayesian concordance analysis from the four nuclear markers remained unchanged regardless of the value of the discordance parameter (a). The PC tree (Figure 2) is consistent with the assignment of individual isolates to terminal lineages on the basis of the concatenated analysis. Concordance factors are reported for all nodes above the species level. The low concordance factors (below 0.5) for several species, including C. fructicola, C. asianum, C. siamense, C. kahawae, C. rhexiae, and C. fructivorum appear to be due to the lack of resolution provided by individual markers rather than topological discordance. Similar to the analysis of the combined data, two principal lineages were resolved in the PC tree. However, relationships among a few species within these two lineages are distinctive from the combined analysis. For example, Colletotrichum theobromicola is placed within the principal lineage that includes C. gloeosporioides, and C. asianum is no longer sister to C. tropicale.
The monophyly of species delimited on the basis of the combined analysis presented in Figure 1 is largely supported by results from the analysis of D3G, which includes three additional species and one subspecies ( Figure 3). Alternate sister group relationships are suggested by this analysis, however branch support values are generally lower than those in Figure 1 and the backbone of the tree topology remains unchanged. The addition of sequence data from ex-type strains of C. asianum and C. siamense allowed for the identification of isolates in clades that include these species. The assignment of isolates Coll38, G.J.S. 08-144, and G.J.S. 08-147 to C. asianum and isolates Coll6, 1092, and NC67 to C. siamense in Figure 1 and Figure 2 are based on the results presented in Figure 3. The inclusion of sequences from the ex-type of C. fructicola confirms that C. ignotum is a synonym as reported by Weir et al. [16]. In addition, the inclusion of two sequences of C. kahawae subsp. cigarro, including the ex-type, illustrates its close affinity to C. rhexiae and C. fructivorum. Weir et al. [16] also identified an isolate from cranberry (CBS124) as C. kahawae subsp.  cigarro, however our analyses indicate this isolate belongs to a phylogenetically distinct lineage.

Genealogical Sorting Indices
The genealogical sorting index represents a quantitative assessment of the exclusive ancestry of each species in a set of bootstrap trees for each individual locus and in the combined dataset. Terminal labels for this analysis match the species assignments based on the combined four-marker analysis (D4G), as previously described. The results of these analyses are presented in Table 3 and the taxon assignments are presented in Table 4.
Colletotrichum gloeosporioides, C. asianum, and C. theobromicola reached the maximum ensemble gsi (gsi T ) value of 1 when calculated across all bootstrapped trees from the nrITS data. Several species including C. kahawae, C. tropicale, C. fructicola, C. nupharicola, and ''C. ignotum 20 had low, but significant, gsi T values based on the nrITS trees ranging from 0.064 to 0.221. Four species, C. fructivorum, C. rhexiae, C. temperatum, and C. siamense had moderate values of gsi T ranging from 0.370 to 0.658. The remaining species, C. melanocaulon, C. sp. indet. C, C. sp. indet. B, and C. sp. indet. A had non-significant gsi T values across the nrITS bootstrapped trees.
Unlike the gsi T values of the nrITS trees, all values calculated from the tub2 trees were significant with the exception of C. kahawae. Colletotrichum temperatum, C. tropicale, C. asianum, C. nupharicola, ''C. ignotum 20, C. sp. indet. C, C. gloeosporioides, and C. theobromicola reached the maximum gsi T value of 1 when calculated across all bootstrapped trees. Colletotrichum rhexiae, C. melanocaulon, C. siamense, and C. sp. indet. B, had low but significant gsi T values ranging from 0.114 to 0.220. Colletotrichum fructivorum, C. sp. indet. A and C. fructicola had moderate to high gsi T values ranging from 0.466 to 0.751.
All gsi T values of trees from the combined dataset were significant with high values ranging from 0.740 to 1. The only species that did not reach the maximum possible gsi T value of 1 were C. kahawae (0.740) and C. siamense (0.952). All values based on the combined four-marker dataset represent strong measures of genealogical divergence across all 1000 bootstrap replicates.

Taxonomy and Morphology
Multilocus phylogenetic analysis of Colletotrichum gloeosporioides s.l. strains isolated from V. macrocarpon and other sympatric host species revealed several distinct lineages within the species complex. Comparison of growth rates and conidial morphology indicates that there is significant morphological overlap between Colletotrichum species isolated from V. macrocarpon and sympatric host species as well as between strains representing additional species within the species complex. Colletotrichum nupharicola is exceptional, exhibiting a very slow growth rate, with significantly longer and wider conidia than all other species included in this study. Comparisons of growth rates and conidial dimensions are presented in Figure 4 and Figure 5 respectively. Three new species are described below and C. rhexiae is epitypified. Conidial, ascospore, and colony morphology are represented in Figure 6 for each new species and C. rhexiae. Seta morphology is depicted in Figure 7. MycoBank MB 801462 ( Figure 6, Figure 7). Similar to Colletotrichum rhexiae Ellis & Everh. but setae less abundant, the interquartile range of the ascospore length to width ratio smaller (3.2-3.9 um) and the interquartile range of the conidial length to width ratio larger on CMA. Common fruit-rot pathogen of cranberry (Vaccinium macrocarpon Aiton) in commercial production.
Growth rate (3.0-) 5.4 (-6.8) mm per day with standard deviation of 1.1 mm on PDA at 25uC [n = 54]; aerial mycelium floccose, white to greyish white, medium grey and brownish grey in some strains; sectoring common. Perithecia developing and maturing on V8 agar, clustered or solitary, dark brown to black, globose to obpyriform to papillate; ascospores allantoid, olive yellow, (15. Habitat and Distribution. Commonly isolated as a fruit-rot pathogen and from asymptomatic tissue of Vaccinium macrocarpon in commercial cranberry beds throughout North America. Also isolated from Rhexia virginica from asymptomatic infections growing in commercial cranberry beds and symptomatic fruit of V. oxycoccos in a wild cranberry bog in Pennsylvania.
Etymology. The specific epithet, ''fructivorum'', refers to the propensity of the species to be associated with fruit-rot of cranberry. From the Latin fructus, fruit, and -vorous, eating.
Holotype   [70] from leaves of Vaccinium macrocarpon in New Jersey. The type material was a slide (slide no. 1447A C.L.S.) of a single ascospore isolate made from this collection deposited in the ''pathological collection of the Department of Agriculture'', now housed in the collections at the Systematic Mycology and Microbiology Laboratory in Beltsville, Maryland (BPI). While other slides designated by C.L. Shear as type material for new species published in the same protologue were located at BPI, slide no. 1447A C.L.S. was not located and is thought to be lost. However, a culture deposited by Shear in April, 1922 to Centraalbureau voor Schimmelcultures in the Netherlands was obtained, sequenced and found to be conspecific with other isolates described here. The nomenclature used by CBS in designating this isolate as Glomerella rufomaculans-vaccinii, however, seems to be based on a typographical error as C.L. Shear clearly intended recognition of this taxon at the varietal level and there is no indication in the literature that it has been formally elevated to species. Furthermore, Glomerella rufomaculans was subsumed into Glomerella cingulata (anamorph: Colletotrichum gloeosporioides) with the revisionary work of von Arx in 1957 [11], necessitating the designation of the new specific epithet, Colletotrichum fructivorum. We have chosen a recently isolated strain as the ex-type strain due to the lack of sporulation observed in the culture deposited by Shear and the fact the holotype designated here is a single ascospore isolate for which we have morphological data for both the anamorph and the teleomorph ( Figure 6). MycoBank MB 801464 ( Figure 6, Figure 7). Similar to Colletotrichum gloeosporioides (Penz.) Sacc. but associated with stem canker of Vaccinium macrocarpon Aiton.     Habitat and Distribution. Isolated from stem canker lesions of Vaccinium macrocarpon in commercial cranberry beds in New Jersey.
Etymology. The specific epithet, ''melanocaulon'', refers to the brown or black stems from which the species was isolated. From the Greek, melano-, black or very dark, and caulon, stem, in agreement with the neuter generic name Colletotrichum.
Growth rate (6.3-) 6.6 (-7.2) mm per day with standard deviation of 0.2 mm on PDA at 25uC [n = 18]; aerial mycelium floccose, white to greyish white, sectoring observed. Mature ascospores not observed on PDA. On CMA aerial mycelium flocculose to barely visible; perithecia solitary to clustered, dark brown to black, subglobose to obpyriform, ascospores hemisphaeroid to reniform, olive brown, (13. Habitat and Distribution. Isolated from rotten fruit of an ornamental cultivar, Vaccinium macrocarpon 'Hamilton', growing at The New York Botanical Garden and as a stem endophyte of V. macrocarpon in a commercial cranberry bog in New Jersey.
Etymology. The specific epithet, ''temperatum'', refers to the known distribution of the species. From the Latin, temperatum, temperate. Holotype

Sympatric Lineages and Host Distribution in North American Cranberry Bogs
Several phylogenetic studies have confirmed the ability of multiple lineages within Colletotrichum gloeosporioides s.l. to colonize the same host family, genus, or species (e.g. Coffea arabica L. [58]; Jasminum sambac (L.) Aiton [71]; Hemerocallis spp. [72]; Amaryllidaceae [73]). Seven well-supported lineages of C. gloeosporioides s.l. were isolated from North American cranberry bogs, either from V. macrocarpon, V. oxycoccos, or other sympatric host plant species (Table 1 and Figure 1). Similar to the results from previous studies focused on a single host or group of closely related hosts, multiple lineages within the species complex can be isolated from Vaccinium spp., including five species (C. fructivorum, C. rhexiae, C. temperatum, C. melanocaulon, and C. fructicola) as endophytes and/or pathogens of V. macrocarpon, two lineages (C. fructivorum and C. sp. indet. C) from V. oxycoccos, and one species (C. fructicola) from V. corymbosum. Likewise, three species have been found to colonize Rhexia virginica L., an herbaceous perennial inhabiting well-drained soils that are seasonally inundated; often found sympatric with V. macrocarpon in wild and agricultural habitats in eastern North America.
In contrast, a single lineage has been isolated from Nuphar and Nymphaea within and beyond North American cranberry bogs. Colletotrichum nupharicola was previously described from Nuphar lutea subsp. polysepala (Engelm.) E.O. Beal in Washington and Idaho and Nymphaea odorata Aiton in Rhode Island. Our study indicates that this species is also present in irrigation reservoirs in agricultural cranberry beds in New Jersey, growing on Nuphar lutea (L.) Sm. The morphological distinctiveness of C. nupharicola makes it one of the few species within the species complex that can be reliably identified on the basis of conidial and cultural morphology and thus can be more easily tracked. Colletotrichum nupharicola was frequently encountered on Nuphar during the course of this study; however, other lineages were not isolated from either Nuphar or Nymphaea.
Chamaecyparis thyoides (L.) Britton, Sterns & Poggenb. is a common and dominant tree species adjacent to cranberry bogs in parts of eastern North America and can be found as a weed in commercial cranberry beds. Bills and Polishook [25] reported C. gloeosporioides s.l. from Chamaecyparis thyoides collected near cranberry bogs in eastern North America, however they reported just two isolates in a survey of endophytes of the host in New Jersey. A single isolate of C. gloeosporioides s.l. was isolated from C. thyoides in this study and found to be sister to Coll887, isolated from a diseased fruit of V. oxycoccos in West Virginia. These isolates form a phylogenetically distinct but undescribed lineage (C. sp. indet. C). Given the apparent ecological distinctiveness, geographical separation, and relative sequence dissimilarity of these isolates, this lineage will remain undescribed until additional, closely related isolates are collected.

Host Preference, Host Organ Preference, and Habitat Distribution
Species of C. gloeosporioides s.l. are largely considered to be host generalists with few exceptions (such as C. salsolae [16]). There are other species that have been isolated from single host species or genera, but these have been sampled primarily from cultivated plants where isolates may be transferred with plant material or from a narrow geographical range in native habitats (e.g. C. ti [16]; C. psidii [16]). Similarly, strains isolated from V. macrocarpon are distributed throughout the C. gloeosporioides species complex, with little indication of host specificity for most lineages. Two of those lineages, C. temperatum and C. melanocaulon, originate solely from V. macrocarpon, however these were isolated only from cultivated habitats and given further sampling may be found on additional hosts. Colletotrichum temperatum has been isolated as a stem endophyte of V. macrocarpon and from rotten fruit in a horticultural variety (V. macrocarpon cv. Hamilton: not grown for agricultural production). The other species, C. melanocaulon, is associated with an emerging stem canker disease on cranberry where it has been isolated from affected stem tissue collected from field populations in commercial cranberry beds. This study shows, however, that a composite of host, habitat, and host organ origin can be a useful indicator of lineage identity in some cases.
Isolates of C. fructivorum and C. rhexiae are not host specific; C. fructivorum has been isolated from V. macrocarpon, V. oxycoccos and R. virginica, while C. rhexiae has been isolated from V. macrocarpon and R. virginica. Nevertheless, the fact that all isolates originating from V. macrocarpon in wild habitats are conspecific with C. rhexiae, and all isolates from diseased fruit in cranberry agricultural production areas are conspecific with C. fructivorum, suggests that host, host organ, and habitat can be useful indicators of lineage identity. Gonzalez et al. [33] noted a correlation between organ-specific pathogenicity and genotype in isolates from the C. gloeosporioides complex. Given their findings, it appears selection for pathogenicity to fruit may be enabling C. fructivorum to colonize fruit to the exclusion of all other lineages within the species complex that can be found in commercial cranberry beds.
Episodic selection has been proposed to be a common force driving divergence among fungal species [74][75][76]. Host shifts are prominent among the factors responsible for episodic selection and speciation, and has recently been proposed for C. kahawae [59]. In contrast, habitat transformation has been implicated as a factor contributing to population divergence in Colletotrichum cereale, a broadly distributed turfgrass pathogen [77]. The broad distribution of C. fructivorum in commercial cranberry beds and restriction to cultivated habitats suggests that the shift to agricultural production of cranberry may have led to its divergence from C. rhexiae. However, this conclusion is preliminary and would better be addressed with additional sampling and more variable molecular markers.
The only North American lineage of C. gloeosporioides s.l. sympatric with cranberry with some indication of host preference over a broad geographical range is C. nupharicola. While C. nupharicola has been isolated from two aquatic host genera, Nuphar and Nymphaea, both of these genera are in the same family, Nymphaeaceae, indicating host specificity at the family level. This species is broadly distributed on Nymphaeaceae from Alaska through the Rocky Mountains to eastern North America (Dennis A. Johnson, pers. communication, and V. Doyle, unpublished data).

Geographic and Host Distribution of Colletotrichum fructivorum
This and other studies of Colletotrichum gloeosporioides s.l. have demonstrated the utility of delimiting species using multi-locus DNA sequence data (e.g. [14,16]). However, given the cryptic nature of many species within the complex and the apparently broad diversity of species at small spatial scales it is difficult to design an appropriate sampling scheme to capture the species diversity of Colletotrichum within a given host, habitat or geographic range. To develop a better understanding of scenarios that promote species divergence, it is necessary to understand the factors that lead to population substructure within a focal species. One of the objectives of this study was to identify one or more broadly distributed lineages within the C. gloeosporioides species complex that occur on cranberry to serve as a system for investigation of the factors driving the divergence between populations. Given the widespread occurrence of Colletotrichum gloeosporioides s.l. as the causal agent of fruit-rot in cranberry agricultural areas, the economic importance of cranberry production in North America, and the ease at which C. gloeosporioides s.l. can be isolated from diseased fruit in commercial cranberry production areas, we sampled diseased fruit across several of the major cranberry production areas in North America and Canada (Table 1). All strains of C. gloeosporioides s.l. isolated from fruit in commercial cranberry production areas were found to belong to C. fructivorum regardless of geographic origin. This reveals that C. fructivorum is broadly distributed in North America from Delaware to Massachusetts and west to Washington and British Columbia. In addition, there is evidence that strains of C. fructivorum are not organ specific (fruit and stems) and not host specific (V. macrocarpon, V. oxycoccos, and R. virginica). The combination of economic importance, broad geographic distribution, and diverse organ and host association makes this a suitable species for examining biotic and abiotic factors that influence microevolutionary processes in C. gloeosporioides s.l.

Diversity of North American Lineages
Fungal strains included in this study have been placed in a broader phylogenetic context to better understand the relationships among species within the Colletotrichum gloeosporioides species complex. Despite the fact that there is a significant body of research on North American isolates thought to be members of the species complex (e.g. [8,23,24,27,33,[78][79][80][81][82][83]) there is a paucity of modern systematic research addressing the phylogenetic and geographic distribution of North American lineages of C. gloeosporioides s.l. However, a recent study by Weir et al. [16] included several North American isolates.
With the inclusion of C. rhexiae, C. temperatum, C. melanocaulon, and C. asianum, our research indicates there is a broader diversity of species in North America than has previously been reported. The species that have been isolated from native North American plant species include C. aeschonymenes, C. asianum, C. clidemiae, C. fructicola, C. fructivorum, C. melanocaulon, C. nupharicola, C. rhexiae, C. temperatum, C. theobromicola, and C. ''f. sp. camelliae'', while C. gloeosporioides, C. musae, and C. siamense have been reported from introduced host species. However, there has been very little work addressing the geographic distribution of the species complex in North America. The only species that have been isolated from more than a very localized range are C. fructivorum, C. rhexiae, and C. nupharicola. Our understanding of the importance of the C. gloeosporioides species complex in North America will continue to improve as more North American mycologists and plant pathologists begin to place their isolates in the appropriate phylogenetic context.
Colletotrichum rhexiae and C. fructivorum each present a unique opportunity to investigate the role of host and geography in shaping the microevolutionary patterns that may ultimately lead to species divergence. Colletotrichum rhexiae was described by Ellis and Everhart in the late 1800's [18] from Delaware while C. fructivorum has been known from the same region since at least 1907 with its formal recognition by Shear [19,70] as Glomerella rufomaculans var. vaccinii and likely earlier as implicated by the work of Byron Halsted [18]. These species have apparently been sympatric for at least 105 years. While we do not have DNA evidence that C. rhexiae has been present for this extent of time, we have examined the original type material and it is not apparently distinct from our collections of C. rhexiae in New Jersey, Delaware and Maryland. In contrast, we do have a culture deposited to CBS in 1922 by Shear of what he considered to be G. rufomaculans var. vaccinii and it is identical in sequence across 4 loci to isolates collected from cranberry fruit in 5 US states and BC, from cranberry stem in New Jersey, and from Rhexia virginica growing as a weed in agricultural cranberry beds. The presence of sexual reproductive structures in both species suggests the potential for genetic exchange among conspecific individuals (unless they are obligately homothallic). However, despite their sympatry and ability to reproduce sexually, strong statistical support at the bifurcation between C. rhexiae and C. fructivorum indicates the presence of other biological factors reinforcing species limits. However, an objective test of species boundaries is better approached with more rapidly evolving markers variable within species, such as microsatellites, and using more appropriate non-phylogenetic algorithms to determine if there is any evidence of introgression. Further investigation using more rapidly evolving genetic markers could also lend the potential for inferring the reproductive strategy of these species and better understand the factors influencing species divergence within Colletotrichum. In addition, with the advancement of next-generation sequencing technology, it is possible to investigate these sister species at the genomic level to understand how genomic modifications (gene divergence, content/turnover or synteny) may influence pathogenicity on different hosts or in distinct organs.

Conclusions
Colletotrichum gloeosporioides s.l. has long been implicated as a pathogen of cranberry in commercial production areas, but a more refined understanding of the species responsible for fruit-rot of cranberry has not been possible due to the difficulty of species delimitation using morphological features. This study resolves this issue, identifying C. fructivorum as the species associated with fruitrot in cranberry. This study also lays the groundwork for future studies regarding the natural history and ecology of members of C. gloeosporioides s.l. and should provide useful tools for plant breeders and plant pathologists, in their efforts to develop resistant cultivars for an industry that utilizes one of North America's few native crop species, V. macrocarpon. We have also determined that C. fructivorum is broadly distributed across North America and Canada in areas of commercial cranberry production and is capable of infecting alternate host species as well. This makes this species an appropriate model for addressing questions of population structure and dispersal at broad geographical and landscape level spatial scales. By virtue of a horizontal (among sympatric species) and vertical (among organs within a species) sampling scheme, we were able to uncover greater diversity of C. gloeosporioides s.l. in wild and commercial cranberry beds than has previously been suggested, revealing seven distinct lineages associated with V. macrocarpon and sympatric host species. Likewise, this level of sampling allowed us to determine that host specificity is not strongly implicated in determining macroevolutionary patterns among these species in temperate regions, with the exception of C. nupharicola. Similarly, despite the generalist nature of the species recovered in this study with respect to organ specificity, C. fructivorum does have an affinity for fruit of V. macrocarpon in commercial cranberry beds, to the exclusion of all other related species. Table S1 GenBank accession numbers for sequence data generated in phylogenetic study of C. gloeosporioides s.l.

(DOCX)
Table S2 GenBank accession numbers for sequence data generated in previous studies of C. gloeosporioides s.l.

(XLSX)
Text S1 Discussion of the inferences drawn from the analysis of D3G+. (DOCX)