Phylogeographic Analysis of Blastomyces dermatitidis and Blastomyces gilchristii Reveals an Association with North American Freshwater Drainage Basins

Blastomyces dermatitidis and Blastomyces gilchristii are dimorphic fungal pathogens that cause serious pulmonary and systemic infections in humans. Although their natural habitat is in the environment, little is known about their specific ecologic niche(s). Here, we analyzed 25 microsatellite loci from 169 strains collected from various regions throughout their known endemic range in North America, representing the largest and most geographically diverse collection of isolates studied to date. Genetic analysis of multilocus microsatellite data divided the strains into four populations of B. dermatitidis and four populations of B. gilchristii. B. dermatitidis isolates were recovered from areas throughout North America, while the B. gilchristii strains were restricted to Canada and some northern US states. Furthermore, the populations of both species were associated with major freshwater drainage basins. The four B. dermatitidis populations were partitioned among (1) the Nelson River drainage basin, (2) the St. Lawrence River and northeast Atlantic Ocean Seaboard drainage basins, (3) the Mississippi River System drainage basin, and (4) the Gulf of Mexico Seaboard and southeast Atlantic Ocean Seaboard drainage basins. A similar partitioning of the B. gilchristii populations was observed among the more northerly drainage basins only. These associations suggest that the ecologic niche where the sexual reproduction, growth, and dispersal of B. dermatitidis and B. gilchristii occur is intimately linked to freshwater systems. For most populations, sexual reproduction was rare enough to produce significant linkage disequilibrium among loci but frequent enough that mating-type idiomorphic ratios were not skewed from 1:1. Furthermore, the evolutionary divergence of B. dermatitidis and B. gilchristii was estimated at 1.9 MYA during the Pleistocene epoch. We suggest that repeated glaciations during the Pleistocene period and resulting biotic refugia may have provided the impetus for speciation as theorized for other species associated with temperate freshwater systems.


Introduction
Most fungi that are pathogenic to humans reside naturally in the environment with occasional transmission to humans or other animals. Many fungi, especially the dimorphic fungi, display a limited and distinct phylogeographic distribution, suggesting that growth and persistence in the environment is linked to specific biogeographic and ecological factors [1][2][3][4][5][6][7][8][9][10][11][12][13][14]. For example, Coccidioides immitis is localized primarily to the San Joaquin Valley of California, while Coccidioides posadasii has a wider biogeographical distribution including southwestern United States, southern California, Mexico, and South America [2]. Yet, little is known about the specific ecologic niche of human fungal pathogens that are acquired from the environment. An understanding of the ecological factors that favor growth, reproduction, and dispersal would potentially provide a means of predicting and controlling human acquisition of infection. Fortunately, highly discriminatory genetic typing methods such as multilocus microsatellite typing can help infer ecological factors controlling the reproduction and propagation of fungal pathogens in the environment, as we demonstrate here for the systemic fungal pathogens in the genus Blastomyces.
The genus Blastomyces represents two species of thermally dimorphic fungi: Blastomyces dermatitidis and Blastomyces gilchristii [15]. In North America, Blastomyces species are considered endemic to the midwestern, southeastern, and south central United States bordering the Ohio and Mississippi River valleys, and the Canadian provinces and US states bordering the Great Lakes and St. Lawrence River [16]. There are reports of blastomycosis from other countries, specifically China, Zimbabwe, South Africa, and India [17][18][19][20]; however, endemicity in these regions is somewhat dubious. Several strains of African origin were shown by DNA melt curve analysis to represent a separate taxon, possibly Emmonsia spp. [21], and case reports from China and India involved patients with a recent travel history to the endemic regions in North American [18,19]. Still, at least five older cases from 1982-1998 involving two humans, a dog, and two bats from India appear to be autochthonous [19]. Thus, North America represents the primary region of endemicity for Blastomyces spp. Although direct isolation of the fungus from the environment has rarely been successful [22], a variety of ecological factors favoring growth have been postulated based on human and canine case studies [23][24][25][26][27][28][29][30][31][32][33][34][35][36].
In a recent study, multilocus sequence typing (MLST) was used to demonstrate the presence of a cryptic species within what was previously considered a single species Blastomyces dermatitidis [15]. The cryptic species, named Blastomyces gilchristii, was found to be reproductively and genetically isolated from B. dermatitidis and appeared to have an overlapping, although perhaps smaller, geographic distribution compared to B. dermatitidis [15]. Similar groupings have been noted by others [37], who also suggested that differences in clinical manifestations may exist between the two groups [38]. More recently, whole genome sequencing of select B. dermatitidis (ER-3, ATCC 18188 and ATCC 26199) and B. gichristii (SLH14081) strains has provided additional evidence for the delineation of these two species [39].
The goal of this study was to use multilocus microsatellite typing (MLMT) to elucidate the population structures of B. dermatitidis and B. gilchristii from, what is to our knowledge, the largest and most geographically diverse collection of isolates studied to date. In total, four populations of B. dermatitidis and four populations of B. gilchristii were detected, with the geographic distribution of the populations partitioned among the major freshwater drainage basins of North America. The results of our analysis indicate that the reproduction, growth, and dispersal of Blastomyces spp. are intimately linked to freshwater systems, confirming what was previously suspected based on epidemiological investigations of blastomycosis outbreaks [33]. The study demonstrates that discriminatory genetic analyses, such as MLMT, are powerful methods for elucidating information about the population structure and ecology of human pathogenic fungi in a manner that is relevant to understanding and possibly controlling disease acquisition.
Initial analysis of the microsatellite data with the STRUCTURE software suggested two biologically meaningful genetic clusters with a large peak in the ΔK profile (S1a Fig). Plots of the individual STRUCTURE Q-values (averaged across all 8 iterations), which show the posterior mean estimates of the proportion of each isolate's genome inherited from ancestors of each group, indicate little admixture between the two groups (Fig 1a). The most probable ancestor analysis by STRUCTURE, which assigns each isolate to a cluster, indicated that one group contained only isolates previously identified as B. dermatitidis by MLST (n = 24) while the second group contained only B. gilchristii isolates (n = 26) [15]. Therefore, the remaining isolates assigned to group 1 and group 2 were identified as B. dermatitidis or B. gilchristii, respectively. Using multilocus sequence typing (MLST) data [15], Ã BEAST estimated the divergence time of B. dermatitidis and B. gilchristii at 1.9 MYA (95% HPD 0.5-4.4 MYA).
STRUCTURE analysis of the B. dermatitidis and the B. gilchristii microsatellite data analyzed independently suggested that each species contained four populations (S1b Fig). Plots of the individual STRUCTURE Q-values indicate some admixture between the populations of each group (Fig 1b and 1c). The most probably ancestor algorithm was used to assign each isolate to one of B. dermatitidis populations 1-4 or B. gilchristii populations 1-4.
B. dermatitidis and its populations exhibited much higher levels of intra-population genetic diversity than B. gilchristii, as indicated by the higher values for the mean effective number of alleles (N e ) and mean haploid genetic diversity (H e ) ( Table 1). Although the number of isolates sampled for B. dermatitidis was more than twice that of B. gilchristii, this likely had little effect on the N e and H e values since 15-20 isolates is considered sufficient for accurately estimating heterozygosity (i.e. genetic diversity) [40]. B. gilchristii exhibited low intrapopulation genetic diversity, with only 21 unique microsatellite types represented among the 50 isolates.
The genetic distance between B. dermatitidis and B. gilchristii was 1.369 with higher pairwise genetic distances between populations of different species than populations of the same species (Table 2). Interestingly, B. dermatitidis population 4 isolates were genetically more similar to the other three B. dermatitidis populations than any of the other three populations were to each other. In comparison, the pairwise genetic distance values of the B. gilchristii populations were quite low, which is likely due to the low genetic diversity within the species (Table 2). F ST values were used to ascertain the proportion of genetic variance among geographic regions relative to the total variance [5]. F ST values close to zero indicate that genetic variation is shared within and between the populations, whereas higher F ST values indicate that more genetic variation occurs between populations compared to those within populations [5]. Our F ST values indicate that the populations were significantly differentiated (p < 0.001) from one another ( Table 2).
The microsatellite dataset was also analyzed by TreeMix v1.1 to detect any potential migration events between the populations [41]. We identified two migration events, one between B. gilchristii populations 1 and 4 (p = 7.3x10 -5 ) and one between B. gilchristii populations 1 and 2 (p = 0.017) (S2 Fig), which correlate with the comparably lower genetic distance values ( Table 2).   2a). B. gilchristii population 1 isolates were sampled primarily from regions of the Nelson River drainage basin, similar to B. dermatitidis population 1. B. gilchristii population 2 contained only four isolates that were derived from the region that could be considered the northern-most part of the Mississippi River System drainage basin (specifically the Mississippi River (trunk) sub-basin (Fig 2b)).  2b). A consensus neighbour-joining phylogenetic tree confirmed not only the deep genetic divide between the B. dermatitidis and B. gilchristii isolates, but also the genetic similarity of isolates associated with the same drainage basin (Fig 3). A few exceptions were observed. Of note, three Minnesota isolates of B. dermatitidis, from Hennepin and Anoka counties, clustered with population 1 even though their geographic location was in the Mississippi River System drainage basin. Likewise one isolate each from Illinois (Gu), Kentucky (K966), and Louisiana (Ro) clustered with population 3 but were geographically associated with the Mississippi River System drainage basin. Two isolates from Thunder Bay, Ontario located in the St. Lawrence River drainage basin grouped with the population 1 isolates, most of which were from the adjoining Nelson River drainage basin. One Ontario isolate, two Wisconsin (Milwaukee) isolates, and three Michigan isolates from the St. Lawrence River drainage basin clustered with the population 4 isolates, along with one canine isolate from Edmonton (Nelson River drainage basin) and one N. Carolina isolate (SE Atlantic Ocean Seaboard drainage basin). Among the B. gilchristii isolates, one isolate from Minnesota (Hennepin County) and two isolates from Ontario (Sault Ste. Marie) clustered with population 1, but were geographically located in the Mississippi River System drainage basin and the St. Lawrence River drainage basin, respectively. Also, one isolate from British Columbia (Fraser River drainage basin) and one isolate from North Spirit Lake, Ontario (Hudson Bay Seaboard drainage basin) clustered with the population 1 isolates (S2 Table). Inconsistencies were noted between the STRUCTURE assignments and the consensus NJ tree, namely ATCC-28306 and F2012034121 (population 1 isolates that cluster with population 4) and DI 13-61 (a population 3 isolate that clusters with population 4). These inconsistencies were likely due to different method algorithms coupled with higher levels of admixture noted in F2012034121 and DI 13-61 and a unique MLMT profile for ATCC-28306.  To further investigate the association between genetic relatedness and drainage basin localization, a series of distance based redundancy analyses (db-RDA) and partial db-RDAs were performed. The analyses suggested that for both species, genetic distance between isolates was significantly correlated with (constrained by) drainage basin (B. dermatitidis pseudo-F = 5.5916, p < 0.001; B. gilchristii pseudo-F = 13.233, p < 0.005 respectively) even when controlling for geographical distance (B. dermatitidis pseudo-F = 2.7827, p <0.001, B. gilchristii pseudo-F = 7.0482, p < 0.005 respectively). For B. dermatitidis, drainage basin explained 11.4% of the variation in genetic distance between isolates. Of this, 5.0% was due to drainage basin alone while 6.4% was due to a joint effect of drainage basin and geographic distance, since the two are not unrelated explanatory factors. An additional 1.5% was explained by geographic distance alone. For B. gilchristii, 23.2% of the variation in genetic distance was attributed to drainage basin, with 17.0% due to a joint effect of drainage basin and geographic distance. An additional 1.3% was explained by geographic distance alone. The remaining 87.1% and 75.5% of the variation in genetic distance of B. dermatitidis and B. gilchristii respectively was unexplained and may be attributed to other factors not examined in this study.

Linkage disequilibrium and recombination analysis
Like many fungi, Blastomyces reproduces both asexually and sexually. Therefore, a series of analyses evaluating linkage disequlibrium and mating-type idiomorph were conducted to assess the relative proportion of sexual and asexual reproduction in Blastomyces. The level of multilocus linkage disequilibrium in the dataset was high, with significant r d values both with and without clone correction, (Table 3). Only the largely homogeneous B. gilchristii populations 1 and 4 exhibited non-significant r d values when analyzed independently, indicating random mixis (Table 3)  one of the four B. gilchristii populations. However, when the seven supercontigs that contained multiple microsatellite loci were analyzed independently for linkage disequilibrium, mixis was detected for supercontigs 1 and 3 for both species and for supercontigs 1, 2, 4, and 20 for the B. gilchristii clone-corrected dataset (Table 4), which suggests sexual recombination. Mating-type idiomorphic ratios were examined as an indirect measure of the level of sex and recombination within the Blastomyces species [5]. The mating-type idiomorphic ratios did not differ significantly from the 1:1 ratio expected for a sexual population, with the exception of B. gilchristii population 4 (Table 1). However, following clone correction, the mating-type idiomorphic ratio of this population was also not significantly different from 1:1 (Table 1)

Discussion
Microsatellite typing of 169 North American strains of Blastomyces confirmed the presence of two species, B. dermatitidis and B. gilchristii, and revealed the presence of several differentiated populations within each species. Furthermore, the geographic localization of each population was generally partitioned among the major freshwater drainage basins of North America. This study represents a discriminatory population genetic analysis of the largest and most geographically diverse collection of Blastomyces isolates examined to date, including strains from various regions throughout their known endemic range in North America. Although previously known as B. dermatitidis, the presence of two distinct species, B. dermatitidis and B. gilchristii, was recently documented based on multilocus sequence typing and recombination analysis [15]. The microsatellite typing and STRUCTURE analysis performed in this study confirms the presence of two markedly distinct genetic groups, which correlate perfectly with the MLST species delineation in a subset of 50 strains. It also corroborates the findings of other researchers who have also noted two distinct genetic groups based on microsatellite typing and whole genome sequencing [37,39]. Little is known about the phenotypic differences between B. dermatitidis and B. gilchristii, however, preliminary studies have suggested that there may be differences in the clinical spectrum of disease between the two species [38]. Based on the geographic analysis in the current study, B. dermatitidis appears to be present throughout North America from the southeastern United States to Western Canada. Conversely, B. gilchristii appears to be limited to a northern range, present primarily in central Canada and a few of the north central US states bordering Canada. B. gilchristii was found in Ontario, Saskatchewan, Alberta, Minnesota, and Wisconsin; however it was rarely detected among strains isolated from states and provinces bordering the eastern seaboard (i.e. Quebec, New York, and Vermont). While more extensive sampling may reveal the presence of B. gilchristii in additional Canadian provinces and northern US states, the level of sampling done in this study allows us to exclude central and southern US from the endemic range of B. gilchristii.
The entire set of 25 microsatellite loci were typed from 82.2% of strains with 22-24 loci typed from the remaining strains. The inability to type all Blastomyces isolates using this scheme has been noted previously [37]. Although we report a higher percentage of isolates that failed to type at one or more loci, this is probably due to the fact that a more diverse collection of isolates was employed in this study. Nevertheless, other studies involving microsatellite typing of fungi typically employ fewer loci, between 9 and 21 [2,13,14,[43][44][45][46], suggesting that the current dataset is sufficient to discriminate the population structure of Blastomyces despite the missing data.
The isolates tested in this study were partitioned into four genetically distinct populations of B. dermatitidis and four genetically distinct populations of B. gilchristii, with the overall geographic localization of the different populations delineated by the major freshwater drainage basins of North America. Although these observations are similar to the geographic delineation of strains described previously [15], these populations were empirically determined and represent a refinement of the previously hypothesized groups. Among the B. dermatitidis isolates, population 1 was associated with the Nelson River drainage basin; population 2 with the St. Lawrence River drainage basin and the northeast portion of the Atlantic Ocean Seaboard drainage basin; population 3 with the Gulf of Mexico Seaboard and southeast portion of the Atlantic Ocean Seaboard drainage basin; and population 4 with the Mississippi River System drainage basin. For populations 2 and 3, a 1:1 assignment of population to drainage basin was not observed; rather a single population detected by STRUCTURE analysis of the microsatellite data was distributed over two adjacently located drainage basins. It is possible that additional subpopulations also delineated by drainage basin do exist but were undetectable by MLMT. Alternatively, additional ecological factors or mechanisms of dispersal, other than freshwater drainage (i.e. wind dispersal), may also be influencing the geographic distribution of these populations and provide an explanation for the distribution of these populations across adjacent drainage basins.
B. gilchristii was also delineated into four populations with one population associated with the Nelson River drainage basin, a second population associated with what is presumably the northern-most tip of the Mississippi River System drainage basin, and a third population associated with the St. Lawrence River and northeast Atlantic Ocean Seaboard drainage basins. With the exception of a single isolate from Toronto, Ontario, the fourth population consisted entirely of isolates from Eagle River, Oconto Falls, and Tomorrow River, Wisconsin. Although the Wisconsin isolates in this population were located no more than 115 miles (190 km) apart [47], they were split between the Mississippi River System and St. Lawrence River drainage basins. While freshwater drainage appears to be a major determinate of the geographic localization of Blastomyces, the B. gilchristii population 4 isolates are genetically highly similar and yet are partitioned among two major freshwater drainage basins. These observations suggest that alternate climatic or environmental determinants or other mechanisms of dispersal (i.e. wind dispersal) impacted the geographic distribution of these populations on a more localized scale. Furthermore, there is evidence that the B. gilchristii populations are not genetically isolated with migration detected from the more northerly population 1 to the more southerly populations 2 and 4 (S2 Fig). Distance-based redundancy analysis indicated that localization to drainage basin was a significant explanatory factor accounting for the genetic structure (analyzed as genetic distance) of B. dermatitidis (p < 0.001) and B. gilchristii (p < 0.005) isolates. In fact, for both species, drainage basin localization was a more important explanatory factor than geographic distance, accounting for 5.0% vs. 1.5% of the variation in genetic distance for B. dermatitidis and 6.1% vs. 1.3% for B. gilchristii. However, the variation partitioning of genetic distance not explained by drainage basin localization and/or geographic position was substantial, 87.1% for B. dermatitidis and 75.5% for B. gilchristii. Thus, many other factors may be responsible for the variation in genetic distance observed between isolates, not the least of which would be the stochastic nature of microsatellite mutations which are generally considered to be selectively neutral [48]. Overall, these findings provide statistical support to the hypothesis that the population genetic structure of Blastomyces is influenced by drainage basin localization.
Although the link between genetic similarity represented by the clustering of isolates into populations and geographic localization to a drainage basin is readily apparent, exceptions within our dataset do exist. As noted above, other ecological variables and modes of dispersal are likely also shaping the geographic localization of genetically distinct populations. Another potentially significant cause of these exceptions is travel, since the travel histories of the patients from whom the isolates were derived were unknown. Due to the lengthy incubation period of blastomycosis (1-4 months) [32,33], a person can acquire the infection in one location but travel to another location before the symptoms develop and the disease is diagnosed. It is also possible that a person may develop blastomycosis in a rural area of one drainage basin, but travel to an urban area in another drainage basin for diagnosis and treatment. This may be particularly relevant for certain Ontario and Minnesota strains, where it is plausible that the infection could have been acquired in rural areas associated with the Nelson River drainage basin, but diagnosed in an urban area (e.g. Minneapolis, Minnesota, or Thunder Bay or Sault Ste. Marie, Ontario) located in the Mississippi or St. Lawrence River drainage basins, respectively. Thus, environmental isolates would be ideal for this type of geographic analysis, however, they are extremely difficult to obtain [22]. Finally, although most of the isolates in this study are recent (acquired within the last 3-10 years) a handful of strains were isolated up to 50 years ago. Thus, there are temporal differences in addition to geographic differences between the isolates, which could be influencing the results.
The association between geographic localization to a drainage basin and genetic population structure implies that the reproduction, growth, and dispersal of both B. dermatitidis and B. gilchristii are intimately linked to freshwater systems. The ecologic niche of Blastomyces has long been an active area of investigation, in large part due to the difficulty in isolating strains directly from the environment, especially soil [22]. There are fewer than 25 reported isolations of this fungus from the environment, representing a less than 1% success rate, using a variety of techniques, including animal inoculation and a variety of artificial media [49,50]. Only 45% of these isolations have come from soil or soil-containing specimens [32,33,[51][52][53][54][55]. In fact, soil appears to have an inhibitory effect on the survival of the fungus in either its mycelial or yeast form [56], whereas woody plant material appears to support its growth best in vitro, especially when combined with soil extract [57]. Based on epidemiological features of canine and human blastomycosis in certain geographic locales, a variety of environmental features have been postulated to facilitate the growth of Blastomyces, including sandy acidic soil, fluctuations in soil moisture levels, waterways at low elevations, decaying organic matter, high humidity (approaching 100%), coniferous forested regions, and proximity to bodies of water [23][24][25][26][27][28][29][30][31][32][33][34]36]. Proximity to freshwater has been emphasized repeatedly by these studies and ecologic niche modeling which noted that proximity to waterways was a distinguishing characteristic of areas that had the highest predicted occurrence of blastomycosis [26,35]. Experimentally, water has been shown to be critical to the efficient dispersal of the infectious conidia of Blastomyces [58]. Both the 1984 [32] and 2006 [22] Wisconsin outbreaks were associated with warm, rainy and/ or windy weather prior to each outbreak. Together these studies suggest that the fungus may require soil/organic nutrients for mycelial growth and reproduction prior to conidial liberation and dispersal by freshwater. Our phylogeographic analysis provides evidence that freshwater systems, probably through their conidial liberating function, serve to define the distribution of this fungus along major waterways and its genetic separation and differentiation over time.
Like Blastomyces spp., microsatellite and/or sequence typing of other fungi has revealed that most fungi exhibit genetically divergent populations localized to specific geographic regions [1][2][3][4][5][6][7][8][9][10][11][12][13][14]. Two notable exceptions are Aspergillus fumigatus and Penicillium chrysogenum which show world-wide dispersal with no correlation between genotype and geographic location [59][60][61]. Furthermore, population genetic structure has been used to infer the mechanism of dispersal of various fungal species, including dispersion by host species [3,5,6,45], environmental factors such as seawater and extreme weather [12,62], and human transportation [8,14,44]. While many fungi exhibit a worldwide distribution, Blastomyces species exhibit a relatively restricted endemic range, primarily central and eastern North America for B. dermatitidis and the north central US and Canada for B. gilchristii. However, if indeed freshwater basins or their shorelines create the optimal environmental locale for reproduction and dissemination; this may explain why the endemic region of these two fungi is relatively restricted. Unlike other fungi where population structure and dispersal is linked to a host species (either animal hosts or plants transported by humans) or large-scale climactic (wind or seawater) factors, the opportunities for large-scale spread of Blastomyces species via freshwater may simply have been limited. Alternatively, Blastomyces may simply be very fastidious, with the resulting ecologic niche capable of supporting growth and reproduction mainly limited to the freshwater systems of North America.
Like many fungi, Blastomyces is capable of clonal propagation as well as sexual reproduction [63]. As well, genetic recombination indicative of sexual reproduction was previously demonstrated by a variety of tests (PTPLT, linkage disequilibrium I A and r d values, split decomposition analysis, Phi test, and 4-gametes tests) performed on seven nuclear gene sequences of the two species [15]. Moreover, whole genome sequencing has identified reduced levels of synteny in the GC-poor regions of select B. gilchristii and B. dermatitidis strains, effectively preventing opportunities for meiotic recombination between species and providing additional evidence for their delineation [39]. Among fungi, the relative contributions of the two modes of growth, clonal propagation and sexual reproduction, differ and will have an impact on population genetic structure [4,5,43]. For our microsatellite dataset, the level of linkage disequilibrium within most populations was significant (with the exception of B. gilchristii populations 1 and 4), and PTPLT analysis suggested that the populations were not freely recombining. The significant level of linkage disequilibrium is contrasted with the low (non-significant) level of linkage disequilibrium previously detected in the multilocus nuclear gene sequence dataset [15]. This is probably due to the different levels of heterozygosity in the datasets, since the detection of linkage depends on polymorphisms [59]. Despite the substantial linkage disequilibrium, we did detect recombination within each species in at least two of the supercontigs containing multiple microsatellite loci, indicating that sexual reproduction does occur [59]. Furthermore, MLMT clones with different mating-type idiomorphs were detected among both the B. dermatitidis and the B. gilchristii isolates, which provides explicit evidence of sexual recombination on the smallest possible scale [5]. The mating-type idiomorphic ratios of the populations of either species (clone-corrected data) was not significantly different from the 1:1 ratio expected for fungi undergoing regular sexual reproduction [59]. Thus, it is probable that in addition to clonal propagation, both B. dermatitidis and B. gilchristii undergo regular sexual reproduction and recombination at a rate sufficient to maintain a 1:1 ratio of mating-types idiomorphs, but rare enough that significant linkage disequilibrium exists within each population.
Based on Bayesian analysis of nucleotide sequences, we estimate that B. dermatitidis and B. gilchristii diverged 1.9 MYA during the Pleistocene epoch. Spanning 2.58 to 0.011 MYA, the Pleistocene epoch was characterized by a series of glaciations events that advanced to envelop large portions of North America, reaching as far south as Iowa, Illinois, Indiana, and Ohio, and then receded [64]. The repeated glaciations caused unparalleled alterations to the freshwater environment, destroying some habitats and creating new systems of lakes and rivers [64]. While most temperate species experienced a reduction and fragmentation of their habitat during this period, populations of freshwater organisms were particularly affected, as the displacement and geographic isolation of subsets of populations created an opportunity for allopatric speciation events [64,65]. When the glaciers retreated, large proglacial lakes formed which could have enabled the widespread distribution of aquatic species [64]. In North America, the Pleistocene glaciations are considered a "speciation pump" and are responsible for reproductive isolation and allopatric speciation of several freshwater fish species. Genetic diversity and intraspecific divergence was greater for freshwater fish species in southern non-glaciated regions compared to northern glaciated regions [64,66]. Given the association of B. dermatitidis and B. gilchristii with the freshwater drainage basins in North America and estimated time of divergence, it is possible that their speciation was caused by the Pleistocene glaciations. Allopatric speciation could have occurred through displacement of local population subsets to different refugia, via the proposed conidial-liberating function of water on the glacial fronts. Similarly, one population subset may have remained frozen in glacial ice for thousands of years while another subset evolved at lower latitudes. The meltwater produced during the glacial retreat could have mixed the two species to create the overlapping geographic distributions observed today. As temperatures warmed during the last glacial maximum (18 000-21 000 years ago), populations within each species may have expanded their historical geographic distribution from isolated refugia as additional landscapes became suitable [65]. Like freshwater aquatic species [66], we noted a deeper tree topology, greater genetic diversity, and greater intra-specific divergence for B. dermatitidis whose geographic range was largely non-glaciated (i.e. the southern Mississippi River basin and southeastern United States) as compared to B. gilchristii, whose geographic range was entirely glaciated.
In conclusion, microsatellite typing coupled with phylogeographic analysis elucidated the population structure of the environmental fungi and serious human pathogens B. dermatitidis and B. gilchristii. This analysis suggested that the major freshwater drainage basins of North America are an essential component of their ecologic niches and provides a plausible mechanism for the dispersal, speciation, and population structure of North American Blastomyces. Further investigation using this knowledge will provide a greater understanding of the ecology of these important fungi; thereby controlling exposure where possible and facilitating the rapid recognition of outbreaks when they occur, leading to prompt public health responses and appropriate treatment.

Strains and culture conditions
One hundred and sixty-nine strains of Blastomyces spp. were collected from various regions of North America (S2 Table). Strains were isolated from human (n = 164), canine (n = 1), and environmental (n = 4) sources between 1963 and 2013. Fifty of the strains included in this study were previously examined by MLST and identified as B. dermatitidis or B. gilchristii [15]. . Strains were identified as Blastomyces spp. by the referring facility using one or more conventional direct detection methods i.e. microscopy, culture identification, mould-yeast phase conversion, AccuProbe Blastomyces dermatitidis culture identification test (Gen-Probe, San Diego, CA), or PCR and sequencing. Upon receipt at PHO, strains were cultured as mold on potato dextrose agar (BD, Mississauga, ON) incubated at 25°C and subsequently subcultured to D-media [67] and incubated at 37°C for 1-3 weeks to convert to the yeast phase.
This study was approved by the Ethics Review Board at Public Health Ontario. Blastomyces specimens were collected from sources listed in the methods section, and the data were analyzed anonymously.

Multilocus microsatellite typing
DNA was isolated from D-media yeast cultures using the Norgen Fungi/Yeast Genomic DNA Isolation Kit (Norgen Biotek, Thorold, ON, Canada). Primers and PCR reactions described by Meece et al. (2011) [37] were used to amplify 25 microsatellite loci from each strain. Loci 17 and 21 were omitted from our dataset due to inconsistent amplification among strains. Furthermore, reverse primers for loci 3, 7, and 10 were modified to include 5' tails (gggatgcTGCTGATTCAGACGGTGAAG, gggatgcTAGATTTCGAGCCCAGCATT, and gggatgcCAAAATGGGAAAGGAAAGCA, respectively) to eliminate ambiguity due to the inconsistent incorporation of +A artifacts during PCR amplification. PCR products were diluted 1 in 10 in water before mixing 0.5 μl of the PCR product with 0.5 μl of GeneScan-500 ROX size standard (Life Technologies, Carlsbad, CA, USA) and 9.0 μl of Hi-Di formamide (Life Technologies). Samples were denatured at 95°C for 3 minutes and cooled immediately on ice prior to fragment analysis using a 50-cm capillary array on a Genetic Analyzer 3730xl (Life Technologies). Results were analyzed using GeneMapper v4.0 (LifeTechnologies) with microsatellite fragments scored by fragment size (S1 Table).

Phylogenetic and population genetic analysis
The number of populations represented by the microsatellite data was assessed using the Bayesian clustering method implemented in the software STRUCTURE v2.3.4 [68][69][70]. Using the admixture model with correlated frequencies, we tested the probability of 1 to 10 clusters for the entire dataset, the B. dermatitidis isolates only, and the B. gilchristii isolates only. Each model was simulated 8 times for 500,000 simulations with a burn-in of 100,000. The number of clusters was determined using the ad hoc statistic delta K (ΔK) [71], with the ancestral population identity of each strain determined from the most probable ancestor algorithm (Q-output) of STRUCTURE.
Microsatellite data were analyzed using TreeMix v1.1 [41] to build a maximum likelihood tree and infer admixture events with sample size correction turned off. Based on an initial tree, B. gilchristii population 1 was chosen as the outgroup. Migration events were added (m = 2) until the p-value exceeded 0.05.
A neighbor-joining (NJ) phylogenetic tree was constructed using Nei's genetic distance matrix [75] calculated by MSA. Although all loci were previously reported to contain dinucleotide repeats [37], we noted a range of fragment sizes for most microsatellite loci suggesting the presence of indels within the flanking regions. Therefore, we chose to use Nei's genetic distance calculation to measure genetic distance, since this it is appropriate for obtaining a correct tree topology from complex microsatellite data where an infinite-allele model is appropriate [76]. A Majority Rule (extended) tree was constructed from 500 bootstrap replications of the neighbour-joining algorithm using Phylip v3.7a [77] with visual presentation in TreeView v1.6.6 [78].
In order to assess the correlation between genetic distance and geographic distance, pairwise matrices based on Nei's genetic distance [75] were calculated and constructed for the B. dermatitidis and the B. gilchristii isolates using MSA. B. dermatitidis isolates SACS and SACR were omitted from this dataset due to uncertainty in geographic origin [79]. Depending on the information available, the geographic location of each isolate was coded as the latitude and longitude of the approximate centre of the city, county, or state/province (S2 Table) with two exceptions: three isolates from Texas were assigned a location in the eastern portion of the state near the Mississippi River and 1 isolate from Alberta was placed in the south-central portion of the province in keeping with what is known about the endemic range of Blastomyces [16]. Additionally, each isolate was categorized into a major North American river drainage basin based on their geographic location (S2 Table). The major North American river drainage basins considered were the: Nelson River drainage basin, Hudson Bay Seaboard drainage basin, Fraser River drainage basin, St. Lawrence River drainage basin, Mississippi River System drainage basin (which included the Arkansas/Red River, Mississippi River (trunk), Missouri River, and Ohio River sub-basins), Gulf of Mexico Seaboard drainage basin, northeast Atlantic Ocean Seaboard drainage basin, and southeast Atlantic Ocean drainage basin. No isolates were assigned to the remaining North American major river drainage basins.
Distance based redundancy analysis (db-RDA) [80] was used to explore the relationship between the response matrix (genetic distance) and an explanatory matrix (geographic distance and/or drainage basin). Using R software [81], the latitude and longitude geographic coordinates were transformed into a 2-dimensional orthogonal system expressed in kilometers using the geoXY function of the SoDA package. Using the vegan package, a principal coordinate analysis (PCoA) of the genetic distance matrix was computed, correcting for negative eigenvalues. All principal coordinates were retained and tested against the explanatory variables of geographic distance and/or drainage basin (treated as categorical data) using db-RDA and partial db-RDA. The significance of the db-RDA was tested by comparing the true value of the pseudo-F test statistic to those of 1000 random permutations of the data with significance assigned if the proportion of the permuted values (p value) that was equal to or smaller than the true value was less than or equal to α = 0.05. Adjusted R 2 values were computed for each test and the partition of variance attributable to each of the explanatory variables was calculated [80].
The distribution of isolates belonging to each B. dermatitidis and B. gilchristii population was mapped across North America using ArcGIS 10.2.1 software (ESRI, Toronto, ON) and maps imported from the Commission for Environmental Cooperation [42].

Recombination and linkage disequilibrium analysis
Linkage disequilibrium (r d ) values were calculated using Multilocus v1.3b [82] for the complete data sets and the clone-corrected data sets of B. dermatitidis and B. gilchristii divided into four populations each (B. dermatitidis populations 1-4 and B. gilchristii populations 1-4) ( Table 1) . The parsimony tree permutation length test (PTPLT) was performed to detect random mating using PAUP 4.0 BETA [84] using files generated by Multilocus v3.1 as previously described [15].
The mating-type idiomorph of each isolate was determined as previously described [15,37] with frequencies compared to the null hypothesis ratio of 1:1 using chi square tests (p 0.05).

Divergence time estimation
The estimated time of divergence of B. dermatitidis and B. gilchristii was estimated using the Ã BEAST extension of BEAST v1.8.0 [85,86]. We used partial sequences of 7 nuclear genes (arf6, chs2, drk1, fads, its-2, pyrF, and tub1) of 40 B. dermatitidis isolates and 40 B. gilchristii isolates, previously published [15]. The substitution rate of its-2 was fixed at 8.3 x10− 4 substitutions per site per million years as previously estimated for B. dermatitidis [87]. The evolutionary rates of the remaining genes were estimated relative to the fixed its-2 substitution rate (arf 6 9.828 x10 -4 , chs2 9.537 x10 -4 , drk1 1.292 x10 -3 , fads 4.622 x10 -4 , opd 1.371 x10 -3 , tub1 1.668 x10 -3 ) [85,86]. An uncorrelated relaxed exponential clock with an exponential prior distribution and a mean of 1.0 x10− 3 substitutions per site per million years [87] was implemented for all genes. The clock, substitution, and tree models were unlinked as appropriate for nuclear genes capable of recombination. The HKY substitution model was chosen for all genes with en estimated base frequency but without gamma site heterogeneity or invariant sites. The Yule speciation model was employed with a random starting tree for each gene. Two independent, identical Ã BEAST analyses were run each for 100 million generations with trees and parameters sampled every 10,000 generations. The two runs were examined in Tracer v1.6 [88] to confirm that all parameters had converged on the same stationary distribution with effective sample sizes >200. The two runs were combined in LogCombiner v1.8.0 with burn-in values of 2500 for each tree file, thus discarding the first 25% of samples, and annotated in TreeAnnotator v1.8.0 to generate the maximum clade credibility tree.