Global Distribution of Polaromonas Phylotypes - Evidence for a Highly Successful Dispersal Capacity

Bacteria from the genus Polaromonas are dominant phylotypes in clone libraries and culture collections from polar and high-elevation environments. Although Polaromonas has been found on six continents, we do not know if the same phylotypes exist in all locations or if they exhibit genetic isolation by distance patterns. To examine their biogeographic distribution, we analyzed all available, long-read 16S rRNA gene sequences of Polaromonas phylotypes from glacial and periglacial environments across the globe. Using genetic isolation by geographic distance analyses, including Mantel tests and Mantel correlograms, we found that Polaromonas phylotypes are globally distributed showing weak isolation by distance patterns at global scales. More focused analyses using discrete, equally sampled distances classes, revealed that only two distance classes (out of 12 total) showed significant spatial structuring. Overall, our analyses show that most Polaromonas phylotypes are truly globally distributed, but that some, as yet unknown, environmental variable may be selecting for unique phylotypes at a minority of our global sites. Analyses of aerobiological and genomic data suggest that Polaromonas phylotypes are globally distributed as dormant cells through high-elevation air currents; Polaromonas phylotypes are common in air and snow samples from high altitudes, and a glacial-ice metagenome and the two sequenced Polaromonas genomes contain the gene hipA, suggesting that Polaromonas can form dormant cells.


Introduction
Over two decades worth of sequence collection from cultureindependent studies of microbial communities has finally provided the necessary data to address questions regarding microbial biogeographic patterns. Hypotheses about whether and why microorganisms have biogeographic patterns, given their small size and ease of dispersal, can now be tested [1,2]. Most recent studies of microbial biogeography have focused on patterns of genetic markers [3], phylogenetic community structure [4,5] or broad taxonomic groups [6]. However, sufficient long-read sequence data is now available to gain a better understanding of the global distribution of specific microbial clades [7], even for clades that are not currently cultured [8]. Such an understanding is a first step in describing the ecology and environmental importance of microbes that are presumed to be ubiquitous in similar environments across the globe.
Isolated ''extreme'' ecosystems, such as glacial and periglacial environments, are ideal for testing biogeographic hypotheses because they contain relatively low microbial diversity and they are geographically widespread. This is especially true of highelevation environments because, like widespread thermal environments [9], they are often separated by large expanses of temperate habitats, yet occur on every continent [8,10]. However, unlike thermal environments, cold, high-elevation environments are linked through the upper atmosphere via the movement of cold air masses. Surprisingly few studies have examined the geographic distribution of microbial families, genera, or species in the cryosphere [7,11,12,13] and only one study has focused on the biogeography of microorganisms in extreme high-elevation ecosystems [8].
There is a growing body of evidence that members of the genus Polaromonas are among the dominant bacteria of glacial ice and sediments worldwide. They have been isolated from glacial ice [14], sea ice [13], sub-glacial sediments [15], and detected in 16S rRNA gene clone libraries [16][17][18] and metagenomic libraries [19] of glaciers. In addition, Nemergut et al. [10] found Polaromonas phylotypes in clone libraries of very recently deglaciated soils but not in soils from the same chronosequence that had been deglaciated for more than four years. Finally, our recent research in the high Himalayas [8], Andes [20,21], Rocky Mountains (this study) and Alaska Range (this study) suggests that sequences closely aligned with Polaromonas are almost ubiquitous in unvegetated periglacial soils in widely separated mountain ranges. Given our recent studies of high elevation periglacial soils and numerous sequences from other studies of glacial sediments worldwide [16][17][18][19]22], we determined the biogeographic distribution of Polaromonas phylotypes in periglacial sediments and ice across the globe.

Results
Polaromonas sequences from sediments and ice of glaciers worldwide ( Figure 1) formed a monophyletic clade that was significantly differentiated from its two closest sister clades, Rhodoferax and Variovorax [23]. This conclusion was supported both by Bayesian (0.99 posterior probability) and Maximum Likelihood (100% of bootstraps) approaches ( Figure 2). Guide sequences from four well-described strains of Polaromonas [23][24][25][26] were encompassed by this clade supporting our use of the term Polaromonas for the whole clade. Phylotypes from the two most intensively sampled sites (Toklat Glacier, Alaska and Nunavut, High Arctic) are distributed throughout this clade ( Figure 2). For example, phylotypes from Toklat Glacier occurred in both sub-clades of Polaromonas and basal to those sub-clades ( Figure 2), indicating that Polaromonas phylotypes from the Toklat Glacier are also globally distributed.
To test whether Polaromonas phylotypes are globally distributed we examined genetic divergence by geographic distance patterns [8,27,28] for pair-wise comparisons among all of the phylotypes in the Polaromonas clade from Figure 2. These analyses revealed a high level of genetic overlap among phylotypes ( Figure 3) even at the largest geographic distances represented in our analyses (18,830 km; John Evans Glacier in the Arctic to Collins Glacier in Antarctica) confirming that at least some phylotypes of Polaromonas are globally distributed to glacial environments. A Mantel correlogram [29] (Figure 4) revealed that significant (P,0.004), Bonferroni-corrected) patterns of genetic divergence occurred within only two distinct distance classes, meaning that within the bounds of these classes genetic distance is positively correlated with geographic distance for Polaromonas phylotypes. The first significantly correlated distance class spans pairwise distances of 6.5 kilometers to 2882 kilometers, and the second spans pairwise distances of 6471 kilometers to 7588 kilometers. All five larger distance classes (.7588 km) showed no significant isolation by distance patterns, indicating that some, as yet unknown, environmental factor may drive genetic differences at the two shorter distances classes mentioned above. However, the fact that 10 out of the 12 distance classes resulted in non-significant Mantel tests is strong evidence for the global dispersal of Polaromonas phylotypes.
Comparison among the genomes of P. naphthalenivorans, Polaromonas sp. strain JS666, and the metagenome from the Northern Schneeferner Glacier [19] allowed us to infer what genes may contribute to global dispersal. Polaromonas was one of dominant contributors to annotated proteins and 16S phylotypes from the Northern Schneeferner Glacier metagenome; out of the 508,960 classifiable protein-coding genes, 48,922, or 9.3% were aligned to the Polaromonas sp. strain JS666 genome with a maximum allowable E-value of 1e 205 . This provides coverage of 3,820 of the total 5,656 genomic features (68%). In addition to the expected presence of genes that help Polaromonas cope with the osmotic and oxidative stress of glacial life, the occurrence of at least one dormancy inducing gene, hipA, was established in all three genomic datasets. Since all Polaromonas genomes lack the necessary genes for spore formation, we suggest they use an alternative dormancy mechanism, which also provides a high capacity for successful airborne dispersal.

Discussion
The goal of this study was to statistically describe the global biogeographic distribution of known Polaromonas phylotypes in glacial environments in order to better understand their role in the cryosphere and to determine if they are endemic to each site or are globally dispersed. We restricted our analyses to long sequence lengths (.1280 BP) that were in a monophyletic clade ( Figure 2) bound at basal and distal levels by described Polaromonas spp. [23][24][25][26]. These and other precautions (see methods) insure that the biogeographic patterns we observed are not due to inclusion of misidentified Polaromonas sequences that would artificially inflate genetic distance by geographic distance comparisons. This focused phylogenetic approach also allows us to shed new light on whether Polaromonas phylotypes are globally dispersed. Are they being constantly distributed globally or are they endemic to individual sites?
Overall our results show that some Polaromonas phylotypes are globally distributed to glacial environments, but that some unmeasured environmental variable may be influencing spatial structuring of Polaromonas phylotypes within a minority (2 out of 12) of distance classes tested. Testing the influence of environmental variables is not possible at the present time because we do not have access to raw samples from most of the sites, but future research could elucidate what factors, in addition to geographic distance, contribute to the genetic structuring of Polaromonas phylotypes. Most importantly however, the present study shows that Polaromonas phylotypes are more widely dispersed than other  Table 1 and Table S1. doi:10.1371/journal.pone.0023742.g001 microbial clades that have been studied using a phylogenetic approach similar to that employed here. For example, recent studies of Bacteria and Archaea show stronger patterns of apparent endemicity [8,9,28] than we observed, as do microbial eukaryotes in glacial [8] and temperate environments [5]. Indeed, Robeson et al. [5] found spatial autocorrelation ranges of about a hundred meters for soil rotifer phylotypes indicating that they are many orders of magnitude less globally dispersed than the Polaromonas clade studied here.
Another strong indicator of the global distribution of Polaromonas phylotypes comes from comparing the within site variability to the global variability of phylotypes. For example, the most intensively sampled site in our analyses was the Toklat Glacier in Alaska, and Figure 2 shows that phylotypes from this site are spread throughout the Polaromonas phylogeny. In addition, different sequences from the Toklat Glacier are almost identical to those from glaciers in the Arctic [17,22], Tibet (unpubl.) and New Zealand [15] among others ( Figure 2). In other words, the local diversity of Polaromonas phylotypes is roughly equivalent to the global diversity with both spatial scales showing a 4% total range in genetic distances.
The apparent global distribution of Polaromonas phylotypes shown here raises the question of how these Gram-negative bacteria with no known spores are dispersed and if there is a global locus from which they are originating. An interesting consideration is the presence of the hipA gene in the two publically-available Polaromonas genomes, as well as in the only metagenome of glacial ice [19]. HipA is linked to the formation metabolically dormant cells [30] as it phosphorylates elongation-factor Tu (EF-TU), thereby blocking translation and inducing dormancy [31]. Recent cell sorting efforts for E. coli persister cells demonstrated highly reduced translation and apparent smaller cell sizes compared to vegetative cells that had high levels of translation [32].
Interestingly, in all species tested in culture so far, formation of persister cells follows a similar pattern to sporulation: low occurrence during mid-log phase followed by a significant increase in persister cell formation during stationary phase [32,33]. Whether or not the formation of persister cells via hipA in Polaromonas provides an easily dispersible propagule or whether it is found in all Polaromonas phylotypes remain open but testable hypotheses.
At present there is not enough information to indicate that Polaromonas phylotypes originate from any one locus or habitat type, but the genus has been found in many non-glacial soils and sediments and are often characterized as being psychrotrophic rather than psychrophilic [23,24,26,34,35]. It is therefore possible that glacial Polaromonas species are being transported from temperate soils and sediments and are not indigenous to glacial and periglacial environments where they could just be persisting as dormant cells. More physiologic and genomic [19] studies of glacial Polaromonas species are needed to determine if specific glacial phylotypes exist.
The unexpected global distribution pattern of Polaromonas phylotypes in high-elevation periglacial environments also suggests that they are being transported there in the upper atmosphere. Although independent confirmation of this is sparse, Fahlgren et al. [36] found Polaromonas sequences in air samples using two different sampling devises on Fløyen Mountain, Norway but not in samples taken at sea level. In addition, Polaromonas is a common inhabitant of snow sampled at very high elevations. Hervàs and Casamayor [37] found Polaromonas sequences ( = clade GKS16 [38]) deposited in surface snow in the highest reaches of the Pyrenees Mountains and Liu et al. [39] found that Polaromonas was only one of two genera found in all snow clone libraries from 4 geographically distributed sites on the Tibetan Plateau. The hypothesis that Polaromonas is being dispersed in high altitude air currents and snow also explains why Liu et al. [14] found Polaromonas of decreasing importance in clone libraries in the order of supra-glacial snow.glacial ice.glacial melt water. Obviously more work is needed to understand the aerobiology of Polaromonas species, but our biogeographic analyses and the studies discussed above are consistent with the hypothesis that Polaromonas species are transported across great distances in the atmosphere.
Finally, recent studies of non-glacial (but seasonally cold) environments shed some light on the possible roles of Polaromonas in glacial systems. Strains of Polaromonas from seasonally cold soils are able to oxidize a wide array of unusual energy sources including H 2 [26], arsenite [34] and a broad range of recalcitrant organic compounds [24,25,35]. Furthermore, genomic studies are revealing that this extreme metabolic versatility may be due to high levels of horizontal gene transfer [35] allowing Polaromonas to adapt to shifting availabilities of energy sources in periglacial environments. Thus, a picture is emerging of Polaromonas as a metabolically diverse ''opportunitroph'' [40,41] that takes advan- Figure 3. All pairwise comparisons of genetic distance by geographic distance for glacier-associated Polaromonas sequences (n = 1378). There was a slight (r M = 0.09) but significant (P = 0.01, Mantel test) increase in genetic distance with geographic distance for the entire data set, but a Mantel correlogram (Figure 4) revealed that spatial structuring was not evident in 10 out of 12 total distance classes across the globe. This pattern was maintained even when only the hyper-variable regions of the 16S rRNA gene were analyzed (see Figure S1). The largest geographic distance comparison in this study was 18,838 km, between the Collins Glacier in Antarctica and the John Evans glacier in Nunavut, Canada. Circle size is proportional to the number of pair wise comparisons at each point on the plot, with bin sizes of 1, 2, 3-4, 5, and .5 for the smallest to largest circles. doi:10.1371/journal.pone.0023742.g003 Polaromonas phylotypes within different distance classes (the midpoint of each distance class is plotted). Shaded diamonds represent distance classes which contained statistically significant spatial structuring as determined using Mantel test on each distance class, and comparing the test's P-value after applying the Bonferroni correction (corrected alpha = 0.004). Unlike previous studies that used this approach to show spatial structuring at large scales [58], our data clearly show that a minority of distance classes indicate significant isolation by distance patterns supporting our contention that Polaromonas phylotypes are globally distributed. doi:10.1371/journal.pone.0023742.g004 tage of transient periods of higher temperatures and substrate availability that occur in all but the most extreme glacial environments.
Taken together, our biogeographic analyses, aerobiological studies, and genomic data allow us to deduce a probable explanation for the unusual distribution of Polaromonas phylotypes in glacial systems. Members of this clade are globally dispersed and show as much genetic diversity within an environment as they do across the globe. This high level of local diversity combined with their apparent propensity for rapid evolution through horizontal gene transfer [35] allows them to adapt to shifting environmental gradients (freeze-thaw cycles, extreme drying and physical disruption due to glacial movement) common in high alpine environments [21,42]. These shifting environmental gradients likely result in periodic decimation of local populations, allowing for the establishment of Polaromonas phylotypes from the atmosphere. This continuous input of widely dispersed phylotypes would result in the weak genetic isolation by distance patterns observed in the present study. However, the ecological role of Polaromonas spp. in high altitude sediments, ice and the atmosphere remains an unsolved mystery.

Methods
To describe the global biogeographic distribution of Polaromonas phylotypes we used previously published and unpublished sequences (see Table 1 for references) from GenBank [43] and new sequences that we obtained from periglacial sediments from our previously described sites in the Himalayas [44], Colorado Rocky Mountains [45], Andes [21] and Alaska [4]. Sediment samples (0 to 4 cm deep) were collected sterilely during the summer at all sites in a grid pattern in order to obtain spatial representation as described in King et al. [4,46]. The location of each sample site was logged using a Garmin 60CSx gps unit. Samples were frozen in the field and shipped to Colorado where they were kept at 280uC until DNA was extracted. Figure 1 shows the global sites used in these analyses.
To extract DNA, 0.4 grams of sediment from each sample was processed with the Mo Bio PowerSoil TM DNA isolation Kit (Carlsbad, CA, USA) and 3 ml of each extraction was PCR amplified in 25 ml reaction volumes using primers 8F (59agagtttgatcctggctcag-39) and 1391R (59-gacgggcggtgwgtrca-39) [8]. The reaction conditions consisted of 1 mM of each primer, 250 mM each dA, dT, dG, dC, 0.25 mL bovine serum albumen, 1 unit of OmniKlen TM Taq polymerase, and 3 ml DNA extract as template. For negative controls, sterile Millipore water was used as template. Denaturing temperature was 94uC (1 minute), the annealing temperature was 53uC (30 seconds) and the extension temperature was 72uC (2 minutes 30 seconds). PCR products were purified using the Quiaquick gel extraction protocol (Qiagen, Valencia, CA, USA), with HyperLadder II TM as a reference. Plasmids were cloned into OneShot TM E. coli using the Invitrogen Topo TA TM cloning kit (Invitrogen, Carlsbad, CA, USA). Colonies were grown on selective media for 18 hours, pelleted and sent overnight on dry ice to Functional Biosciences (Madison, WI, USA) and sequenced bi-directionally using sequencing primers T7 and M13R. Sequencher 4.6 (Gene Codes Co., Ann Arbor, MI, USA) was used to interpret the chromatograms, edit out unreliable data, and assemble contigs. Sequences were imported into ARB v. 9.4 [47], and aligned using the SILVA reference database [48]. Sequences unique to this study were deposited in GenBank [43] under accession numbers JF719322-JF719338 and JF729309. Other glacial Polaromonas sequences, as well as known Polaromonas guide sequences were downloaded from GenBank and imported into ARB. All Polaromonas sequences were aligned in ARB, and then filtered by base frequency to exclude any position in the alignment that had below 30% identity across all sequences. Sequences were exported from ARB to a FASTA file, and Mesquite [49] was used to convert between file formats.
Phylogenetic trees were constructed using two robust methods in order to clearly define a well-supported Polaromonas clade. RAxML [50] was used to make a maximum likelihood (ML) trees with 500 bootstraps, and MrBayes [51][52][53] was run for 5 million generations at a temperature value of 0.02. The Bayes and ML trees were then compared for structural similarity and mutual support of the node separating the out-group genera (Variovorax and Rhodoferax) from Polaromonas phylotypes. The Bayes tree had a posterior probability of 0.99 for this node, and in the ML tree, 100% of bootstraps contained that node, confirming that the Polaromonas clade discussed below is indeed monophyletic. Genomic and metagenomic comparisons were performed using the RAST and MG-RAST annotation and comparison platforms, which utilize the manually curated SEED database [54,55]. The genomes of Polaromonas naphthalenivorans CJ2 (CP000529) [35] and Polaromonas sp. JS666 (CP000316) [25], as well as the glacier metagenome (SRX000607) [19] were obtained from the NCBI database. Comparisons of annotated metabolic subsystems between the genomic data were sorted by identity and function.
Geographic distances between sample sites were computed in R [56] using the Fields package [57] and used to construct a geographic distance matrix using in-house software. An uncorrected genetic distance matrix was exported from ARB using the same filter that was used to export sequences for the trees. To test for a correlation between these matrices, Mantel tests were performed in R using 1000 randomized permutations per test. A Mantel correlogram was constructed to the specifications set forth by Legendre and Legendre [29,59]. The application of Sturge's rule resulted in the data being partitioned into 12 distances classes each containing 115 pairwise comparisons so that each distance class had the same statistical power. Mantel tests were carried out on each of the 12 distance classes, and a Bonferroni correction was applied to the original alpha value of 0.05 resulting in a corrected alpha value of 0.004. Using this approach, only 2 of the 12 distances classes showed significant P-values (P,0.004), however the first distance class was not testable (showed a null result for P and r M values) because it contained no geographic variation. Guide sequences and the outgroup sequences were not included in biogeographic analyses. Several groupings of our Polaromonas sequences were tested in this manner, to see whether variables in addition to geographic distance, such as sequence length, distance from glaciers, elevation, or whether the sequences were obtained from culture-dependent or culture-independent studies were correlated with spatial structuring. However at the present time we do not have enough environmental data to disentangle the effects of geographic distance from environmental variation across the 2 distance classes that showed significant spatial structuring. Figure S1 All pairwise comparisons of genetic distance (only hyper-variable regions of the 16S rRNA gene) by geographic distance for glacier-associated Polaromonas sequences (n = 1378). There was a significant (P = 0.016, Mantel test) increase in genetic distance with geographic distance for the entire data. Circle size is proportional to the number of pair wise comparisons at each point on the plot, with bin sizes of 1, 2, 3-4, 5, and .5 for the smallest to largest circles. (TIF)

Supporting Information
Table S1 Distances in kilometers between sites on Figure 1 and Table 1.