Prokaryotic community shifts during soil formation on sands in the tundra zone

A chronosequence approach, i.e., a comparison of spatially distinct plots with different stages of succession, is commonly used for studying microbial community dynamics during paedogenesis. The successional traits of prokaryotic communities following sand fixation processes have previously been characterized for arid and semi-arid regions, but they have not been considered for the tundra zone, where the environmental conditions are unfavourable for the establishment of complicated biocoenoses. In this research, we characterized the prokaryotic diversity and abundance of microbial genes found in a typical tundra and wooded tundra along a gradient of increasing vegetation—unfixed aeolian sand, semi-fixed surfaces with mosses and lichens, and mature soil under fully developed plant cover. Microbial communities from typical tundra and wooded tundra plots at three stages of sand fixation were compared using quantitative polymerase chain reaction (qPCR) and high-throughput sequencing of 16S rRNA gene libraries. The abundances of ribosomal genes increased gradually in both chronosequences, and a similar trend was observed for the functional genes related to the nitrogen cycle (nifH, bacterial amoA, nirK and nirS). The relative abundance of Planctomycetes increased, while those of Thaumarchaeota, Cyanobacteria and Chloroflexi decreased from unfixed sands to mature soils. According to β-diversity analysis, prokaryotic communities of unfixed sands were more heterogeneous compared to those of mature soils. Despite the differences in the plant cover of the two mature soils, the structural compositions of the prokaryotic communities were shaped in the same way. Thus, sand fixation in the tundra zone increases archaeal, bacterial and fungal abundances, shifts and unifies prokaryotic communities structure.


Introduction
For the investigation of microbial succession during soil-forming processes, a chronosequence approach, i.e., a comparison of spatially distinct plots of different ages, is commonly used. Currently, chronosequences of soil formation can be observed in areas with variable climatic conditions and on omnigenous parent material, such as glacial retreats [1][2][3][4], sand dunes [5][6][7], a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 volcanic rocks [8], or anthropogenic landscapes [9]. Changes in microbial community structure during the process of sand dune fixation have mostly been studied for arid and semi-arid regions [6,10,11] and for coastal environments [5]. Currently, successional traits during sand fixation in the cold climate of the tundra have received increasing attention. Biogeocoenoses of Subarctic region play an important role in regulating the global carbon balance, but they are considered to be susceptible to consequences of climate change (e.g. an increase of mean annual temperatures), and have vulnerable vegetation cover [12]. The environmental conditions for soil formation in the tundra zone are specific, when the sandy substrates are depleted of nutrients, and the average temperature is unfavourable for the development of highly productive plant community.
While the succession of plant communities is relatively well studied, information on the prokaryotic community assemblage during soil formation is still lacking [10,13]. It is known that the first organisms to colonize parent rock are phototrophs, diazotrophs, chemolithotrophs and heterotrophs, whose taxonomic composition depends on the substrate properties [4,8,14]. Several bacterial phyla have been suggested to be associated with the initial stages of soil formation, mainly Bacteroidetes [3] and Cyanobacteria [15,16]. The prokaryotic community acts as the primary producer of organic matter and modifies the parent material for further colonization by plants. Available nitrogen is a limiting factor of plant growth, especially on lean substrates in cold environments [17][18][19]. Some prokaryotes (e.g. from phyla Cyanobacteria, Proteobacteria, Firmicutes, Actinobacteria) are able to perform nitrogen fixation, which leads to the accumulation of available nitrogen during inhabitation of barren substrates, such as rocks and sands [4]. Both archaeal and bacterial ammonia oxidizers produce nitrate (NO 3 − ), which appears to be a crucial form of nitrogen for plants in the tundra zone [20]. Denitrification is a multi-step process of full or partial NO 3 − reduction, which may lead to nitrogen losses through N 2 and N 2 O emission [21,22]. Additionally, the presence of vegetation shapes prokaryotic community structure during soil formation [23]. In comparison to bacteria, fungi are less adapted to life on barren substrates and depend strongly on plants during the early stages of colonization [24]. The diversity of soil microorganisms changes during the process of soil formation; however, there is no distinct and universal pattern of prokaryotic diversity shifts with the successional stage of paedogenesis [2,3,23,25]. Previous studies have shown that at the earliest stages of soil formation after the retreat of glaciers (0-100 years), the bacterial diversity was relatively low, whereas it increased with the age of soil [2] or was the highest in middle-aged soils [3]. In contrast, in a longer timescale of ecosystem development (60-120 000 years), the diversity of the soil prokaryotic community decreased with the site age [25]. The patterns of prokaryotic diversity change among chronosequences of soil formation have been mostly studied for glacier retreats but not for soils formed on aeolian sand dunes.
The aim of this research was to reveal the traits of microbial community succession during sand fixation in the tundra zone. Two chronosequences of soil formation on aeolian sands with similar initial stages and different mature vegetation (typical tundra and wooded tundra) were compared. Taxonomic composition and diversity of the prokaryotic community, the abundances of bacterial, archaeal, and fungal ribosomal genes and functional genes related to the N cycle were estimated for three stages of sand fixation (unfixed sand-semi-fixed surface -mature soil). We hypothesized that 1) the ribosomal and functional genes abundances increases along the chronosequences, 2) α-diversity of prokaryotic communities increases gradually with soil formation and plant colonization on sands, and 3) unfixed sands harbour similar prokaryotic community structures, while the communities in mature soils under the two vegetation types vary from each other.

Sampling site description
Sand fixation chronosequences at two sites on the shores of the Pechora River (Northwestern Russia, Nenetsia region) were studied. This region is located in the southern tundra zone with a humid subarctic climate and an average annual temperature of -3.6˚C. The mean annual precipitation is 445 mm. For both sites, sampling was performed in August 2015 on three types of surfaces: 1 -unfixed aeolian sand, 2 -semi-fixed surface with mosses and lichens, and 3 -mature soil under developed plant cover (Fig 1). The two sites differed in the plant cover that developed on mature soil-typical tundra vegetation with subshrubs (Site I) and wooded tundra with rare trees and subshrubs (Site II).
The first site (Site I) was located on a flat sand hill, probably a moraine formation, with a height of approximately 30 m (67˚58 0 34.3˝N, 52˚55 0 19.9˝E, near Nelmin Nos). The areas of unfixed surface (sand with gravel and rare moss and grass shoots) were found on the hilltop. Presumably, the substrate on that surface was unfixed due to wind and snow erosion. The semi-fixed surface was covered by scanty vegetation: the cushion-like moss Racomitrium canescens, the lichen Stereocaulon paschale, rare subshrubs and other lichens. Vegetation on the mature soil was typical for the tundra zone: lichens (mostly Cladonia arbuscula and Flavocetraria nivalis), subshrubs (g. Empetrum, g. Arctostaphylos, g. Ledum), and f. Gramineae. The soil was classified as Arenosol in the WRB classification [26].
The second site (Site II) was located in a deflation basin with unfixed aeolian sand, which formed small dunes and was gradually covered by vegetation (67˚36 0 23.2˝N, 53˚08 0 12.2˝E, near Naryan-Mar). The unfixed surface was an aeolian sand without gravel. The semi-fixed surface was partly covered by shoots of moss (g. Polytrichum) and lichens (mostly Stereocaulon paschale). Vegetation on the mature soil consisted of various lichens, subshrubs (g. Empetrum, g. Arctostaphylos, Vaccinium vitis-idaea), grasses (Festuca rubra) and small trees (g. Juniperus, g. Betula). The soil was classified as Arenosol in the WRB classification [26] or Psammozem on buried podzol.
For every surface type on each site, five samples were taken from depths of 1-5 cm that lacked plants, mosses and lichens. Sampling plots of different types were located on a transect with 3-5 m between each plot. For molecular analyses, samples were stored at -70˚C. The total organic carbon (TOC) and total nitrogen (TN) contents were estimated for the average sample from each plot using a Vario MACRO Cube CN-analyser (Elementar Analysensysteme GmbH, Germany).

DNA extraction
Total DNA was extracted from 0.5 g of frozen samples using the FastDNA SPIN kit for Soil (MP Biomedicals, USA) as recommended by the manufacturer. The homogenization step was performed with a Precellys 24 homogenizer (Bertin Technologies, France), program 5 (30 sec, 6500 rev. / min). DNA quality was estimated by electrophoresis in agarose gels (1% w/v in TAE) with further visual DNA detection using the Gel Doc XR+ System (Bio-Rad Laboratories, USA). DNA quantity was estimated by Qubit 3 Fluorometer (Thermo Fisher Scientific, USA) using Qubit dsDNA HS Assay Kit (Thermo Fisher Scientific, USA).

Quantitative PCR analysis
qPCR assays were used for quantitative estimation of ribosomal and N-cycle genes. To estimate the functional potential of microbial communities for N-fixation, ammonia-oxidation and denitrification, the abundance of genes encoding key enzymes of these processes (nifH, amoA, nirK and nirS, respectively) was measured [18]. 16S ribosomal genes of Bacteria and Archaea, the ITS region of Fungi and functional genes nifH, bacterial amoA, nirK and nirS were quantified using primer sets described in Table 1. All reactions were performed in a C1000 Thermal Cycler with the CFX96 Real-Time System (Bio-Rad Laboratories, USA). The qPCR mix contained 10 μl of 2X concentrated master mix for qPCR (SYBR Green Supermix (Bio-Rad Laboratories, USA) for the ITS region of Fungi, BioMaster HS-qPCR SYBR Blue (Biolabmix, Russia) for the other genes), 0.5-0.8 μM of each primer, and 1 μl of extracted soil DNA template in a total volume of 20 μl. Quantification of the initial gene copy abundance was performed in CFX Manager. PCR conditions for ribosomal genes were 3 min at 95˚C, followed by 49 cycles of 95˚C for 10 sec, 50˚C for 10 sec, and 72˚C for 20 sec. PCR conditions for N-cycle genes were 3 min at 95˚C, followed by 40 cycles of 95˚C for 20 sec, 54˚C for 20 sec, and 72˚C for 20 sec. To ensure qPCR specificity, melting curve analysis was performed (from 65˚C to 95˚C with an increment of 0.5˚C). Triplicate standard curves ranged from 10 3

GTSAACGTSAAGGARACSGG GASTTCGGRTGSGTCTTGA
Pseudomonas sp [31,32] to 10 8 gene copy number/μl. Standards were made by purifying PCR products and quantifying the concentration by Qubit fluorometer 2 (Thermo Fisher Scientific, USA). Reference organisms (exept amoA gene) for the construction of standard curves for PCR products are described in the Table 1. Efficiencies of qPCR were 82-101% and coefficients of determination were R 2 > 0.90 for all standard curves.

Sequencing of 16S rRNA gene libraries
High-throughput sequencing of the 16S rRNA gene libraries was performed for 5 replicates of each studied sample. The purified DNA isolates were amplified with universal multiplex primers F515 (5 0 -GTGCCAGCMGCCGCGGTAA-3 0 ) and R806 (5 0 -GGACTACVSGGGTATCTAAT-

Processing of 16S rRNA gene data
Sequencing data were processed using QIIME v.1.9.1 [34] and Trimmomatic [35]. Sequence pairs with both forward and reverse reads of at least 180 nucleotides were merged using the fastq-join algorithm. Trimming, i.e. filtering of sequences according to the reading quality parameters was performed using the Trimmomatic program [35], so that the quality of 4 adjacent nucleotides was not lower than 16. Operational taxonomic units (OTU) picking based on 97% nucleotide similarity was performed in the QIIME environment. Chimeras were filtered using VSEARCH algorithm [36]. Reference sequences for OTU selection as well as taxonomic affiliation were obtained from SILVA database version 128, 2017 (https://www.arb-silva.de/ download/archive/qiime). Singletons (OTUs containing only one sequence) and 16S rRNA sequences of chloroplasts and mitochondria were removed.

Statistical and sequence analyses
Statistical analysis of gene abundance data was performed in Microsoft Excel and STATIS-TICA 10.0. A multiple t-test was performed to test for significant (p<0.05) differences between gene abundances in three plot types of each chronosequence. Pearson correlation test was performed to check correlations between substrate chemical properties and gene abundances. Several indices were used for the estimation of total diversity of the studied prokaryotic communities (α-diversity). The Shannon index was calculated (H = S pi lnpi, where pi is the relative abundance of species i in the community). The Chao1 index and the phylogenetic diversity whole tree metric were calculated for characterization of the real number of OTUs in the prokaryotic community [37,38]; they were compared with the total number of observed OTUs. Data were normalized to 4090 sequences per sample.
The analysis of structural differences between prokaryotic communities (β-diversity) was performed using binary metrics of similarity-weighted UniFrac, unweighted UniFrac and Bray-Curtis metrics [39]. Based on weighted UniFrac distances, non-metric multidimensional scaling (NDMS) was carried out to construct diagrams of similarity in prokaryotic community structures. The significance of the differences between structures of prokaryotic communities on three stages of soil formation was estimated using ANOSIM (Analysis of similarities) script in QIIME (999 permutations).

Chemical properties of substrates on the studied plots
Both TOC and TN contents increased in correspondence with the stage of soil formation. For both sampling sites, unfixed sands and semi-fixed surfaces had extremely low organic carbon (0-0.18%) and nitrogen (0.02-0.04%) contents, while mature soils had much higher amounts of C and N ( Table 2). The difference in percentage of TOC between the studied plots was higher than the difference in N content. The amount of DNA recovered from samples increased along two chronosequences.

Ribosomal and N-cycle gene abundances in the two chronosequences
The gradual increase in ribosomal gene copy numbers was revealed in both chronosequences from the unfixed sand to the mature soil. At Site I, there was a statistically significant increase (p<0.001) of one order of magnitude in bacterial and fungal gene copy numbers per gram of substrate, while archaeal gene copy number increased by two orders of magnitude (Fig 2). The gene abundances in the samples from the semi-fixed surfaces were intermediate between those of the unfixed sand and the mature soil. An increase of two orders of magnitude in the number of all ribosomal genes was also observed for Site II; however, mean values of fungal gene copy numbers in the semi-fixed surface and in the mature soil did not significantly differ from each other (Fig 2, S1 Table).
Similar trends were observed for the distribution of functional genes along the chronosequences (Fig 3). For both sites, there was a two-order increase in the amount of all N-cycle genes from the unfixed sand to the mature soil. At Site I, the semi-fixed surface was more similar to the unfixed surface; at Site II, conversely, there was stronger similarity between the semifixed surface and the mature soil. The nirK gene, which is associated with denitrification, was the most abundant among the investigated functional genes in all samples, while amoA genes (associated with nitrification) were least abundant.
Bacterial gene abundance in the substrate correlated with TOC percentage (p<0.05), while archaeal gene abundance correlated with TN (p<0.05), and fungal gene abundance correlated with both TOC and TN (Table 3). All functional genes related to the nitrogen cycle were significantly correlated with TN (p<0.01 for nirK and nirS, p<0.005 for nifH and amoA).

Prokaryotic community structure among the chronosequences
In total, 261 161 sequences of the 16S rRNA gene were obtained (from 2458 to 21 196 sequences per sample) with a mean length of 292 bp. One sample was excluded from the analysis of α-diversity due to a low number of sequences.
Phyla Proteobacteria and Acidobacteria were predominant in all samples (up to 35% of relative abundance) (Fig 4). The taxonomic structure on the phylum level was similar for the prokaryotic communities in the two mature soils under different vegetation types. The comparison of abundances of different phyla showed that the relatively high abundances of Thaumarchaeota (up to 7% on Site I), Chloroflexi (up to 12%) and Cyanobacteria (up to 14% on Site II) were associated with unfixed sands, while Planctomycetes was more abundant in the mature soils and semi-fixed surfaces (Fig 4). The prokaryotic community structure of semifixed surfaces was intermediate between those of unfixed sands and mature soils. Chloroflexi was relatively more abundant in all samples of unfixed sands and semi-fixed surfaces. Genera Chamaesiphon (up to 9% relative abundance), Crinalium, Leptolyngbya and Stigonema belonging to phylum Cyanobacteria were most abundant in the unfixed sand from Site II. There was no significant correlations found between relative abundance of phyla and TOC and TN amounts, except Firmicutes (S2 Table).

The diversity of prokaryotic communities among the chronosequences
The highest prokaryotic α-diversity was found in mature soil from Site I (Table 4). Prokaryotic diversity indices for unfixed sand and the semi-fixed surface at Site I did not differ significantly.  Increased α-diversity was observed among the chronosequence at Site I, while no significant difference was revealed for prokaryotic communities at Site II due to high variation between indices for samples from each plot. The shift in prokaryotic community composition during the process of sand fixation was observed in both chronosequences (Fig 5). Both unweighted and weighted UniFrac analyses showed similar community compositions of the two mature soils and semi-fixed surfaces, while the sand samples formed separate clusters and differed from each other. In weighted UniFrac metrics, prokaryotic communities of unfixed sands were separated from those of semi-fixed surfaces and mature soils. Bray-Curtis metrics also showed that prokaryotic communities of the two sands were significantly different from the other samples. For all metrics (Bray Curtis, Weighted UniFrac, Unweighted UniFrac) the differences between prokaryotic communities on  three types of plots were significant (p<0.01) in both chronosequences. According to R test, the difference between soil formation stages is more pronounced on the Site II (with wooded tundra vegetation on the mature soil plot) comparing to the Site I (S3 Table).

Quantitative analysis of ribosomal and N-cycle genes in the two chronosequences
The observed ribosomal gene copy numbers both in semi-fixed surfaces and in mature soils correspond with the previously obtained data on the microbial population abundance for soils  and soil-like substrates in northern latitudes [40,41]. The higher abundance of bacterial ribosomal genes in comparison to the abundance of fungal genes in all samples can be explained by low diversity of plant communities at the studied plots; as previously shown in other studies, Fungi are more dependent on the plant communities of barren substrates than Bacteria [24]. However, the abundance of genes increased from unfixed sand to mature soil in both chronosequences, which can be an effect of plant community development and consequently higher available organic carbon and nitrogen contents. Available organic matter in soil is known to be a limiting factor of microbial community development [42], and a correlation between the quantity of ribosomal genes and soil organic carbon content was previously observed for other soils [43]. The abundances of all functional genes associated with transformation of nitrogen also increased in both chronosequences from unfixed sand to mature soil and correlated with total nitrogen content. In other studies, the abundance of nifH genes related to N 2 fixation was found to be correlated with total nitrogen content, nitrate and ammonium concentrations [44,45]. This finding is consistent with the study of metagenomes of Arctic tundra soil, where N-assimilation genes were present in all bacterial genomes in microbiomes of different types of polygonal landscapes [46]. Another study showed that in soils formed on glacial retreats, the nitrogen fixation rates significantly increased during the first 4-5 years of succession [15]. Thus, nitrogen-fixing bacteria are important for soil microbial community assemblage and functioning, especially in the tundra zone characterized by scarce vegetation and low nitrogen content. The lowest abundance of the bacterial amoA gene among all functional genes studied can be explained by the low amount of organic nitrogen in all samples. Bacterial amoA gene abundance in soil is known to be related to the available ammonia concentration [47]. In all samples, nirK gene abundance was the highest in comparison to other N-cycle genes. Genes associated with denitrification (nirK and nirS) were previously found to be more abundant than nifH and amoA in other soils [44]. The abundances of the two nitrite reductase-encoding genes (nirK and nirS) observed in this study were disproportionate. Similar trend with higher abundance of nirK genes comparing to nirS was previously observed in Arctic soils [12]. According to previous studies, nirK (copper nitrite reductase) is more widespread in terrestrial ecosystems, while nirS (cytochrome cd1 nitrite reductase) is more abundant in marine environments [48][49][50].
Thus, the obtained data for both Sites supported the hypothesis of increased gene abundance among the chronosequences. The similar dynamics of ribosomal and functional gene abundance in the two chronosequences can be explained by the congruent patterns of increased total organic carbon and total nitrogen from unfixed sands to mature soils on the two sites. Gene copy number indirectly indicates the biomass of different functional and taxonomic groups in soil microbial communities, which increases during primary succession on different types of barren substrates [7].

Changes of prokaryotic community structure during soil formation
The prokaryotic communities of all samples were dominated by phyla Acidobacteria and Proteobacteria, which was previously observed for other soils of the tundra zone [46,51].
Phylum Thaumarchaeota was found to be relatively more abundant in unfixed sands of both sites. All OTUs belonging to Thaumarchaeota were uncultivable Archaea and were previously observed in other terrestrial environments. Among all Archaeal phyla, Thaumarchaeota are known to be predominate in Arctic and Antarctic soils [51]. We suggest that the relative abundance of Thaumarchaeota, but not their absolute number, decreased from unfixed sands to mature soils because archaeal gene abundances in the unfixed sands were a hundred-fold lower than in the mature soils. It is also possible that archaeal OTUs from mature soils were not presented in the database and were ranked as unclassified.
Family Ktedonobacteraceae belonging to phylum Chloroflexi that were predominate in unfixed sands are known to be negatively correlated with organic matter content in deforested soil [52]. This family is mostly represented by uncultivable genera; its cultivable representatives are filamentous, aerobic and mesophilic [53]. Ktedonobacteraceae was previously found in young soils [23,54,55]. In this study, the relative abundance of Chloroflexi decreased with succession development, what is in accordance with previous findings [23].
Representatives of phylum Cyanobacteria were more abundant in the samples of unfixed sand from Site II, which can be explained by low levels of insolation of sand on Site I due to gravel cover. Cyanobacteria are known as free-living phototrophs capable of nitrogen fixation, especially in extreme environments [16,51,56]. Representatives of this phylum could be the primary producers of organic matter in unfixed sands due to the lack of organic carbon and nitrogen. A decrease in Cyanobacteria abundance and number of observed OTUs belonging to this phylum with soil age was previously observed for soils formed by glacial isostatic adjustment in Fennoscandia [57]. Genus Leptolyngbya was found on plots of unfixed sand on Site II. It is known as a producer of adhesive extra-cellular polysaccharides and organic acids that can degrade rock [58]. Thus, all these taxa inhabit barren substrates, and their active presence in the community can be considered an indicator of the primary stage of development of microbial succession.

Diversity changes among the chronosequences
The gradual increase of prokaryotic α-diversity from initial stages of sand fixation to mature soils was expected for both chronosequences. However, prokaryotic α-diversity increased from the unfixed sands to the mature soil on Site I, while prokaryotic communities of all samples on Site II did not follow the same pattern, and their α-diversities did not change with the successional stage. The obtained results partly contradict previously discovered trends of incremental growth of prokaryotic α-diversity during revegetation on moving dunes [10]. However, some studies reported the highest prokaryotic diversity at the early stages of soil formation [59,60]. Although the microbial biomass on barren substrates was relatively low, the high diversity of the unfixed sand prokaryotic community can be explained by the variety of necessary adaptations to harsh environmental conditions.
The prokaryotic community structures in samples of the two mature soils appeared to be very similar comparing to those in the unfixed sand samples. The estimation of β-diversity showed the same pattern of unevenness of prokaryotic communities in the unfixed sand samples. This observed dissimilarity could be a consequence of random propagule input in unfixed sands, while the developed vegetation on both mature soils allowed the formation of more stable prokaryotic communities. Another possible explanation of prokaryotic community differences in unfixed sands could be that the diversity depends on both richness and evenness [61][62][63], and microbiomes in unfixed sands were less even, than in mature soils.

Conclusions
Using a chronosequence approach, we found the expected trends in microbial populations: increased microbial community abundance and change of prokaryotic community structure from unfixed sands to mature soils. The highest prokaryotic diversity and abundance, as well as the amount of microorganisms involved in the nitrogen cycle, were revealed in mature soil under developed plant cover. However, the prokaryotic diversity during soil formation increased slightly, with the minimum values found in sand under pioneer vegetation (intermediate stages of succession). In contrast with our predictions, the analysis of β-diversity shown that prokaryotic communities under the unfixed sands were more dissimilar, than under vegetation. Therefore, plant colonization of aeolian sands in tundra multiplies and unifies the prokaryotic communities.
Supporting information S1