Differential Assemblage of Functional Units in Paddy Soil Microbiomes

Flooded rice fields are not only a global food source but also a major biogenic source of atmospheric methane. Using metatranscriptomics, we comparatively explored structural and functional succession of paddy soil microbiomes in the oxic surface layer and anoxic bulk soil. Cyanobacteria, Fungi, Xanthomonadales, Myxococcales, and Methylococcales were the most abundant and metabolically active groups in the oxic zone, while Clostridia, Actinobacteria, Geobacter, Anaeromyxobacter, Anaerolineae, and methanogenic archaea dominated the anoxic zone. The protein synthesis potential of these groups was about 75% and 50% of the entire community capacity, respectively. Their structure-function relationships in microbiome succession were revealed by classifying the protein-coding transcripts into core, non-core, and taxon-specific transcripts based on homologous gene distribution. The differential expression of core transcripts between the two microbiomes indicated that structural succession is primarily governed by the cellular ability to adapt to the given oxygen condition, involving oxidative stress, nitrogen/phosphorus metabolism, and fermentation. By contrast, the non-core transcripts were expressed from genes involved in the metabolism of various carbon sources. Among those, taxon-specific transcripts revealed highly specialized roles of the dominant groups in community-wide functioning. For instance, taxon-specific transcripts involved in photosynthesis and methane oxidation were a characteristic of the oxic zone, while those related to methane production and aromatic compound degradation were specific to the anoxic zone. Degradation of organic matters, antibiotics resistance, and secondary metabolite production were detected to be expressed in both the oxic and anoxic zones, but by different taxonomic groups. Cross-feeding of methanol between members of the Methylococcales and Xanthomonadales was suggested by the observation that in the oxic zone, they both exclusively expressed homologous genes encoding methanol dehydrogenase. Our metatranscriptomic analysis suggests that paddy soil microbiomes act as complex, functionally coordinated assemblages whose taxonomic composition is governed by the prevailing habitat factors and their hierarchical importance for community succession.


Introduction
Comparative taxonomic profiling of microbial communities, or microbiomes, has been carried out to measure structural responses to changing environmental conditions. Using the 16S rRNA gene, correlation between microbiome composition and selective pressures were detected in various habitats with different environmental or geographic characteristics [1], [2], [3]. Microbiomes in nature exhibited interlineage associations, ranging from specific grouping of a few lineages to large assemblages with shared habitat preferences [4]. These studies showed that microbiomes possess compositional patterns that clearly deviate from a random distribution [5]. However, both theoretical and experimental community ecology and functional equivalence hypothesis between multiple species predict that identical environmental conditions can select for taxonomically different communities [6], [7]. Conservation of functional profiles among different microbiota corroborated that functional rather than taxonomic diversity represents ecological community [8]. Because both microbiome structure and function are strongly influenced by the environment, it is critical to identify the fundamental ecological processes or key factors that govern microbiome succession.
A considerable amount of bacterial genomes is dedicated to shaping the organisms' habitats and maintaining their ecosystem [9]. Cultivation-independent genomic approaches, such as metagenomics, have greatly advanced our understanding of taxonomic and genetic diversity and ecological potential of microbial communities [10]. However, these approaches provide limited information on the functional significance of the observed genes. Metatranscriptomics, which investigates the transcriptional expression of the metagenome by large-scale RNA sequencing, gains insight into the in situ activities of microbiome functioning in response to environmental cues [11], [12], [13], [14]. For example, community-wide responses of marine microbial assemblages to substrate amendments (e.g., dimethylsulfoniopropionate, polyamines putrescine and spermidine) were investigated by metatranscriptomics to characterize the functional activity of members capable of utilizing these substrates [15], [16], [17]. A recent soil metatranscriptome study revealed that the degradation of the pollutant phenanthrene is initiated in accordance with the increased expression of dioxygenase, stress response and detoxification genes [18]. Furthermore, metatranscriptomics showed to be quantitative as deduced from the observation that the transformation of the herbicide atrazine proportionally increased with the abundance of transcripts implicated in the corresponding biochemical reactions [19]. Apparently, metatranscriptomics makes it possible to relate community dynamics to specific environmental cues, while gene-based approaches reveal structural and functional diversity accumulated by historical changes in environmental conditions [20].
Here, we employed metatranscriptomics to reveal structure-function relationships in microbiome succession in flooded paddy soil. Microorganisms in this ecosystem experience periodic cycles of flooding and drainage during plant cultivation. Flooding creates oxygen-limited conditions with different physico-chemical properties such as oxic surface layer, anoxic bulk soil, and rhizosphere. Distinct activities and spatial distribution of functional groups of microorganisms are established between these compartments [21], [22]. Microbial activities determine the balance between methane oxidation in the oxic surface layer and methane production in the anoxic bulk soil, resulting in net emission of this greenhouse gas to the atmosphere. Hence, paddy soil microbiomes exposed to the contrasting oxygen conditions (oxic surface layer vs. anoxic bulk soil) appeared to be a suitable model system to comparatively explore structural and functional succession driven by different environmental cues.
The relative contribution of taxonomic groups to the ribosomal metatranscriptome can be considered a function of population size and metabolic activity [23], [24]. Accordingly, Blazewicz et al. [25] proposed that cellularly expressed ribosomal RNA is an excellent indicator for the protein synthesis potential of microbial populations. Based on this concept, we identified the taxonomic groups that exhibited greatest protein synthesis potential during the different rice plant growth stages in flooded paddy soil. In the ripening stage, the metabolic activities in oxic and anoxic paddy soil were investigated by classifying the functional metatranscriptomes into core, non-core, and taxon-specific transcripts based on homologous gene distribution between these dominant taxonomic groups. Ubiquitous gene sets among community members differentiate the characteristics of a particular community from those of other communities, while they cannot be indicative of specific features of individual community members. Our approach allowed us to identify the ecological roles of individual taxonomic groups and biochemical coordination between community members, thereby revealing the ecological processes that drive differential microbiome succession in flooded paddy soil.

Rice microcosms
Paddy soil samples were provided by the Italian Rice Research Institute in Vercelli, Italy (GPS coordinates: N45.3202, E8.4186). No specific permission was required for the sampling location. Soil was taken from a drained paddy field in spring 2011 and was air dried and stored at room temperature. The soil was sieved (<2 mm) prior to use. The study did not involve endangered or protected species. Using a nylon bag to separate roots from bulk soil, rice plants (Oryza sativa var. KORAL type japonica) were grown in microcosms in the greenhouse as previously described by Shrestha et al. [26]. After transplantation of rice seedlings into the flooded microcosms, samples were taken from the surface layer (oxic zone) and the bulk soil (anoxic zone) at different incubation periods (25,45, and 90 days). These incubation periods respectively correspond to the following plant growth stages: tillering, flowering, and ripening. The upper 2-mm surface layer was sampled using a spatula, after removal of the floodwater. Previous studies have shown that the upper 2-mm surface layer represents the oxic zone and oxygen is depleted from 200 μM at the surface layer to undetectable amounts beneath a depth of approximately 2 mm [27]. Dark gray bulk soil beneath the bottom of the nylon bag was collected, in a depth of approximately 6-8 cm. Aliquots of one gram of wet soil were transferred into 2-ml tubes and immediately shock-frozen in liquid nitrogen. Soil aliquots were stored at -80°C overnight and subjected to RNA extraction at the following day. Two independent microcosms were sampled for the 45-and 90-day incubation periods, while a single microcosm was used for the 25-day incubation period.

Metatranscriptome library preparation
Total RNA was extracted from three soil aliquots of each of the 25-and 45-day-old microcosms using the method described by Mettel et al. [28]. The three aliquots were pooled, resulting in microcosm-specific extracts of total RNA for both the oxic and anoxic zones. A total of 12 soil aliquots were collected from each of the two 90-day-old microcosms and used for the extraction of total RNA, in order to have sufficient amount of RNA for mRNA enrichment. mRNA was enriched using the Ribo-Zero rRNA removal kit (Meta-Bacteria) according to the manufacturer's instructions (Epicentre Biotechnologies, Madison, WI). The concentration of RNA was measured using Qubit RNA assay kit (Invitrogen, Eugene, OR). The integrity of total RNA and depletion of rRNA in enriched mRNA was checked by Experion RNA HighSens chips (Bio-Rad, Hercules, CA). The cDNA libraries of both total RNA and enriched mRNA were constructed by random priming using NEBNext mRNA Library Prep Reagent Set for 454 (New England BioLabs, Ipswich, MA) and tagged using the GS FLX Titanium Rapid Library MID adaptors (454 Life Sciences, Brandford, CT) according to the manufacturers`manuals.
The cDNA libraries constructed from total RNA of the three different incubation periods were mixed in equal molar ratio and sequenced on the 454 GS Junior system, following the protocol of the manufacturer (454 Life Sciences). The cDNA libraries of enriched mRNA were sequenced using the 454 GS FLX instrument (454 Life Sciences) at the Max Planck Genome Centre Cologne (Germany). The raw 454 reads of each cDNA library have been deposited under the study number PRJNA215834 in the NCBI Sequence Read Archive.

Bioinformatic analysis
The raw 454-pyrosequencing reads were preprocessed based on quality scores using PRINSEQ [29]. Low-quality reads were removed if one of the following criteria was met: length < 200 bp, mean quality score < 20, or more than 1% of ambiguous signals (N). Simple repeat sequences like homopolymers were eliminated when the complexity score, calculated by DUST algorithm, was above 7 [30]. Exact duplicate and reverse complimentary reads of the same length were defined as artificial replicates and discarded [31]. Terminal base calls at the 3'-end were trimmed off if the quality score was below 10.
Using BLASTN, the preprocessed reads derived from total RNA were screened against SILVA small and large subunit ribosomal RNA databases (SSU Ref and LSU Ref) [32], using an e-value cutoff of 1e-10. They were classified into SSU rRNA-tags, LSU rRNA-tags, or non-rRNA-tags. If reads had significant matches in both SSU Ref and LSU Ref, they were assigned according to where their best hit alignment had a higher bit score, using a custom-coded Python script. The SSU rRNA-tags were selected to determine the microbiome composition. They were grouped into operational taxonomic units (OTUs) by mapping them to sequences of the non-redundant (nr) SILVA SSU Ref database, which was dereplicated with 99% identity cutoff. Using BLAST, the mapping cutoff was 5% sequence divergence. SSU rRNA-tags mapped to the same reference sequence were grouped into the same OTU. Mapped reference sequences were selected to represent the respective OTUs. Richness estimates and diversity indices were computed using QIIME [33].
Preprocessed reads of enriched mRNA were differentiated into rRNA-tags and non-rRNAtags as described above. The non-rRNA-tags were compared against the Rfam database using INFERNAL with default parameters to eliminate small RNA-derived reads [34]. The remaining reads were defined as mRNA-tags and subjected to BLASTX searches against NCBI nr protein database, using a bit score cutoff of 50. The annotated mRNA-tags were functionally classified into SEED subsystems and taxonomically classified based on top hit and the lowest common ancestor (LCA) using MEGAN4 [35]. Statistical analysis based on functional profiles was done using STAMP software package [36]. Taxon-specific protein databases were constructed by searching the NCBI nr protein database with name of a particular taxon in the organism field. The mRNA-tags were compared by BLASTX independently to each taxon-specific database using a bit score cutoff of 50. The BLAST outputs were parsed using custom-made Python scripts to classify the mRNA-tags into core, non-core, and taxon-specific data sets on the basis of the presence or absence of homologous genes in the taxon-specific databases (S1 and S2 SuppInfo). The mRNA-tags that had no homolog in any of the taxon-specific protein databases were subjected to BLASTX search against the NCBI nr protein database for taxonomic classification.

Basic statistics of metatranscriptome sequencing
Total RNA extracted from the oxic and anoxic zones of flooded rice microcosms was analyzed by 454-pyrosequencing at different incubation periods (25, 45 and 90 days). This approach aimed at identifying the microbial groups contributing most to the cellularly expressed rRNA pools. Across all samplings, a total of 35,851 and 27,304 reads of small subunit ribosomal RNA (SSU rRNA-tags) were obtained for the oxic and anoxic zones, respectively. The average read lengths ranged from 423 bp to 469 bp (S1 Table). Microcosms with rice plants in the ripening stage (90 days) were used to comparatively explore the in situ microbiomes`functioning. 454-pyrosequencing of enriched mRNA resulted in 73,974 and 464,906 raw reads from the oxic and anoxic zone, respectively. Reads derived from rRNA and small non-coding RNA were excluded from further analysis. After processing of the raw data, 48,816 reads (oxic zone) and 92,245 reads (anoxic zone) were classified as protein-coding transcripts (mRNA-tags) (S2 Table).

The microbiomes' taxonomic compositions
The random SSU rRNA-tags were clustered into OTUs using a sequence identity cutoff of 95%. Rarefaction curves revealed that, on average, 35% of the estimated genus richness was covered (S1A Fig). About 20% of SSU rRNA-tags could not be mapped using the 95% identity cutoff and were identified to represent novel genera. However, the decrease of the sequence identity cutoff to 90% and 85% increased the mapping efficiency up to 98% (S1B Fig). This clearly showed that microbial diversity at higher taxonomic ranks, such as family or order level, is confidently explored by the given set of SSU rRNA-tags.
The taxonomic composition of the microbiomes established in the oxic and anoxic zones greatly differed, even at phylum level ( Fig 1A). Spatial separation between the oxic surface layer and anoxic bulk soil explained more than 70% of the compositional variance, as evidenced by principal coordinate analysis ( Fig 1B). The effect of incubation time was less than 9%, thereby suggesting that the community composition was stable over the different plant growth stages. Despite the high taxon richness, the microbiomes in both oxygen zones were dominated by a limited number of taxonomic groups. Members of the Cyanobacteria, Fungi, Xanthomonadales, Myxococcales, and Methylococcales were the most abundant and metabolically active taxa in the oxic zone, while Clostridia, Actinobacteria, Geobacter, Anaeromyxobacter, Anaerolineae, and Euryarchaeota were most prevalent in the anoxic zone. Each of these taxa persistently contributed > 5% to the total SSU rRNA-tags or showed significant fold increase in their relative abundance over incubation time (Fig 2). They will be referred to as the dominant groups. Collectively, they accounted for up to 75% (oxic zone) and 50% (anoxic zone) of total SSU rRNA-tags, suggesting that they occupied crucial roles in microbiome functioning related to their high protein synthesis potential [25]. Representative groups that contributed < 3% to the total SSU rRNA-tags at the order level were as follows: Acidimicrobiales, Burkholderiales, Sphingobacteriales, and Verrucomicrobiales in the oxic zone, and Acidobacteriales, Bacillales, and Syntrophobacterales in the anoxic zone. Some order level groups were detected in both oxygen zones, including Rhizobiales and Rhodospirillales. Highly similar taxonomic patterns between biological replicates corroborated experimental reproducibility of RNA extraction and sample-specific analysis of total RNA by 454-pyrosequencing. Due to the near-steady-state community organization of major taxa throughout incubation period, functional succession in the oxic and anoxic zones was assessed only in 90-day-old microcosms using combined libraries of two biological replicates.

SSU rRNA versus mRNA: a taxonomic perspective
In a conventional approach, the taxonomic origin of mRNA-tags was deduced from the (i) top hit and (ii) lowest common ancestor (LCA) of maximum 100 BLASTX hits in NCBI nr protein database (S2 Fig). We made an attempt to quantify the discrepancy between the taxonomic patterns derived from SSU rRNA and mRNA. Compositional dissimilarity between the taxonomic patterns was calculated at the phylum level using Bray-Curtis indices (Fig 3). Unexpectedly, the taxonomic composition of the mRNA-tags from the oxic zone was more dissimilar to its corresponding rRNA-tags than to the taxonomic compositions of the rRNA-tags and mRNAtags from the anoxic zone. Overall, the patterns were not separated according to the oxic or anoxic zone, although mRNA-tags and SSU rRNA-tags of each zone were obtained from the same extract of total RNA. Additionally, there was only a weak correspondence between the relative abundances of the dominant groups in the rRNA and mRNA data sets (Fig 4A). Except for Actinobacteria, all the dominant groups were highly underrepresented in mRNA-tags relative to their contribution to rRNA-tags. In fact, these taxonomic discrepancies are inevitable Temporal changes in the relative abundance of the dominant taxonomic groups in the SSU rRNA-tags over incubation period. Each of the dominant groups persistently contributed > 5% to total SSU rRNA-tags or showed significant fold changes in their relative abundance during incubation, while no other taxonomic group accounted for greater than 3% at order level. The abundance cutoff was applied to the lowest taxonomic rank possible, explaining why the taxonomic ranks vary from genus to phylum levels.
doi:10.1371/journal.pone.0122221.g002 due to the following reasons: (i) taxonomic bias of mRNA-tags towards phyla that are overrepresented by many sequenced genomes [37], (ii) horizontal gene transfer events between phylogenetically distant microorganisms [38], and (iii) high sequence conservation in house-keeping genes and different evolutionary rates among functional groups of proteins [39]. However, a significant underrepresentation of Cyanobacteria in protein-coding transcripts has also been observed in the metatranscriptomes of marine waters and photosynthetic microbial mats, despite the fact that the Cyanobacteria residing in these environments are well represented by genome and metagenome sequences [40], [41]. The cyanobacterial example implies that incongruent representation of taxa between rRNA and mRNA pools could have reasons other than difficulties in taxonomic classification of mRNAs. To address the question of  The relative abundance of the taxon-specific mRNA-tags for each of the dominant groups was normalized to that of the entire microbiome by taking into account the proportion of the dominant group-derived mRNA-tags on total mRNA-tags (see Fig 5). Linear regression curves were generated with intercept value of 0 for the oxic zone (red line) and anoxic zone (blue line). underrepresentation, we assessed how widely homologs of the expressed genes are distributed among the dominant groups and aimed to extract mRNA-tag data sets that can be specifically assigned to particular taxonomic groups.
Defining core, non-core and taxon-specific mRNA-tags On the basis of their contribution to the SSU rRNA-tag data sets, the protein synthesis potential of the dominant groups was estimated about 75% and 50% of the total microbiome in the oxic and anoxic zone, respectively. Given that (i) 10 8 to 10 9 bacterial cells are present in one gram of paddy soil [42], (ii) bacterial cells contain a number of genes ranging from about 500 to 9000 and (iii) multiple transcript copies are being produced from most genes in active cells [43], we anticipated that the mRNA-tags obtained from a major, but still limited, sequencing effort will be derived primarily from the dominant groups. Analysis of the transcriptional activity of the low-abundant taxanomic groups was not the objective of this study and would have required a much greater sequencing effort. Therefore, instead of searching against NCBI nr protein database, we first compared mRNA-tags to the taxon-specific protein databases that had been constructed for each of the dominant groups (Table 1). Overlaying the annotation profiles of the taxon-specific protein databases classified mRNA-tags into core, non-core and taxon-specific mRNA-tags based on homologous gene distribution (S3 Fig). The mRNA-tags shared between all of the dominant groups or, at least, all bacterial members were defined as core mRNA-tags. Confinement to only bacterial members was made because sequence variations among the three domains of life may be too great to detect all gene homologs [44]. Among the non-core mRNA-tags, those unique to only one of the dominant groups were defined as taxon-specific mRNA-tags.
Among the mRNA-tags functionally annotated, 77.2% (oxic zone) and 75.2% (anoxic zone) had homologs with one or several of the dominant groups (Fig 5). The normalized representation of the dominant groups was prominently increased among the taxon-specific mRNA-tags relative to their abundance among SSU rRNA-tags (Fig 4B), suggesting that the protein synthesis potential of a particular taxonomic group within a complex community is well reflected by Table 1. Size of the taxon-specific protein databases derived for the dominant groups from NCBI nr protein database and the number of mRNAtags that showed BLASTX hits to the respective database. the expression level of genes unique to that group. The increased representation of the dominant groups in taxon-specific mRNA-tags is assumed to be due to the exclusion of highly conserved genes of uncertain taxonomic origin and of most horizontally transferred genes. Given the sequencing effort applied, our study presumably provides an insight into the transcripts highly expressed by the dominant groups in the two replicate microcosms used for functional metatranscriptomics. Analysis of single composite samples by 454-pyrosequencing has been applied in various studies to compare community-wide transcriptional activity between different environmental conditions [18], [45].
Functional profiles of core and non-core mRNA-tags The core mRNA-tags accounted for 38.5% and 24.1% of the dominant group-derived mRNAtags in the oxic and anoxic zone, respectively (Fig 5). Both predicted function and alignment position were highly conserved for most core mRNA-tags among their homologs, although some homologs were annotated as hypothetical proteins. Apparently, the core mRNA-tags were expressed from highly conserved ubiquitous genes, representing functional features exhibited by all the dominant groups in the respective oxygen zone. We examined the functional profiles of the core mRNA-tags using SEED subsystem-based annotation and compared them with those of the non-core mRNA-tags. The core mRNA-tags mostly encoded basic cellular machineries, such as protein and RNA metabolism and regulation and cell signaling, while the non-core mRNA-tags were more frequently implicated in carbohydrate, amino acids and derivatives, and lipid metabolisms (Fig 6). Principal component analysis, however, showed that the functional profiles of the core mRNA-tags remarkably differed between the two microbiomes. Essential capabilities to adapt to either aerobic or anaerobic conditions were considerably represented in the core mRNA-tags, but differentially expressed between the oxic and anoxic zones (e.g., oxidative stress vs. fermentation). Taken together, the core mRNA-tags were revealed to encode not only basic cellular machineries but also cellular functions that are indispensible to be metabolically active under the prevailing conditions. Differential expression of core features between the two microbiomes pointed towards different oxygen availability as a key determinant for structural succession. Carbon bioavailability is one of the primary factors affecting composition and activity of soil microbiomes [46], [47]. A large fraction of the mRNA-tags related to carbohydrate metabolism was classified into non-core mRNA-tags. In general, a wider range of enzymatic reactions that, within the KEGG pathways, belong to carbohydrate metabolism were affiliated with noncore mRNA-tags than with core mRNA-tags (Fig 7). As shown in the following sections, knowledge of the distribution of non-core mRNA-tags among the dominant groups provides insight into how soil microbiomes vary in functional complexity and maintain their heterotrophic lifestyle using a variety of carbon sources as substrates. Enzymatic reactions involved in central carbon metabolism including 'glycolysis/gluconeogenesis', 'pyruvate metabolism', and 'citrate cycle', however, were detected in the core-mRNA-tags with higher coverage and quantity than other pathways of carbohydrate metabolism. These core pathways are highly conserved and universally present in most organisms to generate ATP from glucose metabolism or to serve for anaplerotic reactions. In addition to carbohydrate metabolism, SEED subsystems related to carbon cycling, such as 'photosynthesis', 'serine-glyoxylate cycle', 'methanogenesis', and 'pyruvate metabolism II: acetyl-coA, acetogenesis from pyruvate', were differentially represented between the non-core mRNA-tags of the oxic and anoxic zones (Fig 6). These observations substantiate that carbon sources affect microbiome composition by activating or enriching groups of microorganisms able to utilize particular substrates.
Taxon-specific mRNA-tags from the oxic zone A large fraction of the Cyanobacteria-specific mRNA-tags was related to chlorophyll metabolism and photosystem I and II, revealing high photosynthesis activity under light (Fig 8). Transcripts for nitrate ABC transporter and nitrate/nitrite reductases suggest that Cyanobacteria assimilated nitrate as an inorganic nitrogen source [48], [49]. The cyanobacterial assemblage was mainly composed of small unicellular Synechococcus spp. Both prevalence of Synechococcus and functional annotation of the taxon-specific transcripts indicate that Cyanobacteria coupled oxygenic photosynthesis with nitrate cycling. Photosynthetic activity agrees well with previous findings of increased O2 concentrations during the daytime, fulfilling the high biological oxygen demand for methane oxidation and organic matter degradation in the oxic zone [50]. In addition, the expression of genes encoding exopolysaccharide synthesis and proteins for polysaccharide translocation demonstrates that carbon fixed by photosynthesis was converted into polysaccharides and presumably supplied to other community members. Principal component analysis of the core mRNA-tags (circle) and non-core mRNA-tags (triangle) based on the functional expression profiles using SEED subsystems. Samples indicated in red and blue relate, respectively, to the oxic and the anoxic zone. Differential expression of subsystems between two samples was calculated using two-sided Chi-square test with Yate's correction. Level 1 subsystems which were most significantly overrepresented in either core or non-core mRNA-tags (P-value < 0.01) are shown in a horizontal comparison. Level 2 subsystems showing differential expression between the oxic and anoxic zones are displayed by vertical comparison.   Members of the Myxococcales mainly belonged to Polyangiaceae, Nannocystaceae, and Haliangiaceae as derived from the SSU rRNA-tags, implying a saprophytic life style [51], [52]. Taxon-specific expression of various cellulolytic enzymes, including glucanase, amylase, glucosidase, chitosanase, and xylosidase, as well as N-acetylmuramoyl-L-alanine amidase that hydrolyzes amide bonds in cell wall peptidoglycan agrees well with their known capabilities of feeding on cells and recycling carbon and nitrogen from biomolecules under nutrient limitation in soil [53]. Secondary metabolite production, such as polyketides and insecticidal toxin complex, was found to be active only among members of the Myxococcales, while antibiotic resistance activity was observed for all of the dominant groups. Polyketide is a secondary metabolite of various structural types that can be transformed into antibiotics (e.g., rapamycin, erythromycin, and lovostatin), underlining the potential of the soil microbiome to be a reservoir of bioactive molecules [54].
One of the biogeochemical characteristics in the oxic zone is methane oxidation. The amount of methane that is oxidized prior to emission to the atmosphere, is in the range of 45% to 60% of the methane produced in the anoxic zone [55]. Despite the relatively minor contribution of Methylococcales to the SSU rRNA-tags of the dominant groups, a high transcript abundance of genes encoding particular methane monooxygenase and methanol dehydrogenase was detected. High gene expression of ecologically relevant processes performed by low-abundant organisms was frequently observed in natural environments. For example, Crenarchaeota were found to be very active in ammonia oxidation as derived from the normalized expression levels of Crenarchaeota-related amt and amoABC genes [56].
Transcripts encoding filamentous hemagglutinin and xylose isomerase were frequently detected among the Xanthomonadales-specific mRNA-tags. Hemagglutinin is involved in plant tissue attachment and biofilm formation [57]. Xylose isomerases catabolize xylan, which is the main building block of hemicellulose. The co-expression of these genes suggests that Xanthomonadales play a direct or indirect role in the decomposition of plant material, presumably in concerted action with other cellulose-degrading organisms such as Fungi. As deduced from forest soil eukaryotic metatranscriptome data [45], Fungi expressed enzymes involved in the turnover of plant biomass such as pectin depolymerase, xylosidase, and glycoside hydrolases. Signature transcripts of eukaryotic organisms, like those encoding myosin and splicing factor, were detected exclusively in our Fungi-specific mRNA-tags.

Metabolic interactions revealed by non-core mRNA-tags
Members of the Xanthomonadales have frequently been found in relatively high abundance in environments where methane oxidation occurs such as peatlands and fruitland overlying a coalbed methane seep [58], [59]. In our study, most of the SSU rRNA-tags affiliated with Xanthomonadales had the highest sequence identity with an uncultured bacterial 16S rRNA gene, which was obtained from stable isotope probing experiments using 13 CH 4 as a substrate [60]. The relative abundance of the Xanthomonadales-related SSU rRNA-tags increased with incubation period, concomitantly with those of Methylococcales (Fig 2). These observations led us to speculate that there may be a metabolic link between these two groups. Taxon-specific mRNA-tags did not provide an explanation for the concurrent occurrence of Xanthomonadales with methaneoxidizing bacteria. However, transcripts for methanol dehydrogenase shared exclusively by Methylococcales and Xanthomonadales were detected in the non-core mRNA-tags, suggesting in situ methanol oxidation activities of both taxa. As methanol is the first metabolic intermediate of methane oxidation by methanotrophic bacteria, this homolog distribution pattern of methanol dehydrogenase transcripts allows us to assume that members of the Xanthomonadales feed on methanol produced by and released from Methylococcales. Such cross-feeding of carbon via methanol or formate has previously been observed to occur between methanotrophs and methylotrophs [61]. Microorganisms involved in cross-feeding would require homologous genes to utilize the shared metabolic intermediates. Investigation of non-core, but not taxonspecific, mRNA-tags could provide insights into the metabolic network and/or food chain transfer in complex microbiomes.
Taxon-specific mRNA-tags from the anoxic zone Under anaerobic condition in the dark, different microbial guilds interact to decompose complex organic matters to methane, and microorganisms gain energy by anaerobic respiration or fermentation [62]. The taxon-specific mRNA-tags should allow us to identify particular roles of the dominant groups in the anaerobic food chain (Fig 8). The Actinobacteria-specific mRNA-tags encoded diverse cellulolytic enzymes, such as cellulase, xylosidase, cutinase and lignin peroxidase, as well as glycoside hydrolase. Among the mRNA-tags derived from the minor taxonomic groups (< 3% of SSU rRNA-tags), multiple transcripts for acetyl xylan esterase were affiliated with Acidobacteria. The esterase activity leads to the solubilization of xylan [63]. Thus, it is reasonable to assume that Actinobacteria and Acidobacteria play important roles in the decomposition of lignocellulosic materials in plant cell walls. In addition, transcripts involved in antibiotic production, including lantibiotic modifying enzyme and polyketide synthase, were expressed in the anoxic zone only by Actinobacteria. Notably, Actinobacteria in the anoxic zone and Myxococcales in the oxic zone occupied comparable functional roles, being on the one hand primary degraders of complex organic matters and on the other hand secondary metabolite producers (S4 Fig). Among the Clostridia-specific mRNA-tags, transcripts encoding diverse types of glycoside hydrolase as well as cellulosome components, including cellulosome-anchoring protein and endoglucase, suggest cellulose degradation activity. Fermentation activity was also revealed by high abundance of transcripts for sugar ABC transporter extracellular solute binding protein.
Clostridial species, like C. acetobutylicum and C. thermocellum, transport sugars into cells by either phosphoenolpyruvate(PEP)-dependent phosphotransferase system or ATP-dependent transporter system [64], [65]. Although many transcripts related to fermentation pathways were present in the core mRNA-tags, high taxon-specific expression of sugar transport systems indicates that degradation products from cellulose hydrolysis were utilized primarily by Clostridia, rather than by other dominant groups. High abundance of S-layer transcripts suggests ongoing cell proliferation and relates well to the gradual increase of Clostridia among the SSU rRNA-tags with incubation period (Fig 2). Given that the production of S-layer proteins is energetically expensive [66], Clostridia were metabolically highly active, corroborating that this group is of significant importance for the anaerobic breakdown of polymers in flooded paddy soil.
In anoxic paddy soil, the oxidation of acetate by Geobacter is coupled with the dissimilatory reduction of iron [67], [68]. Transcripts encoding type IV pili and type IV pilus secretin were homologous to unique gene sequences from Geobacter metallireducens, whose expression products are involved in extracellular electron transfer [69], [70]. The notable abundance of flagellin and cytochrome C transcripts indicated the motility of Geobacter spp. to access and reduce Fe(III) oxide [71]. The functional expression profile of the Anaeromyxobacter-specific mRNA-tags obviously differed from that of Myxococcales in the oxic zone (Fig 8, S4 Fig). While Anaeromyxobacter exhibited strong transcript activity in aromatic compound degradation, predation and secondary metabolite production activities that are typical of other myxobacteria were not detected. The expression of different types of motility was also evident. Type-IV pilus-based social (S-) motility and flagellum-mediated motility was highly expressed by Anaeromyxobacter, whereas transcripts encoding adventurous (A-) gliding motility was a characteristic of the myxobacteria inhabiting the oxic zone.
Most of the SSU rRNA-tags derived from Euryarchaeota were affiliated with methanogenic archaea. The most abundant transcript species among the Euryarchaeota-specific mRNA-tags encoded methyl coenzyme M reductase (MCR), which catalyzes the terminal step of methane formation. Most MCR transcripts were derived from Methanosarcina and Methanosaeta, the two different types of acetoclastic methanogens. Additionally, all archaeal S-layer protein transcripts were derived from Methanosaeta spp. (S5 Fig). Both the gradual increase in the relative abundance of SSU rRNA-tags related to Clostridia and acetoclastic methanogens with incubation period and the functional annotation of their taxon-specific mRNA-tags suggest a close metabolic link between the two groups, with acetate as the key metabolic intermediate [71]. Putative roles of Anaerolineae could not be inferred, primarily due to the high abundance of hypothetical proteins (~63%) among their taxon-specific transcripts.

Concluding Remarks
Metatranscriptome data of natural microbial communities are largely derived from highly conserved genes which are involved in basic cellular maintenance [72]. Due to the high sequence conservation and ubiquity, transcripts of these genes are frequently affiliated with incorrect taxonomy. Their high abundance in metatranscriptomes makes it difficult to detect ecologically significant expression of genes that are present in low abundance in metagenomes. This is despite the fact that the expression level of such genes, normalized to gene abundance within community, is significantly higher than that of highly conserved genes [72]. Classification of transcripts based on homologous gene distribution allowed us to identify the functional signatures of active populations and the metabolic contributions of individual taxa to communitywide functioning. This approach provided insights into the ecological process of microbiome succession in response to the prevailing conditions. Cyanobacteria, Fungi, Xanthomonadales, Myxococcales, and Methylococcales were the most abundant and metabolically active groups in the oxic zone, while Clostridia, Actinobacteria, Geobacter, Anaeromyxobacter, Anaerolineae, and methanogenic archaea dominated the anoxic zone. They were stably maintained throughout the incubation period. Distinct taxonomic composition between the two oxygen zones was related to the expression of different sets of highly conserved and ubiquitous genes, whose expression is essential for metabolic activity in the respective zone. The differences in the core features between the two microbiomes suggest that different oxygen availability is the necessary condition in determining structural succession.
Although metabolic traits and most substrate utilization capacities are known to be highly redundant among microbial groups, we observed limited functional redundancy among the dominant groups of either oxic or anoxic zone. Some functions were identified to be expressed in both the oxic and anoxic zones, but by different taxonomic groups. For instance, polymer degradation and secondary metabolite production were performed by either Myxococcales (oxic zone) or Actinobacteria (anoxic zone). While Fungi and Xanthomonadales were suggested to cooperate in the decomposition of plant materials, most of the dominant groups in the anoxic zone collaboratively transformed organic matter. These functions were maintained in both microbiomes, given that plant materials are one of the major carbon sources in soil and antibiotics are essential to manage interaction networks within a natural microbial community [73]. Preselection of active populations by different oxygen availability results in distinct sets of taxa that carry out similar functions during successional change in the two oxygen zones. This finding corroborates that microbiome functioning cannot be determined solely on the basis of taxonomic composition or genetic potential [74], [75].
By contrast, photosynthesis and methane oxidation carried out respectively by Cyanobacteria and Methylococcales were unique to the oxic zone. Methane production by methanogenic archaea and aromatic compound degradation by Anaeromyxobacter was a characteristic of the anoxic zone. Nutrient or energy sources that are exclusively or majorly available in one of the two zones, such as light in the oxic surface layer or acetate in the anoxic bulk soil, induced the activities of particular taxonomic groups which are capable of assimilating or metabolizing these resources. While oxygen led to a prescreening of adaptable taxa, nutrient and energy sources determined the expression of metabolic diversity on spatial and temporal scale. Each of the dominant groups was revealed to serve as a functional unit activated by a relevant energy or nutrient source. Microorganisms embedded in a complex community exhibit different gene expression profiles towards energetically favored or taxonomically specialized metabolic reactions, partly owing to metabolite exchange. Systematic assemblages of functional units benefit from metabolic networks to access resources and niches that cannot be occupied by the activity of individual groups. S2 SuppInfo. A Python script to classify putative mRNA-tags into core, non-core, and taxon-specific transcripts based on homologous gene distribution. (PY) S1 Table. Statistics of ribosomal metatranscriptome libraries. Total RNA was extracted from two independent replicate microcosms, except for the 25-day incubation period. The proportion of exact duplicates and preprocessed reads is given in parenthesis. The values were calculated in relation to the total number of raw reads. Exact duplicate reads were omitted from further analysis. The proportions of reads derived from SSU rRNA, LSU rRNA, and non-rRNA were calculated in relation to the total number of preprocessed reads. (DOCX) S2 Table. Statistics of functional metatranscriptome libraries. The proportions of exact duplicate and preprocessed reads were calculated in relation to the total number of raw reads. Exact duplicate reads were omitted from further analysis. The proportions of reads derived from rRNA, small RNA, and putative mRNA were calculated in relation to the total number of preprocessed reads. (DOCX)