Long-term taxonomic and functional divergence from donor bacterial strains following fecal microbiota transplantation in immunocompromised patients

Immunocompromised individuals are at high risk of developing Clostridium difficile-associated disease (CDAD). Fecal microbiota transplantation (FMT) is a highly effective therapy for refractory or recurrent CDAD and, despite safety concerns, has recently been offered to immunocompromised patients. We investigated the genomics of bacterial composition following FMT in immunocompromised patients over a 1-year period. Metagenomic, strain and gene-level bacterial dynamics were characterized in two CDAD-affected hematopoietic stem cell (HCT) recipients following FMT. We found alterations in gene content, including loss of virulence and antibiotic resistance genes. These alterations were accompanied by long-term bacterial divergence at the species and strain levels. Our findings suggest limited durability of the specific bacterial consortium introduced with FMT and indicate that alterations of the functional potential of the microbiome are more complex than can be inferred by taxonomic information alone. Our observation that FMT alone cannot induce long-term donor-like alterations of the microbiota of HCT recipients suggests that FMT cannot indefinitely supersede environmental and/or host factors in shaping bacterial composition.


Introduction
Fecal microbiota transplantation (FMT) is a remarkably safe and effective therapy for resolving recurrent Clostridium difficile-associated disease (CDAD) in immunocompetent individuals [1], and it is increasingly being used to treat CDAD in immunocompromised individuals [2]. Among immunocompromised populations at greatest risk of developing CDAD are hematopoietic stem cell transplantation (HCT) recipients, who experience attack rates as high as 25%, PLOS  and who are up to nine times more likely to suffer from CDAD than immunocompetent individuals [3][4][5]. A small number of recent studies have reported that FMT can safely resolve CDAD in post-HCT recipients [2,[6][7][8][9]. However, persisting concerns regarding the safety of administering FMT therapy to immunocompromised patients continue to limit use of the therapy in post-HCT individuals. Better understanding the effects of FMT through investigations of post-treatment gut bacterial community structure in immunocompromised patients is needed. Several studies leveraging 16S rRNA gene sequencing have begun to reveal the taxonomic underpinnings of FMT therapy by informing on genus-and species-level bacterial dynamics. These studies have demonstrated that FMT induces taxonomic compositional changes in the recipient gut microbiota, resulting in a more donor-like state [10][11][12][13][14][15]. These analyses have provided a foundation for metagenomic investigation of FMT: there are good reasons for considering whole genomes and strain-level diversity, since bacterial strains with identical 16S rRNA sequences can exhibit wide variability in terms of phenotype and pathogenicity resulting from nucleotide level DNA sequence differences or differing gene content. For example, commensal Escherichia coli are abundant facultative anaerobes in the human gut [16], yet enterohemorrhagic Escherichia coli are important pathogens. A recent study of patients undergoing FMT for treatment of metabolic syndrome was the first to report strain-level dynamics after FMT. In this study of 10 patients, the investigators found that strains of donor bacteria are capable of both replacing and coexisting with same-species recipient strains at three months post-FMT [17]. These findings are intriguing, but questions remain regarding the generalizability of this finding to patients undergoing FMT for treatment of CDAD where, unlike the metabolic syndrome cohort, patients receive antibiotics prior to FMT and have an underlying dysbiosis. In the setting of prior antibiotic therapy, it is presumed that the total amount and diversity of microorganisms present in the FMT recipient is lower, and thus the gut may represent a relatively barren ecological niche for repopulation with donor-derived microbes. Additionally, no studies have yet reported on strain level engraftment at time points greater than three months after FMT.
To date, the collective understanding of FMT impact on the taxonomic composition of the gut microbiota has been defined by studies of immunocompetent patients. However, as immunocompromised patients are at greatest risk of developing CDAD, there is much to gain from an investigation of FMT in this patient population, in which the impaired host immune system may have diminished influence on the gut microbiota community. Further, although the recent studies relating the success of FMT in immunocompromised individuals have begun to assuage concerns of safety [2,[6][7][8][9], those concerns are not unfounded [18]. Characterizing the taxonomic and functional capacity of the immunocompromised patient microbiota post-FMT carries potential for both improving clinical outcomes of CDAD in patients lacking a robust immune system and advancing our understanding of the human microbiota landscape during transitions between states of health and disease.
We present eight HCT patients with three or more microbiologically-documented CDAD episodes, all of whom were successfully treated with FMT. For two of these patients, we were able to perform a longitudinal study of microbiota dynamics, employing shotgun metagenomic sequencing of serial stool samples obtained before, immediately after and over one year following FMT. We show that recipient stool assumed donor-like taxonomic and functional composition immediately following FMT, but that functional and taxonomic concordance were diminished after one year. These findings suggest that environmental, lifestyle and other host-factors are important determinants in long-term shaping of the microbiota. Furthermore, we show that despite the divergence of overall taxonomic and functional similarity at >1 year post-FMT, potentially important specific changes to the gene repertoire were retained, including the loss of pathogenicity genes from the pan-genome of Escherichia coli and a reduction of community-wide antibiotic-resistance genes.

Treatment outcomes of FMT in eight HCT patients
FMT was delivered as an oral, encapsulated therapy using stool from a healthy unrelated donor (Table 1). All eight of the HCT patients to receive FMT experienced resolution of CDAD at 8 weeks, a standard time frame for determining cure. Subjects G and H died of underlying CNS disease and intracranial hemorrhage at 208 and 101 days following FMT, respectively. Both subjects were negative for CDAD at the time of death. Subjects A, B, C, E and F were free of CDAD at time of final stool sample collection at 408, 384, 456, 448 and 410 days post-FMT, respectively. Subject D experienced a recurrence of CDAD, and tested positive for CDAD at time of final stool sample collection, 179 days post-FMT. FMT is typically performed in patients with CDAD after significant clinical improvement is achieved with antimicrobial therapy. Subject B had active diarrhea at the time of FMT; the remaining five subjects were symptomatically quiescent. FMT was well tolerated in all subjects, and the subject with active diarrhea experienced symptom improvement within days of treatment. No attributable side effects or exacerbation of graft-versus-host disease were observed.

Short-term concordance and long-term divergence between recipient and donor microbiota following FMT
To better understand the taxonomic impact of FMT therapy in HCT recipients, we employed shotgun metagenomic sequencing of serial stool samples-both pre-and post-FMT-from ; and one stool sample was obtained at a long-term timepoint post FMT (days 408 and 384 for Subjects A and B, respectively). We also performed shotgun metagenomic sequencing of one long-term post-FMT sample from subjects C, D, E and F; pre-FMT samples were not available for these individuals. Long-term samples for subjects G and H were not collected as subjects were deceased at time of collection. Samples immediately pre-FMT and within 21 days post-FMT were not available for Subjects C, D, E, F, G and H. In order to draw parallels between our results and the majority of FMT metagenomic studies to date, we began with an investigation of bacterial composition following FMT at the level of genus, a level of resolution typically targeted with the common approach of 16S rRNA gene amplicon analysis. For both Subjects A (Fig 1a) and B (Fig 1b) we observed a rapid shift towards a more donor-like state for all bacterial members within the most abundant genera. Although donor-concordant patterns of engraftment were found to be sustained for all shortterm samples (<21 days), compositional similarity was largely diminished in the long-term (>1 year) timepoints for both Subjects A and B.
To quantitatively assess parity between donor and recipient gut communities, we calculated whole-community similarity of all timepoints for Subjects A and B and long-term timepoints for Subjects C, D, E and F to corresponding donor timepoints at both the levels of genus (Fig 1c, S1 Table) and gene (Fig 1d, S2 Table). We observed a large increase in community-wide donor similarity at both the level of genus and gene in subjects A and B with FMT, with maintenance of elevated similarity during the three week short-term follow-up period. However, at >1 year post-FMT, donor similarity was reduced beyond the range of variability observed in donor self-similarity. As a point of comparison, we also calculated donor self-similarity using multiple stool samples taken over 180 days from donor #29, who provided fecal samples for all subjects aside from D, G and H. Donor self-similarity was calculated as similarity across the donor time series to the initial timepoint. At the taxonomic level, donor self-similarity fluctuated by as much as 33%, though at the level of gene, variations in donor self-similarity were minimal (within 1%). Although stool sampled immediately pre-FMT and within a month post-FMT were not collected for Subjects C, D, E, F, G or H, we were able to obtain samples for Subjects C, E and F at >1 year post-FMT, and for Subject D at~6 months following FMT. We noted low donor similarity indices in these samples for both genus and gene (Fig 1c and  1d) (S1 and S2 Tables).

Strain-level genomic remodeling accompanies FMT-induced community-level turnover
To achieve a higher-resolution perspective of the metagenomic events surrounding FMT, we performed strain-level analysis on a subset of high-abundance taxa. Since the presence of genus Roseburia was found to be negligible in Subject A pre-FMT, yet abundant in both the donor and recipient post-FMT (Fig 1a), we sought to investigate the concordance of Roseburia strains between donor and Subject A over both the short-term (within 21 days of FMT) and the long-term (>1 year). Using nucleotide polymorphisms to distinguish between strains within the Roseburia genus, we observed the ratio of concordant to discordant single nucleotide variants (SNVs) to be similar between Subject A short-term samples post-FMT (Fig 2a), and those from the donor sampled from within a period of a month (Fig 2b). At >1 year post- Genus-level taxonomic composition reveals extensive short-term donor microbial engraftment with FMT, followed by long-term reduction in donor similarity relative to donor self-similarity over a similar time period. Genera present in the donor appear in (a) Subject A and (b) Subject B recipients with inconsistent long-term residence. For visual clarity, genera represented by >5% of assigned reads from at least one timepoint are shown. (c) Whole-community Bray-Curtis donor similarity for Subjects A and B, and an additional four patients (Subjects C, D, E and F) for which long-term and donor samples were obtained, is shown for genus-level composition. Donor similarity is calculated for each recipient timepoint with the corresponding donor sample used for FMT. As a point of reference, we show multiple timepoints for Donor #29, where similarity is measured against the first timepoint. Microbial sequencing reads were classified using Kraken [29] in conjunction with a sequence database collected from NCBI Refseq and Genbank microbial genome references. (d) Whole-community donor Chao similarity [33] in gene content is shown. Gene abundances were measured by alignment to the Uniref50 functionally annotated protein sequence database [32].
https://doi.org/10.1371/journal.pone.0182585.g001 and Subject A recipient shortly after FMT from those in Subject A after one year following FMT. (b) Roseburia strains demonstrate stable, low-level diversity in the donor across eight time points taken over~8 months, contrasting with the divergence observed in the recipient. Sequences were assembled from short read data obtained at the long-term time-point, and resultant contigs were classified taxonomically (see Methods). Contigs belonging to genus Roseburia were used as a reference for alignment of read data from all timepoints, followed by SNV calling and variable site selection. All polymorphisms distinguishing the donor from the long-term sample were selected, and presence of these discriminant polymorphisms in intermediate timepoints was recorded (adapted from a previously described approach [17]). https://doi.org/10.1371/journal.pone.0182585.g002 Long-term microbiota dynamics after FMT FMT, we found the strains present within the genus Roseburia in Subject A to be genetically different from those of the donor.
As organisms from the genus Escherichia were observed in both donor and Subject A pre-FMT (Fig 1a), we focused on the representative species, Escherichia coli, and investigated whether strains post-FMT showed greater concordance with recipient pre-FMT or donor strains of E. coli. Using PanPhlAn (see Materials and methods) to calculate individual gene abundances within the E. coli pan-genome, we observed large-scale changes in the gene complement of E. coli in the recipient following FMT (Fig 3). Hierarchical clustering identified 719 genes present prior to FMT that disappear within the first two timepoints post-FMT and remain absent through the >1 year timepoint (Fig 3, S1 File). This group is significantly enriched for known pathogenicity genes (Fig 3, S2 File). While there was a loss of many notable E. coli-specific virulence genes over time, there was an overall increase in the relative abundance of E. coli in the broader microbial community in Subject A post-FMT.
To investigate the effects of FMT on the antibiotic resistome, we measured the abundance of antibiotic resistance genes from donor stool and Subject A both pre-and post-FMT (Fig 4). Subject B was excluded from this analysis due to a large concentration of human DNA pre-FMT, reducing the available microbial data volume. We performed alignment of shotgun data to the Comprehensive Antibiotic Resistance Database [19] to obtain antibiotic resistance gene counts, normalized by total per-sample coverage and aggregated by resistance phenotype. Following FMT, we observed an immediate decrease in the abundance of antibiotic resistance genes by 2.4-fold which remained stable beyond 1 year. In addition, we noted a sustained increase in Chao similarity of unaggregated gene abundance profiles between the resistome profile of Subject A and the donor (S3 Table).

Discussion
Our study provides one of the first metagenomic analysis of strain-and gene-level bacterial engraftment following FMT, and the first such analysis in patients being treated for CDAD. The success of FMT in resolving CDAD in HCT patients has been demonstrated previously [2,[6][7][8][9], however a detailed characterization of this "black box" therapy has been lacking. We used a shotgun metagenomic approach to examine both short-and long-term bacterial engraftment following FMT to treat CDAD in HCT patients. We show that although concordance was observed between microbiota composition of the donor and recipient in the shortterm, concordance was reduced after 1 year. Previous studies investigating long-term donor similarity have been confined to immunocompetent individuals and have either (1) been limited to 16S rRNA sequencing analysis [13][14][15] or (2) observed strain-level changes, but only through to three months following FMT. [17] In the case of the 16S rRNA studies, Broecker et al. [13] investigated family donor FMT in one subject, and reported long-term (>4.5 years) donor similarity but limited short-term (<7 months) concordance. Conversely, we observed short-term similarity and long-term donor discordance. For all eight HCT patients in our study, recipients received stool from a healthy, unrelated donor, in contrast to the method employed by Broecker et al, where fecal material was donated by the sister of the FMT recipient. Our findings that unrelated donors and recipients display dissimilar microbiota profiles at long-term timepoints post-FMT suggest that FMT engraftment is broadly impacted by external effects such as diet, host genetics, host immune surveillance, and subsequent antibiotic exposure. These factors may supersede FMT composition in determining long-term FMT durability.
Also employing 16S rRNA sequencing were studies by 1) Jalanka et al., who used FMT from "universal donors" to treat CDAD in 14 immunocompetent patients, and report on  [37] was used to identify the presence of individual genes within the pan-genome of E. coli. Reads were aligned to a nonredundant collection of genes present in phylogenetically diverse E. coli isolate genome sequences. Those genes occurring at an abundance consistent with E. coli-specific occurrence were chosen and designated present or absent, as described previously [37]. Gene occurrence profiles were bacterial composition through to 1 year following FMT [14]Weingarden et al., who followed taxonomic changes for 151 days post-FMT from 4 immunocompetent CDAD-afflicted subjects treated with stool from a single donor. [15] Like Broecker et al., Jalanka et al. found the unrelated donor bacterial profile to be retained in recipients at 1 year post-FMT. However, hierarchically clustered and divided into 10 groups with similar occurrence over the time series. Cumulative gene counts across groups are indicated on the left margin. Genes associated with virulence were identified by alignment to the Virulence Factor Database [39]. Over-or underrepresentation of virulence genes in cluster groups was determined using Fisher's exact test to detect significant departure from random occurrence. Significant (p < 0.05) underrepresentation was found in group 1, and significant overrepresentation was found in groups 5 and 9. KEGG annotations of genes within each group are given in S1 File. Virulence factors found within each group are given in S2 File.
https://doi.org/10.1371/journal.pone.0182585.g003 Weingarden et al. report the highest similarity between donor and recipient to be 1 day following FMT, with considerable divergence at all later time points. Such findings are more consistent with our observations of short-term similarity and long-term dissimilarity between donor and recipient gut microbiota composition. We attribute the respective discrepancy between the findings of Broecker and Jalanka with those of Weingarden and ourselves to differences in sensitivity between 16S and metagenomic methodologies, and to possible involvement of the host immune status and potential subsequent therapies in influencing the long-term fate of donor microbiota. Our analysis of nucleotide-level similarity between donor and recipient, adapted from Li et al [17], showed donor concordance over short-term timepoints consistent with that group's findings, however after 1 year post-FMT we report high discordance between donor and recipient genus Roseburia SNVs (Fig 2). We focused on this genus since it appeared to be absent in the recipient pre-FMT, abundant in both the donor and the recipient at all timepoints post-FMT, and showed the highest abundance of reads at >1 year post-FMT. Given the consistent presence of Roseburia across all timepoints, it was surprising that the genomic sequence(s) of Roseburia at >1 year post-FMT was extensively dissimilar to that of the Roseburia organism (s) that had initially engrafted in the recipient. It is possible that intra-species competition, neutral drift, or other mechanisms of nucleotide divergence drove strain turnover within the ecological niche harbouring Roseburia; however, more studies are needed to understand the importance of sub-species interactions and how such fluctuations might affect gut microbiome behavior and ultimately human health.
To gain a better appreciation for how gene-level differences between same-species strains of bacteria affect engraftment following FMT, we chose to investigate the gene complement of E. coli, a species that exhibits wide variability between strains, ranging from dominant commensal in the human gut to highly virulent pathogen [16]. We chose this organism both because it occurs at high abundance through the time series obtained from Subject A, and because there is an extensive genome collection available for E. coli. At the taxonomic level, the presence of E. coli was consistent in Subject A throughout all samples; however, at the gene level, we observed massive losses and gains in the total E. coli accessory genome between all measured timepoints, with the most pronounced differences occurring pre-and post-FMT (Fig 3). Whether this change is due to strain replacement or genomic remodeling within a single resident strain is unconfirmed, although the rapid timeframe in which it occurs suggests the former.
Multi-drug-resistant (MDR) bacteria are becoming increasingly common in patients with hematological malignancies, such as HCT recipients [20]. It has been previously shown that FMT is capable of reducing antibiotic resistance genes in immunocompetent patients being treated for CDAD [21]. Due presumably to both the requirement that antibiotic treatment be administered for at least 3 CDAD episodes before FMT will be attempted, as well as the use of antibiotics in conjunction with HCT therapy, recipients have been found to have higher numbers of antibiotic resistance genes than those of healthy donors. We sought to determine (1) whether immunocompromised patients carry an increased number of antibiotic resistance genes compared to a healthy donor, and (2) if FMT is similarly capable of reducing the abundance and variety of antibiotic resistance genes in HCT patients. We demonstrate that, for the immunocompromised subjects analyzed, there was an increased presence of antibiotic resistance genes in the recipient compared to the donor, and that abundance of the antibiotic resistome was reduced by >50% followed FMT (Fig 4). Furthermore, this reduction persisted more than one year after FMT. This is consistent with findings in immunocompetent recipients, and bodes well for the potential reduction of antibiotic resistance via FMT in HCT patients. Given the lean toolkit of antibiotics that remain effective in the face of MDR pathogens, identifying strategies to mitigate antibiotic resistance is of tremendous importance, especially for immunocompromised individuals. Reducing antibiotic resistance may prove to be yet another potential application of FMT.
In conclusion, our work establishes a foundation for future studies employing a shotgun metagenomic approach to better understand the evolution of microbial communities after FMT in various disease states. We demonstrate that granular shotgun metagenomic analyses show high initial donor similarity and long-term donor divergence at the communitywide taxonomic and functional levels, as well as in single-species gene composition and nucleotide variation. While further studies involving a larger cohort of patients and more detailed statistical analysis are necessary to validate our early findings, the findings reported herein lend support to the safe use of FMT for CDAD and reveal preliminary evidence for short and long-term FMT dynamics in immunocompromised HCT patients. In addition, our results strongly support the value of detailed analysis of metagenomic sequence data, and suggest a holistic model of FMT engraftment durability informed by more than donor or recipient microbiome taxonomic composition at the species, genus or phylum level.

Stool sample collection
Bone marrow transplant recipient fecal samples were collected by patients at home using collection kits provided. Fecal samples were shipped to the laboratory on pre-frozen freezer packs and stored at -80˚C within 24 hours of production. Upon defecation, pre-FMT and early (<21 days) timepoint fecal samples were aliquoted in 95% EtOH. Donor and all late timepoint (>6 months) fecal samples were stored in the absence of EtOH.

Fecal microbiota transplantation
Donors were screened and fecal samples were collected, and processed into FMT capsules (Capsugel hypromellose capsules) as described [22][23][24]. FMT was performed as previously reported [22]. Briefly, each patient received 15 frozen FMT capsules with water on each of two consecutive days followed by monitoring for any adverse effects. Subjects took nothing by mouth for four hours before and one hour after administration. Subjects were instructed to stop oral vancomycin 24-48 hours prior to FMT. Subject A vomited more than 12 hours after the first dose but did not bring up any capsules. There were no serious adverse events attributable to FMT.

DNA extraction
Stool samples were stored at -80˚C until use. DNA extractions were performed on all samples using the QIAamp DNA Stool Mini Kit (QIAGEN 1 ) as per manufacturer instructions, with an added bead-beating step using the Mini-Beadbeater-16 (BioSpec Products) and 1 mm diameter Zirconia/Silica beads (BioSpec Products) consisting of 7 rounds of alternating 30 second bead-beating bursts followed by cooling on ice. DNA concentration and quality estimations were performed using Qubit 1 Fluorometric Quantitation (Life Technologies) and Bioanalyzer 2100 (Agilent). DNA libraries were prepared using the Nextera XT DNA Library Prep Kit (Illumina 1 ). Prepared libraries were subjected to 100 base pair, paired-end sequencing on a HiSeq 2500 (Illumina 1 ).

Metagenomic shotgun sequencing and analysis
Data from individual sequencing runs were first aggregated by clinical sample using an inhouse script. Paired-end raw reads from shotgun sequencing were trimmed using cutAdapt v1.7.1 [25] using a minimum length 80bp, maximum N count of 0, minimum terminal base score of 30, and the Illumina Nextera transposase and adapter sequences. In addition, reads were trimmed by 16bp at the 5' end to remove low quality bases and deduplicated with PrinSeq v0.20.4 [26]. Sequencing was performed to asymptotic saturation of available taxonomic diversity, assessed by calculating rarefaction curves using the vegan v2.4 [27] package in the R statistical computing language [28] and defining an asymptotic threshold at less than one new genus discovered per each additional million reads (S1 Fig). Reads were taxonomically assigned using Kraken v0.10.4 [29] based on k-mer resemblance to human, bacterial, viral and fungal k-mer profiles (k = 31) generated from the NCBI Bacterial Refseq and Genbank collections (accessed November 14, 2015), filtered with the kraken-filter utility at a threshold of 0.2, and results visualized with in-house scripts. Similarity between samples based on taxonomic calls was calculated with the Bray-Curtis similarity metric [30].
Functional analysis was conducted on quality filtered reads using HUMAnN2 v0.2.1 [31] to obtain read alignments to the Uniref50 functionally annotated protein sequence database [32]. Relative gene family abundances were obtained by dividing gene family coverage by per-sample total coverage. The Chao similarity metric was chosen to assess functional similarity due to the high diversity present in the functional repertoire (~2 million distinct genes) and the consequent risk of negative bias due to undersampling, a problem affecting alternative choices of similarity index such as Bray-Curtis or Jaccard. [33] Roseburia SNV diversity was calculated by first assembling reads from the subject 1 longterm time-point with Spades v3.6.1 [34], then aligning reads from all timepoints to the contigs so obtained using BWA [35], and finally calling SNVs using the GATK Unified Genotyper [36]. SNVs discriminating donor from long-term recipient timepoint were selected as previously described [17] and visualized with an in-house script.
PanPhlAn [37] was used to obtain gene repertoire membership information for Escherichia coli using the 2016 pan-genome index for this organism. Gene functions were annotated for all genes using the KEGG [38] database according to the method described in the PanPhlAn documentation. Those genes occurring at an abundance consistent with E. coli-specific occurrence were chosen and designated present or absent as described previously [37]. Genes associated with virulence were identified by alignment to the Virulence Factor Database [39]. Over-or underrepresentation of virulence genes in cluster groups was determined using Fisher's exact test to detect significant departure from random occurrence.
Antibiotic resistance profiles were obtained by using BWA [35] to align read data from all samples to the Comprehensive Antibiotic Resistance Database [19] (accessed 6/6/16) and collecting per-gene read counts and aggregating by resistance phenotypes using an in-house script.
Supporting information S1  (Ihaka & Gentleman, 1996). Curves represent repeated random subsamplings of genus-level taxonomic calls taken at increasing sample sizes. Singleton calls were omitted from this analysis. We assess sufficient sampling when a curve has transitioned from the vertical phase to an asymptotic phase. Samples are labeled in the form "Subject_Timepoint", with 'lt' identifying long-term timepoints and donor samples prefixed with the letter D. Timepoints are numbered in chronological order for each subject. Note that labels D29_11 and D29_7 are coincident, as are D29_9 and B_2.