Fig 1.
Bacterial abundance correlations with age and sex.
A) Q-Q plot for correlations of 116 common bacterial taxa with age in samples collected in winter. Gray shading represents the 95% confidence interval of the null. The point circled in orange is genus Bifidobacterium. B) Abundance of genus Bifidobacterium is inversely correlated with age in samples collected during the winter (** q ≤ 0.01). C) Q-Q plot for correlations of 116 common bacterial taxa with sex in samples collected in winter. The point circled in orange represents genus Scardovia. D) Genus Scardovia was significantly more abundant in females (n = 60) than in males (n = 33) in winter (** q ≤ 0.01).
Table 1.
Number of bacterial taxa that vary by sex or age in each season.
The total number of taxa whose relative abundances were significantly correlated with age or sex at various q-value cutoffs are listed. Total number of taxa tested per season is indicated in the bottom row (total). In the text, we discuss the number of significantly correlated taxa in each season with a q-value threshold of ≤ 0.05. The abundances of few bacterial taxa appear to vary consistently with age; however, the abundances of many bacterial taxa are correlated with sex in this population.
Fig 2.
“Chip heritability” for 102 bacterial taxa tested in the “seasons combined” analysis.
Each point represents the estimated percent variance explained (PVE, or “chip heritability”) for the joint effect of all genotypes analyzed in the GWAS for bacterial abundance during the “seasons combined” analyses. Bars indicate standard error measurements around the estimate. A number of bacterial taxa showed non-zero PVE estimates (listed in order from highest to lowest PVE) with error bars that do not intersect zero, indicating that cumulative common genetic variation can explain some portion of the variation in bacterial abundance observed between individuals. Bacterial taxa that also had at least one nominally significant genetic association at a genome-wide association level are labeled in purple, with the level of significance indicated (q ≤ 0.2 or q ≤ 0.1).
Fig 3.
GWAS of genus Akkermansia relative abundance.
A) Manhattan plot of GWAS results for the normalized relative abundance of genus Akkermansia from the “seasons combined” analysis. Each point represents a tested SNP, displayed by chromosomal position (x-axis). The y-axis shows–log10(P-value) for each SNP. SNPs significantly associated with normalized Akkermansia relative abundance (q ≤ 0.2) are shown in purple on chromosome 3. B) Q-Q plot for P-values from the GWAS of the relative abundance of genus Akkermansia. The majority of SNPs lie along the null line, demonstrating the test statistics did not appear to be inflated (due to population stratification, for example). Five SNPs (all in linkage disequilibrium (LD) on chromosome 3) were significantly associated with Akkermansia abundance. The point circled in orange was the most highly associated SNP (rs4894707). C) Normalized Akkermansia abundance, segregated by genotype class at rs4894707 on chromosome 3. Only two genotype classes are represented at this SNP (MAF = 0.185 and Hardy-Weinberg P-value = 0.007 in a larger sample of 1,415 Hutterites that includes the individuals in this study). This SNP lies in a UTR region of the gene PLD1, which as been implicated in obesity studies in African American populations[56].
Table 2.
Number of bacterial taxa with at least one SNP association reaching suggestive significance per season.
For each season, the total number of bacterial taxa examined and number of single-nucleotide polymorphisms (SNPs) tested are listed. The number of taxa for which at least one SNP was associated with abundance at either a Bonferroni corrected P-value cutoff or a q-value cutoff of 0.1 or 0.2 are listed. The total number of taxa for which at least one associated SNP fell below q-value ≤ 0.2 or a Bonferroni threshold are listed under “total significant”.
Fig 4.
Identification of candidate tissues.
At increasingly significant P-value thresholds, variants identified through GWAS were enriched in DNase hypersensitivity peaks in a tissue-specific manner. A) For genus Akkermansia, low P-value GWAS SNPs were significantly enriched in DHS peaks in endothelial cell types (red), but not in DHS peaks of the 15 other tissues examined (gray). The x-axis shows the P-value threshold bins examined and y-axis represents fold enrichment for SNPs overlapping DHS peaks in that bin compared to genome-wide for that tissue type. Both the abundance of genus Akkermansia and endothelial barrier function have been associated with obesity, providing a mechanistic hypothesis that can be further investigated. B) For genus Akkermansia, the significance of enrichment of GWAS SNPs overlapping DHS peaks in endothelial tissue in the lowest P-value bin (P ≤ 0.0005) was determined by GWAS permutation (P ≤ 0.05, see Materials and Methods). The distribution of permuted GWAS SNP enrichments in DHS peaks of endothelial tissue is displayed as a boxplot with actual enrichment plotted as red star. C) For genus Faecalibacterium, low P-value GWAS SNPs are significantly enriched in DHS peaks of both intestine (orange) and stomach (pink) tissues (P ≤ 0.05). D) For genus Faecalibacterium, the significance of enrichment of GWAS SNPS overlapping DHS peaks of both intestine and stomach tissues in the lowest P-value bin (P ≤ 0.0005) was determined by GWAS permutation (intestine P ≤ 0.01, stomach P ≤ 0.05, see Materials and Methods). Members of Faecalibacterium are some of the most common species in the gut and are known to be associated with dysbiosis in patients with irritable bowel syndrome. The distribution of permuted enrichments for each identified candidate tissue is displayed as a boxplot with actual enrichment plotted as an orange (intestine) and pink (stomach) star.