Table 1.
Relevant Characteristics of the Infants in This Study
Table 2.
Infant Stool Sample Collection Schedule
Table 3.
Reference Pool Composition
Figure 1.
Comparison of Microarray- and Sequencing-Based Community Profiles
Microarray-derived and sequencing-derived data estimates of taxonomic group abundance are compared for 12 biological samples.
(A) Abundance estimates for all prokMSA level 2 taxa measured on the array are compared. Each column represents a single biological sample and each row corresponds to a single taxonomic group, identified (to the right of each row) by its numerical prokMSA OTU code, along with the roughly corresponding conventional name for the group.
(B) Comparison of sequence-based and microarray-based relative abundance estimates for level 2 taxonomic groups in 12 samples (same as in [A]). The x-axis represents the relative abundance as estimated by the frequency of clones from a given taxonomic group, and the y-axis represents the relative abundance as estimated by microarray profiling.
(C) Same as (B) for level 3 taxonomic groups.
Figure 2.
Variation in the Overall Density of Fecal Bacteria during the First Year of Life.
For each baby sample, bacterial abundance was estimated by TaqMan real-time PCR with universal bacterial primers. Estimated rRNA gene copies per gram of feces (y-axis) are plotted as a function of days of life (x-axis). Both axes are on a logarithmic scale. Abundance measurements are truncated on the lower end at the value corresponding to the 95th percentile of the extraction (negative) controls (copy number corrected by median stool mass). Episodes of antibacterial or antifungal (nystatin) treatment are indicated on the temporal axis by gray or pink bars, respectively (see Table 1 for additional information).
Figure 3.
Overview of Microbial Community Profiles of All Samples
Each column (n = 430) represents one biological sample and each row (n = 2,149) represents one taxonomic group or species. Samples are organized in temporal order, beginning with birth at left and any maternal- or other family-derived samples at the right of each time course. Wedges above columns are numbered according to baby identifier. Rows (taxa) are sorted by their numerical codes so that subgroups of a given group lie directly below the more general group (e.g., 2.15, then 2.15.1, then 2.15.1.1). The symbols “>” and “> >” are added to the names of labeled taxonomic groups that are subgroups, at level 3 or level 4, respectively, of a labeled level 2 taxonomic group. Increasing darkness of the grayscale corresponds to higher estimated relative abundance.
Figure 4.
Clustering of Samples Based on Population Profiles of Most Common and Abundant Taxa
(A) Each column (n = 430) represents one biological sample, and each row (n = 52) represents one level 4 taxonomic group or species for which two or more samples had relative abundance estimates greater than 1%. All samples, including stool samples from the subject infants, parents, and siblings, as well as milk and vaginal samples, are represented. Samples were clustered by centered Pearson correlation, so that columns representing the most similar samples are grouped together, whereas taxonomic groups (rows) are numerically sorted rather than clustered. Increasing darkness of the grayscale corresponds to higher estimated relative abundance. Values are log2 of relative abundance.
(B) Selected clusters illustrating that (1) profiles from early baby stool samples cluster by baby, (2) very early baby samples cluster with maternal samples, and (3) samples from the pair of fraternal twins cluster together and intermingle.
Figure 5.
Similarity of Microbiota between Babies
For each pair of samples, we calculated the nearest-neighbor samples according to Pearson correlation of the level 4 relative abundance profiles. For each baby, we then computed what percent of nearest-neighbor samples came from each baby. The shade of grey indicates the percent of samples from baby Y (column) that were nearest neighbors of the samples from baby X (row) such that rows add to 100%.
Figure 6.
Temporal Patterns in Average Pairwise Similarity of Infant Stool Microbiota Profiles
(A) Similarity between infants over time. For each time point for which at least six babies were profiled, we calculated the mean pairwise Pearson correlation between the level 4 taxonomic profiles of all babies at that time point. The mean pairwise Pearson correlation between these profiles in 18 adult participants in this study (nine males and nine females) is also shown (open circle indicated by the arrow).
(B) Progression towards adult-like flora over time. For each time point for which at least four babies were profiled, we calculated the mean Pearson correlation between the level 4 taxonomic profiles of all babies at that time point and a “generic adult” profile. The generic adult profile is the centroid of 18 (nine male and nine female) adults (parents in this study).
Figure 7.
Temporal Profiles of the Most Abundant Level 3 Taxonomic Groups
Level 3 taxonomic groups were selected for display if their mean (normalized) relative abundance across all baby samples was greater than 1%. The x-axis indicates days since birth and is shown on a log scale, and the y-axis shows estimated (normalized) relative abundance. For some babies, no values are plotted for the first few days because the total amount of bacteria in the stool samples collected on those days was insufficient for microarray-based analysis.