Skip to main content
Browse Subject Areas

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Identification of shared bacterial strains in the vaginal microbiota of related and unrelated reproductive-age mothers and daughters using genome-resolved metagenomics

  • Michael T. France,

    Roles Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Software, Visualization, Writing – original draft, Writing – review & editing

    Affiliations Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, Maryland, United States of America, Department of Microbiology and Immunology, University of Maryland School of Medicine, Baltimore, Maryland, United States of America

  • Sarah E. Brown,

    Roles Data curation, Investigation, Writing – review & editing

    Affiliation Department of Epidemiology and Public Health, University of Maryland School of Medicine, Baltimore, Maryland, United States of America

  • Anne M. Rompalo,

    Roles Conceptualization, Funding acquisition, Supervision, Writing – review & editing

    Affiliation Division of Infectious Diseases, John Hopkins School of Medicine, Baltimore, Maryland, United States of America

  • Rebecca M. Brotman,

    Roles Conceptualization, Funding acquisition, Supervision, Writing – review & editing

    Affiliations Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, Maryland, United States of America, Department of Epidemiology and Public Health, University of Maryland School of Medicine, Baltimore, Maryland, United States of America

  • Jacques Ravel

    Roles Conceptualization, Funding acquisition, Project administration, Supervision, Writing – review & editing

    Affiliations Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, Maryland, United States of America, Department of Microbiology and Immunology, University of Maryland School of Medicine, Baltimore, Maryland, United States of America


It has been suggested that the human microbiome might be vertically transmitted from mother to offspring and that early colonizers may play a critical role in development of the immune system. Studies have shown limited support for the vertical transmission of the intestinal microbiota but the derivation of the vaginal microbiota remains largely unknown. Although the vaginal microbiota of children and reproductive age women differ in composition, the vaginal microbiota could be vertically transmitted. To determine whether there was any support for this hypothesis, we examined the vaginal microbiota of daughter-mother pairs from the Baltimore metropolitan area (ages 14–27, 32–51; n = 39). We assessed whether the daughter’s microbiota was similar in composition to their mother’s using metataxonomics. Permutation tests revealed that while some pairs did have similar vaginal microbiota, the degree of similarity did not exceed that expected by chance. Genome-resolved metagenomics was used to identify shared bacterial strains in a subset of the families (n = 22). We found a small number of bacterial strains that were shared between mother-daughter pairs but identified more shared strains between individuals from different families, indicating that vaginal bacteria may display biogeographic patterns. Earlier-in-life studies are needed to demonstrate vertical transmission of the vaginal microbiota.


The human body is colonized by microbial populations which together comprise our microbiota [1]. These populations have been shown to be critical determinants of our health and well-being [2,3] and are founded early in life [4]. Initial colonization of newborn infants primarily occurs during and immediately following birth [5,6], although there is an active debate on whether in utero seeding plays a role in this early colonization [713]. Establishment of the microbiota is theorized to be critical to the programming of the neonatal immune system [1417]. The maternal microbiota has long been hypothesized to be a major contributor of microbial strains to their newborn offspring through a process of vertical transmission [18]. As the neonate moves through the vaginal canal, it is expected to be exposed to the mother’s vaginal microbiota and perhaps their fecal and skin microbiota. It follows then that the microbiota of neonates born via C-section has been observed to transiently differ in composition from those born via vaginal delivery [14,19,20]. Studies seeking to demonstrate this process of vertical transmission have identified shared bacterial phylotypes in the microbiota of mothers and their neonates using 16S rRNA amplicon sequencing [21,22]. However, these data lack the resolution necessary to identify shared strains [23]. The most convincing evidence for vertical transmission comes from studies which either used cultivation or shotgun metagenomic based techniques to identify bacterial strains in the microbiota of mothers and their infants [2428]. Yet, these probable vertically transmitted strains have been shown to be minority members of the neonate’s microbiota and to be short-lived [24]. More study is needed to define the provenance of a neonate’s microbiota.

Much of the work on maternal microbiota transmission has focused on the neonate’s intestinal, skin, or oral microbiota [2428]. The source of the bacterial species and strains that inhabit the vagina is not known. Reproductive-age women routinely have communities which are dominated by Lactobacillus with L. crispatus, L. iners, L. jensenii, and L. gasseri being the most prevalent species [29,30]. A significant proportion of these women, however, have communities which do not contain a high relative abundance of lactobacilli and instead are characterized by a more even distribution of several obligate or facultative anaerobes including species in the Gardnerella, Atopobium, and Prevotella genera [29,30]. These Lactobacillus deficient communities are more common among women of Hispanic and African descent [2931] and have been associated with increased risk for adverse health events, including reproductive tract infections [3236]. Less is known about the microbial communities which comprise the vaginal microbiota of pre-pubertal children, but they have been shown to differ in composition and in bacterial load from those found in reproductive-age women [3739]. Any bacteria which are transferred at the time of birth may not be capable of surviving in a child’s vagina, which has been shown to have neutral or alkaline pH and to have a paucity of Lactobacillus [38]. It is not until early puberty that the species which are common in a reproductive age women’s vaginal microbiota (e.g. L. crispatus, L. iners, G. vaginalis) gain dominance in an adolescent’s vaginal microbiota [40]. While it is thought that pubertal hormonal and physiological changes which occur are responsible for this shift in the composition of the vaginal microbiota [4143], it is not clear where the strains come from. It could be that they are of maternal origin and have persisted at low abundance throughout early-life or that they are acquired later in life through some other mechanism.

To determine whether there was evidence for the vertical transmission of the vaginal microbiota which persisted into adolescence and adulthood, we characterized the vaginal microbiota of pre-menopausal mothers and their post-menarcheal daughters. Metataxonomics was used to investigate similarities in community composition between the mother-daughter pairs and genome-resolved metagenomics was used to identify bacterial strains which the two had in common.


Cohort description/sample collection

We characterized the vaginal microbiota of 87 reproductive-age women including 45 daughters and their 42 mothers. The average age of the daughters was 19 (14–35) and the average age of the mothers was 41 (32–51). Participants included in this study self-identified as Black or African American and took part in a douching intervention study [44]. Data was not collected on the mode of birth (vaginal delivery versus c-section), although the estimated rate in the US around the time of the daughter’s births was around 22% [45]. Vaginal swab specimens were collected and stored at -80°C. All participants or their legal guardians (in case of minors) provided written informed consent prior to enrollment in the study. In addition, all minor participants provided written assent to their participation in the study. All procedures were conducted in accordance with relevant guidelines and regulations and was approved by the internal review boards at the University of Maryland Baltimore (#HP-00045398) and the Johns Hopkins University School of Medicine (#NA-00004835).

DNA extraction

DNA was extracted from 200 μL of vaginal swab specimen resuspended in 1ml of phosphate buffered saline transport medium. DNA extractions were performed using the MagAttract PowerMicrobiome DNA/RNA Kit (Qiagen; Hilden, Germany) and bead-beating on a TissueLyser II (Qiagen) according to the manufacturer’s instructions and automated onto a Hamilton STAR robotic platform (Hamilton Robotics; Reno, NV, USA).

16S rRNA gene sequencing

The V3V4 region of the 16S rRNA gene was amplified and sequenced as described previously [46]. The protocol utilizes two amplification steps: one which targets the V3V4 region and one which adds barcoded sequencing (primer sequences in S1 Table). Pooled amplicons were then sequenced an Illumina HiSeq 2500 (Illumina; San Diego, CA, USA) and the resulting paired end sequence reads were processed using DADA2 [47] to identify amplicon sequence variants (ASVs) and remove chimeric sequences as described previously [29]. The median number of sequences per sample following processing was 16 990 (range: 182–36 401). Each ASV was assigned to a taxonomic group using the RDP Naïve Bayesian Classifier [48] trained with the SILVA 16S rRNA gene database [49]. Genera common in the vaginal environment (e.g. Lactobacillus, Gardnerella, Prevotella, Sneathia, and Mobiluncus) were further classified at the species level using speciateIT (version 1.0, The sequence counts attributed to ASVs assigned to the same phylotype were added together. Samples for which less than 500 sequences were generated were dropped from the analysis. Phylotypes with a study-wide average relative abundance of < 10−4 were removed. The final dataset contained 81 samples and 100 phylotypes and can be found in S1 Table. Taxonomic profiles were assigned to community state types (CSTs) using VALENCIA [29]. Similarity in taxonomic composition between mother-daughter pairs was assessed using the Yue-Clayton θ [50]. A permutation test was performed to determine whether these observed values differed with that expected by chance alone. Taxonomic profiles for mothers and daughters were each shuffled randomly and then similarity was assessed in the same manner. This process was repeated 100 times.

Shotgun metagenomics

Shotgun metagenomic data was generated for 22 families and included 47 total samples (22 mothers and 25 daughters). The samples selected for shotgun metagenomics are indicated in Fig 1. Families which either had similar taxonomic profiles or had species in common were selected for this analysis. Shotgun metagenomic sequence libraries were prepared from the extracted DNA using Illumina Nextera XT Flex kits according to manufacturer recommendations. The resulting libraries were sequenced on an Illumina HiSeq 4000 (10 per lane,150 bp paired-end mode) at the Genomic Resource Center at the University of Maryland School of Medicine. The average number of read pairs generated for each library was 37,000,000 (range: 24,500,000 to 81,800,000). Human reads were identified in the resulting sequence datasets using BMtagger and removed ( Sequence datasets were further processed using sortmeRNA [51] to identify and remove ribosomal RNA reads and the remaining reads were trimmed for quality (4 bp sliding window, average quality score threshold Q15) using Trimmomatic v0.3653 [52]. Reads trimmed to less than 75bp were removed from the dataset.

Fig 1. Taxonomic composition of the vaginal microbiota of reproductive age mothers and daughters.

Stacked bars represent the relative abundances of individual bacterial phylotypes. Each plot displays the profiles for members belonging to the same family (M-Mother, D-Daughter). Two families (23,27) had 2 and 3 daughters, respectively. Samples denoted with black diamonds were selected for shotgun metagenomic analysis.

Genome resolved metagenomics

The taxonomic composition of each metagenome was established by mapping to the VIRGO non-redundant gene catalog [53]. De novo assembly was performed on each metagenome using metaspades [54,55] with k-mer sizes: 21, 33, 55, 77, 99, 101, and 127. The resulting assemblies were separated into single genome bins using a reference guided approach. For each metagenome, the sequence reads were mapped back to the corresponding assembly, to establish the contig coverage, and to the VIRGO gene catalog, to establish taxonomy of the contig. Contigs demonstrating at least 5X coverage and which were found to have at least 90% of the reads mapping to VIRGO genes with the same taxonomic annotation were separated into species bins. The species bins were further split into metagenome assembled genomes (MAGs) based on differences in contig coverage. Quality of the resulting MAGs were examined using checkM [56] and those demonstrating at least 80% completion and less than 5% contamination were used in the subsequent analyses (S2 Table). The average completeness of the MAGs was 97.04% (80.9%-100%) and the average contamination was 1.05% (0.0%-4.94%). Genes were identified in each MAG using prodigal [57] and OrthoMCL was used to identify those which were common to at least 95% of the MAGs [58]. Thirteen such genes were identified and their amino acid sequences were individually aligned using Muscle [59] and then concatenated into a single alignment using phyutility [60]. PartitionFinder was used to select an appropriate partitioning scheme and model of molecular evolution [61]. The Phylogeny of the 225 MAGs was established using RaxML-ng with 10 parsimony and ten random starting trees [62]. Bootstrap convergence was detected using the autoMRE setting and occurred after 750 replicates. Relative abundance of MAGs in their resident communities was approximated as the percent of reads from the metagenome mapping to the MAG.

Identification of shared strains

Similarity between the MAGs was assessed using inStrain [63]. An all-versus-all strategy was used wherein a separate inStrain profile was built by mapping the sequence reads from each metagenome against the MAGs recovered from each participant using Bowtie2 [64]. This resulted in 2209 inStrain profiles. For each participant, the set of 47 inStrain profiles were then summarized using the inStrain compare function with Ward linkage, which afforded the determination of coverage overlap and sequence similarity between the sequences reads of each metagenome and the MAGs of each participant. We then applied a stringent sequence similarity threshold (70% coverage and at least 99.9% sequence similarity) to identify participants with shared bacterial strains. A network diagram representing strain sharing was built from the using the NetworkX python package ( Sequence similarity between shared strains identified in the same family versus different families were compared using a Wilcoxon rank sum test.


Similarity in the taxonomic composition of mothers and their daughters

We first asked whether mothers and their daughters had vaginal microbiota which were similar in taxonomic composition. Metataxonomics was used to assess the composition of the vaginal microbiota for 42 mothers and their 45 daughters (Fig 1). One family had two daughters, and another had three (26 & 23, respectively). Similarity between communities was assessed using Yue & Clayton’s θ, which is function of the relative abundances of shared and non-shared species in the communities. While there were several examples of mother-daughter pairs which had very similar in taxonomic composition (e.g. families 1, 8, 23), there were also several which did not (e.g. families 6, 15, 24). The average similarity between mother-daughter pairs was 0.3, indicating that most were found to have communities with different compositions. Similarity between mother-daughter pairs was higher when the daughter had a community state type (CST) that was not dominated by Lactobacillus (CST IV, Fig 2A).

Fig 2.

Similarity in the taxonomic composition between mother-daughter pairs, delineated by the daughter’s CST assignment (A). Higher values of the Yue-Clayton index signify communities that bear greater taxonomic compositional similarity. Permutation tests were used to establish whether the observed similarities between mothers and their daughters were different than that expected by chance alone (B). Black points represent the average number of permuted mother-daughter pairs whose similarity fell within 0.1 increments of the Yue-Clayton similarity index, while yellow stars represent the observed number of pairs. Error bars span the range between the 2.5% and 97.5% quantiles of the 100 random permutations.

Because the taxonomic composition of the human vaginal microbiota routinely resembles one of a limited number of configurations, it is expected that there can be a degree of similarity between entirely unrelated individuals. To determine whether the observed similarity between mother-daughter pairs exceeded that expected by chance alone we used a permutation test. Taxonomic profiles were shuffled and the similarity between these randomized mother-daughter pairs was assessed in the same manner. As can be seen in Fig 2B, the distribution of similarity scores for the permuted data did not differ substantially from the observed distribution. The observed data was found to have slightly fewer pairs with θ value between 0–0.1 (p = 0.02); and slightly more pairs with a θ value between 0.1–0.2 (p = 0.02). This result indicates that the observed similarity in composition between mother-daughter pairs was not different than that for randomly selected mother-daughter pairs.

Genome resolved metagenomics

In the above analysis we demonstrated that most mother-daughter pairs did not have vaginal microbiota with similar taxonomic profiles. Yet, many pairs were found to have species in common, just at different relative abundances (e.g. L. crispatus in family 4). To determine whether the populations of these shared species were comprised of the same strain(s), we selected 22 families (47 participants) to conduct shotgun metagenomic sequencing (denoted by black diamonds in Fig 1). The resulting metagenomes were assembled and binned allowing us to recover 225 near-complete MAGs (Fig 3). We recovered about 5 MAGs per metagenome with a minimum of 1 and a maximum of 13. The recovered MAGs included: 14 Atopobium, 12 “Ca. Lachnocurva vaginae”, 53 Gardnerella, 16 L. crispatus, 27 L. iners, 6 L. jensenii, 29 Prevotella, 10 S. amnii, and 10 S. sanguinegens. The remaining 48 MAGs comprised 24 species. We were able to recover four Gardnerella MAGs from a single metagenome which contained Gardnerella.

Fig 3. Phylogenetic tree displaying the taxonomic diversity of the 225 metagenome assembled genomes (MAGs) derived from the shotgun metagenomic data generated for 22 families.

The maximum likelihood phylogeny was established using a concatenated alignment of the amino acid sequences for 13 orthologous genes which were found to be present in at least 99% of the MAGs: gapD, gltX, ileS, pheS, pheT, cysS, hisS, uvrD, ruvX, rpsO, Ffh, obgE, and lepA.

Identification of shared strains

To identify pairs of MAGs which originate from the same bacterial strain we used the inStrain tool and associated workflow. If the vaginal microbiota is vertically transmitted, we expected that mothers and daughters which had species in common, might also have strains in common. Furthermore, if this transmission had happened at the time of birth, the mother and daughter strains should also not be identical but instead show some degree of sequence divergence consistent with the amount of time past. For this reason, we used a stringent threshold for defining shared strains of at least 99.9% sequence identity and at least 70% overlap. Among our set of 225 MAGs, we identified 49 pairs which met this threshold. Among these, ten were between a mother and daughter from the same family, representing six mother-daughter pairs. These ten pairs of strains were found at both high and low relative abundances and their abundance was generally similar between the mother’s and daughter’s communities (Table 1). Daughters in pairings which were found to share strains were younger at the time of sampling than those in pairings which were not found to share strains (15.17 versus 19.62; t = -4.31; p<0.001). No trend was observed with the mother’s age (39.6 versus 41.9; t = -0.81; p = 0.45). Of these six mother-daughter pairs, three were found to share L. crispatus strains (families 1, 3, and 5) with family 5 also sharing MAGs classified as Clostridiales and P. timonensis. Family 30 shared MAGs classified Gardnerella, L. iners, and A. rimae while family 21 shared only “Ca. L. vaginae” and family 41, only P. timonensis. The remaining 39 instances of shared strains were between daughters and mothers from different families (n = 18), mothers from different families (n = 13), and daughters from different families (n = 8). These pairings were parsed into a diagram representing the network of shared strains among the mothers and daughters in this study (Fig 4). A large part of this network was comprised of a L. crispatus and a L. jensenii strain which were identified in five and four metagenomes, respectively.

Fig 4. Network diagram of shared bacterial strains identified in this cohort.

A stringent threshold, 99.9% sequence identity, 70% coverage, was used to identify shared strains. Lines represent the shared bacterial strains and connect the participants in which the strain was found. Mothers are represented by circles and daughters by squares. Numbers on the nodes signify the family and can be linked back to the taxonomic profile of the participant using Fig 1.

Comparison of sequence identity among strain shared within versus between families

We next asked whether the strains identified as shared within mother-daughter pairs were more or less similar than those shared between families. Sequence similarity was measured as the number of singe nucleotide polymorphisms (SNPs) per megabase pair of aligned sequence. We found strains shared within families trended to being more similar to one another than those shared between families, but this difference was not significant (Fig 5, W = 136, p = 0.149). For the strains shared within families these values were used in combination with the daughter’s age to calculate the per year substitution rate under the hypothesis of vertical transmission at the time of birth. These values ranged from 1.05*10−6 to 3.24*10−5 per base pair per year and are listed in Table 1.

Fig 5. Number of single nucleotide polymorphisms identified in strains found to be shared between members the same family or between members of different families.

Values are scaled per mega base pair of compared sequence.


The origin of the bacterial strains which constitute the vaginal microbiota is not currently known. We found limited evidence for vertical transmission of these bacteria from a mother to her daughter which had persisted through the daughter’s adolescence. While some mother-daughter pairs were found to have communities of similar taxonomic composition, the observed similarities could be explained by chance. A small subset of the mother-daughter pairs were also found to have bacterial strains in common, consistent with vertical transmission, but shared strains were more frequently identified in unrelated individuals. These results do not eliminate vertical transmission as a possible mechanism by which the vaginal microbiota is founded but rather suggest that mothers and daughters do not necessarily have similar vaginal microbiota, later in life. Because we examined the communities years after the birth of the daughter, there was plenty of time for either the mother’s or the daughter’s vaginal microbiota to experience strain turnover. Longitudinal studies have indicated that the vaginal microbiota of reproductive-age women does experience changes in composition over time, although the studies followed women for only a few months [65,66]. Daughters that were found to share strains with their mother were also younger than those which were not, further suggesting that time may play a role. It is not difficult to imagine that the populations of some bacterial strains might go extinct over the course of a person’s life. The mechanisms by which new bacterial strains might be introduced into the vaginal microbiota are not well understood. Unprotected vaginal sex and other sexual practices could result in the introduction of new bacterial strains [6769], but there are likely other mechanisms as well. Strain turnover in either the mother or the daughter’s vaginal microbiota would erode any signal of vertical transmission.

In our analysis, six mother-daughter pairs were found to have matching bacterial strains in their vaginal microbiota. Under the vertical transmission at the time of birth, substitutions are expected to accumulate in both the mother’s and the daughter’s populations, as they evolve independently post-transmission. We used the daughter’s age to calculate the substitution rate under this hypothesis and arrived at values between 3*10−5 and 1*10−6 substitutions per site-year for each shared strain. It is difficult to say how if our observed values fit the vertical transmission narrative, as the expected substitution rate is not well understood. A study on the population genomics of Neisseria gonorrhoeae, a sexually transmitted pathogen, estimated a rate of 3*10−5 substitutions per site-year [70]. Another study surveyed of the substitution rate experienced by a number of human pathogens estimated rates between 10−5 and 10−8 substitutions per site-year [71]. They also found a strong negative relationship between the estimated substitution relationship and the timescale over it was measured. The authors suggest that this relationship results from the accumulation of deleterious mutations which have yet to be purged by purifying selection [71]. The timescale separating our hypothesized vertically transmitted strains is rather short, consistent with our relatively high estimated substitution rates. Many of the bacteria common to the human vagina have reduced genome sizes and have lost components of DNA repair machinery [72,73]. These bacteria may experience higher than average mutation rates [7477] which could lend itself to higher estimates of their substitution rate [78]. We cannot say for certain that our observation of shared strains among these six mother-daughter pairs is the result of the vertical transmission at time of birth, but we find this explanation is reasonable.

The majority of shared strains were identified in women from unrelated families. We hypothesize that this observation may reflect the biogeography of vaginal bacteria. The mothers and daughters included in this study were all living in the Baltimore metropolitan area at the time of sampling. These individuals may be more likely to share bacterial strains with one another simply due to their geographic proximity. Lourens Baas Becking put forth the hypothesis that bacteria did not display biogeographic patterns, suggesting that in the microbial world “everything is everywhere, but the environment selects” [79]. In the years since Bass-Becking put forth his hypothesis, there have been several demonstrations to the contrary [8084]. The dispersal of some bacterial species appears to be constrained leading them to exhibit biogeographic patterns. We suggest that many of the species common to the vaginal microbiota (e.g. L. crispatus, L. iners, L. jensenii, G. vaginalis, “Ca. L. vaginae”, A. vaginae) are among those likely to have their dispersal constrained. With the except of L. crispatus, which is sometimes found in the intestinal tracts of chickens [85,86], these species are not routinely found anywhere other than the vagina. Many of these bacteria are also fastidious and require anaerobic conditions for their robust growth. It is therefore not clear how they might disperse over the great distances necessary to erode biogeographic patterns, except by means of their host. If dispersal is primarily achieved via sexual activity, sexual networks could underpin biogeographic patterns observed for vaginal bacteria [87]. Additional studies of participants from around the world might further illuminate these biogeographic patterns and the factors which govern them. Results from such studies could have translational impact and would be informative on the necessity of developing geo-adapted probiotic formulations to modulate the vaginal microbiota.

We implemented a stringent sequence similarity cutoff (≥99.9% similarity) to identify strains which may have been vertically transferred from mother to daughter. This is because it is not enough to identify instances where two bacterial assemblies belong to the same lineage. Their genome sequences must also be similar enough that any observed sequence differences can be explained by the post-transmission evolution of the two populations. In our case, the transmission was hypothesized to have occurred at birth, meaning that the two populations had evolved independently, in most cases, for about two decades. Other studies which have examined the maternal transmission of microbes from mother to offspring have done so shortly after birth [2426]. In this case, there should be minimal sequence differences between the two populations of vertically transferred strains. It makes sense then, to utilize even more stringent sequence similarity cutoffs than that used here (e.g. ≥99.99% similarity). Marker gene based tools like StrainPhlan [23], do not have the capability to implement such genome-wide sequence similarity cutoffs and therefore may not be the best tool for this analysis. We, as have others [63], advocate for the use of appropriate sequence similarity cutoffs to identify recent microbial transfer events. Identifying the same species or even the same bacterial strain in two samples is not enough.

This study has a number of important limitations which should be considered. First, we examined the mother’s and daughter’s vaginal microbiota not at birth, but instead sometime after the daughter had experienced menarche. This means that there was plenty of time for the daughter or the mother to gain or lose bacterial strains in their communities. Second, we are missing a great deal of metadata which could help explain why some mother-daughter pairs were found to share bacterial strains, and some were not. For example, we do not know which daughters were born by cesarean section and which were vaginally delivered. Nor do we have sexual behavior or partner history data for the participants in this study. Third, the participants had originally been enrolled in a douching intervention study: douching may influence the composition of the vaginal microbiota [88]. Finally, the cohort examined was relatively small and included only women in the greater Baltimore metropolitan area who identified as Black or African American.

Yet even with these limitations, we did identify several mother-daughter pairs which did share strains, which were similar in sequence enough to have been vertically transmitted at the time of birth. These results motivate future studies which investigate the extent to which the vaginal microbiota is vertically transmitted and the importance of this event to the daughter’s future reproductive health. We also identified shared strains in unrelated individuals, suggesting that vaginal bacteria might display biogeographic patterns. These patterns could be confirmed by large scale, multi-regional studies which examine the extent to which strains of vaginal bacteria show geographic specificity.

Supporting information

S1 Table. Taxonomic compositions.

Table containing the taxonomic composition derived from the 16S rRNA gene amplicon sequencing.


S2 Table. Metagenome assembled genomes inventory.

Table describing the completeness and degree of contamination of the metagenome assembled genomes generated in this study.



  1. 1. Human Microbiome Project C. Structure, function and diversity of the healthy human microbiome. Nature. 2012;486(7402):207–14. pmid:22699609
  2. 2. Cho I, Blaser MJ. The human microbiome: at the interface of health and disease. Nat Rev Genet. 2012;13(4):260–70. pmid:22411464
  3. 3. Pflughoeft KJ, Versalovic J. Human Microbiome in Health and Disease. Annual Review of Pathology: Mechanisms of Disease: Annual Reviews 2012. p. 99–122. pmid:21910623
  4. 4. Robertson RC, Manges AR, Finlay BB, Prendergast AJ. The Human Microbiome and Child Growth—First 1000 Days and Beyond. Trends Microbiol. 2019;27(2):131–47. pmid:30529020
  5. 5. Gritz EC, Bhandari V. The human neonatal gut microbiome: a brief review. Front Pediatr. 2015;3:17. pmid:25798435
  6. 6. Moore RE, Townsend SD. Temporal development of the infant gut microbiome. Open Biol. 2019;9(9):190128. pmid:31506017
  7. 7. Leiby JS, McCormick K, Sherrill-Mix S, Clarke EL, Kessler LR, Taylor LJ, et al. Lack of detection of a human placenta microbiome in samples from preterm and term deliveries. Microbiome. 2018;6(1):196. pmid:30376898
  8. 8. Perez-Munoz ME, Arrieta MC, Ramer-Tait AE, Walter J. A critical assessment of the "sterile womb" and "in utero colonization" hypotheses: implications for research on the pioneer infant microbiome. Microbiome. 2017;5(1):48. pmid:28454555
  9. 9. Rackaityte E, Halkias J, Fukui EM, Mendoza VF, Hayzelden C, Crawford ED, et al. Viable bacterial colonization is highly limited in the human intestine in utero. Nat Med. 2020;26(4):599–607. pmid:32094926
  10. 10. Romero R, Miranda J, Chaemsaithong P, Chaiworapongsa T, Kusanovic JP, Dong Z, et al. Sterile and microbial-associated intra-amniotic inflammation in preterm prelabor rupture of membranes. J Matern Fetal Neonatal Med. 2015;28(12):1394–409. pmid:25190175
  11. 11. Aagaard K, Ma J, Antony KM, Ganu R, Petrosino J, Versalovic J. The placenta harbors a unique microbiome. Science Translation Medicine. 2014;6(237):ra65237.
  12. 12. Silverstein RB, Mysorekar IU. Group therapy on in utero colonization: seeking common truths and a way forward. Microbiome. 2021;9(1):7. pmid:33436100
  13. 13. Walter J, Hornef MW. A philosophical perspective on the prenatal in utero microbiome debate. Microbiome. 2021;9(1):5. pmid:33436093
  14. 14. Wampach L, Heintz-Buschart A, Fritz JV, Ramiro-Garcia J, Habier J, Herold M, et al. Birth mode is associated with earliest strain-conferred gut microbiome functions and immunostimulatory potential. Nat Commun. 2018;9(1):5091. pmid:30504906
  15. 15. de Aguero MG, Ganal-Vanarburg SC, Fuhrer T, Rupp S, Uchimura Y, Li H, et al. The maternal microbiota drives early postnatal innate immune development. Science. 2016;351(6279):1296–302. pmid:26989247
  16. 16. Dzidic M, Boix-Amoros A, Selma-Royo M, Mira A, Collado MC. Gut Microbiota and Mucosal Immunity in the Neonate. Med Sci (Basel). 2018;6(3). pmid:30018263
  17. 17. Sanidad KZ, Zeng MY. Neonatal gut microbiome and immunity. Curr Opin Microbiol. 2020;56:30–7. pmid:32634598
  18. 18. Mueller NT, Bakacs E, Combellick J, Grigoryan Z, Dominguez-Bello MG. The infant microbiome development: Mom matters. Trends Mol Med2015. p. 109–17. pmid:25578246
  19. 19. Dominguez-Bello MG, Costello EK, Contreras M, Magris M, Hidalgo G, Fierer N, et al. Delivery mode shapes the acquisition and structure of the initial microbiota across multiple body habitats in newborns. Proc Natl Acad Sci U S A. 2010;107(26):11971–5. pmid:20566857
  20. 20. Stinson LF, Payne MS, Keelan JA. A Critical Review of the Bacterial Baptism Hypothesis and the Impact of Cesarean Delivery on the Infant Microbiome. Frontiers in Medicine: Frontiers; 2018. p. 135. pmid:29780807
  21. 21. Mortensen MS, Rasmussen MA, Stokholm J, Brejnrod AD, Balle C, Thorsen J, et al. Modeling transfer of vaginal microbiota from mother to infant in early life. Elife. 2021;10. pmid:33448927
  22. 22. Backhed F, Roswall J, Peng Y, Feng Q, Jia H, Kovatcheva-Datchary P, et al. Dynamics and Stabilization of the Human Gut Microbiome during the First Year of Life. Cell Host Microbe. 2015;17(5):690–703. pmid:25974306
  23. 23. Truong DT, Tett A, Pasolli E, Huttenhower C, Segata N. Microbial strain-level population structure and genetic diversity from metagenomes. Genome Res. 2017;27(4):626–38. pmid:28167665
  24. 24. Ferretti P, Pasolli E, Tett A, Asnicar F, Gorfer V, Fedi S, et al. Mother-to-Infant Microbial Transmission from Different Body Sites Shapes the Developing Infant Gut Microbiome. Cell Host Microbe. 2018;24(1):133–45 e5. pmid:30001516
  25. 25. Yassour M, Jason E, Hogstrom LJ, Arthur TD, Tripathi S, Siljander H, et al. Strain-Level Analysis of Mother-to-Child Bacterial Transmission during the First Few Months of Life. Cell Host Microbe. 2018;24(1):146–54 e4. pmid:30001517
  26. 26. Wang S, Zeng S, Egan M, Cherry P, Strain C, Morais E, et al. Metagenomic analysis of mother-infant gut microbiome reveals global distinct and shared microbial signatures. Gut Microbes. 2021;13(1):1–24. pmid:33960282
  27. 27. Milani C, Mancabelli L, Lugli GA, Duranti S, Turroni F, Ferrario C, et al. Exploring Vertical Transmission of Bifidobacteria from Mother to Child. Appl Environ Microbiol. 2015;81(20):7078–87. pmid:26231653
  28. 28. Makino H, Kushiro A, Ishikawa E, Muylaert D, Kubota H, Sakai T, et al. Transmission of intestinal Bifidobacterium longum subsp. longum strains from mother to infant, determined by multilocus sequencing typing and amplified fragment length polymorphism. Appl Environ Microbiol. 2011;77(19):6788–93. pmid:21821739
  29. 29. France MT, Ma B, Gajer P, Brown S, Humphrys MS, Holm JB, et al. VALENCIA: a nearest centroid classification method for vaginal microbial communities based on composition. Microbiome. 2020;8(1):166. pmid:33228810
  30. 30. Ravel J, Gajer P, Abdo Z, Schneider GM, Koenig SSK, Mcculle SL, et al. Vaginal microbiome of reproductive-age women. Proceedings of the National Academy of Sciences2011. p. 4680–7. pmid:20534435
  31. 31. Zhou X, Brown CJ, Abdo Z, Davis CC, Hansmann MA, Joyce P, et al. Differences in the composition of vaginal microbial communities found in healthy Caucasian and black women. ISME J. 2007;1(2):121–33. pmid:18043622
  32. 32. Ma B, Forney LJ, Ravel J. Vaginal Microbiome: Rethinking Health and Disease. Annual Review of Microbiology2012. p. 371–89. pmid:22746335
  33. 33. Gosmann C, Anahtar MN, Handley SA, Huttenhower C, Farcasanu M, Abu-Ali G, et al. Lactobacillus-deficient cervicovaginal bacterial communities are associated with increased HIV acquisition in young South African women. Immunity2017. p. 29–37. pmid:28087240
  34. 34. Tamarelle J, de Barbeyrac B, Le Hen I, Thiebaut A, Bebear C, Ravel J, et al. Vaginal microbiota composition and association with prevalent Chlamydia trachomatis infection: a cross-sectional study of young women attending a STI clinic in France. Sex Transm Infect. 2018;94(8):616–8. pmid:29358524
  35. 35. Elovitz MA, Gajer P, Riis V, Brown AG, Humphrys MS, Holm JB, et al. Cervicovaginal microbiota and local immune response modulate the risk of spontaneous preterm delivery. Nat Commun. 2019;10(1):1305. pmid:30899005
  36. 36. Petrova MI, Lievens E, Malik S, Imholz N, Lebeer S. Lactobacillus species as biomarkers and agents that can promote various aspects of vaginal health. Frontiers in Physiology2015. p. 1–18.
  37. 37. Gerstner GJ, Grunberger W, Boschitsch E, Rotter M. Vaginal organisms in prepubertal children with and without vulvovaginitis. Archives in gynecology. 1982;231:247–52.
  38. 38. Hammerschlag MR, Alpert S, Onderdonk AB, Thurston P, Ellen D, McCormack WM, et al. Anaerobic microflora of the vagina in children. AJOG. 1978;131(8):853–6. pmid:686083
  39. 39. Hill GB, St. Claire KK, Gutman LT. Anaerobes predominate among the vaginal microflora of prepubertal girls. Clinical Infectious Diseases. 1995;20:S269–S70. pmid:7548572
  40. 40. Hickey RJ, Zhou X, Settles ML, Erb J, Malone K, Hansmann MA, et al. Vaginal microbiota of adolescent girls prior to the onset of menarche resemble those of reproductive-age women. mBio2015. p. e00097–15. pmid:25805726
  41. 41. Hickey RJ, Zhou X, Pierson JD, Ravel J, Forney LJ. Understanding vaginal microbiome complexity from an ecological perspective. Translational Research2012. p. 267–82. pmid:22683415
  42. 42. Kaur H, Merchant M, Haque MM, Mande SS. Crosstalk Between Female Gonadal Hormones and Vaginal Microbiota Across Various Phases of Women’s Gynecological Lifecycle. Front Microbiol. 2020;11:551. pmid:32296412
  43. 43. Nunn KL, Ridenhour BJ, Chester EM, Vitzthum VJ, Fortenberry JD, Forney LJ. Vaginal Glycogen, Not Estradiol, Is Associated With Vaginal Bacterial Community Composition in Black Adolescent Women. J Adolesc Health. 2019;65(1):130–8. pmid:30879880
  44. 44. Mark H, Sherman SG, Nanda J, Chambers-thomas T, Barnes M, Rompalo A. What has Changed about Vaginal Douching among African American Mothers and Daughters? 2010;27:418–24.
  45. 45. Gregory KD, Curtin SC, Taffel SM, Notzon FC. Changes in indications for Cesarean Delivery: Univted States, 1985 and 1994. American Journal of Public Health. 1998;88(9):1384–7.
  46. 46. Holm JB, Humphrys M, Robinson CK, Settles ML, Ott S, Fu L, et al. Ultra-high throughput multiplexing and sequencing of >500 bp amplicon regions on the Illumina HiSeq 2500 platform. mSphere2019. p. e00029–19.
  47. 47. Callahan BJ, McMurdie PJ, Rosen MJ, Han AW, Johnson AJA, Holmes SP. DADA2: High-resolution sample inference from Illumina amplicon data. Nature Methods 2016. p. 581–3. pmid:27214047
  48. 48. Wang Q, Garrity GM, Tiedje JM, Cole JR. Naive Bayesian classifier for rapid assignment of rRNA sequences into the new bacterial taxonomy. Appl Environ Microbiol. 2007;73(16):5261–7. pmid:17586664
  49. 49. Quast C, Pruesse E, Yilmaz P, Gerken J, Schweer T, Glo FO, et al. The SILVA ribosomal RNA gene database project: improved data processing and web-based tools. 2013;41:590–6.
  50. 50. Yue JC, Clayton MK. A Similarity Measure Based on Species Proportions. Communications in Statistics—Theory and Methods. 2005;34(11):2123–31.
  51. 51. Kopylova E, Noé L, Touzet H. SortMeRNA: fast and accurate filtering of ribosomal RNAs in metatranscriptomic data. Bioinformatics. 2012;28(24):3211–7. pmid:23071270
  52. 52. Bolger AM, Lohse M, Usadel B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics (Oxford, England): Oxford University Press; 2014. p. 2114–20.
  53. 53. Ma B, France MT, Crabtree J, Holm JB, Humphrys MS, Brotman RM, et al. A comprehensive non-redundant gene catalog reveals extensive within-community intraspecies diversity in the human vagina. Nat Commun. 2020;11(1):940. pmid:32103005
  54. 54. Bankevich A, Nurk S, Antipov D, Gurevich AA, Dvorkin M, Kulikov AS, et al. SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing. J Comput Biol. 2012;19(5):455–77. pmid:22506599
  55. 55. Nurk S, Meleshko D, Korobeynikov A, Pevzner PA. metaSPAdes: a new versatile metagenomic assembler. Genome Res. 2017;27(5):824–34. pmid:28298430
  56. 56. Parks DH, Imelfort M, Skennerton CT, Hugenholtz P, Tyson GW. CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes. Genome Res. 2015;25(7):1043–55. pmid:25977477
  57. 57. Hyatt D, Chen G-L, LoCascio PF, Land ML, Larimer FW, Hauser LJ. Prodigal: prokayotic gene recognition and translation site identification. BMC Bioinformatics. 2010;11:119.
  58. 58. Li L. OrthoMCL: Identification of Ortholog Groups for Eukaryotic Genomes. Genome Research. 2003;13(9):2178–89. pmid:12952885
  59. 59. Edgar RC. MUSCLE: a multiple sequence alignment method with reduced time and space complexity. BMC Bioinformatics. 2004;5:113. pmid:15318951
  60. 60. Smith SA, Dunn CW. Phyutility: A phyloinformatics tool for trees, alignments and molecular data. Bioinformatics2008. p. 715–6. pmid:18227120
  61. 61. Lanfear R, Calcott B, Ho SY, Guindon S. Partitionfinder: combined selection of partitioning schemes and substitution models for phylogenetic analyses. Mol Biol Evol. 2012;29(6):1695–701. pmid:22319168
  62. 62. Kozlov AM, Darriba D, Flouri T, Morel B, Stamatakis A. RAxML-NG: a fast, scalable and user-friendly tool for maximum likelihood phylogenetic inference. Bioinformatics. 2019;35(21):4453–5. pmid:31070718
  63. 63. Olm MR, Crits-Christoph A, Bouma-Gregson K, Firek BA, Morowitz MJ, Banfield JF. inStrain profiles population microdiversity from metagenomic data and sensitively detects shared microbial strains. Nat Biotechnol. 2021. pmid:33462508
  64. 64. Langmead B, Salzberg SL. Fast gapped-read alignment with Bowtie 2. Nature Methods: Nature Publishing Group; 2012. p. 357–9. pmid:22388286
  65. 65. Gajer P, Brotman RM, Bai G, Sakamoto J, Schutte UME, Zhong X, et al. Temporal dynamics of the human vaginal microbiota. Science Translational Medicine2012. p. 132ra52. pmid:22553250
  66. 66. Srinivasan S, Liu C, Mitchell CM, Fiedler TL, Thomas KK, Agnew KJ, et al. Temporal variability of human vaginal bacteria and relationship with Bacterial vaginosis. PLoS ONE. 2010;5:e10197. pmid:20419168
  67. 67. Mandar R, Punab M, Borovkova N, Lapp E, Kiiker R, Korrovits P, et al. Complementary seminovaginal microbiome in couples. Res Microbiol. 2015;166(5):440–7. pmid:25869222
  68. 68. Vodstrcil LA, Walker SM, Hocking JS, Law M, Forcey DS, Fehler G, et al. Incident bacterial vaginosis (BV) in women who have sex with women is associated with behaviors that suggest sexual transmission of BV. Clin Infect Dis. 2015;60(7):1042–53. pmid:25516188
  69. 69. Mehta SD, Zhao D, Green SJ, Agingu W, Otieno F, Bhaumik R, et al. The Microbiome Composition of a Man’s Penis Predicts Incident Bacterial Vaginosis in His Female Sex Partner With High Accuracy. Front Cell Infect Microbiol. 2020;10:433. pmid:32903746
  70. 70. Perez-Losada M, Crandall KA, Zenilman J, Viscidi RP. Temporal trends in gonococcal population genetics in a high prevalence urban community. Infect Genet Evol. 2007;7(2):271–8. pmid:17141576
  71. 71. Duchene S, Holt KE, Weill FX, Le Hello S, Hawkey J, Edwards DJ, et al. Genome-scale rates of evolutionary change in bacteria. Microb Genom. 2016;2(11):e000094. pmid:28348834
  72. 72. France MT, Mendes-Soares H, Forney LJ. Genomic comparisons of Lactobacillus crispatus and Lactobacillus iners reveal potential ecological drivers of community composition in the vagina. Applied and Environmental Microbiology2016. p. 7063–73. pmid:27694231
  73. 73. Macklaim JM, Gloor GB, Anukam KC, Cribby S, Reid G. At the crossroads of vaginal health and disease, the genome sequence of Lactobacillus iners AB-1. Proceedings of the National Academy of Sciences of the United States of America2011. p. 4688–95. pmid:21059957
  74. 74. Acosta S, Carela M, Garcia-Gonzalez A, Gines M, Vicens L, Cruet R, et al. DNA Repair Is Associated with Information Content in Bacteria, Archaea, and DNA Viruses. J Hered. 2015;106(5):644–59. pmid:26320243
  75. 75. Batut B, Knibbe C, Marais G, Daubin V. Reductive genome evolution at both ends of the bacterial population size spectrum. Nat Rev Microbiol. 2014;12(12):841–50. pmid:25220308
  76. 76. Bourguignon T, Kinjo Y, Villa-Martin P, Coleman NV, Tang Q, Arab DA, et al. Increased Mutation Rate Is Linked to Genome Reduction in Prokaryotes. Curr Biol. 2020;30(19):3848–55 e4. pmid:32763167
  77. 77. Marais GA, Calteau A, Tenaillon O. Mutation rate and genome reduction in endosymbiotic and free-living bacteria. Genetica. 2008;134(2):205–10. pmid:18046510
  78. 78. Bromham L. Why do species vary in their rate of molecular evolution? Biol Lett. 2009;5(3):401–4. pmid:19364710
  79. 79. Baas-Becking LGM. Geobiologie of Inleiding Tot de Milieukunde. The Hague, The Netherlands: W. P. van Stockum & Zoon N. V.; 1934.
  80. 80. Cho JC, Tiedje JM. Biogeography and degree of endemicity of fluorescent Pseudomonas strains in soil. Applied and Environmental Microbiology2000. p. 5448–56. pmid:11097926
  81. 81. Remold SKSK, Purdy-Gibson MEME, France MTMT, Hundley TCTC. Pseudomonas putida and Pseudomonas fluorescens species group recovery from human homes varies seasonally and by environment. Plos One2015. p. e0127704. pmid:26023929
  82. 82. Amenyogbe N, Dimitriu P, Smolen KK, Brown EM, Shannon CP, Tebbutt SJ, et al. Biogeography of the relationship between the child gut microbiome and innate immune system. mBio. 2020;12(1):e03079–20.
  83. 83. Li SP, Wang P, Chen Y, Wilson MC, Yang X, Ma C, et al. Island biogeography of soil bacteria and fungi: similar patterns, but different mechanisms. ISME J. 2020;14(7):1886–96. pmid:32341471
  84. 84. Martiny JBH, Bohannan BJM, Brown JH, Colwell RK, Fuhrman JA, Green JL, et al. Microbial biogeography: putting microorganisms on the map. Nature Reviews Microbiology2006. p. 102–12. pmid:16415926
  85. 85. Pan M, Hidalgo-Cantabrana C, Barrangou R. Host and body site-specific adaptation of Lactobacillus crispatus genomes. NAR Genomics and Bioinformatics. 2020;2(1). pmid:33575551
  86. 86. Adhikari B, Kwon YM. Characterization of the Culturable Subpopulations of Lactobacillus in the Chicken Intestinal Tract as a Resource for Probiotic Development. Front Microbiol. 2017;8:1389. pmid:28798730
  87. 87. Kenyon C, Buyze J, Klebanoff M, Brotman RM. The role of sexual networks in studies of how BV and STIs increase the risk of subsequent reinfection. Epidemiol Infect. 2018;146(15):2003–9. pmid:30182860
  88. 88. Brotman RM, Klebanoff MA, Nansel TR, Andrews WW, Schwebke JR, Zhang J, et al. A longitudinal study of vaginal douching and bacterial vaginosis—a marginal structural modeling analysis. Am J Epidemiol. 2008;168(2):188–96. pmid:18503038