Skip to main content
Advertisement
  • Loading metrics

Evidence for genetically-based sperm discrimination in the vaginal tract of a primate species

  • Rachel M. Petersen ,

    Roles Conceptualization, Data curation, Formal analysis, Funding acquisition, Investigation, Methodology, Visualization, Writing – original draft, Writing – review & editing

    Rachel.m.petersen@vanderbilt.edu

    Current address: Department of Biological Sciences, Vanderbilt University, Nashville, Tennessee, United States of America

    Affiliation Department of Anthropology, New York University, New York, New York, United States of America

  • Lee (Emily) M. Nonnamaker,

    Roles Investigation

    Current address: Department of Biology, University of Florida, Gainesville, Florida, United States of America

    Affiliation Department of Biological Sciences, University of Notre Dame, Notre Dame, Indiana, United States of America

  • Jaclyn A. Anderson,

    Roles Investigation

    Affiliation Department of Anthropology and Archaeology, University of Calgary, Calgary, Alberta, Canada

  • Christina M. Bergey,

    Roles Methodology

    Affiliation Department of Genetics, Human Genetics Institute, Rutgers University, Piscataway, New Jersey, United States of America

  • Christian Roos,

    Roles Methodology, Writing – review & editing

    Affiliations Primate Genetics Laboratory, German Primate Center, Leibniz Institute for Primate Research, Göttingen, Germany, Gene Bank of Primates, German Primate Center, Leibniz Institute for Primate Research, Göttingen, Germany

  • Amanda D. Melin,

    Roles Methodology, Writing – review & editing

    Affiliations Department of Anthropology and Archaeology, University of Calgary, Calgary, Alberta, Canada, Department of Medical Genetics, University of Calgary, Calgary, Alberta, Canada, Alberta Children’s Hospital Research Institute, University of Calgary, Calgary, Alberta, Canada

  • James P. Higham

    Roles Conceptualization, Funding acquisition, Methodology, Writing – review & editing

    Affiliation Department of Anthropology, New York University, New York, New York, United States of America

Abstract

Females influence offspring paternity through diverse pre- and post-copulatory mechanisms. Sperm discrimination—the differential physiological response to ejaculates based on male or sperm characteristics—can bias fertilization outcomes, but in vivo evidence of this process in large-bodied mammals is lacking. Here, in a study of nine females and four males, we tested whether two aspects of female physiology that affect sperm survival—vaginal immune response and pH—are modulated by male genetic makeup in a non-human primate, the olive baboon (Papio anubis). Our findings suggest post-copulatory differences in vaginal gene expression and pH, with the strongest immune responses and largest pH decreases, harmful to sperm, exhibited by females mating with genetically similar males. These findings are consistent with genetically-based post-copulatory mate discrimination, offering new insights into how interactions between male gametes and the female reproductive tract may shape conception probability in primates.

Introduction

Characterizing the mechanisms and outcomes of sexual selection, and specifically mate choice, has been a major goal of evolutionary biologists [14]. Female mate choice can occur both prior to copulation in the form of behavioral mating biases, or after copulation in the form of fertilization biases, a process termed cryptic female choice (CFC) [58]. To date, empirical evidence demonstrating in vivo CFC in mammals is concentrated in rodent taxa [9,10]. However, the heightened maternal investment and prolonged offspring care common to large-bodied mammals, as well as discrepancies between mating observations and genetic paternity, suggest that CFC may be widespread [1113]. Nonetheless, investigating these processes in species which share aspects of their reproductive physiology with humans, such as other primates, is likely critical for improving our understanding of human infertility.

Studies indicate that the female reproductive tract can discriminate between sperm cells based on their genetic material [1416], providing a potential mechanism for genetically-based CFC. In mammals, in vitro experiments in mice show higher fertilization success for sperm from more distantly related males [17], and artificial insemination experiments in pigs reveal dramatic shifts in oviductal gene expression in response to sex-sorted X- versus Y-chromosome-bearing sperm [18]. In humans, in vitro experiments have shown both differential sperm responsiveness to follicular fluid and differential gene expression in vaginal epithelial cells in response to seminal fluid, however, how these responses relate to the genetic make-up of the egg and sperm remains uncertain [19,20]. The major histocompatibility complex (MHC) is a highly polymorphic genomic region involved in pathogen identification and immune response regulation. It is also an attractive candidate target of CFC due to its prior implicated role in mate choice and important contribution to reproductive success [2123]. While pre-copulatory MHC-based mate preferences are well documented across taxa, including non-human primates [2429], the role of the MHC in post-copulatory sexual selection remains largely unexplored. Evidence for MHC-driven sperm selection is limited to a handful of studies in rodents, fish, and birds [3033], with no documented evidence in primates, despite its potential relevance to human fertility.

In this study, we aimed to explore potential mechanisms of CFC in a non-human primate, the olive baboon (Papio anubis). Olive baboon females mate with multiple males across their ovarian cycle, however, males often attempt to monopolize access to fertile females through mate guarding. These consortships, in which a male closely associates with and guards a female, can persist for several hours to multiple days, during which time the ejaculate from only a single male may be present in the female’s reproductive tract [34]. Furthermore, females energetically invest greatly in each offspring, and experience a relatively slow reproductive rate, providing conditions that are likely to promote selection for CFC [35]. We focused on vaginal pH and gene expression, as these may contribute to sperm survival [36,37] and can be characterized following mating in unanesthetized individuals using positive reinforcement training. We conducted both genome-wide reduced representation DNA sequencing and MHC genotyping on four intact males and nine parous females and strategically paired each male with 2–3 females to encompass a broad range of genetic diversity and complementarity (i.e., similarity) values across mating dyads. We first characterized vaginal pH and gene expression across the cycle in the absence of mating and used these samples as baseline comparisons for post-copulatory responses. We asked how vaginal gene expression and pH changes: (1) across female ovarian cycle phases; (2) in response to mating; and (3) in relation to the genetic diversity and complementarity of the mating male. We hypothesized that females will exhibit a stronger immune response and lower vaginal pH, both potentially harmful to sperm survival, after mating with males who are less genetically diverse and complementary. We predicted this pattern based on the selective pressures favoring offspring with greater genetic diversity, particularly at the MHC, while reducing the risks associated with inbreeding.

Results

Vaginal gene expression varies across the ovarian cycle

We determined the timing of ovulation based on vaginal cytology (Fig 1A; see Materials and methods). We identified a 5-day fertile phase, a 5-day pre-fertile phase, a 5-day post-fertile phase, and classified the remainder of the cycle as the non-fertile phase [3841]. We analyzed 32 non-copulatory vaginal RNA samples from eight females (8 per cycle phase) and 275 non-copulatory pH measurements from nine females (68.8 ± 21.8 s.d. per cycle phase; Fig 1A; additional details on dataset composition provided in Table A in S2 Appendix).

thumbnail
Fig 1. Differential gene expression measured across ovarian cycle phases.

(A) We analyzed 32 RNA-seq samples and 275 pH measurements taken across the four cycle phases as determined by vaginal cytology; (B) The number of differentially expressed (DE) genes across each phase comparison. The largest differences in gene expression were observed when comparing the non-fertile phase to the pre-fertile, fertile, and post-fertile phases; (C) Numerous DE genes were unique to particular cycle phases (i.e., 143 DE genes were unique to the fertile phase), while others were shared across two or more phases (i.e., 1,974 DE genes were shared across the pre-fertile, fertile, and post-fertile phases); (D) Enrichment distributions showing the ranked distribution of genes in the top 5 over/underrepresented gene sets in the fertile phase; (E) Normalized read counts of TLR2, a gene involved in the positive regulation of the inflammatory response and suppressed in the fertile phase; (F) Normalized read counts of SLC4A8, a gene involved in ion transmembrane transport and activated in the fertile phase. The data underlying this figure are provided in S1 Data. Artwork by LMN.

https://doi.org/10.1371/journal.pbio.3003699.g001

To understand baseline vaginal gene expression, we performed differential gene expression analyses with robust empirical Bayes moderation followed by adaptive shrinkage [42,43]. We included cycle phase as the predictor variable, female ID as a blocking factor (i.e., a random effect), and RNA quality (RIN) as a covariate. As some samples were taken prior to male introduction into the enclosure, we also included “male presence” as a covariate (a binary yes or no variable). We found 2,480 differentially expressed (DE) genes between the non-fertile versus pre-fertile phase (2,330 at LFSR < 0.05, 2,050 at LFSR < 0.01), 2,537 between the non-fertile versus fertile phase (2,403 at LFSR < 0.05, 2,174 at LFSR < 0.01), and 2,277 between the non-fertile versus post-fertile phase (2,080 at LFSR < 0.05, 1,675 at LFSR < 0.01; Fig 1B), with many of these shared across the pre-fertile, fertile and post-fertile phases (n = 1,974; Fig 1C and Table B in S2 Appendix). We performed gene set enrichment analysis (GSEA) to describe the biological functions of DE genes and found that the most strongly upregulated pathways in the fertile phase involve G protein-coupled receptor activity and ion transmembrane transport, and the most strongly downregulated pathways include positive regulation of the inflammatory response, phagocytic vesicles, and cell adhesion (Fig 1D and Fig A in S1 Appendix and Table C in S2 Appendix). For example, TLR2, a gene involved in the positive regulation of the inflammatory response, was suppressed in the fertile phase (Fig 1E) and SLC4A8, a gene involved in ion transmembrane transport, was activated in the fertile phase (Fig 1F).

To assess changes in vaginal pH across the cycle, we used robust linear mixed models [44] including female ID as a random effect and average temperature, which is known to impact pH readings, as a covariate [45,46]. We did not find statistically significant differences in vaginal pH across cycle phases (Fig B in S1 Appendix and Table D in S2 Appendix).

Vaginal gene expression and pH indicate responses to copulation

We analyzed 25 post-copulatory RNA samples from six females and 15 post-copulatory pH measurements from five females (described in Table A in S2 Appendix), both collected four hours after an observed copulation with ejaculation to maximize potential changes in gene expression [19]. All post-copulatory samples were collected during the pre-fertile, fertile, or post-fertile phases, and compared to non-copulatory samples taken from the same females during those same three phases when copulation had not been observed that day and there was no evidence of a sperm plug (non-copulatory RNA samples: nnon-cop = 30, pH measurements: nnon-cop = 47; Fig 2A). To account for male-derived RNA present in the vagina, we performed RNA-seq on two semen samples from each male collected opportunistically following masturbation (N= 8 samples total). From these samples, we identified 1,442 genes highly expressed in semen (average expression of >20 cpm, Table E in S2 Appendix), and removed these genes from all subsequent post-copulatory gene expression analyses.

thumbnail
Fig 2. Differential gene expression and pH in response to copulation.

(A) We analyzed 30 RNA-seq samples and 47 pH measurements taken when there was no evidence of recent copulation, and 25 RNA-seq samples and 15 pH measurements taken 4 hours following copulation; (B) Genes within two immune-related GO pathways upregulated in post-copulatory samples. Gene set nodes are sized based on the number of genes within them, genes nodes are colored by their log fold change (logFC) in expression; (C) Normalized counts of TLR2 in non-copulatory vs. post-copulatory samples; (D) Predicted expression of immune system related genes that are differentially expressed in post-copulatory vs. non-copulatory samples. Columns represent samples (left = non-copulatory, right = post-copulatory), rows represent genes, and cell color represents the predicted increase (blue) vs. decrease (purple) in expression, scaled across each row; (E) Vaginal pH was significantly lower in post-copulatory compared to non-copulatory samples. Colored points and error bars represent model predictions ± one standard error and black points represent raw data; (F) Females show non-uniform patterns in the direction and magnitude of pH change between non-copulatory and post-copulatory samples. The data underlying this figure are provided in S2 Data. Artwork by LMN.

https://doi.org/10.1371/journal.pbio.3003699.g002

We performed differential expression analyses using robust estimation followed by adaptive shrinkage, with post-copulatory status (yes or no) as the predictor variable, dyad ID as a blocking factor, and RIN, cycle phase, and male presence as covariates. We identified 941 DE genes in post-copulatory versus non-copulatory samples (715 at LFSR < 0.05, 383 at LFSR < 0.01; Table F in S2 Appendix). DE genes were enriched for two ontology pathways, both of which are involved in immune system processes (Fig 2B and Table G in S2 Appendix). These enriched pathways include well-described genes that regulate chemokine signaling, such as TLR2 (Fig 2C), and are generally predicted to have higher expression in post-copulatory versus non-copulatory contexts (Fig 2D).

To assess alterations in vaginal pH following copulation, we used robust linear mixed models including dyad ID as a random effect and cycle phase and average temperature as covariates. Although we did not find an association between cycle phase and pH in our dataset, we included cycle phase as a covariate due to previous work observing lower vaginal pHs around the time of ovulation in baboons and humans [47,48]. We found that post-copulatory pH measurements were significantly lower than non-copulatory measurements (estimate = −0.39, SE = 0.12, p = 0.001; Table H in S2 Appendix and Fig 2E), with substantial individual variation in the magnitude of pH change following copulation (Fig 2F). Although variance tests can be sensitive to small sample sizes, we nonetheless detect significantly greater variance in pH among post-copulatory samples compared to non-copulatory ones (Breusch-Pagan test: p = 0.05), which aligns with our initial hypothesis and supports the use of an interaction model to test for the role of male genetic diversity and complementarity in moderating post-copulatory vaginal pH.

Male genetic diversity and complementarity modulate post-copulatory vaginal gene expression and pH

To explore whether post-copulatory vaginal gene expression and pH are modulated by male genetic makeup, we estimated both genome-wide and MHC diversity and complementarity. We used double digest restriction-site associated DNA sequencing (ddRAD-seq) to estimate standardized multi-locus heterozygosity (stMLH) and kinship to approximate genome-wide diversity and complementarity, respectively. We used amplicon sequencing of the antigen-binding cleft of four MHC loci (2 class I: A and B, and 2 class II: DQA and DRB) to calculate MHC diversity as the number of class I and class II alleles and complementarity as the proportion of shared alleles between dyads. We paired males and females based on their relative genetic compatibility to produce mating dyads with kinships ranging from −0.18 to 0.24 and MHC complementarity ranging from 10% to 40% (class I loci) and 0% to 40% (class II loci). We also characterized biologically relevant MHC “supertypes” based on amino acid polarity at positively selected sites within the antigen-binding cleft (detailed methods in [49]) to determine supertype-based diversity and complementarity. In total, we tested five measures of male diversity and five measures of complementarity between each mating dyad, summarized in Table I in S2 Appendix.

We performed 10 separate differential gene expression analyses, one for each measure of male genetic diversity or complementarity, applying robust estimation and adaptive shrinkage. Each model was constructed with dyad ID as a blocking factor, RIN, cycle phase, and male presence as covariates, and an interactive effect between post-copulatory status (yes or no) and male genotype as the predictor variable. Measures of male MHC diversity and complementarity were associated with an excess of low p-values relative to the null expectation, suggestive that male MHC diversity and complementarity broadly influence gene expression (Fig 3A and 3B). Complementarilty as measured using alleles was associated with stronger deviations from the null expectation compared to supertypes (Fig 3B). Genome-wide diversity (stMLH), in contrast, was not as strongly associated with gene expression changes (Fig 3A). Postcopulatory expression of 456 genes was associated with male MHC allele or supertype diversity, meeting both an LFSR < 0.1 and family-wise error rate (FWER)-adjusted p < 0.05 criteria (Table J in S2 Appendix). Although the different MHC metrics did not share any of the same significant genes, GSEA revealed that male MHC I allelic diversity and MHC II supertype diversity were both positively associated with the expression of genes involved in RNA polymerase activity and intercellular signaling pathways (Table L in S2 Appendix). Likewise, we identified 590 genes whose post-copulatory expression was associated with either MHC or genome-wide complementarity at an LFSR < 0.1 and FWER-adjusted p < 0.05, representing 25 and 17 GSEA pathways, respectively (Tables K and L in S2 Appendix). Once again, each metric was associated with a unique set of significant genes, however, GSEA revealed that both MHC I and MHC II allelic complementarity were both positively associated with pathways involved in immune response and cellular signaling (Fig 3C and Figs C and D in S1 Appendix and Table L in S2 Appendix). For example, MAP3K2, which functions in the MAP kinase signaling pathway and has been implicated in the activation of NF-κB and downstream cytokine production, is expressed more in females who mated with males with whom they share a greater number of MHC I alleles (i.e., low complementarity) in comparison to females who mated with males with whom they share fewer MHC I alleles (i.e., high complementarity; Fig 3D). All genotype-dependent differential expression and GSEA results are summarized in Table M in S2 Appendix.

thumbnail
Fig 3. Post-copulatory vaginal gene expression in relation to male diversity and complementarity in five mating dyads.

(A) and (B) Quantile–quantile (Q–Q) plots comparing observed to expected p-values for gene expression association with measures of male diversity (A) and complementarity (B). Low p-values are highly enriched in our observed data compared to the null expectation (black line on x = y) when assessing the effect of male MHC diversity (but not genome-wide diversity) and MHC I and II allelic complementarity; (C) Enrichment distributions showing the ranked distribution of genes in the top 8 overrepresented gene sets that are upregulated in expression with low MHC I allelic complementarity; (D) Expression of MAP3K2, a gene which plays a key role in phagocytosis, in a non-copulatory context (left panel) and in a post-copulatory context subset by the degree of MHC I allelic complementarity (high complementarity: sharing < 30% of alleles, low complementary: sharing > 30% of alleles). The data underlying this figure are provided in S3 Data.

https://doi.org/10.1371/journal.pbio.3003699.g003

To evaluate the robustness of our findings, we conducted a leave-one-out sensitivity analysis in which we iteratively excluded a single mating dyad from our dataset and re-ran the differential gene expression analysis. For each iteration, we tested the interaction between post-copulatory status and MHC I or MHC II allelic complementarity—two genotype features that yielded the strongest initial associations. Across iterations, we consistently observed a substantial number of DE genes associated with MHC I allelic complementarity (range: 203−1,191, mean = 837.2, SD = 343.1, LFSR < 0.1) indicating that this association is not driven by any single dyad. Differential expression linked to MHC class II allelic complementarity was more variable, yet still consistently present across dyads (range: 86–1,006, mean = 456.3, SD = 317.3, LFSR < 0.1), indicating that our MHC class II results may be more sensitive to the individual dyads included in the analysis. Future studies with larger sample sizes will be necessary to validate these results.

To assess how male genetic diversity and complementarity modulates post-copulatory vaginal pH, we again fit 10 separate models, one for each measure of male genetic diversity or complementarity. We included dyad ID as a random effect, cycle phase, and average temperature as covariates, and an interaction between post-copulatory status and male genotype as the predictor variable. We found a significant interaction for three measures of genetic complementarity: kinship, class II allelic complementarity, and class II supertype complementarity (Table N in S2 Appendix). For all three metrics, the largest drops in post-copulatory vaginal pH are observed among females mating with genetically similar males (i.e., low complementarity), and the smallest drops (or potential increases) in vaginal pH are observed among females mating with genetically dissimilar males (i.e., high complementarity; Fig 4). Model fit was evaluated with Akaike’s Information Criterion (AIC). For all three significant models, inclusion of the interaction term significantly improved model fit compared to simplified models that did not include male genotype (dAIC > 2). To assess whether our model estimates were driven by particular mating dyads, we refit models testing for the effect of kinship, MHC II allelic complementarity, and MHC II supertype complementarity using the leave-one-out method (see Materials and methods). Our results were generally recapitulated across iterations (Fig E in S1 Appendix). Although smaller sample sizes generate larger standard errors, all model estimates trended in the same direction as the model which included all mating dyads.

thumbnail
Fig 4. Post-copulatory vaginal pH in relation to male genetic complementarity.

Model predictions illustrating the interaction between genetic complementarity and post-copulatory status in predicting post-copulatory vaginal pH, with the lowest post-copulatory pH observed among females mating with males with high degrees of kinship (A) and low degrees of MHC class II allelic (B) and supertype (C) complementarity. Filled points and error bars represent model predictions ± one standard error and open points represent raw data. The data underlying this figure are provided in S4 Data.

https://doi.org/10.1371/journal.pbio.3003699.g004

Discussion

Together, our findings suggest that aspects of female reproductive physiology can respond differentially to male inseminations, providing preliminary support for genetically-based sperm discrimination—a potential mechanism by which post-copulatory mate choice may occur. Vaginal immune responses can protect females from infection, but these processes may need to be carefully regulated mid-cycle to accommodate exposure to paternal-derived molecules [5052]. Our dataset supports this hypothesis, revealing a mid-cycle suppression of immune-related genes. In contrast, these same immune pathways show heightened expression post-copulation, with the magnitude of this response linked to male genetic characteristics. The striking convergence on similar pathways influenced by genotype at MHC class I and class II loci presents a particularly compelling case that vaginal responses may contribute to CFC, especially given the absence of a strong correlation between genetic diversity at these loci in this population [49]. The female immune system poses a potential detriment to sperm survival through processes enriched following mating with genetically similar males [53], however, a strong immune response may also prime the female reproductive tract for implantation [5456], and future studies will be needed to distinguish between vaginal immune responses promoting and antagonizing successful conception [5759]. Lastly, we find that post-copulatory vaginal pH is strongly associated with male genetic complementarity, with the largest drops—detrimental to sperm survival—occurring after mating with genetically similar males, suggesting that vaginal pH dynamics may also serve as a mechanism of CFC alongside changes in vaginal immune response.

This study furthers our understanding of how the mammalian vaginal environment, experienced as a first point of contact between male gametes and the female reproductive tract, may mechanistically contribute to sperm success and potential offspring genotypes. While these findings are based on a limited dataset and should be interpreted as such, they provide intriguing support for a potential mechanism of non-directional sexual selection driven by genetic complementarity (i.e., non-additive mate choice). In this context, genotype-by-genotype interactions would drive CFC, dampening consistent directional shifts in allele frequencies over time across the population. Future work with larger sample sizes will be needed to confirm these patterns, as well as to explore male-driven sexually antagonistic strategies that circumvent female-mediated processes. As our close evolutionary relatives, we are excited by the potential of future non-human primate research to clarify the molecular underpinnings and evolutionary origins of variation in conception probability in humans as well as other mammals.

Materials and methods

Study subjects and experimental design

We worked with a population of captive olive baboons housed at le Centre National de La Recherche Scientifique Station de Primatologie (CNRS SdP), in Rousset, France. Study subjects consisted of 13 individuals, 4 intact males and 9 parous females. We created 4 small study groups composed of 1 male and either 2 or 3 females (3 groups contained 2 females, 1 group contained 3 females). Females were not on any form of contraception. Prior to the start of this study, each group of 2–3 females was housed with a vasectomized male and none of the females were pregnant. To create our study groups, CNRS SdP staff relocated the resident vasectomized male and allowed each group of females to live without a male for one month (the length of one ovarian cycle), during which time females underwent positive reinforcement clicker training to present their hindquarters for vaginal swabbing and pH measurement. After one month, the resident ethologist managed the introduction of an intact male by introducing males to females first from an adjacent enclosure, allowing visual and olfactory interaction for 2–3 days prior to physical introduction. After the intact male was physically introduced, we collected data on each group for two months (the duration of two ovarian cycles). During the course of the study, two females became pregnant, and all data collected from these females following the ovulation window in which they conceived was discarded. All manipulations and treatments received ethical approval from the Ministry of Higher Education, Research and Innovation in France (APAFIS#15021-2018051115066627) and the NYU University Animal Welfare Committee (18-1504), and were compliant with the European Science Foundation animal-handling guidelines to minimize pain and distress.

Genome-wide and MHC genotyping

We utilized genotyping data generated as part of a previous study assessing the concordance between genome-wide and MHC diversity and complementarity in olive baboons living at CNRS SdP [49]. Detailed library preparation, sequencing, and bioinformatic methods are described in detail in [49] and are described in brief below.

We extracted DNA from whole blood using the Qiagen QIAamp DNA mini kit (N = 4; nfemale = 3, nmale = 1) or the GEN-IAL First-DNA All tissue kit (N = 9; nfemale = 6, nmale = 3) following manufacturer’s instructions. To assess genome-wide diversity and complementarity between dyads, we performed double digest restriction-site associated DNA sequencing (ddRAD-seq). We prepared ddRAD-seq libraries following [60]. We digested 1 µg of DNA using restriction enzymes (SphI and MluCI), and size selected for 185 (±19) bp fragments using the Blue Pippin System. We ligated Illumina platform adapters, indexed samples using NEBNext Multiplex Oligos for Illumina sequencing, and sequenced on the Illumina HiSeq 2500 platform using one lane and 150 bp PE reads. We excluded low-quality reads and reads not containing both enzyme cut sites, mapped reads to the olive baboon reference genome (Panu v3) using the bwa mem aligner with default parameters [61], and performed shared SNP calling using the STACKS v2 reference pipeline [62]. We required that a locus be sequenced in at least 80% of individuals to be included in the final SNP set, and excluded SNPs in strong linkage disequilibrium (r2 > 0.5) using PLINK [63]. Our final SNP set consisted of 35,509 SNPs to be used in the calculation of stMLH [64] for each individual, and for genome-wide complementarity between each dyad (i.e., kinship). We calculated stMLH by dividing the proportion of genotyped loci at which an individual was heterozygous by the population mean heterozygosity at all genotyped loci, using the “Rhh” package in R [65]. We calculated kinship between each dyad using the relationship inference algorithm in the software package KING v2.2.4 [66].

To assess MHC diversity and complementary between dyads, we performed PCR amplification of the functionally important antigen-binding regions of two class I MHC loci (A and B) and two class II MHC loci (DQA and DRB). We chose to assess both class I and class II loci because they encode for molecules that are present on different cell types and perform unique functions: class I molecules are present on the surface of nearly all nucleated cells and bind to intracellular pathogens such as viruses, and class II molecules are found on the surface of antigen-presenting cells and bind to extracellular pathogens such as bacteria [67]. Moreover, previous results suggest that in this population, genetic diversity at MHC class I loci is not strongly associated with diversity at class II loci, meaning that cryptic choice mechanisms may favor diversity and/or complementarity at one locus and not the other [49]. We targeted a 195 bp segment within the α1 domain of the class I receptor types (MHC-A and -B), a 188 bp segment within the α1 domain of the DQA receptor, and a 252 bp segment within the β1 domain of the DRB receptor. These sequences make up part of the antigen-binding cleft of each receptor type, and amino acid variation within these regions can result in variable pathogen recognition and binding [68]. We amplified the desired sequences using the MilliporeSigma FastStart High Fidelity PCR System and primers described in Table O in S2 Appendix. Following amplification, we selected the amplicon of the appropriate length using gel electrophoresis and band excision, performed an indexing PCR using Hot Start Pfu DNA Polymerase, and sequenced on the Illumina MiSeq platform with v2 chemistry and 200 bp PE reads. Following sequencing, we trimmed and mapped sequences to MHC-A, -B, -DQA, and -DRB sequences taken from the IPD-MHC database and for each individual retained unique MHC sequences that had over 1,000 reads and were also present at >5% copy number in another individual. We calculated MHC diversity and complementarity for class I and class II loci separately. We calculated an individual’s MHC allelic diversity as the number of unique MHC alleles and calculated MHC allelic complementarity as the number of MHC alleles shared between two individuals divided by the total number of unique MHC alleles possessed by the two individuals in total.

Identification of MHC supertypes

To support the potential biological relevance of our measures of MHC diversity and complementarity, we additionally identified MHC supertypes based on the physiochemical properties of the amino acids involved in antigen binding and calculated MHC diversity and complementarity for each dyad at the supertype level. To do so, we followed methods from [69], which are described briefly below and in detail for this specific dataset in [49]. First, we identified positively selected sites (PSS) within the antigen-binding region of each MHC locus by comparing rates of synonymous (dS) to non-synonymous (dN) nucleotide substitutions in protein-coding regions using methods described by [70]. To do so, we determined sequence reading frames by performing an alignment to published sequences in the IPD-MHC database using NCBI’s basic local alignment search tool (BLAST). We translated aligned sequences in R using the package ‘seqinr’ [71] and performed multiple protein sequence alignment in MAFFT v.7 [72]. We converted protein alignments into codon alignments using PAL2NAL v.14 [73], and constructed a phylogenetic tree of the alignments using randomized axelerated maximum likelihood (RAxML) [74] and a generalized time reversible (GTR) GAMMA substitution model, with the best-scoring tree selected using 100 bootstrap iterations. We then computed substitution rate ratios (dN/dS) by inputting the PAL2NAL codon alignment and RAxML tree into the CODEML program within the Phylogenetic Analysis by Maximum Likelihood (PAML) package [75]. This software identifies statistically significant PSS using the Bayes Empirical Bayes (BEB) analysis computed under NSsite model 8 [76]. Next, we aligned the amino acids associated with each PSS and described the physiochemical properties of each site in the form of five z-descriptors: z1 (hydorphobicity), z2 (steric bulk), z3 (polarity), z4, and z5 (electronic effects) [77]. We compiled a mathematical matrix containing the five z-scores of each PSS of each allele and performed an agglomerative hierarchical clustering analysis using Euclidian distance and the average linkage method with the R function ‘hclust’ in the ‘stats’ package [78]. We used the R package ‘dynamicTreeCut’ [79] to identify significant clusters, while specifying a minimum cluster size of 2 [80]. These methods for determining MHC supertypes have been shown to identify biologically relevant variation in MHC allele functionality in both human and non-human primate studies [69,8184]. We calculated an individual’s MHC supertype diversity as the number of unique MHC supertypes and calculated MHC supertype complementarity as the number of MHC supertypes shared between two individuals divided by the total number of unique supertypes possessed by the two individuals in total. Using our genome-wide metrics, as well as our allele-based and supertype-based MHC descriptors, we calculated in total 5 metrics of diversity and 5 metrics of genetic complementarity for each individual, summarized in Table I in S2 Appendix.

Vaginal RNA sample collection

We collected vaginal RNA samples (N = 307 samples, 34.1 ± 2 samples per female) every other day throughout sexual skin tumescence and detumescence, and every 3 days throughout the rest of the cycle. To collect RNA samples, we inserted a sterile cotton swab into the vaginal opening (~2 to 3 inches) and rotated for 10 seconds. Once removed, we immediately placed the swab into a 1.5 mL DNA lo-bind collection tube containing 500 µl of Qiagen RNA Protect cell reagent and placed it into a cooler for transport back to the lab within one hour. When taking a post-copulatory sample, we removed any visible sperm plug from the vaginal opening using autoclaved forceps before inserting the cotton swab. We performed a piggyback centrifugation to transfer the vaginal cells in solution from the sample collection tube (containing the swab) into a new cryotube and froze at −80°C.

Vaginal pH measurement

We measured the vaginal pH of each female daily (N = 359 measurements, 39.9 ± 6.4 measurements per female), using an ISFET probe and portable SI400 pH meter (Sentron). Prior to each sampling, we calibrated the probe using pH 4 and pH 7 buffers. The linear relationship between the raw voltage reading and the pH values of the known calibration solutions always fell between 95% and 105%, indicating proper function of the probe. We collected three sequential pH measures to determine an average pH reading for each female each day. In the case that a female was not cooperative in taking three separate readings, we instead took only two (n = 7) or one (n = 5). To do so, we inserted the probe into the vaginal opening (~2 to 3 inches) and waited for the reading to stabilize (~6 to 10 s) before recording the value. The ISFET probe simultaneously measures temperature and performs an automatic temperature compensation correction to account for differences in temperature between readings. We refrained from taking pH readings within 30 min following urination, and took post-copulatory measurements (npost-cop = 15) immediately following post-copulatory RNA sampling, approximately 4 hours after an observed copulation with ejaculation. Between sequential readings from the same individual, we cleaned the probe using deionized water. Between individuals, we cleaned the probe with 70% ethanol and deionized water. The probe was stored overnight in a pH 7 buffer, as per the manufacturer’s instructions.

Vaginal cytology

We predicted the timing of ovulation using vaginal cytology. We collected vaginal swabs for cytological slides by inserting a sterile cotton swab into the posterior vagina and rotating it for 10 s before removal. We prepared slides by rolling the swab across a glass microscope slide, applying a spray fixative (CytoRAL), and staining slides with a commercially available simplified Harris-Schorr staining kit (Diagnoestrus; RAL Diagnostics). Vaginal epithelial cells undergo characteristic cyclical changes throughout the ovarian cycle, allowing cycle phase to be determined by the proportion of cell types present on each slide (Fig F in S1 Appendix) [85,86]. Approaching ovulation, white blood cells (WBCs) and mucus are present, and the proportion of large, geometric superficial cells gradually increases. Ovulation is detected by a sharp drop in the proportion of red-staining superficial cells, quantified by assessing the stained color of 100 cells and calculating the eosinophilic index (EI) as number of red cells + the number of red/blue (polychromatophilic) cells * 0.5 (Fig G in S1 Appendix) [87]. In addition to this quantitative measure, ovulation is also qualitatively associated with the disappearance of WBCs and mucus. The postovulatory phase is characterized by the return of WBCs and mucus, cellular clumping, and a return of basal and intermediate cell types.

From these patterns, we identified a 2-day ovulation window as the day of ovulation and the previous day. We then defined a 5-day fertile phase as the two days prior to and one day following the 2-day ovulation window [40]. We classified the 5 days preceding the fertile phase as the pre-fertile phase and the 5 days following the fertile phase as the post-fertile phase. This method for pre-fertile, fertile, and post-fertile phase classification is well established in the primatological literature, and has been used in numerous studies with respect to non-human primate sexual swellings and behavior [3840]. The evaluation of cytological slides to determine ovarian cycle phase has been used with great success in this study population [88,89].

Semen sample collection

To account for male-derived RNA present in post-copulatory vaginal RNA samples, we collected two masturbatory semen samples from each male (N = 8 samples) and conducted RNA sequencing. To do so, we collected coagulated semen left on the enclosure substrate immediately following an observed masturbation. We used autoclaved tweezers to place the sample into a 5mL lo-bind Eppendorf tube and immediately transported it back to the lab. Under a sterile fume hood, we removed the solid portion of the ejaculate, measured the volume of the remaining liquid portion using a pipette, added RNAprotect Cell Reagent in a volume 5 times the liquid sample volume, and froze at −80°C. All samples were frozen within 20 min following the time of ejaculation. We used semen samples collected after masturbation to minimize potential contamination from female-derived RNA. Furthermore, collection of semen from the female vaginal tract post-mating would have required disruption of the sperm plug, which was incompatible with later vaginal RNA sampling 4 hours later. Although the composition of ejaculates produced via masturbation may differ from those produced in a mating context, this approach represented the most feasible and controlled option for characterizing male-derived RNA in ejaculates.

RNA extraction and sequencing

We extracted RNA from vaginal and semen samples using the Qiagen RNeasy Mini kit, according to the manufacturers’ recommended protocol. We incorporated a preliminary PBS wash of the cells, used a QIAshredder for sample homogenization, performed an on-column DNase digestion to improve quality and concentrations, and measured RNA concentration and integrity using an Agilent TapeStation. Across all collected samples, vaginal sample concentrations ranged from 0.07 to 499.5 ng/µl (mean = 15.4 ± 2.7 ng/µl) and semen sample concentrations ranged from 0.005 to 0.5 ng/µl (mean = 0.18 ± 0.03 ng/µl). Library preparation and sequencing was performed at the University of Calgary’s Centre for Health Genomics and Information sequencing core. We sequenced 106 vaginal samples and 8 semen samples with RIN values varying from 1.6 to 9.2 (mean = 5.3 ± 1.7 s.d.). We performed strand-specific library preparation using the NEBNext Ultra II RNA kit with rRNA depletion following the manufacturer’s instructions and performed whole transcriptome sequencing on one NovaSeq6000 S2 100 cycle v1.5 run, generating 50 bp PE reads.

Data processing of RNA-seq libraries

RNA sequencing generated an average of 38.1M (±21.4 s.d.) reads per sample. We trimmed and filtered reads for quality using the program Trimmomatic [90], with the following parameters: -phred33 LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:36. We used the splice-aware alignment tool STAR [91] to align sequences to the olive baboon reference genome (NCBI: GCA_008728515.1). Due to the degraded nature of our samples, we used the following parameters to allow for shorter alignments, as has been done with success in other studies: --outFilterScoreMinOverLread 0.3 --outFilterMatchNminOverLread 0.3 –outFilterMatchNmin 0 [9294]. Due to issues with paired-end sequence alignment, substantially more reads displayed unique mapping in single-end versus paired-end mapping mode (paired-end mode: 3.8 ± 0.56 million uniquely mapped reads per sample, single-end mode: 9.9 ± 0.52 million uniquely mapped reads per sample). To maximize read counts mapped to genomic features, we used only R1 data for all analyses presented here. Studies have demonstrated an approximately 5% false positive and 5% false negative discovery rates for DE genes using single-end as opposed to paired-end reads [95]. These small discrepancies can exacerbate differences in identified gene ontology terms, with overlap between single- and paired-end data falling into the range of 40% [95]. To mitigate potential mapping errors due to the large bacterial cell populations present in the vagina, we additionally filtered mapped reads based on their taxonomic classification using Kraken 2 [96]. We built a custom database containing the olive baboon reference genome, as well as all bacterial, archaeal, fungal, protozoal, and viral genomes available through NCBI. We classified sequences using default kraken parameters, and filtered the STAR mapping results to include only reads that were either confidently classified as olive baboon or not classified as any other type of microorganism. Due to a high number of reads mapping to bacterial, archaeal, fungal, protozoal, and viral genomes, a mean of 27.8% of reads per sample (9.9M ± 5.6 s.d.) passed our kraken classification filter and were mapped uniquely to the baboon genome. From the filtered alignment files, we generated read counts of genomic features using the program Rsubread [97] and the Panubis1.0 genome annotation release 104.

Modeling vaginal physiology across cycle phases

To examine how vaginal gene expression differs across cycle phases, we performed differential expression analysis using the edgeR package in R [98]. To do so, we first subset samples to include the 8 from each cycle phase (pre-fertile, fertile, post-fertile, and non-fertile) that had the highest number of uniquely mapped reads (N = 32 samples total). Eight of the 9 study females are represented in the final set of 32 samples, with a mean of 6.25 females included within each cycle phase. By analyzing only a subset of all sequenced samples, we ensured an equal sample number across cycle phases and increased the mean number of uniquely mapped reads across samples from 9.9M to 13.7M. We filtered the list of genes included in the analysis by removing ribosomal protein genes, genes without human orthologs, and genes in which more than half of the samples had less than 10 counts per million, resulting in a mean library size of 4.9M read counts across 3,154 analyzable genes. We normalized library sizes based on the filtered gene list using the edgeR function ‘calcNormFactors’ and transformed count data for linear modeling using the ‘voomWithQualityWeights’ function in the package ‘limma’ [42]. To model our data, we controlled for female ID using the limma function ‘duplicateCorrelation’ with female ID as a blocking variable and included sample RIN and male presence (whether or not the sample was taken during the month prior to male introduction or after the male had been introduced) as covariates. We fit a linear model for each gene using the function ‘lmFit’ and stabilized the variance estimates across genes by applying a robust empirical Bayes moderation to the standard errors of the fitted coefficients using the function ‘eBayes’ and argument ‘robust=TRUE’. We then applied empirical Bayes adaptive shrinkage using the ‘ash’ function from the ‘ashr’ package in R, which borrows information across genes to shrink effect size and uncertainty estimates towards zero, generate more robust posterior estimates, and calculate local false sign rate (LFSR) [43]. LFSR is a measure which integrates both effect size and certainty to generate a posterior probability that an estimated effect is in the correct direction (positive or negative). This approach is particularly advantageous in small-sample settings when variance estimates are unstable because it downweights imprecise measurements and provides a more reliable alternative to standard FDR corrections [99]. We identified DE genes as those falling below a 10% LFSR, a cut-off which is standard in the field of genomics [100102], and also report the number of DE genes at more stringent 5% and 1% cutoffs. We performed GSEA using the ‘fgsea’ function in the R package ‘clusterProfiler’ [103] and the Papio anubis Ensembl genome annotations available through biomaRt [104], using a p-value cutoff of 0.05.

To examine how vaginal pH changes across cycle phases, we conducted robust linear mixed modeling using the R package ‘robustlmm’ [44]. We included cycle phase as a categorical predictor variable with the non-fertile phase as the reference category, vaginal pH (averaged across the 3 measurements for that day) as the response variable, vaginal temperature (averaged across the 3 measurements for that day) as a covariate, and female ID as a random effect. For this analysis, we included only pH measurements in which there was no observed mating or obvious signs of previous mating (i.e., sperm plug present) that day (N = 275 measurements, 30.6 ± 5.41 s.d. per female, 68.8 ± 21.8 s.d. per cycle phase). Visual inspection of quantile–quantile and residual variance plots confirmed homoscedastic residual variance structure and variance inflation factor (VIF) <2 confirmed no issues of collinearity.

Modeling vaginal physiology in response to mating

To examine how vaginal gene expression changes in response to mating, we again performed differential expression analysis using the edgeR package in R. We analyzed 25 post-copulatory and 30 non-copulatory RNA samples, all of which were taken from the pre-fertile, fertile, or post-fertile phases and had greater than 5M uniquely mapped reads. We were able to collect post-copulatory samples from six of the nine females, and thus limited our non-copulatory samples to those six females as well. This resulted in a mean of 4.3 post-copulatory samples per female and 5 non-copulatory samples per female (Table A in S2 Appendix), with 12.3M (± 7 s.d.) uniquely mapped reads for post-copulatory samples and 8.5M (± 4 s.d.) for non-copulatory samples. To account for male-derived RNA present in the vagina, we first removed genes found to be highly expressed in semen samples. From these samples, we identified 1,442 genes highly expressed in semen (average expression of >20 cpm, Table E in S2 Appendix), and removed these genes from all subsequent post-copulatory gene expression analyses. We then filtered the remaining genes by removing ribosomal protein genes, genes without human orthologs, and genes in which more than half of the samples had less than 10 counts per million, resulting in a mean library size of 1.9M read counts across 2,716 analyzable genes. We normalized library sizes based on the filtered gene list, controlled for dyad ID using a blocking variable, and fit a linear model applying a robust empirical Bayes moderation, including sample RIN, cycle phase, and male presence as covariates. The limma function ‘duplicateCorrelation’ supports the inclusion of only a single blocking factor, and thus we included dyad ID as the blocking variable as this uniquely identifies each male-female pair, capturing the repeated measures associated with both individual IDs. We identified DE genes as those falling below a 10% LFSR and performed GSEA as described above.

To examine how vaginal pH differs following mating, we conducted robust linear mixed effects modeling. We used a binary predictor variable (yes or no) indicating whether the pH measurement for that day was a post-copulatory or non-copulatory measurement (N = 62; npost-cop = 15, nnon-cop = 47). Post-copulatory pH measurements were obtained from five out of nine females during their pre-fertile or fertile phase, thus we limited non-copulatory measurements to those same females and phases (Table A in S2 Appendix). We used pH as the response variable, phase and temperature as covariates, and dyad ID as a random effect. Visual inspection of a quantile-quantile plot revealed greater residual variance in post-copulatory versus non-copulatory samples, which we statistically confirmed with a Breusch-Pagan test using the ‘bptest’ function in the ‘lmtest’ package in R [105]. We used VIF to confirm no issues of collinearity (VIF < 2).

Modeling vaginal physiology in relation to genetic diversity and complementarity

To test how post-copulatory gene expression is related to male genetic diversity and complementarity, we used the same subset of RNA-seq samples described above for our post-copulatory analyses. We fit 10 separate models, each testing for the interactive effect between post-copulatory status (yes or no) and one genotype metric (listed in Table I in S2 Appendix). We controlled for dyad ID using a blocking variable and ran a linear model with robust empirical Bayes moderation on the transformed counts including sample RIN, cycle phase, and male presence as covariates. We identified DE genes as those falling below a 10% LFSR and performed GSEA as described above. Because we tested genotype × post-copulatory status effects across 10 separate models, we applied an additional FWER correction using the Holm method to adjust the p-values for each gene across all models [106]. In the Results, we report how many genes with LFSR < 10% also meet a FWER-adjusted p < 0.05 after this correction.

To test whether post-copulatory vaginal pH is related to male genetic diversity and complementarity, we conducted robust linear mixed effects modeling using the same pH measurements as described above for our post-copulatory analyses. We ran 10 separate models, each testing for the interactive effect between post-copulatory status (yes or no) and one genotype metric (listed in Table I in S2 Appendix). We included cycle phase and temperature as covariates and dyad ID as a random effect. We confirmed homoscedastic residual variance structure by visually inspecting quantile–quantile and residual variance plots, and adjusted p-values for multiple hypothesis testing using the Holm FWER correction [106].

To evaluate the robustness of our findings to the influence of individual mating pairs, we conducted a leave-one-out sensitivity analysis. In this approach, we iteratively removed a single mating dyad from our dataset and reran our analyses testing for the effect of male genotype in modulating post-copulatory vaginal gene expression and pH. For each iteration, we recorded the number of DE genes and the interaction effect size estimate and associated confidence intervals. This method allowed us to identify potentially influential observations and quantify the overall stability of our results.

Supporting information

S1 Appendix.

Fig A. GSEA pathways enriched for differential expression in the fertile phase. The “activated” panel represents pathways enriched for genes with heightened expression in the fertile compared to non-fertile phase and the “suppressed” panel represents pathways enriched for genes with lower expression in the fertile compared to non-fertile phase. Dot size represents the number of genes represented within that pathway and the darker colors represent lower p-values. The data underlying this figure are provided in S5 Data. Fig B. Vaginal pH did not vary significantly between cycle phases. Error bars represent model predictions ± one standard error and points represent the raw data. The data underlying this figure are provided in S5 Data. Fig C. GSEA pathways enriched for differential expression post-copulation. Three gene set pathways are enriched for genes with heightened expression in post-copulatory versus non-copulatory contexts. Dot size represents the number of genes represented within that pathway and the darker colors represent lower p-values. The data underlying this figure are provided in S5 Data. Fig D. GSEA pathways enriched for differential expression in relation to MHC I allelic complementarity. Pathways enriched for genes with heightened expression after mating with males with low complementarity. Dot size represents the number of genes represented within that pathway and the darker colors represent lower p-values. The data underlying this figure are provided in S5 Data. Fig E. Leave-one-out sensitivity analysis of vaginal pH model estimates. Shown are estimated interaction effects between post-copulatory status and male genotype on vaginal pH across leave-one-out iterations. Points represent the estimated interaction term (post-copulatory status × male genotype), and error bars denote 95% confidence intervals. The genotype included in each model is indicated in the facet title. The data underlying this figure are provided in S5 Data. Fig F. Vaginal epithelial cells stained using a modified Harris-Schorr technique. Epithelial cell types include basal cells (A), intermediate cells (B), polychromatophilic superficial cells (C), and eosinophilic superficial cells (D). The preovulatory phase (E) exhibits a gradual increase in the proportion of red to blue staining cells, and the post-ovulatory phase (F) is characterized by an increase in cellular clumping, mucus, and WBCs. Fig G. Composite profile demonstrating fluctuations in EI over the course of the ovarian cycle (N = 22 cycles). Points represent mean EI values on each day in relation to ovulation, and error bars represent the standard error of the mean. The two-day ovulation window is designated by consecutive 0’s on the x axis and is shaded in red. The days leading up to ovulation are designated by negative numbers and the days following ovulation designated by positive numbers. The data underlying this figure are provided in S5 Data.

https://doi.org/10.1371/journal.pbio.3003699.s001

(DOCX)

S2 Appendix.

Table A. Description of dataset composition for vaginal pH and gene expression analyses. Total number of samples, as well as number of samples per cycle phase or copulatory status for cycle phase and post-copulatory analyses, respectively. Measures of male genetic heterozygosity and complementarity are described in Table 1 in S2 Appendix. Table B. Differentially expressed genes- between phases. Genes with significant (FDR < 10%) differential expression in pairwise comparisons between cycle phases. Negative coefficients correspond to lower expression in the phase listed first in the comparison column, and positive coefficients correspond to higher expression in the phase listed first. Table C. Gene set enrichment analysis- between phases. GO pathways that are overrepresented among genes that are differentially expressed between the fertile and non-fertile phase. Table D. Vaginal pH linear model results- cycle phase. Table E. Genes highly expressed in semen. Genes with an average of >20 cpm in semen samples. Table F. Differentially expressed genes- post-copulatory versus non-copulatory. Genes with significant (LFSR < 10%) differential expression in post-copulatory versus non-copulatory samples. Negative coefficients correspond to lower expression post-copulation and positive coefficients correspond to higher expression post-copulation. Table G. Gene set enrichment analysis- post-copulatory versus non-copulatory. GO pathways that are overrepresented among genes that are differentially expressed between post-copulatory versus non-copulatory samples. Table H. Vaginal pH linear model results- post-copulatory status. Table I. Measures of male genetic diversity and dyadic complementarity. Table J. Differentially expressed genes- male genetic diversity. Genes with a significant (FDR < 10%) interactive effect between post-copulatory status and one metric of male genetic diversity (listed in the “genotype metric” column). Table K. Differentially expressed genes- genetic complementarity. Genes with a significant (FDR < 10%) interactive effect between post-copulatory status and one metric of genetic complementarity (listed in the “genotype metric” column). Table L. Gene set enrichment analysis- male genetic diversity and complementarity. Pathways with a significant enrichment (FDR < 10%) interactive effect between post-copulatory status and one metric of male genetic diversity or complementarity (listed in “genotype metric” column). Table M. Number of genes/pathways whose post-copulatory expression is significantly modified by an aspect of male genetic makeup across five mating dyads. Number of differentially expressed (DE) genes at passing an LFSR < 0.1, LFSR < 0.05, and LFSR < 0.01 threshold, number of LFSR < 0.1 genes which also pass a family-wise error rate (FWER)-adjusted p < 0.05 threshold, and number of gene set enrichment analysis (GSEA) pathways at a p < 0.05 threshold. Table N. Vaginal pH linear model results- male genetic diversity and complementarity. Interactive effect of post-copulatory status and each measure of male genetic diversity or complementarity in predicting vaginal pH. Each genetic metric was tested in a separate model. Table O. Primer sequences used to amplify MHC A, B, DQA, and DRB loci.

https://doi.org/10.1371/journal.pbio.3003699.s002

(XLSX)

S5 Data. Data underlying Figs A, B, C, D, E and G in S1 Appendix.

https://doi.org/10.1371/journal.pbio.3003699.s007

(XLSX)

Acknowledgments

We would like to thank members of the Primate Hormones and Behavior lab at NYU, the Primate Genetics Lab at DPZ, and the Melin lab at the University of Calgary for their support in completing this work. We extend a huge thank you to all of the staff at the CNRS Station de Primatologie for their assistance in executing this project, specifically Romain Lacoste, Slaveia Garbit, Magali Ghirart, Pascaline Boitelle, and Pau Molina. Thank you to Beth Archie and Cliff Jolly for their value feedback throughout the formulation and execution of this project. Thank you to Kristi Holt for collecting data and Stefano Vaglio for his insight into baboon training. Thank you to Patrícia Ströher, Gwen Duytschaever, and the University of Calgary’s Centre for Health Genomics and Informatics sequencing core for facilitating RNA preparation and sequencing. This work was supported in part through the NYU IT High Performance Computing resources, services, and staff expertise.

References

  1. 1. Darwin C. The descent of man, and selection in relation to sex. London: John Murray. 1871.
  2. 2. Berglund A, Bisazza A, Pilastro A. Armaments and ornaments: an evolutionary explanation of traits of dual utility. Biol J Linn Soc. 1996;58(4):385–99.
  3. 3. Coleman SW, Patricelli GL, Borgia G. Variable female preferences drive complex male displays. Nature. 2004;428(6984):742–5. pmid:15085130
  4. 4. Prum RO. Aesthetic evolution by mate choice: Darwin’s really dangerous idea. Philos Trans R Soc Lond B Biol Sci. 2012;367(1600):2253–65. pmid:22777014
  5. 5. Eberhard WG. Female control: sexual selection by cryptic female choice. Princeton University Press; 1966.
  6. 6. Firman RC, Gasparini C, Manier MK, Pizzari T. Postmating female control: 20 years of cryptic female choice. Trends Ecol Evol. 2017;32(5):368–82. pmid:28318651
  7. 7. Marie-Orleach L, Vellnow N, Schärer L. The repeatable opportunity for selection differs between pre- and postcopulatory fitness components. Evol Lett. 2020;5(1):101–14. pmid:33552539
  8. 8. Rosenthal GG, Ryan MJ. Sexual selection and the ascent of women: mate choice research since Darwin. Science. 2022;375(6578):eabi6308. pmid:35050648
  9. 9. Martín-Coello J, Benavent-Corai J, Roldan ERS, Gomendio M. Sperm competition promotes asymmetries in reproductive barriers between closely related species. Evolution. 2009;63(3):613–23. pmid:19087184
  10. 10. Sutter A, Lindholm AK. No evidence for female discrimination against male house mice carrying a selfish genetic element. Curr Zool. 2016;62(6):675–85. pmid:29491955
  11. 11. Coltman DW, Bancroft DR, Robertson A, Smith JA, Clutton-Brock TH, Pemberton JM. Male reproductive success in a promiscuous mammal: behavioural estimates compared with genetic paternity. Mol Ecol. 1999;8(7):1199–209. pmid:10447860
  12. 12. Curie-Cohen M, Yoshihara D, Luttrell L, Benforado K, MacCluer JW, Stone WH. The effects of dominance on mating behavior and paternity in a captive troop of rhesus monkeys (Macaca mulatta). Am J Primatol. 1983;5(2):127–38. pmid:31991947
  13. 13. Stern BR, Smith DG. Sexual behaviour and paternity in three captive groups of rhesus monkeys (Macaca mulatta). Anim Behav. 1984;32:23–32.
  14. 14. Hardy MP, Dent JN. Transport of sperm within the cloaca of the female red-spotted newt. J Morphol. 1986;190(3):259–70. pmid:3806681
  15. 15. Roldan ER, Vitullo AD, Merani MS, Von Lawzewitsch I. Cross fertilization in vivo and in vitro between three species of vesper mice, Calomys (Rodentia, Cricetidae). J Exp Zool. 1985;233(3):433–42. pmid:3882881
  16. 16. Yeates SE, Diamond SE, Einum S, Emerson BC, Holt WV, Gage MJG. Cryptic choice of conspecific sperm controlled by the impact of ovarian fluid on sperm swimming behavior. Evolution. 2013;67(12):3523–36. pmid:24299405
  17. 17. Firman RC, Simmons LW. Gametic interactions promote inbreeding avoidance in house mice. Ecol Lett. 2015;18(9):937–43. pmid:26154782
  18. 18. Almiñana C, Caballero I, Heath PR, Maleki-Dizaji S, Parrilla I, Cuello C, et al. The battle of the sexes starts in the oviduct: modulation of oviductal transcriptome by X and Y-bearing spermatozoa. BMC Genomics. 2014;15(1):293. pmid:24886317
  19. 19. Sharkey DJ, Macpherson AM, Tremellen KP, Robertson SA. Seminal plasma differentially regulates inflammatory cytokine gene expression in human cervical and vaginal epithelial cells. Mol Hum Reprod. 2007;13(7):491–501. pmid:17483528
  20. 20. Fitzpatrick JL, Willis C, Devigili A, Young A, Carroll M, Hunter HR, et al. Chemical signals from eggs facilitate cryptic female choice in humans. Proc Biol Sci. 2020;287(1928):20200805. pmid:32517615
  21. 21. Forsberg LA, Dannewitz J, Petersson E, Grahn M. Influence of genetic dissimilarity in the reproductive success and mate choice of brown trout – females fishing for optimal MHC dissimilarity. J Evol Biol. 2007;20(5):1859–69. pmid:17714303
  22. 22. Thoss M, Ilmonen P, Musolf K, Penn DJ. Major histocompatibility complex heterozygosity enhances reproductive success. Mol Ecol. 2011;20(7):1546–57. pmid:21291500
  23. 23. Kalbe M, Eizaguirre C, Dankert I, Reusch TBH, Sommerfeld RD, Wegner KM, et al. Lifetime reproductive success is maximized with optimal major histocompatibility complex diversity. Proc Biol Sci. 2009;276(1658):925–34. pmid:19033141
  24. 24. Sauermann U, Nürnberg P, Bercovitch FB, Berard JD, Trefilov A, Widdig A, et al. Increased reproductive success of MHC class II heterozygous males among free-ranging rhesus macaques. Hum Genet. 2001;108(3):249–54. pmid:11354639
  25. 25. Schwensow N, Fietz J, Dausmann K, Sommer S. MHC-associated mating strategies and the importance of overall genetic diversity in an obligate pair-living primate. Evol Ecol. 2008;22:617–36.
  26. 26. Setchell JM, Charpentier MJE, Abbott KM, Wickings EJ, Knapp LA. Opposites attract: MHC-associated mate choice in a polygynous primate. J Evol Biol. 2010;23(1):136–48. pmid:19891747
  27. 27. Huchard E, Baniel A, Schliehe-Diecks S, Kappeler PM. MHC-disassortative mate choice and inbreeding avoidance in a solitary primate. Mol Ecol. 2013;22(15):4071–86. pmid:23889546
  28. 28. Yang B, Ren B, Xiang Z, Yang J, Yao H, Garber PA, et al. Major histocompatibility complex and mate choice in the polygynous primate: the Sichuan snub-nosed monkey (Rhinopithecus roxellana). Integr Zool. 2014;9(5):598–612. pmid:24382257
  29. 29. Chaves PB, Strier KB, Di Fiore A. Paternity data reveal high MHC diversity among sires in a polygynandrous, egalitarian primate. Proc Biol Sci. 2023;290(2004):20231035. pmid:37528707
  30. 30. Gasparini C, Congiu L, Pilastro A. Major histocompatibility complex similarity and sexual selection: different does not always mean attractive. Mol Ecol. 2015;24(16):4286–95. pmid:25940673
  31. 31. Løvlie H, Gillingham MAF, Worley K, Pizzari T, Richardson DS. Cryptic female choice favours sperm from major histocompatibility complex-dissimilar males. Proc Biol Sci. 2013;280(1769):20131296. pmid:24004935
  32. 32. Rülicke T, Chapuisat M, Homberger FR, Macas E, Wedekind C. MHC-genotype of progeny influenced by parental infection. Proc Biol Sci. 1998;265(1397):711–6. pmid:9608731
  33. 33. Yeates SE, Einum S, Fleming IA, Megens H-J, Stet RJM, Hindar K, et al. Atlantic salmon eggs favour sperm in competition that have similar major histocompatibility alleles. Proc Biol Sci. 2009;276(1656):559–66. pmid:18854296
  34. 34. Higham JP, Semple S, MacLarnon A, Heistermann M, Ross C. Female reproductive signaling, and male mating behavior, in the olive baboon. Horm Behav. 2009;55(1):60–7. pmid:18786539
  35. 35. Jolly CJ, Phillips-Conroy JE. Testicular size, mating system, and maturation schedules in wild Anubis and Hamadryas baboons. Int J Primatol. 2003;24:125–42.
  36. 36. Olmsted SS, Dubin NH, Cone RA, Moench TR. The rate at which human sperm are immobilized and killed by mild acidity. Fertil Steril. 2000;73(4):687–93. pmid:10731526
  37. 37. Schjenken JE, Robertson SA. The female response to seminal fluid. Physiol Rev. 2020;100(3):1077–117. pmid:31999507
  38. 38. Deschner T, Heistermann M, Hodges K, Boesch C. Female sexual swelling size, timing of ovulation, and male behavior in wild West African chimpanzees. Horm Behav. 2004;46(2):204–15. pmid:15256310
  39. 39. Higham JP, Heistermann M, Ross C, Semple S, Maclarnon A. The timing of ovulation with respect to sexual swelling detumescence in wild olive baboons. Primates. 2008;49(4):295–9. pmid:18810314
  40. 40. Young C, Majolo B, Heistermann M, Schülke O, Ostner J. Male mating behaviour in relation to female sexual swellings, socio-sexual behaviour and hormonal changes in wild Barbary macaques. Horm Behav. 2013;63(1):32–9. pmid:23146839
  41. 41. Wilcox AJ, Dunson D, Baird DD. The timing of the “fertile window” in the menstrual cycle: day specific estimates from a prospective study. BMJ. 2000;321(7271):1259–62. pmid:11082086
  42. 42. Ritchie ME, Phipson B, Wu D, Hu Y, Law CW, Shi W, et al. limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res. 2015;43(7):e47. pmid:25605792
  43. 43. Stephens M, Carbonetto P, Dai C, Gerard D, Lu M. ashr: methods for adaptive shrinkage, using Empirical Bayes. 2020.
  44. 44. Koller M. Robustlmm: an R package for robust estimation of linear mixed-effects models. J Stat Softw. 2016;75:1–24.
  45. 45. Cook JD, Strauss KA, Caplan YH, Lodico CP, Bush DM. Urine pH: the effects of time and temperature after collection. J Anal Toxicol. 2007;31(8):486–96. pmid:17988463
  46. 46. Karlsson AH, Rosenvold K. The calibration temperature of pH-glass electrodes: significance for meat quality classification. Meat Sci. 2002;62(4):497–501. pmid:22061758
  47. 47. Miller EA, Beasley DE, Dunn RR, Archie EA. Lactobacilli dominance and vaginal pH: why is the human vaginal microbiome unique? Front Microbiol. 2016;7:1936.
  48. 48. Miller EA, Livermore JA, Alberts SC, Tung J, Archie EA. Ovarian cycling and reproductive state shape the vaginal microbiota in wild baboons. Microbiome. 2017;5(1):8. pmid:28103920
  49. 49. Petersen RM, Bergey CM, Roos C, Higham JP. Relationship between genome-wide and MHC class I and II genetic diversity and complementarity in a nonhuman primate. Ecol Evol. 2022;12(10):e9346. pmid:36311412
  50. 50. Munoz-Suano A, Hamilton AB, Betz AG. Gimme shelter: the immune system during pregnancy: the immunology of pregnancy. Immunol Rev. 2011;241:20–38.
  51. 51. Wira CR, Rodriguez-Garcia M, Patel MV, Biswas N, Fahey JV. Endocrine regulation of the mucosal immune system in the female reproductive tract. Mucosal Immunology. 2015. p. 2141–56.
  52. 52. Wagner RD, Johnson SJ. Probiotic lactobacillus and estrogen effects on vaginal epithelial gene expression responses to Candida albicans. J Biomed Sci. 2012;19(1):58. pmid:22715972
  53. 53. Wigby S, Sirot LK, Linklater JR, Buehner N, Calboli FCF, Bretman A, et al. Seminal fluid protein allocation and male reproductive success. Curr Biol. 2009;19(9):751–7. pmid:19361995
  54. 54. Robertson SA. Seminal plasma and male factor signalling in the female reproductive tract. Cell Tissue Res. 2005;322(1):43–52. pmid:15909166
  55. 55. Scherjon S, Lashley L, van der Hoorn M-L, Claas F. Fetus specific T cell modulation during fertilization, implantation and pregnancy. Placenta. 2011;32 Suppl 4:S291-7. pmid:21592567
  56. 56. Schjenken JE, Robertson SA. Seminal fluid and immune adaptation for pregnancy – comparative biology in mammalian species. Reprod Domest Anim. 2014;49 Suppl 3:27–36. pmid:25220746
  57. 57. Lee JY, Lee M, Lee SK. Role of endometrial immune cells in implantation. Clin Exp Reprod Med. 2011;38(3):119–25. pmid:22384430
  58. 58. Robertson SA, Care AS, Moldenhauer LM. Regulatory T cells in embryo implantation and the immune response to pregnancy. J Clin Invest. 2018;128(10):4224–35. pmid:30272581
  59. 59. Robertson SA, Prins JR, Sharkey DJ, Moldenhauer LM. Seminal fluid and the generation of regulatory T cells for embryo implantation. Am J Reprod Immunol. 2013;69(4):315–30. pmid:23480148
  60. 60. Peterson BK, Weber JN, Kay EH, Fisher HS, Hoekstra HE. Double digest RADseq: an inexpensive method for de novo SNP discovery and genotyping in model and non-model species. PLoS One. 2012;7(5):e37135. pmid:22675423
  61. 61. Li H. Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. arXiv. 2013. http://arxiv.org/abs/1303.3997
  62. 62. Rochette NC, Rivera-Colón AG, Catchen JM. Stacks 2: analytical methods for paired-end sequencing improve RADseq-based population genomics. Mol Ecol. 2019;28(21):4737–54. pmid:31550391
  63. 63. Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MAR, Bender D, et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet. 2007;81(3):559–75. pmid:17701901
  64. 64. Coltman DW, Pilkington JG, Smith JA, Pemberton JM. Parasite-mediated selection against inbred soay sheep in a free-living island populaton. Evolution. 1999;53(4):1259–67. pmid:28565537
  65. 65. Alho JS, Välimäki K, Merilä J. Rhh: an R extension for estimating multilocus heterozygosity and heterozygosity-heterozygosity correlation. Mol Ecol Resour. 2010;10(4):720–2. pmid:21565077
  66. 66. Manichaikul A, Mychaleckyj JC, Rich SS, Daly K, Sale M, Chen W-M. Robust relationship inference in genome-wide association studies. Bioinformatics. 2010;26(22):2867–73. pmid:20926424
  67. 67. Piertney SB, Oliver MK. The evolutionary ecology of the major histocompatibility complex. Heredity. 2006;96(1):7–21. pmid:16094301
  68. 68. Hughes AL, Yeager M. Natural selection and the evolutionary history of major histocompatibility complex loci. Front Biosci. 1998;3:d509-16. pmid:9601106
  69. 69. Schwensow N, Fietz J, Dausmann KH, Sommer S. Neutral versus adaptive genetic variation in parasite resistance: importance of major histocompatibility complex supertypes in a free-ranging primate. Heredity. 2007;99(3):265–77. pmid:17519969
  70. 70. Goodswen SJ, Kennedy PJ, Ellis JT. A gene-based positive selection detection approach to identify vaccine candidates using Toxoplasma gondii as a test case protozoan pathogen. Front Genet. 2018;9:332. pmid:30177953
  71. 71. Charif D, Lobry JR. SeqinR 1.0-2: a contributed package to the R project for statistical computing devoted to biological sequences retrieval and analysis. Biological and Medical Physics, Biomedical Engineering. Springer Berlin Heidelberg. 2007. p. 207–32.
  72. 72. Katoh K, Standley DM. MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol Biol Evol. 2013;30(4):772–80. pmid:23329690
  73. 73. Suyama M, Torrents D, Bork P. PAL2NAL: robust conversion of protein sequence alignments into the corresponding codon alignments. Nucleic Acids Res. 2006;34 Suppl 2:W609-12. pmid:16845082
  74. 74. Stamatakis A. RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics. 2014;30(9):1312–3. pmid:24451623
  75. 75. Yang Z. PAML 4: phylogenetic analysis by maximum likelihood. Mol Biol Evol. 2007;24(8):1586–91. pmid:17483113
  76. 76. Yang Z, Wong WSW, Nielsen R. Bayes empirical bayes inference of amino acid sites under positive selection. Mol Biol Evol. 2005;22(4):1107–18. pmid:15689528
  77. 77. Sandberg M, Eriksson L, Jonsson J, Sjöström M, Wold S. New chemical descriptors relevant for the design of biologically active peptides. A multivariate characterization of 87 amino acids. J Med Chem. 1998;41(14):2481–91. pmid:9651153
  78. 78. Team RC R. R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing; 2021.
  79. 79. Langfelder P, Zhang B, Horvath S. Defining clusters from a hierarchical cluster tree: the Dynamic Tree Cut package for R. Bioinformatics. 2008;24(5):719–20. pmid:18024473
  80. 80. Greenbaum J, Sidney J, Chung J, Brander C, Peters B, Sette A. Functional classification of class II human leukocyte antigen (HLA) molecules reveals seven different supertypes and a surprising degree of repertoire sharing across supertypes. Immunogenetics. 2011;63(6):325–35. pmid:21305276
  81. 81. Lund O, Nielsen M, Kesmir C, Petersen AG, Lundegaard C, Worning P, et al. Definition of supertypes for HLA molecules using clustering of specificity matrices. Immunogenetics. 2004;55(12):797–810. pmid:14963618
  82. 82. Sette A, Sidney J. Nine major HLA class I supertypes account for the vast preponderance of HLA-A and -B polymorphism. Immunogenetics. 1999;50(3–4):201–12. pmid:10602880
  83. 83. Southwood S, Sidney J, Kondo A, del Guercio MF, Appella E, Hoffman S, et al. Several common HLA-DR types share largely overlapping peptide binding repertoires. J Immunol. 1998;160(7):3363–73. pmid:9531296
  84. 84. Trachtenberg E, Korber B, Sollars C, Kepler TB, Hraber PT, Hayes E, et al. Advantage of rare HLA supertype in HIV disease progression. Nat Med. 2003;9(7):928–35. pmid:12819779
  85. 85. Wildt DE, Doyle LL, Stone SC, Harrison RM. Correlation of perineal swelling with serum ovarian hormone levels, vaginal cytology, and ovarian follicular development during the baboon reproductive cycle. Primates. 1977;18(2):261–70.
  86. 86. Hendrickx A. Embryology of the baboon. University of Chicago Press, Chicago, IL; 1971.
  87. 87. MacLennan AH, Wynn RM. Menstrual cycle of the baboon. I. Clinical features, vaginal cytology and endometrial histology. Obstet Gynecol. 1971;38(3):350–8. pmid:4999309
  88. 88. Vaglio S, Minicozzi P, Kessler SE, Walker D, Setchell JM. Olfactory signals and fertility in olive baboons. Sci Rep. 2021;11(1):8506. pmid:33875713
  89. 89. Vaglio S, Ducroix L, Rodriguez Villanueva M, Consiglio R, Kim AJ, Neilands P, et al. Female copulation calls vary with male ejaculation in captive olive baboons. Behaviour. 2020;157: 807–822.
  90. 90. Bolger AM, Lohse M, Usadel B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics. 2014;30(15):2114–20. pmid:24695404
  91. 91. Dobin A, Davis CA, Schlesinger F, Drenkow J, Zaleski C, Jha S, et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics. 2013;29(1):15–21. pmid:23104886
  92. 92. Janiszewska M, Tabassum DP, Castaño Z, Cristea S, Yamamoto KN, Kingston NL, et al. Subclonal cooperation drives metastasis by modulating local and systemic immune microenvironments. Nat Cell Biol. 2019;21(7):879–88. pmid:31263265
  93. 93. Littleton ES, Childress ML, Gosting ML, Jackson AN, Kojima S. Genome-wide correlation analysis to identify amplitude regulators of circadian transcriptome output. Sci Rep. 2020;10(1):21839. pmid:33318596
  94. 94. Rajkov J, El Taher A, Böhne A, Salzburger W, Egger B. Gene expression remodelling and immune response during adaptive divergence in an African cichlid fish. Mol Ecol. 2021;30(1):274–96. pmid:33107988
  95. 95. Corley SM, MacKenzie KL, Beverdam A, Roddam LF, Wilkins MR. Differentially expressed genes from RNA-Seq and functional enrichment results are affected by the choice of single-end versus paired-end reads and stranded versus non-stranded protocols. BMC Genomics. 2017;18(1):399. pmid:28535780
  96. 96. Wood DE, Lu J, Langmead B. Improved metagenomic analysis with Kraken 2. Genome Biol. 2019;20(1):257. pmid:31779668
  97. 97. Liao Y, Smyth GK, Shi W. The R package Rsubread is easier, faster, cheaper and better for alignment and quantification of RNA sequencing reads. Nucleic Acids Res. 2019;47(8):e47. pmid:30783653
  98. 98. Robinson MD, McCarthy DJ, Smyth GK. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics. 2010;26(1):139–40. pmid:19910308
  99. 99. Stephens M. False discovery rates: a new deal. Biostatistics. 2017;18(2):275–94. pmid:27756721
  100. 100. Resztak JA, Wei J, Zilioli S, Sendler E, Alazizi A, Mair-Meijers HE, et al. Genetic control of the dynamic transcriptional response to immune stimuli and glucocorticoids at single-cell resolution. Genome Res. 2023;33(6):839–56. pmid:37442575
  101. 101. Lea AJ, Peng J, Ayroles JF. Diverse environmental perturbations reveal the evolution and context-dependency of genetic effects on gene expression levels. Genome Res. 2022;32(10):1826–39. pmid:36229124
  102. 102. Petersen RM, Vockley CM, Lea AJ. Uncovering methylation-dependent genetic effects on regulatory element function in diverse genomes. Genome Res. 2025;35(8):1781–93. pmid:40659498
  103. 103. Xu S, Hu E, Cai Y, Xie Z, Luo X, Zhan L, et al. Using clusterProfiler to characterize multiomics data. Nat Protoc. 2024;19(11):3292–320. pmid:39019974
  104. 104. Durinck S, Spellman PT, Birney E, Huber W. Mapping identifiers for the integration of genomic datasets with the R/Bioconductor package biomaRt. Nat Protoc. 2009;4(8):1184–91. pmid:19617889
  105. 105. Zeileis A, Hothorn T. Diagnostic checking in regression relationships. R News. 2002.
  106. 106. Holm S. A simple sequentially rejective multiple test procedure. Scand J Stat. 1979;6: 65–70.