Skip to main content
Advertisement
  • Loading metrics

Human and pathogen genotype-by-genotype interactions in the light of coevolution theory

Abstract

Antagonistic coevolution (i.e., reciprocal adaptation and counter-adaptation) between hosts and pathogens has long been considered an important driver of genetic variation. However, direct evidence for this is still scarce, especially in vertebrates. The wealth of data on genetics of susceptibility to infectious disease in humans provides an important resource for understanding host–pathogen coevolution, but studies of humans are rarely framed in coevolutionary theory. Here, I review data from human host–pathogen systems to critically assess the evidence for a key assumption of models of host–pathogen coevolution—the presence of host genotype-by-pathogen genotype interactions (G×G). I also attempt to infer whether observed G×G fit best with “gene-for-gene” or “matching allele” models of coevolution. I find that there are several examples of G×G in humans (involving, e.g., ABO, HBB, FUT2, SLC11A1, and HLA genes) that fit assumptions of either gene-for-gene or matching allele models. This means that there is potential for coevolution to drive polymorphism also in humans (and presumably other vertebrates), but further studies are required to investigate how widespread this process is.

Introduction

Population genetic analyses of humans as well as other organisms have shown that immune genes and other genes at the host–pathogen interface are often highly polymorphic. Moreover, many of these polymorphisms are associated with susceptibility to infectious and inflammatory/autoimmune disease and have therefore likely been subject to natural selection [1,2]. Natural selection is normally expected to eliminate genetic variation, so why are immune genes then so variable?

A popular idea is that the high level of polymorphism is a result of host–pathogen coevolution driven by negative frequency-dependent selection (NFDS; Box 1), often referred to as “Red Queen dynamics” [3,4]. This can occur because infection typically requires pathogen binding to some host molecule to gain access to tissue and/or pathogen evasion of immune recognition to avoid clearance. Regardless of the type of molecular interaction, pathogens should evolve to infect common host genotypes that are then selected against and decline in frequency, followed by pathogen adaptation to alternative host genotypes which are then selected against. Such persistent NFDS as a result of continuous pathogen adaptation to the currently most frequent host genotype can lead to the maintenance of 2 or more alternative alleles for long time periods at the loci involved.

Box 1. Glossary

Host–pathogen coevolution: a form of antagonistic coevolution, where there is reciprocal selection for adaptation and counter-adaptation in 2 species that affect each other’s fitness negatively.

Negative frequency-dependent selection: when the fitness of an allele is negatively correlated with its frequency (direct NFDS) or the frequency of an allele at another locus (indirect NFDS). In case of host–pathogen coevolution, there needs to be indirect NFDS in the sense that the fitness of a host allele depends on the frequency of the pathogen allele with which it interacts [3,10].

The finding that genes at the host–pathogen interface in humans and other organisms often have signatures of balancing selection [57] is clearly consistent with the idea of coevolution by NFDS. However, balancing selection on such genes could also be a result of other forms of pathogen-mediated balancing selection, like heterozygote advantage or spatiotemporal heterogeneity in pathogen abundance driven by environmental factors [3]. None of the latter processes involve reciprocal selection for adaptation and counter-adaptation as in coevolution; instead, they represent unidirectional selection by pathogens on the host.

In invertebrates and plants, “time-shift experiments”—where hosts are exposed to pathogens from the past, present, and future—have demonstrated that coevolution by NFDS indeed plays a role in natural populations [8,9]. However, such experiments are difficult to perform on vertebrates, and there is little other evidence that balancing selection in vertebrates is a result of host–pathogen coevolution by NFDS. Moreover, even if NFDS in principle is a very powerful driver of polymorphism, theoretical models have shown that it only occurs in a quite narrow parameter space [10]. Thus, it is relevant to ask: How important is coevolution, with continuous adaptation and counter-adaptation of host and pathogen, as a driver of polymorphism in vertebrates?

A good way to start investigating the role of coevolution by NFDS in vertebrates is to test assumptions that are specific to models of host–pathogen coevolution. The key assumption of classical models of host–pathogen coevolution by NFDS is that infection depends not only on genetic variation in host and pathogen, but also on the combination of host and pathogen genotypes. Thus, in statistical terms, there needs to be a host genotype-by-pathogen genotype interaction (G×G) for susceptibility to infection [3,4,11].

There are 2 basic types of models of host–pathogen coevolution, with different types of G×G; “matching allele” (MA) and “gene-for-gene” (GFG) models [10,12]. Briefly, MA models assume G×G such that different host genotypes are susceptible to different pathogen genotypes, while GFG models assume G×G such that host genotypes differ in the range of pathogen genotypes they are susceptible to. Both scenarios can lead to coevolution by NFDS and the long-term maintenance of polymorphism, but the GFG scenario will only do so if there is a cost of resistance (see Fig 1 for details). Thus, testing for G×G and investigating the nature of G×G provides a key to understanding host–pathogen coevolution.

thumbnail
Fig 1. Coevolutionary consequences of different types of G×G.

For simplicity, the figure illustrates a scenario where both host and pathogen are haploid and where the G×G involves 1 host locus and 1 pathogen locus (each with 2 different alleles). In MA models, there is a G×G such that different pathogen genotypes infect different host genotypes (a). MA models readily lead to NFDS and the long-term maintenance of polymorphism at interacting loci in both host and pathogen, either in the form of cyclic allele frequencies or a stable polymorphism (b). This occurs because resistance to 1 pathogen genotype comes with a cost in the form of susceptibility to other pathogen genotypes. In other words, under the MA scenario, there is a trade-off between resistance to different pathogen genotypes. In GFG models, there is a G×G such that some pathogen genotypes infect a wider range of genotypes than others (c). In the basic GFG scenario, there is no cost of host resistance or pathogen infectivity. When a host allele that improves resistance without any costs (to the host) occurs in a population, it will be favoured by selection and driven to fixation. Similarly, when a pathogen allele that improves infectivity without costs (to the pathogen) occurs, it will go to fixation. GFG models without costs of resistance or infectivity therefore lead to selective sweeps with only brief, transient polymorphisms, often referred to as arms race coevolution ((d); note that successive sweeps often occur at different sites in the genome, as indicated by different types of lines). However, if there is a cost of host resistance in the currency of another trait related to fitness so that no host genotype has highest fitness under all conditions (and a cost of pathogen infectivity so that no pathogen genotype has highest fitness under all conditions), also GFG models can lead to coevolution by NFDS and the long-term maintenance of polymorphism in the same way as matching allele models (b) [12]. Note that different types of molecular interactions between host and pathogen can result in both MA and GFG type G×G (see [11]). Whether NFDS results in cycles or stable polymorphism (b) depends on the relative importance of 2 different types of NFDS; direct NFDS (where the fitness of an allele is negatively correlated with its frequency) and indirect NFDS (where the fitness of an allele in the host is negatively correlated with the frequency of an allele at the locus involved in G×G in the coevolving pathogen) [10]. Based on figures in [4,11]. GFG, gene-for-gene; G×G, host genotype-by-pathogen genotype interaction; MA, matching allele; NFDS, negative frequency-dependent selection.

https://doi.org/10.1371/journal.pgen.1010685.g001

There are numerous studies demonstrating G×G in plant and invertebrate host–pathogen systems (for examples, see [13,14]), but explicit tests for G×G in vertebrates have been scarce [15]. However, during the last decade, several genome-wide tests for G×G in human host–pathogen systems have been published. Here, I systematically review the evidence for G×G from these studies, as well as candidate gene analyses, and evaluate the implications for our understanding of the importance of coevolution between pathogens and humans (and vertebrates in general) as a cause of balancing selection. I focus on the following questions: For which human genes is there evidence of G×G? Are these G×G of MA or GFG type? Are there other types of costs (e.g., risk of autoimmune disease) associated with genes involved in G×G (which could help maintain polymorphism in case of GFG type G×G)? Do genes involved in G×G show signatures of balancing selection (as would be expected if they are engaged in coevolution by NFDS)?

G×G in humans

Literature search

Studies of G×G in humans have not used consistent terminology (for example, the term genotype-by-genotype interaction or similar is rarely used in the literature on humans), so it is difficult to perform a focused literature search with narrow search terms. Instead, I identified relevant papers by a combination of broad reading of the literature (particularly review papers of genetics of susceptibility to pathogens in humans) and a literature search with broad search terms (Box 2).

Box 2. Literature search

I first identified relevant papers by broad reading of the literature, in particular review papers of genetics of susceptibility to pathogens in humans; this yielded a first set of 13 papers showing G×G, involving 8 different pathogens. To find more papers, I performed a literature search in Web of Science Core Collection in Dec 2022. To this end, I extracted key words from the titles and abstracts of the first set of papers and constructed a query with relatively broad search terms, but which still yielded a manageable number of records [Topic = human AND (genetic varia* OR polymorph*) AND (bacteria* OR viral OR virus OR parasite OR pathogen) AND (interact* OR interplay OR “genome to genome”), which yielded approximately 3.900 records]. By scanning titles and abstracts of these records, I identified papers that considered genetic variation of both host and pathogen; these papers (approximately 1% of the records) were examined in detail (both original results and cited references). In the end, this literature search produced 15 additional papers with evidence for G×G. Most of these concerned pathogens and/or human genes already included in the first set of papers; the list of pathogens and host genes involved in G×G should thus be reasonably complete.

I selected studies showing G×G for any infection-related trait; thus, not only analyses of susceptibility to infection (which is the trait that is traditionally the focus of models of coevolution), but also studies using disease severity, pathogen load, immune escape mutations, etc., as outcome. I only considered natural genetic variation, so studies of genetically modified pathogens or human cell lines were excluded.

Study types and prevalence of G×G

I found evidence for G×G in 10 human host–pathogen systems, including protozoan, bacterial, and viral pathogens (Table 1). Evidence for G×G comes from several different types of studies, from epidemiological analyses to in vitro assays. Moreover, G×G were detected in several different ways. Several studies tested for G×G between 1 or several host candidate genes and pathogen strains. Another approach, employed in some of the most recent studies, is genome-wide testing for G×G in both host and pathogen, that is by performing genome-wide SNP typing of both host and pathogen and then testing for G×G between all pairs of host and pathogen SNPs, referred to as “genome-to-genome” analysis [16]. Other studies used various combinations of candidate gene analysis, pathogen strain identification, and genome-wide analyses.

To gain insight into how common G×G are it is useful to focus on studies based on genome-wide analyses of humans (genome-to-genome studies and genome-wide tests for interactions with pathogen strains), as they should provide a more unbiased estimate of the occurrence of G×G than candidate gene studies. Genome-wide analyses have been performed with viral and bacterial pathogens. All 3 genome-wide tests for G×G with viruses found evidence for G×G [1719]. Most of these concern immune escape mutations, the only exception being [18], which also found G×G for viral load. Of the genome-wide analyses for G×G with bacterial pathogens, 2 studies found statistically significant G×G [20,21] while 1 did not [22]. In all studies where G×G were found, only 1 or a few host loci were involved. Thus, the currently available data indicate that G×G occur in most host–pathogen pairs, but that at most a few host genes are involved in each pair. It should be noted, though, that the multiple testing burden is considerable in genome-wide tests for G×G [16], so future studies with higher power may reveal that a larger number of host loci are often involved in G×G in each host–pathogen pair.

thumbnail
Table 1. Evidence for G×G in human host–pathogen systems.

https://doi.org/10.1371/journal.pgen.1010685.t001

For which genes is there evidence of G×G?

Human genes with evidence for G×G include some genes that are textbook examples of associations with susceptibility to infectious disease, such as the MHC class I genes (HLA-A, HLA-B, HLA-C; G×G with, e.g., Plasmodium falciparum and HIV) and the blood group antigen gene ABO (G×G with Helicobacter pylori and Vibrio cholerae). A recent study also found G×G between HBB (encoding the hemoglobin β subunit)—which has well-known effects on susceptibility to malaria—and P. falciparum [23]. Specifically, the protective effect of the HbS variant at HBB was found to depend on the genotype at 3 different loci in the P. falciparum genome, with all 3 loci in strong linkage disequilibrium such that the minor alleles occur together.

In addition, genes with evidence for G×G include some canonical immune genes (e.g., TLR2, IFNL4, KIR2DL2, CD209) and other genes at the host–pathogen interface with well-documented associations with susceptibility to infection (FUT2, SLC11A1), but also several genes that are not previously recognised in this context (e.g., FARP1, STK32C, UNC5D).

Are these G×G of MA or GFG type?

None of the studies put their results in the context of MA versus GFG. I therefore inferred the type of G×G based on the published data. G×G were detected for several different infection phenotypes, including both binary (e.g., infection status) and continuous traits (e.g., pathogen load or disease severity in infected individuals). For case-control studies of infection status or other binary disease phenotypes, where it is possible to calculate host genotype odds ratios (OR) separately for each pathogen genotype, this information can be used to distinguish MA and GFG. To see this, consider the simplest case where there is G×G between a pair of loci that are bi-allelic in both host and pathogen, as in Fig 1. If there is a trade-off between resistance to different host genotypes as in the MA scenario (Fig 1A), host genotype should be associated with both pathogen genotypes, but in opposite ways. Thus, the OR should be >1 for one pathogen genotype but <1 for the other (note that none of the ORs need to be significantly different from 1, but they should be different from each other). In contrast, if host genotypes differ in the range of pathogen genotypes they are resistant/susceptible to, as in the GFG scenario (Fig 1C), host genotype should only be associated with one of the pathogen genotypes. Thus, the OR should be significantly different from 1 for one pathogen genotype but equal to 1 for the other.

Of the case-control studies with evidence for G×G for infection status or other binary disease phenotypes, 8 present pathogen genotype-specific ORs, with 1 to 3 host genes involved in G×G with each pathogen [20,2328]. In all but 1 case, the OR is significantly different from 1 for one pathogen genotype but not for others. Thus, in the majority of cases the pattern is most consistent with GFG type G×G. Perhaps the most striking example is HBB and resistance to malaria [23]. Here, HbS is protective against severe malaria if an individual is infected with a parasite having the major allele at all 3 loci involved in G×G (OR≈0.02) but not when infected with parasites having the minor allele at all 3 loci (OR≈1) (based on data in Fig 2 of [23]). Similar differences between host genotypes in the range of pathogen strains they are susceptible to (but without any indication of a trade-off between resistance to different strains) are seen with for example ABO (with H. pylori and V. cholera) [27,29] and FUT2 (with Norovirus) [25]. The only indication of MA type G×G in case-control studies concern HLA class II and risk of cervical cancer caused by human papilloma virus (HPV), where different HLA haplotypes affect susceptibility to different HPV types [28].

thumbnail
Fig 2. Examples of G×G for continuous traits related to resistance/susceptibility to infectious disease.

(a) G×G between NK cell Killer Immunoglobulin Receptor genotype and HIV genotype for inhibition of viral replication (median ± interquartile range). NK cells with the KIR2DL2 allele strongly inhibit replication of HIV with wild-type alleles at vpu and env (WT_WT) but have limited inhibitory effect on HIV with variant alleles (V_V), while NK cells without KIR2DL2 have limited inhibitory effect on both WT_WT and V_V. Thus, presence/absence of KIR2DL2 affects the range of HIV genotypes an individual is susceptible to, consistent with the GFG scenario. Data extracted from Fig 1B (day 3) of [34]. (b) G×G between HLA-A genotype and HIV genotype for viral load (mean ± SE). Arginine (R) at residue 432 in the pol gene is an immune escape mutation from the HLA-A allele 03:01. In individuals carrying A*03:01, viral load is suppressed in infections with virus without the Pol432R escape mutation, while there is no effect of Pol432 genotype on viral load in individuals without A*03:01 (and no G×G for viral load between Pol432 genotype and other HLA alleles). Thus, there is no indication of a trade-off between resistance to Pol432R and other genotypes as would be the case in an MA type G×G; instead the pattern is consistent with a GFG type G×G. Data from [32]. (c) G×G between SLC11A1 genotype and M. tuberculosis lineage for tuberculosis severity (median ± IQR). A recently evolved M. tuberculosis sublineage (L4.6) in combination with homozygosity for an ancestral SLC11A1 allele (genotype GG) and an original M. tuberculosis lineage (L3 or L4) in combination with ≥1 derived SLCA11A1 allele (genotype GA or AA) is associated with more severe disease than the other combinations of host and pathogen genotypes (the interaction is highly significant: P = 0.00022). Thus, there is a trade-off between resistance to disease by different lineages, consistent with an MA type G×G. Data extracted from Fig 2 in [35]. GFG, gene-for-gene; G×G, host genotype-by-pathogen genotype interaction; MA, matching allele.

https://doi.org/10.1371/journal.pgen.1010685.g002

There are also several studies that have performed analyses of associations between pathogen and host alleles in chronic viral infections [1719,3033]. Such G×G are generally interpreted as being a result of within-host evolution of immune escape, although they could also reflect differences in susceptibility to infection with viruses carrying different alleles at the start of the infection. These studies have primarily found G×G involving HLA genes. It is generally difficult to infer whether these G×G are of MA or GFG type, because specific HLA alleles are often associated with escape mutations at several positions in the viral genome. However, one of the studies of HIV found that different HLA alleles were associated with different amino acid escape mutations at a particular position [31], a pattern clearly indicating a trade-off between resistance to different pathogen genotypes; thus in at least some cases G×G for escape mutations are consistent with the MA scenario.

Besides studies based on epidemiological analyses of presence/absence of infectious disease or immune escape mutations, there are also some studies finding G×G for various continuous infection-related traits like pathogen replication and disease severity [18,21,32,3437]. Given that the trait is associated with both host and pathogen fitness, also G×G affecting such traits could lead to coevolution. A study using an in vitro assay of HIV replication found that NK cells with at least 1 copy of the KIR2DL2 allele inhibit replication of 1 specific HIV genotype, while NK cells without KIR2DL2 have limited inhibitory effect regardless of HIV genotype [34], a pattern consistent with a GFG type G×G (Fig 2A). Similarly, a study analysing effects of HIV escape mutations on viral load showed that certain virus genotypes had reduced viral load in individuals carrying a specific MHC allele while there was no effect of virus genotype on viral load in individuals without that allele; also this pattern appear consistent with the GFG scenario (Fig 2B) [32]. In contrast, an analysis of tuberculosis patients showed that SLC11A1 genotype had opposite effects on disease severity depending on Mycobacterium tuberculosis lineage [35], consistent with an MA type G×G (Fig 2C). Overall, 4 of the 7 analyses of continuous traits showed results consistent with GFG type G×G, while 3 are consistent with MA type G×G (Table 1).

Finally, in vitro functional analyses of the ability of different Helicobacter pylori isolates to bind host receptors showed that most isolates are generalists and bind both A and H antigen (from individuals with blood group A and O, respectively) while a significant fraction of strains in South America are specialists and bind only H antigen [29], consistent with GFG type G×G.

Are there other types of costs associated with genes involved in G×G?

Since several of the G×G appeared to be of the GFG type, it would be of interest to know if the genes involved are associated with other types of diseases that could lead to the fitness cost necessary to generate NFDS and help maintain polymorphism in case of a GFG type G×G (Fig 1). To identify potential costs of alleles conferring resistance to a particular pathogen genotype, I searched the GWAS catalog [38] (either directly or via LDtrait at LDlink [39] to check if SNPs involved in G×G were in linkage disequilibrium with SNPs associated with other diseases) and PheWAS Resources (HLA genes) [40] for disease-associations with genes involved in G×G.

For several of the genes with GFG type G×G, there is indeed strong evidence for costs associated with the allele that confers resistance to a subset of pathogen genotypes. For example, the nonfunctional FUT2 allele, which protects against some Norovirus strains, is also associated with Crohn’s disease and other diseases [41,42]. Similarly, KIR2DL2, which inhibits replication of a specific HIV genotype [34], is associated with several autoimmune diseases [43]. Overall, costs are known for about two thirds of the genes with indication of GFG type G×G (Table 1). Interestingly, there is also evidence for costs of resistance in case of SLC11A1, which is one of few genes showing clear MA type G×G (where costs are not necessary to generate NFDS; Fig 1). Here, high and low expression alleles are associated with susceptibility to autoimmune and infectious disease, respectively [44].

Do genes involved in G×G show signatures of balancing selection?

If the G×G identified in humans really lead to coevolution by NFDS, one would expect the genes involved to exhibit signatures of balancing selection that can be detected by analyses of population samples of DNA sequence data [45]. For 12 of the 20 genes involved in G×G, there are such signatures of balancing selection, based on genome-wide scans or candidate gene analyses (Table 1). Most show signatures of long-term balancing selection, in some cases—for example, ABO—in the form of “trans-species polymorphisms,” meaning that the polymorphism has been maintained by selection in primates for tens of millions of years [46]. An exception to the trend for long-term balancing selection is HBB that shows a signature of recent positive or balancing selection (for recent selection the signatures of positive and balancing selection are indistinguishable) [47].

Conclusions

The present review has shown that several human genes are involved in G×G, as assumed by models of host–pathogen coevolution. Most of the G×G seem to fit the GFG rather than MA scenario, particularly for case-control studies of infection status and other binary disease phenotypes, which means a cost of resistance is required for these G×G to lead to maintenance of polymorphism by NFDS. Such costs are known for at least some of the genes with evidence for G×G. Taken together, this shows there is scope for coevolution by NFDS also in vertebrates. These conclusions come with several caveats, though.

First, for G×G to result in coevolution, the phenotypic trait concerned must be associated with both host and pathogen fitness. While most studied traits (Table 1) clearly can affect host fitness, the relevance for pathogen fitness is doubtful in some cases, for example, meningitis in Streptococcus pneumoniae infection [20] and risk of cervical cancer caused by HPV [28]. Second, in case of chronic viral infections (HIV, HCV, and EBV), the G×G are thought to be a result of within-host evolution of immune escape, and it is not always clear if these G×G also affect some aspect of host fitness, such as susceptibility to infection or severity of disease, as would be required for coevolution to occur. However, a recent study of HIV found that at least some of the immune escape mutations led to G×G for viral load [32], indicating that G×G involving immune escape mutations might indeed affect host fitness. Third, inferring if G×G are of GFG or MA type from currently segregating host and pathogen alleles might be misleading. For example, what is actually an MA type G×G might appear to be a GFG type G×G if rare alleles are not sampled [11,48]. Fourth, the preponderance of GFG type G×G in case-control studies of binary disease phenotypes might be an artefact of that these analyses are based on separate analyses of each pathogen strain and only report host polymorphisms where the OR is different from 1 for at least one of the pathogen strains. Thus, these analyses will miss MA type G×G where the OR for 2 pathogen strains are in opposite directions and different from each other, but none is different from 1. Even with these caveats in mind, there are some strong cases for coevolutionarily relevant G×G of both GFG and MA type (GFG: e.g., HBB, ABO, FUT2, and HLA genes; MA: e.g., SLC11A1 and HLA genes).

The G×G for HBB illustrates that different types of pathogen-mediated balancing selection can act on a given gene simultaneously. HBB is the textbook example of heterozygote advantage, where individuals with 1 copy of the HbS variant have improved resistance to malaria, whereas HbS homozygosity leads sickle cell disease [49]. The finding that HBB is involved in a G×G with P. falciparum shows that there might also be NFDS on this gene. It is often expected that several different types of pathogen-mediated balancing selection operate simultaneously on a given gene and HBB is perhaps the clearest evidence yet that this is the case.

The G×G for ABO and the HLA genes illustrate another aspect of pathogen-mediated balancing selection—that a given gene might be coevolving with more than 1 pathogen simultaneously, so called “diffuse coevolution” [3]. Diffuse coevolution is expected to be common, perhaps the norm, but ABO and the HLA genes are as far as I am aware the first cases where specific genes have been shown to be involved in G×G with 2 or more different pathogens, thus demonstrating that there actually is opportunity for diffuse coevolution.

In conclusion, there is some evidence from humans for G×G, a key assumption of models of host–pathogen coevolution by NFDS (but not other types of pathogen-mediated balancing selection). This indicates that balancing selection on genes at the host–pathogen interface in humans (and other vertebrates) could indeed be a result of coevolution, as is commonly assumed. Nevertheless, more studies testing for G×G are clearly desirable, in particular, genome-to-genome studies as they give an unbiased perspective on which genes are involved. Recent development of statistical approaches should facilitate this [50,51]. Still, it is important to recognise that the presence of G×G only shows that there is opportunity for coevolution by NFDS, not that it has occurred. Demonstrating that polymorphism is a result of coevolution would require additional analyses. One way would be to test if there is NFDS. Specifically, balancing selection by antagonistic coevolution requires indirect NFDS, i.e., the fitness of a host allele should be negatively correlated with the frequency of a pathogen allele. This could be tested by following 1 or more populations over time. Advances in the analysis of ancient DNA from both mammals and pathogens should make this possible even for humans and other species with long generations times [52]. Nevertheless, identifying the genes involved in G×G—as described in this review—would be a critical first step.

Acknowledgments

I thank Charlie Cornwallis and Maria Strandh for comments on the manuscript and Rasmus Lykke Marvig for providing data for Fig 2B.

References

  1. 1. Karlsson EK, Kwiatkowski DP, Sabeti PC. Natural selection and infectious disease in human populations. Nat Rev Genet. 2014;15:379–393. pmid:24776769
  2. 2. Quintana-Murci L, Clark AG. Population genetic tools for dissecting innate immunity in humans. Nat Rev Immunol. 2013;13:280–293. pmid:23470320
  3. 3. Ebert D, Fields PD. Host–parasite co-evolution and its genomic signature. Nat Rev Genet. 2020;21:754–768. pmid:32860017
  4. 4. Woolhouse MEJ, Webster JP, Domingo E, Charlesworth B, Levin BR. Biological and biomedical implications of the co-evolution of pathogens and their hosts. Nat Genet. 2002;32:569–577. pmid:12457190
  5. 5. Bitarello D, De Filippo C, Kleinert P, Meyer D, Andrés AM. Signatures of long-term balancing selection in human genomes. Genome Biol Evol. 2018;10:939–955. pmid:29608730
  6. 6. Leffler EM, Gao Z, Pfeifer S, Ségurel L, Auton A, Venn O, et al. Multiple instances of ancient balancing selection shared between humans and chimpanzees. Science. 2013;339:1578–1582. pmid:23413192
  7. 7. Koenig D, Jörg H, Li R, Bemm F, Slotte T, Neuffer B, et al. Long-term balancing selection drives evolution of immunity genes in Capsella. Elife. 2019;8:e43606. pmid:30806624
  8. 8. Decaestecker E, Gaba S, Raeymaekers JAM, Stoks R, Van Kerckhoven L, Ebert D, et al. Host-parasite “Red Queen” dynamics archived in pond sediment. Nature. 2007;450:870–873. pmid:18004303
  9. 9. Thrall PH, Laine A-L, Ravensdale M, Nemri A, Dodds PN, Barrett LG, et al. Rapid genetic change underpins antagonistic coevolution in a natural host-pathogen metapopulation. Ecol Lett. 2012;15:425–435. pmid:22372578
  10. 10. Tellier A, Brown JK, Boots M, John S. Theory of Host–Parasite Coevolution: From Ecology to Genomics. eLS. 2021;2:1–10.
  11. 11. Dybdahl MF, Jenkins CE, Nuismer SL. Identifying the molecular basis of host-parasite coevolution: merging models and mechanisms. Am Nat. 2014;184:1–13. pmid:24921596
  12. 12. Agrawal A, Lively CM. Infection genetics: gene-for-gene versus matching- alleles models and all points in between. Evol Ecol Res. 2002;4:79–90.
  13. 13. Carius HJ, Little TJ, Ebert D. Genetic variation in a host-parasite association: potential for coevolution and frequency-dependent selection. Evolution. 2001;55:1136–1145. pmid:11475049
  14. 14. Salvaudon L, Héraudet V, Shykoff JA. Genotype-specific interactions and the trade-off between host and parasite fitness. BMC Evol Biol. 2007;7:189. pmid:17919316
  15. 15. Råberg L, Clough D, Hagström Å, Scherman K, Andersson M, Drews A, et al. MHC class II genotype-by-pathogen genotype interaction for infection prevalence in a natural rodent-Borrelia system. Evolution. 2022;76:2067–2075. pmid:35909235
  16. 16. Fellay J, Pedergnana V. Exploring the interactions between the human and viral genomes. Hum Genet. 2020;139:777–781. pmid:31729546
  17. 17. Bartha I, Carlson JM, Brumme CJ, McLaren PJ, Brumme ZL, John M, et al. A genome-to-genome analysis of associations between human genetic variation, HIV-1 sequence diversity, and viral control. Elife. 2013;2:e01123. pmid:24171102
  18. 18. Ansari MA, Pedergnana V, Ip CLC, Magri A, Von Delft A, Bonsall D, et al. Genome-to-genome analysis highlights the effect of the human innate and adaptive immune systems on the hepatitis C virus. Nat Genet. 2017;49:666–673. pmid:28394351
  19. 19. Rüeger S, Hammer C, Loetscher A, McLaren PJ, Lawless D, Naret O, et al. The influence of human genetic variation on Epstein–Barr virus sequence diversity. Sci Rep. 2021;11:4586. pmid:33633271
  20. 20. Lees JA, Ferwerda B, Kremer PHC, Wheeler NE, Serón MV, Croucher NJ, et al. Joint sequencing of human and pathogen genomes reveals the genetics of pneumococcal meningitis. Nat Commun. 2019;10:2176. pmid:31092817
  21. 21. McHenry ML, Wampande EM, Joloba ML, Malone LL, Mayanja-Kizza H, Bush WS, et al. Interaction between M. tuberculosis lineage and human genetic variants reveals novel pathway associations with severity of TB. Pathogens. 2021;10:1487. pmid:34832643
  22. 22. Nelson CL, Pelak K, Podgoreanu MV, Ahn SH, Scott WK, Allen AS, et al. A genome-wide association study of variants associated with acquisition of Staphylococcus aureus bacteremia in a healthcare setting. BMC Infect Dis. 2014;14:83. pmid:24524581
  23. 23. Band G, Leffler EM, Jallow M, Sisay-Joof F, Ndila CM, Macharia AW, et al. Malaria protection due to sickle haemoglobin depends on parasite genotype. Nature. 2021;602:106–111. pmid:34883497
  24. 24. Caws M, Thwaites G, Dunstan S, Hawn TR, Lan NTN, Thuong NTT, et al. The influence of host and bacterial genotype on the development of disseminated disease with Mycobacterium tuberculosis. PLoS Pathog. 2008;4:e1000034. pmid:18369480
  25. 25. Nordgren J, Svensson L. Genetic Susceptibility to Human Norovirus Infection: An Update. Viruses. 2019;11:226. pmid:30845670
  26. 26. Toyo-oka L, Mahasirimongkol S, Yanai H, Mushiroda T, Wattanapokayakit S, Wichukchinda N, et al. Strain-based HLA association analysis identified HLA-DRB1*09:01 associated with modern strain tuberculosis. HLA Immune Response Genet. 2017;90:149–156. pmid:28612994
  27. 27. Cooling L. Blood Groups in Infection and Host Susceptibility. Clin Microbiol Rev. 2015;28:801–870. pmid:26085552
  28. 28. Apple RJ, Erlich HA, Klitz W, Manos MM, Becker TM, Wheeler CM. HLA DR-DQ associations with papillomavirus-type specificity. Nat Genet. 1994;6:157–162.
  29. 29. Aspholm-Hurtig M, Dailide G, Lahmann M, Kalia A, Ilver D, Roche N, et al. Functional adaptation of BabA the H. pylori ABO blood group antigen binding adhesin. Science. 2004;305:519–522. pmid:15273394
  30. 30. Naret O, Chaturvedi N, Bartha I, Hammer C, Fellay J. Correcting for Population Stratification Reduces False Positive and False Negative Results in Joint Analyses of Host and Pathogen Genomes. Front Genet. 2018:9. pmid:30105048
  31. 31. Brumme ZL, Brumme CJ, Heckerman D, Korber BT, Daniels M, Carlson J, et al. Evidence of Differential HLA Class I-Mediated Viral Evolution in Functional and Accessory/Regulatory Genes of HIV-1. PLoS Pathog. 2007;3:e94. pmid:17616974
  32. 32. Gabrielaite M, Bennedbæk M, Zucco AG, Ekenberg C, Murray DD, Kan VL, et al. Human Immunotypes Impose Selection on Viral Genotypes Through Viral Epitope Specificity. J Infect Dis. 2021;224:2053–2063. pmid:33974707
  33. 33. Timm J, Li B, Daniels MG, Bhattacharya T, Reyor LL, Allgaier R, et al. Human leukocyte antigen-associated sequence polymorphisms in hepatitis C virus reveal reproducible immune responses and constraints on viral evolution. Hepatology. 2007;46:339–349. pmid:17559151
  34. 34. Alter G, Heckerman D, Schneidewind A, Fadda L, Kadie CM, Carlson JM, et al. HIV-1 adaptation to NK-cell-mediated immune pressure. Nature. 2011;476:96–100. pmid:21814282
  35. 35. McHenry ML, Bartlett J, Igo RP, Wampande EM, Benchek P, Mayanja-Kizza H, et al. Interaction between host genes and Mycobacterium tuberculosis lineage can affect tuberculosis severity: Evidence for coevolution? PLoS Genet. 2020;16:e1008728. pmid:32352966
  36. 36. Littera R, Zamboni F, Tondolo V, Fantola G, Chessa L, Orrù N, et al. Absence of activating killer immunoglobulin-like receptor genes combined with hepatitis C viral genotype is predictive of hepatocellular carcinoma. Hum Immunol. 2013;74:1288–1294. pmid:23756163
  37. 37. Liu YC, Chen Z, Neller MA, Miles JJ, Purcell AW, McCluskey J, et al. A Molecular Basis for the Interplay between T Cells, Viral Mutants, and Human Leukocyte Antigen Micropolymorphism. J Biol Chem. 2014;289:16688–16698. pmid:24759101
  38. 38. Buniello A, Macarthur JAL, Cerezo M, Harris LW, Hayhurst J, Malangone C, et al. The NHGRI-EBI GWAS Catalog of published genome-wide association studies, targeted arrays and summary statistics 2019. Nucleic Acids Res. 2019;47:D1005–D1012. pmid:30445434
  39. 39. Machiela MJ, Chanock SJ. LDlink: a web-based application for exploring population-specific haplotype structure and linking correlated alleles of possible functional variants. Bioinformatics. 2015;31:3555–3557. pmid:26139635
  40. 40. Karnes JH, Bastarache L, Shaffer CM, Gaudieri S, Xu Y, Glazer AM, et al. Phenome-wide scanning identifies multiple diseases and disease severity phenotypes associated with HLA variants. Sci Transl Med. 2017;9:eaai8708. pmid:28490672
  41. 41. McGovern DPB, Jones MR, Taylor KD, Marciante K, Yan X, Dubinsky M, et al. Fucosyltransferase 2 (FUT2) non-secretor status is associated with Crohn’s disease. Hum Mol Genet. 2010;19:3468–3476. pmid:20570966
  42. 42. Rausch P, Rehman A, Künzel S, Häsler R, Ott SJ, Schreiber S, et al. Colonic mucosa-associated microbiota is influenced by an interaction of crohn disease and FUT2 (Secretor) genotype. Proc Natl Acad Sci U S A. 2011;108:19030–19035. pmid:22068912
  43. 43. Moesta AK, Parham P. Diverse functionality among human NK cell receptors for the C1 epitope of HLA-C: KIR2DS2, KIR2DL2, and KIR2DL3. Front Immunol. 2012;3:1–13. pmid:23189078
  44. 44. Blackwell JM, Goswami T, Evans CAW, Sibthorpe D, Papo N, White JK, et al. SLC11A1 (formerly NRAMP1) and disease resistance. Cell Microbiol. 2001;3:773–784. pmid:11736990
  45. 45. Fijarczyk A, Babik W. Detecting balancing selection in genomes: Limits and prospects. Mol Ecol. 2015;24:3529–3545. pmid:25943689
  46. 46. Ségurel L, Thompson EE, Flutre T, Lovstad J, Venkat A, Margulis SW. The ABO blood group is a trans-species polymorphism in primates. Proc Natl Acad Sci U S A. 2012;109:4–11. pmid:23091028
  47. 47. Hanchard NA, Rockett KA, Spencer C, Coop G, Pinder M, Jallow M, et al. Screening for recently selected alleles by analysis of human haplotype similarity. Am J Hum Genet. 2006;78:153–159. pmid:16385459
  48. 48. Frank SA. Specificity versus detectable polymorphism in host-parasite genetics. Proc R Soc London B. 1993;254:191–197. pmid:8108452
  49. 49. Aidoo M, Terlouw DJ, Kolczak MS, McElroy PD, Ter Kuile FO, Kariuki S, et al. Protective effects of the sickle cell gene against malaria morbidity and mortality. Lancet. 2002;359:1311–1312. pmid:11965279
  50. 50. Wang M, Roux F, Bartoli C, Huard-Chauveau C, Meyer C, Lee H, et al. Two-way mixed-effects methods for joint association analysis using both host and pathogen genomes. Proc Natl Acad Sci U S A. 2018;115:E5440–E5449. pmid:29848634
  51. 51. MacPherson A, Otto SP, Nuismer SL. Keeping pace with the Red Queen: Identifying the genetic basis of susceptibility to infectious disease. Genetics. 2018;208:779–789. pmid:29223971
  52. 52. Kerner G, Patin E, Quintana-Murci L. New insights into human immunity from ancient genomics. Curr Opin Immunol. 2021;72:116–125. pmid:33992907
  53. 53. Gilbert SC, Plebanski M, Gupta S, Morris J, Cox M, Aidoo M, et al. Association of malaria parasite population structure, HLA, and immunological antagonism. Science. 1998;279:1173–1177. pmid:9469800
  54. 54. DeGiorgio M, Lohmueller KE, Nielsen R. A model-based approach for identifying signatures of ancient balancing selection in genetic data. PLoS Genet. 2014;10:e1004561. pmid:25144706
  55. 55. Ntoumi F, Rogier C, Dieye A, Trape J-F, Millet P, Mercereau-Puijalon O. Imbalanced distribution of Plasmodium falciparum MSP-1 genotypes related to sickle-cell trait. Mol Med. 1997;3:581–592. pmid:9323709
  56. 56. Ashley-Koch A, Yang Q, Olney RS. Sickle Hemoglobin (HbS) allele and sickle cell disease: a HuGe review. Am J Epidemiol. 2000;151:839–845. pmid:10791557
  57. 57. Salie M, Van Der Merwe L, Möller M, Daya M, Van Der Spuy GD, Van Helden PD, et al. Associations between human leukocyte antigen class i variants and the mycobacterium tuberculosis subtypes causing disease. J Infect Dis. 2014;209:216–223. pmid:23945374
  58. 58. Ogarkov O, Mokrousov I, Sinkov V, Zhdanova S, Antipina S, Savilov E. “Lethal” combination of Mycobacterium tuberculosis Beijing genotype and human CD209 -336G allele in Russian male population. Infect Genet Evol. 2012;12:732–736. pmid:22027159
  59. 59. van Crevel R, Parwati I, Sahiratmadja E, Marzuki S, Ottenhoff THM, Netea MG, et al. Infection withMycobacterium tuberculosisBeijing Genotype Strains Is Associated with Polymorphisms inSLC11A1/NRAMP1in Indonesian Patients with Tuberculosis. J Infect Dis. 2009;200:1671–1674. pmid:19863441
  60. 60. Barnes I, Duda A, Pybus OG, Thomas MG. Ancient urbanization predicts genetic resistance to tuberculosis. Evolution. 2011;65:842–848. pmid:20840594
  61. 61. Rowe JA, Handel IG, Thera MA, Deans AM, Lyke KE, Koné A, et al. Blood group O protects against severe Plasmodium falciparum malaria through the mechanism of reduced rosetting. Proc Natl Acad Sci U S A. 2007;104:17471–17476. pmid:17959777
  62. 62. Gendzekhadze K, Norman PJ, Abi-Rached L, Graef T, Moesta AK, Layrisse Z, et al. Co-evolution of KIR2DL3 with HLA-C in a human population retaining minimal essential diversity of KIR and HLA class I ligands. Proc Natl Acad Sci U S A. 2009;106:18692–18697. pmid:19837691
  63. 63. Ansari MA, Aranday-Cortes E, Ip CL, Da Silva FA, Lau SH, Bamford C, et al. Interferon lambda 4 impacts the genetic diversity of hepatitis c virus. Elife. 2019;8:1–23. pmid:31478835
  64. 64. Key FM, Peter B, Dennis MY, Huerta-Sánchez E, Tang W, Prokunina-Olsson L, et al. Selection on a Variant Associated with Improved Viral Clearance Drives Local, Adaptive Pseudogenization of Interferon Lambda 4 (IFNL4). PLoS Genet. 2014;10:e1004681. pmid:25329461
  65. 65. Andrés AM, Hubisz MJ, Indap A, Torgerson DG, Degenhardt JD, Boyko AR, et al. Targets of balancing selection in the human genome. Mol Biol Evol. 2009;26:2755–2764. pmid:19713326
  66. 66. Ferrer-Admetlla A, Sikora M, Laayouni H, Esteve A, Roubinet F, Blancher A, et al. A natural history of FUT2 polymorphism in humans. Mol Biol Evol. 2009;26:1993–2003. pmid:19487333
  67. 67. Siewert KM, Voight BF. Detecting long-term balancing selection using allele frequency correlation. Mol Biol Evol. 2017;34:2996–3005. pmid:28981714
  68. 68. Zehbe I, Tachezy R, Mytilineos J, Voglino G, Mikyškova I, Delius H, et al. Human papillomavirus 16 E6 polymorphisms in cervical lesions from different European populations and their correlation with human leukocyte antigen class II haplotypes. Int J Cancer. 2001;94:711–716. pmid:11745467
  69. 69. Beskow AH, Engelmark MT, Magnusson JJ, Gyllensten UB. Interaction of host and viral risk factors for development of cervical carcinoma in situ. Int J Cancer. 2005;117:690–692. pmid:15929080