Dissecting the Genetic Architecture of Host–Pathogen Specificity

In this essay, I argue that unraveling thefull genetic architecture (i.e., the number,position, effect, and interactions amonggenes underlying phenotypic variation)and molecular landscape of host–pathogeninteractions can only be achieved byaccounting for their genetic specificity.Indeed, the outcome of host–pathogeninteractions often depends on the specificpairing of host and pathogen genotypes[1]. In such cases, the infection phenotypedoes not merely result from additive effectsof host and pathogen genotypes, but alsofrom a specific interaction between thetwo genomes (Box 1). This specific com-ponent, which can be measured by theinteraction term in a two-way statisticalanalysis of phenotypic variation as afunction of host and pathogen genotypes,is referred to as a genotype-by-genotype(G6G) interaction [1]. By analogy togenotype-by-environment (G6E) interac-tions that occur when different genotypesrespond differently to environmentalchange, G6G interactions occur whenthe response of host genotypes differsacross pathogen genotypes. Although theconcept of G6G interactions has mostlybeen used by evolutionary ecologists todescribe the specificity of host immunedefenses against pathogens [2], it can beapplied to any phenotype resulting fromthe specific interaction between two ge-nomes. The general definition of G6Ginteractions allows its use to characterizephenotypes ranging from macroscopictraits such as lifespan [3] to the level ofgene expression [4]. Here, the geneticspecificity of host–pathogen associations isdefined in the sense of G6G interactions.This definition differs from that of immu-nological specificity, which is the ability ofa host to recognize and mount an immuneresponse against a particular pathogengenotype or antigen. Whereas immuno-logical specificity often depends on infec-tion history (i.e., past exposure to apathogen), genetic specificity describesthe intrinsic compatibility between hostand pathogen genotypes and occurs inde-pendently of infection history.In some instances, the specificity ofhost–pathogen associations can be ex-plained to a large extent by major genesof hosts and pathogens, as in the gene-for-gene model of plant–pathogen compati-bility [5,6]. In general, however, multiplegenes and epistatic interactions amongthese genes determine the infection out-come [7–9]. A recent meta-analysis of 500published studies reporting quantitativetrait loci (QTL) for host resistance topathogens in plants and animals revealedthat the genetic architecture of this traitvaries dramatically across different combi-nations of host and pathogen genotypes[9]. Thus, different host–pathogen associ-ations involve different QTL and epistaticinteractions, indicating that a substantialportion of phenotypic variation derivesfrom the specific interaction between thetwo genomes. This is made even morecomplex when multiple pathogen speciesor strains infect the same host [10] and/orwhen G6G interactions are environment-dependent [11,12].It is striking that, to date, quantitativegenetic studies of host–pathogen systemshave neglected the specific component ofthe interaction. Dissecting the geneticarchitecture of complex infection traitshas traditionally relied on QTL mappingstrategies [7,9] and more recently onassociation analyses of candidate genepolymorphisms [8]. A major caveat ofthese QTL mapping and associationstudies is that they focus on either thehost or the pathogen genome. Becausethey consider variation in only one of thetwo interacting organisms, these studiesignore specific host genome by pathogengenome interactions. In order to fullydissect the genetic architecture and ex-plore the molecular landscape of host–pathogen interactions, it will be necessaryto account for the specific component ofthe relationship. This should be madepossible by recent developments in molec-ular strategies combining host and patho-gen genetics [13–15] and in quantitativegenetic models of host–pathogen interac-tions allowing detection of host QTL bypathogen QTL interactions [16,17]. Ad-vantage could also be taken from existingmethods for analysis of gene–gene andgene–environment interactions [18–21]. Acritical (and limiting) aspect for investigat-ing genetic specificity is the need toinclude different combinations of hostand pathogen genotypes in the experi-mental design.From a fundamental standpoint, im-proved knowledge of the genetic architec-ture of host–pathogen specificity hasimportant implications for our under-standing of the ecology and evolution ofhost–pathogen associations. The geneticspecificity of host–pathogen interactions isthought to promote the maintenance ofhost and pathogen genetic diversity viafrequency-dependent coevolutionary cy-cles [22–25], which in turn favor higherrates of mutation, recombination, andsexual reproduction [26]. Unraveling thegenetic architecture and molecular land-scape of host–pathogen specificity, com-bined with molecular evolution analyses,will shed light on the mechanistic basis ofthe infection process and the biochemistryof host–pathogen recognition [27–30].The genetic model and precise epistaticinteractions underlying host–pathogenspecificity are critical determinants of

In this essay, I argue that unraveling the full genetic architecture (i.e., the number, position, effect, and interactions among genes underlying phenotypic variation) and molecular landscape of host-pathogen interactions can only be achieved by accounting for their genetic specificity. Indeed, the outcome of host-pathogen interactions often depends on the specific pairing of host and pathogen genotypes [1]. In such cases, the infection phenotype does not merely result from additive effects of host and pathogen genotypes, but also from a specific interaction between the two genomes (Box 1). This specific component, which can be measured by the interaction term in a two-way statistical analysis of phenotypic variation as a function of host and pathogen genotypes, is referred to as a genotype-by-genotype (G6G) interaction [1]. By analogy to genotype-by-environment (G6E) interactions that occur when different genotypes respond differently to environmental change, G6G interactions occur when the response of host genotypes differs across pathogen genotypes. Although the concept of G6G interactions has mostly been used by evolutionary ecologists to describe the specificity of host immune defenses against pathogens [2], it can be applied to any phenotype resulting from the specific interaction between two genomes. The general definition of G6G interactions allows its use to characterize phenotypes ranging from macroscopic traits such as lifespan [3] to the level of gene expression [4]. Here, the genetic specificity of host-pathogen associations is defined in the sense of G6G interactions. This definition differs from that of immunological specificity, which is the ability of a host to recognize and mount an immune response against a particular pathogen genotype or antigen. Whereas immunological specificity often depends on infection history (i.e., past exposure to a pathogen), genetic specificity describes the intrinsic compatibility between host and pathogen genotypes and occurs independently of infection history.
In some instances, the specificity of host-pathogen associations can be ex-plained to a large extent by major genes of hosts and pathogens, as in the gene-forgene model of plant-pathogen compatibility [5,6]. In general, however, multiple genes and epistatic interactions among these genes determine the infection outcome [7][8][9]. A recent meta-analysis of 500 published studies reporting quantitative trait loci (QTL) for host resistance to pathogens in plants and animals revealed that the genetic architecture of this trait varies dramatically across different combinations of host and pathogen genotypes [9]. Thus, different host-pathogen associations involve different QTL and epistatic interactions, indicating that a substantial portion of phenotypic variation derives from the specific interaction between the two genomes. This is made even more complex when multiple pathogen species or strains infect the same host [10] and/or when G6G interactions are environmentdependent [11,12].
It is striking that, to date, quantitative genetic studies of host-pathogen systems have neglected the specific component of the interaction. Dissecting the genetic architecture of complex infection traits has traditionally relied on QTL mapping strategies [7,9] and more recently on association analyses of candidate gene polymorphisms [8]. A major caveat of these QTL mapping and association studies is that they focus on either the host or the pathogen genome. Because they consider variation in only one of the two interacting organisms, these studies ignore specific host genome by pathogen genome interactions. In order to fully dissect the genetic architecture and ex-plore the molecular landscape of hostpathogen interactions, it will be necessary to account for the specific component of the relationship. This should be made possible by recent developments in molecular strategies combining host and pathogen genetics [13][14][15] and in quantitative genetic models of host-pathogen interactions allowing detection of host QTL by pathogen QTL interactions [16,17]. Advantage could also be taken from existing methods for analysis of gene-gene and gene-environment interactions [18][19][20][21]. A critical (and limiting) aspect for investigating genetic specificity is the need to include different combinations of host and pathogen genotypes in the experimental design.
From a fundamental standpoint, improved knowledge of the genetic architecture of host-pathogen specificity has important implications for our understanding of the ecology and evolution of host-pathogen associations. The genetic specificity of host-pathogen interactions is thought to promote the maintenance of host and pathogen genetic diversity via frequency-dependent coevolutionary cycles [22][23][24][25], which in turn favor higher rates of mutation, recombination, and sexual reproduction [26]. Unraveling the genetic architecture and molecular landscape of host-pathogen specificity, combined with molecular evolution analyses, will shed light on the mechanistic basis of the infection process and the biochemistry of host-pathogen recognition [27][28][29][30]. The genetic model and precise epistatic interactions underlying host-pathogen specificity are critical determinants of coevolutionary dynamics and the evolution and maintenance of sex and recombination [27,31]. In conjunction with gene flow and genetic drift, the genetic basis of specificity can also influence the spatial structure and local adaptation of host and pathogen populations [32].
From a more applied perspective, exploring the genetic basis of hostpathogen specificity will provide impor-tant insights into the mechanisms of disease emergence. Pathogens with a broad host range (i.e., a low degree of host specificity) are those most likely to emerge or re-emerge following ecological

Box 1. A Quantitative Genetic Model of Host-Pathogen Interactions
Quantitative genetics is the area of genetics dealing with the inheritance of traits showing continuous phenotypic variation [35]. Typically, quantitative phenotypes are modeled as the result of combined effects of the genes (G) and the environment (E). The basic model to describe the phenotype of an individual is: where y is the phenotypic value of the individual, m is the mean value of the population, g is the genetic contribution to the deviation from the mean (usually termed ''genotypic value''), and e is the environmental (non-genetic) deviation. By extending this model to a quantitative trait resulting from the interaction between a host and a pathogen, the model becomes: y~mzg H zg P zg HP ze ðEquation 1:2Þ where g H is the host genotypic value, g P is the pathogen genotypic value, and g HP is the genotypic value due to the specific G6G interaction. This simple model ignores interactions between genes and environment (G6E and G6G6E effects), which occur when genotypic values vary across environments. The genetic component of phenotypic variance in a host-pathogen interaction can thus be partitioned into three distinct terms: variance due to the additive effect of the host genotype, variance due to the additive effect of the pathogen genotype, and variance due to the specific interaction between the two genomes. Whereas the first two terms can be characterized by considering either the host or the pathogen genetic variation alone, exploring the genetic basis of host-pathogen specificity requires that genetic variations in both the host and the pathogen are considered simultaneously.
In the case of a trait determined by two haploid loci i and j of a single organism, we can define a i the additive effect of locus i, a j the additive effect of locus j, and b ij the interaction effect between loci i and j to decompose the genotypic value into: y~mz Sa H zSb H ð Þ z Sa P zSb P ð Þ zg HP ze ðEquation 1:4Þ By using the notations Sa HP = Sa H +Sa P (sum of additive effects of host and pathogen loci) and Sb HP = Sb H +Sb P +g HP (sum of interaction effects between host loci, between pathogen loci, and specific G6G interactions between host and pathogen loci), the equation becomes: y~mzSa HP zSb HP ze ðEquation 1:5Þ The striking similarity between equations 1.5 and 1.3 illustrates how the phenotype of a host-pathogen interaction can simply be modeled as that of a third organism that combines both genomes. In such a model, the specific G6G interaction is included among all interaction terms, supporting the view that considering specificity in the genetic architecture of host-pathogen interactions is as important as including intra-genome epistasis. Like epistasis [36,37], host-pathogen specificity may thus largely contribute to the unexplained genetic variation in susceptibility to infectious diseases missed by conventional QTL mapping strategies or genome-wide association studies [38,39].
changes [33]. Disease emergence can also result from pathogen adaptation to a novel host species or population, which largely depends on the initial compatibility between host and pathogen genotypes [34]. Characterizing the genetic and molecular basis underlying host-pathogen specificity thus holds considerable prom-ise for understanding, predicting, and preventing disease emergence. It will help to identify host species and populations most at risk for emergence of a given pathogen and uncover new molecular targets to interfere with the ability of emerging pathogens to jump from one host to another.