Understanding Human Variation in Infectious Disease Susceptibility through Clinical and Cellular GWAS

Over the last ten years, advances in genotyping and high-throughput sequencing technologies have resulted in an explosion of genetic information. Whereas prior attempts at discovering human genetic differences affecting susceptibility to disease relied on genotyping one or a handful of candidate genetic variants, genome-wide association studies (GWAS) have now become a common means of searching for susceptibility genes in an unbiased way. These studies have highlighted the relevance of particular pathways in pathogenesis of infectious and autoimmune disease. Thus, GWAS of clinical phenotypes can alert host-pathogen researchers to unexpected links between their pathway of study and human disease. Complementary to this, cellular GWAS using pathogens as probes can reveal how genetic variation affects cellular processes important for disease pathogenesis.

Over the last ten years, advances in genotyping and highthroughput sequencing technologies have resulted in an explosion of genetic information. Whereas prior attempts at discovering human genetic differences affecting susceptibility to disease relied on genotyping one or a handful of candidate genetic variants, genome-wide association studies (GWAS) have now become a common means of searching for susceptibility genes in an unbiased way. These studies have highlighted the relevance of particular pathways in pathogenesis of infectious and autoimmune disease. Thus, GWAS of clinical phenotypes can alert host-pathogen researchers to unexpected links between their pathway of study and human disease. Complementary to this, cellular GWAS using pathogens as probes can reveal how genetic variation affects cellular processes important for disease pathogenesis.

What Can GWAS Do for Cellular Microbiology? Identification of Genes and Pathways Important in Autoimmune and Infectious Disease Pathogenesis
In GWAS, controls and cases with a disease are genotyped at hundreds of thousands to millions of loci and the genotype frequencies are compared to identify alleles that may be protective or result in increased susceptibility [1]. Prior to the advent of GWAS, a handful of examples demonstrated that common genetic variation could have profound effects on infectious disease susceptibility. The sickle cell allele of hemoglobin protecting against malaria [2] and the CCR5 deletion allele protecting against HIV infection [3] are textbook examples. GWAS provide a way to systematically search for such genetic differences.
GWAS of autoimmune diseases have been particularly successful. For inflammatory bowel disease (IBD), a GWAS of 75,000 people revealed 163 loci that can account for ,15% of the total disease variance of Crohn's disease [4]. The genomic regions implicated by these loci include genes showing a striking enrichment for immune-related gene ontology terms, including regulation of cytokine production and activation of lymphocyte signaling [4]. The causal variants, which nearby genes are affected, and how the affected genes alter pathophysiology are unknown for most of these loci. However, a few successful exceptions should spur researchers to mine this list of diseaserelevant genetic differences. A frameshift mutation in NOD2 that results in a truncated protein was identified as a Crohn's disease susceptibility allele [5,6] prior to the GWAS era through linkage followed by candidate gene studies. NOD2 is a member of the NLR family of intracellular sensors that responds to both bacterial (muramyl dipeptide; MDP) and viral (ssRNA) patterns [7,8]. Mice with the NOD2 mutation had increased intestinal inflammation in the dextran sodium sulphate (DSS) model of colitis, and macrophages from the mice exhibited increased NF-kB signaling and IL-1b secretion in response to MDP [9]. The first reported GWAS hit for IBD was a non-synonymous mutation in an autophagy gene, ATG16L1 [10]. Prior to this finding, autophagy had not been known to play a role in Crohn's disease susceptibility, and this discovery prompted further research in the interplay of autophagy, infection, and immunity. Recent cellular studies have linked these two susceptibility genes. NOD2 recruits ATG16L1 to the plasma membrane to cause autophagy of invasive bacteria [11], while a separate study showed that NOD2 activation by MDP enhances ATG16L1-mediated autophagy to increase antigen presentation in dendritic cells [12]. These studies of IBD demonstrate how careful and extensive follow-up of GWAS hits can be transformative to the understanding of pathophysiology. Researchers can determine whether their gene of interest has been implicated in GWAS by searching the NHGRI GWAS catalog [13] or the GWASdb website, which manually curates more hits and provides an easy-to-navigate browser [14].
Two successful examples from infectious disease GWAS further show how GWAS can inform our understanding of disease and even lead to changes in clinical practice. GWAS of leprosy have revealed eight loci affecting susceptibility [15,16]. Overlap in leprosy and IBD-associated GWAS variants has clearly demonstrated the shared genetic underpinnings controlling susceptibility to infectious and autoimmune disease. Five of the eight loci associated with leprosy are also associated with Crohn's disease [4]. For example, an SNP upstream of NOD2 protects against IBD but leads to increased risk of leprosy [4]. Thus, there are phenotypic trade-offs in genetic variation that may only be revealed under certain environmental situations.
GWAS of treatment response to hepatitis C virus (HCV) infection has prompted changes in clinical practice. Genetic variation at the IL28B locus (encoding interferon lambda-3) has been strongly associated with sustained virological response to Funding: DCK received funding from the NIAID Northwest Regional Center of Excellence for Biodefense and Emerging Infectious Diseases Research (U54 AI057141), an NIAID Research Scholar Development Award (K22 AI093595), and from a Duke School of Medicine Whitehead Scholarship. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing Interests: The authors have declared that no competing interests exist.
* E-mail: dennis.ko@duke.edu pegylated interferon alpha plus ribavirin in patients with chronic HCV infection, with carriers of the beneficial genotype having 2to 3-fold greater odds of eradicating the virus [17]. This effect appears to be particularly strong in patients with the G1 viral genotype [18]. Genotyping individuals for this variant has now become common in managing treatment [19]. While the clinical utility of IL28B genetic testing may begin to wane in the era of direct-acting anti-HCV drugs, the biological information gleaned from these studies will have lasting value, both in terms of recognizing the role of interferon lambdas in HCV infection, and perhaps as a novel route of therapeutic management [19].
Despite these successes, there are certainly limitations to GWAS [20,21] and fewer examples of GWAS discoveries in infectious disease than for most common, chronic human diseases. Notably, sample sizes of infectious disease GWAS have been relatively modest compared to other GWAS. Lack of sufficient coverage of variation on genotyping platforms for African populations is also likely partially responsible. Selection of controls is also inherently difficult for infectious disease GWAS, especially for nosocomial and opportunistic infections for which information on pathogen exposure in uninfected individuals may be limited and patients have varied and multiple comorbidities that can lead to confounding. Perhaps most importantly, infectious disease GWAS have an additional source of genetic variation not present in GWAS of other human diseases-genetic variation in the pathogen. This is illustrated above with the recognition that the influence of IL28B genotype on HCV clearance is largely dependent on the viral genome. The host-pathogen arms race results in two moving targets and in some cases tremendous genetic heterogeneity. For example, HIV-1 is incredibly diverse, with thousands of genetically different viruses classified in a complex tree of types, subtypes, sub-subtypes, and recombinant forms [22]. Despite these challenges, loci associated with infectious disease typically have larger effect sizes compared to noninfectious disease GWAS and thus should be a priority [20]. Infectious disease GWAS will benefit from increased sample size, better coverage of variants in genotyping, and stratified analysis to minimize confounding due to variation in the pathogen.

What Can Cellular Microbiology Do for GWAS? Cellular GWAS as a New Discovery Tool
While clinical GWAS have been successful at highlighting important pathways in disease pathogenesis, there is clearly a need for additional approaches directed at understanding how specific genetic variants affect disease. How can we more effectively move from lists of SNPs to greater biological insight? One approach is to complement the GWAS of organismal/clinical traits with GWAS of different phenotypic scales (Figure 1). For several years, molecular GWAS of gene expression have identified genes whose level of transcription is associated with nearby genetic variation (cis-eQTLs; expression quantitative trait loci [23,24]). These resources are useful for GWAS researchers in trying to narrow down what genes are affected by functional genetic variants within a genomic region. Researchers can take advantage of online tools such as the eQTL Browser [25] to determine if there is human genetic variation near their gene of interest that may regulate levels of expression. The availability of cell lines with alternative alleles, as well as genome engineering approaches to introduce genetic differences into isogenic backgrounds [26], makes this an exercise that can lead to hypothesis-driven experiments to understand how human variants alter cell biology. Figure 1. GWAS of varying phenotypic scales. GWAS have primarily been used to characterize disease-related characteristics in patient populations, but new approaches have expanded the phenotypes used in GWAS. ''Clinical GWAS'' search for associations between genetic differences (primarily in the form of SNPs) and human disease traits such as disease risk, severity of disease, disease progression, and response to treatment. ''Molecular GWAS'' search for associations between SNPs and molecular phenotypes such as levels of mRNAs, proteins, or metabolites. Finally, ''cellular GWAS'' connect SNPs to particular cellular processes. Phenotypic variation in these cellular processes can be examined by manipulation either pharmacologically or using pathogens. doi:10.1371/journal.ppat.1003424.g001 The flow of scientific inspiration can also proceed from cellular microbiology to GWAS. Inspired by the way cellular microbiology [27] has led to numerous key discoveries in basic cell biology, cellular GWAS approaches that utilize pathogens as probes can serve to connect cellular processes to human diseases. In a cellular GWAS, cells from hundreds of genotyped individuals are exposed to a stimulus and the varied responses serve as quantitative traits for GWAS. For example, in the platform developed by one of the authors ( [28]; Hi-HOST: High throughput Human in vitrO Susceptibility Testing), cellular GWAS was carried out on the phenotype of pyroptosis [29], Salmonella-induced inflammatory cell death. Experimental follow-up of an eQTL near APIP (apaf-1interacting protein) led to the discovery that APIP is an enzyme in the methionine salvage pathway and that this metabolic pathway regulates caspase-1 activation [30]. Genotyping data for this SNP in patients with the physiological criteria for sepsis suggested that the APIP allele that results in a more robust caspase-1 response in vitro reduced the odds of death [30]. These findings are now being examined further in an APIP mouse model and in patient populations with Salmonella infections. Pyroptosis is just one consequence of Salmonella infection, and by monitoring multiple cellular phenotypes, we are able to probe human variation in macropinocytosis (by measuring Salmonella invasion), endosomal biology (by measuring intracellular survival and replication), and numerous pro-and anti-inflammatory signaling pathways (by measuring the cytokine response).
Not only can cellular GWAS approaches help elucidate clinical GWAS of bacterial infections, but unexpected connections between cellular processes and noninfectious diseases may emerge from this approach. Increasing the number and types of stimuli will lead to a large catalog of cellular GWAS that have targeted various cellular processes. We have thus far focused on Salmonella and Yersinia phenotypes ( [28,30] and D. Ko, unpublished data), while a similar approach has been used for HIV [31]. Cellular GWAS have also been undertaken for several different drug responses [32,33], and hits from a cellular GWAS of taxol sensitivity showed a statistically significant overlap with a clinical GWAS of peripheral neuropathy induced by this same drug in cancer patients [34]. Cellular GWAS can help to make sense of clinical GWAS associations by revealing what cellular processes may be involved and by providing an experimentally tractable system to allow for hypothesis testing.

Perspective
For the last twenty years, cellular microbiology has provided amazing insights into the physiology of cells [27]. Cellular microbiology is now well poised to contribute to the field of human genetics. Pathogens have clearly been a driving force, and perhaps have even been the main selective pressure, during human evolution [35]. What better way to study functional consequences of common genetic variants that have undergone natural selection than with the agents that have driven that change? Discoveries await both cellular microbiologists and human geneticists, and the results should benefit our understanding of basic biology and susceptibility to disease.