Assembly of Inflammation-Related Genes for Pathway-Focused Genetic Analysis

Recent identifications of associations between novel variants in inflammation-related genes and several common diseases emphasize the need for systematic evaluations of these genes in disease susceptibility. Considering that many genes are involved in the complex inflammation responses and many genetic variants in these genes have the potential to alter the functions and expression of these genes, we assembled a list of key inflammation-related genes to facilitate the identification of genetic associations of diseases with an inflammation-related etiology. We first reviewed various phases of inflammation responses, including the development of immune cells, sensing of danger, influx of cells to sites of insult, activation and functional responses of immune and non-immune cells, and resolution of the immune response. Assisted by the Ingenuity Pathway Analysis, we then identified 17 functional sub-pathways that are involved in one or multiple phases. This organization would greatly increase the chance of detecting gene-gene interactions by hierarchical clustering of genes with their functional closeness in a pathway. Finally, as an example application, we have developed tagging single nucleotide polymorphism (tSNP) arrays for populations of European and African descent to capture all the common variants of these key inflammation-related genes. Assays of these tSNPs have been designed and assembled into two Affymetrix ParAllele customized chips, one each for European (12,011 SNPs) and African (21,542 SNPs) populations. These tSNPs have greater coverage for these inflammation-related genes compared to the existing genome-wide arrays, particularly in the African population. These tSNP arrays can facilitate systematic evaluation of inflammation pathways in disease susceptibility. For additional applications, other genotyping platforms could also be employed. For existing genome-wide association data, this list of key inflammation-related genes and associated subpathways can facilitate comprehensive inflammation pathway- focused association analyses.


INTRODUCTION
Inflammation is an essential component of immune-mediated protection against pathogens and tissue damage. Immune responses are also responsible for the unfavorable rejection of tissue/organ transplants, hypersensitivity reactions (e.g., atopy, anaphylaxis, contact hypersensitivity, delayed-type hypersensitivity), and septic shock. Aberrant or unchecked immune responses may lead to a state of chronic inflammation [1][2][3]. This may occur when the immune response: 1) is activated in the absence of 'danger' signals; 2) fails to fully turn-off (resolve) after elimination of the danger; and 3) fails to completely clear the danger stimulus. Factors that may influence the initiation, activity, and resolution of immune responses include health (physical and emotional), age, diet, medications, and genetic predisposition.
Inflammation may also be a contributing factor for some diseases. The role of chronic inflammation in a wide variety of diseases is well-appreciated, including rheumatoid arthritis and other autoimmune disorders [4], cardiovascular disease [5][6][7], gastrointestinal disorders [8,9], and a number of cancers [10][11][12][13][14]. Perhaps the best evidence for the importance of chronic inflammation in disease is the efficacy of NSAIDs in reducing the risk or severity of these disorders [15]. There is mounting evidence that dietary factors that may influence inflammation, such as the balance of omega-3 vs. omega-6 polyunsaturated fatty acids (PUFAs), have an impact on disease risk and progression [16]. Genetic studies also provide evidence that variant alleles of genes associated with inflammatory pathways impact the risk of disease initiation, progression, and severity (see Table 1). The role of inflammation as a mediator of disease is currently receiving extensive attention, resulting in the National Institute of Allergy and Immunologic Diseases (NIAID) plans for an NIH Roadmap initiative with the overarching theme: ''Inflammation as a Common Mechanism of Disease'' (http://nihroadmap.nih.gov/inflammation/index.asp).
Numerous genetic linkage and case-control association studies have implicated genetic variations in genes important in immunity and inflammation and inflammatory diseases. Single missense heritable mutations can be the sole or major determinant for inflammatory diseases, such as Familial Cold Autoinflammatory Syndrome (missense mutations in exon 3 of Cias1 account for all cases) [17][18][19] and Familial Mediterranean Fever (MEFV, five founder missense mutations, when homozygous, account for 74% of cases) [20]. For many complex inflammatory diseases, polymorphisms in inflammatory genes are more likely to act as modifiers for disease susceptibility rather than sole determinants. Recent meta-analyses report modest associations between single nucleotide polymorphisms (SNPs) in TNF (encoding TNF-a) and increased risk for asthma [21], system lupus erythamatosus (SLE) [22], and psoriatic arthritis [23]. A variant of PTPN22 (encoding a lymphoid-specific protein tyrosine phosphatase) is modestly associated with multiple autoimmune diseases (rheumatoid arthritis, SLE, type 1 diabetes, and Graves' disease) [24]. Associations of IRF5 (interferon regulator factor 5) genetic variants and increased SLE risk have been highly replicated [25]. IFIH1, encoding an innate immunity viral mRNA detector (early type I IFN-b responsive gene, Helicard), is strongly associated with type 1 diabetes risk [26]. Association of a small risk for type 1 diabetes and CTLA4 has been consistently replicated [27][28][29]. An insertion polymorphism in CARD15, the gene encoding the microbial nucleotide detector Nod2, is a major risk factor for Crohn's disease [30]. An association between SNP's in IL23R (IL-23 receptor bchain) and increased risk of inflammatory bowel disease has been reported for a genome-wide association study and confirmed in three independent populations [31]. A recent meta-analysis demonstrated that an IL4R (IL-4 receptor alpha chain) variant modestly increases risk for atopic asthma [32]. A description of these studies and reported associations is provided in Table 1. This list is not intended to be comprehensive but rather serves as an example of representative genetic variations which have been well-established to be associated with common inflammatory disorders.
In addition to inflammatory/autoimmune diseases, polymorphisms in inflammation-associated genes may also contribute to risk for diseases in which inflammatory/immune-disorders are not the primary characteristic. There is evidence from the Breast Cancer Association Consortium that CASP8 (caspase 8) and TGFB1 (TGF-b1) variants impart risk, albeit low penetrance, for breast cancer [33]. Analysis of two independent populations suggest that a rare polymorphism in TNF may also be a low-penetrance risk factor for breast cancer [34]. Several groups report associations between various PTGS2 (COX-2) genetic variants and colorectal cancer risk [35][36][37][38][39]. There is also evidence to suggest that a low frequency COX2 variant allele decreases prostate cancer risk [40], particularly in subjects who frequently eat fish [41]. Interestingly, in a preliminary study using MegAllele TM I&I panel, including 9,275 SNPs in 1,086 genes involved in immunity and inflammation, more SNPs were found to be significantly associated with prostate cancer risk than expected by chance, which suggests multiple genetic variants in this pathway impart modest risk to prostate cancer [42]. Because genetic associations of disease and genetic variations in inflammatory genes are often relatively modest, it is likely that polymorphisms in multiple inflammatory genes cooperate in an additive or synergistic manner to impact disease risk. Pathway analyses may help to reveal gene-gene interactions or risks imparted independently from other genes in the pathway. The advantages of performing analyses at pathway levels are illustrated by Dinu et al. [43]. Associations between CFH (complement factor H) and agerelated macular degeneration have been replicated in numerous studies [44]. To test whether genetic variations in the multiple complement pathway genes impact macular degeneration risk, Dinu et al. analyzed the existing genome-wide association study data of Klein et al. [45], restricting the analyses to genes in the complement pathway in subjects carrying the CFH risk allele. Significant associations were detected for a C7 and MBL2 variants and severity of macular degeneration in the context of the complement pathway analysis and the CHF risk allele, and these associations would not have been significant in a genome-wide analysis of the data.

Choosing genes
Various aspects of immunity contribute to the development of an overall inflammatory immune response. These phases include the development of immune cells, sensing of danger, influx of cells to sites of insult, activation and functional responses of immune and non-immune cells, and resolution of the immune response. To broadly cover most aspects of inflammatory responses, the various phases of immune responses were considered in choosing genes for the SNP array panel, outlined in Table 2. A schematic representation of most of these phases is presented in Fig. 1.
Priority was given first to genes of known function in inflammatory responses (in both immune and non-immune cells), and then to genes expressed in immune cells with function implied by homology to other genes but exact function not clear. Ubiquitously expressed genes required for the normal function of most cell types of diverse origin were given lower priority. However, special emphasis was placed on genes at nodes for signaling to and from multiple pathways, most notably genes in NF-kB, MAPK, and PI3K signaling pathways.
Pathways were built using Ingenuity Pathways Analysis, as described in Methods section, using both pre-defined 'canonical pathways' and custom-built pathways based on our own queries for genes/pathways not included in the canonical pathways.
Multiple functional pathways are involved each of the immune response phases, and each functional pathway may contribute to several of the immune response phases. For example (Fig. 2), the response of a macrophage responding to a danger signal during a gram-negative bacterial infection involves innate pathogen recognition of LPS by the TLR4 complex. TLR4 transduces signals via NF-kB, MAPK, and PI3K signaling pathways, stimulating synthesis of eicosanoids and cytokines to signal other cells of the danger. LPS may also stimulate expression of stressinduced proteins, such as MIC-A and MIC-B (ligands for natural killer cell activating receptors) and T cell co-stimulatory molecules (B7 family proteins).
Because immune response phases utilize multiple functional pathways and these pathways are overlapping among phases, the genes chosen for the SNP array panel were assigned to one of the following functional pathways: Adhesion-Extravasation-Migration: adhesion molecules; chemoattractants and chemoattractant receptors; cytoskeletal rearrangement signaling, motility proteins.
Apoptosis signaling: death receptors and ligands and extrinsic apoptosis pathway signaling; mitochondrial-dependent, intrinsic apoptosis pathway signaling; cellular stress signaling.

Phase of immune response Description
Hematopoiesis/homeosta-sis/tolerance The generation and differentiation of immune cells and maintenance of their number in circulation and tissues; prevention of self-reactivity.
Danger signal Innate recognition of and response to pathogenic foreign substances or stress.

Extravasation
The process of circulating immune cells crossing from blood into peripheral tissues and secondary lymphoid tissues.
Migration to site of inflammation The process of immune cells, after extravasation, reaching the site of inflammatory insult, including chemoattraction, adhesion to substrates, and degradation of extracellular matrix.

Interactions between resident cells, immune cells, and pathogens at site of inflammation
Interactions between resident cells, immune cells, and pathogens at site of inflammation-how infiltrating cells interact with the resident inflammatory cells, non-immune cells (e.g., epithelia), pathogens, and other infiltrating cells, that leads to activation of effector functions.

Activation of inflammatory cells
The signaling pathways and transcription factors stimulated by activating, co-stimulatory, and inhibitory receptors that leads to activation, proliferation, differentiation, and survival of responding immune cells.
Effector functions of inflammatory cells The factors produced/released by immune cells in attempt to resolve the pathogenic insults, including release of cytotoxic/cytostatic mediators and mediators to enhance or fine-tune the immune response.

Response of target cells
The pathways in non-immune cells (e.g., epithelia) activated in response to the effector functions of immune cells.

Resolution of immune response vs. chronic inflammation
The pathways that lead to the downregulation of immune responses and inflammation after the pathogenic insult is cleared; the factors maintaining late-phase immune responses when the insult is not totally resolved. Complement cascade: components of classical, alternative, and lectin-dependent complement pathways.
Innate pathogen detection: Toll-like receptors (TLRs); intracellular nucleotide detectors (e.g., Nod1 and putative members of CARD/Nod family); peptidoglycan recognition proteins; associated signaling molecules linking these detectors to major signaling pathways.
Natural Killer Cell Signaling: Natural killer (NK) cellspecific activating and inhibitory receptors; adaptors and signaling molecules for transducing NK cell-specific receptor signaling.
NF-kB signaling: Molecules for regulation of NF-kB activation, including adaptors linking. other pathways to and from the NF-kB pathway (e.g., from MAPK, pathogen-detection, and TNF-a signaling pathways).
Antigen presentation: Major Histocompatibility Complex (MHC) molecules and associated proteins; proteins involved in uptake, processing, and loading of peptides on MHC molecules.
PI3K/AKT Signaling: Molecules involved in regulation of PI3K-dependent signaling, including adaptors linking other pathways to and from the PI3K pathway.  Table 2) are shown in bold. Additional aspects not shown are the involvement of secondary lymphoid tissues for initial T cell and B cell activation by dendritic cells that migrate from the site of inflammation to lymph nodes and other secondary lymphoid structures. The resolution of immune responses, immunological memory, and homeostasis are also not depicted. doi:10.1371/journal.pone.0001035.g001 ROS/glutathione/cytotoxic granules: Molecules involved in the generation and response to leukocyte-derived cytotoxic agents (reactive oxygen species (ROS), nitric oxide, cytotoxic granules of granulocytes and natural killer cells), including contents of granulocyte and NK cell cytotoxic granules; glutathione peroxidases; peroxiredoxins; catalase; proteinases; superoxide dismutase.
TNF superfamily and signaling: Receptors and ligands of the TNF-a superfamily; adaptors and signaling molecules involved in transducing signals from receptor stimulation to other major signaling pathways (e.g., MAPK and NF-kB pathways).
Examples of the functional subpathways and types of genes chosen for the different phases of immune responses are presented in Table 3 (proteins encoded by the genes are provided). Most of these immune response phases also utilize common signaling pathways, including elements of MAPK, NF-kB, PI3K/Akt, GCPR, cytokine, and leukocyte signaling pathways. An example of the integration of multiple functional subpathways for one phase of an immune response (danger signal) is depicted in Fig. 2. The numbers of genes chosen in each subpathway are listed in Table 4, and the complete list of 1027 candidate genes and their primary subpathway (and secondary subpathway for a subset of genes) is provided in Supplementary Table S1.

Example application: WFINFLAM tSNP panel
Because the components comprising inflammation are very numerous and interacting across many pathways, without strong a priori evidence it is difficult to choose a handful of candidate genes to fully cover the potential genetic risk factors contributing to the inflammatory component of a particular disease. For this reason, panels of single nucleotide polymorphisms (SNPs) in an array of inflammation-related genes broadly covering most aspects of immunity and inflammation based on our assembled list will be critical in objectively evaluating the impact of genetic variations in inflammation-related genes on an inflammation-dependent outcome.
There is a commercially available product, Affymetrix Gene-ChipH Human Immune and Inflammation 9K SNP panel, that attempts to serve the purpose. This application -specific panel contains ,9,000 SNPs to cover ,1,000 immunity-and inflammation-related genes (http://www.affymetrix.com/support/ technical/datasheets/humanimmune_9k_snp_datasheet.pdf). However, the rationale for choosing the ,1,000 immunity and inflammation genes in this panel is not clear, and the coverage of these genes, regardless of their biological functions, may not be sufficient. In order to investigate the impact of genetic variations in a broad array of inflammation-related genes on disease risk, we created two tagging SNP (tSNP) panels, one each for populations with either Caucasian or African ancestries, for Affimetrix ParAllele genotyping chips. These tSNP panels were designed to capture majority of the genetic variations in these 1027 inflammation candidate genes.
tSNPs for the 1027 inflammation-associated candidate genes were chosen based on a pair-wise r 2 threshold of 0.8 and MAF $5% using data in the HapMap Phase II database (HapMap Data Release 21a/phaseII; http://hapmap.org/index.html.en). tSNPs were chosen separately for the CEU (representing European ancestry) and YRI populations (representing African ancestry). In order to accommodate as many relevant inflammation genes as possible, less stringent criteria with r 2 threshold .0.5 were employed for intronic regions (excluding 5kb in the beginning and end of these big introns) greater than 50kb in certain large genes with .100 tSNPs. There were seven genes in this category, and the intronic regions for which a less stringent r 2 threshold was applied are listed in Supplementary Table S2. For genes without genotype information from HapMap, additional SNPs in six inflammation genes were chosen to be included based on information from other resources (see Supplementary Table  S3), as described in Methods section.
The resulting inflammation tSNP panels, WFINFLAM-CEU for Caucasians and WFINFLAM-YRI for African descent, include 12,011 SNPs and 21,542 SNPs respectively in 1027 inflammationassociated candidate genes. There is an average of 11.7 and 21 SNPs in each candidate gene in WFINFLAM-CEU and WFIN-FLAM-YRI panels, respectively. Table 4 briefly summarizes the numbers of genes and tSNPs included in each subpathway. The annotation table for this panel, including the list of 1027 genes, their chromosomal positions, their associated primary and secondary subpathways, and the number of tSNPs chosen for each gene can be found in Supplementary Table S1 (and also our website: http://www1.wfubmc.edu/Genomics/Publications+ and+Data/). Additionally, the annotation file for the SNPs included in the WFINFLAM-CEU and WFINFLAM-YRI panels, including the chromosomal locations, their associated genes and the sub-pathways, can also be found in Supplementary Tables

DISCUSSION
Various components and complex interactions comprise immune and inflammation responses, and numerous genes are involved in this complex network. With a thorough review of various aspects of inflammatory immune responses, and a systematic search for gene-gene interactions using Ingenuity Pathway Analysis, we have provided a comprehensive list of inflammation-associated genes and subpathways for genetic association studies. Genome-wide association studies have been a very popular approach to test the association between disease phenotypes and genetic variations. However, we believe there are still several advantages for a pathway-focused study. First of all, compared to whole-genome analyses, restricting analyses to SNPs in a specific pathway reduces the number of multiple tests performed in the analysis of a study population, thereby reducing the probability of false positive associations and increasing the effective power of the study. This kind of study design is particularly effective when inflammation plays an important role in disease etiology and the goal of the studies is to delineate genetic variations in inflammation pathway to disease risk and/or progression. A related second advantage of restricted pathway analysis is in study design. A large proportion of investigators may not have access to the very large number of subjects and multiple confirmation populations needed to overcome false positive associations due to multiple testing in genome-wide association studies. Studies restricted to a pathway analysis permit the use of study populations that are not large enough for use in whole-genome association studies. When target diseases are not prevalent and inflammation is obviously involved in disease etiology, researchers will gain the most out of an inflammation pathway-specific study design. Although some genes not related to inflammation found in whole-genome panels may impart some risk to inflammation-associated diseases, associated genetic variants would be anticipated to be concentrated in a panel of SNPs in inflammation-associated genes. Therefore, the drawback of potentially missing associated non-inflammation genes is offset by the increased probability of detecting true associations in an inflammation-restricted panel. Thirdly, pathway analysis is far less expensive to perform than whole-genome analysis, especially considering the cost for second, and/or third stage confirmation studies needed to follow-up the significant results from an initial screening in order to rule out false positive associations. Lastly, the functional subpathways are also pre-defined with available biological information. This refined information provides investigators with the opportunity to test gene-gene interactions within subpathways in which synergistic interactions are more likely to be concentrated. Additionally, the interplays between subpathways are also clearly defined to enable investigators to test biologically feasible interactions between subpathways.
However, the results from this manuscript also have potential utility for investigators who have more interests in surveying the whole genome. For whole-genome analyses where there is a prior hypothesis for inflammation being associated with the outcome, the inflammation pathway and subpathways defined in this manuscript may provide a framework for testing whether SNPs in the inflammation pathway or subpathways as a whole are overrepresented for significant associations to the outcome. Although pathway networks can be constructed for whole-genome analyses, such networks should be designed a priori before beginning the study.
The WFINFLAM and WFINFLAM-YRI SNP array panels for inflammation-associated genes provide a powerful tool for analyzing the contribution of genetic variation in diseases that have inflammatory components. Although whole-genome SNP panels are currently available that include almost all of the genes included in the WFINFLAM panels, the coverage of SNPs in genes included in the WFINFLAM array is superior to the coverage of currently available in whole-genome arrays, especially for populations with African ancestry background. For researchers who would like to use other genotyping platforms, the inflammatory gene list provided here would be a good starting point for designing genotyping assays for other platforms. Additionally, alternate approaches, other than r 2 based method, for choosing SNPs based on the inflammatory gene list provided here could also be considered. For example, researchers may specifically focus on ''high-prior'' polymorphisms that are known to be functional or have been previously linked to the specific diseases under study, alone or in combination with the tSNPs provided in the WFINFLAM panel. This approach may be more efficient and powerful than the r 2 based method alone, especially if the targeted ''high-prior'' polymorphisms are causal and their linkage to the nearby tSNPs is incomplete.
In addition, precaution may be warranted for tSNP panels designed based on the HapMap project. The transferability of the LD patterns between populations studied in HapMap project and other study populations may need to be validated. The transferability of HapMap-based selection of tSNPs using the reference CEU population to several other diverse populations of European ancestry has been demonstrated to be almost as effective for overall SNP coverage in selected genomic regions or randomly selected SNPs in the respective populations [46][47][48][49][50]. However, the coverage with tSNPs based on HapMap CEU data for a small subset of specific genes or SNPs may be lower in certain populations despite overall similar coverage [51], particularly for isolated indigenous populations [52]. Other than the data provided by the HapMap project, we are not aware of complete sequence variant information available for validation of the transferability of tSNP panels that were designed based on HapMap data to other study populations.
In summary, pathway analysis of inflammation-associated genes is a powerful approach for determining genetic risk factors for both inflammatory diseases and other diseases that may have an underappreciated modest inflammatory component, such as cancers. The inflammation pathway gene list and functionally-defined subpathways provide useful tools for assessing the impact of genetic variations in inflammation pathways on disease risk, in situations where either pathway-focused studies or genome-wide analyses are employed.

Selection of genes
Networks of genes involved in the regulation of the phases of immune responses (described in Results and Table 2 ) were built using Ingenuity Pathways Analysis (Ingenuity Systems: www. analysis.ingenuity.com). Both pre-defined 'canonical pathways' and custom-built pathways based on our own queries for genes/ pathways not included in the canonical pathways were used to establish these networks. Canonical pathways included: actin cytoskeleton signaling; antigen presentation pathway; apoptosis signaling; B cell receptor signaling; calcium signaling; cAMP signaling; chemokine signaling; complement and coagulation cascade; death receptor signaling; ERK/MAPK signaling; FcEpsilon receptor signaling; G-coupled protein receptor signaling; GM-CSF signaling; IGF-1 signaling; IL-10 signaling; IL-2 signaling; IL-4 signaling; IL-6 signaling; integrin signaling; interferon signaling; JAK/STAT signaling; leukocyte extravasation signaling; natural killer cell signaling; NF-kB signaling; nitric oxide signaling; notch signaling; p38 MAPK signaling; PI3K/Akt signaling; PPAR signaling; protein ubiquination pathway; PTEN signaling; SPK/JNK signaling; T cell receptor signaling; TGFb signaling; Toll-like receptor signaling.
These networks were then arranged into inflammation subpathways by: 1) combining several networks together (e.g., ERK/MAPK, p38 MAPK, and SAPK/JNK canonical pathways into the 'MAPK signaling' inflammation subpathway; IL-2-, IL-4-, IL-6-, IL-10-, Interferon-, GM-CSF-, IGF-1-, JAK/STAT6-, and TGF-b-signaling canonical pathways into 'cytokine signaling' inflammation subpathway; actin cytoskeleton-, chemokine-, integrin-, and leukocyte extravasation-signaling canonincal pathways into the 'adhesion-extravasation-migration' inflammation subpathway; etc.); 2) adding additional genes to bridge networks within a subpathway and to include appropriate genes not included in the canonical pathways (e.g., additional cytokines and their receptors were added to the 'cytokine signaling' inflammation subpathway; additional integrins and chemokines/chemoattractant molecules were added to the 'adhesion-extravasation-migration' inflammation subpathway; CD antigens expressed by leukocytes not already included were considered for addition to several subpathways; other missing genes/ pathways considered important by the panel of investigators, such as the scavenger receptor network for 'leukocyte signaling' inflammation subpathway and Nod1/CARD family networks for 'innate pathogen detection' inflammation subpathway; etc.); 3) trimming the networks of genes with low priority for inclusion in the inflammation panel. Genes with lower priority include: genes not expressed in immune cells or not directly involved in cells responding to inflammation, including non-immune cells (e.g., skeletal muscle-specific myosins in the actin cytoskeleton signaling canonical pathway; calsequestrins expressed mainly in various muscle cells, in the calcium signaling canonical pathway; GH1 and GHR, growth hormone expressed in pituitary gland and its receptor, and NGFB and NGFR, nerve growth factor and its receptor, in the NF-kB canonical pathway; etc.; MAPK8IP1, specific for pancreatic cell function, in the PI3K/AKT canonical pathway; etc.); genes with unknown function, though genes with high homology to known inflammatory mediators were considered (e.g., bcl-2 family homologs, IL-1b family homologs). Special emphasis was placed on genes at nodes for signaling to and from multiple pathways, most notably genes in NF-kB, MAPK, and PI3K signaling pathways.

Choosing tagging SNPs
Tagging SNPs (tSNPs) for candidate genes were chosen using Tagger server (http://www.broad.mit.edu/mpg/tagger/server. html). The target sequences included genomic regions containing the entire candidate genes, 5kb before transcription start site, and 2kb after the transcription end site, based on annotation in NCBI Build 35. When candidate regions for two or more genes overlapped, the combined genomic regions were used for choosing tSNPs. Two separate lists of non-synonymous coding SNPs for the CEPH population (CEU) and for the Yuruba population (YRI) were downloaded from HapMap (rel21a_NCBI_Build35) and forced in as tagging SNPs for ancestry-specific panels. Pair-wise r 2 threshold of 0.8 and MAF$5% were used. For genes without genotype information from HapMap, SNPs in Affymetrix 500k array, as well as SNPs with frequency data from the Innate Immunity PGA (IIPGA) (http://innateimmunity.net/), were manually chosen to be included in the list based on allele frequencies and inter-SNP distance (the chromosomal positions are based on IIPGA annotation because some of these genes/ SNPs were ambiguously mapped in build 35). Some tSNPs were dropped out from the final list because they did not pass the Affymetrix design review due to the following reasons: nonbiallelic SNPs, existing SNPs or ambiguous bases too close to the SNP of interest, SNPs exceeding T m ranges, or SNPs failing BLAST searches. In the attempt to fill in the gaps due to tSNPs failing design criteria, tSNPs for genes with dropped-out tSNPs were chosen again by forcing out the tSNPs failing design review and forcing in the remaining tSNPs when running Tagger. In total, there are 12,011 SNPs in WFINFLAM panel for Caucasians and 21,542 SNPs in WFINFLAM-YRI for African descents in 1027 inflammation-associated candidate genes that passed the Affymetrix design review.

Estimation of genomic coverage by tagging SNPs
We used LdCompare (Hao 2006; http://www.affymetrix.com/ support/developer/tools/devnettools.affx) to estimate the coverage of the 1027 inflammation-associated candidate genes with both WFINFLAM panel for Caucasians and WFINFLAM-YRI panel for African descents. We have also computed the coverage of several commercial genotyping arrays, including Affymetrix Mapping 500K, Illumina HumanHap 550 and HumanHap 650, for these 1027 genes. Single-marker coverage (pairwise r 2 ) and multiple-marker coverage (multiple-marker r 2 ) were computed and combined to estimate the coverage for the genomic regions containing the entire candidate genes, 5kb before transcription start site, and 2kb after the transcription end site. The summary for the coverage for these genes in all these genotyping panels are detailed in Table S6.