Inflammation and Immune-Related Candidate Gene Associations with Acute Lung Injury Susceptibility and Severity: A Validation Study

Introduction Common variants in genes related to inflammation, innate immunity, epithelial cell function, and angiogenesis have been reported to be associated with risks for Acute Lung Injury (ALI) and related outcomes. We tested whether previously-reported associations can be validated in an independent cohort at risk for ALI. Methods We identified 37 genetic variants in 27 genes previously associated with ALI and related outcomes. We prepared allelic discrimination assays for 12 SNPs from 11 genes with MAF>0.05 and genotyped these SNPs in Caucasian subjects from a cohort of critically ill patients meeting criteria for the systemic inflammatory response syndrome (SIRS) followed for development of ALI, duration of mechanical ventilation, and in-hospital death. We tested for associations using additive and recessive genetic models. Results Among Caucasian subjects with SIRS (n = 750), we identified a nominal association between rs2069832 in IL6 and ALI susceptibility (ORadj 1.61; 95% confidence interval [CI], 1.04–2.48, P = 0.03). In a sensitivity analysis limiting ALI cases to those who qualified for the Acute Respiratory Distress Syndrome (ARDS), rs61330082 in NAMPT was nominally associated with risk for ARDS. In terms of ALI outcomes, SNPs in MBL2 (rs1800450) and IL8 (rs4073) were nominally associated with fewer ventilator-free days (VFDs), and SNPs in NFE2L2 (rs6721961) and NAMPT (rs61330082) were nominally associated with 28-day mortality. The directions of effect for these nominal associations were in the same direction as previously reported but none of the associations survived correction for multiple hypothesis testing. Conclusion Although our primary analyses failed to statistically validate prior associations, our results provide some support for associations between SNPs in IL6 and NAMPT and risk for development of lung injury and for SNPs in IL8, MBL2, NFE2L2 and NAMPT with severity in ALI outcomes. These associations provide further evidence that genetic factors in genes related to immunity and inflammation contribute to ALI pathogenesis.


Introduction
Acute Lung Injury (ALI) and it's more severe manifestation, Acute Respiratory Distress Syndrome (ARDS) are associated with high mortality rates among intensive care unit patients [1][2][3][4]. Despite multiple clinical trials, no pharmacological agent has been shown to improve ALI-related outcomes [5]. Failure of these trials is due, in part, to an incomplete understanding of the key biologic pathways involved in the development and persistence of ALI in humans. One approach to identification of the key biologic pathways in ALI pathogenesis is gene-association studies.
Candidate gene studies in ALI have focused on variation within pathways thought to be important in ALI pathogenesis including genes involved in inflammation, immunity, oxidation, coagulation, angiogenesis, cell growth, endothelial barrier, surfactant function, and transcription regulation [6]. However, only a subset of putative ALI risk alleles have been validated in independent populations [7][8][9][10][11]. Robust associations between candidate gene polymorphisms and ALI validated in independent populations could support the development of more accurate models predicting risk for ALI and provide additional rationale for novel therapeutic interventions.
In this study, we sought to determine whether previouslyreported associations between variants in genes of immunity, inflammation, angiogenesis and oxidative stress and risk for development of ALI or related outcomes, are robust. We used a nested case-control study design in a prospective cohort of critically ill patients with SIRS followed for development of ALI. Secondary analyses looked at ALI related outcomes in a case-only design.

SNP selection
We searched PubMed (www.ncbi.nlm.nih.gov/pubmed/) for publications reporting associations between genetic variants and development of ALI and/or ALI related outcomes as of August 2010. A panel of ALI investigators (DSO, BJG, MMW) selected variants based on strength of prior studies, biological relevance, and HapMap [12] minor allele frequencies (MAF) of .0.05 in Caucasians. Variants were then further narrowed to those for which Taqman TM allelic discrimination assays could be designed or that were in high linkage disequilibrium (LD) with another SNP for which an assay could be prepared. As we were unable to design a Taqman assay for rs1800795, we used a surrogate SNP, rs2069832 (pairwise r 2 = 0.97).

Study population
The cohort used for this study has been previously described [13]. Briefly, patients admitted to the intensive care unit at Harborview Medical Center, Seattle, WA (Dec 2006-Dec 2010) with systemic inflammatory response syndrome (SIRS) were followed prospectively for development of ALI and related outcomes such as death and other organ dysfunction. Patients were categorized as having SIRS if they had three or more of the following concurrently within a 24 hour period; a) body temperature (,36uC or .38uC) b) HR.90/min, c) RR.20/min or were on the vent and had PCO2,32 mmHg, and d) WBC,4,000/mm 3 or .12,000/mm 3 . Exclusion criteria included trauma, HIV or immunosuppression, neurological injury, current diagnosis of cancer, and presence of a ''do not resuscitate'' order. ALI cases were as defined as per the American-European Consensus Committee (AECC) criteria for ARDS; presence of a proximal risk factor for ALI, presence of a PaO2/FiO2 (ratio of partial pressure of oxygen in arterial blood/fraction of oxygen in gas delivered) ,300, absence of overt congestive heart failure, and a chest radiograph with bilateral parenchymal opacities as adjudicated by three critical care physicians [13]. We further subclassified cases as ARDS if the PaO2/FiO2 ,200. According to the recently published Berlin Definition for Acute Respiratory Distress Syndrome, our definition of ALI would equate to mild, moderate, or severe ARDS while our definition of ARDS would equate to moderate and severe ARDS [14]. At-risk patients who did not meet criteria for development of ALI during the ICU hospitalization were classified as control subjects. This study was approved by the University of Washington, Division of Human Subjects Research.

Genotyping
Genomic DNA was extracted from whole blood samples using the Puregene DNA Isolation Kit (Qiagen, CA). All SNPs were genotyped using Taqman-based allelic discrimination assays, (Applied Biosystems, CA), on a multichannel microfluidics chip (Fluidigm, CA). Genotyping was carried out per the vendor's protocol in a blinded fashion.

Statistical Analysis
We identified differences in demographic variables between ALI and control subjects using Fisher's exact test for categorical variables and Student's t-test for continuous variables. Observed genotype frequencies were compared with expected frequencies to test for deviations from Hardy-Weinberg equilibrium using Fisher's exact test. To reduce possible confounding from population-stratification and racial differences, all analyses were restricted to Caucasian subjects. As a screen for major population substructure, we performed principal component analysis (PCA) using SNP genotypes from 67 SNPs available on our samples and which were also publically available for HapMap 3 subjects [12]. None of the first 5 PCs associated with case/control status.
We used multivariate logistic regression to estimate the genotype-specific odds ratio (OR) and 95% confidence interval (CI) for ALI susceptibility. The genotype associations were analyzed in both additive and recessive models. Covariates were chosen a priori based on known effects on risk for ALI development and mortality and included age, sex, APACHE III score, comorbidities, smoking status and alcohol abuse. A two sided P value ,0.05 was considered to be nominally significant in all analyses. To control for type I error from multiple hypotheses testing, we used a false discovery rate (FDR) approach setting a threshold of 0.1 as indicative of corrected significance. FDR was calculated separately for each association test using the total number of SNPs tested.
We conducted a sensitivity analysis restricting case and control subjects to avoid phenotype misclassification. We restricted cases of ALI to only those qualifying for moderate and severe ARDS by oxygenation criteria (PaO2/FiO2 ,200) and we restricted at-risk controls to only subjects who did not have an episode of hypoxemia (PaO2/FiO2 ,300).
In secondary analyses we tested for associations between genotypes and ventilator free days (VFDs) using linear regression and mortality within 28 days using logistic regression, adjusting for covariates as above.
In a post hoc analysis, and for SNPs with adequate genotyping data available in prior publications, we completed unadjusted meta-analyses for genotype associations with ALI risk or mortality in ALI subjects by combining our results with the published data.
We estimated that, under an additive model, our study would have adequate statistical power (1-b.0.8) to detect associations of moderate strength with ALI (relative risk (RR) .1.5) for SNPs with MAF.0.2. Assuming a recessive model, we would have adequate power to detect a strong association (RR greater than or equal to 3) with SNPs having a MAF.0.25. See Figure S2 for a description of power calculations and the range of power over varying models, allele frequencies and genotype effect strengths.
Statistical analyses were run using the R statistical package and SNP and Variation Suite 7 TM (Golden Helix TM , Bozeman, MT).

Characteristics of the Study Population
We genotyped 879 subjects; 238 subjects who developed ALI and 641 at-risk control subjects (See Figure 1 for the subject exclusion flow chart diagram). For analyses, we restricted only to subjects that met 3 criteria for SIRS ( Figure 1). These subjects had a mean age of 54 (StdDev 616) years, were predominantly male (63.1%), and were moderately ill with a mean APACHEIII score of 64.7 (StdDev 625). The baseline characteristics for the study population and comparisons of co-morbidities and clinical risk factors for ALI between groups are shown in Table 1. Compared with subjects who did not develop ALI, ALI cases were of similar age, more likely to be male, and had higher APACHE III scores. The proportion of subjects with coexisting end-stage renal disease, liver failure, alcohol abuse, and shock was greater in the ALI group. Hospital mortality for subjects with ALI was 24.6% which is similar to observed rates in recent ARDSNet studies [15,16]. As would be expected, ALI cases were more likely to have acute kidney injury, sepsis, shock and fewer VFDs.

SNPs selected for validation
We searched PubMed for gene association studies of ALI and ARDS susceptibility and related outcomes. At the time of the search (August, 2010), we identified 37 genetic variants in 27 genes including SNPs, insertion/deletions, and SNP haplotypes (Table  S1). The set of SNPs selected for this validation study tagged unique linkage disequilibrium bins that have been associated with ALI risk and/or ALI related outcomes and were located in or near the candidate genes. See Table 2 for the list of SNPs used in this study.

Genotyping results
We achieved genotype call rates across all subjects of over 99% for each of the 12 SNPs. One of the SNPs, rs4444903, had genotype frequencies for the control subjects that deviated from predicted frequency under Hardy Weinberg equilibrium (HWE). The remaining SNPs had genotype results for the control subjects that were not significantly different from HWE predictions using the x 2 test.
We did not observe strong evidence of population stratification by PCA comparing 67 shared genotypes among our subjects and HapMap3 subjects. When plotting PC1 vs. PC2, our subjects clustered with HapMap CEU and TSI subjects but not with Asian or African clusters as expected. No subject outliers or associations between phenotype and eigenvalue were observed. See Figure S1.

Association between SNPs and development of ALI
Genotype frequencies for control and case subjects are presented in Table 3. The A allele of rs4444903 in EGF was over-represented in cases but genotype frequencies for this SNP in controls deviated from HWE.
In analyses adjusted for age, gender, APACHEIII score, presence of diabetes mellitus, end stage renal disease, chronic alcohol use, cirrhosis and smoking status, we identified a nominal association between the A allele of rs2069832 in IL6 by recessive modeling (OR adj 1.61; 95% confidence interval [CI], 1.04-2.48, P = 0.03) ( Table 4). The direction of effect and magnitude of the odds ratio result was consistent with the prior publication [17]. However, this association result did not meet our predetermined threshold for FDR (FDR,0.1). We did not observe any other significant associations with ALI susceptibility for any of the remaining SNPs in adjusted (Table 4) or unadjusted analyses (Table 3).
We evaluated whether we could complete meta-analyses by combining our genotype results with the previously published results but found that the many of the publications lacked adequate genotyping results details. However, we were able to complete meta-analyses for several of the SNPs including in NAMPT, VEGF, IL8, MBL2 and TNF. In unadjusted genotype association meta-analyses using additive modeling, we observed an association between rs59744560 in NAMPT and increased risk for ALI (OR 1.33 (95% CI 1.14-1.56) p value ,0.0003) and an association between rs61330082 in NAMPT and decreased risk for In this study, we excluded non-Caucasian subjects, subjects who were found to have previous enrollment or missing data, and subjects that did not meet 3 SIRS criteria present concurrently within a 24 hour period. doi:10.1371/journal.pone.0051104.g001 ALI (OR 0.86 (95% CI 0.74-1.0) p value 0.049). We did not observe associations between SNPs in VEGF, IL8, MBL2 or TNF and ALI risk in this meta-analysis. See Table S3.

Association between SNPs and development of moderate or severe ARDS
We performed a sensitivity analysis excluding subjects with a higher likelihood for phenotypic misclassification as suggested by prior reports [13,18]. We compared controls (n = 288) without hypoxemic respiratory failure and cases with the more severe lung injury phenotypes, moderate and severe ARDS, n = 126. We identified a nominal association between rs61330082 in NAMPT and reduced ARDS susceptibility (OR adj 0.61; 95% confidence interval [CI], 0.40-0.95, P = 0.02) which replicates the direction of effect seen previously for this SNP [19]. We observed trends toward increased ARDS risk for homozygous carriers of rs6721961 in NFE2L2 (OR adj 3.83; 95% confidence interval [CI], 0.93-15.7), and rs2069832 in IL6 (OR adj 1.91; 95% confidence interval [CI], 0.96-3.40) ( Table 5). None of the other SNPs were associated with ARDS risk in this sensitivity analysis.

Associations with mortality and ventilator free-days
Several of the SNPs selected for this study were previously associated with ALI-related outcomes [17,20,21]. In secondary analyses, we tested the 12 SNPs for associations with 28 day mortality and VFD among subjects with ALI. Genotype frequencies for survivors and non-survivors of ALI are presented in Table  S2. We observed nominal associations between rs4073 in IL8 and rs1800450 in MBL2 and decreased VFD in adjusted analyses (Table 6). We also observed nominal associations between rs61330082 in NAMPT and rs6721961 in NFE2L2 and increased risk for mortality when using a recessive model ( Table 7). Neither of these associations met our pre-test threshold for significance by FDR. Using genotype data included in published studies, we combined our data and the published data to complete metaanalyses for the SNPs in TNF and MBL2. In an unadjusted genotype association meta-analysis using recessive modeling, rs1800629 in TNF was associated with increased mortality (OR 4.6 (95% CI 1.4-14.9), p value 0.001) and rs1800450 inMBL2 displayed a trend toward increased risk (OR 3.07 (95% CI 0.99-9.5), p value 0.07). See Table S4.

Discussion
At the time of our study, candidate gene association studies had identified more than 30 genetic variants in over 27 genes that associate with risk for ALI or ALI-related outcomes. These associations provide evidence that genetic factors contribute to ALI and strengthen the link between these specific genes in disease pathogenesis. In this study, we sought to identify robust associations between previously-associated genetic variants and risk for ALI and related outcomes. Using a nested case-control study in a well-phenotyped ICU cohort, we compared SIRS subjects who remained at-risk for ALI to those who developed ALI. Although, in the primary multivariable analyses, the significance levels for the associations with ALI did not survive correction for multiple comparisons, our nominal association results provide some support for prior findings. We observed a nominal association between a SNP in IL6 (rs2069832, surrogate for rs1800795) and risk for ALI. In analyses restricting case and control definitions to minimize misclassification, we observed a nominal association between rs61330082 in NAMPT and reduced risk for moderate and severe ARDS consistent with previous reports [19]. In terms of ALI-related outcomes, we observed nominal associations between rs4073 in IL8 and rs1800450 in MBL2 and decreased VFDs. We observed nominal associations between rs61330082 in NAMPT and rs6721961 in NFE2L2 and increased 28-day mortality. None of the associations observed in the secondary analyses survived correction for multiple hypothesis testing.
IL6 and IL8 have been extensively studied as important proinflammatory mediators in ALI. Elevated levels of IL-6 and IL-8 are associated with development of ALI, and persistence of elevated levels has been associated with poor outcomes [22][23][24]. rs1800795 is a promoter SNP that results in decreased expression of IL-6, and multiple disease associations have been reported for this SNP [25]. Similarly, rs4073 in IL8 is a promoter SNP that affects gene transcription and has multiple published disease associations [26][27][28]. Our study employed a larger carefullyphenotyped ICU population as compared to the prior published studies for genetic risk associations between these genes and ALI. The CC genotype of rs1800795 in IL6 which results in lower promoter activity was previously associated with reduced mortality in ARDS but not with risk for ALI [17]. Nonas et al. observed that a 3-SNP haplotype in IL6 that included the C allele of rs1800795 was actually associated with reduced risk of ALI [29]. In a subsequent publication, Flores et al. tested 14 SNPs across IL6 tagging all common linkage disequilibrium bins and found that none of these SNPs demonstrated an association with ALI. However, they did find that carriage of two copies of a 6-SNP haplotype that included the G allele at rs1800795 was associated with an OR of 2.73 (95% CI, 2.39-5.37) for development of ALI [11]. Finally, Sutherland et al. reported that carriage of 2 copies of an IL6 haplotype clade that included the G allele of rs1800795 was associated with increased risk of death and fewer days alive and free of ALI [30]. Here we now report that a surrogate for rs1800795 (rs2069832, r 2 = 0.98), is associated with increased risk for ALI and, similar to the studies by Flores and Sutherland, the effect was best modeled assuming a recessive effect. The SNP in IL8 was previously associated with increased days on mechanical ventilation [20]; in our study, rs4073 in IL8 was similarly  associated with reduced days alive and free of ventilation. These data suggest that common variants in inflammatory cytokines contribute to disease pathogenesis among subjects at risk for or who have ALI. NAMPT (PBEF1) has been associated with lung injury pathogenesis in recent translational and genetic association studies. Pre-B-cell colony-enhancing factor (PBEF) is a cytokine encoded by the gene NAMPT and that has been shown to be present at higher concentrations in serum and BALF from patients with ALI [19]. Two studies have reported associations between SNPs in NAMPT and development of ALI. Ye et al. identified 11 SNPs in NAMPT and, in a comparison between ALI cases and healthy controls, demonstrated that two of the SNPs in the promoter region, T-1001G (rs59744560) and C-1543T (rs61330082) were associated with risk for sepsis and ALI [19]. A second study using a nested case-control design in a larger population replicated the association with the 21001G allele and increased risk for ALI while the 21543T allele was associated with decreased risk for ARDS [9]. In our study, we observed the same effect of reduced risk of ALI for carriers of the 21543T genotype in sensitivity analyses restricting to cases with ARDS and controls without hypoxemia. On the other hand, we did not replicate the risk association for the 21001G SNP (rs59744560) in our primary adjusted analysis. In a post hoc meta-analysis combining our data and published genotype data for rs59744560, we did observe a strongly significant association with risk for ALI. Taken together, these findings support a role for NAMPT in ALI pathogenesis and suggest that NAMPT promoter variants are associated with risk for disease development.
NFE2L2 is a gene that has been identified as a potential mediator of hyperoxia-induced lung damage via linkage analysis and positional cloning in mice [31]. NFE2L2 is a transcription factor that targets antioxidant response elements (AREs) leading to gene regulation that is protective in the setting of oxidative stress. Resequencing of NFE2L2 identified a variant (rs6721961) that resulted in diminished promoter activity in an in vitro transfection experiment [8]. The authors also reported that rs6721961 was associated with increased risk for development of traumaassociated acute lung injury. We observed a trend for association of this variant with increased risk for ALI among subjects with SIRS, with a stronger trend in the sensitivity analysis with ARDS as the outcome. Our analysis resulted in a strong odds ratio for increased risk for developing ALI similar to the prior report [8].
Misclassification of ALI, particularly the inclusion of subjects without true ALI among cases, can lead to significant reductions in statistical power to detect true genetic associations [32]. For example, a recent study found that removal of subjects with equivocal CXR results from a set of patients classified with ALI resulted in an improvement in the strength of associations observed [18]. We undertook a sensitivity analysis to determine  if we would observe stronger associations following reduction of potential misclassification of ALI cases and at-risk controls. We restricted cases to only those ALI cases qualifying for ARDS (PF,200) while also restricting control subjects to those who did not have hypoxemia. This approach reduced the analysis sample size but resulted in associations with effects that were farther from the null for the SNPs in IL6, NAMPT and NFE2L2. While these findings did not achieve statistical significance after correction for type I error they provide support for the contention that phenotypic misclassification is a major issue reducing power and precision in case-control studies of ALI. Overall, our study failed to demonstrate robust associations between the SNPs studied and risk for ALI and related outcomes. Reasons for lack of validation include inadequate statistical power, misclassification of subject phenotypes, sample population differences compared to the prior studies, population substructure, and absence of true association with ALI. As power was limiting to identify small effect sizes, especially in the recessive model, we cannot exclude the possibility that we failed to observe true associations between the SNPs and ALI susceptibility and related outcomes. A limited analysis on 67 genotypes demonstrated that subjects from our cohort clustered with HapMap CEU and TSI subjects suggesting that there was not obvious evidence for cryptic admixture. However, more subtle differences in population substructure could have also have affected our ability to detect true associations. Finally, most of the SNPs included in this study have not been definitively shown to have functional significance and may merely serve as markers for causal variants. Future work will require more dense genotyping or sequencing at these loci to gain further insight into potential mechanisms.
This study used a well-phenotyped cohort of ICU subjects with SIRS and a nested-case control design to replicate ALI risk associations for several previously identified genetic variants. We observed nominal associations for between SNPS in IL6 and NAMPT, and risk for ALI and ARDS, but these associations failed to meet significance after adjusting for multiple comparisons. Nominal associations were also identified between SNPs in IL8, MBL2, NAMPT and NFE2L2 and ALI severity outcomes, namely VFDs and 28-day mortality. The failure to replicate the majority of ALI risk and outcome associations with statistical significance emphasizes the challenges of genetic association studies in critical care populations including sample size and disease classification. Our results also highlight the importance of independent confirmatory studies of associations with ALI susceptibility and related outcomes in order to support a better understanding of ALI pathogenesis and prognosis. Figure S1 Genotype principal component analysis plot demonstrating overlap of our subjects with HapMap ethnic populations. Population stratification was assessed by PCA for 67 shared genotypes among our subjects and HapMap 3 subjects [12]. Our subjects overlapped with CEU and TSI HapMap subjects but separated from subjects of African or Asian ethnicity when plotting eigenvalues 1 versus 2. (TIF) Figure S2 Plot of power estimates based on minor allele frequency and genotype effect size. Using CaTS [39], we generated results for estimated statistical power (1-b.0.8) to detect additive and recessive model associations over varying allele frequencies and genotype relative risks, using our known sample size and case frequency and an alpha error rate of 0.05. Each line represents a genotype relative risk: A, power estimates for the additive model for genotype relative risks of 1.25, 1.5, 1.75 and 2; B, power estimates for the recessive model for genotype relative risks of 1.5, 2, 2.5 and 3. (TIF)

Supporting Information
Table S1 Genetic variants associated with acute lung injury and associated outcomes in peer reviewed publications. (DOCX)