Tagging Single Nucleotide Polymorphisms in the IRF1 and IRF8 Genes and Tuberculosis Susceptibility

Genes encoding IRF1 and IRF8 protein have been proposed as candidate tuberculosis susceptibility genes. In order to elucidate whether the IRF1 and IRF8 variants were associated with tuberculosis susceptibility, we conducted a case-control study consisting of 495 controls and 452 ethnically matched cases with tuberculosis in a Chinese population. Seven haplotype tagging single-nucleotide polymorphisms (tagSNPs) (rs2057656; rs2706381; rs2070724; rs2070721; rs2549008; rs2549007; rs2706386) from HapMap database were analyzed, which provided an almost complete coverage of the genetic variations in the IRF1 gene. Fifteen tagSNPs (rs12924316; rs182511; rs305080; rs2292980; rs925994; rs424971; rs16939967; rs11117415; rs4843860; rs9926411; rs8064189; rs12929551; rs10514611; rs1044873; rs6638) were observed in the IRF8 gene. All these tagSNPs were genotyped by SNPstream genotyping and SNaPshot typing. None of the seven tagSNPs was individually associated with tuberculosis in the IRF1 gene. In the IRF8 gene, interestingly, we found that three tagSNPs (rs925994 and rs11117415 located in the intron region; rs10514611 located in the 3′UTR) were associated with risk of tuberculosis after Bonferroni correction. Per allele OR was 1.75 (95% CI 1.35∼2.27, P = 0.002), 4.75 (95% CI 2.16∼10.43, P = 0.002) and 3.39 (95% CI 1.60∼7.20, P = 0.015) respectively. Luciferase reporter gene assay showed that the construct that contained the non-risk allele C of rs10514611 showed significantly higher luciferase activity than did the risk T allele (P<0.01), which implied rs10514611 was a potential functional SNP site. Our results indicated that the IRF8 gene might participate in genetic susceptibility to tuberculosis in a Chinese population.


Introduction
The infectious disease tuberculosis (TB) is still widespread. According to the World Health Organization, two billion people are infected with the causative bacillus Mycobacterium tuberculosis [1]. Epidemiological data has shown that only one tenth of Mycobacterium tuberculosis infections will finally become clinically apparent. Other studies have demonstrated that the prevalence and incidence of TB are significantly higher in identical twins compared to fraternal twins [2]. Additionally, its prevalence is significantly different among different races and ethnic groups [3]. These suggest a possible influence of genetic susceptibility in the development of TB [4][5][6][7].
The immune response to TB is primarily cell mediated, and involves a variety of T cells, macrophages, and cytokines. Studies have shown that T helper (Th) cell subsets (Th1 verse Th2) and their cytokines are essential to prevent the invasion of Mycobacterium tuberculosis [8][9][10]. The interferon regulatory factor (IRF) family is a group of newly identified transcription factors and plays an important role in the regulation of Th cell differentiation [11][12][13]. Among the IRF family, IRF1 and IRF8 proteins are implicated in resistance to intracellular infection [14][15][16]. Genes encoding IRF1 and IRF8 protein have been proposed as candidate TB susceptibility genes. The IRF1 gene (Gene ID: 3659) is located at chromosome 5q31.1. It comprises ten exons [17][18]. The gene product is a protein of 325 amino acids. The IRF8 gene (Gene ID: 3394) is located at chromosome 16q24.1. It comprises nine exons. The gene product is a protein of 426 amino acids [19].
Several recent findings suggest that the IRF1 gene is an important risk factor for some allergic diseases [20][21][22]. Vollstedt et al., in contrast, found no association of the IRF1 gene with pulmonary TB [23]. However, whether more common IRF8 variants are also associated with TB has not been investigated. Different population may have different genetic associations with disease. To elucidate the role of the IRF1/IRF8 in TB susceptibility, we tested the association of seven tagSNPs of the IRF1 and fifteen tagSNPs of the IRF8 with risk of TB in a Chinese case-control study. We attempted to identify sufficient SNPs to tag all the common haplotypes across the IRF1 and IRF8 genes.

Study Population
The case-control study included 452 histopathologically confirmed TB patients and 495 TB-free controls. TB group: Han Chinese with TB were selected from the Hangzhou Municipal Red-Cross Hospital based on the diagnostic criteria of TB. Sputum culture, tuberculin skin test, clinical X-ray examination, and pathological examinations were performed for all cases. Patients with positive sputum culture results were confirmed as sputum smear-positive TB, which required two consecutive positive smears. Patients with negative sputum culture results but positive by the purified protein derivative test (PPD test), X-ray examination, and clinical manifestations consistent with the diagnostic criteria of TB were confirmed as sputum culturenegative TB. All patients responded to anti-mycobacterial treatment and were followed up. Exclusion criteria were: patients with pneumonia, lung cancers, or other diseases with similar clinical features; patients with hepatitis; individuals with HIV/ AIDS or those who had received immuno-suppressors; and patients with other severe diseases. Control group: Healthy Han Chinese were randomly recruited from the same district as a control population. These individuals were not related to members of the TB group, with none having a history of TB as confirmed by X-ray and physical examinations and tuberculin skin test.
All subjects signed informed consent forms voluntarily and the research was approved by the Medical Ethics Commission of Zhejiang University.

SNP Identification and Selection
Using the HapMap genome browser (http://www.hapmap. org/cgi-perl/gbrowse/hapmap3r2_B36) based on the CHB+JPT population, seven tagSNPs (r 2 coefficient cut-off of 0.85 with a minor allele frequency of 0.15), were selected to capture the IRF1 within a 23 kb region of chromosome 5 and fifteen tagSNPs for the IRF8 within a 23 kb region of chromosome 16.

Genotyping
Genomic DNA was extracted from 1.5 ml of peripheral blood samples using the Puregene DNA isolation kit (Gentra Systems, Minneapolis, MN, USA). Two tagSNPs of the IRF1 and ten tagSNPs of the IRF8 were genotyped using the GenomeLab SNPstream Genotyping System (Beckman Coulter, Fullerton, CA). Five tagSNPs of the IRF1 gene and five tagSNPs of the IRF8 gene were genotyped using the SNaPshot typing (Applied Biosystems, Foster City, CA, USA). Primers were shown in Table  S1 and Table S2. Repeated genotyping of .10% randomly selected samples yielded 100% identical results.

Statistical Analysis
Data from the control and TB groups were compared and analyzed with SPSS 11.5 software (SPSS Inc., Chicago, IL, USA). HWE software was used to test the Hardy-Weinberg equilibrium. A x 2 test was performed to compare the distribution of genotypes between the TB and control groups, and the results calculated using SNPstats software (http://bioinfo.iconcologia.net/snpstats/). Based on the multivariable logistic regression method, the casecontrol association of genotypes in five inheritance models (codominant, dominant, recessive, overdominant, log-additive) were tested and these models were coded as follows for genotypes AA AB BB (assume B is risk allele): Dominant 0 1 1 (AA vs AB-BB); Recessive 0 0 1 (AA+AB vs BB); Additive 0 1 2 (trend test on B allele count); Overdominant 0 1 0 (AA+BB vs AB); Codominant 0 1 0 & 0 0 1 (AA vs AB & AA vs BB) and the odds ratios (OR) and 95% confidence intervals (95% CI) were given. Akaike's information criterion (AIC) was used to choose the inheritance model that best fitted the data.
Bonferroni corrections for multiple SNPs were performed using the formula: a = 12(12a9)1/n (corrected for n comparisons, a' = 0.05, n = 22). A P-value less than 0.05 was regarded as statistically significant.
Using online statistical software (http://sampsize.sourceforge. net/), with the current sample size (495 controls and 452 cases) and significance threshold (0.05), we estimated that we had .65% power to detect a common risk allele (MAF.15%) with an odds ratio (OR) of 1.5.

Plasmid Construction and Reporter Gene Assay
PolymiRTS (http://compbio.uthsc.edu/miRSNP) was employed to search for putative SNPs that affected miRNA targeting in human. To produce pMIR-IRF8 plasmid, a fragment covering

Epidemiological Features in the Control and TB Group
A total of 495 DNA samples in the control group (mean age: 46 years, interquartile range, from 37 to 56 years) and 452 samples in the TB group (mean age: 45 years, interquartile range, from 36 to 53 years) was extracted. The control group included 253 males (51%) and 242 females (49%), while the TB group included 317 males (70%) and 135 females (30%) ( Table 1). Twenty four percent of the TB goup had a family history of TB compared to 7% of the control group. Of the cases, the TB group encompassed 396 patients with pulmonary TB and 56 patients with extrapulmonary TB (including lymph node, pleural, bone and renal TB).
Relationship between tagSNPs of the IRF1 Gene and TB Table 2 showed the primary information for the seven tagSNPs in the IRF1 gene. The genotype distributions of the seven studied tagSNPs were in Hardy-Weinberg equilibrium. Genotyping rate was larger than 96%, indicating reliability of genotyping. Logistic regression analysis revealed none of the studied tagSNPs of the IRF1 gene was associated with TB risk (Table 3).
Relationship between tagSNPs of the IRF8 Gene and TB Table 4 showed the primary information for the fifteen tagSNPs in the IRF8 gene. The genotype distributions of the fifteen studied tagSNPs were in Hardy-Weinberg equilibrium (all P.0.05) among the subjects except for rs424971 (P,0.01). The linkage disequilibrium between tagSNP pairs was shown in Figure 2. According to r 2 value, the fourteen tagSNPs could be represented for the whole gene.
Logistic regression analysis revealed that three tagSNPs (rs925994, rs11117415 and rs10514611) in the IRF8 gene were significantly associated with TB. The other tagSNPs were not significantly associated with TB (Table 5).
Akaike's information criterion (AIC) is a measure of the goodness of fit of an estimated statistical model and it can judge a model by how close its fitted values tend to be to the true values in terms of a certain expected value. In the case of rs925994, the log-additive model was accepted as the best fit for these data because of the smaller AIC value. It had an association of 1.75 (1.35,2.27) with TB, indicating that an 75% increase in risk associated with each additional copy of the A allele of rs925994 (P = 0.002 with Bonferroni correction for 22 SNPs). When rs11117415 exhibited a co-dominant effect, patients with genotypes AG and GG had a 1.44-fold (OR = 1.44, 95% CI 1.09,1.91) and 4.75-fold (OR = 4.75, 95% CI 2.16,10.43) increase in risk compared to patients with genotype AA, respectively (P = 0.002 with Bonferroni correction for 22 SNPs). When rs10514611 exhibited a recessive effect, patients with genotype TT had a 3.39-fold (OR = 3.39, 95% CI 1.60,7.20) increase in risk compared to patients with genotype CC and TC (P = 0.015 with Bonferroni correction for 22 SNPs).

Functional Analysis of rs10514611 by Reporter Gene Assay
MicroRNAs are small noncoding transcripts of ,22 nucleotides that inhibit the translation or promote the degradation of complementary mRNA, usually by binding to the 39-UTR of the gene. Analysis using PolymiRTS (http://compbio.uthsc.edu/ miRSNP) indicated that rs10514611 was located within a predicted binding site of miR-330 (Table S3). We therefore examined whether rs10514611 of the IRF8 gene could be a target for microRNA.
To confirm the functional significance of rs10514611, we performed a reporter assay using reporter constructs containing the IRF8 39-UTR. We inserted the IRF8 39-UTR that included either the rs10514611 C allele or the rs10514611 T allele into the 39-UTR region of the pMIR-REPORT vector.
When the reported vectors were transfected into the HeLa cell line, the construct that contained the non-risk allele (rs10514611 C) showed significantly higher luciferase activity than did the risk T allele (P,0.01) (Fig. 3). These results indicated that rs10514611 that was associated with relative risk to TB, in particular when occurring homozygously as TT genotype, might change the expression of IRF8 protein which played an important role in resistance to intracellular infection.

Discussion
IRF1 and IRF8 play an important role in the pathogenesis of TB. However, the IRF1 and IRF8 gene polymorphisms have not been studied much in TB. In our present study, we conducted a case-control study consisting of 495 controls and 452 ethnically matched cases with TB in a Chinese population to investigate the relationship between them and TB susceptibility.
Most of the tagSNPs selection strategies are haplotype-based. Therefore, these tagSNPs are informative polymorphisms that best characterize the haplotype diversity of a given chromosomal region [24]. A series of studies show that it is possible to retain much of the information of haplotypes by retaining only a reduced subset of markers (tagSNPs) [25,26]. Therefore, we used tagSNPs as marker to investigate the relationship between the polymorphisms of the IRF gene family and TB susceptibility. Here, we studied seven tagSNPs of the IRF1 gene and TB risk in a case-control study of a Chinese population by using the Environmental Genome Project database (http://egp.gs. washington.edu). Among the seven tagSNPs, rs2549007 is notable in that it is located in the 59 upstream region containing the NF-kB binding site, and it has been shown to have higher transcriptional activity [20,27]. The positional effect has been associated with susceptibility to atopy [20]. Lee et al. reported that rs2549007 and the haplotype formed by this SNP, and other adjacent SNPs, including rs2706384, rs2549009 and rs839, had a close relationship with Behcet's disease [21]. Schedel et al. reported that rs2070721, rs41525648 and rs2070729 was associated with hereditary allergies [20]. Mangano et al. found that rs2070724 was related with plasmodium infections [22]. All the data inferred that the seven tagSNPs were related to some infectious disease. However, we have not found an association between the tagSNPs of the IRF1 gene and TB in the present study. Vollstedt et al. (2009) also reported no association of the IRF1 gene with TB in the two  Southeast Asian populations (Indonesian and Vietnamese) as well [23], which was in accordance with our results. However, the relationship between the IRF1 gene and TB should be performed in large number of cases and controls in different populations. Furthermore, rs2549007 (MAF: 0.34) and rs2549008 (MAF:0.11) in the promoter region of the IRF1 gene existed among a Chinese population, which provided additional SNP information for IRF1 gene.
IRF8 is a member of the interferon regulatory factors that has a pivotal role in mediating resistance to pathogenic infections and in promoting the differentiation of myeloid cells [16]. Loss-offunction mutation in the mouse IRF-8 gene may impair the induction of type I IFN, which result in rapid dissemination of the infection of mycobacterium TB and rapid necrosis of infected tissues [16]. Furthermore, IRF8 modulates TLR signaling and may contribute to the crosstalk between IFN-c and TLR signal pathways, thus acting as a link between innate and adaptive immune responses [28]. In the present study, we analyzed fifteen tagSNPs of the IRF8 gene in Han Chinese population and we identified three of them (rs925994, rs11117415 and rs10514611) were associated with susceptibility to TB. Per allele OR was 1.75 (95% CI 1.35,2.27, P = 0.002), 4.75 (95% CI 2.16,10.43, P = 0.002), and 3.39 (95% CI 1.60,7.20, P = 0.015) respectively. All these findings indicated that the IRF8 might be involved in the pathogenesis of TB through genetic mechanisms.
Another new finding of this study was that rs10514611 was located in the binding region of microRNA miR-330, according to bioinformatics analyses. Transfection experiments showed that miR-330 could inhibit the expression of a reporter construct containing the risk allele (rs10514611 T). The base substitution of rs10514611 might affect the base-pairing between the miR-330 and its target site, and hence the expression of the risk allele was downregulated by miR-330. These results suggested a potential mechanism of down-regulated expression of the IRF8 in the carriers of the risk allele. Indeed, the absence of IRF8 expression impairs killing of intraphagosomal Mycobacterium bovis in mice [29][30][31]. IRF8 expression also induces the expression of interleukin 12 [32]. Thus it can be seen that downregulation of IRF8 expression might be the reason for rs10514611 (T)'s association with susceptibility to TB. In addition, these investigations of miR-330 expression pattern might provide new insight into the role of microRNA in TB.
However, there were three limitations regarding the present study. One limitation was about the sample size. Increasing the sample size may prove helpful and/or necessary in order to increase power to detect underlying susceptibility genes or loci. Another limitation could be due to correction of multiple genetic model and the method of multiple testing. The Bonferroni correction for mutliple SNPs was overly conservative, it might be give rise to a false-negative rate in association studies. The third limation was sample heterogeneity because about 39.4% of the cases were clinically diagnosed without microbiologic confirmation. Therefore, Association of the IRF8 gene with TB should be tested for reproducibility to validate the role of the IRF8 in TB occurrence.
In conclusion, the present study suggested that genetic variants in the IRF8 gene was associated with risk of TB. The tagSNPs rs925994, rs11117415, and rs10514611 in the IRF8 gene had a strong association with TB risk and the IRF8 might emerge as new and attractive molecular target in TB.