The Influence of Host and Bacterial Genotype on the Development of Disseminated Disease with Mycobacterium tuberculosis

The factors that govern the development of tuberculosis disease are incompletely understood. We hypothesized that some strains of Mycobacterium tuberculosis (M. tuberculosis) are more capable of causing disseminated disease than others and may be associated with polymorphisms in host genes responsible for the innate immune response to infection. We compared the host and bacterial genotype in 187 Vietnamese adults with tuberculous meningitis (TBM) and 237 Vietnamese adults with uncomplicated pulmonary tuberculosis. The host genotype of tuberculosis cases was also compared with the genotype of 392 cord blood controls from the same population. Isolates of M. tuberculosis were genotyped by large sequence polymorphisms. The hosts were defined by polymorphisms in genes encoding Toll-interleukin 1 receptor domain containing adaptor protein (TIRAP) and Toll-like receptor-2 (TLR-2). We found a significant protective association between the Euro-American lineage of M. tuberculosis and pulmonary rather than meningeal tuberculosis (Odds ratio (OR) for causing TBM 0.395, 95% confidence intervals (C.I.) 0.193–0.806, P = 0.009), suggesting these strains are less capable of extra-pulmonary dissemination than others in the study population. We also found that individuals with the C allele of TLR-2 T597C allele were more likely to have tuberculosis caused by the East-Asian/Beijing genotype (OR = 1.57 [95% C.I. 1.15–2.15]) than other individuals. The study provides evidence that M. tuberculosis genotype influences clinical disease phenotype and demonstrates, for the first time, a significant interaction between host and bacterial genotypes and the development of tuberculosis.


Introduction
It is estimated that one third of the world's population is infected with Mycobacterium tuberculosis (M. tuberculosis), although the majority will never develop active disease. The factors that govern the development of tuberculosis disease are complex and incompletely understood. Various factors have been clearly associated with increased susceptibility to tuberculosis. HIV infection is by far the most important; it increases the lifetime risk of sub-clinical infection converting to active disease from 1 in 10 to 1 in 3 [1] and is strongly associated with disseminated disease. Defining the contribution of host genetic polymorphisms to disease susceptibility has been more difficult. Studies have suggested polymorphisms in several genes are associated with the development of pulmonary tuberculosis. Some of the genes with polymorphisms that have been validated in multiple studies and may have an effect on gene function include solute carrier family 11, member 1 (SLC11A1, formerly NRAMP1) [2][3][4][5][6], interferon gamma [7,8], TIRAP/MAL [9], P2XA7 [10,11], and CCL2 (or MCP-1), [12][13][14]. Others have shown the less common extrapulmonary manifestations of tuberculosis may have a different host genetic susceptibility profile and have implicated various polymorphism in components of the innate host response to infection [15] [16,17] [18,19]. We have recently reported associations between the development of TBM and single nucleotide polymorphisms (SNP) in the Toll-interleukin-1 receptor domain containing adaptor protein (TIRAP) and Toll-like receptor-2 (TLR-2) genes [19,20]. However, tuberculosis disease results from the interactions between host and bacteria and there have been no studies examining the influence and relationship of both host and bacterial genotype variation on clinical disease phenotype.
M. tuberculosis exhibits a clonal population structure [21,22] and therefore was regarded until recently as an organism with little relevant genetic variation [23]. However, studies examining M. tuberculosis isolates from wider geographic distributions using whole genome scanning approaches have revealed a cladal phylogeographic distribution with significant variation between major lineages, each of which is associated with specific geographic regions [24,25] (Figure 1). The degree to which this genetic variation influences disease phenotype has been difficult to study. In vitro and in vivo models of infection have shown different genotypes of M. tuberculosis induce different patterns of host immune response [26][27][28][29][30], but the relevance of these findings to human disease remains uncertain. Epidemiological studies have found some genotypes may be associated with different disease phenotypes. For example, several studies have suggested an association between mycobacterial plc gene polymorphism and disseminated extra-pulmonary disease [31][32][33], but these studies have been small, retrospective, or unable to determine if

Author Summary
Tuberculosis, caused by the bacterium Mycobacterium tuberculosis, kills over 2 million people each year. It is estimated that approximately one-third of the world population is infected with M. tuberculosis, though the majority will never develop active disease. The most severe form of tuberculosis occurs when the bacterium spreads to the brain to cause meningitis. We examined whether the genetic variation of the person and the bacteria influenced the type of disease a person develops. We have previously shown that certain mutations in genes of the human immune system can predispose adults in Vietnam to developing tuberculous meningitis. In this study we show that some strains of M. tuberculosis commonly found in Europe and America are less likely to cause tuberculous meningitis in Vietnamese adults than strains predominantly found in Asia. We then looked at the interaction between M. tuberculosis strains and mutations in human immune genes and show that a particular mutation, TLR2 T597C, is more commonly found in patients infected with the East-Asian/Beijing strains of M. tuberculosis. This is the first study to look at both the host and pathogen genotypes together in tuberculosis infection, and the findings suggest that the outcome of exposure to M. tuberculosis can depend on both the human genotype and the bacterial genotype. differences are due to host genetic susceptibility or bacterial genetic virulence determinants.
There has been much interest in the Beijing genotype of M. tuberculosis, which is highly prevalent in Asia and the states of the former USSR and has been responsible for outbreaks of multidrug resistant tuberculosis in the USA [23,34]. Animal models of infection with this genotype have suggested it leads to a hypervirulent phenotype compared with other common strains of M. tuberculosis [35]. This behaviour has been attributed to an intact polyketide synthase (pks 15/1) gene and the production of a phenolic glycolipid (PGL) [29]. PGL synthesis appears to attenuate the early host immune response to infection and is associated with reduced production of inflammatory cytokines (30). The ability of Beijing strains to elude the host innate immune response may explain why a recent study has found this genotype is associated with haematogenously disseminated disease [36]. Animal infection models suggest haematogenous dissemination of infection occurs before the onset of T-cell mediated immunity [37] and supports the hypothesis that the ability of different strains of M. tuberculosis to produce different clinical phenotypes varies dependent upon their interaction with the host innate immune response.
The study described here examined the relationship between polymorphisms in genes responsible for host innate immunity, bacterial genotype, and the development of pulmonary or meningeal tuberculosis. TBM represents the most severe form of haematogenously disseminated tuberculosis causing death or severe disability in more than half of sufferers [38]. We demonstrate that bacterial genotype does influence disease phenotype and interactions between bacterial and host genotype further influence disease expression.

Association between bacterial genotype and disease phenotype
Spoligotyping, RFLP, and MIRU typing. To investigate whether different strains of M. tuberculosis are associated with disseminated disease, we examined isolates from HIV-negative adult patients in Vietnam who either had meningeal disease (n = 187) or localized pulmonary TB (n = 237). Isolates of M. tuberculosis were collected from the CSF of patients with meningitis or the sputum of those with pulmonary TB. The median age of TBM patients was 32 years (range 15-78 years) and of pulmonary patients 36 (range 15-89) ( Table 1). We then genotyped each strain by 3 standard methods: spoligotyping, RFLP, and MIRU typing. Three pulmonary isolates showed evidence of mixed culture by more than one method on repeated occasions (dual bands on LSP typing, dual peaks on MIRU, secondary banding on RFLP, for example) and were therefore excluded from further analysis. It is not known if these cases represent mixed infections or laboratory contamination but it is likely that in a sample of this size some patients would be infected with multiple strains. 234 pulmonary isolates were therefore included in all further analyses. Table 2 summarises how the methods clustered the isolates and their respective ability to discriminate between strains. Overall, 348/421 (82.7%) of isolates clustered by spoligotyping, of which 159/421 (37.8%) were ST1 or the 'Beijing' genotype (including variants lacking additional spacers 37-43) and 74/421 (17.6%) belonged to the Vietnam genotype, ST319 [39]. By RFLP, the single largest cluster, the Hanoi genotype [39], was formed by single copy isolates, n = 119/421 (28.3%). MIRU typing clustered 57.7% (n = 243/421) of isolates. The 3 largest clusters were composed of MIRU 233325173533 (n = 28); MIRU 364225223533 (n = 20), MIRU 223325173533 (n = 15). There was no significant difference (P.0.05) between the proportions clustering in the pulmonary and meningeal tuberculosis groups by any of these three methods and no significant associations were found between any cluster and the two disease phenotypes.
LSP typing and the pks 15/17 bp deletion. We next examined whether M. tuberculosis clades defined by large-sequence polymorphisms (LSPs) were associated with the clinical disease phenotype. The Indo-oceanic lineage, also known as East-African Indian (EAI) [40], or ancestral lineage [41], with RD239 deleted, represented 104/234 (44.4%) pulmonary isolates and 88/187 (47.1%) of the meningeal isolates ( Table 3). The East Asian or 'Beijing' lineage (RD105 deleted) represented 87/234 (37.1%) of pulmonary isolates and 81/187 (43.3%) meningeal isolates. There was no significant association between either of these lineages and disease phenotype. However, we found a significant association between the Euro-American lineage and pulmonary rather than meningeal tuberculosis (13% (13/234) v.s 5.9% (7/187), Crude odds ratio for causing TBM 0.40, 95% confidence intervals 0.19-0.80, P = 0.009) ( Table 3). We sequenced the pks gene codons 54 to 154 to confirm that all isolates in the Euro-American lineage were wild-type, identical to the H37Rv sequence. In addition, we sequenced the pks 15/1 gene from 12 isolates randomly selected from the RD105 and RD239 deleted clades and demonstrated all contained the identical 7 bp insertion described in HN878 [35,42]. As expected, all RD105 or RD239 deleted isolates were subsequently shown to have the pks 7 bp insertion by MAS-PCR screening.
To confirm the association was not an artifact of demographic differences between the populations we performed multivariate logistic regression with genotype, disease phenotype, age, sex and the participant address (classified into 5 areas) entered into the model. Age and sex influence susceptibility to extrapulmonary tuberculosis [43], certain genotypes of M. tuberculosis are associated with young age in Vietnam [39] and analysis by residential district eliminated any potential bias in urban/rural populations of M. tuberculosis. By this analysis the Euro-American isolates were still strongly associated with pulmonary rather than meningeal disease (OR for TBM = 0.40, 95% C.I. 0.20-0.83 P = 0.013).
To provide further support for the biological significance of this finding we investigated whether outcome from TBM was influenced by bacterial lineage. No deaths occurred among those infected with fully drug susceptible Euro-American isolates (n = 0/ 8), whereas 22.6% (27/119) of patients with susceptible isolates of Indo-Oceanic and East-Asian lineages had died by 9 months (Fisher's exact test, P = 0.201).

Relationship between host and bacterial genotypes and disease phenotype
The polymorphisms found in the TIRAP and TLR-2 genes and their associations with disease phenotype have been reported previously [19,20]. In brief, we found previously that the TIRAP SNP C558T and the TLR-2 SNP T597C were associated with susceptibility to meningeal rather than pulmonary tuberculosis and this was reconfirmed in the current dataset. Therefore, we examined whether these polymorphisms were associated with infection with any particular bacterial genotype and whether the relationship influenced disease phenotype.
We analyzed the distribution of alleles and genotypes of the TB groups in comparison with the cord-blood controls (Table 4). TIRAP C558T was associated with susceptibility to TBM as previously reported OR = 2.96 [95% C.I. 1.71-5.11], however, there was no stronger association between TIRAP C558T and TB caused by any unique M. tuberculosis lineage (data not shown). As previously reported [20], the TLR2 T597C polymorphism was associated with all cases of tuberculosis (control vs. all isolates; OR = 1.28 [95% C.I. 1.01-1.62], P = 0.045). However, the allelic where N = the total number of strains in the sample population, s = the total number of types described and n j = the total number of strains belonging to the j th type [52]. b ST319, also known as the Vietnam genotype [39].  There was no association between the TLR2 597C polymorphism and tuberculosis caused by the Indo-Oceanic (P = 0.457) and Euro-American isolates (P = 0.505).
There was an overall association of TLR2 T597C with meningeal disease (OR = 1.51 [95% C.I. 1.12-2.03] P = 0.006) but this was not significant for meningeal disease caused by non-Beijing isolates (control vs. TBM non-Beijing OR = 1.25, [95% C.I. 0.86-1.82], P = 0.243). The strongest allelic association was between TLR2 T597C and TBM caused by Beijing genotype isolates (control vs. TBM East Asian/Beijing; OR = 1.91 [95% C.I. = 1.28-2.86], P = 0.001). On genotypic analysis this association was also highly significant (x 2 = 16.39, P = 0.0003) ( Table 4). We previously used a likelihood ratio test with Bayesian Information Criterion values to determine that the association between TLR2 T597C genotypes and TB showed best fit with a dominant (comparing 597TT/TC vs. 597CC) rather than a recessive (comparing 597TT vs 597TC/CC) model [20]. When we analyzed the association of TB caused by the Beijing lineage and TLR2 T597C using a dominant model for all types of clinical TB, we found a highly significant association (

Discussion
The influence of bacterial and host genotype on the development of different forms of TB has been difficult to study in humans. We have compared bacterial and host genotype, and their interaction, across two large groups of Vietnamese adults with pulmonary or meningeal tuberculosis. The study demonstrated a relationship between M. tuberculosis phylogenetic lineage and disease phenotype: disease caused by the Euro-American lineage was significantly more likely to be pulmonary than meningeal, which suggests that this lineage may be less capable of extra-pulmonary dissemination in the study population. However, the proportion of Euro-American isolates in this study population is relatively small and therefore a larger study is required to confirm this finding. It is possible that the predominance of young males among the TBM cases presented a skewed distribution of M. tuberculosis lineages or that TBM susceptibility factors differ among the elderly or young children.
It is tempting to speculate that the associations between bacterial lineage and disease phenotype are explained by the presence or absence of a functional pks 15/1. Recent studies have suggested that the phenolic glycolipid (PGL) produced by some pks 15/1 intact isolates specifically inhibits the innate immune response and may be responsible for a propensity to dissemination [29,35]. In these studies, production of pro-inflammatory cytokines from M. tuberculosis-infected macrophages was inhibited by PGL in a dose-dependent manner. In addition, bacteria producing PGL were more capable of dissemination from the brain to other organs in animal models than others [35]. Isolates unable to express PGL -such as the Euro-American lineagesmay conversely cause less extra-pulmonary disease. However, the explanation for our findings is unlikely to be as simple and extrapolation from such model studies is highly speculative. It is becoming increasingly clear that antigenic variation in M. tuberculosis is greater than previously thought and the causative mechanism of phenotypic disease variation is unlikely to be a single antigen 'switch'. PGL synthesis is under complex regulation and cannot be predicted simply by the presence of an intact pks 15/1 gene sequence [44].We found no differential association with disease phenotype between the East Asian and Indo-Oceanic Lineages, although it is probable the indo-oceanic isolates do not express the PGL [44]. Of note, the patients infected with Euro-American isolates had lower mortality from TBM compared with patients infected with other lineages. This correlates well with evidence from animal models which showed rabbits infected with these strains had less severe clinical manifestations, milder focal meningeal inflammation and minimal infiltrate despite the presence of significant bacillary loads [35]. The lower mortality in human disease provides further evidence that bacterial genotype may have a significant influence on disease phenotype which could have direct clinical relevance. Bacterial genotyping may allow clinicians to identify those more likely to respond poorly to treatment in which more aggressive treatment might be beneficial. However, the number of TBM patients infected with Euro-American isolates in this study was small and a larger study is required to confirm these findings and examine potential confounders such as BCG vaccination status, immunosuppressive co-morbidities etc.
Recent studies have indicated that the different lineages of M. tuberculosis are strongly associated with specific geographical regions [24]. A global phylogeography of M. tuberculosis has been proposed which suggests lineages may have become specifically adapted to their populations. Such co-evolution, or its absence, may influence disease expression and indicates interactions between bacterial and host genotype should be studied. We hypothesized that polymorphisms in genes responsible for the innate immune response to infection may influence the host response to infection and may result in increased susceptibility to disease from some bacterial lineages but not others. We found that a polymorphism in the TLR2 gene was associated with disease caused by the East Asian or Beijing lineage. This is the first time a relationship between bacterial and host genotype has been observed in TB, although it has previously been observed with other pathogens [45].
TLR2 is a trans-membrane protein which recognizes bacterial ligands -such as the 19kDa lipoprotein -and initiates a signal transduction cascade which activates dendritic cells and macrophages. The SNP T597C is a synonymous SNP that is not known to affect gene function, although we have previously demonstrated it was associated with TBM disease severity and the co-existence of miliary tuberculosis, the most extreme form of disseminated tuberculosis [20]. This suggests a polymorphism, or polymorphisms in linkage disequilibrium (LD) with TLR2 597C are important in multiple-facets of tuberculosis susceptibility. The causal polymorphism may lie in the promoter region, a regulatory region, or in a nearby gene, and must be identified before its effect on disease pathogenesis and interaction with Beijing genotype strains can be understood. However, it is possible that the causal mutation that is in LD with TLR2 597C may be associated with an impaired immune response to M tuberculosis and lead to more aggressive disease, prolonged bacteraemia, and an increased chance of seeding to the meninges. The Beijing genotype may further exploit the host susceptibility to infection through its own ability to subvert the host innate immune response. We have previously demonstrated a strong association between Beijing genotype and TBM in HIV positive patients in the same population [46] supporting the hypothesis that infection of an immune suppressed host with an immune subversive bacteria represent a synergistic combination that results in an increased likelihood of disease. There was no overall association of Beijing genotype with TBM in this HIV negative Vietnamese study population, although the proportion of Bejing genotype isolates was greater in the meningeal group ( [36] and it remains possible that a larger study would show an association too small to reach significance here.
In summary, this study provides evidence that M. tuberculosis genotype influences disease phenotype. In addition, although many reports describe host susceptibility or bacterial genetic associations with clinical phenotype in isolation, we have reported the first association between host and bacterial genotype in concert in M. tuberculosis disease. Studies of host susceptibility or pathogen virulence should be conducted in the context of both. Future vaccine candidates may need to be evaluated against a range of M. tuberculosis genotypes and host ethnicities if they are to prove globally effective, particularly against disseminated disease.

Methods
This study compared the host and bacterial genotypes of Vietnamese adults with TBM or uncomplicated pulmonary tuberculosis. All patients were from a single ethnicity (Vietnamese Kinh) and were not infected with HIV.

Disease phenotypes, patient recruitment, and sample collection
The patients were recruited to the study as previously described [19,20]. Briefly, patients with TBM were recruited at Pham Ngoc Thach Hospital for Tuberculosis and Lung Diseases (PNT) and the Hospital for Tropical Diseases (HTD) in Ho Chi Minh City, Vietnam between March 2000 and April 2003. To enter the study patients had to have clinical evidence of meningitis (nuchal rigidity and abnormal CSF parameters) and M. tuberculosis cultured from the CSF, and be .15 years old with a negative HIV test. All patients were followed for 9 months after the start of treatment; disability was assessed in survivors by the modified Rankin score [38].
Adult patients with uncomplicated pulmonary tuberculosis were recruited between September 2003 and December 2004 at 5 district tuberculosis units (DTUs) from Ho Chi Minh City and the surrounding districts, chosen to represent the geographic distribution of isolates among TBM patients in order to avoid an urban/rural bias in one sample set. Cases were defined by the culture of M. tuberculosis from sputum, a chest X-ray appearance consistent with active tuberculosis without evidence of miliary or extra-pulmonary tuberculosis, and no clinical evidence of extrapulmonary disease. As far as possible, patients were prospectively matched to TBM patients by age (+/25 years) and district of residence, defined in five groups as: urban, sub-urban, rural (surrounding HCMC), rural south-East or rural South-West. Matched patients were recruited from a DTU within each of these districts. Gender matching was attempted but not achieved due to a larger number of men with pulmonary TB attending the DTUs.
The control group comprised of 389 DNA samples extracted from the umbilical cord blood of newborn babies born at Hung Vuong Hospital, Ho Chi Minh City, in 2003. All samples came from unrelated individuals who were ethnic Vietnamese Kinh, as assessed by questionnaire.
Written informed consent was obtained from each patient or an accompanying relative if the patient could not provide consent. All protocols were approved by ethical review committees at the HTD, PNT

Host genotyping
Host genotyping and identification of TLR2 and TIRAP SNPs have been reported in detail previously [19,20]. Briefly, polymorphisms in both genes were identified by sequencing a randomly selected sub-group of patients with TBM. All subjects were then genotyped for the designated SNPs by an allele-specific primer extension assay (MassARRAY TM , Sequenom, San Diego, USA).
M. tuberculosis genotyping. All M. tuberculosis isolates were genotyped by four established methods: IS6110 restriction fragment length polymorphisms (RFLP) [47], spacer oligonucleotide typing (spoligotyping) [48], 12 allele mycobacterial interspersed repetitive unit (MIRU) typing [49], and large sequence polymorphisms (LSP) defined by deligotyping [50]. RFLP has limited discrimination in low-copy number isolates (,5 IS6110 copies) which are prevalent in Vietnam, spoligotyping is unable to discriminate Beijing genotype isolates, which account for approximately 40% of M. tuberculosis isolates in this region, and the discriminatory power of MIRU typing was unknown in Vietnam. LSP typing is a relatively new genotyping technique which has been shown to classify isolates in geographically-related clades.
Briefly, bacterial DNA was extracted from cultures on Lowenstein-Jensen media by cetyl trimethylammonium bromide (CTAB) method [51] and diluted to a working concentration of 15 ng/ml. Spoligotyping [48] and RFLP [47] were carried out according to the standard protocols. MIRU was performed following the method of Supply et al. with minor modifications for a Beckman CEQ8000 sequencer [49]. Wellred Oligos were provided by Proligo, Singapore with Dye D2 labelling replacing FAM, dye D3 labelling replacing HEX, and dye D4 labelling replacing NED. Mapmarker 600-1200 bp standard labelled with D1 dye (Bioventures Inc, USA) was included with each run. Assignment of amplicon size was performed manually with reference to the standard.
LSPs were defined following the method of Tsolaki et al. [50]. Isolates were first characterised for RD105 and RD239 deletion as it was anticipated that the majority of isolates would contain one of these two deletions. Isolates without RD105 or RD239 were sequenced in the pks gene to identify the Euro-American lineage using primers pksi GCAGGCGATGCGTCATGGGG and pksj TCTTGCCCACCGACCCTGGC to amplify a 520 bp fragment [42].
MAS-PCR was used to screen for pks 15/17 bp deletion with outer primers pks1i 39-GCAGGCGATGCGTCATGGGG-59 and pks1j 39-TCTTGCCCACCGACCCTGGC-59 [42] and an internal primer pks1insR 39-ACGGCTGCGGCTCCCGAT-GCT-59. The PCR mix contained 0.1 mM each outer primer, 0.2 mM pks1insR, 0.2 mM dNTPs, 1.5 mM MgCl 2 , Hotstart Taq (Qiagen), 16buffer (supplied with enzyme), 10.85 ml ELGA water and 15 ng DNA template in a final volume of 20 ml. The PCR programme was an initial denaturing of 95uC for 15 minutes, followed by 30 cycles of 94uC for 30 seconds, 67uC for 30 seconds and 72uC for 30 seconds, with a final extension of two minutes at 72uC. Isolates with a 7 bp deletion produced 2 bands of 520 bp and 259 bp while isolates without the deletion produce a single band of 520 bp, validated by comparison with sequencing data for 43 wild-type and 12 D7 bp pks15/1 isolates.
Spoligotyping neighbour joining phylogenetic trees were created with eucldian distance coefficient on Bionumerics software. RFLP phylogenetic trees were created with 2% position tolerance and 1% optimization using Unweighted Pair Group Analysis (UP-GMA), dice coefficient on Bionumerics software. MIRU trees were created using UPGMA, categorical multistate coefficient. For all methods, isolates were considered clustered if 100% similarity was observed.
The prevalence of genotypes among meningeal and pulmonary isolates was compared by Chi-square test. The association of LSP genotype and disease phenotype was further analysed by forward stepwise logistic regression model (P of ,0.05 to enter; P of .0.055 to remove) to identify variables associated with disease phenotype on multivariate analysis. The variables examined in the model were LSP genotype, site of TB, age, sex and residential district. For analysis of host polymorphisms, allelic and genotypic frequencies were compared between the groups using a Chi square test. We also analyzed the data with recessive and dominant models as previously described [20]. P values of #0.05 were considered statistically significant.