A Geographically-Restricted but Prevalent Mycobacterium tuberculosis Strain Identified in the West Midlands Region of the UK between 1995 and 2008

Background We describe the identification of, and risk factors for, the single most prevalent Mycobacterium tuberculosis strain in the West Midlands region of the UK. Methodology/Principal Findings Prospective 15-locus MIRU-VNTR genotyping of all M. tuberculosis isolates in the West Midlands between 2004 and 2008 was undertaken. Two retrospective epidemiological investigations were also undertaken using univariable and multivariable logistic regression analysis. The first study of all TB patients in the West Midlands between 2004 and 2008 identified a single prevalent strain in each of the study years (total 155/3,056 (5%) isolates). This prevalent MIRU-VNTR profile (32333 2432515314 434443183) remained clustered after typing with an additional 9-loci MIRU-VNTR and spoligotyping. The majority of these patients (122/155, 79%) resided in three major cities located within a 40 km radius. From the apparent geographical restriction, we have named this the “Mercian” strain. A multivariate analysis of all TB patients in the West Midlands identified that infection with a Mercian strain was significantly associated with being UK-born (OR = 9.03, 95%CI = 4.56–17.87, p<0.01), Black Caribbean (OR = 5.68, 95%CI = 2.96–10.91, p<0.01) resident in Wolverhampton (OR = 9.29, 95%CI = 5.69–15.19, p<0.01) and negatively associated with age >65 years old (OR = 0.25, 95%CI = 0.09–0.67, p<0.01). A second more detailed investigation analyzed a cohort of 82 patients resident in Wolverhampton between 2003 and 2006. A significant association with being born in the UK remained after a multivariate analysis (OR = 9.68, 95%CI = 2.00–46.78, p<0.01) and excess alcohol intake and cannabis use (OR = 6.26, 95%CI = 1.45–27.02, p = .01) were observed as social risk factors for infection. Conclusions/Significance The continued consistent presence of the Mercian strain suggests ongoing community transmission. Whilst significant associations have been found, there may be other common risk factors yet to be identified. Future investigations should focus on targeting the relevant risk groups and elucidating the biological factors that mediate continued transmission of this strain.


Introduction
DNA fingerprinting of Mycobacterium tuberculosis has a key role in TB control and cluster investigation as the molecular data obtained can be used to direct and focus public health control efforts [1,2]. For example, DNA fingerprinting enhanced the investigation of a large outbreak in North London where many of the epidemiological links would not have been established by routine contact tracing or traditional epidemiological investigations alone [3]. Large-scale studies of M. tuberculosis strains have also enabled the assessment of the impact of global strain migration and the transmission dynamics of specific strains on a local or regional level [4][5][6][7][8].
The number of cases of tuberculosis in the UK has consistently increased each year since the late 1980s with 8,655 cases (14.1 cases per 100,000 population) diagnosed in 2008. There were 1,012 clinical cases (18.7 per 100,000) in the West Midlands region of the UK with a 43% increase in case numbers in the region since 2000. Birmingham is the largest city in the West Midlands with a rate of 42.4 cases per 100,000 in 2008. There were 44.3 cases per 100,000 in London in 2008. There is large variation in the incidence of TB across the West Midlands, with rates highest in one urban area of Birmingham (.80 cases per 100,000) and lowest in rural Worcestershire (,4 cases per 100,000 in 2008) [9].
We analyzed all M. tuberculosis isolates in the West Midlands region of the UK from 2004 and 2008 by universal prospective DNA fingerprinting and identified the most prevalent strain. We then examined the geographical distribution and epidemiological characteristics of cases infected with this strain in the West Midlands region, and in the city of Wolverhampton, which was found to have the highest proportion of patients with this strain.

Study Population
The setting for this study was the West Midlands region of the UK. This region had a total population of 5.4 million in 2008. The city of Birmingham has the largest population in the West Midlands with one million inhabitants [10]. Prospective universal DNA fingerprinting was undertaken between 2004 and 2008 with retrospective genotyping carried out on strains isolated before 2004. Retrospective observational epidemiological investigations were undertaken within one city and on a regional scale.

Case definition
Patients with the MIRU-VNTR profile of the most prevalent M. tuberculosis strain in the West Midlands were included in further epidemiological investigation.

Mycobacterial strains and DNA fingerprinting
The HPA Confirmation of the most prevalent MIRU-VNTR profile as a single strain by IS6110 RFLP For M. tuberculosis isolates originally cultured between 1995 and 2003, DNA fingerprinting was undertaken retrospectively on requested clusters with significant epidemiological links. Strains were retrospectively analyzed by MIRU-VNTR and IS6110 RFLP typing to confirm the genetic relatedness within clusters. IS6110 RFLP interrogates a different and independent genetic sequence than MIRU-VNTR typing. IS6110 RFLP was undertaken in accordance with the international standardized protocol using M. tuberculosis strain MT14323 as a control strain [15].

Assignation of global clade lineage
Spoligotyping was carried out to identify the global strain family that the most prevalent strain is part of. Spoligotyping was performed using the Luminex Multianalyte Profiling System as previously described [16]. Spoligotype families were assigned by comparison to the international SpolDB4 database [17].

Geographical distribution of the Mercian strain within the West Midlands region
Laboratory records of patients with the most prevalent strain were used to map patient residential location using postcode within the West Midlands.

City specific epidemiological investigation
When it became apparent that there was a cohort of patients in Wolverhampton with an indistinguishable MIRU-VNTR profile, a retrospective review of patient case notes and interview of specialist tuberculosis nurses who were involved with the care of these patients was undertaken for culture-positive patients resident in Wolverhampton diagnosed with the same indistinguishable MIRU-VNTR profile between June 2003 and February 2006 to identify common factors and potential epidemiological links. These patients were compared to culture-positive cases diagnosed with other strains in 2004.
A questionnaire was designed to collect comprehensive epidemiologic information including demographic characteristics, clinical history, predisposing risk factors and evidence of contact with patients with active disease caused by any strain. Information was also obtained on occupational, social and recreational history, compliance with tuberculosis treatment and change in weight after eight weeks treatment. Chest radiographs of all patients were reviewed for the presence of cavitation.

Statistical analysis
Proportions calculated from epidemiological data obtained from the West Midlands regional and Wolverhampton city datasets were compared using Pearson's chi-squared test with Fisher's exact test where necessary. Univariate and multivariate logistic regression modeling was used to test the significance of odds ratios in Stata v10 (Stata Corp, College Station, TX, USA). The multivariable model was assembled by adding covariates individually in decreasing order of significance and the ''goodness of fit'' of each model was assessed using the likelihood ratio test. All cases with missing values for the variables examined were excluded from the multivariate model with 114 patients infected by the Mercian strain and 1,891 patients in the control group included. Differences in proportions between entries with complete data for each variable and missing data for at least one variable was analysed. A univariate analysis of the epidemiological investigation of patients resident in Wolverhampton was undertaken using EpiData Analysis v2.2 (EpiData Association, Odense, Denmark). The extent of any association was expressed as an odds ratio (OR) with 95% confidence intervals.

Ethics Statement
This report details the current status of the investigation into the most prevalent strain in the West Midlands, which has been undertaken as part of normal public health practice by microbiologists, respiratory physicians, and public health teams. Therefore, specific ethical approval was not required. The Health Protection Agency (HPA) has Patient Information Advisory Group permission under the Health and Social Care Act 2001 to collect and analyse such data for public health purposes.

Case definition
Inclusion of a patient for further epidemiological investigation was based on MIRU-VNTR typing, with a confirmed case defined as a patient with microbiologically confirmed tuberculosis and an isolate that had the 32333 2432515314 MIRU-VNTR profile.

UK distribution of the most prevalent MIRU-VNTR profile in the West Midlands
The HPA UK Mycobacterium tuberculosis Strain Typing Database was interrogated to analyze the national distribution of the most prevalent strain in the West Midlands with a total of 176 isolates identified across the UK between 2004 and 2008. Only 6/162 (4%) of these isolates were identified in patients resident outside of the Midlands. Since this MIRU-VNTR profile appeared to be geographically restricted to the West Midlands in the UK, we have named the profile the ''Mercian strain'', after the Anglo-Saxon kingdom of Mercia [18].
Geographical distribution of the Mercian strain within the West Midlands region

West Midlands regional epidemiological analysis
A total of 124/156 (79%) tuberculosis patients with the Mercian strain were successfully matched to notification data in the HPA Enhanced Tuberculosis Surveillance system. There were 2,066 tuberculosis patients with other strain types notified in the West Midlands between 2004 and 2008. Patient characteristics identified as risk factors significant in a univariate analysis (Table 1 and Table S1 for all epidemiological variables) were residence in the West Midlands West Health Protection Unit Area and then specifically residence in Wolverhampton, UK-born, and Black Caribbean or White ethnic group. Significant negative associations were identified with age not greater than 65 years old, the Black African ethnic group or extra-pulmonary disease. No significant association with resistance to any of the 1 st line tuberculosis drugs (p = .79) or multi-drug resistance (p = .92) was identified. The significant variables were then included in a multivariate logistic regression which identified that being UK-born, Black Caribbean ethnic group, .65 years old, and resident in Wolverhampton, were significantly associated with the Mercian strain (Table 1). Age .65 years old was a significantly negative association with the Mercian strain. Therefore, age ,65 years old is positively associated with the Mercian strain. There were no statistically significant differences between patients with complete data for each variable compared to those with missing data in both this analysis and the following city-wide epidemiological investigation.

City-wide epidemiological investigation in Wolverhampton
The Mercian strain in Wolverhampton was significantly associated with white UK-born patients who presented with cavitations on chest X-ray and produced smear positive specimens ( Table 2 and Table S2 for all epidemiological variables). Patients infected with the Mercian strain continued to experience weight loss at 8 weeks after starting anti-tubercular chemotherapy. This result was statistically significant (p,0.05). However, there was no significant difference between treatment completion rates after 12 months.
Examination of the epidemiological factors revealed that cases with the Mercian strain were more likely to have a previous history of TB (9/35, 26%), and would have had significant previous contact with a case of TB (24/35, 69%), and in particular patients with the Mercian strain (13/35, 37%) ( Table 2). Significant social factors detected were evidence of excess alcohol intake and cannabis use.

Discussion
We describe here the identification of the most prevalent M. tuberculosis strain in the West Midlands, which we have termed the Mercian strain. Concordant MIRU-VNTR and RFLP data from six different geographical locations across the West Midlands indicated that this strain is present in 3 major cities. Regional, national, and global genotyping databases provided evidence that this strain was restricted to the West Midlands region in England. Regional data showed that this strain primarily infected UK-born, Black Caribbean patients less than 65 years old.
The regional and Wolverhampton epidemiological investigations presented in this report identified significant associations for the Mercian strain. However, they do not provide a full explanation of why the Mercian strain is more prevalent compared to other strains in the West Midlands. Drug and alcohol use were identified as significant social factors in Wolverhampton. Alcohol and drug use have been identified as significant associations in previously reported tuberculosis outbreaks particularly in lowincidence countries [19][20][21][22]. The cumulative number of cases and continuing presence of the Mercian strain does not follow a typical point-source outbreak pattern. The significant association with younger age suggests that cases caused by the Mercian strain have arisen as a result of recent transmission and not re-activation in older patients. A possible transmission scenario is that after the initial emergence of the Mercian strain there have been several independent clusters of transmission each with their own common social link. This has resulted in a large, complex social network where transmission persists and the complete transmission scenario is yet to be fully elucidated.
Both epidemiological investigations presented in this report were retrospective and did not involve direct patient interviews. The Mercian strain continues to be identified in the West Midlands which means that enhanced epidemiological knowledge could be obtained by prospectively investigating social links as each new patient is diagnosed. Investigation of potential factors which may cause a delay in diagnosis should be investigated as well. The data presented by us identified the infected patient population and also important common social factors. The exact interaction of patient population and social factors should be investigated further to identify and fully understand any confounding factors.
It must be noted that the Wolverhampton epidemiological investigation applied a detailed questionnaire that was only used in this location. Of the three major cities in the West Midlands, Wolverhampton had the highest proportion of the Mercian strain (21%). Patients with the Mercian strain in Birmingham and Coventry might differ in their use of drugs and alcohol. The results from the Wolverhampton and region-wide analysis do not concord exactly as different ethnic population groups were identified as at highest risk: the White population in Wolverhampton but the Black Caribbean group across the West Midlands.
Detection of this strain was only possible with the commencement of universal prospective typing of all M. tuberculosis isolates in the West and East Midlands. Only with universal prospective DNA fingerprinting was the full extent of the Mercian strain in the West Midlands fully characterized. Since the Mercian strain is not a drug-resistant strain without associated phenotypic properties that could differentiate it from other M. tuberculosis complex strains, it would only have been possible to detect this strain by universal prospective typing. The patient population in which the Mercian strain has been identified is different to the UK-wide situation for TB as the majority of patients diagnosed each year in the UK are not born in the UK and originate from the Indian Sub-Continent [9].
The 156 individual patients detected between 2004 and 2008, make the Mercian strain one of the largest known communitybased clusters in the world. Previous major prevalent strains have been identified in New York [23,24], Rotterdam [25], North London [3], and Rio de Janeiro [26].
The most prevalent strain detected in the UK was an isoniazid resistant strain in North London that was previously reported in 70 patients [3], with a current total of over 300 cases caused by this strain (Ibrahim Abubakar, Consultant Epidemiologist & TB Section Head, Respiratory Diseases Department -Tuberculosis Section, Health Protection Agency, personal communication). Isoniazid resistance acted as a very useful marker for detection of the strain. It was noted that without the drug resistance marker only prospective typing of all isolates would have detected this large, complex outbreak. This strain was predominantly found in young White or Black Caribbean UK-born adults with drug misuse as a common epidemiological factor [3]. It is possible that patients in this population group take longer to present clinically as TB may not be suspected when initial symptoms develop or they might not seek medical help soon after onset. Both factors aid strain transmission and disease progression.
A strain was identified in 93/314 (30%) patients in Rio de Janeiro that uniquely lacked a major region of genomic DNA (.26.3 kb) which contained 10 genes including two potentially immunogenic PPE genes [26,27]. The RD Rio strain was associated with a higher frequency of cavitary pulmonary disease [28]. The major deletion identified in the RD Rio strain has been hypothesized as having a major impact on the virulence properties of the RD Rio strain. As the genomic content of the Mercian strain has not been characterized, further work should determine whether such a deletion or other similar major genomic variation has altered the virulence of this strain leading to multiple transmission events in and between three cities in the West Midlands.
Retrospective epidemiological studies have identified the earliest isolate of the Mercian strain from 1995 in an archive strain collection. This isolate was part of a cluster of 11 isoniazid resistant strains identified between 1995 and 2000 which was reported previously before the full regional extent of the Mercian strain was known [2]. We have typed very few archived M. tuberculosis strains from 1995 so the full extent of drug sensitive and drug resistant Mercian strains 15 years ago has not yet been assessed. The cluster of isoniazid resistant Mercian strains was present in one specific location. From 1995-2000, there was no other investigated instance of increased isoniazid resistance in the rest of the West Midlands caused by the Mercian strain.
As the Mercian strain has been present since prospective DNA fingerprinting was commenced in 2004 with a median of 30 isolates per year (range 27-37) and has represented a consistent proportion of strains (Figure 1), it is likely that the Mercian strain first emerged in the West Midlands well before 2004.  Universal prospective DNA fingerprinting is an essential part of many countries TB control programs as it has been used to estimate transmission in specific populations groups. The Netherlands have undertaken universal prospective DNA fingerprinting of all M. tuberculosis isolates since 1993 [29]. This enabled the identification of transmission after migration in patients recently arrived in the Netherlands as infections in 30% to 40% of Turkish, Moroccan, and Somali patients could be attributed to recent transmission [29].
Secondly, it has also been shown in the Netherlands that combining data obtained from nationwide tuberculosis contact investigation and DNA fingerprinting surveillance greatly increased the number of defined epidemiological links. In 2,206 clustered cases, DNA fingerprinting increased the number of epidemiologic links from 462 before DNA fingerprinting data was known to 1,002 epidemiologic established links after cluster investigation involving a combination of molecular and epidemiological data. DNA fingerprinting did not increase the number of patients identified as contacts but cluster monitoring did enable the identification of transmission events not detected by contact investigations, the development and evaluation of focused interventions and evaluation of regional tuberculosis eradication programs [30]. A study in Maryland showed that cluster investigation including DNA fingerprinting analysis identified 43/113 (38%) of all detected patient links. [31] . A more recent study showed that DNA fingerprinting data could be used to prospectively identify rapidly expanding clusters before expansion actually occurred based on the properties of the first two patients in a cluster. If the first two patients in a cluster were identified within 3 months of each other, one or both were ,35 years old, and both patients resided in an urban area and originated from Sub-Saharan Africa, there was a more than 5 times increased probability that this strain in an initial cluster of two paired patients would be identified in 5 or more patients within 2 years [32].
We describe here the identification of the most prevalent M. tuberculosis strain in the West Midlands region of the UK, with 156 isolates in a 5 year period between 2004 and 2008. The Mercian strain has been significantly associated with UKborn patients, appears to be geographically restricted to the West Midlands region in the UK with evidence of ongoing transmission.

Supporting Information
Table S1 Univariate and multi-variate analysis of sociodemographic, clinical, and bacteriological data for patients with the Mercian strain (n = 124) and all other patients with strain typing data (n = 2,066) in the West Midlands from 2004-2008. (DOCX)