Determinants of site of tuberculosis disease: An analysis of European surveillance data from 2003 to 2014

Background We explored host-related factors associated with the site of tuberculosis (TB) disease using variables routinely collected by the 31 EU/EEA countries for national surveillance. Methods Logistic regression models were fitted to case-based surveillance data reported to the European Centre for Disease Prevention and Control for TB cases notified from 2003 to 2014. Missing data on HIV infection and on susceptibility to isoniazid and rifampicin for many patients precluded the inclusion of these variables in the analysis. Records from Finland, Lithuania, Spain and the United Kingdom were excluded for lack of exact details of disease localisation; other records without one or more variable (e.g. previous treatment history, geographical origin) or who had mixed pulmonary and extrapulmonary disease or more than one form of extrapulmonary disease were also removed (total exclusion = 38% of 913,637 notifications). Results 564,916 TB cases reported by 27 EU/EEA countries had exclusive pulmonary (PTB; 83%) or extrapulmonary (EPTB; 17%) disease. EPTB was associated with age <15 years (aOR: 5.50), female sex (aOR: 1.60), no previous TB treatment (aOR: 3.10), and geographic origin (aOR range: 0.52–3.74). Origin from the Indian subcontinent or Africa was most strongly associated with lymphatic, osteo-articular and peritoneal/digestive localization (aOR>3.7), and age <15 years with lymphatic (aOR: 17.96) and central nervous system disease (aOR: 11.41). Conclusions Awareness of host-related determinants of site of TB is useful for diagnosis. The predilection for EPTB among patients originating from countries outside Europe may reflect strain preferences for disease localization, geographic/ethnic differences in disease manifestation and other factors, like HIV.


Introduction
Tuberculosis (TB) remains a serious international public health problem today, with a disproportionate number of the 10 million new cases emerging each year being concentrated in Asia and Africa [1]. The countries of the European Union and European Economic Area (EU/EEA) together reported 58,008 cases in 2014 to the joint TB surveillance activities of the European Centre for Disease Prevention and Control (ECDC) and the World Health Organisation (WHO) ( Table 1), which account for about 1% of all TB cases notified globally [2]. Despite its relatively small and declining TB caseload, this group of 31 countries presents a very diverse TB epidemiological pattern, with low TB incidence in its western and southern regions to moderate levels moving eastwards. The large majority of TB cases in western European countries are of foreign origin or in subgroups at a higher risk of infection and disease than the general population. Overall, 75% of the TB cases show exclusive pulmonary involvement (PTB), 19% solely extrapulmonary disease (EPTB), and 6% have mixed disease [3]. These proportions, however, differ markedly between countries and population groups. In general, PTB patients have more severe forms of disease, as evidenced by a higher risk of dying and lower risk of completing treatment successfully when compared to the average EPTB case. Given this unfavourable prognosis, as well as the risk of direct transmissibility, PTB remains a priority for public health action in TB control. Nonetheless, EPTB also presents public health concerns given that it often challenges diagnosis (particularly in children), and it could have clinically significant sequelae [4]. A number of high-income countries have reported an increased frequency of EPTB over time [3,5,6]. Site of disease has been reported to differ substantially by geography and by other risk factors [3,[7][8][9][10].
In the EU/EEA, the reporting of case-based data for the European-level surveillance of TB dates back to the mid-1990s. These data allow an in-depth study of epidemiological patterns and determinants of outcomes (e.g. drug resistance, death). In this paper, we analyse data reported by various EU/EEA countries in the 12 most recent years to identify demographic and clinical host-related factors associated with the site of disease among TB cases.

Data sources and collection
Surveillance data described in this article were reported within the framework of collaborative surveillance of TB in Europe through a network of national TB surveillance authorities [11,12]. Reporting has followed a standardised methodology allowing comparison between countries and over time [13][14][15][16] review group and access will be granted if the request fulfils the requirements described in the Policy on data submission, access, and use of data within TESSy. Detailed information and forms can be obtained from https://ecdc.europa.eu/en/ publications-data/european-surveillance-systemtessy.

Funding:
The study described in this paper was done in the normal course of work and did not benefit from any ad hoc financing. USAID was a principal salary supporter of the WHO co-author involved in this article. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Competing interests:
The authors declare no competing interests. USAID was a principal salary supporter of the WHO co-author involved in this article. DF and AD are staff members of the World Health Organization (WHO). They alone are responsible for the views expressed in this publication and they do not necessarily represent the decisions or policies of WHO. The designations used and the presentation of the material in this publication do not imply the expression of any opinion whatsoever on the part of WHO concerning the legal status of any country, territory, city or area, or of its authorities, nor concerning the delimitation of its frontiers or boundaries. Authors only used routinely collected surveillance data which were entirely anonymised to them for this study, and thus no ethical clearance was sought.

Inclusion criteria and definitions
For this analysis, PTB is defined as TB of the lung parenchyma, tracheo-bronchial tree, or larynx, while EPTB refers to TB affecting any other anatomical site. The preferred method of reporting site of disease requires information on both the major and minor localisation, as well as details of the specific site of EPTB disease. This method allows reporting of up to two sites; if more than two sites are present in an individual patient, only the two main sites are included. The following groupings of EPTB cases were used: pleural, lymphatic (both intrathoracic and extra-thoracic), osteo-articular (spine, bone, and joint), central nervous system (CNS; including meningeal site), genito-urinary (kidney, ureter, bladder, genital tract), peritoneal/digestive tract and disseminated (including TB of >two organ systems, miliary TB or the isolation of Mycobacterium tuberculosis from the blood). In order to explore determinants of specific disease localisation, only patients with either exclusive PTB or exclusive EPTB were retained in the analysis (45,636 exclusions made for patients with both PTB and EPTB involvement or with two EPTB sites; Fig 1). Country of origin was defined by birth in all included countries except for Austria and Belgium where it was defined by citizenship. Countries were classified in seven geographical groups, namely Africa, Americas & Oceania, Europe (three subgroupings), the Indian subcontinent, and the Rest of Asia (Fig 2; Table 1 footnote). The three European subgroupings were: Central European countries which joined the EU since 2004 (Bulgaria, Croatia, Czech Republic, Estonia, Hungary, Latvia, Lithuania, Poland, Romania, Slovakia, and Slovenia); other Central and Eastern Europe (Belarus, Republic of Moldova, the Russian Federation and Ukraine as well as all other Central European countries in the Balkans); and Western Europe (the remaining countries). The Indian subcontinent grouped the following Asian countries: Bangladesh, Bhutan, India, Maldives, Nepal, Pakistan and Sri Lanka. Rest of Asia consisted of other countries in the region not in the Indian subcontinent or Oceania, and includes Central Asia (Fig 2).
Children were defined as individuals who were under 15 years of age at the time of the report; elderly were those over 64 years. Previous TB treatment referred to chemotherapy The study period consisted of the 12 consecutive years from 2003 to 2014 inclusive. Countries eligible for inclusion in this analysis were the 28 belonging to the EU as in 2014, as well as Iceland, Liechtenstein and Norway (Table 1). Countries which did not report complete details of EPTB localisation were not included in the analysis (Finland, Lithuania, Spain and the United Kingdom); records from other countries were excluded for earlier years when no data were reported for site of disease (Denmark, Latvia, Poland and Sweden) or patient geographic origin (France, Norway and Poland) (Fig 1).

Statistical analysis
The 'null hypothesis' of no association between site and putative explanatory variables was tested using data from all countries pooled for all available years. Bivariate and multivariable analyses were restricted to cases with full data on site, sex, age, TB treatment history, and geographic origin. Owing to the large proportion of cases without information on (i) susceptibility to isoniazid and rifampicin and (ii) HIV status (Table 1), the analysis did not consider these variables.
Crude odds ratios (OR) with 95% confidence intervals (95% CI) were used to express magnitude of association between categorical variables in bivariate analysis (Table 2). Pearson's χ2 was used to test relationships and a p-value<0.05 was considered statistically significant and was also used as the threshold to include variables in the logistic regression (see below).
Two multivariable regression analyses were performed. Binary logistic regression was first used to identify variables independently associated with exclusive EPTB disease ( Table 2). Multinomial logistic regression was used to compare the risk factors for different forms of EPTB, using PTB as a reference (Table 3). To simplify the presentation, the results of multinomial regression are shown without adjustment for interaction. Adjusted ORs (aORs) were used to quantify the magnitude of associations.
Maps were created with R (using ggplot2 package) working within R Studio [17]. Logistic regression was done with R (glm function) and multinomial regression with STATA version 12.1 (StataCorp, College Station, Texas).

Description
The thirty-one EU/EEA countries reported a total of 913,637 TB cases, fairly evenly spread over the 12-year period ( Table 1). Half the included cases were from Romania (31%), Poland Table 2 (11%) and the United Kingdom (11%). Out of all cases reported, 69% were exclusively PTB, 20% were exclusively EPTB, 6% had both PTB and EPTB, and 5% had an unknown site. Culture confirmation was more frequent in PTB (69%) than in exclusive EPTB cases (33%), half of whom had no culture result. Multidrug resistance was confirmed in 4% of PTB cases and 1% of EPTB, with 36% overall having no result. HIV infection was recorded in 1% overall but 87% of cases had no information.

Risk factors
After exclusion of cases without key data, 564,916 records (62%; Table 2; Figs 1 and 3) from twenty-seven countries were retained. The characteristics of the cases for the variables of interest had a similar distribution to the one of the whole population before restriction; the only exception is in the proportion of cases from the Central European EU countries (53% in complete dataset and 74% after exclusions; Tables 1 and 2). Seventeen percent of cases had exclusively EPTB. Children and the elderly accounted for smaller proportions of the PTB (2% and 17% respectively) when compared with EPTB cases (10% and 20%). The ratio of males to females was 1.2 among the EPTB cases, but was 2.2 among the PTB cases. More of the EPTB cases were previously untreated for TB (95%) than PTB cases (81%). Only 6% of PTB cases originated from outside Europe, with most reported from Romania and other Central European EU countries. In contrast, a larger proportion of EPTB cases originated from Western Europe (23%), Africa (9%) and Asia (8%). Among the 95,003 exclusively EPTB cases, the most frequent localizations reported were pleural (40%), lymphatic (29%), osteoarticular (9%), and genito-urinary (6%) ( Table 3). The more severe forms of EPTB-CNS (3%) and disseminated (1%)-were relatively uncommon. Risk factors for site of TB in the EU/EEA Analysis At bivariate analysis, the strongest, statistically significant positive associations with EPTB were observed with being a child (OR = 5.84), not previously treated for TB (4.28; 0.23 if previously treated), being from Africa (2.09 compared with Western Europe) or the Indian subcontinent (3.25), or female (1.85) ( Table 2). The adjusted ORs from logistic regression reproduced the magnitude and direction of associations observed at bivariate analysis. Year of report bore no relationship to the site of disease.
At multinomial regression (Table 3), an origin from the Indian subcontinent or Africa was most strongly associated with lymphatic, osteoarticular and peritoneal/digestive localization (aOR at least 3.83). Age <15 years was most strongly associated with lymphatic and central nervous system disease (aOR at least 11.41), but all types of EPTB were more likely in childhood, with the exception of genito-urinary TB which was associated with age 45-64 years and >64 years (aORs 2.43 and 3.35 respectively). Female sex was significantly associated with all forms of EPTB, especially lymphatic forms (2.55). Patients with a previous history of TB treatment were less likely to have EPTB, regardless of form (aORs ranging from 0.15 in pleural to 0.59 in osteoarticular).

Discussion
We explored pooled, case-based surveillance data reported by EU/EEA countries for twelve recent years to identify factors associated with the site of disease in notified TB patients. We observed that EPTB was more likely at the extremes of age, in females, in individuals not previously treated for TB, and in patients who were of African or Asian ethnicity. These findings are important for the post-2015 End TB Strategy with its stress on treating all forms of TB [18].
Our analysis quantifies the differential effects of demographic and clinical factors on the disease manifestation using a multinational routine surveillance database which has been consolidated over several years through the long-standing commitment of a vigorous network of public health workers [11]. Two important strengths of the dataset we used were its diversity in terms of patient mix in the countries of the EU/EEA and its size-more than half a million cases from 27 countries could be included in the main analysis. This allowed, for instance, a grouping of patients into geographical subsets in which meaningful patterns of TB localization could be discerned. These included regions of the world from where such information is otherwise sparse.
The effect of age and sex on site of TB disease that we report concurs with what has been described from various settings, albeit with a much larger geographic span and number of observations [19][20][21][22][23][24][25][26]. In our study, the risk for most types of EPTB was associated with the extremes of age (<15 and >64 years), possibly reflecting the performance of an immune system which is still maturing in infancy and childhood or declining with advancing age. The associations we have described between EPTB and female sex have been reported in different parts of the world [9,27,28]. They may be influenced in part by a higher frequency of smoking in males, which is known to predispose to PTB [27].
Our analysis has shown a clear association between certain forms of EPTB and the geographic origin of patients. After adjustment for age, sex, and TB treatment history, cases originating from India, Africa, and Asia were more likely to develop EPTB than cases originating from Europe and Americas, with aORs ranging between 1.5 and 3.7. For lymphatic, osteoarticular and peritoneal/digestive EPTB, the association with an origin from the Indian subcontinent was the highest of all regional variations observed, ranging from 6.4 to 8.8. The association between EPTB and ethnicity was demonstrated in other studies, suggesting a relationship between the genetic background of the host and some M. tuberculosis genotypes which are more prevalent in some geographical areas [29,30]. Host predisposition for TB and for certain forms of disease based on human genetic characteristics have been described, although the relationship with any particular site was not strong [31,32].
In the dataset we analysed, severe forms of EPTB (CNS and disseminated) were relatively uncommon. Challenges remain to diagnose EPTB, including in well-resourced, high-income settings where TB is becoming increasingly rare [33]. Delays in diagnosis and treatment of EPTB predispose to progression of disease. Maintaining a higher index of suspicion thus remains important, bearing in mind the protean manifestations of TB. The use of appropriate diagnostics is key, particularly molecular techniques which are now validated for the diagnosis of several forms of EPTB and for use in children [34,35].
Some limitations in our analysis should be highlighted. The authors had no means of independently verifying the quality of reporting and any differences in the validity of report between countries and over time. However, most institutions belonging to this European network have been using a fairly standardised reporting methodology for many years. The variables in the ECDC TESSy database do not include those for certain risk factors that are known to influence progression and form of TB disease, such as the delay to start of treatment, adequacy of treatment administered, diabetes, cancer, and other comorbidities, immunosuppressive therapy tobacco smoking, and alcohol consumption [20,21,[36][37][38][39]. Moreover, incomplete information on susceptibility to isoniazid and rifampicin and on HIV status precluded the investigation of their potential associations with TB localisation within the scope of our analysis. While it may be difficult to harmonise variables about modifiable risk factors like alcohol use and smoking within multinational databases such as TESSy, it may be feasible to do so at national level and explore associations which could guide frontline staff in their work. Finally, it may be useful in the future to use TB strain genotype data being collated within TESSy to explore transmission at the molecular level.
In conclusion, it is likely that the predilection for TB site of disease is determined by a complex interplay between the microbiology and host genetic, behavioural, clinical and demographic features [40,41]. Our analysis sheds light on the determinants of TB localisation from a surveillance dataset spanning a broad swathe of the global population, extending well beyond the high-and middle-income settings of the western and central European countries reporting the data. The fact that our findings correspond to those described elsewhere attests to the utility of routinely collected data for planning and decision-making. It is likely that EPTB will remain a topic of discussion in TB care and prevention, especially given that issues such as early case finding, childhood TB, improved diagnostics, and more effective treatment for all forms of TB remain integral to the implementation of the post-2015 End TB Strategy and to TB elimination in low-incidence settings [18,42].