A Novel Scoring System to Measure Radiographic Abnormalities and Related Spirometric Values in Cured Pulmonary Tuberculosis

Background Despite chemotherapy, patients with cured pulmonary tuberculosis may result in lung functional impairment. Objective To evaluate a novel scoring system based on the degree of radiographic abnormalities and related spirometric values in patients with cured pulmonary tuberculosis. Methods One hundred and twenty seven patients with cured pulmonary tuberculosis were prospectively enrolled in a referral hospital specializing in respiratory diseases. Spirometry was performed and the extent of radiographic abnormalities was evaluated twice by each of two readers to generate a novel quantitative score. Scoring reproducibility was analyzed by the intra-class correlation coefficient (ICC) and the Bland-Altman method. Multiple linear regression models were performed to assess the association of the extent of radiographic abnormalities with spirometric values. Results The intra-observer agreement for scoring of radiographic abnormalities (SRA) showed an ICC of 0.81 (CI:95%, 0.67–0.95) and 0.78 (CI:95%, 0.65–0.92), for reader 1 and 2, respectively. Inter-observer reproducibility for the first measurement was 0.83 (CI:95%, 0.71–0.95), and for the second measurement was 0.74 (CI:95%, 0.58–0.90). The Bland-Altman analysis of the intra-observer agreement showed a mean bias of 0.87% and -0.55% and an inter-observer agreement of -0.35% and -1.78%, indicating a minor average systematic variability. After adjustment for age, gender, height, smoking status, pack-years of smoking, and degree of dyspnea, the scoring degree of radiographic abnormalities was significantly and negatively associated with absolute and percent predicted values of FVC: -0.07 (CI:95%, -0.01 to -0.04); -2.48 (CI:95%, -3.45 to -1.50); and FEV1 -0.07 (CI:95%, -0.10 to -0.05); -2.92 (CI:95%, -3.87 to -1.97) respectively, in the patients studied. Conclusion The extent of radiographic abnormalities, as evaluated through our novel scoring system, was inversely associated with spirometric values, and exhibited good reliability and reproducibility. As intra-observer and inter-observer agreement of the SRA varied from good to excellent, the use of SRA in this setting appears acceptable.


Introduction
Globally, tuberculosis remains a significant cause of high morbidity and mortality, with 1.4 million deaths and 8.7 million new cases recorded in 2011. Tuberculosis is also a significant health issue in Mexico, where an estimated minimum of 18 persons per 100,000 inhabitants develop tuberculosis each year [1]. In 2011, 15,843 new cases of pulmonary tuberculosis were identified, representing 81.6% of all cases of tuberculosis recorded that year [2]. The pathologic hallmark of tuberculosis is the formation of granulomas in the affected tissues, which is considered beneficial to the host and an important defense mechanism required to confine and control the infection. The cellular defense mechanisms in the host, which drive granuloma formation in tuberculosis, contribute to both clearance of the infection and tissue destruction. Healing occurs with progressive destruction of the parenchyma and variable degrees of fibrotic response [3]. In this context, lung remodelling in tuberculosis refers to anatomical and structural changes that are not easily reversed (laying down of extracellular matrix), in contrast to reversible changes, such as edema and cellular infiltration. Immune response to infection contributes to residual cavitation, lung fibrosis or scarring, and distortion of lung architecture, leading to volume loss and bronchiectasis [4].
Despite chemotherapy, lung remodelling in tuberculosis may result in variable degrees and patterns of lung functional impairment with significant morbidity [5][6][7]. In fact, treated pulmonary tuberculosis is considered one of the non-traditional risk factors for Chronic obstructive pulmonary disease (COPD) [8], and in tuberculosis endemic areas, cured pulmonary tuberculosis contributes to the prevalence of COPD as defined by Global Initiative for Chronic Obstructive Lung Disease (GOLD) criteria [9,10]. In addition, two population-based surveys from Latin American middle-aged and older adults demonstrated that tuberculosis is associated with airflow obstruction [11,12].
.We hypothesized that the degree of radiographic abnormalities, as detected by chest radiography, is associated with spirometric values: forced vital capacity (FVC) and forced expiratory volume in one second (FEV 1 ).
The aim of this study was to evaluate a novel scoring system based on the degree of chest radiographic abnormalities and the related spirometric values in patients with cured pulmonary tuberculosis.

Materials and Methods
The study was approved by the institutional Ethical and Research Review Board of the "Instituto Nacional de Enfermedades Respiratorias", and written informed consent was obtained from each participant, and confidentially was ensured.

Design, study setting, and population
This is a cross-sectional study. Patients were enrolled for a 30-month period, from 2006 to 2008, at a respiratory diseasededicated Tuberculosis Clinic affiliated with a tertiary-care hospital of Mexico City, which mainly receives patients from the metropolitan area of Mexico City and neighboring states. All patients with pulmonary tuberculosis, who had completed antituberculosis treatment and whose cure was confirmed by negative sputum bacilloscopy and/or Mycobacterium tuberculosis culture, and who consented in writing to participate in the study, were eligible for inclusion. Excluded patients included those who were too physically impaired to perform the spirometry, active smokers, ex-smokers who had stopped smoking less than 6 months prior to being considered for the study, individuals exposed to smoke from biomass fuels, or had a history of occupational exposure to industrial fumes, patients with pleural tuberculosis or with any associated comorbidity that may have caused previous lung structural damage, or with functional limitation such as: interstitial lung disease, asthma, COPD, etc.
Patients were treated according to guidelines of the guidelines of Mexico's National Tuberculosis Control Program [13].

Study protocol, variables, and instruments of measurement
At recruitment, information on general characteristics, comorbidities, smoking history, and respiratory symptoms was obtained. Tuberculosis history and details of data on tuberculosis treatment and microbiological status were obtained from clinical charts. The degree of dyspnea was assessed according to the Medical Research Council (MRC) dyspnea scale [14].

Pulmonary function testing
Spirometry, as well as pre-and post-bronchodilation, was performed in our pulmonary function laboratory using standard procedures for grading quality of the test and its interpretation [15][16][17].
Spirometry was performed with a PB 100 spirometer (Puritan Bennett, Lenexa, KS, USA) and EasyOne Plus Diagnostic Spirometry System SN: 46563/2002. The subject (seated and wearing nose clips) performed a forced exhalation, which yielded the forced expiratory volume in the first second (FEV 1 ), peak expiratory flow (PEF), and forced vital capacity (FVC). Each subject was allowed to perform up to 15 forced expiratory maneuvers, in order to obtain three acceptable maneuvers with the highest FEV 1 and FVC values reproducible within 150 mL. All tests were administered by specifically trained personnel.
The spirometry data were classified according to acceptability and reproducibility in a quality control code from A to F as described in Table 1 [18].
The following values were recorded: FEV 1 , FVC, and their ratio (FEV 1 /FVC). Salbutamol (200 µg) was administered from a calibrated dose inhaler, and spirometry was repeated after 15 minutes. A significant bronchodilator response was defined as Tuberculosis Radiographic Sequelae and Spirometry PLOS ONE | www.plosone.org an increase of 200 mL and 12% in either the postbronchodilator FEV 1 or FVC, respectively. All measurements were expressed as both percentage of predicted normal (% predicted) and absolute values. The set of reference equations used to calculate the % predicted for spirometry values were obtained from Pérez-Padilla [19]. We used the lower limit of normal to interpret spirometric patterns, as suggested by the ERS/ATS guidelines, and the degree of spirometry impairment definition was based on the % predicted of FEV 1 in: mild, >70%; moderate, 60-69%; moderately severe, 50-59%; severe, 35-49%; and very severe, <35% [16,17].

Development of the scoring system for grading radiographic abnormalities
We used routine film posterior-anterior chest radiographs as a non-invasive technique to quantify the extent of lung remodelling in our patients. Each chest radiograph was assessed for the presence, distribution, and extent of pulmonary abnormalities, such as airspace consolidation and fibrosis, lung distortion, traction bronchiectasis, irregular interfaces, and parenchymal bands. We developed a quantitative scale to measure the degree of radiographic abnormalities. The pulmonary parenchyma was evaluated in four quadrants, with the division between the upper and lower lung in both sides being arbitrarily set at the carina section. Each quadrant was scored from 0 to 5, where 0 indicated a normal appearance, and 5 indicated severe abnormality. The score represented the percentage of lung parenchyma involvement. The maximum score for the four lung zones was 20 (Figures 1 and 2). The same image was read twice separately by two experienced observers (pulmonologist researcher RBS, reader one; and radiologist RCP, reader two) who were blinded to clinical or lung functional information. The time elapsed between the first and second measurements was two weeks.

Medical Research Council (MRC) Dyspnea Scale
The MRC dyspnea scale was used to identify patients according to their level of perceived breathlessness. We used the revised 5-grade version of the MRC dyspnea scale [14].

Statistical Analysis
We reported the means and standard deviations (SD) of normally distributed values, and the medians and interquartile ranges (IQR) of non-normally distributed values. Categorical variables were summarized using absolute numbers and percentages.
We measured the intra-observer and inter-observer concordance using the intra-class correlation coefficient (ICC) to evaluate the reliability and reproducibility of chest radiograph scoring. The Bland-Altman analysis [20] with 95% limit of agreement and Pitman´s test of difference in variance was used to test the intra-observer and inter-observer agreement for scoring of radiographic abnormalities for the first and second measurements.
The reported score for association analysis is the mean of four measurements.
To study the relation of radiographic abnormality scores to spirometric values (FVC and FEV 1 ), Pearson's correlation test was applied. Simple linear regression was used to quantify unadjusted associations between spirometric values (FVC and FEV 1 ) and our scoring of radiographic abnormalities, age, gender, height (as a continuous variable) years of education (as a continuous variable), smoking status (former vs. never), pack-years of smoking (as a continuous variable), comorbidity, time of the disease (as a continuous variable), degree of dyspnea, and oxygen pulse saturation (as a continuous variable). All of the individuals in our study population are mestizos (as is most of the Mexican population); therefore, we did not adjust for race/ethnicity.
Multiple linear regression was used to assess the crosssectional relationship between our radiographic abnormality scoring system and spirometric values (FVC and FEV 1 ) as the response variable. A forward selection model with p-value entry criterion of 0.05 was used to create adjusted models, using the following covariates: age, gender, height, smoking status (former vs. never), pack-years of smoking (as a continuous variable), degree of dyspnea, and oxygen pulse saturation (as a continuous variable). Factors for which there were statistically Tuberculosis Radiographic Sequelae and Spirometry PLOS ONE | www.plosone.org significant associations on adjusted models were chosen as covariates in subsequent adjusted models.
We examined models separately for absolute values and % predicted FVC and FEV 1 values. Multicollinearity was a minor issue for most variables; most variance inflation factors (VIF) were smaller than 3.07. Assumptions of linearity, homoscedasticity, and normally distributed error terms were met for the sample.
All analyses were performed with statistics software (Stata 12.0, StataCorp, College Station, TX, USA).

Results
One hundred and twenty seven subjects were enrolled in the study. Participants were mostly middle-aged women with up to 10 years of formal schooling. One-third of the population was ex-smokers. The median (IQR) of pack-years of smoking was 1.65 (0. [3][4][5]. Almost one-half of the participants had underlying medical conditions, including diabetes (33.07%, n =42), hypertension (6.30%, n =8), HIV infection (2.36%, n =3), and hepatic cirrhosis (0.79%, n =1). Most participants (85%) reported having suffered only one episode of tuberculosis. Median (IQR) time elapsed from the end of anti-tuberculosis treatment to inclusion in the study was 11 (6-18) months. Fifty seven percent (n =73) of participants reported some degree of dyspnea. MRC dyspnea grades 1 and 2 were reported most often, in 39% (n =49) and 14% (n =18) cases, respectively ( Table 2).
In 77 subjects, the time span between the start of disease symptoms and the moment of tuberculosis diagnosis was evaluated. The median (IQR) was significantly larger in the group with abnormal spirometric patterns than in the group with normal spirometric patterns: 165 (75-360) and 90 (30-180) days, respectively (p < 0.05).
The mean (SD) of the four radiographic abnormalities scores was 6.46 (4.14).

Bland-Altman Plot analysis
The intra-observer variability of radiographic abnormality scoring by reader 1 (RBS) showed a mean bias of 0.87%, indicating a minor average systematic variability (Pitman´s test; p <0.05). The intra-observer variability for reader 2 (RCP) showed a minimal mean bias of -0.55% and no significant differences between the two evaluations (Pitman´s test; p >0.05), which indicates high agreement ( Figure 3, A and B).
A comparison of radiographic abnormality scores assessed between reader 1 and reader 2 for the first measurement showed a minimal difference (-0.35%) and a high agreement between the two readers; Pitman´s test showed that there is no significant (p >0.05) difference between the measuring errors by the two readers. In contrast, the results for the second measurement showed moderate agreement (mean bias of -1.78%) and significant variation between the two readers (Pitman´s test; p <0.05) (Figure 4, A and B).
Pearson's correlation test demonstrated a significant negative correlation between absolute and % predicted normal values of FVC and FEV 1 (L) and the scoring of radiographic abnormalities ( Figure 5).
Simple linear regression analysis showed that scoring of radiographic abnormalities, age, gender and MRC dyspnea scale were negatively associated with absolute values of FVC,    (Tables 4 and 5). The association with % predicted normal values of FVC showed a negative association with SRA and dyspnea MRC3, while oxygen pulse saturation was positively associated. The same analysis for FEV 1 revealed that SRA and dyspnea MRC3 were negatively associated, whereas oxygen pulse saturation was positively associated (Tables 6 and7). Multiple linear regression models revealed that after adjusting for age, gender, height, smoking status, pack-years of smoking, degree of dyspnea, and oxygen pulse saturation, the degree of radiographic abnormalities was independently associated with absolute values of FVC (0.07 L decrease for each unit increase in score of lung damage; CI: 95%, -0.01 to -0.04; p <0.001); FEV 1 (0.07 L decrease for each unit increase in score of lung damage; CI: 95%, -0.10 to -0.05; p <0.001); and with % predicted values of FVC (2.48% decrease for each unit increase in score of lung damage; CI: 95%, -3.45 to -1.50; p <0.001); and FEV 1 [2.92% decrease for each unit increase in score of lung damage; CI: 95%, -3.87 to -1.97; p <0.001) ( Tables 8 and 9).
Additional adjustment for smoking status (former vs. never), pack-years of smoking, height, and age changed parameter estimates by less than 3% for absolute and % predicted of FVC and FEV 1 values, suggesting that those variables were not confounders (Tables 10 and 11).
None of the interactions between age and the scoring of radiographic abnormalities were significant.

Discussion
Our results indicate that after adjustment for age, height, smoking status (former vs. never), pack-years of smoking, and

Figure 3. Bland-Altman plot of measurement differences against measurements average with a 95% limit of agreement superimposed for pair-wise comparisons of scoring of radiographic abnormalities (SRA), for intra-observer agreement: A) between reader one for the first and second measurements, B) between reader two for the first and second measurements.
doi: 10.1371/journal.pone.0078926.g003 degree of dyspnea, the scoring degree of radiographic abnormalities was significantly and inversely associated with FVC and FEV 1 , in the studied patients with cured pulmonary tuberculosis. This association was independent of the reader, as evidenced by the good reproducibility of the data, and that intra-observer and inter-observer agreement of the SRA varied from good to excellent.
Moreover, our findings confirm that radiographic abnormalities, lung functional impairment, and dyspnea are common in patients from our institution who have successfully completed tuberculosis treatment. We demonstrated that 96.85% (123/127) of patients had some degree of radiographic abnormality, 41% (52/127) had impaired lung function, and 58% (73/127) had some degree of dyspnea as well.
While our study is cross-sectional, the demonstrated associations are in keeping with previous research, which indicates that cured pulmonary tuberculosis causes variable degrees and patterns of lung functional impairment [5,6,21]. A correlation between lung structure-evaluated by conventional computerized tomography-and lung function in pulmonary tuberculosis has also been demonstrated [22].
In addition, Plit and colleagues [23] reported that although antituberculosis treatment improved lung function in patients with pulmonary tuberculosis, a large proportion of patients experience residual impairment: 28% develop airflow limitation and 24% develop a restrictive pattern. They also demonstrated that lung damage-evaluated through radiographic scores-is a factor that influences post-treatment lung function.
Normal spirometric patterns, observed in 59% (n =75) of participants, were the most prevalent in our study. Among the patients with spirometric impairment, an obstructive pattern was the most prevalent, as observed in 57.69% (n =30/52) of cases, followed by a restrictive pattern in 42.30% (n = 22/52) of cases. This contrasts with another study, in which mixed ventilatory disorder was the most prevalent (34%), followed by obstructive (24%) and restrictive (18%) patterns, and in which only 24% of the sample showed a normal spirometry pattern [24].
Long delays in diagnosis and treatment occur in Mexico, further aggravating not only the transmission of the disease, but also the possibility of increased lung damage, which, according to our results, seems to affect lung function negatively. We demonstrated that the median (IQR) time span between the onset of disease symptoms and tuberculosis diagnosis was significantly longer in patients with lung functional impairment than in patients without lung functional impairment. Our results contrast with a recent publication by Vecino et al., who showed that pulmonary impairment after tuberculosis is not related to the delay in tuberculosis diagnosis or treatment and does not change significantly during follow-up [21].
Our findings are in agreement with previous reports, conducted in different settings, such as population-based studies, that demonstrated a strong association between pulmonary tuberculosis and subsequent impairment in lung function [25,26]. In previous studies, considerable variability in frequency and patterns of lung functional impairment among patients with pulmonary tuberculosis is most probably due to differences in patient characteristics, such as previous treatment schedule, smoking status, and varying intervals of treatment onset.
We found a heterogeneous spectrum of radiographic abnormalities, including parenchymal fibrosis, bronchiectasis, and emphysema (data not shown). The mechanism of airway Tuberculosis Radiographic Sequelae and Spirometry PLOS ONE | www.plosone.org obstruction in our patients can be due to partial destruction inside the airways and/or of lung parenchyma, causing loss of radial traction and consequent narrowing of the airway. The restrictive pattern that we also observed may be due to alterations in lung parenchyma and/or pleural affection.
The mean of the differences between equations for all cases of the inter-observer and intra-observer evaluations of SRA ranged from -1.78 to 0.87, indicating a minor average systematic variability. The best agreement occurred between the readers for the first measurement of SRA, with a mean difference of -0.354 (Pitman´s test, p >0.05), and the higher variability occurred for the second measurement of SRA with a mean difference of -1.78 (Pitman´s test, p <0.05).
Our results showed that the use of our scoring system for evaluating the degree of radiographic abnormality predicted lung functional impairment; in this context, measurement of lung functional impairment in subjects with cured pulmonary tuberculosis may be useful, in particular, in subjects with high degrees of radiographic abnormalities.
It is also important to assess the degree and impact of breathlessness in the overall clinical evaluation of these patients. However, because these patients may reduce their activities, a simple assessment of dyspnea may be insufficient; hence, an assessment of functional status, utilizing a tool such as the MRC dyspnea scale, is necessary to fully evaluate the functional impact of radiographic abnormalities in cured pulmonary tuberculosis patients. This was demonstrated in a 6month longitudinal study, developed in 115 patients with smear-positive pulmonary tuberculosis from Papua, Indonesia, whose permanent lung damage influenced exercise tolerance and quality of life in a region with a high number of pulmonary tuberculosis cases and chronic lung dysfunction [27].
From the operational point of view, our study had the advantage of using a low-cost strategy to evaluate lung function. We were able to document an association between the degree of radiographic abnormalities and spirometric values in a single visit conducted after treatment completion. Previous studies have used prospective designs or have evaluated lung damage by conventional computerized tomography [21,23]. Both methods entail problems with patient compliance and higher costs, which are particularly significant in low and medium resource settings.   Despite our rather simple design, our results are similar to those shown in large-population based studies, prospective designs or studies that have used computerized tomography to evaluate lung damage [21,[23][24][25][26].

Strengths and limitations
Some strengths and limitations of this study must be addressed. To date, there are few studies that link radiographic  abnormalities with pulmonary function in cured pulmonary tuberculosis. The use of ICCs and Bland-Altman plots method with limits of agreement for evaluation of our SRA provide a robust examination of agreement. Our results were obtained from a selected population with cured pulmonary tuberculosis, therefore, we may assume that the lung functional impairment and dyspnea are due to effects of tuberculosis-induced lung remodelling. Nevertheless, due to its cross-sectional design, a potential weakness of this study is the lack of spirometry data prior to tuberculosis; hence, we cannot state with certainty that pulmonary tuberculosis caused the observed spirometry changes. or that there has been some damage from pulmonary tuberculosis leading to reduced spirometric results, but still leaving them defined as normal (FVC fell from 110% of predicted to 81% of predicted).
Our use of chest X-ray instead of computed tomography allowed us to reduce costs without affecting our ability to demonstrate a relation between radiographic abnormalities and spirometric values. Moreover, we were able to document that this low cost and frequently available resource is useful and valid to determine extent of damage to lung parenchyma.

Future research
Additional studies are needed to investigate pathophysiological factors associated with lung remodelling, changes in pulmonary function, exercise tolerance, and healthrelated quality of life, following pulmonary TB treatment. Understanding this relationship will be useful to assess the long-term impact of tuberculosis on patients' quality of life and to implement preventive and control measures.

Conclusion
This study showed that spirometric values were associated to the extent of radiographic abnormalities assessed by chest radiography using a simple, valid, and reproducible scoring method, and its used in this setting appears acceptable. Intraobserver and inter-observer agreement of the SRA varied from good to excellent. In patients with cured pulmonary tuberculosis from this population spirometric impairment and dyspnea are common.