Skip to main content
Advertisement
  • Loading metrics

Differences and agreement between two portable hand-held spirometers across diverse community-based populations in the Prospective Urban Rural Epidemiology (PURE) study

  • MyLinh Duong ,

    Roles Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Project administration, Resources, Supervision, Validation, Writing – original draft, Writing – review & editing

    duongmy@mcmaster.ca

    Affiliation Department of Medicine, Population Health Research Institute, McMaster University and Hamilton Health Sciences, Hamilton, Ontario, Canada

  • Sumathy Rangarajan,

    Roles Data curation, Funding acquisition, Methodology, Project administration, Resources, Supervision, Writing – original draft, Writing – review & editing

    Affiliation Department of Medicine, Population Health Research Institute, McMaster University and Hamilton Health Sciences, Hamilton, Ontario, Canada

  • Michele Zaman,

    Roles Formal analysis, Investigation, Methodology, Validation, Writing – original draft, Writing – review & editing

    Affiliation Department of Medicine, Population Health Research Institute, McMaster University and Hamilton Health Sciences, Hamilton, Ontario, Canada

  • Nafiza Mat Nasir,

    Roles Data curation, Project administration, Writing – review & editing

    Affiliation Faculty of Medicine, Universiti Teknologi MARA, Sungai Buloh Campus, Selangor, Malaysia

  • Pamela Seron,

    Roles Data curation, Supervision, Writing – review & editing

    Affiliation Facultad de Medicina, Universidad de La Frontera, Temuco, Chile

  • Karen Yeates,

    Roles Data curation, Supervision, Writing – review & editing

    Affiliation Pamoja Tunaweza Research Centre, Moshi, Tanzania

  • Afzalhussein M. Yusufali,

    Roles Data curation, Supervision, Writing – review & editing

    Affiliation Dubai Medical University, Hatta Hospital, Dubai Health Authority, Dubai, United Arab Emirates

  • Rasha Khatib,

    Roles Data curation, Supervision, Writing – review & editing

    Affiliation Advocate Aurora Research Institute, Milwaukee, IL, United States of America

  • Lap Ah Tse,

    Roles Data curation, Writing – review & editing

    Affiliation JC School of Public Health and Primary Care, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong

  • Chuangshi Wang,

    Roles Data curation, Writing – review & editing

    Affiliation Medical Research & Biometrics Center, National Center for Cardiovascular Diseases, Fuwai Hospital, Chinese Academy of Medical, Beijing, China

  • Andreas Wielgosz,

    Roles Data curation, Investigation, Methodology, Supervision, Writing – review & editing

    Affiliation University of Ottawa Department of Medicine, Ottawa, Ontario, Canada

  • Koon Teo,

    Roles Data curation, Methodology, Supervision, Writing – review & editing

    Affiliation Department of Medicine, Population Health Research Institute, McMaster University and Hamilton Health Sciences, Hamilton, Ontario, Canada

  • Rajesh Kumar,

    Roles Data curation, Supervision, Writing – review & editing

    Affiliation State Health System Resource Center, Punjab, India

  • Alvaro Avezum,

    Roles Data curation, Supervision, Writing – review & editing

    Affiliation International Research Center, Hospital Alemão Oswaldo Cruz, São Paulo, SP, Brazil

  • Rosnah Ismail,

    Roles Data curation, Supervision, Writing – review & editing

    Affiliation Community Health Department, Faculty of Medicine, Universiti Kebangsaan Malaysia, Kuala Lumpur, Malaysia

  • Burcu Tumerdem çalık,

    Roles Data curation, Writing – review & editing

    Affiliation Faculty of Health Sciences, Department of Health Management, Marmara University, Istanbul, Turkey

  • Soumya Gopakumar,

    Roles Data curation, Writing – review & editing

    Affiliation Health Action by People and Government Medical College, Thiruvananthapuram, Kerala, India

  • Omar Rahman,

    Roles Data curation, Supervision, Writing – review & editing

    Affiliation University of Liberal Arts Bangladesh, Dhaka, Bangladesh

  • Katarzyna Zatońska,

    Roles Data curation, Supervision, Writing – review & editing

    Affiliation Wroclaw Medical University Bujwida Wroclaw, Poland, EU

  • Annika Rosengren,

    Roles Data curation, Supervision, Writing – review & editing

    Affiliation University of Gothenburg, Gothenburg, Sweden

  • Johanna Otero,

    Roles Data curation, Supervision, Writing – review & editing

    Affiliation Instituto Masira, Universidad de Santander (UDES), Bucaramanga, Colombia

  • Roya Kelishadi,

    Roles Data curation, Supervision, Writing – review & editing

    Affiliation Cardiovascular Research Institute, Chamran Hospital, Isfahan, Iran

  • Rafael Diaz,

    Roles Data curation, Supervision, Writing – review & editing

    Affiliation Estudios Clinicos Latinoamerica ECLA Rosario, Santa Fe, Argentina

  • Thandi Puoane,

    Roles Data curation, Supervision, Writing – review & editing

    Affiliation University of the Western Cape, School of Public Health, Cape Town, South Africa

  •  [ ... ],
  • Salim Yusuf

    Roles Conceptualization, Formal analysis, Funding acquisition, Investigation, Methodology, Project administration, Resources, Supervision, Writing – original draft, Writing – review & editing

    Affiliation Department of Medicine, Population Health Research Institute, McMaster University and Hamilton Health Sciences, Hamilton, Ontario, Canada

  • [ view all ]
  • [ view less ]

Abstract

Introduction

Portable spirometers are commonly used in longitudinal epidemiological studies to measure and track the forced expiratory volume in first second (FEV1) and forced vital capacity (FVC). During the course of the study, it may be necessary to replace spirometers with a different model. This raise questions regarding the comparability of measurements from different devices. We examined the correlation, mean differences and agreement between two different spirometers, across diverse populations and different participant characteristics.

Methods

From June 2015 to Jan 2018, a total of 4,603 adults were enrolled from 628 communities in 18 countries and 7 regions of the world. Each participant performed concurrent measurements from the MicroGP and EasyOne spirometer. Measurements were compared by the intra-class correlation coefficient (ICC) and Bland-Altman method.

Results

Approximately 65% of the participants achieved clinically acceptable quality measurements. Overall correlations between paired FEV1 (ICC 0.88 [95% CI 0.87, 0.88]) and FVC (ICC 0.84 [0.83, 0.85]) were high. Mean differences between paired FEV1 (-0.038 L [-0.053, -0.023]) and FVC (0.033 L [0.012, 0.054]) were small. The 95% limits of agreement were wide but unbiased (FEV1 984, -1060; FVC 1460, -1394). Similar findings were observed across regions. The source of variation between spirometers was mainly at the participant level. Older age, higher body mass index, tobacco smoking and known COPD/asthma did not adversely impact on the inter-device variability. Furthermore, there were small and acceptable mean differences between paired FEV1 and FVC z-scores using the Global Lung Initiative normative values, suggesting minimal impact on lung function interpretation.

Conclusions

In this multicenter, diverse community-based cohort study, measurements from two portable spirometers provided good correlation, small and unbiased differences between measurements. These data support their interchangeable use across diverse populations to provide accurate trends in serial lung function measurements in epidemiological studies.

Introduction

Lung function assessments are now more accessible with the wide adoption of handheld portable spirometers in the community and ambulatory care setting. These devices are easy to operate and many have inbuilt quality check software to enable high quality measurements. They are also commonly employed in research studies to provide rapid and reliable lung function measurements and tracking of lung function over time [1]. However, in large multicenter trials, it is common to have different portable spirometers across different study sites depending on the local availability of these devices; and it is often necessary to replace older devices with newer models over time [2]. This raise questions regarding the reliability and agreement between measurements obtained from different devices. Therefore, it is important to ascertain the reliability, differences and agreement between different spirometers; and identify factors that may contribute to the variability between spirometers.

To date, there have been few small studies, which examined the variability between different portable spirometers [211]. Many were conducted in highly selected healthy young individuals in laboratory setting. Only a few were conducted in the community but limited to one population (generally from Europe or North America). It is unclear whether these findings can be generalized to other populations with different anthropometrics, demographics and underlying disease prevalence. Furthermore, not much is known on the source of variability between spirometers.

The Prospective Urban Rural Epidemiology (PURE) study is an international prospective cohort study, comprising of adults recruited from urban and rural communities from high, middle and low-income countries. Baseline spirometry data was collected with a handheld portable turbine spirometer without flow volume loops (FVL). In the course of cohort follow-up, a new portable ultrasonic spirometer was introduced, which provided FVL. In the present study, we examined the correlation, agreement and mean difference between measurements from the old and new spirometer, in an unselected sub-sample. We also assessed whether the correlation and agreement between spirometers may differ across diverse populations from different socioeconomic and geographic regions. Lastly, we examined the impact of utilizing two different spirometers on the interpretation of spirometry measurements, using the Global Lung Initiative (GLI) normative values. Our findings will address some of the challenges associated with the widespread use of portable spirometers and their role in providing access to lung function measurements in the community. This information will facilitate correct interpretation of data and offer insight into how best to address the variability between spirometers.

Methods

The PURE study began recruitment in 2004 of community-based adults aged 35 to 70 years old; from 628 urban and rural communities across 18 high-, middle- and low-income countries. The study design and methodology have been described elsewhere [12]. In brief, standardized approaches were used for the enumeration of households, identification of participants, recruitment and data collection. As it was not feasible to collect data from a representative sample of each country, the sampling method used for each country aimed to reduce participation bias based on local risk factors and disease prevalence. Baseline data were collected between 2004–2009 and follow-up occurred every 3 years. The study is coordinated by the Population Health Research Institute, Hamilton Health Sciences, McMaster University (Hamilton, ON, Canada). Ethics approval was provided by the Hamilton Health Sciences Research Ethics Board and the research ethics committees of the other participating centers (Appendix I in S1 File). All participants provided written informed consent to participate in the study.

Baseline spirometry was measured with the MicroGP spirometer (MicroMedical, Chatham. IL, USA), without FVL, following the 2005 American Thoracic Society/European Respiratory Society (ATS/ERS) spirometry standardization guidelines [13]. The MicroGP spirometer contains a turbine, which generates rotational flow during the spirometry maneuver. The rotation of the low-inertia vane is converted into electrical impulses by means of an infrared light-emitting diode and a photodiode sensor. A microprocessor within the device converts the electrical pulses into spirometry measurements, which are displayed digitally. According to the manufacturer, the microGP has an accuracy of ±2%. In 2015, the EasyOne (Ndd, Medical Technologies, Inc., Switzerland) ultrasonic spirometer was introduced, which provided automated quality checks, messaging, quality grades and FVL. The quality grades after each test session provided by the EasyOne include: (1) Grades A or B for three acceptable efforts and <100 ml (Grade A) or <150 ml (Grade B) variability between the two highest FEV1 and FVC; (2) Grade C for two or more acceptable efforts and <200 ml variability; (3) Grade D for one acceptable effort or highly variable efforts > = 200 ml; and (4) Grade F for no acceptable efforts. The EasyOne spirometer uses an ultrasonic sensor to measure airflow. It has no moving parts and its accuracy is not dependent on mechanical function or the measurement of pressure or volume displacement. Accordingly, the manufacturer information report an accuracy <3%, which is maintained throughout its operational life and not needing regular calibration.

All study visits were conducted in dedicated research clinics in the community for all sites and countries. Participants were coached by a trained staff, prior to performing pre-bronchodilator forced inspiratory and expiratory manoeuvers (up to six attempts). All tests were performed in a standing position with participants’ back straight and wearing a nose-clip. With the introduction of the EasyOne spirometer, each center enrolled the first five consecutive participants from each community into the present substudy. Each participant provided spirometry measurements using the two devices in a random order within 3 hours supervised by the same research staff. The order of spirometer measurements was randomly generated by the coordinating site and issued to the center prior to the day of testing. Spirometers were calibrated monthly (or as needed in extreme weather or handling) using a 3L syringe to ensure an accuracy <105 ml or 3.5%.

Statistical analysis

Means and frequency statistics were used to describe the data. The highest FEV1 and FVC from each spirometer were analyzed. The assumption of normality and constant variance of the FEV1 and FVC were assessed by visual inspection of histograms and plots of residuals against fitted values. The correlation and agreement between spirometers were assessed with scatterplots, intra-class correlation coefficients (ICC) and Bland-Altman plots [14]. Mean differences between paired FEV1 and paired FVC were calculated as absolute (EasyOne–MicroGP) and relative ([EasyOne-MicroGP)/ average]*100) differences between spirometers. The random-intercept multilevel ‘null’ model was used to estimate the source (region, country, center and participant level) of variation between spirometers. Stratified analyses by region, sex, age, body mass index (BMI), smoking status, known COPD or asthma, education level and quality grades were performed to explore the effect of each factor on the reliability and agreement between spirometers. Countries were classified into seven regions according to geographic location and socioeconomic level (by the World Bank classification) [15]. To examine the impact on interpretation, the GLI normative values were used to transform FEV1 and FVC into z-scores prior to Bland-Altman analysis [16]. We used the ATS/ERS recommendation for between-effort repeatability within test session of <150 ml to assess whether mean differences between spirometers met the criterion for within test reproducibility [17]. Similarly, a difference in z-score <0.5 SD was regarded as not meaningful difference between age, sex, height and ethnicity GLI adjusted values [18]. All analyses were performed using SAS version 9.4 (The SAS Institute, Cary, NC, USA) and STATA 15 (StataCorp LLC, Texas, USA).

Results

A total of 4,603 participants from 628 communities in 18 countries across 7 regions completed measurements from the two spirometers. Baseline characteristics of included participants are shown in Table 1. Similar to the larger PURE study (Appendix II in S1 File), there were more females and individuals between the ages of 50–65 years. The overall proportion of participants meeting quality grades A, B or C on the EasyOne device was 65%, which is similar to the larger PURE study. There was a trend for higher prevalence of comorbidities including COPD/asthma and cardiac diseases; and lower education level in the substudy.

The correlations, mean differences and limits of agreement (LoA) between paired FEV1 and FVC by region are shown in Table 2. Overall, paired FEV1 and FVC between spirometers were highly correlated (Fig 1). The overall mean differences between spirometers, whether in absolute volume or as a percentage of mean FEV1 or FVC were small and within acceptable limits of between-effort reproducibility (Fig 2). The 95% LoA between paired measurements were wide and showed no association with the size of FEV1 or FVC. Correlations between paired FEV1 and FVC were similarly high across regions except for South Asia, where there were low to moderate strength of correlation (Table 2). For South America and the Middle East, the correlation between paired FVC were lower than FEV1. Across regions, the mean differences between paired FEV1 were small (range from absolute -83ml [relative difference -4%] to 49ml [2.5%]) and showed no consistent bias across regions. The mean differences between paired FVC were larger, particularly for the Middle East (-203 ml [-6%]) and South America (141 ml [6.3%]) and again showed no consistent bias across regions. The 95% LoA were wide for both FEV1 and FVC; suggesting large variation in agreement between spirometers across regions.

thumbnail
Fig 1. Overall correlation between paired FEV1 and FVC from the microGP and EasyOne spirometers.

ICC = intraclass correlation coefficient and 95% CI for paired FEV1 (L) and FVC (L) measured within 3 hours and conducted in a random order. All measurements were supervised by the same trained study coordinator. The Line of identity is provide representing a perfect correlation between paired measurements.

https://doi.org/10.1371/journal.pgph.0000141.g001

thumbnail
Fig 2. Bland-Altman plots for paired FEV1 and FVC measured by the microGP abd EasyOne spirometers.

Differences between paired FEV1 and FVC were calculated as the absolute mean difference (EasyOne minus microGP) in Panels And B; or the relative mean difference (EasyOne minus microGP)/average * 100) in Panels C and D; plotted against the average ((EasyOne+microGP) 2) on the x-axis. The 95% Limits of Agreement (LoA) are provided (blur lines). The 95% CI for the mean differences and LoA are also provide (broken Line).

https://doi.org/10.1371/journal.pgph.0000141.g002

thumbnail
Table 2. Correlations, mean differences and agreement between spirometers by region.

https://doi.org/10.1371/journal.pgph.0000141.t002

To understand the source of variation between spirometers, the ICC and variance components between spirometers were assessed at the region, country, center and participant levels (Table 3). The highest ICC between paired FEV1 and FVC were observed at the participant level, indicating the measurements between spirometers were highly correlated within individuals. This correspond to the largest variance component, suggesting that participant factors contributed significantly to the variation between spirometers. The correlation and variance between spirometers at the region, country and center levels were substantially less, suggesting these levels contribute substantially less to the variation between spirometers. Furthermore, the increase in size of the ICC and variance component from the region to country and center levels were not dramatic, compared to the large increase from center to participant levels. This further highlights the importance of participant factors in contributing to the variation between spirometers.

thumbnail
Table 3. ICC and variance estimates between spirometer at the region, country, centers and individuals levels.

https://doi.org/10.1371/journal.pgph.0000141.t003

To explore the participant factors that may contribute to the variation between spirometers, we examined the baseline characteristics of participants, whose inter-device difference were within and outside the 95% LoA for the overall population (Appendix III in S1 File). The distribution in age, body mass index and sex were similar between these 2 groups. Furthermore, COPD/asthma, cardiac disease, strokes and tobacco smoking did not adversely impact the agreement between spirometers. However, there were higher percentages of lower quality grade spirometry and lower education level in those outside the 95% LoA. Separate stratified analyses were conducted to further explore the effects of sex, age, BMI, smoking status, known COPD or asthma, education and quality grades on spirometer variability (Table 4, Appendix IV in S1 File). The correlation between paired FEV1 and FVC were generally high and similar across strata. The mean differences between spirometers were small with minimal variation across strata, even for the lower quality grades. However, there were lower correlation, larger variability and larger LoA between spirometers among those with lower education level and lower quality grades.

thumbnail
Table 4. Stratified analyses by demographic, anthropometric, clinical characteristics and quality grades.

https://doi.org/10.1371/journal.pgph.0000141.t004

Similar Bland-Altman analyses were conducted on the FEV1 and FVC z-scores using age, sex, height and ethnic appropriate GLI normative values (Table 5). Mean differences between paired FEV1 and FVC z-scores from the two spirometers were small and less than 0.5 SD for the overall substudy and across regions.

thumbnail
Table 5. Mean differences and agreement between Z-scores from the two spirometers for overall study population and by region.

https://doi.org/10.1371/journal.pgph.0000141.t005

Discussion

In this large international multi-center community-based sub-study, we examined the correlation, mean differences and agreement between measurements from two commonly used portable spirometers used in the community and field studies; and how they may vary across diverse populations. We found an average of 65% of quality grades A, B and C, which are clinically acceptable efforts. The overall correlation between paired FEV1 and paired FVC between spirometers were high. The overall mean differences between measurements were small and within acceptable limits of between-effort reproducibility. There were moderate to high correlations between spirometers across diverse populations from different geographic and socio-economic regions. Mean differences between paired FEV1 were uniformly small across regions, while larger differences between paired FVC were observed. In both cases, there was no systematic bias observed across region. The main source of variation between spirometers was at the participant level, with much less variation observed among regions, countries and centers. Exploratory analyses of participant factors identified low education level and poor quality grade efforts were associated with higher variability between spirometers.

As portable spirometers become widely adopted and used in the community, more information on their quality of measurements, reliability, biases and agreement are needed, which will enable correct interpretation and comparison of lung function data across spirometers.

To date, most studies have compared different portable devices in highly selected healthy and mainly young non-smokers within a single population [211]. These studies have reported on high correlation and agreement between devices, which are likely to be inflated given the controlled setting under which the comparisons were made. The relatively small sample sizes and homogeneity of the population studied also limit the ability of prior studies to adequately address the source of variation between spirometers. In contrast, we examined two commonly used portable spirometers in large numbers of unselected individuals, from a wide range of urban and rural communities, and geographic regions. The measurements were collected outside of controlled laboratory setting, which can lend our findings more generalizable to a broader range of populations and settings.

Similar to other community-based studies, we found an average of 25 to 35% of suboptimal quality grade efforts [19]. Even with these data included, there were high correlations and small mean differences between paired FEV1 across regions. For paired FVC, there was more variation in the correlation and mean differences between spirometers. However, for most regions, the mean differences between paired FVC still remained within the acceptable limits of between-effort reproducibility [17]. Furthermore, we observed no consistent bias between spirometers across regions suggesting the variation between devices was random in nature. We found the LoA were wide and variable across regions for both FEV1 and FVC. This was expected as other studies have shown that the LoA will tend to increase with larger sample size and including wider range of data examined [20]. Also, in keeping with previous findings, we observed larger LoA between paired FVC than FEV1 [2, 21].

To date, there has been very limited information on the source of variability between spirometers. The few studies that have examined the effect of age and sex on inter-device variability have reported on disparate findings [2, 7, 9]. These studies were generally small in sample size and included healthy volunteers across a limited age range. Our large sample size and diverse population enabled a robust analysis of the potential sources of variation between spirometers at the region, country, center and participant levels. We identified the largest source of variability was at the participant level, with much smaller contribution at the region, country or center levels. Importantly, participant factors such as older age, higher BMI, previous and current smoking and known COPD/asthma did not adversely affect the variation between spirometers. However, low education level and poor-quality grade efforts, were more likely to demonstrate lower correlation and larger variation between spirometers. Even in these subgroups, the mean differences between spirometers remained small and unbiased, suggesting sufficient precision and comparable estimates of group means across devices.

Our findings have a number of implications. First, we report on the robustness of the FEV1 measurement, which was highly correlated, with small and unbiased differences between devices across diverse populations. The correlation and mean differences between paired FVC, however, were more variable but unbiased across regions. This suggests that a more customized approach by region may be needed to adjust for the larger differences in the FVC between spirometers. Second, the LoA were wide, but random, suggesting considerable between-subject variability in agreement between devices. In this regard, it is important to differentiate the need for individual versus group level precision in estimating lung function for different types of studies. In population-based studies, where exclusion of participants is undesirable (since excluded participants may be systematically different from those included) this will inherently lead to larger inter-subject variability. Furthermore, the focus of population-based studies is mainly on the average differences in lung function between populations or the mean changes over time. In this context, it is more relevant to determine whether on average the recordings from different devices are well correlated, and collected without systematic bias. Therefore, the precision of group mean estimates to provide accurate trends is more important than the precision of individual measurements. By contrast, in clinical studies the within-subject variability may be more relevant in assessing changes in lung function within individuals or small groups in response to an intervention. Here the precision of individual measurements is likely to be more important. To that end, our findings suggest that the two different spirometers, on average, were highly correlated, and had sufficiently high precision in estimating the group means in the overall population and in key subgroups without bias. Furthermore, when the data were transformed using GLI normative values, we observed very small and acceptable differences in the mean z-scores across spirometers; suggesting limited impact on interpretation of the data. Lastly, we did not observe a large contribution to the variation between spirometers at the region, country or center levels, suggesting consistent execution of spirometry measurements across these levels. The main source of variation identified was at the participant level and may be related to factors such as low education level and poor quality spirometry efforts. To this end, while every reasonable effort should be made to increase the precision of individual lung function measurements; those that are beyond what is easily achievable, may not necessarily increase the power of the study but could lead to considerable increase in the complexity and cost of the study and therefore comprise study feasibility [22]. Moreover such methods may create biases (and distort results) particularly if such stringent criteria exclude participants with specific conditions or demographics that may influence lung function.

The strengths of our study include the large sample size, the diverse and unselected populations, which increases the generalizability of our findings. Measurements were taken in random order and supervised by the same-trained staff, and therefore minimize procedure-related variability. Furthermore, all spirographs available from the EasyOne were inspected and assessed by a staff respirologist to ensure agreement with the assessment. Limitations include the measurements of lung function without bronchodilation. The use of bronchodilation can help to reduce variable airway tone in asthmatic patients, which may contribute to the variation between spirometers. However, participants were not requested to withhold any medications prior to testing. Therefore, it is reasonable to assume, that those with chronic lung diseases including asthma would have taken their inhaler medications prior to spirometry assessments; and therefore are less likely to exhibit variable airway tone.

In conclusion, we found moderate to high correlation and small mean differences between paired FEV1 and FVC between the MicroGP and EasyOne spirometers across diverse populations. The differences between paired measurements showed no consistent biases across regions. Our findings support the use of these two spirometers in large long-term studies to provide reliable and comparable measurements, with highly correlated and small unbiased differences between group means across diverse population.

Acknowledgments

We would like to acknowledge the assistance of the following members of our team who were involved in the collection, cleaning and validation of the spirometry data: Maha Mushtaha, Roxanna Solano, Justina Greene, Steven Chen and Alex Dragoman. We also would like to acknowledge the statistical help and assistance from Dr Shrinkant Bangdiwala and Ms Chinthanie Ramasundarahettige.

References

  1. 1. Ferguson GT, Enright PL, Buist AS, Higgins MW. Office spirometry for lung health assessment in adults: a consensus statement from the National Lung Health Education Program. Respir Care 2000;45:513–30. pmid:10813228
  2. 2. Gerbase MW, Dupuis-Lozeron E, Schindler C, Keidel D, Bridevaux PO, Kriemler S, et al. Agreement between spirometers: A challenge in the follow-up of patients and populations? Respiration 2013;85:505–14. pmid:23485575
  3. 3. Viegi G, Simoni M, Pistelli F, Englert N, Salonen R, Niepsuj G, et al. Inter-laboratory comparison of flow-volume curve measurements as quality control producer in the framework of an international epidemiological study (PEACE project). Respir Med 2000;94:194–203. pmid:10783929
  4. 4. Swart F, Schuurmans M, Heydenreich JC, Pieper CH, Bollinger CT. Comparison of a new desktop spirometer (Spirospec) with a laboratory spirometer in a respiratory out-patient clinic. Respir Care 2003;48: 591–95. pmid:12780945
  5. 5. Rebuck DA, Hanania N, D’Urzo AD, Chapman KR. The accuracy of a handheld spirometer. Chest 1996;109:152–57. pmid:8549178
  6. 6. Nelson SB, Gardner R, Crapo RO, Jensen RL. Performance evaluation of contemporary spirometers. Chest 1990;97:288–97. pmid:2298052
  7. 7. Milanzi EVB, Koppelman GH, Oldenwening M, Augustijin S, Aalders-de Ruijter B, Farenhorst M, et al. Considerations in the use of different spirometers in epidemiology studies. Environ Health 2019;18:39. pmid:31023382
  8. 8. Maree DM, Videler EA, Hallauer M, Pieper CH, Bolliger CT. Comparison of a new desktop spirometer (Diagnosa) with a laboratory spirometer. Repiration 2001;68:400–04.
  9. 9. Kunzli N, Ackermann-Liebrich U, Keller R, Perruchoud AP, Schindler C. Variability of FVC and FEV1 due to technician, team, device and subject in an eight centre study: three quality control studies in SAPALDIA. Swiss Study in Air Pollution and Lung Disease in Adults. Eur Respir J 1995;8:371–76. pmid:7789479
  10. 10. Caras WE, Winter MG, Dillard T, Reasor T. Performance comparison of the handheld MicroPlus portable spirometer and the SensorMedics Vmax22 diagnostic spirometer. Respir Care 1999;44: 1465–73.
  11. 11. Barr RG, Stemple KJ, Mesia-Vela S, Basner RC, Derk SJ, Hennenberger PK, et al. Reproducibility and validity of a handhild spirometer. Respir Care 2008;53:433–41. pmid:18364054
  12. 12. Teo K, Chow CK, Vaz M, Rangarajan S, Yusuf S et al, on behalf of The PURE Investigators-writing group. The Prospective Urban Rural Epidemiology (PURE) study: examining the impact of societal influences on chronic non-communicable diseases in low-, middle-, and high-income countries. Am Heart J 2009;158:1–7. pmid:19540385
  13. 13. Miller MR, Hankinson J, Brusasco V, Burgos F, Casaburi R, Coates A, et al. Standardisation of spirometry. Eur Respir J 2005;26: 319–38. pmid:16055882
  14. 14. Bland JM, Altman DG. Statistical methods for assessing agreement between two methods of clinical measurements. Lancet 1986;1:307–10. pmid:2868172
  15. 15. World Bank. How we classify countries. http://data.worldbank.org/about/country-classification [cited 2013 Jan 15].
  16. 16. Quanjer PH, Stanojevic S, Cole TJ, Baur X, Hall GL, Culver BH, et al, ERS Global Lung Function Initiatives. Multi-ethnic reference values for spirometry for the 3–95 yr age range: the global lung 2012 equations. Eur Respir J 2012;40:1324–43. pmid:22743675
  17. 17. Graham BL, Steenbruggen I, Miller MR, Barjaktarevic IZ, Cooper BG, Hall GL, et al., on behalf of the American Thoracic Society and the European Respiratory Society. An Official American Thoracic Society and European Respiratory Society Technical Statement. Am J Respir Crit Care Med 2019;200:e70–e88. pmid:31613151
  18. 18. Quanjer PH, Stanojevic S. Do the Global Lung Function Inittiative 2012 equations fit my population? Eur Respir J 2016;48:1782–85. pmid:27811067
  19. 19. Levy ML, Quanjer PH, Booker R, Cooper BG, Holmes S, Small IR. Diagnostic spirometer in primary care. Proposed standards for general practice compliant with American Thoracic Society and European Respiratory Society recommendations. Prim Care Respir J 2009;18:130–47. pmid:19684995
  20. 20. Stöckl D, Cabaleiro D, Van Uytfanghe K, Thienpont L. Interpreting method comparison studies by use of the Bland-Altman plot: Reflectng the importance of sample size by incorporating confidence limits and pre-defined error limits in the graphic. Clin Chem 2004;50:2216–18. pmid:15502104
  21. 21. Bridevaux P-O, Dupuis-Lozeron E, Schindler C, Keidel D, Gerbase MW, Probst-Hensch NM, et al. Spirometer replacement and serial lung function measurements in population studies: Results from the SAPALDIA Study. Am J Epidemiol 2015;181:752–61. pmid:25816817
  22. 22. Vickers AJ. How many repeated measures in repeated measures designs? Statistical issues for comparative trials. BMC Med Res Methodol 2003;3:22. pmid:14580266