Figures
Abstract
Air pollution inhaled dose is the product of pollutant concentration and minute ventilation (). Previous studies have parameterized the relationship between and variables such as heart rate (HR) and have observed substantial inter-subject variability. In this paper, we evaluate a method to estimate with easy-to-measure variables in an analysis of pooled-data from eight independent studies. We compiled a large diverse data set that is balanced with respect to age, sex and fitness level. We used linear mixed models to estimate with HR, breath frequency (fB), age, sex, height, and forced vital capacity (FVC) as predictors. FVC was estimated using the Global Lung Function Initiative method. We log-transformed the dependent and independent variables to produce a model in the form of a power function and assessed model performance using a ten-fold cross-validation procedure. The best performing model using HR as the only field-measured parameter was = e-9.59HR2.39age0.274sex-0.204FVC0.520 with HR in beats per minute, age in years, sex is 1 for males and 2 for females, FVC in liters, and a median(IQR) cross-validated percent error of 0.664(45.4)%. The best performing model overall was = e-8.57HR1.72fB0.611age0.298sex-0.206FVC0.614, where fB is breaths per minute, and a median(IQR) percent error of 1.20(37.9)%. The performance of these models is substantially better than any previously-published model when evaluated using this large pooled-data set. We did not observe an independent effect of height on , nor an effect of race, though this may have been due to insufficient numbers of non-white participants. We did observe an effect of FVC such that these models over- or under-predict in persons whose measured FVC was substantially lower or higher than estimated FVC, respectively. Although additional measurements are necessary to confirm this finding regarding FVC, we recommend using measured FVC when possible.
Citation: Greenwald R, Hayat MJ, Dons E, Giles L, Villar R, Jakovljevic DG, et al. (2019) Estimating minute ventilation and air pollution inhaled dose using heart rate, breath frequency, age, sex and forced vital capacity: A pooled-data analysis. PLoS ONE 14(7): e0218673. https://doi.org/10.1371/journal.pone.0218673
Editor: Hugo A. Kerhervé, Univ Rennes, FRANCE
Received: January 25, 2019; Accepted: June 6, 2019; Published: July 9, 2019
Copyright: © 2019 Greenwald et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: The data file for this manuscript is included as a supplementary file.
Funding: This work was supported in part by NIH/NIEHS grant K25ES020355 (R. Greenwald), Research Foundation Flanders postdoctoral scholarship 12L8818N (E. Dons), NIH/NIEHS grant R01ES020017 (N. Good), and CDC NIOSH grant T42OH009229 (N. Good). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: The authors have declared that no competing interests exist.
Introduction
The public health consequences of ambient air pollution have been well-documented by more than three decades of epidemiologic, observational and clinical studies. The global burden of disease attributable to ambient air pollution is at a historical high and was estimated to be greater than 3 million deaths per year in 2010 [1]. Although air quality has improved significantly in recent decades in many parts of the developed world, ambient air pollution continues to present a formidable public health burden and is estimated to lead to over 200,000 premature deaths per year [2] in the United States. Efforts to better understand the causal relationships between environmental exposures and health effects are hampered by exposure misclassification, which can obscure the true association between exposure and disease and bias effect estimates [3–5]. Although much of the exposure misclassification in air pollution studies is spatial in nature, vast differences in ventilation rate between individuals also contribute to exposure misclassification.
Given that air pollution inhaled dose is a function of both pollutant concentration and the inhaled volume of air, it is important to accurately account for minute ventilation (, the volume of air inhaled per minute) in order to reduce this misclassification and advance the science of air pollution exposure assessment. Although is difficult or intrusive to measure in natural settings, in this paper, we describe a methodology for estimating using data that is easily obtainable using wearable devices.
Several previous studies have estimated from measurements of heart rate (HR) [6–9], breath frequency (fB) [10], power expenditure [11], or metabolic equivalents (METs) [12], and are succinctly summarized by Dons et al [13]. The typical model structure in these previous studies includes as the dependent variable and HR or other physiological parameters as the predictors, often log-transformed. Since there is tremendous inter-subject variability in the relationship between and HR, these models suffer from poor generalizability and typically have a wide range of percent error. A small-scale pilot study by the lead author of this paper collected data from fifteen adolescent athletes [14] and used the novel approach of using normalized by forced vital capacity () as the dependent variable. This approach effectively models the fraction of lung capacity an individual inhales per minute rather than the absolute volume of air and has the effect of reducing inter-subject variability in the relationship between and physical activity. However, major limitations of this pilot study were small sample size and a non-representative study population. To address these concerns, we sought to explore a variety of methodologies in a much larger and more diverse data set by pooling data from several previously published studies.
Methods
Data collection
We performed a Pubmed database search with the terms “heart rate”, “breathing rate” and “minute ventilation” and narrowed the scope to papers with publication dates in the previous five years. This search returned 327 results, and upon closer examination, we identified 24 studies in which the abstract or main text indicated that all three parameters were measured in healthy humans with time resolution of one minute or less. These studies had a wide variety of scientific objectives, and only a few were related to air pollution exposure. In addition, we had previously identified six publications that were focused on estimating air pollution dose but did not meet our Pubmed database search criteria. We contacted by email the corresponding authors for all 30 identified papers and invited their participation in this analysis. Twelve authors responded agreeing to participate (in one case, the principal investigator had retired and the funding agency agreed on his behalf), two authors responded that the data was unavailable, and we received no response from the remaining 16 after two attempts. Of the twelve positive responses, ten investigators submitted data, of which eight data sets were usable for this analysis. Data from one study was unusable due to misaligned time stamps and another due to poor quality heart rate data (in neither case was this important for the original purpose of the respective study). The final eight participating studies produced a data set that includes 14,550 one-minute data points from 471 unique individuals in the age range of 4–80 years. The data set is balanced with respect to sex, includes individuals of a variety of different fitness levels from five different countries on three different continents, and is racially and ethnically diverse (though disproportionately white). In all cases, the data was deidentified, and a summary of subject characteristics is provided in Table 1. The participating studies are Greenwald et al. [14], Cozza et al. [8], Ramos et al. [9], Adams et al. [15], Giles et al. [16], Jakovljevic et al. [17], Villar et al. [18], and Good et al.[19]. All studies were approved by their respective Institutional Review Boards.
An inherent strength of a pooled-data study is the greater level of generalizability that arises from analyzing data collected following diverse protocols using a variety of methodologies, instrumentation, and personnel. The data assembled in this paper include subjects at rest, sitting, standing, walking, running, cycling, and performing routine activities in an ambulatory setting. A summary of study protocols and data collection methodologies is provided in Table 2.
Model selection
We devoted considerable effort to exploring a wide variety of modeling approaches in order to identify the most appropriate and best performing predictive models. Given that the primary rationale for this study was to develop a practical model for assessing air pollution inhaled dose in field studies, we focused on modeling methodologies that are easily-implemented in a variety of applications and predictor variables that may be easily and inexpensively measured with high time-resolution. We therefore developed a list of potential predictors that included HR, fB, age, sex, height and weight as well as second order and/or interaction terms for these predictors. Due to their limited practicality in ambulatory settings, we did not include tidal volume (VT), metabolic equivalents (METs) or oxygen consumption (VO2) as predictors. In addition, given that the relationships of HR and as well as fB and are non-linear [20] and have different response and relaxation times following stimuli [21], we examined the effect of log-transforming predictor variables and/or the independent variable as well as including HR and fB lags (value 1, 2, 3, or 4 minutes previously) and factorials (current value multiplied by the value 1, 2, 3, and up to 4 minutes previously) as predictors.
A source of inter-subject variability in is differences in lung volume, and we therefore explored four distinct modeling approaches to parameterize the effect of this variability. We ultimately rejected the first three of these approaches, but we will briefly describe them in order to justify our model selection. A common measure of functional lung volume is forced vital capacity (FVC), or the volume of air that can be exhaled with maximum effort, and another lung function parameter potentially useful for predicting is forced expiratory volume in 1-second (FEV1). In persons with normal lung function, FEV1 is about 80% of FVC. In persons with obstructive airway disease such as asthma or COPD, FEV1 can be reduced relative to FVC, and during intense physical activity, this reduction in the ability to rapidly exhale may be relevant for the estimation of . FVC and FEV1 in healthy individuals are strongly correlated with height, and to a lesser extent, with age, sex, and race [22, 23]. Several previous well-powered studies [22, 23] have parameterized the influence of these variables on lung function and have developed algorithms for predicting FVC and FEV1. Our exploratory but ultimately rejected modeling approaches are labelled Approaches A-C (see Table A in the Supporting Information file titled S1 Text): Approach A uses normalized by FVC as the dependent variable, Approach B uses as the dependent variable and includes determinant factors of FVC as predictors, and Approach C uses the same as approach B but also includes FVC as a predictor of . As we describe below, the best-performing modeling approach is referred to as Approach D and uses log-transformed as the dependent variable and log-transformed HR, fB, FVC, and subject-specific traits as predictor variables.
For Approaches A, C and D, we used the measured value of FVC or FEV1 in the subset of data for which it was available, and we also examined the entire dataset using predictions of FVC based on height, age, sex, and race or ethnicity according to the method of the Global Lung Function Initiative [23]. This method uses five racial or ethnic categories: Caucasians, African-Americans, North East Asians, South East Asians, and an Other category for all other ethnicities. We categorized white subjects from the United Kingdom, Portugal, Brazil, Canada, and the United States as Caucasian. All subjects of African ancestry were American and assigned to the African-American category. There were 25 American or Brazilian subjects listed as Asian; however, with no additional information regarding North or South Asian ancestry, we assigned these subjects to the Other category. In addition, there were 53 American subjects who self-identified as Hispanic, but again, with no additional knowledge of racial or national ancestry, these subjects were classified as Other. FVC or FEV1 predictions obtained using this method will not capture changes incurred by airway disease; however, the subjects enrolled in all studies were stated to be healthy.
Statistical methods
We used general linear mixed models to reduce the inherent bias of within-subject repeated measures data [24]. All models were performed using the lme4 or nlme packages for R v3.2.2 (R Foundation for Statistical Computing). Presented results are from the lme4 package, while the nlme package was used to investigate covariance matrix structure. In particular, we examined the effect on model performance of using the variance components and first order autoregressive covariance matrix structures, and the best performing models used the variance components structure. We created a categorical variable called “study” that corresponds to each of the contributing studies. We included a random effect for subject and a random slope for both HR and fB with subject. We additionally evaluated the effect of including a random effect for “study” to account for systematic differences between each study, although this random effect was not found to be important or improve model performance. We visually evaluated residual plots and did not observe evidence of heteroscedasticity. P-values were calculated for each predictor by using likelihood ratio tests to compare the full model with the predictor in question to the reduced model without. The level of significance was set a priori at 0.05.
Cross validation
We performed a ten-fold cross validation procedure to assess model performance. Subjects were randomly divided into ten groups such that each group was comprised of a training set of 423 or 424 subjects and a validation set of 47 or 48. Parameter estimates were calculated based on the training sets, predictions were made for the validation sets, and then the predictions from all ten validation sets were assembled and compared with observations. The cross validated percent error was calculated as (predictions-observations)/observations·100%. We evaluated both model accuracy and precision by examining median percent error (favoring models with a smaller absolute value) and inter-quartile range (IQR, favoring models with a smaller spread in the distribution).
Results
Modeling approach
Parameter estimates and results for the best-performing models using approaches described above as A, B, or C (i.e. the dependent variable was not log-transformed, regardless of whether predictor variables were log-transformed) are included in S1 Text. Using these approaches, we observed substantial evidence of interaction between several predictor variables, namely HR with fB, and both HR and fB with either FVC or the determinants of FVC (i.e. age, height, and sex). By this we mean that the p-values of these interaction terms were significant, and addition of these terms improved cross-validation predictive performance. In addition, we observed a significant effect of adding a second order term for HR (Table A in S1 Text). The interaction of FVC (or its determinants) with HR or fB was reduced for Approach A, and as a consequence, it generally performed better than Approaches B or C. This can likely be explained by noting that the difference between Approach A and C is analogous to algebraically rearranging Eq 1 to produce Eq 2: (1) (2)
Eq 3 expresses this as a hierarchically well-formulated model: (3)
In other words, Approach A moves the interaction of FVC (or its determinants) to the left-hand side of the model. Approach B is similar except that FVC is substituted with a function of age, height, and sex, leading to an even more complicated arrangement of interaction terms. The above equations are simplified in that HR, fB, and FVC are the only predictors shown, but the best performing models using these approaches also included a second order term for HR, interaction of HR with fB, age, height, and sex. An additional drawback to these approaches is related to the fact that the pooled dataset for this analysis includes a large number of data points from subjects at rest (approximately 10%). Approaches A, B, and C performed poorly for subjects at rest and occasionally produced negative predictions of for subjects with HR of less than about 60 beats per minute. The minimum observed for a subject at rest was 0.78·FVC, and we therefore substituted 0.78·FVC for any predicted ventilation value less than that for models using Approaches A, B, or C.
The difference between Approaches B and C could be characterized as a statistical power issue. By including predictors of FVC (height, age, sex, race), but not predicted FVC, Approach B essentially attempts to duplicate the FVC predictions of the GLI study, only with a smaller sample size and less statistical power. Approach C on the other hand leverages the larger sample size of the GLI study to produce better predictions of than Approach B.
After observing the increasing model complexity and poor performance at rest, we evaluated Approach D, which uses a log-transformed dependent variable as well as log-transformed predictor variables. There are several notable advantages to using a model of this form: it eliminates the need for higher order terms for any predictor variable, it cannot produce a nonsensical negative prediction, and interaction terms between predictors are implicit. Eq 4 illustrates a simple model using only HR and fB as predictors with the log-transformed interaction term between them explicitly included: (4)
This can be rearranged to give: (5) (6) (7)
Eq 8 shows the same model without an explicit interaction term: (8)
This can be rearranged to give: (9)
Evaluation of the above models with and without an explicit interaction term shows that indeed and such that Eqs 7 and 9 are equivalent. This obviates the need for explicitly including interaction terms or higher order terms such as HR2. Although the best-performing model using Approach A is similar in cross-validated performance (when corrected for values less than 0.78·FVC) to the best model using Approach D, the Approach D models are much simpler and easier to evaluate, and therefore, all presented results are from Approach D.
Best-performing models
Given that HR is easier to measure in field studies than fB and consumer- or medical-grade wearable devices for measuring HR have greatly proliferated in recent years, we separately evaluated models using HR as the only continuously-measured variable. The best-performing of these is labeled Model D1 in Table 3, and the cross-validation results are shown in Fig 1. Models including fB as a predictor have noticeably improved predictive performance (in that the IQR of the cross-validation error is reduced). The best-performing of these is labeled Model D2 in Table 3, and the cross-validation results are shown in Fig 2. Due to the fact that one of the contributing studies did not measure fB, there were 471 subjects and 14550 data points available for estimating Model D1, but only 421 subjects and 13767 data points available for Model D2.
The median(IQR) percent error from cross-validation for this model is -0.664(45.4)%. Circles are persons without an FVC measurement; triangles are persons with measured FVC = 85–115% of the predicted value; diamonds are persons with measured FVC < 85% predicted, and squares are persons with measured FVC > 115% predicted. Dashed lines are ±25% error.
The median(IQR) percent error from cross validation for this model is 1.20(37.9)%. Circles are persons without an FVC measurement; triangles are persons with measured FVC = 85–115% of the predicted value; diamonds are persons with measured FVC < 85% predicted, and squares are persons with measured FVC > 115% predicted. Dashed lines are ±25% error.
These models yield a power function of the form where β0 is the model intercept.
Discussion
Effect of FVC
Only three of the eight contributors performed baseline lung function measurements [8, 14, 16, 25]. These included 83 unique subjects and 4,226 one-minute data points, and these subjects were disproportionately high-performing athletes. As a consequence, our models using measured baseline FVC as a predictor have substantially less statistical power than models using estimated FVC. On the other hand, given that both airway disease and genetic diversity can result in large differences between an individual’s predicted and actual FVC, FVC measurements have the advantage of capturing the effect of these differences on . The cross-validation results shown in Figs 1 and 2 use predicted FVC as a predictor variable, but data points are shape- and color-coded based on measured FVC. These results suggest that how well an individual’s predicted lung function agrees with measured lung function has an important influence on predictions of . Table 4 describes the results of Model D2 stratified by lung function status. is substantially overestimated for persons with lower than normal lung capacity and underestimated for persons with higher than normal FVC. is somewhat underestimated for persons with measured FVC close to the predicted volume, though to a lesser extent than persons with high FVC. Persons with unmeasured FVC are somewhat overestimated. These results are similar in other models including FVC as a predictor, but are exaggerated in models that do not include FVC. Given that is increased during physical activity by increasing tidal volume as well as fB, and that tidal volume is related to FVC, the observation that is overestimated for persons with low FVC is consistent with these persons having lower than normal tidal volume as well. Since tidal volume cannot be easily measured in ambulatory settings, our findings support the use of FVC measurements as an appropriate proxy to adjust for the effect of lung volume. Table 5 describes similar results as Table 4 except measured FVC is used to predict . Note that these results are still from Model D2 wherein parameter estimates are calculated based on predicted FVC. predictions using measured FVC are substantially more accurate for persons with measured FVC differing from the predicted volume in either direction, and the distribution of error is more symmetrical. We therefore recommend that predictions of using Models D1 or D2 be made using measurements of FVC if possible, particularly for persons with non-normal lung function.
Model performance is shown for subjects with and without FVC measurements, and subjects with FVC measurements are further stratified into low, high, and normal FVC groups.
Model performance is shown for subjects with FVC measurements stratified into low, high, and normal FVC groups.
Effect of FEV1
In obstructive airway disease such as asthma, bronchial constrictions can reduce expiratory flow during rapid exhalation without altering vital capacity [26]. The resulting reduced FEV1/FVC ratio is a classic trait of obstructive airway disease, and it is plausible that in such cases, the baseline FEV1 value would have a stronger influence on than FVC. We therefore explored the influence of baseline FEV1 measurements on predictions, but did not observe any improvement in model performance. It should be noted that since predictions of FEV1 assume no airway disease, there is no meaningful difference between models developed using predicted FVC and predicted FEV1 (though the parameter estimates for each are of course different). The performance of models using measured FEV1 as a predictor was essentially no different than models using measured FVC. However, this finding may be due to the fact that all subjects with lung function measurements had a FEV1/FVC ratio close to normal (median = 0.81), and we cannot draw a conclusion on the relative merits of baseline FVC versus FEV1 for purposes of estimating .
Effect of age
For models including age as a predictor, but not height, sex, or FVC, the parameter estimate(standard error) for age is 0.45(0.023) when fB is included and 0.43(0.024) when it is not. This implies that all else being equal, would increase with age. However, previous studies have suggested that in adulthood, resting is not sensitive to age independent of other factors [27]. In order to explore this discrepancy, we divided the dataset into two strata by age, successively using the ages 15–30 years as the cutpoint. In each case, the younger strata had a larger parameter estimate for age, and we found a negligible effect of age on in strata consisting of persons over 24 years. Other factors that influence during activity are known to be affected by age, including VO2 max, maximum voluntary ventilation, response to hypoxia, and FVC. [27] These factors lead to age-related differences in HR and fB for the same level of activity, and as a consequence, models including HR, fB, FVC as well as age are equally predictive of in the adult population as in the child or adolescent population. We did not observe an improvement in predictive performance by stratifying the model by age (regardless of the cutpoint), perhaps due to the reduction in statistical power resulting from stratification.
Effect of height
Previous studies have found that HR is higher [28] and FVC is lower [22, 23] in persons of shorter stature. If HR and FVC are included as predictors of , the addition of height does not improve predictive performance and the parameter estimate is non-significant, suggesting that much of the effect of height on is a result of the height-related changes to HR and FVC. If FVC is not included as a predictor however, the effect of height is pronounced and statistically significant. When comparing models including FVC but not height to models including height but not FVC, the predictive performance of the FVC models is better, particularly when using measured rather than estimated FVC. Taken together, these findings suggest that height does not have a large effect on independent of its effect on lung capacity.
Effect of sex
We evaluated the effect of sex in three different ways: including sex as a predictor, stratifying by sex, and cross-validating by sex (i.e. using males as the training set and females as the validation set, then vice versa). All three methods suggested a small but significant effect of sex on predictions of . Using sex as a predictor produced a statistically-significant parameter estimate for sex which implied that all else being equal (including FVC), is 13% lower in females than in males. When stratifying by sex, the parameter estimates for HR and age were similar across strata while those for fB and FVC were markedly different. In addition, cross-validation by sex was substantially worse than the random 10-fold cross validation. Taken together, these results suggest an effect of sex on that is independent of FVC. Including sex as a predictor resulted in better-performing models than stratifying by sex, perhaps due to the reduction in statistical power resulting from stratification.
Effect of race or ethnicity
Previous studies that have examined lung function in diverse populations have observed important differences associated with race or national origin. [22, 23] This diversity is likely the result of both differences in developmental environment and genetic factors [29, 30] such as adaptation to high altitude [31, 32]. We attempted to evaluate the effect of race or ethnicity in a similar fashion as the effect of sex. However, the compiled dataset was disproportionately composed of white subjects, and there were insufficient numbers of all other race or ethnicity categories to meaningfully evaluate each on its own. We instead used white and non-white race categories where non-white consisted of the African-American, Asian, and Hispanic categories. We acknowledge that this is not an ideal approach for assessing the role of human genetic diversity on . In particular, we note that there is likely a great deal of diversity within each race category, that each of the non-white categories are likely to be quite different from each other and that the Hispanic category does not necessarily identify genetic background and could include persons with various contributions of European, African, and Native American genetics. Stratifying Models D1 and D2 by race resulted in parameter estimates that were somewhat different from each other; but when cross-validating by race, the percent error was unchanged from random 10-fold cross-validation. It is conceivable that there is no effect of race independent of an effect of race on FVC; however, it is also conceivable that there is an effect of race but that this pooled data set was not sufficiently powered in non-white racial categories to detect that effect.
Effect of lagged HR
HR and fB have different response and relaxation times following stimuli [21], and the relationship between HR and may be different when HR is increasing during activity than when it is decreasing. To parameterize this phenomenon, we evaluated models including either lagged or factorial terms for HR and fB as predictors. Note that these terms are unavailable for the first several minutes of each participant’s session (for however many minutes are lagged or included in the factorial term), and this results in some loss of statistical power. Factorial terms performed better than lagged terms in all cases. The parameter estimates for HR factorials were significant (p < 1e10-6), but fB factorials were not. Nonetheless, inclusion of these terms as predictors did not improve model performance, and we did not include them in our recommended models. It is possible that the direction of the HR trend (increasing or decreasing) is unimportant for predicting ; however, this data set was primarily assembled from exercise tests of increasing intensity such that the vast majority of data points are from an increasing HR trend. It is therefore also conceivable that including factorial terms to identify the HR trend may be useful in predicting in the post-maximum exertion time period, but that effect is not detectable with this dataset.
Effect of “study”
We evaluated the possible systematic effects of which participating study collected data in two ways: we included a random effect for “study”, and we cross-validated by study (i.e., we in turn used data from each study as a validation set and data from the other seven studies as the training set). The random effect for study was very small in comparison to the random effect for subject, and the cross-validation results by study were not meaningfully different than the random 10-fold cross-validation. These results are shown in Table B and Figure D of S1 Text. Taken together, this suggests that were not large systematic differences in relationships between variables in data collected from the various contributing studies, and we therefore did not include a random effect for “study” in the final analysis.
Comparison with previous studies
Several different models for predicting have been previously proposed. The underlying methodologies for these models are diverse and include static estimates of based on the type of activity, models based on energy expenditure, metabolic equivalents, oxygen consumption, HR, fB, or a combination of HR and fB. Most of these previously published models have not been cross-validated in a large sample. Dons et al. [13] recently compared the calculated and air pollution dose using 16 different models on subjects using wearable sensors and is a co-author on this paper. This study found a very wide range of predicted . For some activities, the predictions differed by a factor of 2–4 using the same data as input. The application of previously-published models to our assembled dataset is shown in Table 6 along with the results of Models D1 and D2 from this paper. In addition to the random 10-fold cross-validation results, we have also included the results of cross-validation by study in this table as this may be a more fitting comparison for models from other studies. Please note that Table 6 only displays models that can be evaluated using data included in our dataset. Both Models D1 and D2 presented here have a substantially lower percent error than any previously published model. The best performing model evaluated by Dons et al. was that of Zuurbier et al. [6] When evaluated using our pooled data set, the performance of this model is substantially worse than either Model D1 or Model D2 with a median(IQR) percent error of 4.20(68.3)% as compared to -0.664(45.4)% and 1.20(37.9)% for Models D1 and D2 respectively.
For reference, the results of Models D1 and D2 from this paper are shown, including both random 10-fold cross-validation and cross-validation by study.
Limitations
Parameter estimates for all models in this study were calculated using predicted rather than measured FVC. By definition, these predictions are accurate for persons with average lung function, but this obscures the fact that there is a wide range of diversity in lung function values even for healthy individuals. The standard deviation for FVC predictions from the GLI study is approximately ±10% of the predicted value, and the lower limit of normal is approximately 20% lower than the predicted value. In addition, many persons do not have normal lung function, including people who are susceptible to the health effects of air pollution exposure. This includes asthmatics [35–37] and persons with chronic obstructive pulmonary disease [38]. Asthmatics frequently have lower lung function than non-asthmatics depending on phenotype and age of onset [39]. Furthermore, air pollution exposure itself is associated with decreased lung function [35, 40–43]. Given the large number of nominally healthy subjects included in this data set, it is plausible that there were approximately equal numbers of participants with FVC above and below the predicted value and that the parameter estimates are not biased. 82 of the 471 subjects included in the data set had an FVC measurement, and of these, fourteen had measured FVC more than 15% lower than the predicted value, and eight were more than 15% higher. As previously discussed, this small difference in lung function had an observable effect on predictions, and this error was ameliorated by estimating using the measured value of FVC instead. It is additionally possible that if the data set had included large numbers of participants with asthma or other airway disease or who otherwise had measured lung function substantially different than predicted, the calculated parameter estimates for Models D1 and D2 would be meaningfully different than reported here, and it is further possible that measured FEV1 would be a better predictor of than measured FVC.
Another limitation of this paper was that all changes in were driven by physical activity. It has been previously established that noise and anxiety affect [44–47], and it is plausible that changes in driven by noise or anxiety will have a different relationship with HR and fB than those driven by physical activity. In the context of air pollution exposure, this would be relevant for persons in a loud or stressful transportation environment with elevated air pollutant concentrations. Additional research is necessary to determine if this is true, and if so, to what extent, and what parameters might be useful for accurately estimating in persons experiencing noise, stress, or anxiety.
Conclusion
We describe a method for estimating in healthy individuals using HR as the continuously-measured predictor. Model accuracy and precision is improved by including continuously-measured fB data as well. These predictions have been validated in a large diverse dataset comprised of 471 unique persons aged 4–80 years collected as part of eight independent studies. We found FVC to be an important factor in predicting ; predicted FVC calculated according to a large well-powered study such as the GLI is a substantial improvement over not accounting for FVC; however, using measurements of FVC to estimate further improved predictions, especially in persons with lung function higher or lower than normal. We additionally found age and sex to be important predictors; however, we did not find height or race to be important predictors independent of their influence on FVC. These models have been validated in individuals whose is modulated in response to physical activity, and model results may not be accurate for predicting that is modulated by stress, noise or anxiety. This method is more accurate and precise than other predictive models for estimating and has the advantage of relying on predictors that are easily-measured in the field without specialized equipment.
Supporting information
S1 Text. Supporting information file.
This file contains a text description and results of preliminary exploratory models. It includes three figures showing cross-validation results of these models as well as a figure showing the results of cross-validation by study of the Model D2 from the main text. Finally it includes a table for the random effects for subject, study, HR, and fB.
https://doi.org/10.1371/journal.pone.0218673.s001
(DOCX)
S1 Data. The deidentified pooled data file used for all models is included in a supplementary file labeled S1 Data.
https://doi.org/10.1371/journal.pone.0218673.s002
(TXT)
Acknowledgments
The authors wish to acknowledge Dr. Carla Ramos of CTN Tecnico Lisboa and Dr. Izabela Campos Cozza of Universidade de São Paulo for kindly sharing data.
References
- 1. Lim SS, Vos T, Flaxman AD, Danaei G, Shibuya K, Adair-Rohani H, et al. A comparative risk assessment of burden of disease and injury attributable to 67 risk factors and risk factor clusters in 21 regions, 1990–2010: a systematic analysis for the Global Burden of Disease Study 2010. The Lancet. 2012;380(9859):2224–60. pmid:23245609
- 2. Caiazzo F, Ashok A, Waitz IA, Yim SHL, Barrett SRH. Air pollution and early deaths in the United States. Part I: Quantifying the impact of major sectors in 2005. Atmospheric Environment. 2013;79:198–208.
- 3. Brokamp C, LeMasters GK, Ryan PH. Residential mobility impacts exposure assessment and community socioeconomic characteristics in longitudinal epidemiology studies. Journal of Exposure Science & Environmental Epidemiology. 2016;26(4):428–34. PMC4913165. pmid:26956935
- 4. Strickland MJ, Gass KM, Goldman GT, Mulholland JA. Effects of ambient air pollution measurement error on health effect estimates in time-series studies: a simulation-based analysis. Journal Of Exposure Science & Environmental Epidemiology. 2015;25(2):160–6. pmid:23571405.
- 5. Goldman GT, Mulholland JA, Russell AG, Gass K, Strickland MJ, Tolbert PE. Characterization of ambient air pollution measurement error in a time-series health study using a geostatistical simulation approach. Atmospheric Environment. 2012;57:101–8. PMC3628542; PubMed Central PMCID: PMC3628542. pmid:23606805
- 6. Zuurbier M, Hoek G, van den Hazel P, Brunekreef B. Minute ventilation of cyclists, car and bus passengers: an experimental study. Environmental Health. 2009;8(48):1–10. pmid:19860870
- 7. Satoh T, Higashi T, Sakurai H, Omae K. Development of a new exposure monitoring system considering pulmonary ventilation. The Keio Journal of Medicine. 1989;38(4):432–42. pmid:2630781
- 8. Cozza IC, Zanetta DMT, Fernandes FLA, da Rocha FMM, de Andre PA, Garcia MLB, et al. An approach to using heart rate monitoring to estimate the ventilation and load of air pollution exposure. Science of The Total Environment. 2015;520:160–7. pmid:25813969
- 9. Ramos CA, Reis JF, Almeida T, Alves F, Wolterbeek HT, Almeida SM. Estimating the inhaled dose of pollutants during indoor physical activity. Science of The Total Environment. 2015;527–528:111–8. pmid:25958360
- 10. Bigazzi AY, Figliozzi MA. Dynamic ventilation and power output of urban bicyclists. Journal of the Transportation Research Board. 2015;2520:52–60.
- 11. Faria M, Duarte G, Vasconcelos A, Farias T. Evaluation of a numerical methodology to estimate pedestrians’ energy consumption and PM inhalation. Transportation Research Procedia. 2014;3:780–9.
- 12. Johnson T. A guide to selected algorithms, distributions, and databases used in exposure models developed by the Office of Air Quality Planning and Standards. Research Triangle Park, NC: Environmental Protection Agency, 2002 Contract No.: CR827033.
- 13. Dons E, Laeremans M, Orjuela JP, Avila-Palencia I, Carrasco-Turigas G, Cole-Hunter T, et al. Wearable sensors for personal monitoring and estimation of inhaled traffic-related air pollution: evaluation of methods. Environmental Science & Technology. 2017;51(3):1859–67. pmid:28080048
- 14. Greenwald R, Hayat MJ, Barton J, Lopukhin A. A novel method for quantifying the inhaled dose of air pollutants based on heart rate, breathing rate and forced vital capacity. PLoS ONE. 2016;11(1):14. pmid:26809066
- 15.
Adams WC. Measurement of breathing rate and volume in routinely performed dailty activities. Sacramento: California Air Resources Board, Division R; 1993 June 1993. Report No.: A033-205 Contract No.: A033-205.
- 16. Giles LV, Brandenburg JP, Carlsten C, Koehle MS. Physiological responses to diesel exhaust exposure are modified by cycling intensity. Med Sci Sports Exerc. 2014;46(10):1999–2006. pmid:24561816.
- 17. Jakovljevic DG, Popadic-Gacesa JZ, Barak OF, Nunan D, Donovan G, Trenell MI, et al. Relationship between peak cardiac pumping capability and indices of cardio-respiratory fitness in healthy individuals. Clinical Physiology and Functional Imaging. 2012;32(5):388–93. pmid:22856346
- 18. Villar R, Beltrame T, Hughson RL. Validation of the Hexoskin wearable vest during lying, sitting, standing, and walking activities. Applied Physiology, Nutrition, and Metabolism. 2015;40(10):1019–24. pmid:26360814
- 19. Good N, Carpenter T, Anderson B, Wilson A, Peel JL, Browning R, et al. Development and validation of models to predict personal ventilation rate for air pollution research. Journal of Exposure Science and Environmental Epidemiology. 2018;in press.
- 20. Babb TG. Exercise ventilatory limitation: the role of expiratory flow limitation. Exercise and Sport Sciences Reviews. 2013;41(1):11–8. PMC3529766; PubMed Central PMCID: PMC3529766. pmid:23038244
- 21. Barrera-Ramirez J, Bravi A, Green G, Seely AJ, Kenny GP. Comparison of heart and respiratory rate variability measures using an intermittent incremental submaximal exercise model. Applied Physiology, Nutrition and Metabolism. 2013;38(11):1128–36. PubMed Central PMCID: PMC24053520. pmid:24053520
- 22. Hankinson JL, Odencrantz JR, Fedan KB. Spirometric reference values from a sample of the general U.S. population. Am J Respir Crit Care Med. 1999;159(1):179–87. pmid:9872837
- 23. Quanjer PH, Stanojevic S, Cole TJ, Baur X, Hall GL, Culver BH, et al. Multi-ethnic reference values for spirometry for the 3–95 year age range: the Global Lung Function 2012 equations. European Respiratory Journal. 2012;40(6):1324–43. PMC3786581. pmid:22743675
- 24.
Diggle PJ, Heagerty P, Liang K-Y, Zeger SL. Analysis of Longitudinal Data. Oxford: Oxford University Press; 2002. 400 p.
- 25. Giles LV, Carlsten C, Koehle MS. The pulmonary and autonomic effects of high-intensity and low-intensity exercise in diesel exhaust. Environmental Health. 2018;17(1):87. pmid:30541575
- 26.
Anderson SD, Alison JA. Exercise as a Stimulus. In: Barnes PJ, Drazen JM, Rennard SI, Thomson NC, editors. Asthma and COPD (Second Edition). Oxford: Academic Press; 2009. p. 495–506.
- 27. Sharma G, Goodwin J. Effect of aging on respiratory system physiology and immunology. Clinical Interventions in Aging. 2006;1(3):253–60. PMC2695176; PubMed Central PMCID: PMC2695176. pmid:18046878
- 28. Smulyan H, Marchais SJ, Pannier B, Guerin AP, Safar ME, London GM. Influence of body height on pulsatile arterial hemodynamic data. Journal of the American College of Cardiology. 1998;31(5):1103–9. pmid:9562014
- 29. Hubert HB, Fabsitz RR, Feinleib M, Gwinn C. Genetic and environmental influences on pulmonary function in adult twins. Am Rev Respir Dis. 1982;125(4):409–15. pmid:7200340.
- 30. Wilk JB, Djousse L, Arnett DK, Rich SS, Province MA, Hunt SC, et al. Evidence for major genes influencing pulmonary function in the NHLBI Family Heart Study. Genetic Epidemiology. 2000;19(1):81–94. pmid:10861898
- 31. Kiyamu M, Elías G, León‐Velarde F, Rivera‐Chira M, Brutsaert TD. Aerobic capacity of Peruvian Quechua: a test of the developmental adaptation hypothesis. American Journal of Physical Anthropology. 2015;156(3):363–73. pmid:25385548
- 32. Weitz CA, Garruto RM, Chin CT. Larger FVC and FEV1 among Tibetans compared to Han born and raised at high altitude. American Journal of Physical Anthropology. 2016;159(2):244–55. pmid:26407532
- 33. Do Vale ID, Vasconcelos AS, Duarte GO. Inhalation of particulate matter in three different routes for the same OD pair: A case study with pedestrians in the city of Lisbon. Journal of Transport & Health. 2015;2(4):474–82.
- 34.
McArdle WD, Katch FI, Katch VL, editors. Exercise Physiology. Fifth ed. Philadelphia: Lippincott Williams & Wilkins; 2011.
- 35. McCreanor J, Cullinan P, Nieuwenhuijsen MJ, Stewart-Evans J, Malliarou E, Jarup L, et al. Respiratory effects of exposure to diesel traffic in persons with asthma. New England Journal of Medicine. 2007;357(23):2348–58. pmid:18057337
- 36. Mirabelli MC, Golan R, Greenwald R, Raysoni AU, Holguin F, Kewada P, et al. Modification of traffic-related respiratory response by asthma control in a population of car commuters. Epidemiology. 2015;26(4):546–55. Epub April 23, 2015. pmid:25901844.
- 37. Zora JE, Sarnat SE, Raysoni AU, Johnson BA, Li W-W, Greenwald R, et al. Associations between urban air pollution and pediatric asthma control in El Paso, Texas. Science of the Total Environment. 2013;448(1):56–65. pmid:23312496
- 38. Bloemsma LD, Hoek G, Smit LAM. Panel studies of air pollution in patients with COPD: Systematic review and meta-analysis. Environmental Research. 2016;151:458–68. pmid:27565881
- 39.
Program NAEaP. Expert Panel Report 3: Guidelines for the Diagnosis and Management of Asthma. Bethesda, MD: National Heart, Lung, and Blood Institute, 2007 August 28, 2007. Report No.: 07–4051.
- 40. Holguín F, Flores S, Ross Z, Cortez M, Molina M, Molina L, et al. Traffic-related exposures, airway function, inflammation, and respiratory symptoms in children. Am J Respir Crit Care Med. 2007;176(12):1236–42. pmid:17641154
- 41. Schultz ES, Litonjua AA, Melén E. Effects of long-term exposure to traffic-related air pollution on lung function in children. Current Allergy and Asthma Reports. 2017;17(6):41. PMC5446841. pmid:28551888
- 42. Frye C, Hoelscher B, Cyrys J, Wjst M, Wichmann H-E, Heinrich J. Association of lung function with declining ambient air pollution. Environmental Health Perspectives. 2003;111(3):383–7. pmid:12611668
- 43. Gauderman WJ, Avol E, Gilliland F, Vora H, Thomas D, Berhane K, et al. The effect of air pollution on lung development from 10 to 18 years of age. New England Journal of Medicine. 2004;351(11):1057–67. http://content.nejm.org/cgi/content/abstract/351/11/1057. pmid:15356303
- 44. Bernardi L, Porta C, Sleight P. Cardiovascular, cerebrovascular, and respiratory changes induced by different types of music in musicians and non‐musicians: the importance of silence. Heart. 2006;92(4):445–52. PMC1860846. pmid:16199412
- 45. Gomez P, Danuser B. Affective and physiological responses to environmental noises and music. International Journal of Psychophysiology. 2004;53(2):91–103. pmid:15210287
- 46. Masaoka Y, Homma I. Expiratory time determined by individual anxiety levels in humans. Journal of Applied Physiology. 1999;86(4):1329–36. pmid:10194219
- 47. Masaoka Y, Homma I. Anxiety and respiratory patterns: their relationship during mental stress and physical load. International Journal of Psychophysiology. 1997;27(2):153–9. pmid:9342646