Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Predicting VO2peak from Submaximal- and Peak Exercise Models: The HUNT 3 Fitness Study, Norway

  • Henrik Loe,

    Affiliations K.G. Jebsen Center of Exercise in Medicine at Department of Circulation and Medical Imaging, Norwegian University of Science and Technology, Trondheim, Norway, Valnesfjord Rehabilitation Center, Valnesfjord, Norway

  • Bjarne M. Nes,

    Affiliation K.G. Jebsen Center of Exercise in Medicine at Department of Circulation and Medical Imaging, Norwegian University of Science and Technology, Trondheim, Norway

  • Ulrik Wisløff

    Ulrik.wisloff@ntnu.no

    Affiliation K.G. Jebsen Center of Exercise in Medicine at Department of Circulation and Medical Imaging, Norwegian University of Science and Technology, Trondheim, Norway

Predicting VO2peak from Submaximal- and Peak Exercise Models: The HUNT 3 Fitness Study, Norway

  • Henrik Loe, 
  • Bjarne M. Nes, 
  • Ulrik Wisløff
PLOS
x

Abstract

Purpose

Peak oxygen uptake (VO2peak) is seldom assessed in health care settings although being inversely linked to cardiovascular risk and all-cause mortality. The aim of this study was to develop VO2peak prediction models for men and women based on directly measured VO2peak from a large healthy population

Methods

VO2peak prediction models based on submaximal- and peak performance treadmill work were derived from multiple regression analysis. 4637 healthy men and women aged 20–90 years were included. Data splitting was used to generate validation and cross-validation samples.

Results

The accuracy for the peak performance models were 10.5% (SEE = 4.63 mL⋅kg-1⋅min-1) and 11.5% (SEE = 4.11 mL⋅kg-1⋅min-1) for men and women, respectively, with 75% and 72% of the variance explained. For the submaximal performance models accuracy were 14.1% (SEE = 6.24 mL⋅kg-1⋅min-1) and 14.4% (SEE = 5.17 mL⋅kg-1⋅min-1) for men and women, respectively, with 55% and 56% of the variance explained. The validation and cross-validation samples displayed SEE and variance explained in agreement with the total sample. Cross-classification between measured and predicted VO2peak accurately classified 91% of the participants within the correct or nearest quintile of measured VO2peak.

Conclusion

Judicious use of the exercise prediction models presented in this study offers valuable information in providing a fairly accurate assessment of VO2peak, which may be beneficial for risk stratification in health care settings.

Introduction

Peak oxygen uptake (VO2peak) is widely referred to as cardiorespiratory fitness (CRF) [1], and is inversely linked to cardiovascular disease, hypertension, certain cancers, metabolic syndrome [2,3], and all-cause mortality [4]. At present there is no consensus identifying a precise threshold of cardiorespiratory fitness to be associated with increased cardiovascular risks. However, values below 8 METs and 6 METSs in healthy men and women, respectively, are linked with higher all-cause mortality and adverse cardiovascular effects [4]. Additionally, data suggest that MET levels > 9 and > 7 (vs. lower MET levels) among men and women, respectively, is associated with a mortality risk reduction of ≥ 50% over an average 8 years follow-up [5]. Despite being an essential health indicator, VO2peak is rarely assessed in health care settings [5,6], likely because direct gas analysis measurements of VO2peak is expensive, necessitate the use of advanced equipment, and trained personnel [2]. However, reliable and valid prediction models should be considered as several studies have shown that either directly measured or estimated VO2peak enhance CVD-mortality prediction beyond traditional risk factors [7,8].

Although a maximal test is considered a safe practice, complications and adverse effects occur, normally linked to underlying disease [9]. Consequently, health care personnel should monitor when testing individuals at high risk.

There exist several VO2peak prediction models in the literature. Common limitations in these models are the use of uniform age samples [1012], only one-gender represented [1317], as well as models being based on subjects with homogenous cardiorespiratory fitness levels [10,12,13,15,18]. Hence, they yield fair VO2peak predictions only in subjects similar to those used in generating the model [2,19].

Therefore, the aim of the present study was to develop VO2peak prediction models from both submaximal- and peak treadmill performance, on the basis of data from a large healthy population of both men and women 20–90 years, with a great diversity in measured VO2peak. If these models show fair predictive accuracy they will provide a safe and feasible method for estimating VO2peak for a wide variety of people.

Methods

Study sample

In 2006–2008, the total population above 20 years of age in Nord-Trøndelag county, in Norway, were invited to the third wave of the HUNT study (HUNT 3). Out of a total population of 94194, 54% accepted the invitation (n = 50821). A sub-study (The HUNT Fitness Study) invited healthy subjects (without cardiovascular disease, cancer, pulmonary disease and use of blood pressure medication) in three pre-selected municipalities within the county, to perform treadmill testing with direct measurement of maximal oxygen uptake (VO2max). Out of 12609 potential eligible participants, 5633 appeared, and 1003 failed to complete the cardiopulmonary exercise test (CPET), withdrew or were excluded for medical reasons detected during medical interview. 4637 participants completed the exercise testing.

Ethics statement

The study was approved by REK- Regional Committees for Medical and Health Research Ethics (2013/1788/REK nord), the Norwegian Data Inspectorate and the National Directorate of Health. The study was conducted in conformity with the Declaration of Helsinki and all participants signed a document of informed consent.

Exercise test procedures

A 10-minute warm-up was implemented with workload individualized to induce some sweat, moderately augmented heart rate and breathing, but devoid of exhaustion. Subsequent the warm-up subjects entered the treadmill used for testing (DK7830; DK City, Taichung, Taiwan) and were equipped with a heart rate monitor (Polar S610 or RS400; Polar, Kempele, Finland) and face mask (Hans Rudolph; Shawnee, KS). Subjects were instructed to avoid handrail grasp. Cardiorespiratory variables were measured continuously using ergospirometry (MetaMax II; Cortex Biophysik GmbH, Leipzip, Germany) connected to computer software (Cortex MetaSoft, version 1.11.5). A graded individualized treadmill protocol, starting with the warm-up workload, was used with subjects walking or running at gradually increased speed and/or inclination. Treadmill speed was increased (0.5–1.0 km⋅h-1) when VO2 uptake measurements remained stable > 30 s, keeping a fixed inclination if possible. Test was terminated when subject reached volitional exhaustion (e.g. leg fatigue and shortness of breath), preferably within 8–12 minutes (Table 1). VO2max was taken as the mean of the three successive highest 10-s VO2 values and defined by a leveling off of VO2 (<2 mL·kg-1·min-1 change over the span of these successive measurements) despite increasing speed and/or inclination, in combination with a respiratory exchange ratio (R) above 1.05 and subjective volitional exhaustion (e.g. leg fatigue and shortness of breath). Since a total of 17.6% of the subjects failed to reach all the criteria, the term VO2peak was used. During the incremental test most subjects had their steady-state VO2 measured at one (n = 2827) or two (n = 2576) submaximal levels. At the first submaximal level (VO2 < ventilatory anaerobic threshold (established by V-slope)) steady-state VO2 was attained from each subject after 3 minutes. Measurements at this level were used to develop the submaximal models. At each level, as well as at peak performance, treadmill- velocity and inclination in addition to heart rate were also registered. Velocities in the range 5.9–8.0 km⋅h-1 typically represents the transition from walking to running, with individual variation attributed differences in e.g. stride length, leg length and body-size [2022]. Test velocities used in development of the VO2peak prediction models suggest that most participants (92%) walked during the first submaximal measurement, whereas approximately 80% were running during peak measurements. An average of 87% of all participants used 10% treadmill inclination during both the first submaximal and the peak measurements. For development of the submaximal performance models peak heart rates (HRpeak) were predicted from age in two gender specific linear regression models based on the HUNT 3 fitness data (men: HRpeak = 215.336–0.73 x age, R2 = 0.40, SEE = 12.25 and women: HRpeak = 2 12.497–0.702 x age, R2 = 0.40, SEE = 11.71) and integrated into the fraction peak heart rate variable (Fraction peakHR: HRsubmax/215.336–0.73 x age (men); HRsubmax/212.497–0.702 x age (women)). All tests were performed by trained personnel and test equipment were routinely calibrated with volume ventilation calibrated every third test and gas calibrated every fifth test. Height was measured in centimeters with one decimal, and weight in kilograms with one decimal by internally standardized procedures.

thumbnail
Table 1. Test protocol for using the VO2peak prediction models derived from treadmill work.

https://doi.org/10.1371/journal.pone.0144873.t001

Statistical analysis

Descriptive statistics are given as mean and standard deviation for men and women, respectively. Potential variables were chosen on the basis of correlation with measured VO2peak in previous literature, and entered subsequently in a hierarchical linear regression model. All the retained variables (Treadmill inclination and velocity, weight, age and Fraction HRpeak) made a considerable influence on total model fit. The models were checked for normality and homoscedasticity of residuals and these assumptions were satisfied. All models presented in this paper were derived from the total sample. Internal cross-validation was checked by data-splitting procedures, i.e. SPSS randomly selected approximately 50% of all cases, here denoted validation sample, with the remaining cases denoted cross-validation sample. In these subsets linear regression analysis were performed on the validation sample and applied to predict VO2peak in both the validation- and the cross-validation samples. Model fit was evaluated by squared multiple regression coefficients (R2) and standard errors of the estimate (SEE). R2 and R2 adjusted increased similarly for each new independent variable added to the models. R2 and R2 adjusted were either identical or differed in the third decimal place, showing that both had almost identical impact on the outcome variable. As a result R2 was chosen throughout this paper. To be able to compare the model precision to models derived from external samples we also calculated the % SEE which refers to the percentage of the measured mean VO2peak within which the estimates generally fall. In the total sample, as well as subgroups of age, VO2peak and treadmill velocity, we calculated constant error (CE) and total error (TE) for the model. CE represents the mean difference between measured and predicted values (∑ (measured-predicted)/n), while TE represents the squared mean differences (√∑ (measured-predicted) 2/n). Pearson correlation and variance explained between measured and predicted VO2peak were used to examine potential shrinkage between validation and cross-validation samples. Further internal validation was done by cross-classifying subjects into quintiles of measured and predicted VO2peak. Measures of rank correlation and agreement were tested by use of Kendall`Tau and Cohens`Kappa statistics. Two-sided Paired Samples T-test was used to establish differences between measured and predicted VO2peak. Statistical analyses were performed with SPSS 20.0 (Statistical package for social sciences, Chicago, IL, USA).

Results

Descriptive characteristics are presented in Table 2. Descriptive data in the validation and cross-validation samples were equally distributed (Table 3). Additional descriptive data of the HUNT 3 fitness population are displayed in a previous study [23].

thumbnail
Table 3. Descriptive data for the male and female validation and cross-validation sample: The HUNT 3 fitness study.

https://doi.org/10.1371/journal.pone.0144873.t003

Predicting VO2peak from peak treadmill performance

Peak treadmill inclination and velocity accounted for most of the variance explained by the VO2peak prediction model (men: R2 = 0.72, p<0.001; women: R2 = 0.68, p<0.001), with velocity being the paramount factor. Modest influence were seen from weight and age, and the total explained variance for the peak performance prediction model was R2 = 0.75 (p<0.001) in men and R2 = 0.72 (p<0.001) in women. Including resting heart rate and peak heart rate into the model did not contribute considerable changes in R2 and SEE and were thus excluded from the models. A strong correlation was demonstrated between the predicted- and measured VO2peak (men: r = 0.87; women: r = 0.85) (Fig 1). Two gender specific VO2peak prediction equations were derived from multiple linear regression using the total sample, male: VO2peak = 24.24 + (0.599 x treadmill inclination in %) + (3.197 x treadmill velocity in km⋅h-1)–(0.122 x body weight in kilos)–(0.126 x age in years); women: VO2peak = 17.21 + (0.582 x treadmill inclination in percent) + (3.317 x treadmill velocity in km⋅h-1)–(0.116 x weight in kilos)–(0.099 x age in years) (Tables 4, 5 and 6).

thumbnail
Fig 1. Correlation plots between measured and predicted VO2peak with 95% prediction bands from peak treadmill performance (A and B), and submaximal treadmill performance (C and D).

https://doi.org/10.1371/journal.pone.0144873.g001

thumbnail
Table 4. Multiple linear regression coefficients for predicting VO2peak (mL⋅kg-1⋅min-1) from peak measurements in total sample.

https://doi.org/10.1371/journal.pone.0144873.t004

thumbnail
Table 5. Multiple linear regression analysis for predicting VO2peak (mL⋅kg-1⋅min-1) from peak measurements in total sample: The HUNT 3 fitness study.

https://doi.org/10.1371/journal.pone.0144873.t005

thumbnail
Table 6. VO2peak prediction models based on peak (P) and submaximal (S) treadmill performance in men and women.

https://doi.org/10.1371/journal.pone.0144873.t006

Cross-validation of the peak performance prediction model

The Coefficient of determination (R2) remained stable between the total sample (0.75 and 0.72) and the validation sample (0.76 and 0.72) among both men and women, respectively (Tables 5 and 7), thus suggesting an internally robust prediction model. Also, there were non-significant differences between measured and predicted VO2peak, and we display CE values close to zero, in the total sample (− 0 .03 and 0.02), validation sample (– 0.03 and– 0.03) and cross-validation sample (− 0.21 and − 0.04) in both men and women, respectively (Tables 810), signifying a valid prediction of the mean VO2peak without systematical over- or under prediction, respectively. Our prediction models continue to be stable when stratified into subgroups of age and treadmill velocity (non-significant differences between measured and predicted VO2peak). However, when divided into VO2peak subgroups the least fit participants (<35 mL⋅kg-1⋅min-1 for men and <30 mL⋅kg-1⋅min-1 for women) tended to be overestimated (men: p<0.001; women: p<0.001), and the most fit participants (>50 mL⋅kg-1⋅min-1 for men and >40 mL⋅kg-1⋅min-1 for women) tended to be underestimated (men: p<0.001; women: p<0.001). This is in agreement with the validation and cross-validation samples. The least fit men and women showed CE values of– 3.50 and– 2.60 vs. most fit 2.73 and 2.81, respectively, with corresponding TE values of 5.01 and 4.35 vs. 5.27 and 4.80, with similar tendencies in validation and cross-validation samples. In the medium fit participants (VO2peak between 35 and 50 mL⋅kg-1⋅min-1 for men and between 30 and 40 mL⋅kg-1⋅min-1 for women) the model appears to predict VO2peak fairly well (men: p<0.05; women: p<0.001), with CE values in men and women of– 0.28 and– 0.37, respectively. The same tendencies are shown in the validation and cross-validation samples (Tables 810). Pearson correlation showed minimal shrinkage between validation- sample (men: r = 0.870, R2 = 0.757, p < 0.01; women: r = 0.846, R2 = 0.716, p < 0.01) and cross-validation sample (men: r = 0.863, R2 = 0.745, p = 0.01; women: r = 0.850, R2 = 0.723, p < 0.01). Thus, the entire sample was used in development of the models.

thumbnail
Table 7. Multiple linear regression analysis for predicting VO2peak (mL⋅kg-1⋅min-1) from peak measurements in men and women, validation samples: The HUNT 3 fitness study.

https://doi.org/10.1371/journal.pone.0144873.t007

thumbnail
Table 8. Measured vs. Predicted VO2peak, from peak measurements, in the total sample: The HUNT 3 fitness study.

https://doi.org/10.1371/journal.pone.0144873.t008

thumbnail
Table 9. Cross-validation of Measured vs. Predicted VO2peak, from peak measurements, in men: The HUNT 3 fitness study.

https://doi.org/10.1371/journal.pone.0144873.t009

thumbnail
Table 10. Cross-validation of Measured vs. Predicted VO2peak, from peak measurements, in women: The HUNT 3 fitness study.

https://doi.org/10.1371/journal.pone.0144873.t010

Cross-classification of participants in the peak performance prediction model

The models managed to categorize participants fairly accurately into the correct measured VO2peak group when cross-classifying participants into quintiles of measured and predicted VO2peak (Table 11). In total, 75.3% and 77.6% of the men and women, predicted to be in the lowest quintile, were classified correctly into the lowest measured quintile, respectively, while 95.4% and 96.7% were correctly classified within the correct or closest measured quintile. 77.4% and 78.0% of the men and women, predicted to be in the highest quintile, were correctly classified into the highest measured quintile, respectively, with 95.8% and 95.9% being classified correctly into one of the two highest quintiles (Table 11). The rank correlation between measured and predicted quintiles were 0.74 and 0.70 in men and women, respectively, while measure of agreement by Kappa statistic was 0.45 in men and 0.41 in women.

thumbnail
Table 11. Cross-tabulation between measured and predicted VO2peak quintiles from peak performance for men and women.

https://doi.org/10.1371/journal.pone.0144873.t011

Predicting VO2peak from submaximal treadmill performance

Treadmill inclination and velocity accounted for the major part of the explained variance in predicting VO2peak (men: R2 = 0.40, p<0.001; women: R2 = 0.43, p<0.001), with velocity being the most important variable (men: R2 = 0.35, p<0.001; women: R2 = 0.34, p<0.001). Weight (men: R2 = 0.06, p<0.001; women: R2 = 0.06, p<0.001) and fraction peakHR (HRsubmax and age, men: R2 = 0.09, p<0.001; women: R2 = 0.06, p<0.001) accounted for a lesser but considerable part of the total variance (men: R2 = 0.55, p<0.001; women: R2 = 0.56, p<0.001). Resting heart rate yielded negligible changes in R2 and SEE and hence excluded from the prediction model. A high correlation (r = 0.75) was observed between predicted- and measured VO2peak among men and women (Fig 1). The final regression models derived from total sample were 35.25 + (1.276 x treadmill inclination in %) + (6.402 x velocity in km⋅h-1)–(0.196 x weight in kilos)–(27.615 x HRsubmax/215.336–0.73 x age in years) for men and 23.77 + (1.205 x treadmill inclination in %) + (6.051 x velocity in km⋅h-1)–(0.160 x weight in kilos)–(20.671 x HRsubmax/212.497–0.702 x age in years) for women (Tables 6, 12 and 13).

thumbnail
Table 12. Multiple linear regression coefficients for predicting VO2peak (mL⋅kg-1⋅min-1) from submaximal measurements in total sample.

https://doi.org/10.1371/journal.pone.0144873.t012

thumbnail
Table 13. Multiple linear regression analysis for predicting VO2peak from submaximal measurements on total sample: The HUNT 3 fitness study.

https://doi.org/10.1371/journal.pone.0144873.t013

Cross-validation of the submaximal performance prediction model

R2 was stable between the total sample (0.55 and 0.56) and validation sample (0.54 and 0.56) in men and women, respectively (Tables 13 and 14), indicating a strong prediction model. Furthermore, there were non-significant differences between the measured and predicted VO2peak, and CE was close to zero in the total sample (− 0.14 and − 0.12), validation sample (− 0.27 and − 0.03) and cross-validation sample (− 0.09 and − 0.11), among both men and women, respectively (Tables 1517), thus suggesting a valid estimation of mean VO2peak. The submaximal prediction models are less stable when stratified into subgroups of age, VO2peak and treadmill velocity groups. There was a tendency towards underestimating VO2peak (p<0.001 and p<0.001, with CE 3.03 and 1.46 among men and women, respectively, TE 6.78 and 5.63) in the youngest subjects (<40 years), with a subsequent overestimation (p<0.001 and p<0.001, with CE– 2.88 and– 1.70 among men and women, respectively, TE 6.42 and 4.86) in the oldest subjects (>60 years), whereas VO2peak was predicted fairly well (non-significant difference and p<0.01, with CE– 0.41 and– 0.58 among men and women, respectively, TE 5.90 and 5.02) in middle age group (40–60 years). In the VO2peak subgroups there was an apparent overestimation (p<0.001 and p<0.001, with CE– 5.82 and– 4.37 among men and women, respectively, TE 8.04 and 6.12) in the least fit groups (<35 mL⋅kg-1⋅min-1 for men and <30 mL⋅kg-1⋅min-1 for women), with a succeeding underestimation (p<0.001 and p<0.001, with CE 5.30 and 4.10 among men and women, respectively, TE 7.41 and 5.94) in the most fit group (>50 mL⋅kg-1⋅min-1 for men and >40 mL⋅kg-1⋅min-1 for women). However, the submaximal models predicted VO2peak fairly well (p<0.001 and p<0.001, with CE– 0.79 and– 0.53 among men and women, respectively, TE 4.83 and 4.07) in the medium fit subjects (VO2peak in the range of 35–50 mL⋅kg-1⋅min-1 for men and between 30 and 40 mL⋅kg-1⋅min-1 for women). All velocity groups predicted VO2peak fairly well, with no significant differences between measured and predicted VO2peak (<5 km⋅h-1: CE– 0.44 and– 0.49 among men and women, respectively, and TE 5.80 and 4.80; 5–6 km⋅h-1: CE– 0.35 and– 0.24, TE 6.24 and 5.28; > 6 km⋅h-1: CE– 0.36 and– 0.91, with TE 6.54 and 6.96). Findings in the total sample are in agreement with the overall tendencies in the validation and cross-validation samples (Tables 1517). Pearson correlation showed minor shrinkage between validation- sample (men: r = 0.733, R2 = 0.537, p < 0.01; women: r = 0.749, R2 = 0.561, p < 0.01) and cross-validation sample (men: r = 0.755, R2 = 0.570, p = 0.01; women: r = 0.743, R2 = 0.552, p < 0.01). Hence, the entire sample was utilized in development of the models.

thumbnail
Table 14. Multiple linear regression analysis for predicting VO2peak from submaximal measurements in men and women from validation samples: The HUNT 3 fitness study.

https://doi.org/10.1371/journal.pone.0144873.t014

thumbnail
Table 15. Measured vs. Predicted VO2peak, from submaximal measurements, in the total sample: The HUNT 3 fitness study.

https://doi.org/10.1371/journal.pone.0144873.t015

thumbnail
Table 16. Cross-validation of Measured vs. Predicted VO2peak, from submaximal measurements in men: The HUNT 3 fitness study.

https://doi.org/10.1371/journal.pone.0144873.t016

thumbnail
Table 17. Cross-validation of Measured vs. Predicted VO2peak, from submaximal measurements in women: The HUNT 3 fitness study.

https://doi.org/10.1371/journal.pone.0144873.t017

Cross-classification of participants in the submaximal performance prediction model

Cross-classification of predicted (from submaximal performance) and measured VO2peak achieved a fairly accurate placing of subjects into the correct VO2peak quintile (Table 18). In total, 62.0% and 50.3% of the men and women were predicted appropriately into the lowest measured quintile, respectively, with an increase to 91.3% and 80.9% within the closest measured quintiles. A total of 59.0% and 72.5% of the men and women, in the highest predicted quintile, were correctly categorized into the highest measured quintile, respectively, increasing to 84.9% and 89.0% within one of the two highest quintiles (Table 18). The rank correlation between measured and predicted quintiles were 0.61 and 0.60 in men and women, respectively, while measure of agreement by Kappa statistic was 0.30 in men and 0.28 in women.

thumbnail
Table 18. Cross-tabulation between measured and predicted VO2peak quintiles from submaximal performance in men and women.

https://doi.org/10.1371/journal.pone.0144873.t018

Discussion

The exercise-based prediction models generated in this study accurately placed approximately 91% of the low- and high-fit participants within the correct or nearest quintile of measured VO2peak, and predicted VO2peak with fair precision using both the peak performance and submaximal models.

Accuracy of the VO2peak prediction models

The peak performance models displayed accuracy (SEE) of 10.5% (R2 = 0.75) and 11.5% (R2 = 0.72), in men and women, respectively. This is better than some previous research reporting accuracy in the range 13.3–16.6% [14,24], and also less accurate or equal to that reported by yet others (4.5–11.4%) [1013,15,18,25]. Better prediction accuracy in other models may partly be attributed their homogeneous fitness level in sample subjects [10,12,13,15,18] and/or narrow age range [1012]. Validating other models using HUNT 3 data is difficult given the use of different independent variables, e.g. watts on cycle ergometer [13,18,25] or 20m-shutle run [1012]. Although the ACSM running model [20] used, similar to us, speed and gradient, it is developed from steady-state submaximal aerobic exercise, and can be used exclusively in predicting VO2 during steady-state submaximal work. Hence, it will overestimate VO2 for peak exercise since contribution from anaerobic metabolism is significant [20], which was confirmed in a previous validation study [24]. However, we were able to validate a model by Uth and colleagues [15] using heart rate ratio (HRpeak/resting heart rate) as predictor variable for VO2max. The Uth model, derived from 46 well-trained men, presented a SEE of 4.5%. This accuracy was considerably lower when validated using HUNT 3 data (18% and 19% SEE in men and women, respectively), which is supported by Esco and colleagues [14] who also observed a substantial reduction in accuracy (SEE of 16.6%), using 109 healthy men to validate the Uth model. This underscores the importance of similar gender, age and physical fitness between the subjects using the model and the subjects used in developing the model to assure best possible accuracy [2,19].

Submaximal VO2peak prediction models are generally outperformed on accuracy by models derived from peak workload[26], which is also the case in this study presenting accuracies (SEE) of 14.1% (R2 = 0.55) and 14.4% (R2 = 0.56), in men and women, respectively. Moreover, non-exercise based prediction models derived from HUNT 3 fitness data [27] yielded a somewhat better accuracy (12.8% and 14.3% in men and women, respectively) than the present submaximal models, while the present peak models had better accuracy. Previous research reported prediction error in the range 7.3–20.9% [2,16,2836]. The bench-mark Åstrand-Ryhming nomogram [37] reported accuracy of approximately 10%, which was confirmed when validated by Cink & Thomas [38]. Both Åstrand and Cink observed minor differences between measured and predicted VO2peak. However, both used small groups of physically fit college students for their calculations. Validating the Åstrand-Rhyming nomogram using untrained sedentary subjects [39] showed a 26.5% systematic underestimation of VO2max. Several peak [13,18,25] and submaximal models [16,2931,34,35] used cycle ergometer to measure VO2peak/max, however, compelling evidence points to a 6–15% lower VO2peak compared to that obtained when running [4044].

Cross-validation of VO2peak prediction models

Randomly splitting data into validation and cross-validation samples established good stability throughout all models, suggesting minor shrinkage in accuracy if used on other similar populations. Moreover, data splitting will minimize potential over fitting that might deteriorate the external validity of the models [45].

For the peak performance models, error estimates are fairly stable across subgroups of both age and treadmill velocity. Conversely, in the VO2peak subgroups we observed a trend of systematic under- and overestimation of the predicted values in the high- and low-fit participants, respectively. This is consistent with previous findings [12,46,47].

Similarly for the submaximal models, error estimates are reasonably stable across the treadmill velocity subgroups, whereas across the age subgroups there is a tendency towards under- and overestimating VO2peak in the youngest and oldest, respectively. For the VO2peak subgroups an even greater tendency towards under- and over estimation in the high- and low-fit participants, respectively, is observed compared to the peak performance models. Wier and colleagues [26] argue that the underestimation of the fittest participants is of less importance from a public health perspective, since a high level of fitness is not associated with adverse health outcomes. However, it highlights the necessity of using models derived from aerobically fit subjects to obtain high predictive accuracy and stability for a well-trained population. Such models are previously developed [12,15,18], while models with high predictive accuracy for low-fit populations are scarce. The models inability to accurately identify fitness level in the low-fit subjects represent a potential concern, since low aerobic fitness is associated with increased prevalence of chronic disease as well as a higher mortality risk, e.g. cardiovascular disease and metabolic syndrome [3,4,48]. However, cross-classification accurately predicted approximately 91% of participants, in both sexes, within the nearest quintile of measured VO2peak.

There are several factors that might contribute to the systematic over- and underestimation of VO2peak, as well as to the attenuation of prediction accuracy. The statistical rationale is that our models are based on linear regression, where the distribution assumptions smooth out extreme observations compared to the grand mean, and may therefore under predict high observations and conversely over predict low observations (regression-to-the-mean phenomenon).

For the submaximal performance models there are additional plausible factors. Genetics account for an additional source of prediction inaccuracy as maximal heart rate is heterogeneous, with significant variations in a population [1]. Based on HUNT 3 fitness data, our group recently reported a standard deviation on measured maximal heart rate of ±14 beats⋅min-1 [23]. Consequently, imbedding fraction of maximal/peak heart rate as a separate equation in the model weakens the accuracy of the VO2peak prediction [1]. Furthermore, since the models are based on linear predictions the best trained are underestimated, and could be so because they have a good movement economy, conversely an overestimation of the least fit, attributed poor movement economy. These additional possible explanations are supported by the considerably higher over- and under estimation of VO2peak in the submaximal performance models compared to the peak performance models. Moreover if a person using the prediction equation has a better movement economy than that of the subjects in the HUNT 3 fitness study, he or she will be overestimated using the submaximal model, and conversely underestimated with poor movement economy. The person will have a better or worse aerobic capacity, influenced by movement economy, not by VO2peak.

The independent variables influence on VO2peak

Calculating standardized β weights, for the models based on peak performance, revealed velocity as the key determinant of VO2peak, followed by age and weight, among both sexes. Not surprisingly inclination had the least impact on VO2peak, since approximately 87% of the subjects tested on 10% treadmill inclination in the peak performance models.

Likewise for the submaximal models velocity was paramount in determining VO2peak among both sexes. In men importance of succeeding determinants of VO2peak were fraction HRpeak (consisting of age and work heart rate), weight and inclination. For women this was altered to inclination, fraction of HRpeak and weight. Inclination being more potent in women may be related to a larger diversity in running inclination. Explained variances in the submaximal models were 55% and 56% in men and women, respectively, which yields better predictive capabilities than some (31–51%) [17,31,34], and yet worse than other previous models (60–83%) [2,16,28,29,32,33,35].

Strengths and Limitations

The large sample size, including both men and women, and wide age range makes this study robust. Our direct test to volitional exhaustion to measure VO2peak by ventilatory gas analysis is preferable compared to indirect estimates when making prediction equations from population studies, since direct measurements display higher correlations as well as lower standard error of estimate [5]. The low participation rate may contribute to bias caused by self-selection. Still, 5633 (45%) of those invited to the present Fitness study from the total HUNT population volunteered for the cardiopulmonary exercise test. Out of these 5633, 1003 candidates withdrew, did not complete the CPET or were excluded for medical reasons, leaving 4631 (37%) completed tests. Some potential candidates declined participation due to long waiting lines caused by limited capacity at test sites. Consequently, it is possible that those who finally partook could be healthier than those who withdrew from testing. However, comparing the Fitness study participants to a healthy sample from the total HUNT population (i.e. free from pulmonary- and cardiovascular diseases, sarcoidosis or cancer) established that there were no considerable differences between the two [3]. However, the consistent overestimation of the least fit candidates associated with the highest health risks is more precarious. This should be taken into account when applying the models.

The models inability to accurately identify fitness level in the low-fit subjects represent a potential concern, since low aerobic fitness is associated with increased prevalence of chronic disease as well as a higher mortality risk, e.g. cardiovascular disease and metabolic syndrome

Practical implications

In a health care setting the models good ability to detect subjects with low VO2peak is paramount to classify persons in need of physical activity and lifestyle intervention. Cross-classification of participants into quintiles of measured and predicted VO2peak demonstrate the models reasonable ability to classify participants appropriately. More importantly, both the use of peak- and submaximal performance models are considered a generally safe practice on high-risk cardiovascular disease patients [49]. Our models are derived from a large population of both men and women, with a wide heterogeneity in fitness levels as well as covering a large age span (20–90 years). This provides a high degree of applicability for widespread use.

Conclusions

The VO2peak prediction models presented in this study are inexpensive and uncomplicated to utilize, thus a convenient option for both recreational athletes as well as in health care settings. Judicious and appropriate use of these predictive models will offer valuable information in providing a fairly accurate estimate of peak oxygen uptake, which is beneficial for establishing cardiorespiratory fitness, and with potentially improved risk stratification.

Acknowledgments

The HUNT 3 fitness study is a collaboration between The HUNT research center (Faculty of Medicine, Norwegian University of Science and Technology, NTNU), Nord-Trøndelag County Council and The Norwegian Institute of public Health, Liaison Committee between the Central Norway Regional Health Authority (RHA) and the Norwegian University of Science and Technology (NTNU).

Author Contributions

Conceived and designed the experiments: UW. Performed the experiments: UW BMN. Analyzed the data: HL BMN UW. Contributed reagents/materials/analysis tools: HL BMN UW. Wrote the paper: HL BMN UW. Drafting of manuscript or revising it critically for important intellectual content: HL BMN UW. Final approval of the version to be published: HL BMN UW.

References

  1. 1. Evans HJ, Ferrar KE, Smith AE, Parfitt G, Eston RG (2015) A systematic review of methods to predict maximal oxygen uptake from submaximal, open circuit spirometry in healthy adults. J Sci Med Sport 18: 183–188. pmid:24721146
  2. 2. Larsen GE, George JD, Alexander JL, Fellingham GW, Aldana SG, Parcell AC (2002) Prediction of maximum oxygen consumption from walking, jogging, or running. Res Q Exerc Sport 73: 66–72. pmid:11926486
  3. 3. Aspenes ST, Nilsen TI, Skaug EA, Bertheussen GF, Ellingsen O, Vatten L, et al. (2011) Peak oxygen uptake and cardiovascular risk factors in 4631 healthy women and men. Med Sci Sports Exerc 43: 1465–1473. pmid:21228724
  4. 4. Kodama S, Saito K, Tanaka S, Maki M, Yachi Y, Asumi M, et al. (2009) Cardiorespiratory fitness as a quantitative predictor of all-cause mortality and cardiovascular events in healthy men and women: a meta-analysis. JAMA 301: 2024–2035. pmid:19454641
  5. 5. Jurca R, Jackson AS, LaMonte MJ, Morrow JR Jr., Blair SN, Wareham NJ, et al. (2005) Assessing cardiorespiratory fitness without performing exercise testing. Am J Prev Med 29: 185–193.
  6. 6. Guazzi M, Adams V, Conraads V, Halle M, Mezzani A, Vanhees L, et al. (2012) EACPR/AHA Scientific Statement. Clinical recommendations for cardiopulmonary exercise testing data assessment in specific patient populations. Circulation 126: 2261–2274. pmid:22952317
  7. 7. Gupta S, Rohatgi A, Ayers CR, Willis BL, Haskell WL, Khera A, et al. (2011) Cardiorespiratory fitness and classification of risk of cardiovascular disease mortality. Circulation 123: 1377–1383. pmid:21422392
  8. 8. Stamatakis E, Hamer M, O'Donovan G, Batty GD, Kivimaki M (2013) A non-exercise testing method for estimating cardiorespiratory fitness: associations with all-cause and cardiovascular mortality in a pooled analysis of eight population-based cohorts. Eur Heart J 34: 750–758. pmid:22555215
  9. 9. (1997) Clinical exercise testing with reference to lung diseases: indications, standardization and interpretation strategies. ERS Task Force on Standardization of Clinical Exercise Testing. European Respiratory Society. Eur Respir J 10: 2662–2689. pmid:9426113
  10. 10. Tsiaras V, Zafeiridis A, Dipla K, Patras K, Georgoulis A, Kellis S, et al. (2010) Prediction of peak oxygen uptake from a maximal treadmill test in 12- to 18-year-old active male adolescents. Pediatr Exerc Sci 22: 624–637. pmid:21242610
  11. 11. Leger LA, Lambert J (1982) A maximal multistage 20-m shuttle run test to predict VO2 max. Eur J Appl Physiol Occup Physiol 49: 1–12. pmid:7201922
  12. 12. Kilding AE, Aziz AR, Teh KC (2006) Measuring and predicting maximal aerobic power in international-level intermittent sport athletes. J Sports Med Phys Fitness 46: 366–372. pmid:16998439
  13. 13. Lamberts RP, Davidowitz KJ (2014) Allometric scaling and predicting cycling performance in (well-) trained female cyclists. Int J Sports Med 35: 217–222. pmid:23900902
  14. 14. Esco MR, Olson MS, Williford HN, Mugu EM, Bloomquist BE, McHugh AN (2012) Crossvalidation of two heart rate-based equations for predicting VO2max in white and black men. J Strength Cond Res 26: 1920–1927. pmid:21964424
  15. 15. Uth N, Sorensen H, Overgaard K, Pedersen PK (2004) Estimation of VO2max from the ratio between HRmax and HRrest—the Heart Rate Ratio Method. Eur J Appl Physiol 91: 111–115. pmid:14624296
  16. 16. Cao ZB, Miyatake N, Higuchi M, Miyachi M, Ishikawa-Takata K, Tabata I (2010) Predicting VO2max with an objectively measured physical activity in Japanese women. Med Sci Sports Exerc 42: 179–186. pmid:20010115
  17. 17. Kendall KL, Fukuda DH, Smith AE, Cramer JT, Stout JR (2012) Predicting maximal aerobic capacity (VO2max) from the critical velocity test in female collegiate rowers. J Strength Cond Res 26: 733–738. pmid:22289694
  18. 18. Malek MH, Berger DE, Housh TJ, Coburn JW, Beck TW (2004) Validity of VO2max equations for aerobically trained males and females. Med Sci Sports Exerc 36: 1427–1432. pmid:15292753
  19. 19. Malek MH, Housh TJ, Berger DE, Coburn JW, Beck TW (2005) A new non-exercise-based Vo2max prediction equation for aerobically trained men. J Strength Cond Res 19: 559–565. pmid:16095416
  20. 20. Armstrong LB, Balady GJ, Berry MJ, Davis SE, Davy BM, Davy KP, et al. (2006) ACSM's Guidelines for Exercise Testing and Prescription. U.S.: Lippincott Williams & Wilkins. 288 p.
  21. 21. Rotstein A, Inbar O, Berginsky T, Meckel Y (2005) Preferred transition speed between walking and running: effects of training status. Med Sci Sports Exerc 37: 1864–1870. pmid:16286854
  22. 22. Turvey MT, Holt KG, Lafiandra ME, Fonseca ST (1999) Can the transitions to and from running and the metabolic cost of running be determined from the kinetic energy of running? J Mot Behav 31: 265–278.
  23. 23. Loe H, Rognmo O, Saltin B, Wisloff U (2013) Aerobic capacity reference data in 3816 healthy men and women 20–90 years. PLoS One 8: e64319. pmid:23691196
  24. 24. Koutlianos N, Dimitros E, Metaxas T, Cansiz M, Deligiannis A, Kouidi E (2013) Indirect estimation of VO2max in athletes by ACSM's equation: valid or not? Hippokratia 17: 136–140. pmid:24376318
  25. 25. Magrani P, Pompeu FA (2010) [Equations for predicting aerobic power (VO(2)) of young Brazilian adults]. Arq Bras Cardiol 94: 763–770. pmid:20499007
  26. 26. Wier LT, Jackson AS, Ayers GW, Arenare B (2006) Nonexercise models for estimating VO2max with waist girth, percent fat, or BMI. Med Sci Sports Exerc 38: 555–561. pmid:16540845
  27. 27. Nes BM, Janszky I, Vatten LJ, Nilsen TI, Aspenes ST, Wisloff U (2011) Estimating V.O 2peak from a nonexercise prediction model: the HUNT Study, Norway. Med Sci Sports Exerc 43: 2024–2030. pmid:21502897
  28. 28. Billinger SA, VAN.S E , McClain M, Lentz AA, Good MB (2012) Recumbent stepper submaximal exercise test to predict peak oxygen uptake. Med Sci Sports Exerc 44: 1539–1544. pmid:22382170
  29. 29. Coquart JB, Eston RG, Grosbois JM, Lemaire C, Dubart AE, Luttenbacher DP, et al. (2010) Prediction of peak oxygen uptake from age and power output at RPE 15 in obese women. Eur J Appl Physiol 110: 645–649. pmid:20532554
  30. 30. Cao ZB, Miyatake N, Aoyama T, Higuchi M, Tabata I (2013) Prediction of maximal oxygen uptake from a 3-minute walk based on gender, age, and body composition. J Phys Act Health 10: 280–287. pmid:22821953
  31. 31. Cao ZB, Miyatake N, Higuchi M, Ishikawa-Takata K, Miyachi M, Tabata I (2009) Prediction of VO2max with daily step counts for Japanese adult women. Eur J Appl Physiol 105: 289–296. pmid:18985375
  32. 32. Swain DP, Wright RL (1997) Prediction of VO2peak from submaximal cycle ergometry using 50 versus 80 rpm. Med Sci Sports Exerc 29: 268–272. pmid:9044233
  33. 33. Oja P, Laukkanen R, Pasanen M, Tyry T, Vuori I (1991) A 2-km walking test for assessing the cardiorespiratory fitness of healthy adults. Int J Sports Med 12: 356–362. pmid:1917218
  34. 34. Faulkner J, Parfitt G, Eston R (2007) Prediction of maximal oxygen uptake from the ratings of perceived exertion and heart rate during a perceptually-regulated sub-maximal exercise test in active and sedentary participants. Eur J Appl Physiol 101: 397–407. pmid:17684757
  35. 35. Okura T, Tanaka K (2001) A unique method for predicting cardiorespiratory fitness using rating of perceived exertion. J Physiol Anthropol Appl Human Sci 20: 255–261. pmid:11759263
  36. 36. Marsh CE (2012) Evaluation of the American College of Sports Medicine submaximal treadmill running test for predicting VO2max. J Strength Cond Res 26: 548–554. pmid:22262016
  37. 37. Astrand PO, Ryhming I (1954) A nomogram for calculation of aerobic capacity (physical fitness) from pulse rate during sub-maximal work. J Appl Physiol 7: 218–221. pmid:13211501
  38. 38. Cink RE, Thomas TR (1981) Validity of the Astrand-Ryhming nomogram for predicting maximal oxygen intake. Br J Sports Med 15: 182–185. pmid:7272663
  39. 39. Rowell LB, Taylor HL, Wang Y (1964) Limitations to Prediction of Maximal Oxygen Intake. J Appl Physiol 19: 919–927. pmid:14207745
  40. 40. Astrand PO, Rodahl K, Dahl HA, Strømme SB (2003) Textbook of Work Physiology. Physiological Bases of Exercise. Champaign, IL, U.S.: Human Kinetics. 273p.
  41. 41. Hermansen L, Saltin B (1969) Oxygen uptake during maximal treadmill and bicycle exercise. J Appl Physiol 26: 31–37. pmid:5762873
  42. 42. Miles DS, Critz JB, Knowlton RG (1980) Cardiovascular, metabolic, and ventilatory responses of women to equivalent cycle ergometer and treadmill exercise. Med Sci Sports Exerc 12: 14–19. pmid:7392896
  43. 43. Faulkner JA, Roberts DE, Elk RL, Conway J (1971) Cardiovascular responses to submaximum and maximum effort cycling and running. J Appl Physiol 30: 457–461. pmid:5572760
  44. 44. Miyamura M, Honda Y (1972) Oxygen intake and cardiac output during maximal treadmill and bicycle exercise. J Appl Physiol 32: 185–188.
  45. 45. Rosner B (2006) Fundamentals of Biostatistics. Belmont, CA, U.S.: Thomson. 868 p.
  46. 46. Heil DP, Freedson PS, Ahlquist LE, Price J, Rippe JM (1995) Nonexercise regression models to estimate peak oxygen consumption. Med Sci Sports Exerc 27: 599–606. pmid:7791593
  47. 47. Jackson AS, Blair SN, Mahar MT, Wier LT, Ross RM, Stuteville JE (1990) Prediction of functional aerobic capacity without exercise testing. Med Sci Sports Exerc 22: 863–870. pmid:2287267
  48. 48. Myers J, Prakash M, Froelicher V, Do D, Partington S, Atwood JE (2002) Exercise capacity and mortality among men referred for exercise testing. N Engl J Med 346: 793–801. pmid:11893790
  49. 49. Skalski J, Allison TG, Miller TD (2012) The safety of cardiopulmonary exercise testing in a population with high-risk cardiovascular diseases. Circulation 126: 2465–2472. pmid:23091065