Exploratory analysis of the accuracy of age-based maximal heart rate equations across cardiorespiratory fitness levels

Joel Martin; Bryndan Lindsey; Courtney Gerrity; Jatin Ambegaonkar

doi:10.1371/journal.pone.0335842

Abstract

Introduction

Maximal heart rate (MHR) is a key measure for cardiorespiratory exercise prescription yet is often estimated using age-based prediction equations. The accuracy of these equations may vary by individual characteristics, including cardiorespiratory fitness (CRF), but limited research has examined predictive accuracy across CRF levels. Therefore, we evaluated the accuracy of seven commonly used MHR prediction equations in adults with varying CRF to assess whether prediction error differs by fitness level.

Materials and methods

Data from 230 healthy adults (76% male, mean age 38.5 ± 12.3 years) who completed maximal graded exercise tests between 2019 and 2024 were analyzed retrospectively. Predicted MHR values were calculated using the Fox, Tanaka, Gellish, Arena, Åstrand, Nes, and Fairbairn equations. Linear mixed-effects models (LMM) tested the influence of VO₂max and its interaction with prediction equation on error, with sex included as a covariate. Estimated marginal means and slopes were extracted, with pairwise contrasts adjusted by the Tukey method. Prediction equation accuracy was evaluated by comparing predicted and measured MHR using Bland-Altman analyses, and metrics including mean absolute error (MAE), root mean square error (RMSE), and intraclass correlation coefficients (ICC).

Results

LMM indicated a significant main effect of prediction equation on error (p < 0.001) and a significant equation × VO₂max interaction (p = 0.015), though neither sex (p = 0.49) nor VO₂max (p = 0.18) alone influenced error. The conditional R² for the LME model was 0.70, with a marginal R² of 0.02. Post-hoc linear regressions showed higher VO₂max was associated with greater prediction error for several equations in males, but not females, with a small amount of variance explained (R² ≤ 0.06). Agreement analyses indicated small mean biases across equations (–3 to +6 bpm) but wide limits of agreement (~±18–24 bpm). Arena, Tanaka and Gellish equations showed the lowest MAE and RMSE. Among the equations, Fox showed the most stable performance across MHR ranges, being the only formula without proportional bias across the sample.

Discussion

The findings indicate that CRF had only a limited influence on MHR prediction error, with small associations observed in males but not females, reinforcing age as the primary determinant of MHR. Although some equations (e.g., Tanaka, Gellish, Arena, Fox) performed better than others across agreement metrics, none demonstrated high individual level accuracy, which highlights a lack of precision when estimating MHR for exercise prescription and monitoring purposes. Future work should explore more individualized modeling approaches, though adjusting for CRF alone may not substantially improve prediction accuracy in healthy adults.

Citation: Martin J, Lindsey B, Gerrity C, Ambegaonkar J (2025) Exploratory analysis of the accuracy of age-based maximal heart rate equations across cardiorespiratory fitness levels. PLoS One 20(10): e0335842. https://doi.org/10.1371/journal.pone.0335842

Editor: Stefano Amatori, eCampus University: Universita degli Studi eCampus, ITALY

Received: June 4, 2025; Accepted: October 16, 2025; Published: October 30, 2025

Copyright: © 2025 Martin et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: All relevant data are within the manuscript and its Supporting information files. The dataset generated and analyzed during the current study has been deposited in the Open Science Framework and is publicly available under DOI https://doi.org/10.17605/OSF.IO/VDG92 (version 1.0). The dataset can be accessed at: https://osf.io/vdg92/?view_only=47d195f2b4d94641a1b54dd2a67cca2c.

Funding: The author(s) received no specific funding for this work.

Competing interests: The authors have declared that no competing interests exist.

Introduction

Maximal heart rate (MHR) in beats per minute (bpm) is a fundamental parameter in exercise physiology, commonly used to assess cardiorespiratory fitness (CRF) [1] and prescribe training intensity [2]. Direct measurement of MHR requires a graded exercise test (GXT) taken to volitional exhaustion, which may pose logistical challenges and potential health risks, particularly for individuals with underlying medical conditions or limited exercise tolerance [3]. As a result, age-based MHR prediction equations are widely used as a safer, more convenient alternative to directly measuring MHR [1]. The most common prediction equation is the simple ‘MHR = 220 – age’ formula, developed by Fox and colleagues [4]; although it has been criticized for high inter-individual variability [3,5]. In response, numerous alternative equations have been proposed to improve prediction accuracy due to high inter-individual variability in MHR across different populations [1,6–8].

Several widely used MHR prediction equations are derived from large population studies or specific subgroups [1,6–8]. Notable examples include the Tanaka equation (MHR = 208 − 0.7 × age) introduced by Tanaka et al. [1] based on a meta-analysis of 18,712 subjects, the Gellish equation (MHR = 207 − 0.7 × age) from a longitudinal fitness study in 2007 with participants (n = 908) of a broad age and fitness spectrum [6], and the Nes equation (MHR = 211 − 0.64 × age) derived from a large (n = 3320) Norwegian cohort [9]. Other widely used formulas include Åstrand’s formula (216.6 − 0.84 × age) [10], and Arena’s formula (209.3 − 0.72 × age) which were proposed in the context of cardiac rehabilitation and functional capacity testing to gauge effort [11].

While these MHR prediction equations rely primarily on age, cardiorespiratory responses to exercise may differ between sexes due to physiological and hormonal differences [12]. Accordingly, some MHR models have attempted to account for sex differences [7]. For instance, Fairbairn’s equations are sex-specific for males and females (e.g., male: MHR = 208 − 0.8 × age; female: MHR = 201 − 0.63 × age) with the inclusion of sex intended to improve accuracy for female individuals [7]. With numerous MHR prediction formulas available and derived from different samples [1,6–8], it is evident that no single equation perfectly fits all as each carries assumptions based on the characteristics of its derivation sample [13]. Importantly, across all formula, age has emerged as the predominant factor to predict MHR.

One major limitation of all age-based MHR prediction formulas is that age explains only part of the variability in MHR [1,13,14]. While MHR declines about 0.5 to 1 beats per year in adulthood [14], other factors such as sex [7], genetics [15], and CRF level [16] may also influence MHR. Notably, there has been debate about the effect of CRF and/or endurance training on MHR [16]. Higher CRF achieved through endurance training is generally associated with enhanced stroke volume and autonomic regulation, increased vagal tone at rest [17], and significantly lower resting heart rates [18]. Some evidence indicates that chronically aerobic-trained people show a slight reduction in MHR compared to age-matched untrained peers [16], potentially due to cardiac remodeling and increased parasympathetic influence that accompanies aerobic conditioning [16]. However, others have observed increases in MHR after detraining or inconsistent changes with training [19,20]. Zavorsky [16] and Carter et al. [21] have detailed these conflicting findings, noting instances where MHR in previously sedentary adults decreased with exercise training, only to increase again upon training cessation [16,21]. Interestingly, Lach et al. [22] found that adding CRF and body composition to prediction models only marginally improved accuracy (R² = 0.22 vs. 0.19 for age alone), suggesting limited influence of CRF on MHR for physically active adults. To our knowledge this finding has yet to be explored in subsequent studies. Thus, while age remains the most informative single covariate in population models of MHR, the influence of CRF level on MHR and on the accuracy of prediction is not well resolved.

In summary, the literature indicates that age-based MHR predictions are convenient but lack accuracy on the individual level [13] and it remains uncertain whether any equation is more suitable based on CRF level. The potential error introduced by CRF level represents an important gap in the literature, as knowledge of consistent over or underestimation by a specific equation would affect the ability for exercise professionals to make more informed decisions when selecting the most appropriate formula. To address this gap, the present study undertook an exploratory analysis of the accuracy of several commonly used age-predicted MHR equations across adults with varying CRF levels. In line with recent recommendations for distinguishing exploratory from confirmatory research [23,24], no a priori hypotheses were tested. The study findings may provide preliminary insights of value to exercise science practitioners, clinicians, and researchers who frequently utilize MHR prediction equations in individuals with varying levels of CRF when devising cardiorespiratory training and exercise programs for their clients and patients.

Materials and methods

2.1. Participants and design

This study was a retrospective analysis of de-identified graded exercise tests (GXT) data collected between 2019 and 2024 at a university exercise physiology laboratory. The data were accessed on January 23, 2025 for research purposes. A total of 230 adults (174 males, 56 females) who had performed GXTs during this period were included. Participant ages ranged from 18 to 68 years (mean ± SD: 38.5 ± 12.3 years) with a range of CRF levels. The participants were individuals from the local community who voluntarily participated in the laboratory’s human performance GXT testing services. In most cases, they were recreational athletes of varying levels seeking estimates of VO₂max and MHR to guide training. Others underwent testing for general health or fitness assessment informational purposes. Inclusion criteria for the analysis were: age ≥ 18; completion of a GXT to volitional fatigue with valid attainment of VO₂max (see below); and no acute medical issues at the time of testing. The study protocol was reviewed and approved by the George Mason University Institutional Review Board (IRB #: 1665548) and participants provided informed consent prior to testing. Because this was a retrospective analysis of an existing dataset, no a priori sample size calculation was conducted. As a retrospective exploratory study, we analyzed all eligible records without an a priori power analysis, following recommendations to justify sample size by using the full accessible dataset and to transparently acknowledge the absence of prospective calculations [25]. The exploratory nature of this study is intended to generate preliminary insights and identify patterns that can inform future confirmatory studies that can plan sample sizes for desired accuracy or a smallest effect size of interest [24].

2.2. Graded exercise test protocol

All participants performed a maximal GXT on a motorized treadmill (Desmo, Woodway, Waukesha, WI, USA) to determine VO₂max and MHR. Tests were conducted by trained technicians following standard laboratory procedures. Participants were advised to refrain from heavy exercise and stimulants on the day of the test and provided basic health screening information before starting. The protocol consisted of a 3-minute warm-up followed by a continuous incremental test. During warm-up, participants began with a brisk walk (~3.0 mph, 0–1 min), transitioned to a light jog (~4–5 mph, 1–2 min), and progressed to their self-selected “working pace” (2–3 min). The working pace was typically ~5–6 mph but was adjusted individually to reflect a pace the participant self-reported could maintain for 15–20 minutes. At minute 3, the first stage began at 0% grade. Thereafter, treadmill grade increased by 2% every 2 minutes (e.g., 2% at min 5, 4% at min 7, 6% at min 9, etc.). Once a 10% grade was reached, further increases were made by small increments in speed (~0.2–0.3 mph) rather than grade, to preserve running form. The test followed GXT self-paced recommendations [26] and was designed to elicit volitional exhaustion within ~8–12 minutes, excluding the warm-up. Throughout the test, heart rate (HR) was measured using a telemetry HR monitor (H10, Polar, Polar Electro, Kempele, Finland), which is shown to have superior signal quality relative to other comparable devices [27]. Expired gases were collected using a calibrated metabolic cart (Parvo Medics TrueOne 2400; Parvo Medics, Sandy, UT, USA) to determine oxygen uptake. The metabolic system was calibrated prior to each test according to manufacturer guidelines using standardized gas concentrations and volume flow calibration. Each test continued until the participant reached volitional exhaustion (e.g., gait instability, participant request to stop) or met termination criteria. Achievement of VO₂max (termination criteria) was verified by meeting at least two of the following criteria: (1) respiratory exchange ratio (RER) ≥ 1.10 at peak exercise; (3) heart rate plateauing near age-predicted MHR (within ~10 bpm) or reaching ≥90% of age-predicted MHR; and/or (3) rating of perceived exertion (RPE) ≥ 19 on a 6–20 Borg scale [3]. All participants included in the analysis satisfied these criteria, indicating equivalent effort across the sample [3]. MHR measured was defined as the highest heart rate value attained during the GXT.

2.3. Maximum heart rate prediction equations

The performance of seven common age-based MHR prediction equations were evaluated. The equations chosen for their frequent use in both research and practice [3,13]. The equations were:

Fox [4]: MHR = 220 − age
Tanaka [1]: MHR = 208–0.7 × age
Gellish [6]: MHR = 207–0.7 × age
Arena [11]: MHR = 209.3–0.72 × age
Åstrand [10]: MHR = 216.6 − 0.84 × age
Nes [9]: MHR = 211 − 0.64 × age
Fairbairn [7]: MHR = 208 − 0.8 × age for male; MHR = 201 − 0.63 × age for female

2.4. Statistics

The data was exported from the metabolic cart computer software into Excel (Microsoft Corporation, Redmond, WA, USA) Nfor further analysis. All statistical analyses and visualizations were performed using R (version 4.2.1, R Core Team, Vienna, Austria). Statistical significance was set to p < 0.05 for all analyses with adjustments as appropriate as described below. The dataset supporting this study is openly available on the Open Science Framework at https://doi.org/10.17605/OSF.IO/VDG92. All variables required for analysis (age, sex, MHR, and VO₂max) were complete with no missing values. Therefore, no participants were excluded due to incomplete demographic or physiological data.

Outlier detection was performed on prediction error (Prediction Error = MHR_predicted – MHR_measured) and absolute error values. Outliers were flagged as values exceeding ±3 standard deviations from the mean [28]. This pre-specified threshold was used as a quality-control screen to limit inclusion of likely artifacts from HR telemetry, such as signal dropouts or chest strap slippage [27]. Raw data from flagged cases were reviewed and 3 participants’ data were removed for MHR values flagged as outliers resulting in a final analytical sample of N = 230. Normality of the primary continuous variables (age, VO₂max, MHR, and RER) was assessed using the Shapiro–Wilk test and inspection of Q–Q plots. To further examine the distributional properties, kernel density plots were generated for the overall sample and stratified by sex.

Descriptive statistics were calculated for the full sample and stratified by sex. Sex differences in continuous variables were assessed using non-parametric Mann–Whitney U tests due to violations of normality assumptions. Effect sizes, with 95% confidence intervals (CI), for group differences were calculated as Cohen’s d, with d = 0.2, 0.5, 0.8 indicating small, medium, and large effects, respectively [29]. To assess the relationships between MHR versus age we regressed measured MHR on the equation-predicted MHR (simple linear model: MHR_measured ~ MHR_predicted) and reported coefficients of determination (R²) to summarize the proportion of variance explained. To examine whether the relationship between age and relative VO₂max differed by sex, scatterplots stratified by sex were generated with locally estimated scatterplot smoothing (LOESS) curves. Linear regression models including sex, age, and the sex × age interaction term were fit to test for sex-specific differences in slopes. R² were computed for each sex to quantify the strength of association between age and VO₂max. Slope contrasts were performed to compare the estimated age-related decline in VO₂max between males and females.

To evaluate the effects of CRF on MHR prediction equations, we quantified error using absolute differences between predicted and measured values. Linear mixed-effects models (LMMs) were fit using the lme4 package in R. Fixed effects included Equation (seven levels: Fox, Tanaka, Gellish, Arena, Åstrand, Nes, Fairbairn), VO₂max (continuous), their interactions, and sex (as a covariate). VO₂max values were mean centered prior to analysis so that model intercepts reflected prediction accuracy at the sample average, and to reduce multicollinearity between main effects and interaction terms. Subjects were modeled with random intercepts to account for repeated measures across equations. Estimated marginal means and simple slopes were extracted from the models using the emmeans package, with pairwise post hoc contrasts adjusted using the Tukey method to control for multiple comparisons. LMMs were selected over repeated-measures ANOVA because they better accommodate unbalanced data, relax sphericity assumptions, and allow simultaneous inclusion of continuous covariates and interaction terms, providing a more flexible framework for modeling interindividual variability [30].

To further evaluate whether CRF modified the accuracy of MHR prediction equations, we fit separate ordinary least squares regression models with absolute error as the dependent variable and VO₂max as the predictor. The regression slope quantified whether individuals with higher CRF exhibited systematically greater or lower prediction errors. Regression models were fit in R using the lm() function. For each model, the slope estimate with 95% CI, p-value, R², and residual standard error (RMSE) were calculated.

Bland–Altman analyses were conducted for each prediction equation against measured MHR to evaluate systematic and proportional bias, stratified by sex. Systematic bias was quantified as the mean difference (MHR_predicted – MHR_measured), and 95% limits of agreement (LOA) were calculated as the mean difference ± 1.96 SD of the differences. Proportional bias was assessed by linear regression of the prediction error on measured MHR, with significance indicating a non-constant bias across the range of values. To further evaluate the agreement between MHR_predicted and MHR_measured, several metrics were computed for each prediction equation. These included mean absolute error (MAE), and root-mean-square error (RMSE) to quantify systematic and absolute differences. Agreement between MHR_predicted and MHR_measured was further evaluated using an intraclass correlation coefficient (ICC, two-way mixed-effects model, ICC [1,3]). Each metric was calculated for the entire sample and separately for males and females. ICC values were interpreted using the following thresholds: < 0.50 = poor, 0.50–0.74 = moderate, 0.75–0.89 = good, and ≥0.90 = excellent reliability [31]. To facilitate comparison, equations were ranked from 1 (best) to 7 (worst) for each metric within each group.

Results

3.1. Participant characteristics

Table 1 summarizes the characteristics of the 230 participants. Shapiro–Wilk tests indicated that VO₂max (W = 0.993, p = 0.31) was normally distributed, whereas Age (W = 0.969, p < 0.001), MHR (W = 0.985, p = 0.017), and respiratory exchange ratio (W = 0.915, p < 0.001) were not normally distributed. The data distributions are visualized in Fig 1. The sample was 75.7% male. Sex differences (p < 0.05) were observed for height, body mass, and relative VO₂max, with males presenting significantly greater values and effect sizes ranging from medium to large. No significant sex differences were detected for age, MHR, or respiratory exchange ratio (all p > 0.05). Descriptive statistics for the overall sample and by sex are presented in Table 1.

Download:

Table 1. Overall participant (n = 230) characteristics.

https://doi.org/10.1371/journal.pone.0335842.t001

Download:

Fig 1. Kernel density plots of age, maximum heart rate, VO2max, and respiratory exchange ratio.

https://doi.org/10.1371/journal.pone.0335842.g001

3.2. Relationships of maximum heart rate and VO₂max with age

Simple linear regressions of measured versus predicted MHR showed comparable variance explained across all equations (Fig 2), with R² values ranging from 0.40 to 0.45. Despite differences in slope and intercept across the age-based formulas, the non–sex-specific equations (Fox, Tanaka, Åstrand, Gellish, Nes, and Arena) all explained a similar proportion of variance in measured MHR (R² = 0.42). This similarity reflects the fact that each equation relies primarily on age as the predictor, producing highly collinear estimates.

Download:

Fig 2. Measured maximum heart rate versus age with age-based prediction equations.

Note: Measured maximal heart rate as a function of age, with predicted values from seven age-based equations superimposed. Colored lines represent the prediction equations, with associated R² values denoting model fit to the measured data.

https://doi.org/10.1371/journal.pone.0335842.g002

Linear regression models (Fig 3) revealed that age was negatively associated with relative VO₂max in both males and females. Among males, age explained 24% of the variance in VO₂max (R² = 0.236, p < 0.001), with an average decline of –0.37 mL/kg-min per year (95% CI [–0.47, –0.27]). Among females, age explained 18% of the variance (R² = 0.18, p = 0.001), with an average decline of –0.28 mL/kg-min per year (95% CI [–0.45,–0.12]). The Age × Sex interaction was not statistically significant (β = 0.085, 95% CI [–0.11, 0.28], p = 0.386), indicating that the rate of decline in VO₂max with age did not differ significantly between males and females.

Download:

Fig 3. Age-related decline in relative VO₂max stratified by sex.

Note: Scatterplots with LOESS smoothing (solid black line) and linear regression fits (dashed line) are shown for males and females. The shaded area represents the 95% confidence interval around the LOESS curve.

https://doi.org/10.1371/journal.pone.0335842.g003

3.3. Influence of VO₂max on predicted maximum heart rate

LMM results indicated a significant main effect of prediction equation on error, F(6, 1389) = 5.83, p < 0.001, and a significant equation × VO₂max interaction, F(6, 1389) = 2.63, p = 0.015 (Table 2). Neither sex (p = 0.49) nor VO₂max (p = 0.18) had an independent effect on predicted error. Post hoc contrasts revealed that, relative to the Fox equation (reference), the Tanaka (β = –1.03, 95% CI [–1.63, –0.43], p < 0.001), Gellish (β = –0.98, 95% CI [–1.58, –0.38], p = 0.001), and Arena (β = –0.98, 95% CI [–1.58, –0.38], p = 0.001) equations were associated with significantly lower absolute error. No other between-equation differences were significant. Only one significant slope difference was detected in the equation × VO₂max interaction, with the Fairbairn equation demonstrated a stronger negative association between VO₂max and error (β = –0.079, p = 0.016). Given the significant equation x VO₂max interaction, we investigated simple slopes of VO₂max within each equation. Among these exploratory comparisons, only the Åstrand equation exhibited a significant positive slope (β = 0.106, 95% CI [0.027, 0.185], p = 0.009), indicating that error increased as VO₂max increased. However, the conditional R² for the model was 0.70, with a marginal R² of 0.02, indicating that variance was dominated by subject-level random effects rather than fixed effects.

Download:

Table 2. Linear Mixed-Effects Model Examining the Effects of Prediction Equation, VO₂max, and Sex on Absolute Error of Maximum Heart Rate Prediction.

https://doi.org/10.1371/journal.pone.0335842.t002

Regression analyses stratified by sex revealed that VO₂max was significantly associated with absolute error in several prediction equations among males but not females. In males, higher VO₂max was linked to greater prediction error for the Fox (β = 0.10 bpm per mL/kg-min, p = 0.033), Tanaka (β = 0.10, p = 0.03), Arena (β = 0.10, p = 0.02), Åstrand (β = 0.16, p = 0.001), and Nes (β = 0.11, p = 0.05) equations. The Gellish equation showed a positive slope that approached significance (β = 0.08, p = 0.055). In contrast, the Fairbairn equation did not show a significant relationship (β = 0.03, p = 0.57). Among females, slopes were small and non-significant across all equations (β range: –0.05 to 0.03, all p > 0.61). The proportion of variance explained by VO₂max was low (R² ≤ 0.06), indicating that although CRF influenced prediction error in males, the effect size was small. Figs 4–10 present the regression slopes with 95% CIs, highlighting significant positive associations for several equations in males, but no associations in females. Notably, the 95%CI of slopes indicate overlap between sexes and thus indicate no sex differences in slope per equation.