Figures
Abstract
Introduction
Maximal heart rate (MHR) is a key measure for cardiorespiratory exercise prescription yet is often estimated using age-based prediction equations. The accuracy of these equations may vary by individual characteristics, including cardiorespiratory fitness (CRF), but limited research has examined predictive accuracy across CRF levels. Therefore, we evaluated the accuracy of seven commonly used MHR prediction equations in adults with varying CRF to assess whether prediction error differs by fitness level.
Materials and methods
Data from 230 healthy adults (76% male, mean age 38.5 ± 12.3 years) who completed maximal graded exercise tests between 2019 and 2024 were analyzed retrospectively. Predicted MHR values were calculated using the Fox, Tanaka, Gellish, Arena, Åstrand, Nes, and Fairbairn equations. Linear mixed-effects models (LMM) tested the influence of VO₂max and its interaction with prediction equation on error, with sex included as a covariate. Estimated marginal means and slopes were extracted, with pairwise contrasts adjusted by the Tukey method. Prediction equation accuracy was evaluated by comparing predicted and measured MHR using Bland-Altman analyses, and metrics including mean absolute error (MAE), root mean square error (RMSE), and intraclass correlation coefficients (ICC).
Results
LMM indicated a significant main effect of prediction equation on error (p < 0.001) and a significant equation × VO₂max interaction (p = 0.015), though neither sex (p = 0.49) nor VO₂max (p = 0.18) alone influenced error. The conditional R2 for the LME model was 0.70, with a marginal R2 of 0.02. Post-hoc linear regressions showed higher VO₂max was associated with greater prediction error for several equations in males, but not females, with a small amount of variance explained (R2 ≤ 0.06). Agreement analyses indicated small mean biases across equations (–3 to +6 bpm) but wide limits of agreement (~±18–24 bpm). Arena, Tanaka and Gellish equations showed the lowest MAE and RMSE. Among the equations, Fox showed the most stable performance across MHR ranges, being the only formula without proportional bias across the sample.
Discussion
The findings indicate that CRF had only a limited influence on MHR prediction error, with small associations observed in males but not females, reinforcing age as the primary determinant of MHR. Although some equations (e.g., Tanaka, Gellish, Arena, Fox) performed better than others across agreement metrics, none demonstrated high individual level accuracy, which highlights a lack of precision when estimating MHR for exercise prescription and monitoring purposes. Future work should explore more individualized modeling approaches, though adjusting for CRF alone may not substantially improve prediction accuracy in healthy adults.
Citation: Martin J, Lindsey B, Gerrity C, Ambegaonkar J (2025) Exploratory analysis of the accuracy of age-based maximal heart rate equations across cardiorespiratory fitness levels. PLoS One 20(10): e0335842. https://doi.org/10.1371/journal.pone.0335842
Editor: Stefano Amatori, eCampus University: Universita degli Studi eCampus, ITALY
Received: June 4, 2025; Accepted: October 16, 2025; Published: October 30, 2025
Copyright: © 2025 Martin et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: All relevant data are within the manuscript and its Supporting information files. The dataset generated and analyzed during the current study has been deposited in the Open Science Framework and is publicly available under DOI https://doi.org/10.17605/OSF.IO/VDG92 (version 1.0). The dataset can be accessed at: https://osf.io/vdg92/?view_only=47d195f2b4d94641a1b54dd2a67cca2c.
Funding: The author(s) received no specific funding for this work.
Competing interests: The authors have declared that no competing interests exist.
Introduction
Maximal heart rate (MHR) in beats per minute (bpm) is a fundamental parameter in exercise physiology, commonly used to assess cardiorespiratory fitness (CRF) [1] and prescribe training intensity [2]. Direct measurement of MHR requires a graded exercise test (GXT) taken to volitional exhaustion, which may pose logistical challenges and potential health risks, particularly for individuals with underlying medical conditions or limited exercise tolerance [3]. As a result, age-based MHR prediction equations are widely used as a safer, more convenient alternative to directly measuring MHR [1]. The most common prediction equation is the simple ‘MHR = 220 – age’ formula, developed by Fox and colleagues [4]; although it has been criticized for high inter-individual variability [3,5]. In response, numerous alternative equations have been proposed to improve prediction accuracy due to high inter-individual variability in MHR across different populations [1,6–8].
Several widely used MHR prediction equations are derived from large population studies or specific subgroups [1,6–8]. Notable examples include the Tanaka equation (MHR = 208 − 0.7 × age) introduced by Tanaka et al. [1] based on a meta-analysis of 18,712 subjects, the Gellish equation (MHR = 207 − 0.7 × age) from a longitudinal fitness study in 2007 with participants (n = 908) of a broad age and fitness spectrum [6], and the Nes equation (MHR = 211 − 0.64 × age) derived from a large (n = 3320) Norwegian cohort [9]. Other widely used formulas include Åstrand’s formula (216.6 − 0.84 × age) [10], and Arena’s formula (209.3 − 0.72 × age) which were proposed in the context of cardiac rehabilitation and functional capacity testing to gauge effort [11].
While these MHR prediction equations rely primarily on age, cardiorespiratory responses to exercise may differ between sexes due to physiological and hormonal differences [12]. Accordingly, some MHR models have attempted to account for sex differences [7]. For instance, Fairbairn’s equations are sex-specific for males and females (e.g., male: MHR = 208 − 0.8 × age; female: MHR = 201 − 0.63 × age) with the inclusion of sex intended to improve accuracy for female individuals [7]. With numerous MHR prediction formulas available and derived from different samples [1,6–8], it is evident that no single equation perfectly fits all as each carries assumptions based on the characteristics of its derivation sample [13]. Importantly, across all formula, age has emerged as the predominant factor to predict MHR.
One major limitation of all age-based MHR prediction formulas is that age explains only part of the variability in MHR [1,13,14]. While MHR declines about 0.5 to 1 beats per year in adulthood [14], other factors such as sex [7], genetics [15], and CRF level [16] may also influence MHR. Notably, there has been debate about the effect of CRF and/or endurance training on MHR [16]. Higher CRF achieved through endurance training is generally associated with enhanced stroke volume and autonomic regulation, increased vagal tone at rest [17], and significantly lower resting heart rates [18]. Some evidence indicates that chronically aerobic-trained people show a slight reduction in MHR compared to age-matched untrained peers [16], potentially due to cardiac remodeling and increased parasympathetic influence that accompanies aerobic conditioning [16]. However, others have observed increases in MHR after detraining or inconsistent changes with training [19,20]. Zavorsky [16] and Carter et al. [21] have detailed these conflicting findings, noting instances where MHR in previously sedentary adults decreased with exercise training, only to increase again upon training cessation [16,21]. Interestingly, Lach et al. [22] found that adding CRF and body composition to prediction models only marginally improved accuracy (R2 = 0.22 vs. 0.19 for age alone), suggesting limited influence of CRF on MHR for physically active adults. To our knowledge this finding has yet to be explored in subsequent studies. Thus, while age remains the most informative single covariate in population models of MHR, the influence of CRF level on MHR and on the accuracy of prediction is not well resolved.
In summary, the literature indicates that age-based MHR predictions are convenient but lack accuracy on the individual level [13] and it remains uncertain whether any equation is more suitable based on CRF level. The potential error introduced by CRF level represents an important gap in the literature, as knowledge of consistent over or underestimation by a specific equation would affect the ability for exercise professionals to make more informed decisions when selecting the most appropriate formula. To address this gap, the present study undertook an exploratory analysis of the accuracy of several commonly used age-predicted MHR equations across adults with varying CRF levels. In line with recent recommendations for distinguishing exploratory from confirmatory research [23,24], no a priori hypotheses were tested. The study findings may provide preliminary insights of value to exercise science practitioners, clinicians, and researchers who frequently utilize MHR prediction equations in individuals with varying levels of CRF when devising cardiorespiratory training and exercise programs for their clients and patients.
Materials and methods
2.1. Participants and design
This study was a retrospective analysis of de-identified graded exercise tests (GXT) data collected between 2019 and 2024 at a university exercise physiology laboratory. The data were accessed on January 23, 2025 for research purposes. A total of 230 adults (174 males, 56 females) who had performed GXTs during this period were included. Participant ages ranged from 18 to 68 years (mean ± SD: 38.5 ± 12.3 years) with a range of CRF levels. The participants were individuals from the local community who voluntarily participated in the laboratory’s human performance GXT testing services. In most cases, they were recreational athletes of varying levels seeking estimates of VO₂max and MHR to guide training. Others underwent testing for general health or fitness assessment informational purposes. Inclusion criteria for the analysis were: age ≥ 18; completion of a GXT to volitional fatigue with valid attainment of VO2max (see below); and no acute medical issues at the time of testing. The study protocol was reviewed and approved by the George Mason University Institutional Review Board (IRB #: 1665548) and participants provided informed consent prior to testing. Because this was a retrospective analysis of an existing dataset, no a priori sample size calculation was conducted. As a retrospective exploratory study, we analyzed all eligible records without an a priori power analysis, following recommendations to justify sample size by using the full accessible dataset and to transparently acknowledge the absence of prospective calculations [25]. The exploratory nature of this study is intended to generate preliminary insights and identify patterns that can inform future confirmatory studies that can plan sample sizes for desired accuracy or a smallest effect size of interest [24].
2.2. Graded exercise test protocol
All participants performed a maximal GXT on a motorized treadmill (Desmo, Woodway, Waukesha, WI, USA) to determine VO2max and MHR. Tests were conducted by trained technicians following standard laboratory procedures. Participants were advised to refrain from heavy exercise and stimulants on the day of the test and provided basic health screening information before starting. The protocol consisted of a 3-minute warm-up followed by a continuous incremental test. During warm-up, participants began with a brisk walk (~3.0 mph, 0–1 min), transitioned to a light jog (~4–5 mph, 1–2 min), and progressed to their self-selected “working pace” (2–3 min). The working pace was typically ~5–6 mph but was adjusted individually to reflect a pace the participant self-reported could maintain for 15–20 minutes. At minute 3, the first stage began at 0% grade. Thereafter, treadmill grade increased by 2% every 2 minutes (e.g., 2% at min 5, 4% at min 7, 6% at min 9, etc.). Once a 10% grade was reached, further increases were made by small increments in speed (~0.2–0.3 mph) rather than grade, to preserve running form. The test followed GXT self-paced recommendations [26] and was designed to elicit volitional exhaustion within ~8–12 minutes, excluding the warm-up. Throughout the test, heart rate (HR) was measured using a telemetry HR monitor (H10, Polar, Polar Electro, Kempele, Finland), which is shown to have superior signal quality relative to other comparable devices [27]. Expired gases were collected using a calibrated metabolic cart (Parvo Medics TrueOne 2400; Parvo Medics, Sandy, UT, USA) to determine oxygen uptake. The metabolic system was calibrated prior to each test according to manufacturer guidelines using standardized gas concentrations and volume flow calibration. Each test continued until the participant reached volitional exhaustion (e.g., gait instability, participant request to stop) or met termination criteria. Achievement of VO2max (termination criteria) was verified by meeting at least two of the following criteria: (1) respiratory exchange ratio (RER) ≥ 1.10 at peak exercise; (3) heart rate plateauing near age-predicted MHR (within ~10 bpm) or reaching ≥90% of age-predicted MHR; and/or (3) rating of perceived exertion (RPE) ≥ 19 on a 6–20 Borg scale [3]. All participants included in the analysis satisfied these criteria, indicating equivalent effort across the sample [3]. MHR measured was defined as the highest heart rate value attained during the GXT.
2.3. Maximum heart rate prediction equations
The performance of seven common age-based MHR prediction equations were evaluated. The equations chosen for their frequent use in both research and practice [3,13]. The equations were:
2.4. Statistics
The data was exported from the metabolic cart computer software into Excel (Microsoft Corporation, Redmond, WA, USA) Nfor further analysis. All statistical analyses and visualizations were performed using R (version 4.2.1, R Core Team, Vienna, Austria). Statistical significance was set to p < 0.05 for all analyses with adjustments as appropriate as described below. The dataset supporting this study is openly available on the Open Science Framework at https://doi.org/10.17605/OSF.IO/VDG92. All variables required for analysis (age, sex, MHR, and VO₂max) were complete with no missing values. Therefore, no participants were excluded due to incomplete demographic or physiological data.
Outlier detection was performed on prediction error (Prediction Error = MHRpredicted – MHRmeasured) and absolute error values. Outliers were flagged as values exceeding ±3 standard deviations from the mean [28]. This pre-specified threshold was used as a quality-control screen to limit inclusion of likely artifacts from HR telemetry, such as signal dropouts or chest strap slippage [27]. Raw data from flagged cases were reviewed and 3 participants’ data were removed for MHR values flagged as outliers resulting in a final analytical sample of N = 230. Normality of the primary continuous variables (age, VO₂max, MHR, and RER) was assessed using the Shapiro–Wilk test and inspection of Q–Q plots. To further examine the distributional properties, kernel density plots were generated for the overall sample and stratified by sex.
Descriptive statistics were calculated for the full sample and stratified by sex. Sex differences in continuous variables were assessed using non-parametric Mann–Whitney U tests due to violations of normality assumptions. Effect sizes, with 95% confidence intervals (CI), for group differences were calculated as Cohen’s d, with d = 0.2, 0.5, 0.8 indicating small, medium, and large effects, respectively [29]. To assess the relationships between MHR versus age we regressed measured MHR on the equation-predicted MHR (simple linear model: MHRmeasured ~ MHRpredicted) and reported coefficients of determination (R2) to summarize the proportion of variance explained. To examine whether the relationship between age and relative VO₂max differed by sex, scatterplots stratified by sex were generated with locally estimated scatterplot smoothing (LOESS) curves. Linear regression models including sex, age, and the sex × age interaction term were fit to test for sex-specific differences in slopes. R2 were computed for each sex to quantify the strength of association between age and VO₂max. Slope contrasts were performed to compare the estimated age-related decline in VO₂max between males and females.
To evaluate the effects of CRF on MHR prediction equations, we quantified error using absolute differences between predicted and measured values. Linear mixed-effects models (LMMs) were fit using the lme4 package in R. Fixed effects included Equation (seven levels: Fox, Tanaka, Gellish, Arena, Åstrand, Nes, Fairbairn), VO2max (continuous), their interactions, and sex (as a covariate). VO2max values were mean centered prior to analysis so that model intercepts reflected prediction accuracy at the sample average, and to reduce multicollinearity between main effects and interaction terms. Subjects were modeled with random intercepts to account for repeated measures across equations. Estimated marginal means and simple slopes were extracted from the models using the emmeans package, with pairwise post hoc contrasts adjusted using the Tukey method to control for multiple comparisons. LMMs were selected over repeated-measures ANOVA because they better accommodate unbalanced data, relax sphericity assumptions, and allow simultaneous inclusion of continuous covariates and interaction terms, providing a more flexible framework for modeling interindividual variability [30].
To further evaluate whether CRF modified the accuracy of MHR prediction equations, we fit separate ordinary least squares regression models with absolute error as the dependent variable and VO₂max as the predictor. The regression slope quantified whether individuals with higher CRF exhibited systematically greater or lower prediction errors. Regression models were fit in R using the lm() function. For each model, the slope estimate with 95% CI, p-value, R2, and residual standard error (RMSE) were calculated.
Bland–Altman analyses were conducted for each prediction equation against measured MHR to evaluate systematic and proportional bias, stratified by sex. Systematic bias was quantified as the mean difference (MHRpredicted – MHRmeasured), and 95% limits of agreement (LOA) were calculated as the mean difference ± 1.96 SD of the differences. Proportional bias was assessed by linear regression of the prediction error on measured MHR, with significance indicating a non-constant bias across the range of values. To further evaluate the agreement between MHRpredicted and MHRmeasured, several metrics were computed for each prediction equation. These included mean absolute error (MAE), and root-mean-square error (RMSE) to quantify systematic and absolute differences. Agreement between MHRpredicted and MHRmeasured was further evaluated using an intraclass correlation coefficient (ICC, two-way mixed-effects model, ICC [1,3]). Each metric was calculated for the entire sample and separately for males and females. ICC values were interpreted using the following thresholds: < 0.50 = poor, 0.50–0.74 = moderate, 0.75–0.89 = good, and ≥0.90 = excellent reliability [31]. To facilitate comparison, equations were ranked from 1 (best) to 7 (worst) for each metric within each group.
Results
3.1. Participant characteristics
Table 1 summarizes the characteristics of the 230 participants. Shapiro–Wilk tests indicated that VO₂max (W = 0.993, p = 0.31) was normally distributed, whereas Age (W = 0.969, p < 0.001), MHR (W = 0.985, p = 0.017), and respiratory exchange ratio (W = 0.915, p < 0.001) were not normally distributed. The data distributions are visualized in Fig 1. The sample was 75.7% male. Sex differences (p < 0.05) were observed for height, body mass, and relative VO₂max, with males presenting significantly greater values and effect sizes ranging from medium to large. No significant sex differences were detected for age, MHR, or respiratory exchange ratio (all p > 0.05). Descriptive statistics for the overall sample and by sex are presented in Table 1.
3.2. Relationships of maximum heart rate and VO2max with age
Simple linear regressions of measured versus predicted MHR showed comparable variance explained across all equations (Fig 2), with R2 values ranging from 0.40 to 0.45. Despite differences in slope and intercept across the age-based formulas, the non–sex-specific equations (Fox, Tanaka, Åstrand, Gellish, Nes, and Arena) all explained a similar proportion of variance in measured MHR (R2 = 0.42). This similarity reflects the fact that each equation relies primarily on age as the predictor, producing highly collinear estimates.
Note: Measured maximal heart rate as a function of age, with predicted values from seven age-based equations superimposed. Colored lines represent the prediction equations, with associated R2 values denoting model fit to the measured data.
Linear regression models (Fig 3) revealed that age was negatively associated with relative VO₂max in both males and females. Among males, age explained 24% of the variance in VO₂max (R2 = 0.236, p < 0.001), with an average decline of –0.37 mL/kg-min per year (95% CI [–0.47, –0.27]). Among females, age explained 18% of the variance (R2 = 0.18, p = 0.001), with an average decline of –0.28 mL/kg-min per year (95% CI [–0.45,–0.12]). The Age × Sex interaction was not statistically significant (β = 0.085, 95% CI [–0.11, 0.28], p = 0.386), indicating that the rate of decline in VO₂max with age did not differ significantly between males and females.
Note: Scatterplots with LOESS smoothing (solid black line) and linear regression fits (dashed line) are shown for males and females. The shaded area represents the 95% confidence interval around the LOESS curve.
3.3. Influence of VO2max on predicted maximum heart rate
LMM results indicated a significant main effect of prediction equation on error, F(6, 1389) = 5.83, p < 0.001, and a significant equation × VO₂max interaction, F(6, 1389) = 2.63, p = 0.015 (Table 2). Neither sex (p = 0.49) nor VO₂max (p = 0.18) had an independent effect on predicted error. Post hoc contrasts revealed that, relative to the Fox equation (reference), the Tanaka (β = –1.03, 95% CI [–1.63, –0.43], p < 0.001), Gellish (β = –0.98, 95% CI [–1.58, –0.38], p = 0.001), and Arena (β = –0.98, 95% CI [–1.58, –0.38], p = 0.001) equations were associated with significantly lower absolute error. No other between-equation differences were significant. Only one significant slope difference was detected in the equation × VO₂max interaction, with the Fairbairn equation demonstrated a stronger negative association between VO₂max and error (β = –0.079, p = 0.016). Given the significant equation x VO₂max interaction, we investigated simple slopes of VO₂max within each equation. Among these exploratory comparisons, only the Åstrand equation exhibited a significant positive slope (β = 0.106, 95% CI [0.027, 0.185], p = 0.009), indicating that error increased as VO₂max increased. However, the conditional R2 for the model was 0.70, with a marginal R2 of 0.02, indicating that variance was dominated by subject-level random effects rather than fixed effects.
Regression analyses stratified by sex revealed that VO₂max was significantly associated with absolute error in several prediction equations among males but not females. In males, higher VO₂max was linked to greater prediction error for the Fox (β = 0.10 bpm per mL/kg-min, p = 0.033), Tanaka (β = 0.10, p = 0.03), Arena (β = 0.10, p = 0.02), Åstrand (β = 0.16, p = 0.001), and Nes (β = 0.11, p = 0.05) equations. The Gellish equation showed a positive slope that approached significance (β = 0.08, p = 0.055). In contrast, the Fairbairn equation did not show a significant relationship (β = 0.03, p = 0.57). Among females, slopes were small and non-significant across all equations (β range: –0.05 to 0.03, all p > 0.61). The proportion of variance explained by VO₂max was low (R2 ≤ 0.06), indicating that although CRF influenced prediction error in males, the effect size was small. Figs 4–10 present the regression slopes with 95% CIs, highlighting significant positive associations for several equations in males, but no associations in females. Notably, the 95%CI of slopes indicate overlap between sexes and thus indicate no sex differences in slope per equation.
Note: Points represent individual observations, and solid lines represent sex-specific ordinary least squares regression fits with shaded 95% confidence intervals. Regression slopes with 95% CIs are displayed at the right side of each panel.
Note: Points represent individual observations, and solid lines represent sex-specific ordinary least squares regression fits with shaded 95% confidence intervals. Regression slopes with 95% CIs are displayed at the right side of each panel.
Note: Points represent individual observations, and solid lines represent sex-specific ordinary least squares regression fits with shaded 95% confidence intervals. Regression slopes with 95% CIs are displayed at the right side of each panel.
Note: Points represent individual observations, and solid lines represent sex-specific ordinary least squares regression fits with shaded 95% confidence intervals. Regression slopes with 95% CIs are displayed at the right side of each panel.
Note: Points represent individual observations, and solid lines represent sex-specific ordinary least squares regression fits with shaded 95% confidence intervals. Regression slopes with 95% CIs are displayed at the right side of each panel.
Note: Points represent individual observations, and solid lines represent sex-specific ordinary least squares regression fits with shaded 95% confidence intervals. Regression slopes with 95% CIs are displayed at the right side of each panel.
Note: Points represent individual observations, and solid lines represent sex-specific ordinary least squares regression fits with shaded 95% confidence intervals. Regression slopes with 95% CIs are displayed at the right side of each panel.
3.4. Prediction equation agreement with maximum heart rate
Bland–Altman plots and corresponding statistics for the seven prediction equations are presented in Figs 11–17. Across the sample, mean biases were generally small, ranging from a slight underestimation with the Fairbairn equation (−3.26 bpm) to overestimation with the Nes equation (+6.02 bpm). LOA were wide across all equations (approximately ±18–24 bpm), indicating substantial individual-level variability. Evidence of proportional bias was present for most equations, with stronger effects observed for Tanaka, Gellish, Arena, Nes, and Fairbairn (all p < 0.01). In these cases, predicted MHR increasingly underestimated measured values at higher heart rates. In contrast, the Fox equation demonstrated neither significant mean nor proportional bias, suggesting relatively stable performance across the range of measured MHR. Sex-stratified analyses revealed broadly similar patterns for males and females, though females tended to exhibit slightly wider limits of agreement and, in some cases, stronger proportional bias (e.g., Fairbairn, Nes).
Abbreviations: MHR = Maximum Heart Rate; bpm, beats per minute. The x-axis represents measured MHR in bpm, and the y-axis shows the difference between predicted and measured MHR (bias). The solid horizontal line represents the mean bias, while the upper and lower dashed lines indicate the 95% limits of agreement (LOA). The sloped dashed line shows the proportional bias estimated from linear regression, with the regression equation and p-value reported at the top of each plot.
Abbreviations: MHR = Maximum Heart Rate; bpm, beats per minute. The x-axis represents measured MHR in bpm, and the y-axis shows the difference between predicted and measured MHR (bias). The solid horizontal line represents the mean bias, while the upper and lower dashed lines indicate the 95% limits of agreement (LOA). The sloped dashed line shows the proportional bias estimated from linear regression, with the regression equation and p-value reported at the top of each plot.
Abbreviations: MHR = Maximum Heart Rate; bpm, beats per minute. The x-axis represents measured MHR in bpm, and the y-axis shows the difference between predicted and measured MHR (bias). The solid horizontal line represents the mean bias, while the upper and lower dashed lines indicate the 95% limits of agreement (LOA). The sloped dashed line shows the proportional bias estimated from linear regression, with the regression equation and p-value reported at the top of each plot.
Abbreviations: MHR = Maximum Heart Rate; bpm, beats per minute. The x-axis represents measured MHR in bpm, and the y-axis shows the difference between predicted and measured MHR (bias). The solid horizontal line represents the mean bias, while the upper and lower dashed lines indicate the 95% limits of agreement (LOA). The sloped dashed line shows the proportional bias estimated from linear regression, with the regression equation and p-value reported at the top of each plot.
Abbreviations: MHR = Maximum Heart Rate; bpm, beats per minute. The x-axis represents measured MHR in bpm, and the y-axis shows the difference between predicted and measured MHR (bias). The solid horizontal line represents the mean bias, while the upper and lower dashed lines indicate the 95% limits of agreement (LOA). The sloped dashed line shows the proportional bias estimated from linear regression, with the regression equation and p-value reported at the top of each plot.
Abbreviations: MHR = Maximum Heart Rate; bpm, beats per minute. The x-axis represents measured MHR in bpm, and the y-axis shows the difference between predicted and measured MHR (bias). The solid horizontal line represents the mean bias, while the upper and lower dashed lines indicate the 95% limits of agreement (LOA). The sloped dashed line shows the proportional bias estimated from linear regression, with the regression equation and p-value reported at the top of each plot.
Abbreviations: MHR = Maximum Heart Rate; bpm, beats per minute. The x-axis represents measured MHR in bpm, and the y-axis shows the difference between predicted and measured MHR (bias). The solid horizontal line represents the mean bias, while the upper and lower dashed lines indicate the 95% limits of agreement (LOA). The sloped dashed line shows the proportional bias estimated from linear regression, with the regression equation and p-value reported at the top of each plot.
To complement the Bland–Altman results, pairwise contrasts from the LMM identified several significant between-equation differences for the overall sample. Relative to Fox, the Tanaka (estimate = 1.03, 95% CI [0.43, 1.63], p = 0.014), Gellish (0.98 [0.38, 1.58], p = 0.023), and Arena (0.98 [0.38, 1.58], p = 0.024) equations all showed significantly lower absolute error. Additionally, the Nes equation showed significantly greater absolute error from Tanaka (–1.22 [–1.83, –0.62], p = 0.001), Gellish (–1.17 [–1.77, –0.57], p = 0.002), and Arena (–1.17 [–1.77, –0.56], p = 0.003). No other contrasts reached significance. These results indicate that although several equations (Tanaka, Gellish, Arena) performed differently from Fox, the Nes equation was consistently distinct from the others, aligning with its tendency toward larger positive bias observed in the Bland–Altman analysis.
Agreement metrics for each prediction equation are presented in Table 3 and the relative rankings across metrics are illustrated in Figs 18–20. For the entire sample, the lowest MAE and RMSE were observed for the Tanaka (MAE: 7.40 bpm, RMSE: 9.21 bpm) and Gellish equations (MAE: 7.45 bpm, RMSE: 9.19 bpm), while the Nes equation consistently showed the poorest accuracy (MAE: 8.62 bpm, RMSE: 10.94 bpm). ICC(3,1) values were in the moderate (range: 0.50–0.64), with Fox demonstrating the highest values of the MHR prediction equations.
Abbreviations: CCC, Lin’s concordance correlation coefficient; RMSE, root mean square error; MAE, mean absolute error; ICC, intraclass correlation coefficient. Bias is the mean bias from the Bland Altman analyses. The ranking indicates the performance of each equation per metric, with a lower ranking indicating better performance.
Abbreviations: CCC, Lin’s concordance correlation coefficient; RMSE, root mean square error; MAE, mean absolute error; ICC, intraclass correlation coefficient. Bias is the mean bias from the Bland Altman analyses. The ranking indicates the performance of each equation per metric, with a lower ranking indicating better performance.
Abbreviations: CCC, Lin’s concordance correlation coefficient; RMSE, root mean square error; MAE, mean absolute error; ICC, intraclass correlation coefficient. Bias is the mean bias from the Bland Altman analyses. The ranking indicates the performance of each equation per metric, with a lower ranking indicating better performance.
Among males, findings were similar. Tanaka (MAE: 7.32 bpm, RMSE: 9.04 bpm) and Gellish (MAE: 7.33 bpm, RMSE: 8.99 bpm) yielded the most accurate predictions, while Nes again produced the largest positive bias (+6.24 bpm). ICC was highest for Fox (0.63), but differences across equations were small (0.49–0.63).
Among females, mean bias values ranged from an underestimation of –4.30 bpm (Fairbairn) to an overestimation of +5.31 bpm (Nes). The Arena (MAE: 7.61 bpm, RMSE: 9.74 bpm) and Tanaka (MAE: 7.66 bpm, RMSE: 9.72 bpm) equations performed best on error metrics. Fox provided the ICC with measured MHR (0.67), while Nes again showed the weakest agreement (0.54).
Discussion
The primary objective of this study was to determine whether CRF level influences predictive error across commonly used MHR prediction equations. The LMM revealed a significant interaction between prediction equation and VO₂max, indicating that the accuracy of MHR prediction equations was moderated by CRF level. Post-hoc analyses provided limited evidence of an overall CRF effect across the full sample; however, sex-stratified analyses demonstrated that this influence was more pronounced in males. For several equations (Fox, Tanaka, Arena, Åstrand, Nes), higher VO₂max was associated with greater prediction error, suggesting that fitter males tended to deviate more from their predicted MHR. Notably, the variance explained by VO₂max was small (R2 ≤ 0.06), highlighting that CRF had only a small influence on prediction accuracy. In contrast, females showed no meaningful relationship between CRF and prediction error, with consistently small and nonsignificant slopes across all equations.
Previous studies have reported inconsistent findings regarding the influence of aerobic training on MHR [16,17,19–21]. Some have proposed that endurance training may slightly reduce MHR due to autonomic adaptations and enhanced stroke volume [16,17,21], while others report negligible or variable effects depending on age, training history, or the duration of training and detraining [19,20]. More specifically, several studies have also reported minimal effects of CRF on MHR responses [1,22]. For example, Tanaka et al. [1] found that MHR is predominantly age-determined and largely independent of habitual physical activity (e.g., sedentary or exercise behaviors) leading to altered CRF [3]. Similarly, Lach et al. [22] reported that incorporating CRF and body composition into age-based MHR prediction models yielded only marginal improvements in model fit. The present study provides further evidence that CRF may have a weak influence on the accuracy of MHR prediction, particularly in males, with those who are fitter possibly having slightly great prediction errors. One possible explanation is that individuals with lower CRF may exhibit more uniform cardio-autonomic response (i.e., limited stroke-volume reserve and typical sympathetic activation) at maximal exertion [32], resulting in MHR values that align more closely with age-based expectations. In contrast, higher CRF individuals may display greater physiological variability due to adaptations such as increased stroke volume, lower resting vagal set-point with different β-adrenergic sensitivity, and task-termination that may be influenced by peripheral factors, all of which could subtly affect MHR responses and increase prediction error [16,21,33].
A secondary aim of this study was to evaluate the overall accuracy of commonly used age-based MHR prediction equations stratified by sex and independent of CRF level. Bland–Altman analyses indicated that mean biases were generally small across equations, but wide limits of agreement (±18–24 bpm) highlighted substantial individual-level variability. Proportional bias was evident for most formulas, particularly Tanaka [1], Gellish [6], Arena [11], Nes [9], and Fairbairn [7], where predicted values increasingly underestimated measured MHR at higher heart rates. Notably, the Fox equation [4] showed neither significant mean nor proportional bias, reflecting relatively stable performance across the full range of MHR. When evaluating accuracy metrics, Tanaka [1] and Gellish [6] consistently demonstrated the lowest MAE and RMSE values, whereas Nes [9] produced the poorest agreement, with larger positive bias and significantly greater error compared with several other formulas. These patterns were largely consistent when stratified by sex, though females tended to show slightly wider limits of agreement and somewhat stronger proportional bias. Collectively, the findings support that although certain equations (e.g., Tanaka, Gellish, Arena) provide modestly better accuracy, all formulas exhibited considerable error margins, limiting their precision for individual-level application. These findings are consistent with prior work by Shookster et al. [13], who also found considerable individual error across MHR equations, proportional bias, and emphasized the limited utility of any single formula when applied universally.
4.1. Practical implications
One of the clearest implications of this study is the considerable individual variability in MHR, even after accounting for age, sex, and CRF. The 95% LOA spanned approximately ±20 bpm for all formulas. In practical terms, this degree of error could shift an individual across two heart rate training zones (e.g., from moderate to vigorous intensity), which has meaningful consequences for exercise prescription. Thus, while age-based MHR prediction equations are useful for providing population-level estimates and general fitness guidance (e.g., prescribing approximate target zones in a group class), they are not sufficiently precise for individualized programming, athletic performance optimization, or clinical decision-making. In these contexts, reliance on prediction equations could lead to systematic under- or over-prescription of training intensities, supporting the need for direct measurement of MHR when accuracy is critical.
For practitioners, clinicians, and researchers who rely on MHR for exercise prescription, screening, or monitoring, the results of this study highlight both the utility and the limitations of available equations. Among the prediction equations evaluated, the Fox [4] formula exhibited arguably the most consistent performance across the sample, showing comparable error to Gellish [6], Tanaka [1], and Arena [11] while avoiding proportional bias. These characteristics support its continued use as a practical and generalizable option, especially in time-limited or equipment-limited settings. Given that all equations produced typical errors of approximately ±7–10 bpm, it is critical to treat age-based MHR predictions as approximate estimates rather than absolutes. Accordingly, exercise professionals should communicate this inherent uncertainty when prescribing training intensities based on predicted MHR. One approach is to present prediction intervals (e.g., 175–185 bpm) instead of single-point estimates (e.g., 180 bpm), thereby providing a more transparent representation of the expected range of error. Also, exercise professionals should consider additional methods, such as perceived exertion or heart rate monitoring over time with wearable devices [34], to refine exercise intensity for individuals. These approaches are supported by evidence highlighting both the limitations of heart rate–based prescriptions throughout an exercise session [35] and the value of RPE as a practical indicator of exercise intensity in applied contexts [36]. For instance, within-session cardiovascular drift and HR lag (i.e., slow kinetics during rapid workload changes) can decouple heart rate from true metabolic load, limiting the precision of HR-based prescriptions [35]. In addition, medications such as β-blockers can blunt the chronotropic response [37], rendering HR-based zones misleading and necessitating greater reliance on alternative markers (e.g., RPE, speed/power) in certain populations. Recent work also outlines treadmill protocols and monitoring strategies that improve constant-intensity prescription, such as step-ramp-step approaches in general and clinical populations [38] and individualized load management that integrates HR-based and subjective metrics [39]. Thus, incorporating multiple markers of exercise intensity can enhance safety, personalization, and client understanding when MHR is used for exercise prescription.
Finally, with respect to sex differences, our findings suggest that commonly used MHR prediction equations perform similarly for males and females once age is accounted for, with no meaningful systematic bias detected. Although females exhibited slightly greater variability, these differences were small and unlikely to warrant sex-specific prediction equations in healthy adult populations. This aligns with prior work by Tanaka et al. [1] and Nes et al. [9], who likewise concluded that sex does not substantially alter the age–MHR relationship. Nevertheless, the smaller female sample in our study limits our ability to detect subtle differences in variability or outlier behavior. Future work with larger and more balanced samples may help clarify whether nuanced sex-related distinctions exist. For practitioners, the key takeaway is that variability at the individual level remains the dominant source of error, outweighing sex-based differences.
4.2. Limitations
This study has several limitations. First, the sample was predominantly male (~76%), limiting the ability to fully generalize findings to females. Although we included the Fairbairn equation [7], which incorporates sex-specific coefficients, a more balanced sample would allow for stronger sex-based comparisons. Secondly, because this was a retrospective exploratory analysis of existing data, a priori power calculation was not conducted. As such, the study may be underpowered to detect small effects. Furthermore, underpowered design can also lead to an increase of the proportion of false positives in a field affected by publication bias [23]. However, from a practical perspective, large effects are of greater interest for practitioners, whereas small effects are unlikely to meaningfully influence training or clinical decision-making. Another potential limitation is the broad adult age range (18–68 y, including a subset >65 y) that may introduce heterogeneity (e.g., age-related medications, chronotropic changes) that could attenuate associations [37,40]. Finally, while effort was made to ensure tests reached true physiological maximums with standard criteria [3], it is possible that some participants did not achieve their absolute MHR [26], introducing random error into the criterion measure. This may be in part due to the GXT protocol implemented in the present study. Although the protocol was designed for an ~ 8–12-minute time to task failure, using a grade-then-speed progression, graded running up to 10% incline may have increased local muscular discomfort, and caused some participants to terminate the test due to peripheral factors rather than central cardiorespiratory limitations. While VO₂max attainment criteria were met for all included tests, we cannot exclude the possibility that this underestimated true MHR in a subset of participants.
4.3. Future directions
Future researchers should continue to explore individual-level variability in MHR and seek models that improve prediction accuracy beyond age-based equations [1,4,6,7,9–11]. While adding CRF or body composition to regression models has produced only marginal gains [22], more complex methods, such as machine learning, may offer improvements when applied to large, diverse datasets [41]. Recent work using random forest models and other nonlinear approaches has shown potential to reduce prediction error by up to 20–25%, although individual-level variance remains high [41]. Further studies are also needed to evaluate MHR prediction accuracy in specific subpopulations, including older adults, individuals with cardiovascular or metabolic diseases, and elite endurance athletes, who may deviate from age-based norms due to factors like altered β-adrenergic responsiveness [33], medication effects [37], chronotropic incompetence [40], or training-induced cardiac remodeling that shifts the heart rate–work relationship [16].
Conclusion
In summary, our findings highlight the substantial individual variability in MHR prediction, with small to no practically meaningful effect of CRF on prediction accuracy. Age-based MHR equations are useful for population-level benchmarking and for setting initial training targets, but their typical error (≈±7–10 bpm) and inter-individual heterogeneity limit precision for individualized programming. Thus, when individual accuracy is needed, predicted MHR should be directly assessed and if possible complemented with additional markers (e.g., RPE, speed, power). Future researchers should continue to explore individualized modeling approaches, though adjusting for CRF alone may not improve prediction accuracy among healthy adults. Importantly, the present results should be viewed as exploratory and confirmed in future confirmatory studies.
References
- 1. Tanaka H, Monahan KD, Seals DR. Age-predicted maximal heart rate revisited. J Am Coll Cardiol. 2001;37(1):153–6. pmid:11153730
- 2. Swain DP, Abernathy KS, Smith CS, Lee SJ, Bunn SA. Target heart rates for the development of cardiorespiratory fitness. Med Sci Sports Exerc. 1994;26: 112–116.
- 3.
Swain DP, American College of Sports Medicine, American College of Sports Medicine. ACSM’s resource manual for Guidelines for exercise testing and prescription. Philadelphia: Wolters Kluwer Health/Lippincott Williams & Wilkins; 2014.
- 4. Fox SM 3rd, Naughton JP, Haskell WL. Physical activity and the prevention of coronary heart disease. Ann Clin Res. 1971;3(6):404–32. pmid:4945367
- 5. Robergs RA, Landwehr R. The surprising history of the “HRmax=220-age” equation. J Exerc Physiol Online. 2002;5:1–10.
- 6. Gellish RL, Goslin BR, Olson RE, McDonald A, Russi GD, Moudgil VK. Longitudinal modeling of the relationship between age and maximal heart rate. Med Sci Sports Exerc. 2007;39(5):822–9. pmid:17468581
- 7. Fairbarn MS, Blackie SP, McElvaney NG, Wiggs BR, Paré PD, Pardy RL. Prediction of heart rate and oxygen uptake during incremental and maximal exercise in healthy adults. Chest. 1994;105: 1365–9.
- 8. Gulati M, Shaw LJ, Thisted RA, Black HR, Bairey Merz CN, Arnsdorf MF. Heart rate response to exercise stress testing in asymptomatic women: the st. James women take heart project. Circulation. 2010;122(2):130–7. pmid:20585008
- 9. Nes BM, Janszky I, Wisløff U, Støylen A, Karlsen T. Age-predicted maximal heart rate in healthy subjects: The HUNT fitness study. Scand J Med Sci Sports. 2013;23:697–704.
- 10.
Åstrand P. Experimental studies of physical working capacity in relation to sex and age. FIEP Bull Line. 1952 [cited 17 May 2025]. Available: https://www.semanticscholar.org/paper/Experimental-studies-of-physical-working-capacity-%C3%85strand/ff1d4734f8af092c5e7b00ae98eafd4b4acd6238
- 11. Arena R, Myers J, Kaminsky LA. Revisiting age-predicted maximal heart rate: Can it be used as a valid measure of effort? Am Heart J. 2016;173:49–56. pmid:26920596
- 12. Bassareo PP, Crisafulli A. Gender Differences in Hemodynamic Regulation and Cardiovascular Adaptations to Dynamic Exercise. Curr Cardiol Rev. 2020;16(1):65–72. pmid:30907327
- 13. Shookster D, Lindsey B, Cortes N, Martin J. Accuracy of commonly used age-predicted maximal heart rate equations. Int J Exerc Sci. 2020;13:1242–50.
- 14. Zhu N, Suarez-Lopez JR, Sidney S, Sternfeld B, Schreiner PJ, Carnethon MR, et al. Longitudinal examination of age-predicted symptom-limited exercise maximum HR. Med Sci Sports Exerc. 2010;42(8):1519–27. pmid:20639723
- 15. Williford HN, Esco MR, Olson MS, Gaston K, Russell AR. The Accuracy of Selected Equations to Predict Maximal Heart Rate in African American Men. J Strength Cond Res. 2011;25:S80–1.
- 16. Zavorsky GS. Evidence and possible mechanisms of altered maximum heart rate with endurance training and tapering. Sports Med. 2000;29(1):13–26. pmid:10688280
- 17. Lavie CJ, Arena R, Swift DL, Johannsen NM, Sui X, Lee D-C, et al. Exercise and the cardiovascular system: clinical science and cardiovascular outcomes. Circ Res. 2015;117(2):207–19. pmid:26139859
- 18. Reimers AK, Knapp G, Reimers C-D. Effects of Exercise on the Resting Heart Rate: A Systematic Review and Meta-Analysis of Interventional Studies. J Clin Med. 2018;7(12):503. pmid:30513777
- 19. Houmard JA, Costill DL, Mitchell JB, Park SH, Hickner RC, Roemmich JN. Reduced training maintains performance in distance runners. Int J Sports Med. 1990;11(1):46–52. pmid:2318562
- 20. McConell GK, Costill DL, Widrick JJ, Hickey MS, Tanaka H, Gastin PB. Reduced training volume and intensity maintain aerobic capacity but not performance in distance runners. Int J Sports Med. 1993;14(1):33–7. pmid:8440543
- 21. Carter JB, Banister EW, Blaber AP. Effect of endurance exercise on autonomic control of heart rate. Sports Med. 2003;33(1):33–46. pmid:12477376
- 22. Lach J, Wiecha S, Śliż D, Price S, Zaborski M, Cieśliński I, et al. HR Max Prediction Based on Age, Body Composition, Fitness Level, Testing Modality and Sex in Physically Active Population. Front Physiol. 2021;12:695950. pmid:34393819
- 23. Mesquida C, Murphy J, Lakens D, Warne J. Replication concerns in sports and exercise science: a narrative review of selected methodological issues in the field. R Soc Open Sci. 2022;9(12):220946. pmid:36533197
- 24. Ditroilo M, Mesquida C, Abt G, Lakens D. Exploratory research in sport and exercise science: Perceptions, challenges, and recommendations. J Sports Sci. 2025;43(12):1108–20. pmid:40197233
- 25. Lakens D. Sample Size Justification. Collabra: Psychology. 2022;8(1):33267.
- 26. Beltz NM, Gibson AL, Janot JM, Kravitz L, Mermier CM, Dalleck LC. Graded Exercise Testing Protocols for the Determination of VO2max: Historical Perspectives, Progress, and Future Considerations. J Sports Med (Hindawi Publ Corp). 2016;2016:3968393. pmid:28116349
- 27. Lindsey B, Snyder S, Zhou Y, Shim JK, Hahn J-O, Evans W, et al. Activity Type Effects Signal Quality in Electrocardiogram Devices. Sensors (Basel). 2025;25(16):5186. pmid:40872047
- 28. Aguinis H, Gottfredson RK, Joo H. Best-Practice Recommendations for Defining, Identifying, and Handling Outliers. Organ Res Methods. 2013;16(2):270–301.
- 29.
Cohen J. Statistical power analysis for the behavioral sciences. 2nd ed. Hillsdale, NJ: L. Erlbaum Associates; 1988.
- 30. Yu Z, Guindani M, Grieco SF, Chen L, Holmes TC, Xu X. Beyond t test and ANOVA: applications of mixed-effects models for more rigorous statistical analysis in neuroscience research. Neuron. 2022;110(1):21–35. pmid:34784504
- 31. Koo TK, Li MY. A Guideline of Selecting and Reporting Intraclass Correlation Coefficients for Reliability Research. J Chiropr Med. 2016;15(2):155–63. pmid:27330520
- 32. Ross R, Goodpaster BH, Koch LG, Sarzynski MA, Kohrt WM, Johannsen NM, et al. Precision exercise medicine: understanding exercise response variability. Br J Sports Med. 2019;53(18):1141–53. pmid:30862704
- 33. Gourine AV, Ackland GL. Cardiac Vagus and Exercise. Physiology (Bethesda). 2019;34(1):71–80. pmid:30540229
- 34. Oyeleye M, Chen T, Titarenko S, Antoniou G. A Predictive Analysis of Heart Rates Using Machine Learning Techniques. Int J Environ Res Public Health. 2022;19(4):2417. pmid:35206603
- 35. Teso M, Colosio AL, Pogliaghi S. An Intensity-dependent Slow Component of HR Interferes with Accurate Exercise Implementation in Postmenopausal Women. Med Sci Sports Exerc. 2022;54(4):655–64. pmid:34967799
- 36. Ferri Marini C, Micheli L, Grossi T, Federici A, Piccoli G, Zoffoli L, et al. Are incremental exercise relationships between rating of perceived exertion and oxygen uptake or heart rate reserve valid during steady-state exercises?. PeerJ. 2024;12:e17158. pmid:38711624
- 37. Fletcher GF, Ades PA, Kligfield P, Arena R, Balady GJ, Bittner VA, et al. Exercise standards for testing and training: a scientific statement from the American Heart Association. Circulation. 2013;128(8):873–934. pmid:23877260
- 38. Faricier R, Keltz RR, Hartley T, Mackay N, Murias JM, Huitema AA, et al. A Protocol to Establish Exercise Intensity Domains for Aerobic Exercise Training in Coronary Artery Disease. Med Sci Sports Exerc. 2025;57(7):1593–602. pmid:39999365
- 39. Nuuttila O-P, Uusitalo A, Kokkonen V-P, Weerarathna N, Kyröläinen H. Monitoring fatigue state with heart rate-based and subjective methods during intensified training in recreational runners. Eur J Sport Sci. 2024;24(7):857–69. pmid:38956784
- 40. Brubaker PH, Kitzman DW. Chronotropic incompetence: causes, consequences, and management. Circulation. 2011;123(9):1010–20. pmid:21382903
- 41. Cundrič L, Bosnić Z, Kaminsky LA, Myers J, Peterman JE, Markovic V, et al. A Machine Learning Approach to Developing an Accurate Prediction of Maximal Heart Rate During Exercise Testing in Apparently Healthy Adults. J Cardiopulm Rehabil Prev. 2023;43(5):377–83. pmid:36880964