Appropriate scaling approach for evaluating peak VO2 development in Southern Chinese 8 to 16 years old

Objective To investigate scaling approaches for evaluating the development of peak VO2 and improving the identification of low cardiopulmonary fitness in Southern Chinese children and adolescents. Methods Nine hundred and twenty Chinese children and adolescents (8 to 16 years) underwent graded cardiopulmonary exercise test on a treadmill until volitional exhaustion. Peak VO2 was corrected for the effects of body mass by ratio or allometric scaling. Z score equations for predicting peak VO2 were developed. Correlations between scaled peak VO2, z scores, body size and age were tested to examine the effectiveness of the approach. Results Eight hundred and fifty-two participants (48% male) were included in the analyses. Absolute peak VO2 significantly increased with age in both sexes (both P<0.05), while ratio-scaled peak VO2 increased only in males (P<0.05). Allometrically scaled peak VO2 increased from 11 years in both sexes, plateauing by 12 years in girls and continuing to rise until 15 years in boys. Allometically scaled peak VO2 was not correlated with body mass, but remained correlated with height and age in all but the older girls. Peak VO2 z score was not correlated with body mass, height or age. Conclusions Absolute and allometric scaled peak VO2 values are provided for Hong Kong Chinese children and adolescents by age and sex. Peak VO2 z scores improve the evaluation of cardiopulmonary fitness, allowing comparisons across ages and sex and will likely provide a better metric for tracking change over time in children and adolescents, regardless of body size and age.


Introduction
It is well established that adequate aerobic fitness is associated with reduced risk of all-cause mortality and chronic diseases across the lifespan [1][2][3][4][5]. Peak oxygen uptake (peak VO 2 ) is widely recognized as one of the best indicators of aerobic fitness in the child, providing a composite measure of the pulmonary, cardiovascular, and hematological components of oxygen delivery and oxygen utilization in the exercising muscles.
In clinical settings, peak VO 2 has been used as a predictor of mortality and of hospital admissions [6][7][8], an indicator of the severity of functional impairment and for tracking responses to intervention [9]. Despite the proven usefulness of peak VO 2 , achieving quality and consistency of data in children remains a challenge. Peak VO 2 is developmentally divergent, varying with age, maturity and sex, and is highly correlated with body size and composition [10,11]. It has been convention to scale peak VO 2 by simply dividing absolute peak VO 2 (mL. min -1 ) by body mass (ratio scaling). This however, results in a different pattern of development in peak VO 2 in comparison to absolute peak VO 2 , with absolute peak VO 2 in boys increasing with age, but ratio-scaled peak VO 2 remaining unchanged [12]. Similarly in girls, increases in absolute peak VO 2 are noted with increasing age until a levelling off in puberty, but ratio-scaled peak VO 2 declines with age [12].
Ratio scaling is only robust if underlying mathematical assumptions are met, otherwise spurious interpretation may result [13]. Theoretically, physiological variables should be scaled for size using the general allometric equation to derive the appropriate size power function (y/x b ) and thereby providing a more appropriate interpretation of size-related (x b ) changes in physiologic function (y) [14]. The development of peak VO 2 has been described using allometric scaling (log-linear analysis of covariance with body mass as the covariate) in Caucasian children and adolescents, and the same developmental pattern as absolute peak VO 2 has been noted with age i.e., peak VO 2 increases in boys with age, whereas in girls it increases until about 14 years of age when a leveling off is observed [15]. In a recent study, z score equations targeted to remove the effect of body size for peak VO 2 in a group of healthy children in Canada has also been presented [16].
Peak VO 2 has been shown to vary by ethnic group in children, with sparse data on Southern Chinese children [17,18]. Little consideration has been given to the appropriate scaling of peak VO 2 in the Chinese child, which is necessary for evaluating the development of peak VO 2 and for the identification of children with low cardiopulomonary fitness. The purpose of this study therefore was to examine what the appropriate scaling approach for evaluating the development of peak VO 2 in Southern Chinese children and adolescents should be, how power functions may differ by age and sex, and how this impacts interpretation of peak VO 2 . primary (21 schools) and secondary schools (19 schools)

Sampling frame
The sample selection was based on a stratified (by districts) and clustered (all subjects within class) randomised sampling frame. All primary and secondary schools registered under the Education Bureau according to the four geographic regions in Hong Kong were stratified. Selection of a school was based on computer generated random numbers. If the selected school refused to participate, the next randomly selected school was invited. A total of 12 primary and 19 secondary schools were randomly chosen. Two classes in each school year grade were randomly selected from the schools. All students of the selected classes were invited to participate in the study.

Collection of data
Participants were scheduled to undergo assessment at the cardiopulmonary exercise laboratory in the Prince of Wales Hospital. Participants were asked to abstain from alcohol, any caffeinated drink and vigorous exercise for 24 hours prior to testing. They were also asked to eat only a light snack and drink only 0.5L of water within 3 hours of the test.

Measures
Anthropometric measurements. Body weight was measured to the nearest 0.1 kg and percentage body fat was measured to the nearest 0.1% using foot-to-foot bio-electrical impedance (TBF-401, Tanita Co, Tokyo, Japan) [19], with subjects barefoot and dressed in light tshirt and shorts. Standing height without shoes was measured to the nearest 0.1 cm with a Harpenden stadiometer (Holtain, Grymych, UK). Body mass index (BMI) was calculated by weight in kg divided by height in meters squared [20].
Peak VO 2 . Peak VO 2 was assessed using an incremental treadmill running test to maximum [21]. Participants began walking on the treadmill at an age-specific walking speed for 3 minutes. The speed was then increased by 1 km�h -1 every minute, until the speed reached the running pace of the participant. At this point the speed was held constant and the gradient increased by 1% every minute until volitional exhaustion. Heart rate was monitored continuously during the test. Participants wore a comfortably fitted facemask (Hans Rudolph paediatric large size, 8950 series) and breath-by breath gas samples were collected and analyzed throughout the test using the Medgraphics CPX/DTM metabolic cart (Medical Graphics Corporation, St. Paul, MN), calibrated prior to each test. Peak VO 2 was determined when two of the following three conditions were reached: 1) a respiratory exchange ratio (RER)>1.0, 2) heart rate within 5% of age predicted maximum, 3) the participant was exhausted and refused to carry on despite strong verbal encouragement [22].

Sample size calculation
Assuming peak VO 2 is normally distributed among each age and sex, the sample size was calculated using the standard deviation of the 100α th centile (SD c100α ) and the age-and sex-specific SD described by Healy [23]. In order to determine the age and sex-specific mean and standard deviation (SD) of peak VO 2 for sample size calculation, we collected pilot data from 261 healthy children aged 9-16 years. Age groups were analysed in 2-year intervals and the required sample size for each sex and age group was calculated to obtain the 97 th centile, with an error of ± 3%. We assumed that children under 9 will be the same as those under 10 years of age, and therefore proposed recruiting an additional 43 boys and 43 girls aged 8 years to help recruitment of the younger years. The estimated total number of subjects required was 828 (Table 1).

Allometric scaling
Peak VO 2 was allometrically scaled using the procedure described by Vanderburgh et al [24,25]. Peak VO 2 and body weight were log-transformed. A log-linear regression model was constructed using log (peak VO 2 ) as the dependent and log (body weight) as independent variables. The interaction effects of age and sex were tested and found to significantly modulate the association between body mass and peak VO 2 , justifying the need for sex and age specific exponents. Therefore, allometric scaling was done separately for 4 different subgroups: (1) younger (aged from 8 to 11.99 years) males; (2) older (aged from 12 to 16.99 years) males; (3) younger females (aged from 8 to 11.99 years) and (4) older females (aged from 12 to 16.99 years). Outliers were identified using the least median of squares eliminator and removed from analyses [26]. Regression diagnostics were performed to ensure the models were appropriate [27]. Shapiro-Wilks normality test was used to evaluate the normality of the residuals. Homoscedasticity was assessed by plotting the standardized residuals against the standardized predicted value. The resulting beta coefficients were used as the allometric exponents. Peak VO 2 can then be allometrically scaled using the equation: allometrically scaled peak VO 2 ¼ unscaled peak VO 2 body mass exponent Pearson correlation analysis was used to examine the association of the scaled peak VO 2 with body size (body mass and height) and age to verify the effectiveness of the allometric scaling approach for controlling for body size within the specific age groups.

Development of z score equations
Given the necessity for different allometric exponents for body mass in the 4 different subgroups in this dataset, z scores for the four different subgroups were developed to allow comparisons of peak VO 2 across age and sex. A stepwise regression was used to evaluate the association between peak VO 2 and body size and age. The linear (x), second-order (x 2 ) and third-order (x 3 ) effects of age, height and weight were tested and the resulting regression equations were used to predict peak VO 2 . Subjects with the resulting standardised predicted VO 2 greater than 3 or less than -3 were considered as outliers and were excluded. As the variance of the residual values varied with body size and the variations are different between sexes and age groups, age-and sex-specific models were established to predict the standard deviation with body size using the method suggested by Altman [28]. First, linear regression was used to examine the association between the absolute value of the residual values and body size. Since the residual values were normally distributed, the absolute values of the residual values had a half-normal distribution. The mean of a half-normal distribution is p (2/π). The SD of the residuals can be estimated as the product of the mean of the absolute residuals and p (π/2). It follows that the predicted values from the regression of the absolute residuals on body size multiplied by p (π/2) would be the estimates of SD of the signed residual, and hence of peak VO 2 . As a result, two regression models were developed for each sex and age group, one for the prediction of peak VO 2 and the other for the prediction of SD. Z scores were calculated using the following formula: The calculated z scores were evaluated for departure from a normal distribution by visual assessment of histogram and by using the Shapiro-Wilk statistic.
Pearson correlation analysis was used to examine the association of the calculated z scores with body size and age to verify the validity of the computed z scores in controlling for body size and age.

Statistical analysis
Student's t test, Mann-Whitney U test, and chi-square test were used for group comparisons for parametric, nonparametric, and categorical data, respectively. The interaction effects of age and gender as well as pubertal stage and gender on peak VO 2 were assessed by two-way ANOVA. The level of significance is set at p<0.05. Bonferroni correlation was used to adjust the significance level for multiple pairwise comparisons. All statistical analyses were performed using SPSS for Windows (version 21, IBM Corp., Armonk, NY, USA).
Percentile curves for absolute peak VO 2 (expressed in l�min -1 ) and allometrically scaled peak VO 2 were constructed using the LMS method [29]. The LMS method using the maximum penalized likelihood has been used to perform model fitting of the anthropometric centiles for the physical parameters. The LMS method estimates the measurement centiles in terms of three age-and sex-specific cubic spline curves: the L curve (Box-Cox power to transform the data that follow a normal distribution), M curve (median), and S curve (coefficient of variation).
Children's absolute peak VO 2 was also compared to the predicted peak VO 2 from Armstrong and Welsman's regression equations generated from the data of European children [12].

Results
A total of 1386 students were invited to join the study, of whom 281 (20%) refused to participate, 37 (3%) were excluded from the study due to their medical history, 50 (4%) were unable to attend due to their parents' schedules, 98 (7%) could not be contacted for scheduling. Of the 920 participants who underwent CPET, 68 (7%) failed to meet the criteria for a maximal effort and were excluded. Data from 852 (410 males, 442 females) were included in the final analysis. The physical characteristics of the subjects are provided in Table 2.

Absolute peak VO 2
Absolute peak VO 2 was significantly greater in older (F 1 , 848 = 814, P <0.001) and male (F 1,848 = 52, P <0.001) participants. A significant age � sex interaction was present (F 1 , 834 = 102, P <0.001), with subgroup analyses demonstrating an increase in peak VO 2 with age in both males (F 1,408 = 582, P <0.001) and females (F 1,440 = 228, P <0.001) (Fig 1). Post-hoc analysis confirmed that the difference in peak VO 2 between boys and girls became apparent from 12 years of age. Scatter plots of peak VO 2 against age for boys and girls are presented in S1 and S2 Figs, respectively. Using the LMS method, smoothed percentile curves for unscaled peak VO 2 were constructed for boys (Table 3 and Fig 2) and girls (Table 4 and Fig 3). The smoothed centiles are established with all standard errors within 9%.

Allometric scaling
The allometric exponents generated for the younger (aged from 8 to 11.99 years) males and older (aged from 12 to 16.99 years) males were 0.635 and 0.839 respectively. For the younger females (aged from 8 to 11.99 years) and older females (aged from 12 to 16.99 years) exponents were 0.678 and 0.798. Regression diagnostics confirmed that the residuals for the log-linear regression using body mass were normally distributed, and the residuals were homoscedastic. Percentiles for the 4 subgroups are presented in Table 5 and the smoothed percentile curves for allometric scaled peak VO 2 of the 4 subgroups are presented in Fig 5.

Z score equations
The regression equations of predicted peak VO 2 and SD for the four different subgroups are displayed in Table 6. An automated excel file (S1 File) has been developed to calculate the z score for peak VO 2 . The resulting z score for peak VO 2 was normally distributed and had no residual correlation with age, weight and height. The agreement between the percentile ranks from z score and allometric scaling is high (ICC = 0.95). Pearson correlations of absolute, ratio scaled, allometric scaled peak VO 2 , and z score of peak VO 2 with body mass, height and age within each subgroup are presented in Table 7. All correlations between z score of peak VO 2 and body mass, height, and age were close to zero (p>0.2 for all) in all subgroups. For allometric scaled peak VO 2 , only the correlations with body mass were close to zero (p>0.7 for all) in all subgroups. Significant associations with height and age were apparent in males and in the younger females. Both absolute and ratioscaled peak VO 2 were significantly associated with body size and/or age in all subgroups.

Discussion
Our findings demonstrate that allometric scaling of peak VO 2 for body mass was effective in removing the influence of body mass on peak VO 2 . The scaling exponent however differed by age and sex and the scaled peak VO 2 remained correlated with height, and age, within all ageand sex-specific groups, except older girls. As a result, z score equations for different sex and age groups were developed, which were effective in removing the influence of body mass, height and age on peak VO 2 . These provide an effective metric for identifying children with low peak VO 2 .
Similar to previous reports, we show that absolute peak VO 2 increases with age in both sexes [30][31][32][33]. A greater peak VO 2 in boys compared to girls becomes significant from 12 to 13 years of age, with the difference gradually widening as age increases. This sex-related variability in peak VO 2 likely relates in part to differences in body composition with boys having a higher ratio of fat-free mass/stature 2 and a lower ratio of total body fat/stature 2 after adolescence [34]. Indeed, we show that boys had a significantly lower percentage of body fat (thus a higher percentage of fat-free mass) than girls. In addition, the lower peak VO 2 in the girls in our study compared to boys after 12 years of age, may be related to the onset of menarche at around 12 years in girls from Southern China [35]. which results in increases in fat-mass and a reduced rate of growth [36]. The absolute peak VO 2 values presented here are considerably lower than the predicted values using Armstrong and Welsman's regression equations [12], which were generated from data on Caucasian children and adolescents. It is possible that because Southern Chinese children reach peak height velocity at an earlier age, this results in less time for prepubertal growth [37] compared to Caucasian children and therefore developmentally divergent peak VO 2 . The developmental pattern of ratio-scaled peak VO 2 in girls in the current study shows, similar to Northern Chinese girls, peak VO 2 declined with age [38]. However, a different developmental pattern to the Northern Chinese for ratio scaled peak VO 2 was observed in the boys in our study, with values remaining steady from 8 to 12 years, then gradually increasing. In Northern Chinese boys ratio-scaled peak VO 2 increased from 10 to 13 years and then remained steady [38]. Despite ratio-scaled peak VO 2 remaining the most popular method of expressing peak VO 2 [12,13,39], ratio-scaled peak VO 2 was negatively correlated with body mass and height.
When peak VO 2 was allometrically scaled [24,25], Pearson correlation analyses confirming that the allometric scaling effectively removed the influence of body mass. It should be noted that the exponents for our younger males and females subgroups were very close to the theoretical value of 0.67 [40], whereas in the older males and females subgroups, greater exponents     were generated. These findings suggest that the influence body mass has on peak VO 2 differs by age. The developmental pattern for allometric scaled peak VO 2 differed from that of absolute peak VO 2 , with increases observed from 11 to 12 years of age in both sexes, and continued increases from 12 to 15 years of age in older males. Allometric scaling of peak VO 2 is a preferred approach for the removal of body mass [41], however our data show that peak VO 2 remains associated with height and age in males and in the younger females.  Height is in centimeter, weight is in kilograms, and age is in year.
Regression equations for predicted peak VO 2 = β 1 (Height 2 ) + β 2 (Height) + β 3 (Weight) + β 4 (Age) + β 5 Regression equations for predicted SD = [β 6 (Height) + β 7 (Age) + β 8 ] × p (π/2) https://doi.org/10.1371/journal.pone.0213674.t006 Since 4 different subgroups with different scaling exponents for body mass were required, z score calculation was developed to allows comparisons across different age and sex groups [42], which improve the ability to monitor changes in cardiopulmonary fitness longitudinally. There were no associations between z score of peak VO 2 with body mass, height, or age in all subgroups, suggesting that the z score calculation was effective in removing the influence of body size and age on peak VO 2 .
Cardiopulmonary fitness is one of the most important predictors of lower cardiovascular and metabolic health risk and a valuable clinical diagnostic and prognostic tool. Identification of children who have relatively low cardiopulmonary function is important given improvements in cardiopulmonary function are possible with proper exercise training [43].
Both the allometric scaling and z score approaches likely provide superior decision making with respect to the identification of children with low cardiopulmonary fitness (e.g. with a percentile rank <10). It should be noted that the relatively low R-squared values (<0.5, Table 6) of the regression equations of predicted peak VO 2 for older children (12-16.99y) may reduce the reliability of the resultant z scores [44,45], our further analysis found that the agreement between allometric scaling and z score was high, and since the z score is independent of body size and age, it thus allows for longitudinal tracking, and may be a better choice for monitoring children and adolescents over time. Nevertheless, these are complex approaches for data treatment, and use in the clinical setting may be limited until accessible data processing is made available, and we therefore provide an excel file for easy calculation of peak VO 2 z scores. The development of mobile apps or online sites would be important in the future for easy access to pediatric normal values. Scaling approach for evaluating peak VO 2 in children and adolescents

Study limitations
The sample size estimation was based on establishing a representative cohort with precise extreme centiles for each age and sex. The estimated sample size did not provide sufficient power for log-linear regression analysis to determine sex-specific exponents with high precision for each chronological year. Therefore, we formed younger (<12 years) and older (�12 years) groups to ensure the precision of the exponents. We do not include lean body mass in the current study because we only had bioelectrical impedance data on a limited number of our participants. Although scaling for lean body mass is preferred [41], many clinical testing settings like our do not have easy more valid measures of lean body mass such as dual-energy x-ray absorptiometry or magnetic resonance imaging. Besides, physical activity level was also not measured in this study so that its impact on peak VO 2 could not be examined. In this study, the cardiopulmonary exercise test was performed on a treadmill and the reference values would not be applicable if the test was performed using other ergometers such as a cycle ergometer.

Conclusions
In conclusion, we have provided comparisons between different body size scaling approaches for peak VO 2 in Southern Chinese girls and boys and have shown these differentially impact the interpretation of peak VO 2 with changes in age and sex. We recommend the use of z scores for identifying Southern Chinese children with poor cardiopulmonary fitness, and provide an accessible data processing tools for this purpose.