Reliability and Validity of a 20-s Alternative to the Wingate Anaerobic Test in Team Sport Male Athletes

The intent of this study was to evaluate relative and absolute reliability of the 20-s anaerobic test (WAnT20) versus the WAnT30 and to verify how far the various indices of the 30-s Wingate anaerobic test (WAnT30) could be predicted from the WAnT20 data in male athletes. The participants were Exercise Science majors (age: 21.5±1.6 yrs, stature: 0.183±0.08 m, body mass: 81.2±10.9 kg) who participated regularly in team sports. In Phase I, 41 participants performed duplicate WAnT20 and WAnT30 tests to assess reliability. In Phase II, 31 participants performed one trial each of the WAnT20 and WAnT30 to determine the ability of the WAnT20 to predict components of the WAnT30. In Phase III, 31 participants were used to cross-validate the prediction equations developed in Phase II. Respective intra-class correlation coefficients (ICC) for peak power output (PPO) (ICC = 0.98 and 0.95) and mean power output (MPO) (ICC 0.98 and 0.90) did not differ significantly between WAnT20 and WAnT30. ICCs for minimal power output (POmin) and fatigue index (FI) were poor for both tests (range 0.53 to 0.76). Standard errors of the means (SEM) for PPO and MPO were less than their smallest worthwhile changes (SWC) in both tests; however, POmin and FI values were “marginal,” with SEM values greater than their respective SWCs for both tests values. Stepwise regression analysis showed that MPO had the highest coefficient of predictability (R = 0.97), with POmin and FI considerable lower (R = 0.71 and 0.41 respectively). Cross-validation showed insignificant bias with limits of agreement of 0.99±1.04, 6.5±92.7 W, and 1.6±9.8% between measured and predicted MPO, POmin, and FI, respectively. WAnT20 offers a reliable and valid test of leg anaerobic power in male athletes and could replace the classic WAnT30.


Introduction
Classical analyses of human physical performance suggested three primary energy sources: an anaerobic power (largely phosphagen based, and typically depleted within 2-4 s), an anaerobic capacity (limited largely by lactate accumulation, and exhausted within about 45 s) and an aerobic power capable of sustaining effort for much longer periods [1,2]. The 5 s and 30 s power output measurements of the standard 30-s Wingate Anaerobic Test (WAnT 30 ) were designed to examine the first two of these energy reserves [3]. The WAnT 30 is both a reliable and a valid test [4][5][6], and is the most popular method of evaluating anaerobic ability. Although there are other potential measures such as short sprints [7], continuous vertical jumping [8] and elliptical all-out tests [9]. However, During the 30 s maximal effort of the WAnT 30 , the accumulation of [H + ] as a by product of anaerobic glycolysis results in a drop in blood pH [10]. The increased acidity impairs activity of the enzymes involved in energy release and reduces maximal muscle fibre recruitment [11]. Further, the acute increase of blood glucose usage can result in a temporary hypoglycemia [12]. Undesirable side effects of the 30 s test thus include headache, vomiting, dizziness, and nausea [13]. In athletic applications where frequent assessments are required, awareness of these side effects may lead to less than maximal efforts during repeated testing, with a negative impact upon the reliability and validity of the test [14]. A previous study has shown that a 10 s reduction of test duration reduces physical discomfort in more than 90% of participants [14], leading to the view that a shortening of the test protocol might be helpful in some athletic and clinical applications.
The aerobic contribution is a further argument in favour of shortening the test. Although the objective of the WAnT 30 is to assess peak anaerobic power and anaerobic capacity, it is recognized that over the 30 s of maximal effort, some ATP regeneration occurs through oxidative phosphorylation [15]. The magnitude of this aerobic contribution has been variously estimated at 9-19% [16], 28% [3]? 40% [17] or even 44% [18], although [19] found that the aerobic contribution was only 16% even during the final 5 s of the test.
Previous study [14] have found no difference of peak power output (PPO) between 20 s and 30 s tests, since PPO is usually attained in the first 5 s of effort.
Furthermore, it appears that the mean power output (MPO), the minimum power output (PO min ), and the fatigue index (FI) as determined by a WAnT 30 can be predicted accurately from a WAnT 20 [14]. The studies cited all emphasized the decrease in physical discomfort when using the WAnT 20 , but observations were limited to non-athletes, and the reliability [1] of the WAnT 20 relative to the WAnT 30 was not formally established. Moreover, the prediction algorithms developed in these earlier reports were not validated by subsequent, independent studies. Therefore, the objectives of the present study were to evaluate the reliability of the WAnT 20 relative to the WAnT 30 , and to verify how far the data obtained from the WAnT 30 could be used to predict the traditional WAnT 20 measures in male team-sport athletes.

Participants
The human subject committee of the local university (i.e., High Institute of Sport and Physical Education of Ksar Said, Tunis, Tunisia) approved the study in accordance with the 1975 Declaration of Helsinki. Eighty-one competitive, male team-sport athletes (age: 21.5¡1.6 yrs, stature: 0.183¡0.08 cm, body mass: 81.2¡10.9 kg) were recruited. All were Exercise Science and Physical Education students, participating in regular training and competitive sports team schedules (soccer, basketball, rugby, and handball); they had an average of 6.3¡0.9 yrs of training). All participants were fully informed of the nature of the study and they provided written and informed consent in accordance with accepted policy statements regarding the use of human participants.

Procedures
The standard 30-s Wingate Anaerobic Test and WAnT 20 tests were both performed on a friction belt cycle ergometer (Monark 894 E Peak Bike, Weight Ergometer, Vansbro, Sweden) with a basket weight loading system interfaced to a microcomputer. Monark Anaerobic Test; Software version 2.22 was used to record second-by-second power output throughout the test. The external loading was set at 7.5% of the individual's body mass [4,14]. Participants were familiarized and habituated to the test protocol on two separate occasions prior to collection of definitive data; they performed high-velocity sprint exercises interspersed with 3 min of rest, so as to minimize continued test learning during the definitive experiment. Optimal saddle and handlebar positioning were determined for each subject prior to their first test, and the same placements were used in subsequent tests. Toe clips were used throughout.
Definitive tests were preceded by a standardized warm-up [20] that comprised three 30-s periods of active rest (zero-resistance pedalling at 60 rpm) alternating with three 30-s bouts of exercise at increasing external resistance (25,50, and 75% of the test resistance, respectively). Participants were instructed to pedal at maximal effort throughout the definitive test. Both WAnT 30 and WAnT 20 tests began from a standstill position, with full application of the predetermined resistance. Participants were allowed to stand on the pedals during the first seconds of the tests, and vigorous verbal encouragement was given throughout. An active recovery period of three min followed each test. Participants maintained their normal intake of food and fluids, but they abstained from physical exercise and consumption of alcohol and caffeine for 1 day, and ate no food for two hours before testing.
Phase I examined the respective reliability of WAnT 20 and WAnT 30 protocols. Forty one athletes performed each test twice, on separate days, at the same time of day, and in a randomized order. PPO (highest 5-s output), PO min (lowest 5-s PO), MPO (average PO throughout the test), and FI (percentage drop in power output from PPO to PO min ) were determined for each of the two protocols [4].
In phase II, 31 athletes performed a single WAnT 30 . The standard WAnT 30 indices (as detailed above) were used as criterion indices, while the corresponding data from the first 20 s of the same test were used as predictors.
In phase III, 31 athletes performed both WAnT 30 (criterion) and WAnT 20 (predictor) tests. As a result, the agreement between the measured indices from the WAnT 30 and predicted indices developed from phase II was quantified, using the 95% limits of agreement method (LoA).

Statistical Analyses
Data analyses were performed using SPSS software (version 19.0 for Windows) and MedCalc version 11.1.1.0. Means and standard deviations (SD) were calculated for each variable. The normality of appropriate data sets was confirmed by applying the Anderson-Darling test of normality [1], allowing hypotheses to be tested by parametric statistical techniques. A maximum a priori a of 0.05 was applied throughout.
In phase I, we evaluated the hypothesis that the sample means of test and retest values did not differ, using a paired sample t-test. To help protect against type II errors, an estimate of the effect size (dz) was made to determine if differences between trials were trivial [21]. Reliability was assessed by calculating intra-class correlation coefficients (ICC) model 3,1 [22]. ICCs.0.90 were considered as high, 0.80 to 0.90 as moderate, and ,0.80 as low. The absolute reliability of WAnT 20 and WAnT 30 values was expressed as the standard error of measurements (SEM) [23]. To complement the SEM, the smallest worthwhile change (SWC) was determined by rearrangement of Cohen's d effect size calculation, where the smallest worthwhile effect (0.2) is multiplied by the between-subject SD [24]. By comparing SWC with SEM, test sensitivity was determined, using the thresholds proposed by Lexell and Downham [24]. When SEM was # SWC, the test's capacity to detect change was considered ''good'', when SEM was equal to SWC it was considered ''satisfactory'', and when SEM was > SWC the test was rated as ''marginal''. The square root of the mean square error (MSE) was used to calculate SEM (SEM 5 ffiffiffiffiffiffiffiffiffi ffi MSE p ) [10,25]. The SEM% (SEM/mean6100 of all measurements from both sessions) was also calculated in order to compare the WAnT 20 and WAnT 30 indices. Before reporting the relevant data in the units of measurement, heteroscedasticity was assessed. Since heteroscedasticity was found in the present data, a log transformation was applied, an antilog (back transformation) was performed to give values that could be interpreted in relation to the original scale [26].
In phase II, a stepwise regression equation was developed to predict MPO, PO min , and FI from data collected during the first 20 s of the WAnT 30 . In phase III, Bland-Altman plots were used to determine the goodness of fit for the developed prediction equations [27].

Results
Summary results for WAnT 20 and WAnT 30 tests and retests are shown in Tables 1  and 2. Residual data for WAnT 20 and WAnT 30 test and retest were normally distributed (Anderson-Darling p50.13-0.8), with no significant differences between test and retest outcomes for WAnT 20 or WAnT 30 . The relative and absolute reliability of PO min and FI were poor for both test durations; the SEMs for these two measures were larger than their respective SWCs, indicating that both were of marginal value (Table 1 and 2). However, the PPO and MPO values satisfied the ICC (3,1) criterion of high relative reliability, and this was confirmed by SWCs values that were larger than their SEMs counterparts (Table 1 and 2). Table 3 presents stepwise regression data. The MPO accounted for the greatest coefficient of determination (R 2 50.98). Table 4 indicates that residual errors between measured and estimated MPO, PO min , and FI were normally distributed, and that the mean biases were not statistically significant. The raw MPO data showed evidence of heteroscedasticity, with positive coefficients (r50.24; p50.20). Data were therefore transformed into natural logarithms. The dependent t-test performed between the mean log transformed indices for measured and predicted MPOs showed no significant systematic biases (p50.08; dz50.12). Residual errors were normally distributed, and the 95% ratio-limits of agreement were 20.004¡0.017.

Discussion
The purpose of the present study was to determine the relative and absolute reliability as well as the validity of WAnT 20 when compared to the standard WAnT 30 . Our results demonstrated that the WAnT 20 is a reliable tool for the evaluation of the anaerobic performance of the legs in male team sport athletes.
Furthermore, it appears that if desired, the traditional WAnT 30 indices can be predicted accurately, using data collected during the WAnT 20 .

Relative and Absolute Reliabilities
In this study we analysed the respective test-retest reliabilities of the WAnT 20 and WAnT 30 by complementary indices of relative and absolute reliability. Relative reliability is indicated by the ICC [22]. The ICCs for PPO and MPO were very high for both protocols, although tending to be slightly higher for 20 s than for 30 s tests. In contrast, the ICCs for FI and PO min fell below the minimal standard of acceptability for reliability (Tables 1 and 2). The ICC cannot be used as the sole statistical measure of reliability, since it is affected by sample heterogeneity [27]. Consequently, we determined the SEM as a measure of absolute reliability [12]. Retest reliability and measurement errors were comparable between the two test protocols. However, the FI and PO min for both protocols also showed larger coefficients of variation than the MPO and PPO. We conclude that with either protocol, the most sensitive measures for evaluating real change are the MPO and PPO, and that the reliability of the 20 s test is at least as good as the traditional 30 s protocol for these two indices.
We also examined the likelihood that the true values of estimated differences in test outcomes would be substantial (i.e., larger than the SWC). Inspection of Tables 1 and 2 shows that the SWC for PPO and MPO for both test protocols were greater than their SEMs, indicating that these indices have a good ability to detect real changes in anaerobic performance of the legs in team athletes. In contrast, the data for FI and PO min had SWCs much greater than their SEMs, calling into question their use in assessing the anaerobic performance of team athletes. Oliver [25] has suggested that the mathematical procedures involved in calculating fatigue levels can influence their reliability.

Prediction of the Want 30 Indices from Data Collected during the First 20 S of the Test
The results from the second phase of our study show that the traditional WAnT 30 indices can, if desired, be predicted accurately based on data collected during the first 20 seconds of the same test (Table 3). In our study, all participants achieved PPO within the first 5-10 s of the WAnT 30 . It seems that if PPO is the only variable of interest, then the WAnT 30 could easily be shortened to 10 seconds. However, in order for MPO (R 2 50.97) and PO min (R 2 50.71) to be predicted effectively, the test must continue for 20 seconds. These results agree with the observations of Stickley et al [28] and Laurent et al [10], who found that MPO and PO min could be predicted from the first 20 seconds of a WAnT 30 in female and male college students, respectively. Unlike these two studies, the present study showed a limited ability to predict FI (R 2 50.41), bringing into question the value of the FI as a measure of relative performance decline during all-out effort. The development of an anaerobic protocol even shorter than 20 seconds would further reduce detrimental side effects. However, regression equations based on only the first 15 seconds of the WAnT 30 were less effective in predicting MPO (R 2 50.945), PO min (R 2 50.677), and FI (R 2 50.345). Stickley et al [28] had similar findings in female college students, with coefficients of determination for MPO, PO min , and FI of 0.965, 0.796 and 0.548, respectively based upon 15 s tests.

Validity of the Want 20 versus the Want 30
In Phase III of this study, participants tended to a slightly greater PPO during WAnT20 (924¡165 W) than during WAnT30 (916¡134 W), although this difference was not statistically significant. Significantly greater values for MPO, PO min and FI for WAnT20 (675¡118 W, 484¡100 W, and 49.7¡7.1%,  [28], who found that these indices could be predicted by a WAnT20 protocol in both female and male collegiate students. Hachana et al [23] were also able to produce simple regression equations to estimate WAnT30 from a WAnT15 test. The prediction algorithms developed in these earlier reports were not validated by subsequent independent studies [29]. However, the accuracy of our prediction equations (Table 3) were evaluated by the Bland and Altman method [27]; PO min and FI data were homoscedastic (Table 4), i.e., the calculated limits of agreement remain constant throughout the range of measurement and can therefore be accepted [27]. Indeed, the differences between measured and predicted PO min and FI for male physical education students would be expected to lie within the limits of 6.5¡92.8 (W) and 1.6¡9.8 (%), respectively. Heteroscedasticity occurs when the random error in data increases as the measured values increase [30]; in such circumstances, it is necessary to transform the original test data into natural logarithms and then repeat the limits of agreement tests [27]. Nevill and Atkinson [31] suggested that if the coefficient of correlation between absolute residual errors and the individual means was positive, but not necessarily statistically significant, it was also desirable to transform raw data into natural logarithms and recalculate the limits of agreement in order to reduce heteroscedasticity. Our MPO raw data showed some evidence of heteroscedasticity (Table 4), and accordingly we transformed these data into natural logarithms for analysis. Recalculating antilogs resulted in a limits of agreement of 20.004¡0.017, a mean bias on the ratio scale of 0.99 (exponential 20.004) and an agreement of ¡1.04 (exponential 0.017); thus, 95% of the ratios for the sample (log transformed test score divided by log transformed retest score) should lie between the values of 0.95 (0.9941.04) and 1.03 (0.9961.04). Assuming the bias for estimating MPO of 0.004 to be negligible, the predicted and measured WAnT 30 MPO would differ due to measurement error by no more than 1.7%. To put this limit of agreement into a practical context, if a subject presented with a WAnT 20 predicted MPO of 500 W, the worst case scenario is that this athlete had an MPO as low as 50060.955475 W or as high as 50061.035515 W.

Limitations
Our findings have been limited to a group of young men engaged in team sports. Further data are needed to confirm that a 20 s protocol is appropriate for assessing the anaerobic performance of those engaged in other types of sport, at other levels of training, in different age groups, and particularly in female participants. The effects of habituation and test learning on a 20 s test are also an important area for future enquiry, as are more quantitative evaluations of the extent of symptoms with 20 and 30 s tests.

Conclusion
In summary, our study demonstrates that the WAnT 20 is a reliable tool for measuring the anaerobic performance of the leg muscles in young men trained for team sports, and if desired the data can be used to predict traditional Wingate test indices. In contrast to previous studies, we have shown that WAnT 20 is reliable and can be used to accurately predict WAnT 30 indices. Therefore, the WAnT 20 data can be used by coaches, clinicians, and athletes as reliable and good predictors of the standard WAnT 30 parameters.