Figures
Abstract
Shorter distance events in track and field are replete with folk tales about which lane assignments on the track are advantageous. Estimating the causal effect of lane assignments on race times is a difficult task as lane assignments are typically non-random. To estimate these effects I exploit a random assignment rule for the first round of races in short distance events. Using twenty years of data from the IAAF world athletic championships and U20 world championships, there is no evidence of lane advantages in the 100m. Contrary to popular belief, the data suggest that outside lanes in the 200m and 400m produce faster race times. In the 800m, which is unique in having a lane break, there is some weak evidence that outside lanes producer slower race times, possibly reflecting the advantage of inside lanes having an established position on the track at the lane break. Given that these results do not support common convictions on lane advantages, they also serve as an interesting case study on false beliefs.
Citation: Munro D (2022) Are there lane advantages in track and field? PLoS ONE 17(8): e0271670. https://doi.org/10.1371/journal.pone.0271670
Editor: Roy Cerqueti, Sapienza University of Rome, ITALY
Received: November 10, 2021; Accepted: July 5, 2022; Published: August 3, 2022
Copyright: © 2022 David Munro. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: All data is publicly available and can be accessed via: https://www.worldathletics.org/competitions A full replication package is included in my submission materials and, in addition, can be located here: https://github.com/dmunro-git/Lane-Advantages.
Funding: The author received no specific funding for this work.
Competing interests: The authors have declared that no competing interests exist.
Introduction
In shorter distance track and field events one frequently encounters tales about lane advantages, that is, which lane assignments on the track produce the fastest event times. This track and field folklore is often heard from coaches, teammates, etc., however they are also codified in competition rules, and appear in the popular press (e.g. [1, 2]). These beliefs are held at the highest levels of the sport. Following his bronze medal win in the 100m at the 2020 Olympics Andre De Grasse noted: “I knew it was going to be a tough one after I drew lane nine. I didn’t have a great semifinal and I knew I had to come out and try and execute as best as I can” [3]. In the context of this paper, readers may find it interesting to note that De Grasse’s race in lane nine was his personal record.
Common narratives claim that in races with corners, running in the outside lane (typically 7 through 8 or 9) is disadvantageous as the runner cannot see any of their competitors, and that the very inside lanes (typically 1 and 2) are also seen as undesirable as they have the tightest corners. Therefore, the middle lanes of the track are deemed the most desirable. In support of the belief that inside corners are slower, researchers examining the biomechanics of running find evidence that tighter corners do in fact slow runners down. Tighter corners both reduce running speeds (e.g. [4, 5]) and have lower foot force production [6]. However, no such empirical evidence exists to support the claim that one’s inability to see competitors when running in an outside lane creates a disadvantage. While beliefs about lane advantages are commonly connected to races with corners, the De Grasse quote above highlights that these beliefs persist in events run on straightaways. There is also no empirical work examining the existence of lane advantages in straightaway races. While assessing the impact of seeing competitors, and its ultimate effect on race times, is clearly a relevant question in the context of track and field performance, it also relates more broadly to questions about the performance effects of motivational or psychological factors. For example, there is some evidence [7] that sports teams losing games at halftime end up winning the game more frequently (though [8], find opposing evidence), that professional golfers respond differently to the possibility of losses [9], and, in the non-sports world, that incentives framed as losses (as opposed to gains) can increase worker productivity [10]. This paper contributes to this literature by analyzing these motivation or psychological effects in a track and field context.
Estimating the causal impact of these lane advantages is a difficult empirical task as lane assignments are typically non-random and instead are a function of seed times or race times in prior elimination rounds of the event. Lanes deemed as advantageous in the folklore are assigned to runners with superior seed times or qualifying times. This endogenous assignment to treatment (lanes) prevents a causal interpretation of any differences observed in race times by lane.
Besides the biomechanical evidence, other researchers have approached this question in various ways. Using a mathematical model of track geometry, [11] exploit different track designs and find tighter corners (smaller radii) produce slower times. [12] examines the effects that lane assignments on world records, but does not control for endogenous assignment to lanes. [6] examines the effects of lane assignments on placings in races, and only find statistical differences in events with endogenous assignment to lanes.
To overcome the issue of endogenous assignment to lanes and obtain a causal estimate of the effect of lane assignments on race times, I exploit a random assignment rule used in the first round of major track meets. Using twenty years of data from the IAAF, I estimate these effects in the 100m, 200m, 400m, and 800m. Beginning with the 100m, the data suggests that lane assignments have no effect on race times. This null effect is precisely measured and, given the level of statistical power in the analysis, if true lane advantages do exist in the 100m, the precise null results suggest they must be quite small. In the 200m, there is robust evidence that outside lanes on the track produce the fastest race times. Further, average race times appear to be roughly monotonically decreasing with lane number. This is consistent with the evidence on the biomechanics of running, but inconsistent with the view that outside lanes are undesirable. The common belief in the track and field folklore is that the middle lanes (often 3–6) are the most desirable. Thus, these estimates suggest that these beliefs are incorrect. To give a brief sense of magnitude, depending on the estimation strategy, I find that lane 8 produces, on average, race times which are between 0.084 and 0.178 seconds faster than lane 2. While this is small in absolute magnitude, it could amount to important differences in race results given the standard deviation of race times in the data is less than 1 second.
Consistent with the results in the 200m, outside lanes in the 400m also appear to produce faster race times on average. Though these results are somewhat weaker statistically than those from the 200m. It is important to emphasize that as average race times increase so do the dispersion of race times, so statistical power becomes more of an issue in the 400m and 800m. In an alternative estimation approach which pools runners to increase statistical power, there is stronger statistical evidence that outside lanes in the 400m are faster on average and the magnitude of the effects are similar to those found in the 200m. Finally, in the 800m, there is some mixed evidence that outside lanes produce slower race times. The 800m is unique in this collection of events in its use of a lane break. Thus, the result that outside lanes produce slower times is consistent with the notion that inside lanes are advantageous in the 800m as those runners have an established position on the inside of the track at the lane break. Results from all these events are generally consistent across various statistical models.
These baseline results reflect the net impact of lane assignments where both the tighter corners (biomechanical effects) and the motivational or psychological effects related to seeing competitors could be simultaneously impacting race times. The evidence from the 200m highlights that if the motivational/psychological effects do slow runners down in outside lanes, they are dominated by the tight corners effect. Leveraging the fact that the outermost lane on the track (where runners see no other runners for some portion of the race in the 200 and 400m) are not perfectly correlated with lane number, I also estimate the marginal effect of being in the outermost lane. In the 200m, there is evidence that, all else equal, being in the outermost lane slows runners down. This is suggestive that race times may be influenced to some degree by motivational or psychological factors related to seeing competitors.
I end the paper with some discussion of the implications of these results for race rules, and address why common beliefs about lane advantages are not supported by the data. Finally, I also highlight what other events or competitions this approach to estimating the effects of lane assignments could be implemented.
Data and empirical strategy
The data come from IAAF World Championships and U20 World Championships from 2000 to 2019 and was accessed from [13]. Prior to 2000, World Championship data did not include reaction times for the 100m through 400m, and data on season’s bests and personal bests, which are important regressors below, become very sparse. As a result, I focus on post-2000 data. Data was collected for Men’s and Women’s 100m, 200m, 400m, and 800m. In aggregate, this amounts to roughly 8000 individual race times for these events over this time period. A replication package for the analyses conducted in this paper can be found here: [14]
Causal inference framework
The causal effect of lane assignments involves estimating how a runners performance would have changed if they ran their race in a different lane. Denote runner i in heat j observed race time as Yi,j. There are typically nine lanes on the track, and so possibly nine treatment statuses, but as a simple illustrative example, suppose we are interested in measuring the causal effect of running in lane 8. One lane must be chosen as a reference point to compare all other lanes against, and as is discussed below, I choose lane 2 for this. Denote T8,i,j as a binary indicator variable denoting assignment to lane 8: T8,i,j = {0, 1}. The observed race time can be written in terms of potential outcomes:
(1)
Where the last condition is when all other indicator variables are zero, and the runner is in lane 2. In the lane 8 example,
(2)
(Y8,i,j − Y2,i,j) is the causal effect of running in lane 8 (relative to lane 2). The fundamental empirical challenge in estimating the true causal effect of lane assignments on race times is that, for most races, the assignment to lanes is conditional on a runner’s ability. To understand this issue in this example, the difference in average race times between lanes 2 and 8 can be written as:
(3)
The selection effect term captures the difference in average Y2,i,j between those who were assigned to lanes 8 and 2. In races where assignment to lanes is not random, the selection bias term will not be equal to zero (e.g. runners assigned to lanes conditional on ability). Random assignment of lanes eliminates the selection bias term in Eq (3). With random assignment (independence of runner ability and treatment status), thus the selection bias term in Eq (3) cancels. For more discussion on this see [15].
To quantify the observed difference of race times across lanes I estimate the following statistical model:
(4)
Where Yi,j denotes the observed race time of runner i in heat j, Tk,i,j denotes indicator variables for each lane (excluding lane 2, the reference lane), and Xf,i,j denotes a collection of control variables. Simply estimating Eq (4) on all race data would likely lead to biased estimates of treatment effects as lane assignments are typically non-random. To overcome this fundamental issue and to estimate the causal effects of lane assignments on race times, I leverage the random assignment rule implemented by the IAAF in the first round of each event. This random assignment is important from a causal inference perspective as the runners in each lane will (on average) have the same characteristics, and thus any differences in race times can be attributed to lane assignments. This is the independence criteria highlighted above. Specifically [16], states: “In the first round and any additional preliminary qualification round as per Rule 166.1, the lane order shall be drawn by lot.” From personal correspondence with rules officials at the IAAF I have confirmed that this random assignment rule was initiated in the 1985–86 rulebook under rule 141.11 and is still in place today. In recent years, the Men’s 100m in the World Championship also included Preliminary Round heats, which occurred prior to Round 1 heats. This preliminary round is for “unqualified” athletes. According to the rules, random assignment to lanes is supposed to occur in both the Preliminary Round and Round 1 heats. However, from examining the data, and corresponding with rules officials, it appears that the fastest race times from the Preliminary Round heats were sorted into the outside lanes for the Round 1 heats, resulting in a non-random assignment in Round 1 of these events. As such, for the events where there is a Preliminary Round prior to Round 1, I exclude the Round 1 data from the analysis.
In practice, one could explore the average differences in race times across lanes in a non-parametric manner (e.g. t-tests). However, runner ability varies quite a bit in the first heats and, as a result, there is substantial variation in race times. This makes detecting any statistical differences challenging. The use of the statistical model in Eq (4) is useful as it includes various control variables which help explain much of this variation and helps to sharpen the estimates of lane effects.
The following control variables (Xf,i,j) are included in the regression; the recorded wind measurement in each heat (which is only included in 100m and 200m events) and positive (negative) measurements denote tailwinds (headwinds), the runner’s reaction time to the start gun (which is included in the 100m, 200m, and 400m), the runner’s season best race time, the runner’s personal best race time, and, when data for Men’s and Women’s races are pooled, a dummy variable indicating male events. An additional desirable feature of including season’s best is that it controls for any year effects (e.g. sprinters getting faster over time). The coefficient of interest is on the lane dummy variables, which estimate the causal effect of lane assignments on race times. The additional covariates are useful in explaining much of the variation in race times, which helps to sharpen the estimates of lane effects. Another approach to estimate lane effects would be to exploit within sprinter variation (i.e. observing the same sprinter in multiple lane assignments). However, the vast majority of runners (70–80%) appear in the data only once, which severely hampers such an approach. The inclusion of personal best in regression Eq (4) plays this role to some degree for athletes who are in the data more than once, but only if personal best is not changed between observations of the same athlete.
Results
100m
I begin by analyzing lane assignment effects in the 100m. Narratives about lane advantages tend to be focused on races with corners (200m and 400m) but the quote from Andre De Grasse in the introduction highlights that they also persist in the 100m. Similar to the 200m and 400m, the beliefs that middle lanes are best in the 100m could relate to the fact that middle lanes improve a runners vantage and helps them judge where they are relative to their competitors. Indeed, in lane assignment rules used for later rounds of races, the fastest qualifying times are assigned to inside lanes, which suggests they are viewed as favorable.
To begin each analysis, I confirm whether the randomization across lanes is effective. To do this I estimate the following statistical model:
(5)
Where denotes a runner’s season’s best. If runners are assigned lanes based on ability (e.g. their performance in meets taking place earlier in the season), this would be highly problematic for assessing the causal impact of lane assignments. To qualify for the World or U20 Championships, athletes must meet the entry standard in a window that typically spans a year prior to the event. To insure that lane assignments in Round 1 are indeed random, I proxy for an athletes ability with their season’s best (prior to the event being analyzed) and test whether there are statistical differences in season’s best across lanes. Results from this randomization check are reported in Table 1 below. As discussed below, lane 2 was chosen as the baseline to compare against all other lanes.
Columns 1 and 3 in Table 1 report the results from estimating Eq (5) using all data from the Men’s and Women’s races, respectively. As discussed in more detail below, columns 2 and 4 report the results with outliers excluded. Columns 5 and 6 report the results using all data and data with outliers removed for the pooled Men’s and Women’s data.
In general, the randomization appears to effectively balance runners into lanes based on their season’s best times. In a few cases there are statistically significant results, but these normally appear in lanes with a low number of observations, reported in square brackets. It is not reported in the table because there are no corresponding regression results, but lane 2 has a similar number of observations to lane 3. A common issue in the 100m, and all other events, is that in these first round races lanes 1 and 9 are often empty. As an example, in Table 1 the Women’s 100m has 40 or fewer observations in lanes 1 and 9, relative to around 100 observations in the other lanes. Because lanes 1 and 9 have much fewer observations than the other lanes, they are more susceptible to issues relating to low statistical power. As such, all estimated lane effects for lanes 1 and 9 throughout this paper should be treated with caution as they are more susceptible to Type-1 error (see, e.g., [17]). In addition, small sample sizes are susceptible to Type-M error (exaggerating the magnitude of the effects) [18].
At the bottom of the randomization tables, F-statistics and their associated p-values are reported for joint significance tests of the lanes. Only the women’s races are jointly significance at the 5% level, and when this data is pooled with the Men’s data, the lane estimates fail significance at the 5% level, suggesting that the randomization is generally effective.
As an additional robustness check, I examine if the propensity (probability) a runner is assigned to a specific lane is statistically related to their season’s best. These results are reported in Tables 12–15 in S1 Appendix. None of the regressions show a significant relationship between season’s best and treatment status (lane assignments), providing additional evidence that the randomization is effective.
Moving on to the estimates of the effect of lane assignments on race times, results from model Eq (4) for the 100m data are reported in Table 2. Racers who do not start (DNS) or who are disqualified (DQ) to not register race times. In addition, I exclude any racers with missing season and personal best data as these are important in the analysis. Again, I report the results separately for Men’s and Women’s races and also pool the Men’s and Women’s data in the “Pooled” columns to help improve statistical power. I chose lane 2 as the baseline to compare the other lanes against. I do this because, as is highlighted above, lane 1 consistently has much fewer observations than lanes 2 through 8 and thus may be more susceptible to issues relating to low statistical power. Columns 1 and 3 in Table 2 do not show any systematic effect of lane assignments on race times. There are a few statistically significant lane effects in the Men’s data in column 1. For example, lane 3 produces race times which are on average 0.061 seconds slower than lane 2 (significant at the 5% level). Where as in the Women’s data (column 3), for example, lane 7 is 0.0442 seconds faster on average relative to lane 2 (weakly significant at the 1-sided 10% level). However, the lack of consistency between lanes within the Men’s and Women’s races, along with the lack of consistency between genders suggests these results may be anomalous. Pooling the Men’s and Women’s data (column 5) to improve statistical power only yields a weakly significant effect (1-sided 10%) for lane 9 (-0.043 seconds faster than lane 2). But again, this result should be treated with caution as it has far fewer observations than other lanes.
Another concern one might have with the data is the presence of extreme outliers. For example, in the data there are race times that are more than three standard deviations slower than the mean. It is possible these extreme outliers have an important influence on the estimated lane effects. It also seems plausible that these extreme outliers are unrelated to lane assignments. For example, a runner who sustains an injury during the race may have a much slower race time than the norm. To control for these outliers, I conduct the same analysis where I exclude the slowest 5% of the race times in the Men’s and Women’s race, reported in columns 2 and 4 respectively, and results from the pooled analysis are reported in column 6. Excluding these extreme outliers does not generate a meaningful change in the overall regression results in the 100m. In the pooled data with outliers excluded only lane 9 again has a significant lane effect, being on average 0.0513 seconds faster than lane 2. While this effect should be treated with caution because of the low number of observations, it is also the opposite effect relative to the common narrative pertaining to outside lanes in the 100m.
An important point worth emphasizing with this empirical strategy is that runners, of course, are not blinded to their lane assignments. In an analogy from clinical trials for drugs, it is as if “control” subjects do not receive a placebo and are aware of their treatment status. A concern with non-placebo trials is that control subjects engage in differential behavior because of their status (e.g. seek their own treatment) which may impact the estimates of treatment effects. Though leveraging random assignment ensures runner characteristics will be balanced across lanes, it is possible that runners adjust their effort in response to their lane assignments. The main objective in these early rounds is to qualify to advance to later rounds. Runners may be interested in preserving energy for later races and, as such, give “just enough” effort to advance. The concern is that these “just enough” effort types may supply different levels of effort conditional on their lane assignments, which may impact the estimates of lane effects. It is typical that two or three racers qualify to advance. The IAAF rules that determine qualification for later rounds vary by meet as they can be determined by Technical Delegates. However, it is common that two or three racers from each heat automatically advance, with the possibility of more runners qualifying on time. Aside from these “just enough” types, the remaining athletes are likely to be “maximum effort” types in attempting to qualify to advance. While it is certainly plausible that differential effort provision conditional on lanes could exist, it is important to note that these athletes would constitute the minority of runners because of qualification rules. As an additional robustness check, I re-estimate the model Eq (4) excluding runners who finished in first or second place. Because excluding two runners per race amounts to an important reduction in sample size, I do this on the pooled data. Excluding these runners does not have a meaningful impact on the results, reported in the final column of Table 2, and suggests that differential provision of effort across lanes does not impact the estimates of lane effects. Collectively, these results suggest that there is no robust evidence of lane effects in 100m races. To ease interpretation of the results, Fig 1a plots the lane coefficient estimates from the second last column of Table 2 (i.e. the results generated from pooled data excluding outliers).
These figures plot the estimated lane effects using pooled men’s and women’s data and excluding outliers. 95% confidence intervals are denoted by the smaller symbols.
An issue that is relevant throughout this paper is statistical power. From a null finding of lane effects one cannot, of course, conclude that no lane effects exist. One can only conclude that given the statistical power in this analysis, if true lane effects do exist, their magnitude was not detectable. To provide a sense of the role that statistical power is playing in these null results I briefly highlight the lane effects that could be detected given this sample size. Following [20] I report some Minimum Detectable Effects (MDE) from the above regressions. Analyzing MDEs is a common approach to evaluate ex-post statistical power (see, e.g., [21]). At statistical power of 0.8 and a significance level of 5% or 10%, the MDE is found by multiplying the standard error on the coefficient estimate by 2.8 and 2.49, respectively. For example, using the results from the pooled data in column 6 in Table 2, the standard error on the lane 8 coefficient is 0.0171. Thus, the MDEs at the 5 or 10% significance level would be 0.0479 and 0.0426, respectively. While one cannot rule out true lane effects in the 100m from the null results in Table 2, these MDEs help establish that if lane effects do exist in the 100m, they must be quite small.
200m
200m races are more generally thought to have lane advantages and a common view is that periphery lanes—outside and inside lanes—are slower. Here I repeat the same general analysis strategy as above. To begin, the randomization check is reported in Table 9 in the S1 Appendix. These results show the randomization is effective. Only in lane 1 of the pooled data does season’s best appear to be (weakly) related to lane assignments, which again could be a result of many fewer observations in lane 1.
Results from running model Eq (4) on the 200m data are reported in Table 3. The Men’s, Women’s, and Pooled data including or excluding outliers all show evidence that outside lanes produce lower average race times than lane 2. This consistency across Men’s and Women’s races, and the fact that these lane advantages seem to monotonically increase as the lane number increases are reassuring results. The estimated lane coefficients using pooled data and excluding outliers are plotted in Fig 1b. The estimates are also robust to excluding runners who finish first or second in each race, reported in the final column of Table 3. As discussed above, this suggests that differential provision of effort across lanes from faster athletes is not driving the results.
These estimates suggest the advantage of outside lanes can be sizable. For example, in the Women’s data, excluding outliers, lane 8 is estimated to be 0.1781 faster than lane 2. The standard deviation (SD) of race times in this data is 0.68. A common way to estimate the magnitude of an effect is to compute the effect size . Thus, these estimated results produce an effect size of 0.262, which is sizable. Put a different way, these lane effects could easily be the difference between qualifying, or not, to advance to the next round of the race.
Of particular interest is the fact that these estimated lane advantages are the opposite of what is commonly believed regarding outside lanes. The seeming persistence and pervasiveness of false beliefs is interesting and I return to it in the Discussion section.
400m
Turning to the 400m races, I again first present the randomization check in Table 10, reported in the S1 Appendix. These results again show robust evidence that the randomization successfully balances racers by their season’s bests across the different lanes. The only, weakly, significant result is for lane 3 in the Women’s data, and this disappears when outliers are excluded.
The estimates of lane effects on race times in the 400m are somewhat consistent with the 200m, but are much noisier. These results are reported in Table 4. Wind speed is not recorded in the 400m, so these results are estimated by running model Eq (4) without wind as a control. There is some mixed evidence in the Women’s data that outside lanes produce faster race times, consistent with the lane advantages estimated in the 200m races. However, they do not appear to be monotonically decreasing with lane number, and, in addition, they are absent in the Men’s data. When the data is pooled together and outliers are excluded lanes 4, 5, 6, 7 and 9 show some evidence of faster race times relative to lane 2. These results are not greatly impacted by excluding runners who finish in first or second (reported in the final column). The estimates using pooled data and excluding outliers are graphically depicted in Fig 1c. Visually, the results between the 200 and 400m look somewhat similar, with race times tending to increase with lane number, but from the 95% confidence intervals, it is clear that the 400m results are statistically weaker.
One important issue with the 400m, and 800m below, is that the longer average race times tend to be associated with greater dispersion in race times. For example, as noted above, the standard deviation of race times excluding outliers in the 200m Women’s data is 0.68. The analogous standard deviation in the 400m is 1.64 seconds. As a result, for a given number of observations, statistical power weakens as event times increase. To give a sense of statistical power, I again report the MDE for the 400m. For example, using the pooled data without outliers, the estimate for lane 8 has a standard error of 0.0754. With statistical power of 0.8, this gives a MDE of 0.2111 and 0.1878 for the 5% and 10% significance levels, respectively. Thus, given the number of observations in the 400m data, statistical power would be insufficient to pick up lane effects that would be similar in magnitude as the 200m. Of course, it is also important the emphasize that even if there are lane effects in the 400m that are of similar magnitude as the 200m, their relative importance would be much smaller in the 400m since they represent a much smaller fraction of the mean or standard deviation of race times.
Also of interest is that these results are in contrast to the common belief that outside lanes are a significant disadvantage in the 400m. In both [1, 2] there is discussion about the gold medal race in the 2016 Olympics by Wayde van Niekerk. He is the first man to win the 400m from lane 8 and these articles clearly highlight the sentiment that this is impressive because lane 8 places runners at a disadvantage. However, the results in Table 4 show that, if anything, outside lanes produce average race times that are faster than lane 2. Of course, winning from lane 8 is impressive in that the runner registered one of the slowest qualifying times for the final, but this does not necessarily suggest that lane 8 itself is a disadvantage: van Niekerk’s improvement from his semifinal time was an impressive 1.42 seconds, where as the average improvement of all the other runners in that race was 0.172 seconds.
800m
The 800m race is unique from the above events as lane assignments are not fixed for the duration of the race. Runners are assigned to a lane and must remain in that lane until the break line 100m from the start. This unique feature of the 800m, relative to the other shorter distance events, makes it interesting to explore in the context of lane assignment effects.
I again begin with the randomization check for the 800m data, reported in Table 11 in the S1 Appendix. There appears to be robust evidence that the randomization is effective. Moving on to the estimates of lane effects, I again implement model Eq (4). However, wind speed and reaction times are not recorded for the 800m and are thus not included in the regression. In addition, since the 800m tends to be a pack race—runners tend to run together in a pack for some portion of the race—I also include race fixed effects. Thus, lane effects are estimated after controlling for the average time in a race. On occasion, when tracks do not have a ninth lane, 800m races can have two runners assigned to lane 8. This is quite rare in the data, but I exclude these racers when it does occurs. These regression results are reported in Table 5.
The results are somewhat mixed, possibly due to the issues regarding longer race times and dispersion highlighted above, but there is some weak evidence that outside lanes tend to produce slower race times on average. For example, in the pooled data excluding outliers, lanes 5, 7 and 8 show positive and significant (weakly in some cases) effects on race times, ranging from 0.213 to 0.357 seconds. These results are generally consistent, but somewhat statistically weaker, when runners who finish in first or second are excluded. The results using pooled data and excluding outliers are reported in Fig 1d.
Of interest, the result that outside lanes produce slower race times on average is the opposite of the general result found in the 200m and 400m. As noted above, one possible explanation for this may be the unique lane break feature of the 800m. Since the inside lane of the track minimizes the distance covered, after the break-line all runners converge to the inside lanes. This might make the inside lanes advantageous as runners in the outside lanes either have to jockey for position with runners who have an establish position on the inside of the track, or continue to run in lanes which lengthen the distance travelled around the track.
Vantage points and effort effects
As noted above, the narrative that outside lanes are undesirable in races with corners stems from the idea that not being able to see competitors puts runners at a disadvantage. It could be the case that seeing a competitor generates additional motivation for runners and spurs increased effort. Because of staggered starts, higher lane numbers will be able to see fewer runners, and the outermost lane can see no other runners (until they are passed). This effect will likely be the most dramatic in the 200 and 400m. If these “effort effects” of lanes do exist, a natural interpretation is that they would cause average race times to increase with lane number (i.e. outside lanes would be slower). This effect goes in the opposite direction compared to the biomechanical effects of tight corners. The results reported above should be thought of as the net effects of lane assignments. It is possible that both margins (effort and biomechanical effects) impact runners, but the results in the 200 and 400m suggest that, if anything, race times decrease with lane number. As such, the narrative that outside lanes are undesirable because of effort effects is not well supported by the data.
This, of course, does not rule out that effort effects are active, just that the tight corner effects dominate. While the net effects are clearly what ultimately matters in terms of assessing the desirability of lanes, it is still an interesting question to know if effort effects are present. To assess this question, I revisit the 200 and 400m results and leverage the fact that it is not always the same lane which is the outermost one. Because not all lanes are full in each race and/or some tracks do not have a 9th lane, there is some variability in which lane is the outermost (commonly lanes 8 or 9, and occasionally 7). Because the outermost lane and lane numbers are not perfectly correlated, I can leverage this fact to estimate the separate effect of being in the outermost lane (i.e. not having any competitors ahead of you to start a race).
To estimate these effects I run the same model as Eq (4) but where I also include an additional indicator variable taking a value of 1 (0) when a runner is (is not) in the outermost lane. In other words, controlling for lane effects as in Eq (4), is there a separate statistical effect of being in the outermost lane? To cut down on repetition, these results are estimated on the data without outliers and are reported in Table 6.
The coefficient of interest is on the outermost indicator variable (α1). In the men’s 200m, the estimated coefficient is 0.0973 with a p-value of 0.052. So there is reasonable statistical evidence that, after controlling for lane effects, being in the outermost lane does generate slower race times on average. In the women’s 200m, the coefficient falls to 0.0597 and is insignificant at standard levels. And in the pooled data the coefficient is 0.0778 with a p-value of 0.058. Overall, while the evidence is somewhat mixed between the men’s and women’s races, there is some evidence that being in the outmost lane does have a negative impact on runners. Again, it is important to emphasize that these results do not mean that the outside lanes generate an overall slowdown in race times. The results above clearly highlight that being in the outside lanes in the 200m generate, on average, faster race times. The positive coefficients on the outermost variable is simply the marginal impact of being in the outermost lane.
The results from the 400m races are much more mixed, with none of the outermost coefficients being statistically significant. Again, this could stem from weaker statistical power at this distance.
Alternative regression models
A desirable feature of model Eq (4) is that it is agnostic about the structure of lane advantages, allowing each lane to have a separate treatment effect. However, one of the downsides about this approach is that it requires eight regressors, which compromises statistical power. This may be especially worrisome in the 400 and 800m where statistical power issues are more salient. In this section I explore two alternative statistical models to help to alleviate the statistical power issue. In the first approach, I repeat the general approach in Eq (4) but instead pool runners together in lanes 1 and 2, 3 and 4, 5 and 6, and 7, 8, and 9. This dramatically increases the number of observations per regressor but, of course, has the obvious downside of assuming the statistical impact of, for example, lanes 3 and 4 is identical. Specifically, with this alternative regression I estimate:
(6)
Where T3,4, T5,6, and T7,8,9 denote dummy variables which take a value of 1 when racers are in lanes 3 or 4, 5 or 6, and 7 or 8 or 9, respectively, and take a value of 0 otherwise. Lanes 1 and 2 are now the baseline grouping. To reduce repetition, I report only results which pool men’s and women’s data and exclude outliers.
Results from Eq (6) are reported in Table 7. Overall, the general results are quite similar to those generated from the original model Eq (4). There is no evidence of lanes assignments impacting races times in the 100m and there is strong evidence that outsides produce faster race times in the 200m. The results from the 400m become somewhat more significant and show some evidence that outside lanes in the 400m are also faster. And in the 800m, outside lanes tend to produce slower race times, but the effects remain quite weak statistically.
For the last model, instead of utilizing indicator variables for lane assignments I implement a continuous lane variable. In particular, I estimate the following statistical model:
(7)
where Zi,j is a continuous lane variable taking values from 1 to 9. Of course, the implied assumption here is that lane numbers impact race times in a linear fashion. This further increases statistical power, but, of course, has the undesirable feature of imposing a functional form which may or may not capture the true data generating process. Results of this regression are reported in Table 8.
The results in Table 8 are again quite similar to those found from the original model Eq (4). The β1 coefficient is small and highly insignificant in the 100m and it is negative and highly significant in the 200m. While the coefficient is negative in the 400m, and a similar magnitude as the 200m, it is only weakly significant (p-value = 0.109). Finally, the β1 coefficient is positive in the 800m, but also fails significance at standard levels (p-value = 0.22).
In summary, employing these alternative statistical models seems to buttress the initial estimates of lane effects found from estimating model Eq (4). In all cases there is no evidence of lane effects in the 100m and robust evidence that outside lanes are faster in the 200m. With coarser lane groupings the evidence of faster outside lanes in the 400m becomes somewhat stronger, and this aligns with the effects seen in the 200m. And finally, through all the different statistical models the evidence from the 800m suggests outside lanes are slower, but these effects are quite weak statistically.
Discussion
Leveraging a random assignment rule implemented in the first round of IAAF events, this paper provides causal estimates of lane assignments in sprint distance track and field events. I find no evidence of lane advantages in the 100m, which suggests that a runner’s vantage point is inconsequential for their performance. In the 200m, I find robust evidence that outside lanes on the track produce faster race times. This result is consistent with the biomechanical evidence on the impact of tight corners on running speeds. While average race times in outside lanes are also faster in the 400m, the statistical evidence is somewhat weaker than the 200m. But it is important to note that statistical power becomes more of an issue in events with longer race times. Finally, I find some weak evidence that outside lanes in the 800m tend to produce slower race times, which may be a product of the unique lane break feature of the 800m.
There are a number of interesting points worth discussing. The first is the fact that results in the 200m and 400m suggest that the commonly held belief that middle lanes are best is incorrect. Why these seemingly false beliefs persist is an interesting question. One possible interpretation is that in most observations of track and field races, slower athletes are assigned to the periphery lanes of the track. For example, in the IAAF rules, after round 1, athletes are ranked by their round 1 race times and: “Three draws will be made: i) one for the four highest ranked athletes or teams to determine placings in lanes 3, 4, 5 and 6, ii) another for the fifth and sixth ranked athletes or teams to determine placings in lanes 7 and 8, and iii) another for the two lowest ranked athletes or teams to determine placings in lanes 1 and 2.” Thus, the runners assigned to the periphery lanes are the slowest runners in the race. As another example, in the widely used track and field software called Hytek, used in Olympic trials and NCAA championships, the “standard lane preferences” option in the software ranks lanes from most preferred to least preferred as: 4, 5, 3, 6, 2, 7, 1, 8. Again, the slowest runners are assigned to the periphery lanes. Failure to account for this non-random assignment to lanes may reinforce the idea that periphery lanes are slower. While this can possibly explain the persistence of false beliefs about lane advantages it fails to explain the origin of the these lane assignment rules. One possible explanation of the origin of these rules is a technological constraint. I have heard, but unfortunately have not been able to find documentation to support, that assigning faster runners to the middle lanes was done in the hand timing era to make it easier for timers to see runners cross the finish line in an “inverted-V” pattern, with the middle runners crossing first. This would allow hand timers on either side of the track to observe runners in a sequential finish, and make accurate timing easier. This is an interesting possible explanation: a technical constraint was the impetus for lane assignment rules, which themselves led to beliefs that runners in middle lanes perform the best because of a failure to account for non-random assignment to lanes. Beyond addressing the specific question of lane advantages, the results in this paper could also be viewed as an interesting case study in the persistence of false beliefs (e.g. [22, 23]).
These results are also interesting in the context of the design of lane assignment rules. Lane assignment rules are designed to “reward” the fastest qualifying times with advantageous lanes in later rounds. However, the results here suggest that estimated lane advantages are not consistent with the implied advantages in lane assignment rules. This opens an interesting discussion about fairness and whether these lane assignment rules should be modified. In addition, in track meets that use seed times to assign lanes in the first round of events, there is a question about the fairness of the competition. If the goal of competition is to put all athletes on an even playing field to begin the competition, events that use non-random lane assignments in the first round put some runners at an entrenched disadvantage.
There are a number of ways this work could be expanded. To begin, I have not examined lane advantages in sprint distance hurdle events or relays. Perhaps more interesting given the results from the 200m and 400m is to examine the effect of lane assignments for indoor events, which use a 200m track that has tighter corners than outdoor 400m tracks. It is also possible this methodology could be extended to examine the effect of lane assignments in other sports. If a similar random assignment rule is used at some point in the competition, one could examine this question in swimming, cycling, and speed skating events.
Acknowledgments
I thank Erick Gong for helpful discussions as well as the Editor, Roy Cerqueti, and three anonymous reviewers for helpful comments.
References
- 1.
Morgan F. Are there lane advantages in athletics, swimming, and track cycling?; 2016. http://www.bbc.co.uk/newsbeat/article/37083059/are-there-lane-advantages-in-athletics-swimming-and-track-cycling.
- 2.
Boylan P. How do lane assignments and starting spots work in track?; 2016. https://www.sbnation.com/2016/8/15/12486250/rio-2106-track-athletics-lane-staggered-start-400-record-wayde-van-niekerk.
- 3.
Strashin J. Andre De Grasse wins bronze medal in Olympic men’s 100m; 2021. https://www.cbc.ca/sports/olympics/summer/trackandfield/track/olympics-track-and-field-100-metre-august-1-1.6126000.
- 4. Taboga P, Kram R, Grabowski AM. Maximum-speed curve-running biomechanics of sprinters with and without unilateral leg amputations. Journal of Experimental Biology. 2016;219(6):851–858. pmid:26985053
- 5. Churchill SM, Trewartha G, Salo AI. Bend sprinting performance: new insights into the effect of running lane. Sports Biomechanics. 2019;18(4):437–447. pmid:29562837
- 6. Hanley B, Casado A, Renfree A. Lane and heat draw have little effect on placings and progression in Olympic and IAAF World Championship 800 m running. Frontiers in Sports and Active Living. 2019;1:19. pmid:33344943
- 7. Berger J, Pope D. Can losing lead to winning? Management Science. 2011;57(5):817–827.
- 8. Klein Teeselink B, van den Assem MJ, van Dolder D. Does losing Lead to winning? An empirical analysis for four sports. Management Science. 2022.
- 9. Pope DG, Schweitzer ME. Is Tiger Woods loss averse? Persistent bias in the face of experience, competition, and high stakes. American Economic Review. 2011;101(1):129–57.
- 10. Hossain T, List JA. The behavioralist visits the factory: Increasing productivity using simple framing manipulations. Management Science. 2012;58(12):2151–2167.
- 11. Quinn MD. The effect of track geometry on 200-and 400-m sprint running performance. Journal of Sports Sciences. 2009;27(1):19–25. pmid:18979339
- 12. Morton RH. Statistical effects of lane allocation on times in running races. Journal of the Royal Statistical Society: Series D (The Statistician). 1997;46(1):101–104.
- 13.
World Athletics. Competitions; 2020. https://www.worldathletics.org/competitions/.
- 14.
Munro D. Replication Package; 2022. https://github.com/dmunro-git/Lane-Advantages.
- 15.
Angrist JD, Pischke JS. Mostly harmless econometrics: An empiricist’s companion. Princeton university press; 2009.
- 16.
World Athletics. Competition Rules 2018-2019; 2018.
- 17. Leppink J, Winston K, O’Sullivan P. Statistical significance does not imply a real effect. Perspectives on medical education. 2016;5(2):122–124. pmid:26984161
- 18. Gelman A, Carlin J. Beyond power calculations: Assessing type S (sign) and type M (magnitude) errors. Perspectives on Psychological Science. 2014;9(6):641–651. pmid:26186114
- 19. White H. A heteroskedasticity-consistent covariance matrix estimator and a direct test for heteroskedasticity. Econometrica. 1980; p. 817–838.
- 20. Bloom HS. Minimum detectable effects: A simple way to report the statistical power of experimental designs. Evaluation Review. 1995;19(5):547–556.
- 21.
McKenzie D, Ozier O. Why ex-post power using estimated effect sizes is bad, but an ex-post MDE is not. World Bank Development Impact Blog. 2019.
- 22. Laney C, Fowler NB, Nelson KJ, Bernstein DM, Loftus EF. The persistence of false beliefs. Acta Psychologica. 2008;129(1):190–197. pmid:18620329
- 23. Nunn N, Sanchez de la Sierra R. Why being wrong can be right: Magical warfare technologies and the persistence of false beliefs. American Economic Review. 2017;107(5):582–87.