Figures
Abstract
We introduce a novel system of matching and scoring players in tournaments, called Multi-Tier Tournaments, illustrated by chess and based on the following rules: 1. Players are divided into skill-based tiers, based on their Elo ratings. 2. Starting with one or more mini-tournaments of the least skilled players (Tier 1), the winner or winners—after playing multiple opponents—move to the next-higher tier. 3. The winners progress to a final tier of the best-performing players from lower tiers as well as players with the highest Elo ratings. 4. Performance in each tier is given by a player’s Tournament Score (TS), which depends on his/her wins, losses, and draws (not on his/her Elo rating). Whereas a player’s Elo rating determines in which mini-tournament he/she starts play, TS and its associated tie-breaking rules determine whether a player moves up to higher tiers and, in the final mini-tournament, wins the tournament. This combination of players’ past Elo ratings and current TS’s provides a fair and accurate measure of a player’s standing among the players in the tournament. We apply a variation of Multi-Tier Tournaments to the top 20 active chess players in the world (as of February 2024). Using a dataset of 1209 head-to-head games, we illustrate the viability of giving lower-rated players the opportunity to progress and challenge higher-rated players. We also briefly discuss the application of Multi-Tier Tournaments to baseball, soccer, and other sports that emphasize physical rather than mental skills.
Citation: Brams SJ, Seven MM (2025) Multi-Tier Tournaments: Matching and Scoring Players. PLoS One 20(8): e0328826. https://doi.org/10.1371/journal.pone.0328826
Editor: Julio Alejandro Henriques Castro da Costa, Portugal Football School, Portuguese Football Federation, PORTUGAL
Received: April 19, 2025; Accepted: July 8, 2025; Published: August 13, 2025
Copyright: © 2025 Brams, Seven. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: All relevant data are within the paper and its Supporting information files.
Funding: The author(s) received no specific funding for this work.
Competing interests: The authors have declared that no competing interests exist.
1. Introduction
There are a wide variety of tournament formats. Two of the most popular—Swiss and knockout—each have structural limitations that raise concerns about their fairness, equality of opportunity, and strategic manipulability. In Swiss chess tournaments, for example, color imbalance (i.e., unequal numbers of games played with White or Black) can significantly affect outcomes. This arises because, as we discuss in more detail in Section 3, playing with White gives an advantage to a player, which is similar to a team’s home advantage in soccer. In addition, the pairings in chess may give rise to the “Swiss Gambit,” wherein players benefit from an early loss by facing weaker opponents in later rounds. By comparison, knockout systems are unforgiving in that a single early loss can eliminate strong players, perhaps due to a minor injury or simply having a bad day.
To address these issues, we introduce a novel system of matching and scoring players in tournaments, called Multi-Tier Tournaments, illustrated for chess by the following rules:
- Players are divided into skill-based tiers, based on their Elo ratings.
- Starting with one or more mini-tournaments of the least skilled players (Tier 1), the winner or winners—after playing multiple opponents—move to the next-higher tier.
- The winners progress from lower tiers to a final tier of the best-performing players from lower tiers, which also includes players with the highest Elo ratings.
- Performance in each tier is given by a player’s Tournament Score (TS), which depends on his/her wins, losses, and draws (not on his/her Elo rating).
Elo [1] ratings are a predictor of a match between two players, based on the difference in the players’ ratings. Thus, when a low-rated player defeats a high-rated player, the Elo score of the low-rated player significantly increases and the high-rated player’s score significantly decreases. On the other hand, the more similar the players’ ratings are, the less the players’ ratings are affected by the outcome of their match. In this manner, Elo ratings are self-correcting, providing a measure of the strength of players. For more details, see https://en.wikipedia.org/wiki/Elo_rating_system. Like Elo, the tournament score (TS) we propose, according to rule 4, depends on a player’s wins, draws, and losses, but only for players in each mini-tournament (more on this later).
Whereas a player’s Elo rating determines in which mini-tournament he/she starts play, TS and its associated tie-breaking rules determine whether a player moves up to higher tiers and, in the final mini-tournament, wins the tournament. This combination of players’ past Elo ratings and current TS’s provides a fair and accurate measure of the player’s standing among the players in the tournament.
We apply a variation of Multi-Tier Tournaments to the top 20 active chess players in the world (as of February 2024). Using a dataset of 1209 head-to-head games that we collected, we illustrate the viability of giving lower-rated players the opportunity to progress and challenge higher-rated players. We also discuss the application of Multi-Tier Tournaments to baseball, soccer, and other sports that emphasize physical rather than mental skills.
The International Chess Federation [2, Chapter C.04.1] prescribes several rules for organizing tournaments, which are distinctive in two major respects: (i) they have a fixed number of rounds, which is significantly less than the number of participants; (ii) they match players against opponents with similar scores after each round. Although the complete list of rules is extensive, the following are especially relevant to our paper:
- Two players do not play against each other more than once.
- The difference between the number of Black and the number of White games played by each player is not greater than two.
- No player plays the same color three times in a row.
Rule (1) promotes diversity and reduces the potential bias arising from repeated matchups of the same players. Rule (2) promotes “color fairness,” which may be undermined by the first-mover advantage of White in chess if one player plays more games with this color. (To mitigate this bias, Brams and Ismail (2021) proposed a “catch-up” rule to put White and Black more on a par.) However, rule (2) allows a player to have one extra White game. Our analysis in Section 3 indicates a significant bias favoring the players who had an extra White game. For example, 66% of the top 10 finishers in major Swiss tournaments over the past decade had an extra White game. Rule (3) addresses “psychological fairness,” because consecutively playing with the same color may confer a psychological advantage (when playing with White) or disadvantage (when playing with Black).
As we will show, our approach not only satisfies rules (2) and (3), and other standards of fairness, but it also better balances the distribution of colors than the currently used Swiss system. However, because it allows players to move into higher and higher levels of competition, it is possible for two players who meet early in a tournament to be later opponents. We do not think this is a serious violation of fair play, because knockout tournaments that satisfy (1)—by eliminating competitors after one loss—are not necessarily fair to players in which a player’s lapse in one game can be more than made up for by wins in the other games that he/she plays. We believe it is multiple games, at each tier in a tournament, that should be the principal determinant of who stays in and who is eliminated.
What is distinctive about our approach is its simplicity. It does not depend on indicators of performance other than wins, losses, and draws in multiple contests; where draws are not allowed or possible, this indicator can be dropped. (We propose, however, rules for breaking ties when players have identical TS’s. These are specific to chess and will need to be revised in applying Multi-Tier Tournaments to other games and sports.) This makes it applicable to very different games and sports and, thereby, a general way of ranking competitors and choosing winners in tournaments.
2. Selecting and scoring players in mini-tournaments and determining winners
(i) Selection algorithm
Before players can be matched and scored in a tournament, they must be selected for a mini-tournament. Players in the first mini-tournament of Multi-Tier Tournaments (Tier 1) are defined as the subset of players who have the lowest Elo ratings. For example, if there are 20 players in a tournament, they might be the 8 players with the lowest ratings. Each of these players plays against the 7 others in Tier 1 and receives a Tier Score (TS), based on his/her wins, losses, and draws (more on this later, in which we relax the assumption that every player plays against every other player in a mini-tournament).
We first introduce the notation and give a more formal statement of our proposal. A Multi-Tier Tournament is given by the tuple , where
represents the set of players with
, and
denotes the initial ordered Elo ratings, with
being the lowest rating and
the highest (ties broken randomly). (Players who do not have an Elo rating are assigned an initial rating, which is usually 1500.) The tournament is organized into
tiers, with tier 1 as the lowest tier and tier
as the highest. Each tier, except the final one, consists of a subset of players
, where
gives the number of players in each tier
. The number of rounds in each tier is given by
, and
represents the number of players who progress to the next tier. Naturally, for every
, we assume
.
In every tier except the final one, the top players at the end of
rounds advance to the next-higher tier (if there is one). Players are allocated to tiers as follows:
- i. Players are initially ranked based on their Elo ratings given in the sequence
- ii. Tier 1 consists of the lowest-rated players with Elo indices
. The top
players in this tier advance to tier 2.
- iii. Tier 2 initially consists of
players, whose ranks are given by the indices:
The top players advance to tier 3.
- iv. For higher tiers except the final tier: Tier
initially has
players with Elo indices:
The top players advance to tier
.
- v. Final tier
: The tournament winner is the player with the highest Tournament Score in this tier.
Example 1.
To illustrate how players move up from an 8-player mini-tournament in a 20-player tournament, assume that the 2 players with the highest TS’s in Tier 1 move up to Tier 2, joining the 6 players with the next-highest Elo scores. (Like Tier 1, Tier 2 would now have 8 players.) At the conclusion of the Tier 2 mini-tournament, the 2 players with the highest TS’s in Tier 2 would move up to Tier 3, joining the 6 players with the highest Elo scores, so now Tier 3 also has 8 players. The Tier 3 players then play a third and final mini-tournament, the winner of which becomes the overall winner of the tournament.
Altogether, 8 + 6 + 6 = 20 different players participate in up to three rounds of play. In Tier 1, each of the 8 players plays against the 7 other players, and the 2 players with the highest TS’s move up to Tier 2. Next, each of the 8 Tier 2 players plays against the 7 other players in Tier 2, and the 2 players with the highest TS’s move up to Tier 3. These 8 players play a final mini-tournament, and the player with the highest TS wins the tournament.
Players from Tier 1 who move up to Tier 2 play a total of 7 + 7 = 14 games. If they succeed in Tier 2 and move up to Tier 3, they play a total of 7 + 7 + 7 = 21 games. But if a player is not one of the top two in either Tier 1 or Tier 2, he/she is knocked out of the final competition.
The 6 players with the highest Elo scores, who start in Tier 3, play only 7 games, whether they win or lose in this mini-tournament. Only the single player with the highest TS in Tier 3 wins the tournament. Thus, a player starting out with a low or even moderate Elo rating, and therefore from a lower tier, means that he/she must play, and perform better, in more games than players who start out in the highest tier.
(ii) Scoring algorithm and tie-breaking
In the mini-tournaments of each tier, each player i receives a Tier Score (TSiJ) against all his/her opponents in subset of his/her tier (as we will illustrate shortly, this may not be all members of his/her tier), according to the following formula:
TSiJ varies from −1 (for a player who loses all his/her games) to +1 (for a player who wins all his/her games). Note that TSiJ is based only on i’s wins, losses, and draws within a given tier. A player’s TS does not carry over from one tier to another.
TSiJ differs from Elo ratings in not being a dynamically changing measure of player i’s strength, which may go up, down, or stay the same after each of a player’s matches. Instead, TSiJ is a summary measure of i’s strength against his/her opponents in each of the mini-tournaments in which he/she plays (subscript J in TSiJ is the subset of these players). For example, a player could win a mini-tournament and move up to the next-higher tier, but in the latter mini-tournament he/she might not do so well and so be eliminated from the tournament.
Notice that the denominator of TSiJ includes draws as well as wins and losses; the greater the number of drawn games, the less the absolute value of TSiJ. If the numerator is positive, this lowers the value of TSiJ (more drawn games make TSiJ less positive). If the numerator is negative, this raises the value of TSiJ (more drawn games make TSiJ less negative). Thus, more drawn games hurt a winning player but help a losing player, as one would expect. If the numerator is 0, the number of drawn games is irrelevant.
Put another way, winning a game increases TSiJ more than drawing a game, and drawing a game increases TS more than losing a game. We think this is what a fair-scoring algorithm should do.
But what if the number of players in a mini-tournament is too great for each player to play a game against every other player in his/her starting tier? This will be the case when the number of players in a mini-tournament is greater than about 10, in which case we assume that players play against only a proper subset of opponents. (One could add more tiers to solve this problem, but this may make the tournament unduly lengthy if each tier takes a few days to complete. Thus, if there are 10 players in a mini-tournament, and each player plays against 3 opponents a day, the mini-tournament will take three days to complete against the 9 opponents of each player.) Before addressing the question of choosing such a subset, we suggest four rules to break ties when two or more players have the same TSiJ’s at the end of a mini-tournament.
When two players, A and B, have identical TSiJ’s after play concludes in any tier, we propose the following rules to break ties:
- (i) If A and B play each other in a mini-tournament and A defeats B, A is ranked above B. If they draw or do not play a game against each other, go to step (ii).
- (ii) If A wins more games than B, A is ranked above B. (Note: A may also lose more games than B, but if B has more draws, it is possible that A and B may have the same TSiJ.)
- (iii) If A and B win the same number of games but A takes an average of fewer moves to defeat his/her opponents, A is ranked above B.
- (iv) If (i), (ii), or (iii), in that order, fail to break a tie in TSiJ’s between A and B—but their tie must be broken to determine which players go onto the next-higher tier or which player wins the tournament—use a random device to break the tie (e.g., by tossing a coin).
We think (i), (ii), and (iii) will break almost all ties between two players in a mini-tournament. (Tie-breaking rule (iii) is perhaps least transparent. But we think defeating an opponent more quickly is consistent with being a stronger player.) These tie-breaking rules, except for (iv), are consistent with the better performance of A or B. They can be extended to break ties among more than two players, and thereby to determine, if necessary, which player or players remain in the competition, move to a higher tier, or win the final mini-tournament and, therefore, the tournament.
(iii) Subset-construction algorithm
In rare cases in which there are too many players in a tier for each player to play against all other players in that tier, we need a way to find subsets of players whose average Elo rating is approximately the same. Thereby the level of competition that every player faces in his/her subset will be approximately the same as the level he/she would face in any other subset of opponents.
Example 2.
To illustrate how to construct equally competitive subsets, assume a 2-tier system, but now with 30 players in the lowest tier (Tier 1). For each player to play against his/her 29 opponents would take an inordinate amount of time. Instead, we propose that the 30 players be split into three subsets of 10 players each in such a way that each subset has about the same average Elo score.
To illustrate how to do this, assume that the top three players in Tier 1 (i.e., the winners of each of the three mini-tournaments in this tier) join 7 players in Tier 2 in a second and final mini-tournament comprising 7 + 3 = 10 players. Altogether, there are 37 players:
- Tier 1 winners: 3 winners of each of the Tier 1 subsets, who play 9 + 9 = 18 games in Tier 1 and Tier 2.
- Tier 1 losers: 30–3 = 27 players who play 9 games and are not the winners of their Tier 1 subsets.
- Overall winner: Winner of the Tier 2 mini-tournament, who may be either an original Tier 2 player, who plays 9 games, or one of the Tier 1 winners, who plays 9 + 9 = 18 games.
Each player in Tier 1 has the same opportunity to win in this tier and move up to Tier 2, whichever subset he/she is placed in, according to the following calculations for determining the composition of each of the three Tier 1 subsets of 10 players each:
- For the
combinations (subsets) of 10 players each, determine the single subset whose 10 members come closest to having the same average Elo score as that of all 30 Tier 1 players (ties can be broken randomly). Thereby we identify which of the approximately 30 million subsets of 10 players each is closest to having the same average Elo score as that of all Tier 1 players (ties can be broken randomly). (To be sure, 30 million is a large number but not beyond the ability of present-day computers to determine which subset(s) come closest to having the same average Elo rating as all Tier 1 players.)
- With this subset of 10 players determined, for the remaining
combinations (subsets), identify the single subset whose 10 members come closest to having the same average Elo score as all 30 Tier 1 players.
- Because the first two subsets each has about the same average Elo score as all 30 Tier 1 players, its last remaining subset of 10 players must also have about the same average Elo score.
Thereby we identify three Tier 1 subsets of 10 players that each have approximately the same average rating. Hence, our matching algorithm precludes any Tier 1 player from feeling discriminated against because he/she faces a higher level of competition in his/her 9 matches than would occur if he/she were in a different Tier 1 subset. (This is a slight exaggeration, illustrated by the subset in which the player (say, A) with the highest Tier 1 Elo score appears. A will face a bit lower average level of competition than any player in the other two 10-member subsets, because A’s Elo score pushes down the average score of the other 9 members of his/her subset. But if this player has a bit of a competitive advantage in his/her subset than players with lower Elo scores do in other subsets, this seems appropriate because they are the top players in their subsets, just as Tier 2 players enjoy an advantage by immediately being placed in Tier 2.)
Recall that we did not need the subset-selection algorithm in Example 1, because in each of the three tiers there were only 8 players, so the players in each tier could play against all others in the tier. In Example 2, this was not true for the 30 players in Tier 1, so we showed how to construct three homogeneous subsets of 10 players each that were equally competitive. Elo scores of players were used for this purpose, and they could be used again if too many players remain in contention in higher tiers to permit each player to play against every other.
But it is TS’s that determine who advances in a tournament. We believe that Elo scores should be used only to determine who, initially, gets placed in what tiers and—if the numbers in a higher-level tier are too great to permit each player to play against every other—to construct equally competitive subsets in that tier. In summary, an advancement from these subsets depends only on TS’s, not Elo scores.
3. Practical concerns in chess tournaments
(i) Color imbalance
Perfect color fairness—whereby each player plays an equal number of games as Black and as White—is not always achievable, especially in tournaments with an odd number of rounds.
Even in tournaments with an even number of rounds, color fairness is not always satisfiable. For instance, in the 2023 FIDE Grand Swiss, which is one of the most prestigious Swiss chess tournaments, 9 of the top 10 finishers played one more game as White than as Black (6 vs. 5) (https://chess-results.com/tnr793016.aspx). A similar pattern was observed in the 2023 FIDE Women’s Grand Swiss tournament (https://chess-results.com/tnr793017.aspx).
To see the effects of color imbalance on the outcomes in top-level Swiss tournaments, we analyzed the Grand Swiss and Grand Prix events organized by FIDE between 2017 and 2023. The tournaments include Grand Swiss tournaments in 2023, 2021 (https://chess-results.com/tnr587230.aspx), and 2019 (https://chess-results.com/tnr478041.aspx) and Grand Prix tournaments in Sharjah (https://chess-results.com/tnr263691.aspx), Moscow (https://chess-results.com/tnr280762.aspx), Geneva (https://chess-results.com/tnr288645.aspx), and Palma (https://chess-results.com/tnr307271.aspx) in 2017. We excluded previous Grand Prix tournaments that used the knockout format. A follow-up study by Csato [3], focusing on the three Grand Swiss tournaments mentioned above, found that playing an extra game as White has a significant positive effect on the points this player scored.
These tournaments are of utmost importance in order to qualify for the Candidates Tournament, which determines the challenger for the world championship. We find a significant bias favoring the players who had an extra White game. Of the seven players who qualified for the Candidates Tournament from these events, six (86%) played at least one more game as White. Moreover, 66% of the top 10 finishers in all tournaments had an extra White game. These findings are consistent with White’s historically higher win percentage in elite events, as reported in Brams and Ismail [4].
(ii) The Swiss Gambit
Another concern of Swiss tournaments has been the varying strength of players, based on their Elo scores, in matches. In some cases, losing in the first-round game can lead to weaker opponents in subsequent rounds. The winner of the FIDE Grand Swiss 2023 not only had an additional game as White but also faced the weakest average opposition among the top five players, in part due to losing his first-round game (https://chess-results.com/tnr793016.aspx). Although the eventual winner’s performance was undoubtedly exceptional, such small advantages can add up to influence the selection of the top two players, who qualify for the Candidates Tournament.
To what extent are Multi-Tier Tournaments vulnerable to manipulation? For example, can a player benefit by deliberately losing or drawing against an opponent in order to be paired against weaker opponents in later rounds? Such an attempt at selective pairing is known as the “Swiss Gambit,” perhaps because it was perfected in Swiss tournaments.
In Multi-Tier Tournaments, because it is always preferable to have a greater Elo score in order to enter a Multi-Tier tournament at a higher level, players have no incentive to lower their Elo scores by losing or drawing. Neither do they have an incentive to deliberately lose in a mini-tournament, because this lowers their TS’s and hurts their chances of moving to a higher tier and, possibly, winning the tournament.
(iii) Early withdrawal
Another form of manipulation is for a player to withdraw from a tournament after pairings are announced to avoid playing against a player whom he/she thinks might beat or draw against him/her. This tactic is obviated by the difficulty of being able to anticipate one’s opponents, even in Tier 1, because this may depend on the subset to which the matching algorithm assigns each player.
One way of countering such tactics is to impose hefty penalties for withdrawing in the middle of a tournament. But fines alone might not deter a top player like Leinier Dominguez, who in 2023 quit after two draws in an open tournament in Spain, saying that “I’ve come to a point where I’m simply risking too much if I continue [and lose or draw against a lower-rated player]” (https://www.chess.com/news/view/dominguez-quits-sitges-wesley-so-firouzja-candidates).
In a Multi-Tier Tournament, a highly rated player like Dominguez is assured that he will start against players mostly at his level. However, there may still be uncertainty about Dominguez’s specific opponents if subsets are chosen in the manner we described, rendering uncertain whom he will play against if there is more than one subset in his tier. Nevertheless, he can rest assured that his opponents will be at roughly his level or have proved themselves by advancing to a high level in the tournament.
(iv) Announcement of pairings
Another issue is the timing of matches. Under the Swiss system, matches for each round cannot be finalized until all games in the current round are completed. Because chess games are often lengthy, players may remain uninformed about their next opponents until late the evening before they play.
This uncertainty impacts players’ ability to prepare. Players who can afford extensive support teams, including seconds and coaches, are at an advantage compared to those who must manage their preparation themselves. (In chess, a second is a strong player assisting a player mainly during a tournament, focusing on analyzing opponents, preparing opening strategies, and offering psychological support. A coach, on the other hand, is involved in long-term training, working on a player’s overall game, teaching new concepts, and improving a player’s tactical and strategic understanding.) Furthermore, this lack of early pairing information hinders television and online coverage, because broadcasters are unable to plan in advance for broadcasting key matchups that can attract fan viewership.
Our approach offers the following benefits:
- Perfect or near-perfect White-Black color balance: If there is an even number of rounds, all players play Black and White the same number of times; if an odd number, all players play one color no more than one more round than the other.
- Grouping of players into tiers with the most similar Elo ratings.
- Prevention of two consecutive games with the same color.
We think our rules for matching and scoring players ensure a fair and comprehensive assessment of the players’ performance in each mini-tournament, independent of the players’ Elo ratings in their tier. They ensure that almost all players can be strictly ranked, even if their TSiJ’s are the same, so there will be almost no doubt about who the few players are who move up to compete at the next-higher tier. Similarly, there will almost always be a single winner of the last mini-tournament and, therefore, of the entire tournament.
4. Application of multi-tier tournaments to top 20 chess players
In Table 1, we list the top 20 active chess players, based on their Elo ratings as of February 2024. As in Example 1, we divide players into three tiers: Initially, Tier 1 comprises the 8 players at ranks 1–8; Tier 2 comprises the 6 players at ranks 9–14; Tier 3 comprises the 6 players at ranks 15–20. Later, the two highest scorers in Tiers 1 and 2 move up, respectively, to Tiers 2 and 3, who then each have 8 members who compete against each other.
Like Example 1, there is no need to construct fair subsets of players, because each of the 8 players in each tier can play against every one of the 7 other players in its tier to determine which 2 players move up to a higher tier. But unlike Example 1, the number of games played by each member of a tier against others in it is not 1, as in Multi-Tier Tournaments, but may vary radically, from 0 (there are seven pairs of players who play no games against each other) to 71, as in the case of the Carlsen (#1) – Aronian (#19) pair (Carlsen won 12, lost 8, and drew 51 games against Aronian).
It was not appropriate to calculate, for each tier, TSiJ of player i against all other players J in his/her tier, because this measure is too much influenced by the outlier pairs in a tier, especially those pairs which happen to play many games against each other (perhaps because they are geographically proximate or are invited to, and choose to participate in, many tournaments). To remedy the problem of combining the wins, draws, and losses of such pairs with those pairs playing far fewer games, we normalize TS.
For every two players, A and B, in a tier, we calculate A’s Normalized Tier Score (NTS) in his/her tier as follows:
with if A and B did not play any games. We illustrate the aggregation problem if we had used the formula for TS in section 2, TSiJ (where subscript J indicates all opponents of i). Assume A plays a large number of games against B but many fewer games against the other 6 players, C, D, E, F, G, and H. In that case, the AB pair would have had an unduly large effect when we sum all the wins, draws and losses of A’s opponents, including B, using TSiJ.
But if we first normalize values of TS between −1 and +1 for each of A’s opponents, as does, their summation does not give unduly large weight to the AB pair. Then, taking the average of NTSs, we obtain a summary measure of how, hypothetically, A would do against all his/her opponents in his/her tier, but now based on varying numbers of games A has played against his/her opponents over 20 years (see next paragraph)—instead of just one game in a single tournament that never occurred.
We used head-to-head contest data from 1209 games—excluding so-called rapid and blitz games that require somewhat different skills—that were played between May 2004 and February 2024 that is available from the following website that archives historical chess results (https://www.chessgames.com). Each matchup is based on an average of 14.4 games, with Tier 1 players averaging 6.6 games against each other, Tier 2 players averaging 10.1 games against each other, and Tier 3 players averaging 26.4 games against each other. Higher-ranked players tend to be older and are invited to more tournaments. (These are the average figures for each tier before winners from the lower tiers are added to the higher tiers.)
As Table 2 shows, MVL and Aronian have the greatest TS’s in both Tier 1 and 2. Hence, they advance to Tier 3. Consistent with Carlsen’s dominance in chess for over ten years, he is the clear winner in this Multi-Tier Tournament, with an average TS of 0.22 in Tier 3. Nakamura and Caruana share the second and the third places, respectively. Notably, despite starting the competition in the lowest tier, MVL and Aronian perform better than three Tier 3 players.
5. Other approaches to the design of fair tournaments and limitations
FIDE has approved several versions of Swiss tournaments, including the Dutch, Burstein, Lim, and Dubov systems. These systems introduce changes to either the first-round pairing or later-round pairings under the Swiss system. For example, the Dubov system uses the average rating of the opponents to create pairings within the same score groups. However, this system still relies on the current score to match players and, unlike our system, it does not necessarily improve the overall average of opponents’ ratings or other desired aspects of matches (for a detailed overview, see Held [5]). The most commonly used version is, in fact, the Dutch system, which matches players based on Elo ratings for the first round and then uses the players’ updated scores for following rounds. For instance, in a six-player tournament with players ranked from 1 to 6, with 1 being the highest-rated, the first-round pairing would be 1–4, 2–5, and 3–6 according to the Dutch system.
While sharing similarities with league systems often used in sports like soccer, Multi-Tier Tournaments differ in several key aspects. First, unlike leagues, which run simultaneously over extended periods, Multi-Tier Tournaments progress sequentially within a single time frame so that the players in the lowest tier have an opportunity to win the entire event. Second, leagues use a single or double round-robin format, whereas Multi-Tier Tournaments, similar to Swiss tournaments, are mainly designed for tournaments with a large number of players and a small number of rounds.
Recently, UEFA updated the format for the Champions League. Similar to both Swiss and Multi-Tier Tournaments, teams play fewer games than in a round-robin, and like Multi-Tier Tournaments, the format incorporates a degree of fairness by having teams play half their matches at home and half away. However, there are also key differences between this format and Multi-Tier Tournaments. Multi-Tier Tournaments use cardinal Elo ratings for fairer pairing allocation, whereas UEFA uses a complex set of criteria, including random allocation and ordinal rankings, to determine pairings. In addition, Multi-Tier Tournaments do not include a knockout component, whereas the UEFA Champions League does.
Various metrics to evaluate tournament success have been explored in the literature. Scarf et al. [6] analyzed UEFA Champions League formats via simulations and found the round-robin to be the fairest in aligning pre- and post-tournament rankings. Similarly, Appleton [7] and Sziklai et al. [8] conducted simulations to assess different tournament formats, focusing on the win probability of the best teams and players. Key contributions in this field also include Glenn [9] and Searls [10], who both focus on the probability of the best players and teams winning in different tournament designs. As for Swiss tournaments, Olafsson [11] explored algorithmic properties of different tournament designs, whereas Fuhrlich et al. [12] and Sauer et al. [13] proposed a Swiss-based pairing algorithm that produced improvements in properties like color fairness over FIDE’s Swiss systems. For a review of the tournament design literature, see the recent survey by Devriesere et al. [14].
Tournament ranking systems have been theoretically explored by several researchers. Rubinstein [15] axiomatized the standard points-based system, Csato [16] studied Swiss system chess team tournaments, and Arlegi and Dimitrov [17] examined fairness in knockout tournaments.
Tiebreaking in tournaments typically follow two approaches. One involves a final contest to resolve ties, as seen in tennis tiebreakers, soccer penalty shootouts, and some American football overtime games. These methods, and their fairness, have been the subject of several studies. Che and Hendershott [18], Brams and Sanderson [19], and Granot and Gerchak [20] focus on bidding to determine which team picks a favored option. Additionally, studies, for example, by Apesteguia and Palacios-Huerta [21], Brams and Ismail [22], Brams et al. [23], Cohen-Zada et al. [24], Anbarci et al. [25], and Lambers and Spieksma [26] focus on determining a fairer ordering of play in tiebreak contests. (For further empirical analysis of the first-mover advantage in penalty shootouts, see Kocher et al., [27], Arrondel et al., [28], Rudi et al. [29], and Kassis et al., [30]). The second approach analyses different rules for tiebreaking in Swiss chess tournaments [31] and the FIFA World Cup group stage [32]. For an overview of tiebreaking methods, see [33; Chapter 1.3].
The concept of strategyproofness, or incentive compatibility, in tournament rules has also been a subject of interest. Selected contributions in this area include works by Pauly [34], Kendall and Lenten [35], Brams et al. [23], Dagaev and Sonin [36], Csato [33,37], Anbarci et al. [25], and Guyon [38].
Finally, the literature on the maximum number of consecutive games that can be played as White or Black in tournaments dates back to De Werra [39], who studied round-robin soccer tournaments. In this context, playing at home is similar to playing as White, and playing away is similar to playing as Black. For a review of the relevant results, see Goossens and Spieksma [40].
5.1. Limitations: Complexity considerations, tie-breaking systems, and outcome structure
The format of Multi-Tier Tournaments requires partitioning players into tiers based on their Elo ratings. This is a computationally simple task. In many top-level tournaments, it is unnecessary to further subdivide tiers.
For example, in our main application, we assumed a tournament with 20 players divided into three tiers. A round-robin format was used within each tier, which consists of eight competitors. However, in rare cases—such as when a single tier includes too many players (e.g., 30)—we suggest dividing the tier into further fair subsets with approximately equal average Elo ratings. While there are about 30 million possible ways to form such balanced subsets with 30 players, this is still within the capabilities of modern computers. That said, we do not recommend creating overly large tiers. Makur and Singh [41] recently found that a group size of nine in Large Language Model competitions leads to more accurate modeling. Instead, we suggest increasing the number of tiers when needed. This is because the number of ways to divide n players into k subsets grows exponentially with n, which makes brute-force computation infeasible in tournaments with hundreds of participants. In such cases, approximate methods—such as random draws to form groups of roughly equal strength—can be used to create near-balanced subsets.
In our chess-based application discussed above, players in the top tier each play seven games, which is generally seen as the minimum needed to determine a winner in elite competitions. While it would be desirable from an organizational perspective for top players to play even more games, doing so might reduce the appeal of the format from the players’ perspective. In chess, for example, top players often avoid open tournaments partly because playing games against lower rated opponents offers them little opportunity to increase their ratings but involves significant risk in case of a draw or loss. In physical sports, while rating systems may not be a concern, it is still desirable to ensure that the strongest teams compete against each other, as this increases spectator interest.
While our empirical application focused on chess, the format of Multi-Tier Tournaments can be extended to other sports such as soccer or basketball, where scoring and tie-breaking rules differ. For instance, in soccer, the win-tie-loss point system is 3-1-0 (compared to 1-0.5-0 in chess), and tie-breakers may include goal differences (not just win, lose, or tie). In most North American sports including basketball, there are no ties, and a win-loss point system is 1−0. As a result, the Tournament Score (TS) formula may require sport-specific modifications, such as adjusting the weight assigned to ties in the denominator of TS.
As is standard practice when introducing a new format in competitive sports, further research and pilot testing would be desirable. In sports other than chess, some modifications—such as incorporating specific tie-breaking mechanisms, adapting scoring systems, and addressing physical constraints—may be necessary. For example, applying Multi-Tier Tournaments to the FIFA World Cup would naturally extend the tournament’s length due to longer recovery times between matches—unlike in chess, where several rounds can be played daily, depending on the format. However, it is still desirable that the players or teams with the best records, based on their TS’s, play abbreviated multi-tier tournaments rather than knockout tournaments, which inordinately depend on performance against only one other competitor.
6. Conclusions
In contrast to Swiss tournaments, Multi-Tier Tournaments prioritize determining the pairings as fairly as possible prior to the start of games in a tier. In doing so, Multi-Tier Tournaments minimize, and eliminate whenever possible, biases related to the number of games played as White and Black (color fairness) and the number of consecutive games with each color (psychological fairness).
We showed how Multi-Tier Tournaments group players into tiers, and even within tiers, fairly. They also ensure that lower-rated players are afforded the opportunity to advance through the tiers, based on their performance. At the same time, they give a break to higher-rated players by allowing them to advance by playing fewer games.
Compared to knockout tournaments, where a single loss results in elimination, Multi-Tier Tournaments allow for a player’s loss in one game to be countered by his/her winning in subsequent games. Although our application focused on chess, Multi-Tier Tournaments are applicable to other games and sports where Elo or other scoring or rating methods are used, including tennis, soccer, and the big three of American sports—baseball, basketball, and football.
In fact, one may consider a season of play in a sport like baseball a tournament of sorts, wherein teams play about the same number of games against teams in their league over the course of a season. Their records in these leagues determine which teams go into the playoffs, which might be considered mini-tournaments themselves, requiring winning, for example, 2 out of 3, 3 out of 5, or 4 out of 7 games.
In most sports leagues, the number of games that teams play over a regular season does not depend on how they are rated (it is different leagues that reflect different levels of skill). Only in the playoffs are the numbers narrowed down, as in Multi-Tier Tournaments, when the best-performing teams or players ascend to higher tiers as poorer teams are eliminated. Multi-Tier Tournaments show how this competition can be structured in a fair and systematic way.
References
- 1.
Elo Arpad E. The Rating of Chess Players, Past and Present. New York: Arco Publishing; 1978.
- 2.
FIDE. FIDE Handbook. [cited 2025 June]. handbook.fide.com/chapter/C0401202507
- 3. Csato L. Most Swiss-system tournaments are unfair: Evidence from chess. arXiv preprint. 2024.
- 4.
Brams SJ, Ismail MS. Fairer chess: a reversal of two opening moves in chess creates balance between White and Black. In: 2021 IEEE Conference on Games (CoG), 1–4. IEEE; 2021.
- 5.
Held M. Swiss Dubov and FIDE Swiss (Dutch): A Comparison Between Swiss Pairing Systems. 2020. [cited 2023 Nov 24]. Available from: https://spp.fide.com/wp-content/uploads/2020/11/Dubov-vs-FIDE-Swiss.pdf
- 6. Scarf P, Mat Yusof M, Bilbao M. A numerical study of designs for sporting contests. Eur J Operat Res. 2009;198(1):190–8.
- 7. Appleton DR. May the best man win? J R Stat Soc Series D: Statist. 1995;44(4):529–38.
- 8. Sziklai BR, Biró P, Csató L. The efficacy of tournament designs. Comput Operat Res. 2022;144:105821.
- 9. Glenn WA. A comparison of the effectiveness of tournaments. Biometrika. 1960;47(3/4):253–62.
- 10. Searls DT. On the probability of winning with different tournament procedures. J Am Stat Assoc. 1963;58(304):1064–81.
- 11. Olafsson S. Weighted matching in chess tournaments. J Operat Res Soc. 1990;41 (1): 17–24.
- 12.
Fuhrlich P, Agnes C, Pascal L. Improving ranking quality and fairness in Swiss-system chess tournaments. In: Proceedings of the 23rd ACM Conference on Economics and Computation, 1101–1102. EC’22. Boulder, CO, USA: Association for Computing Machinery; 2022.
- 13. Sauer P, Cseh Á, Lenzner P. Improving ranking quality and fairness in Swiss-system chess tournaments. J Quanti Anal Sports. 2024;20(2):127–46.
- 14. Devriesere K, Csató L, Goossens D. Tournament design: A review from an operational research perspective. arXiv preprint. 2024. https://arxiv.org/abs/2404.05034
- 15. Rubinstein A. Ranking the participants in a tournament. SIAM J Appl Math. 1980;38(1):108–11.
- 16. Csato L. On the ranking of a Swiss system chess team tournament. Ann Operat Res. 2017;254(1–2):17–36.
- 17. Arlegi R, Dimitrov D. Fair elimination-type competitions. Eur J Operat Res. 2020;287(2):528–35.
- 18. Che Y-K, Hendershott T. How to divide the possession of a football? Econ Lett. 2008;99(3):561–5.
- 19.
Brams SJ, Sanderson ZN. Why you shouldn’t use a toss for overtime. Plus Magazine; 2013. https://plus.maths.org/content/toss-overtime
- 20. Granot D, Gerchak Y. An auction with positive externality and possible application to overtime rules in football, soccer, and chess. Operat Res Lett. 2014;42(1):12–5.
- 21. Apesteguia J, Palacios-Huerta I. Psychological pressure in competitive environments: evidence from a randomized natural experiment. Am Econ Rev. 2010;100(5):2548–64.
- 22. Brams SJ, Ismail MS. Making the Rules of Sports Fairer. SIAM Rev. 2018;60(1):181–202.
- 23. Brams SJ, Ismail MS, Kilgour DM, Stromquist W. Catch-Up: A Rule That Makes Service Sports More Competitive. Am Math Mthly. 2018;125(9):771–96.
- 24. Cohen-Zada D, Krumer A, Shapir OM. Testing the effect of serve order in tennis tiebreak. J Econ Behav Organ. 2018;146:106–15.
- 25. Anbarci N, Sun C-J, Unver MU. Designing practical and fair sequential team contests: the case of penalty shootouts. Games Econ Behav. 2021;130:25–43.
- 26. Lambers R, Spieksma FCR. A mathematical analysis of fairness in shootouts. IMA J Manag Math. 2021;32(4):411–24.
- 27. Kocher MG, Lenz MV, Sutter M. Psychological pressure in competitive environments: new evidence from randomized natural experiments. Manag Sci. 2012;58(8):1585–91.
- 28. Arrondel L, Duhautois R, Laslier JF. Decision under psychological pressure: The shooter’s anxiety at the penalty kick. J Econ Psychol. 2019;70:22–35.
- 29. Rudi N, Olivares M, Shetty A. Ordering sequential competitions to reduce order relevance: Soccer penalty shootouts. PLoS One. 2020;15(12):e0243786. pmid:33378400
- 30. Kassis M, Schmidt SL, Schreyer D, Sutter M. Psychological pressure and the right to determine the moves in dynamic tournaments – evidence from a natural field experiment. Games Econ Behav. 2021;126(3):771–96.
- 31. Anbarci N, Ismail MS. AI-powered mechanisms as judges: breaking ties in chess. PLoS One. 2024;19(11):e0305905. pmid:39485752
- 32. Csato L. How to avoid uncompetitive games? The importance of tie-breaking rules. Eur J Operat Res. 2023;307(3):1260–9.
- 33.
Csato L. Tournament design: how operations research can improve sports rules. Springer Nature; 2021.
- 34. Pauly M. Can strategizing in round-robin subtournaments be avoided? Soc Choice Welf. 2013;43(1):29–46.
- 35. Kendall G, Lenten LJA. When sports rules go awry. Eur J Operat Res. 2017;257(2):377–94.
- 36. Dagaev D, Sonin K. Winning by losing: incentive incompatibility in multiple qualifiers. J Sports Econ. 2018;19(8):1122–46.
- 37. Csato L. UEFA Champions League entry has not satisfied strategyproofness in three seasons. J Sports Econ. 2019;20(7):975–81.
- 38. Guyon J. ‘Choose your opponent’: A new knockout design for hybrid tournaments. J Sports Anal. 2022;8(1):9–29.
- 39.
De Werra D. Scheduling in sports. In: Hansen P, editor. Studies on Graphs and Discrete Programming, volume 11 of Annals of Discrete Mathematics. Amsterdam, The Netherlands: North-Holland; 1981. pp. 381–95.
- 40. Goossens DR, Spieksma FCR. Soccer schedules in Europe: an overview. J Sched. 2011;15(5):641–51.
- 41.
Makur A, Singh J. Hypothesis Testing for Generalized Thurstone Models. International Conference on Machine Learning. 2025.