How Many Political Parties Should Brazil Have? A Data-Driven Method to Assess and Reduce Fragmentation in Multi-Party Political Systems

In June 2013, Brazil faced the largest and most significant mass protests in a generation. These were exacerbated by the population’s disenchantment towards its highly fragmented party system, which is composed by a very large number of political parties. Under these circumstances, presidents are constrained by informal coalition governments, bringing very harmful consequences to the country. In this work I propose ARRANGE, a d A ta d R iven method fo R A ssessing and reduci NG party fragm E ntation in a country. ARRANGE uses as input the roll call data for congress votes on bills and amendments as a proxy for political preferences and ideology. With that, ARRANGE finds the minimum number of parties required to house all congressmen without decreasing party discipline. When applied to Brazil’s historical roll call data, ARRANGE was able to generate 23 distinct configurations that, compared with the status quo, have (i) a significant smaller number of parties, (ii) a higher discipline of partisans towards their parties and (iii) a more even distribution of partisans into parties. ARRANGE is fast and parsimonious, relying on a single, intuitive parameter.


Introduction
In June 2013, Brazil faced the largest and most significant mass protests in a generation, comparable in size to the protests that triggered the collapse of the military dictatorship in 1984 [1]. The 2013 protests had been exacerbated by the broader disenchantment of the population towards the party system in Brazil [2]. Banners with sentences such as "no party represents me" or "we don't have a party, we are Brazil!" were commonly seen among the protesters. In response to these protests, the government proposed a program of political reform [1]. However, more than one year has passed and very little has been done.
Why are people so unhappy with the Brazilian party system? To illustrate its incapability, consider the following examples. 19 months before Rio de Janeiro stages South America's first Olympic games, an Evangelical pastor without any link to sports was nominated as Brazil's fragmentation in this particular country is much higher than it should be. If this is the case, ARRANGE immediately provides a new congressmen-party configuration that both reduces the country's level of party fragmentation and increases intra-party similarity, which could potentially increase the efficiency of the party system [4][5][6]8]. To the best of my knowledge, this is the first work that proposes a data-driven method to assess and potentially reduce the number of parties in political systems. In summary, the main contributions of this paper are:

1.
A new problem is addressed: what is the minimum number of political parties a country should have given the roll votes of its congressmen? Again, although party fragmentation has been extensively studied in the literature, to the best of my knowledge, this is the first time roll call data has been used to assess it.
2. ARRANGE, a fast and parsimonious method that receives as input roll call data and outputs a new party system configuration that potentially reduces its actual level of fragmentation.
3. It is shown that Brazil has and had many ideologically redundant parties, i.e., parties that are similar in the ideological space. Thus, if today Brazil has one of the highest levels of party system fragmentation in the world (more than 20 parties), this work proves it can be much lower (down to 4 parties).

Fundamentals and Related Work
The constant advancement of information systems allows, at a growing rate, more data to be stored and generated from the most diverse situations. It is fascinating that, behind all these data, we see the reflection of the environment itself. In order to find knowledge in this invaluable evolving database, a growing number of data-driven methods are being proposed along various research areas. For instance, there are data-driven methods to predict hospital mortality from instance-based patient data [16] and flu epidemics [31]. In the social sciences, Silva et. al. proposed two data-driven methods to quantitatively characterize cultural behaviors of geographical regions [17,18] and Park et. al. [19] designed and evaluated a measure that captures diversity of musical tastes from social media data. In economics, [22] proposed a data-driven approach to understand online consumer behavior and engagement with brands. For the benefit of the industry sector, there are data-driven methods to monitor industrial processes [21] and to assist the development and deployment of intelligent transportation systems [20]. The constant advancement of information systems allows, at a growing rate, more data to be stored and generated from the most diverse situations. It is fascinating that, behind all these data, we see the reflection of the environment itself. In order to find knowledge in this invaluable evolving database, a growing number of data-driven methods are being proposed along various research areas. For instance, there are data-driven methods to predict hospital mortality from instance-based patient data [16] and flu epidemics [31]. In the social sciences, Silva et. al. proposed two data-driven methods to quantitatively characterize cultural behaviors of geographical regions [17,18] and Park et. al. [19] designed and evaluated a measure that captures diversity of musical tastes from social media data. In economics, [22] proposed a data-driven approach to understand online consumer behavior and engagement with brands. For the benefit of the industry sector, there are data-driven methods to monitor industrial processes [21] and to assist the development and deployment of intelligent transportation systems [20].
In the political sciences, data analysis methods from roll votes primarily focuses on the estimation of cleavages and ideologies across congressmen and parties [23,24,32] to characterize and predict legislative behavior [25,33,34]. Recently, with the advancement of political weblogs and online social networks, researchers are also extracting political knowledge from user generated data on the Web. There are studies that focus on mining political opinions [26] and profiles [27] from the texts users post on social media applications. Others, such as [35], extract political opinions from general texts, such as statement records of U.S. senators and online news. More recently, Leman Akoglu [28] classified the political polarity of individuals using roll call votes of U.S. congressmen and texts posted on political forums. The idea behind all these studies is that political preferences tend to be stable over time and can be predicted accurately.
Still in the political sciences, the study of party systems is one of its largest sub-fields [11]. Within this sub-field, since Duverger's seminal paper [13], many studies focused on predicting and understanding the factors that determine the number of parties that compete in a given polity [7,9,12,36,37]. In summary, there are two lines of thought: one that emphasizes the role of electoral laws in structuring coalition incentives, and another that emphasizes the importance of preexisting social cleavages. Another fundamental problem in this sub-field is to count the number of parties by taking into account their relative size [38]. If, for instance, a party has a very small percentage of seats in Congress (e.g. one seat of one thousand), then it should be counted accordingly. The metric that considers this is called the effective number of parties. Conceptually, the effective number of parties is simply the number of "viable" or "important" political parties in a party system that includes parties of unequal sizes. Since Laakso and Taagepera's seminal work [38], several ways of computing the effective number of parties were proposed [39][40][41]. The number of effective parties is a frequent metric for assessing party system fragmentation in a country [9].
The high interest in these problems comes from the fact that the actual number of parties usually determines the number of effective parties, or how fragmented a party system is [38]. Highly fragmented party systems can affect governance drastically [42]. The more fragmented the party system is, the less likely it is that the president's party will control a majority of seats in the legislature. Simone Bohn [7] reviewed the literature and concluded that party system fragmentation impacts the electoral dynamic, the process of coalition formation, governing, and ultimately, the survival of political systems in presidential democracies. Thus, in this paper, we measure party fragmentation by counting both the actual and the effective number of parties.
Another crucial factor for governance is party discipline, i.e., the ability of a political party to get its members to support the policies of their party leadership. Mainwaring and Shugart [5] assessed the effects of this on the costs of governing. If parties are not disciplined, presidents will be forced to rely on ad-hoc coalitions based on the distribution of patronage to individual legislators, which raises the costs of governing and reduces policy coherence. Limongi and Figueiredo [8] argued that "institutional engineering" should focus on electoral formulas that reduce party fragmentation and increase party discipline. Brazil is a special case in politics for its high level of party fragmentation, being consistently analyzed in the literature [6-8, 37, 42, 43]. Thus, using Brazil as the use case makes this work specially challenging but rewarding.
Finally, it is important to emphasize that this work differs significantly from those that focus on algorithms to find communities in networks [30,44] or from time series [29]. Although such algorithms could be applied here to detect communities of congressmen that are ideologically similar and, therefore, could compose a political party, this is very different from our problem in two major aspects. First, here our goal is to find the minimum number of communities (in our case, political parties), which is an optimization problem not addressed by community detection algorithms, which usually aim to maximize modularity [45] or any other cohesion metric [46]. Second, while traditional community detection algorithms do not allow two disjoint subgraphs to be part of the same community, our major constraint here is party discipline. Thus, here we allow two disjoint subgraphs (in our case, two ideologically dissimilar groups of congressmen) to be part of the same community if the party discipline constraint is satisfied. The comparison between the communities (political parties) generated by the proposed method and by the state of the art community detection algorithms is left for future work.

Data Description
All the data used in this work was collected from the Open Data (Dados Abertos) project of the House of Representatives (or Chamber of Deputies) of Brazil. In total, I collected 744,195 thousand roll votes on 774 bills of 1,582 thousand congressman that worked in the House of Representatives of Brazil from November, 4th, 1998 to December, 3rd, 2014. The reason for this particular time interval is related to the purpose of this work. Since party discipline is a fundamental metric of evaluation, I only collected the bills in which the party leaders declared the desired vote for their fellow partisans. More than 95% of the votes given during this period had a declared party leader vote. Moreover, note that congressmen vote for bills and their amendments. An amendment is a proposition presented as ancillary to the bill, to amend its form or content. Thus, bills and amendments compose a total of 2,162 thousand propositions to be voted by the congressmen. Each congressman may or may not agree with the vote of his/ her party leader. There are, in total, 35,216 thousand declared votes of the leaders of the 36 parties that had congressmen elected for the House of Representatives during the analyzed period.

Formal Definitions
As discussed previously, reducing party fragmentation only makes sense if party discipline does not decrease significantly. One way to measure the discipline of a congressman is to compute the fraction of votes given by him/her through his political life that agreed with his/her party leader. However, in Brazil it is common for congressmen to switch parties. Also, it is well known that some parties demand (or inspire) higher levels of discipline than others [8]. Thus, instead of analyzing the discipline of the whole political life of a congressman, I will analyze his/her discipline as a member of a single party, i.e., a congressman may have different levels of discipline if he/she was member of different parties during his/her political life. For simplicity, from now on I will assume that the vote given by the leader of the party was given by the party itself, e.g., I will call party vote the vote given by the leader of the party.
Thus, given the set of m congressmen U ¼ fu 1 ; u 2 ; . . . ; u m g and the N political parties that compose the set P ¼ fp 1 ; p 2 ; . . . p N g, I define a partisan a ≔ (u, p) as the tuple formed by a congressman u and a political party p. The set containing all M partisans is defined as A ¼ fa 1 ; a 2 ; . . . a M g. These partisans' job is to vote for the set B ¼ fb 1 ; b 2 ; . . . ; b n g of n bills and amendments that were put to vote in the House of Representatives during the analyzed period. From now on, I will use the term propositions to refer to both bills and amendments. Since a partisan a had not necessarily voted for all propositions, I define the set B a B as the set of propositions that were voted by partisan a. I also define A p as the set of partisans which are members of party p, i.e., A p ¼ fa i ja i ≔ ðu; p 0 Þ^p 0 ¼ pg.
Before the partisans give their votes for a given proposition b, their parties have to announce their votes for b. Again, since a party p had not necessarily voted for all propositions, I define the set B p B as the set of propositions that were voted by party p. For each proposition b 2 B p of a given party p, there is a vote v b p associated with it. In the same way, for each proposition b 0 2 B a of a given partisan a, there is a vote v b 0 a associated with it. Thus, a party p and a partisan a have, respectively, a set of votes V p and V a , where jV p j ¼ jB p j and jV a j ¼ jB a j. The set of all votes given by partisans and parties is simply V. A vote v b p given by party p on a proposition b may be of four types: Y (yes), N (no), O (obstruction) and F (free), i.e., v b p 2 fY; N; O; Fg. If the vote is Y (N), the party approves (disapproves) the proposition. If the vote is O, the party is trying to avoid the vote on the proposition, i.e., its partisans are called to withdraw from the plenary. Finally, if the vote is F, its partisans are free to vote at will. Similarly, a partisan a vote v b a on a proposition b may be of three types: Y, N and O, i.e., v b a 2 fY; N; Og. If the vote is Y (N), the partisan approves (disapproves) the proposition. If the vote is O, the partisan withdrew from plenary.
In this work, the vote is the fundamental feature that determines a preference or ideology. Since for all propositions in our dataset we have both the vote of the partisan and the party, here I define a general function agrees(v 1 , v 2 ) that receives two votes as input and outputs 1 if the votes are in accordance or 0 otherwise. This function is defined as: Note that both v1 and v2 can be a vote of a partisan or a party. Also note that a F vote implies accordance, since the party that gave that vote does not particularly care about its members' votes. Once we know how to compare votes, we can propose a way to compare the similarity sim(i, j) between two vote sets V i and V j . It is given as: In summary, sim(i, j) sums all the votes in agreement between the vote sets V i and V j considering only the propositions that are in both sets. From Eq 2, we can define the three levels of discipline that we will use throughout this paper. First, I define partisan discipline as the discipline d a ! p of a partisan a ≔ (u, p) towards his/her party p, calculated as: Second, I define party discipline as the discipline d p throughout all the votes that were given by partisan members of the party p, calculated as: Finally, I define overall discipline as the discipline dÃ throughout all the votes given in the House of Representatives during the analyzed period, calculated as:

Politics in Brazil
In this section I will show a summarized view of the fundamental characteristics of Brazilian party system that are relevant to the purpose of this work. Because all parties in Brazil are often referred by their acronyms, I will also use their acronyms instead of their names. Thus, please refer to Table 1 for a list of all parties' names and their respective acronyms and sizes, in this case given by the size of their B p s.
First, in Fig 1, I show the historical participation of all parties in the House of Representatives during the analyzed period. Party participation, which I will interchangeably call party size, is represented by the total number of propositions that were voted by the members of the party (horizontal axis) and the total number of partisans that are and were members of the party (vertical axis). Observe the heterogeneity of this universe. From the three biggest parties (PMDB, PSDB and PT), with hundreds of partisans who voted for thousands of propositions, to the two smallest ones (PSTU and PMR), which together have only three partisans and a little over hundred voted propositions, there are another 31 parties with very distinct levels of representation. In order to verify if there is any correlation between participation and party discipline, I show in Fig 2, for each party p, the total number of votes given by partisans members of p (horizontal axis) and the party discipline d p (vertical axis). Note that there is no apparent relationship between party discipline and participation. In fact, the Pearson's correlation coefficient between party discipline and the total number of votes is 0.24, but since the p-value for testing the hypothesis of no correlation is 0.28, we cannot affirm the correlation is significant. Nevertheless, as already observed by [8] using a smaller dataset, party discipline in Brazil is consistently high: no party has a historical party discipline below 0.75 and only two parties have a figure below 0.85.
In Fig 3, we show the behavior of partisans discipline in Brazil during the analyzed period. First, observe in Fig 3a the Cumulative Distribution Function (CDF) of all partisans' disciplines. Note that the curve representing partisan discipline in Brazil is not very far away from the ideal curve, where all partisans have discipline of 1.0. Thus, together with party discipline, partisan discipline is also usually high in Brazil: only 6.3% of partisans have discipline lower or equal to 0.8. In Fig 3b, we plot the heatmap of partisans using their disciplines and total number of votes. The color bar at the right indicates the number of partisans in a given area of the map. Observe that the vast majority of partisans are located in the upper-right of the heatmap, i.e., they have given many votes and have high partisan discipline. The Pearson's correlation coefficient of 0.03 and p value = 0.23 indicate that there is no correlation between discipline and participation.
Although party fragmentation in Brazil has reached one of the highest levels ever found in the world [8,9], we have seen that party and partisan discipline is consistently high. Does this mean that the level of party fragmentation in Brazil is necessary? According to the seminal doctrine of responsible party government [47], parties must differ sufficiently between themselves, providing the electorate with a proper range of choice between alternative actions. Given that, I reformulate the previous question: is the actual level of party fragmentation in Brazil a consequence of a high number of sufficiently different parties?
Instead of performing a deep clustering analysis to answer this question, I will apply the Principal Component Analysis (PCA) [48] technique to the matrix M V BR composed by the  voting vectors V a i of each partisan a i . PCA is a widely used statistical technique for unsupervised dimension reduction. It transforms the data into a new coordinate system such that the greatest variance is achieved by projecting the data into the first coordinate, namely principal component, the second greatest variance is achieved by projecting into the second coordinate, the second component, and so on. In order to draw more interesting conclusions from this analysis, I will also apply PCA to the matrix M V US composed by the voting vectors of the U.S. congressmen. For this, I will use the same dataset used in [28], which consists of votes on 1655 bills in The House of Representatives in years 2009-2010 by 451 representatives. In the matrices the YES votes are represented as 1, the NO votes as −1, and the F, O and non-attendance as 0. To make the comparison more precise, I will only use the votes in years 1999-2000 for constructing M V BR . This period comprises votes on the 349 bills by 767 contemporary partisans of 18 parties in the first two years of president Cardoso in power. Also, it is the two year period that gives the higher explained variance by the first two components of the PCA.
In Fig 4, I show the first two components of the PCA for both M V BR and M V US , where each point represents a congressman and each symbol represents a party. Observe that, for both USA and Brazil the first two components explain a significant part of the variance: 67% and 53%, respectively. However, only the USA PCA can visually divide the members of different parties. For Brazil, many members of different (same) parties are located together (apart). This suggests that parties in Brazil are not sufficiently different to justify one of the highest levels of party fragmentation ever found in the world [8,9].

The ARRANGE Method
In this section I describe the method ARRANGE, which has, basically, two steps. First, based on the votes given by party leaders, it tries to find pairs of parties that can be merged into one. The idea is that parties that always give the same vote could be merged into a single party. Then, ARRANGE attempts to assign partisans to new parties with the objective of minimizing the total number of parties receiving partisans and preserving party discipline. Finally, I describe the quality outputs of ARRANGE. Formally, the problem tackled by ARRANGE is: Problem 1 Given a set of parties P, a set of partisans A, a set of bills and amendments B and the set of all votes given by parties and partisans V, find the minimum set of parties P Ã to which the partisans in A can be assigned in a way that overall, party and partisan disciplines are maximized.
Merging Political Parties. In order to assign partisans to other parties, it is necessary to formally define the ways it can be done. Thus, here I define an option as the descriptor of the party that can receive an external partisan as member. More formally, given a partisan a ≔ (u, p i ), his/her current party p i , and his/her set of propositions B a , an option o a ≔ (a, p j , sim(a, p j )) is a tuple composed by the partisan a, a party p j 6 ¼ p i , and the similarity value sim(a, p j ) between a and p j . More importantly, the option o a = (a, p j , sim(a, p j )) exists if and only if B a B p j , i.e., if the party p j has voted for all propositions in B a . The set O a ¼ fo 1 a ; o 2 a ; . . .g is composed by all the options of partisan a or, more formally O a ¼ f½o a ≔ ða ≔ ðu; p i Þ; p j ; simða; p j ÞÞ : Moreover, an option o a ≔ (a, p j , sim(a, p j )) of partisan a ≔ (u, p i ) is characterized as a good option if sim(a, p j ) ! sim(a, p i ). The set of good options O Ã a for partisan a ≔ (u, In Fig 5a, I show the histogram of the number of good options jO Ã a j for all partisans of our dataset. Consistent with the high levels of party and partisan discipline, observe that the majority of partisans do not have a single good option, i.e., for 69% of the partisans there is no other party in Brazil that offers more similar voting vectors. In fact, only % 11% of the partisans have more than three good options. This result suggests that it is very difficult to reduce the number of parties in Brazil by moving partisans from one party to another.
The main reason for the low number of good options is related to the short lifetime of many parties in Brazil, as it can be observed in Fig 1. Since an option exists if and only if the propositions voted on by the partisan is a subset of the propositions voted on by the parties, many partisans with a long history of votes cannot find options for them among the small parties. Thus, here I propose a method for creating new parties by merging existing ones. The method is based on a simple idea: if two parties are not contemporary or are contemporary, but all votes given by them are equal, then these parties can be merged into a new one.
More formally, two parties p i and p j can be merged into a new party p i_j if one of the two conditions below is met: i.e., parties p i and p j have not voted for a common proposition.
i.e., parties p i and p j have given the same vote for all common propositions.
Given these two conditions, the first thing we have to do is find all pairs of parties to which at least one of the two conditions is valid. Once this is done, for every pair of parties (p, q) that can be merged, we create a merged party p_q, for which the sets of propositions and votes are, respectively, B pÀq ¼ B p [ B q and V pÀq ¼ V p [ V q . All parties created in this step are put in set P 0 . After this, we repeat this process by verifying, for each party p 2 P 0 , all parties q 2 P that can be merged to p. We merge p to q into p_q as previously, but this time adding the merged parties to P 00 . Once this process is done, we copy P 00 to P 0 , empty P 00 , and restart the process of finding parties q 2 P eligible to be merged to parties p 2 P 0 . The process ends when set P 00 does not receive a new merged party. All parties p 2 P 0 that were not merged are put in the final set of merged parties p 2 P M . This whole process is described in Algorithm 1.  ▷ two parties to be merged 2: for all q 2 P do Basically, what Algorithm 1 does is to find all possible k-combinations N k À Á of set P for all 1 k N. It is well known that P N k¼1 N k À Á ¼ 2 N À 1, making the worst-case complexity for this problem to be O(2 N ). Nevertheless, in practice, finding parties in P that can be merged with parties in P 0 gets significantly harder as k increases, which, in practice, makes Algorithm 1 computationally feasible for the problem in question. For the case of the 36 Brazilian parties, k went up to 6 and the algorithm stopped, generating a set P M containing 95 parties, in which only 5 were not a product of a merge, namely PDT, PMDB, PPS, PSDB and PT. Besides these, Algorithm 1 generated 12 parties of size 2, 17 parties of size 3, 20 parties of size 4, 31 parties of size 5 and 10 parties of size 6.
In Fig 5b, I show the histogram of the number of good options jO Ã a j for all partisans considering the new set of merged parties P M . Observe that the number of partisans that do not have a single good option dropped from % 69% to % 33%, all of them being members of the five parties that were not merged. Moreover, the number of partisans that have more than three good options grew from % 12% to % 47%. Thus, I conclude that merging parties significantly raises the chances of reducing the number of parties in Brazil by moving partisans from one party to another without decreasing party discipline.
Finding the Minimum Set. Now that most of the partisans have multiple good options, we can find ways of redistributing them among the parties in P M . The idea in this redistribution is to find a set of parties P Ã & P M that has a lower cardinality than P, i.e., P Ã & P M is a set of parties able to receive all partisans a 2 A as members and the number of parties N Ã ¼ jP Ã j in P Ã has to be lower than the actual number of parties N ¼ jPj. However, this has to be done cautiously, since there are two partially conflicting goals: 1. Minimize the number of parties; 2. Maximize party and partisan discipline.
These goals are conflicting because minimizing the number of eligible parties to receive partisans as members implies reducing the options for moving partisans and, as a consequence, the number of good options. If a partisan does not have a good option, he/she is obliged to stay in his/her party in order to not decrease his/her partisan discipline. On the other hand, maximizing the discipline implies in maximizing the size of the set of good options and, therefore, the number of parties to be considered has to be as high as possible.
In order to solve this conflict, I model this redistribution problem as a set cover problem (SCP) [49]. SCP is a well studied problem for the field of approximation algorithms [50], being also one of Karp's 21 NP-complete problems shown to be NP-complete. In summary, given a set of elements {1, 2, . . ., m} (called the universe) and a set S of n sets whose union equals the universe, the SCP is to identify the smallest subset of S whose union equals the universe. More formally, given a universe X and a family S of subsets of X , a cover is a subfamily C S of sets whose union is X .
In our case, the universe X is the set of partisans A and the family S of subsets of X is a family of subsets of partisans A M p 2 A where each subset A M p is composed by the partisans that are eligible for moving to party p 2 P M . In order to build A M p , it is necessary to recalculate the set of options O a and good options O Ã a of each partisan a with respect to the merged parties in P M . Once this is done, I define that each set A M p is composed by all the partisans a that have a good option o a ≔ (a, p, sim(a, p)), i.e., A M p ¼ faja 2 A^p 2 P M^9 ½o a ≔ ða; p; simða; pÞÞ 2 O Ã a g. Since this problem is NP-complete, I recur to a greedy algorithm to solve it. Literature shows that the greedy algorithm is essentially the best possible polynomial time approximation algorithm for set cover under plausible complexity assumptions [51]. The greedy algorithm for set covering chooses sets according to one rule: at each stage, choose the set that contains the largest number of uncovered elements. It can be shown [49] that this algorithm achieves an approximation ratio of H(s), where s is the size of the set to be covered and H(n) is the n-th harmonic number: HðnÞ ¼ P n k¼1 1 k ln n þ 1. For our specific problem, this algorithm works as follows, being described in Algorithm 2. First, we create two empty sets: P Ã , which will contain the final set of parties, and A 0 , which receives the covered partisans during the process. Then, while A 0 does not contain all partisans, we find the party p 2 P M to which A M p contains the largest number of uncovered partisans. Then, we add all partisans a 2 A M p to A 0 , make A M p an empty set (so it is not selected again), and add p to the final set of parties P Ã .
Note that so far this process guarantees that all partisans will, at least, have the same partisan discipline as their actual ones, since only good options are used to build the sets A M p ; p 2 P M . Nevertheless, it is possible that relaxing this constraint a little might diminish the total number of parties that compose P Ã considerably. For instance, we may allow a partisan to be member of a party if their similarity is at most 0.05 smaller than his/her similarity with his/her actual party. This relaxation increases the number of partisans that are eligible to be member of other parties and, therefore, may reduce the size of P Ã .
Thus, here I introduce the parameter δ, which is the maximum allowed difference between the actual partisan discipline and the future one. We accommodate this in ARRANGE by simply changing the way the set of good options is constructed. With the introduction of δ, the set of good options O Ã a is defined as: In summary, after generating the set of merged parties P M using Algorithm 1, it is necessary to create the sets A M p for all p 2 P M considering δ. These sets A M p will contain all partisans that Algorithm 2. Find the minimum set of parties P Ã . are eligible for being members of party p given δ. This is done by selecting a value for δ and then running the procedure FIND of Algorithm 2. If, for instance, δ = 0, then no partisans are allowed to decrease their discipline when moving to a different party. On the other side, if δ = 1, partisans are allowed to be members of any party that voted for all propositions that they voted on, i.e., partisan discipline is not a constraint. After running Algorithm 2, we will have a minimum set of parties P Ã able to accommodate all partisans in A. Then, all we have to do is to assign a party p 2 P Ã to each partisan a 2 A. This is done by selecting the option o a ¼ ða ≔ ðu; p i Þ; p j ; simða; p j ÞÞ 2 O Ã a that gives the maximum similarity value sim(a, p j ) for all parties p j 2 P Ã and making p j the new party of partisan a.
Quality Signals. The main goal of ARRANGE is to reduce party fragmentation by reducing the number of parties that are able to accommodate all elected partisans. Thus, the main output of ARRANGE is N Ã (or jP Ã j). Nevertheless, it is necessary to assess the quality of the cover P Ã , since it is essential to reduce party fragmentation by achieving desirable levels of party and partisan discipline. But what are the desirable levels of party and partisan discipline?
In Fig 3a, I showed the CDF of the actual partisan discipline distribution during the analyzed period. The most desirable, or ideal, partisan discipline distribution is shown as a dashed red line, which represents the situation where all partisans have discipline of 1.0. Thus, a new partisan discipline distribution, generated by Algorithm 2 with parameter δ and defined by the random variable X δ , is considered desirable if its CDF F X δ (x) is closer to the ideal than the actual one, defined by the random variable X 0 and CDF F X 0 (x). More formally, considering that the area under the ideal CDF curve is 0 for both partisan and party discipline distributions, I propose the following definition: Definition 1 A new discipline distribution defined by random variable X δ and its CDF F X δ (x) where F X 0 (x) is the CDF of the actual discipline distribution defined by random variable X 0 .
In other words, if the area under F X δ (x) is smaller than the area under F X 0 (x), then X δ represents a desirable discipline distribution. Moreover, given that discipline ranges from 0 to 1, I propose the following lemma: Lemma 1 Given a random variable X 0 representing the actual discipline distribution and a random variable X δ representing a new discipline distribution, if the expected value E X δ [x] of X δ is higher than the expected value E X 0 [x] of X 0 , then X δ represents a desirable discipline distribution.
Proof 1 From probability theory, we know that E X ½x ¼ R 1 0 1 À F X ðxÞ dx for a given random variable X and CDF F X (x) [52]. Since discipline has values from 0 to 1, we can write Thus, now I can formally define three binary quality signals for the new partisan configuration over the new set of parties P Ã generated by Algorithm 2 with parameter δ. These signals indicate, respectively, if the new partisan configuration has desirable levels of partisan, party and overall discipline, and is defined as: • Q 1 : 1 if the overall discipline d d Ã of the new configuration is greater than the overall discipline d 0 Ã of the actual configuration; 0 otherwise. Recall that the overall discipline dÃ is defined in Eq 5.
• Q 2 : 1 if the expected (average) partisan discipline of the new configuration E A X d ½x is greater than the expected party discipline of the actual configuration E A X 0 ½x or, in other words, if the new partisan discipline distribution is a desirable partisan discipline distribution; 0 otherwise.
• Q 3 : 1 if the expected (average) party discipline of the new configuration E P X d ½x is greater than the expected party discipline of the actual configuration E P X 0 ½x or, in other words, if the new party discipline distribution is a desirable party discipline distribution; 0 otherwise.

Results
In this section I show the results for ARRANGE for a hundred equally distributed values of δ between 0 and 1. From now on I will call a configuration c δ the distribution of partisans among parties generated by ARRANGE for a particular δ value. Moreover, in all plots I will indicate whether the quality signals described in previous section were 1 or 0. Namely, I will use yellow stars for results where all quality signals were 1 (Q 1^Q2^Q3 ), blue diamonds when only signals Q 2 and Q 3 were 1 (Q 2^Q3 ), red circles when only signal Q 3 was 1 and, finally, green squares when all quality signal were 0. In summary, ARRANGE was able to generate 31 distinct configurations that, compared with the status quo, have (i) a significantly smaller number of parties, (ii) higher discipline of partisans towards their parties and (iii) more even distributions of partisans into parties. Besides comparing with the status quo, ARRANGE will be compared with two random models: random-sq and random-δ. The competitors can be summarized as: 1. status quo. It is the existing state of affairs, i.e., the actual and historical situation in Brazil during the analyzed period.
2. random-sq. It randomly redistributes the partisans among the existing parties. For each partisan a 2 A, the model randomly pick an option ½o a ≔ ða; p; simÞ 2 O a and assigns a to party p. By using the options set O a I guarantee that the party p allocated to a has voted for every proposition voted by a, i.e., B a B p .
3. random-δ. Works in the same way as random-sq, but instead of allocating partisans to the actual set of parties P, it randomly redistributes the partisans among the minimum set of parties P Ã generated by Algorithm 2.
The random models are used to quantify the payoffs obtained by using ARRANGE when it generates the minimum party set P Ã (random-sq) and it efficiently allocates partisans to the parties of this set (random-δ). Although not always visible, for all results of both models I also show the 99% confidence interval.
In Fig 6, I show the number of parties N Ã generated by ARRANGE for different values of δ. I also show the status quo, i.e., the actual number of parties N of which partisans were members during the analyzed period, and the number of parties generated by random-sq. First, note that the number of parties generated by ARRANGE is significantly lower than the status quo, decreasing as δ increases. Even when no partisans are allowed to decrease its discipline (δ = 0), N Ã = 22, a number % 39% lower than 36, the status quo. Moreover, ARRANGE was able to generate a configuration (c 0.19 ) with only 13 parties (% 64% reduction) and with all quality signals equal to 1 (Q 1^Q2^Q3 ), i.e., with overall, party and partisan disciplines greater than the status quo. When only Q 2 and Q 3 are 1 (Q 2^Q3 ), ARRANGE could generate a configuration (c 0.19 ) with N Ã = 10 parties, a % 72% reduction. Finally, when only Q 3 is 1 (Q 3 ), ARRANGE could provide a % 86% reduction in the number of parties by generating a configuration (e.g. c 0.41 ) with only 5 parties. It is worth mentioning that the number of parties in Brazil is so excessive that even random-sq was able to generate a configuration with fewer parties than the status quo.
Concerning discipline, I show in Fig 7 the overall discipline (Fig 7a) and the average party (Fig 7b) and partisan (Fig 7c) disciplines of the configurations produced by ARRANGE and its competitors. Observe that for all discipline metrics the results produced by ARRANGE are very similar with the status quo, even when all quality signals are 0. The discipline values decrease significantly only for δ values close to 1, when the number of parties generated by ARRANGE is 2 or 1. It is also interesting to note that the random models are able to produce configurations with considerably high levels of discipline, which shows that Brazilian political parties are, on average, very similar to each other.
In order to analyze how well distributed are the partisans among parties, we compute the Gini coefficient [53] for each configuration generated by ARRANGE and its competitors. The  Gini coefficient was initially proposed to describe the income inequality in a population [53]. It assumes values from 0, which expresses perfect equality, where all parties have the same number of partisans, to 1, which expresses maximal inequality among values, where all partisans are allocated to a single party. Observe in Fig 8 that ARRANGE is able to produce configurations in which the partisans are more evenly distributed than the status quo for all values of δ. While the Gini coefficient is % 0.64 for the status quo, it decreased to % 0.43 for c 0.15 (Q 1^Q2 Q 3 ), to % 0.32 for configurations with c 0.41 (Q 3 ), and to % 0 for c 0.5 (all quality signals equal to 0, but with discipline values similar with the status quo). Model random-sq has a slightly high Gini coefficient than the status quo because a few big parties are more likely to randomly receive new partisans, since they appear as an option in the option sets O a for all a 2 A.
Another relevant characteristic to be measured in political party systems is switching, i.e., party changes among partisans. As stated by [43], switching effectively destroys the meaning of party labels, raises voters' information costs, and eliminates party accountability, being a threat to the very core of democratic representation. Thus, in Fig 9, I plot the number of party changes among partisans that would occur if the configurations generated by ARRANGE and the random models were the reality. Observe that in all scenarios generated by ARRANGE the number of changes is lower than the status quo, random-sq and random-δ. This is another evidence of the importance of reducing party fragmentation to have ideologically well defined parties. Now I will take a closer look at particular configurations generated by ARRANGE, namely c 0.15 , c 0.19 and c 0.41 . These are the configurations that have the lowest number of parties and achieved, respectively, quality signals Q 1^Q2^Q3 , Q 2^Q3 and only Q 3 . In Fig 10a, I show the number of active parties per year for the status quo and these three configurations. First, observe that the number of parties changes constantly over the years in Brazil, this being a harmful consequence of its highly fragmented party system. On the other hand, observe that the configurations generated by ARRANGE are (i) much more stable over the years and (ii) have a significantly smaller number of parties than the status quo.  Besides counting the actual number of parties per year, it is also important to measure the effective number of parties. As described previously, it is a concept which provides for an adjusted number of political parties in a country's party system, weighting the partisan count per party by their relative strength [38]. In our case, the relative strength refers to their seat share in the parliament. This measure is especially useful to detect trends toward fewer or more numerous parties over time [38]. The number of parties equals the effective number of parties only when all parties have equal strength. In any other case, the effective number of parties is lower than the actual number of parties. It is also a frequent metric for the fragmentation of a party system [9]. Moreover, although several indexes for computing the effective number of parties exist [41], in this paper I use the Golosov index N p ¼ P N 1 ð1 þ ðs 2 1 =s i Þ À s i Þ À1 , where N is the actual number of parties, s i is the proportional share of each party p i , and s 1 is the highest share of a party [41]. For the best of my knowledge, N p is the most recent one and its results confirm it works better than earlier proposed alternatives in measuring the effective number of components in highly fragmented and highly concentrated party systems, which is the case of Brazil. Thus, in Fig 10b, we show the effective number of parties N p per year, calculated for configurations c 0.15 , c 0.19 and c 0.41 and for the status quo. First, observe that N p is significantly lower and more stable for c 0.15 , c 0.19 and c 0.41 than for the status quo. While N p grows constantly after the year 2000 for the status quo, it remains practically constant for the three configurations generated by ARRANGE. Moreover, if we consider only c 0.41 , Brazil would go from having one of the most fragmented party systems in the world [8] to having one of the least fragmented [9], averaging 3.0 effective parties per year.
In Fig 4b I showed the first two components of the PCA for the matrix M V BR composed by the votes of the partisans of Brazil during the years of 1999 and 2000. In Fig 11, I show this same result, but I replace the status quo party labels by the ones generated by ARRANGE in c 0.15 , c 0.19 and c 0.41 . Observe that all three configurations have a visually better clustering of partisans of the same party than the status quo. This suggests that ARRANGE is also able to provide configurations in which parties are more different among themselves than in the status quo. I leave a deeper quantitative analysis for future work.

Discussion
In practical terms, the main contribution of ARRANGE to a government and its population is the ability to provide a quantitative assessment of fragmentation in its party system. Given that the effective number of parties is expected to reflect the number of issue (or ideological) dimensions in a country [12], highly fragmented party systems overestimate this number of issue dimensions, providing a distorted view to the population that harms democracy [47]. Thus, by mapping plenary votes into ideological preferences, ARRANGE quantitatively provides an "ideal" number of parties for a country given the ideological preferences of its congressmen, revealing the presumable true number of issue dimensions that exists in this particular country. I used the expression "presumable true" because, as verified by [8], it is not always true that a disciplined and cohesive party represents an ideological cleavage (or group) in society, i.e., its members may simply be a group of congressmen obeying its leader in order to obtain a particular benefit.
It is also worth mentioning that ARRANGE also provides the list of parties that should ideally exist and which partisans should be their members. Although I know a democratic government cannot implement this solution easily, it can be used to support significant reforms in its political system. With this in mind, I could not fail to mention in this paper that one of the possible reasons for Brazil's high level of party fragmentation is the so called fundo partidário, which are funds distributed by the federal government to Brazilian political parties for them to spend indiscriminately. A share of the amount paid by the federal government through the fundo partidário is the same for every party, but another part is proportional to the number of elected congressmen, senators and governors by each organization. In 2014, PT, the party with the highest share (16.5%), received R$50,314,999.19 million reais from the fund. On the other hand, PROS, the party with the lowest share (0.16%), received R$493,873.68 thousand reais in 2014 [54]. Consider that, in 2014, the exchange rate of the real varied from U$2.19 to U$2.72 U.S. dollars. It is out of the scope of this paper to point fundo partidário as the main culprit for Brazil's high level of party fragmentation, but it poses as a clear incentive for the creation of many parties in Brazil.
In spite of the fact that this work considers Brazil as its use case, the methods and results shown here can be easily replicated to other countries that have highly fragmented party systems. Carsten Anckar studied party system fragmentation in 77 countries [9] and reported high levels of fragmentation for many countries besides Brazil, such as Bolivia, Bulgaria, Denmark, Ecuador, Finland, France, Guatemala, India, Israel, Italy, Netherlands and Thailand. Nevertheless, it is important to point out that this work was motivated and eased by the Open Data initiative of the Brazilian government, that provides public data related to politics and also to many other areas, such as demographics, government spending, budget and road accidents.

Conclusions
In this work, I proposed the method ARRANGE to assess and reduce fragmentation in multiparty political systems. From roll votes data of partisans and their respective party leaders, ARRANGE redistributes the partisans into new parties considering two conflicting objectives: to minimize the number of parties and to maximize party discipline. When applied to Brazilian historical roll call data, ARRANGE was able to generate 31 distinct configurations that, compared with the status quo, have (i) a significantly smaller number of parties, (ii) higher discipline of partisans towards their parties and (iii) more even distributions of partisans into parties. These results show that Brazil has and had many redundant parties, i.e., parties that are very similar ideologically. Thus, if today Brazil has one of the highest levels of party system fragmentation in the World [8,9], this work proved it could be much lower. Finally, it is important to point out that ARRANGE is a general method and could be directly applied to analyze fragmentation in any of the many highly fragmented party systems that exists in the world [9].