Voting contagion: Modeling and analysis of a century of U.S. presidential elections

Social influence plays an important role in human behavior and decisions. Sources of influence can be divided as external, which are independent of social context, or as originating from peers, such as family and friends. An important question is how to disentangle the social contagion by peers from external influences. While a variety of experimental and observational studies provided insight into this problem, identifying the extent of contagion based on large-scale observational data with an unknown network structure remains largely unexplored. By bridging the gap between the large-scale complex systems perspective of collective human dynamics and the detailed approach of social sciences, we present a parsimonious model of social influence, and apply it to a central topic in political science—elections and voting behavior. We provide an analytical expression of the county vote-share distribution, which is in excellent agreement with almost a century of observed U.S. presidential election data. Analyzing the social influence topography over this period reveals an abrupt phase transition from low to high levels of social contagion, and robust differences among regions. These results suggest that social contagion effects are becoming more instrumental in shaping large-scale collective political behavior, with implications on democratic electoral processes and policies.


Introduction
The understanding of collective human dynamics in theoretical and real-life social systems gained increasing attention in recent decades [1][2][3][4][5][6][7]. At the core of these efforts are models that incorporate a collection of interconnected individuals that change their behavior based on micro-level processes of social influence exerted by their neighbors, but also based on individuals' personal influences independent of social context. The macro-level characteristics of the system emerge as a product of the collective dynamics of these personal influences and microlevel social influence processes. The question of how to separate and measure the effect of social influence is therefore a major challenge for understanding collective human behavior. Although a variety of experimental [8][9][10][11][12] and observational [13][14][15][16] studies attempted to address this challenge, identifying the extent of social influence based on large-scale, macro-a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 level observational data in the presence of unknown network structure remains largely unexplored. To close this gap, we present a simple and universal method for measuring social influence, taking the voter model of statistical physics as our basic dynamical system [2][3][4][17][18][19]. We apply our model to understanding the collective dynamics of voting in US presidential elections-a topic at the core of collective political behavior.
The study of electoral behavior has attracted considerable attention by political scientists. Most studies of voting behavior in the United States and other democracies view vote choices as the result of several interrelated attitudinal and social factors [20]. Attitudinal factors that reflect short-term fluctuations in partisan division of the vote include evaluations of the candidates' personal qualities and government performance, and orientations toward issues of public policy. Long-term factors, which persist beyond a particular election, include partisan loyalties [21][22], ideological orientations [23], and social characteristics such as race, religion, social class, and region [20]. Recent studies have also elucidated the role of social networks in spreading voting behavior [24]. Voters embedded in social networks of friends, family members, neighbors, and co-workers [12] influence each other in terms of voter turnout [12,[25][26][27] or support particular candidates [24,28]. Social networks enable bounded rational voters to limit the cost of searching for political information [23] by relying on readily available information of their peers. These peer groups can also include "opinion leaders" who can considerably influence the behavior of voters in their network of contacts by being perceived as trustworthy and highly informed on political issues [21]. The opinion leaders (also known as "zealots," "inflexible," "stubborn" or "frozen" voters in the sociophysics literature, see [18][19][29][30][31][32][33][34][35][36][37][38]) are individuals who hardly change their political preferences, and influence the voting behavior of uncommitted individuals. Opinion leaders often interpret media messages and pass them on to "opinion followers" [21]. Other sources of political information that were shown to influence citizen attitudes and voting behavior are the mass media [39][40][41][42][43] and a variety of organized efforts at political persuasion such as campaign persuasion [44].
Thus the picture that emerges from the modern history of social science academic voting research suggests that voters are embedded in interpersonal social networks that can increase the likelihood of voting contagion and behavior change via social imitation; but are also exposed to what we might call "external influences"-social forces, which are often consistently skewed in favor of one candidate over another [24], that affect voters. As mentioned above, these external influences include various individual prejudiced attitudes and orientations, party identification, individual's upbringing, religion, ideology, campaign persuasion, and exposure to the mass media, such as television and newspapers. Since in this paper our focus is on understanding the dynamics of flexible voters who are free to change their voting behavior, without loss of generality, one can consider exposure to opinion leaders (including peers, journalists, or politicians), to be an external influence, despite sometimes being a peer influence effect. The reasoning for this is that opinion leaders are ideologically inflexible and unwavering candidate supporters, and thus convey a consistent partisan bias in favor of one candidate over another [24]. Collectively, the voter's electoral decision can be explained in terms of peer effects (via social imitation) and partisan biases conveyed by competing external influences.
A pertinent question here is how to disentangle the effect of social contagion from that of exposure to external influences. This identification problem goes beyond voting. People hold opinions on a multitude of topics that inform alternative courses of action, from crime participation [45] and smoking [46] to riots and protests [47] and financial markets [31]. These opinions can be either the result of individual considerations or, when confronted with information that is difficult to acquire or process, influenced by the views of others through social interactions. In this paper we describe a general methodology for detecting behavioral contagion from large-scale observational data. We extend the basic voter model [2][3][4][17][18][19] by taking into account the dynamic response of social networks to external influences. Our model focuses on two characterizations of voting behavior. The first is that of most studies of voting behavior, which consider vote choices to be driven, as outlined above, by various individual's biases and other external pressures. The second-from complex systems science and recent observational and face-to-face studies-is that of internal self-reinforcing dynamics where voters' opinions are changed under the influence of their peers. Incorporating both, we construct a universal representation of the largest scale system behavior when there is both external and interpersonal influence. The extended voter model is able to reproduce remarkably well statistical features and patterns of almost a century of county-level U.S. presidential election data. More importantly, our model presents a general framework for detecting social contagion from large-scale election return data, and can be applied more generally to many different systems that involve social contagion.
Here, electoral votes cast for U.S. presidential candidates at the county level are analyzed, covering the period of 1920 through 2012. Counties are grouped by state, and the corresponding distributions of the fraction of votes (vote-share) in a county for the Democratic candidate in an election year are studied. Fig 1 shows the county vote-share distributions for various states and election years. The data indicates that there is a wide variation in the characteristics of voting behavior with no apparent pattern of voting dynamics in time or geographical space. Here we show that much of this observed variability of county vote-shares may be best explained by fluctuating peer influences across time and space. Although the study of collective voting behavior has recently been the focus of discussion in the context of identification and modelling of universal patterns of observed voting behavior [48][49][50][51][52][53][54][55][56][57][58][59], the mechanisms leading to such diverse spatiotemporal variation in voting patterns as shown in Fig 1 as well as other spatiotemporal patterns discussed in our paper (see section Empirical Results of US Presidential Elections: 1920 to 2012) are poorly understood. The model presented below provides a parsimonious quantitative framework that is capable to explain and reproduce, among others, the full range of empirical county vote-share distributions for all states and election years. Using the model, we develop an index of social influence that enables us to examine and reveal remarkably robust patterns of spatial and temporal variation in social influence over 92 years of US presidential elections. The statistical physics model presented in this paper is obviously limited in ignoring a lot of psychological and social factors influencing the decisions of individual voters. However, this limitation should not be perceived as an overly simplified assumption that overlooks human complexity. Indeed, as demonstrated by other models of social complexity [1][2][3][4][5][6][7], at times the details of a complex system do not matter if one wants to understand the large-scale behavior of the system. In this case, only the broad-brush features of the system are necessary to understand the complexity of human choices; in our case, the relative strength of external to peer influences are shown as a plausible explanation for observed voting behavior.

Models of opinion dynamics
Models of opinion formation, which explore the dynamics of competing opinions taking into account the interactions among agents, have been extensively studied [1][2][3][4][5][6][7][60][61]. In their most basic form, these models consist of voters, represented by nodes on a social network, having only two possible opinions, 0 or 1. Each voter may change her mind by using various interaction mechanisms, for example, randomly adopting the opinion of a connected neighbor (essentially a noisy majority-vote rule, see [3,17,58]), or by applying local majority rules [1,3,5]. The stochastic dynamics of these simple interaction models ultimately leads to a uniform state corresponding to the all-nodes-0 or all-nodes-1 states where all voters share the same political choices. Obviously, consensus states are not observed in real-world political elections, and thus the basic models cannot be plausibly considered as realistic models that are able to describe empirical voting data. Accordingly, more realistic models of opinion dynamics have been proposed that incorporate, among other features, social impact theory [60][61][62], opinion leaders and zealots [18-19, 29-38, 62-63], external influences and fields [2,[18][19][64][65][66][67][68][69][70], individual's biases [71][72], contrarians [73], individual's own current opinion [74][75], wordof-mouth spreading [52], non-overlapping cliques [59], or noisy diffusive process [58]. Below we further elaborate on the themes of opinion leaders and zealots, external influences, and individual's biases-themes that play an important role in our model, and that have been seen empirically by studies of electoral behavior (see Introduction).
Opinion leaders have often been modeled by considering the presence of biased voters who favor one opinion over the other, and that will not change their opinion (also known as zealots [29], inflexible [30], frozen [18][19]31], stubborn [32][33][34][35], or committed voters [36][37][38]. The problem has been originally introduced and studied with a single zealot in regular lattices [29], and has been subsequently incorporated in models that use repeated local updates of random grouping of agents in the limit where the number of voters and zealots goes to infinity [30], and with arbitrary numbers of voters and zealots in fully connected networks where complete and exact results of the stochastic dynamics have been obtained [18][19]. Further studies that explore the use of zealots in the context of the voter model include [31-34, 36-38, 76]. The role of external influences (distinct from social imitation) in opinion formation has often been modeled as an external perturbation or modulation acting on all agents in the system, or by some external field or global coupling [3]. These perturbations could account for the effects of propaganda [65], fashion waves [66][67], or the mass media [68][69][70]; but are also driven by individual biases and prejudices [71][72], or level of political awareness [64]. More generally, these perturbations represent the dynamic response of a complex system to an external environment [18][19]63]. As mentioned in the Introduction, this view is deeply aligned with empirically and theoretically grounded research by political scientists, which have uncovered external forces in the form of prejudiced attitudes and orientations of individuals, party identification, individual's upbringing, religion, ideology, exposure to campaign persuasion and the mass media (such as television and newspapers), or partisan bias conveyed by opinion leaders.
While the above models show the important role of contagion spreading via social interactions in collective opinion dynamics, any further progress in understanding real-world voting phenomena needs to supplement and contrast these theoretical efforts with large sets of empirical voting data. Some effort has been done in this regards [52,56,58], particularly with the aim of explaining and reproducing the distribution of votes in bipartisan and proportional elections. Our paper is a contribution in this direction. Below we present an exactly solvable model of stochastic voter dynamics obtaining, among others, the stationary vote-share fluctuations across counties, which is in excellent agreement with almost a century of observed U.S. presidential election data at the state level, for every election year. The model is further validated by reproducing empirical temporal and spatial election patterns, as identified by social science academic voting research.

The model
To model the dynamics of elections, we take the prototypical voter model [2][3][17][18][19] as our basic dynamical system, and modify it to more closely reflect features of real-world political elections (see Introduction). We consider a network with N "free" nodes representing uncommitted voters [18][19][29][30][31][32][33][34][35][36][37][38], and links between pairs of free nodes representing peer influences. We add to this network of uncommitted voters "fixed" nodes representing unwavering candidate supporters and opinion leaders that influence all other uncommitted voters, but are not themselves influenced by other nodes [18][19][29][30][31][32][33][34][35][36][37][38]. The assumption that there exists a directed link from each uncommitted voter to each fixed node is motivated by the empirical fact that uncommitted voters are consistently exposed to partisan bias in favor of one of the candidates over another [20-22, 24, 39-44], conveyed by opinion leaders and other external sources (see detailed discussion below). The number of fixed nodes that are biased in favor of the first candidate (named '0') is N 0 and the number biased in favor of the second candidate (named '1') is N 1 . Thus, we consider a network with N + N 0 + N 1 nodes. Each node has an internal state which can take only the values 0 and 1, representing whether the voter chooses the first or second candidate; or, for fixed nodes, whether the node is biased in favor of the first or second candidate. We assume a mean-field interaction model in which each uncommitted voter is equally likely to interact with another uncommitted voter [3,17,58,77]. Accordingly, individuals update their contacts in a fully mixed fashion within the population, which implies a homogeneous random network for the uncommitted voters' social ties. We assume that the N free voters change their internal state following the noisy majority-vote model [3,17,58]: At each time step a random free voter is selected and its state is updated with probability 1 − p by copying the state of one of its connected neighbors, chosen at random from all nodes; and with probability p the state remains the same. The N 0 and N 1 voters that are biased towards the first and second candidate, respectively, remain fixed in state 0 and state 1, respectively.
Our model's assumptions and the noisy majority-vote update rule that we use [3,17,58] share important features with other variants of the majority rule principle. For example, the elegant majority rule proposed in [77]-see also the excellent review in [3]-assumes that all agents in the population can communicate with each other; forming, at each iteration, a random group of agents who take the majority opinion inside the group [3]. In this model, therefore, multiple individuals' vote choices are updated simultaneously at each time step, at variance with our noisy majority-vote update rule where a single individual's vote choice is updated at each time step [3]. The presence of inflexible agents with opposing views in this model [30,35] leads to a solution, in the mean field limit [3,77], that eventually settles into a fixed value of vote-share for one candidate, depending on the initial conditions. Our generalized voter model, on the other hand, does not necessarily settle into a fixed value. Instead, our main result shows that despite fluctuations of the voting dynamics, voter choices converge in distribution. Moreover, the long-run stationary distribution of vote-shares does not depend on the initial vote choices of uncommitted voters. This result, first reported in [18] within a fully solvable model, accounts for both the finite number of voters in a population and the numerous sources that convey consistent partisan biases to uncommitted voters. These properties and the foregoing model's assumptions essentially create the kind of characteristic vote-share fluctuations across counties as recently observed in the sociophysics literature (e.g., [52,56,58,78]) for various countries; and thus support the plausibility of the model and its capacity to describe real world voting phenomena. Of course, successful matching to a variety of spatiotemporal real-world election data is the ultimate test of any theory.
The parameters N 0 and N 1 of the fixed voters can be interpreted according to two viewpoints. We emphasize that both viewpoints are valid and useful: (1) Zealots and opinion leaders: As originally stated above, fixed voters can be viewed as unwavering candidate supporters and opinion leaders (peers, journalists, or politicians) that influence uncommitted voters, but are not themselves influenced by their neighbors' vote choices; (2) External factors: Alternatively, following our assumption in which each uncommitted voter is equally likely to interact with the fixed voters, the parameters N 0 and N 1 give the "effective strength" of the consistent partisan bias conveyed by the fixed voters in favor of one of the candidates (with effective strength N 0 ) over another (with effective strength N 1 ). As stressed in the Introduction, these consistent partisan biases by opinion leaders is merely one instance of a broad class of consistent external factors that influence the choices of uncommitted voters. These external factors include exposure to television, newspapers, or campaign persuasion. Recognizing that no voter is a "blank slate," these external factors also include any prejudiced beliefs, party identification, individual's upbringing, religion, or political ideology of uncommitted voters [20][21][22]. Mathematically, this broad interpretation is achieved (see Materials and methods) by analytically extending the parameters N 0 and N 1 to non-integer values; thus enabling modeling arbitrary strength of these external influences in favor of one of the candidates over another. According to this viewpoint, copying the state of a connected voter represents mutual influence among friends, neighbors, and family members via social imitation or via a consistent partisan bias acting on uncommitted voters (by opinion leaders or other external sources). External influences of opposite partisan biases do not cancel; instead larger N 0 and N 1 reflect increasing probability that consistent partisan biases determine the choices of uncommitted voters, independent of the voting choices of other uncommitted voters. Here we assume that there are many external sources of competing political information, and that over the election period in question the sources are persistent in their proportion of partisan biases regarding the two-major party candidates, though vary in the way they influence individual voters' choices. Election years that are consistently biased towards the first (second) party's candidate would be represented by N 0 greater (smaller) than N 1 .

The limiting stationary distribution of votes
We have previously proposed the above model as a widely applicable theory of collective behavior of complex systems [18-19, 31, 79-80], where the generalized voter model was solved exactly for a fully connected network. The fully connected network case was also shown to be equivalent (up to simple scaling) to a homogeneous random network (see Materials and methods). More specifically, at equilibrium, the probability of finding the network in the global state of k free voters in state 1 (i.e. voting for candidate 1) is given, independently of the initial state, as follows (see derivation in Materials and methods): where N is the number of free voters, k is the number of free voters is state 1 and n k ! are binomial coefficients. As mentioned above, analytically extending the parameters N 0 and N 1 to non-integer values enables to capture not only the case of zealots and opinion leaders but also the generalized effects of external factors (see Materials and methods). In this case, the solution in Eq 1 remains the same, with the difference that factorials must be replaced by gamma functions. Indeed, as we move around in the (N 0 , N 1 )-parameter space, the stationary distribution in Eq 1 exhibits strikingly different shapes. The different shapes of the stationary distributions depend on the magnitude of the external parameters, N 0 and N 1 , compared to the extent of social imitation within the network of uncommitted voters, and the relative partisan bias of opinion leaders or other external influences (e.g., television and newspapers) toward the first or second candidate (i.e., N 0 > N 1 or vice versa). As shown in Materials and Methods, these distributions vary from skewed unimodal distributions with intermediate peaks or peaks at all nodes 1 or all nodes 0, to bimodal and uniform distributions. Interestingly, Eq 1 remains valid for other network topologies (including random, regular lattice, scale-free and small world networks) if N 0 and N 1 are re-scaled according to the degree distribution (see Materials and methods).
In this paper we are mostly interested in the fraction of voters (vote-share) that voted for a candidate rather than the actual number of voters. Thus, we define the vote-share for candidate 1 as the scaled variable v = k/N. The mean and variance of v can be computed from Eq 1 as follows The variance of vote-shares in Eq 3 has an appealing interpretation. When peer influences (via social imitation) are very weak compared to external forces (N 0 , N 1 ! 1), the variance of vote-shares becomes This is the variance of vote-shares that one would expect if all uncommitted voters are solely influenced, each with probability μ v , by the consistent partisan biases exerted by either opinion leaders with opposing views or other external forces (e.g., mass media), independent of the voting choices of other uncommitted voters. The second term on the right side of Eq 3, which is a decreasing nonlinear function of the external influence parameters, represents the effect of social imitation and peer influence within the network of uncommitted voters. This second term, which we call the "social influence index," provides us with a method of detecting and isolating the effect of social imitation and social contagion. We use this index extensively in this paper to explore and understand how social influence changes across states and over almost a century of county-level U.S. presidential election years.

Estimation of external influence from large scale voting data
The U.S. presidential election data are often collected at the level of counties. This data provides, among others, information on the vote-share in each county i (a single realization from an unobserved stationary vote-share distribution). Thus, in order to divulge the phenomenology of voting contagion in electoral voting behavior, we need to show how to estimate the external parameters of the generalized voter model from real data. The unknown external influence parameters N 0 and N 1 for any state in any election year can be estimated from a sample of observed vote-shares across counties as follows. Suppose a particular state has n counties, and let v i be the fraction of voters in the i th county that voted for candidate 1, and N i be the total number of votes cast for all candidates in the county. We assume that all counties of a state are influenced by the same external parameters N 0 and N 1 . Accordingly, the voting dynamics in the i th county is governed by the generalized voter model, which applies to a subnetwork of N i free nodes and N 0 and N 1 fixed nodes (note that each county has a different number of free nodes). We assume that the vote-share distribution (Eq 1) in each county is in equilibrium, and that the corresponding mean and variance are given by Eqs 2 and 3.
Using Eq 2, the expected value of the vote-share in county i does not depend on i, and is equal to μ i = μ = N 1 /(N 0 + N 1 ). We thus estimate μ i by simply taking the sample averagem of vote-shares across all n counties. For the variance of the vote-share s 2 i in county i, a crude estimate based on the single observed vote-share data point v i is provided by ðv i ÀmÞ 2 . Obviously, this estimate is imperfect and we define the residual between s 2 i and the estimate of s 2 Using Eqs 3 and 4, we define a system of nonlinear estimation equations (one equation for each county) that relate ðv i ÀmÞ 2 , the estimate of s 2 i , to the external parameters N 0 and N 1 : The estimation procedure first estimates μ on the right hand side of Eq 5 bym, and then select the sum of parameters N 0 + N 1 that minimizes the squared errors S n i¼1 ε 2 i in Eq 5. The least squares estimate is given bŷ Eq 6 and the conditionm ¼N 1 =ðN 0 þN 1 Þ fully determine the estimated external parameters. We can then use the estimate in Eq 6 to obtain the "social influence index" of the state as defined in Eq 3: where N ¼ S i N i =n is the average number of voters per county. Eq 7 forms the basis for the statistical analysis of social imitation; for all states across U.S. presidential elections (see section Empirical Results of US Presidential Elections: 1920 to 2012).
Derivation of the stationary vote-share distribution at the county level For the U.S. presidential elections from 1920 to 2012, we empirically find that the external parametersN 0 ;N 1 ) 1 for all states and across election years. Moreover, we notice that the total number of voters N i , in any county i for any given election year, is large. Thus, the voting dynamics in any county is applied to a network of voters with a very large number of free and fixed nodes. Driven by these facts, we find that in the limit N i ! 1 the stationary distribution in Eq 1-characterizing the long run distribution of votes in the i th county-is approximately a Gaussian distribution (see Materials and methods). More specifically, the asymptotic voteshare distribution in county i is given by a Gaussian We stress that this predicted Gaussian vote-share distribution (and its characteristic mean and variance) at the county level is not assumed from the outset but turns out to be the consequence of basic principles of voting behavior and the generalized voter model. We next derived the stationary vote share fluctuations across counties.

Derivation of the stationary vote-share distribution across counties
While the stationary vote share distribution at the county level is not observed (but predicted to be Gaussian), the availability of large sets of empirical voting data enables us to obtain, for each state in every election year, the probability distribution of observed vote-shares across all n counties in the state (see Fig 1). As mentioned above, this candidate vote share distribution has been the focus of recent attention. A plausible model for the stationary vote-share distribution across counties is to describe it as a Gaussian scale mixture [81] with n different components (representing the n counties in the state), each distributed as a normal distribution with the same mean μ and different variances s 2 i , as specified above. Let v denote the random variable corresponding to this Gaussian mixture (this is called the "vote-share per county" in Fig 1). This Gaussian mixture is a unimodal distribution with mode at μ, skewness value 1 n s 2 i Using the Pearson system, the Gaussian scale mixture can be shown to be approximately a t-distribution [82]. More specifically, Let c 0 ¼ 2s 2 v b 2 =ð5b 2 À 9Þ and c 2 = (β 2 − 3)/(5β 2 − 9) be the Pearson coefficients corresponding to the Gaussian mixture, and let a ¼ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffi ð1 À c 2 Þ=c 0 p , and m = (1 − c 2 )/c 2 . Then, the scaled and shifted random variable α (v − μ) is approximately distributed as a Student's t-distribution with m degrees of freedom [82]. Notice that the parameters μ, α, and m of this Student's t-distribution can be completely specified once the external parameters N 0 , N 0 are estimated (as was shown above), and the number of counties in the state n, and total number of votes N i in each county are given (these data are publicly available in many countries).
The above key result implies that the scaled and shifted vote-shares across counties can be described by a Student's t-distribution with m degrees of freedom. Finally, we empirically find for our comprehensive U.S. presidential election data that the number of degrees of freedom m ) 100 for all states in every election year. In this case, the Student's t-distribution with m degrees of freedom approaches the normal distribution; and thus the distribution of the scaled and shifted vote-shares across counties is predicted to match nicely the standard normal distribution. We emphasize that this predicted Gaussian vote-share distribution across counties is derived from first principles and does not involve any a priori assumption about the voteshare distribution. Successful matching to election data will be a corroboration of this theory.

Empirical results of US presidential elections: 1920 to 2012
Analyzing the county vote-share probability distributions Our analysis is based on US presidential election data from 1920 to 2012 [83]. States with less than 10 counties (i.e., Connecticut, Delaware, Hawaii, and Rhode Island) and Washington D. C. were excluded from analysis. For each state, in every election year, the data includes information on the number of counties n for which vote-share data was available, the vote-share v i in county i, and the total number of votes cast for all candidates in county i, N i . The external influence parameters N 0 and N 1 , and the distribution parameters α and m were estimated for all states and every election year. Using these parameters, we constructed the probability distribution of the scaled and shifted vote-share quantity α(v i − μ), and compared it with the predicted normal distribution. Fig 2 shows that this theoretical prediction fits remarkably well for most states and election years, representing almost a century of county-level U.S. presidential election data, and is consistent with observations in other countries [78].

Analyzing the evolution of social influence
We can use the index of social influence defined in Eq 7 to explore the level of social interactions across states and election years. We first examine the distribution of the social influence index, aggregated over all states and election years. The histogram in the upper panel of Fig 3 shows a right-skewed distribution. This means that while the bulk of the distribution occurs for small values of social contagion, the electorate in US presidential elections is at times highly volatile and subject to wide swings of social contagion effects higher than the typical value. This is reflected by the highly right-skewed tail of the histogram. Here we find that the loglogistic provides a slightly better fit relative to the log-normal distribution. The log-logistic is a heavy-tailed distribution similar in shape to the log-normal distribution, but with heavier tails [84].
To analyze the electoral dynamics, we examine the spatial and temporal variation in social influence from 1920 to 2012. First, we examine the evolution of social influence over time. Fig  4 shows the time series of the average social influence for each of the nine U.S. census divisions (panels a-f) along with the time series of (normalized) social influence averaged over all U.S. states (panel g). To enable the comparison of the various time series, all data are normalized Z-scores. Specifically, for each individual time series we express the social influence in terms of standard deviation from their mean, calculated from 1920 to 2012. We use hierarchical clustering to identify clusters of U.S. census divisions with the highest within-cluster time-series correlation and the greatest between-cluster time-series variability. The result from the hierarchical clustering suggests three clusters: two main clusters (arranged in panels d and f) with within-cluster average correlations of 0.824 and 0.909, and a relatively high between-cluster average correlation of 0.775; and a singleton cluster (New England) with a relatively low between-cluster average correlation of 0.1071.
Remarkably, we find that despite variations in social influence across states and divisions, the normalized time series pertaining to the overwhelming number of states (with the exception of the three-state region of New England analyzed here) collapse on a very similar curve. Indeed, as can be seen, the normalized curves in panels d and f show a very similar pattern, which is also similar to the observed temporal pattern of social influence when averaged over all states (Fig 4g). That is, the pattern in Fig 4g shows a monotonic upward trend, which means that social influence increases through time (Mann-Kendall test, p < 0.001). Moreover, the period of 1984-2012 displays much higher levels of social influence when compared with the period of 1920-1980, which displays lower levels of social influence (Mann-Whitney Utest, p < 0.001, see S1 Fig).
New England is an apparent exception to this pattern (Fig 4a and 4b). However, this exception may be explained by the historical events and our model. One of the most unique characteristics that makes New England, as a political region in America, different from other regions is its town meeting form of government-a local institution that did not spread to other states [85]. The town meeting is the legislative assembly of a town in which qualified Voting contagion voters make laws in face-to-face communal decision making [85]. Town meetings defined New England's politics until the middle decades of the 20 th century. This was changed in 1962 with the Supreme Court's "one person, one vote" decisions, which resulted in shifting power dynamic away from most small towns that practiced town meetings, face-to-face interactions, to cities that adopted representative politics [85]. Thus the relative high levels of social interaction observed in New England prior to the 1960 election (see Fig 4b)-contrary to the patterns observed in other regions-correspond to the period in which town meetings-a powerful platform of social influence via face-to-face, communal, decision making-had wide legislative powers. This was followed by a sharp decline in relative social influence (see Fig 4b) after the Supreme Court's "one person, one vote" decision, which had the effect of shifting the power from face-to-face communication and social interaction to representative politics. This political transition changed not only the relative level of social interactions-and thus the variability of the vote-share distributions-but also impacted the partisan bias-hence the mean of the vote-share distributions-towards the Democrats [85].

Analyzing the correlation trends of social influence
As a further support for the usefulness and consistency of our model, we examine how the spatial variation of social influence across states changes over time. We can characterize each election as a vector of state-level indices of social influence (using Eq 7), and measure the similarity between each pair of elections by the corresponding correlation coefficient (Fig 5b). This type of analysis, combined with the findings in Fig 4, reveals intriguing patterns that go beyond short-term fluctuations in partisan division of the vote. Hierarchical clustering of the elections by the social influence correlation distance shows several marked clusters of highly similar election years (see Fig 4b): , and three smaller clusters 1920-1924, 1928, Voting contagion and 1976-1980. Remarkably, these clusters of social interactions correspond nicely with the partitioning of American history into distinct party systems [86].
There have been six party system periods in American history, separated by relatively significant change in party loyalties [86][87][88][89][90]. Clustering analysis reveals that during 1932-1972, external forces (in the form of attitudes, orientations, party identification, individual's upbringing, religion, or ideology) are strong compared to social/peer influences, and indicate a stable long-term electorate phase. This result is plausibly supported by the historical account. The stock market crash of 1929 and the ensuing depression signaled the realignment of the fifth party system from a Republican to Democratic majority with the election of 1932 and the New Deal coalition [86,88]. The change was also influenced by demographic changes of rising American electorate of African Americans, blue collar workers, Catholics and urban ethnics, and a shrinking Republican base of white Protestants, small town residents, farmers, and middle class businessmen [88]. The distinction between external and social influence stands despite some fluctuations in Republican vs. Democratic selections.
The cluster 1976-1980 identified by our clustering analysis (see also Fig 5b) suggests that the elections of 1976 and 1980 formed a transition period to the post-New Deal era of  and S1 Movie for a complete analysis). The colored areas reflect the significance (p-value) of local concentration of social influence for each state. The p-values for each state are derived from a random permutation test of local clustering using the Getis-Ord Local G Ã i statistic. This analysis was performed with a contiguity spatial weight matrix that indicates whether states share a boundary or not. The variable of concern is the social influence index calculated using Eq 7 in the main text. Low p-values (p-value 0.1) indicate statistically significant high levels of social influence at a state and its surrounding neighbors (hot spots). High p-values (p-value!0.9) indicate statistically significant low levels of social influence at a state and its surrounding neighbors (cold spots). b) Heatmap of the correlation coefficients between pairs of election years characterized by their state-level social influence profiles. https://doi.org/10.1371/journal.pone.0177970.g005 Voting contagion weakened partisanship among the voters [88]. This transition period corresponds to the Watergate scandal and Nixon's resignation in 1974, and Democrat Jimmy Carter's victory in the 1976 presidential election.
Whereas the previous fifth party system was characterized by strong party loyalties and partisan attachments, the sixth party system (overlapping with the 1984-2012 cluster in Fig 5b) is characterized by electoral dealignment-the weakening of party loyalties among voters [88,[90][91], reduced political involvement [92], and the critical role of voters' personal social interaction networks in determining vote choices [24]. As partisanship declined and more voters became independents [86], inter-election vote swings increased [88]. Moreover, the external influence of television and newspaper declined as the media were considerably less likely to be sources of partisan-biased information [24,88,93]. This led to a period of strong competition where neither Democrats nor Republicans created a true majority party, resulting in alternating control of the presidency, split-ticket voting, and divided government. These trends seem to be consistent with our model, which shows higher levels of social contagion for the 1984-2012 period (Fig 4g), relative to the 1920-1980 period, combined with the long-term stability of social influence patterns indicated by the high levels of association between the 1984-2012 elections (Fig 5b).

Mapping the geography of social influence
S2 Movie in Supplementary Material shows maps (excluding Alaska and Hawaii), color-coded by levels of social influence, for all election years. In order to better characterize the spatial patterns of social influence observed in S2 Movie, we apply a variety of spatial statistical data analysis methods. First, we utilized a random permutation test of spatial autocorrelation using the Moran's I statistic [94,95]. The random permutation tests suggest (see S1 Table) the presence of significant positive spatial correlation, for all election years, between states' own levels of social influence and the levels of their neighbors as indicated by the level of significance (pvalue) shown in the third column of S1 Table. This analysis was performed with a contiguity spatial weight matrix (row normalized) that indicates whether states share a boundary or not. While the Moran's I statistic indicates that the spatial distribution of high and/or low values is more spatially clustered than would be expected if underlying processes were random, it does not identify unexpected spatial spikes of high or low social influence values. We thus applied random permutation tests of spatial clustering using the Getis-Ord General G Ã statistic [96][97]. The tests indicate (see S2 Table) that social influence is significantly concentrated in space as shown by the significance levels (p-value) in the third column of S2 Table. That is, for all election years, the observed Getis-Ord General G Ã is larger than the expected General G Ã , indicating that the spatial distribution of high social influence values is more spatially clustered than would be expected if underlying spatial processes were truly random.
In order to identify where high or low values of social influence cluster spatially, we further applied a random permutation test of local clustering using the Getis-Ord Local G Ã i statistic [96][97]. Low p-values of the random permutation test indicate statistically significant high levels of social influence at a state and its surrounding neighbors (hot spots). High p-values indicate statistically significant low levels of social influence at a state and its surrounding neighbors (cold spots). This analysis was performed with a contiguity spatial weight matrix that indicates whether states share a boundary or not. The corresponding maps of hot spot analysis, for all election years, are presented in S2-S25 Figs and S1 Movie. A sample of these maps of social influence clusters is presented in Fig 5a. The colored areas in Fig 5a reflect  It would be interesting to speculate on the political, economic, social, and psychological factors that drive geographic variation in voting contagion. Research in the geographical and psychological sciences, which examines the geographical distribution of political, economic, social, and personality traits within the United States [98][99][100][101][102][103], suggests that the Great Plains and West North Central region is characterized by individuals that are typified by conservative social values, low openness and resistance to change, and preference of familiarity over novelty. This region comprises states with comparatively small minority populations [101], is less affluent, has fewer highly educated residents, is less innovative compared with other regions, and tends to be politically conservative and religious [100]. Individual in this region choose to settle near family and friends and maintain intimate social relationships with them, but also tend to display low levels of social tolerance and acceptance for people who are from different cultures, unconventional, or live alternative lifestyles [100]. Altogether, the above characteristics indicate a region where voters' choices are plausibly based upon strong ideology, party identification, orientations and attitudes rooted in religion and traditional social values, and reinforced by face-to-face interactions with like-minded family members and friends. We therefore expect our model to generate a social influence index (see S2 Movie for maps of raw social influence values instead of the Getis-Ord Local G Ã i statistic) that reflects external forces (e.g., in the form of party identification or ideology), which are strong compared to peer influences.
Unlike the very low openness and conservative social values typical for the Great Plains and West North Central region, states along the Middle Atlantic and Southwest region are marked by moderately to very high openness, is wealthy, educated, culturally and ethnically diverse, and economically innovative [100,102]. This region appears to be politically liberal, and has fewer mainline Protestants [100]. Residents of this region also appear to be tolerant and accepting of social and cultural differences [100]. Considering the social diversity, tolerance, openness, and open-mindedness in this region, it is plausible that people's orientations and attitudes are influenced by the attitudes of others [100]. This is consistent with our model, which shows high levels of social influence index (see maps of raw social influence values in S2 Movie) that indicate peer influences that are strong compared to external forces in the form of attitudes or ideology. Although further research is needed to uncover the factors affecting social influence, it is plausible that economic, social, and psychological factors, as discussed above, can explain the geographical variability of social influence.

Discussion
Many complex systems can be viewed as comprising of numerous interconnected units each of which independently responds to external forces, but is also affected by internal forces exerted by the states of its connected units. In such systems, the stationary distribution of the states of the units may change in characteristic ways depending on the strength of external influences relative to internal influences [18-19, 31, 79-80]. Therefore, a key question is how to disentangle the effect of internal influences from that of exposure to external influences, given observational data about the phenomena we are trying to explain. This identification problem is important not only to the biological and physical sciences (e.g., ecosystems, see [104]), but also in the social sciences where the importance of social interactions in forming opinions and decisions has been emphasized [12,[45][46]105]. The U.S. presidential elections are a case in point. In such situations, voters' candidate choices are affected by many sources that convey consistent partisan biases skewed in favor of one candidate over another. These sources are numerous and include exposure to television, newspapers, campaign persuasion, or opinion leaders (including peers, journalists, or politicians); but also include various individual prejudiced attitudes and orientations, party identification, individual's upbringing, religion, or ideology (no voter is a 'blank slate'). Uncommitted voters are also affected by the choices of other uncommitted voters in their own personal networks, via social imitation mechanisms. All of these empirical facts are deeply rooted in the extensive study of electoral behavior by social and political scientists (see Introduction) as well as studies of opinion dynamics in the sociophysics literature (see Models of Opinion Dynamics). The vote-share fluctuations across counties, and other spatiotemporal voting patterns, thus depend on the relative magnitude of the persistent partisan biases for one candidate over another.
Individual voters are influenced by a variety of psychological and social factors, but taking them all into account would be not only impossible but also unnecessary for understanding the large-scale behavior of the system. This large-scale behavior can still be captured by introducing a few key parameters, as we have demonstrated in this paper. We presented a general methodology for quantifying the degree of social imitation and peer influence on the basis of given observational data. The methodology is based on an extended version of the voter model [18][19] that takes into account the effect of external forces, and is applied to a comprehensive data of US presidential elections from 1920 to 2012. An essential element in the model is social interaction between individual voters. The model includes two parameters that reflect the bias in favor of one of two candidates. These tunable parameters represent unwavering candidate supporters (zealots or opinion leaders) that convey a consistent partisan bias in favor of one candidate over another; or, as discussed above, alternatively can be interpreted as external factors that influence uncommitted voters' choices. In addition to these external factors, voters are also influenced by the behavior of others via social imitation.
Our model is validated in several ways. First, we derive the theoretical probability distribution of the vote-share per county, and find a remarkable fit between the theoretical result and the empirically observed county vote-share distributions. Our theoretical result is also consistent with observations in other countries [78]. To our knowledge this is the first study that provides an analytical expression of the stationary vote-share distribution across counties. Second, we examined the temporal dynamics of social influence by calculating the social influence index for each state and each election year. Our analysis reveals a distinct pattern of increasing social influence over 92 years (1920-2012) of US presidential elections. The 1984 election year represents the phase transition point from low  to high (1984-2012) levels of social contagion. The increasing levels of social influence at presidential elections suggest, in turn, the decline of bias induced by external forces (e.g., partisanship among voters), and an increasing of independence in voting behavior. Third, we examined how the geographic variation across states in social influence changes over time. This spatiotemporal analysis enables our model to reproduce two stable long-term periods of election years corresponding to two successive long-term periods of low and high levels of social contagion, in alignment with the 1984 phase transition finding. This suggests a new data-driven, large-scale systems approach of characterizing abrupt transitions of political events, which is based on critical realignment in the patterns of social contagion. Finally, we use the model to map the social contagion geography of the United States. Results from spatial analysis reveal robust differences among regions of the United States in terms of their social influence index. In particular, we identify two regions of 'hot' and 'cold' spots of social influence, each comprising states that are geographically close. We provided some evidence that statewide variation in social contagion may be linked to psychological, social, and economic factors.
More broadly the results suggest the growing role of social influence, contagion, and 'herdfollowing' in shaping peoples' behaviors, tastes, and actions in a variety of real-life situations. Social influence and contagion will likely become increasingly evident as our society becomes more interconnected through the information superhighway and transport infrastructure networks. If we want to truly understand macro-level collective behavior in human systems-and perhaps devise ways by which human society can increase its collective wisdom-it will be important to develop practical and effective methods for measuring and monitoring the extent of social influence.

Dynamic network model of voting
Consider a network representing a county with N nonpartisan voters (variable nodes) taking only the values of 0 or 1, representing support for candidate 0 or 1, respectively (e.g., Republican or Democrat). In addition, there are N 0 and N 1 partisan voters (frozen nodes) in state 0 and 1, respectively. At each time step, a variable node is selected at random; with probability 1 − p the node copies the state of one of its connected neighbors, and with probability p the state remains unchanged. The partisan nodes can also be interpreted as external perturbations, representing a variety of factors that influence voters' attitudes towards one of the two candidates (e.g., mass media, party identification, individual's upbringing, religion, or ideology). Analytically extending N 0 and N 1 to be real numbers enables modeling arbitrary strengths of external perturbations.
For a fully connected network the behavior of the system can be solved exactly as follows. The nodes are indistinguishable and the state of the network is fully specified by the number of nodes with internal state 1. Therefore, there are only N + 1 distinguishable global states, which we denote S k , k = 0,1,Á Á Á,N. The state S k has k variable nodes in state 1 and N − k variable nodes in state 0. If P t (k) is the probability of finding the network in state S k at time t, then P t+1 (k) can depend only on P t (k), P t (k + 1) and P t (k − 1). The probabilities P t (k) define a vector of N + 1 components P t . The dynamics is described by the equation The term inside the first brackets gives the probability that the state S k does not change in that time step and is divided into two contributions: the probability p that the node does not change plus the probability 1 − p that the node does change but copies another node in the same state. In the latter case, the state of the node is 1 with probability k / N, and it may copy a different node in the same state with probability (k − 1 + N 1 )/(N + N 0 + N 1 − 1). Also, if the state of the selected node is 0, which has probability (N − k)/N, it may copy another node in state 0 with probability (N − k − 1 + N 0 )/(N + N 0 + N 1 − 1). The other terms are obtained similarly.
In terms of P t , the dynamics is described by the equation where the time evolution matrix T, and also the auxiliary matrix A, is tri-diagonal. The nonzero elements of A are independent of p and are given by The transition probability from state S M to S L after a time t can be written as where a rL and b rM are the components of the right and left r-th eigenvectors of the evolution matrix, a r and b r . Thus, the dynamical problem has been reduced to finding the right and left eigenvectors and eigenvalues of the time evolution matrix T.
The eigenvalues λ r of T are given by and satisfy 0 p λ r 1. The equation for P(L,t; M, 0) shows that the asymptotic behavior of the network is determined only by the right and left eigenvectors with unit eigenvalue, i.e., by the eigenvector corresponding to λ 0 = 1. The coefficients of the corresponding (unnormalized) left eigenvector are simply b 0k = 1. The coefficients a 0k of the right eigenvector are obtained using a generating function technique and an associated nonlinear second order differential equation [18][19]. The coefficients are then given by the Taylor expansion of the hypergeometric function F(−N, N 1 , 1 − N − N 0 , x) ∑ k a 0k x k . After normalization, these coefficients give the stationary distribution rðkÞ ¼ This is the probability of finding the network with k nodes in state 1 at equilibrium, and it is independent of the initial state. The other eigenvectors, corresponding to λ r 6 ¼ 1, can also be calculated, and are also related to hypergeometric functions [18][19]. Although these eigenvectors provide a complete description of the dynamics of the network (see Eq 10), they are not particularly illuminating as we are interested in understanding the asymptotic behavior of the system (λ 0 = 1).
In the thermodynamic limit N ! 1, we can define continuous variables v = k/N, n 0 = N 0 /N and n 1 = N 1 /N and approximate the asymptotic distribution presented in Eq 12 by a Gaussian In the limit where n 0 , n 1 ) 1, the width depends only on the ratio α = n 0 /n 1 and is given by ffiffiffiffiffiffiffiffiffi a=N p =ð1 þ aÞ. In particular, for n 0 , n 1 ) 1, the width tends to 1=ð2 ffiffiffiffi N p Þ. While the model solved above was stated in terms of non-negative integer influence parameters N 0 , N 1 , it can be generalized to a model where the external influence parameters N 0 , N 1 are real numbers. In this case, the solution in Eq 12 remains the same, with the difference that factorials must be replaced by gamma functions. Since the numbers N 0 /(N + N 0 + N 1 − 1) and N 1 /(N + N 0 + N 1 − 1) represent the probabilities that a free node (nonpartisan voter) copies one of the frozen nodes (partisan voters), small (large) values of N 0 and N 1 can be interpreted as representing a weak (strong) connection between the free nodes and the external system containing the frozen nodes. The external system can be thought of as a reservoir that affects the network but is not affected by it. For N 0 = N 1 = 1, we obtain ρ(k) = 1/(N + 1) for all values of N, i.e. N 0 = N 1 = 1 is the critical value of this model. In this case, all states S k are equally likely and the system executes a random walk through the state space. In the limit N ! 1, N 0 = N 1 = 1 marks the transition between disordered and ordered states.

Model behavior
For N 0 , N 1 > 1, we obtain skewed unimodal distributions with peak at N 1 /(N 0 + N 1 ) corresponding to the fraction of voters in the network that voted for candidate 1. If N 1 > N 0 , the majority of votes go to candidate 1, and if N 0 > N 1 the majority of votes go to candidate 0. We note that the estimation of the influence parameters N 0 , N 1 , based on almost a century of US presidential election data, predominantly falls within this regime. For N 0 , N 1 ) 1, ρ(k) resembles a Gaussian distribution, and if N 0 = N 1 about half the voters vote for candidate 0 and half the voters vote for candidate 1, similarly to a magnetic material at high temperatures.
For N 0 , N 1 < 1-the bistable (hysteresis) region-we obtain bimodal distributions in which either of the two network phases can exist, similar to the magnetization state in the Ising model below the critical temperature. For N 0 , N 1 ( 1, the distribution peaks at all nodes 0 or all nodes 1, similar to a magnetized state at low temperatures.

Other network topologies
Although the stationary vote-share distribution given by Eq 12 is obtained assuming fully connected networks, it was shown in [18][19] that our exact results are excellent approximations for other networks, including random, regular lattice, scale-free, and small world networks. These approximations can be useful, for example, if our model is applied to a network constructed based on online social networks or commuting networks. For these networks, which are not fully connected, the effect of the frozen nodes is amplified and can be quantified as follows: the probability that a free node copies a frozen node is P i = (N 0 + N 1 )/(N 0 + N 1 + k i ) where k i is the degree of the node. We can then define effective numbers of frozen nodes in the corresponding fully connected network, N 0ef and N 1ef , as being the values for which where the term on the right-hand side in Eq 14 is the expectation with respect to the degree distribution f(k), and the term on the left-hand side is the probability that a free node copies a frozen node in the corresponding fully connected network. Eq 15 is the mean field boundary  condition. For well-behaved distributions, N 0ef and N 1ef can be obtained in terms of central moments of the degree distribution by expanding the right-hand side in Eq 14 around the average degree hki of the real network, as follows: where μ n = ∑(k − hki) n f(k) are the central moments of the distribution f(k). For example, using only the first term in the Taylor expansion gives (N 0ef + N 1ef )/(N 0ef + N 1ef +N − 1) = (N 0 + N 1 )/(N 0 + N 1 + hki). This leads to where f = (N − 1)/hki. Therefore, as the network acquires more internal connections and hki increases, the effective values N 0ef and N 1ef decrease.
Supporting information S1 Movie. Hotspots of social contagion: 92 years of presidential elections. S1 Movie shows colored maps that reflect the significance (p-value) of local concentration of social influence for each state. The p-values for each state are derived from a random permutation test of local clustering using the Getis-Ord Local G Ã i statistic. This analysis was performed with a contiguity spatial weight matrix that indicates whether states share a boundary or not. The variable of concern is the social influence index calculated using Eq 7 in the main text. Low p-values (pvalue 0.1) indicate statistically significant high levels of social influence at a state and its surrounding neighbors (hot spots). High p-values (p-value!0.9) indicate statistically significant low levels of social influence at a state and its surrounding neighbors (cold spots). The Mann-Whitney U-test is a nonparametric test that assesses whether one of two random variables is stochastically larger than the other. Given a time-series of social influence from 1920 to 2012, we define for each election year, y, two samples of social influence: from 1920 to y−4, and from y to 2012. We apply the Mann-Whitney U-test for these two samples, and calculate the corresponding p-value. The optimal break date is the date that achieves the minimum p-value over all potential breaks within the range 1920-2012 (marked by a red circle in the above curve, plotted in a linear-log scale).  Table. Results of random permutation tests of spatial autocorrelation using Moran's I statistic. This analysis was performed with a contiguity spatial weight matrix (row normalized) that indicates whether states share a boundary or not. The variable of concern is the social influence index calculated using Eq 7 in the main text. The observed Moran's I statistics are shown in the second column and the corresponding significance levels (p-values) of the tests are shown in the third column. The random permutation tests suggest the presence of significant positive spatial autocorrelation as indicated by the level of significance (p-value) shown in the third column. (TIF) S2 Table. Results of random permutation tests of spatial clustering using Getis-Ord General G Ã statistic. This analysis was performed with a contiguity spatial weight matrix that indicates whether states share a boundary or not. The variable of concern is the social influence index calculated using Eq 7 in the main text. The observed Getis-Ord General G Ã statistics and significance levels (p-values) of the tests are shown in the second and third columns, respectively. The tests indicate that social influence is significantly concentrated in space as shown by the significance levels (p-value) in the third column. For all election years, the observed Getis-Ord General G Ã is larger than the expected General G Ã , indicating that the spatial distribution of high social influence values is more spatially clustered than would be expected if underlying spatial processes were truly random. (TIF)