^{1}

^{2}

^{*}

^{1}

^{3}

^{1}

^{4}

^{5}

^{6}

^{7}

^{8}

Analyzed the data: ALH. Wrote the paper: ALH DGR. Conceived and designed the mathematical model and analysis: ALH DGR MAN NAC.

The authors have declared that no competing interests exist.

Many behavioral phenomena have been found to spread interpersonally through social networks, in a manner similar to infectious diseases. An important difference between social contagion and traditional infectious diseases, however, is that behavioral phenomena can be acquired by non-social mechanisms as well as through social transmission. We introduce a novel theoretical framework for studying these phenomena (the SISa model) by adapting a classic disease model to include the possibility for ‘automatic’ (or ‘spontaneous’) non-social infection. We provide an example of the use of this framework by examining the spread of obesity in the Framingham Heart Study Network. The interaction assumptions of the model are validated using longitudinal network transmission data. We find that the current rate of becoming obese is 2

Information, trends, behaviors and even health states may spread between contacts in a social network, similar to disease transmission. However, a major difference is that as well as being spread infectiously, it is possible to acquire this state spontaneously. For example, you can gain knowledge of a particular piece of information either by being told about it, or by discovering it yourself. In this paper we introduce a mathematical modeling framework that allows us to compare the dynamics of these social contagions to traditional infectious diseases. We can also extract and compare the rates of spontaneous versus contagious acquisition of a behavior from longitudinal data and can use this to predict the implications for future prevalence and control strategies. As an example, we study the spread of obesity, and find that the current rate of becoming obese is about 2

Social network effects are of great importance for understanding human behavior. People interact with a varying number of individuals and with some individuals more than others, and this affects behavior in fundamental ways. Sociologists have long studied social influence through networks, and networks now routinely appear in investigations from other fields, including economics

Within network studies, much work has focused on how information, trends, behaviors and other entities spread between the individuals in social networks. These processes are generally referred to as ‘contagion’. Such suggestions of contagious dynamics and the possible relevance of network structure can be rigorously examined using mathematical models of contagious processes. These can then be used to obtain accurate measures of expected prevalences, interventional efficacy, and optimized information flow. Many previous models have been proposed to study influential interactions between individuals. Most of these have considered well-mixed populations, although more recent work has focused on network-structured populations. The most well studied are classic epidemiological models (like SIS and SIR) for the spread of microbial infectious diseases

Each of these models, however, has one or more properties that are problematic for studying social contagion. Many do not capture the probabilistic nature of contagion, or the asymmetry inherent in traditional infectious disease (where the infected state spreads through social contagion whereas the non-infected state does not). Others only consider well-mixed populations, where everyone is influenced by everyone else, ignoring the effect of network structure. Most models inspired by epidemiology are not directly applicable to the social spread of other phenomenon, because many phenomena that spread by social contagion may also arise spontaneously. That is, it is possible to adopt a trend or behavior, or obtain information, from an outside source, without directly ‘catching’ it from a contact in the network. In other words, on top of the probability of obtaining the infection from each infected contact, there is also a non-zero probability of ‘automatically’ obtaining the infection, independent of the local network. This ‘automatic’ non-social infection is not included in traditional infectious disease models. Economic models for the diffusion of innovations, based on early work by Bass

Here, we introduce a new model to study the spread of entities in a social network which has all of the important properties listed above. We then analyze its characteristics and show how it can be applied in different contexts. This model is an extension of the classical infectious disease model, combining features from other models mentioned above. It describes infections that can be contracted both spontaneously and through social (network-structured) transmission, and allows for recovery from infection. As an example, we focus on the spread of obesity in the Framingham Heart Study (FHS) network. The interaction assumptions of the model will be validated using longitudinal network transmission data. We show how we can quantitatively assess the values for the rate of adopting a trend spontaneously versus by contagion to determine the extent to which social transmission is important. We use it to predict prevalences and intervention effectiveness (i.e. get quantitative output, not just qualitative behavior). The results of this model are very different from models with other interaction assumptions, such as the ‘majority rules’ models. We will show that transmissive components are often small compared to the automatic component, but may still contribute materially to prevalence levels. Lastly, we will use pair-wise approximations to generate analytic results for infections in network-structured populations, as well as presenting simulations using a real social network.

In the simplest infectious disease models

In the standard SIS model, infection can only be transmitted by having a contact between an infected and a susceptible individual. Social ‘infections’, however, can also arise due to spontaneous factors other than transmission. Therefore, we extend the SIS model by adding a term whereby uninfected individuals spontaneously (or ‘automatically’) become infected at a constant rate

There are three processes by which an individual's state can change. (i) An infected individual transmits infection to a susceptible contact with rate

In the infectious disease literature, a disease is said to be ‘endemic’ if a stable, non-zero fraction of the population is infected at steady state. If a single infected individual is introduced to a totally susceptible population, then the average number of secondary infections they cause before recovery is called the

Traditional models of infection assume that the population is well-mixed. However, this assumption is unrealistic for many diseases, and also for the social spread of trends and behaviors. To account for the population structure, the infectious process can be constrained to take place on a social network. An infected individual can only pass their infection on to the suspectibles to whom they are connected. Properties of the infectious process thus depend on both the epidemiological parameters and the network structure, and there are often no longer simple analytic formulas to describe the reproductive ratio or steady state level of infection. For example, a property of disease spread on networks are

The correlation between infected individuals,

There are no analytic methods to solve SIS-type dynamics on arbitrary networks without making approximations. Thus, simulations are a more accurate tool to explore theoretical disease dynamics in structured populations without making simplifying assumptions about the network structure. For scaled, well-mixed populations, the formulas given in the previous sections for

Here [XYZ] represents the number of situations where and X individual is connected to a Y individual who in turn is connected to a Z individual. We can approximate all these triples in terms of pairs, using a moment closure approximation (

The result of a network structure is that the number of partnerships between susceptible and infected individuals quickly becomes less than if random, and so

Analyzing the n-regular pair-wise equations allows us to get analytic results and determine how and under what conditions network structure affects the spread of behaviors which are both spontaneously acquired and spread interpersonally. Although simple closed-form solutions do not exist when

The SISa model provides a formal way for assessing the social contagion of trends and behaviors that may be repeatedly caught and recovered from. Using data from the Framingham Heart Study (FHS)

This epidemiological approach to social contagion has important differences from other models which look at correlations in present and past states of connected individuals. Here, similar to others

The dataset we use is a subset of individuals from the Framingham Heart Study

To study the transmission of obesity, we examine changes in BMI between sequential exams. Seven exams were administered to the Offspring Cohort between 1971 to 2001, with network data collected for each. We examine transitions occurring between each exam. The average fraction of the network that was classified as obese increased between these seven exams, suggesting the transmission process is not yet at steady state (Exam 1: 14

A given state

The structure of the Framingham Heart Study social network varies over the course of time, ranging from 7500 individuals with an average of 5.3 connections each at the first exam, to 3500 individuals with 2.8 connections on average at the seventh exam. Summary statistics are presented in the supplement (

The degree distribution of the Framingham Heart Study social network at the most recent exam (7) considered in this study. Connections include friends, family and coworkers. The average degree is around k = 3 and the transitivity is

The results of infectiousness analysis for the spread of obesity between exams 4 and 5 are shown in

Obesity behaves like a disease agent, infecting those in a susceptible ‘not obese’ state. The probability of transitioning from ‘not obese’ to ‘obese’ increases in the number of ‘obese’ contacts (A), and doesn't depend on the number of ‘not obese’ contacts (B). Conversely, the probability of recovering to the ‘not obese’ state does not depend on the number of ‘not obese’ contacts (D) or the ‘obese’ contacts (C)). Labels above points on plot are the number of observations averaged into that data point, and error bars are the standard error of the proportion.

Parameter measurements for obesity from each set of consecutive exams. Data point at exam N represents the value for the transition from exam N to N+1. Error bars are 95

Parameter | Description | Value |

a | rate of spontaneous infection | |

g | rate of recovery | |

rate of transmission through contact | ||

1/a | cycle | 53 years |

1/g | lifetime | 24 years |

influence | 0.13 | |

basic reproductive ratio | 0.35 |

Since these rates were measured for 6 different inter-exam transitions over 30 years, we can look at how the value of these rates changes over time.

We also found that both happiness and depression fit the SISa model, both being contagious from a neutral emotional state

In this section, we will use the SISa model to make predictions and evaluate interventions for the obesity epidemic, using the parameters observed in the FHS data. For simplicity and generality, we will keep the parameters

Time series of an epidemic on the Framingham Heart Study network, using full simulations (light blue) or the n-regular pair-wise equations (dark blue). Parameters used are those measured for the obesity epidemic:

This model predicts that, assuming the rates do not further change over time, the steady state proportion of obese individuals will be 42

We can also compare historical data on the obesity prevalence (from both national studies

A comparison of historical data on the prevalence of obesity in the Framingham Heart Study (blue dots) and the National Health and Nutrition Examination Survey (red dots) with the timeseries predicted from the SISa model with time-varying parameters. For the simulation, we allowed the parameters

We can use the pair-wise equations to see how the steady state prevalence depends on various parameters, which is especially useful to see how interventions that aim to change a certain parameter may affect the prevalence.

Dependence of the equilibrium fraction infected on obesity interventions which act to change the rates of infection (transmission (A) and ‘automatic’ infection (B)) or recovery (C). When not varying, parameters are

In this section we will examine the more general properties of ‘infections’ following SISa model dynamics. While

Time series of an epidemic on the Framingham Heart Study network, using full simulations (light blue) or the n-regular pair-wise equations (dark blue). When the ratio of

We can use the pair-wise equations to see how the steady state prevalence depends on various parameters, which is especially useful to see how interventions that aim to change a certain parameter may affect the prevalence.

Dependence of the equilibrium fraction infected (A) and correlations (

Dependence of the equilibrium fraction infected (A) and correlations (

Dependence of the equilibrium fraction infected(A) and correlations (

In general, the spatial correlations (

The most direct way to compare various parameters for spread, and therefore interventions that reduce one of the parameters, is to look directly at

This graph compares interventions which act to change different parameters of infection (transmission (A), ‘automatic’ infection (B), recovery (C)). Shown is the rate of change of the fraction infected at equilibrium with respect to a change in various parameters of infection. The y axis labels represent the absolute change in the percent infected for a change of 0.01 in one of the parameters. Changing

Many analytic models of network phenomenon assume the transitivity,

The dependence of the equilibrium fraction infected(A) and correlations (

The dependence of the equilibrium fraction infected (A) and correlations (

We've already discussed how changes in parameters of infection affect the steady state prevalence, and we can consider this an analysis of different types of public health interventions that change rates of recovery, infection or network structure. In previous analysis of the obesity epidemic done by Bahr et al

The SISa model offers a framework for quantitatively analyzing and predicting the public health affects of socially contagious phenomenon. Using a longitudinally measured health outcome and social network data, the SISa model can be used to determine the dynamics of a health trend in terms of rates of acquisition, recovery and inter-personal transmission. From these rates, the relative importance of social contagion can be determined, and changes in prevalence over time can be predicted. The framework can also be used to examine how these rates themselves change over time, helping to understand the mechanisms behind drastic changes in disease prevalence, such as in the obesity epidemic current effecting the United States. Finally, understanding the dynamics of a health behavior using the SISa model allows us to evaluate the benefits of various interventions, especially those that may work within social networks.

The prevalence of obesity in the Framingham Heart Study cohort has increased from 14

Using the SISa model with these parameter values estimated for obesity, we can make predictions about the future of the obesity epidemic and the important factors controlling it. Our models suggest that if the most recent rates stay constant, the population will stabilize at 42

This model allows us to can predict how much spatial correlation is expected from a purely infectious process, and compare this to what is observed in the data, which could be influenced by confounding factors and selection bias in choosing friends. A coefficient of 1 indicates that arrangement of infected nodes is random, while higher values are indicative of spatial correlations. We observed a correlation coefficient for obese individuals of 1.30, which was quite close to what was predicted from epidemic simulations (1.33). This suggests that infection alone is sufficient for explaining the observed correlations, and there may not be much selection bias or confounding factors in effect. We also show that network transitivity is not predicted to have a strong affect on prevalences when there is an automatic component to infection. However, our model also shows that contrary to popular belief, a contagious process on a network does not always result in clustering of infected individuals. This is especially true if there is a large automatic infection term, which is likely with many trends and behaviors.

The SISa approach allows us to compare the effectiveness of different classes of intervention. For the parameter range observed, we find that decreasing the rate of transmission

One possible limitation of this study is the incompleteness of the social network dataset used. Because the Framingham Heart Study was not designed as a study of social networks, no attempt was made to capture all of a person's important social contacts. Many close friends of a person could be missing (usually only one friend per person was recorded) and family and coworkers who play only a small part in ones actual social network may have been counted. However, even if under-sampling of real-world contacts did occur in the FHS Network, it does not change our results qualitatively: our data clearly show that rates of becoming obese increase with the number of ‘infected’ contacts (i.e. is contagious) while the rate of ‘recovery’ to a non-obese state does not depend on contacts. However, under-sampling could quantitatively effect our measurement of the rate constants. If a constant number of contacts for each person were missed, our estimate of the y intercept of the transition graphs would be shifted up from its true value, and the actual

It has recently been suggested that certain, particular types of latent homophily, in which an unobservable trait influences both which friends one chooses and current and future behavior, may be impossible to distinguish from contagion in observational studies and hence may bias estimates of contagion and homophily

The SISa model as presented here assumes that all individuals have the same probability of changing state (though not everyone will actually change state within their lifetime). It is clearly possible, however, that there is heterogeneity between individuals in these rates. We do not have sufficient data on obesity in the Framingham dataset to explore this issue, which would require observing numerous transitions between states for each individual. Exploring individual differences in acquisition rate empirically is a very interesting topic for future research, as is extending the theoretical framework we introduce to take into account individual differences.

The results we have presented here reiterate an important general principle of network processes: networks tend to magnify whatever they are seeded with, but they must be seeded with something. The increase in obesity is not purely a network-diffusion phenomenon. Automatic infection serves to start and continuously seed the epidemic. Here we show that the dominant process in the increasing prevalence of obesity is contact-independent weight gain; however, the rate of interpersonal transmission contribute significantly to the overall prevalence and appears to be increasing steadily over time. Thus consideration of social transmission and network effects is an important issue for health and policy professionals.

Summary statistics for the Framingham Heart Study network at each exam. Out-degree is the number of contacts named by an individual. Total degree includes both those who named an individual and those who were were named by an individual. Only friendships are directional, other contacts are symmetrical. Phi (φ) is the transitivity of the network. C_{SI} and C_{II} are the spatial correlations between susceptible and infected, and infected, individuals, respectively. N is the number of people for whom both social network and obesity data was available for at a given exam.

(0.01 MB PDF)

Summary of results from regression of probability of transitioning between states and the number of contacts in a given state, similar to those shown in

(0.01 MB PDF)

Deriving pairwise network equations for heterogeneous networks.

(0.15 MB PDF)

We thank Laurie Meneades for assistance with the Framingham Heart Study Network database.