Natural disasters and indicators of social cohesion

Do adversarial environmental conditions create social cohesion? We provide new answers to this question by exploiting spatial and temporal variation in exposure to earthquakes across Chile. Using a variety of methods and controlling for a number of socio-economic variables, we find that exposure to earthquakes has a positive effect on several indicators of social cohesion. Social cohesion increases after a big earthquake and slowly erodes in periods where environmental conditions are less adverse. Our results contribute to the current debate on whether and how environmental conditions shape formal and informal institutions.

1. EQ t j is a dummy that takes the value 1 if region j has been affected by at least one earthquake within the last 3 years as measured from t, i.e. in years t, t − 1 and t − 2. A region is treated as affected (EQ t j =1) if the epicenter of an earthquake of magnitude M s higher than 7 was located there and/or the intensity (measured by the Modified Mercalli Scale) the region experienced was equal to or higher than VII. We use earthquake data from 2003-2012 to correspond with the availability of our measures of social cohesion. In addition, for higher threshold values EQ t j does not vary across time and the effect of earthquake exposure is subsumed into the fixed effects.
2. DISTEQ t j denotes the distance in years between period t and the last year that region j was affected by an earthquake. For regions that have not suffered any earthquake in the last 30 years, we set this variable to 30. Figure 1 in the main text illustrates the regional variation in earthquake exposure and visualizes the temporal variation in earthquakes, generating variation in the measure DISTEQ t j .
It should be pointed out that if we lower the threshold to M s/M w6.0+ only one previously unaffected region becomes affected (using the EQ t j dummy); if we increase it to M s/M w8.0+ we exclude instances of earthquakes with important human and economic losses and substantially reduce variation in our earthquake indicators.
A.2 Comuna level seismic data. Some of our variables that capture social cohesion are expressed at the level of Chilean comunas. To identify the affected comunas, we complement the information of the National Seismological Center with that of the Legal Medical Service (LMS) of the Ministry of Justice (http : //www.sml.cl/sml/) and the Chilean Association of Municipalities (http : //www.munitel.cl/). The reason is that the seismological service does not always provide information of the affected areas at the comuna-level. We define two comuna-level measures of earthquake exposure: 1. EQ 2010 j is the dummy variable that equals 1 if comuna j: (i) is identified by the seismological service as a comuna hit by the 2010 Maule earthquake (i.e., a comuna that suffered an intensity greater than or equal to V II in the Mercalli scale), and/or (ii) had at least one fatal victim, and/or (iii) asked for economic aid.
2. DISTEQ t j measures the distance in years between period t and the last year that comuna c was hit by an earthquake.
A.3 Social Cohesion. Defining and measuring social cohesion is difficult. As noted by [S1], there is a "proliferation of definitions of social cohesion that have proved difficult to combine or reconcile" (p. 409). We focus on measures of positive and negative behavior proposed by the OECD [S3].
Positive Behavior. Our variables Life Satisfaction and Trust are obtained from the 2008,2009,2010,2011 and 2013 waves of the Latinobarómetro, an annual survey that gathers information on attitudes and beliefs of individuals from 18 Latin American countries (more information and the data are available at http://www.latinobarometro.org/ ). Since we are interested in studying the effects of earthquake exposure on indicators of social cohesion in Chile, in case of these two variables we select Chile and eliminate all non-Chilean citizens. This reduces our sample to 1159 individuals in 2008, 1183 in 2009, 1173 individuals in 2010, 1185 individuals in 2011 and 1177 individuals in 2013. We group individuals according to their comuna of residence. We have observation for only 98 Chilean comunas. Since we have too few observations for several regions, reliable data are available for much less than 15 regions and Life Sat and Trust are therefore only used at the comuna level. Using less than 15 cross-sectional units is too few to make any meaningful analysis at the level of regions.
To construct the variable Life Sat we use the following question (Q27ST in 2008, Q1ST in 2009-2013: "In general, would you say you are satisfied with your life? Would you say you are . . .?". The possible answers are: (1) Very satisfied, (2) Fairly satisfied, (3) Not very satisfied and (4) Not satisfied at all. LifeSat j,t measures the percentage of people living in comuna j, in period t, that choose options (1) or (2). For variable Trust we use the following question (Q21WVSST in 2008, Q58ST in 2009, Q55ST in 2010, Q25ST in 2011and Q29STGBS in 2013: "Generally speaking, would you say that you can trust most people, or that you can never be too careful when dealing with others?". The possible answers are: (1) One can trust most people and (2) One can never be too careful when dealing with others. Trust j,t measures the percentage of people living in comuna j, in period t, that choose (1).
Our variable Charity is obtained from the Teletón, a yearly fund-raising event broadcasted on television in Chile since 1978 (www.teleton.cl). This charity event collects voluntary donations across the whole country in order to raise funds to help children with disabilities who are treated at healthrelated organizations of the Fundación Teletón. We use regional data on contributions to the Teletón between 2007 and 2012. It is important to note three features of this charity event. First, data corresponding to previous editions are not available at a regional level. Second, for this dimension of social cohesion, information is only available at the level of regions. Third, in 2009 and 2013 the event did not take place because the presidential elections were held. Additionally, we would like to stress that natural disaster relief was never a charity aim of the selected sample. There was a special Teletón event for the victims of the 2010 Maule earthquake, different from the standard 2010 edition, that we excluded from our sample. (2) Sport club; (3) Religious organizations (this answer explicitly excludes activities such as prayeractivity, going to mass and the like); (4) Art groups; (5) Cultural groups; (6) Student/youth centres; (7) Women associations; (8) Associations for elderly people; (9) Volunteer groups; (10) Self-health groups; (11) Political party; (12) None. Volunteering does not include relief efforts directly related to the consequences of earthquakes. The only category that might subsume such effects is category (9) and the percentage of people that tick the corresponding box is 0.42%, 0.45% and 0.39%, respectively, in the 2009, 2011 and 2013 CASEN waves. Our variable Volunteering j,t measures the percentage of people living in region (comuna) j, in period t, that choose any option from (1) to (10). We have observation for 15 Chilean regions and for 320 out of 346 comunas. In Table 11 below we also exploit individual-level variation in volunteering.
To measure electoral participation we use the number of persons who showed up at polls in the 2008 and the 2012 elections of mayors and council members. These data come from the Chilean Electoral Service (Servicio Nacional Electoral, available at www.servel.cl). The resulting variable Voting j,t measures the percentage of people who showed up at the polls in region j at period t.
Negative behavior. To construct the variable Crime we use official data on criminal activity provided by the Chilean government. Since 2005 the Ministry of the Interior (Ministerio del Interior y Seguridad Pública) prepares and publishes crime rates classified by crime types according to their social impact ("Tasa de casos policiales por delitos de mayor connotación social" in Spanish; see http : //www.seguridadpublica.gov.cl). This index encompasses crimes both reported to the police by the citizens and discovered by any police officer per each 100000 inhabitants. The episodes the index includes vary from violent crimes, such as like aggravated assault, murder, rape, robbery, to property crimes such as burglary, motor vehicle theft etc. These data are available at the level of both regions and comunas for the period 2005-2011.
We study two other measures of negative behavior: suicides and corruption. The data on suicides come from the Department of Statistics of the Ministry of Health (www.deis.cl) for 2005-2011. The variable Suicides j,t measures the rate of suicides per 100000 inhabitants in region/comuna j in period t. The data on corruption are from the Citizen Safety Survey (Encuesta Nacional Urbana de Seguridad Ciudadana from the Ministry of the Interior). Individuals are asked whether they, or any member of their family, were solicited for bribes by some public office. We have observations for 2005-2012 at the regional level. Corruption j,t measure the percentage of households solicited for bribes in region j at year t. This variable exhibits very little variation though. This may explain why we detect no association between Corruption and Earthquake in our regressions. From CASEN at the level of both regions and comunas we use average years of schooling (Schooling), the percentage of poor people, Poverty, the percentage of females (Women) and migration between regions (Net Migration Rate). This variable measures migration flows between Chilean regions/comunas. CASEN asked individuals in which comuna their mother lived when they were born. With this information we compute "domestic immigration" and "domestic emigration." By the former we refer to the percentage of Chilean people who live in a different region from the one they were born. Domestic emigration measures the percentage of Chilean people who left their region of birth. Net Migration Rate j,t is the difference between domestic immigration and domestic emigration at year t in region/comuna j. It aims to control for possible biases due to the endogenous composition of Chilean regions.
We stress that the years of the CASEN survey do not perfectly match the years of the dependent variables Charity, Crime, Suicides and Corruption. To solve this discrepancy we compute the missing values using the annual rate of increase between periods 2006-2009, 2009-2011 and 2011-2013.
The variable Income is from the ESI and NESI. Because the ESI data is hard to compare with the NESI series in terms of non-labor income [S2], our definition of income includes labor income (i.e., salaries and wages, monetary or in kind royalties, commissions and income of professionals and self-employed) and pensions and widow's pensions. In particular, we compute per capita household income (income, hereafter) and the corresponding Gini coefficient. All monetary variables are expressed in Chilean Pesos (CLP) at 2007 real prices. Finally, data on population size are obtained from the National Institute of Statistics (INE).
Since earthquakes do affect economic variables such as income, poverty or migration, the variables EQ t and DISTEQ t could explain some of these variables at time t. To mitigate such effects we control for lagged variables. More precisely, since an earthquake at time t, t − 1, or t − 2 cannot affect income, poverty, Gini and migration in t − 3, the controls are lagged three periods whenever EQ t is applied. For our recency measure DISTEQ t the controls come from t − 1.
For the variables Life Satisfaction and Trust we use additional controls from the Latinobarómetro data base. Since in the Latinobarómetro individuals are asked about their ideological position, we control for this observable characteristic by using Left j,t , Right j,t and None j,t . Left j,t measures the percentage of people living in comuna j, in period t, that place themselves on the left in the left-right axis, Right j,t is the percentage of people in comuna j, at t, who place themselves on the right and None j,t measures the percentage of people who do not place themselves ideologically on the left or on the right. We also use the average age of individuals, the percentage of people with Low, Medium and High education level. Finally, we use the variable High − Income j,t that measures the percentage of people who cover their needs in a satisfactory manner with their total income family. Table 1 in the main text summarizes the earthquake-related variables and the indicators of social cohesion. Tables 1 and 2 provides additional information regarding these variables as well as the descriptive statistics of the control variables.   Table 1 summarizes some descriptive statistics at the regional level, separately for affected (EQ= 1) and unaffected (EQ= 0) regions. Apart from the differences in our variables of interest, affected regions tend to be more populated and to have relatively more women. Table 2 shows descriptive statistics at the comuna level , separately for affected (EQ 2010 = 1) and unaffected (EQ 2010 = 0) comunas. All numbers are averages across the years 2009 (POST=0) and 2011 (POST=1).   Tables 3 and 4 report the estimates of the Fixed Effect model (1) (see Results in the main text). Remember that some controls are lagged three periods whenever EQ t j is applied, and they are lagged one period for our recency measure DISTEQ t j . Tables 3 and 4 correspond to the estimations reported in the main text but coefficients of controls are included.

Regional Regressions
Many controls in the vector X it will tend to be correlated (such as e.g. income and poverty). As an additional robustness check, and to control for multi-collinearity problems, we also performed a principal components analysis. We apply parallel analysis to filter out the most important part of the variance from all the observed measures and to determine the number of components. There is a total of nine components initially (the variables from Table 1 as well as the variable year). The analysis suggests that three components should be retained as the eigenvalues of the first three components are higher than one. In total, these three components account for 76% of the variance of the eight included variables. These three components are used in the principal component estimations. The results of these estimations are reported in Tables 5 and 6.
Comuna-level regressions At the comuna level we use the difference in differences estimator in expression (2) in Results, where we compare affected and unaffected comunas before and after the 2010 Maule earthquake. Some control variables are again lagged. Table 7 reports the estimates. As with the regional level regressions we also conduct a principal component analysis for the comunalevel data. The results are reported in Table 8 and show qualitatively similar results as the regressions reported in Table 7.
Placebo Test I We conduct two types of placebo tests. We start by using a "fake" treatment group, where we assign EQ = 1 randomly to regions/comunas. We perform this experiment 10,000 times. If the effect is driven by exposure to earthquakes as opposed to other more mechanical forces we should see a null effect under this specification.
Regions. For the regional level regressions we assign each region a random number n i r , with i = {1, 2, 3}, drawn independently uniformly from [0,1]. We then assign EQ = 1 for years [2005][2006] to those regions with n 1 r ≤ 0.067. For those regions with n 1 r > 0.067 and n 2 r ≤ 0.13 we assign EQ = 1 for years [2007][2008][2009]; and for regions with n 1 r > 0.067, n 2 r > 0.13 and n 3 r ≤ 0.4, EQ = 1 only after 2009. For the remaining regions EQ = 0 throughout. Because there are only 15 Chilean regions, the probability that we pick up affected regions in the data is positive, which can increase the number of times we find effects under this specification. This problem is partially mitigated by simulating the outcome variable, y t j . In each replication of the test and for each region and year, we assume that y t j is normally distributed with mean µ and variance σ 2 . The parameters µ and σ 2 are, respectively, equal to the mean and variance of Crime, at national level and for the period considered in this work: 2005-2011.
We estimate the fixed effect model 10000 times using the same controls as in our main specification. Average results are reported in Table 9. If we consider a significance level of 1%, the percentage of rejections of the null hypothesisβ EQ = 0 is 6% (that percentage increases to 13 and to 20, if we consider significance levels of 5% and 10%, respectively). Although the percentage of rejections is high, which has to do with the small number of Chilean regions (15), the average value of the estimated coefficientβ EQ is virtually equal to zero.
We can contrast this to our results where for 3 of our 6 indicators of social cohesion we reject the null-hypothesis at the 5 percent level. Given that a "random rejection" occurs with probability 0.1338 in our data, the probability that our result is generated randomly is given by 6 3 0.1338 3 * 0.8662 3 ≈ 0.031.
Note that this assumes independence across social cohesion indicators, which seems appropriate in the absence of additional information.
Comunas. For the comuna-level regressions, we proceed in the same fashion. We pick 22% of Chilean comunas at random and impose EQ= 1 on them and we artificially generate the dependent variable. We estimate the difference in difference model, with standard errors clustered at province level. In this case, we get rejections 1%, 6% and 11% of the times, depending on the critical significance level we consider. Moreover, the average value of the estimated coefficient corresponding to the interaction variable Post×EQ 2010 is approximately equal to zero. Given that a "random rejection" occurs with probability 0.058 in our data, the probability that our result is generated randomly is given by  Table 10 show that the "placebo interaction" POST13× EQ 2010 c is insignificant throughout. Table 11  The regressions in Columns (1) and (2) replicate our diff-in-diff approach at the comuna level. Column (1) reports estimates based on the entire sample of 520787 respondents, while column (2) excludes (in each wave) all individuals that have moved between comunas in the last five years. The comparison of both regressions discards the possibility that the detected effects are driven by migration of the populatoin across regions.

Individual-level Regressions Finally,
Columns (3) and (4)  uses Victim of Crime i,t as dependent variable, while Column (4) uses Burglary i,t as dependent variable. We exclude (in each wave) those cases in which the location of the crime was different that the victim's home, neighborhood or comuna.     (1) and (2) corresponds to exposure to earthquake and distance in years since the last earthquake, respectively. Both models include lagged values of the components. Significance level (***) 1%, (**) 5% and (*) 10%.        (1) and (2) and data from waves 2008, 2009 and 2012 of the Citizen Safety Survey in Columns (3) and (4). Column (1) considers all individuals; Column (2) considers a subsample of individuals who have been living in the same comuna during the last 5 years. In Column (3) the endogenous variable measures whether the individual was victim of a crime and in column (4) whether they were victim of a burglary. Standard errors (in parentheses) clustered at comuna level. Significance (***) p < 0.01, (**) p < 0.05, (*) p < 0.1