Emotions and decisions in the real world: What can we learn from quasi-field experiments?

Researchers in the social sciences have increasingly studied how emotions influence decision-making. We argue that research on emotions arising naturally in real-world environments is critical for the generalizability of insights in this domain, and therefore to the development of this field. Given this, we argue for the increased use of the “quasi-field experiment” methodology, in which participants make decisions or complete tasks after as-if-random real-world events determine their emotional state. We begin by providing the first critical review of this emerging literature, which shows that real-world events provide emotional shocks that are at least as strong as what can ethically be induced under laboratory conditions. However, we also find that most previous quasi-field experiment studies use statistical techniques that may result in biased estimates. We propose a more statistically-robust approach, and illustrate it using an experiment on negative emotion and risk-taking, in which sports fans completed risk-elicitation tasks immediately after watching a series of NFL games. Overall, we argue that when appropriate statistical methods are used, the quasi-field experiment methodology represents a powerful approach for studying the impact of emotion on decision-making.


Introduction
Emotions are central to the human experience. Given this, understanding how emotions induced in real-world settings affect decision-making is of considerable importance. For example, how do anxiety and elation affect risk-taking by executives or investors? How do emotional events, such as marriage, divorce, or the death of a family member, influence how we behave in the days that follow? And how do natural disasters or pandemic events, and the negative emotions they induce, impact individuals coping with the devastation?
A large body of existing research has established that emotions affect cognitive processes related to risk perception and valuation, and that decision-makers are influenced by emotions unrelated to the decisions themselves [1][2][3][4][5][6][7][8][9][10][11][12][13][14]. A great deal of this work comes from laboratory statistical model. In particular, we show that a panel instrumental variables approach, with repeat data from each participant, can help address some of these statistical concerns. To the best of our knowledge, no previous study has used this approach or laid out a formal statistical model to provide a clear basis for causal inference in quasi-field experiments on emotion. We demonstrate the methodology using a quasi-field experiment that explores the relationship between emotions induced by naturally-occurring events and risk preferences. Our study involves football fans completing risk-elicitation tasks immediately after a series of National Football League (NFL) playoff games. For our analysis, we utilize our proposed panel instrumental variable approach, and find evidence that negative emotions increase risk-taking behavior. Notably, we also find that individual characteristics are correlated with risk-taking behavior, which emphasizes the importance of using an appropriate statistical approach in quasi-field experiments.
We conclude that while there is great promise for quasi-field experiments as a tool to study emotions and decision making, it is critical for researchers to consider potential statistical issues with the approach and build them into their research methodologies and analysis. By doing so, we are confident that social scientists can develop a robust literature that shows us how naturally-occurring emotions impact real-world decisions.

Review of methodological approaches
To provide context for our paper, this section reviews and compares the roles of laboratory and retrospective quasi-experimental methods in studying emotions and moods. We conclude this section with a comparison of these approaches to the more limited quasi-field experiments literature. Note that while there is no consensus on the definition of "emotions" (see Frijda [31] for a review), in the current paper, we draw from Schachter and Singer [32] and define emotions as "[states] of physiological arousal and of [cognitions] appropriate to [that] state of arousal." In addition, we follow the distinction drawn by Clore and Ortony [33]; emotions are linked with personal evaluations of an event while moods are not. However, we defer to the authors' use of terms when discussing their individual work in this section.
In addition, we limit our discussion of laboratory and field experiments to studies about the impact of emotions and moods on actual behavior. Although we limit our discussion in this way, we note that an alternative method of studying the impact of emotions and moods on behavior uses observational or survey data rather than behavioral data. For example, Krekel et al. [34] conducted a meta-analysis of research studies by Gallup to investigate an employee's sense of well-being on productivity and performance. Daly et al. [35], on the other hand, utilized a combination of biological and psychological measurements and economics measures of time discounting to identify correlations between individual characteristics and time preferences.
Furthermore, we note that there have been several comprehensive meta-analyses that examine the generalizability of laboratory results (e.g. [36,37]). In addition, there have also been extensive discussions of the advantages and disadvantages of laboratory experiments and field experiments in economics [38][39][40][41].
Laboratory experiments. Laboratory experiments are often used to study the impact that emotions have on behavior. The selection criteria for this section are papers from either the economics or psychology literature that utilize emotion-or mood-induction methods that are frequently used in laboratory experiments. While we limit our review to commonly used methods in economics studies, there are other methods that have also been used to induce physiological and emotional responses in laboratory experiments (See Martin [42] for a review of other mood induction methods).
The most commonly used techniques to induce emotions or moods in this area of research include mood induction procedures (MIPs), autobiographical recalls, and success-failure experiences (SFEs). In MIPs, moods and emotions are induced using short readings, film clips, and pictures. For example, Conte et al. [43] used film clips to elicit specific emotions: joviality, sadness, anger, and fear. Similarly, Deldin and Levin [44] induced elated and depressed states by asking participants to read a series of 60 statements that were either progressively more positive or more negative. Autobiographical recall, meanwhile, leverages participants' personal experiences as part of the procedure to induce specific emotions. For example, Fessler et al. [45] asked participants to recall or imagine situations where they experienced anger or disgust and then to write an essay about it. In contrast to MIPs and autobiographical recalls, SFEs induce emotions through manipulations of real-world outcomes rather than relying on recalls or participants' imagination. The procedure asks participants to complete a task, then manipulate the outcome (or the perceived outcome) such that participants experience either success or failure from the task.
These methods allow researchers to manipulate moods and emotions in a controlled way to study their impact on preferences. For example, Stanton et al. [46] used film clips to induce happy, sad, or neutral mood to test the impact of mood and framing on risky choices. Ibanez et al. [14] used sets of images to induce amusement, awe, fear, and sadness, in order to test the impact of these emotions on generosity via a dictator game. Combinations of these methods are also used to further reinforce the success of the emotion or mood induction. For example, Andrade and Ariely [47] used a combination of film clips and autobiographical recall to study the extent to which positive and negative emotions influence behavior in ultimatum and dictator games. Similarly, Capra [48] utilized both autobiographical recall and SFE to induce good or bad moods, to study the impact of moods on choices in a dictator game, an ultimatum game, and a trust game.
Overall, the laboratory setting offers several important benefits relative to other techniques when studying emotions. First, laboratory studies provide greater control relative to field studies. Since laboratory participants are randomly assigned to different conditions in a controlled setting, researchers can more confidently assert that differences in participants' subsequent decisions can be attributed to in-lab emotional manipulation. Second, there is clear evidence that the methods discussed in this section are successful in inducing specific emotions and moods. For example, there is evidence that some MIPs can induce similar levels of anxiety and depression to those experienced by clinically depressed participants [38]. And in a meta-analysis, Nummenmaa and Niemi [49] found that SFEs reliably induce both positive and negative emotions.
However, there are also disadvantages to conducting studies on emotion in laboratory settings. One important drawback to MIPs and the similar autobiographical recall method appears to be their susceptibility to demand effects. In a review of experimental papers on the effectiveness of mood induction procedures, Kenealy [50] found that these procedures and results are often inconsistent and subject to demand effects. Westermann et al.'s [51] metaanalysis of various MIPs also found significantly smaller mood induction effects in studies where subjects could guess or were told the purpose of the MIP. Similarly, while autobiographical recall certainly has greater ecological validity relative to the more traditional MIPs, the wording of instructions arguably encourages a demand effect (e.g. "Imagine that someone has done something to make you really angry. Briefly describe the circumstances that would make you the most angry," from [45, p.114]). Because SFEs directly induce emotions by manipulating success or failure in the laboratory, this procedure is less susceptible to demand effects relative to both MIPs and autobiographical recalls. However, SFEs are limited in the types of emotions it can elicit (e.g. it would be difficult to induce fear in a laboratory setting using this method). In addition, note that the MIPs and SFE we discuss here were selected because they are frequently used in the literature. However, there exist other methods of inducing moods and emotions in the laboratory as well. For example, Cohn et al. [52] used the threat of an electric shock to induce fear in their participants, to test the impact that fear has on risk aversion.
Further, laboratory studies of emotion are generally limited in their external validity; that is, there is always the concern that insights gleaned from emotions induced in the lab (and the resulting behaviors they trigger) may not generalize to real-world contexts where emotions are experienced. For example, emotions are often experienced in the social context [53], which is limited in the laboratory setting. Therefore, a study testing the impact of a serious and personal emotional shock (say, from the death of a loved one) or a shared loss on a community may be difficult to replicate in a laboratory setting with existing MIPs or SFEs. Thus, while laboratory studies have strengths in this domain, it is important that it is supplemented by similar work from field settings.
Retrospective quasi-experiments. A second group of studies uses retrospective statistical techniques to measure how real-world behavior, captured by existing data, changes following emotionally charged real-world events. This methodology requires: (a) an event that causes an as-if random (therefore, "quasi-experimental") emotional shock; (b) a relevant choice occasion that occurs following the event; and (c) data documenting the outcomes of those choices. S1 Table summarizes research that used random, natural-occurring events to retrospectively study the impact of emotions on subsequent behaviors in the field. The selection criteria we used was as follows: papers from either the economics or psychology literature that leverage real-world events as random shocks that induce emotions or moods, in order to retrospectively study the impact of emotions or moods on individual behavior following the shock.
Many studies in this literature utilize sports game outcomes as the real world shock, since fans tend to experience increased positive emotions after a home-team win and increased negative emotions after a home-team loss [54]. For example, Card and Dahl [26] used upset losses from home team professional football games as a negative emotional cue, in order to demonstrate a correlation between negative emotion and family violence. Similar work utilizes game outcomes to explore violence related to aggression [55,56], judicial sentencing [57], satisfaction with the political status quo [23,52], and risk aversion [25].
Another set of retrospective quasi-experimental studies rely on variation in weather conditions, based on evidence that weather significantly influences moods [58,59]. For example, Larrick et al. [59] found that high temperatures induce affective aggression, which is associated with greater hostile behavior by baseball pitchers (measured by pitchers hitting batters with pitches). Studies using weather as a natural, emotion/mood-inducing event have also found that weather affects payments at a Pay-What-You-Want restaurant [60], attribution bias [61], and projection bias [62]. Retrospective quasi-experimental studies also utilize other types of events, including terrorist attacks [63], sports outcomes [23,25,26,57], and deaths in the family [64][65][66].
Studies using the retrospective quasi-experimental approach can be compelling. Although this quasi-experimental approach lacks the control of laboratory studies, it utilizes real-world events and moods/emotions, which enhances generalizability. In addition, in contrast to the laboratory approach discussed earlier, this approach is not constrained by the type or intensity of moods and emotions induced by methods such as MIPs, SFEs, or autobiographical recalls.
However, researchers utilizing this quasi-experimental approach are limited by the datasets already in existence. As a result of this limitation, existing literature using retrospective quasiexperimental methods are constrained by the available measures in the dataset, which limits the types of mechanisms that researchers may study through which emotions might influence preferences and choices. Further, data on post-event emotional state is not always available, making it difficult to determine whether the event influenced decisions via emotions or some other pathway.
We note that there exists an alternative stream of literature that utilizes the quasi-experiment approach in which the main variable of interest is the direct impact of the random event on emotions and moods. There are also studies that leverage random events to examine their impact on behaviors that are reported through observations or surveys. In the current paper, we narrow our focus to approaches used to study the impact of emotions and moods on decision-making, but we include a sample of these other works in the S1 Appendix (S2 Table).
Quasi-field experiments. Quasi-field experiments, which combine the use of laboratory protocols with the use of real world events as quasi-random emotional shocks, combine some of the best features of laboratory studies and retrospective quasi-experiments. Like laboratory studies, quasi-field experiments use standard protocols to elicit preferences, values, and emotional states, e.g., via choice tasks and surveys. As a result, quasi-field experiments can be used to study many aspects of decision-making (risk aversion, altruism, cooperation, discounting, behavioral biases, etc.). However, unlike lab studies, quasi-field experiments utilize naturally occurring events that cause emotional shocks (like retrospective quasi-experiments). This allows researchers to study a greater variety of emotions and decisions relative to other methodologies, while retaining some external validity and control over process and measurement. Table 1 summarizes quasi-field experiments that study the impact of emotions and moods on decision-making. The selection criteria we used was as follows: papers from either the economics or psychology literature that study the impact that emotions and moods have on preferences using real-world events as mood/emotion inducing shocks and laboratory elicitation instruments to measure preferences or behaviors ex-post. The table shows that this literature is nascent, with relatively few existing studies compared to the lab and retrospective approaches. We note that although we do not constrain the type of measured preferences in our review, the majority of the papers study risk preferences.
The types of random events used to induce emotions in this literature are varied. For example, Heilman et al. [29] exploited exam performance as a shock to emotions, to explore how the induced emotions affected students' risk preferences. Bassi [69] used variation in weather and a risk preference elicitation task to examine how weather-induced moods affected participants' preference for voting for a political candidate whose performance is more uncertain. Voors et al. [30], meanwhile, tested how exposure to violence affects social preferences, risk preferences, and time preferences using laboratory instruments. Other papers have utilized variation in natural disasters-including floods, earthquakes, and hurricanes-to examine how emotions influence risk-taking in decision tasks [30,67,68]. These papers employ familiar risk elicitation procedures to study changes in risk preferences in response to the emotion induced by the natural disaster. For example, Cameron and Shah [67] asked individuals to complete a task that involved selecting one gamble among a set of six gambles that varied in riskiness, after those individuals had been exposed to a flood or earthquake in Indonesia.
Quasi-field experiments offer a promising complement to laboratory and retrospective quasi-experimental studies. Importantly, our review of existing studies suggests that the magnitude of the emotional responses induced in field settings are comparable to those observed in the lab (see S3 Table). Furthermore, compared to the other, more widely-used methodologies in this space, quasi-field experiments have several strengths. The quasi-field experiment approach may be less susceptible to some of the methodological concerns discussed above with lab experiments. For example, using naturally-occurring events with quasi-field experiments might make it easier for researchers to mask their hypothesis to reduce the impact of demand effects [51]. Furthermore, the quasi-field experiment approach allows for the study of real world emotions, which might improve the generalizability of findings relative to controlled lab work. Additionally, in contrast to retrospective quasi-experiments, quasi-field experiments allow the researcher to study the effects of emotion on a wide variety of behaviors, as opposed to relying on the very limited set of decision contexts that occur and are documented following real-world events. Overall, quasi-field experiments are a valuable tool for researchers studying emotion and decision making, and we argue that the methodology merits more widespread use.
However, quasi-field experiments are by no means the perfect methodological tool. As with most quasi-experiments, quasi-field experiments provide less experimenter control relative to the laboratory approach. This lack of control means that there may be correlations between experienced emotions, decisions, and individual characteristics, which laboratory experiments largely circumvent through truly random assignment. To address these potential issues, researchers can include additional statistical tools, like repeat observations, instrumental variables approaches, and/or fixed-effects, to control for these potential confounds and omitted variables. However, our literature review finds that the majority of these studies have not incorporated either repeat observation, individual fixed-effects and/or IV regressions in their analysis (S1 and S2 Tables). Given that our approach uses instrumental variables, it is worth emphasizing that this approach depends on the use of a strong instrument [71]. If the instrument is only loosely correlated with the endogenous variable, it becomes difficult to defend a causal interpretation of its coefficient. A first stage F-test can be used to measure the strength of this correlation, which is equivalent to checking if the first-stage coefficients are equal to 0. In the next section, we discuss a statistical approach that incorporates these features, which we feel would help researchers develop quasi-field experiments that can convincingly uncover links between emotions and decisions.

A statistical model for quasi-field experiments
In a typical laboratory experiment, researchers randomly assign participants to treatment and control groups. In contrast, quasi-field experiments rely on naturally-occurring emotional shocks. As a result, the statistical validity of their results depends heavily on whether it is possible to identify a source of emotional variation that is unrelated to participants' characteristics. Our review of the literature suggests that many existing quasi-field experiment studies use techniques that are vulnerable to omitted variable bias to estimate the causal effect of emotions on decision making. To the best of our knowledge, studies either perform analyses with individual fixed effects or instrumental variables regression (or neither), but not both. The analyses in these studies, therefore, are susceptible to bias stemming from the fact that unobservable characteristics may be correlated with both the emotion-inducing event and the outcome measure. Note that the problem of omitted variable bias is by no means unique to this setting; many research designs have endogeneity challenges, which formal statistical approaches are designed to address. We put forward our model, and the research design it encourages, as an approach to address the specific endogeneity challenges inherent in field research on emotion.
For example, consider a risk preference study that relies on the emotions induced by exam performance. Even if exam performance strongly impacts emotional state, a comparison of risk-taking behavior after the exam would only be valid if students with good grades on the exam (or grades higher than they expected) were identical in all other respects to students with poor grades (or grades lower than they expected). Such an assumption is unlikely to be valid. In this paper, we propose a panel instrumental variable approach that can be used to help reduce these potential biases. We utilize the instrumental variable approach rather than a reduced form approach, because while the IV approach scales the reduced form, it also accounts for the statistical significance of the first stage regression. Unlike a reduced form regression, the instrumental variable regression requires both the first and second stage regressions to be statistically significant.
To demonstrate the issue and our proposed methodology to help solve it, a formal statistical model is useful. Consider the following model of how risk preferences depend on emotional state. Let R it represent individual i's risk-taking behavior at time t. Suppose that R it is a function of current emotional state E it , an unobserved individual-specific personality characteristic λ i , an unobserved factor Z it , and an error term � it : Suppose also that emotional state itself depends on the emotion-inducing event M it (e.g. midterm exam score relative to expectations), an individual-specific personality characteristic γ i , the unobserved factor Z it , and an error term ξ it : The goal of the study is to estimate β 1 , which represents the effect of emotions E it on risk-taking behavior R it . The central challenge is that E it is correlated with both omitted variables γ i and Z it . Consider the midterm exam example: it is possible that both emotion and risk preferences could be correlated with personality characteristics such as stability or intelligence. Individual characteristics have been shown to correlate with how individuals experience and express emotions [72] For example, there is evidence that an individual's gender has an impact on how that individual experiences an emotion [73,74]. Similarly, individual characteristics such as gender have also been shown to be correlated with risk-taking behavior [75,76]. Furthermore, it is also possible that some time-variant omitted variable (e.g., the time of day at which each student chooses to complete a follow-up risk-elicitation task) may influence both emotion and risk preferences. Thus, due to the omission of unobserved correlated variables, using cross-sectional or even panel data to estimate Eq (1) may result in a biased estimate of β 1 . However, if-conditional on an individual-specific fixed effect-performance relative to expectations (M it ) is orthogonal to the unobserved variable Z it , then β 1 can be estimated using a panel instrumental variables approach that includes individual fixed-effects in both stages. This approach depends on the use of a strong instrument [71]. In other words, to avoid generating a biased measure of causal effect, the instrumental variable must strongly influence the endogenous variable (i.e., emotional state).
Although the panel instrumental variable approach is standard in some areas of economic research, few studies of emotions employ this method. Table 1 shows that few existing quasifield experiment papers use either individual fixed effects or an instrument variable approach. To the best of our knowledge, no existing papers on this subject utilize a combination of both. Based on our model, we suggest that many quasi-field experiments and, more broadly, lab-inthe-field studies, that leverage emotional shocks as part of the design can be improved from a statistical perspective using a panel instrumental variable approach.

An empirical example
To illustrate our statistical model, this section presents a study we conducted on emotions and risk preference. Specifically, we explore the effect of the emotional shocks induced by professional football (NFL) matches on sports fans' risk preferences. While research using this quasifield experiment approach has been conducted in the past, as we discussed earlier, our purpose in presenting this study is to show how a panel instrumental variable approach helps address some of the statistical challenges we identified in past work. Note that our study was reviewed and approved by the Harvard University IRB (F22473-102), and consent was obtained from all subjects.
Participants for this study were recruited through two online message boards for NFL football fans (www.thefantasyfootballguys.com and forums.footballguys.com). Participants were pre-screened through an online survey based on: (1) whether they completed the entire prescreening survey; (2) how committed they were to their football team; (3) how available they were to complete post-game surveys; and (4) whether they had a PayPal account to receive payments. The initial email invitations were sent to 203 NFL fans, asking them to participate in a baseline survey for payment at their convenience, to be followed by further surveys immediately after upcoming NFL games. Participants had to complete the baseline survey in order to be part of the rest of the study, and 163 NFL fans ultimately participated in the study.
All participants were paid a guaranteed $5 show-up fee via PayPal after completing each survey, including the baseline survey. In addition, participants had the opportunity to win up to $26 in each survey, based on their choices in the risk elicitation tasks (the dollar values varied slightly depending on the payoffs and probabilities for each survey task).
The baseline survey sent to participants included more information about the study, solicited contact and basic demographic information from participants, and required participants to complete a risk-elicitation task modeled on the Balloon Analogue Risk Task (BART) from Lejuez et al. [77]. In order to measure the baseline level of risk-taking and to familiarize participants with the rules of the BART, all participants played three distinct "games" of the BART at baseline, for real money. The BART worked as follows. Each distinct BART game began with participants being presented with a virtual balloon, which had an initial dollar value to participants (either $1 or $2, as described below). During each of up to seven rounds of the game, the participant chose either to "inflate" or "not inflate" the balloon. If the participant inflated the balloon successfully, he or she could increase the dollar value of the balloon (by either $1 or $2). However, there was a small probability (known to the participant) that adding air to the balloon would make it pop. If the balloon did pop, the participant did not earn any money for that game. If the participant chose to stop inflating the balloon at any point before the balloon popped, the game ended and the participant kept the money he or she had earned up to that point.
We note that a limitation of this risk preference elicitation method is that we are not able to observe the choices that the participants would have made if the balloon had not popped, making ours an upper bound for how risk-averse (or a lower bound for how risk-loving) a person can be. We sought to minimize the impact of this issue by asking participants to play the BART game three times per participant per week and computing one risk aversion parameter for each participant for each week, with participants popping their balloon all three times in a given week only 18.1% of the time.
Following the baseline survey, each study participant was invited to complete up to six similar "post-game" surveys with BART, after NFL games in late 2012 and early 2013. The number of post-game surveys a given participant completed depended on how many games his/her favorite team played in the postseason. This period covered the NFL playoffs and Super Bowl, when multiple teams were no longer playing, so there were many weeks where there were fewer than 163 participants.
The day before each game, participants were sent an email message reminding them that they would receive a short survey to complete immediately after the football game. Then, the moment that their favorite team's game ended, each participant received a text message and an email with a link to the post-game survey. The survey asked participants to self-report on six emotional states: excitement, nervousness, happiness, anger, sadness, and disappointment. It also asked them to complete three BART games, played for real money. Participants were given a 20-minute window after the game ended to complete the post-game survey. This ensured that their emotional response to the game was still strong at the time of survey completion. Table 2 summarizes the payoffs and probability of popping in the BART games, in each week of the survey. The payoffs and probability of popping varied slightly over the course of the study.
To construct our measure of emotion, we reduce the six emotional variable measures into a single measure using principal component analysis (PCA), which allows us to identify the main component of common variation between the specific emotion variables. Principal component analysis is often used in emotions studies in psychology [78,79] to reduce high-dimensioned responses (distinct measures of multiple emotions). This empirical technique is less frequently used by economists in studying emotions (see [80][81][82] for a few examples from economics, however). Importantly, our proposed statistical approach to quasi-field experiments does not rely on the use of PCA; rather, in this specific study, it was useful to employ PCA because the results show that the component vector with the largest eigenvalue has a natural interpretation as "negative emotion" (see S4 Table in the Supporting Information). To construct our final outcome variable, we project each participant's emotional state onto this vector. Specifically, the negative emotion variable created using PCA includes approximately equal amounts of the anger, sadness, and disappointment variables, and an equal (but also negative) amount of the excitement and happiness variables.
To measure risk preferences, we use each decision by respondents in the Balloon Analogue Risk Task (BART) [77] to calculate a "risk aversion" parameter that measures the local curvature of the participant's utility function. This parameter is based on the difference between the expected value of the gamble and the respondent's certainty equivalent. We normalize by dividing this difference by the square root of the participant's wealth in the losing state of the world. This normalization is intended to capture and adjust for the well-documented phenomenon that risk aversion tends to increase as stake sizes increase (see for example Holt and Laury [83] and Weber and Chapman [84], which document the relationship between stake size and risk aversion). Our normalization approach is a simple structural effort to recognize this pattern, but we acknowledge it is a coarse technique for building this intuitive pattern into our analysis. The parameter is defined as follows: In this equation, p w is the probability of the winning outcome, W w is the participant's total wealth if the winning outcome occurs, W L is her total wealth if the losing outcome occurs, and W c is her wealth if she chooses the certain outcome instead of taking the gamble.
We use each participant's set of choices in each survey (spanning 3 BART games) to calculate the value of the risk aversion parameter at which the respondent switches from the risky option (inflating the balloon) to the safe option (deciding not to inflate). This parameter is 0 for risk-neutral choices, greater than 0 for risk-averse choices, and less than 0 for risk-seeking choices. We then use that parameter value as the dependent variable (measure of risk-taking) in our subsequent analysis.
Note that in 80% of cases (at the participant-week level), we have at least one instance of the participant reaching a risky decision where they chose the safe option. In the remaining 20% of cases, participants always chose the risky option in all 3 BART games they played in that week, which makes our parameter estimate for them for that week an upper bound estimate of their risk aversion. It is important to reiterate that these methodological approaches provide us with single-variable measures for both emotion and risk preference for each participant after each game they watched. Critically, the fact that we have multiple observations per individual allows us to utilize the panel instrumental variable approach.
Results from the NFL fans study. Evaluation of NFL game outcomes as an instrumental variable for emotion. We begin by assessing the extent to which NFL game outcomes serve as an appropriate instrument for emotion (Fig 1). Two requirements must be satisfied. First, game outcomes must have strong, statistically significant effects on negative emotion. Second, game outcomes must not be correlated with unobservable respondent characteristics that also influence risk preferences.
For our statistical approach to be valid in this case, two requirements must be met: 1) NFL games must induce strong emotional responses; and 2) NFL games only influence risk preferences "through" variation in emotional states (and not directly). Panels (a) and (c) of Fig 1  present evidence on the first requirement, through visual evidence on how game outcomes influence fans' emotions. The impact is striking: when a fan's favored team won a game, he or she reported substantially fewer negative emotions. This pattern is strong, regardless of the number of points by which the team won the game. To test this requirement formally, we regress negative emotion on whether the team won and by how much. Specifications 1A and 1B in Table 3 report these regressions. Even after controlling for individual fixed effects, the effect of winning on negative emotions is negative and highly significant.
The second requirement is harder to test. We offer one approach in S5 Table in Supplementary Information, which compares the characteristics of participants who are fans of winning versus losing teams. Our results suggest that while participants are generally similar along most observables, fans of losing teams seem to be younger and have lower incomes than fans of winning teams. This is a potential concern as it might suggest unobservable differences between the fans as well, and suggests that a panel instrumental variable approach using individual fixed effects is appropriate.
Of course, the key concern for our analysis is that unobservable participant characteristics could be related to both the emotion-inducing event (the game outcome) and risk preferences. The outcome of any particular game clearly involves some degree of chance. However, even in circumstances where the real-world emotion-inducing event is plausibly random, bias from selection or other causes might still exist in subtle ways. For example, fans of teams that often lose might root for such teams because they are drawn to underdogs or longshots, by disposition, a characteristic that may be both hard to measure and correlated with risk preferences. Therefore, when repeat observation is possible (as in this case), it should be used as part of a panel instrumental variables approach.
Emotions and risk preferences. We next explore how negative emotions influence risk aversion in the study, using several different approaches to illustrate our methodological point. Table 3 presents the results from our regressions. First, specification 2 shows a "naïve" regression in which risk aversion is regressed directly on negative emotion, without individual or time fixed effects. The coefficient on negative emotion is close to zero and not statistically significant. Next, specification 3A presents a "simple" IV regression approach. This specification uses win/loss status of the football game as an instrumental variable for negative emotion, but does not control for the football game win margin. Again, the effect of negative emotion on risk aversion is small and insignificant. These approaches would therefore suggest minimal evidence of a relationship between negative emotion and risk preference in this context. Specification 3B presents our preferred panel instrumental variables approach. In this specification, we again use win/loss status of the football game as an instrumental variable for negative emotion. However, we also control for the margin by which the football team won the game, and use weights that are inversely proportional to that margin. This has the effect of placing greater emphasis on games that were "cliff-hangers," in which the football team won or lost by only a few points. This specification suggests that negative emotion has a negative effect on risk aversion (p < .05). Further, the point estimate is larger than in the previous specifications.
Finally, as a robustness check, specification 4 shows the results of estimating a regression discontinuity design that uses win margin as the running variable. Panels (b) and (d) in Fig 1  help to visualize the intuition for this approach, by plotting the relationship between win margin and risk aversion. While there is a strong decrease in negative emotion at the point where score difference is zero, there is no statistically significant discontinuity in risk aversion at the threshold.
Overall, when using our preferred panel instrumental variable approach, our results suggest that negative emotion may indeed reduce risk aversion. Notably, our empirical approach shows the importance of the panel instrumental variable method, namely that it allows for a less-biased estimate of the relationship between emotion and risk preference. Indeed, our approach influenced the point estimate and statistical significance in the analysis relative to more common empirical approaches. While there remain limitations to our approach (as discussed further in the next section), we believe that our approach helps reduce bias in estimates and thereby helps researchers better recover true relationships between emotion and decisions.

General discussion
In the section above, we utilize a study as an example to demonstrate how a panel instrumental variables approach can help address omitted variable concerns in quasi-field experiments. We find that there are individual characteristics that may be correlated with both risk preferences and emotional states. Without an appropriate statistical approach, the coefficients might have underestimated the effect of negative emotion on risk aversion. For this reason, we argue that future quasi-field experimental research on emotions and decision making would greatly benefit from using an instrumental variables approach with panel data when possible.
There are limitations to our approach in this specific study that are important to highlight here, however. First, the use of an instrumental variable approach requires several assumptions. For this sports fans study, the most tenuous of these assumptions is the exclusion restriction, namely that our approach assumes that the game outcomes did not impact risk preference through any channel other than emotion. In this case, it is plausible that this assumption does not hold; for example, the game outcomes might have had impacts on other important variables that could influence risk preferences (i.e., income via gambling). To some extent, using panel data about an individual can help alleviate this concern, by allowing for data collection to test for these alternate channels at multiple moments in time (i.e., by asking whether and how much participants had bet on the game in question in each week). In our study, we did not ask explicitly about betting on the game in question, so we are limited in our ability to directly address this issue. However, researchers deploying our suggested empirical approach in the future should think, ex-ante, about possible violations of the exclusion restriction and include questions about possible alternative channels, to restrict bias in their estimates as much as possible.
Second, the construction of an individual-level panel dataset necessarily requires eliciting repeated measures. As a result, the study is susceptible to the common issues associated with repeated, self-reported measures (e.g. changes in expectations/perceptions over time or the possibility that subjects will strive for choice consistency across the repeat measures). In our case, this concern is somewhat alleviated by the large amount of time between games (~one week). However, it is surely possible that this limitation prevents the recovery of a wholly unbiased estimate (for example, if participants feel compelled to make their risky choices consistent week-to-week, it might reduce variance in the outcome variable and thereby hide potential links between emotion and risk preferences).
Third, the likelihood that a participant completes a given survey may be influenced by the expected/actual performance of his/her team, which could also bias the results. For example, if participants do not respond to surveys when their teams suffer a loss when they were expected to win, we would lose data from high negative emotion observations. This could reduce variance in a key variable in the analysis, negative emotion, and therefore make it more difficult to recover the true relationship between emotion and decision making.
Fourth, it is challenging to directly interpret effect sizes in our results. That is, because the measures are sensitive to the specific experimental design used, and the specific ways that we measured negative emotion and risk preferences, it is more difficult to directly compare our empirical results to those in the broader literature. That said, we present these results not as definitive proof of links between emotion and risk preferences, but rather to support our broader methodological argument that quasi-field experimental research designs that accommodate panel instrumental variable approaches are valuable for studying emotions and decision making.

Conclusion
Quasi-field experiments are a promising and underused methodology for research on emotions and decision-making. In this paper, we argue that these types of hybrid studies are at least as effective at inducing emotions as laboratory studies. They offer an approach for studying intense emotions that cannot ethically be induced in a laboratory setting, while also providing greater flexibility than "pure" quasi-experimental research. Combined with appropriate statistical models, quasi-field experiments can produce rigorous inference about the causal effect of emotions on decisions, greatly enhancing the literature on this important topic.
The business of everyday life is ripe with opportunities for quasi-field experimental research on judgment and decision making. Indeed, numerous events-from sporting events to the news to personal triumphs and tragedies-impact human emotions. We argue that by taking advantage of naturally-occurring events in the field and utilizing appropriate statistical techniques, future research could deepen our understanding of how emotions influence the economic choices that individuals make.
Supporting information S1