Fig 1.
Utility and probability weighting functions used in cumulative prospect theory.
(a) Utility functions for gains (u(r − b|γ) = (r − b)γ, for r > b) and losses (u(r − b|λ, γ) = −λ|r − b|γ, for r < b) used in the calculation of value under cumulative prospect theory. These are convex for gains and concave for losses, to mimic a diminishing marginal returns effect on relative rewards. Steeper utility function for losses shows loss aversion, by amplifying the perception of a loss when compared to a gain of similar magnitude. (b) Prelec’s probability weighting function, w(p|α, δ) = exp{−α(−log(p))δ}, is plotted for different values of the Prelec parameter α and for fixed δ = 0.75. The probability weighting function presented originally, , is represented by the black dashed line, for γ = 0.85. Notice that Prelec’s function is very similar to the originally proposed probability weighting function when α = 1, demonstrating both overweighting of low probabilities and underweighting of high probabilities, corresponding to the possibility and certainty effects. (c) Probability anomaly, w(p|α) − p. Blue indicates positive anomaly, whereas red indicates negative anomaly. Here it is easy to see the effects of the probability weighting function; for low values of α, low probabilities are overweighted, causing the so-called possibility effect—i.e. highly unlikely events are perceived as more probable than they actually are ---, and, for high values of α, high probabilities are underweighted, demonstrating the certainty effect—i.e. highly likely events are perceived as less probable than they actually are. Notice that both the certainty and possibility effects come into play when α ≈ 1.
Fig 2.
CPT-value difference between certainties and gambles, VCertainty(b) − VGamble(b, α), for different reference points b and probability weighting functions parameterized by Prelec’s parameter α.
The CPT-value is calculated using utility functions u+(x) = x0.85 for gains (x > 0) and u−(x) = −2|x|0.85 for losses (x < 0), and probability weighting functions are of the form w(p|α, δ) = exp{−α(−log(p))δ}, with δ = 0.75. Blue regions indicate larger positive differences, while red regions indicate larger negative differences. The grey solid line represents the decision boundary, where both values are equal. In (a), the agent is choosing between the certainty of gaining 900 and a gamble in which one might gain 1000 with probability 95%, where the optimal choice is ‘Gamble’ since 1000 × 95% = 950 > 900. In (b), the agent is choosing between the certainty of losing 900 and a gamble in which one might lose 1000 with probability 95%, where the optimal choice is Certainty since −1000 × 95% = −950 < −900. In (c), the agent is choosing between the certainty of gaining 55 and a gamble in which one might gain 1000 with probability 5%, where the optimal choice is ‘Certainty’ since 1000 × 5% = 50 < 55. In (d), the agent is choosing between the certainty of losing 55 and a gamble in which one might lose 1000 with probability 5%, where the optimal choice is Gamble since −1000 × 5% = −50 > −55. In all four cases, the reference point and probability perception influence behavior in significant ways. In a) and d), an optimal choice is attained when the agent either has an optimistic view of outcomes (i.e., low reference point b) and overestimates low probabilities (i.e., low Prelec parameter α), or when the agent underestimates high probabilities (i.e., high α) and is pessimistic (i.e., high reference point b). The opposite happens in b) and c). In cases where both certainty and possibility effects are present (i.e., α ≈ 1), behavior becomes extremely non-monotonous as a function of both parameters.
Table 1.
Payoff matrix underlying the utility functions of the 2-player stag hunt game with two actions—Hare (H) and stag (S).
Specifically, agent 1’s choices are cast in rows and agent 2’s choices are cast in columns. The utility given to agent 1 and agent 2 are the first and second numbers, respectively, of a given cell.
Fig 3.
Probability of hunting stags in a normal-form stag hunt game.
Cumulative prospect theory can explain a wide range of coordinating behaviors in a simple game such as the stag hunt, depending on how probabilities are perceived and how outcomes are framed. The panels show the probability of choosing Stag in the stag hunt normal-form game, for pairs of parameters of prospect theory—i.e., reference point b, Prelec parameter α, loss aversion λ, and utility concavity γ. For each pair of parameters, the remaining ones were left at default values: b = 0, α = 1, λ = 1, and γ = 1. Lighter colors indicate higher probability of hunting stag. White line represents the mixed Nash equilibrium of 1/3.
Fig 4.
Markov stag-hunt transition probabilities of an individual agent.
Darker colors indicate higher probability. Shown are 5 different colors, which, from lightest to darkest, have probabilities of 0% (white), 20%, 40% (corners), 60%, and 80% (corners). Black does not appear because there are no degenerate transitions which would make agents get stuck. Each agent chooses one of three actions (Left, Stay, or Right) and, depending on their current state, move to another state according to the respective transition probabilities. The state of an agent does not change the transition probabilities of the other agent, e.g. an agent cannot block the other agent.
Fig 5.
Markov stag-hunt reward functions.
Darker colors indicate larger rewards. Agents receive a reward at each time step depending on their state and the state of the other agent. The hare state (state 3) can be obtained regardless of where the other agent is. This also allows us to model situations in which an agent can only obtain a big reward if the other agent is willing to coordinate with him. In our case, the stag state (i.e., state 11) has one such big, but difficult to obtain reward.
Fig 6.
Effect of sophistication in value and policies in the Markov stag hunt game.
(a) CPT values of the agent states, for sophistication levels k = 1, 2, 3 and 4. Joint states with redder colors have higher value. (b) Policies for sophistication levels k = 1, 2, 3 and 4. Joint states with darker color indicate higher probability. The value of the stag state grows with the sophistication level. We assumed reference points b1 = b2 = 0, discount factors β1 = β2 = 0.9, utility function u(x) = x and probability weighting .
Fig 7.
Role of sophistication level of CPT-agents in the stationary distributions of the Markov stag hunt game.
Darker colors indicate higher probability. We assumed equal agent parameters (i.e., discount factors β1 = β2 = 0.9, reference point b1 = b2 = 0, utility functions u1(x) = u2(x) = x, and weighting function ).
Fig 8.
Role of the reference point of CPT-agents in the stationary distributions of the Markov stag hunt game for several sophistication levels.
Stationary distributions of the resulting Markov chains obtained by conditioning the Markov game to increasingly sophisticated policies, k = 1, 2, 3 and 4, for CPT-agents with several reference points b = −1, 0, 1, 2. Darker colors indicate higher probability. We assumed discount factors β1 = β2 = 0.9, utility function u(x) = x, and weighting function .
Table 2.
Payoff matrix of a symmetric 2-player, 2-action normal-form game.