Stochastic representation decision theory: How probabilities and values are entangled dual characteristics in cognitive processes

Humans are notoriously bad at understanding probabilities, exhibiting a host of biases and distortions that are context dependent. This has serious consequences on how we assess risks and make decisions. Several theories have been developed to replace the normative rational expectation theory at the foundation of economics. These approaches essentially assume that (subjective) probabilities weight multiplicatively the utilities of the alternatives offered to the decision maker, although evidence suggest that probability weights and utilities are often not separable in the mind of the decision maker. In this context, we introduce a simple and efficient framework on how to describe the inherently probabilistic human decision-making process, based on a representation of the deliberation activity leading to a choice through stochastic processes, the simplest of which is a random walk. Our model leads naturally to the hypothesis that probabilities and utilities are entangled dual characteristics of the real human decision making process. It predicts the famous fourfold pattern of risk preferences. Through the analysis of choice probabilities, it is possible to identify two previously postulated features of prospect theory: the inverse S-shaped subjective probability as a function of the objective probability and risk-seeking behavior in the loss domain. It also predicts observed violations of stochastic dominance, while it does not when the dominance is “evident”. Extending the model to account for human finite deliberation time and the effect of time pressure on choice, it provides other sound predictions: inverse relation between choice probability and response time, preference reversal with time pressure, and an inverse double-S-shaped probability weighting function. Our theory, which offers many more predictions for future tests, has strong implications for psychology, economics and artificial intelligence.


Introduction
Randomness is a fundamental component in most human affairs, from economics and politics to medicine and sports. Yet, people often make poor and inconsistent decisions when confronted with it. The rational normative recipe of Expected Utility Theory [1] has shown major limitations in accounting for how strongly people misperceive probabilities and uncertainty [2][3][4], leading to the notion of bounded rationality [5] and a long list of behavioral biases and fallacies. Several attempts [6][7][8][9][10][11][12] have been made to explain such fallacies, replacing the objective probabilities of events with "decision weights", but still retaining a sort of expectation principle, where the attractiveness of an event is decomposed into the product of (subjective) probability and (subjective) value. Numerous evidence [13][14][15][16][17][18] however suggest that the two are not independent; for example, people tend to overestimate the probability of an event if the associated outcome is bad. Rank-Dependent Theories [9][10][11] partially take into account the effect of value on probability, such that decision makers tend to overweight only events with 'extreme' consequences. However, their axiomatic structure prevents them to account for observed violations of stochastic dominance [19]. Operationally, estimating the subjective probability and utility as two separate entities is subjected to the joint-hypothesis problem [20] leading to severe limitations for real-life applications.
The above-cited frameworks are deterministic in nature, postulating that the best option will always be chosen. When tested against empirical data, a probabilistic component is needed [21] to account for observed "noise" and "inconsistencies" [22,23]. We can distinguish two classes of probabilistic theories of decision-making: random utility maximization (RUM) models and stochastic decision processes. The former, introduced by Thurstone [24] assumes that the "perceived" utility of an option is a random variable, written as the sum of a "true" fixed utility and a random disturbance, encoding the deviation from rational behaviour. Debreu [25] proves the existence of a utility function representing a stochastic preference relation with a minimal set of assumptions. McFadden [26] describes the evolution of RUM models over the past decades, linking it to the Luce choice axiom (LCA) [27], a very useful assumption enforcing desirable properties such as independence from irrelevant alternatives and strong stochastic transitivity. However, several empirical studies [28][29][30][31][32][33] show how humans do not always conform to such structure, the most famous example being the "red bus/blue bus" problem [34].
The second class of models assumes that the utility of alternatives is fixed, but the process leading to a decision is inherently stochastic. Regarding choice in uncertain environment, the most famous model is decision field theory (DFT) [35], where a stochastic process (Brownian motion) is assumed to mimic the fuzzy and hesitant deliberation activity of human mind. The theory takes inspiration from the Ratcliff Drift-diffusion models (DDMs) [36][37][38], which have shown to well describe choices and reaction times in perceptual decision-making tasks (for example, discriminating a motion direction). Differently from RUM models, these theories can account for the fact that human decision making happens in finite time, and explain how such deliberation time-as well as time pressure-affects choice probability.
In almost all theories, the "true" utility of a gamble or its index of worth is obtained by combining probabilities and outcomes in a (subjective) expectation, irrespective of the probabilistic model adopted (additive random disturbance or drift-diffusion process). As a result, all these frameworks carry some problematic aspects of expected utility theory, the most prominent (and oldest) being embodied in the St. Petersburg paradox [39], where an infinite expectation value of the gamble would imply infinite willingness to pay, while in reality many people would pay at most a small amount [40]. Although Expected Utility was designed to solve such paradox, a simple modification of the gamble, often called Super St. Petersburg paradox [41], reintroduces the problem: if the lottery provides outcome 2 2 n -rather than 2 n -with probability 1 2 n = , the Bernoullian expected utility of the gamble (logarithmic utility function) diverges again. For any unbounded utility function, there will always be a Super Super St. Petersburg paradox.
When shifting from the normative perspective of decision theory (telling what people should do)-where expected utility proves best-to a descriptive perspective (reporting what people actually do), it is worthwhile to investigate alternative mechanisms of value formation in human mind that are different from the class of generalized expectation approaches mentioned above. Indeed, from a semantic perspective, separating probability and outcome seems quite odd, since any probabilistic statement must contain explicitly or implicitly information about "value". In other words, a probability number quantifies the likelihood of a concrete event that is specified, and this event carries an explicit value or implicit assessment of worth or impact. For instance, when conceiving the likelihood of a natural disaster, one cannot help not thinking of the potential associated destruction and losses of lives, which are therefore implicitly connected to a cost. When thinking of the probability of the election of some political candidate, one cannot avoid envisaging the social, economic and financial consequences, which carry an implicit value judgement. Generally, whatever the event, it carries either a direct value or an indirect value assessment, even if not fully formalized in the mind of the probability assessor. Therefore, the way in which outcomes and probabilities interact in human mind seems to be much more entangled than represented by the simple factorization prevalent in utility theories and their generalizations in behavioral economics and psychology. The intermingled nature of probabilities and values have been reported by Lopes [42] and is highlighted in the above-cited experiments [13][14][15][16][17][18], which demonstrate the effect of outcomes on perceived probability.

Our contribution
Here we propose a new framework for describing human decisions under risk, based on a representative stochastic process-in the same spirit of drift-diffusion models-but with a notable difference: outcomes and probabilities are not merely multiplied to form an index of worth, rather they combine in a non-symmetric and non-separable way, as dual characteristics of an event. The difference will be evident when presenting the model into details, but the core concept is the following. In drift-diffusion models, as in DFT and race models, outcomes and associated probabilities of gambles are combined in a unique entity, a mathematical expectation, that then plays the role of a drift component of the stochastic process representing the decision-maker. The decision is triggered when the process reaches a threshold, called decision criterion, usually related to the time available for making a choice (the closer the threshold to the starting point, the faster the process will reach it). In our framework, probabilities and outcomes play a structurally different role; a decision occurs when the diffusive particle is absorbed at the end-point of an interval associated with a given event, whose distance is solely determined by the event's probability. The existence of n events is thus represented by n absorbing end-points at the end of n arms in a starfish configuration along which the Brownian particle diffuses. The n arms have different lengths controlled by the probabilities of the associated events. In this representation, it is natural to conceptualize the values or utilities of the outcomes by adding drifts characterizing each arm of the starfish, the larger the value of an event, the larger the drift that biases the random walk towards the corresponding end-point. Notice that this mapping respects the positivity of the probabilities associated with the arms' lengths, while the drifts can be attractive or repulsive to reflect gains and losses, respectively.
More concretely, consider several outcomes A,B,C. . ., each understood to occur with probabilities p A ,p B ,p C ,. . . Our key idea is that the mind imagines consciously or unconsciously some bundles of random paths wandering around in some abstract space, where the alternative outcomes A,B,C. . . are identified as distinct domains (absorbing boundaries) in this space. The distance between domain representing outcome, say, A and the initial position of the particle is inversely proportional to p A , while the bias responsible for the attraction of the particle to the boundary, is proportional to the outcome A. The probability for the diffusing particle to be absorbed by a particular domain is then primarily interpreted as a measure of attractiveness of the associated event, as in DFT; at the same time, a conditional absorption probability can be interpreted as a subjective value-distorted probability, as we will see below.
Thanks to the mutual interaction between perceived probabilities and perceived value of outcomes embedded in the starfish geometry with drifts, our model predicts the famous fourfold pattern of risk preferences [43]. To get an intuition on why this is the case, we derive two previously postulated features of prospect theory [43]: the inverse S-shaped subjective probability as a function of the objective probability and risk-seeking behaviour in the loss domain. However, these two entities are not exactly those described by prospect theory, because they are not separable. Rather, they can be inferred and rationalized by studying how the predicted choice probability depends on events' outcomes and probabilities. Without added assumptions, our model conforms naturally to Luce choice axiom [27], enforcing strong stochastic transitivity for pairwise choices. It also predicts violations (as well as observance) of stochastic dominance, in agreement with empirical data [44].
Moreover, generalizing the model to account for time pressure and finite decision times, it provides other empirically confirmed predictions: the inverse relation between choice probability and response time [45], preference reversal through time pressure [46,47], and an inverse double-S-shaped probability weighting function [48]. Also, note that while usual driftdiffusion models have non-trivial and somehow artificial generalizations beyond binary choices [49], our representation remains essentially locally uni-dimensional for an arbitrary number of available options.
Notwithstanding its predictive power, given its simplicity, the present version of our model has limitations. Because of Luce choice structure: i) it would predict a non-deterministic choice for a decision between two simple sure outcomes (thus we restrict our choice set, as Luce does); ii) it cannot predict observed violations of independence from irrelevant alternatives [31][32][33] (similarity effect, attraction effect, compromise effect). Furthermore, the proposed stochastic representation is more of an allegory that should not be taken at literally meaning that the human brain imagines all possible random paths wandering around in some abstract space for several outcomes A, B, C. . ., for instance as a result of limited human working memory. Our framework is proposed as a first minimal complexity model or null-model of human risky choice, which provides the baseline for further elaboration and improvements. Indeed, our present model is characterized essentially by only two tuning parameters (compared for instance to the seven parameters of DFT). In the future, we will present extensions of the model obtained by relaxing some assumptions.
In summary, motivated by: i) empirical evidence for "interaction" between probability and value [13][14][15][16][17][18]; ii) empirical evidence for intrinsically probabilistic human choice [50]; iii) success of drift-diffusion models in describing human behaviour in several tasks, we present a new probabilistic decision theory that combines probability and value in a non-separable way. Despite its simplicity, it provides straightforward derivations at a more microscopic level of several known structures that have been documented empirically in human decision theory. The rest of the article is structured as follows: in Section Model we introduce the theoretical model, first without time constraints and then generalizing. In Section Results, we outline the main predictions of our theory. Section Discussion summarizes and concludes.

Stage 1: "Infinite time" Stochastic Representation Decision Theory (SRDT)
This sub-section presents the simplest version of our model, i.e. without considering the role of (finite) time for human decision-making.
Formulation of the stochastic representation of lotteries. In the simplest possible situation, a decision maker (DM) has to make a choice between playing two binary lotteries: If the DM chooses lottery L 1 (resp. L 2 ), she will receive amount o A (resp. o C ) with probability p (resp. q), and o B (resp. o D ) with probability 1-p (resp. 1-q). The amounts can be negative, corresponding to losses.
As mentioned in the introduction, our model is conceptually analogous to drift-diffusion models, including decision field theory (DFT), i.e. a stochastic process is assumed to represent the human deliberation activity leading to a decision; choice is triggered when the process reaches a certain threshold. Fig 1 shows how the above binary choice is represented in DFT: if the process (Brownian particle in the simplest case) reaches the upper boundary (resp. lower boundary) first, then lottery L 1 (resp. L 2 ) is chosen. The drift component of the motion is related to the difference between expected-like utilities of the lotteries where u and π are the so-called utility and probability functions, respectively.
In our framework, an alternative way of value formation is assumed, keeping in mind the numerous evidence [13][14][15][16][17][18] showing relevant interaction between probability and value perception. We start from a plausible representation of the lotteries' objective probabilities, as perceived by the decision maker. Typically, humans find easier to understand probabilities in terms of frequencies [51]. Therefore, we propose to model their cognition via the occurrence of favorable random walk paths that hit some target, an absorbing boundary in this case. In other words, we view the cognitive processes leading to the "feeling" or "understanding" of probability as imagining a bundle of random walkers wandering about, and the perception of the actual occurrence of the event as the arrival of random walkers in some boundaries or some domains. This representation allows one to give substance and meaning to what is the perception of probability, equal to the fraction of "successful" paths, in the standard frequentist approach of probability theory [52].
Once the lotteries' objective probabilities are encoded into some absorption probabilities, we introduce lotteries' outcomes and account for: i) their intrinsic utility; ii) their effect on perceived probability. The simplest incarnation of this twofold effect is to introduce an outcomedependent force (derived from a potential energy) that biases the random walk, producing a value-distorted understanding of probability. This construction leads to an effective influence between probabilities and outcomes; such reciprocal interaction will result in a distorted perception of these two entities by the decision-maker, that in turn determines her decision preferences.
Put differently, instead of compressing all the lottery information into an expectation-like index of worth, we "unpack" a lottery by introducing an absorbing branch for each of its outcome-probability pairs. As a consequence, the topology of the space where the stochastic process wanders will depend on the specific choice setup. This condition resonates with the fact that, in many situations, utility maximization is computationally intractable [53].
Operationally, we represent choosing between L 1 and L 2 with a Brownian particle undergoing a continuous random walk [54] that starts at the crossing (taken as the origin) between 4 segments, 2 per lottery, as shown in Fig 2 (to be compared with one segment used in DFT, as shown in Fig 1 along the y-axis, while the x-axis is the time of deliberation). Pictorially, the decision-maker is identified with the Brownian particle itself, whose stochastic path simulates the deliberation act taking place while evaluating the possible alternatives. Each branch encodes information about one lottery outcome-through a potential energy tilting the branch-and its associated probability of occurrence, through the branch length ending with an absorbing boundary. A (perhaps more intuitive) analogous discrete random walk representation is shown in S2 Fig.   Fig 1. DFT-representation of binary choice. If the process reaches the upper boundary (resp. lower boundary) first, then lottery L 1 (resp. L 2 ) is chosen. The drift component of the motion is related to the difference between expectedlike utilities of the lotteries. Time elapsed along the x-axis ("number of sample" denotes "time"), leading to directed paths along it.
https://doi.org/10.1371/journal.pone.0243661.g001 When the process is restricted to represent only one lottery, the probability to be absorbed at the end of one branch can be interpreted as the value-distorted subjective probability of the associated outcome (see sub-section "Subjective Probability"). In the presence of two (or more) lotteries, the probability to be absorbed at the end of one branch of a given lottery gives a contribution to the total probability that this lottery is chosen. The probability of choosing lottery L 1 (resp. L 2 )-denoted by P(L 1 ) (resp. P(L 2 ))-is thus given by where Pðk Z k Þ denotes the probability for the particle to be absorbed by the wall located at distance η k on branch k Z k , for k = 1,2 with η k =1 2{a,b} and η k =2 2{c,d}. In words, the probability of choosing, say, lottery L 1 is given by the sum of two terms: the probability of being absorbed along branch 1 a -representing (o A ,p) -plus the probability of being absorbed along branch 1 b , To quantify the meaning of an outcome o A , we assume the existence of a preference or value function u(o A ), endowed with the minimal standard properties of being non-decreasing and concave on the gain side to represent risk aversion (see sub-section "Risk-seeking behavior for losses" for the loss side). Then, the form of the potential energy acting on the Brownian particle along a branch with outcome o A is taken as linear, with a slope proportional to u(o A ), as represented by dashed lines in Fig 2. This corresponds to a constant force acting on the Brownian particle along each segment. The sign of the energy potential is such that the greater is an outcome, the higher is the attraction toward the corresponding branch end point.
This representation has the advantage of remaining essentially one-dimensional, the motion on each segment being governed by a simple partial differential equation. For example, the probability density p(x,t) of the particle at position x and time t on branch 1 a (of length a) evolves according to the following Fokker-Planck equation [55,56] @pðx; tÞ @t ¼ uðo A Þ @pðx; tÞ @x þ D 2 @ 2 pðx; tÞ @x 2 pða; tÞ ¼ 08t ðabsorbing boundaryÞ where u(o A ) is the constant drift acting on the particle, D is the so-called diffusion coefficient and the two boundary conditions account respectively for the absorbing wall at distance a from the origin, and f(t) represents the probability of the random walker incoming at the origin from other branches. Note that Eq (4) is just one possible way to look at the problem, i.e. solving a diffusion process on each branch independently and then matching the flux to ensure conservation of probability mass. However, as shown in S1 Appendix, we did not proceed this way: rather, we first solve the absorption problem in the case of only two branches (i.e. a one dimensional Brownian motion between two absorbing walls), and then show how it can be generalized to an arbitrary number of branches.
Simple dimension analysis of Eq (4) shows that D sets the scales for the impact of the outcome values compared with the probabilities in the value formation process: (i) taking very large D's amounts to neglecting the influence of outcome values; (ii) small D's make outcome values dominant in the construction of preferences.
Explicit expressions for the decision probabilities. As shown in Fig 2, the probability P (L 1 ) (resp. P(L 2 )) for the decision maker to choose lottery L 1 (resp. L 2 ) is represented by solving Eq (4) for each of the four branches with the matching condition of the conservation of the probability of presence of the Brownian particle when crossing the junction point at the origin.
Using the theory of random walks and diffusion processes [57], we obtain (see S1 Appendix for derivation) with and Expression (5) recovers the ratio scale representation of Luce's choice axiom for binary choice [27], with effective utilities given by (6) and (7). This implies the so-called strong stochastic transitivity for pairwise choices: P fL 1 ;L 2 g ðL 1 Þ � :5 and P fL 2 ;L 3 g ðL 2 Þ � :5 imply that P fL 1 ;L 3 g ðL 1 Þ � max½P fL 1 ;L 2 g ðL 1 Þ; P fL 2 ;L 3 g ðL 2 Þ�. Note that the solution of Eq (4) for N alternatives generalizes into where P N (L j ) is the probability of choosing lottery L j among the N available lotteries and the UðL i Þ's are generalized utilities given by expressions of the form (6) and (7). As stated in the introduction, because of the Luce choice structure, our theory would predict a non-deterministic decision when the choice is between two sure outcomes (e.g. L 1 = {9,1} vs L 2 = {10,1}). Therefore, following Luce [27], we assume that no such task is present into the choice set. As can be seen from (6) and (7), the utility UðLÞ of a given lottery is given by the sum of two terms, each representing the attractiveness of an outcome-probability pair, which cannot be decomposed in a simple product of utility and subjective probability, as in expected utility theories. In contrast, probabilities and utilities combine and interact in a non-trivial way, with D quantifying the relative importance of value with respect to probability assigned by the DM. This becomes evident when taking the asymptotic limits of, e.g., P(L 1 ) (in the presence of another lottery L 2 offered as the second option): In our framework, a decision maker characterized by D!0 (resp. D!1) is influenced only by outcome values (resp. probabilities), while for finite D his decision derives from an entangled mixture of both. Note that the limit for D!0 depends on the sign of the utilities: for example, if u(o B ) and u(o D ) are negative, the asymptotic behavior of the choice probability is This shows an intrinsic difference in perception between gains and losses, an asymmetry that we discuss further in sub-section "Probability-distorted effective utility".

Stage 2: Finite time SRDT
Rationale. Many empirical studies (see [58]) have shown how people do not always choose the best option, but the one that gives a fair trade-off between utility and "cost". A decision is in general a stressful operation, and humans have finite computational resources, so even when there is no explicit time constraint for making a choice, low-effort heuristics become attractive as soon as they provide satisfactory outcomes. Thus, the time dimension in decision-making cannot be neglected, as static theories of decision-making (including RUM models) do. Next sub-section extends the previously presented model to account for finite time deliberation.
Theoretical extension. Eq (5) provided the choice probabilities P(L 1 ) and P(L 2 ) for a binary choice between lotteries L 1 and L 2 assuming infinite available time to make a decision. We are now interested in calculating the choice probability, say P(L 1 ), conditioned on occurring at some time t�T, denoted by P(L 1 |T). In other words, P(L 1 |T) is the probability to be absorbed by one of the outcomes of L 1 , given that the particle is absorbed somewhere before time T. This condition mimics either an explicit time limit (time pressure) or an implicit one, due to accuracy-effort trade-off. Formally, for the binary choice representation in Fig 2, P(L 1 | T) is given by where J η (x,t) is the probability current on branch η at position x and time t. Given the structure of the problem, a closed form expression of J η (x,t) is hard to obtain. However, a very good approximation of the integrals in (11) is given by the Laplace-transformJ ðx; sÞ of the probability currentJ where s is the conjugate variable of time. Therefore, combining Eqs (11) and (12), P(L 1 |T) is approximately given by (see S1 Appendix for derivation) It is easy to check that when there is no time constraint (T!1) Eq (13) retrieves the usual asymptotic choice probabilities in (5).

Fourfold pattern of risk preferences
The fourfold pattern of risk preferences [43] is one prominent example of the inadequacy of Expected Utility to describe observed human behaviors. It is experimentally observed that people are: i) risk-averse when gains have moderate probabilities or losses have small probabilities; ii) risk-seeking when losses have moderate probabilities or gains have small probabilities. In Table 1, we report an example of such behavior. Prospect theory, thanks to the interplay of value function and probability weighting, is able to describe it.
In Fig 3, referring to the example in Eq (15), we show the predicted probability of choosing L 1 in task (A) (Fig 3A) and L 3 in task (B) (Fig 3B) as a function of p, for fixed diffusion coefficient D and different values of r. The fourfold pattern is correctly predicted: in Fig 3A, P (L 1 (p))�0.5 for small p (risk-seeking, possibility effect) and P(L 1 (p))�0.5 for large p (riskaverse, certainty effect). The situation is reversed in Fig 3B. Note that, despite the fact that our model is structurally different from Expected Utility, a more concave utility function, i.e. higher r, leads to greater risk-aversion in the gain domain. However, r is not an absolute indicator of risk-aversion as in EU, since the ultimate choice probabilities will depend also on the values of D and T (when time constraints are considered). Interestingly, our model predicts also that greater risk-aversion for gains corresponds to greater risk-loving for losses, suggesting a positive correlation (some evidence for such correlation is reported in [59]). By further inspection of Fig 3A, we see that something weird happens: the choice probability P(L 1 (p)) does not go to 0.5 as p!0. But this should actually be expected, because L 1 and L 2 in Eq (15) become identical in this limit. This is due to the fact that the contribution to the choice probability from outcome 100,Ũ p ð100Þ, does not go to 0 as p goes to 0. Thus, there is still a probability to be absorbed along that branch. Conversely, when p is exactly 0, as in L 2 , there is no branch corresponding to such outcome. As next subsection will explain, this amounts to an infinite overweighting of small probabilities. Technically, this problem is known as a singular perturbation limit [60], where, informally, "the solutions of the problem at a limiting value of the parameter are different in character from the limit of the solutions of the general problem" [61]. In this case, the singular perturbation is characterised by the following inequality Such singularity is removed once we include finite time constraints, i.e. by imposing that the decision cannot take an infinite time. Indeed, replacing the effective utilities in Eq (6) with the time-constrained ones in Eq (14), the contribution to the choice probability of the probability p outcome satisfies the following limit Eq (18) means that, when the probability of an outcome goes to 0, the corresponding probability to be absorbed along that branch also goes to 0, not contributing to the choice probability of the related lottery. Let us stress that we do not impose a "small" value of T to get rid of the singularity; T can be arbitrarily large, but finite.
The probability of choosing L 1 (p) in Eq (15), given that the decision occurs before T<1, reads PðL 1 ðpÞjTÞ ¼Ũ p ð100jTÞ þŨ 1À p ð0jTÞ U p ð100jTÞ þŨ 1À p ð0jTÞ þŨ 1 ð100pjTÞ ð19Þ Let us now investigate the role of the other parameters, the diffusion coefficient D and the time constraint T. As said in Section "Explicit expressions for the decision probabilities", D is a kind of "utility-numeraire", determining the relative impact of the outcome values compared with the probabilities in the value formation process. The role of T is more subtle: as we will see in the next subsection, a smaller T implies more underweighting (resp. overweighting) of small (resp. high) outcome probabilities. Figs 5 and 6 show the choice probabilities in Eq (16) for different values of D and T, respectively. On the gain-side (Fig 5A), as D decreases, the strength of preferences increases and the preference reversal point between risky and safe lottery shifts to the right (risk-seeking for a wider range of p's). On the loss-side (Fig 5B), decreasing D shifts the curve upward and leftward, implying stronger risk-seeking preferences for a wider range of p's.
Focusing now on Fig 6, we see that, through the underweighting (resp. overweighting) of small (resp. high) probabilities, a smaller T destroys both the possibility effect on the gain-side (no risk-seeking behavior for low p) and the certainty effect on the loss-side (no risk-seeking behavior for high p). In general, a smaller T implies greater risk-aversion, as we will discuss in subsection "Preference reversal with time pressure".  As stated at the beginning of the Section, in prospect theory, the fourfold pattern of risk preferences is usually explained in terms of the combined effects of probability weighting and a convex-concave value function. The next three subsections show how, although in our model these two constructs are not separable, it is still possible to identify them as consequences rather than postulates, offering an additional intuition on why our theory can explain such patterns. Specifically, the study of how value and time constraints affect probability perception is discussed in subsections "Subjective Probability without time-contraints" and "Realistic Inverse double-S-shaped probability weighting function". Conversely, the effect of probability on value perception is analyzed in subsection "Probability-distorted effective utility".
Subjective probability without time-contraints. Eq (5), together with (6) and (7), show the resulting form of the decision probabilities without time-constraints for the binary risky choice (1). From here, we now focus on studying the predicted probability perception of the Decision-maker (DM), say, of outcome o A of lottery L 1 . A convenient way to extract such information is to look at the probability of absorption along branch 1 a , conditional on being absorbed along any branch pertaining to L 1 π(p) can be interpreted as the amount of attention devoted to outcome o A when the DM is looking at lottery L 1 . Several authors [62,63] have established connections between subjective probability and similar psychological notions. Indeed, the fact that π(p) defined in (20) represents a meaningful measure of subjective probability is supported by its asymptotic limits as a function of D: For D!1, outcome values (potential energies) become negligible compared with the stochastic component and the probability perception is unaltered, so that the subjective probability is equal to the objective one. In contrast, for D!0, the decision maker does not pay attention to the probabilities and focuses solely on the payoffs, interpreting their likelihood only as a function of their magnitude. As for Eq (9), a simple interpretation of the D!0 limit is possible only when both utilities are positive . For u(o A ) and u(o B ) negative, the expression becomes Eq (22) implies that when negative utilities are involved, the decision maker, even in the D!0 limit, takes into account the event probabilities.
For finite non-zero D, an interesting value-distortion of probability perception arises: Fig 7  shows π(p) vs p for different uðo B Þ uðo A Þ and D values. Our theory thus derives the empirical inverse Sshape of subjective probability as a function of objective probability, for instance used in standard Prospect Theory by Tversky and Kahneman [11], indicating that human beings tend to overestimate rare events and underestimate high probability events. More specifically, π(p)�p (resp. π(p)<p) for p�p � (resp. p>p � ), where p � is the inflection point of π(p) given by Our theory predicts that the asymmetry in the distortion of π(p) for p!0 and p!1 is controlled by uðo A Þ uðo B Þ : the larger this ratio is, the larger is the subjective distortion for small p's compared with large p's.
There is empirical evidence that changing lottery payoffs changes inflection points. In [64], for each individual, the authors perform the elicitation of two probability weighting functions p À S ðpÞ and p À L ðpÞ for gambles involving small and large losses. The idea is that, when considering lotteries like L ¼ fÀ 1000€; 0:1; À 10€; 0:9g it is possible that the probability 0.1 is not weighted in the same way, because of the different magnitude of the consequences and because of the "distance" between the lottery outcomes. Note that Rank-dependent models (e.g. CPT) predict that π(0.1) is the same in both lotteries,

PLOS ONE
since L and L 0 are comonotonic. On average, the authors find that small probabilities (�0.33 for small losses and �0.5 for large losses) are overweighted (indicating pessimism). The usual inverse-S shape thus holds over both small and large losses, but the inflection point shifts to the right over large losses.
While deriving or recovering the empirical inverse S-shape, our formulation of the subjective probability is fundamentally different from those used in existing decision theories, such as the Prelec II weighting function [65] parametrized to account for some assumed probability distortion, which is supposed to be intrinsic to the DM and can be determined by calibration of the results of a number of standard tests and questions presented to the DM [66]. These subjective probabilities are considered independent of the values of the outcomes to which the probabilities are associated. We have previously argued and also referred to empirical evidence that there is no such thing as an outcome without value. Even a question as far from every life on the probability of life on Mars, say, carries, depending on the DM, religious, scientific, and cultural values and possibly more. In our framework, the subjective probability (20) is influenced by the outcome values and represent the contribution of each outcome in a lottery to the choice of that lottery by the DM. Thus, our theory suggests that it is ill-conceived to attempt characterizing the subjective probabilities of DM. Our approach allows us to formulate a general hypothesis that subjective probabilities are value-dependent, which deserve empirical investigations. In existing decision theories, the subjective probabilities are multiplied by the utilities of the associated events to form a measure of worth and then the choice probability "layer" is added on top. In our theory, subjective probabilities are instead encapsulated into decision probabilities, the former determining the latter. Our model can thus be viewed as a natural generalisation beyond the standard factorisation of probabilities and values to form value preferences.
At this stage, the definition of subjective probability as a relative absorption probability (Eq (20)) may seem somewhat counterintuitive, notwithstanding the fact that it correctly retrieves the objective event probability in the D!1 limit. We would like to stress that our mathematical formulation of the subjective probability is fundamentally different from that in expected utility theories. In Expected utility, as described by Savage [67], the assumption of separation between preferences and beliefs is crucial for the elicitation of subjective probability. However, as stated in the introduction, the simultaneous estimation of utility and subjective probability is subjected to the joint hypothesis-testing problem [20], and many methods have been devised to circumvent such issues [68,69]. Here, in contrast, the subjective estimation of the likelihood of an event depends on the associated magnitude. Consequently, in our model, the subjective probability is actually implied by the utility function, and thus two separate functions cannot be really identified. Our definition of subjective probability should be treated as a way to extract how the choice probability depends on the outcome probabilities, and to get an intuition on why our model is able to explain the fourfold pattern of risk preferences. Concretely, one would just need to estimate the utility function (together with the parameters D and T), and the corresponding "belief function" comes as a result. The next subsection presents an analysis of the subjective probability when time or "energy" constraints are considered.
"Realistic" inverse double-S-shaped probability weighting function. Al-Nowaihi and Dhami [48] report that a theory of choice should be able to describe the following two stylized facts: i) overweighting low probability events and underweighting high probability ones; ii) neglecting extremely low probability events and considering as certain extremely probable events. The first fact is essentially captured by an inverse S-shaped probability weighting function, as derived in Eq (20). The second one is referred by Kahneman and Tversky [43] as an editing phase: "the simplification of prospects can lead the individual to discard events of extremely low probability and to treat events of extremely high probability as if they were certain". Clearly, this resonates with the idea that the DM has limited computational resources and, even when there is no explicit time limit, the processing cost acts as such.
To account for both patterns i) and ii), Al-Nowaihi and Dhami axiomatically construct a composite probability weighting function, shown in Fig 8, obtained by the concatenation of three different Prelec functions, for a total of 6 parameters (see Eq 6.2 in [48]). A DM with such probability function underweights (ignores) very low probabilities events -p2[0,p 1 ]and overweights (considers as certain) extremely probable events -p2[p 3 ,1] -reflecting stylized fact ii). Within the middle range p2[p 1 ,p 3 ], the function has an inverse-S shape, addressing stylized fact i).
Although the proposed probability weighting function addresses the previously mentioned stylized facts, it has six parameters and may seem ad-hoc and artificial. Our framework, on the other hand, predicts the desired shape, resulting from the superposition of two effects: finitetime deliberation and value distortion. Indeed, referring to the previously derived value-distorted subjective probability (20) for outcome o A of lottery L 1 in (1), the time-dependent generalization is (approximately) given by withŨ p ðojTÞ given in Eq (14). In Fig 9, we plot π(p|T) for different values of T, fixing u(o A ) = u(o B ) = D = 10 for illustrative purpose. We can see how the value-distortion and the finite time deliberation play opposite effects: for high values of T, one can observe an inverse S-shape, due to the influence of value on probability perception (as in Fig 7). For low values of T, the influence of time pressure becomes dominant, resulting into a S-shaped probability weighting. For intermediate values of T (in the example T = 0.3, black star-dotted line), the superposition of these two "forces" results in an inverse double S-shaped probability weighting, similar to the one in Fig 8. In summary, our framework predicts the probability weighting function postulated in [48] with only 2 parameters-D and T-and offers a more microscopic explanation for such observed behavior, in terms of competition between value-distortion and finiteness of computational resources. Let us stress that the time constraint in our model is not necessarily meant as an external time pressure, but it can also be conceived as an internal time pressure, because of energy constraints and efficiency-accuracy tradeoff. Therefore, at this stage, we are not claiming that an explicit time-pressure is needed to recover an inverse double-S curve.
"Probability-distorted" effective utility. In the previous sub-section, we have studied how the outcome values alter the probability perception of the DM. Here, we study the effect of probability on value perception. Eq (5) has introduced the effective utilitiesŨ p ð:Þ, which are transformed from the utilities u(.) via a non-trivial nonlinear operation involving the outcome probabilities. This corresponds to the dual of the value-distorted probability π(p) given in expression (20) in the form of a value perceptionŨ p ð:Þ influenced by probability. Fig 10 shows the transformed utility functionŨ p ð:Þ as a function of the original one u(.) for different values of D p ≔ pD 2 . The interaction between probabilities and values transforms an initially risk-averse (concave) utility function u(.) into a convex risk-seeking utility on a sub-interval of the loss domain, predicting the existence of a reference point to discriminate between behavior toward  gains and behavior toward losses, as postulated in prospect theory. We stress here that this comes as another prediction of the theory without any parameter adjustment or added ingredients. In particular, it is not a phenomenological assumption put in the theory, for instance as in Prospect Theory.
To illustrate this effect quantitatively, let us consider again the utility function uðoÞ ¼ 1À e À ro r for o2R. The corresponding transformed utility functionŨ p ðoÞ given by expression (20) reads  (20) and (25) predict a risk taking behavior on the loss side even when starting with a utility function that is everywhere concave, in agreement with the outlined predictions on the fourfold pattern of risk-preferences [43].
Decision tasks like those in Eq (15) are classic examples where the weak risk-aversion relation, denoted by R w , can be applied: meaning that L 2 is riskier than L 1 . More general relations [70] have been suggested to formalize risk-aversion, such as the so-called strong risk-aversion (or second-order stochastic dominance) R s : Within expected utility, these two definitions of risk-aversion coincide [70,71], but, in general, when departing from the expectation structure, the two relations differ [72] and need to be studied separately. An example for strong risk-aversion is the following: According to (27), L 5 is riskier than L 6 ; our framework, using for example r = 1 as before, predicts the correct pattern for the majority of D's: In summary, without added assumptions, our theory predicts what has been postulated for instance by prospect theory, with a concave part of the value function for gains and a convex part for losses. These properties derive naturally from the stochastic representation of probabilities in the presence of values.
However, the way in which the above-presented transformed utility determines choice preferences is different from usual decision models; in expected utility, the concavity of utility function implies risk aversion through Jensen's inequality [73]. Here, due to the non-linear form ofŨ p ðoÞ, it is in general not easy to derive analogous simple constraints for the model parameters.
More generally, we stick with the notion of an (initially concave) utility for the following reason: utility is a well-defined concept in choice under certainty [74], where diminishing marginal utility indicates less and less increase in "happiness" as wealth increases. In Expected Utility, the concept of diminishing marginal utility and risk-aversion are fundamentally entangled: there cannot be one without the other. On the other hand, in generalized theories like Rank-Dependent Utility Theory, as shown in [75], it is possible for a decision-maker to be risk-seeking with a concave utility function, provided the probability weighting is sufficiently "optimistic". Therefore, the concept of diminishing marginal utility and risk-aversion are decoupled to same extent. Analogously, our model hypothesis is that the utility function, when in the context of choice under certainty, has some form (e.g. the CARA function used in the manuscript), expressing (or not) diminishing marginal utility. Then, as soon as there is some uncertainty, due to the interaction between probability and value, the utility becomes "distorted", and assumes a form like the one in Eq (25), allowing to exhibit risk-seeking behavior for losses.

Stochastic dominance
First order Stochastic Dominance [76] is a property that decision theorists usually are not willing to give up, as it essentially encodes the reasonable behaviour that "more is better". A random variable (gamble) L 1 has first-order stochastic dominance over gamble L 2 if P(L 1 �o)�P (L 2 �o)8o and for some o P(L 1 �o)>P(L 2 �o), where {o} is the set of possible outcomes. However, people often violate it when presented with choices like Even if L 1 stochastically dominates L 2 , most people choose L 2. Popular decision models like rank-dependent utility theory [10] and cumulative prospect theory [11] cannot account for this pattern. Within our framework, this is explained when DM exhibit relatively low values of D, such that the decision is "value-oriented" and the DM does not pay sufficient attention to the probabilities. For this particular gamble, assuming linear utility function, P(L 2 )�0. 65 for small values of D, quite close to the fraction 70% of people choosing L 2 experimentally found by Birnbaum and Navarrete [77]. Note that our model does not predict any violation when the dominance is "evident'", as in the following examples ðAÞ L 1 ¼ f1€; 0:5; 3€; 0:5g or L 2 ¼ f1€; 0:5; 2€; 0:5g ! PðL 1 Þ � PðL 2 Þ8D ðBÞ L 3 ¼ f11€; 0:5; 12€; 0:5g or L 4 ¼ f10€; 1; 0€; 0g ! PðL 3 Þ � PðL 4 Þ8D ð31Þ It is clear that L 1 dominates L 2 in (A) and L 3 dominates L 4 in (B), and people choose accordingly. Fig 12 shows the predicted probability to choose L 1 in task (A) (Eq (31)) for different values of D and T. As expected, for small values of the diffusion coefficient D, the choice becomes more deterministic. The (explicit or implicit) time constraint T plays a similar role.
Several descriptive theories [78, 79] allowing violations in cases like (30) predicted unreasonable behaviour in tasks like (31), essentially because two outcomes with the same objective probability were forced to have the same subjective one [80]. Our framework, thanks to the non-separable form of the lotteries' attractiveness, avoids this problem and confirms its significant predictive power.

Predictions from finite time SRDT
This Section presents further predictions of our theory when generalized to account for finite decision time.
Inverse relation between choice probability and response time. Several studies (e.g. [45]) report that there is an inverse relation between the probability to choose an option and the (mean) decision time to choose that option (see Fig 13). Intuitively, the more "difficult" the choice (e.g. two lotteries with similar expected values), the more time it will take to decide, and the choice probability will be around .5, because the optimal decision is not obvious. By construction, our theory predicts such phenomenon, since: In [46], the main result is that the fraction of subjects choosing low risk gambles increased from below .50 (low time pressure) to above .50 (high time pressure). Therefore, subjects essentially became more risk-averse as the time available to make a decision decreases. Our framework is able to predict such pattern. Consider a choice of the form: where E[L 1 ]<E[L 2 ], but Var(L 1 )<Var(L 2 ), so L 2 gives on average a greater payoff, but is riskier. Assume for simplicity a linear utility function u(x) = x. In Fig 14 we can see clearly that P(L 1 | T) goes from below .5 to above .5 as time pressure increases (i.e. time available decreases), reflecting the tendency found in [46] of increasing risk-aversion as a function of time pressure. Note that while Decision Field Theory needs to assume an asymmetric starting point for the random walk in order to capture a preference reversal, our theory essentially predicts this pattern without adjusted additional parameters.

Discussion
We have presented a simple and efficient "stochastic representation" framework that describes the human decision-making process as inherently probabilistic. It is based on a representation of the deliberation process leading to a choice through stochastic processes, the simplest of which is a random walk. Differently from random utility theory (external noise added to the rational utility and probability representation as a calibration procedure), our stochastic representation framework relies on a plausible description of the (assumed) intrinsic stochasticity of the human choice process. Our proposed approach does not disentangle probability and value as in expected utility theories, rather it allows interaction between them in a non-trivial way. Despite its simplicity, the model provides straightforward derivations at a more microscopic level of several known structures that have been documented empirically in human decision theory. Our theory also provide a number of novel predictions.
Here, only structural properties have been presented through simple examples, which are not sufficient to falsify the theory. At this stage, the parsimony of its formulation and the wealth of obtained properties, which are in qualitative or semi-quantitative agreement with empirically observations, makes our theory interesting to further explore. We plan to use more sophisticated procedures to test our model against the major decision theories, based on crossvalidation methods: parameters are first estimated from one part of an experiment, and then these same parameters are applied to a separate part of the experiment and the predictions are evaluated. Note that we cannot use the usual Wilks likelihood-ratio test [84] because in general the models will not be nested, but other methods are possible, such as the Vuong test [85] and Information Criteria (AIC [86] and BIC [87]).
The current formulation is not meant to be "the" definitive framework (if it would exist)since as already mentioned it presents some limitations, such as those deriving from Luce choice axiom-but a baseline to construct more elaborate models, keeping in mind the tradeoff between parsimony and explanatory power.
In general, we are aware that testing alternative ways of value formation is very difficult, because of the measurement problem in economics [88]. Indeed, we cannot really measure the "degree of happiness" of the decision-maker, but we have to infer it-adopting one particular model-through her choices. This adds an additional layer of complexity with respect to other hard sciences, such as physics or chemistry. On the other hand, contribution of this type may help to devise more effective ways to elicit preferences, deepening our understanding of decision processes. In addition, further theoretical and empirical work may lead to modifications of the presented theory, where the expected utility hypothesis (separability of probability and value) can be seen as a particular case of a more complex structure, where probability and value do interact to some extent in the decision maker's mind.