Scientific discovery in a model-centric framework: Reproducibility, innovation, and epistemic diversity

Consistent confirmations obtained independently of each other lend credibility to a scientific result. We refer to results satisfying this consistency as reproducible and assume that reproducibility is a desirable property of scientific discovery. Yet seemingly science also progresses despite irreproducible results, indicating that the relationship between reproducibility and other desirable properties of scientific discovery is not well understood. These properties include early discovery of truth, persistence on truth once it is discovered, and time spent on truth in a long-term scientific inquiry. We build a mathematical model of scientific discovery that presents a viable framework to study its desirable properties including reproducibility. In this framework, we assume that scientists adopt a model-centric approach to discover the true model generating data in a stochastic process of scientific discovery. We analyze the properties of this process using Markov chain theory, Monte Carlo methods, and agent-based modeling. We show that the scientific process may not converge to truth even if scientific results are reproducible and that irreproducible results do not necessarily imply untrue results. The proportion of different research strategies represented in the scientific population, scientists’ choice of methodology, the complexity of truth, and the strength of signal contribute to this counter-intuitive finding. Important insights include that innovative research speeds up the discovery of scientific truth by facilitating the exploration of model space and epistemic diversity optimizes across desirable properties of scientific discovery.


Introduction
If obtained independently of each other, multiple confirmations of a claim in a scientific inquiry lend credibility to that claim [1,2]. We refer to this notion of multiple confirmations as reproducibility of scientific results. Ioannidis [3] argued that a research claim is more likely to be false than true, partially due to the prevalence of null hypothesis significance testing as a method of choice in statistical inference and the focus on statistical significance. Recently, attention has been placed on aspects of scientific practice that contribute to a lack of reproducibility. For example, McElreath and Smaldino [4] modeled population dynamics of scientific discovery and investigated how the evidential value of replication studies changed based on the levels of base rate of true hypotheses, statistical power, and false positive rate. They focused on a population of scientists testing a variety of hypotheses who has access to a tally of positive and negative published findings. This tally is assumed to be informative regarding the truth value of a given hypothesis. Relatedly, Smaldino and McElreath [5] provided an evolutionary model of science to study how incentives drive scientific progress and they report that current incentive structures lead to the degradation of scientific practices. Higginson and Munafò [6] studied incentive structures that favor exploratory research over confirmatory research and conclude that current incentive structures are detrimental to scientific progress. Lastly, Nissen [7] proposed a model focusing on a single scientific claim at a time to explore how publication bias contributes to the process of incorrect findings transitioning from claim to fact.
These recent examples study dynamics of scientific discovery and reproducibility by adopting a hypothesis-centric approach to statistical inference. Limitations of null hypothesis significance testing [8], however, makes understanding salient properties of the scientific process challenging, especially for fields that progress by building, comparing, selecting, and re-building models.
Model-centric approaches to statistical inference, where the goal is to select a useful model that approximates the true model (or system) generating the data, have long been a cornerstone in many scientific disciplines. The appeal of a model-centric approach to statistical inference over a hypothesis-centric approach is its generality as an inferential framework. A hypothesis-centric approach can be formulated as a proper subset of a 3/48 model-centric approach where a true null hypothesis about a parameter specifies a reduced model in a hierarchical modeling framework. Further, a model-centric approach allows us to study the process of scientific discovery under uncertainty while bypassing the complications inherited from the hypothesis testing framework (see [3,8]).
Using notions of probabilistic confirmation, idealized experiments, and model selection, we build a model-centric meta-scientific framework to study scientific discovery and reproducibility of scientific claims. We set up a temporal stochastic process representing scientific discovery, in which a community of scientists explore a space of models and work toward finding the true model generating the data, with a consensus model agreed upon at a given time. Inspired by the literature on epistemic landscapes and the cognitive division of labor [9][10][11], we define four scientist types with distinct research strategies. The first scientist is the theory tester who aims to refine the current consensus model. The second scientist is the maverick who seeks novelty and disregards the current consensus model. The third scientist is the boundary tester who aims to identify the boundaries of the current consensus model. The fourth scientist is the replicator who repeats the latest study conducted in the scientific community.
Using probability calculus and forward-in-time simulations via an agent-based model, we study the properties of the process of scientific discovery, such as the time the scientific community spends at each model, and specifically the true model, the mean first time to hit the true model and staying with the true model, and the rate of reproducibility given a true model. We investigate these properties as a function of diversity of research strategies in scientific populations, the complexity of the true model generating the data, the ratio of error variance to deterministic expectation in the true model, and the model selection statistic.
Our work contributes to meta-research [12] by introducing a theoretical framework in which we can observe the effects of parameters of the process of scientific discovery on a variety of indicators of scientific performance. For example, we take high rate of reproducibility and convergence to truth as two such indicators. We show that a scientific process that maximizes rate of reproducibility does not necessarily facilitate scientific discovery of truth. This fact is explained by differential effects of process parameters such as research strategies on these indicators. Moreover, we provide insight into the workings of scientific processes when unencumbered by issues surrounding the 4/48 null hypothesis significance testing, such as publication bias and p-hacking. We show that even in the absence of questionable research practices and incentive structures, true research claims may not be reproducible or scientific processes may not converge to truth even if they are reproducible.

Methods and materials A Model-centric meta-scientific framework
We adopt the following notion of idealized experiment as the basis for scientific inquiry.
Given some background knowledge K on a natural phenomenon, a scientist makes a prediction M , which is in principle testable using observables, the data D. A mechanism generating D under uncertainty is formulated as a family of probability models M ⊃ M parameterized by θ ∈ Θ. As to what extent M is confirmed by D is assessed by a fixed and known method S evaluated at D. We let ξ := (M, θ, D, S, K) denote an idealized experiment.
In ξ, D confirms M if P(M |D, K) > P(M |K), where P(M |D, K) and P(M |K) are probabilities of M after and before observing the data, respectively. By Bayes' Theorem, P(M |D, K)/P(M |K) is proportional to the likelihood P(D|M, K). Large P(D|M, K) implies high degree of confirmation of M . Complex models, however, have a tendency to inflate P(D|M, K) and hence P(M |D, K). As a remedy against overfitting, methods of model selection S, traditionally penalize model complexity to prevent selecting unduly complex models. If M 1 and M 2 are two models competing to fit D in a scientific inquiry, scores provided by well-known information-theoretic approximations to compare and select models [13][14][15] imply that if S(M 1 ) < S(M 2 ), then M 1 is confirmed to a higher degree relative to M 2 .
We let ξ 1 := (M P , θ, D 1 , S, K 1 ) and ξ 2 := (M P , θ, D 2 , S, K 2 ), subscript denoting a temporal order, whose prediction M P competes against a third model M G . A replication experiment consists of ξ 1 and ξ 2 , conditional on M G and performing a replication at ξ 2 . We say that ξ 2 in a replication experiment reproduces the result of ξ 1 if two conditions are satisfied. First, either S(M P ) > S(M G ) or S(M P ) < S(M G ) is true in both ξ 1 and ξ 2 . Second, sufficient information about ξ 1 is available to ξ 2 through 5/48 K 2 to assess if the first condition holds. Else, we say that ξ 2 fails to reproduce the results of ξ 1 . A replication experiment formalizes necessary open science practices through this second condition in the sense of providing sufficient information about ξ 1 to ξ 2 in order to assess the reproducibility of results of ξ 1 by ξ 2 .
A stochastic process of scientific discovery with no replication Based on the concept of idealized experiment, we now build a temporal stochastic process of scientific discovery. First we exclude replication experiments and then include them to show the novelty they bring to the stochastic process. We consider an infinite population of scientists conducting a sequence of idealized experiments P , θ, D (t) , S, K (t) ), indexed by time t = 1, 2, · · · . We assume that M (t) P belongs to a set of probability models M = {M 1 , M 2 , · · · , M L } known to all scientists.
Further, we assume that there are A distinct scientist types in the population, each with a well-defined research strategy R ∈ R = {R 1 , R 2 , · · · , R A } of proposing a model in their experiment. These strategies depend only on global model M At each time step, ξ (t) is performed and new data D (t) of size n is generated independent of everything else from distribution M T (θ T ). Each experiment is performed by a scientist randomly selected from A types in the population using the categorical distribution with probabilities (p R1 , p R2 , · · · , p R A ). The selected scientist proposes a model M where on the right hand side, the last term is the probability of selecting a scientist with research strategy R a independent of all else, the middle term is the probability of

A stochastic process of scientific discovery with replication
In this section, we make two modifications to our process of scientific discovery. First, we introduce a replicator as a type of scientist into the population. Her strategy is to perform replication experiments defined as (ξ (t−1) , ξ (t) , M . This dependence on t − 1 implies that the stochastic process of scientific discovery does not admit the Markov property anymore. The stochastic process when a replicator is included in the population is a higher order Markov chain. In fact, the transition probabilities are path dependent on the whole history of the process (see S1 Appendix for a description).
Our second modification is to lift the assumption P(M |R a , M i ) > 0 for all a, i, that we imposed in the process without a replicator. This assumption increases the connectivity of the transition probability matrix, which makes calculations in the long-term behavior of the Markov chain straightforward. Due to our new process not admitting the Markov property, these calculations are irrelevant in the analysis of the process with a replicator. Therefore, we drop the assumption of transitioning from a model to any other model to be nonzero. Removing this assumption allows us to define scientist types that visit only the subset of all models consistent with a specific research strategy. This property of the process renders the effects of each research strategy on the process outcomes well-pronounced.
The properties of first order Markov chains, which apply to our model without replication are well-studied. The theory for higher order Markov chains is less well understood (see [16]). For example, existence and uniqueness of stationary probabilities of selecting global models and the mean first passage time to a model are not straightforward to establish in a higher order Markov chain. Thus, we cannot use the theory of Markov chains to investigate properties of interest in our process with replication. For this purpose, we turn to an agent-based model (ABM), which is a forward-in-time, simulation-based implementation of a process at the individual level, using entities defined as agents (see [17,18]). In our ABM, each scientist is represented as an agent that updates the global model by the algorithm given in S2 Appendix.
Agent-based modeling helps us assess interesting properties of our scientific process when replication is included in the system. We motivate these properties with the same questions (in bold) as we did in our stochastic process of scientific discovery with no replication and specify them in the remainder of this section.
G =M T } which also implies the rate of reproducibility when the true model is not the global model We estimate these rates of reproducibility by Monte Carlo integration of ABM simulations (see S3 Appendix).
Intuitively, long time spent on the true model, quick discovery of the true model, high stickiness of the true model, and high rate of reproducibility given the true model is global model appear to us to be desirable properties of an efficient process of scientific discovery. We would like to gain insight into 1) the relationship among these properties, and 2) the composition of research strategies in the scientific population that optimize 10/48 these properties.
Scientists search for the true model in a universe of linear models In this section, we consider an application of our model-centric framework where scientists search for a true model in a universe of linear models M. We have chosen linear models because they can accommodate a variety of experimental and observational study designs with straightforward statistical analysis. Further, the complexity of linear models is mathematically tractable and intuitive to assess. We span M as a subset of statistical models obeying y = Xβ + , where y is n × 1 vector of response variables, X is n × p matrix of predictor variables, β is p × 1 vector of model parameters, and is n × 1 vector of random errors satisfying the Gauss-Markov conditions of zero expectation and diagonal covariance matrix with constant entries. We We define model complexity as a partial ordering of the number of model parameters as described by four conditions: 1) models with larger number of parameters are more complex than those with smaller number of parameters, 2) if two models have equal number of parameters, the model that contains higher order interactions is more complex than the model with lower order interactions or the model without interactions, 3) if two models have the same highest order of interactions, the model that has a higher number of interactions of this order is more complex, 4) else, the two models are equally complex. By convention, we include the predictor x 1 and the response variable y in all models reflecting our assumption that all scientists in the community focus on a research question that involves at least one common factor of interest and a common response variable.

11/48
We visualize model complexity on a two dimensional Euclidean space, by representing each model with a unique geometry on an equilateral hexagon inscribed in its tangent circle (Fig 1). Next we discuss the parameters of the system.

Scientist types
We define our scientist types with two considerations in mind. First, since diversity is often a key component of evolving processes, we include distinct research strategies to explore the effect of epistemic diversity in the scientific population. Second, we aim to define simple research strategies where our scientists do not have a memory of their past decisions and they do not interact directly with each other, but only via a consensus model. Such simple research strategies allow results to be obtained by probability calculus and these can be checked against results from forward-in-time simulations.
Nonetheless, the research strategies we include in our idealized model seem reasonably realistic to us in capturing the essence of some well-known research approaches. research on epistemic landscapes and the division of cognitive labor [9][10][11]. However, in contrast to that strategy, R M ave does not actively aim to avoid previously tested models, which means that Mave's strategy is independent of the current scientific consensus.  (Fig 2A).
For a population with a replicator, we make two changes as described in our process with replication. First, we maintain Tess, Bo, and Mave with the following modification for Tess and Bo. If there are m models satisfying Tess's or Bo's criterion, each of these models is now proposed with probability 1/m, and any other model is proposed with probability 0. Second, we introduce a fourth type of scientist Rey, the replicator. If selected at time t, Rey conducts a replication experiment defined by Rey ). Her strategy R Rey is to set M   (Fig 2B).

Proportions of scientists in populations
We assess the effect of each strategy by considering populations of scientists in which

Maximum number of factors in the model
To explore the effect of model complexity on scientific discovery thoroughly, we fix the Predictor variables, error variance, and sample size We uniformly randomly generate the value of the jth factor at the ith level x i j on the set {1, 2, · · · , 100} for all i and j. We fix the sample size n = 100 for convenience and we calibrate the ratio of the error variance σ 2 to expected value of the model at the mean value of the predictors E(y|µ x ), where µ x = E(x). We use a standardized linear regression model so that if the ith observed response is y * i , then we set y i = (y * i −ȳ * )/s * , whereȳ * is the sample mean and s * is the sample standard deviation of y * i , i = 1, 2, · · · , n. Implementing standardized linear regression model allows us to precisely specify σ 2 : E(y|µ x ). We fix three levels for σ 2 : E(y|µ x ). (1 : 4), (1 : 1), and (4 : 1).
We set the true regression coefficients to be Dirichlet distributed with unit parameters or equivalently, uniformly distributed on the interval (0, 1) with the 14/48 constraint that they sum to 1. Thus, all regression coefficients have the same mean effect size.
We set the correlation between the first factor x 1 and other factors to a small value of 0.2 to avoid orthogonality between predictors which is an idealized case that we believe rarely achievable in practice. We performed additional analyses with higher correlation between predictor variables and found that correlation between predictors does not affect the results in our system unless it is extremely high and causes multicollinearity.

Model selection criteria
The True linear models and design of ABM experiments.
We use "true model" to mean the linear structure and for each true model there are infinitely many probability distributions that can be the fully specified true probability model generating the data. All our analyses for each true model integrate out these parameters by simulation. For the system with replication, we have the computational advantage of employing probability calculus and we analyze our process under each of L = 14 possible true models.
For the system with replication, we use three true models representing a gradient of model complexity: Further, we set up a 3 × 3 × 5 × 2 completely randomized factorial design experiment composed of: 3 true models, 3 σ 2 : E(y|µ x ) levels at (1 : 4), (1 : 1), (4 : 1), 5 scientist populations (see Table 1), and 2 model selection statistics (AIC and SC). Each experimental condition is implemented as an ABM simulation. Each scenario was run for 11000 iterations and replicated 100 times, each using a different random seed. The first 1000 iterations were discarded as burn-in, except when analyzed for the mean first passage time to true model, for which we considered all iterations.
For the code to perform simulations and analyze the data, and a summary dataset, see S5 Code and Data.

Results in a system with no replication
Boundary tester overfits unduly complex models top three most visited models (Fig 3, column B o, rows 7 − 10). This is a consequence of

16/48
Bo's strategy to add only interaction predictors to her proposed models, which regularly pits the global model against more complex models.
SC increases time spent at the true model

True model is sticky
Top panel in Fig 6 shows  Using SC as opposed to AIC as the model selection statistic decreases differences across populations (Fig 7, SC), increasing the speed of discovery for all populations when averaged across all initial and true models. The population to hit the true model fastest is the epistemically diverse population, followed by Tess-, Mave-, and

Results in a system with replication
Reproducible science does not imply convergence to truth Results from our ABM show that the association between reproducibility of scientific claims and discovery of truth must be interpreted with care. Although multiple confirmations of a claim in a scientific inquiry lend credibility to that claim, withstanding test of multiple confirmations is not sufficient for scientific convergence to truth. As the rate of reproducibility in our system increases, the scientist populations do not necessarily spend more time on the true model as evidenced by low value of the sample correlation coefficient for all populations r = 0.18 ( Fig 8A). Further, the rate of reproducibility has a low, positive correlation with the first passage time to true model as evidenced by sample correlation coefficient for all populations r = 0.18 ( Fig 8B).  Fig 8C) regardless of the complexity of the true model and the error variance to model expectation ratio (Fig 9A and Fig 9B). with Rey-and Tess-dominant populations hitting simple true models faster than complex models, and Bo hitting complex true models faster than simpler models. We conclude that the efficiency of the scientific process is improved by the presence of mavericks who facilitate the discovery of truth by proposing innovative ideas.
Epistemic diversity optimizes the process of scientific discovery to all properties and with low variability. Therefore, epistemic diversity is a buffer against weaknesses of each research strategy. We conclude that among the scientist populations we investigate, epistemic diversity optimizes the properties of scientific discovery that we specified as desirable.

20/48
Methodological choices are not details and affect the discovery of scientific truth While scientists focus on their theories and data collection during the scientific process, methodological choices they make to test their claims may appear to be perfunctory.
While theoretical differences among model selection statistics including AIC and SC have been extensively studied in statistics literature (see [19] and references therein), a statistical method is only guaranteed to operate well under its assumptions. In practice, violations of these assumptions will affect the results of an analysis performed with that method. The effects of model selection statistic on our outcomes of interest are not trivial. When true model complexity is low, using SC for model selection increases the proportion of time true model is selected as global model as compared to AIC ( Fig 11A).
As model complexity increases, however, this difference disappears and AIC has lower variance in performance. When the ratio of error variance to model expectation is low, SC leads to a greater proportion of times true model is the global model. As the ratio of error variance to model expectation increases, AIC and SC have similar proportion of times true model is the global model but AIC has smaller variance ( Fig 11B). Averaged over all other parameters, SC has higher proportion of times true model is global model than AIC (medians = 27.05% and 19.83%, respectively) but greater variability (IQR = 66.03% and 33.80%, respectively).
Complexity of the true model has a small direct effect on scientific discovery In our system, the complexity of the true model does not drive our outcomes of interest (Fig 12). The median first passage time to hit the true model is 42 steps when the true model is simple (E(y) = β 1 x 1 + β 2 x 2 ), 172 steps when the true model is moderately , and 215 when the true model is  Fig 8C).

Discussion
We introduced a meta-scientific framework for studying reproducibility that uses a model-centric rather than a hypothesis-centric approach. We kept our investigation simple. We imagined that our scientists engage in straightforward research strategies and do not: commit experimenter bias, learn from their own or others' experiences, engage in hypothesis testing, or commit measurement errors. Further, they are not prone to p-hacking, publication bias, or institutional incentives. Moreover, we assume that there exists a true model that our population of scientists attempt to discover using idealized experiments under the paradigm of confirmation. Many of these factors that we have abstracted away are potential avenues for future research, particularly for more complex social dynamics, but our goal here was to demonstrate the richness of a model-centric approach, even in its simplicity.
Our study shows that even under an idealized framework, the link between reproducibility and the convergence to a scientific truth is not straightforward. A highly reproducible research strategy might steer the scientific community away from the truth, as shown by the behavior of Bo in our system (Bo, the boundary tester, adds predictors, which are interactions to the current global model). Consequently, while both reproducibility and convergence to a scientific truth are presumably desirable properties of scientific progress, reproducibility does not imply convergence to the truth (nor vice versa) and hence we must be careful not to substitute one for the other. At least in our framework and with our operationalizations of rate of reproducibility and scientific discovery of the truth, these two properties of scientific process are uncorrelated with each other. This seeming paradox of scientific progress despite irreproducibility has been recently noted by Shiffrin, Börner, and Stigler [20]. Within our model-centric framework, this seeming paradox can be explained by a combination of research strategies and the state of truth. As such, the explanation of the paradox cannot be reduced down to questionable research practices or structural incentives. Moreover, only very specific research strategies reach 100% reproducibility in our system even when the true model that generates the data is captured. Even then, in our system, these are not 22/48 necessarily strategies that maximize other desirable properties of scientific discovery (e.g., first passage time to true model).
Innovation, which is explicitly represented by Mave in our system, makes otherwise inaccessible, possible true models visible to the scientific population and thereby speeds up discovery of truth. The capacity to speed up discovery is not limited to a maverick strategy. What makes the process of scientific discovery efficient is that the whole space of models can be explored at least by some scientists in the population and the true model is accessible. In our framework, there are scenarios in which scientific discovery is unfeasibly slow. In real life, we surmise that the model space might be much larger and the true model-if it exists-might not be necessarily accessible in the search space, which means that discovery of truth might be even more challenging. Therefore, if the connectedness in the model space is indeed as important in the discovery of truth as we find in our system, then innovative scientists might have an indispensable role in facilitating scientific discovery.
Consistent with findings in the literature on cognitive division of labor [10,21], a diversity of strategies in the scientific population optimizes across a range of desirable properties of scientific discovery. If populations are largely homogeneous, with one research strategy dominant over others, then the scientific population tends to perform poorly on at least one of the desirable properties.
We were surprised to find out that the direct effects of true model complexity on the scientific discovery were less pronounced than the effect of research strategies in the population. We do, however, find that the choice of statistics relative to true model complexity have non-trivial effects on our outcomes of interest. This is corroborated by recent statistical theory [19] (and references therein). The difficulty, of course, is that the complexity of the true model is often unknown to a community of scientists and they make their statistical choices under uncertainty. Our point is just that model complexity can have differential effects on scientific discovery and that these effects are moderated by the choice of statistic.
We think that the main limitations of our framework are the lack of capacity to learn and memorylessness of scientists. As such, Rey (our replicator agent) only provides meta-level information about the scientific process and does not contribute directly to the accumulation of scientific knowledge. Incorporating past reproducibility of specific 23/48 claims in decision making strategies would allow the replicator to make substantive contributions to scientific progress. We leave such considerations for future work.

24/48
Supporting information S1 Appendix. We assume that in addition to types of scientists in A, a replicator is also present in the population. The strategies are given by R = {R o , R 1 , R 2 , · · · , R A , }.
The transition probabilities of the Markov chain at time t can be expressed by conditioning on whether a scientist chosen at a given time is a replicator: Here, the first term in the sum is the joint probability of choosing a scientist who is not Therefore, when a replicator scientist is included in the population, we have a higher order Markov chain, whose long term dynamics are not feasible to obtain without a forward simulation method.

25/48
S2 Appendix. Algorithm for stochastic process of scientific discovery.

Algorithm 1
Require: M, Θ, S, R, P(R a ), P(M |R a , M i ∼ M T (θ), for i = 1, 2, · · · , n independently of each other where, C = 2p log(n) if SC is used or C = 2p if AIC is used.
o } which takes the value 1 if the global model at time t + 1 is equal to global model at time t given that at time t we have chosen a replicator. This is a Bernoulli distributed random variable and its mean is given by o } ) whose Monte Carlo estimate is given by ov is the vth instance replicator is chosen. We also estimate the rate of reproducibility when the true model is global: Here, V T is the number of times replicator is selected when the global model is the true model and V N is the number of times replicator is selected when the global model is not the true model, and V T + V N = V. If Akaike's Information Criterion is used, 2p is used instead of 2p log(n). The loglikelihood in the second term is equal to n times the log of the residual sums of squares and it can be written as log P(y|X,β) = n log(y y − y X(X X) −1 X y) + C, where C is a term dependent only on M T . For transition probabilities, we are interested in P(S(M i ) < S(M j )). We have where subscripts i, j now denote quantities that depend on model M i and M j . The We obtain an estimate of P(S(M i ) − S(M j ) < 0) using Eq (5) conditional on true model M T and its predictors X T as follows. We first generate our set of k predictor variables.
We build X i and X j for M i and M j respectively. Then we generate β Tv , v = 1, 2, · · · , V independently of each other. Finally, we simulate y v |X T , β Tv from the normal distribution with expected value E(y v ) = X T β Tv and variance σ 2 . Each realization (y 1 , y 2 , · · · , y v ) is used in Eq (4) to assess S(M i ) − S(M j ) < 0 and the estimate is obtained using the mean in Eq (5). G , the set of proposal models and their probabilities (given in percentage points inside models) are determined. In a population with no replicator, Bo can propose all models with nonzero probability, where each model formed by adding an interaction to the global model is proposed with probability 0.2, and each of the remaining models is proposed with probability 0.02 (2A). In a population with replicator, Bo can propose only models formed by adding an interaction, each with probability 0.25 (2B). A model is randomly drawn from the set of proposal models (3) and data are generated from the true model (4    . 16 15 12 . 10 10 34 . 14 14 13 . 10 10 36 . 15 15 S7 Fig 6. (σ 2 : E(y|µ x )) = (4 : 1), All