Replication, Communication, and the Population Dynamics of Scientific Discovery

Many published research results are false (Ioannidis, 2005), and controversy continues over the roles of replication and publication policy in improving the reliability of research. Addressing these problems is frustrated by the lack of a formal framework that jointly represents hypothesis formation, replication, publication bias, and variation in research quality. We develop a mathematical model of scientific discovery that combines all of these elements. This model provides both a dynamic model of research as well as a formal framework for reasoning about the normative structure of science. We show that replication may serve as a ratchet that gradually separates true hypotheses from false, but the same factors that make initial findings unreliable also make replications unreliable. The most important factors in improving the reliability of research are the rate of false positives and the base rate of true hypotheses, and we offer suggestions for addressing each. Our results also bring clarity to verbal debates about the communication of research. Surprisingly, publication bias is not always an obstacle, but instead may have positive impacts—suppression of negative novel findings is often beneficial. We also find that communication of negative replications may aid true discovery even when attempts to replicate have diminished power. The model speaks constructively to ongoing debates about the design and conduct of science, focusing analysis and discussion on precise, internally consistent models, as well as highlighting the importance of population dynamics.


Derivation of full model with random replication
Let f T,s = n T,s /n be the frequency of true hypotheses with tally s. Under the assumptions 2 and definitions supplied in the main text, the full recursion for n ′ T,s is given by: n ′ T,s = n T,s + anr for s not equal to 1 or −1. In those cases, there is an additional term. For s = 1: 4 n ′ T,1 = n T,1 + anr The an(1 − r)b(1 − β) term accounts for inflow of novel positive findings, all of which are communicated. For s = −1: 6 n ′ T,−1 = n T,−1 (3) The an(1 − r)bβc N− term accounts for inflow of novel negative findings, only c N− of which are communicated. Recursions for false hypotheses can be derived just by substitution of These recursions implicitly define the population growth recursion for n: This just indicates that the population of published hypotheses grows proportional to the innovation rate, 1 − r, and the rates at which true and false hypotheses respectively produce 12 positive and negative findings, as well as the rate at which negative findings are communicated.
2. Beyond "true" and "false" Above we noted that recursions for false hypotheses can be derived just by substitution 16 of variables: b → 1 − b and 1 − β → α. In other words, true and false hypotheses are differentiated only by the rate at which they appear in new investigations and their respective 18 probabilities of producing positive findings. This also means it is straightforward to expand the model to additional epistemic states, as "true" and "false" really just more more and 20 less correct. For example, small, medium, and large effect sizes could be represented by three states, each with its own base rate and probability of producing a positive result. The 22 derivation would remain the same, but an additional set of steady-state solutions would appear.

Full communication solution.
Here we repeat the simplest such solution from the 48 main text and then motivate its justification. The steady state proportion of hypotheses that are both true and have tally s, when all findings are communicated, is given by:  (3) 68 − + +. The probability of any one of these is (1 − β) 2 β, and the probability that an hypothesis is true and has been studied three times is (1 − r)br 2 .

70
The pattern here generalizes so that the total probability is just:

78
This steady-state solution obviously assumes that there has been an infinite amount of research time, such that every m can be realized. In practice, since the sequence is geometric 80 in r, the probabilities of higher values of m decline very rapidly and simulations confirm that steady-state is reached quite rapidly, as long as the replication rate r is not close to r = 1.

82
More importantly we think, these solutions are never meant to describe actual science, but rather to allow us to reason about causal forces in actual science. So the steady state 84 expressions are important even if, as in many real dynamical system, they are never exactly realized. For example, problems in evolutionary theory are routinely solved by asking what 86 happens on the infinite time horizon. Such solutions have been incredibly useful, despite the fact that no real population or environment is stationary enough to make the exercise 88 literally sensible.

90
to be less than one, the above strategy generalizes directly, but does become complex. The expressions get much more complex, because now the infinite series is over multinomial 92 probabilities of three possible outcomes at each replication investigation of an hypothesis: (1) positive and communicated, (2) negative and communicated, or (3) not communicated.

94
In addition, when findings are not always communicated, then the effective activity rate changes, making other probabilities conditional on observable activity. Still, these solutions 96 can be derived both by the logic to follow or by brute-force solution of the system of recursions. Solving the system of recursions does allow for easily defining reflecting or absorb-98 ing tally boundaries, which may be appealing in some contexts. The combinatoric solution to follow assumes unbounded tallies. Solutions in the bounded and unbounded cases are nearly identical, for all scenarios considered in the main text. The Mathematica notebooks in the supplemental materials present code for both types of solution.
We present the solutions here as a sequence of conditional probabilities, as we've found this form easier to interpret than the general multinomial form. Therefore they provide 104 more insight. Specifically, we decompose the multinomial probabilities into a binomial series for observed/unobserved investigations of a hypothesis and a binomial series for positive/negative findings conditional on being observed. The solutions take the form: Where: 108 Pr(activity) The probabilities Pr(s|+) and Pr(s|−) give the probabilities of tally s averaging over number of investigations m and un-communicated findings u, beginning with either a positive find-110 ing or a negative finding, respectively. This conditioning is necessary because a tally s can be reached by different paths once communication is partial. These probabilities are given 112 by: where I a (b) is a function that returns 1 when a = b and zero otherwise and R = r/ Pr(activity) 114 is the probability of replication, conditional on activity as defined earlier. The term Pr(u|m) gives the probability of u un-communicated findings in m investigations, defined as: where is the probability a replication finding is un-communicated, averaging over positive and 118 negative findings. Finally, the function S(z|n) provides the probability that a sequence of length n communicated replication findings producing a difference z between positive and 120 negative replications. It is defined as: where K(a, b) is again the binomial chooser function, but evaluating to zero when b is not 122 an integer, and: which is the probability of a positive replication, conditional on the replication finding being 124 communicated.

Approximate conditions for reduced communication
We argue in the main text that full communication is rarely optimal, from the perspective of precision. Consider the full communication context: c N− = c R− = c R+ = 1. For small b 128 (b 2 ≈ 0) and small r (r 3 ≈ 0), precision as defined in the main text is improved by reducing communication parameters under the following conditions:

130
• c N− < 1 when α < β (easy to satisfy) • c R− < 1 when α > 0.5 (hopefully not satisfied) 132 • c R+ < 1 when β − α ≤ 1/4 These conditions are derived by first defining precision at s = 1, which is most conservative 134 precision to investigate, because it benefits the least from replication, and higher tallies always have higher precision than s = 1. So improvements at s = 1 cascade upwards to higher 136 tallies. Let PPV 1 be the precision at s = 1. Then the first condition is proved by computing the derivative ∂PPV 1 /∂c N− , evaluated at full communication parameter values. Then Taylor 138 expand the result simultaneously by second-order around r = 0 and by first-order around b = 0. Neglecting terms of order O(b 2 ) and O(r 3 ) and higher: which is negative unless α > β. Thus suppressing some initial negative findings is favorable, provided the base rate is small and replication is not too common. We think most scientific 142 fields satisfy these conditions, but reasonable people can and do disagree on that point. In contrast, suppressing negative replications is unlikely to help. By the same strategy, but 144 this time differentiating with respect to c R− : which is guaranteed positive, indicating that c R− = 1 is favored, when α ≤ 0.5, because by 146 assumption 1 − β > α. The third condition is derived similarly: The last term is the one in play. For the above to be negative, it is required that: And this is guaranteed when β − α ≤ 1/4.