From Traditional Medicine to Witchcraft: Why Medical Treatments Are Not Always Efficacious

Complementary medicines, traditional remedies and home cures for medical ailments are used extensively world-wide, representing more than US$60 billion sales in the global market. With serious doubts about the efficacy and safety of many treatments, the industry remains steeped in controversy. Little is known about factors affecting the prevalence of efficacious and non-efficacious self-medicative treatments. Here we develop mathematical models which reveal that the most efficacious treatments are not necessarily those most likely to spread. Indeed, purely superstitious remedies, or even maladaptive practices, spread more readily than efficacious treatments under specified circumstances. Low-efficacy practices sometimes spread because their very ineffectiveness results in longer, more salient demonstration and a larger number of converts, which more than compensates for greater rates of abandonment. These models also illuminate a broader range of phenomena, including the spread of innovations, medical treatment of animals, foraging behaviour, and self-medication in non-human primates.

small, this distribution is well approximated by a Poisson. Thus, given U the time spent being ill, N is distributed as a Poisson with parameter α 1 U . As described in Methods, U in turn is distributed exponentially with parameter λ = µ+ν+θ+τ +σ. Because N is a Poisson-exponential mixture, it has a geometric distribution with parameter λ λ + α 1 = µ + ν + θ + τ + σ α 1 + µ + ν + θ + τ + σ (see for example reference [1]). This can be seen through the following, letting f (u) represent the probability density function of the exponential distribution: Cultural fitness of the trait The cultural fitness of the trait in this model is the mean number of converts per demonstrator. This is analogous to the concept of the basic reproductive value in epidemic theory and absolute fitness in evolutionary theory. The mean of the geometric distribution above is Probability of spread of new trait The probability of the practice being spread to at least one individual after its invention is The probability of spread through the whole population, that is, fixation, is one minus the probability of (eventual) extinction. To derive the ultimate extinction probability, we use a result of linear birth-death processes [2]. Here, the linear stochastic birth rate is α 1 which is the rate of converting observers to the treatment, and the death rate is λ. The extinction probability [2, p147] is then The probability of spread of the trait is therefore This result can be alternatively derived by considering the process as a discrete time branching process. In this case, the extinction probability is the smallest solution of x in where p = α 1 α 1 +λ and therefore and P (extinction) = min(1, (1 − p)/p) = min(1, λ/(α 1 )).
The probability of spread of the trait is then as given in Equation (2). The reason a discrete formulation is possible here is that the probabilities of events do not change over time (for each individual), and thus to compute the extinction probability it does not matter whether or not the "generations" are synchronised. Note, in this case, the relationship between the cultural fitness of the trait and the probability of its spread: if the cultural fitness is φ, the probability of spread is 1 − 1/φ for φ > 1 and 0 otherwise.
Optimal efficacy of trait Does a treatment with the highest efficacy τ have the highest cultural fitness? In other words, what efficacy maximises the cultural fitness of the behavioural trait? Under this model, the probability of spread (as well as cultural fitness) is an "n-shaped" curve. Thus, intermediate efficacy is favoured. The reason is that on one hand if the treatment efficacy is too low, the practice is abandoned because recovery is too slow, and on the other hand if it is too high, recovery is fast and opportunities for demonstrating the practice to others are reduced. Inspecting Equation (1), this qualitative result clearly holds as long as the abandonment rate increases with decreasing recovery rate. Figure  3 shows this relationship for alternative values of σ, ρ and a.
The optimal efficacy for this model occurs at This optimal efficacy is positive when In other words, if ρ < e aσ /a the treatment efficacy with the highest cultural fitness is negative -that is, the optimal treatment is maladaptive.

A.1.2 Continued demonstration
Here we consider a second model which generalises the above model in that demonstration can occur after an individual has recovered. Again, recovery is permanent with = 0. Note that the probability that the demonstrator recovers rather than dying or abandoning the practice is given by Again, N is the number of observers converted by the demonstrator and U is the time spent by the demonstrator being sick until recovery, death or abandonment.
Let W be the time between recovery and death or abandonment if the demonstrator recovered. We assume U and W are independent. Because the sum of two independent Poisson variables is also Poisson, the conditional distribution of N given the periods U and W is and where, in this case, ζ = µ + θ.
Cultural Fitness of the trait Again, the expectation of the convert distribution gives the cultural fitness φ of the trait. Noting that or written out in full, Note that the cultural fitness of the trait is the sum of two components corresponding to the periods of being ill and being healthy.

A.2 Multiple episodes of illness
In this section we turn to the most general model in which an individual can become ill more than once, and demonstration continues after recovery. Consider the distribution of the number of episodes of illness. Let ψ be the probability that the process ends through death or abandonment in a given episode or in the period of recovery immediately following the episode. This probability is given by Let H be the total number of episodes of illness experienced by an individual. This random number is distributed geometrically; that is, P (H = i) = (1 − ψ) i−1 ψ for i = 1, 2, 3, . . . .
In this model, U is the total time spent by a demonstrator being ill, which is the sum of periods of time of each episode of illness. Assuming these periods are independent and identically distributed as exponential with parameter λ, the sum, conditional on H, has a gamma distributed Similarly, the total time spent being a healthy demonstrator (after the first episode of illness) is where W H ∼ Gamma(H, 1/ζ) and W H−1 ∼ Gamma(H − 1, 1/ζ). The two cases above correspond to death or abandonment occuring during the last phase of illness (W H−1 ) and during the last phase of healthiness (W H ). Finally, as before, The unconditional distribution of N therefore has the probability mass function (the probability that the demonstrator recovers from a given episode of illness) and are the relevant probability density/mass functions and where Γ(h) is the gamma function.

A.2.1 Cultural Fitness
Using the same approach as before, we can derive an expression for the cultural fitness of the trait in this general model. Noting that E(H) = 1/ψ, which is the mean number of episodes, This expression of the cultural fitness generalises the two models introduced in the previous section. When there is only a single episode of illness, = 0 and therefore ψ = 1 and φ reduces to the previous formula for cultural fitness (Equation 3). With both = 0 and α 2 = 0, the expression simplifies to the cultural fitness value of the initial model (Equation 1). If there are multiple episodes of illness, but demonstration only takes place during periods of illness then α 2 = 0, > 0, and φ = α 1 /(ψλ).
The conversion parameters α 1 and α 2 have straightforward relationships with the cultural fitness, with both elevating the cultural fitness as they increase. The cultural fitness of the treatment is very sensitive to increasing and α 2 from zero. The former change allows multiple episodes of illness while the second allows demonstration of the treatment during healthy periods. Figures 4 and 5 show this sensitivity for two submodels: 1) single-episode of illness with continued demonstration and 2) multiple episodes of illness with disease restricted demonstration. In the vicinity of parameter values considered here, and α 2 each has a strong effect on cultural fitness by increasing it dramatically and by creating a high secondary peak at high efficacy values (τ ). Decreasing µ or ν lengthens the periods of being sick and being well, respectively, increasing the opportunities for spread of the treatment and therefore its cultural fitness. More complex relationships hold between cultural fitness and the recovery parameters σ, τ and the abandonment parameters ρ, a, because of the trade-offs mentioned in the main text of the article.

A.2.2 Probability of spread
The probability of spread of the behavioural trait in this general model can be considered in two ways. First, consider the probability that the trait spreads beyond the inventor regardless of whether it spreads any further. As before, this probability is given by P sf i = 1 − P (N = 0). Using Equation (4) and noting that and writing L = λ/(α 1 + λ) and M = ζ/(α 2 + ζ), P sf i simplifies to where again q = (τ + σ)/λ. This probability of spreading from the inventor can be viewed as an upper bound on the ultimate probability of spread (see also Figure 5c).

A.2.3 Simulation of model
To study the ultimate probability of spread of the treatment in the population we use computer simulations because it is not possible to derive an explicit analytical expression for this probability in the general model. In considering extinction probabilities, a continuous time branching process is equivalent to a discrete time branching process in which all times at which observers are converted by each demonstrator are synchronised. This is possible because the distributions of converts produced by demonstrators are independent and identical, and unaffected by time. The algorithm we use for simulating the model is therefore as follows.
9. If the number of new converts is zero at the first "generation", extinction occurred. Otherwise, track the total number of practising individuals. Return to step 3 in order to repeat the procedure for each practising individual in the population. If the practising population size is beyond some (arbitrary but large) threshold, say fixation has occurred.
10. Repeat the process from step 2 many times and track the proportion of simulations in which extinction occurs.
11. Vary parameters and start again from step 1.
We assume fixation has occurred if the size of the population using the treatment reaches a threshold of 500. The probability of spread is simulated as the proportion of 1000 runs in which fixation occurs.