A New Model and Method for Understanding Wolbachia-Induced Cytoplasmic Incompatibility

Wolbachia are intracellular bacteria transmitted almost exclusively vertically through eggs. In response to this mode of transmission, Wolbachia strategically manipulate their insect hosts' reproduction. In the most common manipulation type, cytoplasmic incompatibility, infected males can only mate with infected females, but infected females can mate with all males. The mechanism of cytoplasmic incompatibility is unknown; theoretical and empirical findings need to converge to broaden our understanding of this phenomenon. For this purpose, two prominent models have been proposed: the mistiming-model and the lock-key-model. The former states that Wolbachia manipulate sperm of infected males to induce a fatal delay of the male pronucleus during the first embryonic division, but that the bacteria can compensate the delay by slowing down mitosis in fertilized eggs. The latter states that Wolbachia deposit damaging “locks” on sperm DNA of infected males, but can also provide matching “keys” in infected eggs to undo the damage. The lock-key-model, however, needs to assume a large number of locks and keys to explain all existing incompatibility patterns. The mistiming-model requires fewer assumptions but has been contradicted by empirical results. We therefore expand the mistiming-model by one quantitative dimension to create the new, so-called goalkeeper-model. Using a method based on formal logic, we show that both lock-key- and goalkeeper-model are consistent with existing data. Compared to the lock-key-model, however, the goalkeeper-model assumes only two factors and provides an idea of the evolutionary emergence of cytoplasmic incompatibility. Available cytological evidence suggests that the hypothesized second factor of the goalkeeper-model may indeed exist. Finally, we suggest empirical tests that would allow to distinguish between the models. Generalizing our results might prove interesting for the study of the mechanism and evolution of other host-parasite interactions.


Introduction
In the rst section, we describe in more detail the general framework from which the diverse models were derived. In the second section, we explain the goalkeeper-model more carefully; further statements are derived and all proofs are given. In the third section, we explain the lock-key-model in more detail; further statements are derived, all proofs are given, and we show a summary comparison of the two models.
According to the available data, both models could be correct. In the third section, we present the mistiming-model using our framework; we argue that even with modications, this model is problematic with regard to certain data. In the fth section, we describe in more detail how the predictions with regard to CI levels were derived and the data used to test the predictions. 1 Formalism of the models of CI occurrence We develop a formalism for models of CI occurrence and introduce a number of terms and denitions: • F CI (i; ii; iii; iv) is a model of CI mechanism. It has four distinct attributes (see below).  A model of CI mechanisms F CI (i; ii; iii; iv) can have the following attributes: • i: Shows the number of mod/resc factors included in the model; i ∈ N.
• ii: Shows whether mod and resc are quantitatively identical for their corresponding factor: ∀a∀x, y, z, . . . : x mod a = x resc a , y mod a = y resc a , . . ., then ii = 0, or else ii = 1.
If mod and resc are identical, then they may be expressed by the same genes or controlled by the same promoter. Otherwise, they probably are not. This attribute examines the importance of quantitative dierences. If factors are quasi binary, then presence or absence of the corresponding resc factor determines CI, irrespective of quantities. This may be the case when resc factor quantities are always much greater than mod factor quantities.
Of course, factor quantities could also take neither of the proposed forms. If they were positive integers, for example, then iii would be between 0 and 1.
The latter case may be due to Wolbachia contributions exceeding those of the host, so that net host contributions, even if present, are negligible.
CI (i; ii; iii; iv) is strictly more parsimonious than a model F CI has a higher falsiability than F (2) CI ; it is a special case of F (2) CI . Simply put, if a statement can be derived with i = n factors, it can also be derived with i = n + 1 factors; identical mod and resc in females and males are a special case of them not necessarily being identical; quasi-binary factor quantities are a special case of factors being positive real numbers; and zero not host contribution is a special case of any non-negative net host contribution. Of two unfalsied models, the one with higher falsiability should be preferred. However, no statement on falsiability can be made if condition (A) is not met. For example, one cannot say whether the goalkeeper-model or the lock-key-model has a higher falsiability than the other. 2 Goalkeeper-model

Presentation of the goalkeeper-model
The goalkeeper-model is a F CI (2; 0; 1; 1) model of CI: it takes into account two factors x, y ∈ R + ; mod and resc function are quantitatively identical (x mod = x resc , y mod = y resc ), and there is a net host Here is how the model treats the four basic crosses that are possible: • In the control cross (both parents uninfected), incompatibility does not occur if x resc Since these conditions are necessary for uninfected hosts to rescue themselves, we suppose them to always be true.
• In the reverse cross (only the female infected), CI does not occur if x h + x a ≥ 0 and y h + y a ≥ 0.
Since the left hand quantities are always positive or zero, their sums are also positive or zero. Therefore, these conditions hold irrespective of the Wolbachia strain.  [1]. If we assume that host 1 produces x h1 and y h1 , and host 2 produces x h2 and y h2 , and further that quantities produced by host 1 are greater than those produced by host 2, then the last condition is met more easily in host 1 than in host 2. Therefore, the Wolbachia strain may induce CI in host 2 but not in host 1 (see also the proof in Table 16  A goalkeeper-model without host contribution is possible (a F CI (2; 0; 1; 0) model) and would be more parsimonious than the model presented so far. However, such a model would not allow for intransitivity (table 12) as well as for the additional statements M, N, and O (tables 16, 17, and 18). We therefore believe that host contribution as conceptualized here not only necessary but provides a useful framework to understand CI-related parasite-host interactions (as e.g. studied by [2]).
If a host is infected by several dierent Wolbachia strains simultaneously, we assume total factor quantity to be the sum of each individual strain's factors, i.e. without synergistic or antagonistic eects. This could be an oversimplication since theoretical considerations show that Wolbachia strains should reduce their replication rate when other strains are present [3]. However, empirical work showed that Wolbachia density is not strongly aected by the presence of additional Wolbachia strains [4,5], although a notable exception has been found [6]. Therefore, before relying too strictly on additivity, it should be tested whether Wolbachia density is aected in multiple infections and if it is, CI levels should be corrected for density eects. Still, as few exceptions to additivity have been observed, our model assumes for reasons of simplicity that factors act additively in multiple infections.

Formalism
We symbolize non-infection with the zero element, 0. Further, we use the function R (rescue) to write Wolbachia strain a rescues Wolbachia strain b as simply aRb. For two hosts h i and h j , a rescues b in host h i is written as aR i b, whereas a rescues b in host h j is written as aR j b.

Formalism
Since CI is determined by whether the set of all locks can be matched by the set of all keys and not by quantitative questions, we use a dierent language than in the goalkeeper-model. We dene the set of all locks of a Wolbachia strain a as L a and the set of all keys as K a . The locks and keys are called x 1 , x 2 , . . ..

3.4
Derivations: lock-key-model II acRbc ∧ ¬ (aRb) contradiction (1) and (7) ∀X : ¬(X ∧ ¬X) 9 ∃a, b, c : ¬(acRbc ⇒ aRb) (1) is false, its negation is true  (2) and (3) ∀X : ¬(X ∧ ¬X) 5 ∃a, b, c : ¬(aRb ⇔ acRbc) (1) is false, its negation is true    Why the mistiming-model is most likely invalid The mistiming-model assumes Wolbachia to manipulate sperm content to delay the male pronucleus during the rst cell cycle after fertilization. Rescue restores synchrony by applying the same manipulation to the rest of the ovum, thus delaying it by the same degree [7,8]. The mistiming-model is a F CI (1; 0; 1; 1) model, similar to the goalkeeper-model but with only one factor (we assume a net host contribution to make the model harder to falsify). Denitions are given in table 39.
The mistiming-model is a special case of the goalkeeper-model and has thus more predictive power.
However, the mistiming-model cannot account for bidirectional incompatibility (proof see Table 40 on page 21).
Modifying the mistiming-model by assuming dierent resc factors in the ovum to bind to paternal chromosomes and further slow down their progression allows bidirectional incompatibility to occur [9].
However, this enlarged model can no longer explain unidirectional incompatibility between dierent Wolbachia strains because sperm would always be further delayed upon entering an ovum containing another strain. Assuming dierent binding sites for dierent factors can solve this new problem ( [9]; see gure 1) but the model loses predictive power as a consequence.
The modied mistiming-model relies on the assumption that the sperm DNA can be further modied after fertilization. This implies that it is also modied by the factors contributed naturally by the mother, the host contribution to mod. As a consequence, we will momentarily drop the assumption that there is a net host contribution. The mistiming-model is thus altered to be an F CI (∞; 0; 1; 0) model of CI. The new denitions of the model can be found in Table 41  do not bind to. In addition, b must not produce a higher quantity than a of any factor at a common binding site, leading to denition Y'. This modied mistiming-model can account for both bidirectional incompatibility and unidirectional incompatibility (gure 1, formal proof not shown).
However, in contrast to both goalkeeper-model and lock-key-model, the modied mistiming-model predicts transitivity (i.e. when strain a rescues strain b, and strain b rescues strain c, then strain a must also rescue strain c; statement I in table 42). However, such transitivity is contradicted by empirical ndings [1]. Again, we could add further ad hoc hypotheses, for example by making the mistimingmodel an F CI (∞, 0, 1, 1) model (the same as the previous but with net host contributions). However, this model would be strictly less parsimonious than the goalkeeper-model. We thus prefer to use the goalkeeper-model or the lock-key-model instead to explain CI. 5 Predictions from the norm approach to CI levels in the goalkeeper- Transitivity of real numbers We dene equivalent crosses as those crosses in which the number of excess Wolbachia strains in the female or male are equal. For example in the cross of uninfected females with wHa infected males, the wHa strain is in excess in males (it is not present in the females). Likewise, in the cross of wNo infected females with wHa and wNo double-infected males, the wHa strain is in excess in males (the wNo strain is present in both sexes, but only the wHa strain is present in males). That these crosses should be equivalent can be derived from statement K (table 14). As more precise statistical analyses were not possible from the given data, we treated CI levels as similar if the dierence between the CI levels of all equivalent crosses was less than 10%.     Data can also be pooled by publication [10]: 18 predictions were correct and 11 predictions were false (not signicant); [11]: 5 predictions were correct and 2 predictions were false (not signicant); [12]: 22 predictions were correct and 2 predictions were false (p<0.001, one-tailed binomial test, 1 draw excluded).

Discussion
We analyzed three predictions made by the norm approach to CI levels within the framework of the goalkeeper-model: (1) more Wolbachia strains in females should decrease CI levels, (2) more Wolbachia strains in males should increase CI levels, and (3) equivalent crosses should yield similar CI levels.
Data from three publications generated 60 predictions of which 45 were correct and 14 false (1 draw).
Predictions of type 1 were correct signicantly more often than expected by chance; predictions of types 2 and 3 were correct more often than not, but the dierences were not signicant, maybe owing to small sample sizes.
The norm approach can further be corroborated by looking at the results from [13]. By using predic- Eects of multiple infections have also been studied in the ower bug Orius strigicollis [6]. The authors' results showed that, for the two cases that were statistically signicant, CI levels in males infected with two Wolbachia strains were not higher than in single infections. While this result stands in contrast to our predictions of type 2, the authors provide a likely explanation for this unexpected nding.
When they tested for Wolbachia density in the host, they found reduced densities in hosts with multiple infections. Decreased density probably leads to decreased production of mod factors and thus to lower CI levels. As a result, even though other empirical studies did not nd a similar eect of multiple infections on Wolbachia density [4,5], one may have to correct for possible density reductions when making tests like those presented in this section. Conversely, other aspects of the work on Orius [6] are in support of our predictions. Specically, two statistically signicant examples where more Wolbachia strains in females decrease CI levels are in accord with our predictions of type 1. Moreover, the fact that eight of ten possible comparisons of equivalent crosses did not show signicant dierences is in line with our predictions of type 3. We caution, however, that trying not to nd signicant dierences may lead to false positive results.