Equilibrium Frequency of Endosymbionts in Multiple Infections Based on the Balance between Vertical Transmission and Cytoplasmic Incompatibility

Cytoplasmic incompatibility (CI)-inducing endosymbiotic bacteria, such as Wolbachia and Cardinium, have been well studied through field data and validations on the basis of numerical simulations. However, the analytically derived equilibrium frequency of multiple infections has not yet been determined, although the equilibrium for cases of single infection has been reported. In this study, we considered the difference equation for endosymbionts using three parameters: the probability of the failure of vertical transmission (), CI strength (), and the level of host inbreeding (). To analyze this model, we particularly focused on , i.e., the frequency of host individuals completely infected with all -bacterial strains in the population. , at the equilibrium state, was analytically calculated in the cases where and is any arbitrary value. We found that can be described using two parameters: and , which is identical to . has a larger value in a system with a smaller . In addition, determines the maximum number of strains that infect a single host. Our results revealed the following: i) three parameters can be reduced to a single parameter, i.e., and ii) the threshold of the maximum number of infections is defined by , which prevents additional invasions by endosymbionts.


Introduction
Endosymbiotic bacteria such as Wolbachia and Cardinium are well known to be reproductive manipulators infceting insect cells [1,2]. Cytoplasmic incompatibility (CI) is considered to be the most common and efficient form of manipulation that can spread an infection throughout the host population. CI causes males infected with a bacterial strain reproductively incompatible with uninfected females [3]. Because of CI, infected females exhibit a relatively higher fitness than uninfected females; therefore, the number of infected individuals gradually increases and they become dominant in the host population. For example, approximately only 10% of Drosophila simulans individuals were infected with Wolbachia in the middle of the 1980s in California; however, the infection rate increased to 95% in 1993 [4,5]. In a rearing system of Encarsia pergandiella infected with Cardinium, the infection gradually spread to almost fixation within the population, irrespective of the initial rate of infection [6].
The dynamics of CI-inducing bacteria can be primarily determined by three parameters: vertical transmission efficiency, fecundity of infected females, and CI levels [7], although the dynamics may be affected by other parameters [8,9]. Vertical transmission, from a mother to her offspring, is the main route of transmission of the bacteria to other host individuals. The transmission rate is defined as the proportion of offspring infected from an infected mother, which is approximately 1 (w0:95 in most cases) [10]. Successful transmission has crucial effects on the spread and maintenance of the bacteria [11]. CI-inducing bacteria sometimes affect the fitness of the infected host diretctly through fecundity. The direction of their effect through fecundity depends on the host-bacteria combination, and the effect may be negative [12] or positive [13], or often neutral [14]. The CI level is defined as the proportion of the number of offspring died because of CI relative to the number of total offspring reproduced. In addition, the effect of CI is dependent on the host-bacteria combination, which highly varies from zero to one, i.e., from nearly neutral to complete mortality [15][16][17].
Models have been developed to study the dynamics of CIinducing bacterial symbionts, particularly in Wolbachia, using the three conventional parameters [4,18]. The long-term behavior depends on the initial frequency of bacterial infection [19]. If the initial frequency of bacterial infection is below the threshold, the frequency heads towards extinction; however, if the frequency stochastically exceeds the threshold (e.g., because of random genetic drift), then it is expected to spread to another equilibrium state where both infected and uninfected individuals exist. A higher vertical transmission rate and/or stronger CI will lead to higher infection frequency equilibrium. In most of the models on the basis of difference equations, the equilibria have rarely been analytically derived [20][21][22]. A few analytical results have been reported for single [7] and double infections [23]. Farkas & Hinow [23] analytically examined the case of hosts infected with only a single bacterial strain when two strains were present in the field. However, the symbiotic bacterial dynamics of an arbitrary number of strains remains poorly understood [7,18], which prompts the questions: ''how do the parameters determine the dynamics of infections and how many strains can infect a host population''? Experimental studies have reported multiple infections of Wolbachia in parasitic ants (nine strains) [24], byturid beetles (five strains) [25], and ambrosia beetles (five) [26]. However, whether a host population can be infected with a larger number of species is not known. Thus, the generalization of bacterial numbers is a big issue in endosymbiont studies.
In addition, the modeling of endosymbiotic dynamics in the context of mating system has not been extensively studied. CI is a phenomenon that occurs via mating, thus it is considered that the mating system should affect the dynamics of the endosymbionts. Mating systems can be categorized roughly as random mating and inbreeding. Random mating or panmictic mating is mating between any individuals in the population. In contrast, inbreeding or sib mating, is mating between a brother and a sister. Therefore, CI should be less effective under inbreeding because the mating partner should have the same infection status. Strong inbreeding has been observed in some insects, such as fig wasps and ambrosia beetles, which are known to have Wolbachia infections [26,27]. Some theoretical studies have addressed the dynamics under inbreeding [28][29][30]. However, most of these models were examined numerically rather than analytically. In addition, the scope of these models was limited in the case of single or double infections.
This study aimed to develop an analytically solvable model of the infection dynamics of CI-inducing bacteria in sib/panmictic mating systems. We derived the analytical equilibrium of single and double infections, and generalized arbitrary numbers of strain infections using three parameters (vertical transmission rate, CI levels and the level of inbreeding). Our results show that bacterial infection thresholds can be simply expressed as a function of the ratio between the failure of vertical transmission and the CI level. In addition, we derived the maximum number of bacterial strains that are capable of infecting a host population.

Model
We consider a host population infected with N endosymbiotic strains. The integer i (~0,1, . . . ,2 N {1) represents the state of each host individual, infected or uninfected with bacteria, where i is written in base 2 with N bits and each bit represents the state of infection with bacteria, i.e., 0 and 1 indicate absence and presence, respectively. For example, when i~5 ( = 101 in base 2), the host is infected with the first and third bacterial strains but uninfected with the second strain. For convenience, we introduce a binary function f (i 1 ,i 2 ), which indicates whether the set of infected bacterial strains i 1 is included in that of i 2 . If the set of i 1 is a subset of that of i 2 , f (i 1 ,i 2 )~1, otherwise f (i 1 ,i 2 )~0. For example, f (3,7)~1 because the set of strains i 2 ( = 111 in base 2) contains any strains of i 1 ( = 011 in base 2). In addition, we define n(i) as the number of strains carried by an individual of type i.
We develop an difference equation to express the dynamics of the frequencies of hosts infected with multiple bacterial strains. Let q i be the frequency of individuals infected with the bacterial set i. For simplicity, we assume that all of the bacteria have identical parameters in terms of the vertical transmission rate and CI strength among strains.
Next, we consider the formulation of CI among strains. CI results in a decreased hatching rate by a factor 1{z; therefore, the fitness of individual mating incompatibly is reduced to 1{z. The fitness of mating between a male with bacterial strains of i and a female with bacterial strains of j is generally described as f (i,j)z(1{z) 1{f (i,j) ½ . A bacterial strain can be transferred by vertical transmission from a mother to offspring at egg stage with a probability of 1{m, whereas it fails to be transferred with a probability of m. In the field, measurements of m are typically low (v0:05). An individual host infected with the bacterial set j will produce eggs with i with a probability of (1{m) n(i) m n(j){n(i) when f (i,j)~1. Note that our model assumes a diplodiploid sex determination system in the host insects, but the effects of CI can be quite different between diplodiploidy and haplodiploidy [31].
The occurrence frequency of CI depends on the mating pattern. Under random mating, CI occurs depending on the distribution of q i . The average fitness of a female with the bacterial set j is P 2 N {1 k~0 q k ff (k,j)z(1{z)(1{f (k,j))g. In contrast, under inbreeding, CI depends on m. Mating partner tends to have the same infection status as themselves because the pair are possibly produced by the same mother. CI is less frequent under inbreeding than random mating because it occurs only under following conditions: (i) a mother lacks the bacterial strain(s) because of vertical transmission failure and (ii) a father carries the strain(s) because of their successful transmission from a common grandmother. It is difficult to formulate the exact effect of CI under inbreeding because the frequency in a previous generation is needed. For simplicity, we hereafter assume that m is sufficiently small to regard the average fitness of a female with the bacterial set j as 1. Let q' be the frequency of infected individuals in the next generation and p the probability of random mating. The difference equation of q i is described as follows: where w is a normalization factor that maintains the sum of the updated host frequencies as unity. Eq.(1) can be rewritten as follows: Eq.(2) is symmetrical for p and z, which means that the effect of p on the infection dynamics is equivalent to that of z. In addition, Eq.(2) reveals that the dynamics can be described using only two parameters, m and pz. If two systems have the same value of pz, then the dynamics of these systems should be identical. If we usẽ z z:pz as the effective CI strength, thenz z should correspond to z in the systems with the equivalent dynamics under complete random mating (p~1). As a result, Eq.(1) can be rearranged to a more simple form usingz z as follows: Eq. (3) is identical to the equation developed by Frank [18], but z is substituted byz z. Frank suggested that if all the bacterial strains share the common parameters m and z, the polymorphic equilibrium solutions should be symmetric for the frequencies, i.e., the frequency of the hosts with the same number of strains should be identical. For example, if q 110 w0, then q 110~q101~q011 should be satisfied. Frank performed a reduction of Eq.(3) as follows: where Q i is the frequency of a type that carries i different strains, Frank also proved that a polymorphic equilibrium, where both infected and uninfected individuals exist, can be stably maintained only if there are host individuals infected with all of the bacterial strains in the population [14]. In other words, if the largest n(i) in the population is less than N, polymorphism should be lost and some bacterial strains will eventually fail to achieve infection until stable maintenance is possible ( Figure 1). Frank stated that the presence or absence of individuals infected with all bacterial strains in a population determines the fate of the population. Therefore, we focus on the dynamics of the frequency of hosts infected with all of the bacterial strains in an N-strain population, Q N .
Next, we analytically derive the equilibrium for the system [Eq.(4)]. We will first focus on simple cases with one or two bacterial strains in the population before generalizing the results to an arbitrary number of strains.
there are a maximum of three fixed points in the system: where a:m=z z. The phase portrait of the system is shown in Figure 2a.

Double infection
In the case of N~2, the system [Eq.(4)] can be represented as follows: where w~Q 2 z2Q 1 1{z z Q 1 zQ 2 ð Þ ½ zQ 0 1{z z Q 2 z2Q 1 ð Þ ½ and the total frequency Q 0 z2Q 1 zQ 2 equals unity. This system has five fixed points as follows:     completely infected host can exist at N~2 is narrower than that when N~1. As stated in the previous section, we assume that m is a relatively small parameter. By substituting zero for m as a zero-order approximation, Q Ã 2 at the stable fixed point and the critical value of a (2) c are given as follows: The numerical simulation confirmed that this approximation is adequate and that m had less effcet on Q Ã 2 and a (2) c than a (Figure 3).

N-strain infection
Finally, we calculated the fixed points of Eq.(4) for arbitrary N. We assumed that Q n &0, for 0ƒnƒN{2 in an equilibrium state. Numerical simulations supported this assumption ( Figure S1). For example, when N~10,z z~0:4 and m~0:025, 95.5% of the population possessed N or N{1 bacterial strains. The dominance of highly infected hosts was enhanced by the strain number N. Thus, this assumption is more plausible when the system has a greater N and a lower a. The reduced system is described as follows: where Q N zNQ N{1 &1, w&Q N zNQ N{1 1{z z(Q N z(N{1) ½ Q N{1 ). Because this is a one-dimensional system, the fixed points can be found easily, Q Ã N~0 ,
If m was regarded as a relatively small parameter, (1{m) N &1{Nm. Then, which shows that the frequency of completely infected hosts depends only on a when m is sufficiently small. The difference in Q N between one generation, DQ N , is represented as follows: If N §2 and av N{1 N 2 , the order of the three fixed points is Q Ã{ vQ Ã0 vQ Ãz and Q Ãz is a stable equilibrium. Figure 3 compares Q Ãz at various a with the results calculated by numerical simulations for N~5 and 10. The approximated solution of Q Ãz was generally consistent with the numerical results. In particular, the agreement was better when a was sufficiently small.
Eq. (11) indicates that an exchange of stability, so-called transcritical bifurcation, should occur between Q Ãz and Q Ã0 at When aw N{1 N 2 , the order of fixed points is Q Ã{ vQ Ãz vQ Ã0 and the stable fixed point is Q Ã0 . However, we did not observe a transcritical bifurcation for any parameter set of N,z z and m. Instead, a discrete jump into 0 was observed before Q Ã N reached to 0, which implied that a saddle-node bifurcation also occured at N §3 (Figure 3). The reduced system did not replicate the disappearance of the stable fixed point; however, the analytical solution of Q Ã N agreed with the numerical results until the discrete jump occured. In addition, The numerical simulation confirmed that when the bacterial strain number N was sufficiently high, a discrete jump occured around Q N~0 . Therefore, with a large N limit, the value of a at which the transcritical bifurcation occurs is equal to a at the saddle-node bifurcation. Thus, we obtained the critical a (N) c for a sufficiently large N as follows: We compared the analytically derived a c with numerical simulations. Figure   . For N §3, we plotted the critical a derived from Eq. (12). For larger values of N, the analytical results approached the threshold derived by numerical simulation. The inverse function of Eq.(12) defines the maximum number of bacterial strains that can infect a population, N c , which is given as follows: N c is approximately proportional to a {1 . Therefore, it has a larger value with a lower value of a.

Discussion
In this study, we demonstrated the analytic equilibrium solutions of the frequencies of individuals infected with all the bacterial strains in a population with single, double and arbitrary N infections. Our results provide of a qualitative insight into the symbiotant bacterial dynamics, in constrast to recently developed models that quantitatively simulate specific experimental results [4,32].
Our model used a parameter p, the level of inbreeding, from the perspective of host behavior. We revealed that p has a completely negative effect on CI. Endosymbionts manipulates host reproduction to maintain their own infections in host populations. In contrast, host insects can reduce the prevalence of endosymbiont infections by inbreeding. This result is consistent with previous numerical simulations of inbreeding [28]. Engelstadter et al. claimed that the infection rate decreases with increasingly inbred hosts because uninfected females increasingly mate with uninfected males, which leads to fewer incompatible matings. Our analytic calculation determined a linear relationshipz z~pz, which clearly indicates the negative effect of inbreeding on the CI strength. However, this simple relationship does not appear to hold in some cases of stochastic models. Genetic drift in stochastic island model might have effects on inbreeding [29]. Our result is also consistent with the previous analytical result reported by Dannowski et al. [30]. This model focused on double infections by male-killing bacterial strains and analytically showed that higher level of host inbreeding leads to lower frequency of bacterial infections. Although their result was limited in the case of double infections, our result is applicable to arbitrary number of CI-inducing bacterial strains.
The analytically derived equilibrium depends mainly on one parameter, a. a is a novel index that we defined in this study, which is the ratio of the probability of vertical transmission failure (m) relative to the effective CI level (z z). Previous studies have shown that these parameters are involved with the infection dynamics. Furthermore we demonstrated the simple relationship between these parameters, i.e., the bacterial infection dynamics are determined simply by a balance of both. The equilibrium frequencies of the fully infected host are at a higher rate when a is lower (Figure 3). In this case, CI-inducing bacteria can select two alternative strategies to maintain and spread themselves: increasing the vertical transmission efficiency [16] and/or increasing the CI strength. There have been no reports of the differences among these strategies. Figure 4 presents the asymmetric relationship between m andz z. When m was higher than approximately 0.2, bacterial infections could not be sustained regardless of the value ofz z. Thus, the vertical transmission rate has a threshold value for infection. In contrast, bacterial infections occur with any value ofz z provided an appropriate value of m is selected on the basis onz z. Thus, the vertical transmission rate determines the infection state more strongly than CI.
Eqs (12) and (13)  Provided ava (N1) c is satisfied, the N 1 -infecting population will be maintained. Otherwise, the infection cannot be maintained. For example, a population where a~0:2 can maintain three strains at most because a (4) c (^0:18)vava (3) c (^0:22). If the population actually contains the maximum number of strains, the host population is ''saturated'' with bacteria, i.e. other strains will fail to invade the host population. The parameters are expected to have the same value in our model. Therefore, the frequency of the rarest strain necessarily declines to zero. A higher a value than that of other strains may be needed for a new strain to invade the saturated hosts.
Our theoretical results suggest that the equilibrium frequency of completely infected hosts decreases as N increases (Figure 3). In addition, the range of a where bacteria can be maintained becomes narrower as N increases ( Figure 5). The severe conditions for multiple infections are caused by the increased possibility of vertical transmission failure the offspring by any strains. X -strains that infect a individual can successfully transmit all the strains to the offspring with a lower probability than X {1 strains. Consequently, these results support a noble idea of Wolbachiainduced speciation. Suppose that a is enough large to maintain single infection but not double infection (0.207-0.25), and endosymbiotic strains, A and B strains, infect some insects in two different geographic regions respectively. The infections in both regions will spread until a hybrid zone is produced. Then, horizontal transfer makes double-infected individuals of A and B strains. However, the double infection must be lost because a is too low to maintain the both infections. Therefore, the hybrid zone will be unstable and post-zygotic speciation will be occurred between A-infected and B-infected individuals.
To confirm that our results were consistent with experimental data, we compared the predicted equilibria with the values of CIinducing Wolbachia and Cardinium obtained using natural populations or an artificial line ( Figure 6, Table S1). No data were available for triple or higher multiple infection. Therefore, only five single infections and one double infection were used for reference. No data were available on the inbreeding frequency of the insects studied; hence, p values of all were assumed to be zero, i.e. completely random mating. The experimental values were close to the predicted lines. Thus, our results can be used to estimate the frequency of infected hosts at equilibrium using the parameters (m and z) as well as for estimating the parameters that are often difficult to be measured.
In this study, we aimed to provide a qualitative outline of the dynamics of multiple infections by introducing an analytically solvable model. By ignoring quantitative accuracy, our model sheds light on the mathematical structure related to the multiple infection dynamics. However, we introduced several assumptions to develop a solvable model, which should be eliminated to allow more quantitative comparisons using experimental data. First, we assumed that a common transmission rate or CI level were shared among bacterial strains. If the bacterial parameters are heterogeneous, a bacterial strain with a lower a is more likely to be maintained than other strains. Indeed, some experiments indicate that the CI range is too broad to be regarded as the same value among different strains. For example, the CI levels of two Wolbachia strains (wBruCon and wBruOri) that infect Callosobruchus chinensis are variable [33]. The egg-hatching rate between doubleinfected males and wBruCon-infected females declined to 0 whreas that between double-infected males and wBruOri-infected females decreased to 0.62 due to CI. Another example is Drosophila simulans infected with four Wolbachia strains in Madagascar. The CI levels of the hosts vary between 0 and 1, depending on specific crosses [34]. Second, the effect of CI was assumed to be bidirectional for any combination of strains. However, it is known that some strains can compensate for the effect of CI by substituting for an uninfected strain [35]. To address more complex CI patterns, case-by-case models should be developed. Third, we ignored the bacterial density in a host by categorizing the hosts as infected or uninfected. However, it has been reported that the number of bacteria in a host insect depends on the specific combination of bacterial strains in multiple infections [36]. Thus, the bacterial density might affect the transmission rate and CI strength. The number of bacteria in the hosts should be used as a variable in individual-based models to examine the effects of bacterial density. Forth, we assumed that the geometrical structure of infected hosts was uniform. Thus, we did not consider the desity of the hosts. The crowding of hosts might occur by chance locally, which may cause increasing numbers of bacterial strains to be stably maintained. This effect should also be examined in an individual-based model. Finally, we roughly approximated the frequencies of hosts infected with 0 to n{2 strains as being equals to 0 to calculate the equilibria for N arbitrary strains. A more rigorous approximation method would reduce the error between the numerical and analytical results. The elimination of these assumptions needs to be addressed in future studies.