Intracellular signaling in proto-eukaryotes evolves to alleviate regulatory conflicts of endosymbiosis

The complex eukaryotic cell resulted from a merger between simpler prokaryotic cells, yet the role of the mitochondrial endosymbiosis with respect to other eukaryotic innovations has remained under dispute. To investigate how the regulatory challenges associated with the endosymbiotic state impacted genome and network evolution during eukaryogenesis, we study a constructive computational model where two simple cells are forced into an obligate endosymbiosis. Across multiple in silico evolutionary replicates, we observe the emergence of different mechanisms for the coordination of host and symbiont cell cycles, stabilizing the endosymbiotic relationship. In most cases, coordination is implicit, without signaling between host and symbiont. Signaling only evolves when there is leakage of regulatory products between host and symbiont. In the fittest evolutionary replicate, the host has taken full control of the symbiont cell cycle through signaling, mimicking the regulatory dominance of the nucleus over the mitochondrion that evolved during eukaryogenesis.

These effects can be explained by the reduced symbiont numbers (b,f) which holobionts evolved to limit the effective leakage and transfer rates from symbiont to host.The impact of leakage is most negative on holobionts initialized with identical host and symbiont genomes (a-d) compared to holobionts initialized with distinct host and symbiont genomes (e-h).Standard refers to the experiment without leakage and transfer (see 2). cell-cycles.Thus, signaling only emerges when relatively primitive hosts and symbionts (i.e. which are not adapted to poor nutrient conditions) are exposed to product leakage from the start.
Holobiont populations remain smaller in the presence of product leakage compared to the absence of product leakage, even after having evolved for t = 2 • 10 6 timesteps (Fig Aa ,e).The main adaptation to product leakage, and to a lesser extent to gene transfer, is to limit symbiont number (Fig Ab ,f), reducing the effective rates of leakage and transfer.
Symbiont numbers are limited by increasing the cell-cycle duration of the symbiont relative to the host (2), such that symbiont numbers are always diluted over multiple holobiont generations and are kept just above the bare minimum by holobiont-level selection.A slow symbiont cell cycle relative to that of the host relaxes selection for short symbiont genomes.Consequently, symbiont genomes expand more under the presence of leakage, and even grow bigger than host genomes (Fig Ac ,d,g,h).Interestingly, the small populations of symbionts with large genomes inside each holobiont are more sensitive to Muller's ratchet (2).In particular, the formation and modification of regulatory genes on the symbiont genome is a hazard because these may subsequently leak or transfer to the host genome.
The impact of leakage and transfer is largest when evolution is started with identical host and symbiont genomes (Fig A ), i.e. similar to the situation in the experiments described in the main text.When host and symbiont genomes start out identical, genes retain some of their ancestral overlap, resulting in the strongest molecular interference.
The average Hamming distance between the binding sequences of the g5 copies in host and symbiont after t = 2 • 10 6 timesteps is d = 1.78 (N = 9) for the replicates starting with identical host and symbiont and d = 6.29 (N = 21) for the replicates starting with different host and symbiont.This means that in replicates starting with identical host and symbiont, binding sites for g5 in the symbiont remain very sensitive to the host version of g5 if it is leaked to the symbiont and vice versa.The interference remains in particular for gene g5 because it targets many binding sites which, as we previously showed, constrains the evolution of its binding sequence (see "Gene family analysis" in Appendix 1 of 1), in line with the evolutionary behavior of real transcription factors (3).Apparently, hosts and symbionts are unable to modify their gene regulatory network and to avoid the effects of leakage at the level of gene regulation.

Text B. Symbiont control strategies
To understand how different control mechanism achieve stabilization of symbiont numbers, we tracked individual cells growing at intermediate nutrient conditions (n influx = 30), either through their normal division process or by enforcing symmetric distribution of symbionts.
By analogy to the literature on cell size control (e.g.4; 5; 6), wherein cell behavior can be analyzed by correlating the cell volumes between two consecutive cell cycles, we here compare the symbiont number right after division between two consecutive cell cycles (Fig B).Using this approach, the four different symbiont control strategies could be grouped into three different phenomenological control behaviors as mentioned in the main text.  of the other control mechanisms (effective rate of 0.00843 in Q4 with host control versus 0.00361 for Q3 with bi-directional control, 0.00282 for Q HS 10 with symbiont control and 0.00256 for P9 with implicit control), indicating that holobiont-level selection is important for stabilizing symbiont numbers in the population.At the same time, births are also more frequent with host control (effective rate of 0.0225 in Q4 versus 0.0094 for Q3, 0.0078 for Q HS 10 and 0.0114 for P9), indicating that holobiont-level selection acts rapidly and without detriment to the population.
In contrast, bi-directional control and symbiont control establish division checkpoints that stall the cell cycle until sufficiently many symbionts are present.In line with this mechanism, the correlation between symbiont numbers in consecutive generations is small (Fig Bb ,c).Strikingly, the removal of stochasticity at cell division, i.e. by enforcing symmetric symbiont distribution, makes the control behaviors of host control on one hand, and of bi-directional control and symbiont control on the other hand, even clearer.Finally, implicit control constitutes a weak sizer that takes several generations to correct deviations in symbiont number.For this reason, we do not find such a clear sizer signal as with bi-directional control or symbiont control (Fig Bd).In particular, the impact of stochasticity in symbiont number at division is relatively small for holobionts with high symbiont number, such that the correlation only decreases very slightly when symmetric distribution is enforced.Still, when we focus on poor nutrient conditions where holobionts have few symbionts such that symmetric distribution has a large impact, we retrieve the sizer signature that was also found for bi-directional control and symbiont control: without stochasticity at cell division, the correlation between symbiont numbers in consecutive cell

Text C. Symbiont control by two separate checkpoints for host division
In both Q3 and Q HS 10 where bi-directional and symbiont control strategies evolved, symbionts signal to the host when their numbers are high enough inciting the host to safely divide.A key difference with bi-directional control in Q3 is that the interaction between symbiont g2 and host g7 in Q HS 10 is not dosage-sensitive and does not measure the replication status of the host (Figs 4 and U).The host has evolved a second mechanism to infer the replication status of its own genome.A gene near the origin, g13, performs the same regulatory tasks as g7, and thus prevents host division even if g7 is deactivated under the influence of high symbiont number.Instead, g13 needs to be inhibited by g5, which is located right at the terminus of the host genome.The g5-g13 interaction depends on low-affinity binding, so inhibition of g13 only becomes likely when the entire host genome is replicated (giving two copies of g5).Thus, where bi-directional control in Q3 integrates host status and symbiont levels into a single cell-cycle checkpoint, symbiont control in Q HS 10 rests on two independent checkpoints for holobiont division.

Figure A :
Figure A: Product leakage and gene transfer are deleterious for holobionts consisting of pre-evolved hostsymbiont pairs, as seen in smaller final population sizes (a,e) and larger genomes (c,g and d,h) at t = 2•10 6 .
Host control ensures replication of all symbionts during the holobiont cell cycle, suggesting that deviations in symbiont numbers are propagated to subsequent generations.In line with this description, there is a high correlation between symbiont numbers of consecutive generations (Fig Ba).Furthermore, the clonal growth experiments (i.e.Fig 4 in main text) show that holobiont deaths are much more frequent with host control than in any

Figure B :
Figure B: Phenomenological control strategies identified by correlations in symbiont numbers between consecutive cell cycles.As in Fig 4 in main text, the most recent common ancestor of each replicate was analyzed at intermediate nutrient conditions (n influx = 30).Removing the stochasticity at cell division by forcing symmetric division shows the control strategy more clearly (right versus left panels).In the case of P9, the correlation stays well below 1, indicating that holobionts approach the equilibrium symbiont number over multiple generations.See also Fig C.

Figure C :
Figure C: Phenomenological control strategy of P9 becomes more apparent at poor nutrient conditions (n influx = 5) where due to low symbiont numbers, the impact of stochasticity at cell division is more prominent (cf.Fig B).

Figure D :
Figure D: Evolutionary dynamics of replicate Q3 where holobionts evolved bi-directional control (cf.Fig 2 in main text).Here, the symbiont evolves a substantially larger genome than the host.Yet, the sizes of their regulatory repertoires are very similar and the host controls more symbiont genes than vice versa.As in Q4, holobionts quickly evolve to be insensitive to leakage (bottom panel) and host-to-symbiont signaling also evolves rapidly.Only from t = 4•10 6 , symbiont-to-host signaling evolves yielding bi-directional control.

Figure E :
Figure E: Evolutionary dynamics of Q5, in which bi-directional control evolved.

Figure F :
Figure F: Evolutionary dynamics of Q7, in which bi-directional control evolved.

Figure G :
Figure G: Evolutionary dynamics of Q8, in which symbiont control evolved.

Figure H :
Figure H: Evolutionary dynamics of Q9, in which bi-directional control evolved.

Figure I :
Figure I: Evolutionary dynamics of Q10, in which host control evolved.

Figure J :
Figure J: Evolutionary dynamics in a replicate where holobionts evolved bi-directional control using hostto-symbiont leakage (QHS2).

Figure K :
Figure K: Evolutionary dynamics in a replicate where holobionts evolved symbiont control (QHS10; cf.Fig 2 in main text).As in Q3, the symbiont evolves a larger genome than the host and the sizes of their regulatory repertoires are comparable.Symbiont-to-host signaling evolves remarkably late, coinciding with substantial population expansion.As before, holobionts quickly evolve to become insensitive to host-tosymbiont leakage (bottom panel).In contrast, holobionts were not exposed to symbiont-to-host leakage, so they remain sensitive to the introduction of this type of leakage.

Figure L :
Figure L: Impact of leakage and signaling along ancestral lineages during evolution with product leakage, gene transfer and signal peptide mutations (Q1-10).Host-to-symbiont signaling always appears early in evolution or does not appear at all.In bi-directional control (Q3, Q5, Q7, and Q9), symbiont-to-host signaling evolves substantially later.In all replicates except Q8, holobionts rapidly become insensitive to leakage.Symbols of replicates correspond to those in Fig 3 in main text: Triangles indicate the direction of control (down for host control, up for symbiont control, diamond for bi-directional control), and red dotted lines indicate communication through leakage (down for host-to-symbiont direction, up for symbiont-tohost direction).Replicates were continued beyond t = 10 7 , so we could determine the ancestral lineage up to t = 10 7 .

Figure M :
Figure M: Impact of leakage and targeting along ancestral lineages during evolution with symbiont-to-host product leakage, gene transfer and signal peptide mutations (QSH1-10).As in Fig L, host-to-symbiont communication arises before symbiont-to-host communication.Two replicates (Q3 and Q7) do not evolve any functional communication.Symbols of replicates correspond to those in Fig 3 in main text.Holobiontswere not exposed to host-to-symbiont leakage, and leakage in this direction has little effect on evolved holobionts.Several replicates were continued beyond t = 10 7 , so we could determine the ancestral lineage up to t = 10 7 .

Figure N :
Figure N: Impact of leakage and targeting along ancestral lineages during evolution with host-to-symbiont product leakage, gene transfer and signal peptide mutations (QHS1-10).Unlike in replicates Q1-10 (Fig L), holobionts do not evolve under symbiont-to-host leakage and thus do not become insensitive to it.Symbols of replicates correspond to those in Fig 3 in main text.Several replicates were continued beyond t = 10 7 , so we could determine the ancestral lineage up to t = 10 7 .

Figure O :
Figure O: Impact of leakage and targeting along ancestral lineages during evolution with gene transfer and signal peptide mutations but without leakage (T1-10).Functional signaling is only observed transiently in T8.In addition, holobionts do not evolve under leakage, and thus do not become insensitive to it.Still, holobionts might turn out to evolve a transient positive effect of leakage on growth (e.g.early on in T3).Symbols of replicates correspond to those in Fig 3 in main text.Several replicates were continued beyond t = 10 7 , so we could determine the ancestral lineage up to t = 10 7 .

Figure P :
Figure P: Impact of leakage and targeting along ancestral lineages during evolution without any interference and without the possibility for communication between host and symbiont (P1-10).Symbols of replicates correspond to those in Fig 3 in main text.Holobionts were not exposed to leakage, and in many cases remain particularly sensitive to symbiont-to-host leakage.P7 evolves to be especially sensitive to symbiontto-host leakage, fitting with the large population reduction observed in Fig 5 in main text.Conversely, host-to-symbiont leakage is relatively harmless, and even transiently beneficial in P4, P7 and P8.Several replicates were continued beyond t = 10 7 , so we could determine the ancestral lineage up to t = 10 7 .

Figure Q :
Figure Q: Competition experiments between the 14 adapted populations of Q1-10 and P1-10 without leakage (cf.Fig R).Two populations are inoculated side-by-side (one at the top, one at the bottom) of an enlarged grid which fits both populations entirely (50 × 275).Each competition experiment lasts for 10 5timesteps and includes mutations.The time at which one population has expelled the other from the grid is recorded and a draw is recorded if both populations are still present on the grid at the end.All values in the table denote competitive success of the population in the row relative to the population in the column (s = 10 5 t win ), and rows and columns have both been sorted from most to least competitive.

Figure R :
Figure R: Competition experiments between the 14 adapted populations of Q1-10 and P1-10 with leakage (cf.Fig Q).Two populations are inoculated side-by-side (one at the top, one at the bottom) of an enlarged grid which fits both populations entirely (50 × 275).Each competition experiment lasts for 10 5 timesteps and includes mutations.The time at which one population has expelled the other from the grid is recorded and a draw is recorded if both populations are still present on the grid at the end.All values in the table denote competitive success of the population in the row relative to the population in the column (s = 10 5t win ), and rows and columns have both been sorted from most to least competitive.

Figure S :
Figure S: Evolution of host control along the ancestral lineage of Q4.

Table A :
Important characteristics of pre-evolved free-living prokaryotes, i.e. cell-cycle efficiency (see Fig6in main text) and generalist capacity (plasticity measured as the log-difference between cell-cycle duration at n = 0.1 and n = 100).

Table B :
Host-symbiont pairs used to initialise evolution experiments (see Table A for characteristics of prokaryotes).C1-4 start with identical host and symbiont; C5-6 have different host and symbiont but with very similar phenotypic behavior, i.e.R8 and R9 are both efficient generalists; C7-12 are asymmetric.

Table C :
Genetic sequences of the five core gene types in host and symbiont of the most recent common ancestor of Q4 which lived at t = 8.9 • 10 6 .