Competence shut-off by intracellular pheromone degradation in salivarius streptococci

Competence for DNA transformation is a major strategy for bacterial adaptation and survival. Yet, this successful tactic is energy-consuming, shifts dramatically the metabolism, and transitory impairs the regular cell-cycle. In streptococci, complex regulatory pathways control competence deactivation to narrow its development to a sharp window of time, a process known as competence shut-off. Although characterized in streptococci whose competence is activated by the ComCDE signaling pathway, it remains unclear for those controlled by the ComRS system. In this work, we investigate competence shut-off in the major human gut commensal Streptococcus salivarius. Using a deterministic mathematical model of the ComRS system, we predicted a negative player under the control of the central regulator ComX as involved in ComS/XIP pheromone degradation through a negative feedback loop. The individual inactivation of peptidase genes belonging to the ComX regulon allowed the identification of PepF as an essential oligoendopeptidase in S. salivarius. By combining conditional mutants, transcriptional analyses, and biochemical characterization of pheromone degradation, we validated the reciprocal role of PepF and XIP in ComRS shut-off. Notably, engineering cleavage site residues generated ultra-resistant peptides producing high and long-lasting competence activation. Altogether, this study reveals a proteolytic shut-off mechanism of competence in the salivarius group and suggests that this mechanism could be shared by other ComRS-containing streptococci.


Materials and experimental data
Strains, primers, and methods used for generating experimental data used for the parametrization and validation of the model are described in the Materials and Methods section of the main text or in [1]. The experimental data used for the modeling are available in the S1 Dataset. The Matlab code is available as S2 Appendix. Those two files can be run together using Matlab and take approximately 45 min to complete.

Key processes in ComRS signalization
The ComRS activation has been shown to mostly rely on ComR abundance [1][2][3]. Also, experimental data have underlined the necessity of a functional positive feedback loop for bimodal activation [1]. In addition, results obtained experimentally in S. salivarius showed a low sensitivity of the system regarding ComS (peptide pheromone signal), PptAB (peptide exporter), and Opp (peptide importer) abundance [1]. While those three players are crucial to maintain the functionality of the positive loop, it seems they do not stand as bottlenecks for activation. For this reason, we chose to build a model neglecting the export-processing-reimportation of the peptide by assuming the peptide maturation as a single process. In consequence, we will consider that the mature form of ComS (XIP) is equal to ComS and the ComS term will only be used, except if else stated. The characterization of the formation of the ComRS complex in Streptococcus thermophilus showed a high affinity between ComR and ComS [4,5].We therefore simulated in the model that the binding between the two players is irreversible, even though the complex can be further degraded after binding. Finally, because competence induces a cell-division arrest [6], we chose to neglect the dilution of the different players through cellular growth. Since we model the system during, before, and after competence induction, we will have an effect of this dilution. However, we decided to include it in the degradation rate of the different players.
2. ComR and ComS binding is irreversible. 3. Dilution of the players through growth is included in the degradation rate.

Mathematical formulation
The model was built using the four main players of the system: ComS (the pheromone peptide), ComR (intracellular receptor), the ComRS complex (formed by two ComS and two ComR) and ComX (alternative sigma factor activating the late phase of competence). Since we had few quantitative data on the ComR control through CovRS, we decided to model the basal abundance of ComR fluctuating over time according to luciferase expression data, independently of any other player. We based mainly our model on two other models described for S. thermophilus [2] and Streptococcus mutans [7]. Ordinary differential equations (ODEs) of the model were written as follow: We modeled the fluctuation in ComR abundance as the result of basal rate production over time, subtracted by its degradation rate and its inclusion into the ComRS complex (through a non-linear mass action law modeled by a Hill equation). ComS amount was modeled as the result of a basal constant production plus the transcriptional activation by ComRS (through a classical Michaelis-Menten equation involving saturation). Similarly to ComR, the negative terms were associated to degradation and inclusion into the ComRS complex. Regarding ComRS, we described its production as the half quantity (since dimeric) of the non-linear association between ComS and ComR minus its degradation. Finally, ComX was computed as the result of a basal rate of production, plus the transcriptional activation through ComRS (Michaelis-Menten kinetics) and minus its degradation. The different parameters, their meaning, corresponding values, and origin are described in the next section and in the Table 1 at the end of this appendix.

Model calibration
With the mathematic model described above, we ended up with a total of 12 parameters. Fortunately, a previous modeling of the system in S. thermophilus [2] and experimental data obtained during this work helped us to determine the range of values for each parameter. Table 1 at the end of this appendix provides an overview of all the parameters of the model.

ComR basal rate (bR(t))
Since competence is a transient state that is mainly depending on ComR abundance, we suspected that comR expression is also transient. Transcriptional fusion between the comR promoter and a luciferase gene (PcomR-luxAB) described in [1] allowed us to monitor the activation of the promoter over time. Indeed, we observed a strong activation at the early stage of exponential phase, with a sharp decrease upon entry into stationary phase (Fig. 1A). Combining those two data, we were able to generate an approximation of ComR production over time. We used the same method as previously described [2] for processing luminescence measurement into protein amount production per cell. The only change that was brought to this method was the choice of the luminescence to protein amount factor based on the western blot semi-quantification (Fig. 1B). Of note, this quantification relies on the assumption of a complete lysis of the cell population before blotting. Because this hypothesis is certainly not accurate, this quantification should be taken as a model parametrization and certainly not as an absolute real value.

Maximum synthesis rate of comS (maxs)
We used the same approach for modeling the maximum synthesis of ComS than for the basal synthesis of ComR. In short, we used absorbance data describing growth and luciferase expression of a PcomS-luxAB strain to model the growth curve and growth rate together with the activity of the PcomS promoter upon competence activation. For activating the competence, we overexpressed comR thanks to a xylose-inducible promoter fusion (Pxyl2-comR) present in a neutral locus of the strain by adding xylose at a final concentration of 0.5%.
Since we had no quantification of ComS or XIP available, we used the same RLUmol factor obtained for PcomR-luxAB. Because we know that transcription efficiency can vary from one fusion to another and that translation properties of a mRNA will influence translation efficiency, we compared the total amount of ComS produced during our modeling with existing data in S. mutans [8] for validation (see below). Finally, we calculated the maximum production rate and used it for maxs.

Basal rate of comS (bs)
As previously described, we first computed the growth curve and growth rate for a non-activating competence population (0% xylose added) and then modeled the ComS synthesis rate thanks to the PcomR RLUmol factor previously calculated. We then compared the maximum synthesis rate obtained for the non-activating vs activating competence population to obtain the fold increase in PcomS expression upon competence activation. We then used this fold increase (384-fold) to compute the basal rate of ComS synthesis (bs) such as: Where 1 represents the maximum PcomS activation through the ComRS complex and (1+bs) is thus the maximum rate synthesis (basal rate and ComRS induction).

Basal rate of comX (bx)
We used exactly the same procedure for bx as we used for bs. The maxx parameter used as well as the RLUmol factor were chosen as described in the next free parameter calibration section. We used the same equations as described for bs computation, and calculated a PcomX fold-change between competence activated cells and non-competent activated cells of 95.

Free parameters calibration
To calibrate our model with the three comS-related free parameters left describing ComS activity (krs, maxs, ds), we used microscopic data showing that S. salivarius cells do not activate comX at basal ComR amount (401±32,5 molecules/cell) while they nearly all (95%) activate comX at higher ComR amount (3507±713 molecules/cell representing a 8.5-fold increase) [1]. We used those two data (0% of cells activated vs ~100%) in a deterministic frame, considering the bacterial population as one entity being either ON (~100%) or OFF (0%). Because the three parameters (krs, maxs, ds) influence ComS activity, we calibrated them together, computing the maximum number of ComX produced per simulation, using either low or high ComR values (Fig. 2). As we expected the system to be in an ON state (production of ComX) for high ComR values and in an OFF state (no production of ComX) for low ones, we used as output the factor ComXON/OFF: and draw the output as surface values for different parameter values as shown in Fig. 2 consideration. For this reason, we will consider those parameter ranges as non-permissive for bimodality.

Figure 2. Surface plots showing the value of ComXON/OFF for different parameter values of krs, dS and maxS.
Results show that a bimodal state (meaning high ComXON/OFF values) would only be possible for parameters values ranging from 10 -8 to 10 -4 for krs, and either high dS rate for low krs or high maxS for higher krs.
Using parameter values permissive for bimodal activation (krs values of 10 -8 to 10 -4 , dS of 5 to 15 min -1 and maxS production of 1 to 100 mol.cell -1 .min -1 ), we were able to generate a non-linear solution resulting in ComRS activation only at 8.5 (or higher) ComR fold-change (Fig. 3). Dynamics of the system displayed on Fig. 3 showed that our model responds in a non-linear fashion and could produce bimodality. Indeed, we had a system turning ON for small increase in ComR basal expression, suggesting that small heterogeneity in the ComR distribution could produce different phenotypes. We used those results to tune the amount of ComX produced when competence was activated based on the amount of ComX boxes found in the genome of S. salivarius (~150, [3]). For this purpose, we set the maxX parameter at 40 mol.cell -1 .min -1 , resulting in a maximum concentration of 150 mol.cell -1 . While our system showed bimodal properties, the free pheromone ComS produced in the cell was extremely low (max. ~5 mol/cell; Fig. 3, panel 3) and not in agreement with previous reports [8,9]. Because the parametrization of maxS, krs and dS had shown that a low activity of ComS was necessary for bimodal activation (Fig. 2) and that experimental data suggested high krs and maxS values, we tried to conciliate those observations by re-examining the degradation rate. Various types of negative regulation of the activity of the communication pheromone have been described in streptococci: peptide interference [10], transcriptional inactivation [11,12], peptidase degradation [13,14]. Other mechanisms such as specific RNase activity have also been hypothesized [3], suggesting a high diversity of competence control mechanisms through peptide negative regulation. Therefore, we hypothesized the presence of an additional negative player of the system, which is degrading ComS. We aimed to model the presence of a constitutively expressed player (named deg) that would reduce the basal amount of ComS. In parallel, this new player had to be quickly saturated by ComS upon ComRS activation in order to allow high ComS concentration in competence activated cells. We modeled this deg saturating activity through a new Michaelis-Menten equation term in the ComS equation: Where kcat is the apparent constant rate of degradation of ComS through the negative regulator, deg is the constant deg concentration due to constitutive expression, and Kdeg is the ComS concentration required for half max degradation of ComS.
We implemented this new term in our model and searched for permissive values for bimodal activation with parameters dS = 0.01 min -1 , maxS = 1000 mol.cell -1 .min -1 and krs= 0.01 (dS and krs values were chosen based on the previous simulation [2]. The maxS value was chosen to approach literature data [8], suggesting ComS production upon competence activation in the nanomolar to micromolar range. This value was also chosen in accordance with recent data published on the rate quantification of bacterial translation synthesis [15]. We used the same approach described before to generate surface plot showing bimodal landscape (data not shown) but using instead ComSON/OFF in function of two parameters values: kcat × deg and Kdeg. ComSON/OFF was used in order to have an idea of the maximum ComS values for activating cells. Because computation time were far too long for low Kdeg values permissive for bimodal activation, we decided to keep low krs values (10 -6 cell 3 .mol -3 .min -1 ). We realized that those two parameters (Kdeg and krs) were responsible for the same effect on the model: increasing the ComS production necessary for activating the system. While low Kdeg has much more biological significance than krs regarding our in vitro ComRS binding data, low Kdeg was too difficult to compute in our model.
The parameter landscape regarding ComSON/OFF (thus bimodality) showed that only some particular values were permissive for bimodality. Because low Kdeg was necessary for bimodality and low kcat × deg could provide a fair number of ComS molecules produced when competence is activated, we hypothesized that the properties of the negative player would directly influence the bimodal behavior. While high ComS affinity for the degradation regulator could provide a bimodal activation through efficient low ComS degradation, fast saturation can account for the high ComS production in competence activating cells.
Then, we used this new set of equations to model competence over time in relation to ComR abundance. Regarding previous parametrization, we decided to use kcat × deg = 140 mol.cell -1 .min -1 and Kdeg = 1 mol.cell -1 for computations (Fig. 4). As expected, we could not see any competence activation for lower values than 8.5×ComR. However, unlike our previous model (showing low level of unbound ComS when competence was activated), we could reach about 60,000 molecules per cell of ComS in the activating state. Because this amount would correspond to a nanomolar range of concentration in the growth medium, we concluded that our degradation model described more accurately the features of the ComRS system.
Of note, the ComS amount at the late stage of growth in our model is still important (Fig. 4, panel  3), probably because we did not include a specific shut-off mechanism of the system. We designed our model with this typology because we were mainly interested in using the simplest model as possible describing bimodal activation and were lacking experimental data regarding shut-off mechanisms at this point of the research.

Validation
We then sought to validate our model with a set of experimental data. To do this, we determined if our model could predict the percentage of cells activating competence in ComR overexpressing strains [1].
To this aim, we first determined the hypothetical maximum and minimum basal ComS production. For the minimum value, we used the fact that an 8.5-fold increase in WT ComR abundance is the threshold for the activation of the whole population. It means that all the cells in this situation, even the ones with the lowest basal ComS values, are able to trigger the system. We therefore determined that the minimum basal ComS production for the population was bS = 0.0026, the basal value used in our simulation until now.
To compute the maximum basal ComS value, we used our microscopy data together with ComR semi-quantitative immunoblotting showing that a 1.27-fold ComR increase results in the first cells to activate competence.
We computed the basal ComS value necessary to obtain activation at a threshold of 1.27-fold ComR increase. We found a value of 0.03 for activating the system. We therefore hypothesized that the ComS basal distribution within the population should range from 0.0026 to 0.03, which corresponds to a basal rate synthesis of ComS of 2.6 to 31 mol.cell -1 .min -1 . Of note, because experimental data obtained showed that ComR is homogeneously distributed in the population [1], we neglected the effect of small variations in ComR abundance from cell to cell. This approximation seems relevant, as weakly expressed genes such as comS show higher noise in expression [16]. Nevertheless, this assumption was not proven experimentally in S. salivarius, and still needs to be validated.
Recent theoretical and single molecule studies have shown that the random distribution of proteins can be modelized as a simple gamma function, knowing the mean and the dispersion of the protein amount in the population [7,17]. We considered the rough assumption that bS is directly linked to the basal ComS concentration and we modeled this basal rate as a gamma function using singlecell data to have an idea of the dispersion and the mean of bS. For this purpose, we analyzed the fluorescence of a PcomS-gfp + strain for competence non-activated cells (Fig. 5). Those cells were displaying only non-activated comS promoters, which correspond to basal ComS production. We first computed the Fano factor (F) where µ stands for the mean and σ² for the variance.  Knowing that extreme values of fluorescence intensity would be associated to extreme values of predicted basal rates, we used the lower and upper quantile of the cumulative gamma function (α=0.05) and computed their corresponding intensities (8.8 and 50, respectively). Using the maximum and minimum basal rate of ComS (bSmin=0.0026 and bSmax=0.03), we were able to scale the fluorescence intensity to the basal rate thanks to the equation: We then computed the critical bScrit values for competence activation for several ComR concentrations and computed the associated Icrit factor thanks to the scaling equation (Fig. 6). Finally, we reported the Icrit value computed on the gamma cumulative density function (Fig. 5, panel 3) for every ComR fold increase to determine the percentage of cells with higher intensities than the Icrit value. This percentage represents cells with sufficient ComS basal activity to trigger the system. We were then able to generate a chart depicting the percentage of activated cells in the populations as a function of ComR abundance and compared it to experimental data obtained (Fig.  6). The comparison between experimental values and simulation showed that our model was able to describe quantitatively the bimodal competence activation through the ComRS system.

Competence bimodality and comR heterogeneity
Because our system has shown to be extremely ComR sensitive, we thought that although its distribution in the population is quite homogeneous (see experimental data on PcomR-gfp + , PcomR-comR::gfp + and XIP population-wide dose in [1]) it was possible that small uneven ComR distribution in the population could account for bimodality. To validate this hypothesis, we challenged our model with a similar approach as performed with the comS distribution. First, we generated a gamma distribution fitted to the PcomR-gfp + data as previously described (Fig. 7). We first generated a 1.27-fold gamma distribution representing the ComR distribution in an overexpressing strain harboring a xylose-inducible promoter fused to comR and induced with 0.1% of xylose. To do this, we multiplied the mean and the standard deviation of our experimental data in the WT by 1.27 and generated a new fano factor and a noise parameter that we used to compute the gamma distribution of the 1.27-fold ComR amount strain. This strategy implies that the new distribution generated has the same mean over standard deviation ratio. We could reasonably make this assumption since Pxyl2-gfp + data obtained suggested homogeneous Pxyl2 activation and would therefore only increase ComR amount without changing its distribution among the population [1].
We next used the 0.90 quantile value (corresponding to the 10% of cells activated measured experimentally) of the 1.27-fold distribution generated and defined it as the critical value for activation. This value corresponds to a 1.9-fold ComR change.
We then generated several ComR gamma distribution for different fold changes and computed the number of cells activated with the threshold (1.9-fold) defined. We then compared the computed percentages obtained with experimental values (Fig. 8).   8 shows clearly that ComR heterogeneity can be sufficient to explain bimodal activation of competence. Nevertheless, the parametrization used to obtain the fit presented in Fig. 8 must be considered for interpreting the validation. Indeed, the threshold determined based on experimental values is quite low (1.9-fold ComR increase). For triggering the system at this ComR concentration, high basal rate of ComS is needed (about 40 mol.cell -1 .min -1 ) which is not in accordance with experimental data, suggesting low comS expression. Since the model suggests that both ComS and ComR heterogeneity could explain the bimodal activation of competence, it is most likely that heterogeneity on both partners is at the core of the stochastic process giving rise to bimodality.

S XIP addition vs xip overexpression prediction
Now that we had performed parametrization and validation, we could explore the model to ask biological relevant questions. We first investigated puzzling experimental results obtained regarding pheromone-based competence activation in S. salivarius. Particularly, the addition of nanomolar concentration of synthetic sXIP (sXIP) is able to trigger the system while the overexpression of the 12-amino acid maturated form of ComS (xip) fails to activate competence.
In order to evaluate if our model could explain such unexpected behavior, we modeled the addition of sXIP by a sharp and sudden increase in ComS concentration and modeled the xip overexpression by a constant ComS production over time. Of note, we will still consider in this section that XIP and ComS are equivalent.
Since recent investigations have shown the high processivity of the Opp transporter in Bacillus subtilis [18], we assumed that addition of XIP in the nanomolar concentration range would result in a fast uptake by the cells within a few minutes. Because our luciferase experimental data show an activation of the ComRS complex after 5 min upon sXIP addition, we modeled the total uptake taking place within 15 minutes and reaching a maximum rate after 5 minutes. In parallel, experimental sXIP titration with PcomX-gfp + strains showed a threshold activation at a ~1 nM concentration [1]. Since we do not know the processivity of the Opp transporter and the resulting intracellular concentration in sXIP, we performed a simulation for sXIP uptake gradient, starting with a maximum value and considering that every molecule added to the medium is imported by the cells. To model this, we assumed that at t = 120 min, XIP concentration in the medium reach suddenly 10 -9 M. Direct uptake through the Opp transporter results in an intracellular concentration given by the equation: Where [XIP]cell is the intracellular concentration of XIP, [XIP]out is the extracellular concentration of sXIP added and X(t=120) is the bacterial concentration in the medium at the time t = 120 min. Because a complete uptake of every single SXIP molecule present in the medium by the cells is rather unrealistic, we chose a range of total amount of SXIP taken up by the cell weaker than the one calculated for a complete uptake with a SXIP concentration in the medium of 10 -9 M. We then generated Fig. 9, showing the effect of SXIP addition on the different variables of the system.

Figure 9. Modeling of sXIP importation efficiency and its effect on competence activation.
ComR, ComS, ComRS, ComX abundance, XIP importation rate, and ComS degradation rate were computed over time depending on SXIP uptake. Total amount of SXIP molecules imported through the Opp transporter are depicted by color (blue=100, orange=200 ComR, yellow=400, purple =800 and green=1600 mol.cell -1 ). Those values correspond to an intracellular XIP concentration of 300, 600, 1200, 2400 and 4800 nM.
We found an intracellular concentration of 1.5 µM (500 molecules per cell) sufficient to overcome the degradation rate through the degradation machinery (deg) and trigger competence. Regarding the experimental data showing an effect starting at 1 nM, it would suggest that the Opp system is quite efficient.
To investigate and compare the effect of an intracellular overexpression of xip, we modeled a constant production of ComS over time and checked how the overexpression was able to activate the system (Fig. 10). We then computed the total amount of ComS produced until competence was activated (t = 225 min for the highest xip overexpression) through xip overexpression and compare it to the sXIP quantity necessary for competence activation. We found a much higher ComS value produced through overexpression (6469 mol.cell -1 ) than through SXIP addition (500 mol.cell -1 ). This difference is explained by the discrepancy between the two modes of ComS production. In the first one, the overexpression is gradual and rather slow, giving the time for the negative player deg to degrade it. By contrast, SXIP addition results in a sharp increase of ComS, directly overcoming the deg capacity. Of note, those speculations are based on the suggested high processivity of the Opp transporter [18]. Nevertheless, regarding the experimental reactivity of the system to sXIP addition, this can reasonably be hypothesized.