A mechanistic and data-driven reconstruction of the time-varying reproduction number: Application to the COVID-19 epidemic

The effective reproduction number R eff is a critical epidemiological parameter that characterizes the transmissibility of a pathogen. However, this parameter is difficult to estimate in the presence of silent transmission and/or significant temporal variation in case reporting. This variation can occur due to the lack of timely or appropriate testing, public health interventions and/or changes in human behavior during an epidemic. This is exactly the situation we are confronted with during this COVID-19 pandemic. In this work, we propose to estimate R eff for the SARS-CoV-2 (the etiological agent of the COVID-19), based on a model of its propagation considering a time-varying transmission rate. This rate is modeled by a Brownian diffusion process embedded in a stochastic model. The model is then fitted by Bayesian inference (particle Markov Chain Monte Carlo method) using multiple well-documented hospital datasets from several regions in France and in Ireland. This mechanistic modeling framework enables us to reconstruct the temporal evolution of the transmission rate of the COVID-19 based only on the available data. Except for the specific model structure, it is non-specifically


Introduction
In the last months of 2019, clustered pneumonia cases were described in China [1].The etiological agent of this new disease, a betacoronavirus, was identified in January and named SARS-CoV2.This new coronavirus disease (COVID-19) spread rapidly worldwide, causing millions of cases, killing hundreds of thousands of people and causing socio-economic damage.Until vaccination campaigns are widely implemented, the expansion of COVID-19 with the occurrence of new and more transmissible variants continues to threaten overwhelming the healthcare systems of many countries, despite a wide range of public health strategies using different non-pharmaceutical interventions (NPI).
In the early stages of each new epidemic, one of the first steps to design a control strategy is to estimate pathogen transmissibility in order to provide information on its potential to spread in the population.This is crucial to understand the likely trajectory of the new epidemic and the level of intervention that is needed to control it.Among the various indicators that quantify this transmissibility, the most commonly used is the reproduction number, which measures how many new infected individuals on average can be generated by one infected individual.In the initial phase of the epidemic, when the entire population is susceptible, this quantity is referred to as R 0 , the basic reproduction number, and is defined as the average number of secondary cases caused by one infected individual in an entirely susceptible population [2][3].As the epidemic develops and the number of infected individuals increases (and the number of susceptible individuals decreases), the effective reproduction number R eff , characterizing the transmission potential according to the immunological state of the host population, is used instead.It can be estimated as a function of time, the instantaneous effective reproduction number R eff (t) quantifying the number of secondary infections caused by an infected person at a specific time-point of an epidemic.The epidemic is able to spread when R eff (t)>1 and is under control when R eff (t)<1.R eff (t) can be used to monitor changes in transmission in near real time and, as a consequence, for instance, of control or mitigation measures.
Classically, R eff (t) is estimated by the ratio of the number of new infections generated at time t to the total infectiousness of individuals in the infected state at time t.This latter quantity is defined based on the generation time distribution (the time between infection and transmission) or on the serial interval (the time between the onset of symptoms of a primary case and the onset of symptoms of secondary cases) [4][5].
For the COVID-19 epidemic, considering the data needed for the estimation of the reproductive numbers, caution is urged when interpreting the values obtained and the short-term fluctuations in these estimates due to both data quality, which must be taken into account [6][7][8][9].In a recent study by O'Driscoll et al [6], it was concluded, comparing different methods, that there are many important biases in the R eff (t) estimates and that this can easily lead to erroneous conclusions about changes in transmissibility during an epidemic.These biases are mainly due to the uncertainty in incidence data that can arise due to both the transmission characteristics of this virus (asymptomatic and presymptomatic transmission) and the quality and preparedness of the public health system.For COVID-19, it has been shown that the number of observed confirmed cases significantly underestimates the actual number of infections [10][11].For instance, during the initial rapid growth phase of the COVID-19 epidemic, the number of confirmed case underestimated the actual number of infections by 50 to 100 times [10].In France, it has been estimated that the detection rate increased from 7% in mid-May to 40% by the end of June, compared to well below 5% at the beginning of the epidemic [12].In addition, these biases can be amplified by the combination of the high proportion of asymptomatic cases [13] with low health-seeking behaviors.Although the estimation of the reproductive number is robust to underreporting [14], this is only true if the reporting rate (among other characteristics) is constant over the observation period.This was not verified for the COVID-19 epidemic, mainly due to fluctuations in the capacity of testing and the near real-time availability of information.The uncertainty is not exclusively related to the incidence data, there is also a large variability in the value proposed in the literature for generation time or serial interval distributions and average values [15][16].Other studies have highlighted these issues.Gostic et al [7] quantified the effects of data characteristics on R eff (t) estimates: reporting process; imperfect observation of cases; missing observations of recent infections; estimation of the generation interval.Moreover, Pitzer et al [8] showed that biases in R eff (t) are amplified when reporting delays have fluctuated due to the availability and changing practices of testing.These two studies concluded that changes in diagnostic testing and reporting processes must be monitored and taken into account when interpreting estimates of the reproductive number of COVID-19.Nevertheless these changes are extremely difficult to quantify.In these contexts, as testing capacity and reporting delays evolve, the use of hospital admission and mortality data may be preferable for inferring reproduction numbers [9,17].However, the delays in the time from infection to hospitalization and/or death are also uncertain and, overall, it is difficult to incorporate the uncertainty related to all of these delays [9,17,18].
In addition to the classical methods, a complementary approach consists in inferring changes in transmission using mechanistic mathematical models, and computing R eff based on its proportionality with the transmission rates [19][20][21][22][23][24].The time-variation of R eff is computed indirectly by simply fitting the model to different time periods (before or after the lockdown in the simplest cases) or by using exponential decay models [21], but see Lemaitre et al [20].
To overcome all these numerous weaknesses in the data available for estimating R eff , we propose using a framework that has been already implemented to tackle non-stationarity in epidemiology [25][26][27].It uses diffusion models driven by Brownian motion to model time-varying key epidemiological parameters embedded in a stochastic state-space framework coupled with Bayesian inference methods.The advantages of this approach compared to the existing ones consist in (i) the description of the mechanisms underlying pathogen transmission and hence its particularities (asymptomatic phase), (ii) the joint use of multiple datasets (incidence and hospital data), (iii) the explicit taking into account of the uncertainty associated with the data used and especially (iv) the monitoring of the temporal evolution of some of the model parameters without inclusion of external variables.Overall, the main advantage of this approach is that it is data-driven.Indeed, in this framework, except for the mechanistic assumptions underlying the model, the estimation of the time-varying parameters, based on the available epidemiological observations [27], is done only under the non-specific assumption that they follow a basic stochastic process.
Applied to COVID-19, this framework makes it possible to monitor the evolution of disease transmission over time under non-stationary conditions such as those that prevailed during this epidemic.To compare different geographical settings with similar population size, we have chosen to illustrate our approach with data from several regions of France and Ireland.

Modeling a time-varying R eff
Our framework is based on three main components: a stochastic epidemiological model embedded in a state-space framework, a diffusion process for each time-varying parameter and a Bayesian inference algorithm based on adaptive particle Markov Chain Monte Carlo (PMCMC) (see Supporting Information).The main advantage of the state-space framework is to explicitly consider the observation process.This allows for the unknowns and uncertainty in the partial observation of the disease.The proposed epidemiological model accounts for the transmission characteristic of the COVID-19 and the data features.Figure 1 illustrates model compartments and transition flows between them.The temporal variation in the transmission rate β(t) was modeled by making the assumption that it is not driven by specific mechanistic terms but evolves randomly though constrained by the data.We consider that β(t) follows a continuous diffusion process: ) where ν is the volatility of the Brownian process (dB) to be estimated.The use of a Brownian process can be viewed as a non-specific hypothesis for the supposed variation of β(t) and the volatility ν being a regularized factor.Intuitively, the higher the values of ν the variations in β(t) but smaller ν would induce smoother fluctuations of β(t).The logarithmic transformation avoids negative values with no biological meaning.Based on the SEIR model structure that accounts for the asymptomatic states (see Fig. 1 and eqs S2), the R eff (t) can be computed as: where 1/γ is the average infection duration, τ A is the fraction of asymptomatic individuals in the population, (1-τ A ) the proportion of symptomatic infectious individuals and q i the reduction in the transmissibility of some infected (I 2 ) and asymptomatics (A i ) (see Fig. 1 and eqs S2 in the Supporting Information).The effect of vaccination (already implemented in studied regions) is introduced in our model simply by considering its effect on the depletion of susceptibles.The number of "effectively protected vaccinated people", proportional to the number of people vaccinated with one and/or two doses, are removed from the susceptible compartment (see eq. S3 in the Supporting Information).
For parameter estimation we used PMCMC [28] suitable for partially observed stochastic non-linear systems (see Supporting Information).The implementation is done with the SSM software [26].
Our estimations of R eff were compared, based on data from Ireland, with those obtained with two other methods.The first method, proposed by Cori et al [14] and frequently used in recent studies analyzing COVID-19 data, relies on the number of new infections generated during a given time period and the serial interval distribution and is implemented in the EpiEstim package [29].In the second method new infections and hence R eff are generated using a simple discrete SIR model fitted with Kalman-filtering tools [30].

Data used
During the first wave of the COVID-19 epidemic the number of cases reported was very low and associated with large uncertainties [10,12].This was due, on the one hand, to the testing capacity (RT-PCR laboratory capacity) that was limited and varied greatly during this epidemic [12] and, on the other hand, to the features of this new virus, such as transmission before symptom onset and substantial asymptomatic transmission, which resulted in a low fraction of infected people attending health facilities for testing.This suggests that hospitalization data were likely to be the most accurate COVID-19 related data [9,17].Thus, we focused on hospital multiple datasets in France and in Ireland.We used incidence data to avoid all the shortcomings associated with the use of cumulative data (see [31]), i.e.: daily hospital admissions, daily ICU admissions, daily hospital deaths and daily hospital discharges.We also used cases both in hospital and in ICU.Taking account for the large variability in the daily observations and in particular for an important weekly component, since the 1 st of May for French data and the 1 st of June for Irish data, we have used a weekly average of the observed daily values.
In France, since the beginning of the epidemic, the French regional health agencies (ARS -Agences Régionales de Santé https://www.ars.sante.fr/)have been reporting a number of aggregated statistics.
We have also used data made available on the open platform for French public data [32].In Ireland, this data are published by the Health Protection Surveillance Centre, HPSC and we have used public data [33].Since the beginning of 2021, these two open platforms have also published the daily vaccination data.
Since these hospital related multiple datasets were only available after the implementation of the first panel of mitigation measures in France and in Ireland and that our aim was to estimate R eff before, during and after the implementation of NPI measures, we also used reported incidence data for the period before the implementation of these measures.

Results
The temporal evolution of both the transmission rate β(t) and R eff (t) (Fig. 2A) and the fit of the model to the observed data are displayed in Fig. 2. Fig. 3 shows the time evolution of the R eff (t) (Fig. 3A) and the dynamic of the model's unobserved compartments.Another important characteristic of this epidemic is the fact that the peak of daily hospital admission and daily ICU admission are concomitant (Figs 2G-2H and S3,S6,S9,S12,S15,S18 Figs).This feature has been incorporated in the model by allowing hospitalization or admission to ICU for each stage of the symptomatic infection (Fig. 1 and eqs S2).
The capacity of our framework to describe the different types of data, among which some of them are characterized by large noise (within the Paris region for instance) is also illustrated in Fig. 2. The main advantage of this framework consists in its ability to reconstruct the time variation of the transmission rate β(t) (Fig. 2A).Based on the estimation of β(t), we can compute the time-variation of R eff (t) (Fig. 2A) and thus reconstruct the observed dynamics of the COVID-19 epidemic.Figure 3 provides the results for each component of the epidemic dynamic model for the Paris region.Our main result is described in Fig. 4 displaying the temporal evolution of the R eff in the five French regions considered and in Ireland.
Our estimates of the initial value of R eff lie in the range [3.0-3.5] in agreement with other estimates [11,34].The peak of R eff (t) just before the start date of the first mitigation measures is presumably an effect of the model to accommodate diverging trends between reported case data and hospital related data.Then one can note a decrease of about 80% in R eff between the 1 st of March and the 1 st of May (during the first lockdown) in all the regions considered (Fig. 4 and Table 1).The reduction in the transmission following the second lockdown was smaller, between 40% and 55% in France and less than 25% in Ireland.In France, at the national level, the epidemic showed a slightly increasing plateau before the third wave in April 2021.However, at the regional level and for regions with low seroprevalence, a local third wave was observed.For these waves in Occitanie and Nouvelle Aquitaine a reduction similar to the second wave was observed (Table 1).On the other hand, in Ireland, due to the UK variant, a large wave was observed in January 2021, the reduction of R eff was again significant (>70%) (Table 1).In all these cases, given the temporality of the decline compared to the timing of the NPIs, these sharp decreases seem to be the result of the implementation of the mitigation measures.
We can also notice that there is a larger variability at the end of the observation period.As the dynamics of the model are mainly driven by the hospitalization data, these latter determine the transmission rate (and then R eff ) during the interval [t-δt,t], with δt of the order of the average delay between infection and hospitalization, of around 2-3 weeks (Figs 4A-4B).Fig. 4 also illustrates the robustness and the sensitivity of our method.Firstly, Fig. 4A shows that whether hospital discharge data are used or not during the inference process, similar medians and CI are found.Secondly, Figs 4A-4B highlight the sensitivity to asymptomatic transmission: when asymptomatic transmission increases (i.e.large q A value) lower values are inferred for β(t) hence leading to lower R eff (the opposite happens when q A decreases).

Figs 3C-3D
show that the number of asymptomatic infectious is of the same order of magnitude as the number of symptomatic infectious but with a larger uncertainty due to lack of information in the data.Indeed, the data used contain very little information on asymptomatics and we observe identical prior and posterior distributions for the rate of asymptomatic, τ A (S1,S5,S8 Figs).
In our model approach, one can also estimate the number of removed individuals.We can clearly show that as the course of the epidemic advances the number of removed individuals increases and this increase is amplified by the introduction of the vaccination (Fig. 3H and S4,S7,S10,S13,S19 Figs), inducing a significant rise in seroprevalence (Table 2).Concerning the vaccination, we have only considered an additional depletion term in the susceptibles dynamics (see Supplementary Information, and equations (S2-S3)).The effects of the vaccination on R eff (t) seem not to be yet very significant, the depletion of susceptibles being slightly inversely proportional to an increase of β(t) It is also worth noting that the seroprevalence calculated as of 15 th of May for France or the 1 st of July for Ireland was entirely consistent with the published results of the seroprevalence surveys (Table 2).* not using hospital discharge data; ** using hospital discharge data; *** 01-07-2020 for Ireland.
The comparison of the performances of our estimations of R eff with those of two others methods is summarized in Fig. 5.It is difficult to compare the absolute values of the estimates obtained with the three methods because the true values are unknown and speculative, but we can limit ourselves to comparing the trends given by these methods.The trajectory of R eff over time computed by other methods fall within the range of our 95% CI, which is large because of uncertainties in the transmission rate, asymptomatic transmission and the different delays needed to describe hospital data.We can also note that the 95% CI of R eff obtained with the EpiEstim method is very narrow and its width is smaller than the variability of fluctuations in its median.The main differences between the three estimates relate to the time-lags of the effects of the lockdown, the peaks of R eff and the date of crossing the threshold which is equal to 1.These lags range from one week to more than one month and do not correspond to the early or late lags of the date of crossing 1, when comparing to the estimations provided by our method.
A final important point concerns the observed incidence.We have used the incidence data until the 22 March (Fig. 2B, black points; S6-S9-S12-S15-S18 Figs) in the inference process leading to a median of the posterior distribution of the reporting rate of 2.2% (95% CI: 1.5%-4.2%)for the Paris region (Table S2).The plotted dynamics of the estimated observed incidence uses this value for the whole trajectory.The values of the reporting rate for the other regions are in S2 Table .The comparison of the observed incidence when available and the simulated incidence (Fig 2 and S3,S6,S9,S12,S15,S18 Figs) clearly illustrates that the reporting rate has greatly evolved during the course of the epidemic.It is relatively easy to see that during the first wave the observed peak of incidence comes after the peak of hospitalization, whereas for the second wave the opposite happens, in agreement with what can be expected.Moreover, if we compare the intensity of the observed incidence waves, we can see that the second observed wave is 5 to 10 times greater than the first one, while model-based simulations suggest that they have similar magnitudes.This difference cannot only be explained by the fact that, as the frequency of testing increases, it is also increasingly likely that some of people tested positive are asymptomatic, whereas in the model the people tested are considered symptomatic.In any case, it appears crucial to be able to take into account a reporting delay that seems relatively large for the COVID-19 epidemic using models for now-casting [35][36] or a detailed observation model [18].

Discussion
For any epidemic, it is always essential to estimate pathogen transmissibility in order to provide information on its potential spread in the population.This is commonly done using the effective reproduction number R eff (t).Its time evolution allows us to follow, almost in real time, the course of the epidemic, and to get information regarding the potential effects of mitigation measures taken.
Here we propose a mechanistic approach based on time-varying parameters embedded in a stochastic model coupled with Bayesian inference.This enables us to reconstruct the temporal evolution of the transmission rate of COVID-19 with the non-specific hypothesis that it follows a basic stochastic process constrained by the available data.Using this approach we can describe both the temporal evolution of the COVID-19 epidemic and its R eff (t).Thus, we can quantify the effect of mitigation measures on the epidemics waves in France and in Ireland.It is important to note that this methodology overcomes many of the biases overcomes associated with estimates of R eff (t) obtained using existing methods in the case of COVD-19 [6][7].Indeed, these biases are related to underreporting of infectious cases, uncertainties in the generation time and in the serial interval, and also to the importance of silent transmission.Moreover, these biases are amplified by the fact that reporting delays have fluctuated widely over the course of the epidemic [8].
Within our framework, the estimates of R eff (t) represent the values required to reproduce the welldocumented hospital data.Such data are clearly of a higher quality than the reported number of cases [17][18].Moreover, this estimation accounts for transmission mechanisms and the different delays describe the transmission process (even during the asymptomatic phase) and the different processes related to the hospital (Fig. 1).Using our approach, by visualizing the evolution of R eff (t) we can follow the course of the COVID-19 epidemic in five French regions and in Ireland.Therefore, we can quantify the effect of the mitigation measures during and between epidemics waves.For the first lockdown, we estimated a decrease of around 80% in transmission in both countries.For the second wave, our reduction estimations were between 45 to 55% in French regions and around 20% in Ireland (Table 1).While France is undergoing another major epidemic wave, we have estimated a reduction of more than 70% of the transmissibility of the third wave in Ireland (Table 1).These reductions in transmission may reflect the nature of the mitigation measures implemented in both countries.For the second wave, these measures were less restrictive than during the first wave, nevertheless the second wave was also less severe.In Ireland, the mitigation measures introduced in the third wave were similar to the first wave.We also found other interesting results such as a significant high correlation between the trend of mobility and our estimation of the transmission between the epidemic waves (see S22 Fig. and [37]), highlighting the importance of following the evolution of mobility when relaxing mitigation measures to anticipate the future evolution of the spread of the SARS-CoV-2.
We have also compared the trend of our estimations of R eff with those of two other methods from the literature [14,30] on data from Ireland (Fig. 5).One of the differences between our estimations and the two others is a greater variability of our R eff estimations due to an underlying complex mechanistic model used and to uncertainties in the transmission rate, asymptomatic transmission and in the different delays needed for describing hospital multiple datasets.A second difference consists in asynchronous peaks of R eff .For instance, the peak of R eff occurs at the beginning of August for the two other methods, while our estimates suggest a peak in mid-September (Fig. 5).These differences may be explained by the differences in model complexity but, above all, by the fact that the two other methods computed their estimations based on the new cases only, which are data subject to certain bias [6][7][8].The peak of R eff in early August could be explained by an increase in testing during summer holydays while our estimates peaked later due to the increase in hospitalization generated by higher values of new infections in early September when the economy restarted.We therefore believe that our estimates based on admissions in hospital and ICU and deaths were more consistent with the peak of the second waves of hospitalization that was observed in late October in Ireland (S6-S7 Figs).This reinforces the relevance of our modeling and inference framework to present a more coherent picture of the evolution of this epidemic.
The main characteristic of our approach is the dependence of the R eff (t) estimations on the relevance of the underlying mechanistic model and the accuracy and completeness of the available data.In our case, inference was based on hospital data that are clearly of a higher quality and accuracy compared to the observed number of infected cases.Moreover, despite its relative simplicity, the model incorporating time varying transmission rate is able to accurately describe the hospital multiple datasets that included daily hospitalized admission for COVID-19, daily ICU admission, daily deaths at hospital and also the number of beds used each day both in hospital and ICU.Furthermore our model can be partially validated by quite a good description of daily hospital discharges that are not explicitly used in the fitting process.This partial validation is strengthened by the fact that our predicted seroprevalence in the French regions and in Ireland are in complete agreement with the results published from seroprevalence surveys in these settings (see Table 2).To highlight our model results, we can see that the asymptomatic infectious are as numerous as symptomatic ones, but are characterized by a larger uncertainty, due to the lack of information in the data (Figs 3C-3D and figures in the Supporting Information).This is in agreement with recent papers [11,24,38,39], which emphasize that the growth of the COVID-19 epidemic is driven by silent infections.This has also been highlighted in Ireland where it has been estimated that during the second epidemic wave the ratio of silent infections to known reported cases was approximately 1:1 [40].It is indeed interesting to note that our model estimates for asymptomatic cases in Ireland lead to similar ratios but with a large 95% CI (see Fig. S7).
Our study is not without limitations.The model used here is, like all complex SEIR models developed for COVID-19, non-identifiable.This means that it is likely that several solutions, i.e. several sets of parameter values, allow to reproduce observations and we only present one of the most likely ones.This point is overlooked very ofen but see [41].One limitation is the use of the classical homogeneous mixing assumption in which all individuals are assumed to interact uniformly and ignores heterogeneity between groups by sex, age, geographical region.However, this kind of data is not readily available.When mixing patterns among age groups are available at the individual level in contact tracing databases, they are only accessible following extensive ethical reviews.Another weakness is related to the absence of an age-structure in the model, which would allow generating age-specific predictions.In all cases, considering an age structure and a contact matrix appears insufficient and heterogeneity of contacts is important (see [39]).Nevertheless, in our opinion, these limitations are more than balanced by the fact that we take into account the non-stationarity of the epidemic data and that our results are mainly driven by hospital related data, which is more accurate and timely than the number of infected cases.As our main objective was to infer global R eff , and not to explore age-specific mitigation strategies, the simplification of the age structure appears justified.
The corroboration of our findings on the Irish case on proportions asymptomatic individuals with those of others provides further evidence of this [40].
As demonstrated previously, modeling with time varying parameter is an interesting framework for modeling the temporal evolution of an epidemic even if the knowledge about disease transmission is either incomplete or uncertain [27].Indeed a large part of all the unknowns can be put in the timevarying parameters described by a diffusion process but driven by the observed data.This is exactly what we have been confronted during the COVID-19 pandemic, as the data are uncertain, as are the transmission mechanisms of SARS-CoV-2.We therefore proposed to model the spread of this disease using a stochastic model with a time-varying transmission rate inferred using welldocumented hospital multiple datasets.The knowledge of the transmission rate makes it possible to easily calculate the R eff (t), which is a key parameter of the epidemic, in order to monitor the potential effects of public health policies on the course the COVID-19 epidemic.Therefore, we believe that this framework could be particularly useful to analyze the next evolution of the epidemic in relation to the emergence of new variants and to help refine potential mitigation measures, after the third wave, during the period where these measures will be progressively lifted, pending the complete vaccination of the population.

Supporting information
S1 Text: Description of model formulation and inference method used, including S1 and S2 Tables.U stands for uniform distribution and tN for truncated normal distribution (tN[mean,std,limit inf,limit sup]).

S1 Figure:
Prior and posterior distributions for the model inference presented Fig. 2. I 1 (0) is the initial number of infectious individuals, ν is the volatility of the Brownian process of β(t), 1/σ the average duration of the incubation, 1/γ the average duration of infectious period, 1/κ the average hospitalization period, 1/δ the average time spent in ICU, τ A the fraction of asymptomatics, τ H the fraction of infectious hospitalized, τ I the fraction of ICU admission, τ D the death rate, ρ I the reporting rate for the infectious, ρ H the reporting rate for the hospitalized people.1), σ the incubation rate, γ the recovery rate, 1/κ the average hospitalization period, 1/δ the average time spent in ICU, τ A the fraction of asymptomatics, τ H the fraction of infectious hospitalized, τ I the fraction of ICU admission, τ D death rate, q 1 and q 2 the reduction in the transmissibility of I 2 and A i , q I the reduction in the fraction of people admitted in ICU and q D the reduction in the death rate.

(t). (C) Symptomatic infectious I(t) = I 1 (t)+I 2 (t). (D) Asymptomatic infectious A(t) = A 1 (t)+A 2 (t). (E) Hospitalized individuals H(t) = H 1 (t)+H 2 (t)+ICU(t). (F) Individuals in ICU, ICU(t). (G) Cumulative death D(t). (H) Removed individuals R(t).
The blue lines are the median of the posterior estimates of the simulated trajectories, the purple areas are the 50% Credible Intervals (CI) and the light blue areas the 95% CI.In (A) the orange area corresponds to the 50% CI of R eff .The black points are observations used in the inference process, the white points are the observations not used.In (H) the red line shows the median of R(t) when the "effectively protected vaccinated people" have been subtracted.The blue lines are the median of the posterior estimates of R eff (t) and orange and yellow areas are the 95% CI of R eff .In (A) the orange area corresponds to the case where hospital discharges are included in the inference process, whereas the yellow area corresponds to the model that does not account for them.
In (A) and (B) the dashed curves represent the median of R eff for a preceding time period (blue), or computed with lower transmissibility of the asymptomatics, q A =0.40 (black) or computed with higher transmissibility, q A =0.70 (red).The vertical black dashed lines correspond to the start dates of the main mitigation measures, the dot-dashed lines are for cases where only one part of the region has been subjected to these measures.The horizontal dashed-line is the threshold R eff = 1.

S2Figure:
The traces of the MCMC chain for the model inference in Fig 2.I 1 (0) is the initial number of infectious, ν is the volatility of the Brownian process of β(t), 1/σ the average duration of the incubation, 1/γ the average duration of infectious period, 1/κ the average hospitalization period, 1/δ the average time spent in ICU, τ A the fraction of asymptomatics, τ H the fraction of infectious hospitalized, τ I the fraction of ICU admission, τ D the death rate, ρ I the reporting rate for the infectious, ρ H the reporting rate for the hospitalized people.S3 Figure: Reconstruction of the observed dynamics of COVID-19 in Ile-de-France, the Paris region but for the inference process the hospital discharges have been used.Caption as for Fig. 2. The black points are observations used by the inference process, the white points are the observations not used.S4 Figure: Dynamics of COVID-19 in Ile-de-France, the Paris region but for the inference process the hospital discharges have been used.Caption as for Fig. 3.The black points are observations used by the inference process, the white points are the observations not used.S5 Figure: Prior and posterior distributions for the model inference presented in S3 Fig. Caption as for S1 Fig. S6 Figure: Reconstruction of the observed dynamics of COVID-19 in Ireland.Caption as for Fig. 2 but average daily data of the current week is used after 01-06-2020.The black points are observations used by the inference process, the white points are the observations not used.S7 Figure: Dynamics of COVID-19 in Ireland.Caption as for Fig. 3.The black points are observations used by the inference process, the white points are the observations not used.S8 Figure: Prior and posterior distributions for the model inferences presented S6 in Fig. Caption as for S1 Fig.S22 Figure: Parallel trends in effective reproduction number and public transport mobility (https://www.google.com/covid19/mobility/). (A) Ile-de-France, (B) Ireland, (C) Provence Alpes Côte d'Azur, (D) Occitanie, (E) Nouvelle-Aquitaine, (F) Auvergne Rhône Alpes.Black line: time evolution of the estimated R eff(t)  and blue line: public transport mobility.In (A) the back line corresponds to the case where hospital discharges are included in the inference process, whereas the dashed-line corresponds to the model that does not account for them.The vertical black dashed lines correspond to the start dates of the main mitigation measures, the dot-dashed lines are for cases where only one part of the region has been subjected to these measures.

Figure 2 .
Figure 2. Reconstruction of the observed dynamics of COVID-19 in Ile-de-France, the Paris region.(A) Time evolution of both β(t) and R eff (t).(B) Simulated and observed incidence.(C-D) New daily admissions to hospital and to ICU. (E) Daily new deaths.(F) Hospital discharges.(G-H) Cases in Hospital and in ICU per day (average of daily data over the current week is used after 01-05-2020).The black points are observations used in the inference process, the white points are the observations not used.The blue lines are the median of the posterior estimates of the simulated trajectories, the purple areas are the 50% Credible Intervals (CI) and the light blue areas the 95% CI.In (A) the orange area is the 50% CI of R eff .The vertical dashed lines show the implementation dates of the main NPI measures and the dot-dashed lines are for cases where only one part of the region has been subjected to these measures.The horizontal dashed-line is the threshold R eff = 1.For (B-H), the corresponding reporting rate is applied to the simulated trajectories for comparison with observations.

Figure 3 .
Figure 3. Model dynamics of COVID-19 in Ile-de-France, the Paris region.(A) Time evolution of susceptibles S(t) and R eff (t).(B) Infected non infectious, E(t) = E 1 (t)+E 2 (t).(C) Symptomatic infectious I(t) = I 1 (t)+I 2 (t).(D) Asymptomatic infectious A(t) = A 1 (t)+A 2 (t).(E) Hospitalized individuals H(t) = H 1 (t)+H 2 (t)+ICU(t).(F) Individuals in ICU, ICU(t).(G) Cumulative death D(t).(H) Removed individuals R(t).The blue lines are the median of the posterior estimates of the simulated trajectories, the purple areas are the 50% Credible Intervals (CI) and the light blue areas the 95% CI.In (A) the orange area corresponds to the 50% CI of R eff .The black points are observations used in the inference process, the white points are the observations not used.In (H) the red line shows the median of R(t) when the "effectively protected vaccinated people" have been subtracted.

Figure 4 .
Figure 4. Time varying R eff (t) in five regions of France and in Ireland.(A) Ile-de-France, (B) Ireland, (C) Provence Alpes Côte d'Azur, (D) Occitanie, (E) Nouvelle Aquitaine (F) Auvergne Rhône Alpes.The blue lines are the median of the posterior estimates of R eff (t) and orange and yellow areas are the 95% CI of R eff .In (A) the orange area corresponds to the case where hospital discharges are included

Figure
Figure 5.Comparison between our R eff(t)  estimation and those obtained with two other methods based on Irish multiple datasets.A/ Comparison with the method implemented in EpiEstim R package (http://metrics.covid19-analysis.org/).B/ Comparison with the method proposed by Arroyo-Marioli et al[28]  (http://trackingr-env.eba-9muars8y.us-east-2.elasticbeanstalk.com/).Blue lines are the median of the posterior of our estimates of R eff (t) and orange areas are the corresponding 95% CI of our R eff estimates.The black lines represent the median of R eff (t) for the other methods and the black dotted-dashed lines delimit the corresponding 95% CI associated.In B/ the black dashed lines delimited the 65% CI.The vertical black dashed lines correspond to the start dates of the main mitigation measures.The horizontal dashed-line is the threshold R eff = 1.
* not using hospital discharge data; ** using hospital discharge data.

Table 2 .
Comparison between seroprevalence from different surveys and our seroprevalence estimations for Ilede-France region, Ireland and four other French regions: Provence Alpes Côte d'Azur (PACA), Occitanie (OC), Nouvelle-Aquitaine (NA), Auvergne Rhône Alpes (ARA).For April 2021, we also show our seroprevalence estimations when "effectively protected vaccinated people" are subtracted from the compartment removed.

.
Flow diagram of a generalized SEIR model accounting for asymptomatic transmission and a simplified hospital system (see eqs S2).The variables are: the susceptibles S, the infected noninfectious E, the infectious symptomatic I, the infectious asymptomatic A, the removed individuals R, and the hospital related variables: the hospitalized individuals H, the individuals in intensive care unit ICU, and the deaths at hospital D. The subscripts 1 and 2 stand for the two stages of the Erlang distribution of the sojourn times in E, A, I, H.The flow from H 2 to R represents hospital discharge.Flows in blue are from hospital (H i ) and flow in red from ICU. λ'(t) = β(t).(I 1 +q 1 .I 2 +q 2 .(A 1 +A 2 ))/N then the force of infection is λ(t) = λ'(t).S(t) and β(t) is the timevarying transmission rate ( 5.Comparison between our R eff (t) estimation and those obtained with two other methods based on Irish multiple datasets.A/ Comparison with the method implemented in EpiEstim R package (http://metrics.covid19-analysis.org/).B/ Comparison with the method proposed by Arroyo-Marioli et al[28](http://trackingr-env.eba-9muars8y.us-east-2.elasticbeanstalk.com/).Blue lines are the median of the posterior of our estimates of R eff (t) and orange areas are the corresponding 95% CI of our R eff estimates.The black lines represent the median of R eff (t) for the other methods and the black dotted-dashed lines delimit the corresponding 95% CI associated.In B/ the black dashed lines delimited the 65% CI.The vertical black dashed lines correspond to the start dates of the main mitigation measures.The horizontal dashed-line is the threshold R eff = 1.The Supplementary Figures are at this address: https://www.dropbox.com/s/gf1pydxukznwm7f/0430.zip?dl=0