Modeling the spread of COVID-19 in Germany: Early assessment and possible scenarios

The novel coronavirus (SARS-CoV-2), identified in China at the end of December 2019 and causing the disease COVID-19, has meanwhile led to outbreaks all over the globe with about 2.2 million confirmed cases and more than 150,000 deaths as of April 17, 2020. In this work, mathematical models are used to reproduce data of the early evolution of the COVID-19 outbreak in Germany, taking into account the effect of actual and hypothetical non-pharmaceutical interventions. Systems of differential equations of SEIR type are extended to account for undetected infections, stages of infection, and age groups. The models are calibrated on data until April 5. Data from April 6 to 14 are used for model validation. We simulate different possible strategies for the mitigation of the current outbreak, slowing down the spread of the virus and thus reducing the peak in daily diagnosed cases, the demand for hospitalization or intensive care units admissions, and eventually the number of fatalities. Our results suggest that a partial (and gradual) lifting of introduced control measures could soon be possible if accompanied by further increased testing activity, strict isolation of detected cases, and reduced contact to risk groups.


Introduction
In late December 2019, several cases of acute respiratory syndrome were first reported in Wuhan City (Hubei region, China) by Chinese public health authorities. A novel coronavirus was soon found as the main causative agent. It is now known as severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). The disease caused by SARS-CoV-2, which rapidly spread first through China and then to other countries, is now referred to as coronavirus disease 2019 . The World Health Organization declare COVID-19 a global pandemic on March 11, 2020 [1,2]. As of April 17, 2020, about 2.2 million cases and more than 150,000  [13,14] and 4.71 [15], or could even be larger than 6 [16]. The (effective) reproduction number R for a given population is the corresponding time-dependent quantity which reflects the number of secondary infections generated by one infectious individual in the current population, and is affected by intervention measures aimed at controlling the spread of the disease. Among the first studies to assess the practical implications of public health interventions, Tang et al. [16] identified contact tracing followed by quarantine and isolation, as well as travel restrictions as the most effective measures to contain the epidemic. Meyer-Hermann and coauthors [23] recently predicted the evolution of the reproduction number for the spread in Germany, with detailed analysis for all federal states and observed that as of April 3, 2020 the reproduction number R was lowered to values near 1 in all federal states.
In this study, we predict the spread of COVID-19 among the German population by means of mathematical modeling, simulating the implementation or withdrawal of non-pharmaceutical interventions. First model results were presented in our preliminary work [24] and implicated in the advisory paper [25]. Here we present a follow-up of our study, including case and fatality data, and information about testing activity [26], as of beginning of April 2020. The proposed setting allows investigating how a specific intervention scenario affects the dynamics of the epidemic, with particular attention to interactions between individuals of the same or different age groups (children, adults, and people older than 60 years). We consider the following scenarios: • Minimal intervention: The main factor slowing down disease transmission is people's increased awareness in response to initial recommendations coming from health institutions and local governments, and to media coverage (e.g., washing hands, proper coughing and sneezing, keeping distance from obviously sick persons, limited (self-)quarantine of known or suspected cases); • Baseline scenario: Hitherto adopted main control measures (closure of schools and universities, remote working policy, isolation of identified cases, contact restrictions, a partial economic shutdown, and levels of testing activity as of March 15, 2020) are assumed to be maintained throughout 2020. Model parameters which include such control measures were calibrated on reported cases in Germany as of April 5, 2020 and were used to project data until April 14, 2020; • High vigilance: This scenario is obtained by enriching the baseline measures with significantly increased testing activity (not only suspected COVID-19 cases but also persons without symptoms or known close contacts to identified cases) and a strict isolation protocol of detected COVID-19 cases for about two weeks; • Educational/economic reopening: Partial lifting of the restrictions imposed thus far, gradually reopening schools, universities, and resuming economic activities, though largely maintaining remote working policy and limiting use of public transport service and organized club activities. Significantly increased testing activity and strict isolation protocol of detected COVID-19 cases are maintained; • Phase-out: Gradual lifting of most control measures applied so far (reopening schools, resuming work, regular public transport service, resuming most social and economic activities). Increased testing activity and strict isolation protocol of detected COVID-19 cases are maintained, and the population is assumed to uphold minimal awareness measures; • Cautious phase-out: As the phase-out scenario, but with slower rollback to educational, social and economical activities, and with improved measures to protect elderly and people at risk.
Our model predicts that the current control measures are necessary to slow down or even suppress the spread of the epidemic and that the removal of restrictions in favor of social and economic activities will accelerate the growth of case numbers unless it is accompanied by a significantly strengthened testing and case isolation policy. However, under such increased vigilance, combined with particular care regarding patients at high risk, we project that a gradual phase out of the most severe measures starting around May 5 will lead to a progression of the epidemic that is sufficiently slow to be handled by the health care system.

Mathematical models and methods
The mathematical models adopted for this study are based on systems of differential equations that describe interactions between different groups of individuals in the population. The proposed approach extends the known S-E-I-R (susceptibles-exposed-infected-recovered) model for disease dynamics [27]. Individuals are classified according to their status with respect to the virus spread in the community. In particular we distinguish between individuals who have been exposed to the virus but are not yet infectious, asymptomatic infectives, infectives with mild or influenza-like symptoms (not reported as SARS-CoV-2 infections), and reported SARS-CoV-2 infectives. We assume that infected patients without SARS-CoV-2 diagnosis are unlikely to die of the virus-induced disease. Undetected infections lead to undetected recoveries in the population, which cannot be reported unless testing for ongoing (virus detection) or previous (antibody detection) SARS-CoV-2 infections is performed.

The core model-Homogeneous population
The most basic approach that we adopted to understand the evolution of the epidemic in time is based on the assumption that the population is homogeneous (in particular with respect to space, as well as to age and social habits of individuals). Of course, this simplistic assumption does not reflect the multidimensional complexity of the ongoing situation, but it can help in determining major factors affecting disease spread. Individuals are classified as follows. Susceptible individuals (S) are those who can be infected with SARS-CoV-2. Exposed individuals (E) have been infected with SARS-CoV-2, though they are not yet infectious, nor symptomatic. After a latent phase (on average about 5.5 days from infection [6]) the first COVID-19 symptoms occur (with probability ρ 0 ) and the exposed individual becomes infectious. We distinguish between asymptomatic undetected (U) COVID-19 patients, symptomatic but not yet detected infectives (I) and reported COVID-19 cases (H). For individuals developing symptoms, the probability of the disease being detected immediately is η 0 , the probability of detection at a later stage is denoted by η 1 . Individuals who recovered from a detected (R) or an undetected (R u ) infection, as well as patients who died from the infection (D), are removed from the chain of transmission. The transmission diagram of the model is depicted in Fig 1. Susceptible individuals can be infected via contacts with asymptomatic (transmission rate β U ), symptomatic undetected (β I ) and reported cases (β H ). We assume that asymptomatic infectives do not restrict their contacts to others, and therefore have higher transmission rates than symptomatic infected individuals. Detected cases are supposed to reduce their contacts even further. Due to limitations in the identifiability of the parameters with the available data, we fix the ratio between β I , respectively β H and β U and estimate the latter. Duration of latency (1/γ E ) and infectious periods (1/γ I , 1/γ H and 1/γ U , respectively) are assumed in accordance with available literature. Further details on parameter assumptions are given in Table 1. The dynamics of the core model described above and shown in Fig 1 is given by the following system of differential equations: and N � 83 million is the total population. The basic reproduction number of system (1) can be calculated analytically, e.g, by means of the next-generation matrix approach [28], and is given by The effective reproduction number R at time t can be obtained by the same formula, substituting the values of time-dependent parameters in (2) and multiplying this value by the susceptible fraction of the population, S(t)/(N − D(t)).
To determine model parameters that could not be inferred from the literature, we use the sum of reported cases from three different time periods (February 28 to March 11, March 11 to March 22, and March 22 to April 11) and fit the data to the sum of H, R, and D using the trust region reflective method [29] as implemented in SciPy [30]. The first period does not include any global contact reducing measures except for school closings in the county of Heinsberg. The second period includes nationwide school closings and an increased use of working from home where possible. The third period includes the closing of most stores and general constraints on contacts in the public sphere. For each period we fit η 1 , 1 − η 0 , and β U , while keeping β I and β H proportional to β U with proportionality factors given in Table 1. Fitted model solutions at the end of each time interval are taken as initial values for integration of system (1) over the next time period. To test the sensitivity of the results to variations in the estimated parameters, we add Monte Carlo sampling of the parameters generating a large ensemble of model fits. Each of these fits contributes to the solution according to its Akaike weight. The Akaike weight is determined based on the Akaike information criterion [31]. For few data points the corrected Akaike information criterion (AICc) [32] takes the form where SSE is the sum of squared errors, n is the number of data points, and k is the number of degrees of freedom. The weight of each fit i in a set of J fits is then given as where Δ i = AICc i − AICc min . Based on the sample values and their weights, we can construct a histogram for each parameter and derived properties (c.f. Fig 2). A narrow peaked histogram indicates a well determined parameter with a small standard deviation and a corresponding narrow confidence interval, whereas a flat histogram shows that the parameter may be indeterminate and that the fit is not sensitive to its variations. For a sufficiently large number of points, we can determine, for example, a 95% confidence interval directly from the histogram by excluding the left-and rightmost 2.5% (marked in light blue in Fig 2).

Sensitivity analysis of R and peak cases in the core model
Sensitivity analysis of the reproduction number and of the number of cases at the outbreak peak was further investigated by means of scatter plots (results not shown here) and Sobol analysis (SALib library [36] in Python). The Sobol method is a variance-based method for global sensitivity analysis that decomposes the variance or sensitivity of the model output into contributions of individual parameters or groups of parameters [37,38]. The first-order sensitivity index (S1)-also called main effect-is the ratio of the partial variance of an individual parameter with respect to the total variance. The total-order sensitivity index (ST)-also called total effect-measures the overall influence (including higher-order interactions) of each parameter on the model output. The values of both S1 and ST range between 0 and 1, where higher values indicate a greater contribution of the parameter to the model output [39]. When the total effect due to a parameter is much larger than its main effect, one may need to look at higher-order sensitivity indices to look for interactions between parameters. However, this was not the case for the parameters that we have considered (results not shown here).
We computed the sensitivity indices of the parameters that could not be determined from literature or are expected to vary between populations and over time, viz the transmission parameters, the probability of developing symptoms, and the probability of being detected. Of these parameters ρ 0 , η 0 , and η 1 are varied between 0 and 1, β U is varied between 0 and 3, and β I and β H are assumed to be fixed multiples of β U as detailed in Table 1. All the parameters are varied simultaneously to also observe their interaction effects.

Population structured by age and stages of infection
Refining the core model described above, we include age groups and stages within infective compartments. This allows considering features like the immune response of individuals during infection or social behavior, in particular interactions among individuals of the same or different age groups. Based on the statistical analysis of the RKI data in our previous study [24], we distinguish three groups: children (0-14y), adults (15-59y) and people 60y or older. For

PLOS ONE
The impact of current and future control measures on the spread of COVID-19 in Germany each age group the model tracks susceptibles (S), exposed (E), undetected asymptomatic infectives (U), undetected symptomatic infectives (I), diagnosed infectives (H), recovered (R), undetected recovered (R u ), and deceased (D) individuals, as detailed above for the core model. To obtain more realistic distributions for the duration of the exposed and infective compartments, we split each of these in three stages (E j , U j , I j , H j , j = 1, 2, 3). This is a classical extension of the standard disease transmission model to account for non-exponential distributions of incubation and infectious periods (cf., e.g., [18] for another example of application in modeling . This results in a total of one plus nine infective compartments per age class (stage E 3 is assumed to be infective as well, as individuals have been reported to be infectious before symptoms onset [6,11]). The age classes evolve in parallel (maturation during the course of the outbreak is neglected along with demographics in general), but are coupled to one another by contact rates and disease transmission among individuals of different age groups.
After virus transmission, an exposed individual is assumed to travel through the stages E 1 , E 2 , and E 3 before entering either of the infection stages I 1 , U 1 , or H 1 . In the course of the disease the individual will then pass through the infection stages of the respective compartment. Both symptomatic and asymptomatic infectives in stages U i or I i can enter stages H i+1 or H i , respectively, by being tested. The probability of being tested and positively diagnosed is assumed to be larger for symptomatic than for asymptomatic individuals. Since individuals at stage E 3 are already infectious [11], late symptom onset can be viewed as a prolongation of the latency period. We have therefore omitted any transitions from U to I. The individual leaves the last stage of infection (U 3 , I 3 ) by either recovering to R or R u , depending on whether the infection has been diagnosed. Given the current efforts to detect and control the spread of the disease, it seems a justified assumption that the vast majority of deaths caused by SARS-CoV-2 are investigated, hence we assume only diagnosed individuals (H) to be fatally affected by the disease. Even in the case that the presence of the virus is only discovered postmortem, we model this as detection followed by death (transition I to D).
Equations and parameters of the age structured model. For each age class A we have the following compartments The transmission rate for S A individuals is given by where N is the total population and b B;A X denotes the transmission rate due to contacts of infectious individuals in compartment X 2 {E 3 , U j , I j , H j jj = 1, 2, 3} and age class B, X B with susceptibles in age class A. Contact rates can be summarized in a (3 × 30) matrix B ¼ ðb B;A X Þ X;B;A . Entries of the contact matrix B are assumed to be influenced by (i) the overall aggressiveness of the virus, β 0 , that describes how easily the virus is transmitted and is a common factor for all entries of B, (ii) an age specific susceptibility factor σ A that describes the average biological susceptibility to the virus and the average activity level (in terms of social contacts) of the given age group, A, and is a common factor for each row of B, (iii) the effective infectivity i B X of each given compartment X and age group B that depends on the activity level and the plain infectivity of X B , and is a common factor in each column of B, and finally (iv) a likeness factor φ BA that describes the mixing of age group B with age group A. The factor β 0 has to be estimated by fitting the model to data provided by the RKI [34]. The same is true for the age specific susceptibilities that depend on the immune response of the individual (supposedly on average stronger in younger individuals). The effective infectivity i B X is supposed to be lowest for diagnosed individuals (H j ) thanks to deliberate contact reduction. We also assume symptomatic individuals to be less active than asymptomatic ones which means that i B X is smaller for X = I j than for X = U j , even though the biological infectivity should be higher for symptomatic individuals than for asymptomatic ones. This activity reduction results from both the symptoms restricting the mobility of an affected individual as well as them being aware of their potential to spread a communicable disease. The likeness factor is hard to estimate. Following the work by Prem et al. [21], who focused on the population in Hubei province in China, we consider contacts in three different realms: at home, at work and school, and in other locations, most notably during leisure activities. For each activity, we assume specific contact distributions depending on the ages of the individuals. For example, we assume high contact rates among adults and between adults and children at home. The separation of the realms allows for each intervention to produce a specific effect. For example, school closures as part of the CS intervention (cf. Table 2) predominantly act on the child-child contacts in the school realm. The resulting matrix B under the influence of different interventions is graphically illustrated in Fig 3. Both the assumptions and the final overall contact matrices are largely in keeping with the findings of, e.g., [40] for general communicable diseases.
The progression rates γ X are determined by the estimated mean durations of each stage of the infection. For simplicity, and analogously to [18], we choose all the rates for a given compartment to be the same (that is, γ E1 = γ E2 = γ E3 ≕ γ E and likewise for U, I, and H). In accordance with [35] we assume the mean incubation period to be 5.5 days and therefore put γ E = 3/5.5, the factor 3 giving a mean combined duration of stay in E 1 , E 2 , and E 3 of 5.5 days. In the same manner we take the rates γ U = 3/7 and γ H = γ I = 3/8 corresponding to mean durations of the infectious period of seven days for asymptomatic individuals and eight days for symptomatic ones, in accordance with previous studies [17,21].
The probability ρ 0 of developing symptoms at the end of the incubation period is estimated to be r j 0 ¼ 0:25 for juniors who are reportedly often asymptomatic [41], and r a 0 ¼ 0:7 and r s 0 ¼ 0:8 for adults and seniors, respectively. The probability η 0 of an infection being discovered by the end of the incubation period is assumed to be small in the beginning of the simulation. The same is true for the ratesẐ m for discovering an infection in the absence of symptoms. For symptomatic individuals, these rates are larger as symptoms provide a strong suspicion of being infected. We assume a certain age dependence for testing. On the one hand, seniors are at high risk and might be expected to be tested more frequently but on the other hand, severe respiratory tract infections are common in people with limited immune competence and may not be taken as implicating a SARS-CoV-2 infection. Juniors are reportedly not seriously affected, hence we assume that they are less frequently tested. According to recent RKI reports [26], testing has been dramatically increased between March 9 and March 15, so we assume that all the testing rates mirror this increase. In particular, we assume the rates η 0 andẐ 1=2 to be significantly larger than zero since individuals are encouraged to go for a test, even in the absence of symptoms if they were in close contact with a known infective.
The parameters ν m describe the probability of progressing to the next stage while being diagnosed (i.e. transitioning from an I to an H compartment). Their values are assumed to be close to 1. Them being smaller would imply the assumption that being tested is somehow associated with more severe symptoms and would, therefore, slow down the progression toward recovery.
To follow weekly oscillations in the reported cases, which show a regular slump over the weekends, we include time dependent test rates. This allows the simulated case numbers to

PLOS ONE
The impact of current and future control measures on the spread of COVID-19 in Germany closely follow the data as illustrated in Fig 4. For long term simulations of scenarios we assumed these fluctuations to fade away and removed the time-dependency from the testing rates. This implicitly represents an assumed increase in testing activities, and de facto increases the reporting ratio by more than 10%.

Modeling control measures
In the structured model we can include control measures that explicitly affect different age groups. In particular we consider (i) general increased awareness in the population due to the effect of media (M), as well as (ii) active control due to main intervention measures adopted throughout Germany since March 2020. These control measures include: (CS) Closure of all schools, universities, sports clubs and canceling public events; (HO) reduced contacts in workplaces and outside the household (restaurants, bars, public transport); (T0) initial efforts to improve detection by more testing; (IC0) isolation of infected cases; and lock-down measures (LD) closing most activities and prohibiting more than two people gatherings in effect by the end of March. Details are summarized in Table 2. Reported cases for the three age groups were used to fit the model including control measures as indicated in Table 2 until April 5 (Fig 5). The obtained setting was used to predict data from April 6 to April 14, as well as long term predictions of the baseline(BSL) scenario, in which the control measures applied as of early April (CS, HS, TO, ICO and LD) and the awareness due to the effect of media are maintained until the end of the year (Fig 6(b)). The baseline scenario has then been modified to simulate different possible future scenarios (cf. Table 3). On the one hand, we have enriched the current control measures by further increased testing activity (T1+), stricter isolation of known cases (IC) and increased social distancing from individuals at risk (IO). On the other hand, we have considered possible rollback scenarios (cf. Results).  [24]), cross denotes reported data as of April 15. The model is calibrated on collected data up to April 5 (day 50), data from April 6 to 14 are used for validation. Colors denote the three different age groups: juveniles (0-14y, green), adults (15-59y, blue) and seniors (60y and older, red). It should be noted that the most recent data tend to be lower than expected since not all cases detected on these days have been reported to the RKI, yet.
https://doi.org/10.1371/journal.pone.0238559.g005 (a) minimal intervention: increased awareness, quarantine of known or suspected cases, testing of patients with symptoms and contact history; (b) baseline scenario: minimal intervention scenario increased with school closure, high reduction in economical activities, contact limitation, high testing activity; (c) high vigilance: baseline scenario enriched by isolation of detected cases, combined with increased testing activity; (d) educational/economic reopening: reintroducing in three phases contacts at schools, workplaces, public transportation service; (e) phase-out: rollback of all introduced control measures, up to minimal interventions, accompanied by increased testing also of asymptomatic individuals and strict isolation of identified cases; (f) cautious phase-out: similar to (e), but with slower rollback to regular activities, accompanied by strongly increased testing also of asymptomatic individuals, strict isolation of identified cases, and reduced contacts with elderly and risk groups. Solutions are shown in logarithmic scale for both new and active cases, in order to make peak heights in different orders of magnitude visible. Oscillations up to day 60 are due to the weekend-effect (Fig 4), which is relaxed for long term projections. https://doi.org/10.1371/journal.pone.0238559.g006

PLOS ONE
The impact of current and future control measures on the spread of COVID-19 in Germany Control measures have the fundamental effect to reduce contacts between individuals, hence transmission of the virus from person to person. Recall that in our modeling assumptions, we consider contacts in three different realms: at home, at work and school, and in other locations. Treating different contact categories separately allows for an easier estimate of the effects of contact reducing interventions. School closures are an example which beautifully illustrates the use of separate contact matrices for different realms. A visualization of how control measures reduce the transmission rates in the baseline scenario is given in Fig 3(a). In contrast, in rollback scenarios a partial lift of control measures leads to increased transmission rates (Fig 3(b)).

Data
The publicly available dataset provided by the Robert Koch Institute (RKI) [34] was used for this study. Statistical analysis of the dataset was performed (results not shown here) analogously to our previous work [24].

Results
In the core (non-structured) model (1) we estimated how contacts, hence the reproduction number, decreased in the three considered time slots since the beginning of the epidemic fitting the model to reported data [34]. The left panel of Fig 7 shows the fits on top of the data. Each period is described by a different value for the reproduction number: R ¼ 6:95 for the first period, R ¼ 3:38 for the second period, and R ¼ 0:97 for the third period. The value of R in the early stage may appear rather large when compared to the widely reported values for different countries. On the other hand, reproduction numbers around 10 have been reported

PLOS ONE
The impact of current and future control measures on the spread of COVID-19 in Germany by other authors as well, e.g., [16] for China or [42] for China and the Italian province of Lombardy. An obvious reason for large reproduction numbers could be super-spreading events, such as large public gatherings. Moreover, decreasing detection ratios upon rising case numbers, due to testing capacities being overwhelmed, may lead to lower apparent values of R when only data for confirmed cases are considered. The significant reduction of the reproduction number over time clearly indicates the success of the contact reducing measures.
The inset in Fig 7 shows the predictions based on the fits until mid May showing the catastrophic increase of infections that would have resulted if no or only insufficient measures had been taken. The right panel in Fig 7 shows a range for cases requiring hospitalization and intensive care using the latest prediction of the reproduction number. These are computed from the number of detected cases predicted by the model, assuming 15-20% hospitalizations in low care units and 2-5% in intensive care units. Under the current prediction these numbers could be handled by German hospitals. Though this is an indicative estimate only, it clearly shows that the reproduction number consequently decreased from R � 7 at the end of February to almost R � 1 at the beginning of April.
In the baseline scenario for the refined age and stage structured model, where control measures apply continuously over time, we observe that the dominant eigenvalue of the linearization of the system about the disease-free equilibrium (DFE, i.e., a completely susceptible population) decreases in time and eventually crosses zero (Fig 8). This corresponds to the reproduction number dropping below R ¼ 1 (cf. [28]). The seemingly erratic fluctuations which can be observed in Fig 8 are caused by the weekly oscillations in testing that we call the weekend-effect and that we illustrate in Fig 4. Let us now discuss results on the simulated scenarios in detail. In the minimal intervention scenario, we assume that no specific measures to reduce contacts between individuals (school closures, interruption of most social and economical activities) nor increased testing activity were undertaken. Under these assumptions, the initial rapid increase of cases would have gone on unabatedly, and the number of infected individuals requiring hospitalization or even intensive care would have reached unmanageable levels within weeks. This scenario is purely counterfactual and is only detailed here to evaluate the effects of the measures adopted so far. Model simulations (Fig 6(a)) show that in this scenario a peak in infections would have been reached at the beginning of May 2020 (day 79 since February 15), with about 12 million active infections on the peak day. Over the course of the infection about 75 million people would have been infected and 1.6 million would have died. Fortunately, taking into account interventions adopted so far, the actual course of the epidemic is less dramatic, and the model simulations predict a significantly better outcome.
Compared to our initial investigations [24], we have adjusted the baseline scenario by taking into account the further restrictive measures adopted in the last week of March 2020 (reduction of economic activity, restrictions on meetings in public space, and further increased remote working activity). Said enhanced intervention scenario suggests that the number of active cases peaked with about 33,000 infected individuals at the beginning of April 2020 (Fig  6(b)). This is in accordance with recent modeling studies [23], which suggest that the effective reproduction number R � 1 as of April 2, meaning that the disease free equilibrium is on the verge of instability (i.e., the leading relevant eigenvalue of the linearization about the equilibrium with only susceptible individuals is located close to the imaginary axis, cf. Fig 8). Since, at least for small numbers of active infections, the leading eigenvalue lying on the imaginary axis corresponds to the reproduction number R being close to 1, this makes the predictions for this scenario particularly sensitive to assumptions about the model parameters. The reason is that variations of R make a striking qualitative difference between further exponentially rising case numbers or a slowdown of the epidemic. Taking the latest case numbers (up to April 8) into consideration and assuming that in the weeks to come the testing activity does not take significant slumps on weekends anymore (cf . Fig 4), we project the number of infected individuals, both detected and undetected, to decrease over the coming months. The total number of fatalities would be reduced by more than 90% as compared to the minimal intervention scenario (Fig 9(d)), and the capacity of the health care system would not be severely challenged (see also Fig 9(b)).
In the reported data [34], individuals are counted as "recovered" if they are no longer symptomatic and 14 days (supposedly the longest infectious period under normal circumstances) have passed since the positive test. In contrast, the average infectious period, as chosen for the simulation, is about 1/γ I = 7 days, meaning that individuals on average recover seven days after becoming infectious. Therefore, the number of recovered individuals according to the model is higher than the officially recorded figure, and that in turn leads to lower numbers of active reported cases in the model as compared to the official data. Using data up to April 5 to estimate parameters, the model predicts 116,000 cumulative reported cases for April 9, and about 149,000 cases for April 16. Maintaining the baseline measures throughout the year (we are assuming no seasonality of the disease) would lead to the eradication of the epidemic.

PLOS ONE
The impact of current and future control measures on the spread of COVID-19 in Germany Simulations of this scenario (Fig 6(b)) suggest about 550,000 infections (about 185,000 thereof asymptomatic), and 14,000 deaths over the course of the epidemic. Enriching the baseline scenario with further increased testing activities and even stricter isolation of detected and suspected cases (high vigilance scenario, Fig 6(c)) would shorten the time necessary to call the disease eradicated (beginning of September 2020), but widen the peak of active cases. Stricter isolation of confirmed and suspected cases and improved testing activity would further reduce the spread of the virus, forecasting about 361,000 infections (out of which about 108,000 asymptomatic), and 9,000 deaths over the course of the epidemic.
The above scenarios assume that the current restrictions on public life remain in effect over a long period. As this does not seem to be feasible in practice, we consider further scenarios which include an at least partial lifting of the restrictions imposed thus far. In contrast to our previously simulated scenarios [24], we assume a gradual reopening of economical and educational activity (Fig 6(d)). We assume this to start in about three weeks from now (May 4) with reopening schools and childcare facilities as well as many shops, gradually proceeding to reopening universities, restaurants and other economic activities, and finally resuming on-site work and most club activities from June 1 on. Combining this partial rollback with further increased testing activity and isolation of identified cases (educational/economic reopening scenario) would lead to a (second) peak in active infections (1.3 million, detected and undetected combined) towards the end of November 2020. If no restrictive measures and interventions were to be (re)introduced, the simulation of the model results in about 32 million total infections and 730,000 deaths over the course of the epidemic, which seems to occur only by the end of the summer 2021 (notice the different time scales for different scenarios in Fig 6) under the assumption that no reliable treatment becomes available by then.
The last two scenarios that we present here suggest that a complete, though gradual, rollback of all introduced control measures would lead to a second peak towards the end of August 2020 (Fig 6(e), phase-outscenario), or end of September 2020 in case of slower reintroduction of regular activities (Fig 6(f), cautious phase-out). In both cases, the second peak would be anticipated (Fig 9(a)) and the number of infections at this peak would be way larger than in the educational/economic reopening scenario (7.2 million and 6.3 million active infections in the phase-out and cautious phase-out scenarios, respectively; cf. Fig 9(b)).

Discussion
In this work we proposed a mathematical model for predicting the evolution in time of detected COVID-19 infections in Germany taking into account the age distribution of cases. Distinguishing between people in different age groups allows the model to better characterize contacts between individuals (e.g., child-child contacts being typically different than senioradult contacts, cf. [21]) and to fine-tune the effect of intervention measures on contacts reduction (cf. Fig 3).
Given the limited knowledge about the novel virus' properties and the unprecedented control measures, there are significant uncertainties regarding the precise effects of single measures on effective contact rates. For example, while school closures can be clearly modeled as reducing child-child contacts in the school domain (cf. Methods), the effectiveness of this specific intervention in curbing an epidemic can vary dramatically [43,44], depending on, e.g., the pathogen and its interaction with the immune system of children. Moreover, it is hard to predict how the impact of control measures might wear off as the population grows tired and attention fades, or, rather the measures become even more effective as habituation makes following the guidelines easier. Our results indicate that the current measures lead to a significant reduction in the reproduction number, R, which is approximately 1 in the second week of April (Fig 8). This matches with estimations of the RKI [34] and findings by other groups which are also currently studying the situation in Germany [23,45]. This naturally results in a significant uncertainty of the projection since small deviations of the parameters can make the difference between further growing active case numbers or slowly declining numbers.
We parametrized the core model using, if possible, known or previously estimated parameter values, in particular those concerning the evolution of the disease (latency time, duration of infection, death rates), or plausible assumptions (e.g., for the relation between infectivity of detected and undetected infectives) as explained in the Methods section. Uncertainties of parameters that could not be inferred from the literature or well identified from the data were investigated via stochastic sampling and Sobol analysis. Such uncertainties could be reduced by integrating further data in the model, e.g., estimated contacts or estimates for the number of unreported cases.
For performing the simulations of the age-structured model some of the parameters were taken from literature, some were inferred from the fit of the core model, and others were fixed by means of plausible assumptions (cf. Methods) that provided good agreement with the data reported in the early phase of the outbreak. The parametrization of the structured model could be improved in the future, in particular assuming that better quality data would be available. Analogously to what was done for the core model uncertainties could be systematically analyzed for this model as well.
In our baseline scenario we assumed the current reporting ratio in Germany to be rather high (over 50% as of mid April). Assuming a lower reporting ratio yields more undetected cases and therefore leads to a higher estimate of the reproduction number. Analogously, dropping the weekend effect increases the number of active cases at the peak. On the other hand, lower reporting ratios in early April provide a larger margin for improvement by enhanced testing as in our high vigilance scenario. While reporting ratios are notoriously hard to estimate in early phases of an epidemic, and have a potentially enormous impact on the predictions made by any model [46], several models completely neglect the presence of undetected cases [33,45]. The aggressiveness of the virus and hence the mortality among all affected individuals (whether diagnosed or not) is another unknown, but different assumptions about this parameter can be expected to have similar impacts on all the scenarios discussed here. It may be assumed that earlier detection (as in all our scenarios with enhanced testing) and hence earlier and better care for high risk patients may result in a mortality even below the one observed in Germany so far. The limited capacity of the health care system, in particular of intensive care units, was not yet directly considered as a parameter of our refined model. However, the predicted number of infected individuals at its peak can be used as a proxy for the expected demand for health care resources at the height of the epidemic. Assuming that a fixed proportion of infected individuals will require intensive care, the maximal number of infectives for a given scenario directly indicates the maximal load on the health care system for this scenario.
For an initial study, considering the limited available data, we decided to apply the model to the entire German population, treating it as spatially homogeneous. Compartmental models assuming geographically homogeneous regions were also proposed by other authors for the spread of COVID-19 in Germany [17,45] or other countries [47,48]. Of course, this is an important simplifying assumption, as the population density and disease incidence vary from region to region, and from rural to urban areas. Features of the COVID-19 outbreak that emerged from reported data [34] clearly indicate the heterogeneous distribution of cases with respect to the age groups, which led us to highlight this property instead of geographic inhomogeneities. While the transmission dynamics would essentially remain as presented in this manuscript, the deterministic (based on ordinary differential equations) approach that we adopted here could be too coarse when considering much smaller populations distributed over more geographic patches. When the number of infectious individuals is small, or demographic/environmental variability could significantly impact the epidemic outcome, stochastic models might be better suited [49].
We have conducted simulations covering one year and more starting from the beginning of the epidemic. Some scenarios predict high peaks in active cases and alarmingly high numbers of deaths far into the future. However, these scenarios should only be understood as predictions for the future if no appropriate measures were taken to contain the epidemic. In particular, postponing the date of the peak sufficiently far into the future would provide time for the development of a vaccine or an effective treatment.