Skip to main content

Early stage COVID-19 disease dynamics in Germany: models and parameter identification

Abstract

Since the end of 2019 an outbreak of a new strain of coronavirus, called SARS-CoV-2, is reported from China and later other parts of the world. Since January 21, World Health Organization (WHO) reports daily data on confirmed cases and deaths from both China and other countries (www.who.int/emergencies/diseases/novel-coronavirus-2019/situation-reports). The Johns Hopkins University (github.com/CSSEGISandData/COVID-19/blob/master/csse_COVID_19_data/csse_COVID_19_time_series/time_series_COVID19_confirmed_global.csv) collects those data from various sources worldwide on a daily basis. For Germany, the Robert-Koch-Institute (RKI) also issues daily reports on the current number of infections and infection related fatal cases (www.rki.de/DE/Content/InfAZ/N/Neuartiges_Coronavirus/Situationsberichte/Gesamt.html). However, due to delays in the data collection, the data from RKI always lags behind those reported by Johns Hopkins. In this work we present an extended SEIRD-model to describe the disease dynamics in Germany. The parameter values are identified by matching the model output to the officially reported cases. An additional parameter to capture the influence of unidentified cases is also included in the model.

There’s an evil virus that’s threatening mankind [...] A menace to society Iron Maiden, Virus, 1996.

Introduction

In December 2019, first cases of a novel pneumonia of unknown cause were reported from Wuhan, the seventh-largest city in China. In the meantime, these cases have been identified as infections with a novel strain of coronavirus, called SARS-CoV-2 and the disease it causes is called coronavirus disease 2019 (COVID-19). At the beginning of January 2020, the virus spread over mainland China and reached other provinces. Increased travel activities due to the Chinese new year festivities supported the expansion of the infection. Since 21 January, WHO’s daily situation reports contain the latest figures on confirmed cases and deaths, see [1].

The first COVID-19 case in Germany was reported in late January 2020 in a company close to Munich, Bavaria. Later cases were imported by travelers from China, Iran or Italy as well as tourists returning from ski holidays in the Austria and Italy. By 1 March 2020 more than 100 cases were reported in Germany and since than the number of cases began to rise exponentially, see Fig. 1. The first deaths were reported on 9 March 2020 [2, 3].

Figure 1
figure1

Case numbers in Germany from 1 March until 7 April 2020, as reported by Johns Hopkins University [5]. The initial time point is chosen as 1 March, since then the number of registered infections exceeds 100 cases

By 16 March 2020 the federal government introduced first measures to reduce the spread of the disease: Schools, kindergartens and universities were closed. On 22 March these measures were tightened by implementing a national curfew and contact ban. People are advised to stay at home, leaving only for work related activities, necessary shopping, medical treatment or sports. All this should not be done in groups of more than two persons if they do not belong to the same household [4].

Our work is based on the data reported by Johns Hopkins University [5]. We refrain from using the official data from the Robert-Koch-Institute [2], since they suffer from a delay by several days due to the more complicate way of aggregating those data. For a detailed explanation of the difference between the data reported by Johns Hopkins and the Robert-Koch Institute we refer to the information given on the webpage of the Robert-Koch-Institute, see [6]. Johns Hopkins University continuously collects the data from internet queries at various sources (local health authorities, newspapers, etc.) whereas the Robert-Koch-Institutes collects the data that are reported for the local health authorities to the district level, then state level and finally aggregates them to the federal statistics. Hence these data lag several days behind the ones collected by Johns Hopkins University.

The paper is organized as follows: In Sect. 2 we describe the model and the parameter identification problem. Our models consists of three variants of a five compartment SEIRD-system without demographic terms, where the transmission rate is either fixed (1a)–(1e) or time-dependent (3a)–(3e) and (4a)–(4e). The fatalities are either described by an ODE, see models (1a)–(1e) and (3a)–(3e), or via a delay term in model (4a)–(4e). In the parameter estimation problem, we determine the transmission rate, detection rate and lethality together with the initial values for the exposed and infected compartment. In Sect. 3 we discuss the sensitivity of our model with respect to detection rate. Section 4 is devoted to the adjoint equations used for solving the optimization problem. The simulation results are presented in Sect. 5. Here we do compare the results obtained from the three models presented in Sect. 2.

Mathematical model

To model the dynamics of the spread of COVID-19 incidences, we propose a hierarchy of SEIRD models. For details regarding the original SIR- and SEIR-model we refer to classical works on mathematical epidemiology, e.g [7]. For our basic SEIRD-model, the total population of Germany with \(N\sim83.000.000\) individuals is subdivided in to susceptiblesS, exposedE, infectedI, recoveredR and deathsD. The susceptibles constitute the reservoir of persons that are not yet infected with SARS-CoV-2. After infection susceptible become exposed meaning that they already carry the virus but are not yet infectious. With a rate ϑ exposed individuals become infectious and transmit the virus with rate β to susceptibles. An infected individual loses infectivity with γ and has a probability μ of dying due to the disease [8]. Figure 2 shows the transmission structure. By C we denote all infected cases, independent of their current status. This artificial compartment is later on used to compare with the total number of registered cases reported by Johns Hopkins or RKI.

Figure 2
figure2

Transmission diagram for the basic SEIRD-model (1a)–(1e). The artificial compartment C contains all infected cases, i.e. current active infections, recovered and deaths

The resulting system of ordinary differential equations (ODE) for the above described SEIRD-model reads as

$$\begin{aligned} & \frac{dS}{dt} = - \frac{\beta}{N} SI ,\qquad S(t_{0}) = S_{0}:=N-E_{0}-I_{0}, \end{aligned}$$
(1a)
$$\begin{aligned} &\frac{dE}{dt} = \frac{\beta}{N} SI - \vartheta E ,\qquad E(t_{0}) = E_{0}, \end{aligned}$$
(1b)
$$\begin{aligned} &\frac{dI}{dt} = \vartheta E - \gamma I,\qquad I(t_{0}) = I_{0}, \end{aligned}$$
(1c)
$$\begin{aligned} &\frac{dR}{dt} = (1-\mu) \cdot\gamma I,\qquad R(t_{0}) = 0, \end{aligned}$$
(1d)
$$\begin{aligned} &\frac{dD}{dt} = \mu\cdot\gamma I,\qquad D(t_{0}) = 0. \end{aligned}$$
(1e)

The starting time \(t_{0}\) is chosen as 1 March and the initial conditions for the recovered and dead compartment are assumed to be zero, since in Germany the first COVID-19 related death was recorded on 9 March. Also we may assume that the number of recovered individuals by 1 March is negligible.

In the sequel, we will also consider two refined versions of the above basic model.

At the onset of the disease, the numbers of exposed, infected, recovered and dead are still small and the number of susceptibles is approximately equal to the entire population N. In this setting, the EI-part of the model reduces to

$$\begin{aligned} \begin{pmatrix} E \\ I \end{pmatrix} ' &= \begin{pmatrix} -\vartheta& \beta \\ \vartheta& -\gamma \end{pmatrix} \cdot \begin{pmatrix} E \\ I \end{pmatrix}. \end{aligned}$$

The maximal eigenvalue λ of this linear system determines the initial growth rate and is given by

λ= 1 2 ( ( ϑ + γ ) + ( ϑ γ ) 2 + 4 ϑ β )

and the doubling time\(T_{2}\) equals

$$\begin{aligned} T_{2} = \frac{\ln2}{\lambda}. \end{aligned}$$

Figure 3 depicts the dependence of the doubling time on the transmission rate β. As of mid April, the doubling time in Germany is approximately 14 days compared to 2.5 days by mid March.

Figure 3
figure3

Plot of the doubling time \(T_{2}\) in days versus the transmission rate β for fixed values \(\vartheta=1/2\) and \(\gamma=1/10\). A reduction of the transmission rate from \(\beta=0.8\) to \(\beta=0.2\) accounts for a slow down of the infection from doubling time 2 days to 10 days

In the basic model (1a)–(1e), the transmission rate β is assumed to be fixed. The German state and federal governments introduced several measures to slow down the spread of the disease. Similar measures are nowadays taken in almost every country worldwide. As of 16 March schools, kindergartens and universities were closed and on 22 March a general contact ban was enforced in Germany. Both measures aim at reducing the transmission rate β.

To include this into the basic model (1a)–(1e), we also consider an alternative model for the transmission rate β: We assume β as a piecewise constant function on the time intervals prior to any measures, (until 15 March), after school closings (between 16 and 22 March) and after the contact ban (after March 22)

$$ \beta(t) = \textstyle\begin{cases} \beta_{0} & t< \text{ 16 March}, \\ \beta_{1} & \text{16 March } \le t \le\text{ 22 March}, \\ \beta_{2}& t> \text{ 22 March}. \end{cases} $$
(2)

The resulting time-dependent SEIRD-model reads as

$$\begin{aligned} &\frac{dS}{dt} = - \frac{\beta(t)}{N} SI,\qquad S(t_{0}) = S_{0}:=N-E_{0}-I_{0}, \end{aligned}$$
(3a)
$$\begin{aligned} &\frac{dE}{dt} = \frac{\beta(t)}{N} SI - \vartheta E ,\qquad E(t_{0}) = E_{0}, \end{aligned}$$
(3b)
$$\begin{aligned} &\frac{dI}{dt} = \vartheta E - \gamma I,\qquad I(t_{0}) = I_{0}, \end{aligned}$$
(3c)
$$\begin{aligned} &\frac{dR}{dt} = (1-\mu) \cdot\gamma I,\qquad R(t_{0}) = 0, \end{aligned}$$
(3d)
$$\begin{aligned} &\frac{dD}{dt} = \mu\cdot\gamma I,\qquad D(t_{0}) = 0. \end{aligned}$$
(3e)

Setting \(\beta:=\beta_{0}=\beta_{1}=\beta_{2}\), the time-dependent model reduces to the basic one.

In order to validate our models and to identify the parameters involved therein, both the registered number of infections and the registered number of COVID-19 related deaths are important indications. The number of registered deaths is probably considerably more reliable, since the number of registered infections depends on the number of tests conducted and the dark figure of undetected, mostly asymptomatic cases, is assumed to be remarkably large [9]. We will discuss this point later in more detail. In the previous basic or time-dependent SEIRD-model, the actual increase of the disease related deaths \(\frac{dD}{dt}\) is assumed to be proportional to the current number of infected persons. The Robert-Koch-Institute specifies an average of 10 days between the onset of symptoms and admission to the intensive care unit [10]. Therefore, we assume \(\tau= 14\) for the time between the onset of infectiousness and death. In order to include this time lag into our model, we introduce a delay-term into the time-dependent model and obtain the final delayed time-dependent model:

$$\begin{aligned} &\frac{dS}{dt} = - \frac{\beta(t)}{N} SI ,\qquad S(t_{0}) = S_{0}:=N-E_{0}-I_{0}, \end{aligned}$$
(4a)
$$\begin{aligned} &\frac{dE}{dt} = \frac{\beta(t)}{N} SI - \vartheta E,\qquad E(t_{0}) = E_{0}, \end{aligned}$$
(4b)
$$\begin{aligned} &\frac{dI}{dt} = \vartheta E - \gamma \bigl[(1-\mu) I +\mu I(t- \tau) \bigr],\qquad I(s) = I_{0}(s) \quad\text{for $s\le t_{0}$}, \end{aligned}$$
(4c)
$$\begin{aligned} &\frac{dR}{dt} = (1-\mu) \cdot\gamma I,\qquad R(t_{0}) = 0, \end{aligned}$$
(4d)
$$\begin{aligned} &\frac{dD}{dt} = \mu\cdot\gamma I(t-\tau),\qquad D(t_{0}) = 0. \end{aligned}$$
(4e)

Note, that for solving this delay differential equation (DDE) we need an initial history of the infected compartment, i.e. values \(I_{0}(s)\) for \(t_{0}-\tau\le s\le t_{0}\).

In all the three models, the parameters \(\vartheta=1/2\) [days−1], \(\gamma=1/10\) [days−1] are assume to be fixed and resemble a latency period of 2 days and a recovery period of 10 days, see [2, Situation report 31 March 2020].

The parameters in the transmission rate, i.e. β, or \(\beta_{0},\beta_{1},\beta_{2}\) the lethality μ and the initial values \(E_{0}, I_{0}\) resp. the initial history \(I_{0}(s)\) for the exposed and infected compartment are yet unknown to us. We will identify them together with the detection rateδ by matching the model output to the given data. The detection rate δ corresponds to the fraction of infected individuals which are positively tested for SARS-CoV-2 and hence appear in the official recordings. Various sources speculate that this detection rate is in the order of magnitude of 10–20% meaning that the true number of infected 5–10 times larger than the number published in the official statistics, see [9].

To match the model output and the reported data we use a least-squares approach. Let \(u=(\beta, \delta, \mu, E_{0}, I_{0})\) resp. \(u=(\beta_{0},\beta_{1},\beta_{2}, \delta, \mu, E_{0}, I_{0})\) denote the unknown model parameters to be determined. Furthermore, let \(Y(t)\) and \(Z(t)\) denote the data for the cumulated infected and dead cases at time t reported by Johns Hopkins University. The deviation between the model and the data is measured by the cost functional

$$\begin{aligned} J(u) &:= \frac{ \Vert \delta(I+R)+D-Y \Vert _{L^{2}}^{2}}{ \Vert Y \Vert _{L^{2}}^{2}} + c_{1} \frac{ \Vert D-Z \Vert _{L^{2}}^{2}}{ \Vert Z \Vert _{L^{2}}^{2}} + c_{2} \Vert u \Vert ^{2} \\ &= \frac{1}{ \Vert Y \Vert _{L^{2}}^{2}} \bigl( \bigl\Vert \delta(I+R)+D-Y \bigr\Vert _{L^{2}}^{2} + \omega_{1} \Vert D-Z \Vert _{L^{2}}^{2} + \omega_{2} \Vert u \Vert ^{2} \bigr), \end{aligned}$$
(5)

where \(\Vert f \Vert _{L^{2}}^{2}:=\int_{t_{0}}^{T_{\text{Fit}}} f(t)^{2} \,dt\) denotes the square of the \(L^{2}\)-norm of the function f on the interval \([t_{0}, T_{\text{Fit}}]\) and \(\omega_{1}= c_{1} \frac{ \Vert Y \Vert _{L^{2}}^{2}}{ \Vert Z \Vert _{L^{2}}^{2}}\) as well as \(\omega_{2}=c_{2} \Vert Y \Vert _{L^{2}}^{2}\). For the given data we have \(\Vert Y \Vert _{L^{2}}^{2} \simeq1.2\cdot 10^{11}\) and \(\Vert Z \Vert _{L^{2}}^{2} \simeq6.5\cdot10^{8}\), hence \(\omega_{1} \simeq c_{1}\cdot185\). The cumulated infected Y, i.e. total positive tests, are to be matched in the SEIRD-model to those individuals who had been infected until time t, i.e. the sum of the infected I, recovered R and deaths D. To account for the uncertainty in the true number of infected and recovered cases, we multiply both compartments by the detection rate δ, which is itself part of the parameters to be identified. For the deaths we assume no undetected cases. By \(T_{\text{Fit}}\) we denote the time horizon used for the comparison between the model and the data. The regularization term \(\omega_{2} \Vert u \Vert ^{2}\) is included to ensure the convexity of the cost-functional. The weighting parameters \(c_{1}, c_{2}\) and hence \(\omega_{1},\omega _{2}>0\) allow to balance the contributions from the least squares error in the fatalities and from the size of the parameter values themselves to the least squares error in the infected cases. The weight \(c_{1}\) for the fatal cases allows to compensate the different order of magnitude between the infected cases and the fatal cases, typically \(c_{1}\simeq2\)–3 leading to \(\omega_{1}\simeq500\). The weight \(c_{2}\) is chosen small, such that the overall cost functional is still dominated by the least square fit between the model output and the given data.

The parameters \(u^{\ast}\) themselves are obtained from minimization problem

$$\begin{aligned} &\min_{u} J(u) \text{ subject to one of the ODE-systems (1a)--(1e), (3a)--(3e) or (4a)--(4e)}. \end{aligned}$$
(6a)
$$\begin{aligned} &u^{\ast}=\mathop {\operatorname {argmin}}_{u} J(u). \end{aligned}$$
(6b)

A few analytical considerations

Due to the absence of demographic terms, our basic model (1a)–(1e) does not allow other equilibria besides the trivial disease free equilibrium \(X^{0}=(N,0,0,0,0)\). Since we focus only on the short-time behavior of the epidemics, demographic terms are excluded and equilibria do not play any important role.

An important issue is the question of wether we can identify the detection rate and lethality during the take-off period of the epidemics? The only data available for parameter identification are the total number of registered cases \(C=I+R+D\) and the deaths D. The total registered cases heavily depend on the number of tests conducted. If a person is infected, but not tested, this person will not appear in the official statistics. Hence, there is a presumably large dark figure in the officially recorded data. Our model parameter δ takes this into account. The other, maybe more reliable, available data are the recorded deaths. Here we may assume that all COVID-19 related deaths are diagnosed and hence there is no dark figure in the D-compartment. A recent analysis by the Federal Statistical Office on the excess mortality in Germany for March and April 2020 confirms this assumption, see [11]. For other countries this assumption might be questionable, since they suffered from major COVID-19 outbreaks in care homes that did not enter the official statistics, e.g. in the UK, see [12].

However, one scenario could be possible. A large dark figure in the entire cases, i.e. a small detection rate δ and a very small lethality could result in the same or at least similar observed data as a moderate or even small dark figure and hence large detection rate δ combined with a higher lethality rate. In that setting a simultaneous identification of both, the detection rate δ and the lethality μ could be difficult due to their counteracting effects.

In order to investigate this scenario, we consider the simultaneous effect of the detection rate δ scaling both the initial values of the E and I compartment to account for undetected cases together with a lethality δμ. Removing the S-compartment by setting \(S=N-E-I-R-D\), the basic SEIRD-system (1a)–(1e) reads as

$$\begin{aligned} &\frac{dE}{dt} = \frac{\beta}{N} (N-E-I-R-D)I - \vartheta E,\qquad E(t_{0}) = E_{0}/\delta, \\ &\frac{dI}{dt} = \vartheta E - \gamma I,\qquad I(t_{0}) = I_{0}/\delta, \\ &\frac{dR}{dt} = (1-\delta\mu) \cdot\gamma I,\qquad R(t_{0}) = 0, \\ &\frac{dD}{dt} = \delta\mu\cdot\gamma I ,\qquad D(t_{0}) = 0. \end{aligned}$$

The sensitivities \(\varSigma_{E}:=\partial_{\delta}E\) and \(\varSigma_{I}, \varSigma_{R}, \varSigma_{D}\) of the solution with respect to the detection rate satisfy the system

$$\begin{aligned} \begin{aligned} &\varSigma_{E}' = \frac{\beta}{N}(N-E-2I-R-D) \varSigma_{I} - \vartheta \varSigma_{E} - \frac{\beta}{N}I (\varSigma_{E}+\varSigma_{R}+\varSigma_{D}), \\ & \varSigma_{E}(t_{0}) = -E_{0}/ \delta^{2}, \end{aligned} \end{aligned}$$
(7a)
$$\begin{aligned} &\varSigma_{I}' = \vartheta\varSigma_{E} - \gamma\varSigma_{I},\qquad \varSigma_{I}(t_{0}) = -I_{0}/\delta^{2}, \end{aligned}$$
(7b)
$$\begin{aligned} &\varSigma_{R}' = (1-\delta\mu)\gamma \varSigma_{I} - \mu\gamma I,\qquad \varSigma_{R}(t_{0}) = 0, \end{aligned}$$
(7c)
$$\begin{aligned} &\varSigma_{D}' = \delta\mu\gamma\varSigma_{I} + \mu\gamma I,\qquad \varSigma_{D}(t_{0}) = 0. \end{aligned}$$
(7d)

In Fig. 4 we show the relative sensitivities \(\varSigma_{C}/C\) and \(\varSigma_{D}/D\) for detection rates \(\delta=0.1, 0.2\) and 0.33. The chosen initial values are \(E_{0}=150\) and \(I_{0}=100\) (detected) cases at day 0. All other parameters resemble the assumed values for Germany. Note, that at the onset of the epidemics, i.e. in case of \(\delta=0.1\) for \(t\lesssim30\) and for \(\delta=0.2, 0.33\) even for \(t\lesssim40\), the sensitivities are very small and hence the solution of the SEIR-model is almost independent of the particular value of the detection rate δ. Hence δ cannot be identified from the observed data in a reliable manner.

Figure 4
figure4

Relative sensitivities of C (left) and D (right) with respect to the detection rate δ for \(\delta=0.1\) (blue solid), \(\delta=0.2\) (red dashed) and \(\delta=0.33\) (green dash-dotted). At the onset of the epidemics, the sensitivities are extremely small, hence no reliable identification of δ is possible

To illustrate these findings, we consider a linearization of a simplified SIR-model during the initial phase of the epidemics. We neglect the exposed compartment and assume that at the initial phase, the number of susceptibles is approximately equal to the entire population. Hence we get the linear system

$$\begin{aligned} &\hat{I}' = (\beta-\gamma)\hat{I}, \qquad\hat{I}(0)= \frac{1}{\delta}I_{0}, \\ &\hat{D}' = \delta\mu\gamma\hat{I},\qquad \hat{D}(0)=0 \end{aligned}$$

with the solution

$$\begin{aligned} \hat{D}(t;\delta) = \frac{\mu\gamma}{\beta-\gamma} \bigl(e^{(\beta-\gamma)t}-1 \bigr)I_{0}. \end{aligned}$$

In this linearized setting, the approximation for the dead compartment is independent of the detection rate δ.

From the graphs in Fig. 4 one can conclude, the a significant dependence of the detected or dead compartment C resp. D is given only after the initial take-off period of the epidemic. In the setting of Germany, this implies, that during the month of March a reliable identification to the detection rate might not be possible.

Adjoint equations and optimization

In order to solve the minimization problem (6a)–(6b), we use the adjoint equations, for details see [13, 14]. We introduce the Lagrangian

$$ \mathcal{L}(t,x,u,z) = J(u) + \int_{t_{0}}^{T_{\mathrm{Fit}}} z(t)^{T} \biggl(g(t,x,u)- \frac{dx}{dt} \biggr) \,dt. $$

Here \(z=(z_{S}, z_{E}, z_{I}, z_{R}, z_{D})\) denotes the adjoint functions to the state variable \(x=(S,E,I,R,D)\) and \(g(t,x,u)\) denotes the right hand side of the ODE resp. DDE system. The gradient of \(\mathcal{L}\) with respect to the unknown parameters u is given by

$$\begin{aligned} &\frac{\partial\mathcal{L}}{\partial\beta_{i}} = 2 c_{2}\beta_{i} + \frac{1}{N} \int_{t_{0}}^{T_{\mathrm{Fit}}} \frac{\partial \beta(t)}{\partial\beta_{i}} S I (z_{E}-z_{S} ) \,dt,\quad i = 0,1,2 \\ &\frac{\partial\mathcal{L}}{\partial\delta} = 2 c_{2}\delta+ \frac{2}{ \Vert Y \Vert _{L^{2}}^{2}} \int _{t_{0}}^{T_{\mathrm{Fit}}} (I+R) \bigl[\delta(I+R)+D - Y \bigr] \,dt, \\ &\frac {\partial\mathcal{L}}{\partial\mu} = 2 c_{2}\mu+ \int _{t_{0}}^{T_{\mathrm{Fit}}} \gamma I (z_{D}-z_{I} ) \,dt, \\ &\frac {\partial\mathcal{L}}{\partial E_{0}} = 2 c_{2} E_{0} + z_{E}(t_{0})- z_{S}(t_{0}), \\ &\frac{\partial\mathcal{L}}{\partial I_{0}} = 2 c_{2} I_{0} + z_{I}(t_{0})-z_{S}(t_{0}). \end{aligned}$$

Note, that in the case \(\beta= \beta_{0} = \beta_{1} = \beta_{2}\) we have \(\frac{\partial\beta(t)}{\partial\beta} = 1\). By adding the time delay, we obtain

$$\begin{aligned} \frac{\partial\mathcal{L}}{\partial\mu} = 2 c_{2}\mu+ \int_{t_{0}}^{T_{\mathrm{Fit}}} \gamma I (z_{I}-z_{R} )+ \gamma I (t-\tau ) (z_{D} - z_{I} ) \,dt. \end{aligned}$$

The adjoint system reads as

$$\begin{aligned} &\frac{d z_{S}}{dt} = \frac {\beta(t)}{N} I (z_{S} - z_{E} ), \\ &\frac{d z_{E}}{dt} = \vartheta (z_{E} - z_{I} ), \\ &\frac{d z_{I}}{dt} = \frac {\beta(t)}{N} S (z_{S}-z_{E} )+ \gamma \bigl[z_{I} - z_{R} + \mu (z_{R} - z_{D} ) \bigr]- \frac{2\delta }{ \Vert Y \Vert _{L^{2}}^{2}} \bigl[\delta(I+R)+D-Y \bigr], \\ &\frac{d z_{R}}{dt} = -\frac{2\delta}{ \Vert Y \Vert _{L^{2}}^{2}} \bigl[\delta(I+R)+D-Y \bigr], \\ &\frac{d z_{D}}{dt} = -\frac{2}{ \Vert Y \Vert _{L^{2}}^{2}} \bigl[\delta(I+R)+D-Y \bigr]- \frac{2c_{1}}{ \Vert Z \Vert _{L^{2}}^{2}}(D-Z). \end{aligned}$$

supplemented by the terminal condition \((z_{S}, z_{E}, z_{I}, z_{R}, z_{D}) (T_{\mathrm{Fit}})=0\). In the case of the time delay we receive

$$\begin{aligned} \frac{d z_{I}}{dt} = {}&\frac{\beta(t)}{N} S (z_{S}-z_{E} )+ (1 - \mu )\gamma (z_{I} - z_{R} )- \frac {2\delta}{ \Vert Y \Vert _{L^{2}}^{2}} \bigl[\delta (I+R)+D-Y \bigr] \\ & {}+ \mu\gamma \bigl[z_{I}(t+\tau) - z_{D}(t+\tau ) \bigr] \cdot\chi_{[t_{0},T_{\mathrm{Fit}}-\tau]}(t). \end{aligned}$$

Here \(\chi_{[a,b]}(t)\) denotes the characteristic function of the interval \([a,b]\), i.e. we define \(\chi_{[a,b]}(t)=1\) for \(t\in[a,b]\) and =0 otherwise.

To solve the optimization problem (6a)–(6b) numerically, we apply the Forward-Backward Sweep method [13] combined with a Quasi-Newton method (BFGS) [15].

In each iteration step the ODEs and DDEs of the state variables and adjoint equations are solved with Runge–Kutta methods before the corresponding gradient and direction of descent can be determined. The algorithm stops as soon as the termination condition \(\Vert J(u_{k+1}) - J(u_{k}) \Vert< \mathtt{TOL} \) is fulfilled.

As initial values we use \(\beta= \beta_{0} = \beta_{1} = \beta_{2} = 0.3\) for the transmission rate. This is justified by the fact that an average Basic Reproduction Number of about \(\mathcal{R}_{0} = 3\) is assumed and in our basic model we have

$$ \mathcal{R}_{0} = \frac{\beta}{\gamma}. $$

Epidemiologically, \(\mathcal{R}_{0}\) indicates the number of new infections an infected individual causes during the infectious period in an otherwise susceptible population. For the sake of simplicity, we assume the same starting value for \(I_{0}\) and \(E_{0}\). This corresponds to the value at the first data point of our measurement, i.e. 130 registered infected persons on 1st March. As already mentioned, we assume that for the recovered and deaths at this time \(R_{0} = D_{0} = 0\) holds. The possible problems with the optimization of δ and μ were already mentioned in the previous section. To increase the probability of generating a global minimum, we use \(n = 1000\) normally distributed start values for both parameters fulfilling \(\delta\sim\mathcal{N}(0.25,0.25^{2})\) and \(\mu\sim\mathcal{N}(0.03,0.03^{2})\) with \(\delta, \mu> 0\). The algorithm selects the best result of these n data fits. The reason for this is the assumption that the proportion of detected cases is between 1–50% and the lethality below 6%. For the case fatality rate μ we have

$$ \frac{Z}{Y} \le\mu\le\frac{Z}{R_{\text{reported}}/\delta+Z}, $$
(8)

where \(R_{\text{reported}}\) stands for the reported recovered at time \(T_{\mathrm{Fit}}\). The approach for this estimation can be found in [10]. The smaller the detection rate δ, the lower the upper bound for the case fatality gets.

In case of the time delay we choose as initial history for \(s \in[t_{0} - \tau,t_{0}]\)

$$ I(s) = I_{0} \exp \biggl(- \frac{\ln(0.1)}{\tau} (s - t_{0} ) \biggr). $$

This is justified by the fact that the number of registered cases has increased tenfold during this period and we assume an exponential growth in this time span.

Simulation results

To estimate the unknown parameters u, we match the data reported on a daily basis by Johns Hopkins [5] to our simulation results for a time period starting on 1 March.

The first results in Fig. 5 show a parameter estimation using the basic model (1a)–(1e) and the time period before the onset of any containment measures, i.e. before the closing of schools on 16 March. We fitted the parameters β, δ and μ along with the initial values \(E_{0}\) and \(I_{0}\) over the time period 1 March to 16 March. The initial values \(E_{0}\) and \(I_{0}\) are also subject to fitting, since the official data does not provide information about the active infections at a given day. The weight \(\omega_{2}=1\) to keep the cost functional dominated by the two least square errors. The other weight is chosen as \(\omega_{1}=500\) to compensate the significantly smaller value of the least square error in the fatal cases.

Figure 5
figure5

Fit of the basic model (1a)–(1e) to the data for the period 1 March to 16 March, i.e. before the onset of containment measures

For the given time period of the fit, the model prediction and the observed data are in good accordance. The estimated parameter values are given in Table 1. The detection rate was estimated as \(\delta=0.37\) implying that the true number of infections exceeds the registered cases by a factor 3. The transmission rate \(\beta=0.57\) accounts for a doubling time of 2.6 days at the initial, uncontrolled phase of the epidemic in Germany.

Table 1 Optimal parameter values for the three models (1a)–(1e), (3a)–(3e) and (4a)–(4e) obtained from the minimization problem (6a)–(6b)

In Fig. 6 we show the results obtained with the time-dependent model (3a)–(3e). In this case, the fitting period equals to the entire simulation period starting from 1 March to 7 April. The weights \(\omega_{1}, \omega_{2}\) are identical to the previous simulation. The obtained transmission rate and according doubling times change from \(\beta_{0}=0.5232\) and \(T_{2}(\beta_{0})=2.8\) days at the initial uncontrolled phase to \(\beta_{2}=0.18\) and \(t_{2}(\beta_{2})=11.4\) days after the contact ban has been introduced. The effect of the contact ban effectively reduces the transmission rate by a factor of about 3 and significantly slows down the speed of the epidemics by increasing the doubling time by a factor 4.

Figure 6
figure6

Fit of the time-dependent model (3a)–(3e) to the data for the period 1 March to 7 April

In Fig. 7 we show the result obtained with the delay model (4a)–(4e). For the delay model, we assume a delay of 14 days between entering the class of infected and death. Again, we show the simulation results compared to the reported cases for the infections and deaths. Quite good agreement is found between the model and the simulation for both, infections and deaths. Compared to the time-dependent model, shown in Fig. 6, the delay model agrees better in particular for the fatal cases. In Table 1 we have listed the estimated parameter values in for the three models. We have also included the normalized \(L^{2}\)-difference between the simulation outcome and the given data, i.e. the first two summands from the cost fuctional (5). A t-test revealed that the deviations of the simulation to the reported data is not normal distributed at a significance level of 5%.

Figure 7
figure7

Fit of the delay model (4a)–(4e) to the data for the period 1 March to 7 April

In the simulations the detection rate is found to be 20–40%, indicating that the true number of SARS-CoV-2 infections might be 3–5 times higher that the officially recorded data suggest. The lethality rate is found to be rather small, taking into account the large number of true cases.

Comparing the obtained values for the lethality, the value for the delay-model seems to be most realistic, since in this model we compare the fatal cases today to the infections that occurred two weeks ago. The two other models related the fatal cases of today to the infected cases today, hence to a significantly larger number. Therefore in these to models, the lethality rate seems to be smaller.

Conclusions and outlook

We present three SIR-based models for describing the outbreak of the SARS-CoV-2 outbreak in Germany. Besides a standard SEIR-model, we consider an extension taking into account the effect of social distancing by a time-dependent reduction of the transmission rate. The third model introduces a delay-term to accurately describe the deaths depending on infected cases that occurred several days in the past. Comparing the simulation results to the data published by Johns Hopkins University allows an estimation of the unknown model parameters. Best results are obtained using the delay equation model. In this setting, we find a detection rate of about 20% and a lethality of about 4%. The social distancing measures were leading to an effective reduction of the transmission rate by a factor 4. That is, after the introduction of the measures roughly just 25% of the social contact compared to the initial period were leading to infections.

References

  1. 1.

    World Health Organization (WHO). Novel Coronavirus (2019-nCoV) situation reports. www.who.int/emergencies/diseases/novel-coronavirus-2019/situation-reports.

  2. 2.

    Robert-Koch-Institute. Daily situation reports. www.rki.de/DE/Content/InfAZ/N/Neuartiges_Coronavirus/Situationsberichte/Gesamt.html.

  3. 3.

    Wikipedia. 2020 coronavirus pandemic in Germany. https://en.wikipedia.org/wiki/2020_coronavirus_pandemic_in_Germany#Statistics

  4. 4.

    Federal Government of Germany. Guidelines for reducing social contacts. www.bundesregierung.de/breg-de/themen/coronavirus/besprechung-der-bundeskanzlerin-mit-den-regierungschefinnen-und-regierungschefs-der-laender-1733248 (in German).

  5. 5.

    Johns Hopkins University. Time series of confirmed COVID-19 cases globally. github.com/CSSEGISandData/COVID-19/blob/master/csse_COVID_19_data/csse_COVID_19_time_series/time_series_COVID19_confirmed_global.csv.

  6. 6.

    Robert-Koch-Institute. Why differ the information reported by RKI and Johns Hopkins University? (in German). https://www.rki.de/SharedDocs/FAQ/NCOV2019/gesamt.html.

  7. 7.

    Martcheva M. An introduction to mathematical epidemiology. Berlin: Springer; 2015.

    Google Scholar 

  8. 8.

    WHO. Q&A on coronaviruses. www.who.int/news-room/q-a-detail/q-a-coronaviruses

  9. 9.

    Magal P, Webb G. Predicting the number of reported and unreported cases for the COVID-19 epidemic in South Korea, Italy, France and Germany. medRxiv https://doi.org/10.1101/2020.03.21.20040154, 2020.

  10. 10.

    Robert-Koch-Institute. Corona fact sheet. https://www.rki.de/DE/Content/InfAZ/N/Neuartiges_Coronavirus/Steckbrief.html.

  11. 11.

    Federal Statistical Office: Press release No. 179 of 22 May 2020. https://www.destatis.de/EN/Press/2020/05/PE20_179_12621.html.

  12. 12.

    The Guardian: UK care home Covid-19 deaths may be five times government estimate´. https://www.theguardian.com/world/2020/apr/18/uk-care-home-covid-19-deaths-may-be-five-times-government-estimate (18 April 2020).

  13. 13.

    Lenhart S, Workman JT. Optimal control applied to biological models. Boca Raton: CRC Press; 2007.

    Google Scholar 

  14. 14.

    Vinter R. Optimal control. Berlin: Springer; 2010.

    Google Scholar 

  15. 15.

    Nocedal J, Wright S. Numerical optimization. Berlin: Springer; 2006.

    Google Scholar 

Download references

Acknowledgements

The authors would like to thanks the MOCOS (MOdeling COrona Spread) International Group consisting of research groups at the universities in Kaiserslautern, Koblenz, Trier and Wroclaw for valuable discussions on this urgent and internationally highly relevant issue. We also express our gratitude to the unknown referees for helping with their comments to improve the manuscript.

Availability of data and materials

The used data on registered COVID-19 cases from Johns Hopkins university is available, see [5].

Authors’ information

Not applicable.

Funding

Not applicable.

Author information

Affiliations

Authors

Contributions

Both authors contributed equally. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Thomas Götz.

Ethics declarations

Competing interests

The authors declare that they have no competing interests.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Götz, T., Heidrich, P. Early stage COVID-19 disease dynamics in Germany: models and parameter identification. J.Math.Industry 10, 20 (2020). https://doi.org/10.1186/s13362-020-00088-y

Download citation

Keywords

  • COVID-19
  • Epidemiology
  • Disease dynamics
  • SEIRD-model