Robustness of the reproductive number estimates in vector-borne disease systems

Background The required efforts, feasibility and predicted success of an intervention strategy against an infectious disease are partially determined by its basic reproduction number, R0. In its simplest form R0 can be understood as the product of the infectious period, the number of infectious contacts and the per-contact transmission probability, which in the case of vector-transmitted diseases necessarily extend to the vector stages. As vectors do not usually recover from infection, they remain infectious for life, which places high significance on the vector’s life expectancy. Current methods for estimating the R0 for a vector-borne disease are mostly derived from compartmental modelling frameworks assuming constant vector mortality rates. We hypothesised that some of the assumptions underlying these models can lead to unrealistic high vector life expectancies with important repercussions for R0 estimates. Methodology and principal findings Here we used a stochastic, individual-based model which allowed us to directly measure the number of secondary infections arising from one index case under different assumptions about vector mortality. Our results confirm that formulas based on age-independent mortality rates can overestimate R0 by nearly 100% compared to our own estimate derived from first principles. We further provide a correction factor that can be used with a standard R0 formula and adjusts for the discrepancies due to erroneous vector age distributions. Conclusion Vector mortality rates play a crucial role for the success and general epidemiology of vector-transmitted diseases. Many modelling efforts intrinsically assume these to be age-independent, which, as clearly demonstrated here, can lead to severe over-estimation of the disease’s reproduction number. Our results thus re-emphasise the importance of obtaining field-relevant and species-dependent vector mortality rates, which in turn would facilitate more realistic intervention impact predictions.

borne disease, hosts are immune for the rest of their life. Susceptible hosts become exposed to the virus at a rate λ H (force of infection), then become infectious after 1/ H days, and finally recover after 1/γ days. Susceptible vectors become exposed to the virus at a rate λ V , and develop infectiousness after 1/ V days.
The force of infection on susceptible hosts, λ H is dependent on the number of infected vectors in the system, and the force of infection on susceptible vectors, λ V is dependent on the number of infected hosts in the system. All vectors bite hosts at a constant per day rate β. Thus the force of infection on susceptible hosts and on susceptible vectors are where p H,V is the probability of successfully transmitting the disease given a biting event between an infected and susceptible individual, and I H (t) and I V (t) are the number of hosts and vectors infected at time t respectively. Additionally, individuals were infected at an external infection rate ι to account for introduction of the disease from outside the system.
The host and vector populations were divided into a set of communities organized into a nonwrapping lattice. The distribution of humans and mosquitoes was set to be maximally uniform across all communities. We assumed that individuals mix homogeneously within each community. Let C be the set of all communities in the meta-population and define c = [r, c] ∈ C where [r, c] is the community's row and column number in the lattice. In a model with no intercommunity interaction, communities are isolated from one another, so the force of infection terms of each community c ∈ C are where I H (t, c ) and I V (t, c ) are the number of infected hosts and vectors respectively in community c at time t, and N H ( c ) is the number of humans in community c.
Infection between communities was considered by dispersing the total number of transmission events in a community to neighbouring communities, thereby modelling the biting of vectors in surrounding sub-populations. The total number of transmissions in community c ∈ C at time t is the force of infection multiplied by the population size. Inter-community transmissions were included by setting the total number of transmissions for each community to be a weighted sum of the total number of transmissions for each sub-population in a model where all communities are isolated, with weights defined by some disease dispersal kernel Φ (·).
where Φ ζ ( c ) are the normalized weights, N V (t, c ) and N H (t, c ) are the total number of vectors and hosts respectively in community c at time t, and Λ V (t, c ) and Λ H (t, c ) are the force of infection terms on vectors and hosts respectively in community c at time t with inter-community transmissions.
The local disease dispersal kernel is Φ ζ is normalized and dependent on the distance between the source community ζ and each destination community c ∈ C involved in the dispersal of each transmission event. It was assumed that the distance dependence follows a normal distribution centred at mean zero with variance σ 2 . Therefore, φ ζ ( c ) can be computed as the volume of the two dimensional normal distribution centred around ζ with covariance matrix Σ evaluated at the community c.
A model for human movement was incorporated into the model by allowing human individuals to temporarily visit any sub-population in the lattice with probability ω. Combining this with the local disease dispersal kernel, the total number of transmission events for vectors and hosts are computed as where |C| is the total number of communities in the lattice.
The mathematical method was implemented computationally in C++ with CUDA GPU acceleration in order to reduced simulation run-time such that extensive sensitivity analyses could be performed.

Computational implementation
The mathematical method was implemented computationally in C++ with CUDA GPU acceleration in order to reduced simulation run-time such that extensive sensitivity analyses could be performed.
In the model, a human individual is defined by their age, a random probability determining an individuals life expectancy, the community in which they reside, and the immunological history of the individual, what infected class they are in (susceptible, exposed, infected). An infected or exposed human individual will also the age at which they become infectious, and the age at which they recover. Similarly, an individual mosquito is defined by their age, the random probability, and home community. Infected mosquito also have information on the age at which they become infectious. Note that by the compartmental model in the previous subsection, there is no need for recovery times or immunological history since a mosquito is infected for life.
Firstly, a disease free human and mosquito population are initialized. An individuals age is initialized from the survival function, the probability that an individual will survive beyond a given age. For an age-dependent mortality rate µ (t), the survival function S (t) is given by The life expectancy probability is assigned on a uniform distribution of [s (t) , 1]. Individuals are assigned communities uniformly. A small proportion (≈ 0.01%) of individuals are then infected with infected humans being assigned an age at which they recover. There are only two kernel calls for the demographic update; one for mosquitoes, and one for humans. Each kernel is not sub-divided into smaller functions in order to reduce the number of memory stores and loads. It is expected that the mosquito demographic update is slower than the human update since the much higher per day death rate in mosquitoes leads to a large increase in warp divergence. Other simpler optimization strategies suggested in the CUDA manual, such as using fast math libraries, were followed throughout the parallelized parts of code.

ODE-based reproductive number
We applied the next generation approach [2] to a typical SEIR-SEI host-vector system of ordinary differential equations in order to get the formula for the basic reproductive number R ODE 0 .
Firstly, by assuming the entire host and vector population is susceptible to the pathogen, the ODEs (see Methods in the main text) can be reduced to the following subsystem.
where M is the number of vectors per host (N V /N H ). Then, by setting x = [E H , I H , E V , I V ] T , the subsystem can be written asẋ where T represents the transmission of the pathogen between hosts and vectors, and Σ corresponds to the transitions of individuals between exposed and infected classes. The next generation matrix K L can then be calculated.
Diekmann (2010) showed that dominant eigenvalue of the next generation matrix is the reproductive number of the original system of ordinary differential equations. This eigenvalue is the square root of the reproduction number for the full transmission cycle of a vector-borne disease, which is what we are interested in. Therefore However, for most vector-borne diseases the human recovery rate 1/γ, and intrinsic incubation period 1/ H , are much shorter than mean human life expectancy 1/µ H in the ODE system, an approximation can be made by removing the death rate of humans µ H from the formula

IBM-based reproductive number
The basic reproductive number, R IBM 0 , for the individual based model was computed from first principles using the transmission cycle of the pathogen as described by the individual-based model itself. Starting with an infected human in an entirely susceptible population, the human will infect M p V β vectors per day on average and will be infected for 1/γ days. Therefore, a single infected human is expected of infect M p V β/γ vectors.
The proportion of infected vectors that survive the extrinsic incubation period of the disease is denoted by ρ E V →I V and was calculated from the vector mortality risk µ V . First, we calculate the survival function s (τ ). The survival function is a monotonically decreasing function which defines the probability that a vector will survive beyond τ days in age.
The probability that an individual of age τ will survive a mean incubation period of 1/ V is In order to get the mean probability that an infected vector becomes infectious, we weight ρ E V →I V (τ ) by the proportion of the vector population at age τ , derived from the normalization of the survival function.
A single infected vector will infect p H β humans per day on average. As infectious vectors can infect for the rest of their life, the infectious period is defined as difference between the mean life expectancy of an infectious vector, 1/µ I V , and the mean age at which a vector becomes infectious, 1/α I V . Therefore, the infectious period is dependent upon the vector mortality rate.
As the vector population is assumed to be completely susceptible in the definition of R 0 , the mean age of an infectious vector is calculated by taking the weighted average of all possible ages of an infectious vector, τ + 1/ V , with the proportion of infectious vectors at that given age, obtained by the normalization of the survival function perturbed by the incubation period.
The mean life expectancy of an infectious vector, 1/µ I V , is not the same as the mean life expectancy of a vector, because a vector that became infectious at time τ + 1/ V must have survived up until that age, thereby altering its remaining mortality risk distribution. Therefore, the mean life expectancy of an infectious vector that became infected at age τ days, f (τ ), must survive at least τ + 1/ V plus the remaining expected survival time.
where s τ + 1 Combining all these terms, the basic reproduction number of the individual based model can be derived as Basic reproduction number from initial growth rate The basic reproduction number can be estimated from the time-series of an outbreak of a vectorborne disease in an entirely susceptible population of hosts and vectors.
By assuming that the proportion of infected hosts (i = H) and vectors (i = V ) survive the intrinsic and extrinsic incubation period is i / ( i + µ i ), and that the entire host and vector populations are susceptible to the vector-borne disease the infectious stage of the SEIR-SEI ODE system can be rewritten into the following delay differential equatioṅ where τ H = 1/ H and τ V = 1/ V , are the intrinsic and extrinsic incubation periods respectively.
At the start of an epidemic, the growth rate of infected hosts and vectors is assumed to be a constant λ, so the total number of infected hosts and vectors is defined as Substituting these into equations 29 and 30 and re-arranging yields And then substituting the equation for R 0 derived from the ODE system using the nextgeneration method yields Equilibrium estimation methods are more reliable than using the initial growth rate The asymptotically stable steady state of susceptible individuals in an ODE-based SEIR system for a directly transmitted disease is a good estimator of the basic reproductive number and is given as where S * is the mean proportion of susceptible individuals at equilibrium. The directly transmitted disease R 0 estimate was then used as an approximation for the basic reproductive number of an endemic vector-borne disease. For the individual based model (IBM) this required the proportion of susceptible individuals to reach some dynamic equilibrium, as the inherent stochasticity inhibits the system from reaching an asymptotically stable equilibrium (Figure A in S1 Figure).
Under the assumption of constant vector mortality rate, estimating the reproduction number from the dynamic equilibrium of susceptible individuals is much more reliable than from the initial growth rate of an epidemic in an entirely susceptible population ( Figure B in S1 Figure).
Although the estimate is more robust, gathering seroprevalence data in the field offers substantially greater challenges than collecting the total number of cases over time at the start of an outbreak.