Scabies in residential care homes: Modelling, inference and interventions for well-connected population sub-units

In the context of an ageing population, understanding the transmission of infectious diseases such as scabies through well-connected sub-units of the population, such as residential care homes, is particularly important for the design of efficient interventions to mitigate against the effects of those diseases. Here, we present a modelling methodology based on the efficient solution of a large-scale system of linear differential equations that allows statistical calibration of individual-based random models to real data on scabies in residential care homes. In particular, we review and benchmark different numerical methods for the integration of the differential equation system, and then select the most appropriate of these methods to perform inference using Markov chain Monte Carlo. We test the goodness-of-fit of this model using posterior predictive intervals and propagate forward the resulting parameter uncertainty in a Bayesian framework to consider the economic cost of delayed interventions against scabies, quantifying the benefits of prompt action in the event of detection. We also revisit the previous methodology used to assess the safety of treatments in small population sub-units—in this context ivermectin—and demonstrate that even a very slight relaxation of the implicit assumption of homogeneous death rates significantly increases the plausibility of the hypothesis that ivermectin does not cause excess mortality based upon the data of Barkwell and Shields.


Stochastic models for well-connected population sub-units
Stochastic (random) models play a pivotal role in the description of transmission and control of infection within small closely-connected populations. The typical example of such populations is households [2][3][4][5], although the household methodology is generalisable to any wellconnected population sub-unit such as the residential and nursing care homes (RNCs) we consider here. These models are increasingly becoming useful tools for studying disease transmission and control in structured populations, which can in part be attributed to the availability of household-stratified infection data that can be used to parameterise the models [6][7][8] as well as affordability in the amount of computing power. A type of household transmission model that is currently growing in popularity is a stochastic Markovian household model [2,9,10].
In this type of model, individuals are assumed to have two levels of mixing: one representing transmission between people sharing/living in the same household and the other representing global contacts within the population. These kind of models have the advantage that they capture the temporal behavior of the epidemic and offer a computational trade-off between simpler whole-population models [11,12] and more complex, computationally-intensive individual-based models [13,14].
Stochastic models have traditionally been studied via simulation and estimation based upon a large number of event-driven integer-based simulations [15,16]. These methods are powerful but require a large number of replicates to reduce Monte Carlo error, since it is typically unclear if a single simulation represents the average behaviour of the system or the outcome of a combination of rare events. They therefore quickly become computationally intensive due to the combination of the sheer number of replicates required and the number of possible events that can occur in a given time step. In this work, we have presented a method that allows for the complete range of stochastic behaviours to be captured by a large set of ordinary differential equations (ODEs) which we will refer to as the master equation (also known as the forward equation). The master equation is a set of linear ODEs representing the probability of being in each possible state with the dynamics driven by the rates of transition between states. This method has previously been applied to the study of stochastic disease dynamics [9,[17][18][19].
The use of a master equation has existed for quite a long time but has not been widely used in epidemiology and this is partly due to the algorithmic difficulties involved in solving the resulting large system of linear ODEs. There are a range of methods that can be used to solve the system and in this work we concentrate on so-called series-expansion and projectionbased methods. Both method classes are based on the efficient approximation of the action of the matrix exponential on a vector, a problem that has attracted a lot of research and for which various numerical methods have been proposed [20][21][22]. Besides the existence of these numerical methods used to solve the master equation, there still lacks a body of work that computationally benchmarks the algorithms for use in mathematical epidemiology, an area where these algorithms can be of great utility. In this work, we benchmark a number of competing algorithms and then use the best algorithm to solve a system of linear ODEs describing the transmission of scabies in care homes in the UK to enhance the efficiency and quality of an inferential and policy-driven modelling study. We also consider how the safety and efficacy of pharmaceutical interventions can be assessed using probabilistic models.

Scabies
Sarcoptes scabiei is an ectoparasite that infests human skin, where it burrows and lays eggs causing intense itching and scratching, which may in turn lead to secondary bacterial infection [23]. Global prevalence in 2015 was estimated at 204 million [24] but varies enormously both between and within nations [25]. All age prevalence over 30% has been reported in some countries (Papua New Guinea, Panama, Fiji) [26] and overcrowded settings such as slums and refugee camps can have very high prevalence [27]. In most low and middle income settings it particularly affects children [27]. However in developed high-resource settings, the incidence and prevalence in children and schools has declined, while outbreaks are commonly reported in residential and nursing care homes for the elderly [28,29]. The mite is transmitted mainly through skin to skin contact, and also to a lesser extent through "fomites" (e.g., bedding, skin flakes) [23,30]. Itching begins 4-6 weeks after exposure for a first episode, but can start within 24 hours in subsequent infections [30,31]. An infested person becomes infectious in most cases around 10-14 days after becoming exposed, when newly fertilised adult female mites become ready to seek new burrows in which to lay eggs [30]. Scabies is commonly misdiagnosed since the classical physical signs (burrows and papules) are variably present. This is a particular problem in RNCs since many residents have dementia and may not be able to say they are itching, while their increased agitation may be attributed to other causes. As a consequence, recognition of cases and outbreaks in RNCs is often delayed [28]. In the absence of interventions, scabies is generally not self-limiting [30,32] with a study in Bangladesh observing that children could remain infected for more than six months [33].
In the UK, first-line therapy for scabies is topical permethrin, which is applied all over the body, left on for 8 hours before being washed off, and repeated 7 days later. In RNC outbreaks, residents and staff need to be treated simultaneously. This can be distressing and logistically challenging [28], especially as some guidelines recommend prophylactic mass treatment of all residents and staff once an outbreak is declared [34]. Oral ivermectin has been suggested as an effective alternative in healthcare settings, and is included in French national outbreak management guidelines for RNCs [34]. However, its wider usage as a mass treatment in RNCs has been limited in part due to safety concerns raised by Barkwell and Shields [1].
The retrospective study was carried out in a 47-bed closed unit for residents with behavioural tendencies over a period of six months between June and November 1995. A scabies outbreak occurred during this period and the individuals were treated with two different topical agents, lindale and permethrin, but scabies symptoms re-occured. Consequently, the patients were treated with a single oral dose of ivermectin and all the rashes and symptoms had cleared within five days with the individuals requiring no further treatment. However, during the following six months, the authors observed an increased pattern of excess deaths among the residents who had received ivermectin. Barkwell and Shields, in their conclusion, subsequently advised against using ivermectin to treat scabies in the elderly and/or those with an underlying medical condition, suggesting a potential causal association with deaths in the facility. We re-examine their statistical analysis and interpretation and provide a rigorous method for more careful analysis.

The SEI scabies model
The natural history of scabies infection in the absence of interventions is highly dependent on the history of previous exposure as well as immunological competency of the individuals. Walton [35] has reported that spontaneous recovery of scabies in healthy adults can occur only with subsequent re-infestations. Additionally, parasite numbers can be reduced and in approximately 60% of cases re-infestation of sensitised hosts was unsuccessful. It is still unknown how long this capacity for some level of acquired immunity persists, though 15-24 months after infestation with scabies mite extracts injected intradermally have failed to induce immediate wheal reactions in patients. So in the elderly and especially those in care homes with high comorbidities and compromised immunological responses, we do not include the possibility of spontaneous recovery in the absence of treatment. This would mean that following exposure to scabies, the individuals would have a protracted infection that is not self limiting. We therefore assume that scabies follows an SEI model framework in which individuals are initially susceptible (S), then following an infection event spend some time in a latent 'exposed' class (E). Once a fertilised female mite is transferred to a susceptible individual, mite generation time means there is a delay, between 7 and 14 days, before the host can become infective. However, during this period, the mite burrows can still be observed on the host's skin [30]. Eventually, the individuals become infectious, (I), and are able to infect others.
Our starting point is therefore the stochastic SEI model in a closed population of size N. This consists of three non-independent random variables, S(t), E(t) and I(t) such that S(t) + E(t) + I(t) = N, for all values of t, representing the number of individuals who are uninfected with scabies (Susceptible), who have been infected but are not able to infect others (Exposed), and who have been infected and are able to infect others (Infectious) respectively. The state transitions and rates in this model are ðS; E; IÞ ! ðS À 1; E þ 1; IÞ at rate l SI ; ðS; E; IÞ ! ðS; E À 1; I þ 1Þ at rate g E : If we define the expectations " S ¼ E½S, " E ¼ E½E and " I ¼ E½I then in the limit of large N the dynamics of this model will be governed by the more familiar deterministic differential equations: To model the finite-population dynamics, let P s,e,i (t) represent the probability that there are s, e, i numbers of susceptibles, exposed and infected respectively in the population at time t, then the complete dynamics will be modelled by considering all the possible infection configurations as shown in the system ( Eq (3) can be equivalently represented by counting the number of events of each type that occur rather than the number of individuals in each compartment. This is known as the DA (Degree of Advancement) representation and has previously been described elsewhere [36,37].
If we define Z 1 and Z 2 as the number of exposure (E) and progression to active infection (I) events respectively, then the state space of the process at time t can be denoted as Z(t) = (Z 1 , Z 2 ). If we index the states of the system as z i = (z 1 , z 2 ) with i = 1, . . ., n where n ¼ ðNþ1ÞðNþ2Þ 2 is the size of the state space, we can then order the states of the system such that z i < z i+1 .
The within-household transmission parameter λ in the system (3) is modelled as where β > 0 is an overall scaling for transmission and α represents the different ways that mixing behaviour can change with population size, N. If α = 0 then every pair of individuals in the same population makes contacts capable of spreading disease at the same rate regardless of N; and if α > 0 then larger populations reduce the rate of transmission as if each infective individual had a certain demand for contacts that are evenly spread throughout the population. If α < 0 then larger populations enhance transmission-while this would not normally be considered in the context of households, for RNCs we consider there is the possibility that larger facilities will have more opportunities for contact due to, for example more activities in larger communal areas. The parameter τ, represents the transmission between general members of the community i.e. between household mixing, and the proportion of the overall population that is infective is given as IðtÞ ¼ As we are considering a small number of carehomes in a large population [28], we assume that there is no contact between members of different carehomes and therefore no between carehome transmission. It follows then that if carehomes are independent then τ = 0. A more rigorous derivation of Eq 3 can be found in literature [2,38].

The master equation
More insight can be gained by representing the system (3) in vector notation. Let p be the column vector of the probabilities of a household being in a certain configuration at time t. Then (3) can be expressed more succinctly as a linear constant-coefficient initial value problem, the so-called master equation, where Q 2 R nÂn is the household transition matrix of order n (with n being equal to the total number of states the system can occupy; for our SEI model n = (N + 1)(N + 2)/2 as detailed above) and the probability column vector p 0 2 R n represents the initial configuration at time t = 0. The household transition matrix has the property that its elements sum to zero columnwise. The solution vector p(t) represents the transient behavior of the finite-state Markov chain and is easily shown to be a probability vector for all t ! 0. The solution of the master Eq (5) is given by where exp(tQ) = I + (tQ) + (tQ) 2 /2! + Á Á Á denotes the matrix exponential; see, e.g., [39,Chapter 10]. In what follows we sometime drop time t for notational simplicity. Note that the matrix Q is typically sparse because there is a limited number of transitions that a household in a certain configuration can make, i.e., there are few epidemiological state changes compared to the size of the matrix Q. Because a household can only move to a state following the current one in the DA representation, the states can be ordered so that Q is an upper triangular matrix. This leads to computational savings in some of the algorithms discussed later. The matrices we consider here also have a small bandwidth because we consider epidemiological processes with a limited number of events, though this is not a general feature of epidemiological models.

Exponential integration of the master equation
Technically, we are concerned with the fast and sufficiently-accurate computation of the matrix exponential in Eq (6). For scalar problems (i.e., n = 1) the computation of the exponential is trivial. However, the problem becomes challenging as n gets larger in which case the matrix Q is hopefully sparse or otherwise structured as in our case. We make use of the fact that the full matrix exponential exp(tQ) is not required, but merely the vector-matrix product exp(tQ)p 0 . Computationally, these two are different problems and this section focuses on methods which compute this product directly without forming the matrix exponential itself. Methods based on polynomial and rational approximants have proven to be particularly efficient for this task. They have in common that exp(tQ)p 0 % r(Q)p 0 where r is a well-chosen polynomial, or more generally a rational function, which depends on t and the required approximation accuracy. In the following we review a number of methods which fit into this framework. We refer the reader to [20] and [39, Chapter 10] for further reading. Implicit Runge-Kutta methods (DAm). Very simple computational methods are obtained from the truncated Taylor expansion of the exponential in exp(−hQ)p(h) = p 0 , This corresponds to performing one step of an implicit Runge-Kutta method of order m with step size h. The special case m = 1 is called the implicit (or backward) Euler method and it has recently been used in [37] for SIR epidemic modeling. Herein we refer to this method as DA1 (Direct Advancement, order 1). When applied with a probability vector p(0) (i.e., p(0) ! 0 and kp(0)k 1 = 1) and a matrix Q whose entries sum to zero column-wise (as in our case), DA1 yields an approximation e pðhÞ % pðhÞ which is again a probability vector [37]. We shall also consider expansions in (7) up to the third (m = 2) and fourth-order (m = 3) terms, yielding the DA2 and DA3 methods, respectively. DA2 and DA3 involve higher powers of hQ with the added benefit of better accuracy due to a lower truncation error of the expansion. However, these higher-order methods do not necessarily preserve probability vectors as elements of the computed e pðhÞ can become negative. Nevertheless, the property ke pðhÞk 1 ¼ 1 is preserved. We will argue below that the loss of the non-negativity of the computed vectors is not critical, as a simple projection approach remedies this problem.
Note that s steps of the DAm method with step size h yield e pðshÞ ¼ rðQÞp 0 , where r is the (scalar) rational function The classical Runge-Kutta method (RK4). One step of the classical (explicit) Runge-Kutta method is obtained from a truncated Taylor expansion of the exponential in p(h) = exp(hQ)p 0 , The computation of e pðhÞ can be implemented such that only 4 matrix-vector products with Q are required. After s steps of this method with step size h we obtain e pðshÞ ¼ rðQÞp 0 , where r is the polynomial The method by Higham and Al-Mohy (EXPMV2010). A natural extension of the RK4 method to Taylor approximants with m + 1 terms is the method presented in [21]. The polynomial underlying this method is The parameters m (the Taylor degree), h (step size), and s = t/h (number of steps) are chosen automatically based on spectral properties of Q such that r(Q)p 0 % exp(tQ)p 0 to desired accuracy, in this case either half, single, or double precision in IEEE arithmetic. The evaluation of r(Q)p 0 requires ms matrix-vector products with Q.
The scaling-and-squaring method (EXPM). The scaling-and-squaring method is perhaps the most widely used algorithm for computing the full exponential exp(tQ) (rather than just a matrix-vector product with it); see, e.g., [20]. This method is implemented in MATLAB's expm() function. The rational function underlying the approximation r(Q) % exp(tQ) is where R m is the [m, m] Padé approximant to the exponential function, s is the squaring parameter, and h = t/2 s is the scaling factor. Note that r(Q) can be obtained by repeatedly squaring R m (hQ) for s times. Further computational savings are possible by exploiting the structure of the Padé approximant R m . The parameters are chosen such that khQk is of order 1, numerical cost is minimized, and the overall approximation error is small. For an in-depth error analysis and suggestions for improved parameter choices, we refer to [40,41].
The subdiagonal scaling-and-squaring method (SEXPM). The SEXPM method proposed in [22] is very similar in spirit to the scaling-and-squaring method, but for matrices with spectral region in the left half of the complex plane it uses a potentially smaller squaring parameter s. This is possible by employing a subdiagonal [m − 1, m] Padé approximant R m to the exponential function. Note that the eigenvalues of the matrix Q in our application are all 0 and hence SEXPM is applicable in this context.
Chebyshev approximation (Cheby). A method presented in [42] for the approximation of p(t) = exp(tQ)p 0 with a symmetric matrix Q is to approximate the exponential by a series with shifted-and-scaled Chebyshev polynomials C j , each of degree j. The polynomials C j should be shifted to the spectral interval of the matrix Q and normalized for numerical stability. The three-term recurrence available for the C j translates directly into a three-term vector recurrence for evaluating the approximant e pðtÞ :¼ P m j¼0 c j C j ðQÞp 0 . If Q is nonsymmetric (as in our case), the approximation error in the scalar expansion (8) over Q's spectral interval cannot be directly related to the approximation error kpðtÞ À e pðtÞk (say, in the vector 2-norm). As a consequence, our use of this method is not backed up by theory and we include it only for curiosity. Krylov method (EXPOKIT). The approximation of exp(hQ)p 0 using a Krylov method consists of two steps (see, e.g., [43,44]). The first step computes an orthonormal basis V m 2 R nÂm , m ( n, of the Krylov space In the second step the Arnoldi approximation If the dimension m required for a desired accuracy is small, the computation of V m and exp(tH m ) is fast and the Krylov approach can be very efficient. For larger m, the orthogonalization and storage costs for V m and the evaluation of exp(hH m ) may become prohibitive. In this case, h must be either reduced or restarting techniques [45,46] need to be employed. Here we follow the first approach by using the EXPOKIT method [47], which implements adaptive Krylov time stepping. For the evaluation of the small matrix exponentials exp(hH m ) MATLAB's expm is used (scaling and squaring). Rational Krylov method (RATKRYL). Rational Krylov spaces are a natural generalization of (standard) Krylov spaces from polynomials to rational functions. We will consider here a special case of a so-called shift-and-invert or restricted-denominator rational Krylov space defined as Q m ðQ; p 0 ; xÞ ¼ K m ððQ À xIÞ À 1 ; p 0 Þ, where ξ > 0 is a shift parameter and K m has been defined in (9). The use of such rational Krylov spaces has been proposed in [48,49]. The rational Krylov approximation of exp(hQ)p 0 proceeds similarly to the polynomial case. First, an orthonormal basis V m 2 R nÂm of Q m ðQ; p 0 ; xÞ is computed and second, the rational Arnoldi approximation The construction of the basis vectors in V m requires the solution of m − 1 linear systems with the matrix Q − ξI, but the number m of required vectors is potentially much smaller than in the polynomial Krylov case. The choice of the parameter ξ is nontrivial, especially for nonsymmetric matrices Q as in our case; see, e.g., [50] for a review of this and related parameter selection problems. However, we found that for all numerical tests reported here, the choice ξ = 1 worked well.

Projection onto probability vector
It is easy to ensure that all discussed methods return probability vectors by adding a procedure at each time step that zeros-out all negative numbers and renormalizes the result to have unit 1-norm. If the computed vector is sufficiently accurate, such a normalization procedure does not affect the error significantly.
More precisely, let p = exp(tQ)p 0 be the exact probability vector such that p ! 0 component-wise and kpk 1 = 1. Further, let e p % p be a numerical approximation such that kp À e pk 1 ε ⪡ 1. We define P to be the operator that zeros out negative entries of a vector, i.e., where the subscript i refers to the ith component of a vector.
Then, using basic vector norm inequalities and the fact that jp i À ðPe pÞ i j jp i À e p i j, we have j1 À kPe pk 1 j ¼ jkpk 1 À kPe pk 1 j kp À Pe pk 1 kp À e pk 1 ε; and hence kPe pk 1 ¼ 1 þ d with |δ| ε. Now, for the the normalized vector b p ¼ Pe p=kPe pk 1 we have Hence, the normalization procedure guarantees a probability vector b p and only increases the approximation error ke À b pk 1 by a factor %2 compared to the error of the non-normalized approximation e p.

Model fitting and data
In general, we work in a Bayesian framework, which has the benefit of dealing with the statistical challenges of the small datasets we consider in a systematic manner. We do this by calculating the posterior distribution, f, over parameters θ, given data D, using Bayes' theorem: where L is the likelihood of the data given the parameters and π is the prior distribution over parameters. If the integral in the denominator of the right-hand side of (10) above is tractable, then we can simply evaluate f directly, but if it is not then we can use Markov chain Monte Carlo (MCMC) methods to produce samples from the posterior distribution [51,52]. We fitted the stochastic SEI model above to scabies infection data from a study that enrolled carehomes in the UK [28]. In the study, the authors investigated a series of suspected scabies outbreaks in residential care homes, exploring barriers to early recognition and optimal management. Seven care homes agreed to participate and questionnaires were administered requiring details about dates of onset, diagnosis and treatment, clinical features, underlying illness, pre-existing skin conditions and mobility. An outbreak was determined if a report of two or more clinically suspected cases of scabies in a residential care home were reported to the Surrey and Sussex Health Protection Teams of the Public Health England (PHE) by a GP or a carehome manager. Case definitions included suspected cases because a definite diagnosis of scabies by dermatoscopy or microscopy is rare within the carehome setting and not all symptomatic cases had been seen by a doctor. The data we used to fit to the model are tabulated in Table 1. These data involves scabies outbreaks in seven different care homes, for which the resident population (N), days from onset to diagnosis (T) which is also the point at which treatment is initiated, and the number of scabies cases treated (C) are recorded. Days from onset of symptoms was defined as the first reported day of itching or rash. Frequently an exact date was not available and participants stated that symptoms began e.g. 'over a year ago'. The number of cases included suspected cases and hence would include individuals who have been exposed to the mite but not yet infectious, E, and the infectious individuals, I. As a result, the predicted number of cases from the model, comprised of E and I, is then compared to the number of cases reported, C.
Formally, we write the data D, as the number of scabies cases {c i } that are observed at time {t i } in a care home with population {n i } where i 2 {1, . . ., 7}, and assume the stochastic SEI model as our generative model for the data. Our likelihood, assuming that carehomes are independent, therefore takes the form where P½Eðt i Þ þ Iðt i Þ ¼ P s;e;i ðt i ; a; b; gÞ which is obtained by solving Eq 6. Our MCMC procedure was Random-Walk Metropolis Hastings with hand-tuned Gaussian proposals, which was used to obtain samples from the posterior distribution of the model parameters. We ran 16 MCMC chains in parallel each of length 2.5 × 10 4 , burn-in time for each chain was 10 4 and samples were thinned by a factor of 20. Mixing was assessed using trace plots and the total number of samples visualised is 1.2 × 10 4 .

Costs of delayed intervention and goodness-of-fit
We consider here two costs associated with a scabies outbreak with the first case at time 0 and with an intervention starting at time t.
Economic cost. If we assume that each individual that will need to be treated carries a fixed treatment cost, then the total treatment cost C will be given as the sum of the exposed and infective populations at the intervention time t, We call this the economic cost. QALY cost. One of the important quantities required to justify interventions against an infection is the Quality Adjusted Life Years (QALY) lost, which is a human measure of the disease burden. If we assume that each person-day of symptomatic infection carries an equal QALY loss then represents the total person-day of symptomatic infection, and hence will be proportional to the loss of quality-adjusted life years. We call this the QALY cost [53]. Predicting costs and assessing goodness-of-fit. To sample from the posterior predictive distribution of these two costs (which are represented as random variables) we ran a year-long stochastic simulation [15] of the model given in Eq (1) for the observed population for each parameter posterior sample. Since the economic cost was measured for each residential care home, we are also able to assess model goodness-of-fit by assessing if these data points lie within the posterior predictive intervals generated. Treatment with permethrin and ivermectin. A variety of agents can be used to treat Scabies [54] but the most commonly prescribed drugs in Europe are topical permethrin and oral ivermectin. Treatment is modelled by assuming that the reduction in infected cases occurs once scabies cases are identified in a care home. We take the efficacy of a single course of permethrin (two applications a week apart) or two doses of oral ivermectin two weeks apart as 96.9%, and 62.4% [55] respectively. Barkwell and Shields [1] reported 172 deaths in a population of size 210 over a 36 month period, and 15 subsequent deaths over 6 months in a sub-population of 47 who had received ivermectin treatment, as well as 10 deaths in the remaining population of 163 over that 6 month period. They reported deaths for each month in the two sub-populations over the six months following ivermectin treatment. Barkwell and Shields performed two statistical tests on these data: chi-squared and Fisher's exact. Of these, Fisher's exact test is more accurate for small populations and answers the following question: if two groups, one of size 163 and one of size 47, are formed by picking individuals from the total population of 210 (with 25 deaths) uniformly at random, then what is the probability p of the pattern of deaths observed, or one with more deaths in the population of size 47. This test gives p < 0.0001 when applied to the data.

Safety of ivermectin
This work received criticism from a more standard biostatistical and epidemiological perspective-in particular due to the absence of control for illnesses other than scabies-shortly after its publication [56,57]. Here we do not comment on these issues, but rather focus on the extent to which more general heterogeneity between individuals can invalidate methods such as Fisher's exact test.
Mathematically, we model heterogeneity by assuming that the mortality rates in the population are variable, and that in particular the probability of k deaths in a unit of size n over a time period t are given by a Poisson mixture Lðkjm; y; n; tÞ ¼ Here μ is the mean death rate in the population and θ is the variance divided by the mean, which we call the overdispersion. When θ ! 0, we recover the situation where death rates are homogeneous, and larger values of θ imply more heterogeneity. We will carry out two analyses of the Barkwell and Shields data. Mortality patterns at a given heterogeneity. In the first analysis, we will fix θ and use an improper prior on μ, then since the integral in question is feasible to compute numerically, we can work out the marginal posterior f ðmjk; y; n; tÞ ¼ Lðkjm; y; n; tÞ R Lðkjm 0 ; y; n; tÞ dm 0 ð15Þ using numerical quadrature. The posterior probability of observing κ cases in a population of size n 0 over a time period t 0 is then P½k ¼ Z Lðkjm; y; n 0 ; t 0 Þf ðmjk; y; n; tÞ dm: We calculate and report this probability for values of θ ranging from 0 to 0.2.

Estimation of heterogeneity.
In the second analysis, we take all of the different time periods, population sizes and numbers of deaths (which we index with i) reported by Barkwell and Shields [1] and presented in Table 2 to assess which might be plausible values of θ. Here we will consider the total likelihood L tot ðfk i gjm; y; fn i g; ft i gÞ ¼ Since this is a two-dimensional problem we can visualise by gridding the parameter space to obtain values proportional to the posterior density. We can also carry out MCMC on the joint posterior on that basis can perform our second analysis quantifying joint uncertainty in μ and θ.

Implementation of the exponential integrators
We have implemented the exponential integrators discussed above in MATLAB [58]   Scabies in residential care homes: Modelling inference and interventions exponential. We performed the benchmarking by assuming a population sub-unit with a small (N = 10), medium (N = 30) and large (N = 99) number of individuals with corresponding Q matrices of size 66, 496 and 5,050 respectively for the SEI model. For the more complex multistrain model, the matrix Q is of size 120 and 11,440 corresponding to a household with 3 and 8 individuals respectively. We run each algorithm with 100 replicates and report the mean for the computational time against the error, i.e., the relative ℓ 1 -norm.

Comparison of the exponential integrators
The results for numerical performance for the SEI model with the small Q matrix of size 66 × 66 are presented in Fig 1 with the x-axis showing the log time in seconds and the y-axis the error as measured by log of relative ℓ 1 -norm. From the figure, we can observe that the accuracy of DA1, DA2 and DA3 increases with an increase in the step size in what seems to be linear in log scale. The higher the degree to which the polynomial is expanded, the greater the accuracy. However, for comparable accuracy, DA1 takes almost 2 orders of magnitude more time than DA3. This is because DA1 needs to be evaluated over a much smaller step size in order to achieve the same accuracy as DA3 which consequently increases the computational time.
The RK4 approximation appears to be relatively less accurate than all of the other methods for large step sizes but the accuracy increases with a reduction in the step size. The computational time does not appear to be greatly influenced by varying the tolerance for Chebyshev expansion although it does take more time for a lesser accuracy in the solution vector compared to DA1,2,3, RATKRYL and RK4. RATKRYL has the advantage of a steep increase in the accuracy of the solution for very minimal increase in computational time. ODE45 is the slowest of all the methods followed by EXPM2010 and EXPOKIT. However, for a small matrix, EXPOKIT, results in the most accurate solution.
These results are however dependent on the size of the system. For a medium sized matrix of size 496, all the algorithms take more computational time with the accuracy of CHEBY and EXPOKIT decreasing significantly, see Fig 1. For a large matrix, RATKRYL takes the shortest time albeit at the expense of reduced accuracy. However, the error can be further decreased to the desired level by reducing the relative tolerance. Fig B and C in S1 Text. Complex transmission dynamics model show comparable results when the integrators are applied to a more complex multi-strain model. The outcome is consistent with that of the simple SEI model considered here and the efficiency of the methods depends on the system size in both cases.
Since the size of the care homes in our data range between 18 and 99, leading to matrix sizes of between 190 and 4,371, we opt to use the RATKRYL method in the following model fitting computations. This method seems to strike a good trade-off between the accuracy and the computational time for both small and large system sizes. Since it does not guarantee a probability vector, we implement the projection mapping discussed above, guaranteeing probability vectors at an almost negligible computational cost. The error tolerance used for the model fitting is taken to be 10 −3 .

Fitting scabies transmission parameters
Using RATKRYL method, we solve the SEI scabies model and fit it to data collected from 7 carehomes in the UK [28]. We used MATLAB's kdensity, to produce kernel posterior predictive densities, which gives a smooth probability density function from a finite sample of the random variables β, α and γ. The contour plots in Fig 2 N = 99 (bottom). The x-axis shows the computational time while the y-axis shows the accuracy, measured as ℓ 1 -norm between each of the algorithm's results and the reference result (note that both are on a logarithmic scale). first column of Figs 3 and 4, show the mean of the posterior samples representing how well the model fits the data (blue circles) from the seven care homes (row A to G).

Costs of delayed intervention
The black solid lines (mean) and the grey dashed lines (95% CI), in the first column of Figs 3 and 4, represent the model predictions of the number of scabies cases in the presence of treatment with permethrin and ivermectin respectively. Treatment is implemented immediately when one or more cases have been detected in a care home. In all care homes, we can observe that treatment with permethrin leads to a total eradication of scabies with the exception of care home D in which case a late observation of scabies cases occured. On the other hand, treatment with a single dose of oral ivermectin, first column Fig 4, does not lead to eradication except in carehome G but leads to a reduction in the number of cases that later rebound and saturate at long time.
In the second column sub-plots of Figs 3 and 4 treatment with permethrin and ivermectin is seen to lead to a reduction in the QALY cost, computed as the cumulative person-days of symptomatic infection during the epidemic, with permethrin leading to a greater reduction with the exception of care home D. Fig 5A shows the full posterior we obtain for μ and θ. Given the level of uncertainty for such a small dataset, we perform an analysis for given fixed overdispersal θ (Fig 5B, 5D and 5F) which indicates that as θ is increased, the uncertainty in μ also increases and that the probabilities of a given number of deaths in 6 months in a population of size 47 as defined in (16) also become more spread out, which we assume is the salient fact to explain in the Barkwell and Shields data. In particular, the probability that κ ! 15 is 0.0033 for overdispersal 0, which is consistent with the very low p-values reported by Barkwell and Shields. As we increase the overdispersal, we obtain probabilities that κ ! 15 of 0.20 for overdispersal θ = 0.01, 0.31 for overdispersal θ = 0.02, 0.52 for overdispersal θ = 0.1 and 0.58 for overdispersal θ = 1. Therefore, the mortality pattern reported by Barkwell and Shields is not a particularly low probability event. These conclusions are confirmed by our analysis over the full posterior including overdispersal (Fig 5C, 5E and 5G).

Methodological approach
Expansion and projection based methods to compute the exponential of a matrix were used to compare how well they perform when applied to a Markovian SEI household model describing the transmission of scabies. The computational accuracy and efficiency were tested  by performing various numerical tests by changing the step size h and the system size. In order to test the accuracy, the reference solution was obtained by EXPM and computing the relative ℓ 1 -norm. The methods indicate that for large system sizes, the RATKRYL method was superior both in terms of computational time and accuracy. The DA methods would be appropriate when a fast solution is needed without the need for strict accuracy. Otherwise, for strict accuracy, obtained by reducing the time step, there seems to be a linear increase (in log scale) in time. DA1 has the inherent benefit of ensuring the solution is a probability vector negating the need for correction which might come with some computational time saving and a reduction on the error.
In instances where the solution is required at multiple time points, the expansion based methods can deal with this by applying them repeatedly using the last obtained approximation; the accuracy of which is dependent on the step size. However, the projection based methods have the advantage that the projection space is chosen independent of t and hence no time stepping is required. To run the Scabies model, we chose RATKRYL as it runs fast for different choices of tolerance over a wide range of matrix sizes. The efficiency of the method allows us to perform efficient Bayesian inference on limited scabies data, allowing us to incorporate prior information and quantify remaining uncertainty in a consistent manner. A proper choice for the best method will however depend upon the details of implementation and upon the particular problem being solved. We have also demonstrated, in the S1 Text. Complex transmission dynamics model, that the results obtained from the SEI model are fairly consistent with those from a more complex multi-strain disease model. The overarching observation is that the outcome is predominantly dependent on the system size which is a function of both the number of disease states in the model and the number of individuals in the household.
The appetite for computationally efficient algorithms for solving these type of models has been well documented. In a recent study exploring the use of Bayesian design of experiments in designing epidemiological studies that collect time resolved longitudinal data of infections in households, noted that the failure to adopt such methods stem from the absence of sufficiently efficient methods for computing the likelihood [10]. The ability to reduce problems in household epidemic theory to a set of linear differential equations [2,37,59], on which this work builds, can take advantage of the numerical advances in this work with the ability to achieve the modelling ideal of sufficiently accounting for uncertainty in both structural and parameter values and the ability to propagate it forward within a modelling framework.

Implications for scabies epidemiology and control
By fitting the scabies model to data, we estimate that the mean of the posterior distributions are β = 0.0053, α = 0.68, and 1/γ −1 = 5.4 days with the 95% marginal credible intervals presented in Table 3. The economic cost as well as the QALY cost increase with time in the absence of interventions, with the former growing linearly and the latter saturating at long time.
Our results are relevant to understanding and mitigating the spread of scabies in carehomes in two main ways. First, early detection of the index case is critical in establishing who and when to treat. This can be achieved either by frequently screening for scabies in care homes and a mandatory screening for new residents and staff. This would ensure that the infection is not spread to other carehome residents and staff who act as a frequent link between the carehome and the outside world. We note that there has been reported difficulties in diagnosing scabies, even for trained personnel, due to confounding skin conditions e.g eczema, atypical presentation and lack of a specific diagnostic criteria [28,35]. Comprehensive and timely treatment and identification of all cases, both symptomatic and asymptomatic, would help prevent the spread of the disease. From Fig 4 care home G, it is clear that the earlier the intervention is implemented the more QALYs gained. However a delay in administering ivermectin would lead to a rebound of cases albeit with a high degree of uncertainty. Treatment with permethrin leads to a total eradication of scabies but with the drawback that it requires application to both staff and residents and to be left on the body for 8 hours before being washed, which can be distressing and logistically challenging leading to sub-optimal adherence. However, a case can be made for a second and/or third dose with ivermectin in order to achieve better efficacy leading to eradication, which would be easy to do as it is administered orally. Secondly, the degree of uncertainty in the model predictions can be attributed to the lack of sufficient data to parameterize the model. However we are confident that the technique we have adopted, of drawing samples from the posterior distribution and then propagating forward the uncertainty in a simulation model, is the best framework to adopt for studies with limited data. This has also been made possible by the computational efficiency we have achieved in solving the system of ODEs. Although the model would be considered simple, it is possible to make modifications to it in order to accommodate other complex natural histories with minimal or no change to the numerical solving algorithms. This work can therefore be seen as a foundation on which more complex household based models can be built in order to help make public health decisions.
The human and economic costs of delayed interventions provide a motivation for the use of all available treatments-here, we have suggested that heterogeneity in death rates can make conclusions about mortality from ivermectin hard to draw on the basis of small population studies such as that of Barkwell and Shields; in particular, we strongly advise against overinterpretation of very small p-values that can arise as 'highly significant' [60] as a specific instance of known problems with the use of p-values in general [61].
One of the drawbacks to this is that there does not exist a good and standardized diagnostic test to determine if a person is infected. A study from Japan [62] noted that the time it took for a patient to be diagnosed with scabies was 141 days with a range of between 34 and 313 days despite hospital personnel checking for the condition on the patients skin. In another study [28] which involved a semi-structured survey of managers and affected residents of care homes indicated that most outbreaks were attributable to late diagnosis of the index case. There is therefore a need for improved support to hospital staff, care workers and authorities managing these outbreaks.
Any efforts that are directed towards early and correct diagnosis would be beneficial in curbing the spread of scabies. The development of a point-of-care diagnostic test would be a major contribution towards this objective. As there are long term health conditions associated with scabies [63], failed or wrong diagnosis has extra health economic implications and especially in the context of health institutions, outbreaks, and large epidemics or pandemics. Another important, but maybe slightly understated requirement is the need for a harmonised and carefully coordinated response to scabies outbreaks. There is no national guidance policy for the public health management of scabies in the UK and it is usually left to the individual Public Health England Health Protection Teams to make decisions [34]. This may lead to suboptimal test and treat strategies that retain some level of background transmission in the community which potentially make control or elimination difficult. The design of such a national level guideline would benefit from mathematical modelling and the model developed in this work is designed to be a platform on which we can build more sophisticated models to support national-level decision making.