Predicting epidemic evolution on contact networks from partial observations

The massive employment of computational models in network epidemiology calls for the development of improved inference methods for epidemic forecast. For simple compartment models, such as the Susceptible-Infected-Recovered model, Belief Propagation was proved to be a reliable and efficient method to identify the origin of an observed epidemics. Here we show that the same method can be applied to predict the future evolution of an epidemic outbreak from partial observations at the early stage of the dynamics. The results obtained using Belief Propagation are compared with Monte Carlo direct sampling in the case of SIR model on random (regular and power-law) graphs for different observation methods and on an example of real-world contact network. Belief Propagation gives in general a better prediction that direct sampling, although the quality of the prediction depends on the quantity under study (e.g. marginals of individual states, epidemic size, extinction-time distribution) and on the actual number of observed nodes that are infected before the observation time.


Introduction
Governments and health-care systems maintain costly surveillance programs to report and monitor over time new infection cases for a variety of diseases, from seasonal influenza to the most dreadful viruses such as Ebola. Although surveillance is at the core of modern epidemiology, the early detection of a disease does not automatically guarantee that the fate of the epidemics will be easy to predict, because of the intrinsic stochasticity of the transmission process and the incompleteness of the accessible information. The two issues are often intertwined because a mathematical description that provides a sufficiently accurate prediction at some spatial/temporal scale could become inadequate at another, due to the lack of sufficiently detailed information. For instance, individual-based stochastic compartment models, such as the Susceptible-Infected-Recovered model, are widely used to describe disease transmission in contact networks, but human interactions have only recently become the object of accurate data mining, and exclusively in small and controlled environments such as schools and hospitals (e.g. by means of the RFID technology) [1][2][3][4]. For large-scale epidemic forecast a detailed individual-based description is challenging, therefore researchers and practitioners have resort to coarse-grained metapopulation representations integrated with large-scale datasets on human mobility and real-time estimated parameters [5][6][7].
Beside the difficulty of obtaining accurate data on human interactions, also the observation of the epidemic progression is usually partial, in particular during the initial stages of an outbreak. For this reason, the ability to use all available information to produce a reliable forecast from early and partial observations is crucial to minimize the impact of a disease, at the same time saving financial resources. Using a simple SIR model, Holme [8,9] recently showed that even in the ideal situation in which all information about the structure of the interpersonal network is available, the intrinsic stochasticity of the epidemic process makes prediction of relevant quantities, such as the final outbreak size and the extinction time, very difficult. The quality of the prediction depends on the epidemic parameters (transmission and recovery rates) and on the structure of the underlying network. Predicting the evolution of the epidemics becomes then even more difficult when only partial observation of the state of the individuals is provided.
In a recent series of works [10][11][12], the problem of inferring the origin of an epidemics in individual-based models from partial observations was investigated. Among the different methods proposed, the Belief Propagation (BP) method [10] is not only very reliable and efficient in identifying the origin of an observed epidemics, it also makes possible to easily reconstruct the probability marginals of the individual states at any time, exploiting the causality relations that are generated during the epidemic propagation. Hence, the method can be used to complete the missing information at the time of observation and applied to the problem of epidemic forecasting. In the present paper, we will use BP to predict the evolution of a SIR model on a given network from a partial observation of the states of the nodes in the early stage of the dynamics.
The paper is organized as follows. In Section Inference Models we define direct sampling and Bayesian methods and we introduce the metrics used for validation of the results. Section Results contains a comparison between the prediction obtained using Belief Propagation and Monte Carlo sampling for simulated SIR epidemics on random (regular and power-law) graphs, as well as on a real network of sexual contacts. For these networks, the effectiveness of the methods to predict local (e.g. marginals of individual states) and global (e.g. epidemic size, extinction-time distribution) properties are discussed. Section Methods reports the description of the main techniques employed in this work, in particular the static factor-graph representation of the epidemic process and the Belief Propagation equations used to evaluate the relevant posterior probabilities of the epidemic process given the observations.

Inference models Prediction from partial observations in the SIR model
We consider a discrete time susceptible-infected-recovered (SIR) epidemic model on a graph G = (V, E) that represents a contact network of N = |V| individuals. At each time step of the dynamics a node i 2 V can be in one of three possible states: susceptible (S), infected (I) and recovered (R). The state of a node i at time t is represented by a variable x t i 2 fS; I; Rg. The stochastic process is defined by a set of parameters {λ ij , λ ji } (i, j)2E and {μ i } i2V , such that at each time step an infected node i can infect every susceptible node j in his neighborhood @i with a probability λ ij , then recover with probability μ i (see Sec. Methods for further details on the dynamics). For a given assignment of the infection parameters and a given initial condition x 0 ¼ fx 0 i g i2V , a huge number of different realizations of the stochastic process exists, although some of these outcomes are more likely to occur than others. Epidemic forecasting consists in providing predictions about how much likely some outcomes are in the form of probability distributions, in particular the probability marginals for the states of individual nodes. In realistic situations, the epidemic forecast is performed at some time after the initial infection event, when a number of infected cases is discovered in the population. The information available is thus usually localized in time and involves only a fraction of the overall population: we assume that at time T obs the state x T obs i is made available for a set of nodes i 2 V obs & V and no information about the state is supplied for the nodes not in V obs . In order to focus only on the effects of partial observation, we assume that the structure of the contact network is completely known, and it does not change over time. We remark that, in the case we knew how the network changes over time, we could easily generalize the prediction methods to time-varying networks; unfortunately, the prediction of future contacts in time-varying networks is usually by itself a non-trivial inference problem [13,14].

Direct sampling
Since the SIR stochastic process is Markovian, when V obs = V (complete observation) the probability of the future states x t for t > T obs can be estimated performing a direct sampling, that is generating a large number M s of virtual realizations of the Markov chain from the same initial conditions (a complete observation at T obs ) and directly estimating the probability of an event from its relative frequency of occurrence in the experiment. In particular, if we call x t i;' the value of the variable i at time t in the ℓ-th realization of the stochastic process from the same initial conditions, the individual probability marginal Pðx t i ¼ Xjx T obs Þ can be estimated from the experimental averageP that converges as the empirical averages of a Bernoulli variable, i.e it rapidly converges to the correct value with a standard deviation that decreases with the number of trials as / 1= ffiffiffiffiffi ffi M s p . When x T obs is only partially known (V obs & V), the uncertainty about the future evolution of an epidemic state is much larger; for instance, Fig 1 shows five very different evolutions of the epidemic process after the same partial observation. We call y obs the partial configuration that we observe at T obs , such that x T obs i ¼ y obs i ; 8i 2 V obs . In order to apply the direct sampling method to the case of partial information we first need a way to complete the missing information at T obs . In this work we consider two simple ways to choose the states of unobserved nodes at T obs : • random sampling: given the incomplete observation of the system, the states of unobserved nodes at time T obs are drawn randomly, independently and uniformly with the same probability 1/3, then direct sampling is performed with such an initial condition.
• density sampling: given the incomplete observation of the system, the fraction of observed nodes in each state X 2 {S, I, R} at time T obs is used as an empirical probability to assign, independently and uniformly at random, the state of the unobserved nodes. Direct sampling is then employed to predict future states. The method can be generalized to include dependence on node attributes, such as the degree, by assigning to the unobserved nodes a state with a probability computed from the knowledge of the states of observed nodes with the same attributes.
We remark that unlike the case of complete information, the estimators obtained from these methods through direct sampling have non-zero bias.

A bayesian approach
The posterior probability of a configuration x t at time t given an observation y obs at time T obs can be written as where in the last expression we neglected the a priori probability Pðy T obs Þ of the observed state that only acts as a normalization constant in our analysis, while P(x 0 ) is the prior on the initial conditions. In this Bayesian approach, the prediction of the epidemic evolution after T obs requires to compute the joint probability distribution P(x t , y obs |x 0 ) of the states at the observation time and at some later time given the initial condition. In principle, this quantity could be evaluated experimentally, by taking into account all possible realizations compatible with the constraints imposed by the dynamics and the observation and discarding the others. However, the number of possible epidemic trajectories of length t grows as 3 tN , making this brute-force approach computationally unfeasible even for small systems and very early observations. An approximate sampling method, that we call Similarity Sampling, is inspired by the Soft-Margin algorithm recently put forward in [15] to infer epidemic origins. The Similarity Sampling method consists in evaluating P(x t , y obs |x 0 ) by computing an empirical histogram over a large number of realizations of the epidemic process, each of them contributing with a probability weight that reflects the similarity to the actually observed states at T obs . Every node in the  set of infected and removed nodes at the time of observation T obs is used as single seed for a given number of realizations. We include unobserved nodes with at least one not susceptible neighbour. We assume to know the initial time of the epidemics within a ΔT 0 of time steps. Therefore we consider realizations with origin in a range [−ΔT 0 , ΔT 0 ]. The similarity between a generic realizationx and the real one x is measured by computing the Jaccard similarity function where S IþR ðx T obs Þ is the set of infected and recovered individuals that are observed in the configurationx T obs . The weight function considered is a gaussian / e À ð0ðx T obs ;y obs ÞÀ 1Þ 2 =a 2 , where a is a free parameter. Then the individual marginal probability computed by Similarity Sampling reads:P where X 2 {S, I, R} and the factor of normalisation is Z ¼ ;y obs ÞÀ 1Þ 2 =a 2 . Note that the same result can be achieved by normalizingPðx t i ¼ Xjy obs Þ after having computed the sum. According to [15], for a fixed value of a, we consider a number M s of realizations such that max jP M s ðx i Þ À PM s 2 ðx i Þj < 0:1, i.e. the maximum of the differences between individual marginals after M s and M s /2 realizations is smaller than 0.1. For the problem of source inference, Antulov-Fantulin et al [15] applied the convergence criterion to the marginal probability to be infected at time t = 0. Instead, in the case of predicting the evolution of the disease spreading, we check the convergence of the marginal probability of being in any state at any time step. Given the larger number of conditions, reaching the convergence is computationally more demanding. In all results of the present paper we initially set a = 0.125. If the convergence criterion is not met for M s 8 × 10 5 we use a = 0.5. The latter value guarantees the convergence for any instance. However, it is possible to consider the marginal probabilities computed using a = 0.125 even if the convergence criterion has not been met. In this case, results provided by Similarity Sampling may slightly vary. Although the method could provide a much more accurate estimate of the individual probability marginals than random and density sampling methods, such an accuracy is usually obtained through fine-tuning of the parameters and requires a very high computational power beyond the aim of this work. Following the recent work by some of the authors [10,11] we develop here a different approach that consists in addressing the joint probability distribution P(x t , y obs |x 0 ) as a probabilistic graphical model defined on a static representation of the dynamical trajectories. When the underlying contact network is a tree, the factor graph on which the graphical model is defined can be also reduced to a tree, and the joint probability distribution can be computed exactly by solving a set of local fixed-point equations known as Belief Propagation (BP) equations. On more general graphs, the BP equations can be considered as a heuristic algorithm that, under some decorrelation assumptions for the variables, provides a good approximation of the real probability distribution [16]. The BP equations for the quantity P(x 0 |y obs ), representing the posterior probability of the initial configuration given an observation at a later time, were derived in Refs. [10,11]. The BP equations for the more complex graphical model in Eq (2) are discussed in detail in Sec. Methods. We stress that, with the BP approach, the prediction of the future evolution of the epidemics passes through the inference of the (unobserved) dynamical states prior to the time of observation and the reconstruction of the causal relations developed in the dynamics.

Validation metrics
We used the different inference models under study to compute, for every node i, the marginal probabilities Pðx t i ¼ Xjy obs Þ with X 2 {S, I, R}. These quantities are then used in the binary classification problem of discriminating whether a node has been infected at a time t 0 t or not, that turns out to be a relevant measure to quantify the performances of the different prediction methods. In order to do that, we rank the nodes in decreasing order of magnitude of the probabilities Pðx t i ¼ Ijy obs Þ þ Pðx t i ¼ Rjy obs Þ and build a Receiver Operating Characteristic (ROC) curve [10,11]. Starting from the origin of the axes, the ROC curve is obtained from the ordered list of nodes by moving upward by one unit whenever a node is correctly classified as already infected at time t (true positive) or rightward in case it is not (false positive). The area under the ROC curve (AUC) expresses the probability that a randomly chosen node that was infected before time t is actually ranked higher in terms of the corresponding probability marginal than a randomly chosen susceptible one. When the ranking is equal to the real one, the area under the ROC is 1, whereas a completely random ordering gives an area equal to 0.5.
The area under the ROC curve gives indication of the fraction of the correctly classified nodes, but it does not depends much on the actual values of the marginal probabilities. The latter ones have instead a direct effect on a global quantity of crucial important, the size of the epidemic outbreak, i.e. the number of nodes reached by the infection. The average epidemic size at time t can be expressed as function of the local marginals as [17] The extinction time distribution is another relevant global quantity, that cannot be directly computed from the knowledge of the individual probability marginals, and whose characterization on given network structures is a major issue in epidemic studies [18]. In particular, we are interested in the posterior probability distribution PðT ext jx T obs Þ that the discrete-time epidemic process dies out at time T ext when it is conditioned on the (possibly partial) observation of the state x T obs at time T obs . A crucial point of the BP algorithm is that it is very convenient for computing local quantities, such as marginal probability laws for the single variables or pair-correlations. Some global quantities, such as the average epidemic size, can be directly computed from the knowledge of the local probability marginals. Interestingly, the quantity PðT ext jx T obs Þ can be expressed as the difference between two terms involving the free energies of the related graphical models when the epidemics are constrained to vanish before time T ext and T ext − 1, respectively. In Sec. Methods we show that such free energy can be efficiently computed, in the Bethe approximation, as the sum of local terms by means of the BP equations. Due to the intrinsic stochasticity of the SIR model, node classification is not perfect (AUC values smaller than 1) even in case of direct sampling with complete observation. In fact, the corresponding AUC values rapidly decrease after the observation time and recover only at late times since the epidemics dies out and almost all nodes are either in R or S states. If we interpret the average values of the area under the ROC as a proxy for the epidemic predictability, in the case of a complete observation (best-case scenario) the behavior observed is compatible with the effects due to epidemic heterogeneity reported in [5]. The most interesting region corresponds to intermediate times, when the predictability of the process is the lowest. Similarity Sampling perform largely better in the first stage after the observation, corresponding to the exponential outbreak phase [19]. In particular Similarity Sampling gives an AUC value similar to BP at the time of observation, but a lower AUC value in the subsequent time steps.

Results for individual node classification and epidemic size
Fig 2b shows that density sampling strongly overestimates the average epidemic size with respect to results from complete observation; this is probably an effect of the homogeneous deployment over the graph of infected nodes used to complete the information, that favors a larger epidemic spreading. Disregarding existing correlations between the 90% of the nodes, this scheme could lead to the overestimation of the probability of being infected-in a similar way to mean field approximations. In Similarity Sampling the overestimation of the epidemic size is due both to this procedure to set the seeds of the M s virtual epidemic realizations and to the approximation on the initial time. Belief Propagation also slightly overestimates the epidemic size, but we think this is essentially due to the fact that in most of the instances the algorithm does not properly converge to the correct marginals.
The heatplots in Fig 3 display the same set of data classified as function of the number of observed nodes that were infected before the observation time, respectively for density sampling, Similarity Sampling and Belief Propagation. Results for direct sampling with complete observation are presented as a reference. In the case of the average AUC (Fig 3), BP performs better than both density sampling and Similarity Sampling in all regimes, in particular the performance is very good in the first steps after the observation, almost independently of the actual number of infected and recovered nodes in the observation. For all methods the results slightly improve when a larger number of nodes reached by the epidemics is observed at T obs . For the average epidemic size, Fig 3b shows that the early-stage prediction by density sampling is negatively affected by the observation of a larger number of infected and recovered nodes. The opposite occurs, though to a lesser extent, for BP: when few infected nodes are observed BP overestimates the epidemic size, the worst prediction by BP giving an average size about 20% larger than that obtained from complete observation. The deviation observed by Similarity Sampling is also more evident when a lower number of infected and recovered nodes is observed, but the overestimation is more homogeneously distributed. Interestingly, the poor performance at large times is localized on realizations in which only a few of the observed nodes already got infected at T obs .
In Fig 4 we show the results for the classification of individual states and the average epidemic size on random regular networks when different values of the epidemic parameters are considered. We include the results obtained considering the marginal probabilities computed by Similarity Sampling with a = 0.125 even if the convergence criterion is not met. In general, it provides similar results to the case in which the convergence is required. For λ = 0.7, μ = 0.5 Predicting epidemic from partial observations it provides slightly better results shortly after T obs , but worse than those obtained by BP. We verified that the these results are compatible with the ones obtained considering only the epidemic realizations in which the convergence for a = 0.125 is actually reached (additional information on the convergence of Similarity Sampling is provided in Sec. Methods). We argue that this result provides an upper bound to the performances of Similarity Sampling, because these realizations are somehow "simpler" to predict than others for this method.
Barabási-Albert random graph. In the case of heterogeneous graphs, such as those obtained with the Barabási-Albert (BA) growing network model, in addiction to the random observation, it is interesting to define other observation schemes for the same density of observed nodes: • degree-based observation: nodes are observed in descending order of their degree; • local observation: a connected cluster of observed nodes is generated from a randomly chosen infected node.
We investigated the effect of different observation schemes on random sampling, density sampling, Similarity Sampling and BP.
The results for the average AUC, obtained with observation of 30% of the nodes at T obs = 3, are reported in Figs 5 and 6. In the case of complete observation, direct sampling produces monotonically decreasing AUC values for increasing times. The reason is that in finite size networks the parameters chosen give a non-zero probability of finding susceptible nodes in the last stage of the epidemic evolution, then wrong predictions are possible and the AUC remains considerably below one. For random observation, Fig 5a shows that Belief Propagation always gives larger AUC values than the other sampling methods, especially in the first stage of the epidemics, i.e. during the exponential outbreak. The same behavior is found plotting the results as function of the actual number of observed nodes (see heatplots in Fig 6a) that were already infected at the time of observation; in particular, the performances of BP are better when this number is small.
Figs 5b and 6b report results obtained with the observation of a 30%-fraction of nodes forming a connected subgraph. The overall results are very similar to those with random observation, even though density sampling and BP perform slightly better in the time steps immediately after the observation, while Similarity Sampling is slightly worse in the same regime. A degree-based observation is particularly convenient for heterogenous networks .  Fig 5c shows that the average values of the ROC area increase in the first stage of the epidemics for all prediction methods, in particular the difference between values obtained by BP and those from direct sampling with complete observation is less than 2%. The results reported in the heatplots (see Fig 6c) are qualitatively similar to those from the other observation schemes, with slightly better prediction performances at early times when the number of infected nodes in the observation is small. This is possibly due to the fact that these cases correspond to smaller epidemics whose initial evolution is more predictable.
Results on the prediction of the average epidemic size on Barabási-Albert networks are reported in Figs 7 and 8, except for random sampling that strongly overestimates the size in all regimes and observation schemes. With a random observation scheme (see Fig 7a), density sampling and BP provide very accurate prediction along the whole dynamics, while Similarity Sampling provides strong overestimate of the size value at early time and underestimate at late times. Fig 8a suggests that for both density sampling and BP, the accuracy is lower when a small number of infected and recovered nodes is observed. When the number of nodes reached by the infection at T obs is larger, BP performs better than density sampling (4.5% of the nodes larger than the direct sampling with complete observation). The very bad results of Similarity Sampling at late times are mostly due to a very strong underestimation of the average size when the observation contains only few infected/recovered nodes. On the contrary at early times overestimate appears when a large number of infected/recovered nodes is observed.
In Fig 7b we show the prediction of the average epidemic size when the partial observation is performed considering a connected subgraph of 30% of the nodes. In this case all methods Predicting epidemic from partial observations overestimate the epidemic size, with BP performing considerably better than the others. The poor performances of density sampling are expected because it completely neglects the topological information in the observation. For example if infected nodes are surrounded by susceptible ones, the probability of infection for unobserved nodes is lower, but this is not taken into account in the density sampling approach. BP performs instead poorly when there are very few infected nodes in the observed area. This is expected, because in such a situation this method is not able to correctly reconstruct missing information. Finally, Similarity Sampling gives good results for small and intermediate time steps but again it strongly deviates at large times, mostly because of observations with few infected/recovered nodes (see Fig 8b).
We already noticed that Belief Propagation performs very well in the case of a degree-based observation; this is true also for the epidemic size prediction, as shown in Fig 7c. The difference between the average epidemic size predicted by Belief Propagation and the one obtained by direct sampling with complete observation is less than 2% of the nodes in the network. Instead, density sampling overestimates the average epidemic size, especially in the first epidemic outbreak and for a large number of infected and recovered nodes in the observation (see Fig 8c). Density sampling does not make use of the connectivity knowledge, which is a valuable information: an observed highly connected node is more likely to be infected, ignoring this fact leads to assign the same infection probability of the hubs to every node in the

The heatplots represent the average epidemic size as function of time and of the number of observed nodes that were infected before T obs = 3, computed by density sampling, Similarity
Sampling, Belief Propagation, on a Barabási-Albert random graph of N = 1000 nodes and average degree hki % 4 with homogeneous parameters λ = 0.5, μ = 0.6. As a reference, we also plot results obtained, for the same realizations of the SIR process, by direct sampling with complete observation. The prediction is obtained after the observation at T obs of a 30%-fraction of (a) nodes chosen at random uniformly and independently, (b) nodes forming a connected subgraph, (c) the most connected nodes. The horizontal axis refers to the number of infected or recovered nodes present in the 30% observation (also in the case of complete observation). https://doi.org/10.1371/journal.pone.0176376.g008 Predicting epidemic from partial observations network, leading to larger predicted epidemic sizes. In this respect, one could expect that better results could be obtained simply by introducing a degree-dependence in the infection probability inferred from the observation; nevertheless, preliminary results show no significant improvement in the quality of the prediction.
In Fig 9 we show the results for the classification of individual states and the average epidemic size on BA networks when different values of the epidemic parameters are considered. We include the results obtained considering the marginal probabilities computed by Similarity Sampling with a = 0.125 even if the convergence criterion is not met. In this case it provides better results shortly after T obs , but worse predictions at large time. In any case it is still less accurate than BP. We verified that the these results are compatible with the ones obtained considering only the epidemic realizations in which the convergence for a = 0.125 is actually reached (additional information on the convergence of Similarity Sampling is provided in Sec. Methods). Following the same argument as in Fig 4, we argue that this result provides an upper bound to the performances of Similarity Sampling.

Results for the extinction time distribution
The extinction time distribution is a global feature of the epidemic process, that can strongly depend on the epidemic parameters, the initial conditions and the topological structure of the underlying contact network. Here we are interested in predicting the probability distribution for the extinction time when a (possibly partial) observation is provided. Even in the case of complete observation, the results are highly non-trivial, in particular on networks with peculiar topological structure. Fig 10 shows the extinction time distribution P ext (t) = P(t = T ext |y obs ) for regular trees (a) and regular random graphs (b). In the case of trees the probability distribution is highly variable: depending on the observation, the width and the maximum value of the distribution can change significantly. Fig 10c-10e show three different realizations at the time of observation T obs . In terms of the number of infected node and their average degree, the snapshots in panels (c) and (e) are similar, but their extinction time probability distributions are rather different (T peak = 16 and T peak = 23). On the contrary, despite the very different realizations at T obs , snapshots in panels (c) and (d) correspond to similar distributions (T peak = 23 and T peak = 21). This is due to the arrangement of infected and recovered nodes at the time of observation: the configuration in Fig 10e does not allow to access the root of the tree, so the epidemics is limited and diffusion to other branches of the graph is blocked. In Fig 10c and  10d, instead, the epidemics can spread throughout the graph, causing the distribution to reach a maximum at larger times. The heterogeneity of the extinction time distribution is peculiar of trees and graphs with topological bottlenecks, while random graphs, or graphs with small- Predicting epidemic from partial observations world properties in general, are characterized by very similar distributions for different realizations of the epidemic process (with same epidemic parameters and observation).
The results on the prediction of the extinction time distribution from partial observations is shown in Fig 11. Motivated by the observed strong variability of the extinction time distribution, we first considered the case of regular trees of branching ratio equal to 4 (average degree hki % 2). The partial observation was obtained sampling randomly the state of 10% of the nodes at T obs = 5. Fig 11a displays the average difference between the extinction time distribution predicted using direct sampling with complete observation and that obtained using Belief Propagation (red), density sampling (blue), and Similarity Sampling (magenta). All methods present two regions of higher discrepancy with respect to the prediction with complete observation. As shown by the heatplots in Fig 12b, this is usually due to an underestimation of the probability of extinction in the early stage of propagation and to an overestimation of the probability of extinction at large times. BP is usually able to qualitatively identify the most probable extinction time even when the other methods instead assign more probability mass to much larger times. Heatplots show that the two-peak discrepancy is especially due to observations with few infected and recovered nodes, while the discrepancies between the distributions move mostly at intermediate times when this number is increased. BP performs better than the other methods at every time step, although it presents the same qualitative weaknesses. Predicting epidemic from partial observations Interestingly, the Similarity Sampling method overestimates the probability for the epidemics to die out at early time step. In fact, a fraction of the epidemics with a large Similarity index to the observed incomplete snapshot immediately dies out after T obs , leading to an overestimation of the extinction probability at early time steps.
Figs 11b and 12b display the same analysis in the case of random regular graphs of degree k = 4 with partial observation of the 30% of the nodes at T obs = 4. Although all prediction methods under study are able to reproduce the existence of a unique peak, there are remarkable quantitative differences with the results from direct sampling with complete observation. The BP algorithm provides the best performances, in particular for observations with a large number of infected and recovered nodes. For a low number of infected and recovered nodes, instead, BP gives a larger average difference with respect to density sampling. This effect is mostly due to the non-convergence of the BP algorithm in some instances of the epidemic process, leading an overestimation of the probability of long extinction times. At the time steps close to the peak the Similarity Sampling give a larger average difference with respect to density sampling and BP. The Similarity Sampling gives the largest average difference. The main contribution to the average difference comes when low number of infected and recovered nodes are observed. In this case information provided by the observation is insufficient for Similarity Sampling. Predicting epidemic from partial observations A case study of real contact network We consider a real network dataset of the sexual encounters of internet-mediated prostitution [20,21], that was obtained analyzing a Brazilian web community exchanging information between male sex buyers. The original dataset is in the form of a bipartite temporal network, in which an edge between a "sex buyer" A and "sex seller" B is drawn if A posted a comment in a thread about B. The dataset covers the period September 2002 to October 2008 (2,232 days) and 50,185 contacts are recorded between 6,642 sex sellers and 10,106 sex buyers. In our analysis, we do not consider separate classes of vertices and we focus on a sample network comprising a time window between day 1000 and day 1100. The resulting network (SC) has N = 1293 nodes, E = 1571 edges, average degree hki % 2.4 and maximum degree k max = 55.
We study the predictability of the epidemic evolution on a static projection of the sexual contact network when the observation takes place at times T obs = 4,8 as representatives of early and later time observation. In both cases, density sampling and random sampling make unreliable predictions of the classification of individual states of the nodes (see Fig 13a-13c). For T obs = 4, BP gives good results only in the time steps immediately after the observation, then the performances rapidly deteriorate. BP results slightly improve increasing the observation time. Nevertheless BP is better than other methods. For the average epidemic size, Fig 13b shows that Similarity Sampling gives the best prediction at T obs = 4 (though underestimating the epidemic size), whereas BP performs as bad as density sampling (and random sampling even worse). BP results improve considerably for T obs = 8 while Similarity Sampling turns out to overestimate the epidemic size at early times (Fig 13d).
We remark that results are strongly influenced by the number of infected and recovered nodes in the observation. In this respect, in Fig 14, we repeat all measurements considering observations at T obs = 4 containing a number of infected and recovered nodes equal to N I+R ! 6 (corresponding to the 46% of all instances), and at T obs = 8 with N I+R ! 18 (75% of all instances). BP performances improve considerably at T obs = 4, outperforming all other methods in the case of T obs = 8. These results can be better understood if we consider that the network is characterized by a well connected core surrounded by many low degree nodes. When few infected nodes are observed, they typically are low-degree ones and the epidemic process spreads slowly at early time. In this situation Similarity Sampling is facilitated because the trajectories leading to the observed states are a small set. On the contrary, it is less accurate when Fig 14. Average area under the ROC curve (a,c) and average epidemic size (b,d) as function to the time t ! T obs for SIR dynamics (λ = 0.5, μ = 0.4) on the SC network. Results are obtained with random sampling (green), density sampling (blue), Similarity Sampling (magenta) and Belief Propagation (red) from a random observation of 30% of the nodes at T obs = 4 (a,b) and T obs = 8 (c,d). For T obs = 8, only instances with a number of observed infected and recovered nodes N I+R > 18 is considered (75% of instances). For T obs = 4, only instances with observed infected and recovered nodes N I+R > 6 is considered (46% of instances). In all plots direct sampling from a complete observation is shown for comparison (black). https://doi.org/10.1371/journal.pone.0176376.g014 Predicting epidemic from partial observations many infected nodes are observed or the observation occurs at later times. However results can likely be improved by an higher computational power. Belief Propagation does not choose the seed among the observed infected and recovered nodes only, it computes the probability of being seed for each node of the network, hence it is more accurate than Similarity Sampling when many infected and recovered nodes are unobserved. When the number of nodes reached by the epidemic spreading at the observation time is small, the effect of the existence of short loops in the network is more important and BP is more likely to overestimate the probability of a node to be infected [22]. It is worth noting that we provide to Similarity Sampling the information about the initial time t = 0 ± ΔT 0 of the epidemic spreading, on the contrary we don't provide such an information to BP.
We also consider a weighted static projections of the sexual contact network (WSC), in which every existing edge ij is assigned a weight w ij corresponding to the number of contacts between node i and node j during the period under consideration. Then we define the probability that node i infects node j as l ij ¼ 1 À ð1 À lÞ w ij . Fig 15 shows results for the average AUC and the average epidemic size. Belief Propagation provides higher values for the AUC than all the other methods at all times, even though AUC decreases with time much faster compared to direct sampling with complete observation. Immediately after the observation BP also provided the best prediction of the average epidemic size, while at late times Similarity Sampling works better.

Conclusions
In the present work, we extended the Bayesian Belief Propagation approach to the prediction of the future evolution of an epidemics, providing an efficient distributed algorithm to compute, at any time, the marginal probability of the states of individual nodes in the network. Some global quantities, such as the average epidemic size, can be directly computed as function of the individual marginal probabilities. Here we show that also quantities such as the extinction time distribution, that is intrinsically non local, can be reduced, in the BP approach, to a distributed calculation of local marginals on a locally tree-like factor graph. On random regular graphs and Barabási-Albert networks, the predictions obtained with the BP algorithm are compared with those from other heuristics based on Monte Carlo direct sampling from the same partial observations, while direct sampling with complete information is taken as a reference to assess the quality of the results. We also analyzed a real-world contact network obtained from a Brazilian database of sexual encounters. On random networks, BP provides better prediction than the other methods under study at all time steps. For all methods, the accuracy of the prediction is lower when the actual number of infected and recovered nodes in the observation is small. The errors introduced in the analysis of these configurations can result in a significant distortion of the overall results, in particular in the long time regime, as observed in the case of the average epidemic size measured by Similarity Sampling.
In general, BP is more accurate in the classification of individual marginals than in the estimate of the average epidemic size. A possible reason is that, in some cases, BP equations do not properly converge, resulting in a set of local probability marginals that are slightly different to the correct ones. In particular, convergence issues are mostly due to the presence of small loops in the network, that typically lead to an overestimate of the value of probability marginals by BP. As a consequence, the inaccuracies have little effect on the ranking of individual marginals, on which the ROC classification is based, whereas they are amplified when considering a global quantity such as the average epidemic size. Finally, BP usually approximate the extinction time probability distribution better than other methods.
In the real-world case study, results are affected by the presence of a much higher density of edges with respect to random graphs. BP gives better results for observations at late times. When the observation takes place early, the prediction of all methods is clearly worse, but BP gives the best prediction. The inaccurate prediction by BP is probably due to a combined effect of low level of information in the observation and the existence of many short loops in the network that limits the validity of BP. When considering only observations with a sufficiently large number of infected and recovered nodes, BP results improve considerably with respect to all other methods. An evidence of the role played by short loops comes from the better results generally obtained on the weighted network, because by construction the weighted network is effectively more sparse than the unweighted projection first used. In general, on networks that are not locally tree-like Belief Propagation may show convergence issues that become more significant as the density of the loops increases. In this cases, Belief Propagation likely would not be the recommended method. However, since locally tree networks represent a significant portion of real world and synthetic networks involved in epidemic spreading analysis, we believe that that this work still provides a significant improvement in this field.
In conclusion, BP and Similarity Sampling have advantages and drawbacks depending on the time and type of observation, but BP can be considered the most accurate method to predict both local and global quantities when the underlying network is sparse and when the observation contains a sufficiently large number of infected and recovered nodes. We remark that BP results have been obtained with no knowledge about the initial time of the epidemics, that is another important advantage compared with the other methods.

Methods
The SIR epidemic process A node i 2 V can be in one of the possible states: susceptible (S), infected (I) and recovered (R). At each time step an infected node i can infect each of his neighbor j with a given probability λ ij , then recover with probability μ i . The state of a node i at time t is represented by a variable x t i 2 fS; I; Rg. The process is irreversible, so once a node recovered it does not get infected anymore. The Markov chain is described by the following transition probabilities Pðx tþ1 A realization of the SIR stochastic process is univocally expressed in terms of infection times t i and recovery times g i , 8i 2 V. Given the initial configuration x 0 , for each node i 2 V, a recovery time g i is randomly drawn according to the distribution G i ðg i Þ ¼ m i ð1 À m i Þ g i and the infection transmission delays s ij from node i to node j are generated from the conditional distribution ( Infection times are then given by the deterministic equation Factor graph representation Every realization of the trajectory (x 0 , . . .,x t ) is in one-to-one correspondence with a static configuration of individual infection times t = {t i } i2V and recovery times g = {g i } i2V . Using this static representation of the epidemic dynamics we can express the posterior probability as where P(t,g|x 0 ) is the joint probability distribution of infection, Pðy obs jx T obs Þ ¼ Q i2V obs dðy i ; x T obs i Þ is the matrix that connects the complete configuration x T obs to the partially observed one y obs and recovery times conditioned on the initial configuration x 0 , and P(x t |t, g) and Pðx T obs jt; gÞ are deterministic functions of the set (t, g) representing the connection between a (t, g) configuration and configurations x t and x T obs . Although the sum on the right-hand side of Eq (11) still runs over a possibly huge number (exponentially large in N) of configurations, a representation in which the dynamical relationships between trajectories of neighboring variables is reduced to a set of local constraints on the activation/recovery times is more convenient to develop approximation methods using tools from graphical models and statistical mechanics. By means of Bayes' theorem we compute the posterior probability of a configuration x t at time t given an observation x T obs at time T obs : P x t jy obs ð Þ ¼ X t;g P x t jt; g ð ÞP t; gjy obs À Á / X t;g;x T obs P x t jt; g ð ÞP x T obs jt; g ð ÞP t; g ð ÞPðy obs jx T obs Þ ¼ where is the factorized prior on the initial condition, and P(x t |t, g) and Pðx T obs jt; gÞ are deterministic functions of the set (t, g) representing the probability of a configuration x t at time t (and, respectively, at time T obs ): The joint probability distribution of infection and recovery times conditioned on the initial configuration reads where The factor graph representation of a probability distribution is made up of a bipartite graph composed of factor nodes and variable nodes [16]. Each factorized term in Eq (12) is represented by a factor node and each variable of the problem is represented by a variable node. Each factor node is connected to the set of variable nodes involved in the corresponding factorized term. The factor graph of Eq (12) has a loopy structure which can compromise the accuracy of the BP approximation. We can use a factor graph representation that maintains the same topological properties of the original graph in order to guarantee that BP is exact when the underlying graph is a tree. Following [22,23], we do that by grouping pairs of variable nodes (t i , t j ) in the same variable node. For each edge (i, j) emerging from node i we introduce a triplet ðt ðjÞ i ; t ji ; g j i Þ, where t j i and g j i are copies (on j) of the infection time and recovery time of i and the variables t ij ¼ t ðjÞ i þ s ij on which factors ϕ i depend. Including a constraint that forces copies t i and g i to have a common value, we get in which the local contributions can be expressed as function of the Belief Propagation messages The extinction-time constraint The posterior probability P(T ext |y obs ) of the extinction time from a partial observation at T obs can be written as a difference of posterior probabilities that an epidemic ends within a given Predicting epidemic from partial observations time, P T ext jy obs À Á ¼ P t ext < T ext jy obs À Á À P t ext < T ext À 1jy obs À Á : ð28Þ Using the static representation of dynamical trajectories, Pðt ext < T ext jy obs Þ ¼ X t;g Pðt ext < T ext jt; gÞPðt; gjy obs Þ / X t;g;x T obs Pðt ext < T ext jt; gÞPðx T obs jt; gÞPðt; gÞPðy obs jx T obs Þ ¼ X t;g;x 0 ;x T obs Pðt ext < T ext jt; gÞ Pðx T obs jt; gÞPðt; gjx 0 ÞPðx 0 ÞPðy obs jx T obs Þ ¼ X t;g;x 0 Qðy obs ; T ext ; t; g; x 0 Þ ¼ ZðT ext ; y obs Þ: The terms in the latter expression are the same as in Eq (12), with the exception of the following term factorized over the nodes P t ext < T ext jt; g ð Þ ¼ Predicting epidemic from partial observations that constrains the calculation to epidemics that vanish before T ext . In this way, we give null probability to every single site configuration with t i + g i larger than T ext (except for t i = T inf that describes susceptible nodes). The logarithm of the partition function is the free energy of the model, hence À f T ext ; y obs À Á ¼ log Z T ext ; y obs À Á ¼ log P t ext < T ext jy obs À Á : ð31Þ In the factor graph representation, the free-energy can be approximated with the Bethe free-energy, that is computed by means of the BP equations.

Similarity Sampling convergence
Figs 16 and 17 display the effects of forcing the convergence criterion with a = 0.125 in the case of regular random graphs and Barabasi-Albert graphs, respectively. The solid curves represent results obtained considering only the epidemic realizations for which convergence is met, whereas the dashed curves correspond to results obtained including all realizations (as in the main text). Only in the case of BA graphs with λ = 0.5 the two curves are not within the error bars. Nevertheless, this difference does not change the qualitative behaviour and the comparison with the BP method (see Figs 4 and 9).

Author Contributions
Conceptualization: JB AB LDA.