Cost-efficient vaccination protocols for network epidemiology

We investigate methods to vaccinate contact networks—i.e. removing nodes in such a way that disease spreading is hindered as much as possible—with respect to their cost-efficiency. Any real implementation of such protocols would come with costs related both to the vaccination itself, and gathering of information about the network. Disregarding this, we argue, would lead to erroneous evaluation of vaccination protocols. We use the susceptible-infected-recovered model—the generic model for diseases making patients immune upon recovery—as our disease-spreading scenario, and analyze outbreaks on both empirical and model networks. For different relative costs, different protocols dominate. For high vaccination costs and low costs of gathering information, the so-called acquaintance vaccination is the most cost efficient. For other parameter values, protocols designed for query-efficient identification of the network’s largest degrees are most efficient.


I. INTRODUCTION
The problem of targeted vaccination has typically been formulated as follows. Given some knowledge of the contact network, identify the individuals that are potentially most important for disease spreading. To carry out a targeted vaccination campaign, one would first need to gather information about the contact network, then use this information to vaccinate (or otherwise reduce the impact of the important individuals). There are thus three major costs involved in such an endeavor: the cost of the disease itself (that we can use as our base unit), the cost of gathering the information about the network c info (in units of the cost of a person getting the disease) and the cost of vaccinating c vacc . We can thus evaluate the cost efficiency of a vaccination protocol by measuring the net saving χ per person in units of the cost of sick individuals (1) where Ω and Ω are the expected outbreak sizes (number of individual who had the disease after it became extinct) respectively without and with using vaccinations, N is the number of individuals, f is the fraction of individuals to vaccinate and n is the number of inquiries needed to obtain information (which can depend on the specific vaccination protocol, f and the graph G representing the network). The goal of this paper is to address the trade-off between effectiveness and efficiency of vaccination protocols. We perform the study on several real-life data sets and on the configuration model, which is a method to generate synthetic uncorrelated random networks given a degree sequence. The advantage of the configuration model is in its simplicity, and that infections spreading and vaccination protocols on such networks and their generalized versions have been studied in the literature [4,8,10,20,24,26].
The simplest vaccination protocol is just to vaccinate random individuals, the Random protocol, which often serves as * Electronic address: holme@skku.edu a baseline in the literature, see e.g. Refs. [8,9,20,29]. An important paper by Cohen, Havlin and ben Avraham [9] proposed a more effective Acquaintance vaccination. In their approach, one also starts by randomly selected individuals, but does not vaccinate these, rather, asks them to name someone they met (in such a way that contagion could occurred). In an uncorrelated network, the probability of meeting a person of degree k in such an approach, is proportional to k. It is important to vaccinate high-degree nodes, not only because they have more people to spread the disease to, but also more people to get the disease from.
Let f c denote the fraction of population that must be vaccinated in order to prevent a global outbreak. Formally, as N → ∞, f c = min{ f : Ω ( f )/N = o(1)}, and we will use an upper index for f c to denote a specific vaccination protocol. It was shown numerically in Refs. [8,9] that f A c < f R c . An implicit analytical expression for f A c in uncorrelated networks (configuration model) was derived in Ref. [8]. Similar results were obtained in Ref. [20] for a more general model of infection spreading, in Ref. [4] for imperfect vaccine, and in Ref. [10] for the weighted configuration model, where weights of the edges represent contact probabilities.
A large empirical study based on the 2006 census of the Greater Toronto Area [29] suggests that vaccination of topdegree nodes-the Degree vaccination protocol-is most effective. However, such strategy requires an entire information of the network, which makes it hard to implement. For analytical results on degree-based vaccination and an implicit expression for f Degree c we refer to Ref. [20]. In this paper by optimizing Eq. (1) rather that f c , we confirm that the Degree protocol is never the most efficient one: in all scenarios, the cost of the complete knowledge does not justify the gain in Ω .
In addition to the Acquaintance protocol, we consider two strategies, recently developed for quick detection of highdegree nodes: the Random walk strategy [3], and the Twostep heuristic (TSH) [2]. We also consider three protocols that require complete knowledge of the network, including vaccinating the top-degree nodes. See Section II B for a complete description of all protocols.
We find (Section IV) that the randomized protocols (Ac-quaintance, Random walk, TSH) outperform the Random protocol as well as the three protocols, which require the knowledge of the entire network. For high vaccination costs and low costs of gathering information, the Acquaintance vaccination is the most cost efficient. For other parameter values, either Random walk or TSH are most efficient.
In Section V, we provide a qualitative analysis of the optimal value of f in Eq. (1) for different vaccination protocols, based on the configuration model. The obtained insights correspond well to our findings on the data.

II. PRELIMINARIES
In this section we introduce the methods, data sets and network models we use.

A. SIR simulation
We assume that an infectious disease is spreading over a static contact network represented as a graph G = (V, E). V is the set of N vertices representing individuals; E is the set of M undirected edges representing pairs of individuals between whom the disease to spread. The vertices are, at any given time, in one of three states-susceptible (S), infectious (I) or recovered (R). Susceptible vertices do not have the disease, but can get it. Infectious vertices have the disease and they can spread it. Recovered vertices do not have the disease and cannot get it. We assume a disease outbreak starts at time t = 0. At the beginning all vertices are susceptible, except a randomly chosen vertex that is infectious. If an edge consists of one infectious and one susceptible, then the susceptible becomes infectious at rate β. Every infectious recovers at rate ν. In this setting, an infectious vertex transfers a disease through an edge before getting recovered with probability β/(β + ν).
The SIR model is essentially determined by the ratio between β and ν. In the well-mixed, differential equation version of the SIR model, this ratio is called R 0 , The actual values of β and ν are only needed to calculate the real time to reach the peak prevalence, extinction etc. In this paper, we set ν = 1 which is equivalent to saying that we are measuring the time in units of 1/ν. Straightforwardly, one can simulate this model by going through all S-I edges and accept an infection with probability β/(β + ν). Much more efficient is, however, to perform one infection or recovery event every iteration of the algorithm. The probability of the next event being an infection is where M SI is the number of edges between infectious and susceptible individuals, and N I is the prevalence (number of infectious individuals [13]). The time increment since the last iteration is 1/(βM SI + N I ). Thus, to record time (in units of 1/ν), one adds this amount every iteration to a variable representing time. If an infection event is not performed, one performs a recovery event. In an infection event, the S-I edge is chosen randomly among all S-I links. Similarly, in case of a recovery event, the infectious individual (to recover) is selected uniformly at random among all infectious individuals. For all contact networks and parameter values, we use 300, 000 or more runs of the SIR model for averages. We use β = 1/32, 1/16, 1/8, 1/4, 1/2, 1, 2, 4, 8, 16, 32 and (as mentioned) ν = 1.

B. Vaccination protocols
We compare the performance of seven vaccination protocols-five of these have been analyzed in the literature, and two are proposed by us in this work (but derived from a cost-efficient way of finding the highest degree vertices). The vaccination protocols range from simple to complex and use different amount of information about the network.

Random vaccination
The simplest way of vaccinating a fraction f of a population is to just pick f N persons uniformly at random. In this case, we can assume the information cost to be zero as all we need is a list of contact information of the population.

Acquaintance vaccination
An elegant way of exploiting the network structure to find high-degree individuals to vaccinate is the Acquaintance vaccination scheme by Cohen, Havlin and ben Avraham [9]. In the literature it is often assumed that each individual is sampled a Poisson( f ) distributed number of times, and each time a sampled individual names one neighbor to vaccinate. When the neighbor has already been vaccinated, no vaccination occurs and the next individual is sampled randomly. Then, the average fraction of vaccinated individuals v( f ) is smaller than f . The exact formula for v( f ) is given e.g. in Refs. [8,20]. Naturally, v( f ) is close to f when f is small. Here we assume that when a randomly sampled individual names a contact, which has already been vaccinated, then the individual is asked to name another contact. We discard the rare cases when all contacts of a random individual have already been vaccinated, and thus assume that v( f ) = f . Then the information cost of this protocol is f Nc info , since one needs to make an inquiry to one node for every node that is vaccinated.

Random-walk vaccination
If one is willing to spend more effort on mapping out the network, one can do significantly better than the acquaintance vaccination in finding the high-degree vertices. This is the idea of Random walk vaccination. Under this heuristics one keeps a list of the f N vertices with highest observed degree that is updated during a random walk of inquiries. This is based on Ref. [3] that proposed this method to find highdegree nodes in the World Wide Web in a cost-efficient way. To avoid getting stuck in an isolated subgraph, one allows jumps to random nodes. This model has two parameters. The first, α, sets the jumping probability to 1/α. We use α = 3 to make the random walker jump, on average, every third step (following the recommendation from Ref. [3]). The second parameter m is the number of steps in the random walk, per node on the top list. Rather than fixing this parameter, we will use the value that optimizes χ.
The cost of this protocol is the number of steps the random walk continues to a neighbor of the present node (rather than jumping to a random node) times c info , i.e. mc info (1 − 1/α).

Two-step heuristic
We also try a protocol that, like the Random walk in the previous section, was developed to cost-efficiently identify high-degree nodes in social media. We call it the Two-Step Heuristic. Just like Random walk it has a parameter to tune the amount of information used in the search process [2]. This protocol consist of two stages. In the first stage, one randomly chooses n 1 nodes and considers a reduced network of these n 1 nodes and their neighbors. In the second stage one measures the exact degrees of the n 2 highest-degree nodes of the reduced network. For simplicity, we set n 1 = n 2 = n (which is not far off the expected optimal parameter setting [2]). This gives n( f, G) = 2n, and the total information cost 2nc info .

Degree
Since both Random walk and TSH aim at being costefficient methods to rank nodes according to their degree, we also use the correct values of the degree (which could only be obtained by knowing the entire network). The information cost of this protocol is thus Nc info .

Coreness
There are other structures than degree that could be exploited for mitigating disease spreading. Coreness captures, not only the degree of a node, but also increases with the connectedness of a node's neighborhood. The idea that dense clusters ("core groups" in the epidemiological literature) are important for disease spreading dates back to Ref. [31]. Coreness is not the only metrics to capture this property, but a simple and straightforward one. It is the byproduct of a k-core decomposition, which is a way to analyze the network by successively removing nodes from it. Specifically, at level k, one deletes all nodes with degree ≤ k. If nodes get degree ≤ k during the deletion process, one deletes these too, until all nodes have degrees larger than k. The coreness value of a node is the k-value when it was deleted.
The coreness as an estimate of importance with respect to disease spreading was proposed by Ref. [17]. To use it, one would need to map out the entire network, i.e. all its M edges. However, in reality, the inquiries will be implemented by node, and it is unreasonable to assume that the cost of an inquiry from a node of degree k is proportional to k. Therefore, we choose another simplified approach, in which we assume that knowing the complete network takes one inquiry per node, i.e. the total information cost is Nc info . Note that this is a more demanding inquiry, because it requires an individual to list all its neighbors. Still, we use the same cost, meaning the performance of coreness relative to its cost will be slightly exaggerated compared to the above protocols.

Collective influence
Finally, we use a yet more elaborate algorithm that, like coreness, requires full information about the network. We stick with the author's rather non-descriptive name Collective Influence (CI) [22]. It starts by defining a quantity where d(i, j) is the distance (fewest number of edges in any path) between i and j. The algorithm proceeds by deleting the node of largest x l (i), then recalculating x l for the reduced network and repeating the procedure until f N nodes are deleted. As l grows, the ranking stabilizes but the computation time increases. The choice of l is thus a trade-off between speed and precision. We follow Ref. [22] and set l = 3. Just like coreness, the collective influence needs all the network information. Thus the total cost of information gathering is Nc info .

III. NETWORKS
Ideally, the underlying network of our study should be as realistic as possible (given a pathogen). Our knowledge of the structure of contact networks is advancing, and there are some datasets available. We use the ones that record actual contacts between people and disregard those where contacts are inferred from interaction on social media, etc. [30]. To better understand how the size of the network, and higherorder structures, affect the performance of the algorithms, it is desirable to have models able to generate contact networks. We study one of the simplest such models-the configuration model-not because it is able to generate a network with very realistic structure, but because it enables us to compare the result to other studies, in particular analytical ones.

Configuration model
The input to the configuration model is a degree sequence, i.e. a sequence of desired degrees of the nodes of the network. Then the model proceeds by picking random pairs of nodes and adding an edge between them if their actual degrees are less than their desired degrees. When all nodes except possibly one, if the sum of the degree sequence is odd, has their desired degree, the network has been constructed. The model does not enforce a simple graph (i.e. if there are already edges between a selected pair of nodes, one would still add another edge, and links from a vertex to itself are also allowed). In other words, the configuration model generates a multigraph. Since the empirical graphs in our study are simple graphs by construction, we convert the output of the configuration model to a simple graph by deleting multiple edges and self-loops. In the literature this construction is sometimes called the erased configuration model [27].
Like many previous studies, we focus on networks with a power-law degree distribution, so the probability of a vertex having degree k is proportional to k −γ . A problem with this approach is that the degrees (and consequently the number of edges) will fluctuate very much, meaning that we would need a prohibitively large number of averages. To partially mitigate this problem, we truncate the power-law distribution at N 1/(γ−1) , which, asymptotically as N → ∞, up to a (random) factor independent of N, is the highest degree in networks with independent power-law degrees. Such truncation preserves the limiting degree distribution and improves the precision of the estimated average values of the infection outbreak.

Empirical networks
The first type of empirical networks that we use represent self-reported sexual contacts. Two of these data sets-we label them HIV and Colorado Springs-were gathered by so called contact tracing where individuals testing positive with HIV were required to report their recent contacts. HIV data set is from the first study [1], which used an observed contact network between HIV patients to argue that HIV is a sexually transmitted disease. Colorado Springs is a larger and later contact-tracing data set based on patients from its namesake city in Colorado, USA [18]. Contact tracing does not follow contacts of uninfected individuals, indeed HIV only includes positive cases while Colorado Springs also uninfected individuals that had sex with HIV positive others.
We also use two networks of self-reported sexual contacts not related to contact tracing. One (Iceland) comes from Icelandic men who have sex with men [11] the other (Prostitution) from a Brazilian web forum where sex buyers report their encounters with prostitutes [23].
The final type of empirical networks are so called proximity networks. In these, a link represent a pair of people being close to each other at some time. These data sets all come from the Sociopatterns project (sociopatterns.org) and were collected by radio-frequency identification sensors given to people in some specific social setting. Such sensors record a contact if two persons are within 1-1.5 m. The social setting of one of these data sets is a conference [15] (Conference), another is a hospital [28] (Hospital) and the final one from a school (School 1 and 2) [25].
The original proximity data sets along with Prostitution are time resolved. We construct static networks by aggregating all contacts. (Ideally these data sets should be analyzed as temporal networks [21]-then one could get around the assumption that the past accurately predicts the future [7,19]. However, that is outside the scope of this paper. ) We list the basic statistics-sizes, sampling durations, etc.-of the data sets in Table I.

IV. NUMERICAL RESULTS
We start by evaluating the vaccination protocols in some detail for the Colorado Springs data set. Then we proceed to take a cruder look at all the data sets to see how network structure affects the results.

A. A case study
The Colorado Springs network serves well as example since it is of intermediate size in our collection and has typical features, such as a heterogeneous degree distribution. In this section we set β = 2-once again choosing a modest value that is in the interesting range where disease can spread throughout the population. In Fig. 1, we plot the optimal saved cost Nχ opt as a function of the two parameters-the relative cost of information c info and the relative cost of vaccination c vacc . The general pattern is quite trivial-the protocols needing most information (CI, Degree and Coreness) are also the ones that depend most on c info , while Random, that needs no information at all, depends only on c vacc . The three protocols using an amount of information depending on f (Acquaintance, Random walk and TSH) are affected by both c info and c vacc . From the heat maps it is hard to see which protocol is the best (except, perhaps that Acquaintance has the largest χ for large c info ). This means that the efficiencies of the bestperforming protocols are relatively similar.
The performance of the protocols can be better understood by measuring the fraction of vertices f opt needed to be vaccinated to optimize the total costs. See Fig. 2. The protocols where the information costs do not depend on f obviously have no c info dependence. For the other ones-Acquaintance, Random walk and TSH-f opt decreases with both c vacc and c info . Hence, more information does make these protocols more accurate. This can be seen even more clearly in Fig. 3 where we set f = f opt and study the optimal parameter values (n opt and m opt ) of the Random walk and TSH protocols. Both the protocols naturally have larger values of their parameters the cheaper the information is. For Random walk the optimal parameter value is largest when c info is as small and c vacc as large as possible. Large c vacc gives small optimal f (see Fig. 2) which lowers the cost needed for gathering information. For small c vacc and large c info the relative cost for information gathering is thus so small that the rather small marginal benefit of longer random walks is still affordable. For the TSH protocol the largest parameter value is at an intermediate value of c vacc (still c info is as small as possible). One can understand I: Basic statistics of the data sets. N is the number of individuals; M is the number of links. x is the connectance (fraction of vertex pairs that are links). C is the clustering coefficient and C denotes the averaged values of random graphs with the same expected degree sequence as the real model [6].  the increase of the parameter value with c vacc in a similar way as for Random walk. The eventual decrease, for c vacc ≈ 0.1, as well as other non-monotonicities in the plot, can be related to how Ω responds to changing f opt .

B. Network-structural effects
The picture painted in the previous section remains roughly true for other data sets and β values. In this section, we go directly to our main question of what the most cost effective vaccination protocol is. Figure 4 shows the results for β = 2. The corresponding figure for the other β-values we study can be found in the Supplementary material. From these figures, the conclusions are roughly the same, but for small β, i.e. small outbreak sizes, the results are affected by noise (so the regions are not that clear cut).
For most of the data sets, Acquaintance vaccination is the most efficient protocol for relatively large information costs, TSH is the most efficient for low c info and large c vacc , while Random walk is the most efficient for the rest of the parameter space. One exception is the Prostitution-the largest and sparsest network-where CI is the most cost effective (despite the fact it requires global knowledge of the network structure). This network also has zero clustering coefficient-i.e. no triangles (because only heterosexual contacts are recorded).  Still, the size and sparsity seem like more fundamental differences to the other networks (cf. Ref. [14]). To understand the role of clustering one could perform the same study on model networks where the clustering can be controlled. The densest network, Hospital, is also different in the respect that TSH performs best for the entire parameter space. Random is never the most efficient, meaning that there are network structures that can be exploited for all data sets and parameter values. Coreness and Degree does not perform best under any circumstance.
In addition to the empirical contact networks, we also study scale-free networks of different sizes. See Fig. 5. These networks behaves slightly different from the empirical networks with CI dominating the large-c vacc small-c info region, Acquaintance dominating the small-c vacc large-c info region, Random walk being the best for the region of intermediate c vacc and c info , and TSH being the best protocol for some low c info values and intermediate c vacc values. For N = 2, 500, Coreness is the most efficient for low c vacc and c info (this observation holds for other β values too, see the Supplementary information).

V. ANALYTICAL RESULTS
In this section we corroborate the results from Sec. IV by analytical calculations. To start, we consider the objective function χ( f ) in more detail. We denote In the SIR model, , for any f 1 , f 2 ≥ 0, where f 1 + f 2 ≤ 1. Such subadditivity plays an important role in algorithmic solutions to the influence maximization problem, which is closely related to the vaccination problem but has a somewhat opposite goal: choose the initial set of individuals who get an infection, or information, so that the information spreading in the network is maximized [16].
Remember the notation f c for the fraction of the population that needs to be vaccinated in order to prevent a global outbreak. Then ∆( f ) is increasing to Ω for f < f c , and ∆( f ) = Ω when f ≥ f c . Combining this with subadditivity, we conclude that ∆ ( f ) is positive and decreasing for f < f c .
The optimal value f opt of f in Eq. (3) is achieved when Since n( f, G) is non-decreasing in f , it follows that the maximum gain is achieved for some f opt < f c . Let us now look at f opt for different vaccination strategies. Let us first consider the Random (R) vaccination strategy. Assume that the underlying graph is a configuration model. If the degree distribution has a finite variance, then f R c can be obtained directly from formula (3.5)   the reproduction number to its critical value 1. Specifically, we have: and the value is positive if the global outbreak occurs when no vaccination takes place. When the variance is infinite, as in our case γ = 2.5, then f R c = 1, so the global outbreak cannot be prevented by the random vaccination.
We now turn to f R opt . Applying Eq. (4) we obtain With f R c = 1, we can expect that f R opt is quite large for small c vacc , and is decreasing when c vacc becomes larger. Interestingly, this is what we observe in the case study in Fig. 2. We can also explain the not very large gain in Fig. 1 by the relatively slow growth of ∆( f ).
By Theorem (3.3) of Ref. [8], in configuration model, when the variance of the degrees is finite, for the Acquaintance one can find f A c ∈ (0, 1). In fact, this can be the case even when the variance of the degrees is infinite, as long as the reproduction number in formula (3.13) in Ref. [8], is smaller than one. This is the effect of vaccinating nodes with size-biased degree distribution. Moreover, for the same epidemic on the same graph, it holds that f A c ≤ f R c , where the equality is possible only if both values are equal to one. The optimal fraction of vaccinated individuals f A opt now satisfies the equation Note that compared to the random vaccination, the right-hand side has an additional positive term. Combining the above considerations and the sharper growth of ∆( f ), we expect that f A opt is considerably smaller than f R opt . Again, interestingly, this is the case in Fig. 2. The gain is harder to predict because it depends on both vaccination and information costs. In the case study in Fig. 1, we see that the gain for the acquaintance vaccination is similar to the one for the random vaccination, while in other data sets the acquaintance vaccination outperforms other protocols especially when information cost is high, seeFig. 4.
Degree, Coreness and CI strategies must be most effective in configuration model because they target the nodes that have the highest potential for spreading the infection. A formula for the average outbreak size in configuration model in case of degree-based vaccination is given in Ref. [20] for a general model of infection spreading. However, these results are applicable when nodes of degree s are removed with given probability, and cannot be directly used when fraction f of highest degree nodes is removed.
The fraction f opt for these strategies satisfies the same equation as f R opt :  Compared the Acquaintance protocol, we can expect a faster decline of ∆ ( f ) but the value ∆ ( f opt ) is also smaller. In the case study in Fig. 2 we have f Degree opt < f A opt . For Coreness and CI it is the other way around. Very large value of f opt , especially for Coreness in Fig. 2 signals that these strategies are in fact inefficient for the Colorado Springs network (case study). In Fig. 1, for the same case study we observe that Degree, Coreness and CI have a very small gain. The efficiency of CI on configuration model (Fig. 5) and on the Prostitution data set in Fig. 4, for similar values of the parameters, is an interesting finding that deserves further research. Possible explanation can be in a small number of triangles-the feature that the Prostitution data set and configuration model share.
Finally, consider Random-walk and TSH strategies. Since these strategies target nodes with large degrees, we expect f c to be slightly larger than for Degree. The optimal value f opt satisfies For large enough N, we expect the last term above to be small. Hence, we expect similar optimal values of f opt for the Random-walk and TSH strategies as for Degree, but this optimal values will decrease when c info increases.
As for the net gain, it should be considerably higher than that of Degree, Coreness and CI strategies when c info is large enough. The comparison to the Acquaintance strategy is more tricky since the latter also targets high degree nodes but at lower costs because n( f, G) > N f holds for Random walk and TSH. On the other hand, the accuracy of Random walk and TSH is higher. The comparison between the three randomized strategies: Acquaintance, Random walk, and TSH thus depends on the interplay between accurate targeting and information costs.

VI. SUMMARY AND CONCLUSIONS
Our approach is based on the hypothesis that carrying out a network-based method for targeted immunization comes with costs for mapping out the network and the vaccination itself. These costs need to be put in context of the societal cost of sick individuals. From this starting point, we have evaluated the cost efficiency of seven network-based vaccination methods. There is not one universally best method. Rather, depending on the network structure and relative vaccination and information costs, the best method (at least for the network and parameters we explore) seem to be one of five-Acquaintance, TSH, CI, Coreness and Random walk. We make this point both by analytical calculations and simulations.
Acquaintance vaccination is almost always the most efficient for small c vacc and large c info . It is the protocol that uses second least network information after Random. For very large c info , Random will trivially be the most efficient (keep in mind that c info can, in principle, be larger than one), but we never observe this. TSH dominates the region of large c vacc and small c info , for denser networks (for sparser networks CI could also be most efficient). Random walk dominates intermediate values of c vacc and c info . It is hard to speculate why, but it is probably related to the fact that the optimal parameter values of TSH changes quickly for small c info (Fig. 3), meaning that this protocol is more adaptable than Random walk in this region. CI performs well for sparse networks with few triangles, especially in the region of large c vacc and small c info . Degree is never most efficient, meaning that vaccinating exactly in order of degree is not so important that it is worth obtaining all the network information. Coreness is almost never most efficient, supporting Refs. [12] and [22] (but disagreeing with Ref. [17]).
The main message of this work is that one need to evaluate any targeted vaccination protocol with the information and vaccination costs in mind. If one evaluates only one special case (like zero vaccination and information costs), it is probable that one reaches the wrong conclusion about the practical efficiency of a vaccination algorithm. In real vaccination campaigns, there are of course yet other complications: It could be hard (politically, and maybe morally) to motivate a targeted vaccination campaign, instead of prioritizing sensitive groups. There could be substantial errors in reporting the network information, which could affect the protocols' efficiency. The assumption that the disease has not yet spread in the network is probably never true in practical situations (see e.g. Refs. [5,24,26] on vaccinations using contact tracing of infected individuals during an epidemic). There are thus, sev-eral other complicating factors to evaluate theoretical vaccination protocols in more realistic settings. Our work, however, advances towards increased realism, and we expect most conclusions of the efficiency of the protocols to hold qualitatively with the caveats mentioned above.