Figures
Abstract
Both snowball sampling and Respondent Driven Sampling (RDS) are used to sample hard-to-reach populations. Snowball sampling was initially developed as a probability sampling method, but in practice, it is widely used as a non-probabilistic sampling method. RDS was developed to address the limitations of snowball sampling and can be used to approximate a probability sampling method in practice. Therefore, RDS is often recommended for bio-behavioral surveys (BBS) for surveillance of HIV, viral hepatitis, and STIs among key populations. In some settings, simpler and cheaper monitoring are desired. WHO and UNAIDS are developing a simplified and rapid bio-behavioral survey methodology, a version of snowball sampling to use when RDS is infeasible. In this paper, we use data-based simulations to examine the potential similarities and differences between results from a snowball sample with recruitment initiated from a health service and samples recruited through RDS methodology.
Citation: Kim D, Gile KJ, Mathers B, Mirandola M, Gios L, Toskin I, et al. (2026) Comparing snowball sampling and RDS: A methodology and case study. PLoS One 21(1): e0331666. https://doi.org/10.1371/journal.pone.0331666
Editor: Jaroslaw Kozak, John Paul II Catholic University of Lublin: Katolicki Uniwersytet Lubelski Jana Pawla II, POLAND
Received: January 2, 2025; Accepted: August 19, 2025; Published: January 14, 2026
Copyright: © 2026 Kim et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Data Availability: The informed consent obtained from participants did not include provisions for public data sharing. Although the data were collected prior to the adoption of the General Data Protection Regulation (GDPR), the dataset is retroactively subject to restrictions on cross-border data sharing in accordance with GDPR provisions. Furthermore, the dataset contains sensitive personal information—including health status, sexual orientation, HIV status, and drug use—and cannot be sufficiently anonymised without posing a risk of participant re-identification, particularly given the small and marginalised nature of MSM populations in some countries. To request access to data, inquiries may be sent to comitatoetico@aovr.veneto.it. Please address correspondence to the President of the IRB, as this is a role-based position and may change over time. If further clarification would be helpful, you may also contact Lorenzo Gios at lorenzo.gios.aoui@gmail.com.
Funding: This work was supported by funding from The United Nations Joint Programme and HIV and AIDS (UNAIDS) (these are the views of the individual authors and do not reflect UNAIDS’ positions.). This manuscript is based on data from the Sialon II project, co-funded under the Second Programme of Community Action in the field of Health (2008-2013) (Work Plan 2010) (Grant Agreement Number: 2010 12 11). The sole responsibility lies with the authors of this manuscript and the Commission is not responsible for any use that may be made of the information contained therein.
Competing interests: The authors have declared that no competing interests exist.
Introduction
Both snowball sampling [1,2] and Respondent Driven Sampling (RDS) [3–5] are commonly used to collect samples from populations where standard sampling approaches are not appropriate or prohibitively expensive, but the population is well-connected by a social network. Both sampling methods start with a small initial sample, which is expanded by recruiting from within the social networks of previous participants. Many statistical methods have been developed for obtaining valid estimates from data obtained from snowball sampling when the sample begins with a probability sample [1,6–8]. However, getting an initial sample through random sampling is usually challenging. The reliance on an initial convenience sample violates this condition and renders the whole sample a non-probability sample. The dependence on the initial sample is further heightened when there are large numbers of initial samples and few steps or waves of sampling away from the initial sample. In practice, snowball samples are typically treated as convenience samples. Non-network-based sampling methods such as time-location [9,10], venue-based [11], and targeted sampling [12] also require strong assumptions for valid inference for hard-to-reach populations. RDS has several innovations to address the limitations of snowball sampling. First, RDS limits the number of initial seeds to be small and only allows for a limited number of recruits per respondent, which results in longer sample chains (more waves of recruitment) for a desired sample size. This reduces the dependence of the final sample on the initial seeds. Second, in the presence of key assumptions, an RDS sample can be treated as a probability sample for statistical inference [3,13–15]. Third, to further aid practicality in stigmatized populations, RDS also allows for anonymous recruitment. The sample expands when respondents distribute a small number of uniquely identified coupons among their contacts, making them eligible for participation. Because these innovations allow for practical sampling and more valid statistical inference, RDS is used in many fields and many countries, especially for surveillance of high-risk hard-to-reach populations [16–20].
Bio-behavioral surveys (BBS) [21] for surveillance of HIV, viral hepatitis, and sexually transmitted infections among key populations (such as men who have sex with men, sex workers, and people who inject drugs) commonly use RDS to recruit these groups which are typically hard to reach through other sampling methods. BBS that employs these methods typically requires substantial financial and technical resources. WHO and UNAIDS are developing a simplified and rapid survey methodology that is intended to be less expensive, less technically demanding than a BBS, and able to be implemented on a regular basis by providers of HIV and other health services to these populations. This ‘BBS-Lite’ methodology involves the consecutive sampling of eligible clients accessing health services, who are then provided coupons to recruit other participants through snowball sampling, with an anticipated limited number of waves. To evaluate the strengths and limitations of this proposed methodology, we undertook a simulation study using existing data to compare RDS and snowball sampling methods.
Methods
This paper presents a simulation-based framework to compare two possible sampling methods. In particular, we compare a specific version of snowball sampling modeled after the method planned by WHO and UNAIDS (here called BBS-Lite snowball sampling) and RDS to find possible biases in BBS-Lite as compared to RDS. To best approximate the types of covariate dependence we might be likely to see in real sampling, we used data from previous surveys using RDS in key populations of interest as the basis for our simulation study. This simulation method can be used with existing RDS data to assess whether a BBS-Lite style study is advisable in a given population. A key limitation of this method is that RDS samples have a tree-structure, while full populations are connected by more complex network structures. In most cases, however, RDS samples provide the best, or only, information about the dependence patterns in populations sampled by link-tracing network samples. To keep our study as close to these data as possible, we focus on the outcome that is most observable: comparing sample composition between RDS and BBS-Lite Snowball Sampling. In particular, we focus on the implications of two critical differences between the two methods: the selection of the initial sample and the depth of sampling.
We simulate both BBS-Lite snowball sampling and RDS using data collected in the Sialon II bio-behavioral study [22], which used RDS to recruit men who have sex with men across several European populations. Then, we compare the original data and the simulated data to examine the potential similarities and differences between results from a BBS-Lite snowball sample and a same-sized RDS sample.
Introduction to the data
The Sialon II project [22] is a multi-center biological and behavioral cross-sectional survey carried out across European countries using Time-Location Sampling (TLS) and RDS between 2013 and 2014 to better understand the HIV/STI prevention needs and prevention regulation gaps of Men who have Sex with Men (MSM). RDS was used to recruit a total of 1,305 participants in four countries (400 from Italy, 322 from Lithuania, 183 from Romania, and 400 from Slovakia); see Fig 1 and Table 1. The number of initial sample seeds ranged from 5 to 9, and the maximum number of waves of recruitment in each study ranged between 8 and 21.
Colors represent the health service users status (here, HIV testing history) and circle sizes indicate HIV status.
Because the BBS-Lite snowball method begins with an initial sample of health service users, we identified questionnaire items from the Sialon II survey that would indicate a participant was a health service user. The following two variables were used as proxy indicators of health service utilization: 1) “Have you been given condoms at drop-in centers, sexual-health clinics, health care facilities, outreach service/gay/HIV/other association in last 12 months?” (referred to here as ‘Receive condoms’); and 2) “Have you been tested for HIV in the last 12 months?” (referred to here as ‘HIV testing history’). We summarize the number of health service users according to these variables in Table 1. These two variables were selected as proxies for health service utilization because condom distribution and HIV testing typically take place at locations that provide HIV-related care services.
We then examined the sample means and estimated population proportions using the RDS II estimates [14] for several variables we selected for comparison. Variable names and abbreviations are in Table 2, sample means are in Table 3 and RDS II estimates are in Table 4.
Data generating process
To match our knowledge of the true network as closely as possible, we simulated samples with replacement directly from the sampled network trees of the original RDS data. This process guarantees that each adjacent sampled person is indeed adjacent to their simulated recruiter. It also preserves the observed rates of mixing across subgroups in the population. We simulated samples both using BBS-Lite Snowball Sampling, and RDS, with the latter serving as a check for artifacts of the sampling process induced by our re-sampling procedure over trees rather than a full network. Our primary interest is in whether the simulated samples approximate the original RDS samples.
This study focuses on differences between RDS and BBS-Lite Snowball Sampling resulting from two key differences between the two methods: the composition of the initial sample and the depth or the number of waves of sampling. For our simulation of snowball sampling, we selected initial seeds from the participants who responded “yes” to our service use question (in separate simulations, this is either the ‘Receive condoms’ or ‘HIV testing history’ variable). For RDS, seeds were chosen at random. In RDS, the number of initial seeds and the number of people that one person can recruit are small (usually 2-3 recruits in practice), and there is no limit on the number of waves. For our simulation of BBS-Lite snowball sampling, we limited the number of waves to a maximum of 2 and the potential number of participants recruited from an individual to 3 or fewer. With these constraints, for the simulated BBS-lite snowball sample to attain the same sample size as RDS, the number of initial seeds (i.e., those recruited through HIV services) was considerably larger. We summarize these 3 sampling conditions in Table 5.
The detailed sampling steps were:
- BBS-Lite/Snowball
- Randomly select one initial seed from the service-users group (according to ‘Receive condoms’ or ‘HIV testing history,’ depending on the simulation setting).
- Randomly assign 0-3 as the number of recruits.
- Sample with-replacement the assigned number of recruits from among respondents linked to the recruiter in the original RDS data. (Add the new recruits to the set of potential new recruiters.)
- Repeat steps 2-3 once to sample 2 waves.
- Repeat steps 1-4 until the desired sample size is reached.
- RDS
- Randomly select seeds from the original data. The number of seeds is the same as the original data.
- Select the first node in the sample that has not yet served as recruiter.
- Randomly assign 0-3 as the number of recruits for this recruiter.
- Sample with-replacement the assigned number of recruits linked to the recruiter in the original RDS data.
- Repeat steps 2-4 until desired sample size is reached.
Each sampling procedure was repeated 1000 times.
Measures related to sample differences
We focused on two population structures, reflected in our data and in other RDS studies, that might induce bias in BBS-Lite samples, as compared to RDS: the complete inaccessibility of some people in BBS-Lite, and the dependence between the service variable used to seed the RDS study and variables of interest.
The structure of BBS-Lite snowball sampling means that only population members within 2 network steps of health service users can be sampled; the rest of the population is inaccessible. This is true in a real population and also in our simulations. To study the impact of this non-accessibility, we compared the sample composition of accessible and inaccessible respondents in the Sialon II data. The results are in Table 6, which includes the results of nominal -tests comparing the accessible and inaccessible groups for each variable of interest. As expected, the service usage variables used to select seeds differed dramatically across the accessible and inaccessible groups in all cases. Many other variables had nominally significant differences between the accessible and inaccessible groups. In these cases, we may expect to see important biases in BBS-Lite compared to RDS.
An association between the seed and target variables may also induce bias (summarize in Table 7). To measure this, we used a semiparametric test for bivariate association (SPRTBA) [23] designed to infer binary relationships between categorical data in RDS samples, for categorical variables and logistic regression as a heuristic for continuous variables (‘Age,’ ‘NSMP’ and ‘Unprotected NSMP’). Because these two tests used different test statistics, we report only the p-values of each test. Note that the logistic regression may have inflated the type-I error rates due to the dependence on the RDS sample, so we interpret this Table 7 as a heuristic guide rather than a formal statistical test.
Bias measurement
We evaluated the performance of each simulation setting by comparing the composition of the set of the simulated samples to the composition of the original true RDS data. If the sampling method does not impact sample composition, we expect the observed data composition to be “typical” of the simulated samples. To measure this, we used the quantiles of the observed sample mean among the sample means of each simulated dataset using a measure we called dQ:
Here, Xi are simulated samples, xobs is the value from the original RDS data, and 𝟙 is an indicator function taking the value 1 when its argument is true, and 0 otherwise.
Results
The primary study results are summarized in Fig 2. Each set of 3 boxplots compares the 1000 simulated samples in each of the 3 simulated sampling conditions with the sample mean of the original RDS data. Boxplots where the red line is far from the middle indicate simulations with biased sample compositions compared to the original RDS data.
Table 8 gives the corresponding measures of dQ. This Table 8 also indicates the nominal significance levels of the tests comparing accessible and inaccessible samples, as well as the tests for association with seed variables. We see that in nearly all cases when there are nominally significant differences between accessible and inaccessible samples, there is substantial bias in BBS-Lite Snowball Samples. However, this does not explain all observed biases (for example, Age variable in Lithuania (Fig 2)). In almost all the remaining cases of substantial bias, the association test is nominally significant.
The indicates a significant difference between the accessible and inaccessible samples through the -test. The † indicates a significant correlation between Seed variables and selected variables.
Figs 3 and 4 plot dQ colored by the nominal significance levels. In Fig 3, we see that the high bars (dQ large) are often, but not always colored, and several low bars are colored, suggesting that the difference between the accessible and inaccessible populations is a helpful but not wholly reliable indicator of bias. In Fig 4, almost all the high bars are colored, and the low bars are not. This suggests that the association between the seed and target variables is a more reliable indicator of bias. We also note that when there is a nominally significant difference between accessible and inaccessible samples, there is also a nominally significant association in most cases.
The bar height represents the measure of dQ, and the color represents the significance levels of the -test. The last plot shows the significance levels. indicates a significant difference between the accessible and inaccessible samples through the tests of association ( p-value < 0.5, **p-value < 0.1, ***p-value < 0.01).
The bar height represents the measure of dQ, and the color represents the significance levels of the SPRTBA/logistic regression. The last plot shows the significance levels. † indicates a significant association using SPRTBA/logistic regression between Seed variables and selected variables (†p-value < 0.5, ††p-value < 0.1, †††p-value < 0.01).
We also consider the simulated RDS sampling results, as a sanity check. If the simulated RDS samples differ systematically from the original RDS data, then the biases we are seeing in the simulated BBS-Lite Snowball Samples may be due to features other than the approximations of the BBS-Lite structure. In Table 8, we see that for RDS re-sampling, most dQ values are small. However, we find that the variables ‘HIV’, ‘Unprotected NSMP’, ‘ART Coverage’ and ‘Injected Drug’ are quite high in some countries (0.2<dQ<0.3). In the cases of the ‘HIV’ and ‘Injected Drug’ variables, the sample proportion is very small. In particular, in Lithuania and Slovakia, the sample proportion of ‘HIV’ and ‘Injected Drug’ is less than 0.05. We see from Fig 5, there are very few cases of HIV positive or injecting drug user respondents, so when sampling with a small number of seeds, HIV positive or injecting drug user respondents are less likely to be sampled. Because of the small numbers in these groups, the probability of sampling someone who is HIV positive or injected drugs depends strongly on the specific selection of seeds. In the case of the ‘Unprotected NSMP’ and ‘ART Coverage’ variables, the number of informative samples responding to these variables is small. ‘ART Coverage’ applies only to HIV-positive cases, and ‘Unprotected NSMP’ also has few non-zero respondents. Therefore, the resamples have high variance, and bias in RDS resamples, as compared to the RDS reference, occurs largely in the cases of very small sample fractions.
Discussions
In this project, we introduced a method for studying the implications of different network-based sampling methods on sample composition. In particular, we consider the implications of snowball sampling as intended under BBS-Lite, including a restricted number of waves and selecting only seeds accessing health services, as compared to standard BBS sampling using RDS. The method uses previously-sampled RDS data, which may be available in settings considering BBS-Lite sampling.
The simulation found that the sample compositions of RDS re-samples were largely consistent with the original data, while the sample compositions from the BBS-Lite snowball re-samples were often quite different from the original data. We measured these differences using dQ, a measure reflective of bias scaled by variability.
We compare these methods on 11 target variables in data from 4 countries, with a range of variable and sample characteristics. We also consider 2 diagnostic measures, which can be computed based on RDS data alone and are associated with features we expect to be related to bias induced by the limitations of BBS-Lite Snowball Sampling. We find that nominal tests of association between the seed variable and the variables of interest are reliable indicators of substantial differences in sample composition between RDS and BBS-Lite Snowball Sampling. We expect that this is because such dependence induces over (or under) representation of the variable of interest in the BBS-Lite snowball samples with their short sample chains beginning with a given seed population. The bias is exacerbated by homophily on the variable of interest, which induces strong dependence between the initial sample and all other samples collected within the 2 waves in the BBS-Lite approach.
We also consider differences in the sample composition of accessible and inaccessible subsets of nodes based on the BBS-Lite sampling strategy. In the simulated BBS-Lite Snowball Sampling setting, we only recruit two steps away from the initial seeds, making some parts of the original data (and of real populations) inaccessible to the snowball samples. We find that if the accessible and inaccessible groups are nominally significantly different with respect to a target variable, the BBS-Lite sample composition is usually also biased with respect to the RDS sample composition, although this result is not as consistent as the association diagnostic, and the absence of nominal difference in accessibility does not assure similar sample composition.
We found a few cases where the sample proportion of both BBS-Lite and RDS re-samples were biased or had extremely high variance. These instances corresponded to variables with very little variability in the original data. It is also of note that the variance of estimates tends to be higher with RDS re-sampling than with snowball re-sampling. This is because of the greater mixing of the RDS samples which also increases the representativeness of those samples.
Our study here has focused on the difference in sample composition between RDS and BBS-Lite. This is because the assumptions needed for inference from RDS data are clearly not met by BBS-Lite. This means the BBS-Lite samples should not be used to directly estimate population proportions. We hope that our study has shown some conditions when there should be greater or lesser comparability between a BBS-Lite sample and an RDS sample. In cases where we have some confidence the samples may be comparable, BBS-Lite studies executed between less frequent RDS studies might be used to monitor population change compared to benchmark RDS data.
Acknowledgments
Disclaimer: Some of the authors are present staff members of the World Health Organization. The authors alone are responsible for the views expressed in this article and they do not necessarily represent the views, decisions or policies of the institutions with which they are affiliated.
References
- 1. Goodman LA. Snowball sampling. Ann Math Statist. 1961;32(1):148–70.
- 2. Coleman JS. Relational analysis: the study of social organizations with survey methods. Human Organization. 1958;17(4):28–36.
- 3. Heckathorn DD. Respondent-driven sampling: a new approach to the study of hidden populations. Social Problems. 1997;44(2):174–99.
- 4. Heckathorn DD. Respondent-driven sampling II: deriving valid population estimates from chain-referral samples of hidden populations. Social Problems. 2002;49(1):11–34.
- 5. Malekinejad M, Johnston LG, Kendall C, Kerr LRFS, Rifkin MR, Rutherford GW. Using respondent-driven sampling methodology for HIV biological and behavioral surveillance in international settings: a systematic review. AIDS Behav. 2008;12(4 Suppl):S105-30. pmid:18561018
- 6. TenHouten WD. Generalization and statistical inference from snowball samples. Bulletin of Sociological Methodology/Bulletin de Méthodologie Sociologique. 1992;37(1):25–40.
- 7. Snijders TAB. Estimation on the basis of snowball samples: how to weight?. Bulletin of Sociological Methodology/Bulletin de Méthodologie Sociologique. 1992;36(1):59–70.
- 8.
Frank O. Statistical inference in graphs. FOA Repro; 1971.
- 9. Leon L, Jauffret-Roustide M, Le Strat Y. Design-based inference in time-location sampling. Biostatistics. 2015;16(3):565–79. pmid:25597489
- 10. Karon JM, Wejnert C. Statistical methods for the analysis of time-location sampling data. J Urban Health. 2012;89(3):565–86. pmid:22421885
- 11. Muhib FB, Lin LS, Stueve A, Miller RL, Ford WL, Johnson WD, et al. A venue-based method for sampling hard-to-reach populations. Public Health Reports. 2001;116(S1):216–22.
- 12. Watters JK, Biernacki P. Targeted sampling: options for the study of hidden populations. Social Problems. 1989;36(4):416–30.
- 13. Gile KJ, Handcock MS. Respondent-driven sampling: an assessment of current methodology. Sociol Methodol. 2010;40(1):285–327. pmid:22969167
- 14. Volz E, Heckathorn DD. Probability based estimation theory for respondent-driven sampling. Journal of Official Statistics. 2008;24(1):79.
- 15. Salganik MJ, Heckathorn DD. Sampling and estimation in hidden populations using respondent-driven sampling. Sociological Methodology. 2004;34(1):193–240.
- 16.
Johnston LG. Behavioural surveillance: Introduction to respondent driven sampling. Atlanta, GA: Centers for Disease Control and Prevention; 2008.
- 17. Chopra M, Townsend L, Johnston L, Mathews C, Tomlinson M, O’bra H, et al. Estimating HIV prevalence and risk behaviors among high-risk heterosexual men with multiple sex partners: use of respondent-driven sampling. J Acquir Immune Defic Syndr. 2009;51(1):72–7. pmid:19282783
- 18. Hladik W, Barker J, Ssenkusu JM, Opio A, Tappero JW, Hakim A, et al. HIV infection among men who have sex with men in Kampala, Uganda–a respondent driven sampling survey. PLoS One. 2012;7(5):e38143. pmid:22693590
- 19. Montealegre JR, Johnston LG, Murrill C, Monterroso E. Respondent driven sampling for HIV biological and behavioral surveillance in Latin America and the Caribbean. AIDS Behav. 2013;17(7):2313–40. pmid:23568227
- 20. White RG, Hakim AJ, Salganik MJ, Spiller MW, Johnston LG, Kerr L, et al. Strengthening the reporting of observational studies in epidemiology for respondent-driven sampling studies: “STROBE-RDS” statement. J Clin Epidemiol. 2015;68(12):1463–71. pmid:26112433
- 21. Semá Baltazar C, Boothe M, Chitsondzo Langa D, Sathane I, Horth R, Young P, et al. Recognizing the hidden: strengthening the HIV surveillance system among key and priority populations in Mozambique. BMC Public Health. 2021;21(1):91. pmid:33413261
- 22. Gios L, Mirandola M, Toskin I, Marcus U, Dudareva-Vizule S, Sherriff N, et al. Bio-behavioural HIV and STI surveillance among men who have sex with men in Europe: the Sialon II protocols. BMC Public Health. 2016;16:212. pmid:26935752
- 23. Kim D, Gile KJ, Guarino H, Mateu-Gelabert P. Inferring bivariate association from respondent-driven sampling data. Journal of the Royal Statistical Society Series C: Applied Statistics. 2021;70(2):415–33.