Social venue range and referral chain impact: Implications for the sampling of hidden communities

Background It has been argued that the success of respondent-driven sampling (RDS) in generating unbiased estimates for epidemiologic outcomes depends on participants’ abilities to generate long referral chains. While this is thought to depend on the number of people participants know in the target population, this idea is rarely tested. Furthermore, little attention has been paid to the role of other aspects of social connectedness in recruitment, such as participants’ involvement in local clubs and venues. Methods We examine whether the recruitment potential of young Black men who have sex with men (YBMSM) depends on (1) their social network size and (2) their affiliations with a variety of sex venues across geographic areas. We analyze data from a 2014 RDS study of 598 YBMSM on the south side of Chicago. Results Both a participant’s personal network size and the number of different types of sex venues with which he was affiliated were significantly positively associated with (1) the total number of recruits in the participant’s prospective referral chains and (2) the lengths of those chains. However, only venue affiliation remain significantly associated with recruitment potential in the multivariate model. Conclusions The success of RDS in generating valid samples may depend more on recruiting participants who are involved in multiple venues within the community (i.e., their affiliation networks) than on recruiting those who have large personal social networks.

and (2) form relationships with new contacts in sex markets [32]. On the other hand, the extent of MSM's sex venue involvement and the sizes of their personal social networks may be unrelated in some cases. For one, the principle of social network "homophily"-or the tendency for individuals to form ties with others who are similar to them-suggests that one can have numerous network ties and yet exhibit little diversity in their sex venue exposure [33,34]. Conversely, research shows that people who serve as bridges between otherwise poorly connected groups do not always have large networks [35,36]. These observations suggest that MSM network size and sex venue structure need not be highly correlated. This raises the possibility that the ability of MSM to expand a sample through recruitment of new members may be as closely tied to those MSM's involvement in sex venues-where they are exposed to larger pools of people whom they might not even know-as it is to those MSM's personal social networks.

Study design and objectives
This analysis draws on data from the first wave of the uConnect study (n = 622), completed in 2014. uConnect was implemented by a team of epidemiologists, public health, and social scientists who have expertise in HIV research and prevention, social networks, and sexual behavior in minority communities. It focuses on risk behaviors and health in the lives of Younger Black MSM (YBMSM) in South Chicago. South Chicago and adjacent neighborhoods constitute the largest Black community area in the United States, and include the highest HIV incidence rates in Chicago [37]. Participants completed a questionnaire about their demographic and background information, social network characteristics, community involvement, sex and sexdrug risk/reduction practices, and STI/HIV testing/treatment. The Survey Research Lab at the University of Chicago conducted pilot interviews and cognitive testing of questions. Study participants completed written informed consent. The Institutional Review Boards at the University of Chicago and NORC at the University of Chicago approved the research protocols.

Sample recruitment and interviewing
RDS implementation began by selecting a diverse profile of YBMSM to serve as seeds. The research team began by gathering a group of about twenty community partners to help identify potential seeds in the community. Seeds were recruited using a variety of techniques, including posts on web sites, Facebook postings, and at college campuses, health clinics, and community events. In all, 38 productive seeds, comprising 6.7% of the final sample, were recruited. This excludes 24 initial interviewees who did not recruit any participants. (None of the results reported here depend on whether these non-productive seeds are included.) The median number of referrals was 3. Among all respondents, the distribution of the number of successful referrals was 0 (6.4%), 1 (44.3%), 2 (24.4%), 3 (15.1%), 4 (5.9%), 5 (2.5%) and 6 (1.4%) yielding maximum referral chains of 13. All of the 38 seeds and subsequent recruits were interviewed. At the end of these interviews, each participant was trained on recruiting other YBMSM using six coupons with unique ID numbers. Each participant was offered $60 for their participation in the interview and $20 for each of their recruits. We study the final referral network, which contains a total of 598 YBMSM, including 38 productive seeds and 560 subsequent ("prospective") recruits.
Interviews with each of the participants were conducted in private offices on the University of Chicago campus by trained interviewers. Interviews were conducted using Computer Aided Personal Interviewing (CAPI). Some sections of the interview were self-administered. The interview itself involved a variety of questions and activities, yielding an average interview length of 138 minutes.

Study variables
The dependent variables measure the extent to which a given participant's recruitment efforts impacted the eventual uConnect sample. This is measured in terms of the nature of referral chains that stemmed from a participant's recruits. We are not merely interested in whether the respondent was able to recruit participants. Rather, we are interested in whether the referral chains that were generated by participants expanded in the manner preferred by RDS. We measure this in two ways: 1) the total number of RDS recruits in a given respondent's downstream (i.e., prospective) recruitment chain-that is, how many participants were recruited by the respondent, his recruits, and their subsequent recruits (observed range: 0 to 137); and 2) the length of the longest recruitment chain that stems from the respondent (observed range: 0 to 13).
We examine how social network size and the range of sexual/social venues in which MSM are involved relates to these two measures. Network size is measures as the total number of Black MSM that a given respondent reported knowing (range: 0 to 200). To measure sexual/ social venue range, we draw on respondents' reports of how often, during the past 12 months, they had gone to four different types of venues to "meet or socialize with other men." The four types included clubs/bars, gyms, adult bookstores/bathhouses and House/Ball spaces. For each type of venue, we asked where these venues were located. Options ("all that apply") included south side, north side, west side, east side, south side suburbs, or other. We multiplied the number of the four types of venues by the number of the six geographic identifiers to calculate overall sexual/social venue range.
We control for several other factors that may impact referral chain growth, including: 1) at what point (i.e., recruitment "wave") the participant was recruited in the referral chain, operationalized using a continuous measure indicating wave and a squared term to allow for nonlinearity; 2) the participant's age; 3) the participant's ethnicity; 4) whether he was a resident of the south side or a suburb in the south side; 5) indicators of his sexual orientation (gay, bisexual, or other non-gay orientation); and 6) a count of the number of geographic regions in Chicago (range 0 to 6, as above) in which he had attempted to meet men in public places (e.g., public parks) during the past 12 months. To reduce endogeneity associated with our main predictors, we also include two other measures of social connectedness to this population: 7) the number of sex partners with whom the participant reported having sex during the last six months (range: 0 to 100); and 8) an indicator of whether he had used social media or hook-up application (e.g., Facebook) during the past 24 hours).
uConnect study through their direct recruitment efforts and/or via the efforts of the subsequent recruits in their referral chains. Non-seed recruits generated, on average, 3.5 subsequent recruits. (This number is low partly due to the fact that 57.4% of non-seed participants did not recruit anyone, including those who did not have time to distribute their coupons within the study's recruitment time window.) A total of 186 participants generated more than one recruit in their referral chain. The length of the longest referral chain averaged 2.8 among seeds and .97 among other participants. Of the 45.7% of the sample who recruited at least one participant, the average chain length was 2.7, as 54.1% of these people yielded chains containing at least two waves of participants. These recruitment efforts resulted in several long referral chains-with 473 participants (83.4% of the sample) being included in referral networks that contained at least 10 people.
On average, seeds reported an average network size of 6.8, and other participants reported an average of 4/2. Seeds reported involvement in 1.9 types of sex venues, on average, and other recruits reported involvement in an average of 1.3 types of venues. Overall, 409 (72.1%) reported involvement in at least one type of sex venue, and 222 (39.2%) reported involvement  Social venues and respondent-driven sampling in more than one type of venue. The most popular venues involved clubs or bars on the north side of Chicago (52.2%), the second most popular venues were House/Ball venues on the South Side (24.0%), and the third involved south side clubs/bars (19.4%). Bivariate regression analyses (not shown) provide mixed marginal evidence of an association between personal network size and recruitment. Personal networks size is not significantly associated with the rate of recruit referrals (IRR = 1.020, p = .130), and it is marginally positively associated with the length of recruitment chains (IRR = 1.017, p = .053). The number of venue types in which respondents were involved is more significantly associated with impact on RDS referral chain growth. In a bivariate model (not shown), each additional venue type is associated with a 24.1% increment in the rate of network recruit referral (p = .006). This association is depicted in the scatterplot shown in the left panel of Fig 2. Likewise, each additional venue type is associated with a 14.5% increment in the length of recruitment chains (p = .015). This is depicted in the scatterplot in the left panel of Fig 2. Neither of the associations between personal network size and measures of recruitment success are significant in the multivariate analysis. The associations between venue range and recruitment success, however, remain statistically significant. Table 2 reveals that each additional venue type is associated with a 30.7% increment in the number of recruits (p < .001). The multivariate association is illustrated in Fig 3. For example, a YBMSM who was interviewed in the fifth wave and who reported no sex venue involvement is projected to have about three downstream recruits, while someone in the same wave who reported involvement in five types of venues is projected to have about eleven downstream recruits. YBMSM who were enrolled in the middle of the study (e.g., wave 5) had a greater impact on the growth of the referral network than either those recruited early on (wave = 1) or late in the study (wave = 10). Table 2 also shows that each additional venue type is associated with a 14.7% increment in the total number of recruits (p = .010). This is illustrated in Fig 4. A YBMSM who was interviewed in the fifth wave and who reported no venue involvement, for example, is projected to have a prospective chain of length of about one, while a participant who reported involvement in five venue types is projected to have a chain of length of about two. These results hold when using an alternative(5) RDS person-level weight (S1 Table) as well as (marginally) when no weight is used (S2 Table). , 2013-2014 (N = 567)  Abbreviations: RDS, respondent-driven sampling; IRR, incidence rate ratio; CI, confidence interval; MSM, men who have sex with men. a Applied only to respondents who were provided at least one coupon to distribute. All models are adjusted using Gile's (2011) person-level RDS sequential sampling weights. Models that employ the Volz-Heckathorn (2008) RDS-II weights yield similar results (S1 Table).

Table 2. Incidence rate ratios from multivariate negative binomial regression models predicting (1) total size and (2) maximum chain length of the prospective RDS recruitment networks of MSM in the uConnect study
https://doi.org/10.1371/journal.pone.0181494.t002

Discussion
There is still much to learn about the link between study participants' social connectedness and the success of chain-referral recruitment efforts in hidden populations. The findings of the present study suggest not only that personal network size is not always a useful estimate of participants' impact on prospective recruitment, but also that exposure to non-specific (perhaps unknown) others through a wide-ranging variety of community venues may be a more useful predictor. Young black MSM who were affiliated with a wider variety of sexual/social venues in different communities not only yielded a larger number of prospective recruits, they also generated longer referral chains. These findings hold net of numerous other significant predictors of YBMSM's impact on RDS referral chains, including at which wave participants were added to the study, the participant's age, sexual orientation, their use of social media, their tendency to seek MSM in public places, and, most importantly, the size of their social networks. These findings have important implications for studies of at-risk populations. Because longer referral chains yield larger and more diverse samples, these results suggest that participants who have more diverse venue affiliations may also generate samples that produce less biased estimates [1,12]. Chain-referral studies of hidden populations should remain mindful of not only how the sizes of participants' personal social networks (i.e., the number of others in the target population whom they know) shape their impact on recruitment, but also how participants' exposure to broader pools of potential contacts shape recruitment. Our analyses provide some evidence that this more generalized form of connectedness (involvement in a range of venues) may even have a larger impact on sample growth than does personal social network size. Post-estimation tests reveal that this difference in magnitude is statistically significant with respect to both the size of a participant's overall referral network (χ 2 = 11.1, p > .001) and the length of his longest referral chain (χ 2 = 5.4, p < .03). Finally, we should note that the finding that participants' use of online social media to find sexual partners is significantly associated with the total number of prospective recruits (IRR = 1.716, p < .05). This finding is also consistent with the notion that social connectedness behaviors that link participants to more  Table 2. generalized pools of potential (weak) MSM contacts are more consequential for subsequent recruitment than is personal network size.
This study has several limitations. For one, the data were self-reported. Response bias is possible in the form of over-or under-inflation of participants' social network sizes. Such bias, however, would likely be similar across participants and it would be difficult to determine whether it would behave differently from response biases that are related to participants' venue exposure. Estimating one's network size may be a more cognitively challenging process than responding to fixed items regarding affiliation with types of sexual/social venues. Relatedly, our measure of venue range may underrepresent the true range of potential types of venues with which YBMSM are involved, and may also suffer from underreporting due to social desirability bias. And we should underscore that while both the measure of network size and venue range may capture exposure to a variety of MSM in the community, they do not capture other key parts of the recruitment process, such as the ability of participants to recruit people who are likely to redeem the coupons. Finally, these findings are drawn from a sample that was observed within a specific geographic area. The impact of venues on other populations' recruitment efforts may depend on local variability in the presence of venues. Regardless, YBMSM represent a population at greatest risk of HIV in the United States and maximizing recruitment efficiency in this group can provide valid estimates and engage larger numbers of YBMSM in research and downstream HIV prevention interventions.
These findings should inform selection criteria for seeds in future chain-referral RDS studies, and may impact the parameters that are utilized for the weighting of RDS, which currently focus on personal network size. The focus on sexual/social venue involvement, as an affiliation -as opposed to an individual-based network approach [18,40,41]-suggests the need for alternative or supplemental weighting schemas. We recommend that careful analyses be conducted to determine whether estimates that are derived using more conventional estimators such as the Volz-Heckathorn [5] estimator and the Gile [39] estimator are sensitive to participants' venue affiliations. The development of more efficient referral chains has the potential to decrease the costs of RDS studies, reduce bias in estimates, and engage more at-risk community members around HIV prevention research and intervention.
Supporting information S1 Table. Incidence