Social Contact Structures and Time Use Patterns in the Manicaland Province of Zimbabwe

Background Patterns of person-to-person contacts relevant for infectious diseases transmission are still poorly quantified in Sub-Saharan Africa (SSA), where socio-demographic structures and behavioral attitudes are expected to be different from those of more developed countries. Methods and Findings We conducted a diary-based survey on daily contacts and time-use of individuals of different ages in one rural and one peri-urban site of Manicaland, Zimbabwe. A total of 2,490 diaries were collected and used to derive age-structured contact matrices, to analyze time spent by individuals in different settings, and to identify the key determinants of individuals’ mixing patterns. Overall 10.8 contacts per person/day were reported, with a significant difference between the peri-urban and the rural site (11.6 versus 10.2). A strong age-assortativeness characterized contacts of school-aged children, whereas the high proportion of extended families and the young population age-structure led to a significant intergenerational mixing at older ages. Individuals spent on average 67% of daytime at home, 2% at work, and 9% at school. Active participation in school and work resulted the key drivers of the number of contacts and, similarly, household size, class size, and time spent at work influenced the number of home, school, and work contacts, respectively. We found that the heterogeneous nature of home contacts is critical for an epidemic transmission chain. In particular, our results suggest that, during the initial phase of an epidemic, about 50% of infections are expected to occur among individuals younger than 12 years and less than 20% among individuals older than 35 years. Conclusions With the current work, we have gathered data and information on the ways through which individuals in SSA interact, and on the factors that mostly facilitate this interaction. Monitoring these processes is critical to realistically predict the effects of interventions on infectious diseases dynamics.

Introduction A vast epidemiological literature has shown the importance of considering heterogeneous social contact structures when investigating the transmission of airborne infections and the effects of possible control measures such as vaccination. As a consequence, mathematical modeling of infectious diseases, initially based on simplifying theory-driven assumptions [1] such as homogeneous mixing, have gradually shifted towards using empirical evidence on real individuals' interactions. In particular, over the past decade, field data on social contacts patterns have been gathered through diary-based surveys for a number of countries in Europe [2][3][4][5][6][7], North America [8], Oceania [9], Asia [6,[10][11][12][13][14], South America [15], and Africa [16][17][18]. Age-specific mixing matrices built on the gathered data have been largely used to model the spread of epidemics driven by close-contact interactions [19][20][21][22], and the transmission of endemic childhood infections [23][24][25][26][27]. In addition to these field studies, agent-based modeling techniques have recently been employed to derive synthetic contact matrices for European countries from detailed country-specific socio-demographic data [28,29].
However, two important challenges need to be further explored to better identify sources of heterogeneity in human mixing patterns. The first one regards the poor knowledge of social contact structures in regions, such as sub-Saharan Africa (SSA), characterized by a dramatically high burden of infectious diseases and an unstable and complex socio-demographic context [30]. A second challenge regards the understanding of what type of individual sociodemographic and behavioral features are associated with a given contact pattern. Past modeling efforts have suggested that age is sufficient to capture the main differences in individual social attitudes. However, while in developed countries age may appropriately reflect personal daily routines and activities, it is still unclear whether this is also valid for developing regions and, more generally, which are the key determinants for human interactions in different settings.
To address these questions, we conducted a field study in Zimbabwe, a SSA country characterized by a slowly progressing demographic transition [30], and gathered data on contact patterns and individuals daily routines in the Manicaland province, a predominantly rural setting, but with growing urban settlements. As part of the study, we measured the number and the location of contacts, along with the age of contactees and with socio-demographic characteristics of households, schools and work settings. In addition, and for the first time to our knowledge, information on daily routines, i.e., on time spent at home, school, work, and in the general community, were also gathered from the same individuals. Time-Use studies have extensively been used to address demographic research questions [31,32]. However, they have also shown to provide complementary information on contact data to understand the spread of infections [33,34] and to be very useful for parameterizing agent-based models for infection transmission [28,35].
The combination of contact and time use data differentiates this study from previous work and enables us to better understand the heterogeneity in mixing patterns in a typical low-income setting and to identify which components of the individual's behavior affect the level of interactions with the rest of the community. A deeper investigation on the ways individuals mix, and on the possible socio-demographic drivers, can increase, even in these settings, the robustness of epidemiological models aimed at assessing effective and cost-effective intervention policies for infectious diseases control.

Study Population
The study was conducted as part of the sixth round of the Manicaland HIV/STD Prevention Study [36], a general population cohort study carried out between 1998 and 2013 in Manicaland, the easternmost province of Zimbabwe. Our target population consisted of the population of two sites among those already involved in the Manicaland HIV Study, selected as the most extreme ones in terms of urbanization: a small peri-urban township, situated within one of the main tourist areas in Zimbabwe, and a roadside trading center, characterized by a tarred road passing through the village and surrounded by smaller subsistence farming villages, all scattered around the center (population sizes were, respectively, 4,836 and 6,733, in 2012). Compared to the whole country, the rural site is representative of rural Zimbabwe in terms of age distribution and household size, whereas the peri-urban site can be considered as illustrative of a rural-urban transition zone. Further details on the study population can be found in S1 Text.
Individuals of all ages living in the two sites were considered eligible for inclusion in the study. The sample was stratified by age group (i.e., <1, 1-5, 6-12, 13-18, 19-34, 35-60, and >60) and site of residence. The age stratification was designed to mimic the local schooling system (pre-primary, primary, and secondary school), as well as to capture working-age adults and elderly people. Further details of the approach used for the selection of participants and their enrollment can be found in S1 Text. The study was approved by the Imperial College Research Ethics Committee, the Biomedical and Research Training Institute Institutional Review Board in Harare, and the Medical Research Council of Zimbabwe. All respondents older than 17 years old had to personally provide written informed consent to study participation. For younger participants, either one of the parent or the guardian had to provide written consent for participants. In addition, verbal assent was also required by participants aged between 13 and 17 years old.

Contact and Time Use Survey
Respondents were asked to personally fill in a diary, prepared in English and translated into the local language (Shona), in order to report all the contacts that they had and where they spent their time during two consecutive, randomly assigned, days. Individuals' age, education, occupational status along with details on participants' household composition and on their schooling and working environments were also collected and complemented with records coming from the Manicaland HIV/STD Prevention Study [36]. In line with previous studies [4,10,11,15], a contact was defined as an interaction between two individuals, either physical (when involving skin-to-skin contact), or non-physical (when involving a two-way conversation with three or more words in the physical presence of another person, but no skin-to-skin contact). Respondents were requested to report both physical and non-physical contacts, separately. Multiple contacts with the same individual were reported only once per day. Moreover, for each encounter, respondents provided information on gender and age group of contactee (exact age when known), and on the social setting where the encounter occurred: i) the participant's home, ii) school, iii) workplace, iv) general community (i.e., any remaining outdoor or indoor setting attended).
Respondents provided information on their use of time by recording all the visited settings during their day. Day time was divided into time slots (of 1 or 2 hours) reflecting the position of the sun during the day. For the early and late night time, longer time windows were considered (8pm-12pm; 12pm-4am).
Data were gathered from March 2013 to August 2013. During this period schools were closed for holidays from March 28th to May 6th. For illiterate adults and children aged less than 10 years, an additional individual, denoted hereafter as "shadow", was chosen to fill in the questionnaire on behalf of the study participant. Results of a preliminary pilot study on a sample of 25 people, recruited in a different site from those used in the survey, were used to optimize the questionnaire. For this reason, these respondents were not included in the current sample.

Data Analysis
Time use records are used to compute the proportion of individuals in the different settings in each time slot. The routine nature of daily activities is assessed by testing the correlation between the two consecutive days, using the Cramer's V statistic for binary variables, considering individuals' presence over time across the different settings.
Similarly to past work [3,24,37], an age-specific contact matrix (by five-years age bands), providing the average number of contacts reported by respondents in age group i with contactees in age group j, is computed. Multiple imputation techniques are used when the exact age of the contactee is not available, and bivariate smoothing is performed (see S2 Text). Ageassortativeness of the obtained contact matrices (i.e., preferred mixing among people of the same age) is assessed using the Q index [28], a measure representing departures from proportionate mixing, ranging from zero (proportionate) to one (fully assortative).
Generalized estimating equations (GEEs) are used to statistically identify the key determinants of the overall number of contacts of study participants, as well as of the individual number of contacts at home, at school, at work, and in the general community [38]. Possible explanatory variables include the time spent by individuals in the different settings, the sociodemographic characteristics of respondents (gender, age group, and site of residence), settingrelated characteristics (individuals' household, school, class, and workplace sizes), and the type of day (normal weekday or with school holiday, weekend) in which the diary was kept. Moreover, we test the impact of socio-economic conditions of households, which represents a different stratification of the population from the urbanization level of the site, based on a socioeconomic status (SES) index constructed using house characteristics and owned assets [39]. Finally, we also test for the possible effect of seasonal changes in climatic conditions, using monthly data on rainfall and temperature. Statistical analyses are performed with Stata (version 14.0), and R (version 3.2.3). More details can be found in S3 Text.

Epidemiological Implications of Social Contact Patterns
The epidemiological consequences of the estimated mixing patterns are evaluated by simulating the age distribution of individuals infected during the initial phase of an epidemic in a fully susceptible population. In particular, such distribution was obtained by computing the eigenvector of the next generation matrix associated with contacts, considered for any possible age of the index case in the community, and by assuming an age-independent transmission rate per contact, under the so-called "social contact hypothesis" [3]. It is worth noting that the age distribution of cases during the early phase of the epidemic depends neither on the choice of the duration of the infectivity period, nor on the considered value of the basic reproductive number R 0 . This means that, although the attack rate, the timing, and the severity associated with an infection are generally disease-specific, the impact of mixing patterns on the age distribution of cases within a susceptible population are mainly driven by the type of contacts relevant for infection transmission and the socio-demographic structure of the considered population.

Sample Description
A total of 1,245 complete diaries were collected during the study period, 554 from the periurban site and 691 from the rural site, for a total of 2,490 person-days (Table 1). Of these daily diaries, 52% were completed during weekdays, 20% during weekends, and 28% during school holidays. Although the samples from the two sites appear similar in terms of age distribution, the peri-urban township is characterized by smaller households (HHs) (median HH size is 4 in the peri-urban site versus 5 in the rural one, p = 0.03), and a higher proportion of nuclear families (55% versus 43%, p<0.001). Moreover, significantly different levels of schooling and working are detected, with 34% and 19% of individuals reporting to be either students or workers in the peri-urban township, as opposed to 42% and 10% in the subsistence farming area, respectively.

Time Use Patterns
Individuals spent overall 66.9%, of their day time (i.e., 5am-10pm) at home, 9% at school, 2.2% at work, and 21.9% in the general community. The main differences in daily routines emerge when considering active versus non-active individuals, with active defined as those attending school or work for at least a single time slot (Fig 1). For the active people, the proportion of daily time spent at home and in the general community is much lower than for those nonactive ( Table 1). The recorded percentage of active students among school-age children is significantly higher in the rural site (53.5% versus 21.8%, p<0.001), reflecting both the different proportions of enrollment between sites (91.1% versus 84.4%, p<0.001) and the occurrence of school holidays in the peri-urban township. Despite this difference in actual school attendance, active school-age individuals in the two sites spend a similar amount of time at school (51% in the rural site versus 46.2% in the peri-urban site, p = 0.64). This does not hold for the time devoted to work, since in the peri-urban site we find a larger proportion of active workers in the age group 19-59 (21.5% versus 5.7% in the rural site, p<0.001), who also spend more of their daytime in the workplace (51.5% versus 30.9% in the rural site, p<0.001). A high correlation between time use patterns over the two days is found across all respondents (0.86, interquartile range: 0.77-0.96). Time use patterns on school holidays and weekends are shown in S4 Text.

Number of Contacts by Age and Setting
A total of 26,981 different contacts was reported over the two survey days, resulting in an estimated average number of contacts of 11.1 (median 9, IQR 6-14) per person per day (most of which are physical contacts, S3 Text). The distribution of the number of contacts is highly right-skewed (Fig 2a), with 28% of people reporting 13 or more different contacts and accounting for 53% of total contacts.
The average number of contacts in the peri-urban site is 11.6 (median 10, IQR 6-14) versus 10.8 in the rural site (median 9, IQR 6-14) (p<0.001, median test). Interestingly, differences between sites are also significant among working-age adults and non-active people, and among individuals living in extended families (see Table A in S3 Text). Although some variations across age groups are present, with infants reporting a significantly lower average number of contacts (8.7, p<0.005), being active at school and, to a lesser extent, at work, are found to be the main determinants of the reported number of contacts (Fig 2). In particular, active students and workers report on average, respectively, 35% (p<0.001) and 12% (p = 0.10) more contacts than non-active individuals in the same age bands, but with a reduction of 17% (p = 0.001) of their home contacts.

Age-Specific Mixing Matrices
The derived social contact matrix for all reported contacts is characterized by a remarkable age assortativeness among children aged 5-19 years (Fig 3a), decreasing afterwards, except for a moderate rise among people aged 35-39 years. The low assortativeness characterizing adults with respect to what observed in European countries (e.g., Italy, Fig 3b) is also reflected by the relatively smaller value of the Q index associated with the matrix (0.051 in Manicaland versus 0.11 in Italy), computed excluding weekends and school holidays. Beyond the age-assortativeness mostly driven by school contacts, but also detectable in the general community (Fig C in S3 Text), the mixing structure is strongly affected by the young age distribution of the study population and by the presence of extended families, where mixing is predominantly homogeneous. Some evidence of intergenerational contacts can be noticed, and this is likely due to family contacts between parents/grandparents with children/grandchildren. Interestingly, matrix patterns characterizing individuals older than 20 years mainly reflect the structure of contacts observed at home. These general patterns are confirmed when stratifying contacts by site (Fig D in S3 Text). However, the peri-urban township is characterized by a higher average number of workrelated contacts among adults (30-49 years), a relatively higher intensity of contacts between parents and children, and a more assortative mixing in primary schools with little interactions across different ages, possibly indicating more homogeneous classes.
Additional differences characterizing mixing patterns based on alternative contact stratifications are shown in S3 Text.

Determinants for the Number of Contacts
The overall number of reported contacts is positively associated with the time individuals spend outside of their home (Table 2). In particular, the more time individuals spend at work and in the general community, the higher the number of contacts in these settings, and the lower the number of home contacts (Fig 4). The number of school contacts increases with the class size, but is not associated with the time spent at school. A significantly higher number of contacts in the general community are found in the peri-urban township as opposed to the rural area, probably the consequence of a wider social network and a higher proportion of working age adults. Better off individuals with medium and high SES, independently from the site where they reside, reported less social contacts at home than those with low SES. On the other hand, people living in larger households were associated with a higher number of home contacts. Infants and pre-school children reported a significantly lower number of overall, home, and general community contacts as opposed to school-aged individuals (6-18 years old). Conversely, working-age adults reported a significantly higher number of contacts in the general community than school-age children. Finally, we did not find any difference between different types of day (normal weekdays, weekdays with school holidays, and weekends), but we found an effect of the seasonal change in the climatic conditions, in particular, school contacts decrease in months characterized by higher rainfall. Since the class size is a major determinant of school contacts, this association might suggest that, in days of heavy rain, a lower number of children attend school, affecting the reported number of contacts in this setting. On the other hand, using a different climatic factor, i.e., the average monthly temperature, we did not find any effect (see Table B in S3 Text)." A high positive correlation between individuals' number of contacts over the two days was found (0.63). However, when stratifying by setting, we found large differences, as values ranged from almost zero for work contacts to 0.61 for home contacts. This shows the routine nature of home contacts as opposed to the randomness in work contacts, mostly in case of more informal occupations.

Characteristics of an Epidemic Driven by the Estimated Contact Matrix
The predicted age distribution of infected individuals during the exponential phase of an epidemic is highly sensitive to the underlying social contact structure (Fig 5). Our results suggest that, when using the gathered Manicaland data, about 50% of infections are expected to occur among individuals younger than 12 years, while less than 20% among individuals older than 35 years. This pattern is mainly driven by the population age distribution and does not remarkably change when the transmission chain is assumed to be driven by physical contacts only. Similarly, the expected age distribution of cases in the overall population does not change when home contacts are considered alone, suggesting the critical role of the heterogeneous mixing at home for the epidemic transmission chain.
Moreover, our analysis indicates that, when relevant transmission paths are driven by a combination of home and school contacts, school-age children are expected to play a pivotal role in the spread of the infection both within schools and at home. In this case, more than 60% of infections occur among individuals younger than 12 years. On the other hand, when the infection is mainly transmitted through contacts occurring at home and at work, more than one third of the infections are expected among adults and less than 25% among children aged 6-12 years.
Modeling contacts in the general community as driven by proportional mixing leads to overestimating infections in children (mostly pre-school) and underestimating the number of Table 2. GEE model for the number of social contacts. Coefficients (with respective semi-robust standard error and significance at 5%) for the association between the number of social contacts (overall and by setting of contact), according to GEEs with negative binomial distribution, Manicaland (Zimbabwe), 2013. Social Contact and Time Use Data in Zimbabwe new cases among the elderly. In particular, with proportional mixing, around 55% of infections are predicted to occur among children up to 12 years, whereas, using the collected data on mixing patterns in the general community, only 35% of infections are predicted in the same age group. The predicted age distribution of cases in Manicaland is found to be considerably different from the one obtained when considering a similar infection process in the Italian population. In fact, when the average contact matrix of Italy is used, more than 20% (versus 13% in Manicaland) of infections are predicted to occur among the elderly and less than 30% (versus 50% in Manicaland) among pre-school and primary school children. Finally, predictions based on the projection of Italian contact rates [4] on the Manicaland age-structure would result in significantly larger estimates of the number of infections among pre-school and primary school children (almost 80% of cases), which suggests the crucial importance of deriving contact patterns in countries characterized by different population age distributions and characteristics of the social structures (households, schools, workplaces, etc.).

Discussion
In total, 1,245 diaries (corresponding to 26,981 reported contacts) were collected in one rural and one peri-urban township of the Manicaland province of Zimbabwe, with detailed information on individuals' daily social contacts and use of time. The estimated average number of contacts per person per day is significantly higher in the peri-urban site, and among active students and workers. The derived contact matrix is characterized by a strong age assortativeness among school-aged children and by proportionate mixing at older ages. This pattern is similar to what observed for rural Kenya [17], but differentiates substantially from the European-like mixing [4], which was found to be assortative even among the elderly. This may derive from several factors, among which the younger age distribution of the population in SSA as opposed to European-like settings (in Manicaland, 51% of the population under study was below 20 Seven different social structures derived from the collected data were compared in terms of predicted age distribution of generated infection cases: (from left to right) overall contacts, physical contacts, home contacts, home and school contacts, home and work contacts, general community contacts, proportional mixing based on the average number of contacts in Manicaland and the population of Manicaland. For the sake of comparison, the predicted age distribution of cases is shown also for the overall contact matrix for Italy and for the Italian contact rates applied to the population of Zimbabwe [4].
doi:10.1371/journal.pone.0170459.g005 years of age versus 20% in Italy), the higher proportion of extended families (68% in Manicaland versus 22% in Italy [40]), which leads to large differences in the proportion of home contacts (54% versus 19.7%), and the different use of time (S4 Text). In our study population, during weekdays, individuals reported 63.4% of their daytime at home, 15.3% at school, 1.8% at work, and 19.5% in the general community, as opposed to 57%, 6.2%, 16.7% and 20.1% in Italy, respectively. The proportion of pre-primary school attendance in the age group 0-6 years is extremely low (15% versus 30.7% in Italy), and similarly low is the work participation of adults aged 19-59 years (13.7% versus 26.3% in Italy). Whereas the time at work appeared to be positively associated with the number of overall and work contacts, possibly due to the different work schedules people have, the time spent at school, similar among children, did not result as a determining factor. Conversely, our results suggest that the number of students in the class, rather than the time spent at school, represents the main determinant of the number of school contacts.
Evidence of within-country heterogeneity was also found when comparing the two study sites, with some differences in contact patterns between the rural and the peri-urban site that are worth mentioning. Firstly, due to the different proportions of nuclear and extended families in the two sites, we observed a large number of contacts between the elderly and the younger age groups in the rural area, as well as between parents and their children in the peri-urban area. Secondly, while a higher average number of contacts among school-aged children was found in the rural site, suggesting a possible effect of larger classes, a higher proportion of work contacts was reported for the township.
Interestingly, our modeling simulations have highlighted that the predicted age distribution of infected cases during the exponential phase of an epidemic is highly sensitive to the underlying social contact structure of the study population. In general, when using the gathered Manicaland data, we found that about 50% of infections are expected to occur among individuals aged less than 12 years, and less than 20% among individuals older than 35 years. This result appears robust to the adopted contact definition (all contacts versus physical contacts only). In addition, we found that the heterogeneous nature of home contacts, which is so peculiar of Manicaland data, is the main contributing factor to the transmission chain of an epidemic, and that modeling contacts in the general community as driven by proportional mixing would overestimate the proportion of childhood infections.
A limitation of this study is that data collection did not occur at the same time in both sites, rather it occurred mainly during school holidays in the peri-urban township and during the school term in the subsistence farming area. Even though this procedure possibly led to estimate the number of contacts at school in the township with less accuracy, we do not expect this limitation to have strongly influenced the detection of the qualitative patterns of individual time use and social mixing at school, especially when data from the two sites are pooled together.
Knowledge of social contact patterns is an essential element for a realistic evaluation of the impact of public health control strategies against viral and bacterial infections, such as measles, influenza, tuberculosis, and meningitis, especially in developing countries where vaccination strategies are more difficult to implement and financial constraints are very high. With the current work, we have gathered data on the ways individuals interact in rural and peri-urban areas of SSA and on the factors that mostly facilitate this interaction. We found that the key individual and societal factors contributing to the number of contacts, both overall and in the various settings, are the active participation in school and work, as well as the class size for students. Whereas in the industrialized countries these characteristics may well be captured by individuals' age, in less developed countries this does not seem to be always the case, as individuals of similar age appeared to have different mixing patterns and thus carrying different risks of transmitting infections. Considering the current demographic transition that SSA countries are undergoing, future increases in the school enrollment and work participation rates may have profound effects on individual interactions. On the other hand, our model results indicate that reductions in household and class sizes, which are also expected to occur as part of the demographic transition, may reduce the overall number of contacts that individuals experience in their daily routine in those two settings. Although the final outcome of these processes is extremely difficult to forecast, the current work highlights which components should be closely monitored to identify possible important changes in social structures affecting individuals' mixing patterns and infectious disease dynamics. Additional studies in different and distant demographic settings should be considered in order to assess the evolution of these processes over time, and to link specific structures of mixing patterns to relevant individual behaviors and societal characteristics.
Supporting Information S1 Dataset. Data for the Statistical Analysis. Data on study participants, with summary of the information on social contacts and time use, on which the statistical model is based. (XLSX) S2 Dataset. Social contact matrix. Social contact matrix C ij with the estimated (through bivariate smoothing) average number of contacts between participants in age group i and contactees in age group j, together with the number of individuals in each age group in the reference population. (XLSX) S1 Supporting Information. Survey Diary. Diary provided to participants aged 6 years or more for the collection of social contacts and time use data. (PDF) S1 Text. Sampling strategy and study population. Detailed presentation of the study design and of the study population.