Skip to main content
Browse Subject Areas

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

The French Connection: The First Large Population-Based Contact Survey in France Relevant for the Spread of Infectious Diseases

  • Guillaume Béraud ,

    Affiliations Médecine Interne et Maladies Infectieuses, Centre Hospitalier de Poitiers, Poitiers, France, EA2694, Université Droit et Santé Lille 2, Lille, France, Interuniversity Institute for Biostatistics and statistical Bioinformatics, Hasselt University, Hasselt, Belgium

  • Sabine Kazmercziak,

    Affiliation CRESGE, Université Catholique de Lille, Lille, France

  • Philippe Beutels,

    Affiliation Centre for Health Economics Research & Modeling Infectious Diseases (CHERMID), Vaccine & Infectious Disease Institute, Universiteit Antwerpen, Antwerpen, Belgium

  • Daniel Levy-Bruhl,

    Affiliation Département des maladies infectieuses, InVS, Saint-Maurice, France

  • Xavier Lenne,

    Affiliation Département d’information médicale, Université de Lille Nord de France, Centre Hospitalier Universitaire de Lille, Lille, France

  • Nathalie Mielcarek,

    Affiliations Center for Infection and Immunity of Lille, Institut Pasteur de Lille, Lille, France, INSERM U1019, Lille, France, CNRS UMR8204, Lille, France, Université Lille-Nord de France, Lille, France

  • Yazdan Yazdanpanah,

    Affiliation Service des Maladies Infectieuses et tropicales, Hôpital Bichat Claude Bernard, Paris, France

  • Pierre-Yves Boëlle,

    Affiliation UMR-S 707, INSERM, Paris, France

  • Niel Hens,

    Affiliations Interuniversity Institute for Biostatistics and statistical Bioinformatics, Hasselt University, Hasselt, Belgium, Centre for Health Economics Research & Modeling Infectious Diseases (CHERMID), Vaccine & Infectious Disease Institute, Universiteit Antwerpen, Antwerpen, Belgium

  • Benoit Dervaux

    Affiliations EA2694, Université Droit et Santé Lille 2, Lille, France, DRCI, Centre Hospitalier Universitaire de Lille, Lille, France



Empirical social contact patterns are essential to understand the spread of infectious diseases. To date, no such data existed for France. Although infectious diseases are frequently seasonal, the temporal variation of contact patterns has not been documented hitherto.


COMES-F is the first French large-scale population survey, carried out over 3 different periods (February-March, April, April-May) with some participants common to the first and the last period. Participants described their contacts for 2 consecutive days, and reported separately on professional contacts when typically over 20 per day.


2033 participants reported 38 881 contacts (weighted median [first quartile-third quartile]: 8[5–14] per day), and 54 378 contacts with supplementary professional contacts (9[5–17]). Contrary to age, gender, household size, holidays, weekend and occupation, period of the year had little influence on the number of contacts or the mixing patterns. Contact patterns were highly assortative with age, irrespective of the location of the contact, and gender, with women having 8% more contacts than men. Although most contacts occurred at home and at school, the inclusion of professional contacts modified the structure of the mixing patterns. Holidays and weekends reduced dramatically the number of contacts, and as proxies for school closure, reduced R0 by 33% and 28%, respectively. Thus, school closures could have an important impact on the spread of close contact infections in France.


Despite no clear evidence for temporal variation, trends suggest that more studies are needed. Age and gender were found important determinants of the mixing patterns. Gender differences in mixing patterns might help explain gender differences in the epidemiology of infectious diseases.


Mathematical modelling of infectious diseases is invaluable to evaluate control and prevention strategies by comparing their (cost-)effectiveness and to inform public health decision makers. While most models make assumptions on transmission parameters, social contact data studies estimate the probability of contacts between individuals, and consequently of potential pathogen transmission. For instance, social contact data studies have shown better goodness-of-fit than mathematical and parsimonious models on seroprevalence data for varicella[1]. Contact diaries have several advantages in measuring of the frequency and intensity of contacts between individuals. They are easy to use, capture social interaction in a wide range of settings and do not rely on peer-group participants[2]. They successfully explained age-specific patterns of infection such as varicella-zoster virus, parvovirus B19[3], mumps[4], influenza[4] and pertussis[5]. Nonetheless, defining a contact suitable for infectious disease transmission remains difficult and varies according to pathogen[2,6]. A population-based contact survey provides the basic material allowing to build contact matrices with different levels of contact intimacy (e.g. physical or/and long-duration contacts versus conversational or/and short-duration contacts).

Focusing on 8 European countries, POLYMOD was the first large-scale study to report on contacts between individuals[7]. To date, no such data existed for France. Fumanelli et al[8] estimated contact matrices by inferring the structure of social contacts from demographic data, but at the expense of substantial differences with the empirical contact matrices from the POLYMOD study. Time-Use surveys are widely available and provide a valuable alternative to estimate contact matrices, but they are often restricted to participants older than 8 years[6,9]. With regards to the pandemic influenza A/H1N1 virus, a French household-based survey reported meetings made by participants but with information restricted to the place and the age distribution of contacts[10].

Seasonality is a common feature in infectious diseases, usually attributed to environmental factors such as temperature or humidity[11]. Term-time forcing for measles[12] and other childhood infections[13] suggests the importance of behavioural factors. But few studies have evaluated change in the number of contacts for given persons over a period of time[14,15]. None have compared change in mixing patterns overtime. Hence, while contact matrices have been developed at the country-wide level, they lack in temporal information.

In this paper we describe the first large-scale population survey investigating contact patterns in France and their temporal variations. Taking advantage of the natural heterogeneity of France–one of the largest European countries- and using one of the largest sample sizes for a country-based population survey carried out to date, we have estimated French contact rates. We have reassessed the influence on mixing patterns of weekends and holidays as well as gender, children’s contact patterns or class size. We have also explored the influence of people with high numbers of professional contacts on one or two consecutive days.


Study design

The study population was randomly sampled from all over France (excluding overseas territories) and planned over 2 time periods (February-March/April-May 2012) (Fig 1). An oversight leading to recruit fewer participants than originally planned during the February-March period (Actual Period 1), an additional period (Actual Period 2) was added in April to complete “Design Period 1”. Recruitment during “Design Period 2” was completed according to plan (Actual Period 3). The Actual Period 2 being chronologically close to “Design Period 2”, we presented data analyses according to (1) Design Periods 1 and 2, (2) the 3 Actual Periods 1, 2 and 3 and (3) to Actual Period 1 and a combination of Actual Periods 2 and 3. Winter holidays lasted 2 weeks at different dates in February-March according to the place of residency, while Spring holidays lasted 2 weeks in April-May.

Fig 1. Timeline of the study, showing the distribution of participants and contacts over time.

The periods of inclusion were February, 20th–March,17th; April,1st–April, 7th; April,16th–May, 14th. The dot size is proportional to the log of participant’s number. (Design Period 1: 34 days; Design Period 2: 29 days)

Participants were recruited according to quota for age, gender (sex-ratio = 1), days of the week and school holidays from 24 250 persons contacted by random digit dialling (landlines and mobile numbers). Diaries (S1 Fig) were sent to 3977 people who have accepted to participate, among whom 2033 actually participated (729 during Design Period 1; 1304 during Designed Period 2). Participants common to Design Periods 1 & 2 (n = 278) represented respectively 38% and 21% of participants. Only one person per household could participate. Children and teenagers were oversampled to gain accuracy on age groups known to contribute largely to the spread of infections.

Participants had to describe their environment (household, workplace, school…), their socio-professional background, and all their contacts for 2 consecutive days on a paper diary. A contact was defined as talking to someone within a distance of less than 2 meters, or skin-to-skin touching. Each contact had to be described with age (or estimated age category), gender, location, frequency, type (skin contact or not) and duration of the contact. A contact was to be reported only once daily in the diary.

The diary was derived from the POLYMOD study but had some additional features. The daily number of potentially recorded contacts was limited to 40 (versus 29 to 90 in POLYMOD[7]). A specific diary for children 0–15 years old (27.9% of all the diaries) was designed with instructions for caregivers to help complete it. This diary had specific questions about childcare, school and location of contacts (school and day care centre).

Participants were coached by phone and could seek information to complete the diary through a hotline and an email address. They were contacted up to 3 times if the diary was not returned. Participants who returned the diary were offered 5€ for themselves or for donation. Participants provided a verbal consent when they accepted to participate in the study, as they were contacted by phone. Moreover, they confirmed their consent by returning the diary. Thus, returning the diary was considered as a written consent. For children younger than 15 years, their consent and the consent of a parent or legal caregiver had to be obtained, in a similar ways (first by phone, then by returning the diary). Children between 15 and 17 were considered as adults, thus, their consent was obtained without requiring an adult. Records were anonymized before analysis, thus identifying participant information was and is not available. The study protocol, as well as the consent procedure, was declared and approved by the French Institutional Review Board correspondent at the Institut Catholique de Lille.

Data analysis

Data analysis was done using the statistical programming language R 3.1.0. Sampling weights were calculated according to participants’ age, household size (2009 national census, INSEE[16]), weekdays and weekends, regular and holiday periods. Continuous variables are expressed as weighted median (first quartile-third quartile).

Number of contacts.

Regarding the number of contacts, variable selection was done using random forests[17] (R package randomForest). Age was transformed into five-year age categories and days of the week were transformed into weekday/weekend for data sparseness and model interpretability. Generalized Estimation Equation (GEE)(R package geepack) with a negative binomial distribution were used to regress the number of contacts and the selected variables. Variables influencing the number of contacts were compared with percentage of change and 95% confidence interval (95%CI) based on the estimates from GEE. The GEE approach can handle correlations between repeated observations from the same participants. The degree distribution of the number of contacts was modelled using Generalized Additive Models (GAM) with spline smoothing (R package mgcv[18]) stratified according to age, gender, weekdays and weekend, regular and holiday periods.

Who mixes with whom?

Contact matrices were obtained using GAM assuming a negative binomial distribution, using a one-year age interval and a tensor product spline as smooth interaction term between contact age and participant age. Different matrices were calculated: a base-case matrix without supplementary professional contacts, a matrix with physical contacts only, a matrix with supplementary professional contact information and matrices according to the 3 actual periods of the study.

To assess the influence of gender, 2-by-2 matrices were built according to gender and age (≤18 years; >18 years) of both participants and contacts.

To assess the influence of the place of contact, location-based matrices were built using 6 age categories and no smoothing.

Data sparseness prompted us to use different methods to obtain the matrices as described above. The reciprocal nature of contacts was taken into account by a ‘smooth-then-constrain’ approach[19], except for location matrices where no reciprocity was imposed.

Comparing the impact of different mixing patterns on the spread of infectious diseases, we calculated the relative change in the basic reproduction number R0 for a generic epidemic by calculating the ratio of dominant eigenvalues of the respective next generation matrices[20,21] (S1 Text.). Similarly, we used the leading right eigenvector of the specific next generation matrices to calculate relative incidence by age. For the location matrices, we used the eigenvalue of the contact rate matrices as there was no population size by location, warranting a somewhat different interpretation. Resampling was done to estimate 95%CI of leading eigenvalues and eigenvectors. Changes in R0 and contact rate were compared with a ratio and 95%CI based on a non-parametric bootstrap.

Professional contacts.

Participants with more than 20 daily professional contacts were asked not to report them but rather to provide their total number and age distribution (0–3 years; 3–10 years; 11–17 years; 18–64 years; 64+ years) (referred as “supplementary professional contacts” or “SPC” in the remainder of the paper).

Other contact characteristics were imputed by resampling the characteristics of professional contacts from participants who had between 10 and 20 professional contacts (S2 Text.). We repeated our analyses including the SPC and applied censoring, once at 134 contacts (the 95% percentile) to limit the impact of outliers, and once modelling censor at 29 contacts (S1 Table. Factors influencing the number of contacts, with censoring at 29 similarly to Mossong et al 2008 (non-linear model)), similarly to what had been done in the POLYMOD study. Unless mentioned otherwise, results concerned the model without SPC.

Contact matrices as well as all the data necessary to reproduce our analysis are freely available on and on a public figshare repository at


Participants’ age was 37 (19–59) years old, with 46 (2%) <1 year and 795 (39%) <18 years. Women represented 1136 (56%) of participants. According to the designed quotas, 890 (44%) participants had a weekend day among the 2 consecutive participation days, which was always associated with a weekday. Participation days were neither a weekend day nor holiday for 552 (27%) participants. Participants were on average significantly older than the originally recruited people who didn’t send in their diaries (37.1(27.0)y vs 30.9(25.4)y; p<0.001). Non-participants were mostly aged 18–39 years and 3–9 years (whose diaries were filled by an adult, usually aged 18–39 years). Results have to be interpreted within that context (S3 Text. Design issues). Similarly, employment and school enrolment rates should be interpreted with respect to age quotas (S2 Table).

Number of Contacts

Participants reported 38 881 contacts (8(5–14) per day; Fig 2), with +9% [1%;18%] more contacts in Design Period 2 but without significant differences between Actual Periods 1, 2 or 3 (Table 1) (S2 Fig), or between the combination of Actual Periods 2 & 3 vs. Actual Period 1 (+4%[-4%;13%]). Factors influencing the number of contacts are summarized in Table 1. Region of residency did not influence the number of contacts.

Fig 2. Contact number density.

Histogram of the contact number, including SPC (Supplementary Professional Contacts). Limitation at 40 contacts per day explains the peak at 40 contacts.

Table 1. Factors influencing the number of contacts (SPC: Supplementary Professional Contacts).

The relative number of contacts rapidly increased with age to reach a plateau at around 20 years old. Among children, babies (<1 year) had slightly fewer contacts than toddlers (1–3 years) (Fig 3).

Fig 3. Degree distribution of children <4y, comparing number of contacts between <1y to 1–3y, with density of number of contact.

Similar graph with frequency of number of contact is provided as supplementary material S3 Fig.

Women had 8% [1%;14%] more contacts than men, mainly due to differences for adult women (Fig 4).

Fig 4. Degree distribution comparing number of contacts according to gender in <18y and >18y, with density of number of contact.

Similar graph with frequency of number of contact is provided as supplementary material S3 Fig.

During weekends and holidays the number of all contacts decreased respectively by 21% [14%;27%] and 21% [16%;26%] (Table 1), and by 16% [8%;23%] and 19% [13%;25%] for physical contacts The impact was different between children and adults (Fig 5).

Fig 5. Degree distribution comparing number of contacts according to weekends and holidays in children (3–18y) and adults, with density of number of contact.

Similar graph with frequency of number of contact is provided as supplementary material S3 Fig.

Duration was associated with frequency of contacts, as daily contacts lasted longer than less frequent contacts (Fig 6A & 6C). Physical contacts were associated with longer duration (Fig 6B & 6D) and more frequent contacts (Fig 6E). Physical contacts occurred more often at home or in private places than at work or study place, and rarely during transport (Fig 6D & 6F).

Fig 6. Characteristics of contact (without SPC).

Distribution of location, duration and frequency for all contacts (A) and physical contacts (B). Duration of contact according to frequency (C). Proportion of physical contacts according to duration (D), frequency (E) and location (F).

Transportation modes did not significantly influence the number of contacts, despite a trend for higher number of contacts with public transport (+36%[-20%;+131%]).

For a subanalysis of contacts made by participants common to Design Periods 1 & 2, trends were similar as observed in the full data analysis, though only age, household size, regular or holiday period and occupation remained significant.

No association was found between the size of classroom or childcare centre and the total number of contacts or those specifically at kindergarten (4.8[0.0;9.0]), at school (5.0[0.0;16.0]) or study place (6.0[0.0;16.0]).

Participants reported 6%[1%;10%] fewer contacts on the 2nd day of the study. The more contacts they reported on the first day, the larger the proportional decrease in contacts on the second day.

Who mixes with whom

The R0 of an epidemic occurring during Design Period 2 was 12%[1%;23%] higher than during Design Period 1, but lost significance when comparing 2 out of 3 Actual Periods (P1 vs P2: +13%[-9%;46%]; P1 vs P3: 11%[-4%;26%]). Mixing patterns (Figs 78)(S3 Table) also showed an important contribution during the initial phase of an epidemic for the 10–20 year olds (predominantly for Actuals Periods 2 and 3), and for the 35–50 year olds (predominantly for Actual Period 1).

Fig 7. 3D representation of the base-case matrix without SPC (Supplementary Professional Contacts).

Fig 8. Smoothed contact matrices without SPC, for physical contacts only and with SPC(Supplementary Professional Contacts) (right).

Relative incidence of a new emerging infection in a completely susceptible population estimated from the matrix in regard (left).

The central diagonal on Figs 79 shows that contact patterns were highly assortative with age (i.e. participants tend to mix with people of similar age). The 2 secondary parallel diagonals for people with age differences of about 30 years exhibited a high contact rate: children mixing with adults aged 30–39 years and adults mixing with older contacts (>60 years). These diagonal bands were found only in the home matrix (Fig 9), and mostly represent contacts between (grand)parents and their (grand)children. These mixing patterns were maintained for physical contacts and over the different periods.

Fig 9. Smoothed contact matrices according to Actual Period (right).

Relative incidence of a new emerging infection in a completely susceptible population estimated from the matrix in regard (left).

Men reported significantly fewer contacts with children than women, whatever the contact gender. Men reported significantly fewer contacts with women, whatever the participant or contact age. And boys reported significantly fewer contacts with girls than with boys (Table 2).

Table 2. Gender of participant & Contact, without SPC (Supplementary Professional Contacts): Ratio of contact for male participants compared to female, not taking into account the gender of contact and taking into account the gender of contact.

The impact of school closures on an epidemic was estimated by the relative change in R0 on the weekend compared to a weekday and on holidays compared to a regular day. R0 decreased during weekend and holidays by 28%[10%;44%] and 33%[25%;41%], respectively.

Where do people mix

Contact patterns were different according to location (Fig 10), with most contacts made at school and at home and fewer contacts during transport and in public places. Contact patterns were found to be assortative with age at all locations, and transgenerational mixing occurred mainly at home. With home as a baseline (= 1), the contact ratio according location was higher at school (1.55 [1.25–1.91]), but lower at work (0.56 [0.43–0.72]), in private places (0.42 [0.35–0.50]), in public places (0.59 [0.49–0.69]), in transportation (0.23 [0.15–0.40]), and not different in open spaces (0.85 [0.68–1.04]).

Fig 10. Contact matrices according to location.

Numbers are the ratio of contact rate with contact rate at home (95%CI). No smoothing or reciprocity was applied (particular location wouldn’t be the same for a participant and a contact (e.g., at home vs. not at home), the matrices were kept asymmetric).

The professional contacts

The total number of contacts with SPC was 54 378 (9(5–17)) contacts per day), and was 52,042 contacts (9(5–17)) with SPC censored at 134. Hence the reduction of 4% of the total number of contacts involved 22 (1%) participants. Censored participants were similar in gender, age and household size to other participants. SPC increased the variance and attenuated the effect of all variables, except for age <25y, gender, weekend, occupation when employed and the period (Table 1) (S4 Table, S4 Fig). SPC increased the number of contacts during the last period (Design and Actual), and R0 for Actual Period 2 and 3 up to 9%[-27%;111%] and 46%[8%;100%] with full SPC, and to 6%[-26%;84%] and 33%[1%;72%] with censored SPC. With SPC, the contact matrix (Fig 8) showed a wider contact “plateau”, corresponding to less assortative mixing for ages 20–65 years. It resulted in a contact ratio at work higher than in any other location, of 3.17[1.83;5.99](compared to Home as baseline). The mixing pattern specifically at the workplace was mildly assortative by age, but showed a cut-off at 45 years (Fig 10)(individuals under 45 years had contacts mainly with individuals under 45 years, while individuals over 45 years had contacts mainly with individuals over 45 years). The specific number of contact made at work was 3[0;10] and 20[6;38] with SPC. With SPC, R0 decreased during weekends and holidays by 63%[53%;70%] and 20%[-1%;22%], and public transport increased the number of contacts by 96%[28%;198%]).

For participants common to Design Periods 1 and 2, SPC led to similar results except that weekends became significantly associated with fewer contacts.


Comes-F is the first study on temporal variation of social contact patterns to use a contact diary approach. The trend toward more contacts in April-May was significant for design periods but only with SPC for actual periods. The period did not influence the number of contacts among participants common to Design Periods 1 and 2. Hence, actual/design periods in this study showed little influence on the number of contacts compared to age, household size, gender, holidays and weekends. Weather may help to explain these minor differences between the periods. Two recent studies showed that weather conditions could influence the number of contacts and mixing patterns[22,23]. DeStefano suggested similar trends, although without information on statistical significance or mixing patterns[15]. Besides temporal variation, spatial variation such as place of residency, albeit non-significant, could be a confounding effect as weather varies according to both season and primarily latitude, which could influence mixing patterns.

As in POLYMOD, we found that contacts were mostly influenced by age. Contact patterns were likewise highly assortative with age, with a high contact rate for children and adolescents, and a strong child-parent component (Figs 79). Mixing with both contacts of similar age and with their parents explains the strong participation of children and adolescents in an epidemic[4] (Figs 89). Their high number of contacts favours an important influence of these age groups at the beginning of an epidemic. Additional contacts with other age groups (such as adults aged 35–50 years) lead to a rapid spread among other age groups. Our results present not only similarities with the POLYMOD study, but also with contact survey studies using different methodology, such as an household-structured community cohort [2426], probability sampling instead of quota [27] or in person (face-to-face) interviews. The highest number of contact was always concentrated on children and teenagers, mixing patterns was assortative by age, and physical contact was more likely to be prolonged, frequent and occurring at home.

Our most original result was the difference in gender, with men having fewer contacts than women and mixing assortative with age and gender. The POLYMOD study[7] found a similar trend, but could not establish its statistical significance. With a different methodology, DeStefano et al[15] found that women had 13% more speaking interactions per day. Nonetheless, to the best of our knowledge, no previous study has ever presented contact matrices according to gender. So far, gender differences in infectious diseases epidemiology have been attributed only to hormonal differences[28] or to differences in risk assessment[29] leading to incomplete reporting[30]. We suggest that the higher participation of women in infectious diseases such as influenza[31] or pertussis[32] could also be attributed to behaviour. Data from Japan[33] showed that several infections were more frequent among males during childhood and among females at an older age. The authors’ hypothesis was that mothers were more likely to stay at home with a sick child and consequently more likely to be exposed to infection. More contact should result in accelerated circulation of pathogens among women. Women should consequently get infected sooner than men, and present a shorter serial interval (time between symptoms onset of a case and its infector). Indeed, in a study on pertussis transmission in Dutch households, the mean serial interval was 20 days when the mother was the infector vs. 28 days when it was the father or a sibling[34]. In a study on household transmission of 2009 influenza in New York City, the secondary attack rate among females was almost twice the rate among males[35]. And in a prospective cohort study, female gender was associated with increased influenza transmission[36]. These differences according to age and gender of both participants and contacts could result from the higher involvement of women in childcare as well as gender differences among professional contacts. In our study gender preference occurred evenly among children, in accordance with a study carried out in a school using wearable sensors[37]. For strong interaction, gender preference increased with grade while for low interaction it decreased for girls and increased for boys. Therefore, different trends on gender preference according to countries in POLYMOD[7] could have led to an average non-significant trend.

In accordance with previous work[3], most contacts occurred at home and school and far fewer in other locations such as transportation. Whatever the location, contacts were assortative with age, but less assortative in places where the contact rate was the highest (home, school or workplace with SPC). The diagonals on the home matrix demonstrated that parent-children contact occurred primarily at home, in accordance to findings from Lapidus[10]. This finding confirms the importance of home and school in the spread of infectious diseases, both because contact rates are high, and because contacts are not limited to a specific age category, thereby allowing the pathogen to spread across age categories. Therefore, home quarantine or school closures would have a higher impact than transport-related measures on the contact rate and the spread of infections.

Most of the studies on school closures have relied on strong assumptions about contact patterns[3840], or were based on a specific context (multiple non-pharmaceutical measures taken simultaneously, school closures among others), such as 1918 pandemic[41], 2009 H1N1 pandemic in Mexico[42] or SARS in Beijing[43]. Like Hens[21], we used social contact data to quantify the impact of school closure, not relying on data specific to a particular pathogen. Oversampling the number of study days in a holiday period made it possible to study the influence of holidays—also regarded as a proxy for school closures- on mixing patterns. Based on our analysis and in[21], school closure would have more impact on disease transmission in France than in other European countries, as the R0 of an epidemic decreased by 28% and 33% during weekends and holidays, compared to 21% and 17% in the European countries where a significant decrease was found[21]. This difference should be taken into account when estimating the benefits of school closure, which may nonetheless be counterbalanced by a macroeconomic cost that would render such strategies questionable[44].

Unlike Hens et al[19], we found no influence of day-care centre or classroom size on the number of contacts. This could be a methodological issue, resulting from different definitions of the variable, as well as undeniable differences between Belgium and France. Hence, if size of day-care centre or classroom influences the number of contacts, it is neither strong nor linear in our setting.

Inclusion of SPC pronouncedly modified the number of contacts and influencing variables. Partly this effect resulted from our methodology: SPC were included only for weekdays, hence the strong effect of weekends. But it also influenced mixing patterns, as an adult could make contact outside his or her own age category outside the home, notably in the workplace, which could facilitate the spread of a pathogen among different age categories. This observation raises the question of the possible role of the workplace, as well as public transport in pathogen transmission. Gender difference enhanced by SPC could reflect gender difference in rate of employment. In contrast, the influence of SPC on the period is less clear, even though a higher number of contacts increases the power of the analysis and could render a trend significant. One difficulty is that professional contacts-notably when numerous- are unlikely to have the same importance as others regarding infectious disease transmission (e.g. bus driver). Of note, report of professional contacts was limited at 10 in 4 of 8 countries in POLYMOD (20 in Belgium). Issued from a parametric approach, our results are sensitive to extreme values (e.g. participants with a high number of professional contacts). Hence, limiting the maximum number of contacts to 40 a day and modelling the SPC separately may effectively help to limit the effect of these outliers. That said, while limiting the number of reported contacts facilitates completion of the diary, it inevitably leads to artificial boundaries (Fig 2). Therefore, separate modelling of SPC (see Hens et al[19]) may represent an optimal trade-off between relevance and feasibility.

Whereas the impact of professional contacts on social mixing patterns was assessed, more detailed future analyses may require the use of additional poststratification for employment given the moderate undersampling of employed people in the 50–64y category. When more refined analyses with several poststratification dimensions are of interest, care should be taken with highly variable poststratification weights [45](see e.g. Vandendijck et al.).

The reporting of contacts for 2 consecutive days offered a large amount of data resulting in the largest contact survey for a country. But this positive factor is counteracted by fatigue in reporting, with fewer reported contacts during the second day. Smieszek et al[46] showed higher underreporting for highly connected individuals than for isolated individuals. As the proportion of short and non-intense contacts increased with the total number of contact partners, underreporting of contacts was correlated with contact duration. The fact that participants were slightly older than non-participants could be related to their being less active and/or employed, with less time to participate. Therefore, there is a limit to the amount of information we could request in view of achieving optimal accuracy.

We have presented the first large population contact survey in France, and one of the largest contact surveys of its kind. It improves our understanding on the spread of infectious diseases, on the role of some age categories and on the impact of school closures in France. It raises more fundamental questions on the optimal design of those surveys, on the role of professional contacts and locations, and on the gender difference in the epidemiology of infectious diseases. Finally, it provides some basic material to be used in applied model-based analyses.

Supporting Information

S3 Fig. Degree distribution graph frequency-based.


S4 Fig. Contact description with SPC (Supplementary Professional Contacts).


S1 Text. Next Generation Matrix and R0 calculation.


S2 Text. Supplementary Professional contact modelling methodology.


S1 Table. Factors influencing the number of contacts, with censoring at 29 similarly to Mossong et al 2008 (non-linear model).


S2 Table. Employment and school enrolment rates.


S3 Table. Base-case contact matrix with age categories for all contact and for skin contact only.


S4 Table. Gender table with SPC (Supplementary Professional Contacts).



The Institut Catholique de Lille was the promoter of the survey. The survey was carried out by IPSOS. The computational resources and services used in this work were provided by the VSC (Flemish Supercomputer Center), funded by the Hercules Foundation and the Flemish Government – department EWI. The authors thank Geert Jan Bex for providing help in parallel computing and Jeffrey Arsham, a medical translator, for reading and reviewing the original English-language text.

Author Contributions

Conceived and designed the experiments: BD NH PB DLB XL YY SK NM PYB. Performed the experiments: SK BD. Analyzed the data: GB NH PB BD. Wrote the paper: GB SK PB DLB XL NM YY PYB NH BD.


  1. 1. Ogunjimi B, Hens N, Goeyvaerts N, Aerts M, Van Damme P, Beutels P. Using empirical social contact data to model person to person infectious disease transmission: an illustration for varicella. Math Biosci. 2009;218: 80–87. pmid:19174173
  2. 2. Read JM, Edmunds WJ, Riley S, Lessler J, Cummings D a. T. Close encounters of the infectious kind: methods to measure social mixing behaviour. Epidemiol Infect. 2012;140: 2117–2130. pmid:22687447
  3. 3. Melegaro A, Jit M, Gay N, Zagheni E, Edmunds WJ. What types of contacts are important for the spread of infections?: using contact survey data to explore European mixing patterns. Epidemics. 2011;3: 143–151. pmid:22094337
  4. 4. Wallinga J, Teunis P, Kretzschmar M. Using data on social contacts to estimate age-specific transmission parameters for respiratory-spread infectious agents. Am J Epidemiol. 2006;164: 936–944. pmid:16968863
  5. 5. Rohani P, Zhong X, King AA. Contact network structure explains the changing epidemiology of pertussis. Science. 2010;330: 982–985. pmid:21071671
  6. 6. De Cao E, Zagheni E, Manfredi P, Melegaro A. The relative importance of frequency of contacts and duration of exposure for the spread of directly transmitted infections. Biostat Oxf Engl. 2014;15: 470–483.
  7. 7. Mossong J, Hens N, Jit M, Beutels P, Auranen K, Mikolajczyk R, et al. Social contacts and mixing patterns relevant to the spread of infectious diseases. PLoS Med. 2008;5: e74. pmid:18366252
  8. 8. Fumanelli L, Ajelli M, Manfredi P, Vespignani A, Merler S. Inferring the Structure of Social Contacts from Demographic Data in the Analysis of Infectious Diseases Spread. Salathé M, editor. PLoS Comput Biol. 2012;8: e1002673. pmid:23028275
  9. 9. Zagheni E, Billari FC, Manfredi P, Melegaro A, Mossong J, Edmunds WJ. Using time-use data to parameterize models for the spread of close-contact infectious diseases. Am J Epidemiol. 2008;168: 1082–1090. pmid:18801889
  10. 10. Lapidus N, de Lamballerie X, Salez N, Setbon M, Delabre RM, Ferrari P, et al. Factors associated with post-seasonal serological titer and risk factors for infection with the pandemic A/H1N1 virus in the French general population. PloS One. 2013;8: e60127. pmid:23613718
  11. 11. Lowen AC, Mubareka S, Steel J, Palese P. Influenza virus transmission is dependent on relative humidity and temperature. PLoS Pathog. 2007;3: 1470–1476. pmid:17953482
  12. 12. Keeling MJ, Rohani P. Modeling Infectious Diseases in Humans and Animals [Internet]. Princeton: Princeton University Press; 2011. Available:
  13. 13. Metcalf CJE, Bjornstad ON, Grenfell BT, Andreasen V. Seasonality and comparative dynamics of six childhood infections in pre-vaccination Copenhagen. Proc R Soc B Biol Sci. 2009;276: 4111–4118.
  14. 14. Read JM, Eames KTD, Edmunds WJ. Dynamic social networks and the implications for the spread of infectious disease. J R Soc Interface R Soc. 2008;5: 1001–1007.
  15. 15. DeStefano F, Haber M, Currivan D, Farris T, Burrus B, Stone-Wiggins B, et al. Factors associated with social contacts in four communities during the 2007–2008 influenza season. Epidemiol Infect. 2011;139: 1181–1190. pmid:20943002
  16. 16. INSEE: National Institute of Statistics and Economic Studies. Recensement de la population [Internet]. 2009 [cited 8 Oct 2014]. Available:
  17. 17. Breiman L. Random Forests. Mach Learn. 2001;45: 5–32.
  18. 18. Wood SN. Generalized additive models: an introduction with R. Boca Raton, FL: Chapman & Hall/CRC; 2006.
  19. 19. Hens N, Goeyvaerts N, Aerts M, Shkedy Z, Van Damme P, Beutels P. Mining social mixing patterns for infectious disease models based on a two-day population survey in Belgium. BMC Infect Dis. 2009;9: 5. pmid:19154612
  20. 20. Diekmann O, Heesterbeek JA, Metz JA. On the definition and the computation of the basic reproduction ratio R0 in models for infectious diseases in heterogeneous populations. J Math Biol. 1990;28: 365–382. pmid:2117040
  21. 21. Hens N, Ayele GM, Goeyvaerts N, Aerts M, Mossong J, Edmunds JW, et al. Estimating the impact of school closure on social mixing behaviour and the transmission of close contact infections in eight European countries. BMC Infect Dis. 2009;9: 187. pmid:19943919
  22. 22. Willem L, Van Kerckhove K, Chao DL, Hens N, Beutels P. A nice day for an infection? Weather conditions and social contact patterns relevant to influenza transmission. PloS One. 2012;7: e48695. pmid:23155399
  23. 23. Chan T-C, Fu Y-C, Hwang J-S. Changing social contact patterns under tropical weather conditions relevant for the spread of infectious diseases. Epidemiol Infect. 2015;143: 440–451. pmid:24725605
  24. 24. Horby P, Pham QT, Hens N, Nguyen TTY, Le QM, Dang DT, et al. Social contact patterns in Vietnam and implications for the control of infectious diseases. PloS One. 2011;6: e16965. pmid:21347264
  25. 25. Read JM, Lessler J, Riley S, Wang S, Tan LJ, Kwok KO, et al. Social mixing patterns in rural and urban areas of southern China. Proc Biol Sci. 2014;281: 20140268. pmid:24789897
  26. 26. Grijalva CG, Goeyvaerts N, Verastegui H, Edwards KM, Gil AI, Lanata CF, et al. A household-based study of contact networks relevant for the spread of infectious diseases in the highlands of Peru. PloS One. 2015;10: e0118457. pmid:25734772
  27. 27. Fu Y, Wang D-W, Chuang J-H. Representative contact diaries for modeling the spread of infectious diseases in Taiwan. PloS One. 2012;7: e45113. pmid:23056193
  28. 28. Klein SL, Hodgson A, Robinson DP. Mechanisms of sex disparities in influenza pathogenesis. J Leukoc Biol. 2012;92: 67–73. pmid:22131346
  29. 29. Gustafson PE. Gender differences in risk perception: theoretical and methodological perspectives. Risk Anal Off Publ Soc Risk Anal. 1998;18: 805–811.
  30. 30. Barbara AM, Loeb M, Dolovich L, Brazil K, Russell ML. A comparison of self-report and health care provider data to assess surveillance definitions of influenza-like illness in outpatients. Can J Public Health Rev Can Santé Publique. 2012;103: 69–75.
  31. 31. World Health Organization. Sex, gender and influenza [Internet]. Geneva, Switzerland: World Health Organization; 2010. Available:
  32. 32. Haslam N, Hoang U, Goldacre MJ. Trends in hospital admission rates for whooping cough in England across five decades: database studies. J R Soc Med. 2014;107: 157–162. pmid:24526463
  33. 33. Eshima N, Tokumaru O, Hara S, Bacal K, Korematsu S, Karukaya S, et al. Age-Specific Sex-Related Differences in Infections: A Statistical Analysis of National Surveillance Data in Japan. Kazembe L, editor. PLoS ONE. 2012;7: e42261. pmid:22848753
  34. 34. Te Beest DE, Henderson D, van der Maas NAT, de Greeff SC, Wallinga J, Mooi FR, et al. Estimation of the serial interval of pertussis in Dutch households. Epidemics. 2014;7: 1–6. pmid:24928663
  35. 35. France A-M, Jackson M, Schrag S, Lynch M, Zimmerman C, Biggerstaff M, et al. Household Transmission of 2009 Influenza A (H1N1) Virus after a School–Based Outbreak in New York City, April–May 2009. J Infect Dis. 2010;201: 984–992. pmid:20187740
  36. 36. McCaw JM, Howard PF, Richmond PC, Nissen M, Sloots T, Lambert SB, et al. Household transmission of respiratory viruses–assessment of viral, individual and household characteristics in a population study of healthy Australian adults. BMC Infect Dis. 2012;12: 345. pmid:23231698
  37. 37. Stehlé J, Charbonnier F, Picard T, Cattuto C, Barrat A. Gender homophily from spatial behavior in a primary school: A sociometric study. Soc Netw. 2013;35: 604–613.
  38. 38. Ferguson NM, Cummings DAT, Fraser C, Cajka JC, Cooley PC, Burke DS. Strategies for mitigating an influenza pandemic. Nature. 2006;442: 448–452. pmid:16642006
  39. 39. Germann TC, Kadau K, Longini IM, Macken CA. Mitigation strategies for pandemic influenza in the United States. Proc Natl Acad Sci U S A. 2006;103: 5935–5940. pmid:16585506
  40. 40. Glass RJ, Glass LM, Beyeler WE, Min HJ. Targeted social distancing design for pandemic influenza. Emerg Infect Dis. 2006;12: 1671–1681. pmid:17283616
  41. 41. Markel H, Lipman HB, Navarro JA, Sloan A, Michalsen JR, Stern AM, et al. Nonpharmaceutical interventions implemented by US cities during the 1918–1919 influenza pandemic. JAMA J Am Med Assoc. 2007;298: 644–654.
  42. 42. Chowell G, Echevarría-Zuno S, Viboud C, Simonsen L, Tamerius J, Miller MA, et al. Characterizing the Epidemiology of the 2009 Influenza A/H1N1 Pandemic in Mexico. Peiris JSM, editor. PLoS Med. 2011;8: e1000436. pmid:21629683
  43. 43. Cowling BJ, Ho LM, Leung GM. Effectiveness of control measures during the SARS epidemic in Beijing: a comparison of the Rt curve and the epidemic curve. Epidemiol Infect. 2008;136.
  44. 44. Keogh-Brown MR, Smith RD, Edmunds JW, Beutels P. The macroeconomic impact of pandemic influenza: estimates from models of the United Kingdom, France, Belgium and The Netherlands. Eur J Health Econ HEPAC Health Econ Prev Care. 2010;11: 543–554.
  45. 45. Vandendijck Y, Faes C, Hens N. Prevalence and Trend Estimation from Observational Data with Highly Variable Post-stratification Weights. In revision. Ann Appl Stat.
  46. 46. Smieszek T, Burri EU, Scherzinger R, Scholz RW. Collecting close-contact social mixing data with contact diaries: reporting errors and biases. Epidemiol Infect. 2012;140: 744–752. pmid:21733249