Fig 1.
The local Facebook friendship networks provided by students A and B are shown in black. In particular, we know that i and j are friends on Facebook but not j and k, as i and j are both friends of A and j and k are both friends of B. On the other hand, we do not know if i and k are friends or not: the red dashed line represents the lack of knowledge about the potential existence of this relationship.
Fig 2.
Venn diagrams representing the sets of students concerned by the various data collection efforts.
Table 1.
Cross-tabulation of pairs of contact reports from the contact diaries.
Each pair of participants with at least one contact reported gives a single observation. For instance, there were 12 pairs of students (i, j) such that i reported contacts with j with total duration between 6 and 15 min while j reported a duration between 15 min and 1 h. Each percentage within a cell represents the percentage with respect to the row (right of the cell entry) and column (below the cell entry) totals.
Table 2.
Comparison of properties for the contact networks obtained from sensors and diaries.
On the day of collection of the contact diaries, only 295 students out of the 327 participating correctly wore their sensors. All network properties for the contact diaries network are computed on its symmetrized version. In this summary table we assume that if a contact is reported by at least one of the two nodes, it exists. The right side of the table is performed after matching the two networks. Matching is done by removing the nodes who did not participate to the survey and the ones who did not have contacts recorded by sensors on the 4th day of the study. A * close to the SPL average means that after the match some isolated nodes appeared. In this case, we computed the average on the connected pairs only. Standard deviations are given in parentheses.
Fig 3.
Contact matrices of link densities.
We compare here the contact matrices of link densities built from (a) the network of contacts obtained using the sensor data collected on Dec. 5th and (b) the network of contacts as reported in the contact diaries. We discarded here the data corresponding to the MP*1, PC* and PSI* classes as too few students from these classes filled in a contact diary (2 for MP*1, 0 for PC* and PSI*). The similarity between these two matrices is of 97%.
Fig 4.
Sensors vs. contact diaries: distributions of cumulative durations registered by the sensors.
(a) Cumulative distributions of the aggregate durations of contacts registered by the sensors for (i) all 488 links between the 109 nodes belonging to both networks; (ii) the 202 links that were also reported in the diaries; (iii) the 286 links that were not reported in the diaries. (b) Cumulative distribution of aggregate durations of contacts registered by the sensors for the different categories of links reported in the diaries.
Table 3.
Average and standard deviation of the distributions of aggregate durations for different sets of links (as in Fig 4).
Table 4.
Sensors vs. contact diaries: cross-tabulation of the number of links in each duration category.
The percentages within a cell are computed with respect to the row (right of the cell entry) and column (below the cell entry) totals.
Fig 5.
Contact and friendship networks.
The three layers of the multiplex are shown using exactly the same layout: each node is placed at the same position in the three panels. The color of each node represents its class and size represents its degree in the corresponding network (here we consider a symmetrized version of the network of reported friendships). * Strictly speaking, the Facebook data do not provide a network as we do not have information about the presence or absence of a link between many pairs of nodes (see Fig 1). Figure created using the Gephi software http://www.gephi.org.
Table 5.
Summary statistics for the global contact network and the network of reported friendships.
All network properties for the network of friendships are computed on its symmetrized version. Values on the right part of the table are obtained after retaining only the students present in both networks.
Fig 6.
Comparison of the networks of contacts and friendships.
(a) Shortest path length distributions for both networks; (b) Distributions of aggregate durations, as measured by the sensors, for different kinds of links in the contact network: (i) all links, (ii) links i − j for which only one of i or j reported a friendship with the other, (iii) links for which both students reported the friendship, and (iv) links for which no friendship was reported; (c) and (d): Contact matrices of link densities. We compare here the contact matrices of link densities built from (c) the global aggregated network of contacts obtained using the sensor data and (d) the symmetrized network of reported friendships. The similarity between these two matrices is ≈ 95%.
Table 6.
Mean and standard deviation of distribution of aggregate durations for different sets of links (as in Fig 6(b)).
Fig 7.
Fraction of friendship and contact links as a function of the number of features shared by two students.
Fig 8.
a) Distribution of aggregate durations for the different sets of links. b) Fractions of pairs of students belonging to specific groups (no link, link in both the contact network and Facebook, link in only one of the two) as a function of the number of common features.
Table 7.
Mean and standard deviation of the distributions of aggregate durations for different sets of links in the contact network.
Fig 9.
(a) Conditional probability to find a link in one layer (row index) given its existence in another one (column index); “C” stands for contact network, “FS” for friendship survey, “FB” for Facebook; (b) Distribution of aggregate durations in the contact network for different sets of links.
Table 8.
Mean and standard deviation of distribution of aggregate durations for different sets of links.
Only 11 pairs of students have a reported friendship but no Facebook links, so that we do not give the corresponding statistics.